UGA logo RCC: Research Computing Center
 
 
Home >
 
 
RESOURCES
SERVICES
Application & Code Development
Consulting
Grantwriting Support

RCCBatchBlast

Category | Version | Author | Description
Program on:altix | inQuiry | pcluster | rcluster, IOB

Category(ies): Bioinformatics

Version: 2.0

Author / Distributor:

Yecheng Huang, RCC, UGA

Description:

There is a semi-auto pipeline to run NCBI blast at RCC rcluster.
Split big query file with multiple query sequences into multiple small input files and run blastall(NCBI).

altix: Not available

Back to top


inQuiry: Not available

Back to top


pcluster: Not available

Back to top


rcluster, IOB: running program | Documentation | Installation | System

Running Program:

  • Search utilities
    • rccbatchblast - given sequences in FASTA format, find similar sequences in a BLAST database at rcluster. It splits the inoput files in to chunks and submits all chunks to the queue. It takes all standard options from ncbi blastall. There are two more options:
            -s number of sequences in each unit. The input sequence file will be splitted in to            many smaller size files. This option defines how many sequences in each            splitted file.
            -q The name of the queue. The jobs will be submitted to the queue. For more            detail about queue, please refer to rcc queue
  • Search Result utilities
    • rccbatchblast-check - check the results of rccbatchblast
      * After submit your job, check if your jobs are done.
      * if all jobs succeed, the blast result will merge in check.blast; number of input sequences, number of result queries, and total CPU time will be summarized.
      * if jobs failed, or there are duplicated results in units, suspicious folders will be backup with prefix e + original folder name; commands of clean up and resubmission are given at the report.
      * Please check and analyst errors and resubmit. All results are written to check.report
      * In: original fasta file
      * Out: check.report, check.balst
  • Advanced utilities
Refer NCBI Blast and Bioteam BTBatchBlast for more options.

e.x.

rccbatchblast -i inputfile -o outputfile -d targetdatabase -p program-name -b bValue -v vValue -s numbe-of-sequence-in-split-unit -q queue-name -m mValue

where default bValue=1; vValue=1; size-of-split-unit=100; queue-name=r1-96h; mValue is N/A. Refer queue at rcluster for more options of queue-name.



bjobs -u your-user-name

where your-user-name is the user who run the above RCCBatchBlast.


rccbatchblast-check infile

where infile is Original input fasta file to blast.

Documentation:

Note: DO NOT use "submit job to the queue".
rccbatchblast is a script which already takes care of the submitting to queue.
Except command is"rccbatchblast", the options are same as NCBI blast, plus the options of queue name and chunk size. please refer to Blast

Installation: iNquiry Package

System(s): Unix

Back to top


 
Partnering with UGA