(→DROSOPH) |
|||
(62 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
+ | {{Template:CCDelete}} | ||
{{Software | {{Software | ||
|package_name=MPIBLAST | |package_name=MPIBLAST | ||
|package_description=Parallel implementation of NCBI BLAST | |package_description=Parallel implementation of NCBI BLAST | ||
|package_idnumber=55}} | |package_idnumber=55}} | ||
+ | {{Template:GrahamUpdate}} | ||
=Introduction= | =Introduction= | ||
Line 11: | Line 13: | ||
=Version Selection= | =Version Selection= | ||
− | + | ===All Clusters (except Guppy)=== | |
− | + | ||
− | + | ||
− | = | + | module unload openmpi intel |
+ | module load intel/12.1.3 openmpi/intel/1.6.2 mpiblast/1.6.0 | ||
+ | |||
+ | ===Guppy Only=== | ||
+ | |||
+ | module unload openmpi intel | ||
+ | module load intel/11.0.083 openmpi/intel/1.4.2 mpiblast/1.6.0 | ||
+ | |||
+ | =Job Submission= | ||
+ | |||
+ | sqsub -r 15m -n 24 -q mpi --mpp=3g -o ofile%J mpiblast etc | ||
+ | |||
+ | where the number of cores n=24 is arbitrary. | ||
+ | |||
+ | =Example Jobs= | ||
==DROSOPH== | ==DROSOPH== | ||
Line 22: | Line 36: | ||
<pre> | <pre> | ||
− | mkdir -p /work/$USER/samples/mpiblast/test1 | + | mkdir -p /work/$USER/samples/mpiblast/test1 |
+ | rm /work/$USER/samples/mpiblast/test1/* | ||
cd /work/$USER/samples/mpiblast/test1 | cd /work/$USER/samples/mpiblast/test1 | ||
cp /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.in drosoph.in | cp /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.in drosoph.in | ||
− | gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.nt.gz > drosoph.nt | + | gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.nt.gz > drosoph.nt.$USER |
</pre> | </pre> | ||
− | Create hidden configuration file to define a Shared storage location between nodes and a Local storage directory available on each compute node | + | Create a hidden configuration file using a text editor (such as vi) to define a Shared storage location between nodes and a Local storage directory available on each compute node as follows: |
<pre> | <pre> | ||
− | [ | + | [roberpj@hnd20:/work/$USER/samples/mpiblast/test1] vi .ncbirc |
+ | [NCBI] | ||
+ | Data=/opt/sharcnet/mpiblast/1.6.0/ncbi/data | ||
[BLAST] | [BLAST] | ||
− | BLASTDB=/scratch/ | + | BLASTDB=/scratch/YourSharcnetUsername/mpiblasttest1 |
− | BLASTMAT=/work/ | + | BLASTMAT=/work/YourSharcnetUsername/samples/mpiblast/test1 |
[mpiBLAST] | [mpiBLAST] | ||
− | Shared=/scratch/ | + | Shared=/scratch/YourSharcnetUsername/mpiblasttest1 |
Local=/tmp | Local=/tmp | ||
</pre> | </pre> | ||
− | + | Partition the database into 8 fragments into <i>/scratch/$USER/mpiblasttest1</i> where they will be searched for when running a job in queue according to the <i>.ncbirc</i> file: | |
<pre> | <pre> | ||
− | |||
− | |||
module load mpiblast/1.6.0 | module load mpiblast/1.6.0 | ||
− | + | mkdir /scratch/roberpj/mpiblasttest1 | |
− | + | rm -f /scratch/$USER/mpiblasttest1/* | |
+ | cd /work/$USER/samples/mpiblast/test1 | ||
− | + | [roberpj@hnd20:/work/$USER/samples/mpiblast/test1] mpiformatdb -N 8 -i drosoph.nt.$USER -o T -p F -n /scratch/$USER/mpiblasttest1 | |
− | + | ||
− | + | ||
− | [roberpj@hnd20:/work/$USER/samples/mpiblast/test1] mpiformatdb -N 8 -i drosoph.nt -o T -p F -n /scratch/$USER/mpiblasttest1 | + | |
Reading input file | Reading input file | ||
Done, read 1534943 lines | Done, read 1534943 lines | ||
Line 78: | Line 91: | ||
</pre> | </pre> | ||
− | Submit a short job with a 15m time limit on | + | Submit a short job with a 15m time limit on n=16 cores calculate by taking N=8 fragments + 8. If all goes well output results will be written to <i>drosoph.out</i> and the execution time will appear in ofile%J where %J is the job number: |
<pre> | <pre> | ||
− | + | [roberpj@hnd20:/work/$USER/samples/mpiblast/test1] | |
− | sqsub -r 15m -n | + | sqsub -r 15m -n 106 -q mpi --mpp=1G -o ofile%J mpiblast -d drosoph.nt.$USER -i drosoph.in |
-p blastn -o drosoph.out --use-parallel-write --use-virtual-frags | -p blastn -o drosoph.out --use-parallel-write --use-virtual-frags | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
submitted as jobid 6966896 | submitted as jobid 6966896 | ||
Line 110: | Line 117: | ||
<pre> | <pre> | ||
− | mkdir /work/$USER/samples/mpiblast/test2 | + | mkdir /work/$USER/samples/mpiblast/test2 |
+ | rm -f /work/$USER/samples/mpiblast/test2/* | ||
cd /work/$USER/samples/mpiblast/test2 | cd /work/$USER/samples/mpiblast/test2 | ||
cp /opt/sharcnet/mpiblast/1.6.0/examples/il2ra.in il2ra.in | cp /opt/sharcnet/mpiblast/1.6.0/examples/il2ra.in il2ra.in | ||
− | gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/Hs.seq.uniq.gz > Hs.seq. | + | gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/Hs.seq.uniq.gz > Hs.seq.$USER |
+ | |||
+ | [roberpj@orc-login2:/work/roberpj/samples/mpiblast/test2] ls | ||
+ | Hs.seq.roberpj il2ra.in | ||
</pre> | </pre> | ||
− | Create hidden configuration file using | + | Create a hidden configuration file using a text editor (such as vi) to define a Shared storage location between nodes and a Local storage directory available on each compute node as followw. Note that the ncbi/data directory is not used in this example and hence can be omitted. If the Local and Shared directories are the same replace <b>--copy-via=mpi</b> with <b>--copy-via=none</b> as will be demonstrated in the below sqsub commands. |
<pre> | <pre> | ||
− | [username@orc-login1:/work/ | + | [username@orc-login1:/work/$USER/samples/mpiblast/test2] vi .ncbirc |
[NCBI] | [NCBI] | ||
Data=/opt/sharcnet/mpiblast/1.6.0/ncbi/data | Data=/opt/sharcnet/mpiblast/1.6.0/ncbi/data | ||
[BLAST] | [BLAST] | ||
− | BLASTDB=/work/ | + | BLASTDB=/work/YourSharcnetUsername/mpiblasttest2 |
− | BLASTMAT=/work/ | + | BLASTMAT=/work/YourSharcnetUsername/samples/mpiblast/test2 |
[mpiBLAST] | [mpiBLAST] | ||
− | Shared=/work/ | + | Shared=/work/YourSharcnetUsername/mpiblasttest2 |
Local=/tmp | Local=/tmp | ||
</pre> | </pre> | ||
− | + | Again partition the database into 8 fragments into <i>/work/$USER/mpiblasttest2</i> where they will be searched for when a job runs in the queue according to the <i>.ncbirc</i> file, first removing previous partitioning if any such files exist. For this example assume $USER=roberpj however be sure to replace with your username! | |
<pre> | <pre> | ||
− | mkdir -p / | + | module load mpiblast/1.6.0 |
− | cd /work/$USER/samples/mpiblast/ | + | mkdir -p /work/$USER/mpiblasttest2 |
− | mpiformatdb -N | + | rm -f /work/$USER/mpiblasttest2/* |
+ | cd /work/$USER/samples/mpiblast/test2 | ||
+ | |||
+ | [roberpj@orc-login1:/work/roberpj/samples/mpiblast/test2] mpiformatdb -N 8 -i Hs.seq.$USER -o T -p F -n /work/roberpj/mpiblasttest2 | ||
+ | Reading input file | ||
+ | Done, read 2348651 lines | ||
+ | Breaking Hs.seq.roberpj into 8 fragments | ||
+ | Executing: formatdb -p F -i Hs.seq.roberpj -N 8 -n /work/roberpj/mpiblasttest2/Hs.seq.roberpj -o T | ||
+ | Created 8 fragments. | ||
+ | <<< Please make sure the formatted database fragments are placed in /work/roberpj/mpiblasttest2/ before executing mpiblast. >>> | ||
+ | |||
+ | [roberpj@orc-login1:/work/roberpj/samples/mpiblast/test1] ls | ||
+ | formatdb.log Hs.seq.roberpj il2ra.in | ||
+ | |||
+ | [roberpj@orc-login1:/work/roberpj/mpiblasttest2] ls | ||
+ | Hs.seq.roberpj.000.nhr Hs.seq.roberpj.002.nin Hs.seq.roberpj.004.nnd Hs.seq.roberpj.006.nni | ||
+ | Hs.seq.roberpj.000.nin Hs.seq.roberpj.002.nnd Hs.seq.roberpj.004.nni Hs.seq.roberpj.006.nsd | ||
+ | Hs.seq.roberpj.000.nnd Hs.seq.roberpj.002.nni Hs.seq.roberpj.004.nsd Hs.seq.roberpj.006.nsi | ||
+ | Hs.seq.roberpj.000.nni Hs.seq.roberpj.002.nsd Hs.seq.roberpj.004.nsi Hs.seq.roberpj.006.nsq | ||
+ | Hs.seq.roberpj.000.nsd Hs.seq.roberpj.002.nsi Hs.seq.roberpj.004.nsq Hs.seq.roberpj.007.nhr | ||
+ | Hs.seq.roberpj.000.nsi Hs.seq.roberpj.002.nsq Hs.seq.roberpj.005.nhr Hs.seq.roberpj.007.nin | ||
+ | Hs.seq.roberpj.000.nsq Hs.seq.roberpj.003.nhr Hs.seq.roberpj.005.nin Hs.seq.roberpj.007.nnd | ||
+ | Hs.seq.roberpj.001.nhr Hs.seq.roberpj.003.nin Hs.seq.roberpj.005.nnd Hs.seq.roberpj.007.nni | ||
+ | Hs.seq.roberpj.001.nin Hs.seq.roberpj.003.nnd Hs.seq.roberpj.005.nni Hs.seq.roberpj.007.nsd | ||
+ | Hs.seq.roberpj.001.nnd Hs.seq.roberpj.003.nni Hs.seq.roberpj.005.nsd Hs.seq.roberpj.007.nsi | ||
+ | Hs.seq.roberpj.001.nni Hs.seq.roberpj.003.nsd Hs.seq.roberpj.005.nsi Hs.seq.roberpj.007.nsq | ||
+ | Hs.seq.roberpj.001.nsd Hs.seq.roberpj.003.nsi Hs.seq.roberpj.005.nsq Hs.seq.roberpj.dbs | ||
+ | Hs.seq.roberpj.001.nsi Hs.seq.roberpj.003.nsq Hs.seq.roberpj.006.nhr Hs.seq.roberpj.mbf | ||
+ | Hs.seq.roberpj.001.nsq Hs.seq.roberpj.004.nhr Hs.seq.roberpj.006.nin Hs.seq.roberpj.nal | ||
+ | Hs.seq.roberpj.002.nhr Hs.seq.roberpj.004.nin Hs.seq.roberpj.006.nnd | ||
</pre> | </pre> | ||
− | Submit a couple of short jobs 15m time limit. If all goes well output results will be written to <i>biobrewA.out</i> and <i>biobrewB.out</i> and the execution time appear in corresponding ofile%J's where %J is the job number as per usual: | + | Submit a couple of short jobs 15m time limit. If all goes well output results will be written to <i>biobrewA.out</i> and <i>biobrewB.out</i> and the execution time appear in corresponding ofile%J's where %J is the job number as per usual. In these examples we will submit the job from a saw specific directory which can be copied from test2 as follows: |
+ | |||
+ | <pre> | ||
+ | [roberpj@saw-login1:~] module load mpiblast/1.6.0 | ||
+ | [roberpj@saw-login1:~] cd /work/roberpj/samples/mpiblast | ||
+ | [roberpj@saw-login1:/work/roberpj/samples/mpiblast] cp -a test2 test2.saw | ||
+ | |||
+ | [roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] vi .ncbirc | ||
+ | [roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] cat .ncbirc | grep BLASTMAT | ||
+ | BLASTMAT=/work/roberpj/samples/mpiblast/test2.saw | ||
+ | </pre> | ||
− | A) | + | A) Submit job with profile time option choosing n=24 (8 fragments + 8 or greater): |
<pre> | <pre> | ||
− | + | [roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] rm -f oTime*; | |
− | sqsub -r 15m -n | + | sqsub -r 15m -n 24 -q mpi -o ofile%J mpiblast --use-parallel-write --copy-via=mpi |
+ | -d Hs.seq.roberpj -i il2ra.in -p blastn -o biobrew.out --time-profile=oTime | ||
</pre> | </pre> | ||
− | B) | + | B) Usage of the debug option again choosing n=16 (8 fragments + 8 or greater): |
<pre> | <pre> | ||
− | + | [roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] rm -f oLog*; | |
− | sqsub -r 15m -n | + | sqsub -r 15m -n 16 -q mpi -o ofile%J mpiblast --use-parallel-write --copy-via=none |
+ | -d Hs.seq.$USER -i il2ra.in -p blastn -o biobrew.out --debug=oLog | ||
</pre> | </pre> | ||
Line 228: | Line 280: | ||
=General Notes= | =General Notes= | ||
− | == | + | ==Issue1: Jobs Fail to Start== |
+ | |||
+ | If the following error message occurs it maybe necessary to stop all mpiblast jobs you are running on the cluster and clear | ||
+ | out all the files from /tmp then submit your job again. In the case of Example2 above the following steps work to resolve the problem, where the nodes used from the last run job were red[7-14]. Should there be any questions please open a problem ticket. | ||
+ | |||
+ | <pre> | ||
+ | Purging Job Node Subset (redfin) | ||
+ | pdsh -w red[7-14] ls /tmp/Hs* (confirm existence of file from previous runs) | ||
+ | pdsh -w red[7-14] rm -f /tmp/Hs* | ||
+ | pdsh -w red[7-14] ls /tmp/Hs* (confirm all files from previous runs removed) | ||
+ | |||
+ | Purging Full Cluster (hound) | ||
+ | pdsh -w hnd[1-18] -f 4 rm -f /tmp/Hs* | ||
+ | pdsh -w hnd[1-18] -f 4 ls /tmp/Hs* | ||
+ | </pre> | ||
+ | |||
+ | <pre> | ||
+ | [roberpj@red-admin:/work/roberpj/samples/mpiblast/test2.red4] cat ofile489214.red-admin.redfin.sharcnet | ||
+ | 8 0.577904 Bailing out with signal 11 | ||
+ | -------------------------------------------------------------------------- | ||
+ | MPI_ABORT was invoked on rank 8 in communicator MPI_COMM_WORLD | ||
+ | with errorcode 0. | ||
+ | |||
+ | NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. | ||
+ | You may or may not see output from other processes, depending on | ||
+ | exactly when Open MPI kills them. | ||
+ | -------------------------------------------------------------------------- | ||
+ | 20 0.577719 Bailing out with signal 11 | ||
+ | 0 0.591128 Bailing out with signal 15 | ||
+ | [red14:09194] [[36424,0],0]-[[36424,1],0] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) | ||
+ | 1 0.591327 Bailing out with signal 15 | ||
+ | etc | ||
+ | </pre> | ||
+ | |||
+ | ==Issue2: Jobs Run for a While then Die== | ||
The solution here is to filter the input sequence file. For reasons yet understood the presence of repeat sections results in many thousands of WARNING and ERROR messages rapidly written to the "sqsub -o ofile" output file presumably as mpiblast ignores sequences before eventually diing after several hours, or possibly days. | The solution here is to filter the input sequence file. For reasons yet understood the presence of repeat sections results in many thousands of WARNING and ERROR messages rapidly written to the "sqsub -o ofile" output file presumably as mpiblast ignores sequences before eventually diing after several hours, or possibly days. | ||
Line 262: | Line 348: | ||
Bailing out with signal 15 | Bailing out with signal 15 | ||
46 62909.3 Bailing out with signal 15 | 46 62909.3 Bailing out with signal 15 | ||
− | + | etc | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
</pre> | </pre> | ||
Line 288: | Line 358: | ||
o MPIBLAST Version History<br> | o MPIBLAST Version History<br> | ||
http://www.mpiblast.org/Downloads/Version-History | http://www.mpiblast.org/Downloads/Version-History | ||
+ | |||
+ | [[Category:Bioinformatics]] | ||
+ | |||
+ | <!--checked2016--> |
Latest revision as of 09:32, 6 June 2019
This page is scheduled for deletion because it is either redundant with information available on the CC wiki, or the software is no longer supported. |
Contents
MPIBLAST |
---|
Description: Parallel implementation of NCBI BLAST |
SHARCNET Package information: see MPIBLAST software page in web portal |
Full list of SHARCNET supported software |
Note: Some of the information on this page is for our legacy systems only. The page is scheduled for an update to make it applicable to Graham. |
Introduction
The mpiblast module must be manually loaded before submitting any mpiblast jobs. Two examples below demonstrate how to setup and submit jobs to the mpi queue.
Version Selection
All Clusters (except Guppy)
module unload openmpi intel module load intel/12.1.3 openmpi/intel/1.6.2 mpiblast/1.6.0
Guppy Only
module unload openmpi intel module load intel/11.0.083 openmpi/intel/1.4.2 mpiblast/1.6.0
Job Submission
sqsub -r 15m -n 24 -q mpi --mpp=3g -o ofile%J mpiblast etc
where the number of cores n=24 is arbitrary.
Example Jobs
DROSOPH
Copy sample problem files (fasta database and input) from the /opt/sharcnet examples directory a directory under work as shown here. The fasta database used in this example can be obtained as a guest from NCBI here http://www.ncbi.nlm.nih.gov/guide/all/#downloads_ then clicking "FTP: FASTA BLAST Databases".
mkdir -p /work/$USER/samples/mpiblast/test1 rm /work/$USER/samples/mpiblast/test1/* cd /work/$USER/samples/mpiblast/test1 cp /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.in drosoph.in gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/drosoph.nt.gz > drosoph.nt.$USER
Create a hidden configuration file using a text editor (such as vi) to define a Shared storage location between nodes and a Local storage directory available on each compute node as follows:
[roberpj@hnd20:/work/$USER/samples/mpiblast/test1] vi .ncbirc [NCBI] Data=/opt/sharcnet/mpiblast/1.6.0/ncbi/data [BLAST] BLASTDB=/scratch/YourSharcnetUsername/mpiblasttest1 BLASTMAT=/work/YourSharcnetUsername/samples/mpiblast/test1 [mpiBLAST] Shared=/scratch/YourSharcnetUsername/mpiblasttest1 Local=/tmp
Partition the database into 8 fragments into /scratch/$USER/mpiblasttest1 where they will be searched for when running a job in queue according to the .ncbirc file:
module load mpiblast/1.6.0 mkdir /scratch/roberpj/mpiblasttest1 rm -f /scratch/$USER/mpiblasttest1/* cd /work/$USER/samples/mpiblast/test1 [roberpj@hnd20:/work/$USER/samples/mpiblast/test1] mpiformatdb -N 8 -i drosoph.nt.$USER -o T -p F -n /scratch/$USER/mpiblasttest1 Reading input file Done, read 1534943 lines Breaking drosoph.nt into 8 fragments Executing: formatdb -p F -i drosoph.nt -N 8 -n /scratch/roberpj/mpiblasttest1/drosoph.nt -o T Created 8 fragments. <<< Please make sure the formatted database fragments are placed in /scratch/roberpj/mpiblasttest1/ before executing mpiblast. >>> [roberpj@hnd20:/scratch/roberpj/mpiblasttest1] ls drosoph.nt.000.nhr drosoph.nt.002.nin drosoph.nt.004.nnd drosoph.nt.006.nni drosoph.nt.000.nin drosoph.nt.002.nnd drosoph.nt.004.nni drosoph.nt.006.nsd drosoph.nt.000.nnd drosoph.nt.002.nni drosoph.nt.004.nsd drosoph.nt.006.nsi drosoph.nt.000.nni drosoph.nt.002.nsd drosoph.nt.004.nsi drosoph.nt.006.nsq drosoph.nt.000.nsd drosoph.nt.002.nsi drosoph.nt.004.nsq drosoph.nt.007.nhr drosoph.nt.000.nsi drosoph.nt.002.nsq drosoph.nt.005.nhr drosoph.nt.007.nin drosoph.nt.000.nsq drosoph.nt.003.nhr drosoph.nt.005.nin drosoph.nt.007.nnd drosoph.nt.001.nhr drosoph.nt.003.nin drosoph.nt.005.nnd drosoph.nt.007.nni drosoph.nt.001.nin drosoph.nt.003.nnd drosoph.nt.005.nni drosoph.nt.007.nsd drosoph.nt.001.nnd drosoph.nt.003.nni drosoph.nt.005.nsd drosoph.nt.007.nsi drosoph.nt.001.nni drosoph.nt.003.nsd drosoph.nt.005.nsi drosoph.nt.007.nsq drosoph.nt.001.nsd drosoph.nt.003.nsi drosoph.nt.005.nsq drosoph.nt.dbs drosoph.nt.001.nsi drosoph.nt.003.nsq drosoph.nt.006.nhr drosoph.nt.mbf drosoph.nt.001.nsq drosoph.nt.004.nhr drosoph.nt.006.nin drosoph.nt.nal drosoph.nt.002.nhr drosoph.nt.004.nin drosoph.nt.006.nnd
Submit a short job with a 15m time limit on n=16 cores calculate by taking N=8 fragments + 8. If all goes well output results will be written to drosoph.out and the execution time will appear in ofile%J where %J is the job number:
[roberpj@hnd20:/work/$USER/samples/mpiblast/test1] sqsub -r 15m -n 106 -q mpi --mpp=1G -o ofile%J mpiblast -d drosoph.nt.$USER -i drosoph.in -p blastn -o drosoph.out --use-parallel-write --use-virtual-frags submitted as jobid 6966896 [roberpj@hnd20:/work/roberpj/samples/mpiblast/test1] cat ofile6966896.hnd50 Total Execution Time: 1.80031
When submitting a mpiblast job on a cluster such as goblin that doesnt have an inifiniband interconnect better performance (at least double speedup) will be achieved running the mpi job on one compute node. For regular users of non-contributed hardware typically specify "-n 8" to reflect the max number of cores on a single node:
sqsub -r 15m -n 8 -N 1 -q mpi --mpp=4G -o ofile%J mpiblast -d drosoph.nt -i drosoph.in -p blastn -o drosoph.out --use-parallel-write --use-virtual-frags
Sample output results computed previously with BLASTN 2.2.15 [Oct-15-2006] are included in /opt/sharcnet/mpiblast/1.6.0/examples/ROSOPH.out to compare your newly generated drosoph.out file with.
UNIGENE
The main purpose of this example is to illustrate some additional options and switchs that maybe useful for debugging and for dealing with larger databases as described in official detail at http://www.mpiblast.org/Docs/Guide. The fasta database used in this example can also be downloaded from http://www.ncbi.nlm.nih.gov/guide/all/#downloads_ as a guest by clicking "FTP: UniGene" then entering the "Homo_sapiens" sub-directory. More information about UniGene alignments can be found at https://cgwb.nci.nih.gov/cgi-bin/hgTrackUi?hgsid=95443&c=chr1&g=uniGene_3 . As with Example1 above, for convenience all required files can simply be copied from the /opt/sharcnet examples subdirectory to work as shown here:
mkdir /work/$USER/samples/mpiblast/test2 rm -f /work/$USER/samples/mpiblast/test2/* cd /work/$USER/samples/mpiblast/test2 cp /opt/sharcnet/mpiblast/1.6.0/examples/il2ra.in il2ra.in gunzip -c /opt/sharcnet/mpiblast/1.6.0/examples/Hs.seq.uniq.gz > Hs.seq.$USER [roberpj@orc-login2:/work/roberpj/samples/mpiblast/test2] ls Hs.seq.roberpj il2ra.in
Create a hidden configuration file using a text editor (such as vi) to define a Shared storage location between nodes and a Local storage directory available on each compute node as followw. Note that the ncbi/data directory is not used in this example and hence can be omitted. If the Local and Shared directories are the same replace --copy-via=mpi with --copy-via=none as will be demonstrated in the below sqsub commands.
[username@orc-login1:/work/$USER/samples/mpiblast/test2] vi .ncbirc [NCBI] Data=/opt/sharcnet/mpiblast/1.6.0/ncbi/data [BLAST] BLASTDB=/work/YourSharcnetUsername/mpiblasttest2 BLASTMAT=/work/YourSharcnetUsername/samples/mpiblast/test2 [mpiBLAST] Shared=/work/YourSharcnetUsername/mpiblasttest2 Local=/tmp
Again partition the database into 8 fragments into /work/$USER/mpiblasttest2 where they will be searched for when a job runs in the queue according to the .ncbirc file, first removing previous partitioning if any such files exist. For this example assume $USER=roberpj however be sure to replace with your username!
module load mpiblast/1.6.0 mkdir -p /work/$USER/mpiblasttest2 rm -f /work/$USER/mpiblasttest2/* cd /work/$USER/samples/mpiblast/test2 [roberpj@orc-login1:/work/roberpj/samples/mpiblast/test2] mpiformatdb -N 8 -i Hs.seq.$USER -o T -p F -n /work/roberpj/mpiblasttest2 Reading input file Done, read 2348651 lines Breaking Hs.seq.roberpj into 8 fragments Executing: formatdb -p F -i Hs.seq.roberpj -N 8 -n /work/roberpj/mpiblasttest2/Hs.seq.roberpj -o T Created 8 fragments. <<< Please make sure the formatted database fragments are placed in /work/roberpj/mpiblasttest2/ before executing mpiblast. >>> [roberpj@orc-login1:/work/roberpj/samples/mpiblast/test1] ls formatdb.log Hs.seq.roberpj il2ra.in [roberpj@orc-login1:/work/roberpj/mpiblasttest2] ls Hs.seq.roberpj.000.nhr Hs.seq.roberpj.002.nin Hs.seq.roberpj.004.nnd Hs.seq.roberpj.006.nni Hs.seq.roberpj.000.nin Hs.seq.roberpj.002.nnd Hs.seq.roberpj.004.nni Hs.seq.roberpj.006.nsd Hs.seq.roberpj.000.nnd Hs.seq.roberpj.002.nni Hs.seq.roberpj.004.nsd Hs.seq.roberpj.006.nsi Hs.seq.roberpj.000.nni Hs.seq.roberpj.002.nsd Hs.seq.roberpj.004.nsi Hs.seq.roberpj.006.nsq Hs.seq.roberpj.000.nsd Hs.seq.roberpj.002.nsi Hs.seq.roberpj.004.nsq Hs.seq.roberpj.007.nhr Hs.seq.roberpj.000.nsi Hs.seq.roberpj.002.nsq Hs.seq.roberpj.005.nhr Hs.seq.roberpj.007.nin Hs.seq.roberpj.000.nsq Hs.seq.roberpj.003.nhr Hs.seq.roberpj.005.nin Hs.seq.roberpj.007.nnd Hs.seq.roberpj.001.nhr Hs.seq.roberpj.003.nin Hs.seq.roberpj.005.nnd Hs.seq.roberpj.007.nni Hs.seq.roberpj.001.nin Hs.seq.roberpj.003.nnd Hs.seq.roberpj.005.nni Hs.seq.roberpj.007.nsd Hs.seq.roberpj.001.nnd Hs.seq.roberpj.003.nni Hs.seq.roberpj.005.nsd Hs.seq.roberpj.007.nsi Hs.seq.roberpj.001.nni Hs.seq.roberpj.003.nsd Hs.seq.roberpj.005.nsi Hs.seq.roberpj.007.nsq Hs.seq.roberpj.001.nsd Hs.seq.roberpj.003.nsi Hs.seq.roberpj.005.nsq Hs.seq.roberpj.dbs Hs.seq.roberpj.001.nsi Hs.seq.roberpj.003.nsq Hs.seq.roberpj.006.nhr Hs.seq.roberpj.mbf Hs.seq.roberpj.001.nsq Hs.seq.roberpj.004.nhr Hs.seq.roberpj.006.nin Hs.seq.roberpj.nal Hs.seq.roberpj.002.nhr Hs.seq.roberpj.004.nin Hs.seq.roberpj.006.nnd
Submit a couple of short jobs 15m time limit. If all goes well output results will be written to biobrewA.out and biobrewB.out and the execution time appear in corresponding ofile%J's where %J is the job number as per usual. In these examples we will submit the job from a saw specific directory which can be copied from test2 as follows:
[roberpj@saw-login1:~] module load mpiblast/1.6.0 [roberpj@saw-login1:~] cd /work/roberpj/samples/mpiblast [roberpj@saw-login1:/work/roberpj/samples/mpiblast] cp -a test2 test2.saw [roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] vi .ncbirc [roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] cat .ncbirc | grep BLASTMAT BLASTMAT=/work/roberpj/samples/mpiblast/test2.saw
A) Submit job with profile time option choosing n=24 (8 fragments + 8 or greater):
[roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] rm -f oTime*; sqsub -r 15m -n 24 -q mpi -o ofile%J mpiblast --use-parallel-write --copy-via=mpi -d Hs.seq.roberpj -i il2ra.in -p blastn -o biobrew.out --time-profile=oTime
B) Usage of the debug option again choosing n=16 (8 fragments + 8 or greater):
[roberpj@saw-login1:/work/roberpj/samples/mpiblast/test2.saw] rm -f oLog*; sqsub -r 15m -n 16 -q mpi -o ofile%J mpiblast --use-parallel-write --copy-via=none -d Hs.seq.$USER -i il2ra.in -p blastn -o biobrew.out --debug=oLog
Finally compare /opt/sharcnet/mpiblast/1.6.0/examples/BIOBREW.out computed previously with BLASTN 2.2.15 [Oct-15-2006] with your newly generated biobrew.out output file to verify the results and submit a ticket if there are any problems!
SUPPORTED PROGRAMS IN MPIBLAST
As described in http://www.mpiblast.org/Docs/FAQ mpiblast supports the standard blast programs http://www.ncbi.nlm.nih.gov/BLAST/blast_program.shtml which are reproduced here for reference:
blastp: Compares an amino acid query sequence against a protein sequence database. blastn: Compares a nucleotide query sequence against a nucleotide sequence database. blastx: Compares a nucleotide query sequence translated in all reading frames against a protein sequence database. tblastn: Compares a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames. tblastx: Compares the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
MPIBLAST BINARIES OPTIONS
[roberpj@orc-login1:/opt/sharcnet/mpiblast/1.6.0/bin] ./mpiblast -help mpiBLAST requires the following options: -d [database] -i [query file] -p [blast program name]
[roberpj@orc-login1:/opt/sharcnet/mpiblast/1.6.0/bin] ./mpiformatdb --help Executing: formatdb - formatdb 2.2.20 arguments: -t Title for database file [String] Optional -i Input file(s) for formatting [File In] Optional -l Logfile name: [File Out] Optional default = formatdb.log -p Type of file T - protein F - nucleotide [T/F] Optional default = T -o Parse options T - True: Parse SeqId and create indexes. F - False: Do not parse SeqId. Do not create indexes. [T/F] Optional default = F -a Input file is database in ASN.1 format (otherwise FASTA is expected) T - True, F - False. [T/F] Optional default = F -b ASN.1 database in binary mode T - binary, F - text mode. [T/F] Optional default = F -e Input is a Seq-entry [T/F] Optional default = F -n Base name for BLAST files [String] Optional -v Database volume size in millions of letters [Integer] Optional default = 4000 -s Create indexes limited only to accessions - sparse [T/F] Optional default = F -V Verbose: check for non-unique string ids in the database [T/F] Optional default = F -L Create an alias file with this name use the gifile arg (below) if set to calculate db size use the BLAST db specified with -i (above) [File Out] Optional -F Gifile (file containing list of gi's) [File In] Optional -B Binary Gifile produced from the Gifile specified above [File Out] Optional -T Taxid file to set the taxonomy ids in ASN.1 deflines [File In] Optional -N Number of database volumes [Integer] Optional default = 0
General Notes
Issue1: Jobs Fail to Start
If the following error message occurs it maybe necessary to stop all mpiblast jobs you are running on the cluster and clear out all the files from /tmp then submit your job again. In the case of Example2 above the following steps work to resolve the problem, where the nodes used from the last run job were red[7-14]. Should there be any questions please open a problem ticket.
Purging Job Node Subset (redfin) pdsh -w red[7-14] ls /tmp/Hs* (confirm existence of file from previous runs) pdsh -w red[7-14] rm -f /tmp/Hs* pdsh -w red[7-14] ls /tmp/Hs* (confirm all files from previous runs removed) Purging Full Cluster (hound) pdsh -w hnd[1-18] -f 4 rm -f /tmp/Hs* pdsh -w hnd[1-18] -f 4 ls /tmp/Hs*
[roberpj@red-admin:/work/roberpj/samples/mpiblast/test2.red4] cat ofile489214.red-admin.redfin.sharcnet 8 0.577904 Bailing out with signal 11 -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 8 in communicator MPI_COMM_WORLD with errorcode 0. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- 20 0.577719 Bailing out with signal 11 0 0.591128 Bailing out with signal 15 [red14:09194] [[36424,0],0]-[[36424,1],0] mca_oob_tcp_msg_recv: readv failed: Connection reset by peer (104) 1 0.591327 Bailing out with signal 15 etc
Issue2: Jobs Run for a While then Die
The solution here is to filter the input sequence file. For reasons yet understood the presence of repeat sections results in many thousands of WARNING and ERROR messages rapidly written to the "sqsub -o ofile" output file presumably as mpiblast ignores sequences before eventually diing after several hours, or possibly days.
# cat ofile1635556.saw-admin.saw.sharcnet | grep "WARNING\|ERROR" | wc -l 10560
# cat ofile1635556.saw-admin.saw.sharcnet Selenocysteine (U) at position 60 replaced by X Selenocysteine (U) at position 42 replaced by X [blastall] WARNING: [000.000] NODE_84_length_162_cov_46.259258_1_192_-: SetUpBlastSearch failed. [blastall] ERROR: [000.000] NODE_84_length_162_cov_46.259258_1_192_-: BLASTSetUpSearch: Unable to calculate Karlin-Altschul params, check query sequence <<<< snipped out ~10000 similar WARNING and ERROR messages from this example >>>> [blastall] WARNING: [000.000] NODE_65409_length_87_cov_2.367816_1_77_+: SetUpBlastSearch failed. [blastall] ERROR: [000.000] NODE_65409_length_87_cov_2.367816_1_77_+: BLASTSetUpSearch: Unable to calculate Karlin-Altschul params, check query sequence Selenocysteine (U) at position 61 replaced by X Selenocysteine (U) at position 62 replaced by X Selenocysteine (U) at position 34 replaced by X Selenocysteine (U) at position 1058 replaced by X -------------------------------------------------------------------------- mpirun noticed that process rank 52 with PID 30067 on node saw214 exited on signal 9 (Killed). -------------------------------------------------------------------------- 1618 17 651 62909.3 Bailing out with signal 15 14 3 62909.3 Bailing out with signal 15 19 15 62909.3 Bailing out with signal 1536 5 7 62909.312 62909.310 62909.3 Bailing out with signal 1547 21 62909.3 Bailing out with signal 159 62909.3 Bailing out with signal 15 45 62909.3 Bailing out with signal 1525 62909.3 Bailing out with signal 158 62909.3 Bailing out with signal 15 50 62909.3 Bailing out with signal 1522 62909.3 Bailing out with signal 15 11 62909.3 Bailing out with signal 15 48 62909.323 62909.3 Bailing out with signal 15 Bailing out with signal 15 46 62909.3 Bailing out with signal 15 etc
References
o MPIBLAST Homepage
http://www.mpiblast.org/
o MPIBLAST Version History
http://www.mpiblast.org/Downloads/Version-History