(→r711 versions) |
(→ls971 versions) |
||
Line 55: | Line 55: | ||
===ls971 versions=== | ===ls971 versions=== | ||
− | o For serial or threaded jobs, | + | o For serial or threaded jobs on hound, orca or saw: |
<pre> | <pre> | ||
module load lsdyna/ls971smp6.1.1 | module load lsdyna/ls971smp6.1.1 | ||
</pre> | </pre> | ||
− | o For mpi jobs | + | o For mpi jobs on saw: |
<pre> | <pre> | ||
module unload intel openmpi lsdyna | module unload intel openmpi lsdyna | ||
Line 66: | Line 66: | ||
</pre> | </pre> | ||
− | o For mpi jobs | + | o For mpi jobs on orca and hound do: |
<pre> | <pre> | ||
module unload intel openmpi lsdyna | module unload intel openmpi lsdyna |
Revision as of 18:48, 17 June 2014
LS-DYNA |
---|
Description: Suite of programs for transient dynamic finite element program |
SHARCNET Package information: see LS-DYNA software page in web portal |
Full list of SHARCNET supported software |
Contents
Introduction
Before a research group can use LSDYNA on sharcnet, a license must be purchased directly from LSTC for the sharcnet license server. Alternately, if a research group resides at a institution that has sharcnet computing hardware (mac, uwo, guelph, waterloo) it maybe possible to use a pre-existing site license hosted on an accessible institutional license server. To access and use this software you must be a member of the lsdyna group.
Version Selection
License
Once a license configuration is established, the research group will be given a 5 digit port number. The value should then be inserted into the appropriate departmental export statement before loading the module file as follows:
o UofT Mechanical Engineering Dept
export LSTC_LICENSE_SERVER='Port-Number'@license1.uwo.sharcnet
o McGill Mechanical Engineering Department
export LSTC_LICENSE_SERVER='Port-Number'@license2.uwo.sharcnet
o UW Mechanical and Mechatronics Engineering Dept
export LSTC_LICENSE_SERVER='Port-Number'@license3.uwo.sharcnet
Module
The next step is to load the sharcnet lsdyna module for the version you want to use. First check which modules are available by running the module avail command then load one the modules as shown:
[roberpj@orc-login2:~] module avail lsdyna -------------------------------------- /opt/sharcnet/modules --------------------------------------- lsdyna/hyb/r711.88920 lsdyna/ls971mpp6.1.1 lsdyna/ls971smp6.1.1 lsdyna/mpp/r711.88920 lsdyna/ls971mpp6.0.0 lsdyna/ls971smp6.0.0 lsdyna/ls980mppB1 lsdyna/smp/r711.88920 lsdyna/ls971mpp6.1.0 lsdyna/ls971smp6.1.0 lsdyna/ls980smpB1
r711 versions
o For mpi jobs on hound, orca and saw do:
module load lsdyna/mpp/r711.88920
o For serial or threaded jobs on hound, orca and saw do:
module load lsdyna/smp/r711.88920
o For hybrid mpi jobs submitted into the threaded queue on hound, orca and saw do the following. Note a minimum or 128cores is required and absolutely no test jobs have been run yet to verify this approach will work. The module is therefore unsupported at the present time and should not be used:
module load lsdyna/hyb/r711.88920
ls971 versions
o For serial or threaded jobs on hound, orca or saw:
module load lsdyna/ls971smp6.1.1
o For mpi jobs on saw:
module unload intel openmpi lsdyna module load intel/11.1.069 openmpi/intel/1.5.4 lsdyna/ls971mpp6.1.1
o For mpi jobs on orca and hound do:
module unload intel openmpi lsdyna module load intel/11.1.069 openmpi/intel/1.5.5 lsdyna/ls971mpp6.1.1
ls980 versions
o For serial or threaded jobs, using version ls980 (and noting for this beta module the restart capability is currently broken) do the following:
module load lsdyna/ls980smpB1
o For mpi jobs, using version ls980 (and noting for this beta module the restart capability is currently broken) do the following:
module unload intel openmpi lsdyna module load intel/12.1.3 openmpi/intel/1.4.5 lsdyna/ls980mppB1
Job Submission
To run the single or double precision solvers specify lsdyna_s or lsdyna_d respectively:
o SUBMIT 1CPU SERIAL JOB
sqsub -r 1h -q serial -o ofile.%J --mpp=2G lsdyna_d i=airbag.deploy.k ncpu=1
o SUBMIT 4CPU SMP JOB
sqsub -r 1h -q threaded -n 4 -o ofile.%J --mpp=2G lsdyna_d i=airbag.deploy.k ncpu=4
o SUBMIT 4CPU MPI JOB
Hound or Saw:
sqsub -r 1h -q mpi -o ofile.%J -n 4 --mpp=2G lsdyna_d i=airbag.deploy.k
Orca:
sqsub -r 1h -q mpi -o ofile.%J -n 4 --mpp=2G -f opteron lsdyna_d i=airbag.deploy.k
Example Job
STEP1) The following shows sqsub submission of the airbag example to the mpi queue. Its recommended to first edit airbag.deploy.k and change endtim to 3.000E-00 so the job runs long enough to perform the restart in steps 2 and 3 below:
cp -a /opt/sharcnet/lsdyna/ls971smp6.1.1/examples /work/$USER/test-lsdyna [to grab /opt/sharcnet/lsdyna/ls971smp6.1.1/examples/misc/airbag] cd /work/$USER/test-lsdyna/misc/airbag gunzip airbag.deploy.k.gz module unload intel openmpi lsdyna SAW: module load intel/11.1.069 openmpi/intel/1.5.4 lsdyna/ls971mpp6.1.1 ORCA or HOUND: module load intel/11.1.069 openmpi/intel/1.5.5 lsdyna/ls971mpp6.1.1 export LSTC_LICENSE_SERVER=#####@license#.uwo.sharcnet cp airbag.deploy.k airbag.deploy.restart.k SAW or HOUND: sqsub -r 1h -q mpi --mpp=2G -n 4 -o ofile.%J lsdyna_d i=airbag.deploy.k ORCA: sqsub -r 10m -f opteron -q mpi --mpp=2G -n 4 -o ofile.%J lsdyna_s i=airbag.deploy.k
STEP2) With the job still running, use the echo command as follows to create a file called "D3KIL" that will trigger generation of restart files at which point the file D3KIL itself will be erased. Do this once a day if data loss is critical to you OR once or twice just before the sqsub -r time limit is reached. Further information can be found here http://www.dynasupport.com/tutorial/ls-dyna-users-guide/sense-switch-control:
echo "sw3" > D3KIL sqkill job#
STEP3) Before the job can be restarted, the following two lines must be added to the airbag.deploy.restart.k file:
*CHANGE_CURVE_DEFINITION 1
STEP4) Now resubmit the job as follows using "r=" to specify the restart file:
SAW or HOUND: sqsub -r 1h -q mpi --mpp=2G -n 4 -o ofile.%J lsdyna_d i=airbag.deploy.restart.k r=d3dump01 ORCA: sqsub -r 1h -f opteron -q mpi --mpp=2G -n 4 -o ofile.%J lsdyna_d i=airbag.deploy.restart.k r=d3dump01
General Notes
Memory Issues
A minumum of mpp=2G is recommended although the memory requirement of a job may suggest much less is required. For instance setting mpp=1G for the airbag test job above will result in the following error when running a job in the queue:
-------------------------------------------------------------------------- An attempt to set processor affinity has failed - please check to ensure that your system supports such functionality. If so, then this is probably something that should be reported to the OMPI developers. --------------------------------------------------------------------------
To get an idea of the amount of memory a job used run grep on the output file:
[roberpj@orc-login1:~/samples/lsdyna/airbag] cat ofile.2063302.orc-admin2.orca.sharcnet | grep Memory | Distributed Memory Parallel | Memory size from command line: 500000, 500000 Memory for the head node Memory installed (MB) : 32237 Memory free (MB) : 11259 Memory required (MB) : 0 Memory required to process keyword : 458120 Memory required for decomposition : 458120 Memory required to begin solution (memory= 458120 memory2= 230721) Max. Memory reqd for implicit sol: max used 0 Max. Memory reqd for implicit sol: incore 0 Max. Memory reqd for implicit sol: oocore 0 Memory required to complete solution (memory= 458120 memory2= 230721)
Version Revision
Again run grep on the output file to extract the major and minor revision:
[roberpj@orc-login1:~/samples/lsdyna/airbag] cat ofile.2063302.orc-admin2.orca.sharcnet | grep 'Revision\|Version' | Version : mpp s R6.1.1 Date: 01/02/2013 | | Revision: 78769 Time: 07:43:30 | | SVN Version: 80542 |
Using lstc_qsun
The available options for the lstc_qrun are given by:
[roberpj@hnd19:~] lstc_qrun -help o PRINT HELP AND QUIT: lstc_qrun -help o PRINT VERSION INFO AND QUIT: lstc_qrun -v o PRINT SERVER LIST: lstc_qrun -i o PRINT JOB INFO: lstc_qrun [-s server] o PRINT LICENSE INFORMATION: lstc_qrun [-s server] -r o PRINT VERBOSE LICENSE INFORMATION: lstc_qrun [-s server] -R
To check the status of your job in the queue use the sqjobs command. On x86_64 systems you can check whether any of your jobs are queued on the license server use lstc_qrun. Its possible that sqsub will start your job but it will sit idle until enough license are available on the license server and the lstc_qrun command will reveal this:
[roberpj@hnd19:~] lstc_qrun Defaulting to server 1 specified by LSTC_LICENSE_SERVER variable Running Programs User Host Program Started # procs ----------------------------------------------------------------------------- dgierczy 20277@saw61.saw.sharcn LS-DYNA_971 Mon Apr 8 17:44 1 dgierczy 8570@saw32.saw.sharcn LS-DYNA_971 Mon Apr 8 17:44 1 dscronin 25486@hnd6 MPPDYNA_971 Tue Apr 9 20:19 6 dscronin 14897@hnd18 MPPDYNA_971 Tue Apr 9 21:48 6 dscronin 14971@hnd18 MPPDYNA_971 Tue Apr 9 21:48 6 dscronin 15046@hnd18 MPPDYNA_971 Tue Apr 9 21:48 6 dscronin 31237@hnd16 MPPDYNA_971 Tue Apr 9 21:53 6 dscronin 31313@hnd16 MPPDYNA_971 Tue Apr 9 21:54 6 dscronin 6396@hnd15 MPPDYNA_971 Tue Apr 9 21:54 6 csharson 28890@saw175.saw.sharc MPPDYNA_971 Wed Apr 10 16:48 6 roberpj 11257
At the time of this writing UWO is running on a 4cpu demo license with details:
[roberpj@hnd19:~] lstc_qrun -R Defaulting to server 1 specified by LSTC_LICENSE_SERVER variable **** LICENSE INFORMATION **** PROGRAM EXPIRATION CPUS USED FREE MAX | QUEUE ---------------- ---------- ----- ------ ------ | ----- LS-DYNA_971 12/31/2013 - 966 1024 | 0 dgierczy 20277@saw61.saw.sharcnet 1 dgierczy 8570@saw32.saw.sharcnet 1 MPPDYNA_971 12/31/2013 - 966 1024 | 0 dscronin 25486@hnd6 6 dscronin 14897@hnd18 6 dscronin 14971@hnd18 6 dscronin 15046@hnd18 6 dscronin 31237@hnd16 6 dscronin 31313@hnd16 6 dscronin 6396@hnd15 6 csharson 28890@saw175.saw.sharcnet 6 roberpj 11257@hnd15 8 LICENSE GROUP 58 966 1024 | 0 PROGRAM EXPIRATION CPUS USED FREE MAX | QUEUE ---------------- ---------- ----- ------ ------ | ----- LS-OPT 12/31/2013 0 1024 1024 | 0 LICENSE GROUP 0 1024 1024 | 0
Legacy Instructions
The following binaries remain available on saw and orca for backward compatibility testing:
[roberpj@orc-login2:~] cd /opt/sharcnet/local/lsdyna [roberpj@orc129:/opt/sharcnet/local/lsdyna] ls ls971* ls971_d_R3_1 ls971_d_R4_2_1 ls971_s_R3_1 ls971_s_R4_2_1 ls971_s_R5_1_1 ls971_d_R4_2_0 ls971_d_R5_0 ls971_s_R4_2_0 ls971_s_R5_0
There are currently no sharcnet modules for these versions, hence jobs should be submitted as follows:
module load lsdyna export PATH=/opt/sharcnet/local/lsdyna:$PATH export LSTC_LICENSE_SERVER=XXXXX@license3.uwo.sharcnet cp /opt/sharcnet/local/lsdyna/examples/airbag.deploy.k airbag.deploy.k SERIAL JOB: sqsub -q serial -r 1d -o ofile.%J ls971_d_R4_2_1 i=airbag.deploy.k THREADED JOB: sqsub -q threaded -n 4 -r 1d -o ofile.%J ls971_d_R4_2_1 ncpu=4 para=2 i=airbag.deploy.k
Note! the ls971_s_R3_1 and ls971_d_R3_1 binaries do not work, a fix is being looked for.
References
o LSTC LS-DYNA Homepage
http://www.lstc.com/lsdyna.htm
o LSTC LS-DYNA Support (Tutorials, HowTos, Faq, Manuals, Release Notes, News, Links)
http://www.dynasupport.com/
o LS-DYNA and d3VIEW Blog
http://blog2.d3view.com/a-few-words-on-memory-settings-in-ls-dyna/
o Convert Words to GB (500000000w=3.73gb,800000000w=5.96gb )
http://deviceanalytics.com/memcalc.php
o LS-DYNA Support Environment variables
http://www.dynasupport.com/howtos/general/environment-variables
o http://www.dynasupport.com/release-notes http://www.dynasupport.com/release-notes