From Documentation
Revision as of 22:58, 29 January 2015 by Roberpj (Talk | contribs) (For mpi jobs on hound)

Jump to: navigation, search
LSDYNA
Description: Suite of programs for transient dynamic finite element program
SHARCNET Package information: see LSDYNA software page in web portal
Full list of SHARCNET supported software


Introduction

Before a research group can use LSDYNA on sharcnet, a license must be purchased directly from LSTC for the sharcnet license server. Alternately, if a research group resides at a institution that has sharcnet computing hardware (mac, uwo, guelph, waterloo) it maybe possible to use a pre-existing site license hosted on an accessible institutional license server. To access and use this software you must be a member of the lsdyna group.

Version Selection

License

Once a license configuration is established, the research group will be given a 5 digit port number. The value should then be inserted into the appropriate departmental export statement before loading the module file as follows:

o UofT Mechanical Engineering Dept

export LSTC_LICENSE_SERVER='Port-Number'@license1.uwo.sharcnet

o McGill Mechanical Engineering Department

export LSTC_LICENSE_SERVER='Port-Number'@license2.uwo.sharcnet

o UW Mechanical and Mechatronics Engineering Dept

export LSTC_LICENSE_SERVER='Port-Number'@license3.uwo.sharcnet

Module

The next step is to load the sharcnet lsdyna module for the version you want to use. First check which modules are available by running the module avail command then load one the modules as shown:

[roberpj@saw-login2:~] module avail lsdyna
lsdyna/mpp/r611.79036   lsdyna/mpp/r611.80542
lsdyna/smp/r711.88920   lsdyna/mpp/r711.88920  lsdyna/hyb/r711.88920
lsdyna/ls980smpB1         lsdyna/ls980mppB1        lsdyna/ls980mppB2

r7.1.1 versions

o For mpi jobs:

module load lsdyna/mpp/r711.88920

o For serial or threaded jobs:

module load lsdyna/smp/r711.88920

o Note that hybrid threaded/mpi jobs are currently not supported. A minimum or 128cores is expected to be required before using this module once usage instructions are provided. The module is provided for internal testing purposes only:

module load lsdyna/hyb/r711.88920

r6.1.1 versions

Note that r611.79036 provides both lsdyna_s and lsdyna_d, while r611.80542 provides only lsdyna_s.

For mpi jobs on saw

module unload intel openmpi lsdyna

module load intel/11.1.069 openmpi/intel/1.5.4 lsdyna/mpp/r611.79036
or
module load intel/11.1.069 openmpi/intel/1.6.4 lsdyna/mpp/r611.80542

For mpi jobs on orca

module unload intel openmpi lsdyna

module load intel/11.1.069 openmpi/intel/1.5.5 lsdyna/mpp/r611.79036
or
module load intel/11.1.069 openmpi/intel/1.6.4 lsdyna/mpp/r611.80542

For mpi jobs on hound

module unload intel openmpi lsdyna

module load intel/11.1.069 openmpi/intel/1.5.5 lsdyna/mpp/r611.79036
or
module load intel/11.1.069 openmpi/intel/1.6.4 lsdyna/mpp/r611.80542

ls980 versions

Restart capability for the beta B1 versions described here are broken.

o For serial or threaded jobs:

module load lsdyna/ls980smpB1 

o For mpi jobs using openmpi 1.4.5 ….

module unload intel openmpi lsdyna
module load intel/12.1.3 openmpi/intel/1.4.5 lsdyna/ls980mppB1

o For mpi jobs using the default environment openmpi 1.6.2 ….

module load lsdyna/ls980mppB2

Job Submission

To run the single or double precision solvers specify lsdyna_s or lsdyna_d respectively:

o SUBMIT 1CPU SERIAL JOB

sqsub -r 1h -q serial -o ofile.%J --mpp=2G lsdyna_d i=airbag.deploy.k ncpu=1 

o SUBMIT 4CPU SMP JOB

sqsub -r 1h -q threaded -n 4 -o ofile.%J --mpp=2G lsdyna_d i=airbag.deploy.k ncpu=4

o SUBMIT 4CPU MPI JOB

Hound or Saw:

sqsub -r 1h -q mpi -o ofile.%J -n 4 --mpp=2G lsdyna_d i=airbag.deploy.k

Orca:

sqsub -r 1h -q mpi -o ofile.%J -n 4 --mpp=2G -f xeon lsdyna_s i=airbag.deploy.k    (xeon nodes)
or
sqsub -r 1h -q mpi -o ofile.%J -n 4 --mpp=2G -f opteron lsdyna_d i=airbag.deploy.k  (opteron nodes)

Example Job

STEP1) The following shows sqsub submission of the airbag example to the mpi queue. Its recommended to first edit airbag.deploy.k and change endtim to 3.000E-00 so the job runs long enough to perform the restart in steps 2 and 3 below:

cp -a /opt/sharcnet/lsdyna/r611.79036/examples /work/$USER/test-lsdyna
cd /work/$USER/test-lsdyna/misc/airbag
gunzip airbag.deploy.k.gz
module unload intel openmpi lsdyna
SAW: module load intel/11.1.069 openmpi/intel/1.5.4 lsdyna/mpp/r611.79036
ORCA or HOUND: module load intel/11.1.069 openmpi/intel/1.5.5 lsdyna/mpp/r611.79036
export LSTC_LICENSE_SERVER=#####@license#.uwo.sharcnet
cp airbag.deploy.k airbag.deploy.restart.k
export LSTC_LICENSE_SERVER=XXXXX@licenseY.uwo.sharcnet
SAW or HOUND:
  sqsub -r 1h -q mpi --mpp=2G -n 4 -o ofile.%J lsdyna_d i=airbag.deploy.k
ORCA:
  sqsub -r 10m -f opteron -q mpi --mpp=2G -n 4 -o ofile.%J  lsdyna_s i=airbag.deploy.k

STEP2) With the job still running, use the echo command as follows to create a file called "D3KIL" that will trigger generation of restart files at which point the file D3KIL itself will be erased. Do this once a day if data loss is critical to you OR once or twice just before the sqsub -r time limit is reached. Further information can be found here http://www.dynasupport.com/tutorial/ls-dyna-users-guide/sense-switch-control:

echo "sw3" > D3KIL
sqkill job#

STEP3) Before the job can be restarted, the following two lines must be added to the airbag.deploy.restart.k file:

*CHANGE_CURVE_DEFINITION
         1

STEP4) Now resubmit the job as follows using "r=" to specify the restart file:

SAW or HOUND:
  sqsub -r 1h -q mpi --mpp=2G -n 4 -o ofile.%J lsdyna_d i=airbag.deploy.restart.k r=d3dump01
ORCA:
  sqsub -r 1h -f opteron -q mpi --mpp=2G -n 4 -o ofile.%J lsdyna_d i=airbag.deploy.restart.k r=d3dump01

General Notes

Affinity Issues

It was reported Jan 14/2015 that using openmpi/intel/1.5.5 with lsdyna/mpp/r611.80542 can result in affinity errors on orca. Therefore the instructions have been revised accordingly in https://www.sharcnet.ca/help/index.php/LSDYNA#For_mpi_jobs_on_orca to recommend using openmpi/intel/1.5.4 instead. Because this is also enforced by the module load command the module switch command must be used move back to 1.5.5 as follows:

[roberpj@orc-login2:~] module unload intel openmpi lsdyna
[roberpj@orc-login2:~] module load intel/11.1.069 openmpi/intel/1.5.4 lsdyna/mpp/r611.80542
[roberpj@orc-login2:~] module switch openmpi/intel/1.5.4 openmpi/intel/1.5.5

Memory Issues

A minumum of mpp=2G is recommended although the memory requirement of a job may suggest much less is required. For instance setting mpp=1G for the airbag test job above will result in the following error when running a job in the queue:

--------------------------------------------------------------------------
An attempt to set processor affinity has failed - please check to
ensure that your system supports such functionality. If so, then
this is probably something that should be reported to the OMPI developers.
--------------------------------------------------------------------------

To get an idea of the amount of memory a job used run grep on the output file:

[roberpj@orc-login1:~/samples/lsdyna/airbag]  cat  ofile.2063302.orc-admin2.orca.sharcnet | grep Memory
     |    Distributed Memory Parallel                  |
 Memory size from command line:      500000,       500000
 Memory for the head node
 Memory installed (MB)        :        32237
 Memory free (MB)             :        11259
 Memory required (MB)         :            0
 Memory required to process keyword     :       458120
 Memory required for decomposition      :       458120
 Memory required to begin solution (memory=      458120 memory2=      230721)
 Max. Memory reqd for implicit sol: max used               0
 Max. Memory reqd for implicit sol: incore                 0
 Max. Memory reqd for implicit sol: oocore                 0
 Memory required to complete solution (memory=      458120 memory2=      230721)

Version Revision

Again run grep on the output file to extract the major and minor revision:

[roberpj@orc-login1:~/samples/lsdyna/airbag]  cat  ofile.2063302.orc-admin2.orca.sharcnet | grep 'Revision\|Version'
     |  Version : mpp s R6.1.1    Date: 01/02/2013     |
     |  Revision: 78769           Time: 07:43:30       |
     |  SVN Version: 80542                             |

Using lstc_qsun

The available options for the lstc_qrun are given by:

[roberpj@hnd19:~]  lstc_qrun -help
  o PRINT HELP AND QUIT:                lstc_qrun -help
  o PRINT VERSION INFO AND QUIT:        lstc_qrun -v
  o PRINT SERVER LIST:                  lstc_qrun -i
  o PRINT JOB INFO:                     lstc_qrun [-s server]
  o PRINT LICENSE INFORMATION:          lstc_qrun [-s server] -r
  o PRINT VERBOSE LICENSE INFORMATION:  lstc_qrun [-s server] -R

To check the status of your job in the queue use the sqjobs command. On x86_64 systems you can check whether any of your jobs are queued on the license server use lstc_qrun. Its possible that sqsub will start your job but it will sit idle until enough license are available on the license server and the lstc_qrun command will reveal this:

[roberpj@hnd19:~] lstc_qrun
Defaulting to server 1 specified by LSTC_LICENSE_SERVER variable
                     Running Programs
    User             Host          Program              Started       # procs
-----------------------------------------------------------------------------
dgierczy    20277@saw61.saw.sharcn LS-DYNA_971      Mon Apr  8 17:44     1
dgierczy     8570@saw32.saw.sharcn LS-DYNA_971      Mon Apr  8 17:44     1
dscronin    25486@hnd6             MPPDYNA_971      Tue Apr  9 20:19     6
dscronin    14897@hnd18            MPPDYNA_971      Tue Apr  9 21:48     6
dscronin    14971@hnd18            MPPDYNA_971      Tue Apr  9 21:48     6
dscronin    15046@hnd18            MPPDYNA_971      Tue Apr  9 21:48     6
dscronin    31237@hnd16            MPPDYNA_971      Tue Apr  9 21:53     6
dscronin    31313@hnd16            MPPDYNA_971      Tue Apr  9 21:54     6
dscronin     6396@hnd15            MPPDYNA_971      Tue Apr  9 21:54     6
csharson    28890@saw175.saw.sharc MPPDYNA_971      Wed Apr 10 16:48     6
 roberpj    11257

At the time of this writing UWO is running on a 4cpu demo license with details:

[roberpj@hnd19:~] lstc_qrun -R
Defaulting to server 1 specified by LSTC_LICENSE_SERVER variable
**** LICENSE INFORMATION ****
PROGRAM          EXPIRATION CPUS  USED   FREE    MAX | QUEUE
---------------- ----------      ----- ------ ------ | -----
LS-DYNA_971      12/31/2013          -    966   1024 |     0
 dgierczy   20277@saw61.saw.sharcnet   1
 dgierczy    8570@saw32.saw.sharcnet   1
MPPDYNA_971      12/31/2013          -    966   1024 |     0
 dscronin   25486@hnd6               6
 dscronin   14897@hnd18              6
 dscronin   14971@hnd18              6
 dscronin   15046@hnd18              6
 dscronin   31237@hnd16              6
 dscronin   31313@hnd16              6
 dscronin    6396@hnd15              6
 csharson   28890@saw175.saw.sharcnet   6
 roberpj    11257@hnd15              8
                   LICENSE GROUP    58    966   1024 |     0
 
PROGRAM          EXPIRATION CPUS  USED   FREE    MAX | QUEUE
---------------- ----------      ----- ------ ------ | -----
LS-OPT           12/31/2013          0   1024   1024 |     0
                   LICENSE GROUP     0   1024   1024 |     0

Legacy Instructions

The following binaries remain available on saw and orca for backward compatibility testing:

[roberpj@orc-login2:~] cd /opt/sharcnet/local/lsdyna
[roberpj@orc129:/opt/sharcnet/local/lsdyna] ls ls971*
ls971_d_R3_1    ls971_d_R4_2_1  ls971_s_R3_1    ls971_s_R4_2_1  ls971_s_R5_1_1
ls971_d_R4_2_0  ls971_d_R5_0    ls971_s_R4_2_0  ls971_s_R5_0

There are currently no sharcnet modules for these versions, hence jobs should be submitted as follows:

module load lsdyna
export PATH=/opt/sharcnet/local/lsdyna:$PATH
export LSTC_LICENSE_SERVER=XXXXX@license3.uwo.sharcnet
cp /opt/sharcnet/local/lsdyna/examples/airbag.deploy.k airbag.deploy.k
SERIAL JOB:  sqsub -q serial -r 1d -o ofile.%J ls971_d_R4_2_1 i=airbag.deploy.k
THREADED JOB:  sqsub -q threaded -n 4 -r 1d -o ofile.%J ls971_d_R4_2_1 ncpu=4 para=2 i=airbag.deploy.k

Note! the ls971_s_R3_1 and ls971_d_R3_1 binaries do not work, a fix is being looked for.

References

o LSTC Release Notes
http://www.dynasupport.com/search?Subject%3Alist=release%20note

o LSTC LS-DYNA Homepage
http://www.lstc.com/products/ls-dyna

o LSTC LS-DYNA Support (Tutorials, HowTos, Faq, Manuals, Release Notes, News, Links)
http://www.dynasupport.com/

o LS-DYNA and d3VIEW Blog
http://blog2.d3view.com/a-few-words-on-memory-settings-in-ls-dyna/

o Convert Words to GB (500000000w=3.73gb,800000000w=5.96gb )
http://deviceanalytics.com/memcalc.php

o LS-DYNA Support Environment variables
http://www.dynasupport.com/howtos/general/environment-variables

o http://www.dynasupport.com/release-notes http://www.dynasupport.com/release-notes