(→betay.f90) |
(→betax.f90) |
||
Line 161: | Line 161: | ||
jobs: | jobs: | ||
− | ===betax.f90=== | + | ===1) betax.f90=== |
<pre> | <pre> |
Revision as of 18:48, 5 January 2016
MPFUN90 |
---|
Description: A Fortran-90 arbitrary precision package |
SHARCNET Package information: see MPFUN90 software page in web portal |
Full list of SHARCNET supported software |
Contents
Introduction
MPFUN90 is a Fortran-90 arbitrary precision package (i.e. user specifies the desired precision)written by David H. Bailey et. al from the Lawrence Berkeley National Laboratory. If 64 digits of precision are sufficient for an application then users should use QD instead of MPFUN which would run much faster.
Version Selection
20060123
module purge; module load ldwrapper; module load intel/12.1.3; module load mpfun90/20060123
20100825
module purge; module load ldwrapper; module load intel/15.0.3; module load mpfun90/intel1503/20100825
Job Submission
To submit a job to the serial queue use:
sqsub -r 10m --mpp=2G -o ofile.%J ./a.out
Sample Problems
1) testmp90
This example shows how to compile and run testmp90.f90 and testmp90.f using the 2006 and 2010 sharcnet mpfun90 modules:
20060123
[roberpj@orc-login2:~] module purge; module load ldwrapper; module load intel/12.1.3; module load mpfun90/20060123 [roberpj@orc-login2:~/samples/mpfun90/20060123] cp /opt/sharcnet/mpfun90/20060123/f90/src/testmp90.f90 . [roberpj@orc-login2:~/samples/mpfun90/20060123] ifort testmp90.f90 $CPPFLAGS $LDFLAGS -lmp [roberpj@orc-login2:~/samples/mpfun90/20060123] ./a.out Completed Test 1 Completed Test 2 Completed Test 3 ................. Completed Test 62
20100825
[roberpj@orc-login2:~] module purge; module load ldwrapper; module load intel/15.0.3; module load mpfun90/intel1503/20100825 [roberpj@orc-login2:~/samples/mpfun90/20100825] cp $MPFUN90_ROOT/f90/testmp90.f . [roberpj@orc-login2:~/samples/mpfun90/20100825] ifort -free testmp90.f $MPFUN90_ROOT/toolkit/mpfun90.o -I$MPFUN90_ROOT/toolkit [roberpj@orc-login2:~/samples/mpfun90/20100825] ./a.out Completed Test 1 Completed Test 2 Completed Test 3 ................. Completed Test 62
2) factorial_100
This example shows how to compile and run the program factorial_100.f90 using the 2010 sharcnet mpfun90 module only:
program factorial_100_pgm use mpmodule implicit none integer, parameter :: Nmx = 100 integer :: AllocateStatus integer :: J type(mp_real) zero, one type(mp_real), dimension (:), allocatable :: fac call mpinit(800) call mpsetprec(800) call mpsetoutputprec(700) zero='0.' one ='1.' allocate(fac(0:Nmx),STAT=AllocateStatus) IF (AllocateStatus /= 0) STOP "Not enough memory for fac " fac(0)=one do J=1,Nmx fac(J)=fac(J-1)*J end do J=4 write(6,1001) J 1001 format(/"fac(",i3.3,") =",$) call mpwrite(6,fac(J)) J=100 write(6,1001) J call mpwrite(6,fac(J)) write(6,1005) 1005 format(/"DONE"//) end
Compile
[roberpj@orc-login2:~/samples/mpfun90/factorial_100_pgm] ifort factorial_100.f90 $MPFUN90_ROOT/toolkit/mpmod90.o $MPFUN90_ROOT/toolkit/mpfun90.o -I$MPFUN90_ROOT/toolkit
Execute
[roberpj@orc-login2:~/samples/mpfun90/factorial_100_pgm] ./a.out fac(004) =10 ^ 1 x 2.4, fac(100) =10 ^ 157 x 9.3326215443944152681699238856266700490715968264381621468592 963895217599993229915608941463976156518286253697920827223758251185210916864, DONE
General Notes
OpenMP with MPFUN and QD
The Multi Precision packages MPFUN90 and QD use "user defined data type" for their variables. Since the OpenMP reduction clause cannot be used for "user defined data types" to run MPFUN90 or QD with OpenMP the reduction must be done manually in a PARALLEL OpenMP region.
Detailed documentation on how to use OpenMP with MPFUN90 and QD are presented in the following document:
https://www.sharcnet.ca/help/index.php/OpenMP_Reduction_for_User_Defined_Data_Types
Handling large arrays in an MPFUN program
Declaring large arrays in an MPFUN program often leads to compile errors like:
"relocation truncated to fit r_x86_64_pc32 against .bss"
Even with the flag '-mcmodel large' in the compile command the error will persist, because there is a limit to the amount of data you can have allocated in static arrays for x86_64 architecture. This data segment is limited to 2GB. The data in COMMON plus any other statically declared arrays PLUS your code must fit in 2GB.
In order to overcome this problem use allocatable arrays declared in a module together with a short subroutine which allocates the arrays.
We illustrate these concepts by starting with a program 'betax.f90' which work fine for values of parameter 'Nsz' less than 37. Thus, if you compile it with
integer, parameter :: Nsz=36
it works, but for any higher values it fails with error message:
"relocation truncated to fit: R_X86_64_PC32 against symbol '...'"
To overcome this problem, we first copy program betax.f90 into a new file, say, betay.f90, and write a module called 'mp_data' and a subroutine called 'allocate_arrays' at the top of the new file. We move all the declarations of large arrays into the module, and declare them as 'allocatable'.
In the subroutine 'allocate_arrays' we do the allocations and in the main program we add the statements:
USE mp_data ... call allocate_arrays ...
Below you will find the listings for the files 'betax.f90', 'betay.f90' and the 'HIST' file containing the commands to compile these files and submit jobs:
1) betax.f90
program BETAX use mpmodule implicit none ! integer, parameter :: Nsz=12 ! compilation fails for 37 integer, parameter :: Nsz=37 INTEGER :: I,J,K,L type(mp_real) zero, one type(mp_real) fac(0:Nsz),fci(0:Nsz) type(mp_real) BETA2(Nsz,Nsz) type(mp_real) BETA3(Nsz,Nsz,Nsz) type(mp_real) BETA4(Nsz,Nsz,Nsz,Nsz) call mpinit(800) call mpsetprec(800) call mpsetoutputprec(700) zero='0.' one ='1.' fac(0)=one fci(0)=one do j=1,Nsz fac(j)=fac(j-1)*j fci(j)=one/fac(j) end do I=Nsz/2 J=I-1 K=J-1 L=K-1 write(6,1000) I,J,K,L 1000 format(/"I,J,K,L = ",4i3) write(6,1001) I 1001 format(/"fac(",i3.3,") =",$) call mpwrite(6,fac(I)) BETA2(I,J) =fac(I)*fac(J)*fci(I+J) BETA3(I,J,K) =fac(I)*fac(J)*fac(K)*fci(I+J)*fci(J+K) BETA4(I,J,K,L) =fac(I)*fac(J)*fac(K)*fac(L)* & & fci(I+J)*fci(J+K)*fci(K+L) write(6,1002) I,J 1002 format(/"BETA2(",i3.3,",",i3.3,") =") call mpwrite(6,BETA2(I,J)) write(6,1003) I,J,K 1003 format(/"BETA3(",i3.3,",",i3.3,",",i3.3,") =") call mpwrite(6,BETA3(I,J,K)) write(6,1004) I,J,K,L 1004 format(/"BETA4(",i3.3,",",i3.3,",",i3.3,",",i3.3,") =") call mpwrite(6,BETA4(I,J,K,L)) write(6,1005) 1005 format(/"DONE"//) stop end
Set Nsz=12 (works)
[roberpj@orc-login2:~] ssh orc-dev1 [roberpj@orc129:~] module purge; module load ldwrapper; module load intel/15.0.3; module load mpfun90/intel1503/20100825 [roberpj@orc129:~] cd ~/samples/mpfun90/betax [roberpj@orc129:~/samples/mpfun90/betax] ifort betax.f90 $MPFUN90_ROOT/toolkit/mpmod90.o $MPFUN90_ROOT/toolkit/mpfun90.o -I$MPFUN90_ROOT/toolkit [roberpj@orc129:~/samples/mpfun90/betax] ./a.out I,J,K,L = 6 5 4 3 fac(006) =10 ^ 2 x 7.2, BETA2(006,005) = 10 ^ -3 x 2.1645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645021645021645021645021645021645021645021645021645021645021645 021645021645021645, BETA3(006,005,004) = 10 ^ -7 x 1.4315490505966696442886919077395267871458347648823839300029 776220252410728601204791680982157172633363109553585744061934538125014315490505 966696442886919077395267871458347648823839300029776220252410728601204791680982 157172633363109553585744061934538125014315490505966696442886919077395267871458 347648823839300029776220252410728601204791680982157172633363109553585744061934 538125014315490505966696442886919077395267871458347648823839300029776220252410 728601204791680982157172633363109553585744061934538125014315490505966696442886 919077395267871458347648823839300029776220252410728601204791680982157172633363 109553585744061934538125014315490505966696442886919077395267871458347648823839 300029776220252410, BETA4(006,005,004,003) = 10 ^ -10 x 1.7042250602341305289151094139756271275545651962885522976225 924071729060391191910466286883520443611146558992363981026112545386921804155364 246067193912998901661033180307556724790284880987828833633822295953815228191645 425205515908463754268742930874450148826566060126150829098674903663565795085069 461486695046785749733595538584200715719990096407329967420670368516173504835636 354910731327964888055591003436808425470556989831366248599808690511638357443346 105477624752001169234729325432273278078266740398259672636089869649960352908198 713187375318894593271010504570595273543119348108010239529513905931139491230194 178039983028645160164434540851774411865114812960617949280080799355175772409332 500035447881252869, DONE
Set Nsz=37 (fails)
[roberpj@orc-login2:~/samples/mpfun90/betax] ssh orc-dev1 [roberpj@orc129:~] module purge; module load ldwrapper; module load intel/15.0.3; module load mpfun90/intel1503/20100825 [roberpj@orc129:~] cd ~/samples/mpfun90/betax [roberpj@orc129:~/samples/mpfun90/betax] ifort betax.f90 $MPFUN90_ROOT/toolkit/mpmod90.o $MPFUN90_ROOT/toolkit/mpfun90.o -I$MPFUN90_ROOT/toolkit /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o: In function `mpdefmod_mp_mpsetprec_': mpmod90.f:(.text+0x1b): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_new_mpipl_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o mpmod90.f:(.text+0xa0): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_new_mpwds_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o mpmod90.f:(.text+0xa6): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_mpnwx_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o mpmod90.f:(.text+0xcf): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_mpnwx_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o: In function `mpdefmod_mp_mpgetprec_': mpmod90.f:(.text+0xe6): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_mpnwx_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o: In function `mpdefmod_mp_mpsetprecwords_': mpmod90.f:(.text+0x109): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_new_mpwds_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o mpmod90.f:(.text+0x18c): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_mpnwx_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o mpmod90.f:(.text+0x198): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_mpnwx_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o: In function `mpdefmod_mp_mpgetprecwords_': mpmod90.f:(.text+0x1b2): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_mpnwx_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o: In function `mpdefmod_mp_mpsetoutputprec_': mpmod90.f:(.text+0x1c9): relocation truncated to fit: R_X86_64_PC32 against symbol `mpdefmod_mp_new_mpipl_' defined in COMMON section in /opt/sharcnet/mpfun90/20100825/intel1503/toolkit/mpmod90.o mpmod90.f:(.text+0x24c): additional relocation overflows omitted from the output
betay.f90
module mp_data use mpmodule use mpfunmod implicit none integer, parameter :: Nsz =48 integer :: AllocateStatus ! type(mp_real) zero, one ! ! type(mp_real) fac(0:Nsz),fci(0:Nsz) ! type(mp_real) BETA2(Nsz,Nsz) ! type(mp_real) BETA3(Nsz,Nsz,Nsz) ! type(mp_real) BETA4(Nsz,Nsz,Nsz,Nsz) type(mp_real) zero, one type(mp_real), dimension (:), allocatable :: fac,fci type(mp_real), dimension (:,:), allocatable :: BETA2 type(mp_real), dimension (:,:,:), allocatable :: BETA3 type(mp_real), dimension (:,:,:,:), allocatable :: BETA4 end module mp_data subroutine allocate_arrays USE mp_data implicit none allocate(fac(0:Nsz),STAT=AllocateStatus) IF (AllocateStatus /= 0) STOP "Not enough memory for fac " allocate(fci(0:Nsz),STAT=AllocateStatus) IF (AllocateStatus /= 0) STOP "Not enough memory for fci " allocate(BETA2(Nsz,Nsz),STAT=AllocateStatus) IF (AllocateStatus /= 0) STOP "Not enough memory for BETA2" allocate(BETA3(Nsz,Nsz,Nsz),STAT=AllocateStatus) IF (AllocateStatus /= 0) STOP "Not enough memory for BETA3" allocate(BETA4(Nsz,Nsz,Nsz,Nsz),STAT=AllocateStatus) IF (AllocateStatus /= 0) STOP "Not enough memory for BETA4" print *,"Done with allocations" return END program BETAY USE mp_data implicit none INTEGER :: I,J,K,L call mpinit(800) call mpsetprec(800) call mpsetoutputprec(700) zero='0.' one ='1.' call allocate_arrays fac(0)=one fci(0)=one do j=1,Nsz fac(j)=fac(j-1)*j fci(j)=one/fac(j) end do I=Nsz/2 J=I-1 K=J-1 L=K-1 write(6,1000) I,J,K,L 1000 format(/"I,J,K,L = ",4i3) write(6,1001) I 1001 format(/"fac(",i3.3,") =",$) call mpwrite(6,fac(I)) BETA2(I,J) =fac(I)*fac(J)*fci(I+J) BETA3(I,J,K) =fac(I)*fac(J)*fac(K)*fci(I+J)*fci(J+K) BETA4(I,J,K,L) =fac(I)*fac(J)*fac(K)*fac(L)* & & fci(I+J)*fci(J+K)*fci(K+L) write(6,1002) I,J 1002 format(/"BETA2(",i3.3,",",i3.3,") =") call mpwrite(6,BETA2(I,J)) write(6,1003) I,J,K 1003 format(/"BETA3(",i3.3,",",i3.3,",",i3.3,") =") call mpwrite(6,BETA3(I,J,K)) write(6,1004) I,J,K,L 1004 format(/"BETA4(",i3.3,",",i3.3,",",i3.3,",",i3.3,") =") call mpwrite(6,BETA4(I,J,K,L)) write(6,1005) 1005 format(/"DONE"//) stop end
============================================================================ # File betay.f90 was copied from betax.f90 and modified as described earlier. # Compile betay.f90 ifort betay.f90 -mcmodel medium -shared-intel \ -I/opt/sharcnet/mpfun/6.1.23/f90/include \ -L/opt/sharcnet/mpfun/6.1.23/f90/lib -lmp >& AOUT2 ============================================================================ # The executable might require more than 2.0G of memory: # This will pruduce an error: sqsub -r 5m --mpp=4.0G -o OUTPUTFILE4 ./a.out # This should work: sqsub -r 5m --mpp=8.0G -o OUTPUTFILE8 ./a.out ============================================================================
MPFUN's TOOLKIT
This toolkit provides and extensive example of how the mpfun package can be used. Read the header of files
mathinit.f90 and mathtool.f90
found under
/opt/sharcnet/mpfun90/version/[flavor]/toolkit
for an explanation what the package does. The Makefile in this directory shows how the mpfun libraries were linked to compile this sample program pair. The following steps shows the first initialization step followed by actually running the mathtool program to perform some examples such as performing an integration and displaying the value of pie. Note that the displayed zero cputimes are due to a bug in the mpfun current build which will be fixed in future updates on the clusters.
Initializing Mathtool
To run Mathtool you must first generate following two input files
const.dat and quadts.dat
which can be accomplished by running following command:
[roberpj@bal34:~/samples/mpfun] /opt/sharcnet/mpfun90/version/[flavor]/toolkit/mathinit mathinit: start const complete file const.dat written cpu time = 0.E+0 initqts: Tanh-sinh quadrature initialization 0 18632 1000 18632 2000 18632 3000 18632 4000 18632 5000 18632 6000 18632 7000 18632 8000 18632 initqts: Table spaced used = 8177 initqts complete file quadts.dat written cpu time = 0.E+0 total cpu time = 0.E+0 [roberpj@bal34:~/samples/mpfun] ls const.dat quadts.dat
Running Mathtool
Use the files created in the previous section to run mathtool as follows:
[roberpj@bal34:~/samples/mpfun] /opt/sharcnet/mpfun90/version/[flavor]/toolkit/mathtool Welcome to the Experimental Mathematician's Toolkit Initializing... Current settings: Debug level = 2 Primary precision level = 100 digits Secondary precision level = 200 digits Primary epsilon = 10^ -100 Secondary epsilon = 10^ -200 PSLQ bound = 100 PSLQ level = 1 Quadrature level = 6 Quadrature type = 3 (Tanh-Sinh) -- snip -- TYPE: integrate[1/(1+x^2), {x, 0, 1}] (hit enter) quadts: Iteration 1 of 6; est error = 10^ 0; approx value = 7.853157298876894309110693190063938396203990510665917998384243e-1 quadts: Iteration 2 of 6; est error = 10^ 0; approx value = 7.853981605125528448099840689980111720633592558858121242731082e-1 quadts: Iteration 3 of 6; est error = 10^ -17; approx value = 7.853981633974483082028533923930157568609007576776270726821597e-1 quadts: Iteration 4 of 6; est error = 10^ -36; approx value = 7.853981633974483096156608458198757190586422063114153956005278e-1 quadts: Iteration 5 of 6; est error = 10^ -71; approx value = 7.853981633974483096156608458198757210492923498437764552437361e-1 quadts: Iteration 6 of 6; est error = 10^ -101; approx value = 7.853981633974483096156608458198757210492923498437764552437361e-1 Result[ 1] = 7.85398163397448309615660845819875721049292349843776455243736148076954e-1 CPU time = 0.0000 TYPE: functions (hit enter) Abs Arccos Arcsin Arctan Arctan2 Bessel Besselexp Binomial Cos Erf Exp Factorial Gamma Integrate Log Max Min Polyroot Pslq Result Sin Sqrt Sum Table Tan Zeta Zetap Zetaz TYPE: variables (hit enter) E Log2 Log10 Pi Catalan Eulergamma Infinity Arg1 Arg2 Arg3 Arg4 Arg5 Arg6 Arg7 Arg8 Arg9 Debug Digits Digits2 Epsilon Epsilon2 Pslqbound Pslqlevel Quadlevel Quadtype x TYPE: Pi (hit enter) Result[ 2] = 3.14159265358979323846264338327950288419716939937510582097494459230781e0 CPU time = 0.0000
More Mathtool commands
To start mathtool (assuming you have issued the command /opt/sharcnet/mpfun/current/toolkit/mathinit and have these two files: const.dat HIST quadts.dat - you type:
/opt/sharcnet/mpfun/current/toolkit/mathtool
Then try following commands:
integrate[2*x,{x,0,1}] sqrt[1600] help Binomial Binomial[3,2] hypo[x,y] = sqrt[x^2+y^2] z=hypo[3,4] z exit
This is what you should get from above commands:
integrate[2*x,{x,0,1}] quadts: Iteration 1 of 6; est error = 10^ 0; approx value = 1.000003359570811224742398159504692881030640064061042259417294e0 quadts: Iteration 2 of 6; est error = 10^ 0; approx value = 1.000000000000036584185557611216071731517571134591404403031033e0 quadts: Iteration 3 of 6; est error = 10^ -27; approx value = 1.000000000000000000000000000001046972220803162709182580101627e0 quadts: Iteration 4 of 6; est error = 10^ -60; approx value = 1.000000000000000000000000000000000000000000000000000000000000e0 quadts: Iteration 5 of 6; est error = 10^ -101; approx value = 9.999999999999999999999999999999999999999999999999999999999999e-1 Result[ 1] = 9.99999999999999999999999999999999999999999999999999999999999999999999e-1 CPU time = 0.0000
sqrt[1600] Result[ 2] = 4.00000000000000000000000000000000000000000000000000000000000000000000e1 CPU time = 0.0000
help Binomial Binomial[m,n] computes the binomial coefficient of integers (m,n).
Binomial[3,2] Result[ 3] = 3.00000000000000000000000000000000000000000000000000000000000000000000e0 CPU time = 0.0000
hypo[x,y] = sqrt[x^2+y^2] parse: new function name = hypo CPU time = 0.0000
z=hypo[3,4] parse: new variable name = z CPU time = 0.0000
z Result[ 4] = 5.00000000000000000000000000000000000000000000000000000000000000000000e0 CPU time = 0.0000
exit
References
o High-Precision Software Directory
http://crd.lbl.gov/~dhbailey/mpdist/index.html