MKL |
---|
Description: Intel Math Kernel Library |
SHARCNET Package information: see MKL software page in web portal |
Full list of SHARCNET supported software |
COMPILING WITH MKL
1) On the few remaining centos5 clusters (as shown here https://www.sharcnet.ca/my/software/show/129) the compile script can be used for codes need to link with the MKL blas and lapack libraries (where the program extension xyz can be any of c/cc, cxx/CC/c++ or f77/f90/f95) as follows:
compile program.xyz -llapack
To demonstrate this approach, consider the following example where the result from a.out can be compared with the expect vendor provided solution:
[hnd50:~] ln -s /opt/sharcnet/acml/4.3.0/ifort-64bit/ifort64/examples/dgetrf_example.f [hnd50:~] compile -v dgetrf_example.f -llapack [hnd50:~] ./a.out [hnd50:~] cat /opt/sharcnet/acml/4.3.0/ifort-64bit/ifort64/examples/dgetrf_example.expected
2) More generally (on any sharcnet cluster) the <a href="http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/">Intel® Math Kernel Library Link Line Advisor</a> can be used to generate linker options for more complex linking situations with MKL than the compile script supports. For instance, to determine the link arguments for the Linux Operating System, IA64 Itanium Processor, Intel Compiler, Dynamic Linking, 64 bit Integers, Multi-threaded Version of MKL, Intel OpenMP Library (libiomp5) plus the Scalapack Library, the MKL Link Line Advisor would (at the time of this writing) return the following reccomendation:
$MKLPATH/libmkl_scalapack_lp64.a $MKLPATH/libmkl_solver_lp64.a -Wl,--start-group $MKLPATH/libmkl_intel_lp64.a $MKLPATH/libmkl_intel_thread.a $MKLPATH/libmkl_core.a $MKLPATH/libmkl_blacs_sgimpt_lp64.a -Wl,--end-group -openmp -lpthread
All of the installed MKL libraries can be listed as follows:
[roberpj@saw377:~] ls /opt/sharcnet/intel/current/ifc/mkl/lib/em64t libmkl_blacs_ilp64.a libmkl_gf_lp64.a libmkl_scalapack_ilp64.a libmkl_blacs_intelmpi20_ilp64.a libmkl_gf_lp64.so libmkl_scalapack_ilp64.so libmkl_blacs_intelmpi20_lp64.a libmkl_gnu_thread.a libmkl_scalapack_lp64.a libmkl_blacs_intelmpi_ilp64.a libmkl_gnu_thread.so libmkl_scalapack_lp64.so libmkl_blacs_intelmpi_ilp64.so libmkl_intel_ilp64.a libmkl_sequential.a libmkl_blacs_intelmpi_lp64.a libmkl_intel_ilp64.so libmkl_sequential.so libmkl_blacs_intelmpi_lp64.so libmkl_intel_lp64.a libmkl.so libmkl_blacs_lp64.a libmkl_intel_lp64.so libmkl_solver.a libmkl_blacs_openmpi_ilp64.a libmkl_intel_sp2dp.a libmkl_solver_ilp64.a libmkl_blacs_openmpi_lp64.a libmkl_intel_sp2dp.so libmkl_solver_ilp64_sequential.a libmkl_blacs_sgimpt_ilp64.a libmkl_intel_thread.a libmkl_solver_lp64.a libmkl_blacs_sgimpt_lp64.a libmkl_intel_thread.so libmkl_solver_lp64_sequential.a libmkl_cdft.a libmkl_lapack.a libmkl_vml_def.so libmkl_cdft_core.a libmkl_lapack.so libmkl_vml_mc2.so libmkl_core.a libmkl_mc3.so libmkl_vml_mc3.so libmkl_core.so libmkl_mc.so libmkl_vml_mc.so libmkl_def.so libmkl_p4n.so libmkl_vml_p4n.so libmkl_em64t.a libmkl_pgi_thread.a locale libmkl_gf_ilp64.a libmkl_pgi_thread.so libmkl_gf_ilp64.so libmkl_scalapack.a
Note About Using The ILP64 Vs LP64 Variants Of MKL
You should use Intel MKL ilp64 in following cases.
1. If you are using huge data arrays (indexing exceeds 2^32-1)
2. If you enable FORTRAN code with the /4I8 compiler option
The ilp64 version of the MKL libraries defines integers as 64 bit. This implies codes should be compiled with -i8 OR internally be modfied to use the integer*8 type. Otherwise the standard lp64 version of MKL should be used which assumes integers are standard 32 bit.
Support for Third-Party Interfaces
o GMP Functions
The Intel MKL implementation of GMP arithmetic functions includes arbitrary precision arithmetic operations on integer numbers. The interfaces of such functions fully match the GNU Multiple Precision (GMP) Arithmetic Library. For specifications of these functions, please see this <a href='http://www.intel.com/software/products/mkl/docs/gnump/WebHelp/'>link</a>. If you currently use the GMP library, you need to modify INCLUDE statements in your programs to mkl_gmp.h.
o FFTW Interface Support
Intel MKL provides interface wrappers for the 2.x and 3.x FFTW (www.fftw.org) superstructure are located in the same directory on all clusters. Using hound and version 11.0.083 of the intel compiler as an example, the wrappers and corresponding fftw wrapper header files are located in the following locations:
[roberpj@hnd50:/opt/sharcnet/intel/11.0.083/ifc/mkl/interfaces] ls blas95 fftw2xc fftw2x_cdft fftw2xf fftw3xc fftw3xf lapack95 [roberpj@hnd50:/opt/sharcnet/intel/11.0.083/ifc/mkl/include/fftw] ls fftw3.f fftw_f77.i fftw_mpi.h rfftw.h rfftw_threads.h fftw3.h fftw.h fftw_threads.h rfftw_mpi.h
The wrappers can be used for calling the Intel ~equivilent~ MKL Fourier transform functions instead of FFTW for programs that currently use FFTW without changing the program source code. Referring to the online document <a href='http://www.intel.com/software/products/mkl/docs/fftw_mkl_user_notes_2.htm'>FFTW to Intel® Math Kernel Library Wrappers Technical User Notes</a> its mentions that "FFTW2MKL wrappers are delivered as the source code that must be compiled by the user to build the wrapper library." By popular demand these wrapper have been precompiled for immediate use and located in two directories for each intel module (at present 11.0.083 and 11.1.069) as follows:
[roberpj@hnd50:/opt/sharcnet/intel/11.0.083/mkl/lib/em64t/interfaces] tree . |-- ilp64 | |-- libfftw2xc_intel.a | |-- libfftw2xf_intel.a | |-- libfftw3xc_intel.a | |-- libfftw3xf_intel.a | |-- libmkl_blas95.a | |-- libmkl_lapack95.a | |-- mkl77_lapack.mod | |-- mkl77_lapack1.mod | |-- mkl95_blas.mod | |-- mkl95_lapack.mod | `-- mkl95_precision.mod `-- lp64 |-- libfftw2xc_intel.a |-- libfftw2xf_intel.a |-- libfftw3xc_intel.a |-- libfftw3xf_intel.a |-- libmkl_blas95.a |-- libmkl_lapack95.a |-- mkl77_lapack.mod |-- mkl77_lapack1.mod |-- mkl95_blas.mod |-- mkl95_lapack.mod `-- mkl95_precision.mod
Introduction to Using the MKL FFT
Intel markets two implementations of the FFT. The first being from <a href='http://software.intel.com/en-us/intel-mkl/'>MKL</a> and the other from <a href='http://software.intel.com/en-us/intel-ipp/'>IPP</a> whose differences are described <a href='http://software.intel.com/en-us/articles/mkl-ipp-choosing-an-fft/'>here</a>. Only the MKL version is installed on SHARCNET.
The main FFT Computation Functions provided with MKL are DftiComputeForward and DftiComputeForward which compute the forward and backward FFT respectively. These functions along with Descriptor Manipulation Functions, Descriptor Configuration Functions and Status Checking Functions are provided in the <a href='http://www.intel.com/software/products/mkl/docs/webhelp/fft/fft_DFTF.html'>Table “FFT Functions in Intel MKL”</a>. Intel describes howto use these functions in their <a href='http://www.intel.com/software/products/mkl/docs/webhelp/appendices/mkl_appC_FFT.html'>Fourier Transform Functions Code Examples</a> document which also covers multi-threading aspects.
The simplest way to explain howto MKL FFT is by compiling and running a example problem of which there are several located under /opt/sharcnet/intel/11.0.083/ifc/mkl/examples where the fortran samples are contained in the dftf sub-directory while the c program samples are contained in the dftc sub-directory. The problem demonstrated here is from the source complex_2d_double_ex1.f90 which provides a MKL DFTI interface example program (Fortran-interface) to demonstrate Forward-Backward 2D complex transform for double precision data inplace. Steps to run this program are as follows:
1) Copy the example directory to a test directory in your account with:
cp -r /opt/sharcnet/intel/11.0.083/ifc/mkl/examples/dftf /scratch/myusername/dftfdemo cd /scratch/myusername/dftfdemo
2) Next compile the example program. In this case the machine used is Silky ie) ia64 based.
make lib64 function=complex_2d_double_ex1 compiler=intel interface=ia64 [threading=parallel 2>&1 | tee myMake.out
3) The built output appears as follows, where you will note the first step is to compile mkl_dfti.f90 into a module which is then used in the program on line 42 where the statement Use MKL_DFTI can be seen vizzz:
make lib64 function=complex_2d_double_ex1 compiler=intel interface=ia64 [threading=parallel 2>&1 | tee myMake.out rm -fr *.o *.mod make mkl_dfti.o dfti_example_support.o dfti_example_status_print.o complex_2d_double_ex1.res _IA=64 EXT=a RES_EXT=lib make[1]: Entering directory `/home/roberpj/samples/fft-intel/fft/dftf' mkdir -p ./_results/intel_ia64_parallel_64_lib ifort -w -c /opt/sharcnet/intel/11.0.083/ifc/mkl/include/mkl_dfti.f90 -o mkl_dfti.o mkdir -p ./_results/intel_ia64_parallel_64_lib ifort -w -c source/dfti_example_support.f90 -o dfti_example_support.o mkdir -p ./_results/intel_ia64_parallel_64_lib ifort -w -c source/dfti_example_status_print.f90 -o dfti_example_status_print.o mkdir -p ./_results/intel_ia64_parallel_64_lib ifort -w mkl_dfti.o dfti_example_support.o dfti_example_status_print.o source/complex_2d_double_ex1.f90 -L"/opt/sharcnet/intel/11.0.083/ifc/mkl/lib/64" "/opt/sharcnet/intel/11.0.083/ifc/mkl/lib/64"/libmkl_intel_lp64.a -Wl,--start-group "/opt/sharcnet/intel/11.0.083/ifc/mkl/lib/64"/libmkl_intel_thread.a "/opt/sharcnet/intel/11.0.083/ifc/mkl/lib/64"/libmkl_core.a -Wl,--end-group -L"/opt/sharcnet/intel/11.0.083/ifc/mkl/lib/64" -liomp5 -lpthread -o _results/intel_ia64_parallel_64_lib/complex_2d_double_ex1.out export LD_LIBRARY_PATH="/opt/sharcnet/intel/11.0.083/ifc/mkl/lib/64":/opt/sharcnet/lsf/6.2/linux2.6-glibc2.4-ia64/lib:/opt/sharcnet/lsf/6.2/linux2.6-glibc2.4-ia64/lib:/opt/sharcnet/intel/11$ _results/intel_ia64_parallel_64_lib/complex_2d_double_ex1.out <data/complex_2d_double_ex1.d >_results/intel_ia64_parallel_64_lib/complex_2d_double_ex1.res make[1]: Leaving directory `/home/roberpj/samples/fft-intel/fft/dftf'
4) Since the program gets run automatically by the makefile, the output data can be examined by running more (or less) on the results file called complex_2d_double_ex1.res which gets created.
cat _results/intel_ia64_parallel_64_lib/complex_2d_double_ex1.res COMPLEX_2D_DOUBLE_EX1 Forward-Backward 2D complex transform for double precision data Configuration parameters: DFTI_FORWARD_DOMAIN = DFTI_COMPLEX DFTI_PRECISION = DFTI_DOUBLE DFTI_DIMENSION = 2 DFTI_LENGTHS = { 5, 3} DFTI_PLACEMENT = DFTI_INPLACE DFTI_INPUT_STRIDES = { 0, 1, 15} DFTI_FORWARD_SCALE = 1.0 DFTI_BACKWARD_SCALE = 1.0/real(m*n) INPUT vector X (2D columns) ( 0.729, 0.486) ( -0.865, -0.577) ( -0.278, -0.186) ( 0.787, 0.525) ( 0.839, 0.559) ( -0.586, -0.391) ( 0.122, 0.081) ( -0.741, -0.494) ( -0.794, -0.529) ( -0.655, -0.437) ( 0.580, 0.387) ( -0.866, -0.577) ( -0.830, -0.554) ( -0.371, -0.247) ( -0.791, -0.527) Compute DftiComputeForward Forward OUTPUT vector X (2D columns) ( -3.720, -2.480) ( 3.681, -0.995) ( 0.497, 3.780) ( 2.932, -1.810) ( 1.422, 0.044) ( 3.078, -1.928) ( 1.115, -2.479) ( -2.040, 1.814) ( 3.144, 1.228) ( -1.859, 1.982) ( 2.343, 2.430) ( 0.890, -2.581) ( -0.543, 3.403) ( -0.596, 3.583) ( 0.588, 1.295) Compute DftiComputeBackward Backward OUTPUT vector X (2D columns) ( 0.729, 0.486) ( -0.865, -0.577) ( -0.278, -0.186) ( 0.787, 0.525) ( 0.839, 0.559) ( -0.586, -0.391) ( 0.122, 0.081) ( -0.741, -0.494) ( -0.794, -0.529) ( -0.655, -0.437) ( 0.580, 0.387) ( -0.866, -0.577) ( -0.830, -0.554) ( -0.371, -0.247) ( -0.791, -0.527) ACCURACY = 0.248253E-15 TEST PASSED
SUMMARY EXAMPLE PROGRAMS
The Intel compiler came with many mkl examples which can be copied to your work directory to experiment with by doing the following:
cp -r /opt/sharcnet/intel/current/ifc/mkl/examples /work/$USER
Then each example can be compiled by going into any example directory and executing:
make soem64t