(pycuda work in progress) |
|||
Line 1: | Line 1: | ||
+ | <big>'''PyCUDA''' makes it possible to easily use CUDA inside Python code.</big> | ||
+ | |||
+ | Documentation can be found on the [http://mathema.tician.de/software/pycuda package webpage]. | ||
+ | |||
+ | This package is not currently installed as SHARCNET-supported software, but it's easy for users to install it on their own following instructions below. If any difficulties are encountered when following these instructions, please ask SHARCNET staff for help. | ||
+ | |||
+ | See also: [[PyOpenCL]] | ||
+ | |||
+ | ==SHARCNET installation instructions== | ||
+ | |||
+ | ===Monk cluster=== | ||
+ | (These instructions were tested on Oct 22, 2014) | ||
+ | |||
+ | 1. Unload the Intel compiler module (loaded by default), so that GCC becomes the default compiler. Also, use a later python version. | ||
+ | <source lang="bash"> | ||
+ | module unload intel | ||
+ | module unload openmpi | ||
+ | module load gcc/4.8.2 | ||
+ | module load openmpi/gcc/1.8.1 | ||
+ | module load python/gcc/2.7.8 | ||
+ | </source> | ||
+ | Note: openmpi module is loaded because the python module needs it (it is not actually used by PyCuda) | ||
+ | |||
+ | |||
+ | 2. Create some directory you want to build the package in, cd into it, then get the PyCUDA source code: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | git clone http://git.tiker.net/trees/pycuda.git | ||
+ | cd pycuda | ||
+ | git submodule init | ||
+ | git submodule update | ||
+ | |||
+ | wget https://pypi.python.org/packages/source/p/pycuda/pycuda-2014.1.tar.gz#md5=fdc2f59e57ab7256a7e0df0d9d943022 | ||
+ | tar xfz pycuda-2014.1.tar.gz | ||
+ | cd pycuda-2014.1 | ||
+ | |||
+ | </source> | ||
+ | |||
+ | 3. At this point decide where you want the package to be installed. In this example will use a directory called python_packages in the home directory. If this directory does not yet exist, make it with: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | mkdir -p ~/python_packages/lib/python/ | ||
+ | </source> | ||
+ | Doing it this way creates the required subdirectories as well. | ||
+ | |||
+ | 4. Edit file Makefile.in to add --home flag pointing to the directory you created to the setup install line, so that it reads: | ||
+ | <source lang="bash"> | ||
+ | ${PYTHON_EXE} setup.py install --home=~/python_packages | ||
+ | </source> | ||
+ | |||
+ | 5.You now need to update the PYTHONPATH variable to point to the library directory: | ||
+ | <source lang="bash"> | ||
+ | export PYTHONPATH=~/python_packages/lib/python/:$PYTHONPATH | ||
+ | </source> | ||
+ | |||
+ | 6. Configure and compile, providing a path to the CUDA files on monk: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | python configure.py --cuda-root=/opt/sharcnet/cuda/6.0.37/toolkit | ||
+ | make install | ||
+ | </source> | ||
+ | |||
+ | 7. Do the first test of the installation to make sure the pycuda module can be imported, by starting python and executing: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | import pycuda | ||
+ | </source> | ||
+ | |||
+ | If no errors are reported, everything worked and the package is ready for use. | ||
+ | |||
+ | 8. Add the lines: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | module unload intel | ||
+ | module unload openmpi | ||
+ | module load gcc/4.8.2 | ||
+ | module load openmpi/gcc/1.8.1 | ||
+ | module load python/gcc/2.7.8 | ||
+ | export PYTHONPATH=~/python_packages/lib/python/:$PYTHONPATH | ||
+ | </source> | ||
+ | |||
+ | to your ~/.bashrc file so that this variable is set automatically for you on every login. | ||
+ | |||
+ | 9. Test PyCUDA on a development node which has a GPU (the login node does not have one so PyCUDA tests will produce an error). To do this, execute on monk login node: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | ssh mon54 | ||
+ | </source> | ||
+ | |||
+ | Then go to the directory where you put the PyCuda source code, and execute: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | python test/test_driver.py | ||
+ | </source> | ||
+ | |||
+ | Gives error: | ||
+ | |||
+ | [ppomorsk@mon241:~/supported_sharcnet_packages/pycuda/pycuda-2014.1/test] python test_driver.py | ||
+ | Traceback (most recent call last): | ||
+ | File "test_driver.py", line 4, in <module> | ||
+ | from pycuda.tools import mark_cuda_test | ||
+ | File "/home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/tools.py", line 30, in <module> | ||
+ | import pycuda.driver as cuda | ||
+ | File "/home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/driver.py", line 2, in <module> | ||
+ | from pycuda._driver import * # noqa | ||
+ | ImportError: /home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/_driver.so: undefined symbol: cuStreamAttachMemAsync | ||
+ | |||
+ | ldd output | ||
+ | |||
+ | [ppomorsk@mon54:~/supported_sharcnet_packages/pycuda/pycuda-2014.1/test] ldd /home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/_driver.so | ||
+ | linux-vdso.so.1 => (0x00007fffac5c1000) | ||
+ | libcuda.so.1 => /usr/lib64/libcuda.so.1 (0x00002b7dbda9e000) | ||
+ | libcurand.so.6.0 => /opt/sharcnet/cuda/6.0.37/toolkit/lib64/libcurand.so.6.0 (0x00002b7dbea01000) | ||
+ | libstdc++.so.6 => /opt/sharcnet/gcc/4.8.2/lib64/libstdc++.so.6 (0x00002b7dc3f08000) | ||
+ | libm.so.6 => /lib64/libm.so.6 (0x00002b7dc4211000) | ||
+ | libgcc_s.so.1 => /opt/sharcnet/gcc/4.8.2/lib64/libgcc_s.so.1 (0x00002b7dc4496000) | ||
+ | libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b7dc46ac000) | ||
+ | libc.so.6 => /lib64/libc.so.6 (0x00002b7dc48c9000) | ||
+ | libz.so.1 => /lib64/libz.so.1 (0x00002b7dc4c5e000) | ||
+ | libdl.so.2 => /lib64/libdl.so.2 (0x00002b7dc4e74000) | ||
+ | librt.so.1 => /lib64/librt.so.1 (0x00002b7dc5078000) | ||
+ | /lib64/ld-linux-x86-64.so.2 (0x00002b7dbd4a9000) | ||
+ | |||
+ | and the symbol does seem to exist | ||
+ | |||
+ | [ppomorsk@mon54:/usr/lib64] readelf -Ws libcuda.so.331.89 | grep cuStreamAttachMemAsync | ||
+ | 43: 0000000000139890 538 FUNC GLOBAL DEFAULT 10 cuStreamAttachMemAsync | ||
+ | |||
+ | If everything is working properly, the output should look like this: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | |||
+ | [ppomorsk@mon54:~/pycuda] python test/test_driver.py | ||
+ | ============================= test session starts ============================== | ||
+ | platform linux2 -- Python 2.6.6 -- pytest-2.3.4 | ||
+ | collected 21 items | ||
+ | |||
+ | test_driver.py ..................... | ||
+ | |||
+ | ========================== 21 passed in 61.39 seconds ========================== | ||
+ | </source> | ||
+ | |||
+ | |||
+ | 10. Try the example programs provided with the source code, found in the ''examples'' subdirectory of your pycuda source directory: | ||
+ | |||
+ | <source lang="bash"> | ||
+ | python examples/dump_properties.py | ||
+ | python examples/hello_gpu.py | ||
+ | </source> | ||
+ | |||
+ | ==Sample PyCUDA code== | ||
+ | This is code from the '''hello_gpu.py''' example program. It multiplies two vectors elementwise on the GPU, and then verifies the result with a standard calculation on the CPU. | ||
+ | |||
+ | <source lang="python"> | ||
+ | |||
+ | import pycuda.driver as drv | ||
+ | import pycuda.tools | ||
+ | import pycuda.autoinit | ||
+ | import numpy | ||
+ | import numpy.linalg as la | ||
+ | from pycuda.compiler import SourceModule | ||
+ | |||
+ | mod = SourceModule(""" | ||
+ | __global__ void multiply_them(float *dest, float *a, float *b) | ||
+ | { | ||
+ | const int i = threadIdx.x; | ||
+ | dest[i] = a[i] * b[i]; | ||
+ | } | ||
+ | """) | ||
+ | |||
+ | multiply_them = mod.get_function("multiply_them") | ||
+ | |||
+ | a = numpy.random.randn(400).astype(numpy.float32) | ||
+ | b = numpy.random.randn(400).astype(numpy.float32) | ||
+ | |||
+ | dest = numpy.zeros_like(a) | ||
+ | multiply_them( | ||
+ | drv.Out(dest), drv.In(a), drv.In(b), | ||
+ | block=(400,1,1)) | ||
+ | |||
+ | print dest-a*b | ||
+ | </source> | ||
+ | |||
+ | [[Category:Software packages]] | ||
+ | |||
+ | |||
+ | |||
+ | <hr> | ||
List of colors: [https://meta.wikimedia.org/wiki/Wiki_color_formatting_help Wiki_color_formatting_help] | List of colors: [https://meta.wikimedia.org/wiki/Wiki_color_formatting_help Wiki_color_formatting_help] | ||
Revision as of 17:02, 22 October 2014
PyCUDA makes it possible to easily use CUDA inside Python code.
Documentation can be found on the package webpage.
This package is not currently installed as SHARCNET-supported software, but it's easy for users to install it on their own following instructions below. If any difficulties are encountered when following these instructions, please ask SHARCNET staff for help.
See also: PyOpenCL
SHARCNET installation instructions
Monk cluster
(These instructions were tested on Oct 22, 2014)
1. Unload the Intel compiler module (loaded by default), so that GCC becomes the default compiler. Also, use a later python version.
module unload intel module unload openmpi module load gcc/4.8.2 module load openmpi/gcc/1.8.1 module load python/gcc/2.7.8
Note: openmpi module is loaded because the python module needs it (it is not actually used by PyCuda)
2. Create some directory you want to build the package in, cd into it, then get the PyCUDA source code:
git clone http://git.tiker.net/trees/pycuda.git cd pycuda git submodule init git submodule update wget https://pypi.python.org/packages/source/p/pycuda/pycuda-2014.1.tar.gz#md5=fdc2f59e57ab7256a7e0df0d9d943022 tar xfz pycuda-2014.1.tar.gz cd pycuda-2014.1
3. At this point decide where you want the package to be installed. In this example will use a directory called python_packages in the home directory. If this directory does not yet exist, make it with:
mkdir -p ~/python_packages/lib/python/
Doing it this way creates the required subdirectories as well.
4. Edit file Makefile.in to add --home flag pointing to the directory you created to the setup install line, so that it reads:
${PYTHON_EXE} setup.py install --home=~/python_packages
5.You now need to update the PYTHONPATH variable to point to the library directory:
export PYTHONPATH=~/python_packages/lib/python/:$PYTHONPATH
6. Configure and compile, providing a path to the CUDA files on monk:
python configure.py --cuda-root=/opt/sharcnet/cuda/6.0.37/toolkit make install
7. Do the first test of the installation to make sure the pycuda module can be imported, by starting python and executing:
import pycuda
If no errors are reported, everything worked and the package is ready for use.
8. Add the lines:
module unload intel module unload openmpi module load gcc/4.8.2 module load openmpi/gcc/1.8.1 module load python/gcc/2.7.8 export PYTHONPATH=~/python_packages/lib/python/:$PYTHONPATH
to your ~/.bashrc file so that this variable is set automatically for you on every login.
9. Test PyCUDA on a development node which has a GPU (the login node does not have one so PyCUDA tests will produce an error). To do this, execute on monk login node:
ssh mon54
Then go to the directory where you put the PyCuda source code, and execute:
python test/test_driver.py
Gives error:
[ppomorsk@mon241:~/supported_sharcnet_packages/pycuda/pycuda-2014.1/test] python test_driver.py Traceback (most recent call last):
File "test_driver.py", line 4, in <module> from pycuda.tools import mark_cuda_test File "/home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/tools.py", line 30, in <module> import pycuda.driver as cuda File "/home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/driver.py", line 2, in <module> from pycuda._driver import * # noqa ImportError: /home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/_driver.so: undefined symbol: cuStreamAttachMemAsync
ldd output
[ppomorsk@mon54:~/supported_sharcnet_packages/pycuda/pycuda-2014.1/test] ldd /home/ppomorsk/python_packages/lib/python/pycuda-2014.1-py2.7-linux-x86_64.egg/pycuda/_driver.so
linux-vdso.so.1 => (0x00007fffac5c1000) libcuda.so.1 => /usr/lib64/libcuda.so.1 (0x00002b7dbda9e000) libcurand.so.6.0 => /opt/sharcnet/cuda/6.0.37/toolkit/lib64/libcurand.so.6.0 (0x00002b7dbea01000) libstdc++.so.6 => /opt/sharcnet/gcc/4.8.2/lib64/libstdc++.so.6 (0x00002b7dc3f08000) libm.so.6 => /lib64/libm.so.6 (0x00002b7dc4211000) libgcc_s.so.1 => /opt/sharcnet/gcc/4.8.2/lib64/libgcc_s.so.1 (0x00002b7dc4496000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b7dc46ac000) libc.so.6 => /lib64/libc.so.6 (0x00002b7dc48c9000) libz.so.1 => /lib64/libz.so.1 (0x00002b7dc4c5e000) libdl.so.2 => /lib64/libdl.so.2 (0x00002b7dc4e74000) librt.so.1 => /lib64/librt.so.1 (0x00002b7dc5078000) /lib64/ld-linux-x86-64.so.2 (0x00002b7dbd4a9000)
and the symbol does seem to exist
[ppomorsk@mon54:/usr/lib64] readelf -Ws libcuda.so.331.89 | grep cuStreamAttachMemAsync 43: 0000000000139890 538 FUNC GLOBAL DEFAULT 10 cuStreamAttachMemAsync
If everything is working properly, the output should look like this:
[ppomorsk@mon54:~/pycuda] python test/test_driver.py ============================= test session starts ============================== platform linux2 -- Python 2.6.6 -- pytest-2.3.4 collected 21 items test_driver.py ..................... ========================== 21 passed in 61.39 seconds ==========================
10. Try the example programs provided with the source code, found in the examples subdirectory of your pycuda source directory:
python examples/dump_properties.py python examples/hello_gpu.py
Sample PyCUDA code
This is code from the hello_gpu.py example program. It multiplies two vectors elementwise on the GPU, and then verifies the result with a standard calculation on the CPU.
import pycuda.driver as drv import pycuda.tools import pycuda.autoinit import numpy import numpy.linalg as la from pycuda.compiler import SourceModule mod = SourceModule(""" __global__ void multiply_them(float *dest, float *a, float *b) { const int i = threadIdx.x; dest[i] = a[i] * b[i]; } """) multiply_them = mod.get_function("multiply_them") a = numpy.random.randn(400).astype(numpy.float32) b = numpy.random.randn(400).astype(numpy.float32) dest = numpy.zeros_like(a) multiply_them( drv.Out(dest), drv.In(a), drv.In(b), block=(400,1,1)) print dest-a*b
List of colors: Wiki_color_formatting_help
Simplest table
AA | BB | CC |
DD | EE | FF |
Table with padding
AA | BB | CC |
DD | EE | FF |
Table with border
AA | CC | EE |
BB | DD | FF |
Notebox:
This is a notebox to show border color. |