Run SystemDS with GPU
This guide covers the GPU hardware and software setup for using SystemDS gpu
mode.
- Requirements
- Linux
- Windows
- Command-line users
- Scala Users
- Advanced Configuration
- Training very deep network
Requirements
Hardware
The following GPUs are supported:
- NVIDIA GPU cards with CUDA architectures 5.0, 6.0, 7.0, 7.5, 8.0 and higher than 8.0. For CUDA enabled gpu cards at CUDA GPUs
- For GPUs with unsupported CUDA architectures, or to avoid JIT compilation from PTX, or to use difference versions of the NVIDIA libraries, build on Linux from source code.
-
Release artifacts contain PTX code for the latest supported CUDA architecture. In case your architecture specific PTX is not available enable JIT PTX with instructions compiler driver
nvcc
GPU Compilation.For example, with
--gpu-code
use actual gpu names,--gpu-architecture
is the name of virtual compute architecturenvcc SystemDS.cu --gpu-architecture=compute_50 --gpu-code=sm_50,sm_52
Note: A disk of minimum size 30 GB is recommended.
A minimum version of 10.2 CUDA toolkit version is recommended, for the following GPUs.
GPU type | Status |
---|---|
NVIDIA T4 | Experimental |
NVIDIA V100 | Experimental |
NVIDIA P100 | Experimental |
NVIDIA P4 | Experimental |
NVIDIA K80 | Tested |
NVIDIA A100 | Not supported |
Software
The following NVIDIA software is required to be installed in your system:
CUDA toolkit
- NVIDIA GPU drivers - CUDA 10.2 requires >= 440.33 driver. see CUDA compatibility.
- CUDA 10.2
- CUDNN 7.x
Linux
One easiest way to install the NVIDIA software is with apt
on Ubuntu. For other distributions
refer to the CUDA install Linux.
Note: All linux distributions may not support this. you might encounter some problems with driver installations.
To check the CUDA compatible driver version:
Install CUPTI which ships with CUDA toolkit for profiling.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/extras/CUPTI/lib64
Install CUDA with apt
The following instructions are for installing CUDA 10.2 on Ubuntu 18.04. These instructions might work for other Debian-based distros.
Note: Secure Boot tends to complication installation. These instructions may not address this.
Ubuntu 18.04 (CUDA 10.2)
# Add NVIDIA package repositories
# 1. Download the Ubuntu 18.04 driver repository
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
# 2. Move the repository to preferences
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
# 3. Fetch keys
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
# 4. add repository
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
# 5. Update package lists
sudo apt-get update
# ---
# 6. get the machine-learning repo
# this downloads the repository package but not the actual installation package
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
sudo apt install ./libcudnn7_7.6.5.32-1+cuda10.2_amd64.deb
sudo apt-get update
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.deb
sudo apt install ./libcudnn7-dev_7.6.5.32-1+cuda10.2_amd64.deb
sudo apt-get update
# ---
# 7. Install development and runtime libraries (~4GB)
sudo apt-get install --no-install-recommends \
cuda-10-2 \
libcudnn7=7.6.5.32-1+cuda10.2 \
libcudnn7-dev=7.6.5.32-1+cuda10.2
# Reboot the system. And run `nvidia-smi` for GPU check.
Installation check
$ nvidia-smi
Thu May 13 04:19:11 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA Tesla K80 Off | 00000000:00:1E.0 Off | 0 |
| N/A 38C P0 58W / 149W | 0MiB / 11441MiB | 98% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
To run SystemDS with CUDA
Pass .dml
file with -f
flag
java -Xmx4g -Xms4g -Xmn400m -cp target/SystemDS.jar:target/lib/*:target/SystemDS-*.jar org.apache.sysds.api.DMLScript -f ../main.dml -exec singlenode -gpu
[ INFO] BEGIN DML run 05/14/2021 02:37:26
[ INFO] Initializing CUDA
[ INFO] GPU memory - Total: 11996.954624 MB, Available: 11750.539264 MB on GPUContext{deviceNum=0}
[ INFO] Total number of GPUs on the machine: 1
[ INFO] GPUs being used: -1
[ INFO] Initial GPU memory: 10575485337
This is SystemDS!
SystemDS Statistics:
Total execution time: 0.020 sec.
Windows
Install the hardware and software requirements.
Add CUDA, CUPTI, and cuDNN installation directories to %PATH%
environmental
variable. Neural networks won’t run without cuDNN cuDNN64_7*.dll
.
See Windows install from source guide.
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin;%PATH%
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\extras\CUPTI\lib64;%PATH%
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include;%PATH%
SET PATH=C:\tools\cuda\bin;%PATH%
Command-line users
To enable the GPU backend via command-line, please provide systemds-*-extra.jar
in the classpath and -gpu
flag.
spark-submit --jars systemds-*-extra.jar SystemDS.jar -f myDML.dml -gpu
To skip memory-checking and force all GPU-enabled operations on the GPU, please provide force
option to the -gpu
flag.
spark-submit --jars systemds-*-extra.jar SystemDS.jar -f myDML.dml -gpu force
Scala users
To enable the GPU backend via command-line, please provide systemds-*-extra.jar
in the classpath and use
the setGPU(True)
method of MLContext API to enable the GPU usage.
spark-shell --jars systemds-*-extra.jar,SystemDS.jar
Advanced Configuration
Using single precision
By default, SystemDS uses double precision to store its matrices in the GPU memory.
To use single precision, the user needs to set the configuration property sysds.floating.point.precision
to single
. However, with exception of BLAS operations, SystemDS always performs all CPU operations
in double precision.
Training very deep network
Shadow buffer
To train very deep network with double precision, no additional configurations are necessary.
But to train very deep network with single precision, the user can speed up the eviction by
using shadow buffer. The fraction of the driver memory to be allocated to the shadow buffer can
be set by using the configuration property sysds.gpu.eviction.shadow.bufferSize
.
In the current version, the shadow buffer is currently not guarded by SystemDS
and can potentially lead to OOM if the network is deep as well as wide.
Unified memory allocator
SystemDS uses CUDA’s memory allocator and performs on-demand eviction using only
the Least Recently Used (LRU) eviction policy as per sysds.gpu.eviction.policy
.
To use CUDA’s unified memory allocator that performs page-level eviction instead,
please set the configuration property sysml.gpu.memory.allocator
to unified_memory
.