Class CSRPointer
- java.lang.Object
-
- org.apache.sysds.runtime.instructions.gpu.context.CSRPointer
-
public class CSRPointer extends Object
Compressed Sparse Row (CSR) format for CUDA Generalized matrix multiply is implemented for CSR format in the cuSparse library among other operations Since we assume that the matrix is stored with zero-based indexing (i.e. CUSPARSE_INDEX_BASE_ZERO), the matrix 1.0 4.0 0.0 0.0 0.0 0.0 2.0 3.0 0.0 0.0 5.0 0.0 0.0 7.0 8.0 0.0 0.0 9.0 0.0 6.0 is stored as val = 1.0 4.0 2.0 3.0 5.0 7.0 8.0 9.0 6.0 rowPtr = 0.0 2.0 4.0 7.0 9.0 colInd = 0.0 1.0 1.0 2.0 0.0 3.0 4.0 2.0 4.0
-
-
Field Summary
Fields Modifier and Type Field Description jcuda.Pointer
colInd
integer array of nnz values' column indicesjcuda.jcusparse.cusparseMatDescr
descr
descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supportedstatic jcuda.jcusparse.cusparseMatDescr
matrixDescriptor
long
nnz
Number of non zeroesjcuda.Pointer
rowPtr
integer array of start of all rows and end of last row + 1jcuda.Pointer
val
double array of non zero values
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static CSRPointer
allocateEmpty(GPUContext gCtx, long nnz2, long rows)
static CSRPointer
allocateEmpty(GPUContext gCtx, long nnz2, long rows, boolean initialize)
Factory method to allocate an empty CSR Sparse matrix on the GPUstatic CSRPointer
allocateForDgeam(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, CSRPointer B, int m, int n)
Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)static CSRPointer
allocateForMatrixMultiply(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, int transA, CSRPointer B, int transB, int m, int n, int k)
Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns theCSRPointer
to C with the appropriate GPU memory.CSRPointer
clone(int rows)
static void
copyPtrToHost(CSRPointer src, int rows, long nnz, int[] rowPtr, int[] colInd)
Static method to copy a CSR sparse matrix from Device to hoststatic void
copyToDevice(GPUContext gCtx, CSRPointer dest, int rows, long nnz, int[] rowPtr, int[] colInd, double[] values)
Static method to copy a CSR sparse matrix from Host to Devicevoid
deallocate()
Calls cudaFree lazily on the allocatedPointer
instancesvoid
deallocate(boolean eager)
Calls cudaFree lazily or eagerly on the allocatedPointer
instancesstatic long
estimateSize(long nnz2, long rows)
Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added instatic jcuda.jcusparse.cusparseMatDescr
getDefaultCuSparseMatrixDescriptor()
boolean
isUltraSparse(int rows, int cols)
Check for ultra sparsityjcuda.Pointer
toColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.jcublas.cublasHandle cublasHandle, int rows, int cols, String instName)
Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU.static int
toIntExact(long l)
String
toString()
-
-
-
Field Detail
-
matrixDescriptor
public static jcuda.jcusparse.cusparseMatDescr matrixDescriptor
-
nnz
public long nnz
Number of non zeroes
-
val
public jcuda.Pointer val
double array of non zero values
-
rowPtr
public jcuda.Pointer rowPtr
integer array of start of all rows and end of last row + 1
-
colInd
public jcuda.Pointer colInd
integer array of nnz values' column indices
-
descr
public jcuda.jcusparse.cusparseMatDescr descr
descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supported
-
-
Method Detail
-
toIntExact
public static int toIntExact(long l)
-
getDefaultCuSparseMatrixDescriptor
public static jcuda.jcusparse.cusparseMatDescr getDefaultCuSparseMatrixDescriptor()
- Returns:
- Singleton default matrix descriptor object (set with CUSPARSE_MATRIX_TYPE_GENERAL, CUSPARSE_INDEX_BASE_ZERO)
-
estimateSize
public static long estimateSize(long nnz2, long rows)
Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added in- Parameters:
nnz2
- number of non zeroesrows
- number of rows- Returns:
- size estimate
-
copyToDevice
public static void copyToDevice(GPUContext gCtx, CSRPointer dest, int rows, long nnz, int[] rowPtr, int[] colInd, double[] values)
Static method to copy a CSR sparse matrix from Host to Device- Parameters:
gCtx
- GPUContextdest
- [input] destination location (on GPU)rows
- number of rowsnnz
- number of non-zeroesrowPtr
- integer array of row pointerscolInd
- integer array of column indicesvalues
- double array of non zero values
-
copyPtrToHost
public static void copyPtrToHost(CSRPointer src, int rows, long nnz, int[] rowPtr, int[] colInd)
Static method to copy a CSR sparse matrix from Device to host- Parameters:
src
- [input] source location (on GPU)rows
- [input] number of rowsnnz
- [input] number of non-zeroesrowPtr
- [output] pre-allocated integer array of row pointers of size (rows+1)colInd
- [output] pre-allocated integer array of column indices of size nnz
-
allocateForDgeam
public static CSRPointer allocateForDgeam(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, CSRPointer B, int m, int n)
Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)- Parameters:
gCtx
- a validGPUContext
handle
- a validcusparseHandle
A
- Sparse Matrix A on GPUB
- Sparse Matrix B on GPUm
- Rows in An
- Columns in Bs- Returns:
- CSR (compressed sparse row) pointer
-
allocateForMatrixMultiply
public static CSRPointer allocateForMatrixMultiply(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, int transA, CSRPointer B, int transB, int m, int n, int k)
Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns theCSRPointer
to C with the appropriate GPU memory.- Parameters:
gCtx
- a validGPUContext
handle
- a validcusparseHandle
A
- Sparse Matrix A on GPUtransA
- 'T' if A is to be transposed, 'N' otherwiseB
- Sparse Matrix B on GPUtransB
- 'T' if B is to be transposed, 'N' otherwisem
- Rows in An
- Columns in Bk
- Columns in A / Rows in B- Returns:
- a
CSRPointer
instance that encapsulates the CSR matrix on GPU
-
allocateEmpty
public static CSRPointer allocateEmpty(GPUContext gCtx, long nnz2, long rows, boolean initialize)
Factory method to allocate an empty CSR Sparse matrix on the GPU- Parameters:
gCtx
- a validGPUContext
nnz2
- number of non-zeroesrows
- number of rowsinitialize
- memset to zero?- Returns:
- a
CSRPointer
instance that encapsulates the CSR matrix on GPU
-
allocateEmpty
public static CSRPointer allocateEmpty(GPUContext gCtx, long nnz2, long rows)
-
clone
public CSRPointer clone(int rows)
-
isUltraSparse
public boolean isUltraSparse(int rows, int cols)
Check for ultra sparsity- Parameters:
rows
- number of rowscols
- number of columns- Returns:
- true if ultra sparse
-
toColumnMajorDenseMatrix
public jcuda.Pointer toColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.jcublas.cublasHandle cublasHandle, int rows, int cols, String instName)
Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU. This is a temporary matrix for operations such as cusparseDcsrmv. Since the allocated matrix is temporary, bookkeeping is not updated. The caller is responsible for calling "free" on the returned Pointer object- Parameters:
cusparseHandle
- a validcusparseHandle
cublasHandle
- a validcublasHandle
rows
- number of rows in this CSR matrixcols
- number of columns in this CSR matrixinstName
- name of the invoking instruction to recordStatistics
.- Returns:
- A
Pointer
to the allocated dense matrix (in column-major format)
-
deallocate
public void deallocate()
Calls cudaFree lazily on the allocatedPointer
instances
-
deallocate
public void deallocate(boolean eager)
Calls cudaFree lazily or eagerly on the allocatedPointer
instances- Parameters:
eager
- whether to do eager or lazy cudaFrees
-
-