Class CSRPointer


  • public class CSRPointer
    extends Object
    Compressed Sparse Row (CSR) format for CUDA Generalized matrix multiply is implemented for CSR format in the cuSparse library among other operations Since we assume that the matrix is stored with zero-based indexing (i.e. CUSPARSE_INDEX_BASE_ZERO), the matrix 1.0 4.0 0.0 0.0 0.0 0.0 2.0 3.0 0.0 0.0 5.0 0.0 0.0 7.0 8.0 0.0 0.0 9.0 0.0 6.0 is stored as val = 1.0 4.0 2.0 3.0 5.0 7.0 8.0 9.0 6.0 rowPtr = 0.0 2.0 4.0 7.0 9.0 colInd = 0.0 1.0 1.0 2.0 0.0 3.0 4.0 2.0 4.0
    • Field Detail

      • matrixDescriptor

        public static jcuda.jcusparse.cusparseMatDescr matrixDescriptor
      • nnz

        public long nnz
        Number of non zeroes
      • val

        public jcuda.Pointer val
        double array of non zero values
      • rowPtr

        public jcuda.Pointer rowPtr
        integer array of start of all rows and end of last row + 1
      • colInd

        public jcuda.Pointer colInd
        integer array of nnz values' column indices
      • descr

        public jcuda.jcusparse.cusparseMatDescr descr
        descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supported
    • Method Detail

      • toIntExact

        public static int toIntExact​(long l)
      • getDefaultCuSparseMatrixDescriptor

        public static jcuda.jcusparse.cusparseMatDescr getDefaultCuSparseMatrixDescriptor()
        Returns:
        Singleton default matrix descriptor object (set with CUSPARSE_MATRIX_TYPE_GENERAL, CUSPARSE_INDEX_BASE_ZERO)
      • estimateSize

        public static long estimateSize​(long nnz2,
                                        long rows)
        Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added in
        Parameters:
        nnz2 - number of non zeroes
        rows - number of rows
        Returns:
        size estimate
      • copyToDevice

        public static void copyToDevice​(GPUContext gCtx,
                                        CSRPointer dest,
                                        int rows,
                                        long nnz,
                                        int[] rowPtr,
                                        int[] colInd,
                                        double[] values)
        Static method to copy a CSR sparse matrix from Host to Device
        Parameters:
        gCtx - GPUContext
        dest - [input] destination location (on GPU)
        rows - number of rows
        nnz - number of non-zeroes
        rowPtr - integer array of row pointers
        colInd - integer array of column indices
        values - double array of non zero values
      • copyPtrToHost

        public static void copyPtrToHost​(CSRPointer src,
                                         int rows,
                                         long nnz,
                                         int[] rowPtr,
                                         int[] colInd)
        Static method to copy a CSR sparse matrix from Device to host
        Parameters:
        src - [input] source location (on GPU)
        rows - [input] number of rows
        nnz - [input] number of non-zeroes
        rowPtr - [output] pre-allocated integer array of row pointers of size (rows+1)
        colInd - [output] pre-allocated integer array of column indices of size nnz
      • allocateForDgeam

        public static CSRPointer allocateForDgeam​(GPUContext gCtx,
                                                  jcuda.jcusparse.cusparseHandle handle,
                                                  CSRPointer A,
                                                  CSRPointer B,
                                                  int m,
                                                  int n)
        Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)
        Parameters:
        gCtx - a valid GPUContext
        handle - a valid cusparseHandle
        A - Sparse Matrix A on GPU
        B - Sparse Matrix B on GPU
        m - Rows in A
        n - Columns in Bs
        Returns:
        CSR (compressed sparse row) pointer
      • allocateForMatrixMultiply

        public static CSRPointer allocateForMatrixMultiply​(GPUContext gCtx,
                                                           jcuda.jcusparse.cusparseHandle handle,
                                                           CSRPointer A,
                                                           int transA,
                                                           CSRPointer B,
                                                           int transB,
                                                           int m,
                                                           int n,
                                                           int k)
        Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns the CSRPointer to C with the appropriate GPU memory.
        Parameters:
        gCtx - a valid GPUContext
        handle - a valid cusparseHandle
        A - Sparse Matrix A on GPU
        transA - 'T' if A is to be transposed, 'N' otherwise
        B - Sparse Matrix B on GPU
        transB - 'T' if B is to be transposed, 'N' otherwise
        m - Rows in A
        n - Columns in B
        k - Columns in A / Rows in B
        Returns:
        a CSRPointer instance that encapsulates the CSR matrix on GPU
      • allocateEmpty

        public static CSRPointer allocateEmpty​(GPUContext gCtx,
                                               long nnz2,
                                               long rows,
                                               boolean initialize)
        Factory method to allocate an empty CSR Sparse matrix on the GPU
        Parameters:
        gCtx - a valid GPUContext
        nnz2 - number of non-zeroes
        rows - number of rows
        initialize - memset to zero?
        Returns:
        a CSRPointer instance that encapsulates the CSR matrix on GPU
      • allocateEmpty

        public static CSRPointer allocateEmpty​(GPUContext gCtx,
                                               long nnz2,
                                               long rows)
      • isUltraSparse

        public boolean isUltraSparse​(int rows,
                                     int cols)
        Check for ultra sparsity
        Parameters:
        rows - number of rows
        cols - number of columns
        Returns:
        true if ultra sparse
      • toColumnMajorDenseMatrix

        public jcuda.Pointer toColumnMajorDenseMatrix​(jcuda.jcusparse.cusparseHandle cusparseHandle,
                                                      jcuda.jcublas.cublasHandle cublasHandle,
                                                      int rows,
                                                      int cols,
                                                      String instName)
        Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU. This is a temporary matrix for operations such as cusparseDcsrmv. Since the allocated matrix is temporary, bookkeeping is not updated. The caller is responsible for calling "free" on the returned Pointer object
        Parameters:
        cusparseHandle - a valid cusparseHandle
        cublasHandle - a valid cublasHandle
        rows - number of rows in this CSR matrix
        cols - number of columns in this CSR matrix
        instName - name of the invoking instruction to recordStatistics.
        Returns:
        A Pointer to the allocated dense matrix (in column-major format)
      • deallocate

        public void deallocate()
        Calls cudaFree lazily on the allocated Pointer instances
      • deallocate

        public void deallocate​(boolean eager)
        Calls cudaFree lazily or eagerly on the allocated Pointer instances
        Parameters:
        eager - whether to do eager or lazy cudaFrees