Package | Description |
---|---|
org.apache.sysml.runtime.instructions.gpu.context | |
org.apache.sysml.runtime.matrix.data |
Modifier and Type | Method and Description |
---|---|
static List<GPUContext> |
GPUContextPool.reserveAllGPUContexts()
Reserves and gets an initialized list of GPUContexts
|
Modifier and Type | Method and Description |
---|---|
static CSRPointer |
CSRPointer.allocateEmpty(GPUContext gCtx,
long nnz2,
long rows)
Factory method to allocate an empty CSR Sparse matrix on the GPU
|
static CSRPointer |
CSRPointer.allocateForDgeam(GPUContext gCtx,
jcuda.jcusparse.cusparseHandle handle,
CSRPointer A,
CSRPointer B,
int m,
int n)
Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation
C = a op(A) + b op(B)
|
static CSRPointer |
CSRPointer.allocateForMatrixMultiply(GPUContext gCtx,
jcuda.jcusparse.cusparseHandle handle,
CSRPointer A,
int transA,
CSRPointer B,
int transB,
int m,
int n,
int k)
Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B
and returns the
CSRPointer to C with the appropriate GPU memory. |
static CSRPointer |
GPUObject.columnMajorDenseToRowMajorSparse(GPUContext gCtx,
jcuda.jcusparse.cusparseHandle cusparseHandle,
jcuda.Pointer densePtr,
int rows,
int cols)
Convenience method to convert a CSR matrix to a dense matrix on the GPU
Since the allocated matrix is temporary, bookkeeping is not updated.
|
static void |
CSRPointer.copyToDevice(GPUContext gCtx,
CSRPointer dest,
int rows,
long nnz,
int[] rowPtr,
int[] colInd,
double[] values)
Static method to copy a CSR sparse matrix from Host to Device
|
static jcuda.Pointer |
GPUObject.transpose(GPUContext gCtx,
jcuda.Pointer densePtr,
int m,
int n,
int lda,
int ldc)
Transposes a dense matrix on the GPU by calling the cublasDgeam operation
|
Constructor and Description |
---|
GPUMemoryManager(GPUContext gpuCtx) |
Modifier and Type | Method and Description |
---|---|
static void |
LibMatrixCUDA.abs(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "abs" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.acos(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "acos" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.asin(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "asin" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.atan(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "atan" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.axpy(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in2,
String outputName,
double constant)
Performs daxpy operation
|
static void |
LibMatrixCuDNN.batchNormalizationBackward(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dout,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject scale,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dX,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dScale,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dBias,
double epsilon,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject resultSaveMean,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject resultSaveInvVariance)
This method computes the backpropagation errors for image, scale and bias of batch normalization layer
|
static void |
LibMatrixCuDNN.batchNormalizationForwardInference(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject scale,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject bias,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject runningMean,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject runningVar,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject ret,
double epsilon)
Performs the forward BatchNormalization layer computation for inference
|
static void |
LibMatrixCuDNN.batchNormalizationForwardTraining(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject scale,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject bias,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject runningMean,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject runningVar,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject ret,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject retRunningMean,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject retRunningVar,
double epsilon,
double exponentialAverageFactor,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject resultSaveMean,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject resultSaveInvVariance)
Performs the forward BatchNormalization layer computation for training
|
static void |
LibMatrixCUDA.biasAdd(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject input,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject bias,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock)
Performs the operation corresponding to the DML script:
ones = matrix(1, rows=1, cols=Hout*Wout)
output = input + matrix(bias %*% ones, rows=1, cols=F*Hout*Wout)
This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in function
|
static void |
LibMatrixCUDA.biasMultiply(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject input,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject bias,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock)
Performs the operation corresponding to the DML script:
ones = matrix(1, rows=1, cols=Hout*Wout)
output = input * matrix(bias %*% ones, rows=1, cols=F*Hout*Wout)
This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in function
|
static void |
LibMatrixCUDA.cbind(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in2,
String outputName) |
static void |
LibMatrixCUDA.ceil(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "ceil" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.channelSums(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject input,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock,
long C,
long HW)
Perform channel_sums operations: out = rowSums(matrix(colSums(A), rows=C, cols=HW))
|
static int |
LibMatrixCUDA.computeNNZ(GPUContext gCtx,
jcuda.Pointer densePtr,
int length)
Utility to compute number of non-zeroes on the GPU
|
static void |
LibMatrixCuDNN.conv2d(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject filter,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
double intermediateMemoryBudget)
Performs a 2D convolution
|
static void |
LibMatrixCuDNN.conv2dBackwardData(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject filter,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dout,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject output,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
double intermediateMemoryBudget)
This method computes the backpropogation errors for previous layer of convolution operation
|
static void |
LibMatrixCuDNN.conv2dBackwardFilter(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dout,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
double intermediateMemoryBudget)
This method computes the backpropogation errors for filter of convolution operation
|
static void |
LibMatrixCuDNN.conv2dBiasAdd(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject bias,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject filter,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject output,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
double intermediateMemoryBudget)
Does a 2D convolution followed by a bias_add
|
static void |
LibMatrixCUDA.cos(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "cos" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.cosh(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "cosh" operation on a matrix on the GPU
|
static LibMatrixCuDNNConvolutionAlgorithm |
LibMatrixCuDNNConvolutionAlgorithm.cudnnGetConvolutionBackwardDataAlgorithm(GPUContext gCtx,
String instName,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
long workspaceLimit)
Factory method to get the algorithm wrapper for convolution backward data
|
static LibMatrixCuDNNConvolutionAlgorithm |
LibMatrixCuDNNConvolutionAlgorithm.cudnnGetConvolutionBackwardFilterAlgorithm(GPUContext gCtx,
String instName,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
long workspaceLimit)
Factory method to get the algorithm wrapper for convolution backward filter
|
static LibMatrixCuDNNConvolutionAlgorithm |
LibMatrixCuDNNConvolutionAlgorithm.cudnnGetConvolutionForwardAlgorithm(GPUContext gCtx,
String instName,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
long workspaceLimit)
Factory method to get the algorithm wrapper for convolution forward
|
static LibMatrixCuDNNPoolingDescriptors |
LibMatrixCuDNNPoolingDescriptors.cudnnPoolingBackwardDescriptors(GPUContext gCtx,
String instName,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
LibMatrixDNN.PoolingType poolingType)
Get descriptors for maxpooling backward operation
|
static LibMatrixCuDNNPoolingDescriptors |
LibMatrixCuDNNPoolingDescriptors.cudnnPoolingDescriptors(GPUContext gCtx,
String instName,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
LibMatrixDNN.PoolingType poolingType)
Get descriptors for maxpooling operation
|
static void |
LibMatrixCUDA.denseTranspose(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
jcuda.Pointer A,
jcuda.Pointer C,
long numRowsA,
long numColsA)
Computes C = t(A)
|
void |
SinglePrecisionCudaSupportFunctions.deviceToHost(GPUContext gCtx,
jcuda.Pointer src,
double[] dest,
String instName,
boolean isEviction) |
void |
DoublePrecisionCudaSupportFunctions.deviceToHost(GPUContext gCtx,
jcuda.Pointer src,
double[] dest,
String instName,
boolean isEviction) |
void |
CudaSupportFunctions.deviceToHost(GPUContext gCtx,
jcuda.Pointer src,
double[] dest,
String instName,
boolean isEviction) |
static jcuda.Pointer |
LibMatrixCUDA.double2float(GPUContext gCtx,
jcuda.Pointer A,
jcuda.Pointer ret,
int numElems) |
static void |
LibMatrixCUDA.exp(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "exp" operation on a matrix on the GPU
|
static jcuda.Pointer |
LibMatrixCUDA.float2double(GPUContext gCtx,
jcuda.Pointer A,
jcuda.Pointer ret,
int numElems) |
static void |
LibMatrixCUDA.floor(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "floor" operation on a matrix on the GPU
|
protected static jcuda.jcublas.cublasHandle |
LibMatrixCUDA.getCublasHandle(GPUContext gCtx) |
static JCudaKernels |
LibMatrixCUDA.getCudaKernels(GPUContext gCtx) |
protected static jcuda.jcudnn.cudnnHandle |
LibMatrixCuDNN.getCudnnHandle(GPUContext gCtx) |
protected static jcuda.jcusparse.cusparseHandle |
LibMatrixCUDA.getCusparseHandle(GPUContext gCtx) |
static jcuda.Pointer |
LibMatrixCUDA.getDensePointer(GPUContext gCtx,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject input,
String instName)
Convenience method to get jcudaDenseMatrixPtr.
|
protected static jcuda.Pointer |
LibMatrixCuDNN.getDensePointerForCuDNN(GPUContext gCtx,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
String instName)
Convenience method to get jcudaDenseMatrixPtr.
|
static jcuda.Pointer |
LibMatrixCuDNN.getDensePointerForCuDNN(GPUContext gCtx,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
String instName,
int numRows,
int numCols)
Convenience method to get jcudaDenseMatrixPtr.
|
static long |
LibMatrixCUDA.getNnz(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject mo,
boolean recomputeDenseNNZ)
Note: if the matrix is in dense format, it explicitly re-computes the number of nonzeros.
|
protected static CSRPointer |
LibMatrixCUDA.getSparsePointer(GPUContext gCtx,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject input,
String instName)
Convenience method to get the sparse matrix pointer from a
MatrixObject . |
void |
SinglePrecisionCudaSupportFunctions.hostToDevice(GPUContext gCtx,
double[] src,
jcuda.Pointer dest,
String instName) |
void |
DoublePrecisionCudaSupportFunctions.hostToDevice(GPUContext gCtx,
double[] src,
jcuda.Pointer dest,
String instName) |
void |
CudaSupportFunctions.hostToDevice(GPUContext gCtx,
double[] src,
jcuda.Pointer dest,
String instName) |
static boolean |
LibMatrixCUDA.isInSparseFormat(GPUContext gCtx,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject mo) |
static void |
LibMatrixCUDA.log(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "log" operation on a matrix on the GPU
|
static void |
LibMatrixCuDNN.lstm(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
jcuda.Pointer X,
jcuda.Pointer wPointer,
jcuda.Pointer out0,
jcuda.Pointer c0,
boolean return_sequences,
String outputName,
String cyName,
int N,
int M,
int D,
int T)
Computes the forward pass for an LSTM layer with M neurons.
|
static void |
LibMatrixCuDNN.lstmBackward(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
jcuda.Pointer x,
jcuda.Pointer hx,
jcuda.Pointer cx,
jcuda.Pointer wPointer,
String doutName,
String dcyName,
String dxName,
String dwName,
String dbName,
String dhxName,
String dcxName,
boolean return_sequences,
int N,
int M,
int D,
int T) |
static org.apache.sysml.runtime.controlprogram.caching.MatrixObject |
LibMatrixCuMatMult.matmult(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject left,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject right,
String outputName,
boolean isLeftTransposed,
boolean isRightTransposed)
Matrix multiply on GPU Examines sparsity and shapes and routes call to
appropriate method from cuBLAS or cuSparse C = op(A) x op(B)
The user is expected to call
ec.releaseMatrixOutputForGPUInstruction(outputName);
|
static void |
LibMatrixCUDA.matmultTSMM(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject left,
String outputName,
boolean isLeftTransposed)
Performs tsmm, A %*% A' or A' %*% A, on GPU by exploiting cublasDsyrk(...)
|
static void |
LibMatrixCUDA.matrixMatrixArithmetic(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in2,
String outputName,
boolean isLeftTransposed,
boolean isRightTransposed,
org.apache.sysml.runtime.matrix.operators.BinaryOperator op)
Performs elementwise arithmetic operation specified by op of two input matrices in1 and in2
|
static void |
LibMatrixCUDA.matrixMatrixRelational(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in2,
String outputName,
org.apache.sysml.runtime.matrix.operators.BinaryOperator op)
Performs elementwise operation relational specified by op of two input matrices in1 and in2
|
static void |
LibMatrixCUDA.matrixScalarArithmetic(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in,
String outputName,
boolean isInputTransposed,
org.apache.sysml.runtime.matrix.operators.ScalarOperator op)
Entry point to perform elementwise matrix-scalar arithmetic operation specified by op
|
static void |
LibMatrixCUDA.matrixScalarOp(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in,
String outputName,
boolean isInputTransposed,
org.apache.sysml.runtime.matrix.operators.ScalarOperator op)
Utility to do matrix-scalar operation kernel
|
static void |
LibMatrixCUDA.matrixScalarRelational(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in,
String outputName,
org.apache.sysml.runtime.matrix.operators.ScalarOperator op)
Entry point to perform elementwise matrix-scalar relational operation specified by op
|
static void |
LibMatrixCuDNN.pooling(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
LibMatrixDNN.PoolingType poolingType,
double intermediateMemoryBudget)
performs maxpooling on GPU by exploiting cudnnPoolingForward(...)
|
static void |
LibMatrixCuDNN.poolingBackward(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dout,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject maxpoolOutput,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock,
int N,
int C,
int H,
int W,
int K,
int R,
int S,
int pad_h,
int pad_w,
int stride_h,
int stride_w,
int P,
int Q,
LibMatrixDNN.PoolingType poolingType,
double intermediateMemoryBudget)
Performs maxpoolingBackward on GPU by exploiting cudnnPoolingBackward(...)
This method computes the backpropogation errors for previous layer of maxpooling operation
|
static void |
LibMatrixCUDA.rbind(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in2,
String outputName) |
static void |
LibMatrixCuDNN.relu(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in,
String outputName)
Performs the relu operation on the GPU.
|
static void |
LibMatrixCUDA.reluBackward(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject input,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject dout,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject outputBlock)
This method computes the backpropagation errors for previous layer of relu operation
|
static void |
LibMatrixCUDA.round(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "round" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.sigmoid(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "sigmoid" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.sign(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "sign" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.sin(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "sin" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.sinh(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "sinh" operation on a matrix on the GPU
|
protected static void |
LibMatrixCUDA.sliceDenseDense(GPUContext gCtx,
String instName,
jcuda.Pointer inPointer,
jcuda.Pointer outPointer,
int rl,
int ru,
int cl,
int cu,
int inClen)
Perform slice operation on dense input and output it in dense format
|
static void |
LibMatrixCUDA.sliceOperations(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
org.apache.sysml.runtime.util.IndexRange ixrange,
String outputName)
Method to perform rightIndex operation for a given lower and upper bounds in row and column dimensions.
|
protected static void |
LibMatrixCUDA.sliceSparseDense(GPUContext gCtx,
String instName,
CSRPointer inPointer,
jcuda.Pointer outPointer,
int rl,
int ru,
int cl,
int cu,
int inClen)
Perform slice operation on sparse input and output it in dense format
|
static void |
LibMatrixCuDNN.softmax(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "softmax" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.solve(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in2,
String outputName)
Implements the "solve" function for systemml Ax = B (A is of size m*n, B is of size m*1, x is of size n*1)
|
static void |
LibMatrixCUDA.sqrt(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "sqrt" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.tan(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "tan" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.tanh(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String outputName)
Performs an "tanh" operation on a matrix on the GPU
|
static void |
LibMatrixCUDA.transpose(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in,
String outputName)
Transposes the input matrix using cublasDgeam
|
static void |
LibMatrixCUDA.unaryAggregate(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject in1,
String output,
org.apache.sysml.runtime.matrix.operators.AggregateUnaryOperator op)
Entry point to perform Unary aggregate operations on the GPU.
|
Constructor and Description |
---|
LibMatrixCuDNNInputRowFetcher(GPUContext gCtx,
String instName,
org.apache.sysml.runtime.controlprogram.caching.MatrixObject image)
Initialize the input fetcher
|
LibMatrixCuDNNRnnAlgorithm(org.apache.sysml.runtime.controlprogram.context.ExecutionContext ec,
GPUContext gCtx,
String instName,
String rnnMode,
int N,
int T,
int M,
int D,
boolean isTraining,
jcuda.Pointer w) |
Copyright © 2018 The Apache Software Foundation. All rights reserved.