public class LibMatrixCuMatMult extends LibMatrixCUDA
cudaSupportFunctions, customKernelSuffix, sizeOfDataType
Constructor and Description |
---|
LibMatrixCuMatMult() |
Modifier and Type | Method and Description |
---|---|
static MatrixObject |
matmult(ExecutionContext ec,
GPUContext gCtx,
String instName,
MatrixObject left,
MatrixObject right,
String outputName,
boolean isLeftTransposed,
boolean isRightTransposed)
Matrix multiply on GPU Examines sparsity and shapes and routes call to
appropriate method from cuBLAS or cuSparse C = op(A) x op(B)
The user is expected to call
ec.releaseMatrixOutputForGPUInstruction(outputName);
|
abs, acos, asin, atan, axpy, biasAdd, biasMultiply, cbind, ceil, channelSums, computeNNZ, cos, cosh, cumulativeScan, cumulativeSumProduct, denseTranspose, deviceCopy, double2float, exp, float2double, floor, getCudaKernels, getDenseMatrixOutputForGPUInstruction, getDensePointer, getNnz, isInSparseFormat, log, matmultTSMM, matrixMatrixArithmetic, matrixMatrixRelational, matrixScalarArithmetic, matrixScalarOp, matrixScalarRelational, one, rbind, reluBackward, resetFloatingPointPrecision, round, sigmoid, sign, sin, sinh, sliceOperations, solve, sqrt, tan, tanh, toInt, transpose, unaryAggregate, zero
public static MatrixObject matmult(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject left, MatrixObject right, String outputName, boolean isLeftTransposed, boolean isRightTransposed)
ec
- Current ExecutionContext
instancegCtx
- a valid GPUContext
instName
- name of the invoking instruction to recordStatistics
.left
- Matrix Aright
- Matrix BoutputName
- Name of the output matrix C (in code generated after LOP
layer)isLeftTransposed
- op for A, transposed or notisRightTransposed
- op for B, tranposed or notCopyright © 2020 The Apache Software Foundation. All rights reserved.