Class ExecutionConfig
- java.lang.Object
-
- org.apache.sysds.runtime.instructions.gpu.context.ExecutionConfig
-
public class ExecutionConfig extends Object
Java Wrapper to specify CUDA execution configuration for launching custom kernels
-
-
Constructor Summary
Constructors Constructor Description ExecutionConfig(int gridDimX, int blockDimX)
Convenience constructor for setting the number of blocks, number of threads and the shared memory sizeExecutionConfig(int gridDimX, int blockDimX, int sharedMemBytes)
Convenience constructor for setting the number of blocks, number of threads and the shared memory sizeExecutionConfig(int gridDimX, int gridDimY, int blockDimX, int blockDimY)
Convenience constructor for setting the number of blocks, number of threads and the shared memory sizeExecutionConfig(int gridDimX, int gridDimY, int blockDimX, int blockDimY, int sharedMemBytes)
Convenience constructor for setting the number of blocks, number of threads and the shared memory size
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static ExecutionConfig
getConfigForSimpleMatrixOperations(int rlen, int clen)
Use this for simple vector operations and use following in the kernelint index = blockIdx.x * blockDim.x + threadIdx.x
static ExecutionConfig
getConfigForSimpleVectorOperations(int numCells)
Use this for simple vector operations and use following in the kernelint index = blockIdx.x * blockDim.x + threadIdx.x
String
toString()
-
-
-
Constructor Detail
-
ExecutionConfig
public ExecutionConfig(int gridDimX, int blockDimX, int sharedMemBytes)
Convenience constructor for setting the number of blocks, number of threads and the shared memory size- Parameters:
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)sharedMemBytes
- Amount of Shared memory (for CUDA Kernel)
-
ExecutionConfig
public ExecutionConfig(int gridDimX, int blockDimX)
Convenience constructor for setting the number of blocks, number of threads and the shared memory size- Parameters:
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)
-
ExecutionConfig
public ExecutionConfig(int gridDimX, int gridDimY, int blockDimX, int blockDimY)
Convenience constructor for setting the number of blocks, number of threads and the shared memory size- Parameters:
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)gridDimY
- Number of blocks on the vertical axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)blockDimY
- Number of threads on the vertical axis of a block (for CUDA Kernel)=
-
ExecutionConfig
public ExecutionConfig(int gridDimX, int gridDimY, int blockDimX, int blockDimY, int sharedMemBytes)
Convenience constructor for setting the number of blocks, number of threads and the shared memory size- Parameters:
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)gridDimY
- Number of blocks on the vertical axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)blockDimY
- Number of threads on the vertical axis of a block (for CUDA Kernel)sharedMemBytes
- Amount of Shared memory (for CUDA Kernel)
-
-
Method Detail
-
getConfigForSimpleVectorOperations
public static ExecutionConfig getConfigForSimpleVectorOperations(int numCells)
Use this for simple vector operations and use following in the kernelint index = blockIdx.x * blockDim.x + threadIdx.x
This tries to schedule as minimum grids as possible.
- Parameters:
numCells
- number of cells- Returns:
- execution configuration
-
getConfigForSimpleMatrixOperations
public static ExecutionConfig getConfigForSimpleMatrixOperations(int rlen, int clen)
Use this for simple vector operations and use following in the kernelint index = blockIdx.x * blockDim.x + threadIdx.x
- Parameters:
rlen
- number of rowsclen
- number of columns- Returns:
- execution configuration
-
-