public class ExecutionConfig extends Object
Modifier and Type | Field and Description |
---|---|
int |
blockDimX |
int |
blockDimY |
int |
blockDimZ |
int |
gridDimX |
int |
gridDimY |
int |
gridDimZ |
int |
sharedMemBytes |
jcuda.driver.CUstream |
stream |
Constructor and Description |
---|
ExecutionConfig(int gridDimX,
int blockDimX)
Convenience constructor for setting the number of blocks, number of threads and the
shared memory size
|
ExecutionConfig(int gridDimX,
int blockDimX,
int sharedMemBytes)
Convenience constructor for setting the number of blocks, number of threads and the
shared memory size
|
ExecutionConfig(int gridDimX,
int gridDimY,
int blockDimX,
int blockDimY)
Convenience constructor for setting the number of blocks, number of threads and the
shared memory size
|
ExecutionConfig(int gridDimX,
int gridDimY,
int blockDimX,
int blockDimY,
int sharedMemBytes)
Convenience constructor for setting the number of blocks, number of threads and the
shared memory size
|
Modifier and Type | Method and Description |
---|---|
static ExecutionConfig |
getConfigForSimpleMatrixOperations(int rlen,
int clen)
Use this for simple vector operations and use following in the kernel
int index = blockIdx.x * blockDim.x + threadIdx.x
|
static ExecutionConfig |
getConfigForSimpleVectorOperations(int numCells)
Use this for simple vector operations and use following in the kernel
int index = blockIdx.x * blockDim.x + threadIdx.x
|
String |
toString() |
public int gridDimX
public int gridDimY
public int gridDimZ
public int blockDimX
public int blockDimY
public int blockDimZ
public int sharedMemBytes
public jcuda.driver.CUstream stream
public ExecutionConfig(int gridDimX, int blockDimX, int sharedMemBytes)
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)sharedMemBytes
- Amount of Shared memory (for CUDA Kernel)public ExecutionConfig(int gridDimX, int blockDimX)
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)public ExecutionConfig(int gridDimX, int gridDimY, int blockDimX, int blockDimY)
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)gridDimY
- Number of blocks on the vertical axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)blockDimY
- Number of threads on the vertical axis of a block (for CUDA Kernel)=public ExecutionConfig(int gridDimX, int gridDimY, int blockDimX, int blockDimY, int sharedMemBytes)
gridDimX
- Number of blocks on the horizontal axis of the grid (for CUDA Kernel)gridDimY
- Number of blocks on the vertical axis of the grid (for CUDA Kernel)blockDimX
- Number of threads on the horizontal axis of a block (for CUDA Kernel)blockDimY
- Number of threads on the vertical axis of a block (for CUDA Kernel)sharedMemBytes
- Amount of Shared memory (for CUDA Kernel)public static ExecutionConfig getConfigForSimpleVectorOperations(int numCells)
int index = blockIdx.x * blockDim.x + threadIdx.x
This tries to schedule as minimum grids as possible.
numCells
- number of cellspublic static ExecutionConfig getConfigForSimpleMatrixOperations(int rlen, int clen)
int index = blockIdx.x * blockDim.x + threadIdx.x
rlen
- number of rowsclen
- number of columnsCopyright © 2020 The Apache Software Foundation. All rights reserved.