public class GPUObject extends Object
Modifier and Type | Field and Description |
---|---|
protected boolean |
dirty
whether the block attached to this
GPUContext is dirty on the device and needs to be copied back to host |
protected boolean |
isSparse
Whether this block is in sparse format
|
protected LongAdder |
readLocks
number of read locks on this object (this GPUObject is being used in a current instruction)
|
protected boolean |
writeLock
whether write lock on this object (this GPUObject is being used in a current instruction)
|
Modifier and Type | Method and Description |
---|---|
boolean |
acquireDeviceModifyDense() |
boolean |
acquireDeviceModifySparse() |
boolean |
acquireDeviceRead(String opcode) |
boolean |
acquireHostRead(String instName)
if the data is allocated on the GPU and is dirty, it is copied back to the host memory
|
void |
addReadLock() |
void |
addWriteLock() |
void |
allocateAndFillDense(double v)
Allocates a dense matrix of size obtained from the attached matrix metadata
and fills it up with a single value
|
void |
allocateSparseAndEmpty()
Allocates a sparse and empty
GPUObject
This is the result of operations that are both non zero matrices. |
void |
clearData(String opcode,
boolean eager)
Clears the data associated with this
GPUObject instance |
void |
clearDensePointer()
Removes the dense pointer and potential soft reference
|
Object |
clone() |
static CSRPointer |
columnMajorDenseToRowMajorSparse(GPUContext gCtx,
jcuda.jcusparse.cusparseHandle cusparseHandle,
jcuda.Pointer densePtr,
int rows,
int cols)
Convenience method to convert a CSR matrix to a dense matrix on the GPU
Since the allocated matrix is temporary, bookkeeping is not updated.
|
protected void |
copyFromDeviceToHost(String instName,
boolean isEviction,
boolean eagerDelete)
Copies the data from device to host.
|
void |
denseColumnMajorToRowMajor()
Convenience method.
|
void |
denseRowMajorToColumnMajor()
Convenience method.
|
void |
denseToSparse()
Converts this GPUObject from dense to sparse format.
|
jcuda.Pointer |
getDensePointer()
Pointer to dense matrix
|
CSRPointer |
getJcudaSparseMatrixPtr()
Pointer to sparse matrix
|
long |
getNnz(String instName,
boolean recomputeDenseNNZ)
Being allocated is a prerequisite for computing nnz.
|
protected long |
getSizeOnDevice() |
CSRPointer |
getSparseMatrixCudaPointer()
Convenience method to directly examine the Sparse matrix on GPU
|
boolean |
isAllocated() |
boolean |
isDensePointerNull()
Checks if the dense pointer is null
|
boolean |
isDirty()
Whether this block is dirty on the GPU
|
boolean |
isLocked() |
boolean |
isSparse() |
boolean |
isSparseAndEmpty()
If this
GPUObject is sparse and empty
Being allocated is a prerequisite to being sparse and empty. |
void |
releaseInput()
Releases input allocated on GPU
|
void |
releaseOutput()
releases output allocated on GPU
|
void |
releaseReadLock() |
void |
releaseWriteLock() |
void |
resetReadWriteLock() |
void |
setDensePointer(jcuda.Pointer densePtr)
Convenience method to directly set the dense matrix pointer on GPU
|
void |
setSparseMatrixCudaPointer(CSRPointer sparseMatrixPtr)
Convenience method to directly set the sparse matrix on GPU
Needed for operations like cusparseDcsrgemm(cusparseHandle, int, int, int, int, int, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, int, Pointer, Pointer, Pointer, cusparseMatDescr, Pointer, Pointer, Pointer)
|
void |
sparseToColumnMajorDense()
More efficient method to convert sparse to dense but returns dense in column major format
|
void |
sparseToDense()
Convert sparse to dense (Performs transpose, use sparseToColumnMajorDense if the kernel can deal with column major format)
|
void |
sparseToDense(String instructionName)
Convert sparse to dense (Performs transpose, use sparseToColumnMajorDense if the kernel can deal with column major format)
Also records per instruction invokation of sparseToDense.
|
static int |
toIntExact(long l) |
String |
toString() |
static jcuda.Pointer |
transpose(GPUContext gCtx,
jcuda.Pointer densePtr,
int m,
int n,
int lda,
int ldc)
Transposes a dense matrix on the GPU by calling the cublasDgeam operation
|
protected boolean dirty
GPUContext
is dirty on the device and needs to be copied back to hostprotected LongAdder readLocks
protected boolean writeLock
protected boolean isSparse
public jcuda.Pointer getDensePointer()
public boolean isDensePointerNull()
public void clearDensePointer()
public void setDensePointer(jcuda.Pointer densePtr)
densePtr
- dense pointerpublic static jcuda.Pointer transpose(GPUContext gCtx, jcuda.Pointer densePtr, int m, int n, int lda, int ldc)
gCtx
- a valid GPUContext
densePtr
- Pointer to dense matrix on the GPUm
- rows in ouput matrixn
- columns in output matrixlda
- rows in input matrixldc
- columns in output matrixpublic static CSRPointer columnMajorDenseToRowMajorSparse(GPUContext gCtx, jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.Pointer densePtr, int rows, int cols)
gCtx
- a valid GPUContext
cusparseHandle
- handle to cusparse librarydensePtr
- [in] dense matrix pointer on the GPU in row majorrows
- number of rowscols
- number of columnspublic CSRPointer getSparseMatrixCudaPointer()
public void setSparseMatrixCudaPointer(CSRPointer sparseMatrixPtr)
sparseMatrixPtr
- CSR (compressed sparse row) pointerpublic void denseToSparse()
public void denseRowMajorToColumnMajor()
public void denseColumnMajorToRowMajor()
public void sparseToDense()
public void sparseToDense(String instructionName)
instructionName
- Name of the instruction for which statistics are recorded in GPUStatistics
public void sparseToColumnMajorDense()
public boolean isSparse()
public boolean isAllocated()
public void allocateSparseAndEmpty()
GPUObject
This is the result of operations that are both non zero matrices.public void allocateAndFillDense(double v)
v
- value to fill up the dense matrixpublic boolean isSparseAndEmpty()
GPUObject
is sparse and empty
Being allocated is a prerequisite to being sparse and empty.public long getNnz(String instName, boolean recomputeDenseNNZ)
instName
- instruction namerecomputeDenseNNZ
- recompute NNZ if densepublic boolean acquireDeviceRead(String opcode)
public boolean acquireDeviceModifyDense()
public boolean acquireDeviceModifySparse()
public boolean acquireHostRead(String instName)
instName
- name of the instructionpublic boolean isLocked()
public void addReadLock()
public void addWriteLock()
public void releaseReadLock()
public void releaseWriteLock()
public void resetReadWriteLock()
public void releaseInput()
public void releaseOutput()
protected long getSizeOnDevice()
public static int toIntExact(long l)
protected void copyFromDeviceToHost(String instName, boolean isEviction, boolean eagerDelete) throws DMLRuntimeException
instName
- opcode of the instruction for fine-grained statisticsisEviction
- is called for evictioneagerDelete
- whether to perform eager deletion of the device data.DMLRuntimeException
- if error occurspublic void clearData(String opcode, boolean eager) throws DMLRuntimeException
GPUObject
instanceopcode
- opcode of the instructioneager
- whether to be done synchronously or asynchronouslyDMLRuntimeException
- if error occurspublic CSRPointer getJcudaSparseMatrixPtr()
public boolean isDirty()
Copyright © 2018 The Apache Software Foundation. All rights reserved.