Package org.apache.sysds.runtime.data
Class SparseBlockCSR
- java.lang.Object
- 
- org.apache.sysds.runtime.data.SparseBlock
- 
- org.apache.sysds.runtime.data.SparseBlockCSR
 
 
- 
- All Implemented Interfaces:
- Serializable
 
 public class SparseBlockCSR extends SparseBlock SparseBlock implementation that realizes a traditional 'compressed sparse row' representation, where the entire sparse block is stored as three arrays: ptr of length rlen+1 to store offsets per row, and indexes/values of length nnz to store column indexes and values of non-zero entries. This format is very memory efficient for sparse (but not ultra-sparse) matrices and provides very good performance for common operations, partially due to lower memory bandwidth requirements. However, this format is slow on incremental construction (because it does not allow append/sort per row) without reshifting. Finally, the total nnz is limited to INTEGER_MAX, whereas for SparseBlockMCSR only the nnz per row are limited to INTEGER_MAX. TODO: extensions for faster incremental construction (e.g., max row) TODO more efficient fused setIndexRange impl to avoid repeated copies and updates- See Also:
- Serialized Form
 
- 
- 
Nested Class Summary- 
Nested classes/interfaces inherited from class org.apache.sysds.runtime.data.SparseBlockSparseBlock.Type
 
- 
 - 
Constructor SummaryConstructors Constructor Description SparseBlockCSR(int rlen)SparseBlockCSR(int[] rowPtr, int[] colInd, double[] values, int nnz)SparseBlockCSR(int rlen, int capacity)SparseBlockCSR(int rows, int[] rowInd, int[] colInd, double[] values)Copy constructor for COO representationSparseBlockCSR(int rlen, int capacity, int size)SparseBlockCSR(int rows, int nnz, int[] colInd)Copy constructor for given array of column indexes, which identifies rows by position and implies values of 1.SparseBlockCSR(SparseBlock sblock)Copy constructor sparse block abstraction.SparseBlockCSR(SparseRow[] rows, int nnz)Copy constructor old sparse row representation.
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanadd(int r, int c, double v)Add a value to a matrix cell (r,c).voidallocate(int r)Allocate the underlying data structure holding non-zero values of row r if necessary.voidallocate(int r, int nnz)Allocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.voidallocate(int r, int ennz, int maxnnz)Allocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.voidappend(int r, int c, double v)Append a value to the end of the physical representation.booleancheckValidity(int rlen, int clen, long nnz, boolean strict)Validate the correctness of the internal data structures of the different sparse block implementations.voidcompact()voidcompact(int r)Re-allocate physical row if physical size exceeds logical size plus resize factor.voiddeleteIndexRange(int r, int cl, int cu)Deletes all non-zero values of the given column range [cl,cu) in row r.static longestimateSizeInMemory(long nrows, long ncols, double sparsity)Get the estimated in-memory size of the sparse block in CSR with the given dimensions w/o accounting for overallocation.SparseRowget(int r)Get values of row r in the format of a sparse row.doubleget(int r, int c)Get value of matrix cell (r,c).int[]indexes()Get raw access to underlying array of column indices For use in GPU codeint[]indexes(int r)Get the sorted array of column indexes of all non-zero entries in row r.voidinitSparse(int rlen, int nnz, DataInput in)Initializes the CSR sparse block from an ordered input stream of sparse rows (rownnz, jv-pairs*).voidinitUltraSparse(int nnz, DataInput in)Initializes the CSR sparse block from an ordered input stream of ultra-sparse ijv triples.booleanisAllocated(int r)Indicates if the underlying data structure for a given row is already allocated.booleanisContiguous()Indicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.booleanisEmpty(int r)Get information if row r is empty, i.e., does not contain non-zero values.booleanisThreadSafe()Indicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.intnumRows()Get the number of rows in the sparse block.intpos(int r)Get the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).intposFIndexGT(int r, int c)Get position of first column index greater than column c in row r.intposFIndexGTE(int r, int c)Get position of first column index greater than or equal column c in row r.intposFIndexLTE(int r, int c)Get position of first column index lower than or equal column c in row r.voidreset()Clears the sparse block by deleting non-zero values.voidreset(int ennz, int maxnnz)Clears the sparse block by deleting non-zero values.voidreset(int r, int ennz, int maxnnz)Clears row r of the sparse block by deleting non-zero values.int[]rowPointers()Get raw access to underlying array of row pointers For use in GPU codebooleanset(int r, int c, double v)Set the value of a matrix cell (r,c).voidset(int r, SparseRow row, boolean deep)Set the values of row r to the given sparse row.voidsetIndexRange(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)Sets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r.voidsetIndexRange(int r, int cl, int cu, double[] v, int vix, int vlen)Sets a dense array of non-zeros values into the column range [cl,cu) in row r.voidsetIndexRange(int rl, int ru, int cl, int cu, double[] v, int vix, int vlen)Inserts a sorted row-major array of non-zero values into the row and column range [rl,ru) and [cl,cu).voidsetIndexRange(int rl, int ru, int cl, int cu, SparseBlock sb)Inserts a sparse block into the row and column range [rl,ru) and [cl,cu).longsize()Get the number of non-zero values in the sparse block.intsize(int r)Get the number of non-zero values in row r.longsize(int rl, int ru)Get the number of non-zeros values in the row range of [rl, ru).longsize(int rl, int ru, int cl, int cu)Get the number of non-zeros values in the row and column range of [rl/cl, ru/cu);voidsort()Sort all non-zero value/index pairs of the sparse block by row and column index.voidsort(int r)Sort all non-zero value/index pairs of row r column index.StringtoString()double[]values()Get raw access to underlying array of values For use in GPU codedouble[]values(int r)Get the array of all non-zero entries in row r, sorted by their column indexes.- 
Methods inherited from class org.apache.sysds.runtime.data.SparseBlockgetIterator, getIterator, getIterator, isAligned, isAligned
 
- 
 
- 
- 
- 
Constructor Detail- 
SparseBlockCSRpublic SparseBlockCSR(int rlen) 
 - 
SparseBlockCSRpublic SparseBlockCSR(int rlen, int capacity)
 - 
SparseBlockCSRpublic SparseBlockCSR(int rlen, int capacity, int size)
 - 
SparseBlockCSRpublic SparseBlockCSR(int[] rowPtr, int[] colInd, double[] values, int nnz)
 - 
SparseBlockCSRpublic SparseBlockCSR(SparseBlock sblock) Copy constructor sparse block abstraction.- Parameters:
- sblock- sparse block to copy
 
 - 
SparseBlockCSRpublic SparseBlockCSR(SparseRow[] rows, int nnz) Copy constructor old sparse row representation.- Parameters:
- rows- array of sparse rows
- nnz- number of non-zeroes
 
 - 
SparseBlockCSRpublic SparseBlockCSR(int rows, int[] rowInd, int[] colInd, double[] values)Copy constructor for COO representation- Parameters:
- rows- number of rows
- rowInd- row indices
- colInd- column indices
- values- non zero values
 
 - 
SparseBlockCSRpublic SparseBlockCSR(int rows, int nnz, int[] colInd)Copy constructor for given array of column indexes, which identifies rows by position and implies values of 1.- Parameters:
- rows- number of rows
- nnz- number of non-zeros
- colInd- column indexes
 
 
- 
 - 
Method Detail- 
initUltraSparsepublic void initUltraSparse(int nnz, DataInput in) throws IOExceptionInitializes the CSR sparse block from an ordered input stream of ultra-sparse ijv triples.- Parameters:
- nnz- number of non-zeros to read
- in- data input stream of ijv triples, ordered by ij
- Throws:
- IOException- if deserialization error occurs
 
 - 
initSparsepublic void initSparse(int rlen, int nnz, DataInput in) throws IOExceptionInitializes the CSR sparse block from an ordered input stream of sparse rows (rownnz, jv-pairs*).- Parameters:
- rlen- number of rows
- nnz- number of non-zeros to read
- in- data input stream of sparse rows, ordered by i
- Throws:
- IOException- if deserialization error occurs
 
 - 
estimateSizeInMemorypublic static long estimateSizeInMemory(long nrows, long ncols, double sparsity)Get the estimated in-memory size of the sparse block in CSR with the given dimensions w/o accounting for overallocation.- Parameters:
- nrows- number of rows
- ncols- number of columns
- sparsity- sparsity ratio
- Returns:
- memory estimate
 
 - 
rowPointerspublic int[] rowPointers() Get raw access to underlying array of row pointers For use in GPU code- Returns:
- array of row pointers
 
 - 
indexespublic int[] indexes() Get raw access to underlying array of column indices For use in GPU code- Returns:
- array of column indexes
 
 - 
valuespublic double[] values() Get raw access to underlying array of values For use in GPU code- Returns:
- array of values
 
 - 
allocatepublic void allocate(int r) Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r if necessary.- Specified by:
- allocatein class- SparseBlock
- Parameters:
- r- row index
 
 - 
allocatepublic void allocate(int r, int nnz)Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.- Specified by:
- allocatein class- SparseBlock
- Parameters:
- r- row index
- nnz- number of non-zeros
 
 - 
allocatepublic void allocate(int r, int ennz, int maxnnz)Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.- Specified by:
- allocatein class- SparseBlock
- Parameters:
- r- row index
- ennz- estimated non-zeros
- maxnnz- max non-zeros
 
 - 
compactpublic void compact(int r) Description copied from class:SparseBlockRe-allocate physical row if physical size exceeds logical size plus resize factor.- Specified by:
- compactin class- SparseBlock
- Parameters:
- r- row index
 
 - 
compactpublic void compact() 
 - 
numRowspublic int numRows() Description copied from class:SparseBlockGet the number of rows in the sparse block.- Specified by:
- numRowsin class- SparseBlock
- Returns:
- number of rows
 
 - 
isThreadSafepublic boolean isThreadSafe() Description copied from class:SparseBlockIndicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.- Specified by:
- isThreadSafein class- SparseBlock
- Returns:
- true if thread-safe row updates
 
 - 
isContiguouspublic boolean isContiguous() Description copied from class:SparseBlockIndicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.- Specified by:
- isContiguousin class- SparseBlock
- Returns:
- true if underlying data structures are contiguous arrays
 
 - 
isAllocatedpublic boolean isAllocated(int r) Description copied from class:SparseBlockIndicates if the underlying data structure for a given row is already allocated.- Specified by:
- isAllocatedin class- SparseBlock
- Parameters:
- r- row index
- Returns:
- true if already allocated
 
 - 
resetpublic void reset() Description copied from class:SparseBlockClears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.- Specified by:
- resetin class- SparseBlock
 
 - 
resetpublic void reset(int ennz, int maxnnz)Description copied from class:SparseBlockClears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.- Specified by:
- resetin class- SparseBlock
- Parameters:
- ennz- estimated non-zeros
- maxnnz- max non-zeros
 
 - 
resetpublic void reset(int r, int ennz, int maxnnz)Description copied from class:SparseBlockClears row r of the sparse block by deleting non-zero values. After this call size(r) is guaranteed to return 0.- Specified by:
- resetin class- SparseBlock
- Parameters:
- r- row index
- ennz- estimated non-zeros
- maxnnz- max non-zeros
 
 - 
sizepublic long size() Description copied from class:SparseBlockGet the number of non-zero values in the sparse block.- Specified by:
- sizein class- SparseBlock
- Returns:
- number of non-zero values in sparse block
 
 - 
sizepublic int size(int r) Description copied from class:SparseBlockGet the number of non-zero values in row r.- Specified by:
- sizein class- SparseBlock
- Parameters:
- r- row index starting at 0
- Returns:
- number of non-zero values in row r
 
 - 
sizepublic long size(int rl, int ru)Description copied from class:SparseBlockGet the number of non-zeros values in the row range of [rl, ru).- Specified by:
- sizein class- SparseBlock
- Parameters:
- rl- row lower index
- ru- row upper index
- Returns:
- number of non-zero values in the row range
 
 - 
sizepublic long size(int rl, int ru, int cl, int cu)Description copied from class:SparseBlockGet the number of non-zeros values in the row and column range of [rl/cl, ru/cu);- Specified by:
- sizein class- SparseBlock
- Parameters:
- rl- row lower index
- ru- row upper index
- cl- column lower index
- cu- column upper index
- Returns:
- number of non-zero values in the row and column range
 
 - 
isEmptypublic boolean isEmpty(int r) Description copied from class:SparseBlockGet information if row r is empty, i.e., does not contain non-zero values. Equivalent to size(r)==0. Users should do this check if it is unknown if the underlying row data structure is allocated.- Specified by:
- isEmptyin class- SparseBlock
- Parameters:
- r- row index starting at 0
- Returns:
- true if row does not contain non-zero values
 
 - 
indexespublic int[] indexes(int r) Description copied from class:SparseBlockGet the sorted array of column indexes of all non-zero entries in row r. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).- Specified by:
- indexesin class- SparseBlock
- Parameters:
- r- row index starting at 0
- Returns:
- sorted array of column indexes
 
 - 
valuespublic double[] values(int r) Description copied from class:SparseBlockGet the array of all non-zero entries in row r, sorted by their column indexes. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).- Specified by:
- valuesin class- SparseBlock
- Parameters:
- r- row index starting at 0
- Returns:
- array of all non-zero entries in row r sorted by column indexes
 
 - 
pospublic int pos(int r) Description copied from class:SparseBlockGet the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).- Specified by:
- posin class- SparseBlock
- Parameters:
- r- row index starting at 0
- Returns:
- starting position of row r
 
 - 
setpublic boolean set(int r, int c, double v)Description copied from class:SparseBlockSet the value of a matrix cell (r,c). This might update an existing non-zero value, insert a new non-zero value, or delete a non-zero value.- Specified by:
- setin class- SparseBlock
- Parameters:
- r- row index starting at 0
- c- column index starting at 0
- v- zero or non-zero value
- Returns:
- true, if number of non-zeros changed
 
 - 
addpublic boolean add(int r, int c, double v)Description copied from class:SparseBlockAdd a value to a matrix cell (r,c). This might update an existing non-zero value, or insert a new non-zero value.- Specified by:
- addin class- SparseBlock
- Parameters:
- r- row index starting at 0
- c- column index starting at 0
- v- zero or non-zero value
- Returns:
- true, if number of non-zeros changed
 
 - 
setpublic void set(int r, SparseRow row, boolean deep)Description copied from class:SparseBlockSet the values of row r to the given sparse row. This might update existing non-zero values, insert a new row, or delete a row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.- Specified by:
- setin class- SparseBlock
- Parameters:
- r- row index starting at 0
- row- sparse row
- deep- indicator to create deep copy of sparse row
 
 - 
appendpublic void append(int r, int c, double v)Description copied from class:SparseBlockAppend a value to the end of the physical representation. This should only be used for operations with sequential write pattern or if followed by a sort() operation. Note that this operation does not perform any matrix cell updates.- Specified by:
- appendin class- SparseBlock
- Parameters:
- r- row index starting at 0
- c- column index starting at 0
- v- zero or non-zero value
 
 - 
setIndexRangepublic void setIndexRange(int r, int cl, int cu, double[] v, int vix, int vlen)Description copied from class:SparseBlockSets a dense array of non-zeros values into the column range [cl,cu) in row r. The passed value array may be larger and the relevant range is given by [vix,vix+len).- Specified by:
- setIndexRangein class- SparseBlock
- Parameters:
- r- row index starting at 0
- cl- lower column index starting at 0
- cu- upper column index starting at 0
- v- value array
- vix- start index in value array
- vlen- number of relevant values
 
 - 
setIndexRangepublic void setIndexRange(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)Description copied from class:SparseBlockSets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r. The passed value array may be larger.- Specified by:
- setIndexRangein class- SparseBlock
- Parameters:
- r- row index starting at 0
- cl- lower column index starting at 0
- cu- upper column index starting at 0
- v- value array
- vix- column index array
- vpos- start index in value and index arrays
- vlen- number of relevant values
 
 - 
setIndexRangepublic void setIndexRange(int rl, int ru, int cl, int cu, double[] v, int vix, int vlen)Inserts a sorted row-major array of non-zero values into the row and column range [rl,ru) and [cl,cu). Note: that this is a CSR-specific method to address performance issues due to repeated re-shifting on update-in-place.- Parameters:
- rl- lower row index, starting at 0, inclusive
- ru- upper row index, starting at 0, exclusive
- cl- lower column index, starting at 0, inclusive
- cu- upper column index, starting at 0, exclusive
- v- right-hand-side dense block
- vix- right-hand-side dense block index
- vlen- right-hand-side dense block value length
 
 - 
setIndexRangepublic void setIndexRange(int rl, int ru, int cl, int cu, SparseBlock sb)Inserts a sparse block into the row and column range [rl,ru) and [cl,cu). Note: that this is a CSR-specific method to address performance issues due to repeated re-shifting on update-in-place.- Parameters:
- rl- lower row index, starting at 0, inclusive
- ru- upper row index, starting at 0, exclusive
- cl- lower column index, starting at 0, inclusive
- cu- upper column index, starting at 0, exclusive
- sb- right-hand-side sparse block
 
 - 
deleteIndexRangepublic void deleteIndexRange(int r, int cl, int cu)Description copied from class:SparseBlockDeletes all non-zero values of the given column range [cl,cu) in row r.- Specified by:
- deleteIndexRangein class- SparseBlock
- Parameters:
- r- row index starting at 0
- cl- lower column index starting at 0
- cu- upper column index starting at 0
 
 - 
sortpublic void sort() Description copied from class:SparseBlockSort all non-zero value/index pairs of the sparse block by row and column index.- Specified by:
- sortin class- SparseBlock
 
 - 
sortpublic void sort(int r) Description copied from class:SparseBlockSort all non-zero value/index pairs of row r column index.- Specified by:
- sortin class- SparseBlock
- Parameters:
- r- row index starting at 0
 
 - 
getpublic double get(int r, int c)Description copied from class:SparseBlockGet value of matrix cell (r,c). In case of non existing values this call returns 0.- Specified by:
- getin class- SparseBlock
- Parameters:
- r- row index starting at 0
- c- column index starting at 0
- Returns:
- value of cell at position (r,c)
 
 - 
getpublic SparseRow get(int r) Description copied from class:SparseBlockGet values of row r in the format of a sparse row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.- Specified by:
- getin class- SparseBlock
- Parameters:
- r- row index starting at 0
- Returns:
- values of row r as a sparse row
 
 - 
posFIndexLTEpublic int posFIndexLTE(int r, int c)Description copied from class:SparseBlockGet position of first column index lower than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.- Specified by:
- posFIndexLTEin class- SparseBlock
- Parameters:
- r- row index starting at 0
- c- column index starting at 0
- Returns:
- position of the first column index lower than or equal to column c in row r
 
 - 
posFIndexGTEpublic int posFIndexGTE(int r, int c)Description copied from class:SparseBlockGet position of first column index greater than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1. Note if CSR the pos(r) is subtracted from the result.- Specified by:
- posFIndexGTEin class- SparseBlock
- Parameters:
- r- row index starting at 0
- c- column index starting at 0
- Returns:
- position of the first column index greater than or equal to column c in row r
 
 - 
posFIndexGTpublic int posFIndexGT(int r, int c)Description copied from class:SparseBlockGet position of first column index greater than column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.- Specified by:
- posFIndexGTin class- SparseBlock
- Parameters:
- r- row index starting at 0
- c- column index starting at 0
- Returns:
- position of the first column index greater than column c in row r
 
 - 
toStringpublic String toString() - Specified by:
- toStringin class- SparseBlock
 
 - 
checkValiditypublic boolean checkValidity(int rlen, int clen, long nnz, boolean strict)Description copied from class:SparseBlockValidate the correctness of the internal data structures of the different sparse block implementations.- Specified by:
- checkValidityin class- SparseBlock
- Parameters:
- rlen- number of rows
- clen- number of columns
- nnz- number of non zeros
- strict- enforce optional properties
- Returns:
- true if the sparse block is valid wrt the corresponding format such as COO, CSR, MCSR.
 
 
- 
 
-