Package org.apache.sysds.runtime.data
Class SparseBlockCOO
- java.lang.Object
-
- org.apache.sysds.runtime.data.SparseBlock
-
- org.apache.sysds.runtime.data.SparseBlockCOO
-
- All Implemented Interfaces:
Serializable,Block
public class SparseBlockCOO extends SparseBlock
SparseBlock implementation that realizes a traditional 'coordinate matrix' representation, where the entire sparse block is stored as triples in three arrays: row indexes, column indexes, and values, where row indexes and colunm indexes are sorted in order to allow binary search. This format is very memory efficient for ultra-sparse matrices, allows fast incremental construction but has performance drawbacks for row-major access through our sparse block abstraction since there is no constant-time random access to individual rows. Similar to CSR, the nnz is limited to Integer.MAX_VALUE. In contrast to COO matrix formats with three arrays, we use 1+#dims arrays to represent the values and indexes of all dimensions.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.sysds.runtime.data.SparseBlock
SparseBlock.Type
-
-
Constructor Summary
Constructors Constructor Description SparseBlockCOO(int rlen)SparseBlockCOO(int rlen, int capacity)SparseBlockCOO(SparseBlock sblock)Copy constructor sparse block abstraction.SparseBlockCOO(SparseRow[] rows, int nnz)Copy constructor old sparse row representation.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanadd(int r, int c, double v)Add a value to a matrix cell (r,c).voidallocate(int r)Allocate the underlying data structure holding non-zero values of row r if necessary.voidallocate(int r, int nnz)Allocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.voidallocate(int r, int ennz, int maxnnz)Allocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.voidappend(int r, int c, double v)Append a value to the end of the physical representation.booleancheckValidity(int rlen, int clen, long nnz, boolean strict)Validate the correctness of the internal data structures of the different sparse block implementations.voidcompact(int r)Re-allocate physical row if physical size exceeds logical size plus resize factor.voiddeleteIndexRange(int r, int cl, int cu)Deletes all non-zero values of the given column range [cl,cu) in row r.static longestimateSizeInMemory(long nrows, long ncols, double sparsity)Get the estimated in-memory size of the sparse block in COO with the given dimensions w/o accounting for overallocation.SparseRowget(int r)Get values of row r in the format of a sparse row.doubleget(int r, int c)Get value of matrix cell (r,c).Iterator<IJV>getIterator()Get a non-zero iterator over the entire sparse block.Iterator<IJV>getIterator(int ru)Get a non-zero iterator over the partial sparse block [0,ru).Iterator<IJV>getIterator(int rl, int ru)Get a non-zero iterator over the subblock [rl, ru).int[]indexes()Get raw access to underlying array of column indices For use in GPU codeint[]indexes(int r)Get the sorted array of column indexes of all non-zero entries in row r.booleanisAllocated(int r)Indicates if the underlying data structure for a given row is already allocated.booleanisContiguous()Indicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.booleanisEmpty(int r)Get information if row r is empty, i.e., does not contain non-zero values.booleanisThreadSafe()Indicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.intnumRows()Get the number of rows in the sparse block.intpos(int r)Get the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).intposFIndexGT(int r, int c)Get position of first column index greater than column c in row r.intposFIndexGTE(int r, int c)Get position of first column index greater than or equal column c in row r.intposFIndexLTE(int r, int c)Get position of first column index lower than or equal column c in row r.voidreset()Clears the sparse block by deleting non-zero values.voidreset(int ennz, int maxnnz)Clears the sparse block by deleting non-zero values.voidreset(int r, int ennz, int maxnnz)Clears row r of the sparse block by deleting non-zero values.int[]rowIndexes()Get raw access to underlying array of row indices For use in GPU codebooleanset(int r, int c, double v)Set the value of a matrix cell (r,c).voidset(int r, SparseRow row, boolean deep)Set the values of row r to the given sparse row.voidsetIndexRange(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)Sets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r.voidsetIndexRange(int r, int cl, int cu, double[] v, int vix, int vlen)Sets a dense array of non-zeros values into the column range [cl,cu) in row r.longsize()Get the number of non-zero values in the sparse block.intsize(int r)Get the number of non-zero values in row r.longsize(int rl, int ru)Get the number of non-zeros values in the row range of [rl, ru).longsize(int rl, int ru, int cl, int cu)Get the number of non-zeros values in the row and column range of [rl/cl, ru/cu);voidsort()Sort all non-zero value/index pairs of the sparse block by row and column index.voidsort(int r)Sort all non-zero value/index pairs of row r column index.StringtoString()double[]values()Get raw access to underlying array of values For use in GPU codedouble[]values(int r)Get the array of all non-zero entries in row r, sorted by their column indexes.
-
-
-
Constructor Detail
-
SparseBlockCOO
public SparseBlockCOO(int rlen)
-
SparseBlockCOO
public SparseBlockCOO(int rlen, int capacity)
-
SparseBlockCOO
public SparseBlockCOO(SparseBlock sblock)
Copy constructor sparse block abstraction.- Parameters:
sblock- sparse block to copy
-
SparseBlockCOO
public SparseBlockCOO(SparseRow[] rows, int nnz)
Copy constructor old sparse row representation.- Parameters:
rows- array of sparse rowsnnz- number of non-zeros
-
-
Method Detail
-
estimateSizeInMemory
public static long estimateSizeInMemory(long nrows, long ncols, double sparsity)Get the estimated in-memory size of the sparse block in COO with the given dimensions w/o accounting for overallocation.- Parameters:
nrows- number of rowsncols- number of columnssparsity- sparsity ratio- Returns:
- memory estimate
-
allocate
public void allocate(int r)
Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r if necessary.- Specified by:
allocatein classSparseBlock- Parameters:
r- row index
-
allocate
public void allocate(int r, int nnz)Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.- Specified by:
allocatein classSparseBlock- Parameters:
r- row indexnnz- number of non-zeros
-
allocate
public void allocate(int r, int ennz, int maxnnz)Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.- Specified by:
allocatein classSparseBlock- Parameters:
r- row indexennz- estimated non-zerosmaxnnz- max non-zeros
-
compact
public void compact(int r)
Description copied from class:SparseBlockRe-allocate physical row if physical size exceeds logical size plus resize factor.- Specified by:
compactin classSparseBlock- Parameters:
r- row index
-
numRows
public int numRows()
Description copied from class:SparseBlockGet the number of rows in the sparse block.- Specified by:
numRowsin classSparseBlock- Returns:
- number of rows
-
isThreadSafe
public boolean isThreadSafe()
Description copied from class:SparseBlockIndicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.- Specified by:
isThreadSafein classSparseBlock- Returns:
- true if thread-safe row updates
-
isContiguous
public boolean isContiguous()
Description copied from class:SparseBlockIndicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.- Specified by:
isContiguousin classSparseBlock- Returns:
- true if underlying data structures are contiguous arrays
-
isAllocated
public boolean isAllocated(int r)
Description copied from class:SparseBlockIndicates if the underlying data structure for a given row is already allocated.- Specified by:
isAllocatedin classSparseBlock- Parameters:
r- row index- Returns:
- true if already allocated
-
checkValidity
public boolean checkValidity(int rlen, int clen, long nnz, boolean strict)Description copied from class:SparseBlockValidate the correctness of the internal data structures of the different sparse block implementations.- Specified by:
checkValidityin classSparseBlock- Parameters:
rlen- number of rowsclen- number of columnsnnz- number of non zerosstrict- enforce optional properties- Returns:
- true if the sparse block is valid wrt the corresponding format such as COO, CSR, MCSR.
-
reset
public void reset()
Description copied from class:SparseBlockClears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.- Specified by:
resetin classSparseBlock
-
reset
public void reset(int ennz, int maxnnz)Description copied from class:SparseBlockClears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.- Specified by:
resetin classSparseBlock- Parameters:
ennz- estimated non-zerosmaxnnz- max non-zeros
-
reset
public void reset(int r, int ennz, int maxnnz)Description copied from class:SparseBlockClears row r of the sparse block by deleting non-zero values. After this call size(r) is guaranteed to return 0.- Specified by:
resetin classSparseBlock- Parameters:
r- row indexennz- estimated non-zerosmaxnnz- max non-zeros
-
size
public long size()
Description copied from class:SparseBlockGet the number of non-zero values in the sparse block.- Specified by:
sizein classSparseBlock- Returns:
- number of non-zero values in sparse block
-
size
public int size(int r)
Description copied from class:SparseBlockGet the number of non-zero values in row r.- Specified by:
sizein classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- number of non-zero values in row r
-
size
public long size(int rl, int ru)Description copied from class:SparseBlockGet the number of non-zeros values in the row range of [rl, ru).- Specified by:
sizein classSparseBlock- Parameters:
rl- row lower indexru- row upper index- Returns:
- number of non-zero values in the row range
-
size
public long size(int rl, int ru, int cl, int cu)Description copied from class:SparseBlockGet the number of non-zeros values in the row and column range of [rl/cl, ru/cu);- Specified by:
sizein classSparseBlock- Parameters:
rl- row lower indexru- row upper indexcl- column lower indexcu- column upper index- Returns:
- number of non-zero values in the row and column range
-
isEmpty
public boolean isEmpty(int r)
Description copied from class:SparseBlockGet information if row r is empty, i.e., does not contain non-zero values. Equivalent to size(r)==0. Users should do this check if it is unknown if the underlying row data structure is allocated.- Specified by:
isEmptyin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- true if row does not contain non-zero values
-
indexes
public int[] indexes(int r)
Description copied from class:SparseBlockGet the sorted array of column indexes of all non-zero entries in row r. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).- Specified by:
indexesin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- sorted array of column indexes
-
values
public double[] values(int r)
Description copied from class:SparseBlockGet the array of all non-zero entries in row r, sorted by their column indexes. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).- Specified by:
valuesin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- array of all non-zero entries in row r sorted by column indexes
-
pos
public int pos(int r)
Description copied from class:SparseBlockGet the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).- Specified by:
posin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- starting position of row r
-
set
public boolean set(int r, int c, double v)Description copied from class:SparseBlockSet the value of a matrix cell (r,c). This might update an existing non-zero value, insert a new non-zero value, or delete a non-zero value.- Specified by:
setin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0v- zero or non-zero value- Returns:
- true, if number of non-zeros changed
-
set
public void set(int r, SparseRow row, boolean deep)Description copied from class:SparseBlockSet the values of row r to the given sparse row. This might update existing non-zero values, insert a new row, or delete a row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.- Specified by:
setin classSparseBlock- Parameters:
r- row index starting at 0row- sparse rowdeep- indicator to create deep copy of sparse row
-
add
public boolean add(int r, int c, double v)Description copied from class:SparseBlockAdd a value to a matrix cell (r,c). This might update an existing non-zero value, or insert a new non-zero value.- Specified by:
addin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0v- zero or non-zero value- Returns:
- true, if number of non-zeros changed
-
append
public void append(int r, int c, double v)Description copied from class:SparseBlockAppend a value to the end of the physical representation. This should only be used for operations with sequential write pattern or if followed by a sort() operation. Note that this operation does not perform any matrix cell updates.- Specified by:
appendin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0v- zero or non-zero value
-
setIndexRange
public void setIndexRange(int r, int cl, int cu, double[] v, int vix, int vlen)Description copied from class:SparseBlockSets a dense array of non-zeros values into the column range [cl,cu) in row r. The passed value array may be larger and the relevant range is given by [vix,vix+len).- Specified by:
setIndexRangein classSparseBlock- Parameters:
r- row index starting at 0cl- lower column index starting at 0cu- upper column index starting at 0v- value arrayvix- start index in value arrayvlen- number of relevant values
-
setIndexRange
public void setIndexRange(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)Description copied from class:SparseBlockSets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r. The passed value array may be larger.- Specified by:
setIndexRangein classSparseBlock- Parameters:
r- row index starting at 0cl- lower column index starting at 0cu- upper column index starting at 0v- value arrayvix- column index arrayvpos- start index in value and index arraysvlen- number of relevant values
-
deleteIndexRange
public void deleteIndexRange(int r, int cl, int cu)Description copied from class:SparseBlockDeletes all non-zero values of the given column range [cl,cu) in row r.- Specified by:
deleteIndexRangein classSparseBlock- Parameters:
r- row index starting at 0cl- lower column index starting at 0cu- upper column index starting at 0
-
sort
public void sort()
Description copied from class:SparseBlockSort all non-zero value/index pairs of the sparse block by row and column index.- Specified by:
sortin classSparseBlock
-
sort
public void sort(int r)
Description copied from class:SparseBlockSort all non-zero value/index pairs of row r column index.- Specified by:
sortin classSparseBlock- Parameters:
r- row index starting at 0
-
get
public double get(int r, int c)Description copied from class:SparseBlockGet value of matrix cell (r,c). In case of non existing values this call returns 0.- Specified by:
getin interfaceBlock- Specified by:
getin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- value of cell at position (r,c)
-
get
public SparseRow get(int r)
Description copied from class:SparseBlockGet values of row r in the format of a sparse row.- Specified by:
getin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- values of row r as a sparse row
-
posFIndexLTE
public int posFIndexLTE(int r, int c)Description copied from class:SparseBlockGet position of first column index lower than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.- Specified by:
posFIndexLTEin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- position of the first column index lower than or equal to column c in row r
-
posFIndexGTE
public int posFIndexGTE(int r, int c)Description copied from class:SparseBlockGet position of first column index greater than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1. Note if CSR the pos(r) is subtracted from the result.- Specified by:
posFIndexGTEin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- position of the first column index greater than or equal to column c in row r
-
posFIndexGT
public int posFIndexGT(int r, int c)Description copied from class:SparseBlockGet position of first column index greater than column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.- Specified by:
posFIndexGTin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- position of the first column index greater than column c in row r
-
getIterator
public Iterator<IJV> getIterator()
Description copied from class:SparseBlockGet a non-zero iterator over the entire sparse block. Note that the returned IJV object is reused across next calls and should be directly consumed or deep copied.- Overrides:
getIteratorin classSparseBlock- Returns:
- IJV iterator
-
getIterator
public Iterator<IJV> getIterator(int ru)
Description copied from class:SparseBlockGet a non-zero iterator over the partial sparse block [0,ru). Note that the returned IJV object is reused across next calls and should be directly consumed or deep copied.- Overrides:
getIteratorin classSparseBlock- Parameters:
ru- exclusive upper row index starting at 0- Returns:
- IJV iterator
-
getIterator
public Iterator<IJV> getIterator(int rl, int ru)
Description copied from class:SparseBlockGet a non-zero iterator over the subblock [rl, ru). Note that the returned IJV object is reused across next calls and should be directly consumed or deep copied.- Overrides:
getIteratorin classSparseBlock- Parameters:
rl- inclusive lower row index starting at 0ru- exclusive upper row index starting at 0- Returns:
- IJV iterator
-
toString
public String toString()
- Specified by:
toStringin classSparseBlock
-
rowIndexes
public int[] rowIndexes()
Get raw access to underlying array of row indices For use in GPU code- Returns:
- array of row indices
-
indexes
public int[] indexes()
Get raw access to underlying array of column indices For use in GPU code- Returns:
- array of column indices
-
values
public double[] values()
Get raw access to underlying array of values For use in GPU code- Returns:
- array of values
-
-