Class SparseBlockCSR

  • All Implemented Interfaces:
    Serializable, Block

    public class SparseBlockCSR
    extends SparseBlock
    SparseBlock implementation that realizes a traditional 'compressed sparse row' representation, where the entire sparse block is stored as three arrays: ptr of length rlen+1 to store offsets per row, and indexes/values of length nnz to store column indexes and values of non-zero entries. This format is very memory efficient for sparse (but not ultra-sparse) matrices and provides very good performance for common operations, partially due to lower memory bandwidth requirements. However, this format is slow on incremental construction (because it does not allow append/sort per row) without reshifting. Finally, the total nnz is limited to INTEGER_MAX, whereas for SparseBlockMCSR only the nnz per row are limited to INTEGER_MAX. TODO: extensions for faster incremental construction (e.g., max row) TODO more efficient fused setIndexRange impl to avoid repeated copies and updates
    See Also:
    Serialized Form
    • Constructor Summary

      Constructors 
      Constructor Description
      SparseBlockCSR​(int rlen)  
      SparseBlockCSR​(int[] rowPtr, int[] colInd, double[] values, int nnz)  
      SparseBlockCSR​(int rlen, int capacity)  
      SparseBlockCSR​(int rows, int[] rowInd, int[] colInd, double[] values)
      Copy constructor for COO representation
      SparseBlockCSR​(int rlen, int capacity, int size)  
      SparseBlockCSR​(int rows, int nnz, int[] colInd)
      Copy constructor for given array of column indexes, which identifies rows by position and implies values of 1.
      SparseBlockCSR​(SparseBlock sblock)
      Copy constructor sparse block abstraction.
      SparseBlockCSR​(SparseRow[] rows, int nnz)
      Copy constructor old sparse row representation.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean add​(int r, int c, double v)
      Add a value to a matrix cell (r,c).
      void allocate​(int r)
      Allocate the underlying data structure holding non-zero values of row r if necessary.
      void allocate​(int r, int nnz)
      Allocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.
      void allocate​(int r, int ennz, int maxnnz)
      Allocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.
      void append​(int r, int c, double v)
      Append a value to the end of the physical representation.
      boolean checkValidity​(int rlen, int clen, long nnz, boolean strict)
      Validate the correctness of the internal data structures of the different sparse block implementations.
      void compact()  
      void compact​(int r)
      Re-allocate physical row if physical size exceeds logical size plus resize factor.
      boolean contains​(double pattern)
      Checks if the block contains at least one value of the given pattern.
      void deleteIndexRange​(int r, int cl, int cu)
      Deletes all non-zero values of the given column range [cl,cu) in row r.
      static long estimateSizeInMemory​(long nrows, long ncols, double sparsity)
      Get the estimated in-memory size of the sparse block in CSR with the given dimensions w/o accounting for overallocation.
      SparseRow get​(int r)
      Get values of row r in the format of a sparse row.
      double get​(int r, int c)
      Get value of matrix cell (r,c).
      int[] indexes()
      Get raw access to underlying array of column indices For use in GPU code
      int[] indexes​(int r)
      Get the sorted array of column indexes of all non-zero entries in row r.
      void initSparse​(int rlen, int nnz, DataInput in)
      Initializes the CSR sparse block from an ordered input stream of sparse rows (rownnz, jv-pairs*).
      void initUltraSparse​(int nnz, DataInput in)
      Initializes the CSR sparse block from an ordered input stream of ultra-sparse ijv triples.
      boolean isAllocated​(int r)
      Indicates if the underlying data structure for a given row is already allocated.
      boolean isContiguous()
      Indicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.
      boolean isEmpty​(int r)
      Get information if row r is empty, i.e., does not contain non-zero values.
      boolean isThreadSafe()
      Indicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.
      int numRows()
      Get the number of rows in the sparse block.
      int pos​(int r)
      Get the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).
      int posFIndexGT​(int r, int c)
      Get position of first column index greater than column c in row r.
      int posFIndexGTE​(int r, int c)
      Get position of first column index greater than or equal column c in row r.
      int posFIndexLTE​(int r, int c)
      Get position of first column index lower than or equal column c in row r.
      void reset()
      Clears the sparse block by deleting non-zero values.
      void reset​(int ennz, int maxnnz)
      Clears the sparse block by deleting non-zero values.
      void reset​(int r, int ennz, int maxnnz)
      Clears row r of the sparse block by deleting non-zero values.
      int[] rowPointers()
      Get raw access to underlying array of row pointers For use in GPU code
      boolean set​(int r, int c, double v)
      Set the value of a matrix cell (r,c).
      void set​(int r, SparseRow row, boolean deep)
      Set the values of row r to the given sparse row.
      void setIndexRange​(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)
      Sets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r.
      void setIndexRange​(int r, int cl, int cu, double[] v, int vix, int vlen)
      Sets a dense array of non-zeros values into the column range [cl,cu) in row r.
      void setIndexRange​(int rl, int ru, int cl, int cu, double[] v, int vix, int vlen)
      Inserts a sorted row-major array of non-zero values into the row and column range [rl,ru) and [cl,cu).
      void setIndexRange​(int rl, int ru, int cl, int cu, SparseBlock sb)
      Inserts a sparse block into the row and column range [rl,ru) and [cl,cu).
      long size()
      Get the number of non-zero values in the sparse block.
      int size​(int r)
      Get the number of non-zero values in row r.
      long size​(int rl, int ru)
      Get the number of non-zeros values in the row range of [rl, ru).
      long size​(int rl, int ru, int cl, int cu)
      Get the number of non-zeros values in the row and column range of [rl/cl, ru/cu);
      void sort()
      Sort all non-zero value/index pairs of the sparse block by row and column index.
      void sort​(int r)
      Sort all non-zero value/index pairs of row r column index.
      String toString()  
      double[] values()
      Get raw access to underlying array of values For use in GPU code
      double[] values​(int r)
      Get the array of all non-zero entries in row r, sorted by their column indexes.
    • Constructor Detail

      • SparseBlockCSR

        public SparseBlockCSR​(int rlen)
      • SparseBlockCSR

        public SparseBlockCSR​(int rlen,
                              int capacity)
      • SparseBlockCSR

        public SparseBlockCSR​(int rlen,
                              int capacity,
                              int size)
      • SparseBlockCSR

        public SparseBlockCSR​(int[] rowPtr,
                              int[] colInd,
                              double[] values,
                              int nnz)
      • SparseBlockCSR

        public SparseBlockCSR​(SparseBlock sblock)
        Copy constructor sparse block abstraction.
        Parameters:
        sblock - sparse block to copy
      • SparseBlockCSR

        public SparseBlockCSR​(SparseRow[] rows,
                              int nnz)
        Copy constructor old sparse row representation.
        Parameters:
        rows - array of sparse rows
        nnz - number of non-zeroes
      • SparseBlockCSR

        public SparseBlockCSR​(int rows,
                              int[] rowInd,
                              int[] colInd,
                              double[] values)
        Copy constructor for COO representation
        Parameters:
        rows - number of rows
        rowInd - row indices
        colInd - column indices
        values - non zero values
      • SparseBlockCSR

        public SparseBlockCSR​(int rows,
                              int nnz,
                              int[] colInd)
        Copy constructor for given array of column indexes, which identifies rows by position and implies values of 1.
        Parameters:
        rows - number of rows
        nnz - number of non-zeros
        colInd - column indexes
    • Method Detail

      • initUltraSparse

        public void initUltraSparse​(int nnz,
                                    DataInput in)
                             throws IOException
        Initializes the CSR sparse block from an ordered input stream of ultra-sparse ijv triples.
        Parameters:
        nnz - number of non-zeros to read
        in - data input stream of ijv triples, ordered by ij
        Throws:
        IOException - if deserialization error occurs
      • initSparse

        public void initSparse​(int rlen,
                               int nnz,
                               DataInput in)
                        throws IOException
        Initializes the CSR sparse block from an ordered input stream of sparse rows (rownnz, jv-pairs*).
        Parameters:
        rlen - number of rows
        nnz - number of non-zeros to read
        in - data input stream of sparse rows, ordered by i
        Throws:
        IOException - if deserialization error occurs
      • estimateSizeInMemory

        public static long estimateSizeInMemory​(long nrows,
                                                long ncols,
                                                double sparsity)
        Get the estimated in-memory size of the sparse block in CSR with the given dimensions w/o accounting for overallocation.
        Parameters:
        nrows - number of rows
        ncols - number of columns
        sparsity - sparsity ratio
        Returns:
        memory estimate
      • rowPointers

        public int[] rowPointers()
        Get raw access to underlying array of row pointers For use in GPU code
        Returns:
        array of row pointers
      • indexes

        public int[] indexes()
        Get raw access to underlying array of column indices For use in GPU code
        Returns:
        array of column indexes
      • values

        public double[] values()
        Get raw access to underlying array of values For use in GPU code
        Returns:
        array of values
      • allocate

        public void allocate​(int r)
        Description copied from class: SparseBlock
        Allocate the underlying data structure holding non-zero values of row r if necessary.
        Specified by:
        allocate in class SparseBlock
        Parameters:
        r - row index
      • allocate

        public void allocate​(int r,
                             int nnz)
        Description copied from class: SparseBlock
        Allocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.
        Specified by:
        allocate in class SparseBlock
        Parameters:
        r - row index
        nnz - number of non-zeros
      • allocate

        public void allocate​(int r,
                             int ennz,
                             int maxnnz)
        Description copied from class: SparseBlock
        Allocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.
        Specified by:
        allocate in class SparseBlock
        Parameters:
        r - row index
        ennz - estimated non-zeros
        maxnnz - max non-zeros
      • compact

        public void compact​(int r)
        Description copied from class: SparseBlock
        Re-allocate physical row if physical size exceeds logical size plus resize factor.
        Specified by:
        compact in class SparseBlock
        Parameters:
        r - row index
      • compact

        public void compact()
      • numRows

        public int numRows()
        Description copied from class: SparseBlock
        Get the number of rows in the sparse block.
        Specified by:
        numRows in class SparseBlock
        Returns:
        number of rows
      • isThreadSafe

        public boolean isThreadSafe()
        Description copied from class: SparseBlock
        Indicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.
        Specified by:
        isThreadSafe in class SparseBlock
        Returns:
        true if thread-safe row updates
      • isContiguous

        public boolean isContiguous()
        Description copied from class: SparseBlock
        Indicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.
        Specified by:
        isContiguous in class SparseBlock
        Returns:
        true if underlying data structures are contiguous arrays
      • isAllocated

        public boolean isAllocated​(int r)
        Description copied from class: SparseBlock
        Indicates if the underlying data structure for a given row is already allocated.
        Specified by:
        isAllocated in class SparseBlock
        Parameters:
        r - row index
        Returns:
        true if already allocated
      • reset

        public void reset()
        Description copied from class: SparseBlock
        Clears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.
        Specified by:
        reset in class SparseBlock
      • reset

        public void reset​(int ennz,
                          int maxnnz)
        Description copied from class: SparseBlock
        Clears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.
        Specified by:
        reset in class SparseBlock
        Parameters:
        ennz - estimated non-zeros
        maxnnz - max non-zeros
      • reset

        public void reset​(int r,
                          int ennz,
                          int maxnnz)
        Description copied from class: SparseBlock
        Clears row r of the sparse block by deleting non-zero values. After this call size(r) is guaranteed to return 0.
        Specified by:
        reset in class SparseBlock
        Parameters:
        r - row index
        ennz - estimated non-zeros
        maxnnz - max non-zeros
      • size

        public long size()
        Description copied from class: SparseBlock
        Get the number of non-zero values in the sparse block.
        Specified by:
        size in class SparseBlock
        Returns:
        number of non-zero values in sparse block
      • size

        public int size​(int r)
        Description copied from class: SparseBlock
        Get the number of non-zero values in row r.
        Specified by:
        size in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        number of non-zero values in row r
      • size

        public long size​(int rl,
                         int ru)
        Description copied from class: SparseBlock
        Get the number of non-zeros values in the row range of [rl, ru).
        Specified by:
        size in class SparseBlock
        Parameters:
        rl - row lower index
        ru - row upper index
        Returns:
        number of non-zero values in the row range
      • size

        public long size​(int rl,
                         int ru,
                         int cl,
                         int cu)
        Description copied from class: SparseBlock
        Get the number of non-zeros values in the row and column range of [rl/cl, ru/cu);
        Specified by:
        size in class SparseBlock
        Parameters:
        rl - row lower index
        ru - row upper index
        cl - column lower index
        cu - column upper index
        Returns:
        number of non-zero values in the row and column range
      • isEmpty

        public boolean isEmpty​(int r)
        Description copied from class: SparseBlock
        Get information if row r is empty, i.e., does not contain non-zero values. Equivalent to size(r)==0. Users should do this check if it is unknown if the underlying row data structure is allocated.
        Specified by:
        isEmpty in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        true if row does not contain non-zero values
      • indexes

        public int[] indexes​(int r)
        Description copied from class: SparseBlock
        Get the sorted array of column indexes of all non-zero entries in row r. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).
        Specified by:
        indexes in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        sorted array of column indexes
      • values

        public double[] values​(int r)
        Description copied from class: SparseBlock
        Get the array of all non-zero entries in row r, sorted by their column indexes. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).
        Specified by:
        values in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        array of all non-zero entries in row r sorted by column indexes
      • pos

        public int pos​(int r)
        Description copied from class: SparseBlock
        Get the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).
        Specified by:
        pos in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        starting position of row r
      • set

        public boolean set​(int r,
                           int c,
                           double v)
        Description copied from class: SparseBlock
        Set the value of a matrix cell (r,c). This might update an existing non-zero value, insert a new non-zero value, or delete a non-zero value.
        Specified by:
        set in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        v - zero or non-zero value
        Returns:
        true, if number of non-zeros changed
      • add

        public boolean add​(int r,
                           int c,
                           double v)
        Description copied from class: SparseBlock
        Add a value to a matrix cell (r,c). This might update an existing non-zero value, or insert a new non-zero value.
        Specified by:
        add in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        v - zero or non-zero value
        Returns:
        true, if number of non-zeros changed
      • set

        public void set​(int r,
                        SparseRow row,
                        boolean deep)
        Description copied from class: SparseBlock
        Set the values of row r to the given sparse row. This might update existing non-zero values, insert a new row, or delete a row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.
        Specified by:
        set in class SparseBlock
        Parameters:
        r - row index starting at 0
        row - sparse row
        deep - indicator to create deep copy of sparse row
      • append

        public void append​(int r,
                           int c,
                           double v)
        Description copied from class: SparseBlock
        Append a value to the end of the physical representation. This should only be used for operations with sequential write pattern or if followed by a sort() operation. Note that this operation does not perform any matrix cell updates.
        Specified by:
        append in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        v - zero or non-zero value
      • setIndexRange

        public void setIndexRange​(int r,
                                  int cl,
                                  int cu,
                                  double[] v,
                                  int vix,
                                  int vlen)
        Description copied from class: SparseBlock
        Sets a dense array of non-zeros values into the column range [cl,cu) in row r. The passed value array may be larger and the relevant range is given by [vix,vix+len).
        Specified by:
        setIndexRange in class SparseBlock
        Parameters:
        r - row index starting at 0
        cl - lower column index starting at 0
        cu - upper column index starting at 0
        v - value array
        vix - start index in value array
        vlen - number of relevant values
      • setIndexRange

        public void setIndexRange​(int r,
                                  int cl,
                                  int cu,
                                  double[] v,
                                  int[] vix,
                                  int vpos,
                                  int vlen)
        Description copied from class: SparseBlock
        Sets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r. The passed value array may be larger.
        Specified by:
        setIndexRange in class SparseBlock
        Parameters:
        r - row index starting at 0
        cl - lower column index starting at 0
        cu - upper column index starting at 0
        v - value array
        vix - column index array
        vpos - start index in value and index arrays
        vlen - number of relevant values
      • setIndexRange

        public void setIndexRange​(int rl,
                                  int ru,
                                  int cl,
                                  int cu,
                                  double[] v,
                                  int vix,
                                  int vlen)
        Inserts a sorted row-major array of non-zero values into the row and column range [rl,ru) and [cl,cu). Note: that this is a CSR-specific method to address performance issues due to repeated re-shifting on update-in-place.
        Parameters:
        rl - lower row index, starting at 0, inclusive
        ru - upper row index, starting at 0, exclusive
        cl - lower column index, starting at 0, inclusive
        cu - upper column index, starting at 0, exclusive
        v - right-hand-side dense block
        vix - right-hand-side dense block index
        vlen - right-hand-side dense block value length
      • setIndexRange

        public void setIndexRange​(int rl,
                                  int ru,
                                  int cl,
                                  int cu,
                                  SparseBlock sb)
        Inserts a sparse block into the row and column range [rl,ru) and [cl,cu). Note: that this is a CSR-specific method to address performance issues due to repeated re-shifting on update-in-place.
        Parameters:
        rl - lower row index, starting at 0, inclusive
        ru - upper row index, starting at 0, exclusive
        cl - lower column index, starting at 0, inclusive
        cu - upper column index, starting at 0, exclusive
        sb - right-hand-side sparse block
      • deleteIndexRange

        public void deleteIndexRange​(int r,
                                     int cl,
                                     int cu)
        Description copied from class: SparseBlock
        Deletes all non-zero values of the given column range [cl,cu) in row r.
        Specified by:
        deleteIndexRange in class SparseBlock
        Parameters:
        r - row index starting at 0
        cl - lower column index starting at 0
        cu - upper column index starting at 0
      • sort

        public void sort()
        Description copied from class: SparseBlock
        Sort all non-zero value/index pairs of the sparse block by row and column index.
        Specified by:
        sort in class SparseBlock
      • sort

        public void sort​(int r)
        Description copied from class: SparseBlock
        Sort all non-zero value/index pairs of row r column index.
        Specified by:
        sort in class SparseBlock
        Parameters:
        r - row index starting at 0
      • get

        public double get​(int r,
                          int c)
        Description copied from class: SparseBlock
        Get value of matrix cell (r,c). In case of non existing values this call returns 0.
        Specified by:
        get in interface Block
        Specified by:
        get in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        value of cell at position (r,c)
      • get

        public SparseRow get​(int r)
        Description copied from class: SparseBlock
        Get values of row r in the format of a sparse row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.
        Specified by:
        get in class SparseBlock
        Parameters:
        r - row index starting at 0
        Returns:
        values of row r as a sparse row
      • posFIndexLTE

        public int posFIndexLTE​(int r,
                                int c)
        Description copied from class: SparseBlock
        Get position of first column index lower than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.
        Specified by:
        posFIndexLTE in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        position of the first column index lower than or equal to column c in row r
      • posFIndexGTE

        public int posFIndexGTE​(int r,
                                int c)
        Description copied from class: SparseBlock
        Get position of first column index greater than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1. Note if CSR the pos(r) is subtracted from the result.
        Specified by:
        posFIndexGTE in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        position of the first column index greater than or equal to column c in row r
      • posFIndexGT

        public int posFIndexGT​(int r,
                               int c)
        Description copied from class: SparseBlock
        Get position of first column index greater than column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.
        Specified by:
        posFIndexGT in class SparseBlock
        Parameters:
        r - row index starting at 0
        c - column index starting at 0
        Returns:
        position of the first column index greater than column c in row r
      • checkValidity

        public boolean checkValidity​(int rlen,
                                     int clen,
                                     long nnz,
                                     boolean strict)
        Description copied from class: SparseBlock
        Validate the correctness of the internal data structures of the different sparse block implementations.
        Specified by:
        checkValidity in class SparseBlock
        Parameters:
        rlen - number of rows
        clen - number of columns
        nnz - number of non zeros
        strict - enforce optional properties
        Returns:
        true if the sparse block is valid wrt the corresponding format such as COO, CSR, MCSR.
      • contains

        public boolean contains​(double pattern)
        Description copied from class: SparseBlock
        Checks if the block contains at least one value of the given pattern. Implementations need to handle NaN patterns as well (note that NaN==NaN yields false).
        Overrides:
        contains in class SparseBlock
        Parameters:
        pattern - checked pattern
        Returns:
        true if pattern appears at least once, otherwise false