Class CompressedMatrixBlock

    • Field Detail

      • debug

        public static boolean debug
        Debugging flag for Compressed Matrices
    • Constructor Detail

      • CompressedMatrixBlock

        public CompressedMatrixBlock()
      • CompressedMatrixBlock

        public CompressedMatrixBlock​(int rl,
                                     int cl)
        Main constructor for building a block from scratch. Use with caution, since it constructs an empty matrix block with nothing inside.
        Parameters:
        rl - number of rows in the block
        cl - number of columns
      • CompressedMatrixBlock

        public CompressedMatrixBlock​(CompressedMatrixBlock that)
        Copy constructor taking that CompressedMatrixBlock and populate this new compressedMatrixBlock with pointers to the same columnGroups.
        Parameters:
        that - CompressedMatrixBlock to copy values from
      • CompressedMatrixBlock

        public CompressedMatrixBlock​(int rl,
                                     int cl,
                                     long nnz,
                                     boolean overlapping,
                                     List<AColGroup> groups)
        Direct constructor with everything.
        Parameters:
        rl - Number of rows in the block
        cl - Number of columns
        nnz - Number of non zeros
        overlapping - If the matrix is overlapping
        groups - The list of column groups
    • Method Detail

      • reset

        public void reset​(int rl,
                          int cl,
                          boolean sp,
                          long estnnz,
                          double val)
        Description copied from class: MatrixBlock
        Internal canonical reset of dense and sparse matrix blocks.
        Overrides:
        reset in class MatrixBlock
        Parameters:
        rl - number of rows
        cl - number of columns
        sp - sparse representation
        estnnz - estimated number of non-zeros
        val - initialization value
      • allocateColGroup

        public void allocateColGroup​(AColGroup cg)
        Allocate the given column group and remove all references to old column groups. This is done by simply allocating a ned _colGroups list and adding the given column group
        Parameters:
        cg - The column group to use after.
      • allocateColGroupList

        public void allocateColGroupList​(List<AColGroup> colGroups)
        Replace the column groups in this CompressedMatrixBlock with the given column groups
        Parameters:
        colGroups - new ColGroups in the MatrixBlock
      • getColGroups

        public List<AColGroup> getColGroups()
        Get the column groups of this CompressedMatrixBlock
        Returns:
        the column groups
      • decompress

        public MatrixBlock decompress()
        Decompress block into a MatrixBlock
        Returns:
        a new uncompressed matrix block containing the contents of this block
      • getColGroupForColumn

        public AColGroup getColGroupForColumn​(int id)
        Get the column group allocated and associated with a specific column Id; There is some search involved in this since we do not know where to look for the column and which Column group contains the value.
        Parameters:
        id - The column id or number we try to find
        Returns:
        The column group for that column
      • decompress

        public MatrixBlock decompress​(int k)
        Decompress block into a MatrixBlock
        Parameters:
        k - degree of parallelism
        Returns:
        a new uncompressed matrix block containing the contents of this block
      • putInto

        public void putInto​(MatrixBlock target,
                            int rowOffset,
                            int colOffset,
                            boolean sparseCopyShallow)
        Description copied from class: MatrixBlock
        Method for copying this matrix into a target matrix. Note that this method does not maintain number of non zero values. The method should output into the allocated block type of the target, therefore before any calls an appropriate block must be allocated. CSR sparse format is not supported. If allocating into a sparse matrix MCSR block the rows have to be sorted afterwards with a call to target.sortSparseRows()
        Overrides:
        putInto in class MatrixBlock
        Parameters:
        target - Target MatrixBlock, that can be allocated dense or sparse
        rowOffset - The Row offset to allocate into.
        colOffset - The column offset to allocate into.
        sparseCopyShallow - If the output is sparse, and shallow copy of rows is allowed from this block
      • getCachedDecompressed

        public MatrixBlock getCachedDecompressed()
        Get the cached decompressed matrix (if it exists otherwise null). This in practice means that if some other instruction have materialized the decompressed version it can be accessed though this method with a guarantee that it did not go through the entire decompression phase.
        Returns:
        The cached decompressed matrix, if it does not exist return null
      • recomputeNonZeros

        public long recomputeNonZeros()
        Description copied from class: MatrixBlock
        Recomputes and materializes the number of non-zero values of the entire matrix block.
        Overrides:
        recomputeNonZeros in class MatrixBlock
        Returns:
        number of non-zeros
      • recomputeNonZeros

        public long recomputeNonZeros​(int rl,
                                      int ru,
                                      int cl,
                                      int cu)
        Description copied from class: MatrixBlock
        Recomputes the number of non-zero values of a specified range of the matrix block. NOTE: This call does not materialize the compute result in any form.
        Overrides:
        recomputeNonZeros in class MatrixBlock
        Parameters:
        rl - row lower index, 0-based, inclusive
        ru - row upper index, 0-based, inclusive
        cl - column lower index, 0-based, inclusive
        cu - column upper index, 0-based, inclusive
        Returns:
        the number of non-zero values
      • estimateCompressedSizeInMemory

        public long estimateCompressedSizeInMemory()
        Obtain an upper bound on the memory used to store the compressed block.
        Returns:
        an upper bound on the memory used to store this compressed block considering class overhead.
      • baseSizeInMemory

        public static long baseSizeInMemory()
      • getExactSizeOnDisk

        public long getExactSizeOnDisk()
        Description copied from class: MatrixBlock
        NOTE: The used estimates must be kept consistent with the respective write functions.
        Overrides:
        getExactSizeOnDisk in class MatrixBlock
        Returns:
        exact size on disk
      • append

        public MatrixBlock append​(MatrixBlock[] that,
                                  MatrixBlock ret,
                                  boolean cbind)
        Description copied from class: MatrixBlock
        Append that list of matrixes to this matrix. cbind true makes the matrix "wider" while cbind false make it "taller"
        Overrides:
        append in class MatrixBlock
        Parameters:
        that - a list of matrices to append in order
        ret - the output matrix to modify, (is also returned)
        cbind - if binding on columns or rows
        Returns:
        the ret MatrixBlock object with the appended result
      • isOverlapping

        public boolean isOverlapping()
      • setOverlapping

        public void setOverlapping​(boolean overlapping)
      • slice

        public MatrixBlock slice​(int rl,
                                 int ru,
                                 int cl,
                                 int cu,
                                 boolean deep,
                                 MatrixBlock ret)
        Description copied from interface: CacheBlock
        Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.
        Specified by:
        slice in interface CacheBlock<MatrixBlock>
        Overrides:
        slice in class MatrixBlock
        Parameters:
        rl - row lower
        ru - row upper inclusive
        cl - column lower
        cu - column upper inclusive
        deep - enforce deep-copy
        ret - cache block
        Returns:
        sub-block of cache block
      • slice

        public void slice​(ArrayList<IndexedMatrixValue> outlist,
                          IndexRange range,
                          int rowCut,
                          int colCut,
                          int blen,
                          int boundaryRlen,
                          int boundaryClen)
        Description copied from class: MatrixValue
        Slice out up to 4 matrixBlocks that are separated by the row and col Cuts. This is used in the context of spark execution to distributed sliced out matrix blocks of correct block size.
        Overrides:
        slice in class MatrixBlock
        Parameters:
        outlist - The output matrix blocks that is extracted from the matrix
        range - An index range containing overlapping information.
        rowCut - The row to cut and split the matrix.
        colCut - The column to cut ans split the matrix.
        blen - The Block size of the output matrices.
        boundaryRlen - The row length of the edge case matrix block, used for the final blocks that does not have enough rows to construct a full block.
        boundaryClen - The col length of the edge case matrix block, used for the final blocks that does not have enough cols to construct a full block.
      • max

        public double max()
        Description copied from class: MatrixBlock
        Wrapper method for reduceall-max of a matrix.
        Overrides:
        max in class MatrixBlock
        Returns:
        the maximum value of all values in the matrix
      • min

        public double min()
        Description copied from class: MatrixBlock
        Wrapper method for reduceall-min of a matrix.
        Overrides:
        min in class MatrixBlock
        Returns:
        the minimum value of all values in the matrix
      • sum

        public double sum()
        Description copied from class: MatrixBlock
        Wrapper method for reduceall-sum of a matrix.
        Overrides:
        sum in class MatrixBlock
        Returns:
        Sum of the values in the matrix.
      • colSum

        public MatrixBlock colSum()
        Description copied from class: MatrixBlock
        Wrapper method for single threaded reduceall-colSum of a matrix.
        Overrides:
        colSum in class MatrixBlock
        Returns:
        A new MatrixBlock containing the column sums of this matrix.
      • sumSq

        public double sumSq()
        Description copied from class: MatrixBlock
        Wrapper method for reduceall-sumSq of a matrix.
        Overrides:
        sumSq in class MatrixBlock
        Returns:
        Sum of the squared values in the matrix.
      • prod

        public double prod()
        Description copied from class: MatrixBlock
        Wrapper method for reduceall-product of a matrix.
        Overrides:
        prod in class MatrixBlock
        Returns:
        the product sum of the matrix content
      • mean

        public double mean()
        Description copied from class: MatrixBlock
        Wrapper method for reduceall-mean of a matrix.
        Overrides:
        mean in class MatrixBlock
        Returns:
        the mean value of all values in the matrix
      • isEmptyBlock

        public boolean isEmptyBlock​(boolean safe)
        Description copied from class: MatrixBlock
        Get if this MatrixBlock is an empty block. The call can potentially tricker a recomputation of non zeros if the non-zero count is unknown.
        Overrides:
        isEmptyBlock in class MatrixBlock
        Parameters:
        safe - True if we want to ensure the count non zeros if the nnz is unknown.
        Returns:
        If the block is empty.
      • leftIndexingOperations

        public MatrixBlock leftIndexingOperations​(ScalarObject scalar,
                                                  int rl,
                                                  int cl,
                                                  MatrixBlock ret,
                                                  MatrixObject.UpdateType update)
        Description copied from class: MatrixBlock
        Explicitly allow left indexing for scalars. Note: This operation is now 0-based. * Operations to be performed: 1) result=this; 2) result[row,column] = scalar.getDoubleValue();
        Overrides:
        leftIndexingOperations in class MatrixBlock
        Parameters:
        scalar - scalar object
        rl - row lower
        cl - column lower
        ret - ?
        update - ?
        Returns:
        matrix block
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     double scalar,
                                     MatrixValue that,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        Description copied from class: MatrixBlock
        D = ctable(A,v2,W) this <- A; scalarThat <- v2; that2 <- W; result <- D (i1,j1,v1) from input1 (this) (v2) from sclar_input2 (scalarThat) (i3,j3,w) from input3 (that2)
        Overrides:
        ctableOperations in class MatrixBlock
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     double scalar,
                                     double scalar2,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        Description copied from class: MatrixBlock
        D = ctable(A,v2,w) this <- A; scalar_that <- v2; scalar_that2 <- w; result <- D (i1,j1,v1) from input1 (this) (v2) from sclar_input2 (scalarThat) (w) from scalar_input3 (scalarThat2)
        Overrides:
        ctableOperations in class MatrixBlock
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     MatrixIndexes ix1,
                                     double scalar,
                                     boolean left,
                                     int brlen,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        Description copied from class: MatrixBlock
        Specific ctable case of ctable(seq(...),X), where X is the only matrix input. The 'left' input parameter specifies if the seq appeared on the left, otherwise it appeared on the right.
        Overrides:
        ctableOperations in class MatrixBlock
      • ctableOperations

        public void ctableOperations​(Operator op,
                                     MatrixValue that,
                                     double scalar,
                                     boolean ignoreZeros,
                                     CTableMap resultMap,
                                     MatrixBlock resultBlock)
        Description copied from class: MatrixBlock
        D = ctable(A,B,w) this <- A; that <- B; scalar_that2 <- w; result <- D (i1,j1,v1) from input1 (this) (i1,j1,v2) from input2 (that) (w) from scalar_input3 (scalarThat2) NOTE: This method supports both vectors and matrices. In case of matrices and ignoreZeros=true we can also use a sparse-safe implementation
        Overrides:
        ctableOperations in class MatrixBlock
      • ctableSeqOperations

        public MatrixBlock ctableSeqOperations​(MatrixValue thatMatrix,
                                               double thatScalar,
                                               MatrixBlock resultBlock,
                                               boolean updateClen)
        Overrides:
        ctableSeqOperations in class MatrixBlock
        Parameters:
        thatMatrix - matrix value
        thatScalar - scalar double
        resultBlock - result matrix block
        updateClen - when this matrix already has the desired number of columns updateClen can be set to false
        Returns:
        result matrix block
      • randOperationsInPlace

        public MatrixBlock randOperationsInPlace​(RandomMatrixGenerator rgen,
                                                 org.apache.commons.math3.random.Well1024a bigrand,
                                                 long bSeed)
        Description copied from class: MatrixBlock
        Function to generate a matrix of random numbers. This is invoked both from CP as well as from MR. In case of CP, it generates an entire matrix block-by-block. A bigrand is passed so that block-level seeds are generated internally. In case of MR, it generates a single block for given block-level seed bSeed. When pdf="uniform", cell values are drawn from uniform distribution in range [min,max]. When pdf="normal", cell values are drawn from standard normal distribution N(0,1). The range of generated values will always be (-Inf,+Inf).
        Overrides:
        randOperationsInPlace in class MatrixBlock
        Parameters:
        rgen - random matrix generator
        bigrand - ?
        bSeed - seed value
        Returns:
        matrix block
      • randOperationsInPlace

        public MatrixBlock randOperationsInPlace​(RandomMatrixGenerator rgen,
                                                 org.apache.commons.math3.random.Well1024a bigrand,
                                                 long bSeed,
                                                 int k)
        Description copied from class: MatrixBlock
        Function to generate a matrix of random numbers. This is invoked both from CP as well as from MR. In case of CP, it generates an entire matrix block-by-block. A bigrand is passed so that block-level seeds are generated internally. In case of MR, it generates a single block for given block-level seed bSeed. When pdf="uniform", cell values are drawn from uniform distribution in range [min,max]. When pdf="normal", cell values are drawn from standard normal distribution N(0,1). The range of generated values will always be (-Inf,+Inf).
        Overrides:
        randOperationsInPlace in class MatrixBlock
        Parameters:
        rgen - random matrix generator
        bigrand - ?
        bSeed - seed value
        k - ?
        Returns:
        matrix block
      • getUncompressed

        public MatrixBlock getUncompressed()
      • isShallowSerialize

        public boolean isShallowSerialize​(boolean inclConvert)
        Description copied from interface: CacheBlock
        Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.
        Specified by:
        isShallowSerialize in interface CacheBlock<MatrixBlock>
        Overrides:
        isShallowSerialize in class MatrixBlock
        Parameters:
        inclConvert - if true report blocks as shallow serialize that are currently not amenable but can be brought into an amenable form via toShallowSerializeBlock.
        Returns:
        true if shallow serialized
      • copy

        public void copy​(MatrixValue thatValue)
        Description copied from class: MatrixValue
        Copy that MatrixValue into this MatrixValue. If the MatrixValue is a MatrixBlock evaluate the sparsity of the original matrix, and copy into either a sparse or a dense matrix.
        Overrides:
        copy in class MatrixBlock
        Parameters:
        thatValue - object to copy the values into.
      • copy

        public void copy​(MatrixValue thatValue,
                         boolean sp)
        Description copied from class: MatrixValue
        Copy that MatrixValue into this MatrixValue. But select sparse destination block depending on boolean parameter.
        Overrides:
        copy in class MatrixBlock
        Parameters:
        thatValue - object to copy the values into.
        sp - boolean specifying if output should be forced sparse or dense. (only applicable if the 'that' is a MatrixBlock)
      • copy

        public void copy​(int rl,
                         int ru,
                         int cl,
                         int cu,
                         MatrixBlock src,
                         boolean awareDestNZ)
        Description copied from class: MatrixBlock
        In-place copy of matrix src into the index range of the existing current matrix. Note that removal of existing nnz in the index range and nnz maintenance is only done if 'awareDestNZ=true',
        Overrides:
        copy in class MatrixBlock
        Parameters:
        rl - row lower index, 0-based
        ru - row upper index, 0-based, inclusive
        cl - column lower index, 0-based
        cu - column upper index, 0-based, inclusive
        src - matrix block
        awareDestNZ - true, forces (1) to remove existing non-zeros in the index range of the destination if not present in src and (2) to internally maintain nnz false, assume empty index range in destination and do not maintain nnz (the invoker is responsible to recompute nnz after all copies are done)
      • clearSoftReferenceToDecompressed

        public void clearSoftReferenceToDecompressed()
      • clearCounts

        public void clearCounts()
      • quickSetValue

        public void quickSetValue​(int r,
                                  int c,
                                  double v)
        Overrides:
        quickSetValue in class MatrixBlock
      • appendValue

        public void appendValue​(int r,
                                int c,
                                double v)
        Description copied from class: MatrixBlock

        Append value is only used when values are appended at the end of each row for the sparse representation

        This can only be called, when the caller knows the access pattern of the block
        Overrides:
        appendValue in class MatrixBlock
        Parameters:
        r - row
        c - column
        v - value
      • sortSparseRows

        public void sortSparseRows()
        Description copied from class: MatrixBlock
        Sorts all existing sparse rows by column indexes.
        Overrides:
        sortSparseRows in class MatrixBlock
      • sortSparseRows

        public void sortSparseRows​(int rl,
                                   int ru)
        Description copied from class: MatrixBlock
        Sorts all existing sparse rows in range [rl,ru) by column indexes.
        Overrides:
        sortSparseRows in class MatrixBlock
        Parameters:
        rl - row lower bound, inclusive
        ru - row upper bound, exclusive
      • minNonZero

        public double minNonZero()
        Description copied from class: MatrixBlock
        Utility function for computing the min non-zero value.
        Overrides:
        minNonZero in class MatrixBlock
        Returns:
        minimum non-zero value
      • isInSparseFormat

        public boolean isInSparseFormat()
        Description copied from class: MatrixBlock
        Returns the current representation (true for sparse).
        Overrides:
        isInSparseFormat in class MatrixBlock
        Returns:
        true if sparse
      • evalSparseFormatInMemory

        public boolean evalSparseFormatInMemory()
        Description copied from class: MatrixBlock
        Evaluates if this matrix block should be in sparse format in memory. Note that this call does not change the representation - for this please call examSparsity.
        Overrides:
        evalSparseFormatInMemory in class MatrixBlock
        Returns:
        true if matrix block should be in sparse format in memory
      • evalSparseFormatOnDisk

        public boolean evalSparseFormatOnDisk()
        Description copied from class: MatrixBlock
        Evaluates if this matrix block should be in sparse format on disk. This applies to any serialized matrix representation, i.e., when writing to in-memory buffer pool pages or writing to local fs or hdfs.
        Overrides:
        evalSparseFormatOnDisk in class MatrixBlock
        Returns:
        true if matrix block should be in sparse format on disk
      • examSparsity

        public void examSparsity​(boolean allowCSR,
                                 int k)
        Description copied from class: MatrixBlock
        Evaluates if this matrix block should be in sparse format in memory. Depending on the current representation, the state of the matrix block is changed to the right representation if necessary. Note that this consumes for the time of execution memory for both representations.
        Overrides:
        examSparsity in class MatrixBlock
        Parameters:
        allowCSR - allow CSR format on dense to sparse conversion
        k - parallelization degree
      • denseToSparse

        public void denseToSparse​(boolean allowCSR,
                                  int k)
        Overrides:
        denseToSparse in class MatrixBlock
      • pickValue

        public double pickValue​(double quantile,
                                boolean average)
        Overrides:
        pickValue in class MatrixBlock
      • sumWeightForQuantile

        public double sumWeightForQuantile()
        Description copied from class: MatrixBlock
        In a given two column matrix, the second column denotes weights. This function computes the total weight
        Overrides:
        sumWeightForQuantile in class MatrixBlock
        Returns:
        sum weight for quantile
      • isThreadSafe

        public boolean isThreadSafe()
        Description copied from class: MatrixBlock
        Indicates if concurrent modifications of disjoint rows are thread-safe.
        Overrides:
        isThreadSafe in class MatrixBlock
        Returns:
        true if thread-safe
      • checkNaN

        public void checkNaN()
        Description copied from class: MatrixBlock
        Checks for existing NaN values in the matrix block.
        Overrides:
        checkNaN in class MatrixBlock
      • init

        public void init​(double[][] arr,
                         int r,
                         int c)
        Description copied from class: MatrixBlock
        NOTE: This method is designed only for dense representation.
        Overrides:
        init in class MatrixBlock
        Parameters:
        arr - 2d double array matrix
        r - number of rows
        c - number of columns
      • init

        public void init​(double[] arr,
                         int r,
                         int c)
        Description copied from class: MatrixBlock
        NOTE: This method is designed only for dense representation.
        Overrides:
        init in class MatrixBlock
        Parameters:
        arr - double array matrix
        r - number of rows
        c - number of columns