QDictionary (SystemDS 2.1.0 API)

java.lang.Object
- org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
- - org.apache.sysds.runtime.compress.colgroup.dictionary.QDictionary

```
public class QDictionary
extends ADictionary
```
This dictionary class aims to encapsulate the storage and operations over unique floating point values of a column group. The primary reason for its introduction was to provide an entry point for specialization such as shared dictionaries, which require additional information.

Constructor Summary

Constructors
Constructor and Description

QDictionary(BitmapLossy bm)

Constructors
Constructor and Description
`QDictionary(BitmapLossy bm)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`addMaxAndMin(double[] ret, int[] colIndexes)` This method adds the max and min values contained in the dictionary to corresponding cells in the ret variable.
`void`	`addToEntry(Dictionary d, int fr, int to, int nCol)` Copies and adds the dictionary entry from this dictionary to the d dictionary
`double`	`aggregate(double init, Builtin fn)` Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.
`void`	`aggregateCols(double[] c, Builtin fn, int[] colIndexes)` Aggregates the columns into the target double array provided.
`double[]`	`aggregateTuples(Builtin fn, int nCol)` Aggregate all entries in the rows.
`QDictionary`	`apply(ScalarOperator op)` Applies the scalar operation on the dictionary.
`QDictionary`	`applyBinaryRowOpLeft(BinaryOperator op, double[] v, boolean sparseSafe, int[] colIndexes)` Apply binary row operation on this dictionary on the left side.
`QDictionary`	`applyBinaryRowOpRight(BinaryOperator op, double[] v, boolean sparseSafe, int[] colIndexes)` Apply binary row operation on this dictionary on the right side.
`QDictionary`	`applyScalarOp(ScalarOperator op, double newVal, int numCols)` Applies the scalar operation on the dictionary.
`QDictionary`	`clone()` Returns a deep clone of the dictionary.
`QDictionary`	`cloneAndExtend(int len)` Clone the dictionary, and extend size of the dictionary by a given length
`void`	`colSum(double[] c, int[] counts, int[] colIndexes, boolean square)` Get the column sum of the values contained in the dictionary
`double[]`	`colSum(int[] counts, int nCol)` get the column sum of this dictionary only.
`boolean`	`containsValue(double pattern)` Detect if the dictionary contains a specific value.
`MatrixBlockDictionary`	`getAsMatrixBlockDictionary(int nCol)` Get this dictionary as a matrixBlock dictionary.
`long`	`getExactSizeOnDisk()` Calculate the space consumption if the dictionary is stored on disk.
`long`	`getInMemorySize()` Returns the memory usage of the dictionary.
`static long`	`getInMemorySize(int valuesCount)`
`long`	`getNumberNonZeros(int[] counts, int nCol)` Calculate the number of non zeros in the dictionary.
`int`	`getNumberOfValues(int nCol)` Get the number of distinct tuples given that the column group has n columns
`double`	`getScale()`
`String`	`getString(int colIndexes)` Get a string representation of the dictionary, that considers the layout of the data.
`double[]`	`getTuple(int index, int nCol)` Get the values contained in a specific tuple of the dictionary.
`double`	`getValue(int i)` Get Specific value contained in the dictionary at index.
`byte`	`getValueByte(int i)`
`double[]`	`getValues()` Get all the values contained in the dictionary as a linearized double array.
`byte[]`	`getValuesByte()`
`boolean`	`isLossy()` Specify if the Dictionary is lossy.
`Dictionary`	`makeDoubleDictionary()`
`void`	`preaggValuesFromDense(int numVals, int[] colIndexes, int[] aggregateColumns, double[] b, double[] ret, int cut)` Pre Aggregate values for right Matrix Multiplication.
`static QDictionary`	`read(DataInput in)`
`ADictionary`	`reExpandColumns(int max)` return a new Dictionary that have re expanded all values, based on the entries already contained.
`ADictionary`	`replace(double pattern, double replace, int nCol, boolean safe)` Make a copy of the values, and replace all values that match pattern with replacement value.
`ADictionary`	`scaleTuples(int[] scaling, int nCol)` Scale all tuples contained in the dictionary by the scaling factor given in the int list.
`ADictionary`	`sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)` Modify the dictionary by removing columns not within the index range.
`ADictionary`	`subtractTuple(double[] tuple)` Allocate a new dictionary where the tuple given is subtracted from all tuples in the previous dictionary.
`double`	`sum(int[] counts, int ncol)` Get the sum of the values contained in the dictionary
`double[]`	`sumAllRowsToDouble(boolean square, int nrColumns)` Method used as a pre-aggregate of each tuple in the dictionary, to single double values.
`double`	`sumRow(int k, boolean square, int nrColumns)` Sum the values at a specific row.
`double`	`sumsq(int[] counts, int ncol)` Get the square sum of the values contained in the dictionary
`void`	`write(DataOutput out)` Write the dictionary to a DataOutput.

Methods inherited from class org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
applyBinaryRowOp, getMostCommonTuple

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - QDictionary
```
public QDictionary(BitmapLossy bm)
```
- Method Detail
  - getValues
```
public double[] getValues()
```
    Description copied from class: ADictionary
    
    Get all the values contained in the dictionary as a linearized double array.
    
    Specified by:
    
    getValues in class ADictionary
    
    Returns:
    
    linearized double array
  - getValue
```
public double getValue(int i)
```
    Description copied from class: ADictionary
    
    Get Specific value contained in the dictionary at index.
    
    Specified by:
    
    getValue in class ADictionary
    
    Parameters:
    
    i - The index to extract the value from
    
    Returns:
    
    The value contained at the index
  - getValueByte
```
public byte getValueByte(int i)
```
  - getValuesByte
```
public byte[] getValuesByte()
```
  - getScale
```
public double getScale()
```
  - getInMemorySize
```
public long getInMemorySize()
```
    Description copied from class: ADictionary
    
    Returns the memory usage of the dictionary.
    
    Specified by:
    
    getInMemorySize in class ADictionary
    
    Returns:
    
    a long value in number of bytes for the dictionary.
  - getInMemorySize
```
public static long getInMemorySize(int valuesCount)
```
  - aggregate
```
public double aggregate(double init,
                        Builtin fn)
```
    Description copied from class: ADictionary
    
    Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.
    
    Specified by:
    
    aggregate in class ADictionary
    
    Parameters:
    
    init - The initial Value, in cases such as Max value, this could be -infinity
    
    fn - The Function to apply to values
    
    Returns:
    
    The aggregated value as a double.
  - aggregateTuples
```
public double[] aggregateTuples(Builtin fn,
                                int nCol)
```
    Description copied from class: ADictionary
    
    Aggregate all entries in the rows.
    
    Specified by:
    
    aggregateTuples in class ADictionary
    
    Parameters:
    
    fn - The aggregate function
    
    nCol - The number of columns contained in the dictionary.
    
    Returns:
    
    Aggregates for this dictionary tuples.
  - apply
```
public QDictionary apply(ScalarOperator op)
```
    Description copied from class: ADictionary
    
    Applies the scalar operation on the dictionary. Note that this operation modifies the underlying data, and normally require a copy of the original Dictionary to preserve old objects.
    
    Specified by:
    
    apply in class ADictionary
    
    Parameters:
    
    op - The operator to apply to the dictionary values.
    
    Returns:
    
    this dictionary with modified values.
  - applyScalarOp
```
public QDictionary applyScalarOp(ScalarOperator op,
                                 double newVal,
                                 int numCols)
```
    Description copied from class: ADictionary
    
    Applies the scalar operation on the dictionary. The returned dictionary should contain a new instance of the underlying data. Therefore it will not modify the previous object.
    
    Specified by:
    
    applyScalarOp in class ADictionary
    
    Parameters:
    
    op - The operator to apply to the dictionary values.
    
    newVal - The value to append to the dictionary.
    
    numCols - The number of columns stored in the dictionary.
    
    Returns:
    
    Another dictionary with modified values.
  - applyBinaryRowOpRight
```
public QDictionary applyBinaryRowOpRight(BinaryOperator op,
                                         double[] v,
                                         boolean sparseSafe,
                                         int[] colIndexes)
```
    Description copied from class: ADictionary
    
    Apply binary row operation on this dictionary on the right side.
    
    Specified by:
    
    applyBinaryRowOpRight in class ADictionary
    
    Parameters:
    
    op - The operation to this dictionary
    
    v - The values to use on the right hand side.
    
    sparseSafe - boolean specifying if the operation is safe, and therefore dont need to allocate an extended dictionary
    
    colIndexes - The column indexes to consider inside v.
    
    Returns:
    
    A new dictionary containing the updated values.
  - applyBinaryRowOpLeft
```
public QDictionary applyBinaryRowOpLeft(BinaryOperator op,
                                        double[] v,
                                        boolean sparseSafe,
                                        int[] colIndexes)
```
    Description copied from class: ADictionary
    
    Apply binary row operation on this dictionary on the left side.
    
    Specified by:
    
    applyBinaryRowOpLeft in class ADictionary
    
    Parameters:
    
    op - The operation to this dictionary
    
    v - The values to use on the left hand side.
    
    sparseSafe - boolean specifying if the operation is safe, and therefore dont need to allocate an extended dictionary
    
    colIndexes - The column indexes to consider inside v.
    
    Returns:
    
    A new dictionary containing the updated values.
  - clone
```
public QDictionary clone()
```
    Description copied from class: ADictionary
    
    Returns a deep clone of the dictionary.
    
    Specified by:
    
    clone in class ADictionary
  - cloneAndExtend
```
public QDictionary cloneAndExtend(int len)
```
    Description copied from class: ADictionary
    
    Clone the dictionary, and extend size of the dictionary by a given length
    
    Specified by:
    
    cloneAndExtend in class ADictionary
    
    Parameters:
    
    len - The length to extend the dictionary, it is assumed this value is positive.
    
    Returns:
    
    a clone of the dictionary, extended by len.
  - write
```
public void write(DataOutput out)
           throws IOException
```
    Description copied from class: ADictionary
    
    Write the dictionary to a DataOutput.
    
    Specified by:
    
    write in class ADictionary
    
    Parameters:
    
    out - the output sink to write the dictionary to.
    
    Throws:
    
    IOException - if the sink fails.
  - read
```
public static QDictionary read(DataInput in)
                        throws IOException
```
    Throws:
    
    IOException
  - getExactSizeOnDisk
```
public long getExactSizeOnDisk()
```
    Description copied from class: ADictionary
    
    Calculate the space consumption if the dictionary is stored on disk.
    
    Specified by:
    
    getExactSizeOnDisk in class ADictionary
    
    Returns:
    
    the long count of bytes to store the dictionary.
  - getNumberOfValues
```
public int getNumberOfValues(int nCol)
```
    Description copied from class: ADictionary
    
    Get the number of distinct tuples given that the column group has n columns
    
    Specified by:
    
    getNumberOfValues in class ADictionary
    
    Parameters:
    
    nCol - The number of Columns in the ColumnGroup.
    
    Returns:
    
    the number of value tuples contained in the dictionary.
  - sumAllRowsToDouble
```
public double[] sumAllRowsToDouble(boolean square,
                                   int nrColumns)
```
    Description copied from class: ADictionary
    
    Method used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.
    
    Specified by:
    
    sumAllRowsToDouble in class ADictionary
    
    Parameters:
    
    square - If each entry should be squared.
    
    nrColumns - The number of columns in the ColGroup to know how to get the values from the dictionary.
    
    Returns:
    
    a double array containing the row sums from this dictionary.
  - sumRow
```
public double sumRow(int k,
                     boolean square,
                     int nrColumns)
```
    Description copied from class: ADictionary
    
    Sum the values at a specific row.
    
    Specified by:
    
    sumRow in class ADictionary
    
    Parameters:
    
    k - The row index to sum
    
    square - If each entry should be squared.
    
    nrColumns - The number of columns
    
    Returns:
    
    The sum of the row.
  - colSum
```
public double[] colSum(int[] counts,
                       int nCol)
```
    Description copied from class: ADictionary
    
    get the column sum of this dictionary only.
    
    Specified by:
    
    colSum in class ADictionary
    
    Parameters:
    
    counts - the counts of the values contained
    
    nCol - The number of columns contained in each tuple.
    
    Returns:
    
    the colSums of this column group.
  - colSum
```
public void colSum(double[] c,
                   int[] counts,
                   int[] colIndexes,
                   boolean square)
```
    Description copied from class: ADictionary
    
    Get the column sum of the values contained in the dictionary
    
    Specified by:
    
    colSum in class ADictionary
    
    Parameters:
    
    c - The output array allocated to contain all column groups output.
    
    counts - The counts of the individual tuples.
    
    colIndexes - The columns indexes of the parent column group, this indicate where to put the column sum into the c output.
    
    square - Specify if the values should be squared
  - sum
```
public double sum(int[] counts,
                  int ncol)
```
    Description copied from class: ADictionary
    
    Get the sum of the values contained in the dictionary
    
    Specified by:
    
    sum in class ADictionary
    
    Parameters:
    
    counts - The counts of the individual tuples
    
    ncol - The number of columns contained
    
    Returns:
    
    The sum scaled by the counts provided.
  - sumsq
```
public double sumsq(int[] counts,
                    int ncol)
```
    Description copied from class: ADictionary
    
    Get the square sum of the values contained in the dictionary
    
    Specified by:
    
    sumsq in class ADictionary
    
    Parameters:
    
    counts - The counts of the individual tuples
    
    ncol - The number of columns contained
    
    Returns:
    
    The square sum scaled by the counts provided.
  - addMaxAndMin
```
public void addMaxAndMin(double[] ret,
                         int[] colIndexes)
```
    Description copied from class: ADictionary
    
    This method adds the max and min values contained in the dictionary to corresponding cells in the ret variable. One use case for this method is the squash operation, to go from an overlapping state to normal compression.
    
    Specified by:
    
    addMaxAndMin in class ADictionary
    
    Parameters:
    
    ret - The double array that contains all columns min and max.
    
    colIndexes - The column indexes contained in this dictionary.
  - getString
```
public String getString(int colIndexes)
```
    Description copied from class: ADictionary
    
    Get a string representation of the dictionary, that considers the layout of the data.
    
    Specified by:
    
    getString in class ADictionary
    
    Parameters:
    
    colIndexes - The number of columns in the dictionary.
    
    Returns:
    
    A string that is nicer to print.
  - makeDoubleDictionary
```
public Dictionary makeDoubleDictionary()
```
  - sliceOutColumnRange
```
public ADictionary sliceOutColumnRange(int idxStart,
                                       int idxEnd,
                                       int previousNumberOfColumns)
```
    Description copied from class: ADictionary
    
    Modify the dictionary by removing columns not within the index range.
    
    Specified by:
    
    sliceOutColumnRange in class ADictionary
    
    Parameters:
    
    idxStart - The column index to start at.
    
    idxEnd - The column index to end at (not inclusive)
    
    previousNumberOfColumns - The number of columns contained in the dictionary.
    
    Returns:
    
    A dictionary containing the sliced out columns values only.
  - reExpandColumns
```
public ADictionary reExpandColumns(int max)
```
    Description copied from class: ADictionary
    
    return a new Dictionary that have re expanded all values, based on the entries already contained.
    
    Specified by:
    
    reExpandColumns in class ADictionary
    
    Parameters:
    
    max - The number of output columns possible.
    
    Returns:
    
    The re expanded Dictionary.
  - containsValue
```
public boolean containsValue(double pattern)
```
    Description copied from class: ADictionary
    
    Detect if the dictionary contains a specific value.
    
    Specified by:
    
    containsValue in class ADictionary
    
    Parameters:
    
    pattern - The value to search for
    
    Returns:
    
    true if the value is contained else false.
  - getNumberNonZeros
```
public long getNumberNonZeros(int[] counts,
                              int nCol)
```
    Description copied from class: ADictionary
    
    Calculate the number of non zeros in the dictionary. The number of non zeros should be scaled with the counts given. This gives the exact number of non zero values in the parent column group.
    
    Specified by:
    
    getNumberNonZeros in class ADictionary
    
    Parameters:
    
    counts - The counts of each dictionary entry
    
    nCol - The number of columns in this dictionary
    
    Returns:
    
    The nonZero count
  - addToEntry
```
public void addToEntry(Dictionary d,
                       int fr,
                       int to,
                       int nCol)
```
    Description copied from class: ADictionary
    
    Copies and adds the dictionary entry from this dictionary to the d dictionary
    
    Specified by:
    
    addToEntry in class ADictionary
    
    Parameters:
    
    d - the target dictionary
    
    fr - the from index
    
    to - the to index
    
    nCol - the number of columns
  - isLossy
```
public boolean isLossy()
```
    Description copied from class: ADictionary
    
    Specify if the Dictionary is lossy.
    
    Specified by:
    
    isLossy in class ADictionary
    
    Returns:
    
    A boolean
  - getTuple
```
public double[] getTuple(int index,
                         int nCol)
```
    Description copied from class: ADictionary
    
    Get the values contained in a specific tuple of the dictionary.
    
    Specified by:
    
    getTuple in class ADictionary
    
    Parameters:
    
    index - The index where the values are located
    
    nCol - The number of columns contained in this dictionary
    
    Returns:
    
    a materialized double array containing the tuple.
  - subtractTuple
```
public ADictionary subtractTuple(double[] tuple)
```
    Description copied from class: ADictionary
    
    Allocate a new dictionary where the tuple given is subtracted from all tuples in the previous dictionary.
    
    Specified by:
    
    subtractTuple in class ADictionary
    
    Parameters:
    
    tuple - a double list representing a tuple, it is given that the tuple with is the same as this dictionaries.
    
    Returns:
    
    a new instance of dictionary with the tuple subtracted.
  - getAsMatrixBlockDictionary
```
public MatrixBlockDictionary getAsMatrixBlockDictionary(int nCol)
```
    Description copied from class: ADictionary
    
    Get this dictionary as a matrixBlock dictionary. This allows us to use optimized kernels coded elsewhere in the system, such as matrix multiplication.
    
    Specified by:
    
    getAsMatrixBlockDictionary in class ADictionary
    
    Parameters:
    
    nCol - The number of columns contained in this column group.
    
    Returns:
    
    A Dictionary containing a MatrixBlock.
  - aggregateCols
```
public void aggregateCols(double[] c,
                          Builtin fn,
                          int[] colIndexes)
```
    Description copied from class: ADictionary
    
    Aggregates the columns into the target double array provided.
    
    Specified by:
    
    aggregateCols in class ADictionary
    
    Parameters:
    
    c - The target double array, this contains the full number of columns, therefore the colIndexes for this specific dictionary is needed.
    
    fn - The function to apply to individual columns
    
    colIndexes - The mapping to the target columns from the individual columns
  - scaleTuples
```
public ADictionary scaleTuples(int[] scaling,
                               int nCol)
```
    Description copied from class: ADictionary
    
    Scale all tuples contained in the dictionary by the scaling factor given in the int list.
    
    Specified by:
    
    scaleTuples in class ADictionary
    
    Parameters:
    
    scaling - The ammout to multiply the given tuples with
    
    nCol - The number of columns contained in this column group.
    
    Returns:
    
    A New dictionary (since we don't want to modify the underlying dictionary)
  - preaggValuesFromDense
```
public void preaggValuesFromDense(int numVals,
                                  int[] colIndexes,
                                  int[] aggregateColumns,
                                  double[] b,
                                  double[] ret,
                                  int cut)
```
    Description copied from class: ADictionary
    
    Pre Aggregate values for right Matrix Multiplication.
    
    Specified by:
    
    preaggValuesFromDense in class ADictionary
    
    Parameters:
    
    numVals - The number of values contained in this dictionary
    
    colIndexes - The column indexes that is associated with the parent column group
    
    aggregateColumns - The column to aggregate, this is preprocessed, to find remove consideration for empty columns
    
    b - The values in the right hand side matrix
    
    ret - The double array to put in the aggregate.
    
    cut - The number of columns in b.
  - replace
```
public ADictionary replace(double pattern,
                           double replace,
                           int nCol,
                           boolean safe)
```
    Description copied from class: ADictionary
    
    Make a copy of the values, and replace all values that match pattern with replacement value. If needed add a new column index.
    
    Specified by:
    
    replace in class ADictionary
    
    Parameters:
    
    pattern - The value to look for
    
    replace - The value to replace the other value with
    
    nCol - The number of columns contained in the dictionary.
    
    safe - Specify if the operation require consideration of adding a new tuple. This happens if the dictionary have allocated the last zero tuple or not.
    
    Returns:
    
    A new Column Group, reusing the index structure but with new values.

Class QDictionary

Constructor Summary

Method Summary

Methods inherited from class org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary

Methods inherited from class java.lang.Object

Constructor Detail

QDictionary

Method Detail

getValues

getValue

getValueByte

getValuesByte

getScale

getInMemorySize

getInMemorySize

aggregate

aggregateTuples

apply

applyScalarOp

applyBinaryRowOpRight

applyBinaryRowOpLeft

clone

cloneAndExtend

write

read

getExactSizeOnDisk

getNumberOfValues

sumAllRowsToDouble

sumRow

colSum

colSum

sum

sumsq

addMaxAndMin

getString

makeDoubleDictionary

sliceOutColumnRange

reExpandColumns

containsValue

getNumberNonZeros

addToEntry

isLossy

getTuple

subtractTuple

getAsMatrixBlockDictionary

aggregateCols

scaleTuples

preaggValuesFromDense

replace