Class QDictionary
- java.lang.Object
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.ACachingMBDictionary
-
- org.apache.sysds.runtime.compress.colgroup.dictionary.QDictionary
-
- All Implemented Interfaces:
Serializable
,IDictionary
public class QDictionary extends ACachingMBDictionary
This dictionary class aims to encapsulate the storage and operations over unique floating point values of a column group. The primary reason for its introduction was to provide an entry point for specialization such as shared dictionaries, which require additional information.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from interface org.apache.sysds.runtime.compress.colgroup.dictionary.IDictionary
IDictionary.DictType
-
-
Field Summary
-
Fields inherited from interface org.apache.sysds.runtime.compress.colgroup.dictionary.IDictionary
LOG
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description double
aggregate(double init, Builtin fn)
Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.QDictionary
clone()
Returns a deep clone of the dictionary.int[]
countNNZZeroColumns(int[] counts)
Count the number of non zero values in each column of the dictionary, multiplied with the countsstatic QDictionary
create(byte[] values, double scale, int nCol, boolean check)
MatrixBlockDictionary
createMBDict(int nCol)
boolean
equals(IDictionary o)
Indicate if the other dictionary is equal to this.IDictionary.DictType
getDictType()
Get the dictionary type this dictionary is.long
getExactSizeOnDisk()
Calculate the space consumption if the dictionary is stored on disk.long
getInMemorySize()
Returns the memory usage of the dictionary.static long
getInMemorySize(int valuesCount)
MatrixBlockDictionary
getMBDict()
long
getNumberNonZeros(int[] counts, int nCol)
Calculate the number of non zeros in the dictionary.int
getNumberOfColumns(int nCol)
Get the number of columns in this dictionary, provided you know the number of values, or rows.int
getNumberOfValues(int nCol)
Get the number of distinct tuples given that the column group has n columnsdouble
getSparsity()
Get the sparsity of the dictionary.String
getString(int colIndexes)
Get a string representation of the dictionary, that considers the layout of the data.double
getValue(int i)
Get Specific value contained in the dictionary at index.double
getValue(int r, int c, int nCol)
Get Specific value contain in dictionary at index.double[]
getValues()
Get all the values contained in the dictionary as a linearized double array.static QDictionary
read(DataInput in)
IDictionary
sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)
Modify the dictionary by removing columns not within the index range.double[]
sumAllRowsToDouble(int nrColumns)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.double[]
sumAllRowsToDoubleSq(int nrColumns)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.void
write(DataOutput out)
Write the dictionary to a DataOutput.-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.dictionary.ACachingMBDictionary
getMBDict
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.dictionary.ADictionary
addToEntry, addToEntry, addToEntryVectorized, aggregateCols, aggregateColsWithReference, aggregateRows, aggregateRowsWithDefault, aggregateRowsWithReference, aggregateWithReference, append, applyScalarOp, applyScalarOpAndAppend, applyScalarOpWithReference, applyUnaryOp, applyUnaryOpAndAppend, applyUnaryOpWithReference, binOpLeft, binOpLeftAndAppend, binOpLeftWithReference, binOpRight, binOpRight, binOpRightAndAppend, binOpRightWithReference, cbind, centralMoment, centralMoment, centralMomentWithDefault, centralMomentWithDefault, centralMomentWithReference, centralMomentWithReference, colProduct, colProductWithReference, colSum, colSumSq, colSumSqWithReference, containsValue, containsValueWithReference, correctNan, equals, equals, getNumberNonZerosWithReference, getRow, MMDict, MMDictDense, MMDictScaling, MMDictScalingDense, MMDictScalingSparse, MMDictSparse, multiplyScalar, preaggValuesFromDense, product, productAllRowsToDouble, productAllRowsToDoubleWithDefault, productAllRowsToDoubleWithReference, productWithDefault, productWithReference, putDense, putSparse, reorder, replace, replaceWithReference, rexpandCols, rexpandColsWithReference, rightMMPreAggSparse, scaleTuples, subtractTuple, sum, sumAllRowsToDoubleSqWithDefault, sumAllRowsToDoubleSqWithReference, sumAllRowsToDoubleWithDefault, sumAllRowsToDoubleWithReference, sumSq, sumSqWithReference, TSMMToUpperTriangle, TSMMToUpperTriangleDense, TSMMToUpperTriangleDenseScaling, TSMMToUpperTriangleScaling, TSMMToUpperTriangleSparse, TSMMToUpperTriangleSparseScaling, TSMMWithScaling
-
-
-
-
Method Detail
-
create
public static QDictionary create(byte[] values, double scale, int nCol, boolean check)
-
getValues
public double[] getValues()
Description copied from interface:IDictionary
Get all the values contained in the dictionary as a linearized double array.- Specified by:
getValues
in interfaceIDictionary
- Overrides:
getValues
in classADictionary
- Returns:
- linearized double array
-
getValue
public double getValue(int i)
Description copied from interface:IDictionary
Get Specific value contained in the dictionary at index.- Specified by:
getValue
in interfaceIDictionary
- Overrides:
getValue
in classADictionary
- Parameters:
i
- The index to extract the value from- Returns:
- The value contained at the index
-
getValue
public final double getValue(int r, int c, int nCol)
Description copied from interface:IDictionary
Get Specific value contain in dictionary at index.- Specified by:
getValue
in interfaceIDictionary
- Overrides:
getValue
in classADictionary
- Parameters:
r
- Row targetc
- Col targetnCol
- nCol in dictionary- Returns:
- value
-
getInMemorySize
public long getInMemorySize()
Description copied from interface:IDictionary
Returns the memory usage of the dictionary.- Returns:
- a long value in number of bytes for the dictionary.
-
getInMemorySize
public static long getInMemorySize(int valuesCount)
-
aggregate
public double aggregate(double init, Builtin fn)
Description copied from interface:IDictionary
Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.- Specified by:
aggregate
in interfaceIDictionary
- Overrides:
aggregate
in classADictionary
- Parameters:
init
- The initial Value, in cases such as Max value, this could be -infinityfn
- The Function to apply to values- Returns:
- The aggregated value as a double.
-
clone
public QDictionary clone()
Description copied from interface:IDictionary
Returns a deep clone of the dictionary.- Specified by:
clone
in interfaceIDictionary
- Specified by:
clone
in classADictionary
- Returns:
- A deep clone
-
write
public void write(DataOutput out) throws IOException
Description copied from interface:IDictionary
Write the dictionary to a DataOutput.- Parameters:
out
- the output sink to write the dictionary to.- Throws:
IOException
- if the sink fails.
-
read
public static QDictionary read(DataInput in) throws IOException
- Throws:
IOException
-
getExactSizeOnDisk
public long getExactSizeOnDisk()
Description copied from interface:IDictionary
Calculate the space consumption if the dictionary is stored on disk.- Returns:
- the long count of bytes to store the dictionary.
-
getNumberOfValues
public int getNumberOfValues(int nCol)
Description copied from interface:IDictionary
Get the number of distinct tuples given that the column group has n columns- Parameters:
nCol
- The number of Columns in the ColumnGroup.- Returns:
- the number of value tuples contained in the dictionary.
-
getNumberOfColumns
public int getNumberOfColumns(int nCol)
Description copied from interface:IDictionary
Get the number of columns in this dictionary, provided you know the number of values, or rows.- Parameters:
nCol
- The number of rows/values known inside this dictionary- Returns:
- The number of columns
-
sumAllRowsToDouble
public double[] sumAllRowsToDouble(int nrColumns)
Description copied from interface:IDictionary
Method used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.- Specified by:
sumAllRowsToDouble
in interfaceIDictionary
- Overrides:
sumAllRowsToDouble
in classADictionary
- Parameters:
nrColumns
- The number of columns in the ColGroup to know how to get the values from the dictionary.- Returns:
- a double array containing the row sums from this dictionary.
-
sumAllRowsToDoubleSq
public double[] sumAllRowsToDoubleSq(int nrColumns)
Description copied from interface:IDictionary
Method used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.- Specified by:
sumAllRowsToDoubleSq
in interfaceIDictionary
- Overrides:
sumAllRowsToDoubleSq
in classADictionary
- Parameters:
nrColumns
- The number of columns in the ColGroup to know how to get the values from the dictionary.- Returns:
- a double array containing the row sums from this dictionary.
-
getString
public String getString(int colIndexes)
Description copied from interface:IDictionary
Get a string representation of the dictionary, that considers the layout of the data.- Parameters:
colIndexes
- The number of columns in the dictionary.- Returns:
- A string that is nicer to print.
-
sliceOutColumnRange
public IDictionary sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)
Description copied from interface:IDictionary
Modify the dictionary by removing columns not within the index range.- Specified by:
sliceOutColumnRange
in interfaceIDictionary
- Overrides:
sliceOutColumnRange
in classADictionary
- Parameters:
idxStart
- The column index to start at.idxEnd
- The column index to end at (not inclusive)previousNumberOfColumns
- The number of columns contained in the dictionary.- Returns:
- A dictionary containing the sliced out columns values only.
-
getNumberNonZeros
public long getNumberNonZeros(int[] counts, int nCol)
Description copied from interface:IDictionary
Calculate the number of non zeros in the dictionary. The number of non zeros should be scaled with the counts given. This gives the exact number of non zero values in the parent column group.- Parameters:
counts
- The counts of each dictionary entrynCol
- The number of columns in this dictionary- Returns:
- The nonZero count
-
countNNZZeroColumns
public int[] countNNZZeroColumns(int[] counts)
Description copied from interface:IDictionary
Count the number of non zero values in each column of the dictionary, multiplied with the counts- Specified by:
countNNZZeroColumns
in interfaceIDictionary
- Overrides:
countNNZZeroColumns
in classADictionary
- Parameters:
counts
- The counts to multiply with.- Returns:
- The nonzero count of each column in the dictionary.
-
getDictType
public IDictionary.DictType getDictType()
Description copied from interface:IDictionary
Get the dictionary type this dictionary is.- Returns:
- The Dictionary type this is.
-
getSparsity
public double getSparsity()
Description copied from interface:IDictionary
Get the sparsity of the dictionary.- Specified by:
getSparsity
in interfaceIDictionary
- Overrides:
getSparsity
in classADictionary
- Returns:
- a sparsity between 0 and 1
-
equals
public boolean equals(IDictionary o)
Description copied from interface:IDictionary
Indicate if the other dictionary is equal to this.- Parameters:
o
- The other object- Returns:
- If it is equal
-
getMBDict
public MatrixBlockDictionary getMBDict()
- Overrides:
getMBDict
in classADictionary
-
createMBDict
public MatrixBlockDictionary createMBDict(int nCol)
- Specified by:
createMBDict
in classACachingMBDictionary
-
-