Class ColGroupSDC
-
- All Implemented Interfaces:
Serializable
,Cloneable
public class ColGroupSDC extends AMorphingMMColGroup
Column group that sparsely encodes the dictionary values. The idea is that all values is encoded with indexes except the most common one. the most common one can be inferred by not being included in the indexes. This column group is handy in cases where sparse unsafe operations is executed on very sparse columns. Then the zeros would be materialized in the group without any overhead.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroup
AColGroup.CompressionType
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description AColGroup
binaryRowOpLeft(BinaryOperator op, double[] v, boolean isRowSafe)
Perform a binary row operation.AColGroup
binaryRowOpRight(BinaryOperator op, double[] v, boolean isRowSafe)
Perform a binary row operation.CM_COV_Object
centralMoment(CMOperator op, int nRows)
Central Moment instruction executed on a column group.void
computeColSums(double[] c, int nRows)
Compute the column sumboolean
containsValue(double pattern)
Detect if the column group contains a specific value.long
estimateInMemorySize()
Get the upper bound estimate of in memory allocation for the column group.AColGroup
extractCommon(double[] constV)
org.apache.sysds.runtime.compress.colgroup.AColGroup.ColGroupType
getColGroupType()
AColGroup.CompressionType
getCompType()
Obtain the compression type.double
getCost(ComputationCostEstimator e, int nRows)
Get the computation cost associated with this column group.int[]
getCounts(int[] counts)
ADictionary
getDictionary()
long
getExactSizeOnDisk()
Returns the exact serialized size of column group.double
getIdx(int r, int colIdx)
Get the value at a colGroup specific row/column index position.long
getNumberNonZeros(int nRows)
Get the number of nonZeros contained in this column group.void
readFields(DataInput in)
Deserialize column group from data input.AColGroup
replace(double pattern, double replace)
Make a copy of the column group values, and replace all values that match pattern with replacement value.AColGroup
rexpandCols(int max, boolean ignore, boolean cast, int nRows)
Expand the column group to multiple columns.AColGroup
scalarOperation(ScalarOperator op)
Perform the specified scalar operation directly on the compressed column group, without decompressing individual cells if possible.AColGroup
subtractDefaultTuple()
String
toString()
AColGroup
unaryOperation(UnaryOperator op)
void
write(DataOutput out)
Serializes column group to data output.-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AMorphingMMColGroup
leftMultByAColGroup, leftMultByMatrixNoPreAgg, tsmmAColGroup
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroupValue
copy, decompressToDenseBlock, decompressToSparseBlock, forceMatrixBlockDictionary, getCachedCounts, getCounts, getNumValues, rightMultByMatrix
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroupCompressed
getMax, getMin, preAggRows, tsmm, unaryAggregateOperations, unaryAggregateOperations
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroup
colSum, decompressToDenseBlock, decompressToSparseBlock, get, getColIndices, getNumCols, shiftColIndices, sliceColumn, sliceColumns
-
-
-
-
Method Detail
-
getCompType
public AColGroup.CompressionType getCompType()
Description copied from class:AColGroup
Obtain the compression type.- Specified by:
getCompType
in classAColGroup
- Returns:
- How the elements of the column group are compressed.
-
getColGroupType
public org.apache.sysds.runtime.compress.colgroup.AColGroup.ColGroupType getColGroupType()
-
getIdx
public double getIdx(int r, int colIdx)
Description copied from class:AColGroup
Get the value at a colGroup specific row/column index position.
-
getDictionary
public ADictionary getDictionary()
- Overrides:
getDictionary
in classAColGroupValue
-
computeColSums
public void computeColSums(double[] c, int nRows)
Description copied from class:AColGroup
Compute the column sum- Overrides:
computeColSums
in classAColGroupValue
- Parameters:
c
- The array to add the column sum to.nRows
- The number of rows in the column group.
-
getCounts
public int[] getCounts(int[] counts)
- Specified by:
getCounts
in classAColGroupValue
-
getNumberNonZeros
public long getNumberNonZeros(int nRows)
Description copied from class:AColGroup
Get the number of nonZeros contained in this column group.- Overrides:
getNumberNonZeros
in classAColGroupValue
- Parameters:
nRows
- The number of rows in the column group, this is used for groups that does not contain information about how many rows they have.- Returns:
- The nnz.
-
estimateInMemorySize
public long estimateInMemorySize()
Description copied from class:AColGroup
Get the upper bound estimate of in memory allocation for the column group.- Overrides:
estimateInMemorySize
in classAColGroupValue
- Returns:
- an upper bound on the number of bytes used to store this ColGroup in memory.
-
scalarOperation
public AColGroup scalarOperation(ScalarOperator op)
Description copied from class:AColGroup
Perform the specified scalar operation directly on the compressed column group, without decompressing individual cells if possible.- Specified by:
scalarOperation
in classAColGroup
- Parameters:
op
- operation to perform- Returns:
- version of this column group with the operation applied
-
unaryOperation
public AColGroup unaryOperation(UnaryOperator op)
- Specified by:
unaryOperation
in classAColGroup
-
binaryRowOpLeft
public AColGroup binaryRowOpLeft(BinaryOperator op, double[] v, boolean isRowSafe)
Description copied from class:AColGroup
Perform a binary row operation.- Specified by:
binaryRowOpLeft
in classAColGroup
- Parameters:
op
- The operation to executev
- The vector of values to apply, should be same length as dictionary length.isRowSafe
- True if the binary op is applied to an entire zero row and all results are zero- Returns:
- A updated column group with the new values.
-
binaryRowOpRight
public AColGroup binaryRowOpRight(BinaryOperator op, double[] v, boolean isRowSafe)
Description copied from class:AColGroup
Perform a binary row operation.- Specified by:
binaryRowOpRight
in classAColGroup
- Parameters:
op
- The operation to executev
- The vector of values to apply, should be same length as dictionary length.isRowSafe
- True if the binary op is applied to an entire zero row and all results are zero- Returns:
- A updated column group with the new values.
-
write
public void write(DataOutput out) throws IOException
Description copied from class:AColGroup
Serializes column group to data output.- Overrides:
write
in classAColGroupValue
- Parameters:
out
- data output- Throws:
IOException
- if IOException occurs
-
readFields
public void readFields(DataInput in) throws IOException
Description copied from class:AColGroup
Deserialize column group from data input.- Overrides:
readFields
in classAColGroupValue
- Parameters:
in
- data input- Throws:
IOException
- if IOException occurs
-
getExactSizeOnDisk
public long getExactSizeOnDisk()
Description copied from class:AColGroup
Returns the exact serialized size of column group. This can be used for example for buffer preallocation.- Overrides:
getExactSizeOnDisk
in classAColGroupValue
- Returns:
- exact serialized size for column group
-
replace
public AColGroup replace(double pattern, double replace)
Description copied from class:AColGroup
Make a copy of the column group values, and replace all values that match pattern with replacement value.- Overrides:
replace
in classAColGroupValue
- Parameters:
pattern
- The value to look forreplace
- The value to replace the other value with- Returns:
- A new Column Group, reusing the index structure but with new values.
-
extractCommon
public AColGroup extractCommon(double[] constV)
- Specified by:
extractCommon
in classAMorphingMMColGroup
-
subtractDefaultTuple
public AColGroup subtractDefaultTuple()
-
centralMoment
public CM_COV_Object centralMoment(CMOperator op, int nRows)
Description copied from class:AColGroup
Central Moment instruction executed on a column group.- Overrides:
centralMoment
in classAColGroupValue
- Parameters:
op
- The Operator to use.nRows
- The number of rows contained in the ColumnGroup.- Returns:
- A Central Moment object.
-
rexpandCols
public AColGroup rexpandCols(int max, boolean ignore, boolean cast, int nRows)
Description copied from class:AColGroup
Expand the column group to multiple columns. (one hot encode the column group)- Overrides:
rexpandCols
in classAColGroupValue
- Parameters:
max
- The number of columns to expand to and cutoff values at.ignore
- If zero and negative values should be ignored.cast
- If the double values contained should be cast to whole numbers.nRows
- The number of rows in the column group.- Returns:
- A new column group containing max number of columns.
-
getCost
public double getCost(ComputationCostEstimator e, int nRows)
Description copied from class:AColGroup
Get the computation cost associated with this column group.
-
containsValue
public boolean containsValue(double pattern)
Description copied from class:AColGroup
Detect if the column group contains a specific value.- Overrides:
containsValue
in classAColGroupValue
- Parameters:
pattern
- The value to look for.- Returns:
- boolean saying true if the value is contained.
-
toString
public String toString()
- Overrides:
toString
in classAColGroupValue
-
-