Class AColGroupValue
- java.lang.Object
-
- org.apache.sysds.runtime.compress.colgroup.AColGroup
-
- org.apache.sysds.runtime.compress.colgroup.AColGroupCompressed
-
- org.apache.sysds.runtime.compress.colgroup.AColGroupValue
-
- All Implemented Interfaces:
Serializable
,Cloneable
- Direct Known Subclasses:
AColGroupOffset
,AMorphingMMColGroup
,APreAgg
public abstract class AColGroupValue extends AColGroupCompressed implements Cloneable
Base class for column groups encoded with value dictionary. This include column groups such as DDC OLE and RLE.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroup
AColGroup.CompressionType
-
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description CM_COV_Object
centralMoment(CMOperator op, int nRows)
Central Moment instruction executed on a column group.void
computeColSums(double[] c, int nRows)
Compute the column sumboolean
containsValue(double pattern)
Detect if the column group contains a specific value.AColGroupValue
copy()
Get a copy of this column group note this is only a shallow copy.void
decompressToDenseBlock(DenseBlock db, int rl, int ru, int offR, int offC)
Decompress into the DenseBlock.void
decompressToSparseBlock(SparseBlock sb, int rl, int ru, int offR, int offC)
Decompress into the SparseBlock.long
estimateInMemorySize()
Get the upper bound estimate of in memory allocation for the column group.void
forceMatrixBlockDictionary()
int[]
getCachedCounts()
Get the cached counts.int[]
getCounts()
Returns the counts of values inside the dictionary.abstract int[]
getCounts(int[] out)
ADictionary
getDictionary()
long
getExactSizeOnDisk()
Returns the exact serialized size of column group.long
getNumberNonZeros(int nRows)
Get the number of nonZeros contained in this column group.int
getNumValues()
Obtain number of distinct tuples in contained sets of values associated with this column group.void
readFields(DataInput in)
Deserialize column group from data input.AColGroup
replace(double pattern, double replace)
Make a copy of the column group values, and replace all values that match pattern with replacement value.AColGroup
rexpandCols(int max, boolean ignore, boolean cast, int nRows)
Expand the column group to multiple columns.AColGroup
rightMultByMatrix(MatrixBlock right)
Right matrix multiplication with this column group.String
toString()
void
write(DataOutput out)
Serializes column group to data output.-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroupCompressed
getMax, getMin, preAggRows, tsmm, unaryAggregateOperations, unaryAggregateOperations
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroup
binaryRowOpLeft, binaryRowOpRight, colSum, decompressToDenseBlock, decompressToSparseBlock, get, getColIndices, getCompType, getCost, getIdx, getNumCols, leftMultByAColGroup, leftMultByMatrixNoPreAgg, scalarOperation, shiftColIndices, sliceColumn, sliceColumns, tsmmAColGroup, unaryOperation
-
-
-
-
Method Detail
-
decompressToDenseBlock
public final void decompressToDenseBlock(DenseBlock db, int rl, int ru, int offR, int offC)
Description copied from class:AColGroup
Decompress into the DenseBlock. (no NNZ handling)- Specified by:
decompressToDenseBlock
in classAColGroup
- Parameters:
db
- Target DenseBlockrl
- Row to start decompression fromru
- Row to end decompression atoffR
- Row offset into the target to decompressoffC
- Column offset into the target to decompress
-
decompressToSparseBlock
public final void decompressToSparseBlock(SparseBlock sb, int rl, int ru, int offR, int offC)
Description copied from class:AColGroup
Decompress into the SparseBlock. (no NNZ handling) Note this method is allowing to calls to append since it is assumed that the sparse column indexes are sorted afterwards- Specified by:
decompressToSparseBlock
in classAColGroup
- Parameters:
sb
- Target SparseBlockrl
- Row to start decompression fromru
- Row to end decompression atoffR
- Row offset into the target to decompressoffC
- Column offset into the target to decompress
-
getNumValues
public int getNumValues()
Description copied from class:AColGroup
Obtain number of distinct tuples in contained sets of values associated with this column group. If the column group is uncompressed the number or rows is returned.- Specified by:
getNumValues
in classAColGroup
- Returns:
- the number of distinct sets of values associated with the bitmaps in this column group
-
getDictionary
public ADictionary getDictionary()
-
getCounts
public final int[] getCounts()
Returns the counts of values inside the dictionary. If already calculated it will return the previous counts. This produce an overhead in cases where the count is calculated, but the overhead will be limited to number of distinct tuples in the dictionary. The returned counts always contains the number of zero tuples as well if there are some contained, even if they are not materialized.- Returns:
- The count of each value in the MatrixBlock.
-
getCachedCounts
public final int[] getCachedCounts()
Get the cached counts. If they are not materialized or the garbage collector have removed them, then null is returned.- Returns:
- The counts or null.
-
readFields
public void readFields(DataInput in) throws IOException
Description copied from class:AColGroup
Deserialize column group from data input.- Overrides:
readFields
in classAColGroup
- Parameters:
in
- data input- Throws:
IOException
- if IOException occurs
-
write
public void write(DataOutput out) throws IOException
Description copied from class:AColGroup
Serializes column group to data output.- Overrides:
write
in classAColGroup
- Parameters:
out
- data output- Throws:
IOException
- if IOException occurs
-
getExactSizeOnDisk
public long getExactSizeOnDisk()
Description copied from class:AColGroup
Returns the exact serialized size of column group. This can be used for example for buffer preallocation.- Overrides:
getExactSizeOnDisk
in classAColGroup
- Returns:
- exact serialized size for column group
-
getCounts
public abstract int[] getCounts(int[] out)
-
computeColSums
public void computeColSums(double[] c, int nRows)
Description copied from class:AColGroup
Compute the column sum- Specified by:
computeColSums
in classAColGroup
- Parameters:
c
- The array to add the column sum to.nRows
- The number of rows in the column group.
-
copy
public AColGroupValue copy()
Description copied from class:AColGroup
Get a copy of this column group note this is only a shallow copy. Meaning only the object wrapping index structures, column indexes and dictionaries are copied.
-
containsValue
public boolean containsValue(double pattern)
Description copied from class:AColGroup
Detect if the column group contains a specific value.- Specified by:
containsValue
in classAColGroup
- Parameters:
pattern
- The value to look for.- Returns:
- boolean saying true if the value is contained.
-
getNumberNonZeros
public long getNumberNonZeros(int nRows)
Description copied from class:AColGroup
Get the number of nonZeros contained in this column group.- Specified by:
getNumberNonZeros
in classAColGroup
- Parameters:
nRows
- The number of rows in the column group, this is used for groups that does not contain information about how many rows they have.- Returns:
- The nnz.
-
forceMatrixBlockDictionary
public void forceMatrixBlockDictionary()
-
rightMultByMatrix
public final AColGroup rightMultByMatrix(MatrixBlock right)
Description copied from class:AColGroup
Right matrix multiplication with this column group. This method can return null, meaning that the output overlapping group would have been empty.- Specified by:
rightMultByMatrix
in classAColGroup
- Parameters:
right
- The MatrixBlock on the right of this matrix multiplication- Returns:
- The new Column Group or null that is the result of the matrix multiplication.
-
estimateInMemorySize
public long estimateInMemorySize()
Description copied from class:AColGroup
Get the upper bound estimate of in memory allocation for the column group.- Overrides:
estimateInMemorySize
in classAColGroup
- Returns:
- an upper bound on the number of bytes used to store this ColGroup in memory.
-
replace
public AColGroup replace(double pattern, double replace)
Description copied from class:AColGroup
Make a copy of the column group values, and replace all values that match pattern with replacement value.
-
centralMoment
public CM_COV_Object centralMoment(CMOperator op, int nRows)
Description copied from class:AColGroup
Central Moment instruction executed on a column group.- Specified by:
centralMoment
in classAColGroup
- Parameters:
op
- The Operator to use.nRows
- The number of rows contained in the ColumnGroup.- Returns:
- A Central Moment object.
-
rexpandCols
public AColGroup rexpandCols(int max, boolean ignore, boolean cast, int nRows)
Description copied from class:AColGroup
Expand the column group to multiple columns. (one hot encode the column group)- Specified by:
rexpandCols
in classAColGroup
- Parameters:
max
- The number of columns to expand to and cutoff values at.ignore
- If zero and negative values should be ignored.cast
- If the double values contained should be cast to whole numbers.nRows
- The number of rows in the column group.- Returns:
- A new column group containing max number of columns.
-
-