Class ColGroupDDC
- java.lang.Object
-
- org.apache.sysds.runtime.compress.colgroup.AColGroup
-
- org.apache.sysds.runtime.compress.colgroup.AColGroupCompressed
-
- org.apache.sysds.runtime.compress.colgroup.ADictBasedColGroup
-
- org.apache.sysds.runtime.compress.colgroup.AColGroupValue
-
- org.apache.sysds.runtime.compress.colgroup.APreAgg
-
- org.apache.sysds.runtime.compress.colgroup.ColGroupDDC
-
- All Implemented Interfaces:
Serializable,IContainADictionary,IMapToDataGroup
public class ColGroupDDC extends APreAgg implements IMapToDataGroup
Class to encapsulate information about a column group that is encoded with dense dictionary encoding (DDC).- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroup
AColGroup.CompressionType
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description AColGroupappend(AColGroup g)Append the other column group to this column group.AColGroupappendNInternal(AColGroup[] g, int blen, int rlen)AColGroupbinaryRowOpLeft(BinaryOperator op, double[] v, boolean isRowSafe)Perform a binary row operation.AColGroupbinaryRowOpRight(BinaryOperator op, double[] v, boolean isRowSafe)Perform a binary row operation.AColGroupCompressedcombineWithSameIndex(int nRow, int nCol, List<AColGroup> right)C bind the list of column groups with this column group.AColGroupCompressedcombineWithSameIndex(int nRow, int nCol, AColGroup right)C bind the given column group to this.booleancontainsValue(double pattern)Detect if the column group contains a specific value.static AColGroupcreate(IColIndex colIndexes, IDictionary dict, AMapToData data, int[] cachedCounts)longestimateInMemorySize()Get the upper bound estimate of in memory allocation for the column group.org.apache.sysds.runtime.compress.colgroup.AColGroup.ColGroupTypegetColGroupType()CompressedSizeInfoColGroupgetCompressionInfo(int nRow)Get the compression info for this column group.ICLASchemegetCompressionScheme()Get the compression scheme for this column group to enable compression of other data.AColGroup.CompressionTypegetCompType()Obtain the compression type.doublegetCost(ComputationCostEstimator e, int nRows)Get the computation cost associated with this column group.int[]getCounts(int[] counts)IEncodegetEncoding()Get encoding of this column group.longgetExactSizeOnDisk()Returns the exact serialized size of column group.doublegetIdx(int r, int colIdx)Get the value at a colGroup specific row/column index position.AMapToDatagetMapToData()voidleftMMIdentityPreAggregateDense(MatrixBlock that, MatrixBlock ret, int rl, int ru, int cl, int cu)voidleftMultByMatrixNoPreAgg(MatrixBlock matrix, MatrixBlock result, int rl, int ru, int cl, int cu)Left multiply with this column group.AColGroupmorph(AColGroup.CompressionType ct, int nRow)Recompress this column group into a new column group of the given type.voidpreAggregateDense(MatrixBlock m, double[] preAgg, int rl, int ru, int cl, int cu)Pre aggregate a dense matrix block into a pre aggregate target (first step of left matrix multiplication)voidpreAggregateSparse(SparseBlock sb, double[] preAgg, int rl, int ru, int cl, int cu)voidpreAggregateThatDDCStructure(ColGroupDDC that, Dictionary ret)voidpreAggregateThatSDCSingleZerosStructure(ColGroupSDCSingleZeros that, Dictionary ret)voidpreAggregateThatSDCZerosStructure(ColGroupSDCZeros that, Dictionary ret)static ColGroupDDCread(DataInput in)AColGrouprecompress()Recompress this column group into a new column group.voidrightDecompressingMult(MatrixBlock right, MatrixBlock ret, int rl, int ru, int nRows, int crl, int cru)Right side Matrix multiplication, iterating though this column group and adding to the retbooleansameIndexStructure(AColGroupCompressed that)AColGroupscalarOperation(ScalarOperator op)Perform the specified scalar operation directly on the compressed column group, without decompressing individual cells if possible.AColGroupsliceRows(int rl, int ru)Slice range of rows out of the column group and return a new column group only containing the row segment.voidsparseSelection(MatrixBlock selection, ColGroupUtils.P[] points, MatrixBlock ret, int rl, int ru)AColGroupsparsifyFOR()AColGroup[]splitReshape(int multiplier, int nRow, int nColOrg)This method returns a list of column groups that are naive splits of this column group as if it is reshaped.AColGroup[]splitReshapePushDown(int multiplier, int nRow, int nColOrg, ExecutorService pool)This method returns a list of column groups that are naive splits of this column group as if it is reshaped.StringtoString()AColGroupunaryOperation(UnaryOperator op)Perform unary operation on the column group and return a new column groupvoidwrite(DataOutput out)-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.APreAgg
getPreAggregateSize, leftMultByAColGroup, mmWithDictionary, preAggregate, preAggregateThatIndexStructure, tsmmAColGroup
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroupValue
centralMoment, clear, computeColSums, getCounts, getNumberNonZeros, getNumValues, replace, rexpandCols
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.ADictBasedColGroup
copyAndSet, copyAndSet, decompressToDenseBlock, decompressToDenseBlockTransposed, decompressToSparseBlock, decompressToSparseBlockTransposed, getDictionary, getSparsity, reduceCols, rightMultByMatrix
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroupCompressed
getMax, getMin, getSum, isEmpty, preAggRows, sameIndexStructure, tsmm, unaryAggregateOperations, unaryAggregateOperations
-
Methods inherited from class org.apache.sysds.runtime.compress.colgroup.AColGroup
addVector, appendN, colSum, combine, decompressToDenseBlock, decompressToSparseBlock, get, getColIndices, getNumCols, rightMultByMatrix, selectionMultiply, shiftColIndices, sliceColumn, sliceColumns, sortColumnIndexes
-
-
-
-
Method Detail
-
create
public static AColGroup create(IColIndex colIndexes, IDictionary dict, AMapToData data, int[] cachedCounts)
-
sparsifyFOR
public AColGroup sparsifyFOR()
-
getCompType
public AColGroup.CompressionType getCompType()
Description copied from class:AColGroupObtain the compression type.- Specified by:
getCompTypein classAColGroup- Returns:
- How the elements of the column group are compressed.
-
getMapToData
public AMapToData getMapToData()
- Specified by:
getMapToDatain interfaceIMapToDataGroup
-
getIdx
public double getIdx(int r, int colIdx)Description copied from class:AColGroupGet the value at a colGroup specific row/column index position.
-
getCounts
public int[] getCounts(int[] counts)
-
leftMultByMatrixNoPreAgg
public void leftMultByMatrixNoPreAgg(MatrixBlock matrix, MatrixBlock result, int rl, int ru, int cl, int cu)
Description copied from class:AColGroupLeft multiply with this column group.- Specified by:
leftMultByMatrixNoPreAggin classAColGroup- Parameters:
matrix- The matrix to multiply with on the leftresult- The result to output the values into, always dense for the purpose of the column groups parallelizingrl- The row to begin the multiplication from on the lhs matrixru- The row to end the multiplication at on the lhs matrixcl- The column to begin the multiplication from on the lhs matrixcu- The column to end the multiplication at on the lhs matrix
-
preAggregateDense
public void preAggregateDense(MatrixBlock m, double[] preAgg, int rl, int ru, int cl, int cu)
Description copied from class:APreAggPre aggregate a dense matrix block into a pre aggregate target (first step of left matrix multiplication)- Specified by:
preAggregateDensein classAPreAgg- Parameters:
m- The matrix to preAggregatepreAgg- The preAggregate targetrl- Row lower on the left side matrixru- Row upper on the left side matrixcl- Column lower on the left side matrix (or row lower in the column group)cu- Column upper on the left side matrix (or row upper in the column group)
-
leftMMIdentityPreAggregateDense
public void leftMMIdentityPreAggregateDense(MatrixBlock that, MatrixBlock ret, int rl, int ru, int cl, int cu)
- Specified by:
leftMMIdentityPreAggregateDensein classAPreAgg
-
rightDecompressingMult
public void rightDecompressingMult(MatrixBlock right, MatrixBlock ret, int rl, int ru, int nRows, int crl, int cru)
Description copied from class:AColGroupRight side Matrix multiplication, iterating though this column group and adding to the ret- Overrides:
rightDecompressingMultin classAColGroup- Parameters:
right- Right side matrix to multiply with.ret- The return matrix to add results torl- The row of this column group to multiply fromru- The row of this column group to multiply to (not inclusive)nRows- The number of rows in this column groupcrl- The right hand side column lowercru- The right hand side column upper
-
preAggregateSparse
public void preAggregateSparse(SparseBlock sb, double[] preAgg, int rl, int ru, int cl, int cu)
- Specified by:
preAggregateSparsein classAPreAgg
-
preAggregateThatDDCStructure
public void preAggregateThatDDCStructure(ColGroupDDC that, Dictionary ret)
-
preAggregateThatSDCZerosStructure
public void preAggregateThatSDCZerosStructure(ColGroupSDCZeros that, Dictionary ret)
-
preAggregateThatSDCSingleZerosStructure
public void preAggregateThatSDCSingleZerosStructure(ColGroupSDCSingleZeros that, Dictionary ret)
-
sameIndexStructure
public boolean sameIndexStructure(AColGroupCompressed that)
- Specified by:
sameIndexStructurein classAColGroupCompressed
-
getColGroupType
public org.apache.sysds.runtime.compress.colgroup.AColGroup.ColGroupType getColGroupType()
-
estimateInMemorySize
public long estimateInMemorySize()
Description copied from class:AColGroupGet the upper bound estimate of in memory allocation for the column group.- Overrides:
estimateInMemorySizein classAColGroupValue- Returns:
- an upper bound on the number of bytes used to store this ColGroup in memory.
-
scalarOperation
public AColGroup scalarOperation(ScalarOperator op)
Description copied from class:AColGroupPerform the specified scalar operation directly on the compressed column group, without decompressing individual cells if possible.- Specified by:
scalarOperationin classAColGroup- Parameters:
op- operation to perform- Returns:
- version of this column group with the operation applied
-
unaryOperation
public AColGroup unaryOperation(UnaryOperator op)
Description copied from class:AColGroupPerform unary operation on the column group and return a new column group- Specified by:
unaryOperationin classAColGroup- Parameters:
op- The operation to perform- Returns:
- The new column group
-
binaryRowOpLeft
public AColGroup binaryRowOpLeft(BinaryOperator op, double[] v, boolean isRowSafe)
Description copied from class:AColGroupPerform a binary row operation.- Specified by:
binaryRowOpLeftin classAColGroup- Parameters:
op- The operation to executev- The vector of values to apply the values contained should be at least the length of the highest value in the column indexisRowSafe- True if the binary op is applied to an entire zero row and all results are zero- Returns:
- A updated column group with the new values.
-
binaryRowOpRight
public AColGroup binaryRowOpRight(BinaryOperator op, double[] v, boolean isRowSafe)
Description copied from class:AColGroupPerform a binary row operation.- Specified by:
binaryRowOpRightin classAColGroup- Parameters:
op- The operation to executev- The vector of values to apply the values contained should be at least the length of the highest value in the column indexisRowSafe- True if the binary op is applied to an entire zero row and all results are zero- Returns:
- A updated column group with the new values.
-
write
public void write(DataOutput out) throws IOException
- Overrides:
writein classADictBasedColGroup- Throws:
IOException
-
read
public static ColGroupDDC read(DataInput in) throws IOException
- Throws:
IOException
-
getExactSizeOnDisk
public long getExactSizeOnDisk()
Description copied from class:AColGroupReturns the exact serialized size of column group. This can be used for example for buffer preallocation.- Overrides:
getExactSizeOnDiskin classADictBasedColGroup- Returns:
- exact serialized size for column group
-
getCost
public double getCost(ComputationCostEstimator e, int nRows)
Description copied from class:AColGroupGet the computation cost associated with this column group.
-
containsValue
public boolean containsValue(double pattern)
Description copied from class:AColGroupDetect if the column group contains a specific value.- Specified by:
containsValuein classAColGroup- Parameters:
pattern- The value to look for.- Returns:
- boolean saying true if the value is contained.
-
sliceRows
public AColGroup sliceRows(int rl, int ru)
Description copied from class:AColGroupSlice range of rows out of the column group and return a new column group only containing the row segment. Note that this slice should maintain pointers back to the original dictionaries and only modify index structures.
-
append
public AColGroup append(AColGroup g)
Description copied from class:AColGroupAppend the other column group to this column group. This method tries to combine them to return a new column group containing both. In some cases it is possible in reasonable time, in others it is not. The result is first this column group followed by the other column group in higher row values. If it is not possible or very inefficient null is returned.
-
getCompressionScheme
public ICLAScheme getCompressionScheme()
Description copied from class:AColGroupGet the compression scheme for this column group to enable compression of other data.- Specified by:
getCompressionSchemein classAColGroup- Returns:
- The compression scheme of this column group
-
recompress
public AColGroup recompress()
Description copied from class:AColGroupRecompress this column group into a new column group.- Specified by:
recompressin classAColGroup- Returns:
- A new or the same column group depending on optimization goal.
-
getCompressionInfo
public CompressedSizeInfoColGroup getCompressionInfo(int nRow)
Description copied from class:AColGroupGet the compression info for this column group.- Specified by:
getCompressionInfoin classAColGroup- Parameters:
nRow- The number of rows in this column group.- Returns:
- The compression info for this group.
-
getEncoding
public IEncode getEncoding()
Description copied from class:AColGroupGet encoding of this column group.- Overrides:
getEncodingin classAColGroup- Returns:
- The encoding of the index structure.
-
sparseSelection
public void sparseSelection(MatrixBlock selection, ColGroupUtils.P[] points, MatrixBlock ret, int rl, int ru)
-
morph
public AColGroup morph(AColGroup.CompressionType ct, int nRow)
Description copied from class:AColGroupRecompress this column group into a new column group of the given type.
-
combineWithSameIndex
public AColGroupCompressed combineWithSameIndex(int nRow, int nCol, List<AColGroup> right)
Description copied from class:AColGroupC bind the list of column groups with this column group. the list of elements provided in the index of each list is guaranteed to have the same index structures- Overrides:
combineWithSameIndexin classAColGroup- Parameters:
nRow- The number of rows contained in all right and this column group.nCol- The number of columns to shift the right hand side column groups over when combining, this should only effect the column indexesright- The right hand side column groups to combine. NOTE only the index offset of the second nested list should be used. The reason for providing this nested list is to avoid redundant allocations in calling methods.- Returns:
- A combined compressed column group of the same type as this!.
-
combineWithSameIndex
public AColGroupCompressed combineWithSameIndex(int nRow, int nCol, AColGroup right)
Description copied from class:AColGroupC bind the given column group to this.- Overrides:
combineWithSameIndexin classAColGroup- Parameters:
nRow- The number of rows contained in the right and this column group.nCol- The number of columns in this.right- The column group to c-bind.- Returns:
- a new combined column groups.
-
splitReshape
public AColGroup[] splitReshape(int multiplier, int nRow, int nColOrg)
Description copied from class:AColGroupThis method returns a list of column groups that are naive splits of this column group as if it is reshaped. This means the column groups rows are split into x number of other column groups where x is the multiplier. The indexes are assigned round robbin to each of the output groups, meaning the first index is assigned. If for instance the 4. column group is split by a 2 multiplier and there was 5 columns in total originally. The output becomes 2 column groups at column index 4 and one at 9. If possible the split column groups should reuse pointers back to the original dictionaries!- Specified by:
splitReshapein classAColGroup- Parameters:
multiplier- The number of column groups to split intonRow- The number of rows in this column group in case the underlying column group does not knownColOrg- The number of overall columns in the host CompressedMatrixBlock.- Returns:
- a list of split column groups
-
splitReshapePushDown
public AColGroup[] splitReshapePushDown(int multiplier, int nRow, int nColOrg, ExecutorService pool) throws Exception
Description copied from class:AColGroupThis method returns a list of column groups that are naive splits of this column group as if it is reshaped. This means the column groups rows are split into x number of other column groups where x is the multiplier. The indexes are assigned round robbin to each of the output groups, meaning the first index is assigned. If for instance the 4. column group is split by a 2 multiplier and there was 5 columns in total originally. The output becomes 2 column groups at column index 4 and one at 9. If possible the split column groups should reuse pointers back to the original dictionaries! This specific variation is pushing down the parallelization given via the executor service provided. If not overwritten the default is to call the normal split reshape- Overrides:
splitReshapePushDownin classAColGroup- Parameters:
multiplier- The number of column groups to split intonRow- The number of rows in this column group in case the underlying column group does not knownColOrg- The number of overall columns in the host CompressedMatrixBlockpool- The executor service to submit parallel tasks to- Returns:
- a list of split column groups
- Throws:
Exception- In case there is an error we throw the exception out instead of handling it
-
toString
public String toString()
- Overrides:
toStringin classAColGroupValue
-
-