Class CompressionSettingsBuilder
- java.lang.Object
-
- org.apache.sysds.runtime.compress.CompressionSettingsBuilder
-
public class CompressionSettingsBuilder extends Object
Builder pattern for Compression Settings. See CompressionSettings for details on values.
-
-
Constructor Summary
Constructors Constructor Description CompressionSettingsBuilder()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description CompressionSettingsBuilder
addValidCompression(AColGroup.CompressionType cp)
Add a single valid compression type to the EnumSet of valid compressions.CompressionSettingsBuilder
clearValidCompression()
Clear all the compression types allowed in the compression.CompressionSettingsBuilder
copySettings(CompressionSettings that)
Copy the settings from another CompressionSettings Builder, modifies this, not that.CompressionSettings
create()
Create the CompressionSettings object to use in the compression.CompressionSettingsBuilder
setAllowSharedDictionary(boolean allowSharedDictionary)
Allow the Dictionaries to be shared between different column groups.CompressionSettingsBuilder
setCoCodePercentage(double coCodePercentage)
Set the coCode percentage, the effect is different based on the coCoding strategy, but the general effect is that higher values results in more coCoding while lower values result in less.CompressionSettingsBuilder
setColumnPartitioner(CoCoderFactory.PartitionerType columnPartitioner)
Set the type of CoCoding Partitioner type to use for combining columns together.CompressionSettingsBuilder
setCostType(CostEstimatorFactory.CostType costType)
Set the cost type used for estimating the cost of column groups default is memory based.CompressionSettingsBuilder
setEstimationType(SampleEstimatorFactory.EstimationType estimationType)
Set the estimation type used for the sampled estimates.CompressionSettingsBuilder
setIsInSparkInstruction()
Inform the compression that it is executed in a spark instruction.CompressionSettingsBuilder
setLossy(boolean lossy)
Set the Compression to use Lossy compression.CompressionSettingsBuilder
setMaxColGroupCoCode(int maxColGroupCoCode)
Set the maximum number of columns to CoCode together in the CoCoding strategy.CompressionSettingsBuilder
setMaxSampleSize(int maxSampleSize)
Set the maximum sample size to extract from a given matrix, this overrules the sample percentage if the sample percentage extracted is higher than this maximum bound.CompressionSettingsBuilder
setMinimumCompressionRatio(double ratio)
Set the minimum compression ratio to be achieved by the compression.CompressionSettingsBuilder
setMinimumSampleSize(int minimumSampleSize)
Set the minimum sample size to extract from a given matrix, this overrules the sample percentage if the sample percentage extracted is lower than this minimum bound.CompressionSettingsBuilder
setSamplingRatio(double samplingRatio)
Set the sampling ratio in percent to sample the input matrix.CompressionSettingsBuilder
setSDCSortType(InsertionSorterFactory.SORT_TYPE sdcSortType)
Set the sort type to use.CompressionSettingsBuilder
setSeed(int seed)
Set the seed for the compression operation.CompressionSettingsBuilder
setSortValuesByLength(boolean sortValuesByLength)
Set the sortValuesByLength flag.CompressionSettingsBuilder
setTransposeInput(String transposeInput)
Specify if the input matrix should be transposed before compression.CompressionSettingsBuilder
setValidCompressions(EnumSet<AColGroup.CompressionType> validCompressions)
Set the valid compression strategies used for the compression.
-
-
-
Method Detail
-
copySettings
public CompressionSettingsBuilder copySettings(CompressionSettings that)
Copy the settings from another CompressionSettings Builder, modifies this, not that.- Parameters:
that
- The other CompressionSettingsBuilder to copy settings from.- Returns:
- The modified CompressionSettings in the same object.
-
setLossy
public CompressionSettingsBuilder setLossy(boolean lossy)
Set the Compression to use Lossy compression.- Parameters:
lossy
- A boolean specifying if the compression should be lossy- Returns:
- The CompressionSettingsBuilder
-
setSamplingRatio
public CompressionSettingsBuilder setSamplingRatio(double samplingRatio)
Set the sampling ratio in percent to sample the input matrix. Input value should be in range 0.0 - 1.0- Parameters:
samplingRatio
- The ratio to sample from the input- Returns:
- The CompressionSettingsBuilder
-
setSortValuesByLength
public CompressionSettingsBuilder setSortValuesByLength(boolean sortValuesByLength)
Set the sortValuesByLength flag. This sorts the dictionaries containing the data based on their occurences in the ColGroup. Improving cache efficiency especially for diverse column groups.- Parameters:
sortValuesByLength
- A boolean specifying if the values should be sorted- Returns:
- The CompressionSettingsBuilder
-
setAllowSharedDictionary
public CompressionSettingsBuilder setAllowSharedDictionary(boolean allowSharedDictionary)
Allow the Dictionaries to be shared between different column groups.- Parameters:
allowSharedDictionary
- A boolean specifying if the dictionary can be shared between column groups.- Returns:
- The CompressionSettingsBuilder
-
setTransposeInput
public CompressionSettingsBuilder setTransposeInput(String transposeInput)
Specify if the input matrix should be transposed before compression. This improves cache efficiency while compression the input matrix- Parameters:
transposeInput
- string specifying if the input should be transposed before compression, should be one of "auto", "true" or "false"- Returns:
- The CompressionSettingsBuilder
-
setSeed
public CompressionSettingsBuilder setSeed(int seed)
Set the seed for the compression operation.- Parameters:
seed
- The seed used in sampling the matrix and general operations in the compression.- Returns:
- The CompressionSettingsBuilder
-
setValidCompressions
public CompressionSettingsBuilder setValidCompressions(EnumSet<AColGroup.CompressionType> validCompressions)
Set the valid compression strategies used for the compression.- Parameters:
validCompressions
- An EnumSet of CompressionTypes to use in the compression- Returns:
- The CompressionSettingsBuilder
-
addValidCompression
public CompressionSettingsBuilder addValidCompression(AColGroup.CompressionType cp)
Add a single valid compression type to the EnumSet of valid compressions.- Parameters:
cp
- The compression type to add to the valid ones.- Returns:
- The CompressionSettingsBuilder
-
clearValidCompression
public CompressionSettingsBuilder clearValidCompression()
Clear all the compression types allowed in the compression. This will only allow the Uncompressed ColGroup type. Since this is required for operation of the compression- Returns:
- The CompressionSettingsBuilder
-
setColumnPartitioner
public CompressionSettingsBuilder setColumnPartitioner(CoCoderFactory.PartitionerType columnPartitioner)
Set the type of CoCoding Partitioner type to use for combining columns together.- Parameters:
columnPartitioner
- The Strategy to select from PartitionerType- Returns:
- The CompressionSettingsBuilder
-
setMaxColGroupCoCode
public CompressionSettingsBuilder setMaxColGroupCoCode(int maxColGroupCoCode)
Set the maximum number of columns to CoCode together in the CoCoding strategy. Compression time increase with higher numbers.- Parameters:
maxColGroupCoCode
- The max selected.- Returns:
- The CompressionSettingsBuilder
-
setCoCodePercentage
public CompressionSettingsBuilder setCoCodePercentage(double coCodePercentage)
Set the coCode percentage, the effect is different based on the coCoding strategy, but the general effect is that higher values results in more coCoding while lower values result in less. Note that with high coCoding the compression ratio would possibly be lower.- Parameters:
coCodePercentage
- The percentage to set.- Returns:
- The CompressionSettingsBuilder
-
setMinimumSampleSize
public CompressionSettingsBuilder setMinimumSampleSize(int minimumSampleSize)
Set the minimum sample size to extract from a given matrix, this overrules the sample percentage if the sample percentage extracted is lower than this minimum bound.- Parameters:
minimumSampleSize
- The minimum sample size to extract- Returns:
- The CompressionSettingsBuilder
-
setMaxSampleSize
public CompressionSettingsBuilder setMaxSampleSize(int maxSampleSize)
Set the maximum sample size to extract from a given matrix, this overrules the sample percentage if the sample percentage extracted is higher than this maximum bound.- Parameters:
maxSampleSize
- The maximum sample size to extract- Returns:
- The CompressionSettingsBuilder
-
setEstimationType
public CompressionSettingsBuilder setEstimationType(SampleEstimatorFactory.EstimationType estimationType)
Set the estimation type used for the sampled estimates.- Parameters:
estimationType
- the estimation type in used.- Returns:
- The CompressionSettingsBuilder
-
setCostType
public CompressionSettingsBuilder setCostType(CostEstimatorFactory.CostType costType)
Set the cost type used for estimating the cost of column groups default is memory based.- Parameters:
costType
- The Cost type wanted- Returns:
- The CompressionSettingsBuilder
-
setMinimumCompressionRatio
public CompressionSettingsBuilder setMinimumCompressionRatio(double ratio)
Set the minimum compression ratio to be achieved by the compression.- Parameters:
ratio
- The ratio to achieve while compressing- Returns:
- The CompressionSettingsBuilder
-
setIsInSparkInstruction
public CompressionSettingsBuilder setIsInSparkInstruction()
Inform the compression that it is executed in a spark instruction.- Returns:
- The CompressionSettingsBuilder
-
setSDCSortType
public CompressionSettingsBuilder setSDCSortType(InsertionSorterFactory.SORT_TYPE sdcSortType)
Set the sort type to use.- Parameters:
sdcSortType
- The sort type for the construction of SDC groups- Returns:
- The CompressionSettingsBuilder
-
create
public CompressionSettings create()
Create the CompressionSettings object to use in the compression.- Returns:
- The CompressionSettings
-
-