Class CompressedSizeEstimatorExact

  • public class CompressedSizeEstimatorExact
    extends CompressedSizeEstimator
    Exact compressed size estimator (examines entire dataset).
    • Method Detail

      • getColGroupInfo

        public CompressedSizeInfoColGroup getColGroupInfo​(int[] colIndexes,
                                                          int estimate,
                                                          int nrUniqueUpperBound)
        Description copied from class: CompressedSizeEstimator
        A method to extract the Compressed Size Info for a given list of columns, This method further limits the estimated number of unique values, since in some cases the estimated number of uniques is estimated higher than the number estimated in sub groups of the given colIndexes.
        Specified by:
        getColGroupInfo in class CompressedSizeEstimator
        colIndexes - The columns to extract compression information from
        estimate - An estimate of number of unique elements in these columns
        nrUniqueUpperBound - The upper bound of unique elements allowed in the estimate, can be calculated from the number of unique elements estimated in sub columns multiplied together. This is flexible in the sense that if the sample is small then this unique can be manually edited like in CoCodeCostMatrixMult.
        The CompressedSizeInfoColGroup for the given column indexes.
      • getDeltaColGroupInfo

        public CompressedSizeInfoColGroup getDeltaColGroupInfo​(int[] colIndexes,
                                                               int estimate,
                                                               int nrUniqueUpperBound)
        Description copied from class: CompressedSizeEstimator
        A method to extract the Compressed Size Info for a given list of columns, This method further limits the estimated number of unique values, since in some cases the estimated number of uniques is estimated higher than the number estimated in sub groups of the given colIndexes. The Difference for this method is that it extract the values as delta values from the matrix block input.
        Specified by:
        getDeltaColGroupInfo in class CompressedSizeEstimator
        colIndexes - The columns to extract compression information from
        estimate - An estimate of number of unique delta elements in these columns
        nrUniqueUpperBound - The upper bound of unique elements allowed in the estimate, can be calculated from the number of unique elements estimated in sub columns multiplied together. This is flexible in the sense that if the sample is small then this unique can be manually edited like in CoCodeCostMatrixMult.
        The CompressedSizeInfoColGroup for the given column indexes.