Class QDictionary

  • All Implemented Interfaces:
    Serializable, IDictionary

    public class QDictionary
    extends ACachingMBDictionary
    This dictionary class aims to encapsulate the storage and operations over unique floating point values of a column group. The primary reason for its introduction was to provide an entry point for specialization such as shared dictionaries, which require additional information.
    See Also:
    Serialized Form
    • Method Detail

      • create

        public static QDictionary create​(byte[] values,
                                         double scale,
                                         int nCol,
                                         boolean check)
      • getValues

        public double[] getValues()
        Description copied from interface: IDictionary
        Get all the values contained in the dictionary as a linearized double array.
        Specified by:
        getValues in interface IDictionary
        Overrides:
        getValues in class ADictionary
        Returns:
        linearized double array
      • getValue

        public double getValue​(int i)
        Description copied from interface: IDictionary
        Get Specific value contained in the dictionary at index.
        Specified by:
        getValue in interface IDictionary
        Overrides:
        getValue in class ADictionary
        Parameters:
        i - The index to extract the value from
        Returns:
        The value contained at the index
      • getValue

        public final double getValue​(int r,
                                     int c,
                                     int nCol)
        Description copied from interface: IDictionary
        Get Specific value contain in dictionary at index.
        Specified by:
        getValue in interface IDictionary
        Overrides:
        getValue in class ADictionary
        Parameters:
        r - Row target
        c - Col target
        nCol - nCol in dictionary
        Returns:
        value
      • getInMemorySize

        public long getInMemorySize()
        Description copied from interface: IDictionary
        Returns the memory usage of the dictionary.
        Returns:
        a long value in number of bytes for the dictionary.
      • getInMemorySize

        public static long getInMemorySize​(int valuesCount)
      • aggregate

        public double aggregate​(double init,
                                Builtin fn)
        Description copied from interface: IDictionary
        Aggregate all the contained values, useful in value only computations where the operation is iterating through all values contained in the dictionary.
        Specified by:
        aggregate in interface IDictionary
        Overrides:
        aggregate in class ADictionary
        Parameters:
        init - The initial Value, in cases such as Max value, this could be -infinity
        fn - The Function to apply to values
        Returns:
        The aggregated value as a double.
      • write

        public void write​(DataOutput out)
                   throws IOException
        Description copied from interface: IDictionary
        Write the dictionary to a DataOutput.
        Parameters:
        out - the output sink to write the dictionary to.
        Throws:
        IOException - if the sink fails.
      • getExactSizeOnDisk

        public long getExactSizeOnDisk()
        Description copied from interface: IDictionary
        Calculate the space consumption if the dictionary is stored on disk.
        Returns:
        the long count of bytes to store the dictionary.
      • getNumberOfValues

        public int getNumberOfValues​(int nCol)
        Description copied from interface: IDictionary
        Get the number of distinct tuples given that the column group has n columns
        Parameters:
        nCol - The number of Columns in the ColumnGroup.
        Returns:
        the number of value tuples contained in the dictionary.
      • getNumberOfColumns

        public int getNumberOfColumns​(int nCol)
        Description copied from interface: IDictionary
        Get the number of columns in this dictionary, provided you know the number of values, or rows.
        Parameters:
        nCol - The number of rows/values known inside this dictionary
        Returns:
        The number of columns
      • sumAllRowsToDouble

        public double[] sumAllRowsToDouble​(int nrColumns)
        Description copied from interface: IDictionary
        Method used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.
        Specified by:
        sumAllRowsToDouble in interface IDictionary
        Overrides:
        sumAllRowsToDouble in class ADictionary
        Parameters:
        nrColumns - The number of columns in the ColGroup to know how to get the values from the dictionary.
        Returns:
        a double array containing the row sums from this dictionary.
      • sumAllRowsToDoubleSq

        public double[] sumAllRowsToDoubleSq​(int nrColumns)
        Description copied from interface: IDictionary
        Method used as a pre-aggregate of each tuple in the dictionary, to single double values. Note if the number of columns is one the actual dictionaries values are simply returned.
        Specified by:
        sumAllRowsToDoubleSq in interface IDictionary
        Overrides:
        sumAllRowsToDoubleSq in class ADictionary
        Parameters:
        nrColumns - The number of columns in the ColGroup to know how to get the values from the dictionary.
        Returns:
        a double array containing the row sums from this dictionary.
      • getString

        public String getString​(int colIndexes)
        Description copied from interface: IDictionary
        Get a string representation of the dictionary, that considers the layout of the data.
        Parameters:
        colIndexes - The number of columns in the dictionary.
        Returns:
        A string that is nicer to print.
      • sliceOutColumnRange

        public IDictionary sliceOutColumnRange​(int idxStart,
                                               int idxEnd,
                                               int previousNumberOfColumns)
        Description copied from interface: IDictionary
        Modify the dictionary by removing columns not within the index range.
        Specified by:
        sliceOutColumnRange in interface IDictionary
        Overrides:
        sliceOutColumnRange in class ADictionary
        Parameters:
        idxStart - The column index to start at.
        idxEnd - The column index to end at (not inclusive)
        previousNumberOfColumns - The number of columns contained in the dictionary.
        Returns:
        A dictionary containing the sliced out columns values only.
      • getNumberNonZeros

        public long getNumberNonZeros​(int[] counts,
                                      int nCol)
        Description copied from interface: IDictionary
        Calculate the number of non zeros in the dictionary. The number of non zeros should be scaled with the counts given. This gives the exact number of non zero values in the parent column group.
        Parameters:
        counts - The counts of each dictionary entry
        nCol - The number of columns in this dictionary
        Returns:
        The nonZero count
      • countNNZZeroColumns

        public int[] countNNZZeroColumns​(int[] counts)
        Description copied from interface: IDictionary
        Count the number of non zero values in each column of the dictionary, multiplied with the counts
        Specified by:
        countNNZZeroColumns in interface IDictionary
        Overrides:
        countNNZZeroColumns in class ADictionary
        Parameters:
        counts - The counts to multiply with.
        Returns:
        The nonzero count of each column in the dictionary.
      • getDictType

        public IDictionary.DictType getDictType()
        Description copied from interface: IDictionary
        Get the dictionary type this dictionary is.
        Returns:
        The Dictionary type this is.
      • equals

        public boolean equals​(IDictionary o)
        Description copied from interface: IDictionary
        Indicate if the other dictionary is equal to this.
        Parameters:
        o - The other object
        Returns:
        If it is equal