Interface OffsetFactory


  • public interface OffsetFactory
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Interface Description
      static class  OffsetFactory.OFF_TYPE
      The specific underlying types of offsets.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static org.apache.commons.logging.Log LOG  
    • Method Summary

      Static Methods 
      Modifier and Type Method Description
      static int correctionByte​(int nRows, int size)  
      static int correctionChar​(int nRows, int size)  
      static AOffset createOffset​(int[] indexes)
      Main factory pattern creator for Offsets.
      static AOffset createOffset​(int[] indexes, int apos, int alen)
      Create a Offset based on a subset of the indexes given.
      static AOffset createOffset​(IntArrayList indexes)
      Create the offsets based on our primitive IntArrayList.
      static long estimateInMemorySize​(int size, int nRows)
      Avg diff only works assuming a normal distribution of the offsets.
      static AOffset readIn​(DataInput in)
      Read in AOffset from the DataInput.
    • Field Detail

      • LOG

        static final org.apache.commons.logging.Log LOG
    • Method Detail

      • createOffset

        static AOffset createOffset​(int[] indexes)
        Main factory pattern creator for Offsets. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.
        Parameters:
        indexes - List of indexes, that is assumed to be sorted and have no duplicates
        Returns:
        AOffset object containing offsets to the next value.
      • createOffset

        static AOffset createOffset​(IntArrayList indexes)
        Create the offsets based on our primitive IntArrayList. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.
        Parameters:
        indexes - The List of indexes, that is assumed to be sorted and have no duplicates
        Returns:
        AOffset object containing offsets to the next value.
      • createOffset

        static AOffset createOffset​(int[] indexes,
                                    int apos,
                                    int alen)
        Create a Offset based on a subset of the indexes given. This is useful if the input is created from a CSR matrix, since it allows us to not reallocate the indexes[] but use the shared indexes from the entire CSR representation. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.
        Parameters:
        indexes - The indexes from which to take the offsets.
        apos - The position to start looking from in the indexes.
        alen - The position to end looking at in the indexes.
        Returns:
        A new Offset.
      • readIn

        static AOffset readIn​(DataInput in)
                       throws IOException
        Read in AOffset from the DataInput.
        Parameters:
        in - DataInput to read from
        Returns:
        The AOffset data instance
        Throws:
        IOException - If the DataInput fails reading in the variables
      • estimateInMemorySize

        static long estimateInMemorySize​(int size,
                                         int nRows)
        Avg diff only works assuming a normal distribution of the offsets. This means that if we have 1000 rows and 100 offsets, it is assumed that on average the distance between elements is 10. Optionally todo is to add some number of size if the average distance is almost the same as the max value of the OffsetLists. this would add to the estimated size and approximate better the real compression size. It would also then handle edge cases better.
        Parameters:
        size - The estimated number of offsets
        nRows - The number of rows.
        Returns:
        The estimated size of an offset given the number of offsets and rows.
      • correctionByte

        static int correctionByte​(int nRows,
                                  int size)
      • correctionChar

        static int correctionChar​(int nRows,
                                  int size)