Interface OffsetFactory
-
public interface OffsetFactory
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static class
OffsetFactory.OFF_TYPE
The specific underlying types of offsets.
-
Field Summary
Fields Modifier and Type Field Description static org.apache.commons.logging.Log
LOG
-
Method Summary
Static Methods Modifier and Type Method Description static int
correctionByte(int nRows, int size)
static int
correctionChar(int nRows, int size)
static AOffset
createOffset(int[] indexes)
Main factory pattern creator for Offsets.static AOffset
createOffset(int[] indexes, int apos, int alen)
Create a Offset based on a subset of the indexes given.static AOffset
createOffset(IntArrayList indexes)
Create the offsets based on our primitive IntArrayList.static long
estimateInMemorySize(int size, int nRows)
Avg diff only works assuming a normal distribution of the offsets.static AOffset
readIn(DataInput in)
Read in AOffset from the DataInput.
-
-
-
Method Detail
-
createOffset
static AOffset createOffset(int[] indexes)
Main factory pattern creator for Offsets. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.- Parameters:
indexes
- List of indexes, that is assumed to be sorted and have no duplicates- Returns:
- AOffset object containing offsets to the next value.
-
createOffset
static AOffset createOffset(IntArrayList indexes)
Create the offsets based on our primitive IntArrayList. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.- Parameters:
indexes
- The List of indexes, that is assumed to be sorted and have no duplicates- Returns:
- AOffset object containing offsets to the next value.
-
createOffset
static AOffset createOffset(int[] indexes, int apos, int alen)
Create a Offset based on a subset of the indexes given. This is useful if the input is created from a CSR matrix, since it allows us to not reallocate the indexes[] but use the shared indexes from the entire CSR representation. Note this creator is unsafe it is assumed that the input index list only contain sequential non duplicate incrementing values.- Parameters:
indexes
- The indexes from which to take the offsets.apos
- The position to start looking from in the indexes.alen
- The position to end looking at in the indexes.- Returns:
- A new Offset.
-
readIn
static AOffset readIn(DataInput in) throws IOException
Read in AOffset from the DataInput.- Parameters:
in
- DataInput to read from- Returns:
- The AOffset data instance
- Throws:
IOException
- If the DataInput fails reading in the variables
-
estimateInMemorySize
static long estimateInMemorySize(int size, int nRows)
Avg diff only works assuming a normal distribution of the offsets. This means that if we have 1000 rows and 100 offsets, it is assumed that on average the distance between elements is 10. Optionally todo is to add some number of size if the average distance is almost the same as the max value of the OffsetLists. this would add to the estimated size and approximate better the real compression size. It would also then handle edge cases better.- Parameters:
size
- The estimated number of offsetsnRows
- The number of rows.- Returns:
- The estimated size of an offset given the number of offsets and rows.
-
correctionByte
static int correctionByte(int nRows, int size)
-
correctionChar
static int correctionChar(int nRows, int size)
-
-