public class Dictionary extends ADictionary
Constructor and Description |
---|
Dictionary(double[] values) |
Modifier and Type | Method and Description |
---|---|
void |
addMaxAndMin(double[] ret,
int[] colIndexes)
This method adds the max and min values contained in the dictionary to corresponding cells in the ret variable.
|
void |
addToEntry(Dictionary d,
int fr,
int to,
int nCol)
Copies and adds the dictionary entry from this dictionary to the d dictionary
|
double |
aggregate(double init,
Builtin fn)
Aggregate all the contained values, useful in value only computations where the operation is iterating through
all values contained in the dictionary.
|
void |
aggregateCols(double[] c,
Builtin fn,
int[] colIndexes)
Aggregates the columns into the target double array provided.
|
double[] |
aggregateTuples(Builtin fn,
int nCol)
Aggregate all entries in the rows.
|
Dictionary |
apply(ScalarOperator op)
Applies the scalar operation on the dictionary.
|
Dictionary |
applyBinaryRowOpLeft(BinaryOperator op,
double[] v,
boolean sparseSafe,
int[] colIndexes)
Apply binary row operation on this dictionary on the left side.
|
Dictionary |
applyBinaryRowOpRight(BinaryOperator op,
double[] v,
boolean sparseSafe,
int[] colIndexes)
Apply binary row operation on this dictionary on the right side.
|
Dictionary |
applyScalarOp(ScalarOperator op,
double newVal,
int numCols)
Applies the scalar operation on the dictionary.
|
Dictionary |
clone()
Returns a deep clone of the dictionary.
|
Dictionary |
cloneAndExtend(int len)
Clone the dictionary, and extend size of the dictionary by a given length
|
void |
colSum(double[] c,
int[] counts,
int[] colIndexes,
boolean square)
Get the column sum of the values contained in the dictionary
|
double[] |
colSum(int[] counts,
int nCol)
get the column sum of this dictionary only.
|
boolean |
containsValue(double pattern)
Detect if the dictionary contains a specific value.
|
MatrixBlockDictionary |
getAsMatrixBlockDictionary(int nCol)
Get this dictionary as a matrixBlock dictionary.
|
long |
getExactSizeOnDisk()
Calculate the space consumption if the dictionary is stored on disk.
|
long |
getInMemorySize()
Returns the memory usage of the dictionary.
|
long |
getNumberNonZeros(int[] counts,
int nCol)
Calculate the number of non zeros in the dictionary.
|
int |
getNumberOfValues(int nCol)
Get the number of distinct tuples given that the column group has n columns
|
String |
getString(int colIndexes)
Get a string representation of the dictionary, that considers the layout of the data.
|
double[] |
getTuple(int index,
int nCol)
Get the values contained in a specific tuple of the dictionary.
|
double |
getValue(int i)
Get Specific value contained in the dictionary at index.
|
double[] |
getValues()
Get all the values contained in the dictionary as a linearized double array.
|
boolean |
isLossy()
Specify if the Dictionary is lossy.
|
void |
preaggValuesFromDense(int numVals,
int[] colIndexes,
int[] aggregateColumns,
double[] b,
double[] ret,
int cut)
Pre Aggregate values for right Matrix Multiplication.
|
static Dictionary |
read(DataInput in) |
ADictionary |
reExpandColumns(int max)
return a new Dictionary that have re expanded all values, based on the entries already contained.
|
ADictionary |
replace(double pattern,
double replace,
int nCol,
boolean safe)
Make a copy of the values, and replace all values that match pattern with replacement value.
|
ADictionary |
scaleTuples(int[] scaling,
int nCol)
Scale all tuples contained in the dictionary by the scaling factor given in the int list.
|
ADictionary |
sliceOutColumnRange(int idxStart,
int idxEnd,
int previousNumberOfColumns)
Modify the dictionary by removing columns not within the index range.
|
ADictionary |
subtractTuple(double[] tuple)
Allocate a new dictionary where the tuple given is subtracted from all tuples in the previous dictionary.
|
double |
sum(int[] counts,
int ncol)
Get the sum of the values contained in the dictionary
|
double[] |
sumAllRowsToDouble(boolean square,
int nrColumns)
Method used as a pre-aggregate of each tuple in the dictionary, to single double values.
|
double |
sumRow(int k,
boolean square,
int nrColumns)
Sum the values at a specific row.
|
double |
sumsq(int[] counts,
int ncol)
Get the square sum of the values contained in the dictionary
|
String |
toString() |
void |
write(DataOutput out)
Write the dictionary to a DataOutput.
|
applyBinaryRowOp, getMostCommonTuple
public double[] getValues()
ADictionary
getValues
in class ADictionary
public double getValue(int i)
ADictionary
getValue
in class ADictionary
i
- The index to extract the value frompublic long getInMemorySize()
ADictionary
getInMemorySize
in class ADictionary
public double aggregate(double init, Builtin fn)
ADictionary
aggregate
in class ADictionary
init
- The initial Value, in cases such as Max value, this could be -infinityfn
- The Function to apply to valuespublic double[] aggregateTuples(Builtin fn, int nCol)
ADictionary
aggregateTuples
in class ADictionary
fn
- The aggregate functionnCol
- The number of columns contained in the dictionary.public Dictionary apply(ScalarOperator op)
ADictionary
apply
in class ADictionary
op
- The operator to apply to the dictionary values.public Dictionary applyScalarOp(ScalarOperator op, double newVal, int numCols)
ADictionary
applyScalarOp
in class ADictionary
op
- The operator to apply to the dictionary values.newVal
- The value to append to the dictionary.numCols
- The number of columns stored in the dictionary.public Dictionary applyBinaryRowOpRight(BinaryOperator op, double[] v, boolean sparseSafe, int[] colIndexes)
ADictionary
applyBinaryRowOpRight
in class ADictionary
op
- The operation to this dictionaryv
- The values to use on the right hand side.sparseSafe
- boolean specifying if the operation is safe, and therefore dont need to allocate an extended
dictionarycolIndexes
- The column indexes to consider inside v.public Dictionary applyBinaryRowOpLeft(BinaryOperator op, double[] v, boolean sparseSafe, int[] colIndexes)
ADictionary
applyBinaryRowOpLeft
in class ADictionary
op
- The operation to this dictionaryv
- The values to use on the left hand side.sparseSafe
- boolean specifying if the operation is safe, and therefore dont need to allocate an extended
dictionarycolIndexes
- The column indexes to consider inside v.public Dictionary clone()
ADictionary
clone
in class ADictionary
public Dictionary cloneAndExtend(int len)
ADictionary
cloneAndExtend
in class ADictionary
len
- The length to extend the dictionary, it is assumed this value is positive.public static Dictionary read(DataInput in) throws IOException
IOException
public void write(DataOutput out) throws IOException
ADictionary
write
in class ADictionary
out
- the output sink to write the dictionary to.IOException
- if the sink fails.public long getExactSizeOnDisk()
ADictionary
getExactSizeOnDisk
in class ADictionary
public int getNumberOfValues(int nCol)
ADictionary
getNumberOfValues
in class ADictionary
nCol
- The number of Columns in the ColumnGroup.public double[] sumAllRowsToDouble(boolean square, int nrColumns)
ADictionary
sumAllRowsToDouble
in class ADictionary
square
- If each entry should be squared.nrColumns
- The number of columns in the ColGroup to know how to get the values from the dictionary.public double sumRow(int k, boolean square, int nrColumns)
ADictionary
sumRow
in class ADictionary
k
- The row index to sumsquare
- If each entry should be squared.nrColumns
- The number of columnspublic double[] colSum(int[] counts, int nCol)
ADictionary
colSum
in class ADictionary
counts
- the counts of the values containednCol
- The number of columns contained in each tuple.public void colSum(double[] c, int[] counts, int[] colIndexes, boolean square)
ADictionary
colSum
in class ADictionary
c
- The output array allocated to contain all column groups output.counts
- The counts of the individual tuples.colIndexes
- The columns indexes of the parent column group, this indicate where to put the column sum into
the c output.square
- Specify if the values should be squaredpublic double sum(int[] counts, int ncol)
ADictionary
sum
in class ADictionary
counts
- The counts of the individual tuplesncol
- The number of columns containedpublic double sumsq(int[] counts, int ncol)
ADictionary
sumsq
in class ADictionary
counts
- The counts of the individual tuplesncol
- The number of columns containedpublic void addMaxAndMin(double[] ret, int[] colIndexes)
ADictionary
addMaxAndMin
in class ADictionary
ret
- The double array that contains all columns min and max.colIndexes
- The column indexes contained in this dictionary.public String getString(int colIndexes)
ADictionary
getString
in class ADictionary
colIndexes
- The number of columns in the dictionary.public ADictionary sliceOutColumnRange(int idxStart, int idxEnd, int previousNumberOfColumns)
ADictionary
sliceOutColumnRange
in class ADictionary
idxStart
- The column index to start at.idxEnd
- The column index to end at (not inclusive)previousNumberOfColumns
- The number of columns contained in the dictionary.public ADictionary reExpandColumns(int max)
ADictionary
reExpandColumns
in class ADictionary
max
- The number of output columns possible.public boolean containsValue(double pattern)
ADictionary
containsValue
in class ADictionary
pattern
- The value to search forpublic long getNumberNonZeros(int[] counts, int nCol)
ADictionary
getNumberNonZeros
in class ADictionary
counts
- The counts of each dictionary entrynCol
- The number of columns in this dictionarypublic void addToEntry(Dictionary d, int fr, int to, int nCol)
ADictionary
addToEntry
in class ADictionary
d
- the target dictionaryfr
- the from indexto
- the to indexnCol
- the number of columnspublic boolean isLossy()
ADictionary
isLossy
in class ADictionary
public double[] getTuple(int index, int nCol)
ADictionary
getTuple
in class ADictionary
index
- The index where the values are locatednCol
- The number of columns contained in this dictionarypublic ADictionary subtractTuple(double[] tuple)
ADictionary
subtractTuple
in class ADictionary
tuple
- a double list representing a tuple, it is given that the tuple with is the same as this
dictionaries.public MatrixBlockDictionary getAsMatrixBlockDictionary(int nCol)
ADictionary
getAsMatrixBlockDictionary
in class ADictionary
nCol
- The number of columns contained in this column group.public void aggregateCols(double[] c, Builtin fn, int[] colIndexes)
ADictionary
aggregateCols
in class ADictionary
c
- The target double array, this contains the full number of columns, therefore the colIndexes for
this specific dictionary is needed.fn
- The function to apply to individual columnscolIndexes
- The mapping to the target columns from the individual columnspublic ADictionary scaleTuples(int[] scaling, int nCol)
ADictionary
scaleTuples
in class ADictionary
scaling
- The ammout to multiply the given tuples withnCol
- The number of columns contained in this column group.public void preaggValuesFromDense(int numVals, int[] colIndexes, int[] aggregateColumns, double[] b, double[] ret, int cut)
ADictionary
preaggValuesFromDense
in class ADictionary
numVals
- The number of values contained in this dictionarycolIndexes
- The column indexes that is associated with the parent column groupaggregateColumns
- The column to aggregate, this is preprocessed, to find remove consideration for empty
columnsb
- The values in the right hand side matrixret
- The double array to put in the aggregate.cut
- The number of columns in b.public ADictionary replace(double pattern, double replace, int nCol, boolean safe)
ADictionary
replace
in class ADictionary
pattern
- The value to look forreplace
- The value to replace the other value withnCol
- The number of columns contained in the dictionary.safe
- Specify if the operation require consideration of adding a new tuple. This happens if the
dictionary have allocated the last zero tuple or not.Copyright © 2021 The Apache Software Foundation. All rights reserved.