Class FrameBlock
- java.lang.Object
-
- org.apache.sysds.runtime.matrix.data.FrameBlock
-
- All Implemented Interfaces:
Externalizable
,Serializable
,org.apache.hadoop.io.Writable
,CacheBlock
public class FrameBlock extends Object implements CacheBlock, Externalizable
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
FrameBlock.ColumnMetadata
static class
FrameBlock.FrameMapFunction
-
Field Summary
Fields Modifier and Type Field Description static int
BUFFER_SIZE
-
Constructor Summary
Constructors Constructor Description FrameBlock()
FrameBlock(int ncols, Types.ValueType vt)
FrameBlock(Types.ValueType[] schema)
FrameBlock(Types.ValueType[] schema, String[] names)
FrameBlock(Types.ValueType[] schema, String[][] data)
FrameBlock(Types.ValueType[] schema, String[] names, String[][] data)
FrameBlock(FrameBlock that)
Copy constructor for frame blocks, which uses a shallow copy for the schema (column types and names) but a deep copy for meta data and actual column data.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FrameBlock
append(FrameBlock that, FrameBlock ret, boolean cbind)
Appends the given argument frameblock 'that' to this frameblock by creating a deep copy to prevent side effects.void
appendColumn(boolean[] col)
Append a column of value type BOOLEAN as the last column of the data frame.void
appendColumn(double[] col)
Append a column of value type DOUBLE as the last column of the data frame.void
appendColumn(float[] col)
Append a column of value type float as the last column of the data frame.void
appendColumn(int[] col)
Append a column of value type INT as the last column of the data frame.void
appendColumn(long[] col)
Append a column of value type LONG as the last column of the data frame.void
appendColumn(String[] col)
Append a column of value type STRING as the last column of the data frame.void
appendColumn(Types.ValueType vt, org.apache.sysds.runtime.matrix.data.FrameBlock.Array col)
void
appendColumns(double[][] cols)
Append a set of column of value type DOUBLE at the end of the frame in order to avoid repeated allocation with appendColumns.void
appendRow(Object[] row)
Append a row to the end of the data frame, where all row fields are boxed objects according to the schema.void
appendRow(String[] row)
Append a row to the end of the data frame, where all row fields are string encoded.FrameBlock
binaryOperations(BinaryOperator bop, FrameBlock that, FrameBlock out)
This method performs the value comparison on two frames if the values in both frames are equal, not equal, less than, greater than, less than/greater than and equal to the output frame will store boolean value for each each comparisonvoid
compactEmptyBlock()
Free unnecessarily allocated empty block.void
copy(int rl, int ru, int cl, int cu, FrameBlock src)
void
copy(FrameBlock src)
static String
createColName(int i)
static String[]
createColNames(int size)
static String[]
createColNames(int off, int size)
FrameBlock
detectSchemaFromRow(double sampleFraction)
FrameBlock
dropInvalidType(FrameBlock schema)
Drop the cell value which does not confirms to the data type of its columnvoid
ensureAllocatedColumns(int numRows)
Allocate column data structures if necessary, i.e., if schema specified but not all column data structures created yet.void
ensureColumnCompatibility(int newlen)
Checks for matching column sizes in case of existing columns.FrameBlock
frameRowReplication(FrameBlock rowToreplicate)
Object
get(int r, int c)
Gets a boxed object of the value in position (r,c).org.apache.sysds.runtime.matrix.data.FrameBlock.Array
getColumn(int c)
byte[]
getColumnAsBytes(int c)
Object
getColumnData(int c)
FrameBlock.ColumnMetadata[]
getColumnMetadata()
FrameBlock.ColumnMetadata
getColumnMetadata(int c)
String
getColumnName(int c)
Returns the column name for the requested column.Map<String,Integer>
getColumnNameIDMap()
Creates a mapping from column names to column IDs, i.e., 1-based column indexesString[]
getColumnNames()
Returns the column names of the frame block.String[]
getColumnNames(boolean alloc)
Returns the column names of the frame block.FrameBlock
getColumnNamesAsFrame()
String
getColumnType(int c)
static FrameBlock.FrameMapFunction
getCompiledFunction(String lambdaExpr, long margin)
DataCharacteristics
getDataCharacteristics()
double
getDouble(int r, int c)
Returns the double value at the passed row and column.double
getDoubleNaN(int r, int c)
Returns the double value at the passed row and column.long
getExactSerializedSize()
Get the exact serialized size in bytes of the cache block.byte[]
getIndexAsBytes(int c, int r)
Get a specific index as bytes, this method is used to parse the strings into Python.long
getInMemorySize()
Get the in-memory size in bytes of the cache block.int
getNumColumns()
Get the number of columns of the frame block, that is the number of columns defined in the schema.int
getNumRows()
Get the number of rows of the frame block.Iterator<Object[]>
getObjectRowIterator()
Get a row iterator over the frame where all fields are encoded as boxed objects according to their value types.Iterator<Object[]>
getObjectRowIterator(int[] cols)
Get a row iterator over the frame where all selected fields are encoded as boxed objects according to their value types.Iterator<Object[]>
getObjectRowIterator(int rl, int ru)
Get a row iterator over the frame where all fields are encoded as boxed objects according to their value types.Iterator<Object[]>
getObjectRowIterator(int rl, int ru, int[] cols)
Get a row iterator over the frame where all selected fields are encoded as boxed objects according to their value types.Iterator<Object[]>
getObjectRowIterator(Types.ValueType[] schema)
Get a row iterator over the frame where all fields are encoded as boxed objects according to the value types of the provided target schema.HashMap<String,Long>
getRecodeMap(int col)
This function will split every Recode map in the column using delimiter Lop.DATATYPE_PREFIX, as Recode map generated earlier in the form of Code+Lop.DATATYPE_PREFIX+Token and store it in a map which contains token and code for every unique tokens.Types.ValueType[]
getSchema()
Returns the schema of the frame block.FrameBlock
getSchemaTypeOf()
String
getString(int r, int c)
Returns the string of the value at the passed row and column.Iterator<String[]>
getStringRowIterator()
Get a row iterator over the frame where all fields are encoded as strings independent of their value types.Iterator<String[]>
getStringRowIterator(int colID)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.Iterator<String[]>
getStringRowIterator(int[] cols)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.Iterator<String[]>
getStringRowIterator(int rl, int ru)
Get a row iterator over the frame where all fields are encoded as strings independent of their value types.Iterator<String[]>
getStringRowIterator(int rl, int ru, int colID)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.Iterator<String[]>
getStringRowIterator(int rl, int ru, int[] cols)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.FrameBlock
invalidByLength(MatrixBlock feaLen)
This method validates the frame data against an attribute length constrain if data value in any cell is greater than the specified threshold of that attribute the output frame will store a null on that cell position, thus removing the length-violating values.boolean
isColNameDefault(int i)
boolean
isColNamesDefault()
boolean
isColumnMetadataDefault()
boolean
isColumnMetadataDefault(int c)
boolean
isShallowSerialize()
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.boolean
isShallowSerialize(boolean inclConvert)
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.FrameBlock
leftIndexingOperations(FrameBlock rhsFrame, int rl, int ru, int cl, int cu, FrameBlock ret)
FrameBlock
leftIndexingOperations(FrameBlock rhsFrame, IndexRange ixrange, FrameBlock ret)
FrameBlock
map(String lambdaExpr, long margin)
FrameBlock
map(FrameBlock.FrameMapFunction lambdaExpr, long margin)
FrameBlock
mapDist(FrameBlock.FrameMapFunction lambdaExpr)
void
mapInplace(Function<String,String> fun)
void
merge(CacheBlock that, boolean bDummy)
Merge the given block into the current block.void
merge(FrameBlock that)
static FrameBlock
mergeSchema(FrameBlock temp1, FrameBlock temp2)
void
readExternal(ObjectInput in)
void
readFields(DataInput in)
void
recomputeColumnCardinality()
FrameBlock
removeEmptyOperations(boolean rows, boolean emptyReturn, MatrixBlock select)
<T> FrameBlock
replaceOperations(String pattern, String replacement)
void
reset()
void
reset(int nrow, boolean clearMeta)
void
set(int r, int c, Object val)
Sets the value in position (r,c), where the input is assumed to be a boxed object consistent with the schema definition.void
setColumn(int c, org.apache.sysds.runtime.matrix.data.FrameBlock.Array column)
void
setColumnMetadata(int c, FrameBlock.ColumnMetadata colmeta)
void
setColumnMetadata(FrameBlock.ColumnMetadata[] colmeta)
void
setColumnNames(String[] colnames)
void
setNumRows(int numRows)
void
setSchema(Types.ValueType[] schema)
Sets the schema of the frame block.FrameBlock
slice(int rl, int ru, int cl, int cu, boolean deep, CacheBlock retCache)
Right indexing operations to slice a subframe out of this frame block.FrameBlock
slice(int rl, int ru, int cl, int cu, CacheBlock retCache)
Slice a sub block out of the current block and write into the given output block.void
slice(ArrayList<Pair<Long,FrameBlock>> outlist, IndexRange range, int rowCut)
FrameBlock
slice(IndexRange ixrange, FrameBlock ret)
void
toShallowSerializeBlock()
Converts a cache block that is not shallow serializable into a form that is shallow serializable.String
toString()
FrameBlock
valueSwap(FrameBlock schema)
void
write(DataOutput out)
void
writeExternal(ObjectOutput out)
FrameBlock
zeroOutOperations(FrameBlock result, IndexRange range, boolean complementary, int iRowStartSrc, int iRowStartDest, int blen, int iMaxRowsToCopy)
This function ZERO OUT the data in the slicing window applicable for this block.
-
-
-
Field Detail
-
BUFFER_SIZE
public static final int BUFFER_SIZE
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
FrameBlock
public FrameBlock()
-
FrameBlock
public FrameBlock(FrameBlock that)
Copy constructor for frame blocks, which uses a shallow copy for the schema (column types and names) but a deep copy for meta data and actual column data.- Parameters:
that
- frame block
-
FrameBlock
public FrameBlock(int ncols, Types.ValueType vt)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[] names)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[][] data)
-
FrameBlock
public FrameBlock(Types.ValueType[] schema, String[] names, String[][] data)
-
-
Method Detail
-
getNumRows
public int getNumRows()
Get the number of rows of the frame block.- Specified by:
getNumRows
in interfaceCacheBlock
- Returns:
- number of rows
-
getDouble
public double getDouble(int r, int c)
Description copied from interface:CacheBlock
Returns the double value at the passed row and column. If the value is missing 0 is returned.- Specified by:
getDouble
in interfaceCacheBlock
- Parameters:
r
- row of the valuec
- column of the value- Returns:
- double value at the passed row and column
-
getDoubleNaN
public double getDoubleNaN(int r, int c)
Description copied from interface:CacheBlock
Returns the double value at the passed row and column. If the value is missing NaN is returned.- Specified by:
getDoubleNaN
in interfaceCacheBlock
- Parameters:
r
- row of the valuec
- column of the value- Returns:
- double value at the passed row and column
-
getString
public String getString(int r, int c)
Description copied from interface:CacheBlock
Returns the string of the value at the passed row and column. If the value is missing or NaN, null is returned.- Specified by:
getString
in interfaceCacheBlock
- Parameters:
r
- row of the valuec
- column of the value- Returns:
- string of the value at the passed row and column
-
setNumRows
public void setNumRows(int numRows)
-
getNumColumns
public int getNumColumns()
Get the number of columns of the frame block, that is the number of columns defined in the schema.- Specified by:
getNumColumns
in interfaceCacheBlock
- Returns:
- number of columns
-
getDataCharacteristics
public DataCharacteristics getDataCharacteristics()
- Specified by:
getDataCharacteristics
in interfaceCacheBlock
-
getSchema
public Types.ValueType[] getSchema()
Returns the schema of the frame block.- Returns:
- schema as array of ValueTypes
-
setSchema
public void setSchema(Types.ValueType[] schema)
Sets the schema of the frame block.- Parameters:
schema
- schema as array of ValueTypes
-
getColumnNames
public String[] getColumnNames()
Returns the column names of the frame block. This method allocates default column names if required.- Returns:
- column names
-
getColumnNamesAsFrame
public FrameBlock getColumnNamesAsFrame()
-
getColumnNames
public String[] getColumnNames(boolean alloc)
Returns the column names of the frame block. This method allocates default column names if required.- Parameters:
alloc
- if true, create column names- Returns:
- array of column names
-
getColumnName
public String getColumnName(int c)
Returns the column name for the requested column. This method allocates default column names if required.- Parameters:
c
- column index- Returns:
- column name
-
setColumnNames
public void setColumnNames(String[] colnames)
-
getColumnMetadata
public FrameBlock.ColumnMetadata[] getColumnMetadata()
-
getColumnMetadata
public FrameBlock.ColumnMetadata getColumnMetadata(int c)
-
isColumnMetadataDefault
public boolean isColumnMetadataDefault()
-
isColumnMetadataDefault
public boolean isColumnMetadataDefault(int c)
-
setColumnMetadata
public void setColumnMetadata(FrameBlock.ColumnMetadata[] colmeta)
-
setColumnMetadata
public void setColumnMetadata(int c, FrameBlock.ColumnMetadata colmeta)
-
getColumnNameIDMap
public Map<String,Integer> getColumnNameIDMap()
Creates a mapping from column names to column IDs, i.e., 1-based column indexes- Returns:
- map of column name keys and id values
-
ensureAllocatedColumns
public void ensureAllocatedColumns(int numRows)
Allocate column data structures if necessary, i.e., if schema specified but not all column data structures created yet.- Parameters:
numRows
- number of rows
-
ensureColumnCompatibility
public void ensureColumnCompatibility(int newlen)
Checks for matching column sizes in case of existing columns.- Parameters:
newlen
- number of rows to compare with existing number of rows
-
createColNames
public static String[] createColNames(int size)
-
createColNames
public static String[] createColNames(int off, int size)
-
createColName
public static String createColName(int i)
-
isColNamesDefault
public boolean isColNamesDefault()
-
isColNameDefault
public boolean isColNameDefault(int i)
-
recomputeColumnCardinality
public void recomputeColumnCardinality()
-
get
public Object get(int r, int c)
Gets a boxed object of the value in position (r,c).- Parameters:
r
- row index, 0-basedc
- column index, 0-based- Returns:
- object of the value at specified position
-
set
public void set(int r, int c, Object val)
Sets the value in position (r,c), where the input is assumed to be a boxed object consistent with the schema definition.- Parameters:
r
- row indexc
- column indexval
- value to set at specified position
-
reset
public void reset(int nrow, boolean clearMeta)
-
reset
public void reset()
-
appendRow
public void appendRow(Object[] row)
Append a row to the end of the data frame, where all row fields are boxed objects according to the schema.- Parameters:
row
- array of objects
-
appendRow
public void appendRow(String[] row)
Append a row to the end of the data frame, where all row fields are string encoded.- Parameters:
row
- array of strings
-
appendColumn
public void appendColumn(String[] col)
Append a column of value type STRING as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of strings
-
appendColumn
public void appendColumn(boolean[] col)
Append a column of value type BOOLEAN as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of booleans
-
appendColumn
public void appendColumn(int[] col)
Append a column of value type INT as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of longs
-
appendColumn
public void appendColumn(long[] col)
Append a column of value type LONG as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of longs
-
appendColumn
public void appendColumn(float[] col)
Append a column of value type float as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of doubles
-
appendColumn
public void appendColumn(double[] col)
Append a column of value type DOUBLE as the last column of the data frame. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
col
- array of doubles
-
appendColumns
public void appendColumns(double[][] cols)
Append a set of column of value type DOUBLE at the end of the frame in order to avoid repeated allocation with appendColumns. The given array is wrapped but not copied and hence might be updated in the future.- Parameters:
cols
- 2d array of doubles
-
appendColumn
public void appendColumn(Types.ValueType vt, org.apache.sysds.runtime.matrix.data.FrameBlock.Array col)
-
getColumnData
public Object getColumnData(int c)
-
getColumnType
public String getColumnType(int c)
-
getIndexAsBytes
public byte[] getIndexAsBytes(int c, int r)
Get a specific index as bytes, this method is used to parse the strings into Python. It should only be used in columns where the datatype is String. Since in other cases it might be faster to return other types. Note that P- Parameters:
c
- The column index.r
- The row index.- Returns:
- The returned byte array.
-
getColumnAsBytes
public byte[] getColumnAsBytes(int c)
-
getColumn
public org.apache.sysds.runtime.matrix.data.FrameBlock.Array getColumn(int c)
-
setColumn
public void setColumn(int c, org.apache.sysds.runtime.matrix.data.FrameBlock.Array column)
-
getStringRowIterator
public Iterator<String[]> getStringRowIterator()
Get a row iterator over the frame where all fields are encoded as strings independent of their value types.- Returns:
- string array iterator
-
getStringRowIterator
public Iterator<String[]> getStringRowIterator(int[] cols)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.- Parameters:
cols
- column selection, 1-based- Returns:
- string array iterator
-
getStringRowIterator
public Iterator<String[]> getStringRowIterator(int colID)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.- Parameters:
colID
- column selection, 1-based- Returns:
- string array iterator
-
getStringRowIterator
public Iterator<String[]> getStringRowIterator(int rl, int ru)
Get a row iterator over the frame where all fields are encoded as strings independent of their value types.- Parameters:
rl
- lower row indexru
- upper row index- Returns:
- string array iterator
-
getStringRowIterator
public Iterator<String[]> getStringRowIterator(int rl, int ru, int[] cols)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.- Parameters:
rl
- lower row indexru
- upper row indexcols
- column selection, 1-based- Returns:
- string array iterator
-
getStringRowIterator
public Iterator<String[]> getStringRowIterator(int rl, int ru, int colID)
Get a row iterator over the frame where all selected fields are encoded as strings independent of their value types.- Parameters:
rl
- lower row indexru
- upper row indexcolID
- columnID, 1-based- Returns:
- string array iterator
-
getObjectRowIterator
public Iterator<Object[]> getObjectRowIterator()
Get a row iterator over the frame where all fields are encoded as boxed objects according to their value types.- Returns:
- object array iterator
-
getObjectRowIterator
public Iterator<Object[]> getObjectRowIterator(Types.ValueType[] schema)
Get a row iterator over the frame where all fields are encoded as boxed objects according to the value types of the provided target schema.- Parameters:
schema
- target schema of objects- Returns:
- object array iterator
-
getObjectRowIterator
public Iterator<Object[]> getObjectRowIterator(int[] cols)
Get a row iterator over the frame where all selected fields are encoded as boxed objects according to their value types.- Parameters:
cols
- column selection, 1-based- Returns:
- object array iterator
-
getObjectRowIterator
public Iterator<Object[]> getObjectRowIterator(int rl, int ru)
Get a row iterator over the frame where all fields are encoded as boxed objects according to their value types.- Parameters:
rl
- lower row indexru
- upper row index- Returns:
- object array iterator
-
getObjectRowIterator
public Iterator<Object[]> getObjectRowIterator(int rl, int ru, int[] cols)
Get a row iterator over the frame where all selected fields are encoded as boxed objects according to their value types.- Parameters:
rl
- lower row indexru
- upper row indexcols
- column selection, 1-based- Returns:
- object array iterator
-
write
public void write(DataOutput out) throws IOException
- Specified by:
write
in interfaceorg.apache.hadoop.io.Writable
- Throws:
IOException
-
readFields
public void readFields(DataInput in) throws IOException
- Specified by:
readFields
in interfaceorg.apache.hadoop.io.Writable
- Throws:
IOException
-
writeExternal
public void writeExternal(ObjectOutput out) throws IOException
- Specified by:
writeExternal
in interfaceExternalizable
- Throws:
IOException
-
readExternal
public void readExternal(ObjectInput in) throws IOException
- Specified by:
readExternal
in interfaceExternalizable
- Throws:
IOException
-
getInMemorySize
public long getInMemorySize()
Description copied from interface:CacheBlock
Get the in-memory size in bytes of the cache block.- Specified by:
getInMemorySize
in interfaceCacheBlock
- Returns:
- in-memory size in bytes of cache block
-
getExactSerializedSize
public long getExactSerializedSize()
Description copied from interface:CacheBlock
Get the exact serialized size in bytes of the cache block.- Specified by:
getExactSerializedSize
in interfaceCacheBlock
- Returns:
- exact serialized size in bytes of cache block
-
isShallowSerialize
public boolean isShallowSerialize()
Description copied from interface:CacheBlock
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.- Specified by:
isShallowSerialize
in interfaceCacheBlock
- Returns:
- true if shallow serialized
-
isShallowSerialize
public boolean isShallowSerialize(boolean inclConvert)
Description copied from interface:CacheBlock
Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.- Specified by:
isShallowSerialize
in interfaceCacheBlock
- Parameters:
inclConvert
- if true report blocks as shallow serialize that are currently not amenable but can be brought into an amenable form viatoShallowSerializeBlock
.- Returns:
- true if shallow serialized
-
toShallowSerializeBlock
public void toShallowSerializeBlock()
Description copied from interface:CacheBlock
Converts a cache block that is not shallow serializable into a form that is shallow serializable. This methods has no affect if the given cache block is not amenable.- Specified by:
toShallowSerializeBlock
in interfaceCacheBlock
-
compactEmptyBlock
public void compactEmptyBlock()
Description copied from interface:CacheBlock
Free unnecessarily allocated empty block.- Specified by:
compactEmptyBlock
in interfaceCacheBlock
-
binaryOperations
public FrameBlock binaryOperations(BinaryOperator bop, FrameBlock that, FrameBlock out)
This method performs the value comparison on two frames if the values in both frames are equal, not equal, less than, greater than, less than/greater than and equal to the output frame will store boolean value for each each comparison- Parameters:
bop
- binary operatorthat
- frame block of rhs of m * n dimensionsout
- output frame block- Returns:
- a boolean frameBlock
-
leftIndexingOperations
public FrameBlock leftIndexingOperations(FrameBlock rhsFrame, IndexRange ixrange, FrameBlock ret)
-
leftIndexingOperations
public FrameBlock leftIndexingOperations(FrameBlock rhsFrame, int rl, int ru, int cl, int cu, FrameBlock ret)
-
slice
public FrameBlock slice(IndexRange ixrange, FrameBlock ret)
-
slice
public FrameBlock slice(int rl, int ru, int cl, int cu, CacheBlock retCache)
Description copied from interface:CacheBlock
Slice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slice
in interfaceCacheBlock
- Parameters:
rl
- row lowerru
- row uppercl
- column lowercu
- column upperretCache
- cache block- Returns:
- sub-block of cache block
-
slice
public FrameBlock slice(int rl, int ru, int cl, int cu, boolean deep, CacheBlock retCache)
Right indexing operations to slice a subframe out of this frame block. Note that the existing column value types are preserved.- Specified by:
slice
in interfaceCacheBlock
- Parameters:
rl
- row lower index, inclusive, 0-basedru
- row upper index, inclusive, 0-basedcl
- column lower index, inclusive, 0-basedcu
- column upper index, inclusive, 0-baseddeep
- enforce deep-copyretCache
- cache block- Returns:
- frame block
-
slice
public void slice(ArrayList<Pair<Long,FrameBlock>> outlist, IndexRange range, int rowCut)
-
append
public FrameBlock append(FrameBlock that, FrameBlock ret, boolean cbind)
Appends the given argument frameblock 'that' to this frameblock by creating a deep copy to prevent side effects. For cbind, the frames are appended column-wise (same number of rows), while for rbind the frames are appended row-wise (same number of columns).- Parameters:
that
- frame block to append to current frame blockret
- frame block to return, can be nullcbind
- if true, column append- Returns:
- frame block
-
copy
public void copy(FrameBlock src)
-
copy
public void copy(int rl, int ru, int cl, int cu, FrameBlock src)
-
getRecodeMap
public HashMap<String,Long> getRecodeMap(int col)
This function will split every Recode map in the column using delimiter Lop.DATATYPE_PREFIX, as Recode map generated earlier in the form of Code+Lop.DATATYPE_PREFIX+Token and store it in a map which contains token and code for every unique tokens.- Parameters:
col
- is the column # from frame data which contains Recode map generated earlier.- Returns:
- map of token and code for every element in the input column of a frame containing Recode map
-
merge
public void merge(CacheBlock that, boolean bDummy)
Description copied from interface:CacheBlock
Merge the given block into the current block. Both blocks needs to be of equal dimensions and contain disjoint non-zero cells.- Specified by:
merge
in interfaceCacheBlock
- Parameters:
that
- cache blockbDummy
- ?
-
merge
public void merge(FrameBlock that)
-
zeroOutOperations
public FrameBlock zeroOutOperations(FrameBlock result, IndexRange range, boolean complementary, int iRowStartSrc, int iRowStartDest, int blen, int iMaxRowsToCopy)
This function ZERO OUT the data in the slicing window applicable for this block.- Parameters:
result
- frame blockrange
- index rangecomplementary
- ?iRowStartSrc
- ?iRowStartDest
- ?blen
- ?iMaxRowsToCopy
- ?- Returns:
- frame block
-
getSchemaTypeOf
public FrameBlock getSchemaTypeOf()
-
detectSchemaFromRow
public FrameBlock detectSchemaFromRow(double sampleFraction)
-
dropInvalidType
public FrameBlock dropInvalidType(FrameBlock schema)
Drop the cell value which does not confirms to the data type of its column- Parameters:
schema
- of the frame- Returns:
- original frame where invalid values are replaced with null
-
invalidByLength
public FrameBlock invalidByLength(MatrixBlock feaLen)
This method validates the frame data against an attribute length constrain if data value in any cell is greater than the specified threshold of that attribute the output frame will store a null on that cell position, thus removing the length-violating values.- Parameters:
feaLen
- vector of valid lengths- Returns:
- FrameBlock with invalid values converted into missing values (null)
-
mergeSchema
public static FrameBlock mergeSchema(FrameBlock temp1, FrameBlock temp2)
-
map
public FrameBlock map(String lambdaExpr, long margin)
-
frameRowReplication
public FrameBlock frameRowReplication(FrameBlock rowToreplicate)
-
valueSwap
public FrameBlock valueSwap(FrameBlock schema)
-
map
public FrameBlock map(FrameBlock.FrameMapFunction lambdaExpr, long margin)
-
mapDist
public FrameBlock mapDist(FrameBlock.FrameMapFunction lambdaExpr)
-
getCompiledFunction
public static FrameBlock.FrameMapFunction getCompiledFunction(String lambdaExpr, long margin)
-
replaceOperations
public <T> FrameBlock replaceOperations(String pattern, String replacement)
-
removeEmptyOperations
public FrameBlock removeEmptyOperations(boolean rows, boolean emptyReturn, MatrixBlock select)
-
-