Class LegacyEncoder
- java.lang.Object
-
- org.apache.sysds.runtime.transform.encode.LegacyEncoder
-
- All Implemented Interfaces:
Externalizable
,Serializable
- Direct Known Subclasses:
EncoderMVImpute
,EncoderOmit
public abstract class LegacyEncoder extends Object implements Externalizable
Base class for all transform encoders providing both a row and block interface for decoding frames to matrices.- See Also:
- Serialized Form
-
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract MatrixBlock
apply(FrameBlock in, MatrixBlock out)
Encode input data blockwise according to existing transform meta data (transform apply).abstract void
build(FrameBlock in)
Build the transform meta data for the given block input.void
buildPartial(FrameBlock in)
Partial build of internal data structures (e.g., in distributed spark operations).abstract MatrixBlock
encode(FrameBlock in, MatrixBlock out)
Block encode: build and apply (transform encode).int[]
getColList()
MatrixBlock
getColMapping(FrameBlock meta, MatrixBlock out)
Obtain the column mapping of encoded frames based on the passed meta data frame.abstract FrameBlock
getMetaData(FrameBlock out)
Construct a frame block out of the transform meta data.int
initColList(int[] colList)
int
initColList(org.apache.wink.json4j.JSONArray attrs)
abstract void
initMetaData(FrameBlock meta)
Sets up the required meta data for a subsequent call to apply.boolean
isApplicable()
Indicates if this encoder is applicable, i.e, if there is at least one column to encode.int
isApplicable(int colID)
Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.void
mergeAt(LegacyEncoder other, int row, int col)
Merges another encoder, of a compatible type, in after a certain position.void
prepareBuildPartial()
Allocates internal data structures for partial build.void
readExternal(ObjectInput in)
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.void
setColList(int[] colList)
void
shiftCols(int offset)
LegacyEncoder
subRangeEncoder(IndexRange ixRange)
Returns a new Encoder that only handles a sub range of columns.void
updateIndexRanges(long[] beginDims, long[] endDims)
Update index-ranges to after encoding.void
writeExternal(ObjectOutput os)
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.
-
-
-
Method Detail
-
getColList
public int[] getColList()
-
setColList
public void setColList(int[] colList)
-
initColList
public int initColList(org.apache.wink.json4j.JSONArray attrs)
-
initColList
public int initColList(int[] colList)
-
isApplicable
public boolean isApplicable()
Indicates if this encoder is applicable, i.e, if there is at least one column to encode.- Returns:
- true if at least one column to encode
-
isApplicable
public int isApplicable(int colID)
Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.- Parameters:
colID
- column ID- Returns:
- true if encoder is applicable for given column
-
encode
public abstract MatrixBlock encode(FrameBlock in, MatrixBlock out)
Block encode: build and apply (transform encode).- Parameters:
in
- input frame blockout
- output matrix block- Returns:
- output matrix block
-
build
public abstract void build(FrameBlock in)
Build the transform meta data for the given block input. This call modifies and keeps meta data as encoder state.- Parameters:
in
- input frame block
-
prepareBuildPartial
public void prepareBuildPartial()
Allocates internal data structures for partial build.
-
buildPartial
public void buildPartial(FrameBlock in)
Partial build of internal data structures (e.g., in distributed spark operations).- Parameters:
in
- input frame block
-
apply
public abstract MatrixBlock apply(FrameBlock in, MatrixBlock out)
Encode input data blockwise according to existing transform meta data (transform apply).- Parameters:
in
- input frame blockout
- output matrix block- Returns:
- output matrix block
-
subRangeEncoder
public LegacyEncoder subRangeEncoder(IndexRange ixRange)
Returns a new Encoder that only handles a sub range of columns.- Parameters:
ixRange
- the range (1-based, begin inclusive, end exclusive)- Returns:
- an encoder of the same type, just for the sub-range
-
mergeAt
public void mergeAt(LegacyEncoder other, int row, int col)
Merges another encoder, of a compatible type, in after a certain position. Resizes as necessary.Encoders
are compatible with themselves andEncoderComposite
is compatible with every otherEncoder
.- Parameters:
other
- the encoder that should be merged inrow
- the row where it should be placed (1-based)col
- the col where it should be placed (1-based)
-
updateIndexRanges
public void updateIndexRanges(long[] beginDims, long[] endDims)
Update index-ranges to after encoding. Note that only Dummycoding changes the ranges.- Parameters:
beginDims
- begin dimensions of rangeendDims
- end dimensions of range
-
getMetaData
public abstract FrameBlock getMetaData(FrameBlock out)
Construct a frame block out of the transform meta data.- Parameters:
out
- output frame block- Returns:
- output frame block?
-
initMetaData
public abstract void initMetaData(FrameBlock meta)
Sets up the required meta data for a subsequent call to apply.- Parameters:
meta
- frame block
-
getColMapping
public MatrixBlock getColMapping(FrameBlock meta, MatrixBlock out)
Obtain the column mapping of encoded frames based on the passed meta data frame.- Parameters:
meta
- meta data frame blockout
- output matrix- Returns:
- matrix with column mapping (one row per attribute)
-
writeExternal
public void writeExternal(ObjectOutput os) throws IOException
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.- Specified by:
writeExternal
in interfaceExternalizable
- Parameters:
os
- object output- Throws:
IOException
- if IOException occurs
-
readExternal
public void readExternal(ObjectInput in) throws IOException
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.- Specified by:
readExternal
in interfaceExternalizable
- Parameters:
in
- object input- Throws:
IOException
- if IOException occur
-
shiftCols
public void shiftCols(int offset)
-
-