Class ColumnEncoder
- java.lang.Object
- 
- org.apache.sysds.runtime.transform.encode.ColumnEncoder
 
- 
- All Implemented Interfaces:
- Externalizable,- Serializable,- Comparable<ColumnEncoder>,- Encoder
 - Direct Known Subclasses:
- ColumnEncoderBin,- ColumnEncoderComposite,- ColumnEncoderDummycode,- ColumnEncoderFeatureHash,- ColumnEncoderPassThrough,- ColumnEncoderRecode,- ColumnEncoderUDF
 
 public abstract class ColumnEncoder extends Object implements Encoder, Comparable<ColumnEncoder> Base class for all transform encoders providing both a row and block interface for decoding frames to matrices.- See Also:
- Serialized Form
 
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classColumnEncoder.EncoderType
 - 
Field SummaryFields Modifier and Type Field Description static intAPPLY_ROW_BLOCKS_PER_COLUMNstatic intBUILD_ROW_BLOCKS_PER_COLUMN
 - 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description MatrixBlockapply(CacheBlock in, MatrixBlock out, int outputCol)Apply Functions are only used in Single Threaded or Multi-Threaded Dense context.MatrixBlockapply(CacheBlock in, MatrixBlock out, int outputCol, int rowStart, int blk)voidbuild(CacheBlock in, double[] equiHeightMaxs)voidbuild(CacheBlock in, Map<Integer,double[]> equiHeightMaxs)voidbuildPartial(FrameBlock in)Partial build of internal data structures (e.g., in distributed spark operations).intcompareTo(ColumnEncoder o)List<DependencyTask<?>>getApplyTasks(CacheBlock in, MatrixBlock out, int outputCol)Callable<Object>getBuildTask(CacheBlock in)List<DependencyTask<?>>getBuildTasks(CacheBlock in)intgetColID()MatrixBlockgetColMapping(FrameBlock meta)Obtain the column mapping of encoded frames based on the passed meta data frame.longgetEstMetaSize()intgetEstNumDistincts()Callable<Object>getPartialBuildTask(CacheBlock in, int startRow, int blockSize, HashMap<Integer,Object> ret)Callable<Object>getPartialMergeBuildTask(HashMap<Integer,?> ret)Set<Integer>getSparseRowsWZeros()booleanisApplicable()Indicates if this encoder is applicable, i.e, if there is a column to encode.booleanisApplicable(int colID)Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.voidmergeAt(ColumnEncoder other)Merges another encoder, of a compatible type, in after a certain position.voidprepareBuildPartial()Allocates internal data structures for partial build.voidreadExternal(ObjectInput in)Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.voidsetColID(int colID)voidsetEstMetaSize(long estSize)voidsetEstNumDistincts(int numDistincts)voidshiftCol(int columnOffset)voidupdateIndexRanges(long[] beginDims, long[] endDims, int colOffset)Update index-ranges to after encoding.voidwriteExternal(ObjectOutput os)Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.- 
Methods inherited from class java.lang.Objectequals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 - 
Methods inherited from interface org.apache.sysds.runtime.transform.encode.EncoderallocateMetaData, build, getMetaData, initMetaData
 
- 
 
- 
- 
- 
Method Detail- 
applypublic MatrixBlock apply(CacheBlock in, MatrixBlock out, int outputCol) Apply Functions are only used in Single Threaded or Multi-Threaded Dense context. That's why there is no regard for MT sparse!
 - 
applypublic MatrixBlock apply(CacheBlock in, MatrixBlock out, int outputCol, int rowStart, int blk) 
 - 
isApplicablepublic boolean isApplicable() Indicates if this encoder is applicable, i.e, if there is a column to encode.- Returns:
- true if a colID is set
 
 - 
isApplicablepublic boolean isApplicable(int colID) Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.- Parameters:
- colID- column ID
- Returns:
- true if encoder is applicable for given column
 
 - 
prepareBuildPartialpublic void prepareBuildPartial() Allocates internal data structures for partial build.- Specified by:
- prepareBuildPartialin interface- Encoder
 
 - 
buildPartialpublic void buildPartial(FrameBlock in) Partial build of internal data structures (e.g., in distributed spark operations).- Specified by:
- buildPartialin interface- Encoder
- Parameters:
- in- input frame block
 
 - 
buildpublic void build(CacheBlock in, double[] equiHeightMaxs) 
 - 
buildpublic void build(CacheBlock in, Map<Integer,double[]> equiHeightMaxs) 
 - 
mergeAtpublic void mergeAt(ColumnEncoder other) Merges another encoder, of a compatible type, in after a certain position. Resizes as necessary.ColumnEncodersare compatible with themselves andEncoderCompositeis compatible with every otherColumnEncoders.MultiColumnEncodersare compatible with every encoder- Parameters:
- other- the encoder that should be merged in
 
 - 
updateIndexRangespublic void updateIndexRanges(long[] beginDims, long[] endDims, int colOffset)Update index-ranges to after encoding. Note that only Dummycoding changes the ranges.- Specified by:
- updateIndexRangesin interface- Encoder
- Parameters:
- beginDims- begin dimensions of range
- endDims- end dimensions of range
- colOffset- is applied to begin and endDims
 
 - 
getColMappingpublic MatrixBlock getColMapping(FrameBlock meta) Obtain the column mapping of encoded frames based on the passed meta data frame.- Parameters:
- meta- meta data frame block
- Returns:
- matrix with column mapping (one row per attribute)
 
 - 
writeExternalpublic void writeExternal(ObjectOutput os) throws IOException Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.- Specified by:
- writeExternalin interface- Externalizable
- Parameters:
- os- object output
- Throws:
- IOException- if IOException occurs
 
 - 
readExternalpublic void readExternal(ObjectInput in) throws IOException Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.- Specified by:
- readExternalin interface- Externalizable
- Parameters:
- in- object input
- Throws:
- IOException- if IOException occur
 
 - 
getColIDpublic int getColID() 
 - 
setColIDpublic void setColID(int colID) 
 - 
shiftColpublic void shiftCol(int columnOffset) 
 - 
setEstMetaSizepublic void setEstMetaSize(long estSize) 
 - 
getEstMetaSizepublic long getEstMetaSize() 
 - 
setEstNumDistinctspublic void setEstNumDistincts(int numDistincts) 
 - 
getEstNumDistinctspublic int getEstNumDistincts() 
 - 
compareTopublic int compareTo(ColumnEncoder o) - Specified by:
- compareToin interface- Comparable<ColumnEncoder>
 
 - 
getBuildTaskspublic List<DependencyTask<?>> getBuildTasks(CacheBlock in) 
 - 
getBuildTaskpublic Callable<Object> getBuildTask(CacheBlock in) 
 - 
getPartialBuildTaskpublic Callable<Object> getPartialBuildTask(CacheBlock in, int startRow, int blockSize, HashMap<Integer,Object> ret) 
 - 
getApplyTaskspublic List<DependencyTask<?>> getApplyTasks(CacheBlock in, MatrixBlock out, int outputCol) 
 
- 
 
-