Class ColumnEncoderRecode
- java.lang.Object
-
- org.apache.sysds.runtime.transform.encode.ColumnEncoder
-
- org.apache.sysds.runtime.transform.encode.ColumnEncoderRecode
-
- All Implemented Interfaces:
Externalizable
,Serializable
,Comparable<ColumnEncoder>
,Encoder
public class ColumnEncoderRecode extends ColumnEncoder
- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.sysds.runtime.transform.encode.ColumnEncoder
ColumnEncoder.EncoderType
-
-
Field Summary
Fields Modifier and Type Field Description static boolean
SORT_RECODE_MAP
-
Fields inherited from class org.apache.sysds.runtime.transform.encode.ColumnEncoder
APPLY_ROW_BLOCKS_PER_COLUMN, BUILD_ROW_BLOCKS_PER_COLUMN
-
-
Constructor Summary
Constructors Constructor Description ColumnEncoderRecode()
ColumnEncoderRecode(int colID)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
allocateMetaData(FrameBlock meta)
Pre-allocate a FrameBlock for metadata collection.void
build(CacheBlock in)
Build the transform meta data for the given block input.void
buildPartial(FrameBlock in)
Partial build of internal data structures (e.g., in distributed spark operations).void
computeRCDMapSizeEstimate(CacheBlock in, int[] sampleIndices)
static String
constructRecodeMapEntry(String token, Long code)
Returns the Recode map entry which consists of concatenation of code, delimiter and token.boolean
equals(Object o)
Callable<Object>
getBuildTask(CacheBlock in)
HashMap<String,Long>
getCPRecodeMaps()
HashSet<Object>
getCPRecodeMapsPartial()
FrameBlock
getMetaData(FrameBlock meta)
Construct a frame block out of the transform meta data.int
getNumDistinctValues()
Callable<Object>
getPartialBuildTask(CacheBlock in, int startRow, int blockSize, HashMap<Integer,Object> ret)
Callable<Object>
getPartialMergeBuildTask(HashMap<Integer,?> ret)
HashMap<String,Long>
getRcdMap()
int
hashCode()
void
initMetaData(FrameBlock meta)
Construct the recodemaps from the given input frame for all columns registered for recode.void
mergeAt(ColumnEncoder other)
Merges another encoder, of a compatible type, in after a certain position.void
prepareBuildPartial()
Allocates internal data structures for partial build.void
readExternal(ObjectInput in)
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.void
sortCPRecodeMaps()
static String[]
splitRecodeMapEntry(String value)
Splits a Recode map entry into its token and code.void
writeExternal(ObjectOutput out)
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.-
Methods inherited from class org.apache.sysds.runtime.transform.encode.ColumnEncoder
apply, apply, build, build, compareTo, getApplyTasks, getBuildTasks, getColID, getColMapping, getEstMetaSize, getEstNumDistincts, getSparseRowsWZeros, isApplicable, isApplicable, setColID, setEstMetaSize, setEstNumDistincts, shiftCol, updateIndexRanges
-
-
-
-
Method Detail
-
constructRecodeMapEntry
public static String constructRecodeMapEntry(String token, Long code)
Returns the Recode map entry which consists of concatenation of code, delimiter and token.- Parameters:
token
- is part of Recode mapcode
- is code for token- Returns:
- the concatenation of token and code with delimiter in between
-
splitRecodeMapEntry
public static String[] splitRecodeMapEntry(String value)
Splits a Recode map entry into its token and code.- Parameters:
value
- concatenation of token and code with delimiter in between- Returns:
- string array of token and code
-
sortCPRecodeMaps
public void sortCPRecodeMaps()
-
computeRCDMapSizeEstimate
public void computeRCDMapSizeEstimate(CacheBlock in, int[] sampleIndices)
-
build
public void build(CacheBlock in)
Description copied from interface:Encoder
Build the transform meta data for the given block input. This call modifies and keeps meta data as encoder state.- Parameters:
in
- input frame block
-
getBuildTask
public Callable<Object> getBuildTask(CacheBlock in)
- Overrides:
getBuildTask
in classColumnEncoder
-
getPartialBuildTask
public Callable<Object> getPartialBuildTask(CacheBlock in, int startRow, int blockSize, HashMap<Integer,Object> ret)
- Overrides:
getPartialBuildTask
in classColumnEncoder
-
getPartialMergeBuildTask
public Callable<Object> getPartialMergeBuildTask(HashMap<Integer,?> ret)
- Overrides:
getPartialMergeBuildTask
in classColumnEncoder
-
prepareBuildPartial
public void prepareBuildPartial()
Description copied from class:ColumnEncoder
Allocates internal data structures for partial build.- Specified by:
prepareBuildPartial
in interfaceEncoder
- Overrides:
prepareBuildPartial
in classColumnEncoder
-
buildPartial
public void buildPartial(FrameBlock in)
Description copied from class:ColumnEncoder
Partial build of internal data structures (e.g., in distributed spark operations).- Specified by:
buildPartial
in interfaceEncoder
- Overrides:
buildPartial
in classColumnEncoder
- Parameters:
in
- input frame block
-
mergeAt
public void mergeAt(ColumnEncoder other)
Description copied from class:ColumnEncoder
Merges another encoder, of a compatible type, in after a certain position. Resizes as necessary.ColumnEncoders
are compatible with themselves andEncoderComposite
is compatible with every otherColumnEncoders
.MultiColumnEncoders
are compatible with every encoder- Overrides:
mergeAt
in classColumnEncoder
- Parameters:
other
- the encoder that should be merged in
-
getNumDistinctValues
public int getNumDistinctValues()
-
allocateMetaData
public void allocateMetaData(FrameBlock meta)
Description copied from interface:Encoder
Pre-allocate a FrameBlock for metadata collection.- Parameters:
meta
- frame block
-
getMetaData
public FrameBlock getMetaData(FrameBlock meta)
Description copied from interface:Encoder
Construct a frame block out of the transform meta data.- Parameters:
meta
- output frame block- Returns:
- output frame block?
-
initMetaData
public void initMetaData(FrameBlock meta)
Construct the recodemaps from the given input frame for all columns registered for recode.- Parameters:
meta
- frame block
-
writeExternal
public void writeExternal(ObjectOutput out) throws IOException
Description copied from class:ColumnEncoder
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.- Specified by:
writeExternal
in interfaceExternalizable
- Overrides:
writeExternal
in classColumnEncoder
- Parameters:
out
- object output- Throws:
IOException
- if IOException occurs
-
readExternal
public void readExternal(ObjectInput in) throws IOException
Description copied from class:ColumnEncoder
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.- Specified by:
readExternal
in interfaceExternalizable
- Overrides:
readExternal
in classColumnEncoder
- Parameters:
in
- object input- Throws:
IOException
- if IOException occur
-
-