Class TokenizerBuilderNgram
- java.lang.Object
-
- org.apache.sysds.runtime.transform.tokenize.builder.TokenizerBuilder
-
- org.apache.sysds.runtime.transform.tokenize.builder.TokenizerBuilderWhitespaceSplit
-
- org.apache.sysds.runtime.transform.tokenize.builder.TokenizerBuilderNgram
-
- All Implemented Interfaces:
Serializable
public class TokenizerBuilderNgram extends TokenizerBuilderWhitespaceSplit
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description intmaxGramintminGramorg.apache.sysds.runtime.transform.tokenize.builder.TokenizerBuilderNgram.NgramTypengramType-
Fields inherited from class org.apache.sysds.runtime.transform.tokenize.builder.TokenizerBuilderWhitespaceSplit
regex
-
-
Constructor Summary
Constructors Constructor Description TokenizerBuilderNgram(int[] idCols, int tokenizeCol, org.apache.wink.json4j.JSONObject params)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcreateInternalRepresentation(FrameBlock in, DocumentRepresentation[] internalRepresentation, int rowStart, int blk)List<Token>splitIntoNgrams(Token token, int minGram, int maxGram)-
Methods inherited from class org.apache.sysds.runtime.transform.tokenize.builder.TokenizerBuilderWhitespaceSplit
splitToTokens
-
Methods inherited from class org.apache.sysds.runtime.transform.tokenize.builder.TokenizerBuilder
createInternalRepresentation, getTasks
-
-
-
-
Method Detail
-
createInternalRepresentation
public void createInternalRepresentation(FrameBlock in, DocumentRepresentation[] internalRepresentation, int rowStart, int blk)
- Overrides:
createInternalRepresentationin classTokenizerBuilderWhitespaceSplit
-
-