java.lang.Object
- org.apache.sysds.hops.OptimizerUtils

public class OptimizerUtils
extends Object

Nested Class Summary

Nested Classes
Modifier and Type	Class	Description
`static class`	`OptimizerUtils.MemoryManager`	Memory managers (static partitioned, unified)
`static class`	`OptimizerUtils.OptimizationLevel`	Optimization Types for Compilation O0 STATIC - Decisions for scheduling operations on CP/MR are based on predefined set of rules, which check if the dimensions are below a fixed/static threshold (OLD Method of choosing between CP and MR).

Field Summary

Fields
Modifier and Type	Field	Description
`static boolean`	`ALLOW_ALGEBRAIC_SIMPLIFICATION`
`static boolean`	`ALLOW_AUTO_VECTORIZATION`
`static boolean`	`ALLOW_BRANCH_REMOVAL`	Enables if-else branch removal for constant predicates (original literals or results of constant folding).
`static boolean`	`ALLOW_CODE_MOTION`	Enables a specific rewrite for code motion, i.e., hoisting loop invariant code out of while, for, and parfor loops.
`static boolean`	`ALLOW_COMBINE_FILE_INPUT_FORMAT`	Enables the use of CombineSequenceFileInputFormat with splitsize = 2x hdfs blocksize, if sort buffer size large enough and parallelism not hurt.
`static boolean`	`ALLOW_COMMON_SUBEXPRESSION_ELIMINATION`	Enables common subexpression elimination in dags.
`static boolean`	`ALLOW_COMPRESSION_REWRITE`	Boolean specifying if compression rewrites is allowed.
`static boolean`	`ALLOW_CONSTANT_FOLDING`	Enables constant folding in dags.
`static boolean`	`ALLOW_EVAL_FCALL_REPLACEMENT`	Replace eval second-order function calls with normal function call if the function name is a known string (after constant propagation).
`static boolean`	`ALLOW_FOR_LOOP_REMOVAL`	Enables the removal of (par)for-loops when from, to, and increment are constants (original literals or results of constant folding) and lead to an empty sequence, i.e., (par)for-loops without a single iteration.
`static boolean`	`ALLOW_INTER_PROCEDURAL_ANALYSIS`	Enables interprocedural analysis between main script and functions as well as functions and other functions.
`static boolean`	`ALLOW_LOOP_UPDATE_IN_PLACE`	Enables a specific rewrite that enables update in place for loop variables that are only read/updated via cp leftindexing.
`static boolean`	`ALLOW_OPERATOR_FUSION`
`static boolean`	`ALLOW_RAND_JOB_RECOMPILE`
`static boolean`	`ALLOW_RUNTIME_PIGGYBACKING`	Enables parfor runtime piggybacking of MR jobs into the packed jobs for scan sharing.
`static boolean`	`ALLOW_SCRIPT_LEVEL_COMPRESS_COMMAND`	This variable allows for insertion of Compress and decompress in the dml script from the user.
`static boolean`	`ALLOW_SCRIPT_LEVEL_LOCAL_COMMAND`	This variable allows for use of explicit local command, that forces a spark block to be executed and returned as a local block.
`static boolean`	`ALLOW_SIZE_EXPRESSION_EVALUATION`	Enables simple expression evaluation for datagen parameters 'rows', 'cols'.
`static boolean`	`ALLOW_SPLIT_HOP_DAGS`	Enables a specific hop dag rewrite that splits hop dags after csv persistent reads with unknown size in order to allow for recompile.
`static boolean`	`ALLOW_SUM_PRODUCT_REWRITES`	Enables sum product rewrites such as mapmultchains.
`static boolean`	`ALLOW_TRANSITIVE_SPARK_EXEC_TYPE`	Enable transitive spark execution type selection.
`static boolean`	`ALLOW_UNARY_UPDATE_IN_PLACE`	Enables the update-in-place for all unary operators with a single consumer.
`static boolean`	`ALLOW_WORSTCASE_SIZE_EXPRESSION_EVALUATION`	Enables simple expression evaluation for datagen parameters 'rows', 'cols'.
`static boolean`	`ASYNC_TRIGGER_RDD_OPERATIONS`	Enable prefetch and broadcast.
`static long`	`BOOLEAN_SIZE`
`static long`	`BUFFER_POOL_SIZE`	Buffer pool size in bytes
`static long`	`CHAR_SIZE`
`static int`	`DEFAULT_BLOCKSIZE`	Default blocksize if unspecified or for testing purposes
`static int`	`DEFAULT_FRAME_BLOCKSIZE`	Default frame blocksize
`static double`	`DEFAULT_MEM_UTIL_FACTOR`	Default buffer pool sizes for static (15%) and unified (85%) memory
`static OptimizerUtils.OptimizationLevel`	`DEFAULT_OPTLEVEL`	Default optimization level if unspecified
`static double`	`DEFAULT_SIZE`	Default memory size, which is used if the actual estimate can not be computed e.g., when input/output dimensions are unknown.
`static double`	`DEFAULT_UMM_UTIL_FACTOR`
`static long`	`DOUBLE_SIZE`
`static boolean`	`FEDERATED_COMPILATION`	Compile federated instructions based on input federation state and privacy constraints.
`static Map<Integer,FEDInstruction.FederatedOutput>`	`FEDERATED_SPECS`
`static long`	`INT_SIZE`
`static double`	`INVALID_SIZE`
`static int`	`IPA_NUM_REPETITIONS`	Number of inter-procedural analysis (IPA) repetitions.
`static long`	`MAX_NNZ_CP_SPARSE`
`static long`	`MAX_NUMCELLS_CP_DENSE`
`static double`	`MEM_UTIL_FACTOR`	Utilization factor used in deciding whether an operation to be scheduled on CP or MR.
`static OptimizerUtils.MemoryManager`	`MEMORY_MANAGER`	Indicate the current memory manager in effect
`static double`	`PARALLEL_CP_READ_PARALLELISM_MULTIPLIER`	Specifies a multiplier computing the degree of parallelism of parallel text read/write out of the available degree of parallelism.
`static double`	`PARALLEL_CP_WRITE_PARALLELISM_MULTIPLIER`
`static long`	`SAFE_REP_CHANGE_THRES`

Constructor Summary

Constructors
Constructor Description

OptimizerUtils()

Method Summary

All Methods Static Methods Concrete Methods
Modifier and Type	Method	Description
`static boolean`	`allowsToFilterEmptyBlockOutputs(Hop hop)`
`static boolean`	`checkSparkBroadcastMemoryBudget(double size)`
`static boolean`	`checkSparkBroadcastMemoryBudget(long rlen, long clen, long blen, long nnz)`
`static boolean`	`checkSparkCollectMemoryBudget(DataCharacteristics dc, long memPinned)`
`static boolean`	`checkSparkCollectMemoryBudget(DataCharacteristics dc, long memPinned, boolean checkBP)`
`static boolean`	`checkSparseBlockCSRConversion(DataCharacteristics dcIn)`
`static CompilerConfig`	`constructCompilerConfig(CompilerConfig cconf, DMLConfig dmlconf)`
`static CompilerConfig`	`constructCompilerConfig(DMLConfig dmlconf)`
`static void`	`disableUMM()`	Disable unified memory manager and fallback to static partitioning.
`static void`	`enableUMM()`	Enable unified memory manager and initialize with the default size (85%).
`static long`	`estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, double sp)`	Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp.
`static long`	`estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, double sp, boolean outputEmptyBlocks)`
`static long`	`estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, long nnz)`	Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and number of non-zeros nnz.
`static long`	`estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, long nnz, boolean outputEmptyBlocks)`
`static long`	`estimatePartitionedSizeExactSparsity(Hop hop)`	Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the hops dimensions and number of non-zeros nnz.
`static long`	`estimatePartitionedSizeExactSparsity(DataCharacteristics dc)`	Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the given matrix characteristics
`static long`	`estimatePartitionedSizeExactSparsity(DataCharacteristics dc, boolean outputEmptyBlocks)`
`static long`	`estimateSize(long nrows, long ncols)`	Similar to estimate() except that it provides worst-case estimates when the optimization type is ROBUST.
`static long`	`estimateSize(DataCharacteristics dc)`
`static long`	`estimateSizeEmptyBlock(long nrows, long ncols)`
`static long`	`estimateSizeExactSparsity(long nrows, long ncols, double sp)`	Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp.
`static long`	`estimateSizeExactSparsity(long nrows, long ncols, long nnz)`	Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and and number of non-zeros nnz.
`static long`	`estimateSizeExactSparsity(DataCharacteristics dc)`
`static long`	`estimateSizeTextOutput(int[] dims, long nnz, Types.FileFormat fmt)`
`static long`	`estimateSizeTextOutput(long rows, long cols, long nnz, Types.FileFormat fmt)`
`static boolean`	`exceedsCachingThreshold(long dim2, double outMem)`	Indicates if the given matrix characteristics exceed the threshold for caching, i.e., the matrix should be cached.
`static double`	`getBinaryOpSparsity(double sp1, double sp2, Types.OpOp2 op, boolean worstcase)`	Estimates the result sparsity for matrix-matrix binary operations (A op B)
`static double`	`getBinaryOpSparsityConditionalSparseSafe(double sp1, Types.OpOp2 op, LiteralOp lit)`
`static long`	`getBufferPoolLimit()`	Returns buffer pool size as set in the config
`static int`	`getConstrainedNumThreads(int maxNumThreads)`
`static Types.ExecMode`	`getDefaultExecutionMode()`
`static int`	`getDefaultFrameSize()`
`static org.apache.log4j.Level`	`getDefaultLogLevel()`
`static long`	`getDefaultSize()`
`static double`	`getLeftIndexingSparsity(long rlen1, long clen1, long nnz1, long rlen2, long clen2, long nnz2)`
`static double`	`getLocalMemBudget()`	Returns memory budget (according to util factor) in bytes
`static long`	`getMatMultNnz(double sp1, double sp2, long m, long k, long n, boolean worstcase)`
`static double`	`getMatMultSparsity(double sp1, double sp2, long m, long k, long n, boolean worstcase)`	Estimates the result sparsity for Matrix Multiplication A %*% B.
`static long`	`getNnz(long dim1, long dim2, double sp)`
`static long`	`getNumIterations(ForStatementBlock fsb, long defaultValue)`
`static long`	`getNumIterations(ForProgramBlock fpb, long defaultValue)`
`static long`	`getNumIterations(ForProgramBlock fpb, LocalVariableMap vars, long defaultValue)`
`static int`	`getNumMappers()`
`static int`	`getNumReducers(boolean configOnly)`	Returns the number of reducers that potentially run in parallel.
`static OptimizerUtils.OptimizationLevel`	`getOptLevel()`
`static long`	`getOuterNonZeros(long n1, long n2, long nnz1, long nnz2, Types.OpOp2 op)`
`static int`	`getParallelBinaryReadParallelism()`
`static int`	`getParallelBinaryWriteParallelism()`
`static int`	`getParallelTextReadParallelism()`	Returns the degree of parallelism used for parallel text read.
`static int`	`getParallelTextWriteParallelism()`	Returns the degree of parallelism used for parallel text write.
`static double`	`getSparsity(long[] dims, long nnz)`
`static double`	`getSparsity(long dim1, long dim2, long nnz)`
`static double`	`getSparsity(Hop hop)`
`static double`	`getSparsity(DataCharacteristics dc)`
`static double`	`getTotalMemEstimate(Hop[] in, Hop out)`
`static double`	`getTotalMemEstimate(Hop[] in, Hop out, boolean denseOut)`
`static int`	`getTransformNumThreads()`
`static String`	`getUniqueTempFileName()`	Wrapper over internal filename construction for external usage.
`static boolean`	`isBinaryOpConditionalSparseSafe(Types.OpOp2 op)`	Determines if a given binary op is potentially conditional sparse safe.
`static boolean`	`isBinaryOpConditionalSparseSafeExact(Types.OpOp2 op, LiteralOp lit)`	Determines if a given binary op with scalar literal guarantee an output sparsity which is exactly the same as its matrix input sparsity.
`static boolean`	`isBinaryOpSparsityConditionalSparseSafe(Types.OpOp2 op, LiteralOp lit)`
`static boolean`	`isHybridExecutionMode()`
`static boolean`	`isIndexingRangeBlockAligned(long rl, long ru, long cl, long cu, long blen)`	Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.
`static boolean`	`isIndexingRangeBlockAligned(IndexRange ixrange, DataCharacteristics mc)`	Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.
`static boolean`	`isMaxLocalParallelism(int k)`
`static boolean`	`isMemoryBasedOptLevel()`
`static boolean`	`isOptLevel(OptimizerUtils.OptimizationLevel level)`
`static boolean`	`isSparkExecutionMode()`
`static boolean`	`isTopLevelParFor()`
`static boolean`	`isUMMEnabled()`	Check if unified memory manager is in effect
`static boolean`	`isValidCPDimensions(long rows, long cols)`	Returns false if dimensions known to be invalid; other true
`static boolean`	`isValidCPDimensions(Types.ValueType[] schema, String[] names)`	Returns false if schema and names are not properly specified; other true Length to be > 0, and length of both to be equal.
`static boolean`	`isValidCPDimensions(DataCharacteristics mc)`
`static boolean`	`isValidCPMatrixSize(long rows, long cols, double sparsity)`	Determines if valid matrix size to be represented in CP data structures.
`static void`	`resetDefaultSize()`
`static void`	`resetStaticCompilerFlags()`
`static double`	`rEvalSimpleDoubleExpression(Hop root, HashMap<Long,Double> valMemo)`
`static double`	`rEvalSimpleDoubleExpression(Hop root, HashMap<Long,Double> valMemo, LocalVariableMap vars)`
`static long`	`rEvalSimpleLongExpression(Hop root, HashMap<Long,Long> valMemo)`	Function to evaluate simple size expressions over literals and now/ncol.
`static long`	`rEvalSimpleLongExpression(Hop root, HashMap<Long,Long> valMemo, LocalVariableMap vars)`
`static String`	`toMB(double inB)`

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - MEM_UTIL_FACTOR
```
public static double MEM_UTIL_FACTOR
```
    Utilization factor used in deciding whether an operation to be scheduled on CP or MR. NOTE: it is important that MEM_UTIL_FACTOR+CacheableData.CACHING_BUFFER_SIZE < 1.0
  - DEFAULT_MEM_UTIL_FACTOR
```
public static double DEFAULT_MEM_UTIL_FACTOR
```
    Default buffer pool sizes for static (15%) and unified (85%) memory
  - DEFAULT_UMM_UTIL_FACTOR
```
public static double DEFAULT_UMM_UTIL_FACTOR
```
  - MEMORY_MANAGER
```
public static OptimizerUtils.MemoryManager MEMORY_MANAGER
```
    Indicate the current memory manager in effect
  - BUFFER_POOL_SIZE
```
public static long BUFFER_POOL_SIZE
```
    Buffer pool size in bytes
  - DEFAULT_BLOCKSIZE
```
public static final int DEFAULT_BLOCKSIZE
```
    Default blocksize if unspecified or for testing purposes
    
    See Also:
    
    Constant Field Values
  - DEFAULT_FRAME_BLOCKSIZE
```
public static final int DEFAULT_FRAME_BLOCKSIZE
```
    Default frame blocksize
    
    See Also:
    
    Constant Field Values
  - DEFAULT_OPTLEVEL
```
public static final OptimizerUtils.OptimizationLevel DEFAULT_OPTLEVEL
```
    Default optimization level if unspecified
  - DEFAULT_SIZE
```
public static double DEFAULT_SIZE
```
    Default memory size, which is used if the actual estimate can not be computed e.g., when input/output dimensions are unknown. The default is set to a large value so that operations are scheduled on MR while avoiding overflows as well.
  - DOUBLE_SIZE
```
public static final long DOUBLE_SIZE
```
    See Also:
    
    Constant Field Values
  - INT_SIZE
```
public static final long INT_SIZE
```
    See Also:
    
    Constant Field Values
  - CHAR_SIZE
```
public static final long CHAR_SIZE
```
    See Also:
    
    Constant Field Values
  - BOOLEAN_SIZE
```
public static final long BOOLEAN_SIZE
```
    See Also:
    
    Constant Field Values
  - INVALID_SIZE
```
public static final double INVALID_SIZE
```
    See Also:
    
    Constant Field Values
  - MAX_NUMCELLS_CP_DENSE
```
public static final long MAX_NUMCELLS_CP_DENSE
```
    See Also:
    
    Constant Field Values
  - MAX_NNZ_CP_SPARSE
```
public static final long MAX_NNZ_CP_SPARSE
```
  - SAFE_REP_CHANGE_THRES
```
public static final long SAFE_REP_CHANGE_THRES
```
    See Also:
    
    Constant Field Values
  - ALLOW_COMMON_SUBEXPRESSION_ELIMINATION
```
public static boolean ALLOW_COMMON_SUBEXPRESSION_ELIMINATION
```
    Enables common subexpression elimination in dags. There is however, a potential tradeoff between computation redundancy and data transfer between MR jobs. Since, we do not reason about transferred data yet, this rewrite rule is enabled by default.
  - ALLOW_CONSTANT_FOLDING
```
public static boolean ALLOW_CONSTANT_FOLDING
```
    Enables constant folding in dags. Constant folding computes simple expressions of binary operations and literals and replaces the hop sub-DAG with a new literal operator.
  - ALLOW_ALGEBRAIC_SIMPLIFICATION
```
public static boolean ALLOW_ALGEBRAIC_SIMPLIFICATION
```
  - ALLOW_OPERATOR_FUSION
```
public static boolean ALLOW_OPERATOR_FUSION
```
  - ALLOW_BRANCH_REMOVAL
```
public static boolean ALLOW_BRANCH_REMOVAL
```
    Enables if-else branch removal for constant predicates (original literals or results of constant folding).
  - ALLOW_FOR_LOOP_REMOVAL
```
public static boolean ALLOW_FOR_LOOP_REMOVAL
```
    Enables the removal of (par)for-loops when from, to, and increment are constants (original literals or results of constant folding) and lead to an empty sequence, i.e., (par)for-loops without a single iteration.
  - ALLOW_AUTO_VECTORIZATION
```
public static boolean ALLOW_AUTO_VECTORIZATION
```
  - ALLOW_SIZE_EXPRESSION_EVALUATION
```
public static boolean ALLOW_SIZE_EXPRESSION_EVALUATION
```
    Enables simple expression evaluation for datagen parameters 'rows', 'cols'. Simple expressions are defined as binary operations on literals and nrow/ncol. This applies only to exact size information.
  - ALLOW_WORSTCASE_SIZE_EXPRESSION_EVALUATION
```
public static boolean ALLOW_WORSTCASE_SIZE_EXPRESSION_EVALUATION
```
    Enables simple expression evaluation for datagen parameters 'rows', 'cols'. Simple expressions are defined as binary operations on literals and b(+) or b(*) on nrow/ncol. This applies also to worst-case size information.
  - ALLOW_RAND_JOB_RECOMPILE
```
public static boolean ALLOW_RAND_JOB_RECOMPILE
```
  - ALLOW_RUNTIME_PIGGYBACKING
```
public static boolean ALLOW_RUNTIME_PIGGYBACKING
```
    Enables parfor runtime piggybacking of MR jobs into the packed jobs for scan sharing.
  - ALLOW_INTER_PROCEDURAL_ANALYSIS
```
public static boolean ALLOW_INTER_PROCEDURAL_ANALYSIS
```
    Enables interprocedural analysis between main script and functions as well as functions and other functions. This includes, for example, to propagate statistics into functions if save to do so (e.g., if called once).
  - IPA_NUM_REPETITIONS
```
public static int IPA_NUM_REPETITIONS
```
    Number of inter-procedural analysis (IPA) repetitions. If set to >=2, we apply IPA multiple times in order to allow scalar propagation over complex function call graphs and various interactions between constant propagation, constant folding, and other rewrites such as branch removal and the merge of statement block sequences.
  - ALLOW_SUM_PRODUCT_REWRITES
```
public static boolean ALLOW_SUM_PRODUCT_REWRITES
```
    Enables sum product rewrites such as mapmultchains. In the future, this will cover all sum-product related rewrites.
  - ALLOW_SPLIT_HOP_DAGS
```
public static boolean ALLOW_SPLIT_HOP_DAGS
```
    Enables a specific hop dag rewrite that splits hop dags after csv persistent reads with unknown size in order to allow for recompile.
  - ALLOW_LOOP_UPDATE_IN_PLACE
```
public static boolean ALLOW_LOOP_UPDATE_IN_PLACE
```
    Enables a specific rewrite that enables update in place for loop variables that are only read/updated via cp leftindexing.
  - ALLOW_UNARY_UPDATE_IN_PLACE
```
public static boolean ALLOW_UNARY_UPDATE_IN_PLACE
```
    Enables the update-in-place for all unary operators with a single consumer. In this case we do not allocate the output, but directly write the output values back to the input block.
  - ALLOW_EVAL_FCALL_REPLACEMENT
```
public static boolean ALLOW_EVAL_FCALL_REPLACEMENT
```
    Replace eval second-order function calls with normal function call if the function name is a known string (after constant propagation).
  - ALLOW_CODE_MOTION
```
public static boolean ALLOW_CODE_MOTION
```
    Enables a specific rewrite for code motion, i.e., hoisting loop invariant code out of while, for, and parfor loops.
  - FEDERATED_COMPILATION
```
public static boolean FEDERATED_COMPILATION
```
    Compile federated instructions based on input federation state and privacy constraints.
  - FEDERATED_SPECS
```
public static Map<Integer,FEDInstruction.FederatedOutput> FEDERATED_SPECS
```
  - PARALLEL_CP_READ_PARALLELISM_MULTIPLIER
```
public static final double PARALLEL_CP_READ_PARALLELISM_MULTIPLIER
```
    Specifies a multiplier computing the degree of parallelism of parallel text read/write out of the available degree of parallelism. Set it to 1.0 to get a number of threads equal the number of virtual cores.
    
    See Also:
    
    Constant Field Values
  - PARALLEL_CP_WRITE_PARALLELISM_MULTIPLIER
```
public static final double PARALLEL_CP_WRITE_PARALLELISM_MULTIPLIER
```
    See Also:
    
    Constant Field Values
  - ALLOW_COMBINE_FILE_INPUT_FORMAT
```
public static final boolean ALLOW_COMBINE_FILE_INPUT_FORMAT
```
    Enables the use of CombineSequenceFileInputFormat with splitsize = 2x hdfs blocksize, if sort buffer size large enough and parallelism not hurt. This solves to issues: (1) it combines small files (depending on producers), and (2) it reduces task latency of large jobs with many tasks by factor 2.
    
    See Also:
    
    Constant Field Values
  - ALLOW_SCRIPT_LEVEL_LOCAL_COMMAND
```
public static boolean ALLOW_SCRIPT_LEVEL_LOCAL_COMMAND
```
    This variable allows for use of explicit local command, that forces a spark block to be executed and returned as a local block.
  - ALLOW_SCRIPT_LEVEL_COMPRESS_COMMAND
```
public static boolean ALLOW_SCRIPT_LEVEL_COMPRESS_COMMAND
```
    This variable allows for insertion of Compress and decompress in the dml script from the user. This is added because we want to have a way to test, and verify the correct placement of compress and decompress commands.
  - ALLOW_COMPRESSION_REWRITE
```
public static boolean ALLOW_COMPRESSION_REWRITE
```
    Boolean specifying if compression rewrites is allowed. This is disabled at run time if the IPA for Workload aware compression is activated.
  - ALLOW_TRANSITIVE_SPARK_EXEC_TYPE
```
public static boolean ALLOW_TRANSITIVE_SPARK_EXEC_TYPE
```
    Enable transitive spark execution type selection. This refines the exec-type selection logic of unary aggregates by pushing * the unary aggregates, whose inputs are created by spark instructions, to spark execution type as well.
  - ASYNC_TRIGGER_RDD_OPERATIONS
```
public static boolean ASYNC_TRIGGER_RDD_OPERATIONS
```
    Enable prefetch and broadcast. Prefetch asynchronously calls acquireReadAndRelease() to trigger a chain of spark transformations, which would would otherwise make the next instruction wait till completion. Broadcast allows asynchronously transferring the data to all the nodes.
- Constructor Detail
  - OptimizerUtils
```
public OptimizerUtils()
```
- Method Detail
  - getOptLevel
```
public static OptimizerUtils.OptimizationLevel getOptLevel()
```
  - isMemoryBasedOptLevel
```
public static boolean isMemoryBasedOptLevel()
```
  - isOptLevel
```
public static boolean isOptLevel(OptimizerUtils.OptimizationLevel level)
```
  - constructCompilerConfig
```
public static CompilerConfig constructCompilerConfig(DMLConfig dmlconf)
```
  - constructCompilerConfig
```
public static CompilerConfig constructCompilerConfig(CompilerConfig cconf,
                                                     DMLConfig dmlconf)
```
  - resetStaticCompilerFlags
```
public static void resetStaticCompilerFlags()
```
  - getDefaultSize
```
public static long getDefaultSize()
```
  - resetDefaultSize
```
public static void resetDefaultSize()
```
  - getDefaultFrameSize
```
public static int getDefaultFrameSize()
```
  - getLocalMemBudget
```
public static double getLocalMemBudget()
```
    Returns memory budget (according to util factor) in bytes
    
    Returns:
    
    local memory budget
  - getBufferPoolLimit
```
public static long getBufferPoolLimit()
```
    Returns buffer pool size as set in the config
    
    Returns:
    
    buffer pool size in bytes
  - isUMMEnabled
```
public static boolean isUMMEnabled()
```
    Check if unified memory manager is in effect
    
    Returns:
    
    boolean
  - disableUMM
```
public static void disableUMM()
```
    Disable unified memory manager and fallback to static partitioning. Initialize LazyWriteBuffer with the default size (15%).
  - enableUMM
```
public static void enableUMM()
```
    Enable unified memory manager and initialize with the default size (85%).
  - isMaxLocalParallelism
```
public static boolean isMaxLocalParallelism(int k)
```
  - isTopLevelParFor
```
public static boolean isTopLevelParFor()
```
  - checkSparkBroadcastMemoryBudget
```
public static boolean checkSparkBroadcastMemoryBudget(double size)
```
  - checkSparkBroadcastMemoryBudget
```
public static boolean checkSparkBroadcastMemoryBudget(long rlen,
                                                      long clen,
                                                      long blen,
                                                      long nnz)
```
  - checkSparkCollectMemoryBudget
```
public static boolean checkSparkCollectMemoryBudget(DataCharacteristics dc,
                                                    long memPinned)
```
  - checkSparkCollectMemoryBudget
```
public static boolean checkSparkCollectMemoryBudget(DataCharacteristics dc,
                                                    long memPinned,
                                                    boolean checkBP)
```
  - checkSparseBlockCSRConversion
```
public static boolean checkSparseBlockCSRConversion(DataCharacteristics dcIn)
```
  - getNumReducers
```
public static int getNumReducers(boolean configOnly)
```
    Returns the number of reducers that potentially run in parallel. This is either just the configured value (SystemDS config) or the minimum of configured value and available reduce slots.
    
    Parameters:
    
    configOnly - true if configured value
    
    Returns:
    
    number of reducers
  - getNumMappers
```
public static int getNumMappers()
```
  - getDefaultExecutionMode
```
public static Types.ExecMode getDefaultExecutionMode()
```
  - isSparkExecutionMode
```
public static boolean isSparkExecutionMode()
```
  - isHybridExecutionMode
```
public static boolean isHybridExecutionMode()
```
  - getParallelTextReadParallelism
```
public static int getParallelTextReadParallelism()
```
    Returns the degree of parallelism used for parallel text read. This is computed as the number of virtual cores scales by the PARALLEL_READ_PARALLELISM_MULTIPLIER. If PARALLEL_READ_TEXTFORMATS is disabled, this method returns 1.
    
    Returns:
    
    degree of parallelism
  - getParallelBinaryReadParallelism
```
public static int getParallelBinaryReadParallelism()
```
  - getParallelTextWriteParallelism
```
public static int getParallelTextWriteParallelism()
```
    Returns the degree of parallelism used for parallel text write. This is computed as the number of virtual cores scales by the PARALLEL_WRITE_PARALLELISM_MULTIPLIER. If PARALLEL_WRITE_TEXTFORMATS is disabled, this method returns 1.
    
    Returns:
    
    degree of parallelism
  - getParallelBinaryWriteParallelism
```
public static int getParallelBinaryWriteParallelism()
```
  - estimateSize
```
public static long estimateSize(DataCharacteristics dc)
```
  - estimateSizeExactSparsity
```
public static long estimateSizeExactSparsity(DataCharacteristics dc)
```
  - estimateSizeExactSparsity
```
public static long estimateSizeExactSparsity(long nrows,
                                             long ncols,
                                             long nnz)
```
    Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and and number of non-zeros nnz.
    
    Parameters:
    
    nrows - number of rows
    
    ncols - number of cols
    
    nnz - number of non-zeros
    
    Returns:
    
    memory footprint
  - estimateSizeExactSparsity
```
public static long estimateSizeExactSparsity(long nrows,
                                             long ncols,
                                             double sp)
```
    Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp. This function can be used directly in Hops, when the actual sparsity is known i.e., sp is guaranteed to give worst-case estimate (e.g., Rand with a fixed sparsity). In all other cases, estimateSize() must be used so that worst-case estimates are computed, whenever applicable.
    
    Parameters:
    
    nrows - number of rows
    
    ncols - number of cols
    
    sp - sparsity
    
    Returns:
    
    memory footprint
  - estimatePartitionedSizeExactSparsity
```
public static long estimatePartitionedSizeExactSparsity(DataCharacteristics dc)
```
    Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the given matrix characteristics
    
    Parameters:
    
    dc - matrix characteristics
    
    Returns:
    
    memory estimate
  - estimatePartitionedSizeExactSparsity
```
public static long estimatePartitionedSizeExactSparsity(DataCharacteristics dc,
                                                        boolean outputEmptyBlocks)
```
  - estimatePartitionedSizeExactSparsity
```
public static long estimatePartitionedSizeExactSparsity(long rlen,
                                                        long clen,
                                                        long blen,
                                                        long nnz)
```
    Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and number of non-zeros nnz.
    
    Parameters:
    
    rlen - number of rows
    
    clen - number of cols
    
    blen - rows/cols per block
    
    nnz - number of non-zeros
    
    Returns:
    
    memory estimate
  - estimatePartitionedSizeExactSparsity
```
public static long estimatePartitionedSizeExactSparsity(long rlen,
                                                        long clen,
                                                        long blen,
                                                        long nnz,
                                                        boolean outputEmptyBlocks)
```
  - estimatePartitionedSizeExactSparsity
```
public static long estimatePartitionedSizeExactSparsity(Hop hop)
```
    Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the hops dimensions and number of non-zeros nnz.
    
    Parameters:
    
    hop - The hop to extract dimensions and nnz from
    
    Returns:
    
    the memory estimate
  - estimatePartitionedSizeExactSparsity
```
public static long estimatePartitionedSizeExactSparsity(long rlen,
                                                        long clen,
                                                        long blen,
                                                        double sp)
```
    Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp.
    
    Parameters:
    
    rlen - number of rows
    
    clen - number of cols
    
    blen - rows/cols per block
    
    sp - sparsity
    
    Returns:
    
    memory estimate
  - estimatePartitionedSizeExactSparsity
```
public static long estimatePartitionedSizeExactSparsity(long rlen,
                                                        long clen,
                                                        long blen,
                                                        double sp,
                                                        boolean outputEmptyBlocks)
```
  - estimateSize
```
public static long estimateSize(long nrows,
                                long ncols)
```
    Similar to estimate() except that it provides worst-case estimates when the optimization type is ROBUST.
    
    Parameters:
    
    nrows - number of rows
    
    ncols - number of cols
    
    Returns:
    
    memory estimate
  - estimateSizeEmptyBlock
```
public static long estimateSizeEmptyBlock(long nrows,
                                          long ncols)
```
  - estimateSizeTextOutput
```
public static long estimateSizeTextOutput(long rows,
                                          long cols,
                                          long nnz,
                                          Types.FileFormat fmt)
```
  - estimateSizeTextOutput
```
public static long estimateSizeTextOutput(int[] dims,
                                          long nnz,
                                          Types.FileFormat fmt)
```
  - getTotalMemEstimate
```
public static double getTotalMemEstimate(Hop[] in,
                                         Hop out)
```
  - getTotalMemEstimate
```
public static double getTotalMemEstimate(Hop[] in,
                                         Hop out,
                                         boolean denseOut)
```
  - isIndexingRangeBlockAligned
```
public static boolean isIndexingRangeBlockAligned(IndexRange ixrange,
                                                  DataCharacteristics mc)
```
    Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.
    
    Parameters:
    
    ixrange - indexing range
    
    mc - matrix characteristics
    
    Returns:
    
    true if indexing range is block aligned
  - isIndexingRangeBlockAligned
```
public static boolean isIndexingRangeBlockAligned(long rl,
                                                  long ru,
                                                  long cl,
                                                  long cu,
                                                  long blen)
```
    Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.
    
    Parameters:
    
    rl - rows lower
    
    ru - rows upper
    
    cl - cols lower
    
    cu - cols upper
    
    blen - rows/cols per block
    
    Returns:
    
    true if indexing range is block aligned
  - isValidCPDimensions
```
public static boolean isValidCPDimensions(DataCharacteristics mc)
```
  - isValidCPDimensions
```
public static boolean isValidCPDimensions(long rows,
                                          long cols)
```
    Returns false if dimensions known to be invalid; other true
    
    Parameters:
    
    rows - number of rows
    
    cols - number of cols
    
    Returns:
    
    true if dimensions valid
  - isValidCPDimensions
```
public static boolean isValidCPDimensions(Types.ValueType[] schema,
                                          String[] names)
```
    Returns false if schema and names are not properly specified; other true Length to be > 0, and length of both to be equal.
    
    Parameters:
    
    schema - the schema
    
    names - the names
    
    Returns:
    
    false if schema and names are not properly specified
  - isValidCPMatrixSize
```
public static boolean isValidCPMatrixSize(long rows,
                                          long cols,
                                          double sparsity)
```
    Determines if valid matrix size to be represented in CP data structures. Note that sparsity needs to be specified as rows*cols if unknown.
    
    Parameters:
    
    rows - number of rows
    
    cols - number of cols
    
    sparsity - the sparsity
    
    Returns:
    
    true if valid matrix size
  - exceedsCachingThreshold
```
public static boolean exceedsCachingThreshold(long dim2,
                                              double outMem)
```
    Indicates if the given matrix characteristics exceed the threshold for caching, i.e., the matrix should be cached.
    
    Parameters:
    
    dim2 - dimension 2
    
    outMem - ?
    
    Returns:
    
    true if the given matrix characteristics exceed threshold
  - getUniqueTempFileName
```
public static String getUniqueTempFileName()
```
    Wrapper over internal filename construction for external usage.
    
    Returns:
    
    unique temp file name
  - allowsToFilterEmptyBlockOutputs
```
public static boolean allowsToFilterEmptyBlockOutputs(Hop hop)
```
  - getConstrainedNumThreads
```
public static int getConstrainedNumThreads(int maxNumThreads)
```
  - getTransformNumThreads
```
public static int getTransformNumThreads()
```
  - getDefaultLogLevel
```
public static org.apache.log4j.Level getDefaultLogLevel()
```
  - getMatMultNnz
```
public static long getMatMultNnz(double sp1,
                                 double sp2,
                                 long m,
                                 long k,
                                 long n,
                                 boolean worstcase)
```
  - getMatMultSparsity
```
public static double getMatMultSparsity(double sp1,
                                        double sp2,
                                        long m,
                                        long k,
                                        long n,
                                        boolean worstcase)
```
    Estimates the result sparsity for Matrix Multiplication A %*% B.
    
    Parameters:
    
    sp1 - sparsity of A
    
    sp2 - sparsity of B
    
    m - nrow(A)
    
    k - ncol(A), nrow(B)
    
    n - ncol(B)
    
    worstcase - true if worst case
    
    Returns:
    
    the sparsity
  - getLeftIndexingSparsity
```
public static double getLeftIndexingSparsity(long rlen1,
                                             long clen1,
                                             long nnz1,
                                             long rlen2,
                                             long clen2,
                                             long nnz2)
```
  - isBinaryOpConditionalSparseSafe
```
public static boolean isBinaryOpConditionalSparseSafe(Types.OpOp2 op)
```
    Determines if a given binary op is potentially conditional sparse safe.
    
    Parameters:
    
    op - the HOP OpOp2
    
    Returns:
    
    true if potentially conditional sparse safe
  - isBinaryOpConditionalSparseSafeExact
```
public static boolean isBinaryOpConditionalSparseSafeExact(Types.OpOp2 op,
                                                           LiteralOp lit)
```
    Determines if a given binary op with scalar literal guarantee an output sparsity which is exactly the same as its matrix input sparsity.
    
    Parameters:
    
    op - the HOP OpOp2
    
    lit - literal operator
    
    Returns:
    
    true if output sparsity same as matrix input sparsity
  - isBinaryOpSparsityConditionalSparseSafe
```
public static boolean isBinaryOpSparsityConditionalSparseSafe(Types.OpOp2 op,
                                                              LiteralOp lit)
```
  - getBinaryOpSparsityConditionalSparseSafe
```
public static double getBinaryOpSparsityConditionalSparseSafe(double sp1,
                                                              Types.OpOp2 op,
                                                              LiteralOp lit)
```
  - getBinaryOpSparsity
```
public static double getBinaryOpSparsity(double sp1,
                                         double sp2,
                                         Types.OpOp2 op,
                                         boolean worstcase)
```
    Estimates the result sparsity for matrix-matrix binary operations (A op B)
    
    Parameters:
    
    sp1 - sparsity of A
    
    sp2 - sparsity of B
    
    op - binary operation
    
    worstcase - true if worst case
    
    Returns:
    
    result sparsity for matrix-matrix binary operations
  - getOuterNonZeros
```
public static long getOuterNonZeros(long n1,
                                    long n2,
                                    long nnz1,
                                    long nnz2,
                                    Types.OpOp2 op)
```
  - getNnz
```
public static long getNnz(long dim1,
                          long dim2,
                          double sp)
```
  - getSparsity
```
public static double getSparsity(DataCharacteristics dc)
```
  - getSparsity
```
public static double getSparsity(long dim1,
                                 long dim2,
                                 long nnz)
```
  - getSparsity
```
public static double getSparsity(Hop hop)
```
  - getSparsity
```
public static double getSparsity(long[] dims,
                                 long nnz)
```
  - toMB
```
public static String toMB(double inB)
```
  - getNumIterations
```
public static long getNumIterations(ForProgramBlock fpb,
                                    long defaultValue)
```
  - getNumIterations
```
public static long getNumIterations(ForStatementBlock fsb,
                                    long defaultValue)
```
  - getNumIterations
```
public static long getNumIterations(ForProgramBlock fpb,
                                    LocalVariableMap vars,
                                    long defaultValue)
```
  - rEvalSimpleLongExpression
```
public static long rEvalSimpleLongExpression(Hop root,
                                             HashMap<Long,Long> valMemo)
```
    Function to evaluate simple size expressions over literals and now/ncol. It returns the exact results of this expressions if known, otherwise Long.MAX_VALUE if unknown.
    
    Parameters:
    
    root - the root high-level operator
    
    valMemo - ?
    
    Returns:
    
    size expression
  - rEvalSimpleLongExpression
```
public static long rEvalSimpleLongExpression(Hop root,
                                             HashMap<Long,Long> valMemo,
                                             LocalVariableMap vars)
```
  - rEvalSimpleDoubleExpression
```
public static double rEvalSimpleDoubleExpression(Hop root,
                                                 HashMap<Long,Double> valMemo)
```
  - rEvalSimpleDoubleExpression
```
public static double rEvalSimpleDoubleExpression(Hop root,
                                                 HashMap<Long,Double> valMemo,
                                                 LocalVariableMap vars)
```

Class OptimizerUtils

Nested Class Summary

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

MEM_UTIL_FACTOR

DEFAULT_MEM_UTIL_FACTOR

DEFAULT_UMM_UTIL_FACTOR

MEMORY_MANAGER

BUFFER_POOL_SIZE

DEFAULT_BLOCKSIZE

DEFAULT_FRAME_BLOCKSIZE

DEFAULT_OPTLEVEL

DEFAULT_SIZE

DOUBLE_SIZE

INT_SIZE

CHAR_SIZE

BOOLEAN_SIZE

INVALID_SIZE

MAX_NUMCELLS_CP_DENSE

MAX_NNZ_CP_SPARSE

SAFE_REP_CHANGE_THRES

ALLOW_COMMON_SUBEXPRESSION_ELIMINATION

ALLOW_CONSTANT_FOLDING

ALLOW_ALGEBRAIC_SIMPLIFICATION

ALLOW_OPERATOR_FUSION

ALLOW_BRANCH_REMOVAL

ALLOW_FOR_LOOP_REMOVAL

ALLOW_AUTO_VECTORIZATION

ALLOW_SIZE_EXPRESSION_EVALUATION

ALLOW_WORSTCASE_SIZE_EXPRESSION_EVALUATION

ALLOW_RAND_JOB_RECOMPILE

ALLOW_RUNTIME_PIGGYBACKING

ALLOW_INTER_PROCEDURAL_ANALYSIS

IPA_NUM_REPETITIONS

ALLOW_SUM_PRODUCT_REWRITES

ALLOW_SPLIT_HOP_DAGS

ALLOW_LOOP_UPDATE_IN_PLACE

ALLOW_UNARY_UPDATE_IN_PLACE

ALLOW_EVAL_FCALL_REPLACEMENT

ALLOW_CODE_MOTION

FEDERATED_COMPILATION

FEDERATED_SPECS

PARALLEL_CP_READ_PARALLELISM_MULTIPLIER

PARALLEL_CP_WRITE_PARALLELISM_MULTIPLIER

ALLOW_COMBINE_FILE_INPUT_FORMAT

ALLOW_SCRIPT_LEVEL_LOCAL_COMMAND

ALLOW_SCRIPT_LEVEL_COMPRESS_COMMAND

ALLOW_COMPRESSION_REWRITE

ALLOW_TRANSITIVE_SPARK_EXEC_TYPE

ASYNC_TRIGGER_RDD_OPERATIONS

Constructor Detail

OptimizerUtils

Method Detail

getOptLevel

isMemoryBasedOptLevel

isOptLevel

constructCompilerConfig

constructCompilerConfig

resetStaticCompilerFlags

getDefaultSize

resetDefaultSize

getDefaultFrameSize

getLocalMemBudget

getBufferPoolLimit

isUMMEnabled

disableUMM

enableUMM

isMaxLocalParallelism

isTopLevelParFor

checkSparkBroadcastMemoryBudget

checkSparkBroadcastMemoryBudget

checkSparkCollectMemoryBudget

checkSparkCollectMemoryBudget

checkSparseBlockCSRConversion

getNumReducers

getNumMappers

getDefaultExecutionMode