Package org.apache.sysds.resource.cost
Class CostEstimator
- java.lang.Object
-
- org.apache.sysds.resource.cost.CostEstimator
-
public class CostEstimator extends Object
Class for estimating the execution time of a program. For estimating the time for new set of resources, a new instance of CostEstimator should be created.
-
-
Constructor Summary
Constructors Constructor Description CostEstimator(Program program, CloudInstance driverNode, CloudInstance executorNode)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static double
estimateExecutionTime(Program program, CloudInstance driverNode, CloudInstance executorNode)
Entry point for estimating the execution time of a program.VarStats
getStats(String statsName)
Intended to be called only when it is certain that the corresponding variable is not a scalar and its statistics are in_stats
already.VarStats
getStatsWithDefaultScalar(String statsName)
Intended to be called when the corresponding variable could be scalar.double
getTimeEstimate()
double
getTimeEstimateCPInst(CPInstruction inst)
Estimates the execution time of a single CP instruction following the formula C(p) = T_w + max(T_r, T_c) with: T_w - instruction write (to mem.) time T_r - instruction read (to mem.) time T_c - instruction compute timedouble
getTimeEstimateInst(Instruction inst)
double
getTimeEstimateSparkJob(VarStats varToCollect)
void
maintainFCallInputStats(FunctionCallCPInstruction finst)
Creates copies of theVarStats
for the function argument.void
maintainFCallOutputStats(FunctionCallCPInstruction finst, FunctionProgramBlock fpb)
Creates copies of theVarStats
for the function output parameters.void
maintainStats(Instruction inst)
Keep the basic-block variable statistics updated and compute I/O cost.double
parseSPInst(SPInstruction inst)
Parse a Spark instruction, and it stores the corresponding cost for computing the output variable in the RDD statistics' object related to that variable.void
putStats(HashMap<String,VarStats> inputStats)
Meant to be used for testing purposes
-
-
-
Constructor Detail
-
CostEstimator
public CostEstimator(Program program, CloudInstance driverNode, CloudInstance executorNode)
-
-
Method Detail
-
estimateExecutionTime
public static double estimateExecutionTime(Program program, CloudInstance driverNode, CloudInstance executorNode) throws CostEstimationException
Entry point for estimating the execution time of a program.- Parameters:
program
- compiled runtime programdriverNode
- ?executorNode
- ?- Returns:
- estimated time for execution of the program
given the resources set in
SparkExecutionContext
- Throws:
CostEstimationException
- in case of errors
-
putStats
public void putStats(HashMap<String,VarStats> inputStats)
Meant to be used for testing purposes- Parameters:
inputStats
- ?
-
getStats
public VarStats getStats(String statsName)
Intended to be called only when it is certain that the corresponding variable is not a scalar and its statistics are in_stats
already.- Parameters:
statsName
- the corresponding operand name- Returns:
VarStats object
if the given key is present in the map saving the current variable statistics.- Throws:
RuntimeException
- if the corresponding variable is not in_stats
-
getStatsWithDefaultScalar
public VarStats getStatsWithDefaultScalar(String statsName)
Intended to be called when the corresponding variable could be scalar.- Parameters:
statsName
- the corresponding operand name- Returns:
VarStats object
in any case
-
getTimeEstimate
public double getTimeEstimate() throws CostEstimationException
- Throws:
CostEstimationException
-
maintainFCallInputStats
public void maintainFCallInputStats(FunctionCallCPInstruction finst)
Creates copies of theVarStats
for the function argument. Meant to be called before estimating the execution time of the function program block of the corresponding function call instruction, otherwise the relevant statistics would not be available for the estimation.- Parameters:
finst
- ?
-
maintainFCallOutputStats
public void maintainFCallOutputStats(FunctionCallCPInstruction finst, FunctionProgramBlock fpb)
Creates copies of theVarStats
for the function output parameters. Meant to be called after estimating the execution time of the function program block of the corresponding function call instruction, otherwise the relevant statistics would not have been created yet.- Parameters:
finst
- ?fpb
- ?
-
maintainStats
public void maintainStats(Instruction inst)
Keep the basic-block variable statistics updated and compute I/O cost. NOTE: At program execution reading the files is done once the matrix is needed but cost estimation the place for adding cost is not relevant.- Parameters:
inst
- ?
-
getTimeEstimateInst
public double getTimeEstimateInst(Instruction inst) throws CostEstimationException
- Throws:
CostEstimationException
-
getTimeEstimateCPInst
public double getTimeEstimateCPInst(CPInstruction inst) throws CostEstimationException
Estimates the execution time of a single CP instruction following the formula C(p) = T_w + max(T_r, T_c) with:- T_w - instruction write (to mem.) time
- T_r - instruction read (to mem.) time
- T_c - instruction compute time
- Parameters:
inst
- instruction for estimation- Returns:
- estimated time in seconds
- Throws:
CostEstimationException
- when the hardware configuration is not sufficient
-
parseSPInst
public double parseSPInst(SPInstruction inst) throws CostEstimationException
Parse a Spark instruction, and it stores the corresponding cost for computing the output variable in the RDD statistics' object related to that variable. This method is responsible for initializing the correspondingRDDStats
object for each output variable, including for outputs that are explicitly brought back to CP (Spark action within the instruction). It returns the time estimate only for those instructions that bring the output explicitly to CP. For the rest, the estimated time (cost) is stored as part of the corresponding RDD statistics, emulating the lazy evaluation execution of Spark.- Parameters:
inst
- Spark instruction for parsing- Returns:
- if explicit action, estimated time in seconds, else always 0
- Throws:
CostEstimationException
- ?
-
getTimeEstimateSparkJob
public double getTimeEstimateSparkJob(VarStats varToCollect)
-
-