Class RDDStats


  • public class RDDStats
    extends Object
    • Constructor Detail

      • RDDStats

        public RDDStats​(VarStats sourceStats)
        Initiates RDD statistics object bound to an existing VarStats object. Uses HDFS block size to adjust automatically the number of partitions for the current RDD.
        Parameters:
        sourceStats - bound variables statistics
      • RDDStats

        public RDDStats​(long size,
                        int partitions)
        Initiates RDD statistics object for intermediate variables (not bound to VarStats). Intended to be used for intermediate shuffle estimations.
        Parameters:
        size - distributed size of the object
        partitions - target number of partitions; -1 for fitting to HDFS block size
    • Method Detail

      • getCost

        public double getCost()
        Meant to be used at testing
        Returns:
        estimated time (seconds) for generation of the current RDD
      • isCollected

        public boolean isCollected()
        Meant to be used at testing
        Returns:
        flag if the current RDD is collected