Class LibMatrixDatagen


  • public class LibMatrixDatagen
    extends Object
    • Method Detail

      • isShortcutRandOperation

        public static boolean isShortcutRandOperation​(double min,
                                                      double max,
                                                      double sparsity,
                                                      RandomMatrixGenerator.PDF pdf)
      • updateSeqIncr

        public static double updateSeqIncr​(double seq_from,
                                           double seq_to,
                                           double seq_incr)
      • generateUniqueSeedPath

        public static String generateUniqueSeedPath​(String basedir)
      • setupSeedsForRand

        public static org.apache.commons.math3.random.Well1024a setupSeedsForRand​(long seed)
        A matrix of random numbers is generated by using multiple seeds, one for each block. Such block-level seeds are produced via Well equidistributed long-period linear generator (Well1024a). For a given seed, this function sets up the block-level seeds. This function is invoked from both CP (RandCPInstruction.processInstruction()) as well as MR (RandMR.java while setting up the Rand job).
        Parameters:
        seed - seed for random generator
        Returns:
        Well1024a pseudo-random number generator
      • computeNNZperBlock

        @Deprecated
        public static LongStream computeNNZperBlock​(long nrow,
                                                    long ncol,
                                                    int blen,
                                                    double sparsity)
        Deprecated.
      • createRandomMatrixGenerator

        public static RandomMatrixGenerator createRandomMatrixGenerator​(String pdfStr,
                                                                        int r,
                                                                        int c,
                                                                        int blen,
                                                                        double sp,
                                                                        double min,
                                                                        double max,
                                                                        String distParams)
      • generateRandomMatrix

        public static void generateRandomMatrix​(MatrixBlock out,
                                                RandomMatrixGenerator rgen,
                                                org.apache.commons.math3.random.Well1024a bigrand,
                                                long bSeed)
        Function to generate a matrix of random numbers. This is invoked both from CP as well as from MR. In case of CP, it generates an entire matrix block-by-block. A bigrand is passed so that block-level seeds are generated internally. In case of MR, it generates a single block for given block-level seed bSeed. When pdf="uniform", cell values are drawn from uniform distribution in range [min,max]. When pdf="normal", cell values are drawn from standard normal distribution N(0,1). The range of generated values will always be (-Inf,+Inf).
        Parameters:
        out - output matrix block
        rgen - random matrix generator
        bigrand - Well1024a pseudo-random number generator
        bSeed - seed for random generator
      • generateRandomMatrix

        public static void generateRandomMatrix​(MatrixBlock out,
                                                RandomMatrixGenerator rgen,
                                                org.apache.commons.math3.random.Well1024a bigrand,
                                                long bSeed,
                                                int k)
        Function to generate a matrix of random numbers. This is invoked both from CP as well as from MR. In case of CP, it generates an entire matrix block-by-block. A bigrand is passed so that block-level seeds are generated internally. In case of MR, it generates a single block for given block-level seed bSeed. When pdf="uniform", cell values are drawn from uniform distribution in range [min,max]. When pdf="normal", cell values are drawn from standard normal distribution N(0,1). The range of generated values will always be (-Inf,+Inf).
        Parameters:
        out - output matrix block
        rgen - random matrix generator
        bigrand - Well1024a pseudo-random number generator
        bSeed - seed for random generator
        k - ?
      • generateSequence

        public static void generateSequence​(MatrixBlock out,
                                            double from,
                                            double to,
                                            double incr)
        Method to generate a sequence according to the given parameters. The generated sequence is always in dense format. Both end points specified from and to must be included in the generated sequence i.e., [from,to] both inclusive. Note that, to is included only if (to-from) is perfectly divisible by incr. For example, seq(0,1,0.5) generates (0.0 0.5 1.0) whereas seq(0,1,0.6) generates (0.0 0.6) but not (0.0 0.6 1.0)
        Parameters:
        out - output matrix block
        from - lower end point
        to - upper end point
        incr - increment value
      • generateSample

        public static void generateSample​(MatrixBlock out,
                                          long range,
                                          int size,
                                          boolean replace,
                                          long seed)
        Generates a sample of size size from a range of values [1,range]. replace defines if sampling is done with or without replacement.
        Parameters:
        out - output matrix block
        range - range upper bound
        size - sample size
        replace - if true, sample with replacement
        seed - seed for random generator