public class RemoteParForSpark
extends Object
This class serves two purposes: (1) isolating Spark imports to enable running in
environments where no Spark libraries are available, and (2) to follow the same
structure as the parfor remote_mr job submission.
NOTE: currently, we still exchange inputs and outputs via hdfs (this covers the general case
if data already resides in HDFS, in-memory data, and partitioned inputs; also, it allows for
pre-aggregation by overwriting partial task results with pre-paggregated results from subsequent
iterations)
TODO reducebykey on variable names