SystemDS Architecture

Algorithms in Apache SystemDS are written in a high-level R-like language called Declarative Machine learning Language (DML) or a high-level Python-like language called PyDML. SystemDS compiles and optimizes these algorithms into hybrid runtime plans of multi-threaded, in-memory operations on a single node (scale-up) and distributed Spark operations on a cluster of nodes (scale-out). SystemDS's high-level architecture consists of the following components:

Language

DML (with either R- or Python-like syntax) provides linear algebra primitives, a rich set of statistical functions and matrix manipulations, as well as user-defined and external functions, control structures including parfor loops, and recursion. The user provides the DML script through one of the following APIs:
  • Command-line interface ( DMLScript )
  • Convenient programmatic interface for Spark users ( MLContext )
  • Java Machine Learning Connector API ( Connection )
ParserWrapper performs syntactic validation and parses the input DML script using ANTLR into a a hierarchy of StatementBlock and Statement as defined by control structures. Another important class of the language component is DMLTranslator which performs live variable analysis and semantic validation. During that process we also retrieve input data characteristics -- i.e., format, number of rows, columns, and non-zero values -- as well as infrastructure characteristics, which are used for subsequent optimizations. Finally, we construct directed acyclic graphs (DAGs) of high-level operators ( Hop ) per statement block.

Optimizer

The SystemDS optimizer works over programs of HOP DAGs, where HOPs are operators on matrices or scalars, and are categorized according to their access patterns. Examples are matrix multiplications, unary aggregates like rowSums(), binary operations like cell-wise matrix additions, reorganization operations like transpose or sort, and more specific operations. We perform various optimizations on these HOP DAGs, including algebraic simplification rewrites ( ProgramRewriter ), intra-/InterProceduralAnalysis for statistics propagation into functions and over entire programs, and operator ordering of matrix multiplication chains. We compute memory estimates for all HOPs, reflecting the memory requirements of in-memory single-node operations and intermediates. Each HOP DAG is compiled to a DAG of low-level operators ( Lop ) such as grouping and aggregate, which are backend-specific physical operators. Operator selection picks the best physical operators for a given HOP based on memory estimates, data, and cluster characteristics. Individual LOPs have corresponding runtime implementations, called instructions, and the optimizer generates an executable runtime program of instructions.

Runtime

We execute the generated runtime program locally in CP (control program), i.e., within a driver process. This driver handles recompilation, runs in-memory single-node CPInstruction (some of which are multi-threaded ), maintains an in-memory buffer pool, and launches Spark jobs if the runtime plan contains distributed computations in the form of Spark instructions ( SPInstruction ). For the Spark backend, we rely on Spark's lazy evaluation and stage construction. CP instructions may also be backed by GPU kernels ( GPUInstruction ). The multi-level buffer pool caches local matrices in-memory, evicts them if necessary, and handles data exchange between local and distributed runtime backends. The core of SystemDS's runtime instructions is an adaptive matrix block library, which is sparsity-aware and operates on the entire matrix in CP, or blocks of a matrix in a distributed setting. Further key features include parallel for-loops for task-parallel computations, and dynamic recompilation for runtime plan adaptation addressing initial unknowns.
Packages 
Package Description
org.apache.sysds.api  
org.apache.sysds.api.jmlc  
org.apache.sysds.api.mlcontext  
org.apache.sysds.common  
org.apache.sysds.conf  
org.apache.sysds.hops  
org.apache.sysds.hops.codegen  
org.apache.sysds.hops.codegen.cplan  
org.apache.sysds.hops.codegen.cplan.cuda  
org.apache.sysds.hops.codegen.cplan.java  
org.apache.sysds.hops.codegen.opt  
org.apache.sysds.hops.codegen.template  
org.apache.sysds.hops.cost  
org.apache.sysds.hops.estim  
org.apache.sysds.hops.fedplanner  
org.apache.sysds.hops.ipa  
org.apache.sysds.hops.recompile  
org.apache.sysds.hops.rewrite  
org.apache.sysds.lops  
org.apache.sysds.lops.compile  
org.apache.sysds.lops.compile.linearization  
org.apache.sysds.parser  
org.apache.sysds.parser.dml  
org.apache.sysds.runtime  
org.apache.sysds.runtime.codegen  
org.apache.sysds.runtime.compress  
org.apache.sysds.runtime.compress.bitmap  
org.apache.sysds.runtime.compress.cocode  
org.apache.sysds.runtime.compress.colgroup  
org.apache.sysds.runtime.compress.colgroup.dictionary  
org.apache.sysds.runtime.compress.colgroup.insertionsort  
org.apache.sysds.runtime.compress.colgroup.mapping  
org.apache.sysds.runtime.compress.colgroup.offset  
org.apache.sysds.runtime.compress.cost  
org.apache.sysds.runtime.compress.estim  
org.apache.sysds.runtime.compress.estim.encoding  
org.apache.sysds.runtime.compress.estim.sample  
org.apache.sysds.runtime.compress.lib  
org.apache.sysds.runtime.compress.readers  
org.apache.sysds.runtime.compress.utils  
org.apache.sysds.runtime.compress.workload  
org.apache.sysds.runtime.controlprogram  
org.apache.sysds.runtime.controlprogram.caching  
org.apache.sysds.runtime.controlprogram.context  
org.apache.sysds.runtime.controlprogram.federated  
org.apache.sysds.runtime.controlprogram.federated.monitoring  
org.apache.sysds.runtime.controlprogram.federated.monitoring.controllers  
org.apache.sysds.runtime.controlprogram.federated.monitoring.models  
org.apache.sysds.runtime.controlprogram.federated.monitoring.repositories  
org.apache.sysds.runtime.controlprogram.federated.monitoring.services  
org.apache.sysds.runtime.controlprogram.paramserv  
org.apache.sysds.runtime.controlprogram.paramserv.dp  
org.apache.sysds.runtime.controlprogram.paramserv.homomorphicEncryption  
org.apache.sysds.runtime.controlprogram.paramserv.rpc  
org.apache.sysds.runtime.controlprogram.parfor  
org.apache.sysds.runtime.controlprogram.parfor.opt  
org.apache.sysds.runtime.controlprogram.parfor.stat  
org.apache.sysds.runtime.controlprogram.parfor.util  
org.apache.sysds.runtime.data  
org.apache.sysds.runtime.functionobjects  
org.apache.sysds.runtime.instructions  
org.apache.sysds.runtime.instructions.cp  
org.apache.sysds.runtime.instructions.cpfile  
org.apache.sysds.runtime.instructions.fed  
org.apache.sysds.runtime.instructions.gpu  
org.apache.sysds.runtime.instructions.gpu.context  
org.apache.sysds.runtime.instructions.spark  
org.apache.sysds.runtime.instructions.spark.data  
org.apache.sysds.runtime.instructions.spark.functions  
org.apache.sysds.runtime.instructions.spark.utils  
org.apache.sysds.runtime.io  
org.apache.sysds.runtime.io.hdf5  
org.apache.sysds.runtime.io.hdf5.message  
org.apache.sysds.runtime.iogen  
org.apache.sysds.runtime.lineage  
org.apache.sysds.runtime.matrix.data  
org.apache.sysds.runtime.matrix.data.sketch  
org.apache.sysds.runtime.matrix.data.sketch.countdistinctapprox  
org.apache.sysds.runtime.matrix.operators  
org.apache.sysds.runtime.meta  
org.apache.sysds.runtime.privacy  
org.apache.sysds.runtime.privacy.finegrained  
org.apache.sysds.runtime.privacy.propagation  
org.apache.sysds.runtime.transform  
org.apache.sysds.runtime.transform.decode  
org.apache.sysds.runtime.transform.encode  
org.apache.sysds.runtime.transform.meta  
org.apache.sysds.runtime.transform.tokenize  
org.apache.sysds.runtime.util  
org.apache.sysds.utils  
org.apache.sysds.utils.stats