Apache SystemDS 1.2.0 Release Notes

The Apache SystemML 1.2.0 release was approved on Aug. 24th, 2018. The release includes enhancements, features, and additions as listed below.

New Builtin Functions/Operations/Scripts/Features

  • Factorization Machines
  • Support for functions with default parameters
  • exists() for checking of existing variables
  • Triangular matrix functions: lower.tri() and upper.tri()
  • New nary min/max operations
  • as.matrix() over list of scalars
  • Function calls with named function arguments
  • Convolution operations (forward/backward)
  • Maxpooling operations (forward)
  • Support for bias_add and bias_mult
  • Global constants
  • Added support for rowProd/colProd
  • DML eval function
  • New data type list for lists and structs

API

  • JMLC API extension for passing multiple dml scripts

Compiler & Runtime

  • Code motion framework
  • Global subexpression elimination
  • Optional rewrite for hoisting loop-invariant operations
  • Improved IPA constant propagation and replacement
  • ParFor Data Partitioning Rewrite on Hops instead of Statements
  • New rewrites for chains of comparisons
  • Extended rewrite framework for codegen plans
  • Improved parfor optimizer rewrite for in-place-update
  • Rework function block recompilation
  • Support rowMeans in codegen row templates

Performance Improvements

  • Improved multi-threading of unary aggregates
  • Reuse of fair scheduler pools in local parfor workers
  • Performance issues Spark ctable(X,Y) w/ large num distinct
  • Performance instruction generation
  • Performance sample operations
  • Performance ultra-sparse block operations
  • New native tsmm operator and its integration
  • Multi-threaded unary operations (e.g., exp, log, sigmoid)

Bug Fixes

  • Memory leak buffer pool due to missing variable cleanup
  • Missing buffer pool serialization of compressed matrices
  • Compilation failure on inferring size of reshapes w/ zero rows/columns
  • Incorrect result for min/max over matrices with NaNs
  • Missing support for external functions with variable number of outputs
  • Reblock ultra-sparse matrix fails with index out of bounds
  • Performance issue CSE on DAGs w/ many root nodes (e.g., resnet200)
  • Non-fused bias_add builtin creates incorrect results over sparse inputs
  • Inconsistent namespace naming depending on OS
  • Codegen failing on three-way multi-aggregate
  • Codegen optimizer failing for MLogreg special cases
  • Failing matrix market to binary reblock with zero rows/columns

Deprecated/Removed/Cleanup

  • Opt level 4
  • File-based removeEmpty()
  • Cleanup exception handling apis/compiler/runtime

Experimental

  • Parameter server: local backend and distributed
  • Sparsity estimators

JIRA release notes