SystemDSContext
All operations using SystemDS need a java instance running.
The connection is ensured by an SystemDSContext
object.
An SystemDSContext
object can be created using
from systemds.context import SystemDSContext
sds = SystemDSContext()
When the calculations are finished the context has to be closed again
sds.close()
Since it is annoying that it is always necessary to close the context, SystemDSContext
implements the python context management protocol, which supports the following syntax
with SystemDSContext() as sds:
# do something with sds which is an SystemDSContext
pass
This will automatically close the SystemDSContext
once the with-block is left.
Note
Creating a context is an expensive procedure, because a sub-process starting a JVM might have to start, therefore try to do this only once for your program, or always leave at least one context open.
- class systemds.context.SystemDSContext(port: int = - 1)
A context with a connection to a java instance with which SystemDS operations are executed. The java process is started and is running using a random tcp port for instruction parsing.
This class is used as the starting point for all SystemDS execution. It gives the ability to create all the different objects and adding them to the exectution.
- __init__(port: int = - 1)
Starts a new instance of SystemDSContext, in which the connection to a JVM systemds instance is handled Any new instance of this SystemDS Context, would start a separate new JVM.
Standard out and standard error form the JVM is also handled in this class, filling up Queues, that can be read from to get the printed statements from the JVM.
- array(*args: Sequence[Union[DAGNode, str, int, float, bool]]) systemds.operator.nodes.list.List
Create a List object containing the given nodes.
Note that only a sequence is allowed, or a dictionary, not both at the same time. :param args: A Sequence that will be inserted to a list :param kwargs: A Dictionary that will return a dictionary, (internally handled as a list) :return: A List
- close()
Close the connection to the java process and do necessary cleanup.
- dict(**kwargs: Dict[str, Union[DAGNode, str, int, float, bool]]) systemds.operator.nodes.list.List
Create a List object containing the given nodes.
Note that only a sequence is allowed, or a dictionary, not both at the same time. :param args: A Sequence that will be inserted to a list :param kwargs: A Dictionary that will return a dictionary, (internally handled as a list) :return: A List
- exception_and_close(exception, trace_back_limit: Optional[int] = None)
Method for printing exception, printing stdout and error, while also closing the context correctly.
- Parameters
e – the exception thrown
- federated(addresses: Iterable[str], ranges: Iterable[Tuple[Iterable[int], Iterable[int]]], *args, **kwargs: Dict[str, Union[DAGNode, str, int, float, bool]]) systemds.operator.nodes.matrix.Matrix
Create federated matrix object.
- Parameters
sds_context – the SystemDS context
addresses – addresses of the federated workers
ranges – for each federated worker a pair of begin and end index of their held matrix
args – unnamed params
kwargs – named params
- Returns
The Matrix containing the Federated data.
- from_numpy(mat: numpy.array, *args: Sequence[Union[DAGNode, str, int, float, bool]], **kwargs: Dict[str, Union[DAGNode, str, int, float, bool]]) systemds.operator.nodes.matrix.Matrix
Generate DAGNode representing matrix with data given by a numpy array, which will be sent to SystemDS on need.
- Parameters
mat – the numpy array
args – unnamed parameters
kwargs – named parameters
- Returns
A Matrix
- from_pandas(df: pandas.core.frame.DataFrame, *args: Sequence[Union[DAGNode, str, int, float, bool]], **kwargs: Dict[str, Union[DAGNode, str, int, float, bool]]) systemds.operator.nodes.frame.Frame
Generate DAGNode representing frame with data given by a pandas dataframe, which will be sent to SystemDS on need.
- Parameters
df – the pandas dataframe
args – unnamed parameters
kwargs – named parameters
- Returns
A Frame
- full(shape: Tuple[int, int], value: Union[float, int]) systemds.operator.nodes.matrix.Matrix
Generates a matrix completely filled with a value
- Parameters
sds_context – SystemDS context
shape – shape (rows and cols) of the matrix TODO tensor
value – the value to fill all cells with
- Returns
the OperationNode representing this operation
- get_stderr(lines: int = - 1)
Getter for the stderr of the java subprocess The output is taken from the stderr queue and returned in a new list. :param lines: The number of lines to try to read from the stderr queue. default -1 prints all current lines in the queue.
- get_stdout(lines: int = - 1)
Getter for the stdout of the java subprocess The output is taken from the stdout queue and returned in a new list. :param lines: The number of lines to try to read from the stdout queue. default -1 prints all current lines in the queue.
- list(*args: Sequence[Union[DAGNode, str, int, float, bool]], **kwargs: Dict[str, Union[DAGNode, str, int, float, bool]]) systemds.operator.nodes.list.List
Create a List object containing the given nodes.
Note that only a sequence is allowed, or a dictionary, not both at the same time. :param args: A Sequence that will be inserted to a list :param kwargs: A Dictionary that will return a dictionary, (internally handled as a list) :return: A List
- rand(rows: int, cols: int, min: Optional[Union[float, int]] = None, max: Optional[Union[float, int]] = None, pdf: str = 'uniform', sparsity: Optional[Union[float, int]] = None, seed: Optional[Union[float, int]] = None, lamb: Union[float, int] = 1) systemds.operator.nodes.matrix.Matrix
Generates a matrix filled with random values
- Parameters
sds_context – SystemDS context
rows – number of rows
cols – number of cols
min – min value for cells
max – max value for cells
pdf – probability distribution function: “uniform”/”normal”/”poison” distribution
sparsity – fraction of non-zero cells
seed – random seed
lamb – lambda value for “poison” distribution
- Returns
- read(path: os.PathLike, **kwargs: Dict[str, Union[DAGNode, str, int, float, bool]]) systemds.operator.operation_node.OperationNode
Read an file from disk. Supportted types include: CSV, Matrix Market(coordinate), Text(i,j,v), SystemDS Binary, etc. See: http://apache.github.io/systemds/site/dml-language-reference#readwrite-built-in-functions for more details :return: an Operation Node, containing the read data the operationNode read can be of types, Matrix, Frame or Scalar.
- scalar(v: Dict[str, Union[DAGNode, str, int, float, bool]]) systemds.operator.nodes.scalar.Scalar
Construct an scalar value, this can contain str, float, double, integers and booleans. :return: A scalar containing the given value.
- seq(start: Union[float, int], stop: Optional[Union[float, int]] = None, step: Union[float, int] = 1) systemds.operator.nodes.matrix.Matrix
Create a single column vector with values from start to stop and an increment of step. If no stop is defined and only one parameter is given, then start will be 0 and the parameter will be interpreted as stop.
- Parameters
sds_context – SystemDS context
start – the starting value
stop – the maximum value
step – the step size
- Returns
the OperationNode representing this operation
- source(path: str, name: str, print_imported_methods: bool = False) systemds.operator.nodes.source.Source
Import methods from a given dml file.
The importing is done thorugh the DML command source, and adds all defined methods from the script to the Source object returned in python. This gives the flexibility to call the methods directly on the object returned.
In systemds a method called func_01 can then be imported using
`python res = self.sds.source("PATH_TO_FILE", "UNIQUE_NAME").func_01().compute(verbose = True) `
- Parameters
path – The absolute or relative path to the file to import
name – The name to give the imported file in the script, this name must be unique
print_imported_methods – boolean specifying if the imported methods should be printed.