Package org.apache.sysds.api.mlcontext
Class Matrix
- java.lang.Object
-
- org.apache.sysds.api.mlcontext.Matrix
-
public class Matrix extends Object
Matrix encapsulates a SystemDS matrix. It allows for easy conversion to various other formats, such as RDDs, JavaRDDs, DataFrames, and double[][]s. After script execution, it offers a convenient format for obtaining SystemDS matrix data in Scala tuples.
-
-
Constructor Summary
Constructors Constructor Description Matrix(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert a Spark DataFrame to a SystemDS binary-block representation.Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, long numRows, long numCols)
Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
Convert a Spark DataFrame to a SystemDS binary-block representation.Matrix(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description MatrixMetadata
getMatrixMetadata()
Obtain the matrix metadataboolean
hasBinaryBlocks()
Whether or not this matrix contains data as binary blocksboolean
hasMatrixObject()
Whether or not this matrix contains data as a MatrixObjectdouble[][]
to2DDoubleArray()
Obtain the matrix as a two-dimensional double arrayorg.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock>
toBinaryBlocks()
Obtain the matrix as aJavaPairRDD<MatrixIndexes, MatrixBlock>
org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
toDF()
Obtain the matrix as aDataFrame
of doubles with an ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
toDFDoubleNoIDColumn()
Obtain the matrix as aDataFrame
of doubles with no ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
toDFDoubleWithIDColumn()
Obtain the matrix as aDataFrame
of doubles with an ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
toDFVectorNoIDColumn()
Obtain the matrix as aDataFrame
of vectors with no ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>
toDFVectorWithIDColumn()
Obtain the matrix as aDataFrame
of vectors with an ID columnorg.apache.spark.api.java.JavaRDD<String>
toJavaRDDStringCSV()
Obtain the matrix as aJavaRDD<String>
in CSV formatorg.apache.spark.api.java.JavaRDD<String>
toJavaRDDStringIJV()
Obtain the matrix as aJavaRDD<String>
in IJV formatMatrixBlock
toMatrixBlock()
Obtain the matrix as aMatrixBlock
MatrixObject
toMatrixObject()
Obtain the matrix as a SystemDS MatrixObject.org.apache.spark.rdd.RDD<String>
toRDDStringCSV()
Obtain the matrix as aRDD<String>
in CSV formatorg.apache.spark.rdd.RDD<String>
toRDDStringIJV()
Obtain the matrix as aRDD<String>
in IJV formatString
toString()
IfMatrixObject
is available, outputMatrixObject.toString()
.
-
-
-
Constructor Detail
-
Matrix
public Matrix(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)
-
Matrix
public Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)
Convert a Spark DataFrame to a SystemDS binary-block representation.- Parameters:
dataFrame
- the Spark DataFramematrixMetadata
- matrix metadata, such as number of rows and columns
-
Matrix
public Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, long numRows, long numCols)
Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.- Parameters:
dataFrame
- the Spark DataFramenumRows
- the number of rowsnumCols
- the number of columns
-
Matrix
public Matrix(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.- Parameters:
binaryBlocks
- theJavaPairRDD<MatrixIndexes, MatrixBlock>
matrixmatrixMetadata
- matrix metadata, such as number of rows and columns
-
Matrix
public Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert a Spark DataFrame to a SystemDS binary-block representation.- Parameters:
dataFrame
- the Spark DataFrame
-
-
Method Detail
-
toMatrixObject
public MatrixObject toMatrixObject()
Obtain the matrix as a SystemDS MatrixObject.- Returns:
- the matrix as a SystemDS MatrixObject
-
to2DDoubleArray
public double[][] to2DDoubleArray()
Obtain the matrix as a two-dimensional double array- Returns:
- the matrix as a two-dimensional double array
-
toJavaRDDStringIJV
public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringIJV()
Obtain the matrix as aJavaRDD<String>
in IJV format- Returns:
- the matrix as a
JavaRDD<String>
in IJV format
-
toJavaRDDStringCSV
public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringCSV()
Obtain the matrix as aJavaRDD<String>
in CSV format- Returns:
- the matrix as a
JavaRDD<String>
in CSV format
-
toRDDStringCSV
public org.apache.spark.rdd.RDD<String> toRDDStringCSV()
Obtain the matrix as aRDD<String>
in CSV format- Returns:
- the matrix as a
RDD<String>
in CSV format
-
toRDDStringIJV
public org.apache.spark.rdd.RDD<String> toRDDStringIJV()
Obtain the matrix as aRDD<String>
in IJV format- Returns:
- the matrix as a
RDD<String>
in IJV format
-
toDF
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDF()
Obtain the matrix as aDataFrame
of doubles with an ID column- Returns:
- the matrix as a
DataFrame
of doubles with an ID column
-
toDFDoubleWithIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleWithIDColumn()
Obtain the matrix as aDataFrame
of doubles with an ID column- Returns:
- the matrix as a
DataFrame
of doubles with an ID column
-
toDFDoubleNoIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleNoIDColumn()
Obtain the matrix as aDataFrame
of doubles with no ID column- Returns:
- the matrix as a
DataFrame
of doubles with no ID column
-
toDFVectorWithIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorWithIDColumn()
Obtain the matrix as aDataFrame
of vectors with an ID column- Returns:
- the matrix as a
DataFrame
of vectors with an ID column
-
toDFVectorNoIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorNoIDColumn()
Obtain the matrix as aDataFrame
of vectors with no ID column- Returns:
- the matrix as a
DataFrame
of vectors with no ID column
-
toBinaryBlocks
public org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> toBinaryBlocks()
Obtain the matrix as aJavaPairRDD<MatrixIndexes, MatrixBlock>
- Returns:
- the matrix as a
JavaPairRDD<MatrixIndexes, MatrixBlock>
-
toMatrixBlock
public MatrixBlock toMatrixBlock()
Obtain the matrix as aMatrixBlock
- Returns:
- the matrix as a
MatrixBlock
-
getMatrixMetadata
public MatrixMetadata getMatrixMetadata()
Obtain the matrix metadata- Returns:
- the matrix metadata
-
toString
public String toString()
IfMatrixObject
is available, outputMatrixObject.toString()
. IfMatrixObject
is not available butMatrixMetadata
is available, outputMatrixMetadata.toString()
. Otherwise outputObject.toString()
.
-
hasBinaryBlocks
public boolean hasBinaryBlocks()
Whether or not this matrix contains data as binary blocks- Returns:
true
if data as binary blocks are present,false
otherwise.
-
hasMatrixObject
public boolean hasMatrixObject()
Whether or not this matrix contains data as a MatrixObject- Returns:
true
if data as binary blocks are present,false
otherwise.
-
-