Package org.apache.sysds.runtime.io
Class ReaderTextCSVParallel
- java.lang.Object
-
- org.apache.sysds.runtime.io.MatrixReader
-
- org.apache.sysds.runtime.io.ReaderTextCSVParallel
-
public class ReaderTextCSVParallel extends MatrixReader
Parallel version of ReaderTextCSV.java. To summarize, we do two passes in order to compute row offsets and the actual read. We accordingly create count and read tasks and use fixed-size thread pools to execute these tasks. If the target matrix is dense, the inserts are done lock-free. In contrast to textcell parallel read, we also do lock-free inserts. If the matrix is sparse, because splits contain row partitioned lines and hence there is no danger of lost updates. Note, there is also no sorting of sparse rows required because data comes in sorted order per row.
-
-
Constructor Summary
Constructors Constructor Description ReaderTextCSVParallel(FileFormatPropertiesCSV props)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description MatrixBlock
readMatrixFromHDFS(String fname, long rlen, long clen, int blen, long estnnz)
MatrixBlock
readMatrixFromInputStream(InputStream is, long rlen, long clen, int blen, long estnnz)
-
-
-
Constructor Detail
-
ReaderTextCSVParallel
public ReaderTextCSVParallel(FileFormatPropertiesCSV props)
-
-
Method Detail
-
readMatrixFromHDFS
public MatrixBlock readMatrixFromHDFS(String fname, long rlen, long clen, int blen, long estnnz) throws IOException, DMLRuntimeException
- Specified by:
readMatrixFromHDFS
in classMatrixReader
- Throws:
IOException
DMLRuntimeException
-
readMatrixFromInputStream
public MatrixBlock readMatrixFromInputStream(InputStream is, long rlen, long clen, int blen, long estnnz) throws IOException, DMLRuntimeException
- Specified by:
readMatrixFromInputStream
in classMatrixReader
- Throws:
IOException
DMLRuntimeException
-
-