Class PartitionedBroadcast<T extends CacheBlock>
- java.lang.Object
-
- org.apache.sysds.runtime.instructions.spark.data.PartitionedBroadcast<T>
-
- All Implemented Interfaces:
Serializable
public class PartitionedBroadcast<T extends CacheBlock> extends Object implements Serializable
This class is a wrapper around an array of broadcasts of partitioned matrix/frame blocks, which is required due to 2GB limitations of Spark's broadcast handling. Without this partitioning ofBroadcast<PartitionedBlock>
intoBroadcast<PartitionedBlock>[]
, we got java.lang.IllegalArgumentException: Size exceeds Integer.MAX_VALUE issue. Despite various jiras, this issue still showed up in Spark 2.1.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description PartitionedBroadcast()
PartitionedBroadcast(org.apache.spark.broadcast.Broadcast<PartitionedBlock<T>>[] broadcasts, DataCharacteristics dc)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static int
computeBlocksPerPartition(long[] dims, int blen)
static int
computeBlocksPerPartition(long rlen, long clen, long blen)
void
destroy()
This method cleanups all underlying broadcasts of a partitioned broadcast, by forward the calls to SparkExecutionContext.cleanupBroadcastVariable.T
getBlock(int[] ix)
T
getBlock(int rowIndex, int colIndex)
org.apache.spark.broadcast.Broadcast<PartitionedBlock<T>>[]
getBroadcasts()
DataCharacteristics
getDataCharacteristics()
long
getNumCols()
int
getNumColumnBlocks()
int
getNumRowBlocks()
long
getNumRows()
T
slice(long rl, long ru, long cl, long cu, T block)
Utility for slice operations over partitioned matrices, where the index range can cover multiple blocks.
-
-
-
Constructor Detail
-
PartitionedBroadcast
public PartitionedBroadcast()
-
PartitionedBroadcast
public PartitionedBroadcast(org.apache.spark.broadcast.Broadcast<PartitionedBlock<T>>[] broadcasts, DataCharacteristics dc)
-
-
Method Detail
-
getBroadcasts
public org.apache.spark.broadcast.Broadcast<PartitionedBlock<T>>[] getBroadcasts()
-
getNumRows
public long getNumRows()
-
getNumCols
public long getNumCols()
-
getNumRowBlocks
public int getNumRowBlocks()
-
getNumColumnBlocks
public int getNumColumnBlocks()
-
getDataCharacteristics
public DataCharacteristics getDataCharacteristics()
-
computeBlocksPerPartition
public static int computeBlocksPerPartition(long rlen, long clen, long blen)
-
computeBlocksPerPartition
public static int computeBlocksPerPartition(long[] dims, int blen)
-
getBlock
public T getBlock(int rowIndex, int colIndex)
-
getBlock
public T getBlock(int[] ix)
-
slice
public T slice(long rl, long ru, long cl, long cu, T block)
Utility for slice operations over partitioned matrices, where the index range can cover multiple blocks. The result is always a single result matrix block. All semantics are equivalent to the core matrix block slice operations.- Parameters:
rl
- row lower boundru
- row upper boundcl
- column lower boundcu
- column upper boundblock
- block object- Returns:
- block object
-
destroy
public void destroy()
This method cleanups all underlying broadcasts of a partitioned broadcast, by forward the calls to SparkExecutionContext.cleanupBroadcastVariable.
-
-