@NotThreadSafe public abstract class SortBuffer extends Object implements DataBuffer
DataBuffer implementation which sorts all appended records only by subpartition index.
Records of the same subpartition keep the appended order.
It maintains a list of MemorySegments as a joint buffer. Data will be appended to the
joint buffer sequentially. When writing a record, an index entry will be appended first. An index
entry consists of 4 fields: 4 bytes for record length, 4 bytes for Buffer.DataType and 8
bytes for address pointing to the next index entry of the same subpartition which will be used to
index the next record to read when coping data from this DataBuffer. For simplicity, no
index entry can span multiple segments. The corresponding record data is seated right after its
index entry and different from the index entry, records have variable length thus may span
multiple segments.
| Modifier and Type | Field and Description |
|---|---|
protected BufferRecycler |
bufferRecycler
BufferRecycler used to recycle freeSegments. |
protected LinkedList<org.apache.flink.core.memory.MemorySegment> |
freeSegments
A list of
MemorySegments used to store data in memory. |
protected static int |
INDEX_ENTRY_SIZE
Size of an index entry: 4 bytes for record length, 4 bytes for data type and 8 bytes for
pointer to next entry.
|
protected boolean |
isFinished
Whether this sort buffer is finished.
|
protected boolean |
isReleased
Whether this sort buffer is released.
|
protected long[] |
lastIndexEntryAddresses
Addresses of the last record's index entry for each subpartition.
|
protected long |
numTotalBytesRead
Total number of bytes already read from this sort buffer.
|
protected long |
readIndexEntryAddress
Index entry address of the current record or event to be read.
|
protected int |
readOrderIndex
Used to index the current available subpartition to read data from.
|
protected int |
recordRemainingBytes
Record bytes remaining after last copy, which must be read first in next copy.
|
ArrayList<org.apache.flink.core.memory.MemorySegment> |
segments
A segment list as a joint buffer which stores all records and index entries.
|
protected int[] |
subpartitionReadOrder
Data of different subpartitions in this sort buffer will be read in this order.
|
| Modifier | Constructor and Description |
|---|---|
protected |
SortBuffer(LinkedList<org.apache.flink.core.memory.MemorySegment> freeSegments,
BufferRecycler bufferRecycler,
int numSubpartitions,
int bufferSize,
int numGuaranteedBuffers,
int[] customReadOrder) |
| Modifier and Type | Method and Description |
|---|---|
boolean |
append(ByteBuffer source,
int targetSubpartition,
Buffer.DataType dataType)
No partial record will be written to this
SortBasedDataBuffer, which means that
either all data of target record will be written or nothing will be written. |
protected int |
copyRecordOrEvent(org.apache.flink.core.memory.MemorySegment targetSegment,
int targetSegmentOffset,
int sourceSegmentIndex,
int sourceSegmentOffset,
int recordLength) |
void |
finish()
Finishes this
DataBuffer which means no record can be appended anymore. |
protected int |
getSegmentIndexFromPointer(long value) |
protected int |
getSegmentOffsetFromPointer(long value) |
boolean |
hasRemaining()
Returns true if not all data appended to this
DataBuffer is consumed. |
boolean |
isFinished()
Whether this
DataBuffer is finished or not. |
boolean |
isReleased()
Whether this
DataBuffer is released or not. |
long |
numTotalBytes()
Returns the total number of bytes written to this
DataBuffer. |
long |
numTotalRecords()
Returns the total number of records written to this
DataBuffer. |
void |
release()
Releases this
DataBuffer which releases all resources. |
protected void |
updateReadSubpartitionAndIndexEntryAddress() |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetNextBufferprotected static final int INDEX_ENTRY_SIZE
protected final LinkedList<org.apache.flink.core.memory.MemorySegment> freeSegments
MemorySegments used to store data in memory.protected final BufferRecycler bufferRecycler
BufferRecycler used to recycle freeSegments.public final ArrayList<org.apache.flink.core.memory.MemorySegment> segments
protected final long[] lastIndexEntryAddresses
protected long numTotalBytesRead
protected boolean isFinished
protected boolean isReleased
protected final int[] subpartitionReadOrder
protected long readIndexEntryAddress
protected int recordRemainingBytes
protected int readOrderIndex
protected SortBuffer(LinkedList<org.apache.flink.core.memory.MemorySegment> freeSegments, BufferRecycler bufferRecycler, int numSubpartitions, int bufferSize, int numGuaranteedBuffers, @Nullable int[] customReadOrder)
public boolean append(ByteBuffer source, int targetSubpartition, Buffer.DataType dataType) throws IOException
SortBasedDataBuffer, which means that
either all data of target record will be written or nothing will be written.append in interface DataBufferIOExceptionprotected int copyRecordOrEvent(org.apache.flink.core.memory.MemorySegment targetSegment,
int targetSegmentOffset,
int sourceSegmentIndex,
int sourceSegmentOffset,
int recordLength)
protected void updateReadSubpartitionAndIndexEntryAddress()
protected int getSegmentIndexFromPointer(long value)
protected int getSegmentOffsetFromPointer(long value)
public long numTotalRecords()
DataBufferDataBuffer.numTotalRecords in interface DataBufferpublic long numTotalBytes()
DataBufferDataBuffer.numTotalBytes in interface DataBufferpublic boolean hasRemaining()
DataBufferDataBuffer is consumed.hasRemaining in interface DataBufferpublic void finish()
DataBufferDataBuffer which means no record can be appended anymore.finish in interface DataBufferpublic boolean isFinished()
DataBufferDataBuffer is finished or not.isFinished in interface DataBufferpublic void release()
DataBufferDataBuffer which releases all resources.release in interface DataBufferpublic boolean isReleased()
DataBufferDataBuffer is released or not.isReleased in interface DataBufferCopyright © 2014–2025 The Apache Software Foundation. All rights reserved.