public class NettyShuffleMaster extends Object implements ShuffleMaster<NettyShuffleDescriptor>
ShuffleMaster for netty and local file based shuffle implementation.| Constructor and Description |
|---|
NettyShuffleMaster(ShuffleMasterContext shuffleMasterContext) |
| Modifier and Type | Method and Description |
|---|---|
void |
close()
Closes this shuffle master service which should release all resources.
|
org.apache.flink.configuration.MemorySize |
computeShuffleMemorySizeForTask(TaskInputsOutputsDescriptor desc)
JM announces network memory requirement from the calculating result of this method.
|
CompletableFuture<Collection<PartitionWithMetrics>> |
getPartitionWithMetrics(org.apache.flink.api.common.JobID jobId,
Duration timeout,
Set<ResultPartitionID> expectedPartitions)
Retrieves specified partitions and their metrics (identified by
expectedPartitions),
the metrics include sizes of sub-partitions in a result partition. |
void |
notifyPartitionRecoveryStarted(org.apache.flink.api.common.JobID jobId)
Notifies that the recovery process of result partitions has started.
|
void |
registerJob(JobShuffleContext context)
Registers the target job together with the corresponding
JobShuffleContext to this
shuffle master. |
CompletableFuture<NettyShuffleDescriptor> |
registerPartitionWithProducer(org.apache.flink.api.common.JobID jobID,
PartitionDescriptor partitionDescriptor,
ProducerDescriptor producerDescriptor)
Asynchronously register a partition and its producer with the shuffle service.
|
void |
releasePartitionExternally(ShuffleDescriptor shuffleDescriptor)
Release any external resources occupied by the given partition.
|
void |
snapshotState(CompletableFuture<ShuffleMasterSnapshot> snapshotFuture,
ShuffleMasterSnapshotContext context)
Triggers a snapshot of the shuffle master's state.
|
boolean |
supportsBatchSnapshot()
Whether the shuffle master supports taking snapshot in batch scenarios if
BatchExecutionOptions.JOB_RECOVERY_ENABLED is true. |
void |
unregisterJob(org.apache.flink.api.common.JobID jobId)
Unregisters the target job from this shuffle master, which means the corresponding job has
reached a global termination state and all the allocated resources except for the cluster
partitions can be cleared.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitrestoreState, startpublic NettyShuffleMaster(ShuffleMasterContext shuffleMasterContext)
public CompletableFuture<NettyShuffleDescriptor> registerPartitionWithProducer(org.apache.flink.api.common.JobID jobID, PartitionDescriptor partitionDescriptor, ProducerDescriptor producerDescriptor)
ShuffleMasterThe returned shuffle descriptor is an internal handle which identifies the partition internally within the shuffle service. The descriptor should provide enough information to read from or write data to the partition.
registerPartitionWithProducer in interface ShuffleMaster<NettyShuffleDescriptor>jobID - job ID of the corresponding job which registered the partitionpartitionDescriptor - general job graph information about the partitionproducerDescriptor - general producer information (location, execution id, connection
info)public void releasePartitionExternally(ShuffleDescriptor shuffleDescriptor)
ShuffleMasterThis call triggers release of any resources which are occupied by the given partition in
the external systems outside of the producer executor. This is mostly relevant for the batch
jobs and blocking result partitions. The producer local resources are managed by ShuffleDescriptor.storesLocalResourcesOn() and ShuffleEnvironment.releasePartitionsLocally(Collection).
releasePartitionExternally in interface ShuffleMaster<NettyShuffleDescriptor>shuffleDescriptor - shuffle descriptor of the result partition to release externally.public org.apache.flink.configuration.MemorySize computeShuffleMemorySizeForTask(TaskInputsOutputsDescriptor desc)
NettyShuffleEnvironmentOptions.NETWORK_BUFFERS_PER_CHANNEL and
NettyShuffleEnvironmentOptions.NETWORK_EXTRA_BUFFERS_PER_GATE, which means we should
always keep the consistency of configurations between JM, RM and TM in fine-grained resource
management, thus to guarantee that the processes of memory announcing and allocating respect
each other.computeShuffleMemorySizeForTask in interface ShuffleMaster<NettyShuffleDescriptor>desc - describes task inputs and outputs information for shuffle
memory calculation.TaskInputsOutputsDescriptor.public CompletableFuture<Collection<PartitionWithMetrics>> getPartitionWithMetrics(org.apache.flink.api.common.JobID jobId, Duration timeout, Set<ResultPartitionID> expectedPartitions)
ShuffleMasterexpectedPartitions),
the metrics include sizes of sub-partitions in a result partition.getPartitionWithMetrics in interface ShuffleMaster<NettyShuffleDescriptor>jobId - ID of the target jobtimeout - The timeout used for retrieve the specified partitions.expectedPartitions - The set of identifiers for the result partitions whose metrics are
to be fetched.public void registerJob(JobShuffleContext context)
ShuffleMasterJobShuffleContext to this
shuffle master. Through the shuffle context, one can obtain some basic information like job
ID, job configuration. It enables ShuffleMaster to notify JobMaster about lost result
partitions, so that JobMaster can identify and reproduce unavailable partitions earlier.registerJob in interface ShuffleMaster<NettyShuffleDescriptor>context - the corresponding shuffle context of the target job.public void unregisterJob(org.apache.flink.api.common.JobID jobId)
ShuffleMasterunregisterJob in interface ShuffleMaster<NettyShuffleDescriptor>jobId - ID of the target job to be unregistered.public boolean supportsBatchSnapshot()
ShuffleMasterBatchExecutionOptions.JOB_RECOVERY_ENABLED is true. If it
returns true, Flink will call ShuffleMaster.snapshotState(java.util.concurrent.CompletableFuture<org.apache.flink.runtime.shuffle.ShuffleMasterSnapshot>, org.apache.flink.runtime.shuffle.ShuffleMasterSnapshotContext) to take snapshot, and call ShuffleMaster.restoreState(java.util.List<org.apache.flink.runtime.shuffle.ShuffleMasterSnapshot>) to restore the state of shuffle master.supportsBatchSnapshot in interface ShuffleMaster<NettyShuffleDescriptor>public void snapshotState(CompletableFuture<ShuffleMasterSnapshot> snapshotFuture, ShuffleMasterSnapshotContext context)
ShuffleMastersnapshotState in interface ShuffleMaster<NettyShuffleDescriptor>public void notifyPartitionRecoveryStarted(org.apache.flink.api.common.JobID jobId)
ShuffleMasternotifyPartitionRecoveryStarted in interface ShuffleMaster<NettyShuffleDescriptor>jobId - ID of the target jobpublic void close()
throws Exception
ShuffleMasterclose in interface AutoCloseableclose in interface ShuffleMaster<NettyShuffleDescriptor>ExceptionCopyright © 2014–2025 The Apache Software Foundation. All rights reserved.