public class DefaultVertexParallelismAndInputInfosDecider extends Object implements VertexParallelismAndInputInfosDecider
VertexParallelismAndInputInfosDecider. This implementation will
decide parallelism and JobVertexInputInfos as follows:
1. For job vertices whose inputs are all ALL_TO_ALL edges, evenly distribute data to downstream subtasks, make different downstream subtasks consume roughly the same amount of data.
2. For other cases, evenly distribute subpartitions to downstream subtasks, make different downstream subtasks consume roughly the same number of subpartitions.
| Modifier and Type | Method and Description |
|---|---|
int |
computeSourceParallelismUpperBound(JobVertexID jobVertexId,
int maxParallelism)
Compute source parallelism upper bound for the source vertex.
|
ParallelismAndInputInfos |
decideParallelismAndInputInfosForVertex(JobVertexID jobVertexId,
List<BlockingResultInfo> consumedResults,
int vertexInitialParallelism,
int vertexMinParallelism,
int vertexMaxParallelism)
Decide the parallelism and
JobVertexInputInfos for this job vertex. |
long |
getDataVolumePerTask()
Get the average size of data volume to expect each task instance to process.
|
public ParallelismAndInputInfos decideParallelismAndInputInfosForVertex(JobVertexID jobVertexId, List<BlockingResultInfo> consumedResults, int vertexInitialParallelism, int vertexMinParallelism, int vertexMaxParallelism)
VertexParallelismAndInputInfosDeciderJobVertexInputInfos for this job vertex.decideParallelismAndInputInfosForVertex in interface VertexParallelismAndInputInfosDeciderjobVertexId - The job vertex idconsumedResults - The information of consumed blocking resultsvertexInitialParallelism - The initial parallelism of the job vertex. If it's a positive
number, it will be respected. If it's not set(equals to ExecutionConfig.PARALLELISM_DEFAULT), a parallelism will be automatically decided for
the vertex.vertexMinParallelism - The min parallelism of the job vertex.vertexMaxParallelism - The max parallelism of the job vertex.public int computeSourceParallelismUpperBound(JobVertexID jobVertexId, int maxParallelism)
VertexParallelismAndInputInfosDecidercomputeSourceParallelismUpperBound in interface VertexParallelismAndInputInfosDeciderjobVertexId - The job vertex idmaxParallelism - The max parallelism of the job vertex.public long getDataVolumePerTask()
VertexParallelismAndInputInfosDecidergetDataVolumePerTask in interface VertexParallelismAndInputInfosDeciderCopyright © 2014–2024 The Apache Software Foundation. All rights reserved.