Class JobVertex
- java.lang.Object
-
- org.apache.flink.runtime.jobgraph.JobVertex
-
- All Implemented Interfaces:
Serializable
- Direct Known Subclasses:
InputOutputFormatVertex
public class JobVertex extends Object implements Serializable
The base class for job vertexes.- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interfaceJobVertex.FinalizeOnMasterContextThe context exposes some runtime infos for finalization.static interfaceJobVertex.InitializeOnMasterContext
-
Field Summary
Fields Modifier and Type Field Description static intMAX_PARALLELISM_DEFAULT
-
Constructor Summary
Constructors Constructor Description JobVertex(String name)Constructs a new job vertex and assigns it with the given name.JobVertex(String name, JobVertexID id)Constructs a new job vertex and assigns it with the given name.JobVertex(String name, JobVertexID primaryId, List<OperatorIDPair> operatorIDPairs)Constructs a new job vertex and assigns it with the given name.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddIntermediateDataSetIdToConsume(IntermediateDataSetID intermediateDataSetId)voidaddOperatorCoordinator(org.apache.flink.util.SerializedValue<OperatorCoordinator.Provider> serializedCoordinatorProvider)JobEdgeconnectNewDataSetAsInput(JobVertex input, DistributionPattern distPattern, ResultPartitionType partitionType)JobEdgeconnectNewDataSetAsInput(JobVertex input, DistributionPattern distPattern, ResultPartitionType partitionType, boolean isBroadcast)JobEdgeconnectNewDataSetAsInput(JobVertex input, DistributionPattern distPattern, ResultPartitionType partitionType, IntermediateDataSetID intermediateDataSetId, boolean isBroadcast)voidfinalizeOnMaster(JobVertex.FinalizeOnMasterContext context)A hook that can be overwritten by sub classes to implement logic that is called by the master after the job completed.CoLocationGroupgetCoLocationGroup()org.apache.flink.configuration.ConfigurationgetConfiguration()Returns the vertex's configuration object which can be used to pass custom settings to the task at runtime.JobVertexIDgetID()Returns the ID of this job vertex.List<JobEdge>getInputs()org.apache.flink.core.io.InputSplitSource<?>getInputSplitSource()List<IntermediateDataSetID>getIntermediateDataSetIdsToConsume()Class<? extends TaskInvokable>getInvokableClass(ClassLoader cl)Returns the invokable class which represents the task of this vertex.StringgetInvokableClassName()Returns the name of the invokable class which represents the task of this vertex.intgetMaxParallelism()Gets the maximum parallelism for the task.org.apache.flink.api.common.operators.ResourceSpecgetMinResources()Gets the minimum resource for the task.StringgetName()Returns the name of the vertex.intgetNumberOfInputs()Returns the number of inputs.intgetNumberOfProducedIntermediateDataSets()Returns the number of produced intermediate data sets.List<org.apache.flink.util.SerializedValue<OperatorCoordinator.Provider>>getOperatorCoordinators()StringgetOperatorDescription()List<OperatorIDPair>getOperatorIDs()StringgetOperatorName()StringgetOperatorPrettyName()IntermediateDataSetgetOrCreateResultDataSet(IntermediateDataSetID id, ResultPartitionType partitionType)intgetParallelism()Gets the parallelism of the task.org.apache.flink.api.common.operators.ResourceSpecgetPreferredResources()Gets the preferred resource for the task.List<IntermediateDataSet>getProducedDataSets()StringgetResultOptimizerProperties()SlotSharingGroupgetSlotSharingGroup()Gets the slot sharing group that this vertex is associated with.booleanhasNoConnectedInputs()voidinitializeOnMaster(JobVertex.InitializeOnMasterContext context)A hook that can be overwritten by sub classes to implement logic that is called by the master when the job starts.booleanisAnyOutputBlocking()booleanisDynamicParallelism()booleanisInputVertex()booleanisOutputVertex()booleanisParallelismConfigured()booleanisStoppable()booleanisSupportsConcurrentExecutionAttempts()voidsetDynamicParallelism(int parallelism)voidsetInputSplitSource(org.apache.flink.core.io.InputSplitSource<?> inputSplitSource)voidsetInvokableClass(Class<? extends TaskInvokable> invokable)voidsetMaxParallelism(int maxParallelism)Sets the maximum parallelism for the task.voidsetName(String name)Sets the name of the vertex.voidsetOperatorDescription(String operatorDescription)voidsetOperatorName(String operatorName)voidsetOperatorPrettyName(String operatorPrettyName)voidsetParallelism(int parallelism)Sets the parallelism for the task.voidsetParallelismConfigured(boolean parallelismConfigured)voidsetResources(org.apache.flink.api.common.operators.ResourceSpec minResources, org.apache.flink.api.common.operators.ResourceSpec preferredResources)Sets the minimum and preferred resources for the task.voidsetResultOptimizerProperties(String resultOptimizerProperties)voidsetSlotSharingGroup(SlotSharingGroup grp)Associates this vertex with a slot sharing group for scheduling.voidsetStrictlyCoLocatedWith(JobVertex strictlyCoLocatedWith)Tells this vertex to strictly co locate its subtasks with the subtasks of the given vertex.voidsetSupportsConcurrentExecutionAttempts(boolean supportsConcurrentExecutionAttempts)StringtoString()voidupdateCoLocationGroup(CoLocationGroupImpl group)
-
-
-
Field Detail
-
MAX_PARALLELISM_DEFAULT
public static final int MAX_PARALLELISM_DEFAULT
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
JobVertex
public JobVertex(String name)
Constructs a new job vertex and assigns it with the given name.- Parameters:
name- The name of the new job vertex.
-
JobVertex
public JobVertex(String name, JobVertexID id)
Constructs a new job vertex and assigns it with the given name.- Parameters:
name- The name of the new job vertex.id- The id of the job vertex.
-
JobVertex
public JobVertex(String name, JobVertexID primaryId, List<OperatorIDPair> operatorIDPairs)
Constructs a new job vertex and assigns it with the given name.- Parameters:
name- The name of the new job vertex.primaryId- The id of the job vertex.operatorIDPairs- The operator ID pairs of the job vertex.
-
-
Method Detail
-
getID
public JobVertexID getID()
Returns the ID of this job vertex.- Returns:
- The ID of this job vertex
-
getName
public String getName()
Returns the name of the vertex.- Returns:
- The name of the vertex.
-
setName
public void setName(String name)
Sets the name of the vertex.- Parameters:
name- The new name.
-
getNumberOfProducedIntermediateDataSets
public int getNumberOfProducedIntermediateDataSets()
Returns the number of produced intermediate data sets.- Returns:
- The number of produced intermediate data sets.
-
getNumberOfInputs
public int getNumberOfInputs()
Returns the number of inputs.- Returns:
- The number of inputs.
-
getOperatorIDs
public List<OperatorIDPair> getOperatorIDs()
-
getConfiguration
public org.apache.flink.configuration.Configuration getConfiguration()
Returns the vertex's configuration object which can be used to pass custom settings to the task at runtime.- Returns:
- the vertex's configuration object
-
setInvokableClass
public void setInvokableClass(Class<? extends TaskInvokable> invokable)
-
setParallelismConfigured
public void setParallelismConfigured(boolean parallelismConfigured)
-
isParallelismConfigured
public boolean isParallelismConfigured()
-
setDynamicParallelism
public void setDynamicParallelism(int parallelism)
-
isDynamicParallelism
public boolean isDynamicParallelism()
-
getInvokableClassName
public String getInvokableClassName()
Returns the name of the invokable class which represents the task of this vertex.- Returns:
- The name of the invokable class,
nullif not set.
-
getInvokableClass
public Class<? extends TaskInvokable> getInvokableClass(ClassLoader cl)
Returns the invokable class which represents the task of this vertex.- Parameters:
cl- The classloader used to resolve user-defined classes- Returns:
- The invokable class,
nullif it is not set
-
getParallelism
public int getParallelism()
Gets the parallelism of the task.- Returns:
- The parallelism of the task.
-
setParallelism
public void setParallelism(int parallelism)
Sets the parallelism for the task.- Parameters:
parallelism- The parallelism for the task.
-
getMaxParallelism
public int getMaxParallelism()
Gets the maximum parallelism for the task.- Returns:
- The maximum parallelism for the task.
-
setMaxParallelism
public void setMaxParallelism(int maxParallelism)
Sets the maximum parallelism for the task.- Parameters:
maxParallelism- The maximum parallelism to be set. must be between 1 and Short.MAX_VALUE + 1.
-
getMinResources
public org.apache.flink.api.common.operators.ResourceSpec getMinResources()
Gets the minimum resource for the task.- Returns:
- The minimum resource for the task.
-
getPreferredResources
public org.apache.flink.api.common.operators.ResourceSpec getPreferredResources()
Gets the preferred resource for the task.- Returns:
- The preferred resource for the task.
-
setResources
public void setResources(org.apache.flink.api.common.operators.ResourceSpec minResources, org.apache.flink.api.common.operators.ResourceSpec preferredResources)Sets the minimum and preferred resources for the task.- Parameters:
minResources- The minimum resource for the task.preferredResources- The preferred resource for the task.
-
getInputSplitSource
public org.apache.flink.core.io.InputSplitSource<?> getInputSplitSource()
-
setInputSplitSource
public void setInputSplitSource(org.apache.flink.core.io.InputSplitSource<?> inputSplitSource)
-
getProducedDataSets
public List<IntermediateDataSet> getProducedDataSets()
-
getOperatorCoordinators
public List<org.apache.flink.util.SerializedValue<OperatorCoordinator.Provider>> getOperatorCoordinators()
-
addOperatorCoordinator
public void addOperatorCoordinator(org.apache.flink.util.SerializedValue<OperatorCoordinator.Provider> serializedCoordinatorProvider)
-
setSlotSharingGroup
public void setSlotSharingGroup(SlotSharingGroup grp)
Associates this vertex with a slot sharing group for scheduling. Different vertices in the same slot sharing group can run one subtask each in the same slot.- Parameters:
grp- The slot sharing group to associate the vertex with.
-
getSlotSharingGroup
public SlotSharingGroup getSlotSharingGroup()
Gets the slot sharing group that this vertex is associated with. Different vertices in the same slot sharing group can run one subtask each in the same slot.- Returns:
- The slot sharing group to associate the vertex with
-
setStrictlyCoLocatedWith
public void setStrictlyCoLocatedWith(JobVertex strictlyCoLocatedWith)
Tells this vertex to strictly co locate its subtasks with the subtasks of the given vertex. Strict co-location implies that the n'th subtask of this vertex will run on the same parallel computing instance (TaskManager) as the n'th subtask of the given vertex.NOTE: Co-location is only possible between vertices in a slot sharing group.
NOTE: This vertex must (transitively) depend on the vertex to be co-located with. That means that the respective vertex must be a (transitive) input of this vertex.
- Parameters:
strictlyCoLocatedWith- The vertex whose subtasks to co-locate this vertex's subtasks with.- Throws:
IllegalArgumentException- Thrown, if this vertex and the vertex to co-locate with are not in a common slot sharing group.- See Also:
setSlotSharingGroup(SlotSharingGroup)
-
getCoLocationGroup
@Nullable public CoLocationGroup getCoLocationGroup()
-
updateCoLocationGroup
public void updateCoLocationGroup(CoLocationGroupImpl group)
-
getOrCreateResultDataSet
public IntermediateDataSet getOrCreateResultDataSet(IntermediateDataSetID id, ResultPartitionType partitionType)
-
connectNewDataSetAsInput
public JobEdge connectNewDataSetAsInput(JobVertex input, DistributionPattern distPattern, ResultPartitionType partitionType)
-
connectNewDataSetAsInput
public JobEdge connectNewDataSetAsInput(JobVertex input, DistributionPattern distPattern, ResultPartitionType partitionType, boolean isBroadcast)
-
connectNewDataSetAsInput
public JobEdge connectNewDataSetAsInput(JobVertex input, DistributionPattern distPattern, ResultPartitionType partitionType, IntermediateDataSetID intermediateDataSetId, boolean isBroadcast)
-
isInputVertex
public boolean isInputVertex()
-
isStoppable
public boolean isStoppable()
-
isOutputVertex
public boolean isOutputVertex()
-
hasNoConnectedInputs
public boolean hasNoConnectedInputs()
-
setSupportsConcurrentExecutionAttempts
public void setSupportsConcurrentExecutionAttempts(boolean supportsConcurrentExecutionAttempts)
-
isSupportsConcurrentExecutionAttempts
public boolean isSupportsConcurrentExecutionAttempts()
-
isAnyOutputBlocking
public boolean isAnyOutputBlocking()
-
initializeOnMaster
public void initializeOnMaster(JobVertex.InitializeOnMasterContext context) throws Exception
A hook that can be overwritten by sub classes to implement logic that is called by the master when the job starts.- Parameters:
context- Provides contextual information for the initialization- Throws:
Exception- The method may throw exceptions which cause the job to fail immediately.
-
finalizeOnMaster
public void finalizeOnMaster(JobVertex.FinalizeOnMasterContext context) throws Exception
A hook that can be overwritten by sub classes to implement logic that is called by the master after the job completed.- Parameters:
context- Provides contextual information for the initialization- Throws:
Exception- The method may throw exceptions which cause the job to fail immediately.
-
getOperatorName
public String getOperatorName()
-
setOperatorName
public void setOperatorName(String operatorName)
-
getOperatorDescription
public String getOperatorDescription()
-
setOperatorDescription
public void setOperatorDescription(String operatorDescription)
-
setOperatorPrettyName
public void setOperatorPrettyName(String operatorPrettyName)
-
getOperatorPrettyName
public String getOperatorPrettyName()
-
getResultOptimizerProperties
public String getResultOptimizerProperties()
-
setResultOptimizerProperties
public void setResultOptimizerProperties(String resultOptimizerProperties)
-
addIntermediateDataSetIdToConsume
public void addIntermediateDataSetIdToConsume(IntermediateDataSetID intermediateDataSetId)
-
getIntermediateDataSetIdsToConsume
public List<IntermediateDataSetID> getIntermediateDataSetIdsToConsume()
-
-