@Internal public class GlobalCommitterOperator<CommT,GlobalCommT> extends AbstractStreamOperator<Void> implements OneInputStreamOperator<CommittableMessage<CommT>,Void>
GlobalCommitter.
This operator usually trails behind a CommitterOperator. In this case, the global
committer will receive committables from the committer operator through processElement(StreamRecord). Once all committables from all subtasks have been received, the
global committer will commit them. This approach also works for any number of intermediate custom
operators between the committer and the global committer in a custom post-commit topology.
That means that the global committer will not wait for notifyCheckpointComplete(long). In many cases, it receives the callback before the actual
committables anyway. So it would effectively globally commit one checkpoint later.
However, we can leverage the following observation: the global committer will only receive
committables iff the respective checkpoint was completed and upstream committers received the
notifyCheckpointComplete(long). So by waiting for all committables of a given
checkpoint, we implicitly know that the checkpoint was successful and the global committer is
supposed to globally commit.
Note that committables of checkpoint X are not checkpointed in X because the global committer
is trailing behind the checkpoint. They are replayed from the committer state in case of an
error. The state only includes incomplete checkpoints coming from upstream committers not
receiving notifyCheckpointComplete(long). All committables received are successful.
In rare cases, the GlobalCommitterOperator may not be connected (in)directly to a committer
but instead is connected (in)directly to a writer. In this case, the global committer needs to
perform the 2PC protocol instead of the committer. Thus, we absolutely need to use notifyCheckpointComplete(long) similarly to the CommitterOperator. Hence, commitOnInput is set to false in this case. In particular, the following three prerequisites
must be met:
notifyCheckpointComplete(long) as sketched above.
In all other cases (batch or upstream committer or checkpointing is disabled), the global committer commits on input.
chainingStrategy, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, LOG, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager| Constructor and Description |
|---|
GlobalCommitterOperator(org.apache.flink.util.function.SerializableSupplier<org.apache.flink.api.connector.sink2.Committer<CommT>> committerFactory,
org.apache.flink.util.function.SerializableSupplier<org.apache.flink.core.io.SimpleVersionedSerializer<CommT>> committableSerializerFactory,
boolean commitOnInput) |
| Modifier and Type | Method and Description |
|---|---|
void |
initializeState(org.apache.flink.runtime.state.StateInitializationContext context)
Stream operators with state which can be restored need to override this hook method.
|
void |
notifyCheckpointComplete(long checkpointId) |
void |
processElement(StreamRecord<CommittableMessage<CommT>> element)
Processes one element that arrived on this input of the
MultipleInputStreamOperator. |
void |
setup(StreamTask<?,?> containingTask,
StreamConfig config,
Output<StreamRecord<Void>> output)
Initializes the operator.
|
void |
snapshotState(org.apache.flink.runtime.state.StateSnapshotContext context)
Stream operators with state, which want to participate in a snapshot need to override this
hook method.
|
close, finish, getChainingStrategy, getContainingTask, getCurrentKey, getExecutionConfig, getInternalTimerService, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isUsingCustomRawKeyedState, notifyCheckpointAborted, open, prepareSnapshotPreBarrier, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark1, processWatermark2, processWatermarkStatus, processWatermarkStatus1, processWatermarkStatus2, reportOrForwardLatencyMarker, setChainingStrategy, setCurrentKey, setKeyContextElement1, setKeyContextElement2, setMailboxExecutor, setProcessingTimeService, snapshotState, useSplittableTimersclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitsetKeyContextElementclose, finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, open, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotStatenotifyCheckpointAbortedgetCurrentKey, setCurrentKeyprocessLatencyMarker, processRecordAttributes, processWatermark, processWatermarkStatushasKeyContextpublic GlobalCommitterOperator(org.apache.flink.util.function.SerializableSupplier<org.apache.flink.api.connector.sink2.Committer<CommT>> committerFactory, org.apache.flink.util.function.SerializableSupplier<org.apache.flink.core.io.SimpleVersionedSerializer<CommT>> committableSerializerFactory, boolean commitOnInput)
public void setup(StreamTask<?,?> containingTask, StreamConfig config, Output<StreamRecord<Void>> output)
SetupableStreamOperatorsetup in interface SetupableStreamOperator<Void>setup in class AbstractStreamOperator<Void>public void snapshotState(org.apache.flink.runtime.state.StateSnapshotContext context)
throws Exception
AbstractStreamOperatorsnapshotState in interface StreamOperatorStateHandler.CheckpointedStreamOperatorsnapshotState in class AbstractStreamOperator<Void>context - context that provides information and means required for taking a snapshotExceptionpublic void initializeState(org.apache.flink.runtime.state.StateInitializationContext context)
throws Exception
AbstractStreamOperatorinitializeState in interface StreamOperatorStateHandler.CheckpointedStreamOperatorinitializeState in class AbstractStreamOperator<Void>context - context that allows to register different states.Exceptionpublic void notifyCheckpointComplete(long checkpointId)
throws Exception
notifyCheckpointComplete in interface org.apache.flink.api.common.state.CheckpointListenernotifyCheckpointComplete in class AbstractStreamOperator<Void>Exceptionpublic void processElement(StreamRecord<CommittableMessage<CommT>> element) throws Exception
InputMultipleInputStreamOperator.
This method is guaranteed to not be called concurrently with other methods of the operator.processElement in interface Input<CommittableMessage<CommT>>ExceptionCopyright © 2014–2025 The Apache Software Foundation. All rights reserved.