Class StoppableKafkaEnumContextProxy
java.lang.Object
org.apache.flink.connector.kafka.dynamic.source.enumerator.StoppableKafkaEnumContextProxy
- All Implemented Interfaces:
AutoCloseable,org.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
@Internal
public class StoppableKafkaEnumContextProxy
extends Object
implements org.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>, AutoCloseable
A proxy enumerator context that supports life cycle management of underlying threads related to a
sub
KafkaSourceEnumerator. This is
motivated by the need to cancel the periodic partition discovery in scheduled tasks when sub
Kafka Enumerators are restarted. The worker thread pool in SourceCoordinatorContext should not contain tasks of
inactive KafkaSourceEnumerators, after source restart.
Due to the inability to cancel scheduled tasks from SourceCoordinatorContext, this enumerator context
will safely catch exceptions during enumerator restart and use a closeable proxy scheduler to
invoke tasks on the coordinator main thread to maintain the single threaded property.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classGeneral exception to signal to internal exception handling mechanisms that a benign error occurred.static interfaceThis factory exposes a way to override theStoppableKafkaEnumContextProxyused in the enumerator. -
Constructor Summary
ConstructorsConstructorDescriptionStoppableKafkaEnumContextProxy(String kafkaClusterId, KafkaMetadataService kafkaMetadataService, org.apache.flink.api.connector.source.SplitEnumeratorContext<DynamicKafkaSourceSplit> enumContext, Runnable signalNoMoreSplitsCallback) Constructor for the enumerator context. -
Method Summary
Modifier and TypeMethodDescriptionvoidassignSplits(org.apache.flink.api.connector.source.SplitsAssignment<KafkaPartitionSplit> newSplitAssignments) Wrap splits with cluster metadata.<T> voidExecute the one time callables in the coordinator.<T> voidcallAsync(Callable<T> callable, java.util.function.BiConsumer<T, Throwable> handler, long initialDelay, long period) Schedule task via internal thread pool to proxy task so that the task handler callback can execute in the single threaded source coordinator thread pool to avoid synchronization needs.voidclose()Note that we can't close the source coordinator here, because these contexts can be closed during metadata change when the coordinator still needs to continue to run.intbooleanorg.apache.flink.metrics.groups.SplitEnumeratorMetricGroupvoidrunInCoordinatorThread(Runnable runnable) voidsendEventToSourceReader(int subtaskId, org.apache.flink.api.connector.source.SourceEvent event) voidsignalNoMoreSplits(int subtask) protected <T> Callable<T> wrapCallAsyncCallable(Callable<T> callable) Wraps callable in call async executed in worker thread pool with exception propagation to optimize on doing IO in non-coordinator thread.protected <T> java.util.function.BiConsumer<T, Throwable> wrapCallAsyncCallableHandler(java.util.function.BiConsumer<T, Throwable> mainHandler) Handle exception that is propagated by a callable, executed on coordinator thread.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.flink.api.connector.source.SplitEnumeratorContext
assignSplit, registeredReadersOfAttempts, sendEventToSourceReader, setIsProcessingBacklog
-
Constructor Details
-
StoppableKafkaEnumContextProxy
public StoppableKafkaEnumContextProxy(String kafkaClusterId, KafkaMetadataService kafkaMetadataService, org.apache.flink.api.connector.source.SplitEnumeratorContext<DynamicKafkaSourceSplit> enumContext, @Nullable Runnable signalNoMoreSplitsCallback) Constructor for the enumerator context.- Parameters:
kafkaClusterId- The Kafka cluster id in order to maintain the mapping to the sub KafkaSourceEnumeratorkafkaMetadataService- the Kafka metadata service to facilitate error handlingenumContext- the underlying enumerator contextsignalNoMoreSplitsCallback- the callback when signal no more splits is invoked
-
-
Method Details
-
metricGroup
public org.apache.flink.metrics.groups.SplitEnumeratorMetricGroup metricGroup()- Specified by:
metricGroupin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
sendEventToSourceReader
public void sendEventToSourceReader(int subtaskId, org.apache.flink.api.connector.source.SourceEvent event) - Specified by:
sendEventToSourceReaderin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
currentParallelism
public int currentParallelism()- Specified by:
currentParallelismin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
registeredReaders
- Specified by:
registeredReadersin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
assignSplits
public void assignSplits(org.apache.flink.api.connector.source.SplitsAssignment<KafkaPartitionSplit> newSplitAssignments) Wrap splits with cluster metadata.- Specified by:
assignSplitsin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
signalNoMoreSplits
public void signalNoMoreSplits(int subtask) - Specified by:
signalNoMoreSplitsin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
callAsync
Execute the one time callables in the coordinator.- Specified by:
callAsyncin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
callAsync
public <T> void callAsync(Callable<T> callable, java.util.function.BiConsumer<T, Throwable> handler, long initialDelay, long period) Schedule task via internal thread pool to proxy task so that the task handler callback can execute in the single threaded source coordinator thread pool to avoid synchronization needs.Having the scheduled task in the internal thread pool also allows us to cancel the task when the context needs to close due to dynamic enumerator restart.
In the case of KafkaEnumerator partition discovery, the callback modifies KafkaEnumerator object state.
- Specified by:
callAsyncin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
runInCoordinatorThread
- Specified by:
runInCoordinatorThreadin interfaceorg.apache.flink.api.connector.source.SplitEnumeratorContext<KafkaPartitionSplit>
-
isNoMoreSplits
public boolean isNoMoreSplits() -
close
Note that we can't close the source coordinator here, because these contexts can be closed during metadata change when the coordinator still needs to continue to run. We can only close the coordinator context in Flink job shutdown, which Flink will do for us. That's why there is the complexity of the internal thread pools in this class.TODO: Attach Flink JIRA ticket -- discuss with upstream how to cancel scheduled tasks belonging to enumerator.
- Specified by:
closein interfaceAutoCloseable- Throws:
Exception
-
wrapCallAsyncCallable
Wraps callable in call async executed in worker thread pool with exception propagation to optimize on doing IO in non-coordinator thread. -
wrapCallAsyncCallableHandler
protected <T> java.util.function.BiConsumer<T,Throwable> wrapCallAsyncCallableHandler(java.util.function.BiConsumer<T, Throwable> mainHandler) Handle exception that is propagated by a callable, executed on coordinator thread. Depending on condition(s) the exception may be swallowed or forwarded. This is the Kafka topic partition discovery callable handler.
-