Class AsyncIntervalJoinOperator<K,T1,T2,OUT>
- java.lang.Object
-
- org.apache.flink.streaming.api.operators.AbstractStreamOperator<OUT>
-
- org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncKeyOrderedStreamOperator<OUT>
-
- org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncStateStreamOperator<OUT>
-
- org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncStateUdfStreamOperator<OUT,ProcessJoinFunction<T1,T2,OUT>>
-
- org.apache.flink.runtime.asyncprocessing.operators.co.AsyncIntervalJoinOperator<K,T1,T2,OUT>
-
- Type Parameters:
K- The type of the key based on which we join elements.T1- The type of the elements in the left stream.T2- The type of the elements in the right stream.OUT- The output type created by the user-defined function.
- All Implemented Interfaces:
Serializable,org.apache.flink.api.common.state.CheckpointListener,KeyContext,KeyContextHandler,org.apache.flink.streaming.api.operators.OutputTypeConfigurable<OUT>,StreamOperator<OUT>,StreamOperatorStateHandler.CheckpointedStreamOperator,Triggerable<K,String>,TwoInputStreamOperator<T1,T2,OUT>,UserFunctionProvider<ProcessJoinFunction<T1,T2,OUT>>,YieldingOperator<OUT>,AsyncKeyOrderedProcessing,AsyncKeyOrderedProcessingOperator
@Internal public class AsyncIntervalJoinOperator<K,T1,T2,OUT> extends AbstractAsyncStateUdfStreamOperator<OUT,ProcessJoinFunction<T1,T2,OUT>> implements TwoInputStreamOperator<T1,T2,OUT>, Triggerable<K,String>
Anoperatorto execute time-bounded stream inner joins. This is the async state access version ofIntervalJoinOperator.By using a configurable lower and upper bound this operator will emit exactly those pairs (T1, T2) where t2.ts ∈ [T1.ts + lowerBound, T1.ts + upperBound]. Both the lower and the upper bound can be configured to be either inclusive or exclusive.
As soon as elements are joined they are passed to a user-defined
ProcessJoinFunction.The basic idea of this implementation is as follows: Whenever we receive an element at
processElement1(StreamRecord)(a.k.a. the left side), we add it to the left buffer. We then check the right buffer to see whether there are any elements that can be joined. If there are, they are joined and passed to the aforementioned function. The same happens the other way around when receiving an element on the right side.Whenever a pair of elements is emitted it will be assigned the max timestamp of either of the elements.
In order to avoid the element buffers to grow indefinitely a cleanup timer is registered per element. This timer indicates when an element is not considered for joining anymore and can be removed from the state.
- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncStateUdfStreamOperator
declarationContext, userFunction
-
Fields inherited from class org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncKeyOrderedStreamOperator
asyncExecutionController, currentProcessingContext, declarationManager, environment
-
Fields inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
combinedWatermark, config, lastRecordAttributes1, lastRecordAttributes2, latencyStats, metrics, output, processingTimeService, stateHandler, stateKeySelector1, stateKeySelector2, timeServiceManager
-
-
Constructor Summary
Constructors Constructor Description AsyncIntervalJoinOperator(long lowerBound, long upperBound, boolean lowerBoundInclusive, boolean upperBoundInclusive, org.apache.flink.util.OutputTag<T1> leftLateDataOutputTag, org.apache.flink.util.OutputTag<T2> rightLateDataOutputTag, org.apache.flink.api.common.typeutils.TypeSerializer<T1> leftTypeSerializer, org.apache.flink.api.common.typeutils.TypeSerializer<T2> rightTypeSerializer, ProcessJoinFunction<T1,T2,OUT> udf)Creates a new IntervalJoinOperator.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.flink.api.common.state.v2.MapState<Long,List<IntervalJoinOperator.BufferEntry<T1>>>getLeftBuffer()org.apache.flink.api.common.state.v2.MapState<Long,List<IntervalJoinOperator.BufferEntry<T2>>>getRightBuffer()voidonEventTime(InternalTimer<K,String> timer)Invoked when an event-time timer fires.voidonProcessingTime(InternalTimer<K,String> timer)Invoked when a processing-time timer fires.voidopen()This method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.voidprocessElement1(StreamRecord<T1> record)Process aStreamRecordfrom the left stream.voidprocessElement2(StreamRecord<T2> record)Process aStreamRecordfrom the right stream.protected <T> voidsideOutput(T value, long timestamp, boolean isLeft)Write skipped late arriving element to SideOutput.-
Methods inherited from class org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncStateUdfStreamOperator
close, finish, getUserFunction, initializeState, notifyCheckpointAborted, notifyCheckpointComplete, setOutputType, setup, snapshotState
-
Methods inherited from class org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncStateStreamOperator
createAsyncExecutionController, getKeySelectorForAsyncKeyedContext
-
Methods inherited from class org.apache.flink.runtime.asyncprocessing.operators.AbstractAsyncKeyOrderedStreamOperator
asyncProcessWithKey, beforeInitializeStateHandler, drainStateRequests, getAsyncKeyedStateBackend, getCurrentKey, getDeclarationManager, getElementOrder, getInternalTimerService, getOrCreateKeyedState, getRecordProcessor, handleAsyncException, isAsyncKeyOrderedProcessingEnabled, newKeySelected, postProcessElement, postProcessWatermark, prepareSnapshotPreBarrier, preProcessWatermark, preserveRecordOrderAndProcess, processNonRecord, processRecordAttributes, processRecordAttributes1, processRecordAttributes2, processWatermark, processWatermark, processWatermark1, processWatermark1Internal, processWatermark2, processWatermark2Internal, processWatermarkInternal, processWatermarkStatus, processWatermarkStatus, reportOrForwardLatencyMarker, setAsyncKeyedContextElement, setKeyContextElement1, setKeyContextElement2
-
Methods inherited from class org.apache.flink.streaming.api.operators.AbstractStreamOperator
getContainingTask, getExecutionConfig, getKeyedStateBackend, getKeyedStateStore, getMetricGroup, getOperatorConfig, getOperatorID, getOperatorName, getOperatorStateBackend, getOrCreateKeyedState, getPartitionedState, getPartitionedState, getProcessingTimeService, getRuntimeContext, getStateKeySelector1, getStateKeySelector2, getTimeServiceManager, getUserCodeClassloader, hasKeyContext1, hasKeyContext2, initializeState, isUsingCustomRawKeyedState, processLatencyMarker, processLatencyMarker1, processLatencyMarker2, processWatermark1, processWatermark2, processWatermarkStatus1, processWatermarkStatus2, setCurrentKey, setMailboxExecutor, setProcessingTimeService, snapshotState, useSplittableTimers
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.flink.streaming.runtime.operators.asyncprocessing.AsyncKeyOrderedProcessing
getRecordProcessor, isAsyncKeyOrderedProcessingEnabled
-
Methods inherited from interface org.apache.flink.streaming.runtime.operators.asyncprocessing.AsyncKeyOrderedProcessingOperator
asyncProcessWithKey, getDeclarationManager, getElementOrder, postProcessElement, preserveRecordOrderAndProcess, setAsyncKeyedContextElement
-
Methods inherited from interface org.apache.flink.api.common.state.CheckpointListener
notifyCheckpointAborted, notifyCheckpointComplete
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContext
getCurrentKey, setCurrentKey
-
Methods inherited from interface org.apache.flink.streaming.api.operators.KeyContextHandler
hasKeyContext
-
Methods inherited from interface org.apache.flink.streaming.api.operators.StreamOperator
close, finish, getMetricGroup, getOperatorAttributes, getOperatorID, initializeState, prepareSnapshotPreBarrier, setKeyContextElement1, setKeyContextElement2, snapshotState
-
Methods inherited from interface org.apache.flink.streaming.api.operators.TwoInputStreamOperator
processLatencyMarker1, processLatencyMarker2, processRecordAttributes1, processRecordAttributes2, processWatermark1, processWatermark1, processWatermark2, processWatermark2, processWatermarkStatus1, processWatermarkStatus2
-
-
-
-
Constructor Detail
-
AsyncIntervalJoinOperator
public AsyncIntervalJoinOperator(long lowerBound, long upperBound, boolean lowerBoundInclusive, boolean upperBoundInclusive, org.apache.flink.util.OutputTag<T1> leftLateDataOutputTag, org.apache.flink.util.OutputTag<T2> rightLateDataOutputTag, org.apache.flink.api.common.typeutils.TypeSerializer<T1> leftTypeSerializer, org.apache.flink.api.common.typeutils.TypeSerializer<T2> rightTypeSerializer, ProcessJoinFunction<T1,T2,OUT> udf)Creates a new IntervalJoinOperator.- Parameters:
lowerBound- The lower bound for evaluating if elements should be joinedupperBound- The upper bound for evaluating if elements should be joinedlowerBoundInclusive- Whether or not to include elements where the timestamp matches the lower boundupperBoundInclusive- Whether or not to include elements where the timestamp matches the upper boundudf- A user-definedProcessJoinFunctionthat gets called whenever two elements of T1 and T2 are joined
-
-
Method Detail
-
open
public void open() throws ExceptionDescription copied from class:AbstractStreamOperatorThis method is called immediately before any elements are processed, it should contain the operator's initialization logic, e.g. state initialization.The default implementation does nothing.
- Specified by:
openin interfaceStreamOperator<K>- Overrides:
openin classAbstractAsyncStateUdfStreamOperator<OUT,ProcessJoinFunction<T1,T2,OUT>>- Throws:
Exception- An exception in this method causes the operator to fail.
-
processElement1
public void processElement1(StreamRecord<T1> record) throws Exception
Process aStreamRecordfrom the left stream. Whenever anStreamRecordarrives at the left stream, it will get added to the left buffer. Possible join candidates for that element will be looked up from the right buffer and if the pair lies within the user defined boundaries, it gets passed to theProcessJoinFunction.- Specified by:
processElement1in interfaceTwoInputStreamOperator<K,T1,T2>- Parameters:
record- An incoming record to be joined- Throws:
Exception- Can throw an Exception during state access
-
processElement2
public void processElement2(StreamRecord<T2> record) throws Exception
Process aStreamRecordfrom the right stream. Whenever aStreamRecordarrives at the right stream, it will get added to the right buffer. Possible join candidates for that element will be looked up from the left buffer and if the pair lies within the user defined boundaries, it gets passed to theProcessJoinFunction.- Specified by:
processElement2in interfaceTwoInputStreamOperator<K,T1,T2>- Parameters:
record- An incoming record to be joined- Throws:
Exception- Can throw an exception during state access
-
sideOutput
protected <T> void sideOutput(T value, long timestamp, boolean isLeft)Write skipped late arriving element to SideOutput.
-
onEventTime
public void onEventTime(InternalTimer<K,String> timer) throws Exception
Description copied from interface:TriggerableInvoked when an event-time timer fires.- Specified by:
onEventTimein interfaceTriggerable<K,T1>- Throws:
Exception
-
onProcessingTime
public void onProcessingTime(InternalTimer<K,String> timer) throws Exception
Description copied from interface:TriggerableInvoked when a processing-time timer fires.- Specified by:
onProcessingTimein interfaceTriggerable<K,T1>- Throws:
Exception
-
getLeftBuffer
@VisibleForTesting public org.apache.flink.api.common.state.v2.MapState<Long,List<IntervalJoinOperator.BufferEntry<T1>>> getLeftBuffer()
-
getRightBuffer
@VisibleForTesting public org.apache.flink.api.common.state.v2.MapState<Long,List<IntervalJoinOperator.BufferEntry<T2>>> getRightBuffer()
-
-