Interface KeyedPartitionStream<K,T>
-
- All Superinterfaces:
DataStream
- All Known Subinterfaces:
KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,T>
@Experimental public interface KeyedPartitionStream<K,T> extends DataStream
This interface represents a kind of partitioned data stream. For this stream, each key is a partition, and the partition to which the data belongs is deterministic.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interfaceKeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,T>This interface represents a configurableKeyedPartitionStream.static interfaceKeyedPartitionStream.ProcessConfigurableAndTwoKeyedPartitionStreams<K,T1,T2>This class represents a combination of twoKeyedPartitionStream.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description BroadcastStream<T>broadcast()Transform this stream to a newBroadcastStream.<T_OTHER,OUT>
NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT>connectAndProcess(BroadcastStream<T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction)Apply a two input operation to this and otherBroadcastStream.<T_OTHER,OUT>
KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT>connectAndProcess(BroadcastStream<T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT,K> newKeySelector)Apply a two input operation to this and otherBroadcastStream.<T_OTHER,OUT>
NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT>connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputNonBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction)Apply a two input operation to this and otherKeyedPartitionStream.<T_OTHER,OUT>
KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT>connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputNonBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT,K> newKeySelector)Apply a two input operation to this and otherKeyedPartitionStream.The two keyed streams must have the same partitions, otherwise it makes no sense to connect them.GlobalStream<T>global()Coalesce this stream to aGlobalStream.<NEW_KEY> KeyedPartitionStream<NEW_KEY,T>keyBy(org.apache.flink.api.java.functions.KeySelector<T,NEW_KEY> keySelector)Transform this stream to a newKeyedPartitionStream.<OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT>process(OneInputStreamProcessFunction<T,OUT> processFunction)Apply an operation to thisKeyedPartitionStream.<OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT>process(OneInputStreamProcessFunction<T,OUT> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT,K> newKeySelector)Apply an operation to thisKeyedPartitionStream.<OUT1,OUT2>
NonKeyedPartitionStream.ProcessConfigurableAndTwoNonKeyedPartitionStream<OUT1,OUT2>process(TwoOutputStreamProcessFunction<T,OUT1,OUT2> processFunction)Apply a two output operation to thisKeyedPartitionStream.<OUT1,OUT2>
KeyedPartitionStream.ProcessConfigurableAndTwoKeyedPartitionStreams<K,OUT1,OUT2>process(TwoOutputStreamProcessFunction<T,OUT1,OUT2> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT1,K> keySelector1, org.apache.flink.api.java.functions.KeySelector<OUT2,K> keySelector2)Apply a two output operation to thisKeyedPartitionStream.NonKeyedPartitionStream<T>shuffle()Transform this stream to a newNonKeyedPartitionStream, data will be shuffled between these two streams.ProcessConfigurable<?>toSink(org.apache.flink.api.connector.dsv2.Sink<T> sink)
-
-
-
Method Detail
-
process
<OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT> process(OneInputStreamProcessFunction<T,OUT> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT,K> newKeySelector)
Apply an operation to thisKeyedPartitionStream.This method is used to avoid shuffle after applying the process function. It is required that for the same record, the new
KeySelectormust extract the same key as the originalKeySelectoron thisKeyedPartitionStream. Otherwise, the partition of data will be messy.- Parameters:
processFunction- to perform operation.newKeySelector- to select the key after process.- Returns:
- new
KeyedPartitionStreamwith this operation.
-
process
<OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> process(OneInputStreamProcessFunction<T,OUT> processFunction)
Apply an operation to thisKeyedPartitionStream.Generally, apply an operation to a
KeyedPartitionStreamwill result in aNonKeyedPartitionStream, and you can manually generate aKeyedPartitionStreamvia keyBy partitioning. In some cases, you can guarantee that the partition on which the data is processed will not change, then you can useprocess(OneInputStreamProcessFunction, KeySelector)to avoid shuffling.- Parameters:
processFunction- to perform operation.- Returns:
- new
NonKeyedPartitionStreamwith this operation.
-
process
<OUT1,OUT2> KeyedPartitionStream.ProcessConfigurableAndTwoKeyedPartitionStreams<K,OUT1,OUT2> process(TwoOutputStreamProcessFunction<T,OUT1,OUT2> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT1,K> keySelector1, org.apache.flink.api.java.functions.KeySelector<OUT2,K> keySelector2)
Apply a two output operation to thisKeyedPartitionStream.This method is used to avoid shuffle after applying the process function. It is required that for the same record, these new two
KeySelectors must extract the same key as the originalKeySelectors on thisKeyedPartitionStream. Otherwise, the partition of data will be messy.- Parameters:
processFunction- to perform two output operation.keySelector1- to select the key of first output.keySelector2- to select the key of second output.- Returns:
- new
KeyedPartitionStream.ProcessConfigurableAndTwoKeyedPartitionStreamswith this operation.
-
process
<OUT1,OUT2> NonKeyedPartitionStream.ProcessConfigurableAndTwoNonKeyedPartitionStream<OUT1,OUT2> process(TwoOutputStreamProcessFunction<T,OUT1,OUT2> processFunction)
Apply a two output operation to thisKeyedPartitionStream.- Parameters:
processFunction- to perform two output operation.- Returns:
- new
NonKeyedPartitionStream.ProcessConfigurableAndTwoNonKeyedPartitionStreamwith this operation.
-
connectAndProcess
<T_OTHER,OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputNonBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction)
Apply a two input operation to this and otherKeyedPartitionStream. The two keyed streams must have the same partitions, otherwise it makes no sense to connect them.Generally, concatenating two
KeyedPartitionStreamwill result in aNonKeyedPartitionStream, and you can manually generate aKeyedPartitionStreamvia keyBy partitioning. In some cases, you can guarantee that the partition on which the data is processed will not change, then you can useconnectAndProcess(KeyedPartitionStream, TwoInputNonBroadcastStreamProcessFunction, KeySelector)to avoid shuffling.- Parameters:
other-KeyedPartitionStreamto perform operation with two input.processFunction- to perform operation.- Returns:
- new
NonKeyedPartitionStreamwith this operation.
-
connectAndProcess
<T_OTHER,OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT> connectAndProcess(KeyedPartitionStream<K,T_OTHER> other, TwoInputNonBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT,K> newKeySelector)
Apply a two input operation to this and otherKeyedPartitionStream.The two keyed streams must have the same partitions, otherwise it makes no sense to connect them.This method is used to avoid shuffle after applying the process function. It is required that for the same record, the new
KeySelectormust extract the same key as the originalKeySelectors on these twoKeyedPartitionStreams. Otherwise, the partition of data will be messy.- Parameters:
other-KeyedPartitionStreamto perform operation with two input.processFunction- to perform operation.newKeySelector- to select the key after process.- Returns:
- new
KeyedPartitionStreamwith this operation.
-
connectAndProcess
<T_OTHER,OUT> NonKeyedPartitionStream.ProcessConfigurableAndNonKeyedPartitionStream<OUT> connectAndProcess(BroadcastStream<T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction)
Apply a two input operation to this and otherBroadcastStream.Generally, concatenating
KeyedPartitionStreamandBroadcastStreamwill result in aNonKeyedPartitionStream, and you can manually generate aKeyedPartitionStreamvia keyBy partitioning. In some cases, you can guarantee that the partition on which the data is processed will not change, then you can useconnectAndProcess(BroadcastStream, TwoInputBroadcastStreamProcessFunction, KeySelector)to avoid shuffling.- Parameters:
processFunction- to perform operation.- Returns:
- new stream with this operation.
-
connectAndProcess
<T_OTHER,OUT> KeyedPartitionStream.ProcessConfigurableAndKeyedPartitionStream<K,OUT> connectAndProcess(BroadcastStream<T_OTHER> other, TwoInputBroadcastStreamProcessFunction<T,T_OTHER,OUT> processFunction, org.apache.flink.api.java.functions.KeySelector<OUT,K> newKeySelector)
Apply a two input operation to this and otherBroadcastStream.This method is used to avoid shuffle after applying the process function. It is required that for the record from non-broadcast input, the new
KeySelectormust extract the same key as the originalKeySelectors on theKeyedPartitionStream. Otherwise, the partition of data will be messy. As for the record from broadcast input, the output key from keyed partition itself instead of the new key selector, so the data it outputs will not affect the partition.- Parameters:
other-BroadcastStreamto perform operation with two input.processFunction- to perform operation.newKeySelector- to select the key after process.- Returns:
- new
KeyedPartitionStreamwith this operation.
-
global
GlobalStream<T> global()
Coalesce this stream to aGlobalStream.- Returns:
- the coalesced global stream.
-
keyBy
<NEW_KEY> KeyedPartitionStream<NEW_KEY,T> keyBy(org.apache.flink.api.java.functions.KeySelector<T,NEW_KEY> keySelector)
Transform this stream to a newKeyedPartitionStream.- Parameters:
keySelector- to decide how to map data to partition.- Returns:
- the transformed stream partitioned by key.
-
shuffle
NonKeyedPartitionStream<T> shuffle()
Transform this stream to a newNonKeyedPartitionStream, data will be shuffled between these two streams.- Returns:
- the transformed stream after shuffle.
-
broadcast
BroadcastStream<T> broadcast()
Transform this stream to a newBroadcastStream.- Returns:
- the transformed
BroadcastStream.
-
toSink
ProcessConfigurable<?> toSink(org.apache.flink.api.connector.dsv2.Sink<T> sink)
-
-