Class FlinkKafkaProducerBase<IN>

  • Type Parameters:
    IN - Type of the messages to write into Kafka.
    All Implemented Interfaces:
    Serializable, org.apache.flink.api.common.functions.Function, org.apache.flink.api.common.functions.RichFunction, org.apache.flink.streaming.api.checkpoint.CheckpointedFunction, org.apache.flink.streaming.api.functions.sink.SinkFunction<IN>

    @Internal
    public abstract class FlinkKafkaProducerBase<IN>
    extends org.apache.flink.streaming.api.functions.sink.RichSinkFunction<IN>
    implements org.apache.flink.streaming.api.checkpoint.CheckpointedFunction
    Flink Sink to produce data into a Kafka topic.

    Please note that this producer provides at-least-once reliability guarantees when checkpoints are enabled and setFlushOnCheckpoint(true) is set. Otherwise, the producer doesn't provide any reliability guarantees.

    See Also:
    Serialized Form
    • Nested Class Summary

      • Nested classes/interfaces inherited from interface org.apache.flink.streaming.api.functions.sink.SinkFunction

        org.apache.flink.streaming.api.functions.sink.SinkFunction.Context
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected Exception asyncException
      Errors encountered in the async producer are stored here.
      protected org.apache.kafka.clients.producer.Callback callback
      The callback than handles error propagation or logging callbacks.
      protected String defaultTopicId
      The name of the default topic this producer is writing data to.
      protected FlinkKafkaPartitioner<IN> flinkKafkaPartitioner
      User-provided partitioner for assigning an object to a Kafka partition for each topic.
      protected boolean flushOnCheckpoint
      If true, the producer will wait until all outstanding records have been send to the broker.
      static String KEY_DISABLE_METRICS
      Configuration key for disabling the metrics reporting.
      protected boolean logFailuresOnly
      Flag indicating whether to accept failures (and log them), or to fail on failures.
      protected long pendingRecords
      Number of unacknowledged records.
      protected org.apache.flink.util.SerializableObject pendingRecordsLock
      Lock for accessing the pending records.
      protected org.apache.kafka.clients.producer.KafkaProducer<byte[],​byte[]> producer
      KafkaProducer instance.
      protected Properties producerConfig
      User defined properties for the Producer.
      protected KeyedSerializationSchema<IN> schema
      (Serializable) SerializationSchema for turning objects used with Flink into.
      protected Map<String,​int[]> topicPartitionsMap
      Partitions of each topic.
    • Method Summary

      All Methods Static Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      protected void checkErroneous()  
      void close()  
      protected abstract void flush()
      Flush pending records.
      protected <K,​V>
      org.apache.kafka.clients.producer.KafkaProducer<K,​V>
      getKafkaProducer​(Properties props)
      Used for testing only.
      protected static int[] getPartitionsByTopic​(String topic, org.apache.kafka.clients.producer.KafkaProducer<byte[],​byte[]> producer)  
      static Properties getPropertiesFromBrokerList​(String brokerList)  
      void initializeState​(org.apache.flink.runtime.state.FunctionInitializationContext context)  
      void invoke​(IN next, org.apache.flink.streaming.api.functions.sink.SinkFunction.Context context)
      Called when new data arrives to the sink, and forwards it to Kafka.
      protected long numPendingRecords()  
      void open​(org.apache.flink.configuration.Configuration configuration)
      Initializes the connection to Kafka.
      void setFlushOnCheckpoint​(boolean flush)
      If set to true, the Flink producer will wait for all outstanding messages in the Kafka buffers to be acknowledged by the Kafka producer on a checkpoint.
      void setLogFailuresOnly​(boolean logFailuresOnly)
      Defines whether the producer should fail on errors, or only log them.
      void snapshotState​(org.apache.flink.runtime.state.FunctionSnapshotContext ctx)  
      • Methods inherited from class org.apache.flink.api.common.functions.AbstractRichFunction

        getIterationRuntimeContext, getRuntimeContext, setRuntimeContext
      • Methods inherited from interface org.apache.flink.streaming.api.functions.sink.SinkFunction

        finish, invoke, writeWatermark
    • Field Detail

      • KEY_DISABLE_METRICS

        public static final String KEY_DISABLE_METRICS
        Configuration key for disabling the metrics reporting.
        See Also:
        Constant Field Values
      • producerConfig

        protected final Properties producerConfig
        User defined properties for the Producer.
      • defaultTopicId

        protected final String defaultTopicId
        The name of the default topic this producer is writing data to.
      • schema

        protected final KeyedSerializationSchema<IN> schema
        (Serializable) SerializationSchema for turning objects used with Flink into. byte[] for Kafka.
      • flinkKafkaPartitioner

        protected final FlinkKafkaPartitioner<IN> flinkKafkaPartitioner
        User-provided partitioner for assigning an object to a Kafka partition for each topic.
      • topicPartitionsMap

        protected final Map<String,​int[]> topicPartitionsMap
        Partitions of each topic.
      • logFailuresOnly

        protected boolean logFailuresOnly
        Flag indicating whether to accept failures (and log them), or to fail on failures.
      • flushOnCheckpoint

        protected boolean flushOnCheckpoint
        If true, the producer will wait until all outstanding records have been send to the broker.
      • producer

        protected transient org.apache.kafka.clients.producer.KafkaProducer<byte[],​byte[]> producer
        KafkaProducer instance.
      • callback

        protected transient org.apache.kafka.clients.producer.Callback callback
        The callback than handles error propagation or logging callbacks.
      • asyncException

        protected transient volatile Exception asyncException
        Errors encountered in the async producer are stored here.
      • pendingRecordsLock

        protected final org.apache.flink.util.SerializableObject pendingRecordsLock
        Lock for accessing the pending records.
      • pendingRecords

        protected long pendingRecords
        Number of unacknowledged records.
    • Constructor Detail

      • FlinkKafkaProducerBase

        public FlinkKafkaProducerBase​(String defaultTopicId,
                                      KeyedSerializationSchema<IN> serializationSchema,
                                      Properties producerConfig,
                                      FlinkKafkaPartitioner<IN> customPartitioner)
        The main constructor for creating a FlinkKafkaProducer.
        Parameters:
        defaultTopicId - The default topic to write data to
        serializationSchema - A serializable serialization schema for turning user objects into a kafka-consumable byte[] supporting key/value messages
        producerConfig - Configuration properties for the KafkaProducer. 'bootstrap.servers.' is the only required argument.
        customPartitioner - A serializable partitioner for assigning messages to Kafka partitions. Passing null will use Kafka's partitioner.
    • Method Detail

      • setLogFailuresOnly

        public void setLogFailuresOnly​(boolean logFailuresOnly)
        Defines whether the producer should fail on errors, or only log them. If this is set to true, then exceptions will be only logged, if set to false, exceptions will be eventually thrown and cause the streaming program to fail (and enter recovery).
        Parameters:
        logFailuresOnly - The flag to indicate logging-only on exceptions.
      • setFlushOnCheckpoint

        public void setFlushOnCheckpoint​(boolean flush)
        If set to true, the Flink producer will wait for all outstanding messages in the Kafka buffers to be acknowledged by the Kafka producer on a checkpoint. This way, the producer can guarantee that messages in the Kafka buffers are part of the checkpoint.
        Parameters:
        flush - Flag indicating the flushing mode (true = flush on checkpoint)
      • getKafkaProducer

        @VisibleForTesting
        protected <K,​V> org.apache.kafka.clients.producer.KafkaProducer<K,​V> getKafkaProducer​(Properties props)
        Used for testing only.
      • open

        public void open​(org.apache.flink.configuration.Configuration configuration)
                  throws Exception
        Initializes the connection to Kafka.
        Specified by:
        open in interface org.apache.flink.api.common.functions.RichFunction
        Overrides:
        open in class org.apache.flink.api.common.functions.AbstractRichFunction
        Throws:
        Exception
      • invoke

        public void invoke​(IN next,
                           org.apache.flink.streaming.api.functions.sink.SinkFunction.Context context)
                    throws Exception
        Called when new data arrives to the sink, and forwards it to Kafka.
        Specified by:
        invoke in interface org.apache.flink.streaming.api.functions.sink.SinkFunction<IN>
        Parameters:
        next - The incoming data
        Throws:
        Exception
      • close

        public void close()
                   throws Exception
        Specified by:
        close in interface org.apache.flink.api.common.functions.RichFunction
        Overrides:
        close in class org.apache.flink.api.common.functions.AbstractRichFunction
        Throws:
        Exception
      • flush

        protected abstract void flush()
        Flush pending records.
      • initializeState

        public void initializeState​(org.apache.flink.runtime.state.FunctionInitializationContext context)
                             throws Exception
        Specified by:
        initializeState in interface org.apache.flink.streaming.api.checkpoint.CheckpointedFunction
        Throws:
        Exception
      • snapshotState

        public void snapshotState​(org.apache.flink.runtime.state.FunctionSnapshotContext ctx)
                           throws Exception
        Specified by:
        snapshotState in interface org.apache.flink.streaming.api.checkpoint.CheckpointedFunction
        Throws:
        Exception
      • getPropertiesFromBrokerList

        public static Properties getPropertiesFromBrokerList​(String brokerList)
      • getPartitionsByTopic

        protected static int[] getPartitionsByTopic​(String topic,
                                                    org.apache.kafka.clients.producer.KafkaProducer<byte[],​byte[]> producer)
      • numPendingRecords

        @VisibleForTesting
        protected long numPendingRecords()