Interface BucketAssigner<IN,​BucketID>

  • Type Parameters:
    IN - The type of input elements.
    BucketID - The type of the object returned by the getBucketId(Object, BucketAssigner.Context). This has to have a correct #hashCode() and #equals(Object) method. In addition, the Path to the created bucket will be the result of the #toString() of this method, appended to the basePath specified in the file sink.
    All Superinterfaces:
    Serializable
    All Known Implementing Classes:
    BasePathBucketAssigner, DateTimeBucketAssigner

    @PublicEvolving
    public interface BucketAssigner<IN,​BucketID>
    extends Serializable
    A BucketAssigner is used with a file sink to determine the bucket each incoming element should be put into.

    The StreamingFileSink can be writing to many buckets at a time, and it is responsible for managing a set of active buckets. Whenever a new element arrives it will ask the BucketAssigner for the bucket the element should fall in. The BucketAssigner can, for example, determine buckets based on system time.

    • Method Detail

      • getBucketId

        BucketID getBucketId​(IN element,
                             BucketAssigner.Context context)
        Returns the identifier of the bucket the provided element should be put into.
        Parameters:
        element - The current element being processed.
        context - The context used by the current bucket assigner.
        Returns:
        A string representing the identifier of the bucket the element should be put into. The actual path to the bucket will result from the concatenation of the returned string and the base path provided during the initialization of the file sink.
      • getSerializer

        org.apache.flink.core.io.SimpleVersionedSerializer<BucketID> getSerializer()
        Returns:
        A SimpleVersionedSerializer capable of serializing/deserializing the elements of type BucketID. That is the type of the objects returned by the getBucketId(Object, BucketAssigner.Context).