@PublicEvolving
public interface SupportsComputedColumnPushDown
ScanTableSource.
Computed columns add additional columns to the table's schema. They are defined by logical expressions that reference other physically existing columns.
An example in SQL looks like:
CREATE TABLE t (str STRING, ts AS PARSE_TIMESTAMP(str), i INT) // `ts` is a computed column
By default, if this interface is not implemented, computed columns are added to the physically produced row in a subsequent operation after the source.
However, it might be beneficial to perform the computation as early as possible in order to be close to the actual data generation. Especially in cases where computed columns are used for generating watermarks, a source must push down the computation as deep as possible such that the computation can happen within a source's data partition.
This interface provides a SupportsComputedColumnPushDown.ComputedColumnConverter that needs to be applied to every row
during runtime.
Note: The final output data type emitted by a source changes from the physically produced data type to the full data type of the table's schema. For the example above, this means:
ROW<str STRING, i INT> // before conversion
ROW<str STRING, ts TIMESTAMP(3), i INT> // after conversion
Note: If a source implements SupportsProjectionPushDown, the projection must be applied to
the physical data in the first step. The SupportsComputedColumnPushDown (already aware of the
projection) will then use the projected physical data and insert computed columns into the result. In
the example below, the projections [i, d] are derived from the DDL (c requires i)
and query (d and c are required). The pushed converter will rely on this order and
will process [i, d] to produce [d, c].
CREATE TABLE t (i INT, s STRING, c AS i + 2, d DOUBLE);
SELECT d, c FROM t;
| Modifier and Type | Interface and Description |
|---|---|
static interface |
SupportsComputedColumnPushDown.ComputedColumnConverter
Generates and adds computed columns to
RowData if necessary. |
| Modifier and Type | Method and Description |
|---|---|
void |
applyComputedColumn(SupportsComputedColumnPushDown.ComputedColumnConverter converter,
DataType outputDataType)
|
void applyComputedColumn(SupportsComputedColumnPushDown.ComputedColumnConverter converter, DataType outputDataType)
RowData containing the physical
fields of the external system into a new RowData with push-downed computed columns.
Note: Use the passed data type instead of TableSchema.toPhysicalRowDataType() for
describing the final output data type when creating TypeInformation. If the source implements
SupportsProjectionPushDown, the projection is already considered in both the converter
and the given output data type.
Copyright © 2014–2020 The Apache Software Foundation. All rights reserved.