Package org.apache.flink.connector.jdbc
Class JdbcInputFormat
- java.lang.Object
-
- org.apache.flink.api.common.io.RichInputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>
-
- org.apache.flink.connector.jdbc.JdbcInputFormat
-
- All Implemented Interfaces:
Serializable,org.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>,org.apache.flink.api.java.typeutils.ResultTypeQueryable<org.apache.flink.types.Row>,org.apache.flink.core.io.InputSplitSource<org.apache.flink.core.io.InputSplit>
@Experimental public class JdbcInputFormat extends org.apache.flink.api.common.io.RichInputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit> implements org.apache.flink.api.java.typeutils.ResultTypeQueryable<org.apache.flink.types.Row>InputFormat to read data from a database and generate Rows. The InputFormat has to be configured using the supplied InputFormatBuilder. A valid RowTypeInfo must be properly configured in the builder, e.g.:TypeInformation>[] fieldTypes = new TypeInformation>[] { BasicTypeInfo.INT_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.DOUBLE_TYPE_INFO, BasicTypeInfo.INT_TYPE_INFO }; RowTypeInfo rowTypeInfo = new RowTypeInfo(fieldTypes); JdbcInputFormat jdbcInputFormat = JdbcInputFormat.buildJdbcInputFormat() .setDrivername("org.apache.derby.jdbc.EmbeddedDriver") .setDBUrl("jdbc:derby:memory:ebookshop") .setQuery("select * from books") .setRowTypeInfo(rowTypeInfo) .finish();In order to query the JDBC source in parallel, you need to provide a parameterized query template (i.e. a valid
PreparedStatement) and aJdbcParameterValuesProviderwhich provides binding values for the query parameters. E.g.:Serializable[][] queryParameters = new String[2][1]; queryParameters[0] = new String[]{"Kumar"}; queryParameters[1] = new String[]{"Tan Ah Teck"}; JdbcInputFormat jdbcInputFormat = JdbcInputFormat.buildJdbcInputFormat() .setDrivername("org.apache.derby.jdbc.EmbeddedDriver") .setDBUrl("jdbc:derby:memory:ebookshop") .setQuery("select * from books WHERE author = ?") .setRowTypeInfo(rowTypeInfo) .setParametersProvider(new JdbcGenericParameterValuesProvider(queryParameters)) .finish();- See Also:
Row,JdbcParameterValuesProvider,PreparedStatement,DriverManager, Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classJdbcInputFormat.JdbcInputFormatBuilderBuilder forJdbcInputFormat.
-
Field Summary
Fields Modifier and Type Field Description protected BooleanautoCommitprotected JdbcConnectionProviderconnectionProviderprotected intfetchSizeprotected booleanhasNextprotected static org.slf4j.LoggerLOGprotected Object[][]parameterValuesprotected StringqueryTemplateprotected ResultSetresultSetprotected intresultSetConcurrencyprotected intresultSetTypeprotected org.apache.flink.api.java.typeutils.RowTypeInforowTypeInfoprotected static longserialVersionUIDprotected PreparedStatementstatement
-
Constructor Summary
Constructors Constructor Description JdbcInputFormat()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static JdbcInputFormat.JdbcInputFormatBuilderbuildJdbcInputFormat()A builder used to set parameters to the output format's configuration in a fluent way.voidclose()Closes all resources used.voidcloseInputFormat()voidconfigure(org.apache.flink.configuration.Configuration parameters)org.apache.flink.core.io.InputSplit[]createInputSplits(int minNumSplits)protected ConnectiongetDbConn()org.apache.flink.core.io.InputSplitAssignergetInputSplitAssigner(org.apache.flink.core.io.InputSplit[] inputSplits)org.apache.flink.api.java.typeutils.RowTypeInfogetProducedType()protected PreparedStatementgetStatement()org.apache.flink.api.common.io.statistics.BaseStatisticsgetStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics cachedStatistics)org.apache.flink.types.RownextRecord(org.apache.flink.types.Row reuse)Stores the next resultSet row in a tuple.voidopen(org.apache.flink.core.io.InputSplit inputSplit)Connects to the source database and executes the query in a parallel fashion if thisInputFormatis built using a parameterized query (i.e.voidopenInputFormat()booleanreachedEnd()Checks whether all data has been read.
-
-
-
Field Detail
-
serialVersionUID
protected static final long serialVersionUID
- See Also:
- Constant Field Values
-
LOG
protected static final org.slf4j.Logger LOG
-
connectionProvider
protected JdbcConnectionProvider connectionProvider
-
queryTemplate
protected String queryTemplate
-
resultSetType
protected int resultSetType
-
resultSetConcurrency
protected int resultSetConcurrency
-
rowTypeInfo
protected org.apache.flink.api.java.typeutils.RowTypeInfo rowTypeInfo
-
statement
protected transient PreparedStatement statement
-
resultSet
protected transient ResultSet resultSet
-
fetchSize
protected int fetchSize
-
autoCommit
protected Boolean autoCommit
-
hasNext
protected boolean hasNext
-
parameterValues
protected Object[][] parameterValues
-
-
Method Detail
-
getProducedType
public org.apache.flink.api.java.typeutils.RowTypeInfo getProducedType()
- Specified by:
getProducedTypein interfaceorg.apache.flink.api.java.typeutils.ResultTypeQueryable<org.apache.flink.types.Row>
-
configure
public void configure(org.apache.flink.configuration.Configuration parameters)
- Specified by:
configurein interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>
-
openInputFormat
public void openInputFormat()
- Overrides:
openInputFormatin classorg.apache.flink.api.common.io.RichInputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>
-
closeInputFormat
public void closeInputFormat()
- Overrides:
closeInputFormatin classorg.apache.flink.api.common.io.RichInputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>
-
open
public void open(org.apache.flink.core.io.InputSplit inputSplit) throws IOExceptionConnects to the source database and executes the query in a parallel fashion if thisInputFormatis built using a parameterized query (i.e. using aPreparedStatement) and a properJdbcParameterValuesProvider, in a non-parallel fashion otherwise.- Specified by:
openin interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>- Parameters:
inputSplit- which is ignored if this InputFormat is executed as a non-parallel source, a "hook" to the query parameters otherwise (using its splitNumber)- Throws:
IOException- if there's an error during the execution of the query
-
close
public void close() throws IOExceptionCloses all resources used.- Specified by:
closein interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>- Throws:
IOException- Indicates that a resource could not be closed.
-
reachedEnd
public boolean reachedEnd() throws IOExceptionChecks whether all data has been read.- Specified by:
reachedEndin interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>- Returns:
- boolean value indication whether all data has been read.
- Throws:
IOException
-
nextRecord
public org.apache.flink.types.Row nextRecord(org.apache.flink.types.Row reuse) throws IOExceptionStores the next resultSet row in a tuple.- Specified by:
nextRecordin interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>- Parameters:
reuse- row to be reused.- Returns:
- row containing next
Row - Throws:
IOException
-
getStatistics
public org.apache.flink.api.common.io.statistics.BaseStatistics getStatistics(org.apache.flink.api.common.io.statistics.BaseStatistics cachedStatistics) throws IOException- Specified by:
getStatisticsin interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>- Throws:
IOException
-
createInputSplits
public org.apache.flink.core.io.InputSplit[] createInputSplits(int minNumSplits) throws IOException- Specified by:
createInputSplitsin interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>- Specified by:
createInputSplitsin interfaceorg.apache.flink.core.io.InputSplitSource<org.apache.flink.core.io.InputSplit>- Throws:
IOException
-
getInputSplitAssigner
public org.apache.flink.core.io.InputSplitAssigner getInputSplitAssigner(org.apache.flink.core.io.InputSplit[] inputSplits)
- Specified by:
getInputSplitAssignerin interfaceorg.apache.flink.api.common.io.InputFormat<org.apache.flink.types.Row,org.apache.flink.core.io.InputSplit>- Specified by:
getInputSplitAssignerin interfaceorg.apache.flink.core.io.InputSplitSource<org.apache.flink.core.io.InputSplit>
-
getStatement
@VisibleForTesting protected PreparedStatement getStatement()
-
getDbConn
@VisibleForTesting protected Connection getDbConn()
-
buildJdbcInputFormat
public static JdbcInputFormat.JdbcInputFormatBuilder buildJdbcInputFormat()
A builder used to set parameters to the output format's configuration in a fluent way.- Returns:
- builder
-
-