@PublicEvolving public class BlockSplittingRecursiveEnumerator extends NonSplittingRecursiveEnumerator
FileEnumerator enumerates all files under the given paths recursively, and creates a
separate split for each file block.
Please note that file blocks are only exposed by some file systems, such as HDFS. File systems that do not expose block information will not create multiple file splits per file, but keep the files as one source split.
Files with suffixes corresponding to known compression formats (for example '.gzip', '.bz2',
...) will not be split. See StandardDeCompressors for a list of known formats and
suffixes.
The default instantiation of this enumerator filters files with the common hidden file prefixes '.' and '_'. A custom file filter can be specified.
FileEnumerator.Provider| 构造器和说明 |
|---|
BlockSplittingRecursiveEnumerator()
Creates a new enumerator that enumerates all files except hidden files.
|
BlockSplittingRecursiveEnumerator(java.util.function.Predicate<org.apache.flink.core.fs.Path> fileFilter,
String[] nonSplittableFileSuffixes)
Creates a new enumerator that uses the given predicate as a filter for file paths, and avoids
splitting files with the given extension (typically to avoid splitting compressed files).
|
| 限定符和类型 | 方法和说明 |
|---|---|
protected void |
convertToSourceSplits(org.apache.flink.core.fs.FileStatus file,
org.apache.flink.core.fs.FileSystem fs,
List<FileSourceSplit> target) |
protected boolean |
isFileSplittable(org.apache.flink.core.fs.Path filePath) |
enumerateSplits, getNextIdpublic BlockSplittingRecursiveEnumerator()
The enumerator does not split files that have a suffix corresponding to a known
compression format (for example '.gzip', '.bz2', '.xy', '.zip', ...). See StandardDeCompressors for details.
public BlockSplittingRecursiveEnumerator(java.util.function.Predicate<org.apache.flink.core.fs.Path> fileFilter, String[] nonSplittableFileSuffixes)
protected void convertToSourceSplits(org.apache.flink.core.fs.FileStatus file,
org.apache.flink.core.fs.FileSystem fs,
List<FileSourceSplit> target)
throws IOException
protected boolean isFileSplittable(org.apache.flink.core.fs.Path filePath)
Copyright © 2014–2022 The Apache Software Foundation. All rights reserved.