| Modifier and Type | Method and Description |
|---|---|
RecordReader |
BaseInputFormat.createReader(InputSplit split) |
RecordReader |
InputFormat.createReader(InputSplit split)
Creates a reader from an input split
|
RecordReader |
InputFormat.createReader(InputSplit split,
Configuration conf)
Creates a reader from an input split
|
| Modifier and Type | Method and Description |
|---|---|
RecordReader |
LineInputFormat.createReader(InputSplit split) |
RecordReader |
CSVInputFormat.createReader(InputSplit split) |
RecordReader |
ListStringInputFormat.createReader(InputSplit split)
Creates a reader from an input split
|
RecordReader |
SVMLightInputFormat.createReader(InputSplit split) |
RecordReader |
MatlabInputFormat.createReader(InputSplit split) |
RecordReader |
LineInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
CSVInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
ListStringInputFormat.createReader(InputSplit split,
Configuration conf)
Creates a reader from an input split
|
RecordReader |
LibSvmInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
SVMLightInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
MatlabInputFormat.createReader(InputSplit split,
Configuration conf) |
| Modifier and Type | Method and Description |
|---|---|
static void |
RecordReaderConverter.convert(RecordReader reader,
RecordWriter writer)
Write all values from the specified record reader to the specified record writer.
|
static void |
RecordReaderConverter.convert(RecordReader reader,
RecordWriter writer,
boolean closeOnCompletion)
Write all values from the specified record reader to the specified record writer.
|
| Modifier and Type | Method and Description |
|---|---|
void |
RecordListener.recordRead(RecordReader reader,
Object record)
Event listener for each record to be read.
|
| Modifier and Type | Method and Description |
|---|---|
void |
LogRecordListener.recordRead(RecordReader reader,
Object record) |
| Modifier and Type | Interface and Description |
|---|---|
interface |
SequenceRecordReader
A sequence of records.
|
| Modifier and Type | Class and Description |
|---|---|
class |
BaseRecordReader
Manages record listeners.
|
| Modifier and Type | Method and Description |
|---|---|
RecordReader |
RecordReaderFactory.create(URI uri)
Creates instance of RecordReader
|
| Modifier and Type | Class and Description |
|---|---|
class |
ComposableRecordReader
RecordReader for each pipeline.
|
class |
ConcatenatingRecordReader
Combine multiple readers into a single reader.
|
class |
FileRecordReader
File reader/writer
|
class |
LineRecordReader
Reads files line by line
|
| Constructor and Description |
|---|
ComposableRecordReader(RecordReader... readers) |
ConcatenatingRecordReader(RecordReader... readers) |
| Modifier and Type | Class and Description |
|---|---|
class |
CollectionRecordReader
Collection record reader.
|
class |
CollectionSequenceRecordReader
Collection record reader for sequences.
|
class |
ListStringRecordReader
Iterates through a list of strings return a record.
|
| Modifier and Type | Class and Description |
|---|---|
class |
CSVLineSequenceRecordReader
CSVLineSequenceRecordReader: Used for loading univariance (single valued) sequences from a CSV,
where each line in a CSV represents an independent sequence, and each sequence has exactly 1 value
per time step.
For example, a CSV file with content: |
class |
CSVMultiSequenceRecordReader
CSVMultiSequenceRecordReader: Used to read CSV-format time series (sequence) data where there are multiple
independent sequences in each file.
The assumption is that each sequence is separated by some delimiter - for example, a blank line between sequences, or some other line that can be detected by a regex. Note that the number of columns (i.e., number of lines in the CSV per sequence) must be the same for all sequences. It supports 3 CSVMultiSequenceRecordReader.Modes:(a) CONCAT mode: the output is a univariate (single column) sequence with the values from all lines (b) EQUAL_LENGTH: Require that all lines have the exact same number of tokens (c) PAD: For any shorter lines (fewer tokens), a user-specified padding Writable value will be used to make them the same length as the other sequences Example: Input data: |
class |
CSVNLinesSequenceRecordReader
A CSV Sequence record reader where:
(a) all time series are in a single file (b) each time series is of the same length (specified in constructor) (c) no delimiter is used between time series For example, with nLinesPerSequence=10, lines 0 to 9 are the first time series, 10 to 19 are the second, and so on. |
class |
CSVRecordReader
Simple csv record reader.
|
class |
CSVRegexRecordReader
A CSVRecordReader that can split
each column into additional columns using regexs.
|
class |
CSVSequenceRecordReader
CSV Sequence Record Reader
This reader is intended to read sequences of data in CSV format, where
each sequence is defined in its own file (and there are multiple files)
Each line in the file represents one time step
|
class |
CSVVariableSlidingWindowRecordReader
A sliding window of variable size across an entire CSV.
|
| Modifier and Type | Class and Description |
|---|---|
class |
FileBatchRecordReader
FileBatchRecordReader reads the files contained in a
FileBatch using the specified RecordReader.Specifically, the record(URI, DataInputStream) method of the underlying reader is used to
load files.For example, if the FileBatch was constructed using image files (png, jpg etc), FileBatchRecordReader could be used with ImageRecordReader. |
class |
FileBatchSequenceRecordReader
FileBatchSequenceRecordReader reads the files contained in a
FileBatch using the specified SequenceRecordReader.Specifically, the SequenceRecordReader.sequenceRecord(URI, DataInputStream) } method of the underlying sequence
reader is used to load files.For example, if the FileBatch was constructed using csv sequence files (each file represents one example), FileBatchSequencRecordReader could be used with CSVSequenceRecordReader. |
| Constructor and Description |
|---|
FileBatchRecordReader(RecordReader rr,
FileBatch fileBatch) |
| Modifier and Type | Class and Description |
|---|---|
class |
InMemoryRecordReader
This is a
RecordReader
primarily meant for unit tests. |
class |
InMemorySequenceRecordReader
This is a
SequenceRecordReader
primarily meant for unit tests. |
| Modifier and Type | Class and Description |
|---|---|
class |
JacksonLineRecordReader
JacksonLineRecordReader will read a single file line-by-line when .next() is
called. |
class |
JacksonLineSequenceRecordReader
|
class |
JacksonRecordReader
RecordReader using Jackson.
Design for this record reader: - Support for JSON, XML and YAML: one record per file only, via Jackson ObjectMapper:
FieldSelection. |
| Modifier and Type | Class and Description |
|---|---|
class |
LibSvmRecordReader
Record reader for libsvm format, which is closely
related to SVMLight format.
|
class |
MatlabRecordReader
Matlab record reader
|
class |
SVMLightRecordReader
Record reader for SVMLight format, which can generally
be described as
LABEL INDEX:VALUE INDEX:VALUE ...
|
| Modifier and Type | Class and Description |
|---|---|
class |
RegexLineRecordReader
RegexLineRecordReader: Read a file, one line at a time, and split it into fields using a regex.
|
class |
RegexSequenceRecordReader
RegexSequenceRecordReader: Read an entire file (as a sequence), one line at a time and
split each line into fields using a regex.
|
| Modifier and Type | Class and Description |
|---|---|
class |
TransformProcessRecordReader
This wraps a
RecordReader
with a TransformProcess and allows every Record
that is returned by the RecordReader
to have a transform process applied before being returned. |
class |
TransformProcessSequenceRecordReader
This wraps a
SequenceRecordReader with a TransformProcess
which will allow every Record returned from the SequenceRecordReader
to be transformed before being returned. |
| Modifier and Type | Field and Description |
|---|---|
protected RecordReader |
TransformProcessRecordReader.recordReader |
| Constructor and Description |
|---|
TransformProcessRecordReader(RecordReader recordReader,
TransformProcess transformProcess) |
| Modifier and Type | Method and Description |
|---|---|
static List<String> |
TransformProcess.inferCategories(RecordReader recordReader,
int columnIndex)
Infer the categories for the given record reader for a particular column
Note that each "column index" is a column in the context of:
List
|
static Map<Integer,List<String>> |
TransformProcess.inferCategories(RecordReader recordReader,
int[] columnIndices)
Infer the categories for the given record reader for
a particular set of columns (this is more efficient than
TransformProcess.inferCategories(RecordReader, int)
if you have more than one column you plan on inferring categories for)
Note that each "column index" is a column in the context of:
List |
| Modifier and Type | Method and Description |
|---|---|
void |
Vectorizer.fit(RecordReader reader)
Fit based on a record reader
|
void |
Vectorizer.fit(RecordReader reader,
Vectorizer.RecordCallBack callBack)
Fit based on a record reader
|
VECTOR_TYPE |
Vectorizer.fitTransform(RecordReader reader)
Fit based on a record reader
|
VECTOR_TYPE |
Vectorizer.fitTransform(RecordReader reader,
Vectorizer.RecordCallBack callBack)
Fit based on a record reader
|
Copyright © 2019. All rights reserved.