public class CSVMultiSequenceRecordReader extends CSVRecordReader implements SequenceRecordReader
CSVMultiSequenceRecordReader.Modes:
a,b,c
1,2
A,B,C
D,E,F
Output:CSVLineSequenceRecordReader for the edge case - a univariate version,
Serialized Form| Modifier and Type | Class and Description |
|---|---|
static class |
CSVMultiSequenceRecordReader.Mode |
DEFAULT_DELIMITER, DEFAULT_QUOTE, DELIMITER, QUOTE, SKIP_NUM_LINES, skipNumLinescharset, conf, initialized, lineIndex, locations, splitIndexinputSplit, listeners, streamCreatorFnAPPEND_LABEL, LABELS, NAME_SPACE| Constructor and Description |
|---|
CSVMultiSequenceRecordReader(int skipNumLines,
char elementDelimiter,
char quote,
String sequenceSeparatorRegex,
CSVMultiSequenceRecordReader.Mode mode,
Writable padValue)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default
quote character ('"')
|
CSVMultiSequenceRecordReader(String sequenceSeparatorRegex,
CSVMultiSequenceRecordReader.Mode mode)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default
quote character ('"').
Note that this constructor cannot be used with CSVMultiSequenceRecordReader.Mode.PAD as the padding value cannot be specified |
CSVMultiSequenceRecordReader(String sequenceSeparatorRegex,
CSVMultiSequenceRecordReader.Mode mode,
Writable padValue)
Create a sequence reader using the default value for skip lines (0), the default delimiter (',') and the default
quote character ('"')
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
batchesSupported()
This method returns true, if next(int) signature is supported by this RecordReader implementation.
|
List<SequenceRecord> |
loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas)
Load multiple sequence records from the given a list of
RecordMetaData instances |
SequenceRecord |
loadSequenceFromMetaData(RecordMetaData recordMetaData)
Load a single sequence record from the given
RecordMetaData instanceNote: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once using SequenceRecordReader.loadSequenceFromMetaData(List) |
SequenceRecord |
nextSequence()
Similar to
SequenceRecordReader.sequenceRecord(), but returns a Record object, that may include metadata such as the source
of the data |
List<List<Writable>> |
sequenceRecord()
Returns a sequence record.
|
List<List<Writable>> |
sequenceRecord(URI uri,
DataInputStream dataInputStream)
Load a sequence record from the given DataInputStream
Unlike
RecordReader.next() the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStream |
hasNext, initialize, loadFromMetaData, loadFromMetaData, next, next, nextRecord, onLocationOpen, parseLine, readStringLine, record, resetclose, closeIfRequired, getConf, getIterator, getLabels, initialize, resetSupported, setConfgetListeners, invokeListeners, setListeners, setListenersclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetLabels, getListeners, hasNext, initialize, initialize, loadFromMetaData, loadFromMetaData, next, next, nextRecord, record, reset, resetSupported, setListeners, setListenersgetConf, setConfpublic CSVMultiSequenceRecordReader(String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode)
CSVMultiSequenceRecordReader.Mode.PAD as the padding value cannot be specifiedsequenceSeparatorRegex - The sequence separator regex. Use "^$" for "sequences are separated by an empty linemode - Mode: see CSVMultiSequenceRecordReader javadocpublic CSVMultiSequenceRecordReader(String sequenceSeparatorRegex, CSVMultiSequenceRecordReader.Mode mode, Writable padValue)
sequenceSeparatorRegex - The sequence separator regex. Use "^$" for "sequences are separated by an empty linemode - Mode: see CSVMultiSequenceRecordReader javadocpadValue - Padding value for padding short sequences. Only used/allowable with CSVMultiSequenceRecordReader.Mode.PAD,
should be null otherwisepublic CSVMultiSequenceRecordReader(int skipNumLines,
char elementDelimiter,
char quote,
String sequenceSeparatorRegex,
CSVMultiSequenceRecordReader.Mode mode,
Writable padValue)
skipNumLines - Number of lines to skipelementDelimiter - Delimiter for elements - i.e., ',' if lines are comma separatedsequenceSeparatorRegex - The sequence separator regex. Use "^$" for "sequences are separated by an empty linemode - Mode: see CSVMultiSequenceRecordReader javadocpadValue - Padding value for padding short sequences. Only used/allowable with CSVMultiSequenceRecordReader.Mode.PAD,
should be null otherwisepublic List<List<Writable>> sequenceRecord()
SequenceRecordReadersequenceRecord in interface SequenceRecordReaderpublic SequenceRecord nextSequence()
SequenceRecordReaderSequenceRecordReader.sequenceRecord(), but returns a Record object, that may include metadata such as the source
of the datanextSequence in interface SequenceRecordReaderpublic List<List<Writable>> sequenceRecord(URI uri, DataInputStream dataInputStream) throws IOException
SequenceRecordReaderRecordReader.next() the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStreamsequenceRecord in interface SequenceRecordReaderIOException - if error occurs during reading from the input streampublic SequenceRecord loadSequenceFromMetaData(RecordMetaData recordMetaData) throws IOException
SequenceRecordReaderRecordMetaData instanceSequenceRecordReader.loadSequenceFromMetaData(List)loadSequenceFromMetaData in interface SequenceRecordReaderrecordMetaData - Metadata for the sequence record that we want to load fromIOException - If I/O error occurs during loadingpublic List<SequenceRecord> loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
SequenceRecordReaderRecordMetaData instancesloadSequenceFromMetaData in interface SequenceRecordReaderrecordMetaDatas - Metadata for the records that we want to load fromIOException - If I/O error occurs during loadingpublic boolean batchesSupported()
RecordReaderbatchesSupported in interface RecordReaderbatchesSupported in class CSVRecordReaderCopyright © 2020. All rights reserved.