public class CSVVariableSlidingWindowRecordReader extends CSVRecordReader implements SequenceRecordReader
| Modifier and Type | Field and Description |
|---|---|
static String |
LINES_PER_SEQUENCE |
DEFAULT_DELIMITER, DEFAULT_QUOTE, DELIMITER, QUOTE, SKIP_NUM_LINES, skipNumLinescharset, conf, initialized, lineIndex, locations, splitIndexinputSplit, listeners, streamCreatorFnAPPEND_LABEL, LABELS, NAME_SPACE| Constructor and Description |
|---|
CSVVariableSlidingWindowRecordReader()
No-arg constructor with the default number of lines per sequence (10)
|
CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence) |
CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence,
int stride) |
CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence,
int skipNumLines,
int stride,
String delimiter) |
CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence,
int stride,
String delimiter) |
| Modifier and Type | Method and Description |
|---|---|
boolean |
hasNext()
Whether there are anymore records
|
void |
initialize(Configuration conf,
InputSplit split)
Called once at initialization.
|
List<Record> |
loadFromMetaData(List<RecordMetaData> recordMetaDatas)
Load multiple records from the given a list of
RecordMetaData instances |
Record |
loadFromMetaData(RecordMetaData recordMetaData)
Load a single record from the given
RecordMetaData instanceNote: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once using RecordReader.loadFromMetaData(List) |
List<SequenceRecord> |
loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas)
Load multiple sequence records from the given a list of
RecordMetaData instances |
SequenceRecord |
loadSequenceFromMetaData(RecordMetaData recordMetaData)
Load a single sequence record from the given
RecordMetaData instanceNote: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once using SequenceRecordReader.loadSequenceFromMetaData(List) |
SequenceRecord |
nextSequence()
Similar to
SequenceRecordReader.sequenceRecord(), but returns a Record object, that may include metadata such as the source
of the data |
void |
reset()
Reset record reader iterator
|
List<List<Writable>> |
sequenceRecord()
Returns a sequence record.
|
List<List<Writable>> |
sequenceRecord(URI uri,
DataInputStream dataInputStream)
Load a sequence record from the given DataInputStream
Unlike
RecordReader.next() the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStream |
batchesSupported, next, next, nextRecord, onLocationOpen, parseLine, readStringLine, recordclose, closeIfRequired, getConf, getIterator, getLabels, initialize, resetSupported, setConfgetListeners, invokeListeners, setListeners, setListenersclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitbatchesSupported, getLabels, getListeners, initialize, next, next, nextRecord, record, resetSupported, setListeners, setListenersgetConf, setConfpublic static final String LINES_PER_SEQUENCE
public CSVVariableSlidingWindowRecordReader()
public CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence)
maxLinesPerSequence - Number of lines in each sequence, use default delemiter(,) between entries in the same linepublic CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence,
int stride)
maxLinesPerSequence - Number of lines in each sequence, use default delemiter(,) between entries in the same linestride - Number of lines between records (increment window > 1 line)public CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence,
int stride,
String delimiter)
maxLinesPerSequence - Number of lines in each sequence, use default delemiter(,) between entries in the same linestride - Number of lines between records (increment window > 1 line)public CSVVariableSlidingWindowRecordReader(int maxLinesPerSequence,
int skipNumLines,
int stride,
String delimiter)
maxLinesPerSequence - Number of lines in each sequencesskipNumLines - Number of lines to skip at the start of the file (only skipped once, not per sequence)stride - Number of lines between records (increment window > 1 line)delimiter - Delimiter between entries in the same line, for example ","public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
RecordReaderinitialize in interface RecordReaderinitialize in class CSVRecordReaderconf - a configuration for initializationsplit - the split that defines the range of records to readIOExceptionInterruptedExceptionpublic boolean hasNext()
RecordReaderhasNext in interface RecordReaderhasNext in class CSVRecordReaderpublic List<List<Writable>> sequenceRecord()
SequenceRecordReadersequenceRecord in interface SequenceRecordReaderpublic List<List<Writable>> sequenceRecord(URI uri, DataInputStream dataInputStream) throws IOException
SequenceRecordReaderRecordReader.next() the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStreamsequenceRecord in interface SequenceRecordReaderIOException - if error occurs during reading from the input streampublic SequenceRecord nextSequence()
SequenceRecordReaderSequenceRecordReader.sequenceRecord(), but returns a Record object, that may include metadata such as the source
of the datanextSequence in interface SequenceRecordReaderpublic SequenceRecord loadSequenceFromMetaData(RecordMetaData recordMetaData) throws IOException
SequenceRecordReaderRecordMetaData instanceSequenceRecordReader.loadSequenceFromMetaData(List)loadSequenceFromMetaData in interface SequenceRecordReaderrecordMetaData - Metadata for the sequence record that we want to load fromIOException - If I/O error occurs during loadingpublic List<SequenceRecord> loadSequenceFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
SequenceRecordReaderRecordMetaData instancesloadSequenceFromMetaData in interface SequenceRecordReaderrecordMetaDatas - Metadata for the records that we want to load fromIOException - If I/O error occurs during loadingpublic Record loadFromMetaData(RecordMetaData recordMetaData)
RecordReaderRecordMetaData instanceRecordReader.loadFromMetaData(List)loadFromMetaData in interface RecordReaderloadFromMetaData in class CSVRecordReaderrecordMetaData - Metadata for the record that we want to load frompublic List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas)
RecordReaderRecordMetaData instancesloadFromMetaData in interface RecordReaderloadFromMetaData in class CSVRecordReaderrecordMetaDatas - Metadata for the records that we want to load frompublic void reset()
RecordReaderreset in interface RecordReaderreset in class CSVRecordReaderCopyright © 2019. All rights reserved.