public class StringListToCountsNDArrayTransform extends BaseTransform
| Modifier and Type | Field and Description |
|---|---|
protected boolean |
binary |
protected int |
columnIdx |
protected String |
columnName |
protected String |
delimiter |
protected boolean |
ignoreUnknown |
protected Map<String,Integer> |
map |
protected String |
newColumnName |
protected List<String> |
vocabulary |
inputSchema| Constructor and Description |
|---|
StringListToCountsNDArrayTransform(String columnName,
List<String> vocabulary,
String delimiter,
boolean binary,
boolean ignoreUnknown) |
StringListToCountsNDArrayTransform(String columnName,
String newColumnName,
List<String> vocabulary,
String delimiter,
boolean binary,
boolean ignoreUnknown) |
| Modifier and Type | Method and Description |
|---|---|
String |
columnName()
Returns a singular column name
this op is meant to run on
|
String[] |
columnNames()
Returns column names
this op is meant to run on
|
protected Collection<Integer> |
getIndices(String text) |
protected org.nd4j.linalg.api.ndarray.INDArray |
makeBOWNDArray(Collection<Integer> indices) |
List<Writable> |
map(List<Writable> writables)
Transform a writable
in to another writable
|
Object |
map(Object input)
Transform an object
in to another object
|
Object |
mapSequence(Object sequence)
Transform a sequence
|
String |
outputColumnName()
The output column name
after the operation has been applied
|
String[] |
outputColumnNames()
The output column names
This will often be the same as the input
|
static List<String> |
readVocabFromFile(String path) |
void |
setInputSchema(Schema inputSchema)
Set the input schema.
|
String |
toString() |
Schema |
transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
getInputSchema, mapSequenceprotected final String columnName
protected final String newColumnName
protected final String delimiter
protected final boolean binary
protected final boolean ignoreUnknown
protected int columnIdx
public StringListToCountsNDArrayTransform(String columnName, List<String> vocabulary, String delimiter, boolean binary, boolean ignoreUnknown)
columnName - The name of the column to convertvocabulary - The possible tokens that may be present.delimiter - The delimiter for the Strings to convertignoreUnknown - Whether to ignore unknown tokenspublic StringListToCountsNDArrayTransform(String columnName, String newColumnName, List<String> vocabulary, String delimiter, boolean binary, boolean ignoreUnknown)
columnName - The name of the column to convertvocabulary - The possible tokens that may be present.delimiter - The delimiter for the Strings to convertignoreUnknown - Whether to ignore unknown tokenspublic static List<String> readVocabFromFile(String path) throws IOException
IOExceptionpublic Schema transform(Schema inputSchema)
ColumnOppublic void setInputSchema(Schema inputSchema)
ColumnOpsetInputSchema in interface ColumnOpsetInputSchema in class BaseTransformpublic String toString()
toString in class BaseTransformprotected Collection<Integer> getIndices(String text)
protected org.nd4j.linalg.api.ndarray.INDArray makeBOWNDArray(Collection<Integer> indices)
public List<Writable> map(List<Writable> writables)
Transformwritables - the record to transformpublic Object map(Object input)
input - the record to transformpublic String outputColumnName()
public String[] outputColumnNames()
public String[] columnNames()
public String columnName()
Copyright © 2017. All rights reserved.