public class FirstDigitTransform extends BaseTransform
FirstDigitTransform.Modes are supported, which determines how non-numerical entries should be handled:CategoricalToOneHotTransform and Reductions) to implement
Benford's law.| Modifier and Type | Class and Description |
|---|---|
static class |
FirstDigitTransform.Mode
Mode determines how non-numerical entries should be handled:
EXCEPTION_ON_INVALID: output has 10 category values ("0", ..., "9"), and any non-numerical values result in an exception INCLUDE_OTHER_CATEGORY: output has 11 category values ("0", ..., "9", "Other"), all non-numerical values are mapped to "Other" |
| Modifier and Type | Field and Description |
|---|---|
protected String |
inputColumn |
protected FirstDigitTransform.Mode |
mode |
static String |
OTHER_CATEGORY |
protected String |
outputColumn |
inputSchema| Constructor and Description |
|---|
FirstDigitTransform(String inputColumn,
String outputColumn,
FirstDigitTransform.Mode mode) |
| Modifier and Type | Method and Description |
|---|---|
String |
columnName()
Returns a singular column name
this op is meant to run on
|
String[] |
columnNames()
Returns column names
this op is meant to run on
|
List<Writable> |
map(List<Writable> writables)
Transform a writable
in to another writable
|
Object |
map(Object input)
Transform an object
in to another object
|
Object |
mapSequence(Object sequence)
Transform a sequence
|
String |
outputColumnName()
The output column name
after the operation has been applied
|
String[] |
outputColumnNames()
The output column names
This will often be the same as the input
|
void |
setInputSchema(Schema schema)
Set the input schema.
|
String |
toString() |
Schema |
transform(Schema inputSchema) |
getInputSchema, mapSequencepublic static final String OTHER_CATEGORY
protected String inputColumn
protected String outputColumn
protected FirstDigitTransform.Mode mode
public FirstDigitTransform(String inputColumn, String outputColumn, FirstDigitTransform.Mode mode)
inputColumn - Input column nameoutputColumn - Output column name. If same as input, input column is replacedmode - See FirstDigitTransform.Modepublic List<Writable> map(List<Writable> writables)
Transformwritables - the record to transformpublic Object map(Object input)
Transforminput - the record to transformpublic Object mapSequence(Object sequence)
Transformpublic String toString()
toString in class BaseTransformpublic String outputColumnName()
ColumnOppublic String[] outputColumnNames()
ColumnOppublic String[] columnNames()
ColumnOppublic String columnName()
ColumnOppublic void setInputSchema(Schema schema)
ColumnOpsetInputSchema in interface ColumnOpsetInputSchema in class BaseTransformCopyright © 2020. All rights reserved.