类 VectorIndexer
- java.lang.Object
-
- org.apache.flink.ml.feature.vectorindexer.VectorIndexer
-
- 所有已实现的接口:
Serializable,org.apache.flink.ml.api.Estimator<VectorIndexer,VectorIndexerModel>,org.apache.flink.ml.api.Stage<VectorIndexer>,org.apache.flink.ml.common.param.HasHandleInvalid<VectorIndexer>,org.apache.flink.ml.common.param.HasInputCol<VectorIndexer>,org.apache.flink.ml.common.param.HasOutputCol<VectorIndexer>,VectorIndexerModelParams<VectorIndexer>,VectorIndexerParams<VectorIndexer>,org.apache.flink.ml.param.WithParams<VectorIndexer>
public class VectorIndexer extends Object implements org.apache.flink.ml.api.Estimator<VectorIndexer,VectorIndexerModel>, VectorIndexerParams<VectorIndexer>
An Estimator which implements the vector indexing algorithm.A vector indexer maps each column of the input vector into a continuous/categorical feature. Whether one feature is transformed into a continuous or categorical feature depends on the number of distinct values in this column. If the number of distinct values in one column is greater than a specified parameter (i.e., maxCategories), the corresponding output column is unchanged. Otherwise, it is transformed into a categorical value. For categorical outputs, the indices are in [0, numDistinctValuesInThisColumn].
The output model is organized in ascending order except that 0.0 is always mapped to 0 (for sparsity). We list two examples here:
- If one column contains {-1.0, 1.0}, then -1.0 should be encoded as 0 and 1.0 will be encoded as 1.
- If one column contains {-1.0, 0.0, 1.0}, then -1.0 should be encoded as 1, 0.0 should be encoded as 0 and 1.0 should be encoded as 2.
The `keep` option of
HasHandleInvalidmeans that we put the invalid entries in a special bucket, whose index is the number of distinct values in this column.- 另请参阅:
- 序列化表格
-
-
字段概要
-
从接口继承的字段 org.apache.flink.ml.common.param.HasHandleInvalid
ERROR_INVALID, HANDLE_INVALID, KEEP_INVALID, SKIP_INVALID
-
从接口继承的字段 org.apache.flink.ml.feature.vectorindexer.VectorIndexerParams
MAX_CATEGORIES
-
-
构造器概要
构造器 构造器 说明 VectorIndexer()
-
方法概要
所有方法 静态方法 实例方法 具体方法 修饰符和类型 方法 说明 VectorIndexerModelfit(org.apache.flink.table.api.Table... inputs)Map<org.apache.flink.ml.param.Param<?>,Object>getParamMap()static VectorIndexerload(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv, String path)voidsave(String path)-
从类继承的方法 java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
从接口继承的方法 org.apache.flink.ml.feature.vectorindexer.VectorIndexerParams
getMaxCategories, setMaxCategories
-
-
-
-
方法详细资料
-
fit
public VectorIndexerModel fit(org.apache.flink.table.api.Table... inputs)
- 指定者:
fit在接口中org.apache.flink.ml.api.Estimator<VectorIndexer,VectorIndexerModel>
-
save
public void save(String path) throws IOException
- 指定者:
save在接口中org.apache.flink.ml.api.Stage<VectorIndexer>- 抛出:
IOException
-
load
public static VectorIndexer load(org.apache.flink.table.api.bridge.java.StreamTableEnvironment tEnv, String path) throws IOException
- 抛出:
IOException
-
getParamMap
public Map<org.apache.flink.ml.param.Param<?>,Object> getParamMap()
- 指定者:
getParamMap在接口中org.apache.flink.ml.param.WithParams<VectorIndexer>
-
-