@Reference(type=Inproceedings, author={"Amirthalingam Ramanan","Mahesan Niranjan"}, title="Resource-Allocating Codebook for Patch-based Face Recognition", year="2009", booktitle="IIS", url="http://eprints.ecs.soton.ac.uk/21401/") public class IntRAC extends Object implements SpatialClusters<int[]>, SpatialClusterer<IntRAC,int[]>, CentroidsProvider<int[]>, HardAssigner<int[],float[],IntFloatPair>
During training, data points are selected at random. The first data point is chosen as a centroid. Every following data point is set as a new centroid if it is outside the threshold of all current centroids. In this way it is difficult to guarantee number of clusters so a minimisation function is provided to allow a close estimate of the required threshold for a given K.
This implementation supports int[] cluster centroids.
In terms of implementation, this class is a both a clusterer, assigner and
the result of the clustering. This is because the RAC algorithm never ends;
that is to say that if a new point is being assigned through the
HardAssigner interface, and that point is more than the threshold
distance from any other centroid, then a new centroid will be created for the
point. If this behaviour is undesirable, the results of clustering can be
"frozen" by manually constructing an assigner that takes a
CentroidsProvider (or the centroids provided by calling
getCentroids()) as an argument.
| Modifier and Type | Field and Description |
|---|---|
protected ArrayList<int[]> |
codebook |
protected static int[][] |
distances |
protected int |
nDims |
protected double |
threshold |
protected long |
totalSamples |
CLUSTER_HEADER| Constructor and Description |
|---|
IntRAC()
Sets the threshold to 128
|
IntRAC(double radiusSquared)
Define the threshold at which point a new cluster will be made.
|
IntRAC(int[][] bKeys,
int subSamples,
int nClusters)
Iteratively select subSamples from bKeys and try to choose a threshold
which results in nClusters.
|
| Modifier and Type | Method and Description |
|---|---|
String |
asciiHeader() |
int |
assign(int[] data)
Assign a single point to a cluster.
|
int[] |
assign(int[][] data)
Assign data to a cluster.
|
IntFloatPair |
assignDistance(int[] data)
Assign a single point to a cluster.
|
void |
assignDistance(int[][] data,
int[] indices,
float[] distances)
Assign data to clusters.
|
byte[] |
binaryHeader() |
protected static double |
calculateThreshold(int[][] samples,
int nClusters) |
IntRAC |
cluster(DataSource<int[]> data)
Perform clustering with data from a data source.
|
IntRAC |
cluster(int[][] data)
Perform clustering on the given data.
|
HardAssigner<int[],?,?> |
defaultHardAssigner()
Get the default hard assigner for this clusterer.
|
int[][] |
getCentroids() |
int |
numClusters()
Get the number of clusters.
|
int |
numDimensions()
Get the data dimensionality
|
int[][] |
performClustering(int[][] data) |
void |
readASCII(Scanner in) |
void |
readBinary(DataInput dis) |
int |
size()
The number of centroids; this potentially grows as assignments are made.
|
void |
writeASCII(PrintWriter writer) |
void |
writeBinary(DataOutput dos) |
protected double threshold
protected int nDims
protected static int[][] distances
protected long totalSamples
public IntRAC()
public IntRAC(double radiusSquared)
radiusSquared - public IntRAC(int[][] bKeys, int subSamples, int nClusters)
bKeys - All keys to be trained againstsubSamples - number of subsamples to select from bKeys each iterationnClusters - number of clusters to aim forprotected static double calculateThreshold(int[][] samples, int nClusters) throws org.apache.commons.math.MaxIterationsExceededException, org.apache.commons.math.FunctionEvaluationException
org.apache.commons.math.MaxIterationsExceededExceptionorg.apache.commons.math.FunctionEvaluationExceptionpublic IntRAC cluster(int[][] data)
SpatialClusterercluster in interface SpatialClusterer<IntRAC,int[]>data - the data.public IntRAC cluster(DataSource<int[]> data)
SpatialClustererDataSource
could potentially be backed by disk rather in memory.cluster in interface SpatialClusterer<IntRAC,int[]>data - the data.public int numClusters()
SpatialClustersnumClusters in interface SpatialClusters<int[]>public int numDimensions()
SpatialClustersnumDimensions in interface Assigner<int[]>numDimensions in interface SpatialClusters<int[]>public int[] assign(int[][] data)
HardAssignerassign in interface HardAssigner<int[],float[],IntFloatPair>data - the data.public int assign(int[] data)
HardAssignerassign in interface HardAssigner<int[],float[],IntFloatPair>data - datum to assign.public String asciiHeader()
asciiHeader in interface ReadableASCIIasciiHeader in interface WriteableASCIIpublic byte[] binaryHeader()
binaryHeader in interface ReadableBinarybinaryHeader in interface WriteableBinarypublic void readASCII(Scanner in) throws IOException
readASCII in interface ReadableASCIIIOExceptionpublic void readBinary(DataInput dis) throws IOException
readBinary in interface ReadableBinaryIOExceptionpublic void writeASCII(PrintWriter writer) throws IOException
writeASCII in interface WriteableASCIIIOExceptionpublic void writeBinary(DataOutput dos) throws IOException
writeBinary in interface WriteableBinaryIOExceptionpublic int[][] getCentroids()
getCentroids in interface CentroidsProvider<int[]>public void assignDistance(int[][] data, int[] indices, float[] distances)
HardAssignerassignDistance in interface HardAssigner<int[],float[],IntFloatPair>data - the data.indices - the cluster index for each data point.distances - the distance to the closest cluster for each data point.public IntFloatPair assignDistance(int[] data)
HardAssignerassignDistance in interface HardAssigner<int[],float[],IntFloatPair>data - point to assign.public HardAssigner<int[],?,?> defaultHardAssigner()
SpatialClustersdefaultHardAssigner in interface SpatialClusters<int[]>public int size()
size in interface HardAssigner<int[],float[],IntFloatPair>HardAssigner.size()public int[][] performClustering(int[][] data)
performClustering in interface Clusterer<int[][]>