Package de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.multisource.dispatchers.clustermerge
Class ClustererSmile
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.multisource.dispatchers.clustermerge.ClustererSmile
- All Implemented Interfaces:
Clusterer
Clusterer based on the SMILE library.
-
Field Summary
Modifier and TypeFieldDescriptionprivate static final org.slf4j.Logger
private int
Number of examples to process in each thread for computing the distance matrix.private int
Number of threads to compute the distance matrix.private boolean
If true uses the BLAS component to calculate the distance matrix (this might not be numerically stable). -
Constructor Summary
ConstructorDescriptionClustererSmile
(int numberOfThreads) Clusterer based on the SMILE library.ClustererSmile
(int numberOfThreads, int numberOfExamplesPerThread) Clusterer based on the SMILE library.ClustererSmile
(int numberOfThreads, int numberOfExamplesPerThread, boolean useBLAS) Clusterer based on the SMILE library. -
Method Summary
Modifier and TypeMethodDescriptionprivate smile.clustering.linkage.Linkage
getLinkage
(int exampleSize, float[] proximity, ClusterLinkage linkage) Return the linkage which can be used to calculate the hierarchical clustering.float[]
getProximity
(double[][] features, ClusterDistance distance) Calculates the proximity which is the lower triangular part of the distance matrix.private static smile.math.distance.Distance<double[]>
getSmileDistanceFunction
(ClusterDistance distance) static float[]
proximity
(double[][] data, ClusterDistance clusterDistance) static float[]
proximityEuclideanParallel
(double[][] data, int numberOfThreads, int numberOfExamplesPerThread, boolean squared) static float[]
proximityParallel
(double[][] data, ClusterDistance clusterDistance) run
(double[][] features, ClusterLinkage linkage, ClusterDistance distance)
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGER -
numberOfThreads
private int numberOfThreadsNumber of threads to compute the distance matrix. -
numberOfExamplesPerThread
private int numberOfExamplesPerThreadNumber of examples to process in each thread for computing the distance matrix. E.g. if specified 10, then the distances between these 10 examples are computed in one thread. -
useBLAS
private boolean useBLASIf true uses the BLAS component to calculate the distance matrix (this might not be numerically stable).
-
-
Constructor Details
-
ClustererSmile
public ClustererSmile() -
ClustererSmile
public ClustererSmile(int numberOfThreads) Clusterer based on the SMILE library.- Parameters:
numberOfThreads
- number of threads to use (-1 to use all processors and 0 to use no threads)
-
ClustererSmile
public ClustererSmile(int numberOfThreads, int numberOfExamplesPerThread) Clusterer based on the SMILE library.- Parameters:
numberOfThreads
- number of threads to use (-1 to use all processors and 0 to use no threads)numberOfExamplesPerThread
- number of examples to compute in each batch/thread
-
ClustererSmile
public ClustererSmile(int numberOfThreads, int numberOfExamplesPerThread, boolean useBLAS) Clusterer based on the SMILE library.- Parameters:
numberOfThreads
- number of threads to use (-1 to use all processors and 0 to use no threads)numberOfExamplesPerThread
- number of examples to compute in each batch/threaduseBLAS
- if true uses the BLAS component to calculate the distance matrix (this might not be numerically stable)
-
-
Method Details
-
run
-
getLinkage
private smile.clustering.linkage.Linkage getLinkage(int exampleSize, float[] proximity, ClusterLinkage linkage) Return the linkage which can be used to calculate the hierarchical clustering.- Parameters:
exampleSize
- the number of examplesproximity
- the lower triangle of the distance matrix (linearized)linkage
- the linkage method (like eucledian etc)- Returns:
- linkage
-
getProximity
Calculates the proximity which is the lower triangular part of the distance matrix. This means the returned float array contains all pairwise distances.- Parameters:
features
- the features to usedistance
- the distance matric to use- Returns:
- the proximity
-
getSmileDistanceFunction
private static smile.math.distance.Distance<double[]> getSmileDistanceFunction(ClusterDistance distance) -
proximity
-
proximityParallel
-
proximityEuclideanParallel
public static float[] proximityEuclideanParallel(double[][] data, int numberOfThreads, int numberOfExamplesPerThread, boolean squared)
-