java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.multisource.dispatchers.clustermerge.ClustererSmile
All Implemented Interfaces:
Clusterer

public class ClustererSmile extends Object implements Clusterer
Clusterer based on the SMILE library.
  • Field Details

    • LOGGER

      private static final org.slf4j.Logger LOGGER
    • numberOfThreads

      private int numberOfThreads
      Number of threads to compute the distance matrix.
    • numberOfExamplesPerThread

      private int numberOfExamplesPerThread
      Number of examples to process in each thread for computing the distance matrix. E.g. if specified 10, then the distances between these 10 examples are computed in one thread.
    • useBLAS

      private boolean useBLAS
      If true uses the BLAS component to calculate the distance matrix (this might not be numerically stable).
  • Constructor Details

    • ClustererSmile

      public ClustererSmile()
    • ClustererSmile

      public ClustererSmile(int numberOfThreads)
      Clusterer based on the SMILE library.
      Parameters:
      numberOfThreads - number of threads to use (-1 to use all processors and 0 to use no threads)
    • ClustererSmile

      public ClustererSmile(int numberOfThreads, int numberOfExamplesPerThread)
      Clusterer based on the SMILE library.
      Parameters:
      numberOfThreads - number of threads to use (-1 to use all processors and 0 to use no threads)
      numberOfExamplesPerThread - number of examples to compute in each batch/thread
    • ClustererSmile

      public ClustererSmile(int numberOfThreads, int numberOfExamplesPerThread, boolean useBLAS)
      Clusterer based on the SMILE library.
      Parameters:
      numberOfThreads - number of threads to use (-1 to use all processors and 0 to use no threads)
      numberOfExamplesPerThread - number of examples to compute in each batch/thread
      useBLAS - if true uses the BLAS component to calculate the distance matrix (this might not be numerically stable)
  • Method Details

    • run

      public MergeOrder run(double[][] features, ClusterLinkage linkage, ClusterDistance distance)
      Specified by:
      run in interface Clusterer
    • getLinkage

      private smile.clustering.linkage.Linkage getLinkage(int exampleSize, float[] proximity, ClusterLinkage linkage)
      Return the linkage which can be used to calculate the hierarchical clustering.
      Parameters:
      exampleSize - the number of examples
      proximity - the lower triangle of the distance matrix (linearized)
      linkage - the linkage method (like eucledian etc)
      Returns:
      linkage
    • getProximity

      public float[] getProximity(double[][] features, ClusterDistance distance)
      Calculates the proximity which is the lower triangular part of the distance matrix. This means the returned float array contains all pairwise distances.
      Parameters:
      features - the features to use
      distance - the distance matric to use
      Returns:
      the proximity
    • getSmileDistanceFunction

      private static smile.math.distance.Distance<double[]> getSmileDistanceFunction(ClusterDistance distance)
    • proximity

      public static float[] proximity(double[][] data, ClusterDistance clusterDistance)
    • proximityParallel

      public static float[] proximityParallel(double[][] data, ClusterDistance clusterDistance)
    • proximityEuclideanParallel

      public static float[] proximityEuclideanParallel(double[][] data, int numberOfThreads, int numberOfExamplesPerThread, boolean squared)