Class MultiSourceDispatcherIncrementalMerge

java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_base.multisource.MatcherMultiSourceURL
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.multisource.dispatchers.MultiSourceDispatcherIncrementalMerge
All Implemented Interfaces:
IMatcherMultiSourceCaller, MultiSourceDispatcher
Direct Known Subclasses:
MultiSourceDispatcherIncrementalMergeByCluster, MultiSourceDispatcherIncrementalMergeByOrder

public abstract class MultiSourceDispatcherIncrementalMerge extends MatcherMultiSourceURL implements MultiSourceDispatcher, IMatcherMultiSourceCaller
Matches multiple ontologies / knowledge graphs with an incremental merge approach. This means that two ontologies are merged together and then possibly the union is merged with another ontology and so on. The order how they are merged is defined by subclasses.
  • Field Details

    • LOGGER

      private static final org.slf4j.Logger LOGGER
    • matcherSupplier

      private Supplier<Object> matcherSupplier
    • numberOfThreads

      private int numberOfThreads
    • addingInformationToUnion

      private boolean addingInformationToUnion
    • removeUnusedJenaModels

      private boolean removeUnusedJenaModels
    • copyMode

      private CopyMode copyMode
    • intermediateAlignments

      private List<Alignment> intermediateAlignments
    • mergeOrderFileCache

      private FileCache<MergeOrder> mergeOrderFileCache
    • serializedTreeFile

      private File serializedTreeFile
    • goldStandard

      private Map<Map.Entry<String,String>,Alignment> goldStandard
    • idExtractor

      private DatasetIDExtractor idExtractor
  • Constructor Details

    • MultiSourceDispatcherIncrementalMerge

      public MultiSourceDispatcherIncrementalMerge(Object oneToOneMatcher, boolean addInformationToUnion)
      Constructor which expects the actual one to one matcher and a boolean if information should be added to the union.
      Parameters:
      oneToOneMatcher - ont to one matcher
      addInformationToUnion - if true all information from matched entities are in the union.
    • MultiSourceDispatcherIncrementalMerge

      public MultiSourceDispatcherIncrementalMerge(Object oneToOneMatcher)
    • MultiSourceDispatcherIncrementalMerge

      public MultiSourceDispatcherIncrementalMerge(Supplier<Object> matcherSupplier, boolean addInformationToUnion)
      Constructor which expects the actual one to one matcher and a boolean if information should be added to the union.
      Parameters:
      matcherSupplier - a function which returns a new configured matcher everytime when it is called.
      addInformationToUnion - if true all information from matched entities are in the union.
    • MultiSourceDispatcherIncrementalMerge

      public MultiSourceDispatcherIncrementalMerge(Supplier<Object> matcherSupplier)
  • Method Details

    • match

      public URL match(List<URL> models, URL inputAlignment, URL parameters) throws Exception
      Description copied from class: MatcherMultiSourceURL
      Matches multiple ontologies/knowledge graphs together.
      Specified by:
      match in class MatcherMultiSourceURL
      Parameters:
      models - the ontologies/knowledge graphs as URLs
      inputAlignment - the input alignment as URL (alignment API format)
      parameters - the parameters file url. Format are currently json or yaml.
      Returns:
      an alignment as URL (most often as file URL) the format is again the alignment API format.
      Throws:
      Exception - in case something went wrong
    • needsTransitiveClosureForEvaluation

      public boolean needsTransitiveClosureForEvaluation()
      Description copied from class: MatcherMultiSourceURL
      Returns a boolean value if the matcher needs a transitive closure for evaluation. E.g. some matchers match only A-B-C and the testcase asks for A-C then this is only true, if the transitive closure is computed
      Specified by:
      needsTransitiveClosureForEvaluation in interface IMatcherMultiSourceCaller
      Overrides:
      needsTransitiveClosureForEvaluation in class MatcherMultiSourceURL
      Returns:
      true if the transitive closure is need, false otherwise
    • match

      public AlignmentAndParameters match(List<Set<Object>> models, Object inputAlignment, Object parameters) throws Exception
      Description copied from interface: IMatcherMultiSourceCaller
      Matches multiple ontologies / knowledge graphs together.
      Specified by:
      match in interface IMatcherMultiSourceCaller
      Parameters:
      models - this is a list of sets of objects where each sets contains different representations of the dame ontologies/ knowledge graph.
      inputAlignment - this object represents the input alignment.
      parameters - object representing additional parameters. Only add to this object and do not create a new Object like parameters= new ...() because otherwise the parameters are lost (java ist call by value). Sensible classes are Properties, Map<String, Object> or any similar data structure. Some already specified keys (strings) can be found at ParameterConfigKeys.
      Returns:
      the resulting alignment of the matching process.
      Throws:
      Exception - in case of any errors
    • runSequential

      private AlignmentAndParameters runSequential(MergeOrder mergeOrder, List<Set<Object>> models, Object inputAlignment, Properties p) throws MatchingException, Exception
      Throws:
      MatchingException
      Exception
    • runParallel

      private AlignmentAndParameters runParallel(MergeOrder mergeOrder, List<Set<Object>> models, Object inputAlignment, Properties p) throws MatchingException
      Throws:
      MatchingException
    • addDistance

      private static Properties addDistance(Properties p, double distance, double normalizedDistance)
    • getMergeTree

      public abstract MergeOrder getMergeTree(List<Set<Object>> models, Object parameters)
      Returns the merging tree (which ontologies are merged in which order). Have a look at the return description to see the merging tree format.
      Parameters:
      models - the models
      parameters - object representing additional parameters.
      Returns:
      mergingTree for n models, this is a n-1 by 2 matrix where row i describes the merging of clusters at step i of the clustering. If an element j in the row is less than n, then observation j was merged at this stage. If j ≥ n then the merge was with the cluster formed at the (earlier) stage j-n of the algorithm.
    • isLeftModelGreater

      protected boolean isLeftModelGreater(Set<Object> leftOntology, Set<Object> rightOntology, Properties p) throws TypeTransformationException
      Throws:
      TypeTransformationException
    • callClearIndex

      private void callClearIndex()
    • getNumberOfThreads

      public int getNumberOfThreads()
      Returns the number of thread which are used during merge. A number equal to one means sequential processing and greater than one means parallel processing.
      Returns:
      the number of thread used.
    • setNumberOfThreads

      public void setNumberOfThreads(int numberOfThreads)
      Sets the number of threads which are used during merge. A number equal to one means sequential processing and greater than one means parallel processing with the specified number of threads.
      Parameters:
      numberOfThreads - the number of threads to use. Values greater or equal to one are allowed.
    • setNumberOfThreadsToCpuCores

      public void setNumberOfThreadsToCpuCores()
      Sets the number of threads which are used during merge to the number of available CPU cores. A number equal to one means sequential processing and greater than one means parallel processing with the specified number of threads.
    • isAddingInformationToUnion

      public boolean isAddingInformationToUnion()
      Return true if all information / triples are added to the union. If set to false, only the information of non matched entities is added to the union.
      Returns:
      true if all information / triples are added to the union
    • setAddingInformationToUnion

      public void setAddingInformationToUnion(boolean addInformationToUnion)
      Sets the value if all information / triples are added to the union. If set to false, only the information of non matched entities is added to the union.
      Parameters:
      addInformationToUnion - true if all information / triples are added to the union
    • isRemoveUnusedJenaModels

      public boolean isRemoveUnusedJenaModels()
      Returns true if OntModels are removed.
      Returns:
      true if unused OntModels are removed
    • setRemoveUnusedJenaModels

      public void setRemoveUnusedJenaModels(boolean removeUnusedJenaModels)
      If set to true, this removes OntModel/Model which are not needed anymore. This helps to match a large number of KGs in a memory friendly way.
      Parameters:
      removeUnusedJenaModels - if true, unused OntModels will be removed
    • getIntermediateAlignments

      public List<Alignment> getIntermediateAlignments()
      Returns the intermediate alignments. This only works if setSavingIntermediateAlignments(boolean) is set to true before the match method is called.
      Returns:
      a list of intermediate alignments or null if setSavingIntermediateAlignments(boolean) was set to false (the default).
    • isSavingIntermediateAlignments

      public boolean isSavingIntermediateAlignments()
      Returns true if intermediate alignments are stored.
      Returns:
      true if intermediate alignments are stored
    • setSavingIntermediateAlignments

      public void setSavingIntermediateAlignments(boolean intermediateAlignmentsNew)
      Set to true if the intermediate alignments should be stored.
      Parameters:
      intermediateAlignmentsNew - true if the intermediate alignments should be stored
    • getCopyMode

      public CopyMode getCopyMode()
      Returns the used copy mode.
      Returns:
      the used copy mode
    • setCopyMode

      public void setCopyMode(CopyMode copyMode)
      Sets the copy mode which is used during the merging. Defaults to None.
      Parameters:
      copyMode - new copy mode to use
    • getCacheFile

      public File getCacheFile()
      Returns the cache file which is used to store the merge tree. In case no cache file is set, then null is returned.
      Returns:
      the cache file or null
    • setCacheFile

      public void setCacheFile(File cacheFile)
      Sets the cache file which should be used to cache the merge tree. Only use this method if the same models are merged over and over again. Thus the merge tree do not need to be computed multiple times. In case the file does not exists, the new computed merge tree is stored in this file. In all other cases, the file will not be modified. The cache file needs to be deleted to compute the merge tree again.
      Parameters:
      cacheFile - the file for storing the merge tree.
    • getSerializedTreeFile

      public File getSerializedTreeFile()
      Return the file where the serialzed merge tree is stored.
      Returns:
      the file where the serialzed merge tree is stored.
    • setSerializedTreeFile

      public void setSerializedTreeFile(File serializedTreeFile)
      Sets the file where the serialzed merge tree is stored. Set this to a non null value to write the serialzed merge tree to file.
      Parameters:
      serializedTreeFile - the file where the serialzed merge tree is stored.
    • getMatcherSupplier

      public Supplier<Object> getMatcherSupplier()
      Return the function which returns a new configured matcher object. In case it is not null, it will be used instead of the matcher object.
      Returns:
      the matcher supplier function
    • setMatcherSupplier

      public void setMatcherSupplier(Supplier<Object> matcherSupplier)
      If a matcher supplier is set, then this will be used instead of the matcher object.
      Parameters:
      matcherSupplier - the matcher supplier (when called, this function should return a new configured matcher object)
    • setGoldStandard

      public void setGoldStandard(Object alignment, DatasetIDExtractor idExtractor) throws TypeTransformationException
      Throws:
      TypeTransformationException
    • setGoldStandard

      public void setGoldStandard(Alignment alignment, DatasetIDExtractor idExtractor)