All Implemented Interfaces:
IMatcher<org.apache.jena.ontology.OntModel,​Alignment,​Properties>, eu.sealsproject.platform.res.domain.omt.IOntologyMatchingToolBridge, eu.sealsproject.platform.res.tool.api.IPlugin, eu.sealsproject.platform.res.tool.api.IToolBridge

public class BackgroundMatcher
extends MatcherYAAAJena
Template matcher where the background knowledge and the exploitation strategy (represented as ImplementedBackgroundMatchingStrategies) can be plugged-in. This matcher can be used as matching component. It is sensible to use a simple string matcher before running this matcher to increase the performance by filtering out simple matches. If you want a pre-packaged stand-alone background-based matching system, you can try out BackgroundMatcherStandAlone.
This matcher relies on a similarity metric that is implemented within the background source and used in compare(String, String).
  • Field Details

    • linker

      private final LabelToConceptLinker linker
      Linker used to link labels to concepts.
    • alignment

      private Alignment alignment
      Alignment
    • ontology1

      private org.apache.jena.ontology.OntModel ontology1
      Ontologies
    • ontology2

      private org.apache.jena.ontology.OntModel ontology2
    • knowledgeSource

      private ExternalResourceWithSynonymCapability knowledgeSource
      the knowledgeSource to be used
    • LOGGER

      private static final org.slf4j.Logger LOGGER
      Logger
    • strategy

      Matching strategy.
    • threshold

      private double threshold
      The minimal confidence threshold that is required for a match.
    • isAllowForCumulativeMatches

      private boolean isAllowForCumulativeMatches
      If something has been matched in an earlier step, allow for it to be matched again. Default: false.
    • isVerboseLoggingOutput

      private boolean isVerboseLoggingOutput
      Log every match. Do not use in performance optimized settings.
    • valueExtractor

      private final TextExtractor valueExtractor
      The value extractor used to obtain labels for resources.
    • multiConceptLinkerUpperLimit

      private int multiConceptLinkerUpperLimit
      If a concept cannot be linked as full string, the longest substrings are matched. This is expensive. If there are lengthy description texts etc., this should not be performed. This variable represents the number of tokens within a label up to which the multi-linking will be performed. The limit is inclusive, linking will not be performed if |tokens in label| > multiConceptLinkerUpperLimit
    • isSynonymyConfidenceAvailable

      private final boolean isSynonymyConfidenceAvailable
      If true, there is a confidence score for each synonymy relation.
  • Constructor Details

    • BackgroundMatcher

      public BackgroundMatcher​(SemanticWordRelationDictionary knowledgeSourceToBeUsed, ImplementedBackgroundMatchingStrategies strategy, double threshold)
      Main Constructor
      Parameters:
      knowledgeSourceToBeUsed - Specify the knowledgeSource to be used.
      strategy - The knowledgeSource strategy that shall be applied.
      threshold - The minimal required threshold that is required for a match.
    • BackgroundMatcher

      public BackgroundMatcher​(SemanticWordRelationDictionary knowledgeSourceToBeUsed)
      Convenience Default Constructor Threshold: 0.0 and Strategy: Synonymy are assumed.
      Parameters:
      knowledgeSourceToBeUsed - The knowledge source that is to be used.
    • BackgroundMatcher

      public BackgroundMatcher​(SemanticWordRelationDictionary knowledgeSourceToBeUsed, ImplementedBackgroundMatchingStrategies strategy)
      Convenience Default Constructor Threshold: 0.0 is assumed.
      Parameters:
      knowledgeSourceToBeUsed - The knowledge source that is to be used.
      strategy - The strategy that shall be applied.
  • Method Details

    • match

      public Alignment match​(org.apache.jena.ontology.OntModel sourceOntology, org.apache.jena.ontology.OntModel targetOntology, Alignment inputAlignment, Properties p) throws Exception
      Description copied from class: MatcherYAAAJena
      Aligns two ontologies specified via a Jena OntModel, with an input alignment as Alignment object, and returns the mapping of the resulting alignment. Note: This method might be called multiple times in a row when using the evaluation framework. Make sure to return a mapping which is specific to the given inputs.
      Specified by:
      match in interface IMatcher<org.apache.jena.ontology.OntModel,​Alignment,​Properties>
      Specified by:
      match in class MatcherYAAAJena
      Parameters:
      sourceOntology - This OntModel represents the source ontology.
      targetOntology - This OntModel represents the target ontology.
      inputAlignment - This mapping represents the input alignment.
      p - Additional properties.
      Returns:
      The resulting alignment of the matching process.
      Throws:
      Exception - Any exception which occurs during matching.
    • addAlignmentExtensions

      private void addAlignmentExtensions()
      Adds extension values.
    • getConfigurationListing

      private String getConfigurationListing()
      Get configuration of matcher as string output.
      Returns:
      The configuration as string.
    • match

      private void match​(org.apache.jena.util.iterator.ExtendedIterator<? extends org.apache.jena.ontology.OntResource> sourceOntologyIterator_1, org.apache.jena.util.iterator.ExtendedIterator<? extends org.apache.jena.ontology.OntResource> targetOntologyIterator_2)
      Given two iterators, match the resources covered by them.
      Parameters:
      sourceOntologyIterator_1 - iterator 1 must be that of the source ontology
      targetOntologyIterator_2 - iterator 2 must be that of the target ontology
    • performFullStringSynonymyMatching

      private void performFullStringSynonymyMatching​(Map<String,​Set<String>> uri2labelMap_1, Map<String,​Set<String>> uri2labelMap_2)
      Filter out token synonymy utilizing a synonymy strategy. Note that the method accepts a HashMap of Uri -> set(LINKS) rather than Uri -> set(labels).
      Parameters:
      uri2labelMap_1 - URI2labels map of the source ontology.
      uri2labelMap_2 - URI2labels map of the target ontology.
    • fullMatchUsingDictionaryWithLinks

      public org.javatuples.Pair<Boolean,​Double> fullMatchUsingDictionaryWithLinks​(Set<String> set1, Set<String> set2)
      Determines whether two sets of links match using the internal knowledgeSource. Not that no linking is performed but links are expected in the sets.
      Parameters:
      set1 - Set 1 Set of links 1.
      set2 - Set 2 Set of links 2.
      Returns:
      Pair where (1) boolean indicating whether there is a match, (2) providing the match confidence.
    • performTokenBasedSynonymyMatching

      private void performTokenBasedSynonymyMatching​(Map<String,​Set<String>> uri2labelMap_1, Map<String,​Set<String>> uri2labelMap_2)
      Match based on token equality and synonymy.
      Parameters:
      uri2labelMap_1 - source uri2labels map
      uri2labelMap_2 - target uri2labels map
    • isTokenSetSynonymous

      org.javatuples.Pair<Boolean,​Double> isTokenSetSynonymous​(List<Set<String>> tokenList1, List<Set<String>> tokenList2)
      Checks whether the two lists are synonymous, this means that: each component of one list can be found in the other list OR is synonymous to one component in the other list.
      Parameters:
      tokenList1 - List of words
      tokenList2 - List of words
      Returns:
      true if synonymous, else false
    • isTokenSynonymous

      public org.javatuples.Pair<Boolean,​Double> isTokenSynonymous​(Set<String> set1, Set<String> set2)
      Compare the two maps for synonymous terms.
      Parameters:
      set1 - Set of tokens 1
      set2 - Set of tokens 2
      Returns:
      true if the term of a set has a synonymous or equal counterpart in the other set. T this is tested both ways (set1 -> set2 and set2 -> set1).
    • performLongestStringSynonymyMatching

      private void performLongestStringSynonymyMatching​(Map<String,​Set<String>> uri2labelMap_1, Map<String,​Set<String>> uri2labelMap_2)
      Match by determining multiple concepts for a label.
      Parameters:
      uri2labelMap_1 - URI2label map 1.
      uri2labelMap_2 - URI2label map 2.
    • isLinkListSynonymous

      private org.javatuples.Pair<Boolean,​Double> isLinkListSynonymous​(List<Set<String>> list_1, List<Set<String>> list_2)
      Given two lists of links, this method checks whether those are synonymous.
      Parameters:
      list_1 - List of links 1.
      list_2 - List of links 2.
      Returns:
      Returns true, if the links are synonymous.
    • isLinkSetSynonymous

      private org.javatuples.Pair<Boolean,​Double> isLinkSetSynonymous​(Set<String> set_1, Set<String> set_2)
      All components of set_1 have to be synonymous to components in set_2.
      Parameters:
      set_1 - Set 1.
      set_2 - Set 2.
      Returns:
      True if synonymous, else false.
    • convertToUriLinksMap

      private Map<String,​List<Set<String>>> convertToUriLinksMap​(Map<String,​Set<String>> uris2labels, boolean isSourceOntology)
      This method converts a URIs -> labels HashMap to a URIs -> List<nlinks>. Mapped entries are ignored.
      Parameters:
      uris2labels - URIs to labels map.
      isSourceOntology - True if the map refers to the source ontology.
      Returns:
      Map URI -> tokens
    • setContainsSynonym

      private boolean setContainsSynonym​(String word, HashSet<String> set)
      Check whether the specified word is synonymous to a word in the given set.
      Parameters:
      word - Word to be checked.
      set - Set containing the words.
      Returns:
      true if synonymous.
    • convertToUriLinkMap

      private HashMap<String,​Set<String>> convertToUriLinkMap​(Map<String,​Set<String>> uri2labels, boolean isSourceOntology)
      This method transforms the uri2labels into a uri2links HashMap. Thereby, the linking function is called only once. Furthermore, concepts that cannot be linked are not included in the resulting HashMap. Mapped entries are not linked.
      Parameters:
      uri2labels - Input HashMap URI -> labels
      isSourceOntology - True if the map refers to the source ontology.
      Returns:
      HashMap URI -> links
    • convertToUriTokenMap

      private Map<String,​List<Set<String>>> convertToUriTokenMap​(Map<String,​Set<String>> uris2labels, boolean isSourceOntology)
      This method converts a URIs -> labels HashMap to a URIs -> tokens HashMap. Mapped entries are ignored.
      Parameters:
      uris2labels - URIs to labels map.
      isSourceOntology - True if the map refers to the source ontology.
      Returns:
      Map: URI -> tokens
    • tokenizeAndFilter

      public static HashSet<String> tokenizeAndFilter​(String label)
      Tokenizes a label and filters out stop words.
      Parameters:
      label - The label to be tokenized.
      Returns:
      Tokenized label.
    • mappingExistsForSourceURI

      private boolean mappingExistsForSourceURI​(String uri)
      Checks whether there exists a mapping cell where the URI is used as source.
      Parameters:
      uri - URI for which the check shall be performed.
      Returns:
      True if at least one mapping cell exists, else false.
    • mappingExistsForTargetURI

      private boolean mappingExistsForTargetURI​(String uri)
      Checks whether there exists a mapping cell where the URI is used as target.
      Parameters:
      uri - URI for which the check shall be performed.
      Returns:
      True if at least one mapping cell exists, else false.
    • compareScore

      private double compareScore​(String lookupTerm1, String lookupTerm2)
    • compare

      private boolean compare​(String lookupTerm1, String lookupTerm2)
      The compare method compares two concepts that are available in a background knowledge source. The concepts will be compared using the specified strategy and the method will return true if the determined similarity is above the specified minimal threshold.
      Parameters:
      lookupTerm1 - Term 1.
      lookupTerm2 - Term 2.
      Returns:
      True if similarity larger than minimal threshold, else false.
    • getMatcherName

      public String getMatcherName()
      Get the name of the matcher.
      Returns:
      A textual representation of the matcher.
    • getStrategy

    • setStrategy

      public void setStrategy​(ImplementedBackgroundMatchingStrategies strategy)
    • getKnowledgeSource

      public ExternalResourceWithSynonymCapability getKnowledgeSource()
    • setKnowledgeSource

      public void setKnowledgeSource​(ExternalResourceWithSynonymCapability knowledgeSource)
    • getThreshold

      public double getThreshold()
    • setThreshold

      public void setThreshold​(double threshold)
    • isAllowForCumulativeMatches

      public boolean isAllowForCumulativeMatches()
    • setAllowForCumulativeMatches

      public void setAllowForCumulativeMatches​(boolean allowForCumulativeMatches)
    • isVerboseLoggingOutput

      public boolean isVerboseLoggingOutput()
    • setVerboseLoggingOutput

      public void setVerboseLoggingOutput​(boolean verboseLoggingOutput)
    • getMultiConceptLinkerUpperLimit

      public int getMultiConceptLinkerUpperLimit()
    • setMultiConceptLinkerUpperLimit

      public void setMultiConceptLinkerUpperLimit​(int multiConceptLinkerUpperLimit)
    • isSynonymyConfidenceAvailable

      public boolean isSynonymyConfidenceAvailable()