java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.SemanticWordRelationDictionary
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.wikidata.WikidataKnowledgeSource
All Implemented Interfaces:
ExternalResource, ExternalResourceWithHypernymCapability, ExternalResourceWithSynonymCapability, HypernymCapability, SynonymCapability

public class WikidataKnowledgeSource extends SemanticWordRelationDictionary
  • Field Details

    • synonymyBuffer

      Buffer for repeated synonymy requests.
    • hypernymyBuffer

      ConcurrentMap<String,HashSet<String>> hypernymyBuffer
      Buffer for repeated hypernymy requests.
    • askBuffer

      Buffer for (expensive) ask queries.
    • linker

      private WikidataLinker linker
      Linker for the Wikidata knowledge source.
    • ENDPOINT_URL

      private static final String ENDPOINT_URL
      The public SPARQL endpoint.
      See Also:
    • LOGGER

      private static final org.slf4j.Logger LOGGER
      Default logger
    • knowledgeSourceName

      private String knowledgeSourceName
      Name of the instance.
    • persistenceService

      private PersistenceService persistenceService
      Service responsible for disk buffers.
    • isDiskBufferEnabled

      private boolean isDiskBufferEnabled
      If the disk-buffer is disabled, no buffers are read/written from/to the disk. Default: true.
    • IS_HYPERNYM_LEVEL_1_NO_CLOSE

      private static final String IS_HYPERNYM_LEVEL_1_NO_CLOSE
      Query fragment, add '}' to make it usable. Replace <subconcept> and <superconcept>.
      See Also:
    • IS_HYPERNYM_LEVEL_2_NO_CLOSE

      private static final String IS_HYPERNYM_LEVEL_2_NO_CLOSE
      Query fragment, add '}' to make it usable. Replace <subconcept> and <superconcept>.
      See Also:
    • IS_HYPERNYM_LEVEL_3_NO_CLOSE

      private static final String IS_HYPERNYM_LEVEL_3_NO_CLOSE
      Query fragment, add '}' to make it usable. Replace <subconcept> and <superconcept>.
      See Also:
  • Constructor Details

    • WikidataKnowledgeSource

      public WikidataKnowledgeSource()
      Constructor
    • WikidataKnowledgeSource

      public WikidataKnowledgeSource(boolean isDiskBufferEnabled)
      Constructor
      Parameters:
      isDiskBufferEnabled - True if the buffer shall be enabled.
  • Method Details

    • initializeBuffers

      private void initializeBuffers()
      Initialize buffers (either on-disk or memory).
    • isInDictionary

      public boolean isInDictionary(String word)
      Test whether the given word can be mapped (1-1) to a Wikidata concept (no smart mechanisms applied). The assumed default language is English.
      Parameters:
      word - The word to be looked for.
      Returns:
      True if the word can be found in the dictionary.
    • isInDictionary

      public boolean isInDictionary(String word, Language language)
      Test whether the given word can be mapped (1-1) to a Wikidata concept (no smart mechanisms applied).
      Parameters:
      word - The word to be used for the concept lookup.
      language - The language of the word
      Returns:
      True if the specified word is found in the knowledge resource.
    • isStrongFormSynonymous

      public boolean isStrongFormSynonymous(String link1, String link2)
      Checks for synonymy by determining whether link1 is contained in the set of synonymous words of link2 or vice versa.
      Specified by:
      isStrongFormSynonymous in interface SynonymCapability
      Overrides:
      isStrongFormSynonymous in class SemanticWordRelationDictionary
      Parameters:
      link1 - Word 1
      link2 - Word 2
      Returns:
      True if the given words are synonymous, else false.
    • isInDictionaryWithLabelAskQuery

      private boolean isInDictionaryWithLabelAskQuery(String word, Language language)
      Ask query with label.
      Parameters:
      word - The concept label that shall be looked up.
      language - The language of the label.
      Returns:
      True, if a concept has the label as rdfs:label.
    • isInDictionaryWithAltLabelAskQuery

      private boolean isInDictionaryWithAltLabelAskQuery(String word, Language language)
      Ask query with altLabel.
      Parameters:
      word - The concept label that shall be looked up.
      language - The language of the label.
      Returns:
      True, if a concept has the label as skos:altLabel.
    • getSynonymsLexical

      public Set<String> getSynonymsLexical(String linkedConcept)
      Description copied from class: SemanticWordRelationDictionary
      Retrieves a list of synonyms independently of the word sense. The assumed language is English.
      Specified by:
      getSynonymsLexical in interface SynonymCapability
      Specified by:
      getSynonymsLexical in class SemanticWordRelationDictionary
      Parameters:
      linkedConcept - The linked concept for which synonyms shall be retrieved.
      Returns:
      A set of synonyms in word form (not links).
    • getSynonyms

      public HashSet<String> getSynonyms(String linkedConcept, Language language)
      Language-bound synonymy retrieval.
      Parameters:
      linkedConcept - The linked concept for which synonyms shall be retrieved.
      language - The language of the synonyms.
      Returns:
      A set of synonyms (string).
    • getConceptLinks

      public ArrayList<String> getConceptLinks(String[] conceptsToBeLinked)
      For multiple words look for all links.
      Parameters:
      conceptsToBeLinked - An array of concepts that shall be linked.
      Returns:
      A list of links that were found for the given concepts. Concepts that could not be linked are ignored. If none of the given concepts can be linked, the resulting ArrayList will be empty.
    • getClosestCommonHypernym

      public org.javatuples.Pair<Set<String>,Integer> getClosestCommonHypernym(List<String> links, int limitOfHops)
      Determine the closest common hypernym.
      Parameters:
      links - The linked concepts for which the closest common hypernym shall be found.
      limitOfHops - This is an expensive operation. You can limit the number of upward hops to perform.
      Returns:
      The closest common hypernym together with the upwards-depth. This is represented as pair:
      [0] Set of common concepts (String)
      [1] The depth as integer. If there is a direct hyperconcept, the depth will be equal to 1.
      If multiple candidates apply, all are returned. If there is no closest common hypernym, null will be returned.
    • addOrPut

      private static void addOrPut(HashMap<String,HashSet<String>> map, String key, HashSet<String> setToAdd)
      Helper method.
      Parameters:
      map - The map to which shall be added or put.
      key - Key for the map.
      setToAdd - What shall be added.
    • determineCommonConcepts

      static Set<String> determineCommonConcepts(HashMap<String,HashSet<String>> data)
      Helper method. Given a map of concepts, the common concepts are determined and returned. Package modifier for better testing.
      Parameters:
      data - The data structure in which it shall be checked whether there are common concepts.
      Returns:
      Common concepts. Set is empty if there are non.
    • buildHypernymDepthQuery

      static String buildHypernymDepthQuery(String superconcept, String subconcept, int depth)
      Checks whether one wikidata URI is a subclass/instance of the other.
      Parameters:
      superconcept - URI of the superconcept.
      subconcept - URI of the subconcept.
      depth - The depth.
      Returns:
      Query as String.
    • replaceConceptsAndCompleteQuery

      private static String replaceConceptsAndCompleteQuery(String template, String superConcept, String subConcept)
      Helper method. Only to be used in buildHypernymDepthQuery(String, String, int).
      Parameters:
      template - Template to be used.
      superConcept - Super concept.
      subConcept - Sub concept.
      Returns:
      Complete query.
    • isHypernym

      public boolean isHypernym(String superConcept, String subConcept, int depth)
      Determine whether the specified superConcept is actually a superConcept given the specified subConcept.
      Overrides:
      isHypernym in class SemanticWordRelationDictionary
      Parameters:
      superConcept - URI or link.
      subConcept - URI or link.
      depth - The desired depth (integer in the range [1, 2, 3]).
      Returns:
      True if it is a hypernym, else false.
    • buildInstanceOfSublcassOfCleanQuery

      static String buildInstanceOfSublcassOfCleanQuery(String wikidataUri, int depth)
      The query obtained is so that the depth is upwards followed. There is no mixture of wdt:P31 (instance of) and wdt:P279 (subclass of). That means, that only super-instances are upwards followed UNION superclasses are upwards followed. The "instance-of" of a super-class cannot be found with this query!

      DEV remark: This is a bit too involved for an easy-to-understand API. This is currently not used. Look at the unit test to better understand what the query does.

      Parameters:
      wikidataUri - The wikidata URI.
      depth - The desired depth.
      Returns:
      The query as String.
    • getHypernyms

      public HashSet<String> getHypernyms(String linkedConcept)
      This will return the direct hypernyms as String.
      Specified by:
      getHypernyms in class SemanticWordRelationDictionary
      Parameters:
      linkedConcept - The linked concept for which hypernyms shall be retrieved. The linked concept is a URI or a multilink.
      Returns:
      The found hypernyms as links (URIs). If it is planned to immediately use the lexical representation use getHypernymsLexical(String, Language). In case nothing was found, an empty set will be returned.
    • getHypernymsLexical

      public HashSet<String> getHypernymsLexical(String linkedConcept)
      Uses wdt:P31 (instance of) as well as wdt:P279 (subclass of).
      Parameters:
      linkedConcept - The concept that has already been linked (URI). The assumed language is English.
      Returns:
      A set of links.
    • getHypernymsLexical

      public HashSet<String> getHypernymsLexical(String linkedConcept, Language language)
      Uses wdt:P31 (instance of) as well as wdt:P279 (subclass of).
      Parameters:
      linkedConcept - The concept that has already been linked (URI).
      language - Language of the strings.
      Returns:
      A set of links.
    • getLabelsForLink

      public HashSet<String> getLabelsForLink(String linkedConcept)
      Given a URI, obtain the written representations, i.e., the labels.
      Parameters:
      linkedConcept - The URI for which labels shall be obtained.
      Returns:
      A set of labels.
    • getLabelsForLink

      public HashSet<String> getLabelsForLink(String linkedConcept, Language language)
      Given a linked concept, retrieve all labels (rdfs:label, skos:altLabel).
      Parameters:
      linkedConcept - The link to the concept (URI).
      language - Desired language for the labels.
      Returns:
      Set of labels, all in the specified language.
    • close

      public void close()
      Description copied from class: SemanticWordRelationDictionary
      Closing open resources.
      Specified by:
      close in class SemanticWordRelationDictionary
    • getLinker

      public LabelToConceptLinker getLinker()
      Description copied from interface: ExternalResource
      Returns the linker instance for this particular resource.
      Specified by:
      getLinker in interface ExternalResource
      Specified by:
      getLinker in class SemanticWordRelationDictionary
      Returns:
      The specific linker used to link words to concepts.
    • getName

      public String getName()
      Description copied from interface: ExternalResource
      Obtain the name of the resource.
      Specified by:
      getName in interface ExternalResource
      Specified by:
      getName in class SemanticWordRelationDictionary
      Returns:
      Name of the resource.
    • isDiskBufferEnabled

      public boolean isDiskBufferEnabled()
    • commitAll

      private void commitAll(PersistenceService.PreconfiguredPersistences persistence)
      Transaction commit
      Parameters:
      persistence - Persistence to be commited.
    • commitAll

      private void commitAll()
      Commit data changes if active.
    • setDiskBufferEnabled

      public void setDiskBufferEnabled(boolean diskBufferEnabled)
      Note that when you disable your buffer during runtime, the buffer will be reinitialized.
      Parameters:
      diskBufferEnabled - True for enablement, else false.