Class WikidataKnowledgeSource
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.SemanticWordRelationDictionary
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.wikidata.WikidataKnowledgeSource
- All Implemented Interfaces:
ExternalResource
,ExternalResourceWithHypernymCapability
,ExternalResourceWithSynonymCapability
,HypernymCapability
,SynonymCapability
-
Field Summary
Modifier and TypeFieldDescription(package private) ConcurrentMap<String,
Boolean> Buffer for (expensive) ask queries.private static final String
The public SPARQL endpoint.(package private) ConcurrentMap<String,
HashSet<String>> Buffer for repeated hypernymy requests.private static final String
Query fragment, add '}' to make it usable.private static final String
Query fragment, add '}' to make it usable.private static final String
Query fragment, add '}' to make it usable.private boolean
If the disk-buffer is disabled, no buffers are read/written from/to the disk.private String
Name of the instance.private WikidataLinker
Linker for the Wikidata knowledge source.private static final org.slf4j.Logger
Default loggerprivate PersistenceService
Service responsible for disk buffers.(package private) ConcurrentMap<String,
HashSet<String>> Buffer for repeated synonymy requests. -
Constructor Summary
ConstructorDescriptionConstructorWikidataKnowledgeSource
(boolean isDiskBufferEnabled) Constructor -
Method Summary
Modifier and TypeMethodDescriptionprivate static void
Helper method.(package private) static String
buildHypernymDepthQuery
(String superconcept, String subconcept, int depth) Checks whether one wikidata URI is a subclass/instance of the other.(package private) static String
buildInstanceOfSublcassOfCleanQuery
(String wikidataUri, int depth) The query obtained is so that the depth is upwards followed.void
close()
Closing open resources.private void
Commit data changes if active.private void
commitAll
(PersistenceService.PreconfiguredPersistences persistence) Transaction commitHelper method.getClosestCommonHypernym
(List<String> links, int limitOfHops) Determine the closest common hypernym.getConceptLinks
(String[] conceptsToBeLinked) For multiple words look for all links.getHypernyms
(String linkedConcept) This will return the direct hypernyms as String.getHypernymsLexical
(String linkedConcept) Uses wdt:P31 (instance of) as well as wdt:P279 (subclass of).getHypernymsLexical
(String linkedConcept, Language language) Uses wdt:P31 (instance of) as well as wdt:P279 (subclass of).getLabelsForLink
(String linkedConcept) Given a URI, obtain the written representations, i.e., the labels.getLabelsForLink
(String linkedConcept, Language language) Given a linked concept, retrieve all labels (rdfs:label, skos:altLabel).Returns the linker instance for this particular resource.getName()
Obtain the name of the resource.getSynonyms
(String linkedConcept, Language language) Language-bound synonymy retrieval.getSynonymsLexical
(String linkedConcept) Retrieves a list of synonyms independently of the word sense.private void
Initialize buffers (either on-disk or memory).boolean
boolean
isHypernym
(String superConcept, String subConcept, int depth) Determine whether the specified superConcept is actually a superConcept given the specified subConcept.boolean
isInDictionary
(String word) Test whether the given word can be mapped (1-1) to a Wikidata concept (no smart mechanisms applied).boolean
isInDictionary
(String word, Language language) Test whether the given word can be mapped (1-1) to a Wikidata concept (no smart mechanisms applied).private boolean
isInDictionaryWithAltLabelAskQuery
(String word, Language language) Ask query with altLabel.private boolean
isInDictionaryWithLabelAskQuery
(String word, Language language) Ask query with label.boolean
isStrongFormSynonymous
(String link1, String link2) Checks for synonymy by determining whether link1 is contained in the set of synonymous words of link2 or vice versa.private static String
replaceConceptsAndCompleteQuery
(String template, String superConcept, String subConcept) Helper method.void
setDiskBufferEnabled
(boolean diskBufferEnabled) Note that when you disable your buffer during runtime, the buffer will be reinitialized.Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.SemanticWordRelationDictionary
isHypernym, isHypernymous, isSynonymous, isSynonymousOrHypernymous
-
Field Details
-
synonymyBuffer
ConcurrentMap<String,HashSet<String>> synonymyBufferBuffer for repeated synonymy requests. -
hypernymyBuffer
ConcurrentMap<String,HashSet<String>> hypernymyBufferBuffer for repeated hypernymy requests. -
askBuffer
ConcurrentMap<String,Boolean> askBufferBuffer for (expensive) ask queries. -
linker
Linker for the Wikidata knowledge source. -
ENDPOINT_URL
The public SPARQL endpoint.- See Also:
-
LOGGER
private static final org.slf4j.Logger LOGGERDefault logger -
knowledgeSourceName
Name of the instance. -
persistenceService
Service responsible for disk buffers. -
isDiskBufferEnabled
private boolean isDiskBufferEnabledIf the disk-buffer is disabled, no buffers are read/written from/to the disk. Default: true. -
IS_HYPERNYM_LEVEL_1_NO_CLOSE
Query fragment, add '}' to make it usable. Replace<subconcept>
and<superconcept>
.- See Also:
-
IS_HYPERNYM_LEVEL_2_NO_CLOSE
Query fragment, add '}' to make it usable. Replace<subconcept>
and<superconcept>
.- See Also:
-
IS_HYPERNYM_LEVEL_3_NO_CLOSE
Query fragment, add '}' to make it usable. Replace<subconcept>
and<superconcept>
.- See Also:
-
-
Constructor Details
-
WikidataKnowledgeSource
public WikidataKnowledgeSource()Constructor -
WikidataKnowledgeSource
public WikidataKnowledgeSource(boolean isDiskBufferEnabled) Constructor- Parameters:
isDiskBufferEnabled
- True if the buffer shall be enabled.
-
-
Method Details
-
initializeBuffers
private void initializeBuffers()Initialize buffers (either on-disk or memory). -
isInDictionary
Test whether the given word can be mapped (1-1) to a Wikidata concept (no smart mechanisms applied). The assumed default language is English.- Parameters:
word
- The word to be looked for.- Returns:
- True if the word can be found in the dictionary.
-
isInDictionary
Test whether the given word can be mapped (1-1) to a Wikidata concept (no smart mechanisms applied).- Parameters:
word
- The word to be used for the concept lookup.language
- The language of the word- Returns:
- True if the specified word is found in the knowledge resource.
-
isStrongFormSynonymous
Checks for synonymy by determining whether link1 is contained in the set of synonymous words of link2 or vice versa.- Specified by:
isStrongFormSynonymous
in interfaceSynonymCapability
- Overrides:
isStrongFormSynonymous
in classSemanticWordRelationDictionary
- Parameters:
link1
- Word 1link2
- Word 2- Returns:
- True if the given words are synonymous, else false.
-
isInDictionaryWithLabelAskQuery
Ask query with label.- Parameters:
word
- The concept label that shall be looked up.language
- The language of the label.- Returns:
- True, if a concept has the label as rdfs:label.
-
isInDictionaryWithAltLabelAskQuery
Ask query with altLabel.- Parameters:
word
- The concept label that shall be looked up.language
- The language of the label.- Returns:
- True, if a concept has the label as skos:altLabel.
-
getSynonymsLexical
Description copied from class:SemanticWordRelationDictionary
Retrieves a list of synonyms independently of the word sense. The assumed language is English.- Specified by:
getSynonymsLexical
in interfaceSynonymCapability
- Specified by:
getSynonymsLexical
in classSemanticWordRelationDictionary
- Parameters:
linkedConcept
- The linked concept for which synonyms shall be retrieved.- Returns:
- A set of synonyms in word form (not links).
-
getSynonyms
Language-bound synonymy retrieval.- Parameters:
linkedConcept
- The linked concept for which synonyms shall be retrieved.language
- The language of the synonyms.- Returns:
- A set of synonyms (string).
-
getConceptLinks
For multiple words look for all links.- Parameters:
conceptsToBeLinked
- An array of concepts that shall be linked.- Returns:
- A list of links that were found for the given concepts. Concepts that could not be linked are ignored. If none of the given concepts can be linked, the resulting ArrayList will be empty.
-
getClosestCommonHypernym
public org.javatuples.Pair<Set<String>,Integer> getClosestCommonHypernym(List<String> links, int limitOfHops) Determine the closest common hypernym.- Parameters:
links
- The linked concepts for which the closest common hypernym shall be found.limitOfHops
- This is an expensive operation. You can limit the number of upward hops to perform.- Returns:
- The closest common hypernym together with the upwards-depth.
This is represented as pair:
[0] Set of common concepts (String)
[1] The depth as integer. If there is a direct hyperconcept, the depth will be equal to 1.
If multiple candidates apply, all are returned. If there is no closest common hypernym, null will be returned.
-
addOrPut
private static void addOrPut(HashMap<String, HashSet<String>> map, String key, HashSet<String> setToAdd) Helper method.- Parameters:
map
- The map to which shall be added or put.key
- Key for the map.setToAdd
- What shall be added.
-
determineCommonConcepts
Helper method. Given a map of concepts, the common concepts are determined and returned. Package modifier for better testing.- Parameters:
data
- The data structure in which it shall be checked whether there are common concepts.- Returns:
- Common concepts. Set is empty if there are non.
-
buildHypernymDepthQuery
Checks whether one wikidata URI is a subclass/instance of the other.- Parameters:
superconcept
- URI of the superconcept.subconcept
- URI of the subconcept.depth
- The depth.- Returns:
- Query as String.
-
replaceConceptsAndCompleteQuery
private static String replaceConceptsAndCompleteQuery(String template, String superConcept, String subConcept) Helper method. Only to be used inbuildHypernymDepthQuery(String, String, int)
.- Parameters:
template
- Template to be used.superConcept
- Super concept.subConcept
- Sub concept.- Returns:
- Complete query.
-
isHypernym
Determine whether the specified superConcept is actually a superConcept given the specified subConcept.- Overrides:
isHypernym
in classSemanticWordRelationDictionary
- Parameters:
superConcept
- URI or link.subConcept
- URI or link.depth
- The desired depth (integer in the range [1, 2, 3]).- Returns:
- True if it is a hypernym, else false.
-
buildInstanceOfSublcassOfCleanQuery
The query obtained is so that the depth is upwards followed. There is no mixture of wdt:P31 (instance of) and wdt:P279 (subclass of). That means, that only super-instances are upwards followed UNION superclasses are upwards followed. The "instance-of" of a super-class cannot be found with this query!DEV remark: This is a bit too involved for an easy-to-understand API. This is currently not used. Look at the unit test to better understand what the query does.
- Parameters:
wikidataUri
- The wikidata URI.depth
- The desired depth.- Returns:
- The query as String.
-
getHypernyms
This will return the direct hypernyms as String.- Specified by:
getHypernyms
in classSemanticWordRelationDictionary
- Parameters:
linkedConcept
- The linked concept for which hypernyms shall be retrieved. The linked concept is a URI or a multilink.- Returns:
- The found hypernyms as links (URIs). If it is planned to immediately use the lexical representation
use
getHypernymsLexical(String, Language)
. In case nothing was found, an empty set will be returned.
-
getHypernymsLexical
Uses wdt:P31 (instance of) as well as wdt:P279 (subclass of).- Parameters:
linkedConcept
- The concept that has already been linked (URI). The assumed language is English.- Returns:
- A set of links.
-
getHypernymsLexical
Uses wdt:P31 (instance of) as well as wdt:P279 (subclass of).- Parameters:
linkedConcept
- The concept that has already been linked (URI).language
- Language of the strings.- Returns:
- A set of links.
-
getLabelsForLink
Given a URI, obtain the written representations, i.e., the labels.- Parameters:
linkedConcept
- The URI for which labels shall be obtained.- Returns:
- A set of labels.
-
getLabelsForLink
Given a linked concept, retrieve all labels (rdfs:label, skos:altLabel).- Parameters:
linkedConcept
- The link to the concept (URI).language
- Desired language for the labels.- Returns:
- Set of labels, all in the specified language.
-
close
public void close()Description copied from class:SemanticWordRelationDictionary
Closing open resources.- Specified by:
close
in classSemanticWordRelationDictionary
-
getLinker
Description copied from interface:ExternalResource
Returns the linker instance for this particular resource.- Specified by:
getLinker
in interfaceExternalResource
- Specified by:
getLinker
in classSemanticWordRelationDictionary
- Returns:
- The specific linker used to link words to concepts.
-
getName
Description copied from interface:ExternalResource
Obtain the name of the resource.- Specified by:
getName
in interfaceExternalResource
- Specified by:
getName
in classSemanticWordRelationDictionary
- Returns:
- Name of the resource.
-
isDiskBufferEnabled
public boolean isDiskBufferEnabled() -
commitAll
Transaction commit- Parameters:
persistence
- Persistence to be commited.
-
commitAll
private void commitAll()Commit data changes if active. -
setDiskBufferEnabled
public void setDiskBufferEnabled(boolean diskBufferEnabled) Note that when you disable your buffer during runtime, the buffer will be reinitialized.- Parameters:
diskBufferEnabled
- True for enablement, else false.
-