Class WiktionaryKnowledgeSource
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.SemanticWordRelationDictionary
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.wiktionary.WiktionaryKnowledgeSource
- All Implemented Interfaces:
ExternalResource
,ExternalResourceWithHypernymCapability
,ExternalResourceWithSynonymCapability
,HypernymCapability
,SynonymCapability
Class utilizing DBnary, a SPARQL endpoint for Wiktionary.
Alternatively, TDB1 can be used as offline storage.
-
Field Summary
Modifier and TypeFieldDescriptionprivate ConcurrentMap<String,
Boolean> Buffer for ask queries.private static final String
The public SPARQL endpoint.private ConcurrentMap<String,
HashSet<String>> Buffer for hypernymy.private boolean
True if buffers shall be written to disk.private boolean
True if a tdb source shall be used rather than an on-line SPARQL endpoint.private WiktionaryLinker
The linker that links input strings to terms.private static final org.slf4j.Logger
Logger for this class.private PersistenceService
Service responsible for disk buffers.private ConcurrentMap<String,
HashSet<String>> Buffer for synonyms.private org.apache.jena.query.Dataset
The TDB dataset into which the dbnary data set was loaded.private ConcurrentMap<String,
HashSet<String>> private ConcurrentMap<String,
HashSet<String>> -
Constructor Summary
ConstructorDescriptionConstructor for Wiktionary online (SPARQL endpoint) access.WiktionaryKnowledgeSource
(boolean isDiskBufferEnabled) ConstructorWiktionaryKnowledgeSource
(String tdbDirectoryPath) Constructor for DBnary TDB access. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
De-constructor; call before ending the program.private void
commit
(PersistenceService.PreconfiguredPersistences persistence) Commit persistence.private void
Commit data changes if active.(package private) static String
encodeWord
(String word) Encodes words so that they can be looked up in the wiktionary dictionary.getHypernyms
(String linkedConcept) Obtain hypernyms for the given concept.getHypernyms
(String linkedConcept, Language language) Obtain hypernyms for the given concept.private static String
getLemmaFromURI
(String uri) Given a resource URI, this method will transform it to a lemma.Returns the linker instance for this particular resource.getName()
Obtain the name of the resource.getNormalizedTranslations
(String linkedConcept, Language sourceLanguage, Language targetLanguage) Looks for translations of the given string.getSynonyms
(String word, Language language) Retrieves the synonyms of a particular word in a particular language.getSynonymsEncoded
(String linkedConcept) getSynonymsLexical
(String linkedConcept) Retrieves a list of synonyms independently of the word sense.getTranslation
(String linkedConcept, Language sourceLanguage, Language targetLanguage) Obtain the translations for the linked concept.getTranslationOf
(String translationString, Language languageOfTranslation) Given a translation, find concepts which state that the given translation is their translation.private void
Helper functions for constructor-independent actions.boolean
isInDictionary
(String word) boolean
isInDictionary
(String word, Language language) Language dependent query for existence in the dbnary dictionary.boolean
isStrongFormSynonymous
(String link1, String link2) Checks for synonymy by determining whether link1 is contained in the set of synonymous words of link2 or vice versa.boolean
isTranslationDerived
(String word_1, Language language_1, String word_2, Language language_2) Checks whether the two words are translation of the same word (this mechanism uses another language as common denominator).boolean
isTranslationLinked
(String linkedConceptToBeTranslated, Language language_1, String linkedConcept_2, Language language_2) Checks whether linkedConceptToBeTranslated can be translated to linkedConcept_2.boolean
isTranslationNonLinked
(String linkedConceptToBeTranslated, Language language_1, String nonlinkedConcept_2, Language language_2) Checks whether linkedConceptToBeTranslated can be translated to non linked concept 2.boolean
isUseTdb()
static String
normalizeForTranslations
(String stringToBeNormalized) normalizeForTranslations
(HashSet<String> setToBeNormalized) Normalization Function for translations.Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.SemanticWordRelationDictionary
isHypernym, isHypernym, isHypernymous, isSynonymous, isSynonymousOrHypernymous
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGERLogger for this class. -
persistenceService
Service responsible for disk buffers. -
synonymyBuffer
Buffer for synonyms. -
hypernymyBuffer
Buffer for hypernymy. -
askBuffer
Buffer for ask queries. -
translationBuffer
-
translationOfBuffer
-
tdbDataset
private org.apache.jena.query.Dataset tdbDatasetThe TDB dataset into which the dbnary data set was loaded. -
ENDPOINT_URL
The public SPARQL endpoint.- See Also:
-
isUseTdb
private boolean isUseTdbTrue if a tdb source shall be used rather than an on-line SPARQL endpoint. -
isDiskBufferEnabled
private boolean isDiskBufferEnabledTrue if buffers shall be written to disk. -
linker
The linker that links input strings to terms.
-
-
Constructor Details
-
WiktionaryKnowledgeSource
public WiktionaryKnowledgeSource()Constructor for Wiktionary online (SPARQL endpoint) access. By default, a disk-buffer is enabled. -
WiktionaryKnowledgeSource
public WiktionaryKnowledgeSource(boolean isDiskBufferEnabled) Constructor- Parameters:
isDiskBufferEnabled
- True if buffers shall be written to disk.
-
WiktionaryKnowledgeSource
Constructor for DBnary TDB access.- Parameters:
tdbDirectoryPath
- Path to the Wiktionary TDB directory.
-
-
Method Details
-
initialize
private void initialize()Helper functions for constructor-independent actions. -
close
public void close()De-constructor; call before ending the program.- Specified by:
close
in classSemanticWordRelationDictionary
-
isInDictionary
-
isInDictionary
Language dependent query for existence in the dbnary dictionary. Note that case-sensitivity applies ( (Katze, deu) can be found whereas (katze, deu) will not return any results ).- Parameters:
word
- The word to be looked for.language
- The language of the word.- Returns:
- boolean indicating whether the word exists in the dictionary in the corresponding language.
-
isStrongFormSynonymous
Checks for synonymy by determining whether link1 is contained in the set of synonymous words of link2 or vice versa.- Specified by:
isStrongFormSynonymous
in interfaceSynonymCapability
- Overrides:
isStrongFormSynonymous
in classSemanticWordRelationDictionary
- Parameters:
link1
- Word 1link2
- Word 2- Returns:
- True if the given words are synonymous, else false.
-
getSynonymsEncoded
-
getSynonymsLexical
Description copied from class:SemanticWordRelationDictionary
Retrieves a list of synonyms independently of the word sense. The assumed language is English.- Specified by:
getSynonymsLexical
in interfaceSynonymCapability
- Specified by:
getSynonymsLexical
in classSemanticWordRelationDictionary
- Parameters:
linkedConcept
- The linked concept for which synonyms shall be retrieved.- Returns:
- A set of synonyms in word form (not links).
-
getSynonyms
Retrieves the synonyms of a particular word in a particular language.- Parameters:
word
- Word for which the synonyms shall be retrieved.language
- Language of the word.- Returns:
- Set of synonyms.
-
getLemmaFromURI
Given a resource URI, this method will transform it to a lemma.- Parameters:
uri
- Resource URI to be transformed.- Returns:
- Lemma.
-
encodeWord
Encodes words so that they can be looked up in the wiktionary dictionary.- Parameters:
word
- Word to be encoded.- Returns:
- encoded word
-
getHypernyms
Obtain hypernyms for the given concept. The assumed language is English.- Specified by:
getHypernyms
in classSemanticWordRelationDictionary
- Parameters:
linkedConcept
- The linked concept for which hypernyms shall be retrieved.- Returns:
- A set of hypernyms.
-
getHypernyms
Obtain hypernyms for the given concept.- Parameters:
linkedConcept
- The linked concept for which hypernyms shall be retrieved.language
- The desired language of the hypernyms.- Returns:
- A set of hypernyms.
-
getTranslation
public HashSet<String> getTranslation(String linkedConcept, Language sourceLanguage, Language targetLanguage) Obtain the translations for the linked concept.- Parameters:
linkedConcept
- The concept that was linked.sourceLanguage
- Language of the linked concept.targetLanguage
- Language to which the concept shall be translated.- Returns:
- The result is not a linked concept but instead a word.
-
getTranslationOf
Given a translation, find concepts which state that the given translation is their translation.- Parameters:
translationString
- The translation (textual string).languageOfTranslation
- The language of the translationString.- Returns:
- A set of concepts of which
translation
is the given translation.
-
isTranslationDerived
public boolean isTranslationDerived(String word_1, Language language_1, String word_2, Language language_2) Checks whether the two words are translation of the same word (this mechanism uses another language as common denominator).- Parameters:
word_1
- Word 1 (does not have to be linked).language_1
- Language 1.word_2
- Word 2 (does not have to be linked).language_2
- Language 2.- Returns:
- True, if a translation can be derived; else false.
-
isTranslationLinked
public boolean isTranslationLinked(String linkedConceptToBeTranslated, Language language_1, String linkedConcept_2, Language language_2) Checks whether linkedConceptToBeTranslated can be translated to linkedConcept_2. Note that BOTH concepts have to be linked.- Parameters:
linkedConceptToBeTranslated
- Linked conceptlanguage_1
- Language of linkedConceptToBeTranslated.linkedConcept_2
- Linked conceptlanguage_2
- Language of linkedConcept_2.- Returns:
- True if translation from linkedConceptToBeTranslated to linkedConcept_2 possible, else false.
-
isTranslationNonLinked
public boolean isTranslationNonLinked(String linkedConceptToBeTranslated, Language language_1, String nonlinkedConcept_2, Language language_2) Checks whether linkedConceptToBeTranslated can be translated to non linked concept 2. Note that the first concept has to be linked.- Parameters:
linkedConceptToBeTranslated
- The linked concept.language_1
- Language of linkedConceptToBeTranslated.nonlinkedConcept_2
- Concept not linked (just a string).language_2
- Language of linkedConcept_2.- Returns:
- True if translation from linkedConceptToBeTranslated to linkedConcept_2 possible, else false.
-
getNormalizedTranslations
public HashSet<String> getNormalizedTranslations(String linkedConcept, Language sourceLanguage, Language targetLanguage) Looks for translations of the given string. The translations are non-aggressively normalized (lower-case etc.) and returned.- Parameters:
linkedConcept
- The linked concept for which translations shall be obtained.sourceLanguage
- Source language.targetLanguage
- Target language.- Returns:
- The result is not a linked concept but instead a word that was normalized.
-
isUseTdb
public boolean isUseTdb() -
normalizeForTranslations
Normalization Function for translations.- Parameters:
setToBeNormalized
- Set whose strings shall be normalized.- Returns:
- HashSet with Normalized Strings.
-
normalizeForTranslations
-
commit
Commit persistence.- Parameters:
persistence
- The persistence that is to be committed.
-
commitAll
private void commitAll()Commit data changes if active. -
getLinker
Description copied from interface:ExternalResource
Returns the linker instance for this particular resource.- Specified by:
getLinker
in interfaceExternalResource
- Specified by:
getLinker
in classSemanticWordRelationDictionary
- Returns:
- The specific linker used to link words to concepts.
-
getName
Description copied from interface:ExternalResource
Obtain the name of the resource.- Specified by:
getName
in interfaceExternalResource
- Specified by:
getName
in classSemanticWordRelationDictionary
- Returns:
- Name of the resource.
-