Class LabelToConceptLinkerEmbeddings
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.embeddings.LabelToConceptLinkerEmbeddings
- All Implemented Interfaces:
LabelToConceptLinker
- Direct Known Subclasses:
BabelNetEmbeddingLinker
,WebIsAlodEmbeddingLinker
,WiktionaryEmbeddingLinker
,WordNetEmbeddingLinker
LabelToConceptLinker with some additional functions required for embedding approaches.
-
Field Summary
Modifier and TypeFieldDescriptionprivate static final org.slf4j.Logger
Default loggerData lookup.(package private) LinkedList<StringModifier>
The list of operations that is performed to find a concept in the dictionary. -
Constructor Summary
ConstructorDescriptionLabelToConceptLinkerEmbeddings
(File entityFile) ConstructorLabelToConceptLinkerEmbeddings
(String filePathToEntityFile) Constructor -
Method Summary
Modifier and TypeMethodDescriptionlinkLabelToTokensLeftToRight
(String labelToBeLinked) Splits the labelToBeLinked in ngrams up to infinite size and tries to link components.linkToPotentiallyMultipleConcepts
(String labelToBeLinked) This method tries to linklabelToBeLinked
to one concept if possible.linkToSingleConcept
(String labelToBeLinked) Queries for a concept and returns a link that represents an entity in the background knowledge source such as theSemanticWordRelationDictionary
.abstract String
NormalizationreadFileIntoMap
(File file) Read the HashSet of concepts/entities from file.void
setStringModificationSequence
(LinkedList<StringModifier> stringModificationSequence) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.LabelToConceptLinker
getNameOfLinker, setNameOfLinker
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGERDefault logger -
stringModificationSequence
LinkedList<StringModifier> stringModificationSequenceThe list of operations that is performed to find a concept in the dictionary. -
lookupMap
Data lookup.
-
-
Constructor Details
-
LabelToConceptLinkerEmbeddings
Constructor- Parameters:
entityFile
- File containing the entities that are available in the background knowledge source. One entity per line. UTF-8 encoded.
-
LabelToConceptLinkerEmbeddings
Constructor- Parameters:
filePathToEntityFile
- The file path to the entity file as string.
-
-
Method Details
-
normalize
Normalization- Parameters:
stringToBeNormalized
- The String that shall be normalized.- Returns:
- Normalized version of the String.
-
linkToSingleConcept
Description copied from interface:LabelToConceptLinker
Queries for a concept and returns a link that represents an entity in the background knowledge source such as theSemanticWordRelationDictionary
. Note that the link may not always be something intuitive such as a URI but may also be an artificial identifier that is understood by the corresponding background knowledge source.- Specified by:
linkToSingleConcept
in interfaceLabelToConceptLinker
- Parameters:
labelToBeLinked
- The label which shall be linked to a single concept.- Returns:
- Concept or null if no link could be found.
-
linkLabelToTokensLeftToRight
Splits the labelToBeLinked in ngrams up to infinite size and tries to link components. This corresponds to a MAXGRAM_LEFT_TO_RIGHT_TOKENIZER or NGRAM_LEFT_TO_RIGHT_TOKENIZER OneToManyLinkingStrategy.- Parameters:
labelToBeLinked
- Label that shall be linked.- Returns:
- A set of concept URIs that were found.
-
linkToPotentiallyMultipleConcepts
Description copied from interface:LabelToConceptLinker
This method tries to linklabelToBeLinked
to one concept if possible. If it fails, it will try to link it to multiple concepts.- Specified by:
linkToPotentiallyMultipleConcepts
in interfaceLabelToConceptLinker
- Parameters:
labelToBeLinked
- The label which shall be linked.- Returns:
- One or multiple linked concepts in a set. Null if it could not fully link the label.
-
readFileIntoMap
Read the HashSet of concepts/entities from file.- Parameters:
file
- The file must be UTF-8 encoded.- Returns:
- The contents of the file as Map.
-
getStringModificationSequence
-
setStringModificationSequence
-
getLookupMap
-