java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.external.embeddings.LabelToConceptLinkerEmbeddings
All Implemented Interfaces:
LabelToConceptLinker
Direct Known Subclasses:
BabelNetEmbeddingLinker, WebIsAlodEmbeddingLinker, WiktionaryEmbeddingLinker, WordNetEmbeddingLinker

public abstract class LabelToConceptLinkerEmbeddings extends Object implements LabelToConceptLinker
LabelToConceptLinker with some additional functions required for embedding approaches.
  • Field Details

    • LOGGER

      private static final org.slf4j.Logger LOGGER
      Default logger
    • stringModificationSequence

      LinkedList<StringModifier> stringModificationSequence
      The list of operations that is performed to find a concept in the dictionary.
    • lookupMap

      Map<String,String> lookupMap
      Data lookup.
  • Constructor Details

    • LabelToConceptLinkerEmbeddings

      public LabelToConceptLinkerEmbeddings(File entityFile)
      Constructor
      Parameters:
      entityFile - File containing the entities that are available in the background knowledge source. One entity per line. UTF-8 encoded.
    • LabelToConceptLinkerEmbeddings

      public LabelToConceptLinkerEmbeddings(String filePathToEntityFile)
      Constructor
      Parameters:
      filePathToEntityFile - The file path to the entity file as string.
  • Method Details

    • normalize

      public abstract String normalize(String stringToBeNormalized)
      Normalization
      Parameters:
      stringToBeNormalized - The String that shall be normalized.
      Returns:
      Normalized version of the String.
    • linkToSingleConcept

      public String linkToSingleConcept(String labelToBeLinked)
      Description copied from interface: LabelToConceptLinker
      Queries for a concept and returns a link that represents an entity in the background knowledge source such as the SemanticWordRelationDictionary. Note that the link may not always be something intuitive such as a URI but may also be an artificial identifier that is understood by the corresponding background knowledge source.
      Specified by:
      linkToSingleConcept in interface LabelToConceptLinker
      Parameters:
      labelToBeLinked - The label which shall be linked to a single concept.
      Returns:
      Concept or null if no link could be found.
    • linkLabelToTokensLeftToRight

      private Set<String> linkLabelToTokensLeftToRight(String labelToBeLinked)
      Splits the labelToBeLinked in ngrams up to infinite size and tries to link components. This corresponds to a MAXGRAM_LEFT_TO_RIGHT_TOKENIZER or NGRAM_LEFT_TO_RIGHT_TOKENIZER OneToManyLinkingStrategy.
      Parameters:
      labelToBeLinked - Label that shall be linked.
      Returns:
      A set of concept URIs that were found.
    • linkToPotentiallyMultipleConcepts

      public Set<String> linkToPotentiallyMultipleConcepts(String labelToBeLinked)
      Description copied from interface: LabelToConceptLinker
      This method tries to link labelToBeLinked to one concept if possible. If it fails, it will try to link it to multiple concepts.
      Specified by:
      linkToPotentiallyMultipleConcepts in interface LabelToConceptLinker
      Parameters:
      labelToBeLinked - The label which shall be linked.
      Returns:
      One or multiple linked concepts in a set. Null if it could not fully link the label.
    • readFileIntoMap

      @NotNull private @NotNull Map<String,String> readFileIntoMap(File file)
      Read the HashSet of concepts/entities from file.
      Parameters:
      file - The file must be UTF-8 encoded.
      Returns:
      The contents of the file as Map.
    • getStringModificationSequence

      public LinkedList<StringModifier> getStringModificationSequence()
    • setStringModificationSequence

      public void setStringModificationSequence(LinkedList<StringModifier> stringModificationSequence)
    • getLookupMap

      public Map<String,String> getLookupMap()