java.lang.Object

eu.sealsproject.platform.res.tool.impl.AbstractPlugin

de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMChooseGivenEntityFilter

All Implemented Interfaces:: Filter, IMatcher<org.apache.jena.ontology.OntModel,Alignment,Properties>, eu.sealsproject.platform.res.domain.omt.IOntologyMatchingToolBridge, eu.sealsproject.platform.res.tool.api.IPlugin, eu.sealsproject.platform.res.tool.api.IToolBridge

public class LLMChooseGivenEntityFilter extends LLMBase implements Filter

This filter asks the LLM given a source entity which is the best target entity (out of the ones in the alignment). Afterwards the same is done in the reversed direction.

Field Summary

Fields

Modifier and Type

Field

Description

static final String

BINARY_ADDITIONAL_CONFIDENCE_KEY

private static final org.slf4j.Logger

LOGGER

private static final String

NEWLINE

protected boolean

useNumbers

Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMBase
debugFile, loadingArguments, promt, wordForcer, wordStopper

Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.TransformersBase
cudaVisibleDevices, extractor, modelName, multipleTextsToMultipleExamples, multiProcessing, trainingArguments, transformersCache, usingTensorflow

Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
FILE_PREFIX, FILE_SUFFIX
Constructor Summary

Constructors

Constructor

Description

LLMChooseGivenEntityFilter(TextExtractorMap extractor, String modelName, String promt)

Constructor with all required parameters and default values for optional parameters (can be changed by setters).

LLMChooseGivenEntityFilter(TextExtractor extractor, String modelName, String promt)

Constructor with all required parameters and default values for optional parameters (can be changed by setters).
Method Summary

Modifier and Type

Method

Description

private List<Set<String>>

generateWordsToDetect(int maxNumberOfCandidates)

private String

getEnumerationLabel(int i)

private List<Correspondence>

getList(Iterable<Correspondence> a)

protected String

getOneTextualRepresentation(org.apache.jena.rdf.model.Resource r, Map<org.apache.jena.rdf.model.Resource,String> cache)

Alignment

match(org.apache.jena.ontology.OntModel source, org.apache.jena.ontology.OntModel target, Alignment inputAlignment, Properties properties)

Aligns two ontologies specified via a Jena OntModel, with an input alignment as Alignment object, and returns the mapping of the resulting alignment.

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMBase
addGenerationArgument, addLoadingArgument, addLoadingArguments, addTrainingArgument, getDebugFile, getGenerationArguments, getLoadingArguments, getPromt, includeMoreVariations, includeMoreVariations, isWordForcer, isWordStopper, predictConfidences, setDebugFile, setGenerationArguments, setLoadingArguments, setPromt, setTrainingArguments, setWordForcer, setWordStopper

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.TransformersBase
getCudaVisibleDevices, getCudaVisibleDevicesButOnlyOneGPU, getExamplesForBatchSizeOptimization, getExtractor, getExtractorMap, getModelName, getMultiProcessing, getTextualRepresentation, getTrainingArguments, getTransformersCache, isMultipleTextsToMultipleExamples, isOptimizeForMixedPrecisionTraining, isUsingTensorflow, setCudaVisibleDevices, setCudaVisibleDevices, setExtractor, setExtractorMap, setModelName, setMultipleTextsToMultipleExamples, setMultiProcessing, setOptimizeForMixedPrecisionTraining, setTransformersCache, setUsingTensorflow, writeExamplesToFile

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAAJena
getModelSpec, match, readOntology

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAA
match

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
match

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherURL
align, align, canExecute, getType

Methods inherited from class eu.sealsproject.platform.res.tool.impl.AbstractPlugin
getId, getVersion, setId, setVersion

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface eu.sealsproject.platform.res.tool.api.IPlugin
getId, getVersion

Field Details
- NEWLINE
  
  private static final String NEWLINE
- LOGGER
  
  private static final org.slf4j.Logger LOGGER
- useNumbers
  
  protected boolean useNumbers
- BINARY_ADDITIONAL_CONFIDENCE_KEY
  
  public static final String BINARY_ADDITIONAL_CONFIDENCE_KEY
Constructor Details
- LLMChooseGivenEntityFilter
  
  public LLMChooseGivenEntityFilter(TextExtractorMap extractor, String modelName, String promt)
  
  Constructor with all required parameters and default values for optional parameters (can be changed by setters). It uses the systems default tmp dir to store the files with texts generated from the knowledge graphs. Pytorch is used instead of tensorflow and all visible GPUs are used for prediction.
  
  Parameters:
  
  extractor - the extractor to select which text for each resource should be used.
  
  modelName - the model name which can be a model id (a hosted model on huggingface.co) or a path to a directory containing a model and tokenizer ( see first parameter pretrained_model_name_or_path of the from_pretrained function in huggingface library). In case of a path, it should be absolute. The path can be generated by e.g. FileUtil.getCanonicalPathIfPossible(java.io.File)
  
  promt - The promt to use for the LLM. Use text {candidates} to insert the text representation of all candidates with alphabetical enumerations.
- LLMChooseGivenEntityFilter
  
  public LLMChooseGivenEntityFilter(TextExtractor extractor, String modelName, String promt)
  
  Constructor with all required parameters and default values for optional parameters (can be changed by setters). It uses the systems default tmp dir to store the files with texts generated from the knowledge graphs. Pytorch is used instead of tensorflow and all visible GPUs are used for prediction.
  
  Parameters:
  
  extractor - the extractor to select which text for each resource should be used.
  
  modelName - the model name which can be a model id (a hosted model on huggingface.co) or a path to a directory containing a model and tokenizer ( see first parameter pretrained_model_name_or_path of the from_pretrained function in huggingface library). In case of a path, it should be absolute. The path can be generated by e.g. FileUtil.getCanonicalPathIfPossible(java.io.File)
  
  promt - The promt to use for the LLM. Use {left} and {right} to insert the text representation of the left and right concept.
Method Details
- match
  
  public Alignment match(org.apache.jena.ontology.OntModel source, org.apache.jena.ontology.OntModel target, Alignment inputAlignment, Properties properties) throws Exception
  
  Description copied from class: MatcherYAAAJena
  
  Aligns two ontologies specified via a Jena OntModel, with an input alignment as Alignment object, and returns the mapping of the resulting alignment. Note: This method might be called multiple times in a row when using the evaluation framework. Make sure to return a mapping which is specific to the given inputs.
  
  Specified by:
  
  match in interface IMatcher<org.apache.jena.ontology.OntModel,Alignment,Properties>
  
  Specified by:
  
  match in class MatcherYAAAJena
  
  Parameters:
  
  source - This OntModel represents the source ontology.
  
  target - This OntModel represents the target ontology.
  
  inputAlignment - This mapping represents the input alignment.
  
  properties - Additional properties.
  
  Returns:
  
  The resulting alignment of the matching process.
  
  Throws:
  
  Exception - Any exception which occurs during matching.
- getEnumerationLabel
  
  private String getEnumerationLabel(int i)
- getList
  
  private List<Correspondence> getList(Iterable<Correspondence> a)
- generateWordsToDetect
  
  private List<Set<String>> generateWordsToDetect(int maxNumberOfCandidates)
- getOneTextualRepresentation
  
  protected String getOneTextualRepresentation(org.apache.jena.rdf.model.Resource r, Map<org.apache.jena.rdf.model.Resource,String> cache)

Class LLMChooseGivenEntityFilter

Field Summary

Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMBase

Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.TransformersBase

Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile

Constructor Summary

Method Summary

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMBase

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.TransformersBase

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAAJena

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAA

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile

Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherURL

Methods inherited from class eu.sealsproject.platform.res.tool.impl.AbstractPlugin

Methods inherited from class java.lang.Object

Methods inherited from interface eu.sealsproject.platform.res.tool.api.IPlugin

Field Details

NEWLINE

LOGGER

useNumbers

BINARY_ADDITIONAL_CONFIDENCE_KEY

Constructor Details

LLMChooseGivenEntityFilter

LLMChooseGivenEntityFilter

Method Details

match

getEnumerationLabel

getList

generateWordsToDetect

getOneTextualRepresentation