Class LLMChooseGivenEntityFilter
java.lang.Object
eu.sealsproject.platform.res.tool.impl.AbstractPlugin
de.uni_mannheim.informatik.dws.melt.matching_base.MatcherURL
de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAA
de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAAJena
de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.TransformersBase
de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMBase
de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMChooseGivenEntityFilter
- All Implemented Interfaces:
Filter
,IMatcher<org.apache.jena.ontology.OntModel,
,Alignment, Properties> eu.sealsproject.platform.res.domain.omt.IOntologyMatchingToolBridge
,eu.sealsproject.platform.res.tool.api.IPlugin
,eu.sealsproject.platform.res.tool.api.IToolBridge
This filter asks the LLM given a source entity which is the best target entity (out of the ones in the alignment).
Afterwards the same is done in the reversed direction.
-
Field Summary
Modifier and TypeFieldDescriptionstatic final String
private static final org.slf4j.Logger
private static final String
protected boolean
Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMBase
debugFile, loadingArguments, promt, wordForcer, wordStopper
Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.TransformersBase
cudaVisibleDevices, extractor, modelName, multipleTextsToMultipleExamples, multiProcessing, trainingArguments, transformersCache, usingTensorflow
Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
FILE_PREFIX, FILE_SUFFIX
-
Constructor Summary
ConstructorDescriptionLLMChooseGivenEntityFilter
(TextExtractorMap extractor, String modelName, String promt) Constructor with all required parameters and default values for optional parameters (can be changed by setters).LLMChooseGivenEntityFilter
(TextExtractor extractor, String modelName, String promt) Constructor with all required parameters and default values for optional parameters (can be changed by setters). -
Method Summary
Modifier and TypeMethodDescriptiongenerateWordsToDetect
(int maxNumberOfCandidates) private String
getEnumerationLabel
(int i) private List<Correspondence>
protected String
getOneTextualRepresentation
(org.apache.jena.rdf.model.Resource r, Map<org.apache.jena.rdf.model.Resource, String> cache) match
(org.apache.jena.ontology.OntModel source, org.apache.jena.ontology.OntModel target, Alignment inputAlignment, Properties properties) Aligns two ontologies specified via a Jena OntModel, with an input alignment as Alignment object, and returns the mapping of the resulting alignment.Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.LLMBase
addGenerationArgument, addLoadingArgument, addLoadingArguments, addTrainingArgument, getDebugFile, getGenerationArguments, getLoadingArguments, getPromt, includeMoreVariations, includeMoreVariations, isWordForcer, isWordStopper, predictConfidences, setDebugFile, setGenerationArguments, setLoadingArguments, setPromt, setTrainingArguments, setWordForcer, setWordStopper
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_ml.python.nlptransformers.TransformersBase
getCudaVisibleDevices, getCudaVisibleDevicesButOnlyOneGPU, getExamplesForBatchSizeOptimization, getExtractor, getExtractorMap, getModelName, getMultiProcessing, getTextualRepresentation, getTrainingArguments, getTransformersCache, isMultipleTextsToMultipleExamples, isOptimizeForMixedPrecisionTraining, isUsingTensorflow, setCudaVisibleDevices, setCudaVisibleDevices, setExtractor, setExtractorMap, setModelName, setMultipleTextsToMultipleExamples, setMultiProcessing, setOptimizeForMixedPrecisionTraining, setTransformersCache, setUsingTensorflow, writeExamplesToFile
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAAJena
getModelSpec, match, readOntology
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAA
match
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
match
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherURL
align, align, canExecute, getType
Methods inherited from class eu.sealsproject.platform.res.tool.impl.AbstractPlugin
getId, getVersion, setId, setVersion
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface eu.sealsproject.platform.res.tool.api.IPlugin
getId, getVersion
-
Field Details
-
NEWLINE
-
LOGGER
private static final org.slf4j.Logger LOGGER -
useNumbers
protected boolean useNumbers -
BINARY_ADDITIONAL_CONFIDENCE_KEY
-
-
Constructor Details
-
LLMChooseGivenEntityFilter
Constructor with all required parameters and default values for optional parameters (can be changed by setters). It uses the systems default tmp dir to store the files with texts generated from the knowledge graphs. Pytorch is used instead of tensorflow and all visible GPUs are used for prediction.- Parameters:
extractor
- the extractor to select which text for each resource should be used.modelName
- the model name which can be a model id (a hosted model on huggingface.co) or a path to a directory containing a model and tokenizer ( see first parameter pretrained_model_name_or_path of the from_pretrained function in huggingface library). In case of a path, it should be absolute. The path can be generated by e.g.FileUtil.getCanonicalPathIfPossible(java.io.File)
promt
- The promt to use for the LLM. Use text {candidates} to insert the text representation of all candidates with alphabetical enumerations.
-
LLMChooseGivenEntityFilter
Constructor with all required parameters and default values for optional parameters (can be changed by setters). It uses the systems default tmp dir to store the files with texts generated from the knowledge graphs. Pytorch is used instead of tensorflow and all visible GPUs are used for prediction.- Parameters:
extractor
- the extractor to select which text for each resource should be used.modelName
- the model name which can be a model id (a hosted model on huggingface.co) or a path to a directory containing a model and tokenizer ( see first parameter pretrained_model_name_or_path of the from_pretrained function in huggingface library). In case of a path, it should be absolute. The path can be generated by e.g.FileUtil.getCanonicalPathIfPossible(java.io.File)
promt
- The promt to use for the LLM. Use {left} and {right} to insert the text representation of the left and right concept.
-
-
Method Details
-
match
public Alignment match(org.apache.jena.ontology.OntModel source, org.apache.jena.ontology.OntModel target, Alignment inputAlignment, Properties properties) throws Exception Description copied from class:MatcherYAAAJena
Aligns two ontologies specified via a Jena OntModel, with an input alignment as Alignment object, and returns the mapping of the resulting alignment. Note: This method might be called multiple times in a row when using the evaluation framework. Make sure to return a mapping which is specific to the given inputs.- Specified by:
match
in interfaceIMatcher<org.apache.jena.ontology.OntModel,
Alignment, Properties> - Specified by:
match
in classMatcherYAAAJena
- Parameters:
source
- This OntModel represents the source ontology.target
- This OntModel represents the target ontology.inputAlignment
- This mapping represents the input alignment.properties
- Additional properties.- Returns:
- The resulting alignment of the matching process.
- Throws:
Exception
- Any exception which occurs during matching.
-
getEnumerationLabel
-
getList
-
generateWordsToDetect
-
getOneTextualRepresentation
-