Class MachineLearningScikitFilter
java.lang.Object
eu.sealsproject.platform.res.tool.impl.AbstractPlugin
de.uni_mannheim.informatik.dws.melt.matching_base.MatcherURL
de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAA
de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAAJena
de.uni_mannheim.informatik.dws.melt.matching_ml.python.MachineLearningScikitFilter
- All Implemented Interfaces:
Filter
,IMatcher<org.apache.jena.ontology.OntModel,
,Alignment, Properties> eu.sealsproject.platform.res.domain.omt.IOntologyMatchingToolBridge
,eu.sealsproject.platform.res.tool.api.IPlugin
,eu.sealsproject.platform.res.tool.api.IToolBridge
This filter learns and applies a classifier given a training sample and an existing alignment.
-
Field Summary
FieldsModifier and TypeFieldDescriptionWhich additional confidences should be used to train the classifier.private int
Number of cross validation to execute.private static final org.slf4j.Logger
Default logger.private int
Number of jobs to execute in parallel.private MatcherYAAAJena
Generator for training data.Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
FILE_PREFIX, FILE_SUFFIX
-
Constructor Summary
ConstructorsConstructorDescriptionMachineLearningScikitFilter
(MatcherYAAAJena trainingGenerator) MachineLearningScikitFilter
(MatcherYAAAJena trainingGenerator, int crossValidationNumber, int numberOfParallelJobs) MachineLearningScikitFilter
(MatcherYAAAJena trainingGenerator, List<String> confidenceNames) MachineLearningScikitFilter
(MatcherYAAAJena trainingGenerator, List<String> confidenceNames, int crossValidationNumber, int numberOfParallelJobs) ConstructorMachineLearningScikitFilter
(Alignment trainingAlignment) MachineLearningScikitFilter
(Alignment trainingAlignment, int crossValidationNumber, int numberOfParallelJobs) -
Method Summary
Modifier and TypeMethodDescriptionstatic Alignment
applyStoredMLModel
(File modelFile, Alignment predictAlignment, List<String> confidenceNames) Load a machine learning model from a file (trained/generated with trainAndStoreMLModel) and apply it to the alignment which is then filtered.private static Alignment
filterAlignment
(Alignment fullAlignment, List<Correspondence> orderedAlignment, List<Integer> predictions) match
(org.apache.jena.ontology.OntModel source, org.apache.jena.ontology.OntModel target, Alignment inputAlignment, Properties properties) Aligns two ontologies specified via a Jena OntModel, with an input alignment as Alignment object, and returns the mapping of the resulting alignment.static Alignment
trainAndApplyMLModel
(Alignment trainAlignment, Alignment predictAlignment, List<String> confidenceNames, int crossValidationNumber, int numberOfParallelJobs) Trains a machine learning model in python and applies it to the predictAlignment to filter it.trainAndStoreMLModel
(Alignment alignment, File modelFile, List<String> confidenceNames, int crossValidationNumber, int numberOfParallelJobs) Trains a machine learning model in python based on the given alignment and then stores the best model in a file.private static void
writeDataset
(List<Correspondence> alignment, File file, boolean includeTarget, List<String> confidenceNames) Writes the given alignment to a file.Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAAJena
getModelSpec, match, readOntology
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena.MatcherYAAA
match
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherFile
match
Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_base.MatcherURL
align, align, canExecute, getType
Methods inherited from class eu.sealsproject.platform.res.tool.impl.AbstractPlugin
getId, getVersion, setId, setVersion
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface eu.sealsproject.platform.res.tool.api.IPlugin
getId, getVersion
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGERDefault logger. -
trainingGenerator
Generator for training data. If relation is equivalence, then this is the positive class. All other relations are the negative class. -
confidenceNames
Which additional confidences should be used to train the classifier. -
crossValidationNumber
private int crossValidationNumberNumber of cross validation to execute. -
numberOfParallelJobs
private int numberOfParallelJobsNumber of jobs to execute in parallel.
-
-
Constructor Details
-
MachineLearningScikitFilter
public MachineLearningScikitFilter() -
MachineLearningScikitFilter
-
MachineLearningScikitFilter
public MachineLearningScikitFilter(Alignment trainingAlignment, int crossValidationNumber, int numberOfParallelJobs) -
MachineLearningScikitFilter
-
MachineLearningScikitFilter
-
MachineLearningScikitFilter
public MachineLearningScikitFilter(MatcherYAAAJena trainingGenerator, int crossValidationNumber, int numberOfParallelJobs) -
MachineLearningScikitFilter
public MachineLearningScikitFilter(MatcherYAAAJena trainingGenerator, List<String> confidenceNames, int crossValidationNumber, int numberOfParallelJobs) Constructor- Parameters:
trainingGenerator
- generator for training data.confidenceNames
- confidence names to use.crossValidationNumber
- Number of cross validation to execute.numberOfParallelJobs
- Number of jobs to execute in parallel.
-
-
Method Details
-
match
public Alignment match(org.apache.jena.ontology.OntModel source, org.apache.jena.ontology.OntModel target, Alignment inputAlignment, Properties properties) throws Exception Description copied from class:MatcherYAAAJena
Aligns two ontologies specified via a Jena OntModel, with an input alignment as Alignment object, and returns the mapping of the resulting alignment. Note: This method might be called multiple times in a row when using the evaluation framework. Make sure to return a mapping which is specific to the given inputs.- Specified by:
match
in interfaceIMatcher<org.apache.jena.ontology.OntModel,
Alignment, Properties> - Specified by:
match
in classMatcherYAAAJena
- Parameters:
source
- This OntModel represents the source ontology.target
- This OntModel represents the target ontology.inputAlignment
- This mapping represents the input alignment.properties
- Additional properties.- Returns:
- The resulting alignment of the matching process.
- Throws:
Exception
- Any exception which occurs during matching.
-
trainAndApplyMLModel
public static Alignment trainAndApplyMLModel(Alignment trainAlignment, Alignment predictAlignment, List<String> confidenceNames, int crossValidationNumber, int numberOfParallelJobs) Trains a machine learning model in python and applies it to the predictAlignment to filter it.- Parameters:
trainAlignment
- Correspondences with an EQUIVALENCE relation are treated as positives. All other relations are treated as negatives.predictAlignment
- the alignment to filterconfidenceNames
- the confidence names of the alignment to use (leave empty to use all additional confidences from trainAlignment.crossValidationNumber
- the number of folds when doing a cross validation.numberOfParallelJobs
- number of parallel jobs.- Returns:
- the filtered alignment
-
trainAndStoreMLModel
public static List<String> trainAndStoreMLModel(Alignment alignment, File modelFile, List<String> confidenceNames, int crossValidationNumber, int numberOfParallelJobs) Trains a machine learning model in python based on the given alignment and then stores the best model in a file.- Parameters:
alignment
- Correspondences with an EQUIVALENCE relation are treated as positives. All other relations are treated as negatives.modelFile
- the file to store the best model.confidenceNames
- the confidence names of the alignment to use (leave empty to use all additional confidences from trainAlignment.crossValidationNumber
- the number of folds when doing a cross validation.numberOfParallelJobs
- number of parallel jobs.- Returns:
- the confidences names which are used (can be directly used as input for confidenceNames in applyStoredMLModel)
-
applyStoredMLModel
public static Alignment applyStoredMLModel(File modelFile, Alignment predictAlignment, List<String> confidenceNames) Load a machine learning model from a file (trained/generated with trainAndStoreMLModel) and apply it to the alignment which is then filtered.- Parameters:
modelFile
- the file to load the ML model.predictAlignment
- the alignment which should be filtered.confidenceNames
- the confidence names of the alignment to use (have to be the same as in training - order has to be the same).- Returns:
- the filtered alignment.
-
filterAlignment
-
writeDataset
private static void writeDataset(List<Correspondence> alignment, File file, boolean includeTarget, List<String> confidenceNames) throws IOException Writes the given alignment to a file.- Parameters:
alignment
- Dataset to write. Correspondences with an EQUIVALENCE relation are treated as positives. All other relations are treated as negatives.file
- File to write.includeTarget
- If true, the label (0 for negatives, 1 for positives) will be persisted.confidenceNames
- the confidence names of the alignment to use.- Throws:
IOException
- Exception in case of problems while writing.
-