Class MultiSourceDispatcherIncrementalMerge
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_base.multisource.MatcherMultiSourceURL
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.multisource.dispatchers.MultiSourceDispatcherIncrementalMerge
- All Implemented Interfaces:
IMatcherMultiSourceCaller
,MultiSourceDispatcher
- Direct Known Subclasses:
MultiSourceDispatcherIncrementalMergeByCluster
,MultiSourceDispatcherIncrementalMergeByOrder
public abstract class MultiSourceDispatcherIncrementalMerge
extends MatcherMultiSourceURL
implements MultiSourceDispatcher, IMatcherMultiSourceCaller
Matches multiple ontologies / knowledge graphs with an incremental merge approach.
This means that two ontologies are merged together and then possibly the union is merged with another ontology and so on.
The order how they are merged is defined by subclasses.
-
Field Summary
Modifier and TypeFieldDescriptionprivate boolean
private CopyMode
private DatasetIDExtractor
private static final org.slf4j.Logger
private FileCache<MergeOrder>
private int
private boolean
private File
-
Constructor Summary
ConstructorDescriptionMultiSourceDispatcherIncrementalMerge
(Object oneToOneMatcher) MultiSourceDispatcherIncrementalMerge
(Object oneToOneMatcher, boolean addInformationToUnion) Constructor which expects the actual one to one matcher and a boolean if information should be added to the union.MultiSourceDispatcherIncrementalMerge
(Supplier<Object> matcherSupplier) MultiSourceDispatcherIncrementalMerge
(Supplier<Object> matcherSupplier, boolean addInformationToUnion) Constructor which expects the actual one to one matcher and a boolean if information should be added to the union. -
Method Summary
Modifier and TypeMethodDescriptionprivate static Properties
addDistance
(Properties p, double distance, double normalizedDistance) private void
Returns the cache file which is used to store the merge tree.Returns the used copy mode.Returns the intermediate alignments.Return the function which returns a new configured matcher object.abstract MergeOrder
getMergeTree
(List<Set<Object>> models, Object parameters) Returns the merging tree (which ontologies are merged in which order).int
Returns the number of thread which are used during merge.Return the file where the serialzed merge tree is stored.boolean
Return true if all information / triples are added to the union.protected boolean
isLeftModelGreater
(Set<Object> leftOntology, Set<Object> rightOntology, Properties p) boolean
Returns true if OntModels are removed.boolean
Returns true if intermediate alignments are stored.Matches multiple ontologies/knowledge graphs together.Matches multiple ontologies / knowledge graphs together.boolean
Returns a boolean value if the matcher needs a transitive closure for evaluation.private AlignmentAndParameters
runParallel
(MergeOrder mergeOrder, List<Set<Object>> models, Object inputAlignment, Properties p) private AlignmentAndParameters
runSequential
(MergeOrder mergeOrder, List<Set<Object>> models, Object inputAlignment, Properties p) void
setAddingInformationToUnion
(boolean addInformationToUnion) Sets the value if all information / triples are added to the union.void
setCacheFile
(File cacheFile) Sets the cache file which should be used to cache the merge tree.void
setCopyMode
(CopyMode copyMode) Sets the copy mode which is used during the merging.void
setGoldStandard
(Alignment alignment, DatasetIDExtractor idExtractor) void
setGoldStandard
(Object alignment, DatasetIDExtractor idExtractor) void
setMatcherSupplier
(Supplier<Object> matcherSupplier) If a matcher supplier is set, then this will be used instead of the matcher object.void
setNumberOfThreads
(int numberOfThreads) Sets the number of threads which are used during merge.void
Sets the number of threads which are used during merge to the number of available CPU cores.void
setRemoveUnusedJenaModels
(boolean removeUnusedJenaModels) If set to true, this removes OntModel/Model which are not needed anymore.void
setSavingIntermediateAlignments
(boolean intermediateAlignmentsNew) Set to true if the intermediate alignments should be stored.void
setSerializedTreeFile
(File serializedTreeFile) Sets the file where the serialzed merge tree is stored.
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGER -
matcherSupplier
-
numberOfThreads
private int numberOfThreads -
addingInformationToUnion
private boolean addingInformationToUnion -
removeUnusedJenaModels
private boolean removeUnusedJenaModels -
copyMode
-
intermediateAlignments
-
mergeOrderFileCache
-
serializedTreeFile
-
goldStandard
-
idExtractor
-
-
Constructor Details
-
MultiSourceDispatcherIncrementalMerge
Constructor which expects the actual one to one matcher and a boolean if information should be added to the union.- Parameters:
oneToOneMatcher
- ont to one matcheraddInformationToUnion
- if true all information from matched entities are in the union.
-
MultiSourceDispatcherIncrementalMerge
-
MultiSourceDispatcherIncrementalMerge
public MultiSourceDispatcherIncrementalMerge(Supplier<Object> matcherSupplier, boolean addInformationToUnion) Constructor which expects the actual one to one matcher and a boolean if information should be added to the union.- Parameters:
matcherSupplier
- a function which returns a new configured matcher everytime when it is called.addInformationToUnion
- if true all information from matched entities are in the union.
-
MultiSourceDispatcherIncrementalMerge
-
-
Method Details
-
match
Description copied from class:MatcherMultiSourceURL
Matches multiple ontologies/knowledge graphs together.- Specified by:
match
in classMatcherMultiSourceURL
- Parameters:
models
- the ontologies/knowledge graphs as URLsinputAlignment
- the input alignment as URL (alignment API format)parameters
- the parameters file url. Format are currently json or yaml.- Returns:
- an alignment as URL (most often as file URL) the format is again the alignment API format.
- Throws:
Exception
- in case something went wrong
-
needsTransitiveClosureForEvaluation
public boolean needsTransitiveClosureForEvaluation()Description copied from class:MatcherMultiSourceURL
Returns a boolean value if the matcher needs a transitive closure for evaluation. E.g. some matchers match only A-B-C and the testcase asks for A-C then this is only true, if the transitive closure is computed- Specified by:
needsTransitiveClosureForEvaluation
in interfaceIMatcherMultiSourceCaller
- Overrides:
needsTransitiveClosureForEvaluation
in classMatcherMultiSourceURL
- Returns:
- true if the transitive closure is need, false otherwise
-
match
public AlignmentAndParameters match(List<Set<Object>> models, Object inputAlignment, Object parameters) throws Exception Description copied from interface:IMatcherMultiSourceCaller
Matches multiple ontologies / knowledge graphs together.- Specified by:
match
in interfaceIMatcherMultiSourceCaller
- Parameters:
models
- this is a list of sets of objects where each sets contains different representations of the dame ontologies/ knowledge graph.inputAlignment
- this object represents the input alignment.parameters
- object representing additional parameters. Only add to this object and do not create a new Object likeparameters= new ...()
because otherwise the parameters are lost (java ist call by value). Sensible classes areProperties
,Map<String, Object>
or any similar data structure. Some already specified keys (strings) can be found atParameterConfigKeys
.- Returns:
- the resulting alignment of the matching process.
- Throws:
Exception
- in case of any errors
-
runSequential
private AlignmentAndParameters runSequential(MergeOrder mergeOrder, List<Set<Object>> models, Object inputAlignment, Properties p) throws MatchingException, Exception - Throws:
MatchingException
Exception
-
runParallel
private AlignmentAndParameters runParallel(MergeOrder mergeOrder, List<Set<Object>> models, Object inputAlignment, Properties p) throws MatchingException - Throws:
MatchingException
-
addDistance
-
getMergeTree
Returns the merging tree (which ontologies are merged in which order). Have a look at the return description to see the merging tree format.- Parameters:
models
- the modelsparameters
- object representing additional parameters.- Returns:
- mergingTree for n models, this is a n-1 by 2 matrix where row i describes the merging of clusters at step i of the clustering. If an element j in the row is less than n, then observation j was merged at this stage. If j ≥ n then the merge was with the cluster formed at the (earlier) stage j-n of the algorithm.
-
isLeftModelGreater
protected boolean isLeftModelGreater(Set<Object> leftOntology, Set<Object> rightOntology, Properties p) throws TypeTransformationException - Throws:
TypeTransformationException
-
callClearIndex
private void callClearIndex() -
getNumberOfThreads
public int getNumberOfThreads()Returns the number of thread which are used during merge. A number equal to one means sequential processing and greater than one means parallel processing.- Returns:
- the number of thread used.
-
setNumberOfThreads
public void setNumberOfThreads(int numberOfThreads) Sets the number of threads which are used during merge. A number equal to one means sequential processing and greater than one means parallel processing with the specified number of threads.- Parameters:
numberOfThreads
- the number of threads to use. Values greater or equal to one are allowed.
-
setNumberOfThreadsToCpuCores
public void setNumberOfThreadsToCpuCores()Sets the number of threads which are used during merge to the number of available CPU cores. A number equal to one means sequential processing and greater than one means parallel processing with the specified number of threads. -
isAddingInformationToUnion
public boolean isAddingInformationToUnion()Return true if all information / triples are added to the union. If set to false, only the information of non matched entities is added to the union.- Returns:
- true if all information / triples are added to the union
-
setAddingInformationToUnion
public void setAddingInformationToUnion(boolean addInformationToUnion) Sets the value if all information / triples are added to the union. If set to false, only the information of non matched entities is added to the union.- Parameters:
addInformationToUnion
- true if all information / triples are added to the union
-
isRemoveUnusedJenaModels
public boolean isRemoveUnusedJenaModels()Returns true if OntModels are removed.- Returns:
- true if unused OntModels are removed
-
setRemoveUnusedJenaModels
public void setRemoveUnusedJenaModels(boolean removeUnusedJenaModels) If set to true, this removes OntModel/Model which are not needed anymore. This helps to match a large number of KGs in a memory friendly way.- Parameters:
removeUnusedJenaModels
- if true, unused OntModels will be removed
-
getIntermediateAlignments
Returns the intermediate alignments. This only works ifsetSavingIntermediateAlignments(boolean)
is set to true before the match method is called.- Returns:
- a list of intermediate alignments or null if
setSavingIntermediateAlignments(boolean)
was set to false (the default).
-
isSavingIntermediateAlignments
public boolean isSavingIntermediateAlignments()Returns true if intermediate alignments are stored.- Returns:
- true if intermediate alignments are stored
-
setSavingIntermediateAlignments
public void setSavingIntermediateAlignments(boolean intermediateAlignmentsNew) Set to true if the intermediate alignments should be stored.- Parameters:
intermediateAlignmentsNew
- true if the intermediate alignments should be stored
-
getCopyMode
Returns the used copy mode.- Returns:
- the used copy mode
-
setCopyMode
Sets the copy mode which is used during the merging. Defaults to None.- Parameters:
copyMode
- new copy mode to use
-
getCacheFile
Returns the cache file which is used to store the merge tree. In case no cache file is set, then null is returned.- Returns:
- the cache file or null
-
setCacheFile
Sets the cache file which should be used to cache the merge tree. Only use this method if the same models are merged over and over again. Thus the merge tree do not need to be computed multiple times. In case the file does not exists, the new computed merge tree is stored in this file. In all other cases, the file will not be modified. The cache file needs to be deleted to compute the merge tree again.- Parameters:
cacheFile
- the file for storing the merge tree.
-
getSerializedTreeFile
Return the file where the serialzed merge tree is stored.- Returns:
- the file where the serialzed merge tree is stored.
-
setSerializedTreeFile
Sets the file where the serialzed merge tree is stored. Set this to a non null value to write the serialzed merge tree to file.- Parameters:
serializedTreeFile
- the file where the serialzed merge tree is stored.
-
getMatcherSupplier
Return the function which returns a new configured matcher object. In case it is not null, it will be used instead of the matcher object.- Returns:
- the matcher supplier function
-
setMatcherSupplier
If a matcher supplier is set, then this will be used instead of the matcher object.- Parameters:
matcherSupplier
- the matcher supplier (when called, this function should return a new configured matcher object)
-
setGoldStandard
public void setGoldStandard(Object alignment, DatasetIDExtractor idExtractor) throws TypeTransformationException - Throws:
TypeTransformationException
-
setGoldStandard
-