Class FamerClustering
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.multisource.clustering.FamerClustering
- All Implemented Interfaces:
Filter
,IMatcherMultiSource<Object,
Alignment, Object>
public class FamerClustering
extends Object
implements IMatcherMultiSource<Object,Alignment,Object>, Filter
A filter for multi source matching.
It filters the input alignment by analyzing the structure of the correspondences.
E.g. if many entities are fully connected, then this indicates that all of those correspondences are correct.
More information on all possible algorithmn which should be chosen in the constructor can be found at Scalable Matching and Clustering of Entities with FAMER .
The source code can be found at gitlab.
-
Field Summary
Modifier and TypeFieldDescriptionprivate boolean
private org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering
private DatasetIDExtractor
private static final org.slf4j.Logger
private static Pattern
private int
private boolean
-
Constructor Summary
ConstructorDescriptionFamerClustering
(DatasetIDExtractor datsetIdExtractor) FamerClustering
(DatasetIDExtractor datsetIdExtractor, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm) FamerClustering
(DatasetIDExtractor datsetIdExtractor, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm, boolean addCorrespondences, boolean removeCorrespondences) -
Method Summary
Modifier and TypeMethodDescriptiongetClusteringFromLogicalGraphClip
(org.gradoop.flink.model.impl.epgm.LogicalGraph clusteredGraph) getClusteringFromLogicalGraphWithLong
(org.gradoop.flink.model.impl.epgm.LogicalGraph clusteredGraph) getClusteringFromLogicalGraphWithString
(org.gradoop.flink.model.impl.epgm.LogicalGraph clusteredGraph) getClusters
(Alignment alignment, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm, DatasetIDExtractor datsetIdExtractor) Computes a map between uris and correspoding clusterId.getClusters
(Alignment alignment, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm, DatasetIDExtractor datsetIdExtractor, int parallelism) Computes a map between uris and correspoding clusterId.private static LogicalGraphAndSourceIds
getLogicalGraphFromAlignment
(Alignment a, DatasetIDExtractor datsetIdExtractor, int parallelism) int
static boolean
instanceOfOne
(Object o, Class<?>... classes) Matches multiple ontologies / knowledge graphs together.processAlignment
(Alignment inputAlignment) void
setParallelism
(int parallelism) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface de.uni_mannheim.informatik.dws.melt.matching_base.multisource.IMatcherMultiSource
needsTransitiveClosureForEvaluation
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGER -
datsetIdExtractor
-
clusteringAlgorithm
private org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm -
addCorrespondences
private boolean addCorrespondences -
removeCorrespondences
private boolean removeCorrespondences -
parallelism
private int parallelism -
NON_DIGIT
-
-
Constructor Details
-
FamerClustering
public FamerClustering(DatasetIDExtractor datsetIdExtractor, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm, boolean addCorrespondences, boolean removeCorrespondences) -
FamerClustering
public FamerClustering(DatasetIDExtractor datsetIdExtractor, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm) -
FamerClustering
-
-
Method Details
-
match
public Alignment match(List<Object> models, Alignment inputAlignment, Object parameters) throws Exception Description copied from interface:IMatcherMultiSource
Matches multiple ontologies / knowledge graphs together.- Specified by:
match
in interfaceIMatcherMultiSource<Object,
Alignment, Object> - Parameters:
models
- a list of ontologies / knowledge graphs in the desired format.inputAlignment
- this object represents the input alignment.parameters
- object representing additional parameters. Only add to this object and do not create a new Object likeparameters= new ...()
because otherwise the parameters are lost (java ist call by value). Sensible classes areProperties
,Map<String, Object>
or any similar data structure. Some already specified keys (strings) can be found atParameterConfigKeys
.- Returns:
- the resulting alignment of the matching process.
- Throws:
Exception
- in case of any errors
-
processAlignment
-
getParallelism
public int getParallelism() -
setParallelism
public void setParallelism(int parallelism) -
getClusters
public static Map<String,Set<Long>> getClusters(Alignment alignment, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm, DatasetIDExtractor datsetIdExtractor) Computes a map between uris and correspoding clusterId.- Parameters:
alignment
- alignmentclusteringAlgorithm
- the cluster algorithm to use. TheClusteringOutputType
doesn't matter but for best performance chooseClusteringOutputType.GRAPH
.datsetIdExtractor
- the dataset id extractor to use. It gets an URI and returns the corresponding data source id.- Returns:
- a map between uris and correspoding clusterId
-
getClusters
public static Map<String,Set<Long>> getClusters(Alignment alignment, org.gradoop.famer.clustering.parallelClustering.AbstractParallelClustering clusteringAlgorithm, DatasetIDExtractor datsetIdExtractor, int parallelism) Computes a map between uris and correspoding clusterId.- Parameters:
alignment
- alignmentclusteringAlgorithm
- the cluster algorithm to use. TheClusteringOutputType
doesn't matter but for best performance chooseClusteringOutputType.GRAPH
.datsetIdExtractor
- the dataset id extractor to use. It gets an URI and returns the corresponding data source id.parallelism
- The parallelism for the local flink environment (can be set to -1 for default which is number of processors).- Returns:
- a map between uris and correspoding clusterId
-
instanceOfOne
-
getClusteringFromLogicalGraphWithString
private static Map<String,Set<Long>> getClusteringFromLogicalGraphWithString(org.gradoop.flink.model.impl.epgm.LogicalGraph clusteredGraph) throws Exception - Throws:
Exception
-
getClusteringFromLogicalGraphClip
private static Map<String,Set<Long>> getClusteringFromLogicalGraphClip(org.gradoop.flink.model.impl.epgm.LogicalGraph clusteredGraph) throws Exception - Throws:
Exception
-
getClusteringFromLogicalGraphWithLong
private static Map<String,Set<Long>> getClusteringFromLogicalGraphWithLong(org.gradoop.flink.model.impl.epgm.LogicalGraph clusteredGraph) throws Exception - Throws:
Exception
-
getLogicalGraphFromAlignment
private static LogicalGraphAndSourceIds getLogicalGraphFromAlignment(Alignment a, DatasetIDExtractor datsetIdExtractor, int parallelism)
-