Class ConfidenceFinder
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_eval.paramtuning.ConfidenceFinder
This class offers static functionality to analyze and optimize matchers in terms of their confidences (and
confidence thresholds).
-
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprivate static double
divideWithTwoDenominators
(double numerator, double denominatorOne, double denominatorTwo) Simple division that is to be performed.static double
getBestConfidenceForFmeasure
(ExecutionResult executionResult) Given an ExecutionResult, this method determines the best cutting point in order to optimize the F1-score.static double
getBestConfidenceForFmeasure
(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness) If this method takes too long, you can use the more efficient methodgetBestConfidenceForFmeasure(Alignment, Alignment, GoldStandardCompleteness, int)
and set a decimal precision (e.g.static double
getBestConfidenceForFmeasure
(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness, int decimalPrecision) Given two alignments, this method determines the best cutting point (main confidence in correspondences) in order to optimize the F1-score.static double
getBestConfidenceForFmeasureBeta
(ExecutionResult executionResult, double beta) Given an ExecutionResult, this method determines the best cutting point in order to optimize the F_beta-score (beta is given as a parameter).static double
getBestConfidenceForFmeasureBeta
(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness, double beta) Given two alignments, this method determines the best cutting point (main confidence in correspondences) in order to optimize the F_beta-score (beta is given as a parameter).static double
getBestConfidenceForPrecision
(ExecutionResult executionResult) Given an ExecutionResult, this method determines the best cutting point in order to optimize the precision.static double
getBestConfidenceForPrecision
(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness) Given two alignments, this method determines the best cutting point (main confidence in correspondences) in order to optimize the precision.static ExecutionResultSet
getConfidenceResultSet
(ExecutionResult executionResult) private static double
getFbetaMeasure
(double precision, double recall, double beta) getOccurringConfidences
(Alignment a, double begin, double end) getOccurringConfidences
(Alignment alignment, int decimalPrecision) If you require a precise solution, set thedecimalPrecision
to a negative number.getOccurringConfidences
(Alignment a, int decimalPrecision, double begin, double end) getSteps
(double start, double end, double stepWidth)
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGER
-
-
Constructor Details
-
ConfidenceFinder
public ConfidenceFinder()
-
-
Method Details
-
getSteps
-
getOccurringConfidences
-
getOccurringConfidences
-
getOccurringConfidences
If you require a precise solution, set thedecimalPrecision
to a negative number.- Parameters:
alignment
- The alignment.decimalPrecision
- The desired decimal precision. Negative number for optimal precision.- Returns:
- Set of precision values.
-
getOccurringConfidences
-
getBestConfidenceForFmeasure
Given an ExecutionResult, this method determines the best cutting point in order to optimize the F1-score.- Parameters:
executionResult
- The execution result for which the optimal confidence threshold shall be determined.- Returns:
- The optimal confidence threshold for an optimal F1 measure. All correspondences with a confidence
LOWER than the result should be discarded. You can directly use
ConfidenceFilter
to cut correspondences LESS than the optimal threshold determined by this method.
-
getBestConfidenceForFmeasure
public static double getBestConfidenceForFmeasure(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness) If this method takes too long, you can use the more efficient methodgetBestConfidenceForFmeasure(Alignment, Alignment, GoldStandardCompleteness, int)
and set a decimal precision (e.g. 1 or 2).- Parameters:
reference
- The reference alignment.systemAlignment
- The system alignment.gsCompleteness
- The gold standard completeness.- Returns:
- The optimal confidence.
-
getBestConfidenceForFmeasure
public static double getBestConfidenceForFmeasure(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness, int decimalPrecision) Given two alignments, this method determines the best cutting point (main confidence in correspondences) in order to optimize the F1-score.- Parameters:
reference
- the reference alignment to usesystemAlignment
- the system alignmentgsCompleteness
- What gold standard completeness is given - If reference alignment is a subset of the overall reference alignment AND we have a one-to-one alignment, useGoldStandardCompleteness.PARTIAL_SOURCE_COMPLETE_TARGET_COMPLETE
.decimalPrecision
- The precision of the confidences. A low precision (such as 2) will optimize the runtime performance - however, it may lead to suboptimal results. If you require an optimal solution, set the decimal precision to a negative number.- Returns:
- The optimal confidence threshold for an optimal F1 measure. All correspondences with a confidence
LOWER than the result should be discarded. You can directly use
ConfidenceFilter
to cut correspondences LESS than the optimal threshold determined by this method.
-
getBestConfidenceForFmeasureBeta
Given an ExecutionResult, this method determines the best cutting point in order to optimize the F_beta-score (beta is given as a parameter).- Parameters:
executionResult
- The execution result for which the optimal confidence threshold shall be determined.beta
- the beta value for F-beta measure- Returns:
- The optimal confidence threshold for an optimal F_beta measure. All correspondences with a confidence
LOWER than the result should be discarded. You can directly use
ConfidenceFilter
to cut correspondences LESS than the optimal threshold determined by this method.
-
getBestConfidenceForFmeasureBeta
public static double getBestConfidenceForFmeasureBeta(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness, double beta) Given two alignments, this method determines the best cutting point (main confidence in correspondences) in order to optimize the F_beta-score (beta is given as a parameter).- Parameters:
reference
- the reference alignment to usesystemAlignment
- the system alignmentgsCompleteness
- what gold standard completeness is given - If reference alignment is a subset of the overall reference alignment AND we have a one-to-one alignment, useGoldStandardCompleteness.PARTIAL_SOURCE_COMPLETE_TARGET_COMPLETE
.beta
- the beta value for F-beta measure- Returns:
- The optimal confidence threshold for an optimal F_beta measure. All correspondences with a confidence
LOWER than the result should be discarded. You can directly use
ConfidenceFilter
to cut correspondences LESS than the optimal threshold determined by this method.
-
getBestConfidenceForPrecision
Given an ExecutionResult, this method determines the best cutting point in order to optimize the precision.- Parameters:
executionResult
- The execution result for which the optimal confidence threshold shall be determined.- Returns:
- The optimal confidence threshold for an optimal precision. All correspondences with a confidence
LOWER than the result should be discarded. You can directly use
ConfidenceFilter
to cut correspondences LESS than the optimal threshold determined by this method.
-
getBestConfidenceForPrecision
public static double getBestConfidenceForPrecision(Alignment reference, Alignment systemAlignment, GoldStandardCompleteness gsCompleteness) Given two alignments, this method determines the best cutting point (main confidence in correspondences) in order to optimize the precision.- Parameters:
reference
- the reference alignment to usesystemAlignment
- the system alignmentgsCompleteness
- what gold standard completeness is given - If reference alignment is a subset of the overall reference alignment AND we have a one-to-one alignment, useGoldStandardCompleteness.PARTIAL_SOURCE_COMPLETE_TARGET_COMPLETE
.- Returns:
- The optimal confidence threshold for an optimal precision. All correspondences with a confidence
LOWER than the result should be discarded. You can directly use
ConfidenceFilter
to cut correspondences LESS than the optimal threshold determined by this method.
-
getConfidenceResultSet
-
divideWithTwoDenominators
private static double divideWithTwoDenominators(double numerator, double denominatorOne, double denominatorTwo) Simple division that is to be performed. The two denominators will be added.- Parameters:
numerator
- Numerator of fractiondenominatorOne
- Denominator 1denominatorTwo
- Denominator 2- Returns:
- Result as double.
-
getFbetaMeasure
private static double getFbetaMeasure(double precision, double recall, double beta)
-