Class TextExtractorVerbalizedRDF
java.lang.Object
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.util.textExtractors.TextExtractorRDFBase
de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.util.textExtractors.TextExtractorVerbalizedRDF
- All Implemented Interfaces:
TextExtractor
This extractor creates only one text per resource which describes it by verbalizing
each statement where the resource is in the subject position.
For each RDFNode (subjetc, predicate, object) the corresponding label is used as textual representation.
If the object is a label, the lexical form is used.
Only the statements which contain label information are dropped because the information is already included in the other statements.
An example for a simple class: A subclass of B. A disjoit with X.
-
Field Summary
Modifier and TypeFieldDescriptionprotected boolean
private static Set<org.apache.jena.rdf.model.Property>
private static final TextExtractorOnlyLabel
protected boolean
Fields inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.util.textExtractors.TextExtractorRDFBase
SKIP_DEFINITIONS, SKIP_DEFINITIONS_AND_LONG_LITERALS, SKIP_DEFINITIONS_AND_SHORTEN_LONG_LITERALS, statementProcessor
-
Constructor Summary
ConstructorDescriptionTextExtractorVerbalizedRDF
(boolean lineByLineTranslation, boolean includeQuotes) -
Method Summary
Modifier and TypeMethodDescriptionextract
(org.apache.jena.rdf.model.Resource r) Given a Jena resource this method extracts textual/string representations from it.boolean
boolean
protected String
optionallyQuote
(String text) setIncludeQuotes
(boolean includeQuotes) setLineByLineTranslation
(boolean lineByLineTranslation) Methods inherited from class de.uni_mannheim.informatik.dws.melt.matching_jena_matchers.util.textExtractors.TextExtractorRDFBase
getStatementProcessor, setStatementProcessor
-
Field Details
-
labelExtractor
-
lineByLineTranslation
protected boolean lineByLineTranslation -
includeQuotes
protected boolean includeQuotes -
LABEL_PROPERTIES
-
-
Constructor Details
-
TextExtractorVerbalizedRDF
public TextExtractorVerbalizedRDF(boolean lineByLineTranslation, boolean includeQuotes) - Parameters:
lineByLineTranslation
- if set to true, this will always repeat the subject. E.g. A subclass of B. A disjoit with X. etcincludeQuotes
- if true, the subject and object (represented by their labels) will be quoted such that there is a clear separation.
-
TextExtractorVerbalizedRDF
public TextExtractorVerbalizedRDF()
-
-
Method Details
-
extract
Description copied from interface:TextExtractor
Given a Jena resource this method extracts textual/string representations from it.- Parameters:
r
- the jena resource which also allows to traverse the whole rdf graph- Returns:
- a set of textual representations of the given resource.
-
optionallyQuote
-
isLineByLineTranslation
public boolean isLineByLineTranslation() -
setLineByLineTranslation
-
isIncludeQuotes
public boolean isIncludeQuotes() -
setIncludeQuotes
-