Class AlignmentParser
java.lang.Object
de.uni_mannheim.informatik.dws.melt.yet_another_alignment_api.AlignmentParser
The AlignmentParser can parse XML files following the convention described in the
Alignment Format. Note that currently EDOAL is not
supported.
- Author:
- Sven Hertling, Jan Portisch
-
Field Summary
Modifier and TypeFieldDescriptionprivate static final org.slf4j.Logger
Default logger.private static final ThreadLocal<SAXParser>
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic InputStream
getInputStreamFromURL
(URL url) private static String
getRecordEntry
(org.apache.commons.csv.CSVRecord record, int index) private static String
getRecordEntry
(org.apache.commons.csv.CSVRecord record, String name) static Alignment
Parses the given file as alignment.static Alignment
parse
(InputStream s) static void
parse
(InputStream s, Alignment m) static Alignment
Reference the alignment file to be parsed using a String URI.static Alignment
Reference the alignment file to be parsed using a String URI.static Alignment
static Alignment
Parse alignment from CSV (comma separated file) with header (source, target, confidence, relation).static Alignment
parseCSVWithoutHeader
(File file) Parse alignment from CSV (comma separated file) without header.static Alignment
parseCSVWithoutHeader
(File file, char delimiter) Parse alignment from CSV (comma separated file) without header.static Alignment
parseFromText
(String text) Parses an alignment based on the textual representation.static Alignment
Parse alignment from TSV (tab separated file) without header.The columns are source \t target \t confidence \t relation.
-
Field Details
-
LOGGER
private static final org.slf4j.Logger LOGGERDefault logger. -
threadLocal
-
-
Constructor Details
-
AlignmentParser
public AlignmentParser()
-
-
Method Details
-
parseFromText
Parses an alignment based on the textual representation.- Parameters:
text
- the xml as string- Returns:
- Parsed alignment instance.
- Throws:
SAXException
- Parsing exception.IOException
- IO exception.
-
parse
Reference the alignment file to be parsed using a String URI.- Parameters:
uri
- URI as String. Alternatively a file path can be specified.- Returns:
- Parsed alignment instance.
- Throws:
SAXException
- Parsing exception.IOException
- IO exception.
-
parse
Reference the alignment file to be parsed using a String URI.- Parameters:
uri
- URI as String.- Returns:
- Parsed alignment instance.
- Throws:
SAXException
- Parsing exception.IOException
- IO exception.
-
parse
- Throws:
SAXException
IOException
-
parse
Parses the given file as alignment.- Parameters:
fileToBeParsed
- The file that shall be parsed.- Returns:
- Parsed alignment instance.
- Throws:
SAXException
- A SAXException.IOException
- An IOException.
-
parse
- Throws:
SAXException
IOException
-
parse
- Throws:
SAXException
IOException
-
getInputStreamFromURL
- Throws:
IOException
-
parseCSV
Parse alignment from CSV (comma separated file) with header (source, target, confidence, relation). The extensions are not parsed.- Parameters:
file
- the file to read from- Returns:
- the parsed alignment
- Throws:
IOException
- thrown if some io error occurs.
-
getRecordEntry
-
getRecordEntry
-
parseCSVWithoutHeader
Parse alignment from CSV (comma separated file) without header. The order is: (source, target, confidence, relation). The extensions are not parsed.- Parameters:
file
- the file to read from- Returns:
- the parsed alignment
- Throws:
IOException
- thrown if some io error occurs.
-
parseCSVWithoutHeader
Parse alignment from CSV (comma separated file) without header. The order is: (source, target, confidence, relation). The extensions are not parsed.- Parameters:
file
- the file to read fromdelimiter
- the delimiter to use- Returns:
- the parsed alignment
- Throws:
IOException
- thrown if some io error occurs.
-
parseTSV
Parse alignment from TSV (tab separated file) without header.The columns are source \t target \t confidence \t relation. The relation and confidence can be left out as long as the position stays the same (e.g. relation is always on index 3). The files are usually generated by PARIS matcher.- Parameters:
file
- the file to read from- Returns:
- the parsed alignment
- Throws:
IOException
- thrown if some io error occurs.
-