java.lang.Object
de.uni_mannheim.informatik.dws.melt.yet_another_alignment_api.AlignmentParser

public class AlignmentParser extends Object
The AlignmentParser can parse XML files following the convention described in the Alignment Format. Note that currently EDOAL is not supported.
Author:
Sven Hertling, Jan Portisch
  • Field Details

    • LOGGER

      private static final org.slf4j.Logger LOGGER
      Default logger.
    • threadLocal

      private static final ThreadLocal<SAXParser> threadLocal
  • Constructor Details

    • AlignmentParser

      public AlignmentParser()
  • Method Details

    • parseFromText

      public static Alignment parseFromText(String text) throws SAXException, IOException
      Parses an alignment based on the textual representation.
      Parameters:
      text - the xml as string
      Returns:
      Parsed alignment instance.
      Throws:
      SAXException - Parsing exception.
      IOException - IO exception.
    • parse

      public static Alignment parse(String uri) throws SAXException, IOException
      Reference the alignment file to be parsed using a String URI.
      Parameters:
      uri - URI as String. Alternatively a file path can be specified.
      Returns:
      Parsed alignment instance.
      Throws:
      SAXException - Parsing exception.
      IOException - IO exception.
    • parse

      public static Alignment parse(URI uri) throws SAXException, IOException
      Reference the alignment file to be parsed using a String URI.
      Parameters:
      uri - URI as String.
      Returns:
      Parsed alignment instance.
      Throws:
      SAXException - Parsing exception.
      IOException - IO exception.
    • parse

      public static Alignment parse(URL url) throws SAXException, IOException
      Throws:
      SAXException
      IOException
    • parse

      public static Alignment parse(File fileToBeParsed) throws SAXException, IOException
      Parses the given file as alignment.
      Parameters:
      fileToBeParsed - The file that shall be parsed.
      Returns:
      Parsed alignment instance.
      Throws:
      SAXException - A SAXException.
      IOException - An IOException.
    • parse

      public static Alignment parse(InputStream s) throws SAXException, IOException
      Throws:
      SAXException
      IOException
    • parse

      public static void parse(InputStream s, Alignment m) throws SAXException, IOException
      Throws:
      SAXException
      IOException
    • getInputStreamFromURL

      public static InputStream getInputStreamFromURL(URL url) throws IOException
      Throws:
      IOException
    • parseCSV

      public static Alignment parseCSV(File file) throws IOException
      Parse alignment from CSV (comma separated file) with header (source, target, confidence, relation). The extensions are not parsed.
      Parameters:
      file - the file to read from
      Returns:
      the parsed alignment
      Throws:
      IOException - thrown if some io error occurs.
    • getRecordEntry

      private static String getRecordEntry(org.apache.commons.csv.CSVRecord record, String name)
    • getRecordEntry

      private static String getRecordEntry(org.apache.commons.csv.CSVRecord record, int index)
    • parseCSVWithoutHeader

      public static Alignment parseCSVWithoutHeader(File file) throws IOException
      Parse alignment from CSV (comma separated file) without header. The order is: (source, target, confidence, relation). The extensions are not parsed.
      Parameters:
      file - the file to read from
      Returns:
      the parsed alignment
      Throws:
      IOException - thrown if some io error occurs.
    • parseCSVWithoutHeader

      public static Alignment parseCSVWithoutHeader(File file, char delimiter) throws IOException
      Parse alignment from CSV (comma separated file) without header. The order is: (source, target, confidence, relation). The extensions are not parsed.
      Parameters:
      file - the file to read from
      delimiter - the delimiter to use
      Returns:
      the parsed alignment
      Throws:
      IOException - thrown if some io error occurs.
    • parseTSV

      public static Alignment parseTSV(File file) throws IOException
      Parse alignment from TSV (tab separated file) without header.The columns are source \t target \t confidence \t relation. The relation and confidence can be left out as long as the position stays the same (e.g. relation is always on index 3). The files are usually generated by PARIS matcher.
      Parameters:
      file - the file to read from
      Returns:
      the parsed alignment
      Throws:
      IOException - thrown if some io error occurs.