AnCoraPipe: A tool for multilevel annotation - PowerPoint PPT Presentation

About This Presentation
Title:

AnCoraPipe: A tool for multilevel annotation

Description:

Manu Bertran, B rbara Soriano, Oriol Borrega, Marta Recasens. Universitat de Barcelona ... Adding new annotation levels/values is fast and easy. ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 9
Provided by: STEL
Learn more at: https://stel2.ub.edu
Category:

less

Transcript and Presenter's Notes

Title: AnCoraPipe: A tool for multilevel annotation


1
AnCoraPipe A tool for multilevel annotation
  • Manu Bertran, Bàrbara Soriano, Oriol Borrega,
    Marta Recasens
  • Universitat de Barcelona
  • CBA 2008

2
  • Contents
  • Data format
  • Annotation interface
  • Installation
  • Description
  • Future improvements

3
Data format
  • Data are stored in UTF-8 encoded XML format.
  • Design principles
  • Reduced inventory of node names.
  • Attributes are atomic.
  • Attributes describe only the node they depend of.
  • There is no redundancy in the data.
  • Adding new annotation levels/values is fast and
    easy.
  • Annotation time has been reduced by a whole 50.

4
Annotation interface
  • Installation requirements
  • Java 1.5 o higher.
  • SWT Java graphic library (included in our package
    for Windows XP).
  • Otherwise, the graphic library can be obtained
    with the free Eclipse package.

5
Annotation interface
  • Description
  • The interface is organized in a series of screens
    where specific data for each annotation level are
    shown.
  • The interface highlights all nodes capable of
    being annotated, and the sentences which have not
    been marked yet, in order to make the annotators
    work easier.

6
Annotation interface
  • The system allows for the addition of external
    tools for specific annotation levels
  • WordNet
  • Coreference

7
Future improvements
  • Making the tool available from the Internet,
    adapting it to Linux and Mac environments.
  • Implementing corpus query methods from the
    interface.
  • Implementing statistical corpus description
    methods.
  • Adding tools to handle verbal and nominal
    lexicons.
  • Adding semiautomatic methods and machine learning
    functions for the partial annotation of corpora.

8
  • http//clic.ub.edu/
  • http//clic.ub.edu/ancora/
  • http//clic.ub.edu/mbertran/tbfeditor/
Write a Comment
User Comments (0)
About PowerShow.com