Diapositive 1 - PowerPoint PPT Presentation

About This Presentation

Diapositive 1


European Language Resources Association ELRA s Services 15 Years on... Sharing and Anticipating the Community Victoria Arranz & Khalid Choukri – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 24
Provided by: Khal47
Learn more at: http://www.lrec-conf.org


Transcript and Presenter's Notes

Title: Diapositive 1

European Language Resources Association
ELRAs Services 15 Years on... Sharing
and Anticipating the Community
Victoria Arranz Khalid Choukri ELRA/ELDA 55
Rue Brillat-Savarin, F-75013 Paris, France Tel.
33 1 43 13 33 33 -- Fax. 33 1 43 13 33
30 Email arranz, choukri_at_elda.org http//www.el
ra.info/ or http//www.elda.org/
  • Before ELRA was established
  • once upon a time
  • rational behind its foundation and its Mission(s)
  • ELRA Activities
  • Identification and Distribution
  • Production of LRs
  • Evaluation of Human Language Technologies
  • Dissemination
  • New Visions
  • Large International Cooperation
  • Advocating for a Backbone of Language Resources
    and HLT evaluation, Open and Shared

ELRAs Foundation Mission
  • Created in February 1995
  • Funding from the European Commission 3 years
  • Main rationale bring into focus the need for a
    mutual exchange and use of LRs
  • A Repository Center
  • Technical Logistic issues
  • Commercial issues (prices, fees, royalties)
  • Legal issues (Licensing, IPR)
  • Information Dissemination
  • Infrastructure for the evaluation of Human
    Language Technologies providing resources, tools,
    methodologies, logistics, Exit strategies /
    Capitalization on evaluation packages
  • Operational body ELDA

Identification and Distribution
  • LR licensing priority to simplify
    relationship between providers and users -gt
    drafted generic contracts
  • Contracts
  • establish usage research / technology
  • protect data owners and their LRs
  • available on www.elda.org/article1.html
  • designed before CC licenses future mergings or
    joint designing?
  • 500 have been signed

Contract model
LREC-2010 Workshop on Legal Issues
Identification and Distribution
  • More than 1,000 LRs catalogued and available
    ELRA Catalogue of Language Resources

Number of LRs within the ELRA Catalogue over the
Distribution of Resources vs Usage
  • ELRA has distributed over 3,500 LRs
  • 48 research in academia
  • 37 research and technology development in
  • 16 evaluation
  • Further 1,500 copies distributed within
    evaluation campaigns

Distribution of Resources vs Usage
The Universal Catalogue
the ELRA Catalogue
The Universal Catalogue
the ELRA Catalogue
  • Over 1,700 LRs compiled in the Universal
    Catalogue http//universal.elra.org
  • Antechamber of ELRA Catalogue
  • Window-shopping nature allows users to realise
    about existence of LRs for future availability?
    ELDA team helps to clear out legal situation
  • New feature simplified collaboration form
    (following users feedback)
  • Also related LREC Map initiative LR
    identification tool during LREC submission time
    (ELRA FlaReNet). See
  • Calzolari, N., Soria, C., Del Gratta, R.,
    Goggi, S., Quochi, V., Russo, I., Choukri, K.,
    Mariani, J. and Piperidis S. The LREC 2010
    Resource Map. LREC 2010.

Identification and Distribution The ELRA
  • Two interesting novelties
  • ELRAs implication in evaluation
  • Distribution of evaluation packages (with
    definition of new type of use/agreement
     Evaluation Packages End-User Agreement  new
    pricing policy.
  • Technology evaluation products, systems and
  • ELRAs Catalogue of LRs for RD
  • Easy and fast access to LRs dedicated to
    academic research at an affordable price

Production of LRs
  • Production or commissioning production
  • Production
  • Within the framework of European and
    international projects NEMLAR, Neologos,
    OrienTel, Speecon, C-ORAL-ROM, CHIL, TC-STAR,
  • In support of companies or institutions
    sometimes confidential
  • ELRAs advisory role
  • PCom
  • VCom

Production of LRs
  • Current technological development demands more
    ambitious resources size, type of linguistic
    information, quality of the end-result
  • These are main objectives for ELRA and have
  • LRs compiled in more than 25 languages
  • High quality LRs strict validation
  • Involved in every stage of production

Production of LRs
  • Production through ELDA
  • (i) speech data for a variety of languages
    (e.g., Hindi, Korean, Colloquial Arabic(s),
    Canadian French, US Spanish, etc.),
  • (ii) Broadcast News Speech Corpus for Arabic,
    French, Spanish, etc.,
  • (iii) corpora for languages such as Catalan,
    Kazakh, Romanian, Turkish, etc., (iv) aligned
    textual corpora for Machine Translation in
    languages such as Arabic, Chinese, English,
    French, German, Spanish, etc.,
  • (v) video annotations with audio transcriptions,
  • (vi) collections of SMS data,
  • (vii) recordings of Wizard-of-Oz based data for
    dialogue systems, etc. covering different types
    of LRs and for different technologies

HLT Evaluation
  • ELRA has ensured infrastructure for technology
    evaluation, also with web-based service platforms
  • Through participation in European projects
    (CHIL, TC-STAR, CLEF) and in French national
    programmes (Technolangue)
  • Collaborative and customized services for HLT
  • on-demand evaluation services
  • customized LRs for laboratories and/or companies
  • Important end-result evaluation packages
    compiled and made available
  • they contain required DBs, tools, methodologies
    and protocols to conduct comparable experiments

HLT Evaluation
  • More than 20 technologies and over 40 eval
    packages are available on ELRA Catalogue.
  • Some covered technologies
  • Text processing Information retrieval, Question
    Answering, Machine Translation, Automatic
    Summarization, Parsing, Multilingual Text
    Alignment, Terminology Extraction,
  • Speech processing Automatic Speech Recognition,
    Speech Synthesis, Speech Translation, Broadcast
    News Transcription, Acoustic Person Tracking,
    Acoustic Speaker Identification, Speech Activity
  • Multi-modal interfaces Multimodal Person
    Tracking, Audiovisual Speech Recognition,
    Multimodal Person Identification.
  • Reference Portal for HLT Evaluation

  • ELRA has increased its activities for the
    dissemination of information on LRs
  • Speakers Corner for the researchers and
    developers of the area
  • Events
  • Language Resources and Evaluation Conference (7th
    edition) http//www.lrec-conf.org
  • LangTech
  • European LR and Technologies Forum (within
  • MEDAR Conferences
  • Worshops of less-resourced languages (within
  • Language Resources and Evaluation Journal
    (Springer) http//www.springerlink.com/
  • Newsletter
  • Members News
  • BLARK web site http//www.blark.org/ other web

Today's Context...a New Landscape?
  • New Visions.... Part of ELRAs success and
    evolution has implied facing and anticipating
    the realities/needs of the community
  • History
  • Why the community established agencies like LDC,
    ELRA and some others
  • How things evolved and saw new players (BAS,
    CSLU, LDC India, GSK, etc.) emerging
  • What their missions were and what it is today
  • Is there a role for such organizations?
  • Impact of new Instruments on the field
  • How did the web and Internet shake up the whole
  • (internet .... web ............. web 2.0)
  • New facilities (personal pages, free /easy web
  • Easiness to use new media , new means to store,
    share resources
  • But did our LR consumption/ behaviour change
    that much?

New Layered Approach with Distributed Services
New Visions...
  • This new vision is being implemented in
    collaboration with a number of experts
  • International initiatives working to design this
    new vision
  • PANACEA (www.panacea-lr.eu)
  • Defining new legal frameworks...
  • Cost-effectivenesss in LR production and
  • Web-based factories
  • META-NET (www.meta-net.eu/)
  • NoE towards an open, integrated, secured and
    interoperable exchange
  • Towards securing the largest Global Catalogue of
    LRs LDC, NICT, OLAC, etc...harmonise their
    catalogues with the Universal Catalogue. The LREC
    Map and other future Maps (from other conferences)

Concluding Remarks
  • Overview on latest developments in ELRAs
    services around
  • Identification Distribution
  • Evaluation
  • Production
  • Dissemination
  • From early archiving and distribution to LR
    identification, collection, validation and
    distribution platform....
  • ...with clear and well-established legal
  • ...enhancing work on evaluation (with new
    techniques, covering more technologies and
    languages, providing more evaluation packages,
    setting up the HLT Eval portal....)
  • ... increasing work on LR production and its
  • ...big push to dissemination...

Concluding Remarks
  • ...encouraging international cooperation...
  • ...after years of consolidation...ELRA is looking
    forward to the new challenges emerging from new
  • ...a new interoperable exchange is certainly on

Thank you for your attention
Write a Comment
User Comments (0)
About PowerShow.com