Diapositive 1 - PowerPoint PPT Presentation

About This Presentation
Title:

Diapositive 1

Description:

European Language Resources Association ELRA s Services 15 Years on... Sharing and Anticipating the Community Victoria Arranz & Khalid Choukri – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 24
Provided by: Khal47
Learn more at: http://www.lrec-conf.org
Category:

less

Transcript and Presenter's Notes

Title: Diapositive 1


1
European Language Resources Association
ELRAs Services 15 Years on... Sharing
and Anticipating the Community
Victoria Arranz Khalid Choukri ELRA/ELDA 55
Rue Brillat-Savarin, F-75013 Paris, France Tel.
33 1 43 13 33 33 -- Fax. 33 1 43 13 33
30 Email arranz, choukri_at_elda.org http//www.el
ra.info/ or http//www.elda.org/
2
Overview
  • Before ELRA was established
  • once upon a time
  • rational behind its foundation and its Mission(s)
  • ELRA Activities
  • Identification and Distribution
  • Production of LRs
  • Evaluation of Human Language Technologies
  • Dissemination
  • New Visions
  • Large International Cooperation
  • Advocating for a Backbone of Language Resources
    and HLT evaluation, Open and Shared

3
ELRAs Foundation Mission
  • Created in February 1995
  • Funding from the European Commission 3 years
  • Main rationale bring into focus the need for a
    mutual exchange and use of LRs
  • A Repository Center
  • Technical Logistic issues
  • Commercial issues (prices, fees, royalties)
  • Legal issues (Licensing, IPR)
  • Information Dissemination
  • Infrastructure for the evaluation of Human
    Language Technologies providing resources, tools,
    methodologies, logistics, Exit strategies /
    Capitalization on evaluation packages
  • Operational body ELDA

4
Activities
5
Identification and Distribution
  • LR licensing priority to simplify
    relationship between providers and users -gt
    drafted generic contracts
  • Contracts
  • establish usage research / technology
    development
  • protect data owners and their LRs
  • available on www.elda.org/article1.html
  • designed before CC licenses future mergings or
    joint designing?
  • 500 have been signed

Contract model
LREC-2010 Workshop on Legal Issues
6
Identification and Distribution
  • More than 1,000 LRs catalogued and available
    ELRA Catalogue of Language Resources
    http//catalog.elra.org

Number of LRs within the ELRA Catalogue over the
years
7
Distribution of Resources vs Usage
  • ELRA has distributed over 3,500 LRs
  • 48 research in academia
  • 37 research and technology development in
    industry
  • 16 evaluation
  • Further 1,500 copies distributed within
    evaluation campaigns

8
Distribution of Resources vs Usage
9
The Universal Catalogue
the ELRA Catalogue
10
The Universal Catalogue
the ELRA Catalogue
  • Over 1,700 LRs compiled in the Universal
    Catalogue http//universal.elra.org
  • Antechamber of ELRA Catalogue
  • Window-shopping nature allows users to realise
    about existence of LRs for future availability?
    ELDA team helps to clear out legal situation
  • New feature simplified collaboration form
    (following users feedback)
  • Also related LREC Map initiative LR
    identification tool during LREC submission time
    (ELRA FlaReNet). See
  • Calzolari, N., Soria, C., Del Gratta, R.,
    Goggi, S., Quochi, V., Russo, I., Choukri, K.,
    Mariani, J. and Piperidis S. The LREC 2010
    Resource Map. LREC 2010.

11
Identification and Distribution The ELRA
Catalogue
  • Two interesting novelties
  • ELRAs implication in evaluation
  • Distribution of evaluation packages (with
    definition of new type of use/agreement
     Evaluation Packages End-User Agreement  new
    pricing policy.
  • Technology evaluation products, systems and
    applications
  • ELRAs Catalogue of LRs for RD
  • Easy and fast access to LRs dedicated to
    academic research at an affordable price
    http//catalogue.elra.info/retd

12
Production of LRs
  • Production or commissioning production
  • Production
  • Within the framework of European and
    international projects NEMLAR, Neologos,
    OrienTel, Speecon, C-ORAL-ROM, CHIL, TC-STAR,
    ESTER, MEDIA, MEDAR, PASSAGE, etc.
  • In support of companies or institutions
    sometimes confidential
  • ELRAs advisory role
  • PCom
  • VCom

13
Production of LRs
  • Current technological development demands more
    ambitious resources size, type of linguistic
    information, quality of the end-result
  • These are main objectives for ELRA and have
    triggered
  • LRs compiled in more than 25 languages
  • High quality LRs strict validation
  • Involved in every stage of production

14
Production of LRs
  • Production through ELDA
  • (i) speech data for a variety of languages
    (e.g., Hindi, Korean, Colloquial Arabic(s),
    Canadian French, US Spanish, etc.),
  • (ii) Broadcast News Speech Corpus for Arabic,
    French, Spanish, etc.,
  • (iii) corpora for languages such as Catalan,
    Kazakh, Romanian, Turkish, etc., (iv) aligned
    textual corpora for Machine Translation in
    languages such as Arabic, Chinese, English,
    French, German, Spanish, etc.,
  • (v) video annotations with audio transcriptions,
  • (vi) collections of SMS data,
  • (vii) recordings of Wizard-of-Oz based data for
    dialogue systems, etc. covering different types
    of LRs and for different technologies

15
HLT Evaluation
  • ELRA has ensured infrastructure for technology
    evaluation, also with web-based service platforms
  • Through participation in European projects
    (CHIL, TC-STAR, CLEF) and in French national
    programmes (Technolangue)
  • Collaborative and customized services for HLT
    Evaluation
  • on-demand evaluation services
  • customized LRs for laboratories and/or companies
  • Important end-result evaluation packages
    compiled and made available
  • they contain required DBs, tools, methodologies
    and protocols to conduct comparable experiments

16
HLT Evaluation
  • More than 20 technologies and over 40 eval
    packages are available on ELRA Catalogue.
  • Some covered technologies
  • Text processing Information retrieval, Question
    Answering, Machine Translation, Automatic
    Summarization, Parsing, Multilingual Text
    Alignment, Terminology Extraction,
  • Speech processing Automatic Speech Recognition,
    Speech Synthesis, Speech Translation, Broadcast
    News Transcription, Acoustic Person Tracking,
    Acoustic Speaker Identification, Speech Activity
    Detection,
  • Multi-modal interfaces Multimodal Person
    Tracking, Audiovisual Speech Recognition,
    Multimodal Person Identification.
  • Reference Portal for HLT Evaluation
    http//hlt-evaluation.org

17
Dissemination
  • ELRA has increased its activities for the
    dissemination of information on LRs
  • Speakers Corner for the researchers and
    developers of the area
  • Events
  • Language Resources and Evaluation Conference (7th
    edition) http//www.lrec-conf.org
  • LangTech
  • European LR and Technologies Forum (within
    FlaReNet)
  • MEDAR Conferences
  • Worshops of less-resourced languages (within
    LTC09)
  • Language Resources and Evaluation Journal
    (Springer) http//www.springerlink.com/
  • Newsletter
  • Members News
  • BLARK web site http//www.blark.org/ other web
    sites

18
Today's Context...a New Landscape?
  • New Visions.... Part of ELRAs success and
    evolution has implied facing and anticipating
    the realities/needs of the community
  • History
  • Why the community established agencies like LDC,
    ELRA and some others
  • How things evolved and saw new players (BAS,
    CSLU, LDC India, GSK, etc.) emerging
  • What their missions were and what it is today
  • Is there a role for such organizations?
  • Impact of new Instruments on the field
  • How did the web and Internet shake up the whole
    structure
  • (internet .... web ............. web 2.0)
  • New facilities (personal pages, free /easy web
    hosting)
  • Easiness to use new media , new means to store,
    share resources
  • But did our LR consumption/ behaviour change
    that much?

19
New Layered Approach with Distributed Services
Functionalities
20
New Visions...
  • This new vision is being implemented in
    collaboration with a number of experts
  • International initiatives working to design this
    new vision
  • PANACEA (www.panacea-lr.eu)
  • Defining new legal frameworks...
  • Cost-effectivenesss in LR production and
    automation
  • Web-based factories
  • META-NET (www.meta-net.eu/)
  • NoE towards an open, integrated, secured and
    interoperable exchange
  • Towards securing the largest Global Catalogue of
    LRs LDC, NICT, OLAC, etc...harmonise their
    catalogues with the Universal Catalogue. The LREC
    Map and other future Maps (from other conferences)

21
Concluding Remarks
  • Overview on latest developments in ELRAs
    services around
  • Identification Distribution
  • Evaluation
  • Production
  • Dissemination
  • From early archiving and distribution to LR
    identification, collection, validation and
    distribution platform....
  • ...with clear and well-established legal
    frameworks...
  • ...enhancing work on evaluation (with new
    techniques, covering more technologies and
    languages, providing more evaluation packages,
    setting up the HLT Eval portal....)
  • ... increasing work on LR production and its
    coverage
  • ...big push to dissemination...

22
Concluding Remarks
  • ...encouraging international cooperation...
  • ...after years of consolidation...ELRA is looking
    forward to the new challenges emerging from new
    trends...
  • ...a new interoperable exchange is certainly on
    sight.

23
Thank you for your attention
Write a Comment
User Comments (0)
About PowerShow.com