Resource Discovery (metadata and searching) - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Resource Discovery (metadata and searching)

Description:

What kinds of resources should EMELD provide search services for? ... How can EMELD get good metadata into its search database? ... oral stop, plosive) ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 29
Provided by: scottf3
Learn more at: http://emeld.org
Category:

less

Transcript and Presenter's Notes

Title: Resource Discovery (metadata and searching)


1
Resource Discovery(metadata and searching)
  • Working Group Report

2
Issues discussed
  • What kinds of resources should EMELD provide
    search services for?
  • What should the design be for an EMELD search
    interface?
  • How can EMELD get good metadata into its search
    database?
  • What level of metadata should be exposed?

3
What resources?
  • Anything that might be of value to the endangered
    language's linguist.
  • Language data
  • Tools
  • Advice (including reviews)
  • People
  • "Gateway" websites

4
What resources?
  • But, there's no reason to rely on this working
    group for "what".
  • A questionnaire distributed via Linguist

5
What resources?
  • Two kinds of best practice resources
  • Resources with best practice metadata
  • These resources can be discovered
  • Non-digital resources encouraged
  • Digital resources discouraged, but allowed

6
What resources?
  • Best practice digital resources
  • All digital resources encouraged to be of this
    type
  • Benefits
  • Enhanced search features (due to document
    interoperability)
  • Special "BP globe of approval"

7
What resources?
  • Side Note
  • Best Practice "approval" system should be tied
    into a larger system through which digital
    resources could be listed as "publications"
  • A topic for another working group? (Perhaps OLAC?)

8
What resources?
  • Issues which need to be addressed
  • Metadata for resources interesting to linguists
    but which are not linguistic data
  • Needed Best practice metadata standards for
  • Tools
  • Advice
  • People
  • ...
  • Test EMELD could see how it would classify
    everything in BPU.

9
How to search?
  • Assumption Metadata and data is distributed
  • Query Language
  • Metadata OLAC standard
  • Data from interoperable documents A new standard

10
How to search?
  • Resource Query Language Ideal
  • A generalized query protocal used across the
    linguistics community
  • A series of "methods" to be defined can be called
    on these resources to retrieve structured
    linguistic data matching query parameters

11
How to search?
  • Problems implementing ideal
  • No clear sense as to what "methods" are needed.
  • One solution Examine results from questionnaire

12
How to search?
  • Problems implementing ideal
  • Very few repositories allow their data to be
    accessed in a generalized way
  • First step Encourage documentation of repository
    data access systems and develop a metadata
    standard for this

13
How to search?
  • Long term implementation issues
  • An OLAC Query Language Protocol
  • A well-defined linguistic query language
  • A system for "packaging" queries
  • Linguistic data search registry
  • Linguistic sites register they are data access
    sites
  • They also register implemented search methods
  • EMELD will archive best-practice documents for
    data access for data creators not capable of
    implementing the query protocol

14
How to search?
  • Pilot project
  • Take some small subset of resources
  • Data inputted via Field
  • Nijmegen? SIL? AIATSIS? AILLA?
  • Take FIELD search out of FIELD
  • Search over that small set of resources
  • Ideally, keep both resources in separate
    databases to begin to develop query interchange
    protocol

15
How to search?
  • Another project Grammatical thesaurus
  • Develop a grammatical thesaurus that gives common
    synomyns for a given grammatical term (Ex. oral
    stop, plosive)
  • This could then be used to allow a user's search
    to be expanded to include synonyms for a given
    term.
  • In all likelihood, there are other applications
    of this.

16
How to search?
  • Search interface
  • EMELD should implement a VISER-like service for
    access to its database
  • There are two distinct kinds of searches
  • Resource location
  • Resource data search

17
How to search?
  • Search interface
  • The details of the search interface implemented
    by EMELD are hard to conceive of until more
    resources can be accessed through it
  • A questionnaire can help with this area too.
  • EMELD could ask people to try the search and
    evaluate it
  • Starting with the people in this room

18
Getting the data
  • Sticks
  • EMELD Ambassadors
  • Assisted by Linguist Spider

19
Getting the data
  • Carrots
  • Support harvesting metadata in document headers
    for submitted URL's.
  • Resources with best practice metadata can be
    referenced using some standard EMELD URI which
    can be used as a reference
  • These resources could be posted and advertised
    on Linguist

(but consult Baden first)
20
Getting the data
  • Juiciest Carrots (Best Practice resources only)
  • "Preferred" EMELD URI's
  • Marked as such in a search
  • Could undergo "advanced" search techniques
  • Be peer-reviewed and vetted by LDRA

(Linguistic Digital Resource Association)
This organization does not exist, as far as I
know.
21
Granularity
  • Right now there are no recommendations for the
    granularity of exposed metadata records
  • Large archives, for example, have hierarchical
    structure, one level of which must be isolated
    (the IMDI session, for example)
  • Cutting-edge archives don't work well with the
    resourceobject model. Their resources are
    "created" based on the user's needs

22
Granularity
  • The lack of recommendations on this issue
    inhibits metadata creation
  • Granularity makes a big difference as to what
    content is searchable
  • Two different audience's in need of advice
  • "Real" archives (a.k.a. trusted repositories)
  • Individuals

23
Granularity
  • Recommendation EMELD should encourage IMDI and
    OLAC to devise best-practice recommendations for
    granularity

24
The questionnaire
  • Two broad kinds of questions
  • What kinds of things would you like?
  • What kinds of would you hate hate?

(Dafydd's Corollary)
25
The questionnaire
  • Part one Search capabilities
  • How do you want to conduct your search
    (google-style, directory-style, pull-down
    menus...)?
  • What kinds of searches are you doing already on
    other sites?
  • Search within results? (We wanted this.)
  • Thesaurus-based search

26
The questionnaire
  • Part Two Search content
  • Free entry (like Google)
  • Feature-based entry
  • Statistical questions
  • Phonetic characters
  • Geographical search
  • Time search
  • ...

27
The questionnaire
  • Part Three Results
  • Google-like results
  • Journal abstract search-like results
  • Restricted results (only return web sites, .pdf
    documents, ...)
  • ...

28
The questionnaire
  • Format
  • Online submission
  • Combination multiple choice (for the uncreative)
    and free form (for the creative)
  • Encourage people to envision the search of the
    year 2503
Write a Comment
User Comments (0)
About PowerShow.com