MetaLib Technical Notes Sue Dentinger UW Madison Library Technology Group 32906 WAAL presentation - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

MetaLib Technical Notes Sue Dentinger UW Madison Library Technology Group 32906 WAAL presentation

Description:

http://www.webindexing.biz/Webbook2Ed/glossary.htm. WAAL 3/29/06 SWD ... California Digital Library Glossary Definition. http://www.cdlib.org/inside/diglib/glossary ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 27
Provided by: wlaL
Category:

less

Transcript and Presenter's Notes

Title: MetaLib Technical Notes Sue Dentinger UW Madison Library Technology Group 32906 WAAL presentation


1
MetaLib Technical NotesSue DentingerUW Madison
Library Technology Group3/29/06 WAAL
presentation
2
Outline
  • MetaLib Both a Content Management System and a
    Metasearch engine
  • What does a Metasearch look like in MetaLib
  • What is a Metasearch?
  • Types of Metasearch Connections to Various
    Sources
  • Metasearch Pros and Cons
  • Metasearch/Federated Searching Differences
  • Example Federated Search
  • Federated Search Pros and Cons
  • Conclusion
  • Articles Cited

3
MetaLib Both a Content Management System (CMS)
and Metasearch engine
  • CMS (Oracle-based, menu-driven, web interface
    provided for resource administration)
  • Web-based public interface you can tailor to some
    degree with logos/graphics/fonts
  • Or for develop your own interface using XML
    gateway API

4
Metalib Admin
5
Already saw MetaLib Interface, What does MetaLib
look like as a Metasearch?
6
During a Metasearch
7
Metasearch Results Page
8
One View of Results from a Metasearch
9
What is a Metasearch?
  • A metasearch uses real-time to search external
    resources.
  • Metasearching dynamically searching 2 or more
    resources.
  • Search occurs in a unified environment. You dont
    control incoming data sources,
  • Interface controls how results presented.
  • Many sources searched goes through a program,
    (often ExLibris supplied, or write your own) to
    pre-establish search mechanism.
  • Search method is predefined using for example
  • Z39.50, XML gateway, Search and Link or Screen
    scraping,
  • Patron doesnt really know/care which method is
    used.

10
Types of Metasearch Connections to Various
Sources
  • Z39.50 Meaning vendor must run a Z39.50 server.
    Metalib is the client. (like EndNote)
  • (uses MARC, SUTRS (Simple Unstructured Text
    Record Syntax) or other underlying data formats)
  • XML gateway (SRU/SRW) (Search Retrieve Web
    Service /Search Retrieve URL Service)
    (Best choice)
  • 3 basic operations Explain structure Scan
    Search-Retrieve
  • May use Common Query Language (CQL)
  • More efficient than Z39.50, returning XML results
  • HTTP requests, search and link resources.
  • Base URL and request sent to source, only hits
    returned, go to native interface to see results.
  • Format conversion tables often used to read
    results back into MetaLib.
  • External program (often screen scraping).

11
Z39.50 Example
Format Conversion pgm specifies program used to
convert incoming recs to Metalib internal format.
ExLibris supplied, or you can write.
Record type Incoming record format.
12
Z39.50 Specify what/how to search
Z39.50 attributes... u use attribute, specifies
field to search, s structure (phrase or word
search), t truncation (left, right, both)
EEEK!
Term transformations convert users metalib query
into target db search format by applying
pre-programmed term transformations.
13
Metasearch Pros
  • Data comes from original sources.
  • Data is as fresh as can be.
  • Sources can have varying underlying formats.
  • Different types of sources all relating to topic
    can be hand-selected for specific purposes.
  • Deduping, sorting, presenting unified results all
    in one interface across disparate sources.
  • User or library can control quality/content of
    incoming sources to most pertinent.
  • Roy Tennant Its not just what you search but
    what you dont search that counts.

14
Metasearch Cons
  • Real time is s-l-o-w-e-r
  • Results coming back to web must have limits on
    number to process from each source.
  • Sorting/deduping 10,000 records from 5 sources
    would take too long.

15
Metasearch/Federated Searching Differences
  • In past metasearching federated searching
    cross database searching parallel searching
    broadcast searching integrated searching!
  • More recently, some library leaders (Roy Tennant,
    CDL, Tamar Sadeh, ExLibris, and others) are
    making a finer distinction between a metasearch,
    a federated search and cross-database searching.

16
Whats the Difference Between a Metasearch and a
Federated Search?
  • Metasearch uses just-in-time (real-time)
    processing for all kinds of data sources.
  • Metasearch systems are not databases of the data,
    but hold structural info on retrieving from many
    sources.
  • Federated search uses Just-in-case processing of
    pre-populated, underlying repository of data
    pre-harvested and ranked or sorted in a
    pre-specified order.

17
Metasearch 1
  • Ideal solution is for metasearch systems to
    receive resource specific info at time of actual
    interaction and figure out flow of the
    interaction based on a picture or design of the
    information itself.
  • Tamar Sadeh, Google Scholar versus Metasearch
    Systems. Jan 11, 2006. http//library.cern.ch/HEP
    LW/12/papers/1
  • Concept of a metasearch is based on Tim Berners
    Lee concept of the semantic web, a W3C project.

18
Metasearch 2
  • Semantics meaning. If a computer
    understands the semantics of a document, it
    understands the meaning, rather than just
    interpreting a series of characters.
  • Semantic web project of the W3C in which
    automated methods based on quality metadata are
    envisaged to replace much human searching of the
    web. Relies on ontologies, XML and RDF.
  • http//www.webindexing.biz/Webbook2Ed/glossary.htm

19
Example of a Federated Search?
Google Scholar, Elseviers SCIRUS, OVID,...
20
Federated Search
  • Searches a repository or index of objects
    populated earlier from multiple data sources.
  • Presents unified interface.
  • Just-in-case processing pre-process ranking
    algorithm often based on number of times cited or
    other criteria. It can be applied to data
    elements unrelated to any future query.
  • Endeca used at NCSU called a web navigation
    system, but could be considered closer to a
    federated search of a pre-populated database.
  • FAST used at Elsevier uses similar
    pre-processing
  • Predetermined rank can be used to better evaluate
    relevance of an item retrieved in a query.

21
Federated Search Pros
  • In general just-in-case pre-processing will have
    much better performance. VERY FAST!
  • Can provide an initial sort and often relevance
    ranking.

22
Federated Search Cons
  • Searches not done in real time.
  • Often content in repository is not as fresh as it
    should or could be.
  • Can have months of delay.
  • Content provider must maintain underlying
    database and constantly feed it.
  • Pre-sorted, so almost always no ability to
    re-sort on any other criteria.

23
Conclusion
  • Behind the scenes MetaLib doing a lot of real
    time work!
  • ExLibris continually updates connection programs
    for many vendors so we dont have to. Regular
    updates.
  • We do need to tweak/maintain our local access to
    each vendor.
  • What to answer when people ask....Why is
    metasearching so slow compared to Google?
  • Comparing apples and oranges!
  • Google is not just articles, but blogs, comments,
    unfiltered web.
  • Google is not real-time.
  • Google cannot be sorted differently.
  • Google is too humongous! Harder to do nuanced
    searching. BUT....

24
Conclusion cont
  • Google Scholar or federated searching on the
    other hand... gives metasearching a run for the
    money despite current drawbacks in content and
    freshness.
  • Real advantage in speed and numbers of records
    processed ahead.
  • Real problems in data harvested and which
    articles are most relevant for whom and when.
  • In long run, the concept of metasearching better
    suits the concept of the semantic web.

25
Articles Cited
  • California Digital Library Glossary Definition.
    http//www.cdlib.org/inside/diglib/glossary/
  • Sadeh, Tamar. Google Scholar versus Metasearch
    Systems. In High Energy Physics Libraries
    Webzine. Issue 12, Feb. 2006. http//library.cern
    .ch/HEPLW/12/papers/1
  • Lease Morgan, Eric. SRW/U in Five Hundred Words.
    http//www.loc.gov/z3950/agency/zing/srw/brief.htm
    l
  • SRW Search/Retrieve Web Service. Z39.50
    International Next Generation.
    http//www.loc.gov/z3950/agency/zing/srw/z3950.htm
    l
  • Jermey, Jon and Browne, Glenda. Website Indexing
    enhancing access to information within websites.
    Glossary. http//www.webindexing.biz/Webbook2Ed/gl
    ossary.htm

26
Questions?
  • Continued Usability Studies
  • Custom Search Launch
  • My Space Login
  • Campus Marketing and Instruction

Contact Information Todd Bruns
tbruns_at_library.wisc.edu Sue Dentinger
sdentinger_at_library.wisc.edu Amy Kindschi
kindschi_at_engr.wisc.edu
Write a Comment
User Comments (0)
About PowerShow.com