Title: Qualified Dublin Core Using RDF for SciTech Journal Articles DC2001 International Conference on Dubl
1Qualified Dublin Core Using RDF forSci-Tech
Journal ArticlesDC-2001International
Conference on Dublin Core and Metadata
Applications, October 22-26, 2001National
Institute of Informatics, Tokyo,
Japanhttp//dli.grainger.uiuc.edu/Publications/D
C2001/
- Thomas G. Habing (thabing_at_uiuc.edu)
- Timothy W. Cole (t-cole3_at_uiuc.edu)
- William H. Mischo (w-mischo_at_uiuc.edu)
- University of Illinois at Urbana-Champaign
2History and Objectives of the Testbed
- Funded 1994-98 under DLI-I (NSF/NASA/DARPA).
- Continued 1998-2001 under CNRIs D-Lib Test
Suite. - Construct large-scale, multi-publisher,
markup-based full-text journal testbed. - Investigate processing, indexing, normalization,
retrieval, rendering and linking. - Study end-user searching behavior and needs.
3Description of Testbed
- Testbed contains 65,000 articles from 50
journals. - Received from publishers as SGML (various DTDs).
- Converted to well-formed XML.
- Content support from AIP, APS, ASCE, IEE, ASM,
ACM, Elsevier. - Additional support from IEEE, NRL, NTT Learning
Systems.
4Usage of Metadata in Illinois Testbed
- Facilitate resource discovery across
heterogeneous sources through normalization. - Common, easily displayable search results.
- Add value to the original object reference
linking, links to alternate formats and A I
services. - Data exchange, as with Open Archive Initiative
Protocol for Metadata Harvesting (OAI PMH).
5(No Transcript)
6(No Transcript)
7Metadata Extraction Process
- Metadata is derived from full-text using XSLT.
- One-to-one mappings.
- select//titlegrp/title maps to ltdctitlegt.
- Complex mappings
- Tables of Contents, Literal Markup such as
MathML. - Advanced XSLT techniques
- JavaScript functions are used for some
formatting. - The document(url) function is used to merge XML
from other sources, such as CrossRef, into the
metadata. - See paper for sample XSLT code.
8Other uses of XSLT
- Dumb-down to unqualified DC.
- Transform metadata to HTML for display.
- Generate RDF triples for use in a RDBMS.
9Dumb-down XSLT
- ltxslvariable name"DCQ" select"document('dcq.rdf
s')"/gt - ltxsltemplate name"dumb_down_dcq"gtltxslvariable
name"SubPropertyOf select"DCQ//rdfProperty
_at_rdfIDlocal- name(current())/rdfssubProper
tyOf/_at_rdfresource"/gtltxslvariable name"DCTag"
select"substring- after(SubPropertyOf,'xmlns
_dc')"/gtltxslif test"DCTag"gt
ltxslcall-template name"dumb_down"gt
ltxslwith-param name"Tag" select"DCTag"/gt
10Local Extensions to DCQ
- Qualified DC was not adequate for our needs.
- Various DC working groups provided some guidance.
- We extended DCQ in three areas
- Citation-related extensions.
- Agent-related (creator) extensions.
- Type and encoding scheme extensions.
11Citation-related Extensions
- ltuiLibcitationgtA. Author. "A Title" Some Jrnl.
- ltdcidentifiergt ltuiLibOpenURL-OBJECT-METADATA-
ZONEgt ltrdfvaluegtgenrearticleampaulastAu
thor - ltdcqisPartOfgt ltrdfDescription
rdfID"JournalIssue"gt ltdcidentifiergtltuiLib
ISSNgt ltrdfvaluegt1234-5678lt/rdfvaluegt
lt/uiLibISSNgtlt/dcidentifiergt
ltdctitlegtSome Journallt/dctitlegt
12Agent-related Extensions
- Based on DC Agent Qualifiers, Working Draft.
1999. - ltdccreatorgtltrdfSeqgtltrdfligt ltdcaPerson
rdfID"AUTHOR-1"gt ltdcaagentnamegtltdcaFNFgt
ltrdfvaluegtAuthor, A. N.lt/rdfvaluegt
lt/dcaFNFgtlt/dcaagentnamegt
ltdcaagentaffiliationgtBig
Universitylt/dcaagentaffiliationgt
ltdcaagentidentifier rdfresource"mailto
ana_at_big.edu"/gt
13Type and Encoding Extensions
- Extensions to DCMI Type Vocabulary.
- ltdctype rdfresource "http//dli.grainger.uiu
c.edu/uiLibbook"/gt - ltdctype rdfresource "http//dli.grainger.uiu
c.edu/uiLibinjrnl"/gt - Additional Encoding Schemes.
- PACS, ACMCCS, ISSN, CODEN, ACM_JRNL_CODE.
14Conclusions
- Using DCQ/RDF for sci-tech journal articles is
viable - Steep learning curve for RDF
- Dumbing-down DCQ/RDF is complex
- Cannot ignore non-DC tags, RDF Schema is required
- DCQ is missing many properties and types required
for complete serials descriptions - Utility of RDF remains uncertain