Title: Evaluating contentoriented XML retrieval: The INEX initiative
1Evaluating content-oriented XML retrieval The
INEX initiative
- Mounia Lalmas
- Queen Mary University of London
- http//qmir.dcs.qmul.ac.uk
2Outline
- Information retrieval
- XML retrieval
- Evaluating information retrieval
- Evaluating XML retrieval INEX
3Information retrieval
- Example of a user information need (e.g. on the
WWW) - Find all documents about sailing charter
agencies that (1) offer sailing boats in the
Greek islands, and (2) are registered with the
RYA. The documents should contain boat
specification, price per week, e-mail and other
contact details. - A formal representation of an information need
constitutes a query
4Information retrieval
- IR is concerned with the representation, storage,
organisation, and access to repositories of
information, usually under the form of documents.
- Primary goal of an IR system
- Retrieve all the documents which are relevant
(useful) to a user query, while retrieving as few
non-relevant - documents as possible.
-
5Conceptual model for IR
Documents
Query
Indexing
Formulation
Document representation
Query representation
Retrieval function
Relevancefeedback
Retrieval results
6XML Retrieval
- Traditional IR is about finding relevant
documents to a users information need, e.g.
entire book. - XML allows users to retrieve document components
that are more focussed to their information
needs, e.g a chapter of a book instead of an
entire book. - The structure of documents is exploited to
identify which document components (XML elements)
to retrieve.
7XML eXtensible Mark-up Language
- Meta-language (user-defined tags) currently being
adopted as the document format language by W3C - Used to describe content and structure (and not
layout) - Grammar described in DTD (? used for validation)
ltlecturegt lttitlegt Structured Document
Retrieval lt/titlegt ltauthorgt ltfnmgt Smith
lt/fnmgt ltsnmgt John lt/snmgt lt/authorgt ltchaptergt
lttitlegt Introduction into XML
retrieval lt/titlegt ltparagraphgt .
lt/paragraphgt lt/chaptergt
lt/lecturegt
lt!ELEMENT lecture (title, author,chapter)gt lt!ELE
MENT author (fnm,snm)gt lt!ELEMENT fnm PCDATAgt
8XML eXtensible Mark-up Language
- Use of XPath notation to refer to the XML
structure
chapter/title title is a direct sub-component of
chapter //title any title chapter//title title
is a direct or indirect sub-component of
chapter chapter/paragraph2 any direct second
paragraph of any chapter chapter/ all direct
sub-components of a chapter
ltlecturegt lttitlegt Structured Document
Retrieval lt/titlegt ltauthorgt ltfnmgt Smith
lt/fnmgt ltsnmgt John lt/snmgt lt/authorgt ltchaptergt
lttitlegt Introduction into SDR lt/titlegt
ltparagraphgt . lt/paragraphgt
lt/chaptergt lt/lecturegt
9Queries
- Content-only (CO) queries
- Standard IR queries but here we are retrieving
document components - London tube strikes
- Content-and-structure (CAS) queries
- Put on constraints on which types of components
are to be retrieved - E.g. Sections of an article in the Times about
congestion charges - E.g. Articles that contain sections about
congestion charges in London, and that contain a
picture of Ken Livingstone
10Conceptual model for XML retrieval
Structured documents
Content structure
tf, idf, acc
Inverted file structure index
Matching content structure
Presentation of related components
11Example of XML approaches
- The representation of a composite
- element (e.g. article and section) is
- defined as the aggregated
- representation of its sub-elements
section
Sec3 is then also about XML (in fact very much
about XML), retrieval, authoring
p1 is about XML retrievalp2 is about XML,
authoring
12Example of XML approaches
- Document ?t1, ?t2, ?t3
-
- Title Section_1
Section_2 - 0.9 t1, 0.4 t2 0.5 t1 0.2
t1, 0.7 t3 - ? Aggregated weight of ti in Document based on
the instances of ti in the sub-elements (Title,
Section_1 and Section_2)
13Evaluation
- The goal of an IR system
- retrieve as many relevant documents as possible
and as few non-relevant documents as possible - Comparative evaluation of technical performance
of IR systems effectiveness - ability of the IR system to retrieve relevant
documents and suppress non-relevant documents - Effectiveness
- combination of recall and precision
14Relevance
- A document is relevant if it has significant and
demonstrable bearing on the matter at hand. - Common assumptions
- Objectivity
- Topicality
- Binary nature
- Independence
15Recall / Precision
16Recall / Precision
- relevant documents for a given query
- d3, d5, d9, d25, d39, d44, d56, d71, d89,
d123 -
-
17Comparison of systems
18Test collection
- Document collection document themselves
- depend on the task, e.g. evaluating web retrieval
requires a collection of HTML documents. - Queries / requests
- simulate real user information needs.
- Relevance judgements
- stating for a query the relevant documents.
- See TREC
19Evaluation of XML retrieval INEX
- Evaluating the effectiveness of content-oriented
XML retrieval approaches - Collaborative effort participants contribute to
the development of the collection - queries
- relevance assessments
- Similar methodology as for TREC, but adapted to
XML retrieval.
20INEX Test Collection
- The INEX test collection (2002)
- Documents (500MB), which consist of 12,107
articles in XML format from the IEEE Computer
Society - 30 CO and 30 CAS queries
- Relevance assessments per retrieved components,
by participating groups - Relevance defined in terms of relevance and
coverage - Participants 36 active groups worldwide
- In 2003, INEX has 36 CO and 30 CAS queries
- Same document collections
- CAS queries are defined according to a subset of
XPath. - Relevance assessments per retrieved components,
by participating group - Relevance defined in terms of exhaustivity and
specificity - Participants 40 active groups worldwide
- INEX 2004 is just starting
21Example of CO topic
- ltinex_topic topic_id"126" query_type"CO"
ct_no"25"gt - lttitlegtOpen standards for digital video in
distance learninglt/titlegt - ltdescriptiongtOpen technologies behind media
streaming in distance learning projectslt/descripti
ongt - ltnarrativegt I am looking for articles/components
discussing methodologies of digital video
production and distribution that respect free
access to media content through internet or via
CD-ROMs or DVDs in connection to the learning
process. Discussions of open versus proprietary
standards of storing and sending digital video
will be appreciated. lt/narrativegt - ltkeywordsgtmedia streaming,video streaming,audio
streaming, digital video,distance learning,open
standards,free accesslt/keywordsgt
22Example of CAS topic
- lttitlegt//articleabout(.,'formal methods verify
correctness aviation systems')/body//about(.,'c
ase study application model checking theorem
proving')lt/titlegt - ltdescriptiongtFind documents discussing formal
methods to verify correctness of aviation
systems. From those articles extract parts
discussing a case study of using model checking
or theorem proving for the verification.
lt/descriptiongt - ltnarrativegtTo be considered relevant a document
must be about using formal methods to verify
correctness of aviation systems, such as flight
traffic control systems, airplane- or helicopter-
parts. From those documents a body-part must be
returned (I do not want the whole body element, I
want something smaller). That part should be
about a case study of applying a model checker or
a theorem proverb to the verification.
lt/narrativegt - ltkeywordsgtSPIN, SMV, PVS, SPARK, CWBlt/keywordsgt
23Relevance in XML
- A element is relevant if it has significant and
demonstrable bearing on the matter at hand - Common assumptions in IR
- Objectivity
- Topicality
- Binary nature ?
- Independence ?
section
24Relevance in XML
- Exhaustivity
- how exhaustively a document component discusses
the topic of request - Specificity
- how focused the component is on the topic of
request (i.e. discusses no other, irrelevant
topics) - 4-graded 0, 1, 2 , 3
- needed because of the structure
- Relevance (3,3), (2,3), (1,1), (0,0), etc
25Relevance assessment task
- Exhaustivity
- Element ? parent element, children element
- Consistency
- Parent of a relevant element must also be
relevant, although to a different extent - Exhaustivity increase going ?
- Specificity decrease going ?
- Use of an online interface
- Assessing a query takes a week!
- Average 2 topics per participants
- Only participants that complete the assessment
task have access to the collection
section
26Metrics
- Recall/precision can used but must take into
consideration - near misses (we do not retrieve the
best component e.g. p4
but one near enough e.g. p2) - overlap (we retrieve a component e.g. doc23
and one of its sub-components e.g. sec3)
doc23
sec3
p2 p4
27Conclusion
- XML retrieval is not just about the effective
retrieval of XML documents, but also how to
evaluate the effectiveness - INEX 2004
- More rigorous query topic format (e.g. parser)
- New metrics (e.g. not based on precision/recall)
- Tracks
- Relevance feedback
- Interactive
- Heterogeneous collection
- Natural language query
28Thank you
- http//inex.is.informatik.uni-duisburg.de2004/