Positional Grouping in XQuery - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Positional Grouping in XQuery

Description:

... gathered from xsl-list 1999-2005 ... dt XSL Transformations /dt dd A language for ... xsl:for-each-group handles both value-based and ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 27
Provided by: Micha476
Category:

less

Transcript and Presenter's Notes

Title: Positional Grouping in XQuery


1
Positional Groupingin XQuery
  • Michael Kay

2
Definitions
  • Grouping
  • a function that takes a sequence as input, and
    produces a sequence of sequences as output
  • Value-based Grouping
  • assignment of items to groups is based on
    properties of the item
  • Positional Grouping
  • Identification of structure based on patterns in
    the sequence of items

3
Requirements
  • Based on use cases
  • Real-life problems gathered from xsl-list
    1999-2005
  • Some were captured in the XSLT 2.0 requirements
    (2001), others came later
  • No attempt to define any theoretical notion of
    completeness, e.g. a class of grammars

4
Use Case 1 Headings and Paragraphs
ltbodygt  lth2gtMorninglt/h2gt  ltpgtGot
uplt/pgt  ltpgtMade tealt/pgt  lth2gtAfternoonlt/h2gt  ltp
gtHad lunchlt/pgt  ltpgtFed the catlt/pgt  ltpgtPosted a
letterlt/pgtlt/bodygt
ltchaptergt  ltsection title"Morning"gt    ltparagtGo
t uplt/paragt    ltparagtMade tealt/paragt  lt/sectiongt
  ltsection title"Afternoon"gt    ltparagtHad
lunchlt/paragt    ltparagtFed the catlt/paragt    ltpar
agtPosted a letterlt/paragt  lt/sectiongtlt/chaptergt
5
Use Case 2 Adjacent Bullets
ltp/gt ltq/gt ltbulletgtonelt/bulletgt ltbulletgttwolt/bullet
gt ltx/gt lty/gt
ltp/gt ltq/gt ltlistgt ltbulletgtonelt/bulletgt
ltbulletgttwolt/bulletgt lt/listgt ltx/gt lty/gt
6
Use Case 3 Term Definition Lists
ltdtgtXMLlt/dtgt ltddgtExtensible Markup
Languagelt/ddgt ltdtgtXSLTlt/dtgt ltdtgtXSL
Transformationslt/dtgt ltddgtA language for
transforming XMLlt/ddgt ltddgtA specification
produced by W3Clt/ddgt
lttermgt ltdtgtXMLlt/dtgt ltddgtExtensible Markup
Languagelt/ddgt lt/termgt lttermgt ltdtgtXSLTlt/dtgt
ltdtgtXSL Transformationslt/dtgt ltddgtA language for
transforming XMLlt/ddgt ltddgtA specification
produced by W3Clt/ddgt lt/termgt
7
Use Case 4 Continuation Markers
ltin cont"yes"gt One way tolt/ingtltin
cont"yes"gt understand positional grouping
islt/ingtltingt as an exercise in
parsing.lt/ingt ltin cont"yes"gt To get from a
sequence of itemslt/ingt ltin cont"yes"gt to a
tree, we could uselt/ingt ltingt some kind of
grammar.lt/ingt
ltparagtOne way to understand positional grouping
is as an exercise in parsing.lt/paragt ltparagtTo
get from a sequence of items to a tree, we could
use some kind of grammar.lt/paragt
8
Use Case 5 Page ranges
4, 6, 9, 11, 12, 13, 18, 20, 21
4, 6, 9, 11-13, 18, 20-21
9
Use Case 6 Arrange in rows
"Green", "Pink", "Lilac", "Turquoise", "Peach",
"Opal", "Champagne"
lttablegt lttrgt lttdgtGreenlt/tdgt lttdgtPinklt/tdgt
lttdgtLilaclt/tdgt lt/trgt lttrgt lttdgtTurquoiselt/tdgt
lttdgtPeachlt/tdgt lttdgtOpallt/tdgt lt/trgt lttrgt
lttdgtChampagnelt/tdgt lttdgt lt/tdgt lttdgt
lt/tdgt lt/trgtlt/tablegt
10
Use Case 7 Level Numbers
ltdatagt ltgedcom level"0"/gt ltindi
level"1"/gt ltname level"2"/gt ltfirst
level"3"gtMichaellt/firstgt ltlast
level"3"gtKaylt/lastgt ltemail level"2"gtmike_at_saxonic
a.comlt/emailgt ltindi level"1"/gt ltname
level"2"/gt ltfirst level"3"gtNormlt/firstgt ltlast
level"3"gtWalshlt/lastgt ltemail level"2"gtnorm_at_nwals
h.comlt/emailgt lt/datagt
ltgedcomgt ltindigt ltnamegt
ltfirstgtMichaellt/firstgt ltlastgtKaylt/lastgt
lt/namegt ltemailgtmike_at_saxonica.comlt/emailgt
lt/indigt ltindigt ltnamegt
ltfirstgtNormlt/firstgt ltlastgtWalshlt/lastgt
lt/namegt ltemailgtnorm_at_nwalsh.comlt/emailgt
lt/indigt lt/gedcomgt
11
XQuery 1.0 Solutions
  • Head/tail recursion
  • Positional indexing

12
"Headings and Paragraphs"using head/tail
recursion
declare function localsection(e as element(H2))
ltsectiongt localnextPara(e/following
-sibling1selfP) lt/sectiongt decla
re function localnextPara(p as element(P)?)
if (p) then (p, localnextPara(p/following-
sibling1selfP)) else () ltoutgtfor
h in doc('doc.xml')//BODY/H2 return
localsection(h)lt/outgt
13
"Term Definition Lists"using positional indexing
let s (for e at p in input
where eselfdt and
not(preceding-sibling1selfdt)
return p, count(input)1) for i in 1 to
count(s) - 1 return lttermgt for j in si
to si 1 - 1 return inputj lt/termgt
14
XSLT 2.0 approach
  • ltxslfor-each-groupgt handles both value-based and
    positional grouping
  • But the two are largely distinct
  • Three varieties of positional grouping
  • group-starting-with
  • group-ending-with
  • group-adjacent

15
Identifying Breaks
  • All use cases have these properties
  • input sequence is the concatenation of the output
    sequence
  • "breaks" depend only on
  • the item before the break
  • the item after the break
  • position (use case "arrange in rows" only)

16
Conceptual approach to solution
  • Higher order function

partition( population as item(),
break-function as function(after as
item(), before as
item(), position as
xsinteger) as xsboolean(),
action as function(group as item())
as item() ) as item()
17
Syntactic realisation
partition g in population break after a
before b at p where condition return
action
18
Solution to Use Case 1 Headings and Paragraphs
partition section in body/ break before e
where eselfh2 return ltsection
title"section/h2"gt for p in
section/p return ltparagtplt/paragt
lt/sectiongt
19
Solution to Use Case 2 Adjacent Bullets
partition children in break after a before
b where not(aselfbullet and
bselfbullet) return if
(children/selfbullet) then ltlistgt
children lt/listgt else children
20
Use Case 3 Term Definition Lists
ltdtgtXMLlt/dtgt ltddgtExtensible Markup
Languagelt/ddgt ltdtgtXSLTlt/dtgt ltdtgtXSL
Transformationslt/dtgt ltddgtA language for
transforming XMLlt/ddgt ltddgtA specification
produced by W3Clt/ddgt
lttermgt ltdtgtXMLlt/dtgt ltddgtExtensible Markup
Languagelt/ddgt lt/termgt lttermgt ltdtgtXSLTlt/dtgt
ltdtgtXSL Transformationslt/dtgt ltddgtA language for
transforming XMLlt/ddgt ltddgtA specification
produced by W3Clt/ddgt lt/termgt
partition term in break after a before b
where (aselfdd and bselfdt) return
lttermgttermlt/termgt
21
Use Case 4 Continuation Markers
ltin cont"yes"gtOne way tolt/inltin cont"yes"gt
understand positional grouping is ltingt as an
exercise in parsing.lt/ingt ltin cont"yes"gtTo get
from a sequence of itemslt/ingt ltin cont"yes"gt to
a tree, we could uselt/ingt ltingt some kind of
grammar.lt/ingt
ltparagtOne way to understand positional grouping
is as an exercise in parsing.lt/paragt ltparagtTo
get from a sequence of items to a tree, we could
use some kind of grammar.lt/paragt
partition para in ./in break after a where
not(a/_at_cont "yes") return ltparagtparalt/paragt
22
Use Case 5 Page ranges
4, 6, 9, 11, 12, 13, 18, 20, 21
4, 6, 9, 11-13, 18, 20-21
partition range in page-numbers break after a
before b where (b ! a 1) return if
(count(range) 1) then range
else concat(range1, "-",
rangelast()
23
Use Case 6 Arrange in rows
"Green", "Pink", "Lilac", "Turquoise", "Peach",
"Opal", "Champagne"
lttablegt lttrgt...lt/trgtlttrgt...lt/trgtlttrgt...lt/trgt lt/tab
legt
partition rows in colours break at p where
((p - 1) mod 3 0) return lttrgt for i in
rows return lttdgtilt/tdgt lt/trgt
24
Use Case 7 Level Numbers
declare function fgroup( items as
element(), level as xsinteger) as
element() partition group in items break
before b where b/_at_level level return
element group1/node-name()
fgroup(remove(group, 1), level 1)
fgroup(/data/, 0)
25
Performance
  • Algorithm is intrinsically O(n)
  • Memory usage
  • naive implementation uses memory proportional to
    size of largest group
  • smart implementation can be fully streamed
  • Almost inevitably better than the XQuery 1.0
    solutions

26
Conclusions
  • Need exists for both value-based and positional
    grouping
  • Positional grouping use-cases can be solved by
    identifying breaks in terms of (before, after,
    position)
  • Conceptual approach based on higher-order
    functions, realised in concrete syntax
Write a Comment
User Comments (0)
About PowerShow.com