Overview of SPGiST - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Overview of SPGiST

Description:

... when to use the index and when not, cost model, selectivity estimation ... Integration with the query optimizer, costing, selectivity estimation. Node clustering ... – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 18
Provided by: aref
Category:

less

Transcript and Presenter's Notes

Title: Overview of SPGiST


1
Overview of SP-GiST
  • Walid G. Aref
  • Department of Computer Science
  • Purdue University

2
Indexing
  • With the emergence of non-traditional database
    applications, the need for non-traditional types
    of indexes is inevitable
  • Example applications

3
Challenges in Indexing
  • Current database systems support
  • B-trees, hash tables
  • Very few systems support R-trees
  • Fewer systems support a variant of the region
    quadtree

4
What is Wrong?
  • Building and integrating an index type into the
    database system is an overwhelming task
  • Integration with the query optimizer, when to use
    the index and when not, cost model, selectivity
    estimation
  • Providing query operators that utilize the index
  • Concurrency control and recovery techniques
  • Very few index structures/research address all
    these issues

5
SP-GiST Space-partitioning Generalized Search
Trees
  • Software engineering solution to support a wide
    class of indexes inside a database management
    system
  • GiST Supports B-tree-like indexes, e.g., R-trees
  • SP-GiST Supports the class of space-partitioning
    trees, e.g., variants of the quadtree, variants
    of the trie, k-d tree

6
SP-GiST Space-partitioning Generalized Search
Trees
  • The framework provides the basic services inside
    a database system, e.g.,
  • Concurrency control and recovery
  • Query operators
  • Bulk-loading and insertion
  • Integration with the query optimizer, costing,
    selectivity estimation
  • Node clustering
  • SP-GiST supports the class of space-partitioning
    (SP) trees
  • Suitable for emerging database applications
  • An extensible index structure that can be
    instantiated to realize any member in the class
    of SP trees
  • Example index structures realizable by SP-GiST
  • Disk-based versions of variants of the trie, all
    variants of quadtrees and octrees, the k-d tree,
    the bin-tree,

7
SP-GiST Extensible Interfaces
  • Internal Methods
  • Supported by the DBMS system
  • reflect similarities among the various SP-trees,
    e.g., insert, delete, search, bulk-load
    algorithms
  • Interface Parameters and External Methods
  • Extensible index interfaces
  • Reflect structural and behavioral differences
    among various SP-trees
  • Need to be supported by the user to instantiate a
    new type of SP-tree index

8
Examples on SP Trees
Data Driven SP Trees Space is decomposed based
on the input data
Space Driven SP Trees Space decomposition is
independent from the order of data insertion
k-d Tree
Trie
9
Examples on Index Realization inside SP-GiST
10
Main Characteristics of Space-Partitioning Trees
  • Decompose the space recursively into a fixed
    number of disjoint partitions
  • There are two types of space-partitioning trees
  • Space-driven space-partitioning trees
  • e.g., the trie and region quadtree
  • Data-driven space-partitioning trees
  • e.g., the point quadtree and the k-d tree

11
Index Realization using SP-GiST
  • Interface Parameters
  • Key type Type of the data in the leaf-level of
    the tree,
  • E.g., word, point
  • Number of space partitions
  • E.g., 4 for a quadtree, 26 for a trie, etc.
  • Node predicate gives the predicate to use when
    navigating the SP-tree
  • E.g., letter a, (x,y) inside node.rectangle
  • Bucket size determines the maximum number of
    items that a leaf node can hold

12
Index Realization using SP-GiST
  • Resolution determines maximum number of space
    decomposition, set based on the space and the
    granularity required

13
Interface Parameters for SP-GiST
  • PathShrink NeverShrink, LeafShrink, TreeShrink

14
Interface Parameters for SP-GiST
  • NodeShrink Determines if empty partitions should
    be kept in the tree or not
  • E.g., tree vs. forest

15
Index Realization using SP-GiST
  • External Methods
  • Consistent a boolean function to guide the
    search in the tree
  • PickSplit defines the way for splitting nodes in
    the tree
  • Cluster defines how tree nodes are clustered
    into disk pages (SP-GiST provides a default
    clustering algorithm)

16
SP-GiST Internal Methods
  • Internal methods are provided by the SP-GiST
    framework and does not need recoding
  • Search Traverses the tree using the Consistent
    external method
  • Insert Delete Builds the tree using the
    PickSplit and Cluster external methods
  • Bulk Load Uses a general bulk loading algorithm
    ICDE 2004, Direct Buffering Bulk Loading
    (DBDL), that can bulk load any SP-tree
  • Bulk Insert Uses a general bulk insertion
    algorithm ICDE 2004, Buffer Tree Bulk Insertion
    (BTBI) that can bulk insert a group of objects in
    any SP-tree

17
SP-GiST Details
  • More detail
  • Ilyas and Aref, JIIS 2001, SSDBM 2001
  • www.cs.purdue.edu/faculty/aref.html
Write a Comment
User Comments (0)
About PowerShow.com