Title: Rethinking Database System Architecture: Towards a Self-Tuning RISC-style Database System
1Rethinking Database System ArchitectureTowards
a Self-Tuning RISC-style Database System
- Surajit Chaudhuri Gerhard
Weikum - Microsoft Research University of the Saarland
- Redmond, USA Saarbruecken, Germany
2Conclusion
Problem DBMS technology is packaged
monolithically too many features, too much
complexity
- Solution
- RISC-style simplification and componentization
- break up DBMS into layered packages
- with narrow APIs and self-tuning
capabilities - compose appropriate packages
- into broader range of IT applications
Think globally, fix locally
3Outline
- Analysis
- Role Models for New Departure
- Proposal
4Passing of a Dream
Old World
New World
dot com
inventory
payroll
lt?XML?gt
Web server
DBMS
order entry
ERP
Mining
multi-tier architecture with many custom data
managers
DBMS at center of the universe
5Why Did This Happen?
- Universality of DBMS was a leap of faith
- SQL is unnatural and complex
- Yet another failed example of transparency trap
- Featurism has turned into a curse
- Excessive bundling
- Performance is unpredictable
- (Auto-) Tuning is a nightmare
- Unacceptable GPR for app system architects
6Example of Poor GPR DBMS Query Processor
- Yet another indexing smart added
- Yet another join method added
- Yet another transformation rule added
- Optimizer designers will admit
- It is unpredictable
- Hard to abstract principles
- ERP/Mining/etc attempt to outsmart QP
- Turning into black magic
- Cannot educate next generation of engineers
7Role Models for New Departure
- Ex. 1 Aircraft with many subsystems
- (engine, fuselage, electrical control, etc.)
- Ex. 2 RISC hardware
- No single engineer understands entire system
- Local theories for individual subsystems and
- reasonable understanding of interactions
- Few points of interaction
- with stable and narrow interfaces
- Built-in system support for debugging
subcomponents (incl. performance)
8RISC Philosophy for DBMS
- DBMS technology must be packaged
- as components with simplified functionality
- Enforce
- Layered approach
- Strong limits on interaction (narrow APIs)
- Multiple consumers for a component
- Components must have manageable complexity
- to be desirable for its potential consumers
- Encapsulation must include
- predictable performance and self-tuning
9Why Predictability is Crucial
From best-effort to guaranteed performance
Our ability to analyze and predict the
performance of the enormously complex software
systems ... are painfully inadequate"
(PITAC Report)
- Downtime is very expensive (100K/min)
- Very slow servers are like unavailable servers
- Tuning for peak load requires predictability
- of workload ? config ? performance function
- Self-tuning requires mathematical models
- Feasible at component scale
10Check Availability
(Look-Up Will Take 8-25 Seconds)
Internal Server Error. Our system administrator
has been notified. Please try later again.
11RISC-style Engine (Components)
- Design principles for components
- include only functionality that is self-tuning
- apply Occams razor for internal alternatives
- Level 1 (base layer) SPJ only
- only B-trees, with automatic index selection
built-in - API includes prioritization exec. time
prediction - Level 2 Support for aggregation
- Uses level 1 with narrow API
- Self-tuning for aggregation considerations
- Level 3 Full-fledged SQL
- Layering sacrifices performance for manageability
12RISC in the Large
- Composition principles for IT solutions in the
large - Choose least-complexity components
- IT solution can rely on predictable/guaranteed
- performance of components
- Use level 1 engine (SPJ,
- or merely record and index managers)
- for MP3 repository, simple E-service etc.
- Use level 2 engine (SPJ aggregation)
- for OLAP or ERP
- Use level 3 engine (SQL)
- for full-fledged DW, legacy apps
13Implications of RISC Approach
- Need for Universal Glue for components
- COM/Universal Runtime and EJB
- Simplicity is key
- Eliminate all second-order optimizations
- Restrict alternatives
- Not yet another join method or transformation
rule - Dont abuse extensibility!
14Road Map
- Demonstrate plug and play light-weight
- data servers for various scenarios
- (API and guaranteed performance)
- MP3 repositories
- OLAP server
- Metadata manager
- Open source bazaar?
15Potential Caveatsand Rebuttals
- Weve been down this road before!
- But we now have better understanding of
- the appropriate components and APIs.
- We will lose performance!
- But we win in terms of predictability and
- overall GPR.
- There is no business incentive!
- As industries mature, predictability and
manageability do matter for long-term benefit.