Title: Scheduling in Staged- DB Systems
1Scheduling in Staged- DB Systems
- Nicolas Bonvin, Rammohan Narendula, and Surender
Reddy Yerva
2Organization
- What is Staged-DB?
- Scheduling in Staged-DB
- Our Contribution
- Scheduling in Execution Phase
- System Modeling
- System Design Details
- Performance Study
- Future Work
3Motivation
- Response time time needed to produce the ?rst
page as output - Big advantage for the overlapping case ('1')
4Query Lifetime in DBMS
Query
PARSER
Query tree
OPTIMIZER
catalogs and statistics
Query plan
operators
EXECUTION
Data
Answer
EXECUTION(Disk-IO) 90 OF TIME
5DB Paradigm So Far..
- Query ? Query Execution Plan (Tree of Operators)
- Multiple Queries
- Each query handled by a DIFFERENT THREAD
- No cross communication/sharing across threads ?
- Sharing Opportunity is missed
D
C
D
C
One Query Multiple Operators
6Staged-DB Paradigm
- DB is remodeled as various stages
- Stage
- Common execution logic grouped into a stage
- Each operator in QEP can be seen as a stage
- Query passed through all the needed stages to get
an output - Common Data needs ?? Detected by the Stage
D
C
DBMS
StagedDB
D
C
One Operator Multiple queries
7Staged Database Systems
DBMS
queries
Conventional
- DB ? Stages Execution Stage? microEngine
- Each Stage has a queue, Also each microEngine has
a request queue.
High concurrency ? locality across requests
8Scheduling In Staged-DB
- Scheduling at Different levels
- Stages (Parser, Optimizer, Execution)
- Across MicroEngines (Execution Engine has
SCAN,JOIN etc micro-engines) - Within MicroEngine
- We Consider only scheduling across microEngines
- Scheduling Policies
- Round-Robin
- Heavy Load First
- Light Load First
9Detailed System Design
- Based on Discrete Event Simulation technique
- All the computation, data needs, dependencies are
modeled using events - System components
- Global System Queue
- Dispatcher
- Operator (or) mEngine
- Global Scheduler
- Main Memory
- Overlap Detector
10Global System Queue
Query Arrival
event
Dispatcher
eventId componentId functionId firingTime packet
Scheduler
Engine Insert
Memory
Disk-Fetch
11mEngine
Input Packet Queue
Request packet from parent node/ dispatcher
Packet format queryId list queryPlans pageId conte
xtInfo
Engine Insert
Call Overlap detector
Send packet to Child OR execute and produce output
Engine Execution Begin
Insert packet
Pick packet from Q
Engine Execution End
Insert event into Event queue for the scheduler
12mEngines
- Join
- Sort
- Aggregation
- Scan
- Wait and Scan
- Index Scan
13Overlap detection
- With memory
- With input queue
- Two types
- Linear
- Spike
14Memory Manager
- Pinning and unpinning
- Put()
- pageExists()
- consumePage()
15Performance study
- 5 queries
- 5 runs
- Uniform arrival rate
16Effect of Overlapping
- Response time time needed to produce the ?rst
page as output - Big advantage for the overlapping case ('1')
17Effect of Overlapping
- Memory consumption max of pages consumed in
memory during the life time of the query - Higher memory consumption with Overlapping !
18Effect of Overlapping
- Throughput of queries completed in a unit of
time - Clear advantage with Overlap detection !
19Comparing scheduling policies
- Mean response time
- Round Robin seems to perform a little better
20Comparing scheduling policies
- Memory consumption
- No differences !
21Future Work
- Few more interesting global scheduling policies
are possible. - The system did not consider a local scheduling
policy to pick one packet among many in the input
packet queue, for processing next. It picks the
fist packet in the queue at the moment. - Regarding implementation, experimentation should
be done with more mEngines and a bench mark style
input queries.