Exploiting Asynchronous IO using the Asynchronous Iterator Model - PowerPoint PPT Presentation

About This Presentation
Title:

Exploiting Asynchronous IO using the Asynchronous Iterator Model

Description:

Bottom level nodes perform operations such as sequential scans or index scans. ... Async Sequential scan. Check if next tuple ... Out of order sequential scan ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 34
Provided by: Sur75
Category:

less

Transcript and Presenter's Notes

Title: Exploiting Asynchronous IO using the Asynchronous Iterator Model


1
Exploiting Asynchronous IO using the Asynchronous
Iterator Model
  • Suresh Iyengar S. Sudarshan
  • Santosh Kumar Raja Agrawal IIT
    Bombay

Current affiliations Microsoft Hyderabad,
Guruji.com, SAP
2
Agenda
  • AIO Background
  • Exploiting AIO in query processing
  • Asynchronous Iterator model
  • Asynchronous Index Nested Loops Join
  • Asynchronous versions of other operators
  • Performance results
  • Related Work
  • Conclusion

3
IO Processing Traditional way
Application
Kernel
System call
Read ()
Initiate IO
Context switch
Read response
data
Application Blocked !
  • CPU is idle most of the time waiting for an IO
    completion

4
IO Processing Async. way
Application
Kernel
System call
Initiate IO
AIO Read ()
Read response
Notify
data
Do other work !!
5
IO Processing Async. Way
  • Asynchronous approach
  • Overlap of CPU and IO processing
  • Application can generate multiple IO requests
  • Allows IO subsystem to reorder access to data on
    disk
  • Important in RAID environments

6
Asynchronous IO Interface
( File descriptor, offset, buffer, numBytes, )
Linux 2.6 kernel
aio_read ( aio structure) Request an AIO read operation
aio_error ( aio structure ) Check the status of an AIO request
lio_listio ( array of aio structures ) Initiate a list of AIO operations
  • We use list AIO in our implementation
  • Can initiate multiple IO read operations in one
    system call

7
Handling AIO completion
  • Signal-based handler
  • A signal is generated on IO completion
  • Callback using interrupts
  • An interrupt is generated on IO completion
  • Concurrent access to completion handler and
    shared data structures in both of above methods
  • Polling
  • Store IO requests in pending queue and poll
    periodically for completion
  • Our experiments show polling beats
    signal/interrupt based approach

Call completion handler
8
Demand-Driven Iterator
  • Open()
  • Next()
  • Close()

NLJ
scan
scan
Blocking call !
Table A
Table B
  • Bottom level nodes perform operations such as
    sequential scans or index scans.
  • Upper level nodes are join nodes or other
    operator nodes such as sort or aggregate.

9
Agenda
  • AIO Background
  • Exploiting AIO in query processing
  • Asynchronous Iterator model
  • Asynchronous Index Nested Loop (INL) Joins
  • Asynchronous versions of other operators
  • Performance results
  • Related Work
  • Conclusion

10
Asynchronous Iterator
  • Open()
  • Next()
  • Close()

NLJ
scan
scan
  • I dont have the tuple available in the
    memory !!
  • Issue AIO read operation
  • Return LATER

Table A
Table B
Non- Blocking call !
11
Asynchronous Iterator Model (AIM)
  • Allow a node to return a status LATER to the
    parent
  • Instead of blocking for IO completion.
  • The parent operator could
  • Perform other work, such as fetching data from
    another input
  • Simply return a LATER status to its parent node
  • Or just loop, reinvoking the child operator till
    it returns a tuple
  • E.g. root of the execution plan tree
  • Exact action depends on operator
  • Asynchronous versions of different operators
  • Focus on Asynchronous Indexed Nested Loops join

12
Asynchronous INL Joins
  • Original state of Indexed Nested Loops (INL) node
  • Left and right subplans and qualifier lists
  • Augmented state for async INL node
  • An array of outer tuples each having a queue of
    matching inner TIDs
  • AIO may have been issued for some already, others
    later
  • A workqueue for outer slots which already have
    AIO issued for their matching inner TIDS
  • An IO queue recording all pending AIO requests
    made by the node
  • Used to poll for completion of AIO requests

13
Asynchronous INL Join (contd.)
  • We divide the async INL join operations into two
    stages
  • Stage 1 Fetch outer tuples and issues AIO
    requests
  • Stage 2 Check for AIO completion, process AIO
    results and return join results.
  • Stages are interleaved
  • Stage 1 may be in progress for some tuples, and
    Stage 2 for others

14
Asynchronous INL Join (contd.)
Stage 1
Fetch outer tuples
For each outer tuple
Find the matching inner TIDs for each outer tuple
Put the outer tuple in workqueue
Issue LIST AIO for matching inner TIDS of all
outer tuples in workqueue (subject to BATCH_SIZE)
15
Asynchronous INL Join (contd.)
  • Rules
  • Batch size
  • BATCH_SIZE max number of outstanding AIO
    requests
  • Why? OS limits, efficiency issues
  • We set the MAX_BATCH_SIZE per node to 200 in our
    experiments
  • Scale BATCH_SIZE in powers of 2 till
    MAX_BATCH_SIZE so that async INL can output
    tuples quickly at the onset
  • Case where outer tuple matches a large number of
    inner tuples is handled appropriately
  • Keeping the AIO queue filled
  • We issue further AIO requests (fetching outer
    tuples as required) if 10 of earlier AIO
    requests have completed

16
Asynchronous INL Join (contd.)
For each outer tuple in workqueue
Stage 2
Check if any matching inner TIDs are present in
memory
No
Present ?
Yes
  • Remove that inner TID from outer tuples TID
    array
  • Perform join and add to result
  • if join result found break from loop

Update workqueue
Next page ..
17
Asynchronous INL Join (contd.)
Prev page..
Yes
Return resultto parent node
Any join results?
Back to start of Stage 2
Yes
No
No
Poll for AIO completion Is tuple found or parent
node cannot handle LATER
Is no outstanding outer tuples reached end of
outer tuple
Yes
No
tupStat END_OF_RESULT result NULL
tupstat LATERresult NULL
Return result and tupStat to parent node
18
Async. versions of other operators
  • Async Sequential scan
  • Check if next tuple is in the in-memory buffer
  • If its present, return the tuple
  • Else initiate an async read. Set tupStat LATER
    and return
  • Out of order sequential scan
  • Start returning the tuples of a particular
    relation which are already there in the memory
  • even if out of order
  • Concurrently, issue AIO for other tuples

19
Async. versions of other operators
I can start the sorting of other input !
Merge Join
LATER
sort
sort
LATER
Seq scan
Seq scan
T1
T2
Initiate AIO read
20
Performance Results
  • Experiments with TPC-H database with scale
    factors of 1 and 10 in three different setups
  • Core 2 duo P4 with
  • 1GB RAM and TPC-H - 1 GB database (single disk)
  • 1GB RAM and TPC-H 10 GB database (single disk)
  • 3.2GB RAM and TPC-H 10 GB database (4 disks /
    RAID 10)
  • We use PostgreSQL 8.1.3 as the code base
  • Compare it with our modified version of the same
    code base, incorporating asynchronous iterator
    model
  • with async INL and async seq. scan

21
Performance Results 1GB RAM
Query 1a select l_orderkey, l_quantity from
orders, lineitem where o_orderkeyl_orderkey
and l_orderkey1002 and l_linestatusF
TPCH 1 GB
TPCH 10 GB
22
Performance Results 1 GB RAM
Query 2a select l_orderkey,l_quantity from
orders,lineitem,customer where
o_orderkeyl_orderkey and o_custkeyc_custkey
and l_orderkey1002 and l_linestatusF
TPCH 10 GB
TPCH 1 GB
23
Performance Results 1GB RAM
Query 2a Join of orders, lineitem and customer
with filter (TPCH 1GB )
Startup effect
24
Performance Results 1 GB RAM
Query 2b select l_orderkey,l_quantity from
myorders,lineitem,customer where
o_orderkeyl_orderkey and o_custkeyc_custkey
-- No tight selection
TPCH 1 GB
TPCH 10 GB
1GB RAM
25
Performance Results 3.2 GB RAID
Query 2a Join of orders, lineitem and customer
with filter
Query 1a Join of orders and lineitem with filter
TPC-H 10GB / 3.2GB RAM / 4 disks RAID10
26
Performance Results 3.2 GB RAID
Query 1b Join of myorders, lineitem
Query 2b Join of myorders, lineitem and customer
TPC-H 10GB / 3.2GB RAM / 4 disks RAID10
27
Performance Results
TPC-H Q12 select l_shipmode,sum(...) from
orders,lineitem where o_orderkey l_orderkey
and ltseveral selectiongt group by l_shipmode
order by l_shipmode
Original INL Async INL Gain
TPCH 1GB 1GB RAM 64.7 sec 48 sec 25
TPCH 10 GB 1GB RAM 687 sec 431 sec 37
TPCD 10GB RAID 10 4 disks, 3.2 GB RAM 164 sec 147 sec 10
28
Related Work
  • Graefes generalized spool iterator (Graefe
    BTW03 )
  • Pre-fetches multiple outer tuples
  • Issue AIO for matching inner TIDS
  • Can be replenished when empty or when one tuple
    is joined

INL
Spool operator
Index lookup
scan
29
Related Work
  • AIO used in database products
  • Microsoft SQL Server, IBM DB2, Oracle
  • No public documentation on how these systems use
    AIO
  • Asynchronous iteration for evaluating web queries
    (R.Goldman and J. Widom SIGMOD 2000 )
  • They report results only on web queries

30
Conclusion
  • Proposed the Asynchronous Iterator Model (AIM)
  • Presented asynchronous versions of INL and some
    operators
  • Showed gains of over 50 in some cases
  • AIM can be useful in web-service access and in
    data integration systems like IBM DataJoiner
  • Future work
  • Implementing async versions for index lookup, sub
    plan, sort and merge operator
  • Performing async IO in the presence of ordering
    constraints

31
Thank YouQuestions ?

32
Plans
  • Query 1a
  • Seq scan on lineitem, probe on orders
  • Merge Join
  • -gt Index Scan on orders
  • -gt Sort lineitem
  • -gt Seq Scan on lineitem
  • Query 2a
  • Nested Loop
  • -gt Nested Loop
  • -gt Seq Scan on lineitem
  • -gt Index Scan on orders
  • -gt Index Scan on customer

33
Plans
  • Query 2a Merge Join
  • -gt Sort orders
  • -gt Merge Join
  • -gt Index Scan on orders
  • -gt Sort on lineitem
  • -gt Seq Scan on lineitem
  • -gt Index Scan on customer
  • Query 2b
  • Nested Loop
  • -gt Nested Loop
  • -gt Seq Scan on lineitem
  • -gt Index Scan on myorders
  • -gt Index Scan on customer
Write a Comment
User Comments (0)
About PowerShow.com