Query Processing and Networking Infrastructures - PowerPoint PPT Presentation

About This Presentation
Title:

Query Processing and Networking Infrastructures

Description:

Day 2: Research Synergies w/Networking. Queries as indirection, revisited ... Re-xmission: e.g. polling, retries. 'Joe is so persistent' Persistence of put or get ... – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 56
Provided by: joehell
Learn more at: https://dsf.berkeley.edu
Category:

less

Transcript and Presenter's Notes

Title: Query Processing and Networking Infrastructures


1
Query Processing and Networking Infrastructures
  • Day 2 of 2
  • Joe Hellerstein
  • UC Berkeley
  • September 27, 2002

2
Outline
  • Day 1 Query Processing Crash Course
  • Intro
  • Queries as indirection
  • How do relational databases run queries?
  • How do search engines run queries?
  • Scaling up cluster parallelism and distribution
  • Day 2 Research Synergies w/Networking
  • Queries as indirection, revisited
  • Useful (?) analogies to networking research
  • Some of our recent research at the seams
  • Some of your research?
  • Directions and collective discussion

3
Indirections
4
Standard Spatial Indirection
  • Allows referent to move without changes to
    referers
  • Doesnt matter where the object is, we find it.
  • Alternative copying
  • Works if updates are managed carefully, or dont
    exist

5
Temporal Indirection
  • Asynchronous communication is indirection in time
  • Doesnt matter when the object arrives, you find
    it
  • Analogy to space
  • Sender ? referer
  • Recipient ? referent

6
Generalizing
  • Indirection in Space
  • x-to-one or x-to-many?
  • Physical or Logical mapping?
  • Indirection in Time
  • Persistence model storage or re-xmission
  • Persistence role sender or receiver

7
Indirection in Space, Redux
  • One-to-one, one-to-many, many-to-many?
  • Standard relational issue
  • E.g. virtual address is many-to-one
  • E.g. email distribution list is one-to-many
  • Physical or logical
  • Mapping table?
  • E.g. page tables, mailing list, DNS, multicast
    group lists
  • Logical
  • E.g. queries, subscriptions, interests

8
Indirection in Time, Redux
  • Persistence model storage or re-xmission
  • Storage e.g. DB, heap, stack, NW buffer,
    mailqueue
  • Re-xmission e.g. polling, retries.
  • Joe is so persistent
  • Persistence of put or get
  • Put e.g. DB insert, email, retry
  • Get e.g. subscription, polling

9
Examples Storage Systems
  • Virtual Memory System
  • Space 1-to-1, physical
  • Time synchronous (no indirection)
  • Database System
  • Space many-to-many, logical
  • Time synchronous (no indirection)
  • Broadcast Disks
  • Space 1-to-1
  • Time re-xmitted put

10
Examples Split-Phase APIs
  • Polling
  • Space no indirection
  • Time re-xmitted get
  • Callbacks
  • Space no indirection
  • Time stored get
  • Active Messages
  • Space no indirection
  • Time stored get
  • App stores a get with putter, which tags it on
    messages

11
Examples Communication
  • Email
  • Space One-to-many, physical
  • Mapping is one-to-many, delivery is one-to-one
    (copies)
  • Time stored put
  • Multicast
  • Space One-to-many, physical
  • Both mapping and delivery are one-to-many
  • Time roughly synchronous?

12
Examples Distributed APIs
  • RPC
  • Space 1-to-1, physical
  • Can be 1-to-many
  • Time synchronous (no indirection)
  • Messaging systems
  • Space 1-to-1, physical
  • Often 1-to-many
  • Time depends!
  • Transactional messaging is stored put
  • Exactly-once transmission guaranteed
  • Other schemes are re-xmitted put
  • At least once transmission. Idempotency of
    message becomes important!

13
Examples Logic-based APIs
  • Publish-Subscribe
  • Space one-to-many, logical
  • Time stored receiver
  • Tuplespaces
  • Space one-to-many, logical
  • Time stored sender

14
Indirection Summary
  • 2 binary indirection variables for space, 2 for
    time
  • Can have indirection in one without the other
  • Leads to 24 indirection options
  • 16 joint space/time indirections, 4 space-only, 4
    time-only
  • And few lessons about the tradeoffs!
  • Note issues here in performance and SW
    engineering and
  • E.g. Are tuplespaces better than pub/sub?
  • Not a unidimensional question!

15
Rendezvous
  • Indirection on both sender and receiver side
  • In time and/or space on each side
  • Most general neither sender nor receiver know
    where or when rendezvous will happen!
  • Each chases a reference for where
  • Each must persist for when

16
Join as Rendezvous
  • Recall pipelining hash join
  • Combine all blue and gray tuples that match
  • A batch rendezvous
  • In space the data items were not stored in a
    fixed location, copied into HT
  • In time both sides do put-persist in the join
    algorithm via storage
  • A hint of things to come
  • In parallel DBs, the hash table is
    content-addressed (via the exchange routing
    function)
  • What if hash table is distributed?
  • If a tuple in the join is doing get, then is
    there a distinction between sender/recipient?
    Between query and data?

17
Some resonances
  • We said that query systems are an indirection
    mechanism.
  • Logical, many-to-many, but synchronous
  • Query-response
  • And some dataflow techniques inside query engines
    seem to provide useful indirection mechanisms
  • If we add a network into the picture, life gets
    very interesting
  • Indirection in space very useful
  • Indirection in time is critical
  • Rendezvous is a basic operation

18
More Resonance
19
More Interaction CS262 Experiment w/ Eric Brewer
  • Merge OS DBMS grad class, over a year
  • Eric/Joe, point/counterpoint
  • Some tie-ins were obvious
  • memory mgmt, storage, scheduling, concurrency
  • Surprising QP and networks go well side by side
  • E.g. eddies and TCP Congestion Control
  • Both use back-pressure and simple Control Theory
    to learn in an unpredictable dataflow
    environment

20
Scout
  • Paths the key to comm-centric OS
  • Making Paths Explicit in the Scout Operating
    System, David Mosberger and Larry L. Peterson.
    OSDI 96.

Figure 3Example Router Graph
21
CLICK
  • A NW router is a query plan!
  • With a twist flow-based context
  • An opportunity for autonomous query
    optimization

22
Revisiting a NW Classic with DB Goggles
23
Clark Tennenhouse, SIGCOMM 90
  • Architectural Considerations for a New Generation
    of Protocols
  • Love it for two reasons
  • Tries to capture the essence of what networks do
  • Great for people who need the 10,000-foot view!
  • Im a fan of doing this (witness last week)
  • Tries to move the community up the food chain
  • Resonances everywhere!!

24
CT Overview (for amateurs like me)
  • Core function of protocols data xfer
  • Data Manipulation
  • buffer, checksum, encryption, xfer to/from app
    space, presentation
  • Transfer Control
  • flow/congestion ctl, detecting transmission
    problems, acks, muxing, timestamps, framing

25
C Ts Wacky Ideas
  • Thesis nets are good at xfer control, not so
    good at data manipulation
  • Some CT wacky ideas for better data manipulation
  • Xfer semantic units, not packets (ALF)
  • Auto-rewrite layers to flatten them (ILP)
  • Minimize cross-layer ordering constraints
  • Control delivery in parallel via packet content

26
DB People Should Be Experts!
  • BUT remember
  • Basic Internet assumptiona network of unknown
    topology and with an unknown, unknowable and
    constantly changing population of competing
    conversations (Van Jacobson)
  • Spoils the whole optimize-then-execute
    architecture of query optimization
  • What happens when denvironment/dt lt query
    length??
  • What about the competing conversations?
  • How do we handle the unknown topology?
  • What about partial failure?
  • Ideally, wed like
  • the semantics and optimization of DB dataflow
  • with the agility and efficiency of NW dataflow

27
The Cosmic Convergence
Data Models, Query Opt, DataScalability
DATABASE RESEARCH
Adaptive QueryProcessing
ContinuousQueries, Streams
P2P QueryEngines
SensorQuery Engines
XML Routing
Router Toolkits
Content Addressingand DHTs
DirectedDiffusion
NETWORKING RESEARCH
Adaptivity, Federated Control, GeoScalability
28
What does the QP perspective add?
  • In terms of high-level languages?
  • In terms of a reusable set of operators?
  • In terms of optimization opportunities?
  • In terms of batch-I/O tricks?
  • In terms of approximate answers?
  • A safe route to Active Networks?
  • Not computationally complete
  • Optimizable and reconfigurable -- data
    independence applies
  • Fun to be had here!
  • Addressing a few fronts at Berkeley

29
Some of our work at the seams
  • Starting with centralized engine for remote data
    sets and streams
  • Telegraph eddies, SteMs, FLuX
  • Deep Web, filesharing systems, sensor streams
  • More recently, querying sensor networks
  • TinyDB/TAG in-network queries
  • And DHT-based overlay networks
  • PIER

30
Telegraph Overview
31
Telegraph An Adaptive Dataflow System
  • Themes Adaptivity and Sharing
  • Adaptivity encapsulated in operators
  • Eddies for order of operations
  • State Modules (SteMs) for transient state
  • FLuX for parallel load-balance and availability
  • Work- and state-sharing across flows
  • Unlike traditional relational schemes, try to
    share physical structures
  • Franklin, Hellerstein, Hong and students (to
    follow)

32
Telegraph Architecture
Request Parsing, Metadata
XML Catalog
Explicit Dataflows
SQL
Online Query Processing
Join Select Project Group Aggregate
Transitive Closure DupElim
InterModule Comm and scheduling (Fjords)
Modules
Adaptive Routing and Optimization
Juggle
Eddy
FLuX
SteM
Ingress
File Reader
Sensor Proxy
P2P Proxy
TeSS
33
Continuous Adaptivity Eddies
Eddy
  • A little more state per tuple
  • Ready/done bits (extensible a la
    Volcano/Starburst)
  • Minimal state in Eddy itself
  • Queue parameters being learning
  • Decisions which tuple in queue to which operator
  • Query processing dataflow routing!!
  • Ron Avnur

34
Two Key Observations
  • Break the set-oriented boundary
  • Usual DB model algebra expressions (R S)
    T
  • Common DB implementation pipelining operators!
  • Subexpressions neednt be materialized
  • Typical implementation is more flexible than
    algebra
  • We can reorder in-flight operators
  • Dont rewrite graph. Impose a router
  • Graph edge absence of routing constraint
  • Observe operator consumption/production rates
  • Consumption cost. Production costselectivity
  • Could break these down per values of tuples
  • So fun!
  • Simple, incremental, general
  • Brings all of query optimization online
  • And hence a bridge to ML, Control Theory, Queuing
    Theory

35
State Modules (SteMs)
static dataflows
  • Goal Further adaptivity through competition
  • Multiple mirrored sources (AMs)
  • Handle rate changes, failures, parallelism
  • Multiple alternate operators
  • Join Routing State
  • SteM operator manages tradeoffs
  • State Module, unifies caches, rendezvous buffers,
    join state
  • Competitive sources/operators share
    building/probing SteMs
  • Join algorithm hybridization!
  • Eddies SteMs tackle the full (single-site)
    query optimization problem online
  • Vijayshankar Raman, Amol Deshpande

eddy
eddy stems
36
FLuX Routing Across Cluster
  • Fault-tolerant, Load-balancing eXchange
  • Continuous/long-running flows need high
    availability
  • Big flows need parallelism
  • Adaptive Load-Balancing reqd
  • FLuX operator Exchange plus
  • Adaptive flow partitioning (River)
  • Transient state replication migration
  • Replication checkpointing for SteMs
  • Note set-based, not sequence-based!
  • Needs to be extensible to different ops
  • Content-sensitivity
  • History-sensitivity
  • Dataflow semantics
  • Optimize based on edge semantics
  • Networking tie-in again
  • At-least-once delivery?
  • Exactly-once delivery?
  • In/Out of order?
  • Mehul Shah

37
Continuously AdaptiveContinuous Queries (CACQ)
  • Continuous Queries clearly need all this stuff!
  • Natural application of Telegraph infrastructure
  • 4 Ideas in CACQ
  • Use eddies to allow reordering of ops.
  • But one eddy will serve for all queries
  • Queries are data join with Grouped Filter
  • A la stored get!
  • This idea extended in PSOUP (Chandrasekaran
    Franklin)
  • Explicit tuple lineage
  • Mark each tuple with per-op ready/done bits
  • Mark each tuple with per-query completed bits
  • Joins via SteMs, shared across all queries
  • Note mixed-lineage tuples in a SteM. I.e.
    shared state is not shared algebraic expressions!
  • Delete a tuple from flow only if it matches no
    query
  • Sam Madden, Mehul Shah, Vijayshankar Raman,
    Sirish Chandrasekaran

38
Sensor QP TinyDB/TAG
39
Wireless Sensor Networks
Palm DevicesLinux
Smart Dust MotesTinyOS
  • A spectrum of devices
  • Varying degrees of power and network constraints
  • Fun is on the small side!
  • Our current platform Mica and TinyOS
  • 4Mhz Atmel CPU, 4KB RAM, 40kBit radio, 512K
    EEPROM, 128K Flash
  • Sensors temp, light, accelerometer,
    magnetometer, mic, etc.
  • Wireless, single-ported, multi-hop ad-hoc network
  • Spanning-tree communication through root

40
TinyDB
  • A query/trigger engine for motes
  • Declarative (SQL-like) language for
    optimizability
  • Data independence arguments in spades here!
  • Non-programmers can deal with it
  • Lots of challenges at the seams of queries and
    routing
  • Query plans over dynamic multi-hop network
  • With power and bandwidth consumption as key
    metrics
  • Sam Madden (w/Hellerstein, Hong, Franklin)

41
Focus Hierarchical Aggregation
  • Aggregation natural in sensornets
  • The big picture typically interesting
  • Aggregation can smooth noise and loss
  • E.g. signal processing aggs like wavelets
  • Provides data reduction
  • Power/Network Reductionin-network aggregation
  • Hierarchical version of parallel aggregation
  • Tricky design space
  • power vs. quality
  • topology-selection
  • value-based routing
  • dynamic environment requires adaptivity

42
TinyDB Sample Apps
  • Habitat Monitoring what is the average
    humidity in the populated petrel burrows on
    Great Duck Island right now?
  • Smart Office find me the conference rooms that
    have been reserved but unoccupied for 5
    minutes.
  • Home Automation lower blinds when light
    intensity is above a threshold.

43
Performance in SensorNets
  • Power consumption
  • Communication gtgt Computation
  • METRIC radio wake time
  • Send gt Receive
  • METRIC messages generated
  • Run for 5 years vs. Burn power for critical
    events vs. Run my experiment
  • Bandwidth Constraints
  • Internal gtgt External
  • Volume gtgt surface area
  • Result Quality
  • Noisy sensors
  • Discrete sampling of continuous phenomena
  • Lossy communication channel

44
TinyDB
  • SQL-like language for specifying continuous
    queries and triggers
  • Schema management, etc.
  • Proxy on desktop, small query engine per mote
  • Plug and play (query snooping)
  • To keep the engine tiny, use an eddy-style arch
  • One explicit copy of each iterators code image
  • Adaptive dataflow in network
  • Alpha available for download on SourceForge

45
Some of the Optimization Issues
  • Extensible Aggregation API
  • Init(), Iter(), SplitFlow(), Close()
  • Properties
  • Amount of intermediate state
  • Duplicate sensitivity
  • Monotonicity
  • Exemplary vs. Summary
  • Hypothesis Testing
  • Snooping and Suppression
  • Compression, Presumption, Interpolation
  • Generally, QP and NW issues intertwine!

46
PIER Querying the Internet
47
Querying the Internet
  • As opposed to querying over the Internet
  • Have to deal with Internet realities
  • Scale, dynamics, federated admin, partial
    failure, etc.
  • Standard distributed DBs wont work
  • Applications
  • Start with real-time, distributed network
    monitoring
  • Traffic monitoring, intrusion/spam detection,
    software deployment detection (e.g. via TBIT),
    etc.
  • Use PIERs SQL as a workload generator for
    networks?
  • Virtual tables determine load produced by each
    site
  • Queries become a way of specifying site-to-site
    communication
  • Move to infect the network more deeply?
  • E.g. Indirection schemes like i3, rendezvous
    mechanisms, etc.
  • Overlays only?

48
And p2p QP, Obviously
  • Gnutella done right
  • And its so easy! -)
  • Crawler-free web search
  • Bring WYGIWIGY queries to the people
  • Ranking, recommenders, etc.
  • Got to be more fun here
  • If p2p takes off in a big way, queries have to be
    a big piece
  • Why p2p DB, anyway?
  • No good reason I can think of! -)
  • Focus on the grassroots nature of p2p
  • Schema integration and transactions and ??
  • No! Work with what you got! Query the data
    thats out there
  • Nothing complicated for users will fly
  • Avoid the DB word P2P QP, not P2P DB

49
Approach Leverage DHTs
  • Distributed Hash Tables
  • Family of distributed content-routing schemes
  • CAN, CHORD, Pastry, Tapestry, etc.
  • Internet scale hash table
  • A la wide-area, adaptive Exchange routing table
  • With some notion of storage
  • Leverage DHTs aggressively
  • As distributed indexes on stored data
  • As state modules for query processing
  • E.g. use DHTs as the hash tables in a hash join
  • As rendezvous points for exchanging info
  • E.g. Bloom Filters

50
PIER P2p Information Exchange and Retrieval
  • Relational-style query executor
  • With front-ends for SQL and catalogs
  • Standard and continuous queries
  • With access to DHT APIs
  • Currently CAN and Chord, working on Tapestry
  • Common DHT API would help
  • Currently simulating queries running on 10s of
    thousands of nodes
  • Look ma, it scales!
  • Widest-scale relational engine ever, looks
    feasible
  • Most of the simulator code will live on in
    implementation
  • On Millennium and PlanetLab this fall/winter
  • Ryan Huebsch and Boon Thau Loo (w/Hellerstein,
    Shenker, Stoica)

51
PIER Challenges
  • How does this batch workload stress DHTs?
  • How does republishing of soft-state interact with
    dataflow?
  • And semantics of query answers
  • Materialization/precomputation/caching
  • Physical tuning meets SteMs meets materialized
    views
  • How to do query optimization in this context
  • Distributed eddies!
  • Partial failure a reality
  • At storage nodes, query execution nodes?
  • Impact on results, mitigation
  • What about aggregation?
  • Similarities/difference with TAG?
  • With Astrolabe Birman et al?
  • The usual CQ and data stream query issues,
    distributed
  • Analogous to work in Telegraph, and at Brown,
    Wisconsin, Stanford

52
All together now?
  • I thought about changing the names
  • Telegraph, Teletiny?
  • The group didnt like the branding
  • Teletubby!
  • Seriously integration?
  • Its a plausible need
  • Sensor data map data historical sensor logs
  • Filesharing Web
  • We have done both of these cheesily
  • But fun questions of doing it right
  • E.g. pushing predicates and data into sensor net
    or not?

53
References Resources
54
Database Texts
  • Undergrad textbooks
  • Ramakrishnan Gehrke, Database Management
    Systems
  • Silberschatz, Korth, Sudarshan, Database System
    Concepts
  • Garcia-Molina, Ullman, Widom, Database Systems -
    The Complete Book
  • ONeil ONeil, DATABASE Principles,
    Programming, and Performance
  • Abiteboul, Hull, Vianu, Foundations of Databases
  • Graduate texts
  • Stonebraker Hellerstein, Readings in Database
    Systems (a.k.a The Red Book)
  • Brewer Hellerstein Readings book (e-book?) in
    progress. Fall 2003?

55
Research Links
  • DB group at Berkeley db.cs.berkeley.edu
  • GiST gist.cs.berkeley.edu
  • Telegraph telegraph.cs.berkeley.edu
  • TinyDB telegraph.cs.berkeley.edu/tinydb
    berkeley.intel-research.net/tinydb
  • Red Book redbook.cs.berkeley.edu
Write a Comment
User Comments (0)
About PowerShow.com