Wireless Sensor Networks and Query Processing - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

Wireless Sensor Networks and Query Processing

Description:

Chemicals, food, vehicles (car parks), machines, containers, ... People sitting at an airport lounge. New York taxi cabs. Kids playing. Military movements ... – PowerPoint PPT presentation

Number of Views:154
Avg rating:3.0/5.0
Slides: 74
Provided by: CIT788
Category:

less

Transcript and Presenter's Notes

Title: Wireless Sensor Networks and Query Processing


1
Wireless Sensor Networks and Query Processing
  • WSN Wireless Sensor Networks
  • Routing Problems
  • Routing Algorithms
  • Real-time Query Processing
  • Sensor Selection and Data Aggregation

2
A typical Wireless Sensor Network
SN
SN
GW
Bluetooth
GW
SN
SN
SN
SN
SN
SN
SN
SN
GW
SN
GW
SN
WLAN
GPRS
Ethernet
  • Integration of Sensor Nodes (SN) and Gateways (GW)

3
MANET Mobile Ad-hoc Networking
4
Why Wireless Sensor Networks ?
  • Ease of deployment
  • Speed of deployment
  • Decreased dependence on infrastructure
  • Self-adaptive and self-organizing
  • Sensors are cheap devices and can be deployed in
    large number
  • Sensors can work in harsh environment conditions,
    i.e., desert
  • Sensors can work continuously for monitoring and
    surveillance purposes
  • Connected to the rest of the system through a
    gateway
  • From the gateway, various functions and queries
    may be submitted into the system to access the
    sensor data

5
Todays Wireless Sensor Networks (WSN)
  • First generation of WSNs is available
  • Diverse sensor nodes, several gateways
  • Even with special sensors cameras, body
    temperature
  • Basic software
  • Routing, energy conservation, management
  • Several prototypes for different applications
  • Environmental monitoring, industrial automation,
    wildlife monitoring
  • Many see new possibilities for monitoring,
    surveillance, protection
  • Sensor networks as a cheap and flexible new
    meansfor surveillance (i.e., security)
  • Monitoring and protection of goods
  • Chemicals, food, vehicles (car parks), machines,
    containers,
  • Large application area besides military
  • Law enforcement, disaster recovery, industry,
    private homes,

6
Mobile ad-hoc networks (MANET)
  • Network without infrastructure
  • Use components of participants for networking
  • Examples
  • Single-hop All partners max. one hop apart
  • Bluetooth piconet, PDAs in a room,gaming
    devices
  • Multi-hop Cover larger distances, circumvent
    obstacles
  • Bluetooth scatternet, police network,
    car-to-car networks
  • MANET (Mobile Ad-hoc Networking) group
  • Dynamic network topology
  • Mobile nodes

7
Many Variations
  • Fully Symmetric Environment
  • All nodes have identical capabilities and
    responsibilities
  • Asymmetric Capabilities
  • Transmission ranges and radios may differ
  • Battery life at different nodes may differ
  • Processing capacity may be different at different
    nodes
  • Speed of movement (fixed and mobile)
  • Asymmetric Responsibilities
  • Only some nodes may route packets
  • Some nodes may act as leaders of nearby nodes
    (e.g., cluster head)

8
Many Variations
  • Traffic characteristics may differ in different
    mobile ad hoc networks
  • Bit rate
  • Timeliness constraints
  • Reliability requirements
  • Unicast / multicast / geocast
  • May co-exist (and co-operate) with an
    infrastructure-based network

9
Many Variations
  • Mobility patterns may be different
  • People sitting at an airport lounge
  • New York taxi cabs
  • Kids playing
  • Military movements
  • Mobility characteristics
  • Speed
  • Predictability
  • Direction of movement
  • Pattern of movement
  • Uniformity (or lack thereof) of mobility
    characteristics among different nodes

10
Wireless Sensor Networks Challenges
  • Long-lived, autonomous networks
  • Use environmental energy sources
  • Embed and forget
  • Self-healing
  • Self-configuring networks
  • Routing
  • Data aggregation
  • Localization
  • Managing wireless sensor networks
  • Tools for access and programming
  • Update distribution
  • Scalability, Quality of Service

11
Routing Problem
  • Routing finding a route to send data from the
    source to the destination
  • Highly dynamic network topology
  • Device mobility plus varying channel quality
  • Separation and merging of networks possible
  • Asymmetric connections possible

N6
N7
N6
N7
N1
N1
N2
N3
N2
N3
N4
N4
N5
N5
time t1
time t2
good link weak link
Changing topology
12
Mobile Ad Hoc Networks
  • May need to traverse multiple links to reach a
    destination

13
Mobile Ad Hoc Networks
  • Mobility causes route changes

14
Routing Problems
  • Asymmetric links
  • A path from node A to B does not implies that
    node B can use the same path to send packet to
    node A
  • Redundant links
  • Multiple paths from A to B, which one is the best
    one (minimizing the number of hops count) and
    should be chosen
  • Interference
  • Collision, neighboring nodes send packets at the
    same time
  • Collision -gt retransmission (MAC)
  • Dynamic topology
  • Changing link quality due to movement
  • Need to find a new path every short period of
    time. The old one does not work
  • Update of path information in the intermediate
    nodes
  • No nodes have a complete information of the
    status of all the nodes in the system
  • Transmission delay is changing
  • Difficult for loading balancing and traffic
    control

15
Routing Problems
  • Routing Problem
  • To find a route to connect the source node (S) to
    the destination node (D) through a sequence of
    relay nodes
  • The route may just for a one time connection or
    for a period of time (continuous monitoring)
  • Issues in routing algorithms
  • Minimize message overhead (no. of messages)
  • On-demand algorithms
  • Minimize the searching delay
  • Table-driven algorithms
  • Route maintenance
  • Minimize energy consumption rate
  • Power-aware routing algorithms (choosing high
    energy nodes as relay nodes
  • Switching some of the mobile hosts to doze mode
    to conserve energy

16
Routing Methods
  • Two types of routing algorithms
  • On-demand protocols (reactive)
  • A route is searched upon the receipt a connection
    request
  • Table-driven protocols (proactive)
  • The topology of the whole network is maintained
  • When a connection is needed, the source node can
    select the route from its memory directly

17
Routing Methods
  • Latency of route discovery
  • Proactive protocols may have lower latency since
    routes are maintained at all times
  • Reactive protocols may have higher latency
    because a route from X to Y will be found only
    when X attempts to send to Y
  • Overhead of route discovery/maintenance
  • Reactive protocols may have lower overhead since
    routes are determined only if needed
  • Proactive protocols can (but not necessarily)
    result in higher overhead due to continuous route
    updating
  • Which approach achieves a better trade-off
    depends on the traffic and mobility patterns

18
Routing Algorithms for Ad Hoc Networks
  • Flooding
  • Dynamic Source Routing (DSR)
  • Location-Aided Routing (LAR)
  • Power-Aware Routing (PAR)
  • Least Interference Routing (LIR)

19
Flooding for Data Delivery
  • Sender S broadcasts data packet P to all its
    neighbors
  • Each node receiving P forwards P to its neighbors
  • Sequence numbers used to avoid the possibility of
    forwarding the same packet more than once
  • Packet P reaches destination D provided that D is
    reachable from sender S
  • Node D does not forward the packet

20
Flooding for Data Delivery
Y
Represents that connected nodes are within each
others transmission range
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents a node that has received packet P
21
Flooding for Data Delivery
Y
Represents transmission of packet P
Broadcast transmission
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents a node that receives packet P for the
first time
22
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
  • Node H receives packet P from two neighbors
  • potential for collision

23
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
  • Node C receives packet P from G and H, but does
    not forward
  • it again, because node C has already forwarded
    packet P once

24
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
  • Nodes J and K both broadcast packet P to node D
  • Since nodes J and K are hidden from each other,
    their
  • transmissions may collide
  • gt Packet P may not be delivered to node
    D at all,
  • despite the use of flooding

25
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
  • Node D does not forward packet P, because node D
  • is the intended destination of packet P

26
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
  • Flooding completed
  • Nodes unreachable from S do not receive packet P
    (e.g., node Z)
  • Nodes for which all paths from S go through the
    destination D
  • also do not receive packet P (example node N)

27
Flooding for Data Delivery
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
  • Flooding may deliver packets to too many nodes
  • (in the worst case, all nodes reachable from
    sender
  • may receive the packet)

28
Flooding for Data Delivery Advantages
  • Simplicity
  • May be more efficient than other protocols when
    the rate of information transmission is low
    enough that the overhead of explicit route
    discovery/maintenance incurred by other protocols
    is relatively higher
  • This scenario may occur, for instance, when nodes
    transmit small data packets relatively
    infrequently, and many topology changes occur
    between consecutive packet transmissions
  • Potentially higher reliability of data delivery
  • Because packets may be delivered to the
    destination on multiple paths

29
Flooding for Data Delivery Disadvantages
  • Potentially, very high overhead
  • Data packets may be delivered to too many nodes
    who do not need to receive them
  • Potentially lower reliability of data delivery
  • Flooding uses broadcasting -- hard to implement
    reliable broadcast delivery without significantly
    increasing overhead
  • In our example, nodes J and K may transmit to
    node D simultaneously, resulting in loss of the
    packet
  • In this case, destination would not receive the
    packet at all

30
Flooding of Control Packets
  • Many protocols perform (potentially limited)
    flooding of control packets, instead of data
    packets
  • The control packets are used to discover routes
  • Discovered routes are subsequently used to send
    data packet(s)
  • Overhead of control packet flooding is amortized
    over data packets transmitted between consecutive
    control packet floods

31
Dynamic Source Routing (DSR)
  • In DSR, it consists of two steps
  • route discovery a node tries to discover a route
    to a destination if it has to send something to
    its destination
  • route maintenance if a node detects the current
    route has changed, it needs to find a new route
  • In route discovery, if node S wants to send a
    packet to node D, but does not know a route to D,
    node S initiates a route discovery (small size
    message)
  • Source node S floods Route Request (RREQ)
  • Each node appends own identifier when forwarding
    RREQ
  • If a node has already received the request, it
    will drop the request

32
Route Discovery in DSR
Y
Z
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents a node that has received RREQ for D
from S
33
Route Discovery in DSR
Y
Broadcast transmission
Z
S
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
X,Y Represents list of identifiers appended
to RREQ
Represents transmission of RREQ
34
Route Discovery in DSR
Y
Z
S
S,E
E
F
B
C
M
L
J
A
G
S,C
H
D
K
I
N
  • Node H receives packet RREQ from two neighbors
  • potential for collision

35
Route Discovery in DSR
Y
Z
S
E
F
S,E,F
B
C
M
L
J
A
G
H
D
K
S,C,G
I
N
  • Node C receives RREQ from G and H, but does not
    forward
  • it again, because node C has already forwarded
    RREQ once

36
Route Discovery in DSR
Y
Z
S
E
F
S,E,F,J
B
C
M
L
J
A
G
H
D
K
I
N
S,C,G,K
  • Nodes J and K both broadcast RREQ to node D
  • Since nodes J and K are hidden from each other,
    their
  • transmissions may collide

37
Route Discovery in DSR
Y
Z
S
E
S,E,F,J,M
F
B
C
M
L
J
A
G
H
D
K
I
N
  • Node D does not forward RREQ, because node D
  • is the intended target of the route discovery

38
Route Discovery in DSR
  • Destination D on receiving the first RREQ, sends
    a Route Reply (RREP)
  • RREP is sent on a route obtained by reversing the
    route appended to received RREQ
  • RREP includes the route from S to D on which RREQ
    was received by node D

39
Route Reply in DSR
Y
Z
S
RREP S,E,F,J,D
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Represents RREP control message
40
Route Reply in DSR
  • Route Reply can be sent by reversing the route in
    Route Request (RREQ) only if links are guaranteed
    to be bi-directional
  • To ensure this, RREQ should be forwarded only if
    it received on a link that is known to be
    bi-directional
  • If unidirectional (asymmetric) links are allowed,
    then RREP may need a route discovery for S from
    node D
  • Unless node D already knows a route to node S
  • If a route discovery is initiated by D for a
    route to S, then the Route Reply is piggybacked
    on the Route Request from D

41
Dynamic Source Routing (DSR)
  • Node S on receiving RREP, caches the route
    included in the RREP
  • When node S sends a data packet to D, the entire
    route is included in the packet header
  • Hence the name source routing
  • Intermediate nodes use the source route included
    in a packet to determine to whom a packet should
    be forwarded

42
Data Delivery in DSR
Y
Z
DATA S,E,F,J,D
S
E
F
B
C
M
L
J
A
G
H
D
K
I
N
Packet header size grows with route length
43
Dynamic Source Routing Advantages
  • Routes maintained only between nodes who need to
    communicate
  • reduces overhead of route maintenance
  • Route caching can further reduce route discovery
    overhead
  • A single route discovery may yield many routes to
    the destination, due to intermediate nodes
    replying from local caches

44
Dynamic Source Routing Disadvantages
  • Packet header size grows with route length due to
    source routing
  • Flood of route requests may potentially reach all
    nodes in the network
  • Care must be taken to avoid collisions between
    route requests propagated by neighboring nodes
  • Insertion of random delays before forwarding RREQ
  • Increased contention if too many route replies
    come back due to nodes replying using their local
    cache
  • Route Reply Storm problem
  • Reply storm may be eased by preventing a node
    from sending RREP if it hears another RREP with a
    shorter route

45
Enhancement to routing
  • There may be multiple route from the source node
    to the destination node. How to choose the route?
  • Interference
  • The number of neighboring nodes
  • If the number of neighboring nodes is larger, the
    probability of having conflict in transmission is
    higher. Therefore, more re-transmission and
    greater waste in bandwidth
  • Energy level of the intermediate nodes
  • Eliminate those nodes with energy level below a
    threshold value
  • Location area
  • Estimate the possible region of the destination
    node
  • Broadcast the packets to the estimated region
  • i.e., LAR

46
Location-Aided Routing (LAR)
  • Exploits location information to limit scope of
    route request flood
  • Location information may be obtained using GPS
  • Expected Zone is determined as a region that is
    expected to hold the current location of the
    destination
  • Expected region determined based on potentially
    old location information, and knowledge of the
    destinations speed
  • Route requests limited to a Request Zone that
    contains the Expected Zone and location of the
    sender node

47
Expected Zone in LAR
X last known location of node D, at time
t0 Y location of node D at current time
t1, unknown to node S r (t1 - t0) estimate
of Ds speed
X
r
Y
Expected Zone
48
Request Zone in LAR
Network Space
Request Zone
B
X
S
r
A
Y
49
LAR
  • Only nodes within the request zone forward route
    requests
  • Node A does not forward RREQ, but node B does
    (see previous slide)
  • Request zone explicitly specified in the route
    request
  • Each node must know its physical location to
    determine whether it is within the request zone
  • If route discovery using the smaller request zone
    fails to find a route, the sender initiates
    another route discovery (after a timeout) using a
    larger request zone
  • the larger request zone may be the entire network
  • Rest of route discovery protocol similar to DSR

50
Energy-aware routing
  • Only sensors with sufficient energy forward data
    for other nodes
  • Example Routing via nodes with enough solar
    power is considered for free

51
System Monitoring and Surveillance
  • Wireless sensor systems
  • Needs to monitor the occurrences of (simple)
    events in the system environment
  • I.e., When the temperature is higher than 50C
  • I.e., The max and min pressure in a day
  • Complex events
  • The occurrences of multiple simple events at the
    same time
  • The maximum temperatures of two rooms when the
    pressure is higher than 1000mmHg
  • The light intensity at the arrival time of a bird

52
System Monitoring
  • Continuous monitoring queries (CMQs)
  • Submit to monitor the events occurring in the
    system environment for a period of time Begin
    time and end time
  • A condition is defined. Once the condition is
    satisfied, an alert is sent to the user
  • Based on the attributes defined in the condition,
    a set of data items are identified as input to
    the query
  • Access to a set of data items (pre-defined)
  • The data items are generated by sensor nodes
    distributed in the system environment
  • Sensor nodes
  • Each sensor node may be installed multiple
    sensors to capture different signals of the
    system environment
  • Fixed sampling frequency
  • Communicate through low bandwidth wireless
    network

53
In-Network Processing
  • Processing of queries (two approaches)
  • (1) Send sensor data to a centralized server for
    processing
  • (2) Process the queries at the sensor nodes
  • In-networking processing
  • A query is divided into a set of sub-queries
  • Each sub-query is processed at the sensor node
    (called participating nodes) which is responsible
    for generating its required data items
  • A coordinator node (one of the participating
    sensor node) is responsible for aggregating the
    results from the sensor nodes
  • Example get the average temperature of sensors A
    to D from now for then 10 min if they are higher
    than 100F
  • No need to transmit large volume of data to a
    centralized server for processing
  • Issues routing and aggregation

54
System Architecture
  • MSPU Mobile sensor processing units
  • Base Station connecting with MSPU through a
    wireless link
  • Back-end server maintains a database, and
    provides an interface for submitting CMQ and
    displaying query results including performance
    statistics

55
Continuous Monitoring Query
  • CMQi consists of a set of sub-queries, SCMQi,1,
    SCMQi,2, SCMQi,n defined according to the
    distribution of the required nodes of the query
  • One of the nodes is the coordinator node and the
    others are participating nodes
  • Each sub-query contains a selection condition to
    process on the sensor data from its node
  • A CMQ contains an aggregation condition for
    execution,
  • i.e., to have the results from all the
    sub-queries
  • Calculating the maximum value requires at least
    two inputs

56
Execution of CMQs
  • Step 1
  • Evaluation on the sensor data items generated by
    a sensor node using the selection condition
    defined in the sub-query
  • Step 2
  • Sending sub-query evaluation results to the
    coordinator node for evaluation if the
    aggregation conditions have been satisfied
  • Report the query result to the client as a
    function of time during the activation period of
    the query

57
Execution of CMQs
58
An Example of a CMQ
  • Get the maximum temperature of Sensors A and B
    from now if they are higher than 100F on until 15
    min later
  • CMQ1 (SCMQ1,1 , SCMQ1,2, Operation1, 1200,
    1215)
  • SCMQ1,1 If temperature T1 of sensor data from
    MSPU1 gt 100F, return the temperature
  • SCMQ1,2 If temperature T2 of sensor data from
    MSPU2 gt 100F, return the temperature
  • Aggregate condition1 The output from both
    SCMQ1,1 and SCMQ1,2 are data values
  • Aggregate operation1 IF T1 gt T2, return T1 ELSE
    return T2

59
Temporal Consistency
  • The sensor nodes follow their pre-set frequency
    (period) to generate sensor data values
  • A sensor data value is invalid if the new version
    is generated
  • Data version X is valid if creation time of x
    generation period gt current time
  • The main purpose of a CMQ is to monitor system
    environment
  • Not to miss the occurrences of any such events
  • Require continuous evaluation on sensor data
  • Ensure that all results generated from the CMQ
    are correct (consistent with the real situation
    in the monitoring environment)
  • Require each evaluation on temporally consistent
    data such that they are valid at the same time
    point

60
Temporal Inconsistency Problem
  • If MSPU1 is assigned to be the coordinator node,
    MSPU2 will forward its sub-query results to MSPU1
  • Due to communication delay, the set of sub-query
    results from MSPU2 received by MSPU1 will be
    shifted by the transmission delay
  • The generated query results may become incorrect
  • Incorrect light intensity at the arrival time of
    a bird
  • Incorrect maximum temperature of the two rooms

61
Temporal Inconsistency Problem
62
Temporal Inconsistency Problem
  • Time-stamping technique
  • Using time-stamp to label the validity of a data
    version
  • From lower valid time (LVT) to upper valid time
    (UVT)
  • Relative consistency
  • The intersection of the validity intervals of all
    the accessed data items is non-empty
  • The data versions are not too old (currency
    requirement)
  • Buffering
  • The coordinator node buffer the received
    sub-query results
  • Evaluation follows the relative consistency
    requirement

63
Aggregation Problem
  • How to aggregate the sub-query results from the
    participating nodes?
  • Objectives
  • To minimize the aggregation cost (data
    communication cost)
  • Fault tolerance to message loss
  • Minimize the processing cost at the coordinator
    node
  • Centralized aggregation
  • Select a coordinator node which is close to all
    the participating nodes

64
Periodic Pushing
  • The latest generated sub-query results form a
    message and are forwarded to the coordinator node
    every fixed submission period
  • Each message contains several sensor data
    versions of a data item to minimize the message
    overhead
  • Results are time-stamped to indicate their
    validity intervals
  • Evaluation at the coordinator node follows the
    time-stamps by searching the received data at the
    buffer
  • Message loss can easily be detected

65
Periodic Pushing
66
Conditional Pushing
  • Aims to reduce the sizes of data versions for
    aggregation
  • The scheme is the same as periodic pushing except
    the data values are compressed
  • Successive data versions with the same value are
    compressed
  • Although the redundancy in data values within a
    message is eliminated, the redundancy in
    successive messages cannot be eliminated
  • The amount of bandwidth saved depends on how the
    data values changes from the sensor node

67
Conditional Pushing
68
Sequential Pushing
  • The sensor nodes of a CMQ are assumed to be close
    to each other and they can directly communicate
    with each other
  • The submission of sub-query results is triggered
    by a triggering node which is one of the
    participating nodes of the CMQ
  • The determination of which participating node to
    be the triggering node is based on which one has
    the least number of satisfied results in
    evaluation
  • The pushing of sub-query results follows a
    sequential order according to the evaluation
    results
  • Partial processing of the operation, which is
    originally to be performed at the coordinator
    node, is performed on its way

69
Sequential Pushing
70
Sequential Pushing
  • Due to dynamic properties of sensor data, the
    probability of satisfying the condition in a
    sub-query at a node may change with time
  • Reordering of the nodes
  • Assigns the false node to be the first node in
    the sequence.
  • All the nodes following the false node will
    remain in the same relative order to each other.
  • All the nodes in front of the false node remain
    in their original relative order. They rejoin the
    node sequence by putting them after the last node
    of the original sequence

71
Sequential Pushing
72
SeqPush Vs Centralized Scheme
  • The total number of messages is normally smaller
    especially for the case where the probability of
    satisfying the aggregation conditions in all the
    sub-queries is not high
  • The processing workload at the coordinator node
    is lower as the participating nodes are
    responsible for partial computation of the
    aggregation operation
  • The processing cost in searching for relatively
    consistent sensor data values will be lower due
    to a false result from a sub-query

73
References
  • Schiller 8.3
  • David B. Johnson and David A. Maltz, Dynamic
    Source Routing in Ad Hoc Wireless Networks
    (DSR), in Mobile Computing, 1996.
  • Young-Bae Ko and Nitin H. Vaidya, Location-aided
    Routing (LAR) in Mobile Ad Hoc Networks, in
    Proceedings of 1998 ACM International Conference
    on Mobile Computing and Networking
  • Y. Yao and J. E. Gehrke, Query Processing in
    Sensor Networks, in Proceedings of the First
    Biennial Conference on Innovative Data Systems
    Research (CIDR 2003), Asilomar, California,
    January 2003
Write a Comment
User Comments (0)
About PowerShow.com