DEMS: A Data Mining Based Technique to Handle Missing Data in Mobile Sensor Network Applications - PowerPoint PPT Presentation

About This Presentation
Title:

DEMS: A Data Mining Based Technique to Handle Missing Data in Mobile Sensor Network Applications

Description:

DEMS: A Data Mining Based Technique to Handle Missing Data in Mobile Sensor Network Applications Le Gruenwald Md. Shiblee Sadik Rahul Shukla Hanqing Yang – PowerPoint PPT presentation

Number of Views:545
Avg rating:3.0/5.0
Slides: 49
Provided by: mssad
Category:

less

Transcript and Presenter's Notes

Title: DEMS: A Data Mining Based Technique to Handle Missing Data in Mobile Sensor Network Applications


1
DEMS A Data Mining Based Technique to Handle
Missing Data in Mobile Sensor Network
Applications
  • Le Gruenwald Md. Shiblee Sadik Rahul Shukla
    Hanqing Yang
  • School of Computer Science
  • University of Oklahoma
  • Norman, Oklahoma, USA
  • ggruenwald_at_ou.edu

2
Outline
  • Research objective
  • Current approaches
  • The proposed approach DEMS
  • Performance Evaluation
  • Conclusions and future work

3
Mobile Sensor Networks
  • A typical mobile sensor network
  • Sensor nodes are provided with motion
    capabilities
  • Sensor nodes can relocate themselves
  • Sensor nodes may move continuously/randomly
  • Sensor nodes may move periodically to make up for
    lost/missing sensors
  • Sensor nodes send data to a base station.

4
Missing Sensor Data
  • Missing sensor data sensor readings that fail
    to reach the base station or are corrupted when
    reaching the base station
  • Reasons for missing sensor data
  • Power shortage (sensor nodes are battery-powered)
  • Mal-functioning of sensor nodes (hardware
    failure)
  • Networking issues
  • Connection failures
  • Data package collision
  • etc.

5
Research Objective
  • Goal Develop an effective algorithm to estimate
    missing sensors readings in a mobile sensor
    network application.

6
Research Issues
  • Issues common with static sensor networks
  • Infiniteness, fast arrival rate, concept drifts
  • Additional issues due to mobility of mobile
    sensors
  • Spatial relations
  • The spatial relation between two sensors
    readings is distorted by the mobility of mobile
    sensors
  • Temporal relations
  • The history data of a mobile sensor that are
    generated at different locations may not
    necessarily possess the temporal relationships
    with the data in the current round of sensor
    readings
  • Frequent power failure
  • Power outage is more common in mobile sensor
    network compared to static sensor network because
    mobility requires excessive power.

7
Current Approaches
Statistics based techniques
Fig 1. A taxonomy of techniques for handling
missing data
8
The Proposed Approach DEMS
  • DEMS Data Estimation for Mobile Sensors
  • Based on two important concepts
  • Virtual Static Sensor (VSS)
  • A fictitious static sensor which mimics a real
    static sensor
  • helps reconstruct the spatial and temporal
    relations among the sensors readings
  • Association Rule Mining
  • A popular method of discovering relationships
    among different items
  • helps explore the relationships among sensors
    readings.

9
DEMS Components
  • DEMS has three major components
  • Mapping Real Mobile Sensor (RMS) to Virtual
    Static Sensor (VSS)
  • Divides the entire area of coverage into small
    hexagons
  • A hexagon the coverage area of a VSS with VSS
    being at the center of the hexagon
  • Converts RMS readings into VSS readings
  • Association rule mining
  • Constructs a novel data structure called
    MASTER-tree to capture the association rules
    among VSSs
  • Updates MASTER-trees to capture the most recent
    association rules among VSSs
  • Data estimation
  • Uses the most recent association rules to
    estimate a missing VSS reading
  • Uses the estimated value of the missing VSS
    reading as the value of the missing RMS reading.

10
DEMS Mapping RSS to VSS
  • What is VSS?
  • A VSS is a fictitious static sensor
  • A VSS reading is based on one or more RMSs
    readings
  • A VSS has a unique identifier and has a unique
    area of coverage
  • Why do we need VSS?
  • Each VSS has a fixed location hence the spatial
    relations among VSSs readings can be obtained
  • Each VSS reading is generated from a fixed
    location hence history readings might have
    strong temporal relations with the current
    reading.

11
DEMS Mapping RSS to VSS (Cont.)
  • How to construct a VSS?
  • We divide the entire monitoring area into small
    hexagons
  • A virtual static sensor is the center of a
    hexagon
  • Each hexagon is a coverage area of a virtual
    static sensor.

12
DEMS Mapping RMS to VSS
  • Goal map RMSs readings to VSSs readings so
    that spatial and temporal relations among the
    sensor readings can be restored.
  • Two types of mapping
  • Mapping of a non-missing RMS to VSS
  • Mapping of a missing RMS to VSS

13
DEMS Mapping of a non-missing RMS to VSS
  • If a VSS contains one RMS within its coverage
    area, the RMSs reading is used as the VSS
    reading
  • If a VSS contains more than one RMSs, the average
    of the RMSs readings is used as the VSS reading
  • If a VSS contains no RMS, the VSS is called
    inactive.

14
DEMS Mapping of a missing RMS to VSS
  • Why mapping of a missing RMS is difficult?
  • RMS location is the key to RMS to VSS mapping
  • If a RMS is missing, it is very likely that its
    data and location would be missing together
  • Hence mapping of a missing RMS to VSS requires
    intelligence
  • The solution
  • A missing RMS is mapped to a VSS using a
    trajectory mining approach for location
    prediction Morzy, 2007.

15
DEMS Mapping of a missing RMS to VSS (cont.)
  • What is a trajectory?
  • A trajectory is the sequence of hexagons that a
    mobile sensor traverses
  • If a mobile sensor is not missing, it reports its
    location and the location is contained by one
    hexagon
  • Hence the sequence of hexagons is called a
    trajectory.

(V14,V9,V11,V4,V3,V10) is the trajectory of M1
16
DEMS Mapping of a missing RMS to VSS (cont.)
  • Each RMS has a trajectory
  • DEMS periodically stores the trajectories
    (collected from all RMSs) into a frequency
    pattern tree
  • Frequency pattern tree
  • It has a root labeled null
  • Each node consists of an ID (hexagon ID) and
    count (number of times it appears in the
    trajectories)

Example 5 trajectories 1. (V14, V9, V11, V4,
V2, V8, V1) 2. (V14, V9, V11, V4, V3, V10, V1) 3.
(V14, V9, V5, V4, V3, V10, V8) 4. (V14, V9, V11,
V4, V3, V10, V1, V8) 5. (V2, V3, V6, V10, V8, V1)
Fig 5. A frequency pattern tree
17
DEMS Mapping of a missing RMS to VSS (cont.)
  • If a RMS is missing, it is mapped to a VSS from
    the frequency pattern tree and its own trajectory
  • Consider the last known trajectory of M1
    (V14,V9,V11,V4)

V3 Predicted next hexagon in trajectory of M1
Fig 7. Frequency pattern tree
18
DEMS Mapping RMS to VSS (cont.)
Procedure mapReal2Virtual(RealSensorData listRSData, VirtualSensorData listVSData) Procedure mapReal2Virtual(RealSensorData listRSData, VirtualSensorData listVSData)
1 for each real sensor rs
2 if(rs is not missing)
3 location ? listRSData(rs).Location
4 vs ? findVirtualSensor(location)
5 listVSData(vs).addReading(listRSData(rs).Reading)
6 else
7 location ? predictLocation(rs)
8 vs ? findVirtualSensor(location)
9 listVSData(vs).status?missing
10 end loop
11 for each virtual static sensor vs
12 if(listVSData(vs) has data)
13 listVSData(vs).status?active
14 listVSData(vs).reading?average(listVSData(vs).Readings)
15 else
16 if(listVSData(vs).status is not missing)
17 listVSData(vs).status ?inactive
18 end loop
end procedure end procedure
Fig 8. Mapping algorithm
19
DEMS Components
  • DEMS has three major components
  • Real Mobile Sensor (RMS) to Virtual Static Sensor
    (VSS)
  • Divides the entire area of coverage into small
    hexagons,
  • Each hexagon is the coverage area of a virtual
    static sensor where the virtual static sensor is
    assumed to be sitting in the middle of the
    hexagon,
  • Converts RMS readings into VSS readings.
  • Association rule mining
  • Constructs a novel data structure called
    MASTER-tree to capture the association rules
    among VSSs
  • Updates MASTER-trees to capture the most recent
    association rules among VSSs.
  • Data estimation
  • Uses the most recent association rules to
    estimate a missing VSS reading,
  • Uses the missing VSS reading as missing RMS
    reading.

20
DEMS Association Rule Mining
  • Goal mine and represent the potential
    association rules among the VSS readings.
  • We propose a novel data structure (called
    MASTER-tree) to mine and represent the
    association rules among VSS readings
  • MASTER-tree basics
  • A MASTER-tree is capable of mining any kind of
    association rules among any number of VSSs
  • A MASTER-tree represents potential association
    rules among the VSS readings
  • A path in MASTER-tree represents a potential
    association rule.

Fig 8. A MASTER-tree
21
DEMS Association Rule Mining (cont.)
  • The potential number of association rules among
    VSSs grows exponentially with the number of VSSs
  • To restrict the number of association rules, DEMS
    clusters the VSSs into small groups and
    constructs one MASTER-tree for each group
  • DEMS uses Agglomerative clustering
  • Agglomerative clustering starts with every VSS as
    an individual cluster
  • At each step it merges two closest clusters based
    on their pair-wise distances into one if the
    total number of VSSs in the new cluster does not
    exceed a user-defined maximum number of VSSs in
    one cluster.

22
DEMS Association Rule Mining (cont.)
Details
23
DEMS The MASTER-tree Projection Module (cont.)
24
DEMS Association Rule Mining (cont.)
5
V21, 10, V11, 10 ? V311, 20 Support
40 Confidence 100
V211, 20, V311, 20 ? V11, 20 Support
60 Confidence 66
(2, 31, 485, 7657, 121937)
Fig 11. MASTER-tree with count
25
DEMS Association Rule Mining (cont.)
  • Let
  • The minimum support 50
  • The minimum confidence 50
  • A typical association rule becomes
  • V211, 20, V311, 20 ? V11, 20
  • The rule meaning if the VSS reading of V2 is
    within 10 to 20 and the VSS reading for V3 is
    within 10 to 20, the VSS reading for V1 is most
    likely within 0 to 20.
  • There exists a path from the root node to V11,
    20 via V211, 20 and V311, 20 in the
    Master-tree.

26
DEMS Components
  • DEMS composed of three major components
  • Real Mobile Sensor (RMS) to Virtual Static Sensor
    (VSS)
  • Divides the entire area of coverage into small
    hexagons,
  • Each hexagon is the coverage area of a virtual
    static sensor where the virtual static sensor is
    assumed to be sitting in the middle of the
    hexagon,
  • Converts RMS readings into VSS readings.
  • Association rule mining
  • Construct a novel data structure called
    MASTER-tree to capture the association rules
    among VSSs,
  • Update MASTER-trees to capture most recent
    association rules among VSSs.
  • Data estimation
  • Uses the most recent association rules to
    estimate a missing VSS reading
  • Uses the estimated value of the missing VSS
    reading as the estimated value of the missing RMS
    reading.

27
DEMS Data Estimation
  • Goal estimate the missing VSS reading.
  • The data estimation modules estimates the missing
    VSS
  • The estimated reading for the missing VSS is used
    as the estimated reading for the missing RMS.

28
DEMS Data Estimation (cont.)
Fig 12. Flowchart of the data estimation module
(A step by step example)
29
Performance Evaluation
  • Simulation Model
  • We simulate the missing data for our datasets
  • A sensor is missing randomly (approximately
    5-10) for a consecutive random number (10 - 20)
    of rounds
  • Data and location both are missing for a missing
    sensor
  • We use DEMS, TinyDB, SPIRIT and Average method to
    estimate missing readings
  • TinyDB
  • An average based technique which estimates the
    missing data by taking the average of the
    readings from other sensor readings in the
    current round.
  • SPIRIT
  • An auto-regression based technique which
    estimates the missing data based on the readings
    in the previous rounds
  • Average
  • The average of other sensor readings is used as
    the estimated reading
  • We compare the techniques based on mean absolute
    error (MAE)
  • MAE Sestimation error/number of estimations.

30
Performance Evaluation (cont.)
  • Datasets
  • DAPPLE Project Dataset A real life dataset
  • The carbon monoxide (CO) readings in the range
    0, 6 were collected over a period of two weeks
    around Marylebone Road in London
  • The mobile sensors monitoring the atmospheric CO
    level are attached to PDAs which store these
    readings
  • We chose Thursday, 20th May 2004, when three
    sensors were simultaneously recording for about
    32 minutes, resulting in 600 rounds (after
    disregarding the missing rounds) of CO readings
  • Factory Floor Temperature Dataset A synthetic
    dataset
  • A simulation of a mobile sensor network for
    monitoring factory floor temperatures
  • Machines are placed on a floor
  • Some machines are turned on for a number of
    rounds the temperatures on these machines reach
    a high constant temperature and heat disperse on
    the floor.
  • 100 mobile sensors were roaming around in random
    directions to monitor the factory floor and
    report the temperature readings in the range 0,
    100C from different locations.
  • The mobile sensor readings were sampled once per
    hour the total rounds of readings are 5000 from
    100 mobile sensors.

31
Performance Evaluation (cont.)
Fig 13. Impacts of number of rounds on MAE for
DAPPLE project dataset
Table 1. Average MAE for DAPPLE project dataset
Approach Average MAE
DEMS 0
Average 1.2717
TinyDB 0.6331
SPIRIT 0.9437
32
Performance Evaluation (cont.)
Fig 14. Impacts of number of rounds on MAE for
factory floor dataset
Table 2. Average MAE for factory floor dataset
Approach Average MAE
DEMS 2.2538
Average 14.778
TinyDB 6.9621
SPIRIT 4.7472
33
Conclusions and Future Work
  • We proposed DEMS
  • A novel data estimation technique for mobile
    sensor networks based on data mining and virtual
    static sensor concepts
  • Estimates missing sensor data with high accuracy
  • Future work Extend DEMS to include
  • Multiple base stations
  • De-synchronized mobile sensor networks
  • Cluster sensor networks.

34
Thanks
  • Questions?

35
MASTER-tree Construction
Back
36
MASTER-tree projection and data estimation An
example
  • Simulation

37
Assume
  • Three Node (A, B, C)
  • One dimension of Data (Temperature)
  • Upper bound 30 lower bound 0, cell size 10
  • dis(A,B) 4, dis(A,C) 3 and dis(B,C) 5
  • MCSS 10
  • minSup 25
  • minConf 75

C
A
B
38
Pattern trees
Ø
Ø
Ø
A
C
B
C
B
A
A
B
C
C
B
B
B
A
A
B
C
C
B
A
C
Pattern tree for A
Pattern tree for B
Pattern tree for C
Ø
C
A
B
A
C
B
A
B
C
C
A
C
B
A
B
Final MASTER tree without GS
39
Data Sequence
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
40
Ø
B
A
C
A
B
C
B
C
A
A
B
C
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
41
Ø
B
A
C
A
B
C
A
C
B
B
C
A
B
C
A
A
B
C
C
B
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
42
Ø
B
A
C
A
B
C
A
C
B
B
C
A
B
C
A
A
B
C
C
B
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
43
Ø
B
A
C
A
B
C
A
C
B
B
C
A
A
C
B
B
C
A
A
A
B
C
C
B
A
C
B
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
44
Ø
B
A
C
A
B
C
A
C
B
B
C
A
A
C
B
B
C
A
A
A
B
C
C
B
A
C
B
A
C
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
45
Ø
B
A
C
A
B
C
A
C
B
B
C
A
A
C
B
B
C
A
A
A
B
C
C
B
A
C
B
A
C
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
46
MCSS 10
Ø
Rule Ø ?C 0, 29 Supp 100 Conf 100
B
A
2
2
1
A
B
C
A
C
B
B
C
A
A
C
B
B
C
A
A
A
B
C
C
B
A
C
B
A
C
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
47
MCSS 10
Ø
Rule Ø ?C 0, 19 Supp 80 Conf 80
B
A
2
2
1
A
B
C
A
C
B
B
C
A
A
C
B
B
C
A
A
A
B
C
C
B
A
C
B
A
C
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
48
MCSS 10
Back to presentation
Ø
Rule Ø ?C 0, 29 Supp 80 Conf 80
B
2
3
2
2
1
Rule A ?C 0, 9 Supp 40 Conf 100
A
B
C
A
C
B
B
2
A
A
C
B
B
C
A
A
A
B
C
C
B
A
C
B
A
C
A
A 4 14 11 18 6 8
B 8 18 15 22 10 12
C 7 17 14 21 9 ?
Write a Comment
User Comments (0)
About PowerShow.com