RFID Data Management - PowerPoint PPT Presentation


PPT – RFID Data Management PowerPoint presentation | free to download - id: 9f0a0-MmMwM


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

RFID Data Management


In supply chain: Items travel through a series of locations. ... where the books with property 'Adventure' and 'Romance', are currently present in the store. ... – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 44
Provided by: mte6
Learn more at: http://www.it.iitb.ac.in
Tags: rfid | data | management


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: RFID Data Management

RFID Data Management
  • Kamlesh Laddhad (05329014)
  • Karthik B.(05329021)
  • Guide Prof. Bernard Menezes

  • Introduction to RFID Technology.
  • Issues with RFID Technology.
  • RFID Data Characteristics.
  • Data Warehousing.
  • Expressive Temporal Model Dynamic Relationship
    ER Model
  • RFID - Cuboids.
  • Use of Bitmap Datatype.
  • Data Cleaning.
  • Extensible Sensor stream Processing (ESP)
  • Statistical sMoothing for Unreliable RFid
  • Future Plans.

  • Radio Frequency Identification
  • It is an Automatic Identification and Data
    Capture Technology.
  • Fast
  • No contact or line of sight.
  • Uses radio-frequency waves to transfer data
  • Components
  • Tag small, low-cost device that can hold a
    limited amount of data.
  • Associated with objects, such as pallets, cases,
    and even individual items.
  • Reader Recognize presence of tag and read info
    stored on it.
  • Unique electronic product code (EPC) associated
    with a tag.
  • By placing RFID tag readers at various locations,
    one can track the movement of objects through
    supply chain networks.

Applications and Adoptions
  • Supply Chain Management real-time inventory
  • US Department Of Defense shipments to armed
  • Retail Active shelves monitor product
  • Wal-Mart, Albertson Major Retails stores
  • Access control toll collection, transportation.
  • Airline luggage management
  • British airways20 million bags a year
  • Implemented to reduce lost/misplaced luggage
  • Anti-counterfeiting and security
  • Food and Drug Administration To reduce
    counterfeit in pharmaceutical supply chain

Prospective for RFID research
  • The physics of building tags and readers
  • Tags have few gates Apart from basic operation,
    very less computing power.
  • Radio-frequency has some issues with operating in
    certain physical mediums.
  • The privacy and safety issues
  • Complex encryption schemes are not possible on
    RFID tags.
  • Counterfeiting by means of either illegitimate
    readers or spoofed tags are possible
  • Reader-tag communication is wireless Third
    parties can eavesdrop on signals.
  • Software Architecture to collect, filter,
    organize, and answer online queries
  • No. of tags are proportional to No of items being
  • No. of readers are proportional to traceable
    strategic locations/areas
  • Each Reader picks up tag signals on continuous
  • Data generated by RFID systems is enormous
  • E.g. Wal-Mart is expected to generate 7 terabytes
    of RFID data per day.
  • Our Focus Third Stream.

Data Warehousing Techniques
Data Management Challenges
  • Data Explosion Example
  • A retailer with 3,000 stores, selling 10,000
    items a day per store.
  • Each item moves 10 times on average before being
  • Movement recorded as (EPC, location, second)
  • Data volume 300 million tuples per day.
  • Example OLAP Query Average time for items to
    move from warehouse to checkout counter in March
  • Costly to answer if there are a billion tuples
    for March 2006.

Data Characteristics
  • Temporal and history oriented
  • Applications dynamically generate observations
  • Objects location and containment relationship
    among objects changes
  • Need Expressive data model.
  • Inaccurate data and implicit semantics
  • False positive Non-existing tag incorrectly
  • False Negative Reader missed a tag which was in
    its vicinity.
  • Noisy data duplicate readings (redundancy)
    Same tag read more than once.
  • Need Automated data filtering and
  • Streaming and large volume
  • Object stay in place for longer duration Readers
    records them periodically. Large data keeps
  • We need to preserve this data for tracking and
  • Need Scalable storage scheme, compression
    techniques to reduce data.
  • Data Granularity
  • Data collection granularity needs to be decided
  • Differs across applications.

Warehousing Helps!!
  • Lossless compression
  • Remove redundancy (r1,l1,t1) (r1,l1,t2) ...
    (r1,l1,t10) gt (r1,l1,t1,t10)
  • Group objects that move and stay together.
  • Data cleaning Multi-reading, missed-reading,
    error-reading, bulky movement.
  • Data mining Find trends, outliers, frequent,
    sequential, flow patterns.
  • Multi-dimensional summary product, location,
    time, …
  • Store manager Check item movements from the
    backroom to different shelves in his store
  • Region manager Collapse intra-store movements
    and look at distribution centers, warehouses, and
  • Query Processing
  • Support for OLAP roll-up, drill-down, slice, and
  • Path query New to RFID-Warehouses, about the
    structure of paths
  • What products that go through quality control
    have shorter paths?
  • What locations are common to the paths of a set
    of defective auto-parts?
  • Identify containers at a port that have deviated
    from their historic paths

Dynamic Relationship ER Model
  • Proposed by Wang and Liu from Siemens.
  • RFID entities are static and are not altered.
  • RFID relationships dynamic and change all the
  • Two types of dynamic relationships added
  • Event-based dynamic relationship. A timestamp
    attribute added to represent the occurring
    timestamp of the event.
  • State-based dynamic relationship. tstart and tend
    attributes added to represent the lifespan of a

  • Static entity table
  • OBJECT (object_epc, name, description)
  • LOCATION (location_id, name, owner)
  • Dynamic relationship tables
  • OBSERVATION(sensor_epc, value, timestamp)
  • OBJECTLOCATION(epc, location_id, tstart, tend)
  • TRANSACTIONITEM(transaction_id, epc, timestamp)
  • SENSOR (sensor_epc, name, description)
  • TRANSACTION (transaction_id, transaction_type)
  • CONTAINMENT(epc, parent_epc, tstart, tend)
  • SENSORLOCATION(sensor epc, location id,position,
    tstart, tend)

  • Missing RFID Object Detection
  • Find when and where object holding EPC MEPC
    was lost.
  • select location_id, tstart, tend from
    objectlocaiton where epc'MEPC' and tstart (
    select max(o.tstart) from objectlocation o where
    o.epc'MEPC' )
  • Check if there are missing objects at current
    location C, knowing that all objects were
    complete at previous location L at time T.
  • select l.epc from objectlocation l where
    l.location_id 'L' and l.tstart lt 'T' and
    l.tend gt 'T' and l.epc not in ( select c.epc
    from objectlocation c where c.location_id 'C' )

  • RFID Object Moving Time Inquiry
  • Time it takes to supply OEPC from location S to
    location E?
  • select (e.tstart-s.tstart) as supplying_time from
    objectlocation e, objectlocation s where e.epc
    'OEPC' and s.epc'OEPC' and s.location_id 'S'
    and e.locaiton_id'E'

Compression Idea
  • Bulky object movements
  • Objects often move and stay together through the
    supply chain.
  • If 1000 packs of product P stay together at the
    distribution center, register a single record.
  • (GID, distribution center, time_in, time_out).
  • GID is a generalized identifier that represents
    the 1000 packs that stayed together at the
    distribution center
  • Analysis usually takes place at a much higher
    level of abstraction than the one present in raw
    RFID data

RFID Cuboids
  • Fact Table (EPC, location, time_in, time_out).
  • In supply chain Items travel through a series of
  • Query what is the average time that product P
    stays at store in Location A?
  • Traditional cubes miss the path structure of the
  • Stay Table (GIDs, location, time_in, time_out
  • Records information on items that stay together
    at a given location
  • If using record transitions difficult to answer
    queries, lots of intersections needed
  • Map Table (GID, ltGID1,..,GIDngt)
  • Links together stages that belong to the same
    path. Provides additional compression and query
    processing efficiency
  • High level GID points to lower level GIDs
  • If saving complete EPC Lists high costs of IO to
    retrieve long lists, costly query processing
  • Information Table (EPC list, attribute
    1,...,attribute n)
  • Records path-independent attributes of the items,
    e.g., color, manufacturer, price..

EPC Overview
  • Electronic product code
  • Standard naming scheme, proposed by Auto-Id
  • An EPC uniquely identifies an item.
  • Format ltHeader, Manager_No., Object Class,
    Serial No.gt
  • Header Identifies the length, type, structure,
    version and generation of EPC.
  • Manager Number Identifies an organizational
  • Object Class Identifies a class, or type of
  • Serial Number Specific instance of the Object
    Class being tagged.
  • We will refer to
  • ltHeader, Manager No, Object Classgt Prefix
  • ltSerial No.gt Suffix

Use of Bitmap Datatype
  • Observation Items move together.
  • Groups of items in the same proximity - e.g. on a
    shelf, on a shipment
  • Groups of items with same property - e.g. Same
  • Use a bitmap type for modeling a collection of
    EPCs that can occur in item tracking
  • Instead of storing a tuple per item store a tuple
    for all the items having same prefix.
  • New extra fields instead of epc
  • ltLen, Suffix_length, Prefix, suffix_start,
    Suffix_end, bitmapgt

Example Product Inventory
  • With EPC Collections
  • With epc_bitmaps

Use of Bitmap Datatype
  • Header EPC_Manager Object_Class
  • 2-bits 21-bits 17-bits
  • 0x4AA890001F62C160
  • …………………………
  • 0x4AA890001FA0B38E

Bitmap Operations
  • To use this with such datatype in SQL, we need
    operations on such bitmaps.
  • Conversion and couting Operations epc2Bmap,
    bmap2Epc and bmap2Count
  • Pairwise Logical Operations bmapAnd, bmapOr,
    bmapMinus, and bmapXor
  • Maintenance Operations bmapInsert and bmapDelete
  • Membership Testing Operation bmapExists
  • Comparison Operation bmapEqual

Use of these operations in SQL
  • Items added to a given shelf between time t1 and
  • SELECT bmap2Epc(bmapMinus(s2.item_bmap,
    s1.item_bmap)) FROM Shelf_Inventory s1,
    Shelf_Inventory s2 WHERE s1.shelf_id ltsid1gt AND
    s1.shelf_id s2.shelf_id AND s1.time ltt1gt AND
    s2.time ltt2gt
  • Book store categorizes books in various
  • Following query determines the shelves where the
    books with property Adventure and Romance,
    are currently present in the store.
  • SELECT s.shelf_id FROM Shelf_Inventory s WHERE
    bmap2Count(bmapAnd( s.item_bmap, SELECT
    bmapAnd(p.Adventure, p.Romance) FROM
    Propery_Inventory p) ) gt 0 AND

Road Ahead
  • Extension to bitmap proposal
  • Bitmap datatype is more appropriate for initial
    bulk-load batch updates.
  • It performs badly for incremental updates.
  • A hybrid Scheme for incremental Updates
  • Maintain inventories periodic checkpoints using
  • For changes occurring between checkpoints,
    Maintain a traditional item-level table.
  • Answer queries by merging the latest checkpoint
    bitmap with the corresponding durations
    item-level data.
  • The epc_suffix in the collection may not be
  • The bitmap will be sparse- Lot of zeros.
  • Compress this using some encoding scheme
  • Good for initial bulk loading and batch updates
  • May reduce efficiency of bitmap operations.

Open Problems
  • Efficient methods data mining problems
  • Trend analysis
  • Outlier detection
  • Path clustering
  • We will try exploring data mining applications to
    RFID data.

RFID Data Cleaning
Issues in Data Cleaning
  • Lack of Completeness
  • RFID readers capture only 60-70 of all tags that
    are in the vicinity
  • Smoothing of data is done to rectify the loss of
    intermediate messages
  • Temporal Nature of data or tag dynamics
  • RFID tags are in motion and that is what makes
    them more difficult to handle
  • But motion of a tag causes dropping of messages
  • RFID data streams are very fast and are huge in
  • Hence filtering is important before sending them
    to database

Current Strategies
  • Temporal Granule
  • Based on the fact that tag data do not differ
    much over a small time period
  • Data can be clubbed on a small time frame
  • Spatial Granule
  • Similarly, data from physically close readers are
    also homogeneous

Stages of ESP
  • Point operates over a single value in a sensor
    stream, filtered by a predicate in the WHERE
  • Smooth granularity defined by applications to
    correct for missed readings temporally (over one
    input only) uses aggregate function over the
  • Merge granularity specified by the application
    to correct for missed readings spatially grouped
    by the specified spatial granule.

Stages of ESP (contd.)
  • Arbitrate deals with conflicts between different
    spatial granules grouped by spatial granule
    first and then uses HAVING construct to determine
    those conflicts
  • Virtualize used for combining data streams from
    different sources, could also be different
    devices join construct is used to combine the
    different data streams and then filtered using
    some predicate

Smooth stage
  • False Positives (erroneous readings) reporting
    objects that are not actually present
  • False Negatives (missed readings) not reporting
    objects that actually are present

False positives and False Negatives Jeff06
Tag List
  • The reader has an internal table called the Tag
  • An epoch is the smallest unit of interaction
    between the reader and the middleware.
  • Every epoch consists of certain number of
    Interrogation cycles
  • Interrogation Cycle is one run of the reader
    protocol to determine all tags
  • At every epoch the reader sends the tag list to
    the middleware.

SMURF Per tag Cleaning
  • SMURF uses statistical methods to reduce the
    false negative and false positives happening in
    the RFID stream.
  • The goal here is two fold one is to determine
    the statistical window size, and secondly,
    ensuring that the transition of the tags is
  • To determine the window size we need to fit a
    probability distribution to the sample size
  • And to determine the transition of the tag out of
    the reader's vicinity, we define a 98 confidence
    interval within that probability distribution
    function on the sample size Si.

SMURF Per tag Cleaning (contd.)
  • Using the tag list, per-epoch sampling
    probability, pi,t is determined, pi,t number of
    times tag was read in a epoch /
    interrogation cycles per epoch
  • We average this over the sample size Si to get
    the average read rate (piavg) for a tag i.
  • If same probability of pi is assumed for each
    epoch throughout the window then each successful
    observation is like a Bernoulli trail.

SMURF Per tag Cleaning (contd.)
  • So, Si is the binomial random variable for a
    sample Si with mean wi. piavg and variance
    wi. piavg. (1-piavg)
  • Now using this we can express the window size as
    a limit,
  • If the current window size is less than the
    calculated one then the window size is adjusted
  • Similarly using the Central limit theorem for
    transition detection we get Si - µ gt 2 s

Normal Sliding window….
  • Epoch based mid-point sliding window
  • Emits a reading with an epoch value corresponding
    to the middle of the window

Ensuring Completeness
  • In the first window, piavg demands a larger
  • Thus window size is increased

Transition Detection
  • In the first window the number of readings
    decreases significantly (and statistically)
  • Thus a transition is likely to have occurred so
    window is halved

SMURF Multi-tag aggregate Cleaning
  • Similar to per-tag cleaning, the window for
    multi-tag cleaning is determined by Here, pavg
    is the average per-epoch sampling probability
    over all observed tags.
  • To detect the transition in population count, we
    estimate the population count of two windows t
    wi, t and t wi/2, t with true populations
    Nw Nw
  • Thus, for a transition to have happened, we need
    the difference between the two estimates to be
    within the limit 2(sw sw)

SMURF Multi-tag aggregate Cleaning
  • To calculate the estimate of population count, we
    use p-estimators The estimated population count
    is given by
  • Similarly by p-estimators, and assuming
    independence across different tags, the variance
    of the estimate is estimated as
  • Here pi is probability of reading the tag i at
    least once during the whole window, given by 1
    (1 piavg)w

The Road ahead…
  • Applications in RFID do not accept any delays in
    the data delivery
  • Data is either present in the cache or the
    database data in the database increases
    processing time and data in cache does not
    understand SQL like queries
  • Anomaly detection in object tracking is also an
    important part of object tracking
  • Issues like untraceability, forward security, and
    database desynchronization are still not
    completely resolved.
  • One more serious problem with RFID is
  • In the next stage we expect to look into some of
    these issues

  • ????

Thank You.
  • Xiaolei Li, Hector Gonzalez, Jiawei Han and Diego
    Klabjan. Warehousing and analyzing massive RFID
    data sets. ICDE, 2006.
  • Fusheng Wang and Peiya Liu. Temporal management
    of RFID data. VLDB, 2005.
  • Timothy Chorma, Ying Hu, Seema Sundara and
    Jagannathan Srinivasan. Supporting RFID-based
    item tracking applications in oracle DBMS using a
    bitmap datatype. VLDB, 2005.

  • Minos Garofalakis, Shawn R. Jeffery and Michael
    J. Franklin. Adaptive cleaning for RFID data
    streams. VLDB, 2006.
  • J. Franklin, Wei Hong, Shawn R. Jeffery, Gustavo
    Alonso and Jennifer Widom. Declarative support
    for sensor data cleaning. In Pervasive, 2006.
  • Sridhar Ramachandran Sudarshan S. Chawathe,
    Venkat Krishnamurthy and Sanjay E. Sarma.
    Managing RFID data. VLDB, 2004.
About PowerShow.com