Semantically Enhanced and Efficient Enforcement of Mobile Consumers Privacy Preferences - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Semantically Enhanced and Efficient Enforcement of Mobile Consumers Privacy Preferences

Description:

... resolve spatio-temporal and granularity conflicts? ... stc is a spatio-temporal constraint consisting of a civil location and ... hold several properties: ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 61
Provided by: mahmoud
Category:

less

Transcript and Presenter's Notes

Title: Semantically Enhanced and Efficient Enforcement of Mobile Consumers Privacy Preferences


1
Semantically Enhanced and Efficient Enforcement
of Mobile Consumers Privacy Preferences
Nabil R. Adam, Mahmoud Youssef and Vijay Atluri
Center for Information Management Integration and
Connectivity (CIMIC) Rutgers University
Presentation at SAP Research Labs, Palo
Alto 4/5/2005
2
Outline
  • Introduction
  • Research Problem
  • Part I A Solution with Focus on Efficiency
  • Controlling Information Flow
  • Access Control Model
  • Evaluation and Enforcement Mechanism
  • Comments on the System Design
  • Performance Evaluation Study
  • Summary (Part I)
  • Part II A Solution with Focus on Expressiveness
  • Another Privacy Criterion
  • Quick Look at Description Logics, OWL , and RDF
  • Preferences Ontology
  • Query Processing
  • Implementation
  • Performance Evaluation Study
  • Summary (Part II)

3
Scenario Location-based Advertising
  • With the availability of positioning and tracking
    technology, it is possible to
  • Track different entities, e.g., vehicles,
    containers, and individuals
  • The Location Service (LS) aggregates location
    information
  • Location information is managed as Moving Objects
    Database (MOD)
  • Merchants customize offers based on consumer
    profile and location.

4
The Moving Objects Problem
  • Moving objects need special data modeling due to
  • The rate of update
  • Too many customers sending continuous updates.
  • Traditional databases are not designed for
    intensive updates
  • The same problem exists in the RFID domain.
  • Queries usually need to address the future
  • Type of queries
  • Queries submitted by the consumers are also
    moving

3
2
6
R2
5
7
1
R4
8
R3
1
2
3
4
5
6
7
8
R1
4
(a)
3
R1
2
6
R3
5
7
1
8
4
R2
1
5
6
7
8
(b)
The structure of R-tree has to changes
drastically due to the movement of the objects
5
The Moving Objects Spatio-temporal Model (MOST)
  • In the MOST, Sistla et al., 1997
  • A database attribute that is continuously
    changing is considered a dynamic attribute
  • That requires less updates
  • linear change is assumed.
  • MO indexing schemes index objects in the
    projections, in the d-dimensional space, or in a
    transformed space.
  • We use projected trajectories in our computations
  • How good is the linearity assumption?

Y
O1
X
X
tnow
T (Time)
6
The Tradeoff BetweenPersonalization and Privacy
  • Personalization involves
  • collection of profile and location information
    which raises privacy concerns
  • Studies by Chellappa et al., Harn et al., and
    Spiekermann et al. show
  • consumers do not opt-in to online services when
    they do not trust merchants for their profiles.
  • consumers are willing to tradeoff their
    information with trusted vendors for convenience.
  • Even among the privacy-concerned consumers.

7
The Tradeoff Between Personalization and Privacy
(Contd)
  • Analysis of over 120 surveys show Westin, 2002
  • Change in the attitude of 3/4 of American
    consumers towards privacy from a modest to a high
    intensity matter
  • Three segments of consumers fundamentalists,
    unconcerned, and pragmatic.
  • The size of the pragmatic group is 125 Million.
  • The challenge to businesses is to address the
    needs of this group
  • Provide convenience
  • Protect Privacy

8
One Trusted Third Party
A Proposed Solution The consumer to trust only
one third-party
The Problem The consumer has to trust too many
merchants for her profile
It stands to reason then that the LS assumes that
role
9
Basic Approaches to Privacy Protection
  • Device-based Approach (e.g., Schilit et al.,
    2003)
  • Advantage consumer does not have to trust anyone
  • Limitations
  • Consumer receives all the messages. Who is being
    charged for transmission?
  • Too much load on the network.
  • Requires powerful devices.
  • Trusted Third Party
  • Anonymity Approaches The Anonymizer Project
  • Do not support identity-based analysis (e.g.,
    purchase history)
  • Consumers still have to trust the anonymizer.
  • Our Proposed Approach (Access Control with
    Controlled Information Flow)

10
The Environment The Players
  • The player in this environment are
  • The LS maintains consumer information, enforce
    their privacy policy, and provides answers to
    queries
  • Information Requester a merchant or a marketing
    intermediary
  • Location Information Providers e.g., the
    Wireless Networks
  • Information Owners the consumers

11
The Research ProblemPolicy Requirements
  • Preventing unauthorized sharing of consumer
    information among information requesters
  • Consider the spam problem
  • Preventing misuse of permitted access to consumer
    information
  • If access policies are based on merchant
    identity, merchants can violate consumer
    preferences in terms of time and location.
  • Access policies need to have spatio-temporal
    constraints

12
The Research ProblemUser Interface Requirements
  • Consumers need a user-friendly approach to
    defining policy rules
  • Access rules should be defined at different
    granularities.
  • However, such representation will create
    granularity conflict
  • Example
  • R1 (Hilton, c1_info, read, -) ltEssexCounty,
    all_timegt
  • R2 (Hotels, c1_info, read, ) ltNJ, week_daysgt

13
The Research ProblemPolicy Enforcement
Requirements
  • Two Capabilities are required
  • Addressing the impact of consumer motion and its
    interaction with the spatio-temporal constraints.
  • Spatio-temporal conflict The location query may
    intersect with the spatio-temporal constraints of
    more than one access rule, e.g.,
  • During the time interval of the query the
    customer will pass by two locations (Hudson
    County and NYC) which she has different
    permissions for.
  • Translating between geospatial coordinates, as
    expressed in the MOD, and civil names, as
    expressed in the constraints, e.g.,
  • MOD Current Location of Customer C1 (74. 32145,
    40.75321)
  • Access Rule (Hotels, No Access) ltNew York City,
    All timesgt

14
The Research ProblemScalability and Efficiency
Requirements
  • The system has to accommodate for growth in the
    number of consumers and merchants
  • yet
  • Not adversely impacting the overall performance
    of the query processing.

15
Summary of The Challenges
  • How to prevent the illegal sharing of consumer
    information?
  • How to efficiently resolve spatio-temporal and
    granularity conflicts?
  • How to efficiently compute the interaction among
    the spatio-temporal constraints and the location
    information?
  • How to translating between geospatial coordinates
    and civil names?

16
Part I A Solution with Focus on Efficiency
17
Overview of the Proposed Solutions
  • 1. Control information flow to merchants
  • 2. Develop an access control model that allows
  • Specification of spatio-temporal policies
  • Example merchant Hilton has access to my
    information when I am outside New Jersey on
    Weekdays.
  • Representation of merchants, location, and time
    at different levels of granularity.
  • 3. Efficient enforcement of access control
  • Turn the problem into a string search problem

18
1. Controlling Information Flow
  • Solution
  • Merchants send information related to a specific
    offer along with query to the LS
  • The LS runs the query producing a list of
    consumers IDs who satisfy the merchant criteria
  • The LS enforces the access control which filters
    the IDs
  • The filtered IDs are then forwarded with the
    advertisement to the wireless networks to deliver
    them to the consumers
  • The wireless network sends the offers to the
    consumer devices and reports to the LS then
  • The LS sends pseudonyms to the merchants.

19
2. The Proposed Access Control Model
  • An access rule consists of an authorization
    triple and a constraint
  • (s, o, /-), ltstcgt
  • Where
  • s ? S is a subject, i.e., a merchants at some
    granularity.
  • o ? O is an object, i.e, a consumer ID ? l,p,
    where l, p is location and profile information.
  • /- is a flag ,i.e, grant/deny.
  • stc is a spatio-temporal constraint consisting
    of a civil location and a time interval.
  • Spatial and temporal constraints are generalized
    to stc
  • The only Access Mode is read.
  • no need to represent it in the model.
  • Generic access rule
  • (s, ID?lp, /-), ltstcgt

20
2.1 Model Components Representation
  • All Components are represented as hierarchies
    (except the ID and the flag)
  • These hierarchies hold several properties
  • In every level in a hierarchy, the nodes are
    exact decomposition of their parents (i.e., the
    parent is the union of children and the children
    are disjoint). Thus
  • the root always represents All Members
  • the leaves are the members at their most specific
    representation.
  • No multiple inheritance.

Subject Hierarchy
21
2.2 Order of Hierarchies and Precedence
  • We adopt the following order
  • ID ? Object ? Subject ? Location ? Time ? Flag
  • The order among hierarchies implies precedence
  • Precedence has no impact on the model behavior as
    long as the same order is followed in the
    specification and evaluation of access rules
  • However, it has impact on the notion of relative
    specificity, as we will see later

22
2.3 The System State
  • The system state includes partial instantiations
    of the hierarchies.
  • Each instance of a hierarchy includes only the
    nodes that have access rules defined on them.
  • The instances belonging to the same consumer can
    be seen as a tree

23
2.4 Conflict Resolution
  • For spatio-temporal conflict ? denial precedes
    grants
  • I.e., being conservative.
  • For Granularity Conflict ? Inheritance with
    Overriding
  • Nodes not in the instance are
  • Assumed to virtually exist, and
  • Inherit permissions from the next existing
    ancestor
  • Nodes in the instance
  • More specific rules override less specific ones
  • The semantics must be conveyed to the consumer

24
2.4.1 Relative Specificity Among Rules
  • For two rules R1 and R2 for the same consumer, R1
    is more specific than R2 if
  • R1 has a more specific object than R2 i.e., R2
    has lp and R1 has l or p
  • R1 and R2 have the same object AND R1 has a more
    specific subject
  • R1 and R2 have the same object and subject AND R1
    has a more specific location or
  • R1 and R2 have the same object, subject, and
    location AND R1 has a more specific time.

25
2.5 Advantages of the Model
  • Support for more efficient search.
  • Overriding motivates that the search starts from
    the most-specific representation.
  • We exploit that by adaptively searching for the
    most specific rule that matches some search key.
  • The system size is kept small
  • Since instances are partial instantiations.
  • Component representation is granular.
  • This streamlines the user interface,
  • Provides support for aggregate queries.

26
3. Evaluation of Access Control
  • Evaluation involves
  • Compose search keys
  • Match them against the access rules.
  • Each consumer in the query result can generate
    multiple search keys
  • based on the intersection between the query and
    the consumers motion line.
  • Granular representation is another source of
    search keys.
  • Definition A spatio-temporal window is a
    combination of a time leaf and a location leaf.

27
3.1 The Evaluation Procedure
  • The evaluation proceeds as follows
  • For each consumer and for each spatio-temporal
    window that the consumer passes through, a search
    key is created.
  • For each of the created keys, an adaptive search
    operation is performed and a flag is retrieved.
  • The flags that belong to the same consumer are
    combined using the denial precedes grants rule.

Check YAA05 for detailed computations
28
3.2 Components of the Evaluation and Enforcement
Mechanism
  • A spatio-temporal module
  • Provides computations for interaction between
    moving objects and consumer location information.
  • Translates geospatial coordinates to civil names.
  • Built on top of Oracle Spatial using Oracle
    Pre-compiler (ProC/C and PL/SQL)
  • An encoder
  • Encodes both access rules and search keys into
    equal-length alphabetical strings.
  • The ASM-trie (the Adaptive Search Multiway-trie)
  • Performs the adaptive search on specially encoded
    strings.

29
3.2.1 The Encoder
  • In the access rule (search key), each hierarchy
    substring is drawn from a table that encodes that
    hierarchy.
  • Depending on the max cardinality of children, one
    or more letters are used for each level, e.g.,
    one letter for region, and 2 letters for state.
  • There is no one-to-one relationship between nodes
    in an access rule and the nodes in the ASM-Trie.
  • Adaptive search is not just back tracking

30
The Encoders Support for the Adaptive Search
  • Letter a is used as padding to give equal
    length to all substrings
  • This way, it also represents the parent node in
    the access rule.
  • Letter a is never used in encoding a child.
  • The ID substring is encoded in uppercase to
    indicate that adaptive search is not supported.

31
3.2.2 The ASM-trie
  • The ASM-trie is a main memory structure that
    supports adaptive search.
  • In the ASM-trie, the node includes
  • 27 pointers-to-node to represent the alphabet and
    a null character
  • Letters are implied by their order (radix) (e.g.,
    0null, 1a, 2b, )
  • A pointer to its parent for adaptive search,
  • A pointer to the previous-letter for backward
    traversal, and
  • A Boolean variable to indicate whether adaptive
    search is supported in this level.
  • For the Insert and Search algorithms, check
    YAA05

32
Performance Evaluation Study
  • ASM-trie vs. main memory trie with linear scan
    vs. Oracle linear scan.
  • Machine Xeon 2.4 GHz with 2 GB RAM.
  • 100 search key sets and 30 data replicas
  • The ASM-trie had a constant search time, around
    32000 keys/Sec.
  • The ASM-trie exhibited linear space utilization
    around 1200 access rules per MB.
  • The difference between the ASM-trie and the
    regular trie can be attributed to the adaptive
    search.

ASM-trie
Main Memory Linear scan
Oracle Linear scan
33
Comments on the Design
  • The choice of a memory resident approach.
  • The limit on main memory size should not affect
    the scalability of the LS for several reasons
  • The LS is implemented as a distributed system
    where every node is responsible for a specific
    service area
  • 64-bit processors becoming a commonplace.
  • New directions in implementing large-scale
    services, e.g., Google
  • rely on multiple cheap servers
  • all the data is indexed in the memory.
  • This year, a first conference on data management
    on new hardware

34
Summary (Part I)
  • Contribution
  • An access control model for moving objects and
    consumer profiles that supports granular
    representation.
  • An efficient enforcement mechanism that utilizes
    a new data structure, the ASM-trie.
  • A design of information flow that prevents
    merchants from sharing consumer information.
  • Future work
  • Disk-based ASM-trie

35
Part II A Solution with Focus on Expressiveness
36
Another Criterion for Privacy
  • Why do customers accept receiving advertisement?
  • Convenience of timely and location-based offers
  • Related to their interests.
  • Have incentives.
  • The current privacy policy considers 1 and 2, but
    not 3.
  • Can we add incentives to the privacy criteria?
  • Yes, but this type of domains is difficult to
    model with some data structures like the
    hierarchies in Part I.
  • In general, it is difficult to model exceptions
    in such hierarchies.
  • Consider NYC (a city that is composed of five
    counties). It violates the hierarchys structural
    properties.

37
KR Techniques
  • Knowledge Representation (KR) Techniques
  • Modeling approaches based on KR techniques are
    more expressive
  • KR techniques can be broadly classified into
  • logic-based and non logic-based.
  • Description Logics (DLs) is a class of
    logic-based KR that has been used recently as a
    basis for designing the Ontology Web Language
    (OWL)
  • We propose a solution based on
  • modeling incentives and the other preference as
    an ontology and
  • enforcing these preferences using DL reasoning
    techniques.

38
Overview
  • 1. A brief overview of DLs
  • 2. Preferences Ontology
  • 3. Query Processing
  • 4. Implementation
  • 5. Performance Evaluations Study

39
1. DLs A Brief Overview
  • The basic building block of KR in DLs is
  • The concept -- defined as a set of individuals
  • Concepts and the IS-A relationship are used to
    build hierarchical terminologies (taxonomies).
  • Terminologies are the intensional knowledge
  • The extensional knowledge comes from assertions
    about individuals
  • In addition to the IS-A, DLs can represent other
    types of relationships
  • roles

40
1.1 A Minimal DL language and its Interpretation
DLs have well-defined model-theoretic
interpretation The following is the
interpretation of the AL language.
?? ?? (the universal concept,
thing) ?? ? (bottom concept,
nothing) (? A)? ?? \ A ? (atomic
negation) (C ? D)? C? ? D ? (conjunctio
n) (C ? D)? C? ? D ? (disjunction) (?R.
C)? ? ? ?? ??. (?, ?) ? R?? ? ?
C? (Value restriction) (?R.?)? ? ? ??
??. (?, ?) ? R? (limited value existential
quanti.)
41
1.2 SHIQ(D) and OWL
  • SHIQ(D) is equivalent to AL plus full concept
    negation, transitive roles, qualified cardinality
    restrictions, role hierarchies, inverse roles,
    and datatypes.
  • SHIQ(D) has a good balance between
    expressiveness and computational efficiency
    (computability and decidability)
  • SHIQ(D) is almost equivalent to OWL
  • For an excellent reference on DLs, check the DLHB.

42
1.3 Important Features of DLs
  • Two types of terminology axioms inclusion
    (e.g., ) and equality (e.g., ).
  • A definition is an equality with atomic left
    side.
  • A finite set of definitions T is called
    terminology or TBox.
  • A finite set of assertions about individuals is
    called ABox.
  • The Open-world semantics
  • The unique name assumption

43
1.4 Reasoning in DLs
  • Assuming a knowledge base K, concepts C and D,
    and an individual a
  • TBox reasoning includes
  • Class subsumption queries determine if C is a
    subclass of D with respect to K.
  • Class hierarchy queries given a class C, return
    all or the most-specific (most-general)
    superclasses (subclasses) of C in K.
  • Class satisfiability queries given a class C,
    determine if C is satisfiable (consistent) with
    respect to K.
  • ABox reasoning includes
  • Ground determine whether a given individual a is
    an instance of C.
  • Open determine all the individuals in K that are
    instances of C.
  • All-classes given an individual a, determine all
    the classes in K that have element a.

44
2. Preferences Ontology
  • The ontology includes six taxonomies
  • IncentiveType,
  • IncentiveValue,
  • Location,
  • Time,
  • Products,
  • Merchants.
  • Both Consumer preferences and merchant queries
    are
  • Subsumed by a class called CPP (Consumer Privacy
    Preferences).

45
Modeling Promotions
  • Promotion Techniques
  • Price reduction,
  • Happy hour (i.e., price reduction for a short
    time),
  • No payment for a specific period,
  • Payments on installments,
  • More items for free,
  • Bundle (which could be homogeneous reduced
    price for second item, or heterogeneous another
    product at a reduced price),
  • Premium (i.e., a free non-related product or
    service, e.g., free miles),
  • Prize,
  • Contest (i.e., based on a skill),
  • Sweepstakes (i.e., based on chance), and
  • Rebates or refund (i.e., cash refund, coupon
    refund, or escalating refund).
  • We analyzed these techniques and found that a
    promotion includes
  • Incentive type ? IncentiveType Taxonomy
  • incentive value ? IncentiveValue Taxonomy
  • Conditions ? Property Restrictions on the
    IncentiveType Taxonomy

46
2.1 Incentive Type Taxonomy
IncentiveType T Monetary
IncentiveType Coupon IncentiveType Time
Slack IncentiveType ExtraItems
IncentiveType PayOnInstallments
IncentiveType InstantRefund
Monetary DelayedRefund Monetary
  • For each of these subclasses, an object property
    is defined.
  • Promotion conditions are expressed as property
    restrictions on the class IncentiveType and its
    subclasses
  • Example product condition
  • Property hasProduct
  • Range Products_Services
  • Restriction allValuesFrom AllProduct and with
    cardinality  1.

47
2.2 Incentive Value Taxonomy
  • The IncentiveValue taxonomy includes five
    subclasses
  • PercentageReduction,
  • ScalarReduction,
  • Price,
  • TimeSlack, and
  • NumberOfInstallments
  • The taxonomy also includes five datatype
    properties hasPercentageValue, hasScalarValue,
    etc.
  • The range for these properties is the XML integer
    data type.
  • Example
  • ?20 IncentiveValue.hasPercentageValue

48
2.3 Location Taxonomy
  • The main class in the location taxonomy is
    AllLocations where its semantics is the set of
    all cities.
  • since we are using class subsumption for
    reasoning, we represented cities as primitive
    classes instead of individuals
  • Example

49
2.4 Products and Services Taxonomy
  • Used the United Nations Standard Products and
    Services Code (UNSPSC) .
  • UNSPSC provides five levels taxonomy Segment,
    Family, Class, Commodity, and Business Function.
  • Imported from XML to OWL.

50
2.5 Time Taxonomy
  • The main class in the time taxonomy is AllTimes
    where its semantics is the set of all hours (in
    one year)
  • You can express things like Labor Day even though
    it does not have a specific date.

51
2.6 Merchants Taxonomy
  • Merchant taxonomy is similar to the industry
    hierarchy in Part I.
  • Compatible with the Census Bureaus
    Classification of Industries
  • Related to Products and Services Taxonomy with
    two properties hasProduct and hasService

52
2.7 Consumer Preferences
  • ConsPref ? CPP ?
  • (?hasLocation.L1 ?
  • ?hasTime.T1 ?
  • ?hasProduct.P1 ?
  • ?hasMerchant.M1 ?
  • ?hasIncentiveType.IT1 ?
  • ?hasIncentiveValue.IV1)

Consumer ID
Preference
  • ConsPref_12345_1 ? CPP ?
  • (?hasLocation.USA_CA_Cities ?
  • ?hasTime.WeekEnd ?
  • ?hasService.Lodging ?
  • ?hasMerchant.Hotels ?
  • ?hasIncentiveType.Percentage ?
  • ? ?20 hasIncentiveValue)

53
2.8 Merchant Queries
  • MerchQuery ? CPP ?
  • (?hasLocation.L2 ?
  • ?hasTime.T2 ?
  • ?hasProduct.P2 ?
  • ?hasMerchant.M2 ?
  • ?hasIncentiveType.IT2 ?
  • ?hasIncentiveValue.IV2)

Merchant ID
Query
  • MerchQuery_Hilton_1 ? CPP ?
  • (?hasLocation. SanJose ?
  • ?hasTime.Saturday ?
  • ?hasService.Lodging ?
  • ?hasMerchant.Hilton ?
  • ?hasIncentiveType.Percentage ?
  • ? ?25 hasIncentiveValue)

54
3. Query Processing
  • Queries are processed by computing the
    intersection between ConsPref_CID_n (P) and
    MerchQuery_MID_m (Q)
  • Exact match P ? Q ? Permission Grant
  • Disjoint P Q ? ? Permission Deny
  • Subsumes Q P ? Permission Grant
  • Subsumed P Q ? Permission is undetermined

55
3.1 Query Processing Algorithm
56
4.1 Semantic Enforcement Mechanism
57
4.2 Tools Used
  • Protégé Excellent KR editor with OWL plugin
  • Uses Jena for persistent storage
  • Uses DIG interface to communicate with reasoners
  • The DIG interface does not support datatype
    properties. How to get around it?
  • Has visualization tools, query tools, etc.
  • Racer (or Fact) DL reasoner that supports DIG
    interface
  • OWL API converts OWL ontologies to DIG format
  • The DIG interface does not support datatype
    properties.
  • How to get around it?

58
4.3 Getting Around DIG Problems
  • In the IncentiveValue taxonomy, we modeled the
    values as set of recursive subclasses that
    represent value levels.
  • Example, for the percentage subclass, the first
    class is GreaterThan5, the second is
    GreaterThan10 and so forth.
  • We used the complement of the incentive value
    submitted by the merchant
  • For the scalar subclass (which is infinite), we
    introduced an artificial ceiling and used unequal
    steps (e.g., 50, 100, 200, 500).
  • We followed a similar approach with the other
    taxonomies.

59
5. Performance of matching using RACER for query
sizes from 15 to 90
60
Summary (Part II)
  • Performance
  • Query size is more significant than ontology size
    for the reasoner
  • Contribution
  • Promotion Model
  • Preference Ontology
  • Enforcement Mechanism
  • Algorithm for semantic-based query processing
  • Future work
  • Combining the two approaches in one solution that
    provides efficiency and expressiveness.
  • Examining other query processing approaches,
    e.g., using an instance store, and asserting
    multiple queries before processing them.
Write a Comment
User Comments (0)
About PowerShow.com