Confidentiality, Privacy and Trust 1 - PowerPoint PPT Presentation


PPT – Confidentiality, Privacy and Trust 1 PowerPoint presentation | free to view - id: 13306-MTY0O


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

Confidentiality, Privacy and Trust 1


Before I as a user of Organization A send data about me to organization B, I ... SocNet software, Friendster. Blogs and diaries, Blog quotes and links. History ' ... – PowerPoint PPT presentation

Number of Views:262
Avg rating:3.0/5.0
Slides: 39
Provided by: chrisc8


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Confidentiality, Privacy and Trust 1

Confidentiality, Privacy and Trust - 1
  • Prof. Bhavani Thuraisingham
  • The University of Texas at Dallas
  • October 6, 2008

CPT Confidentiality, Privacy and Trust
  • Before I as a user of Organization A send data
    about me to organization B, I read the privacy
    policies enforced by organization B
  • If I agree to the privacy policies of
    organization B, then I will send data about me to
    organization B
  • If I do not agree with the policies of
    organization B, then I can negotiate with
    organization B
  • Even if the web site states that it will not
    share private information with others, do I trust
    the web site
  • Note while confidentiality is enforced by the
    organization, privacy is determined by the user.
    Therefore for confidentiality, the organization
    will determine whether a user can have the data.
    If so, then the organization van further
    determine whether the user can be trusted

What is Privacy
  • Medical Community
  • Privacy is about a patient determining what
    patient/medical information the doctor should be
    released about him/her
  • Financial community
  • A bank customer determine what financial
    information the bank should release about him/her
  • Government community
  • FBI would collect information about US citizens.
    However FBI determines what information about a
    US citizen it can release to say the CIA

Some Privacy concerns
  • Medical and Healthcare
  • Employers, marketers, or others knowing of
    private medical concerns
  • Security
  • Allowing access to individuals travel and
    spending data
  • Allowing access to web surfing behavior
  • Marketing, Sales, and Finance
  • Allowing access to individuals purchases

Data Mining as a Threat to Privacy
  • Data mining gives us facts that are not obvious
    to human analysts of the data
  • Can general trends across individuals be
    determined without revealing information about
  • Possible threats
  • Combine collections of data and infer information
    that is private
  • Disease information from prescription data
  • Military Action from Pizza delivery to pentagon
  • Need to protect the associations and correlations
    between the data that are sensitive or private

Some Privacy Problems and Potential Solutions
  • Problem Privacy violations that result due to
    data mining
  • Potential solution Privacy-preserving data
  • Problem Privacy violations that result due to
    the Inference problem
  • Inference is the process of deducing sensitive
    information from the legitimate responses
    received to user queries
  • Potential solution Privacy Constraint Processing
  • Problem Privacy violations due to un-encrypted
  • Potential solution Encryption at different
  • Problem Privacy violation due to poor system
  • Potential solution Develop methodology for
    designing privacy-enhanced systems

Privacy Constraint /Policy Processing
  • Privacy constraints processing
  • Based on prior research in security constraint
  • Simple Constraint an attribute of a document is
  • Content-based constraint If document contains
    information about X, then it is private
  • Association-based Constraint Two or more
    documents taken together is private individually
    each document is public
  • Release constraint After X is released Y becomes
  • Augment a database system with a privacy
    controller for constraint processing

Inference/Privacy Control
Interface to the Semantic Web
Technology By UTD
Inference Engine/ Rules Processor (Reasoning in
Privacy Policies Ontologies Rules
OWL/RDF Documents Web Pages, Databases
OWL/RDF Data Management
Semantic Model for Privacy Control
Dark lines/boxes contain private information
Has disease
Johns address
Patient John
Travels frequently
Privacy Preserving Data Mining
  • Prevent useful results from mining
  • Introduce cover stories to give false results
  • Only make a sample of data available so that an
    adversary is unable to come up with useful rules
    and predictive functions
  • Randomization
  • Introduce random values into the data and/or
  • Challenge is to introduce random values without
    significantly affecting the data mining results
  • Give range of values for results instead of exact
  • Secure Multi-party Computation
  • Each party knows its own inputs encryption
    techniques used to compute final results
  • Rules, predictive functions
  • Approach Only make a sample of data available
  • Limits ability to learn good classifier

Platform for Privacy Preferences (P3P) What is
  • P3P is an emerging industry standard that enables
    web sites to express their privacy practices in a
    standard format
  • The format of the policies can be automatically
    retrieved and understood by user agents
  • It is a product of W3C World wide web consortium
  • When a user enters a web site, the privacy
    policies of the web site is conveyed to the user
    If the privacy policies are different from user
    preferences, the user is notified User can then
    decide how to proceed
  • Several major corporations are working on P3P
    standards including

Platform for Privacy Preferences (P3P)
  • Several major corporations are working on P3P
    standards including
  • Microsoft
  • IBM
  • HP
  • NEC
  • Nokia
  • NCR
  • Web sites have also implemented P3P
  • Semantic web group has adopted P3P

Platform for Privacy Preferences (P3P)
  • Initial version of P3P used RDF to specify
    policies Recent version has migrated to XML
  • P3P Policies use XML with namespaces for
    encoding policies
  • P3P has its own statements and data types
    expressed in XML P3P schemas utilize XML schemas
  • P3P specification released in January 20005 uses
    catalog shopping example to explain concepts P3P
    is an International standard and is an ongoing
  • Example Catalog shopping
  • Your name will not be given to a third party but
    your purchases will be given to a third party

P3P and Legal Issues
  • P3P does not replace laws
  • P3P work together with the law
  • What happens if the web sites do no honor their
    P3P policies
  • Then appropriate legal actions will have to be
  • XML is the technology to specify P3P policies
  • Policy experts will have to specify the policies
  • Technologies will have to develop the
  • Legal experts will have to take actions if the
    policies are violated

Privacy for Assured Information Sharing
Data/Policy for Federation
Data/Policy for
Data/Policy for
Agency A
Agency C
Data/Policy for
Agency B
Key Points
  • 1. There is no universal definition for privacy,
    each organization must definite what it means by
    privacy and develop appropriate privacy policies
  • 2. Technology alone is not sufficient for privacy
    We need technologists, Policy expert, Legal
    experts and Social scientists to work on Privacy
  • 3. Some well known people have said Forget about
    privacy Therefore, should we pursue research on
  • Interesting research problems, there need to
    continue with research
  • Something is better than nothing
  • Try to prevent privacy violations and if
    violations occur then prosecute
  • 4. We need to tackle privacy from all directions

Application Specific Privacy?
  • Examining privacy may make sense for healthcare
    and financial applications
  • Does privacy work for Defense and Intelligence
  • Is it meaningful to have privacy for
    surveillance and geospatial applications
  • Once the image of my house is on Google Earth,
    then how much privacy can I have?
  • I may want my location to be private, but does it
    make sense if a camera can capture a picture of
  • If there are sensors all over the place, is it
    meaningful to have privacy preserving
  • This suggestion that we need application specific
  • It is not meaningful to examine PPDM for every
    data mining algorithm and for every application

Data Mining and Privacy Friends or Foes?
  • They are neither friends nor foes
  • Need advances in both data mining and privacy
  • Need to design flexible systems
  • For some applications one may have to focus
    entirely on pure data mining while for some
    others there may be a need for privacy-preserving
    data mining
  • Need flexible data mining techniques that can
    adapt to the changing environments
  • Technologists, legal specialists, social
    scientists, policy makers and privacy advocates
    MUST work together

Popular Social Networks
  • Face book - A social networking website.
    Initially the membership was restricted to
    students of Harvard University. It was originally
    based on what first-year students were given
    called the face book which was a way to get to
    know other students on campus. As of July 2007,
    there over 34 million active members worldwide.
    From September 2006 to September 2007 it
    increased its ranking from 60 to 6th most visited
    web site, and was the number one site for photos
    in the United States.
  • Twitter- A free social networking and
    micro-blogging service that allows users to send
    updates (text-based posts, up to 140 characters
    long) via SMS, instant messaging, email, to the
    Twitter website, or an application/ widget within
    a space of your choice, like MySpace, Facebook, a
    blog, an RSS Aggregator/reader.
  • My Space - A popular social networking website
    offering an interactive, user-submitted network
    of friends, personal profiles, blogs, groups,
    photos, music and videos internationally.
    According to AlexaInternet, MySpace is currently
    the worlds sixth most popular English-language
    website and the sixth most popular website in any
    language, and the third most popular website in
    the United States, though it has topped the chart
    on various weeks. As of September 7, 2007, there
    are over 200 million accounts.

Social Networks More formal definition
  • A structural approach to understanding social
  • Networks consist of Actors and the Ties between
  • We represent social networks as graphs whose
    vertices are the actors and whose edges are the
  • Edges are usually weighted to show the strength
    of the tie.
  • In the simplest networks, an Actor is an
    individual person.
  • A tie might be is acquainted with. Or it might
    represent the amount of email exchanged between
    persons A and B.

Social Network Examples
  • Effects of urbanization on individual well-being
  • World political and economic system
  • Community elite decision-making
  • Social support, Group problem solving
  • Diffusion and adoption of innovations
  • Belief systems, Social influence
  • Markets, Sociology of science
  • Exchange and power
  • Email, Instant messaging, Newsgroups
  • Co-authorship, Citation, Co-citation
  • SocNet software, Friendster
  • Blogs and diaries, Blog quotes and links

  • Sociograms were invented in 1933 by Moreno.
  • In a sociogram, the actors are represented as
    points in a two-dimensional space. The location
    of each actor is significant. E.g. a central
    actor is plotted in the center, and others are
    placed in concentric rings according to
    distance from this actor.
  • Actors are joined with lines representing ties,
    as in a social network. In other words a social
    network is a graph, and a sociogram is a
    particular 2D embedding of it.
  • These days, sociograms are rarely used (most
    examples on the web are not sociograms at all,
    but networks). But methods like MDS
    (Multi-Dimensional Scaling) can be used to lay
    out Actors, given a vector of attributes about
  • Social Networks were studied early by researchers
    in graph theory (Harary et al. 1950s). Some
    social network properties can be computed
    directly from the graph.
  • Others depend on an adjacency matrix
    representation (Actors index rows and columns of
    a matrix, matrix elements represent the tie
    strength between them).

Social Network Analysis of 9/11 Terrorists
Early in 2000, the CIA was informed of two
terrorist suspects linked to al-Qaeda. Nawaf
Alhazmi and Khalid Almihdhar were photographed
attending a meeting of known terrorists in
Malaysia. After the meeting they returned to Los
Angeles, where they had already set up
residence in late 1999.
Social Network Analysis of 9/11 Terrorists
  • What do you do with these suspects? Arrest or
    deport them immediately? No, we need to use them
    to discover more of the al-Qaeda network.
  • Once suspects have been discovered, we can use
    their daily activities to uncloak their network.
    Just like they used our technology against us, we
    can use their planning process against them.
    Watch them, and listen to their conversations to
  • who they call / email
  • who visits with them locally and in other cities
  • where their money comes from
  • The structure of their extended network begins to
    emerge as data is discovered via surveillance.

Social Network Analysis of 9/11 Terrorists
A suspect being monitored may have many contacts
-- both accidental and intentional. We must
always be wary of 'guilt by association'.
Accidental contacts, like the mail delivery
person, the grocery store clerk, and neighbor may
not be viewed with investigative interest.
Intentional contacts are like the late
afternoon visitor, whose car license plate is
traced back to a rental company at the airport,
where we discover he arrived from Toronto (got to
notify the Canadians) and his name matches a cell
phone number (with a Buffalo, NY area code) that
our suspect calls regularly. This intentional
contact is added to our map and we start tracking
his interactions -- where do they lead? As data
comes in, a picture of the terrorist organization
slowly comes into focus. How do investigators
know whether they are on to something big? Often
they don't. Yet in this case there was another
strong clue that Alhazmi and Almihdhar were up to
no good -- the attack on the USS Cole in October
of 2000. One of the chief suspects in the Cole
bombing Khallad was also present along with
Alhazmi and Almihdhar at the terrorist meeting
in Malaysia in January 2000. Once we have their
direct links, the next step is to find their
indirect ties -- the 'connections of their
connections'. Discovering the nodes and links
within two steps of the suspects usually starts
to reveal much about their network. Key
individuals in the local network begin to stand
out. In viewing the network map in Figure 2, most
of us will focus on Mohammed Atta because we now
know his history. The investigator uncloaking
this network would not be aware of Atta's
eventual importance. At this point he is just
another node to be investigated.
Social Network Analysis of 9/11 Terrorists
Figure 2 shows the two suspects and
Social Network Analysis of 9/11 Terrorists
Social Network Analysis of 9/11 Terrorists
  • We now have enough data for two key conclusions
  • All 19 hijackers were within 2 steps of the two
    original suspects uncovered in 2000!
  • Social network metrics reveal Mohammed Atta
    emerging as the local leader
  • With hindsight, we have now mapped enough of the
    9-11 conspiracy to stop it. Again, the
    investigators are never sure they have uncovered
    enough information while they are in the process
    of uncloaking the covert organization. They also
    have to contend with superfluous data. This data
    was gathered after the event, so the
    investigators knew exactly what to look for.
    Before an event it is not so easy.
  • As the network structure emerges, a key dynamic
    that needs to be closely monitored is the
    activity within the network. Network activity
    spikes when a planned event approaches. Is there
    an increase of flow across known links? Are new
    links rapidly emerging between known nodes? Are
    money flows suddenly going in the opposite
    direction? When activity reaches a certain
    pattern and threshold, it is time to stop
    monitoring the network, and time to start
    removing nodes.
  • The author argues that this bottom-up approach of
    uncloaking a network is more effective than a top
    down search for the terrorist needle in the
    public haystack -- and it is less invasive of the
    general population, resulting in far fewer "false

Social Network Analysis of Steroid Usage in
Baseball (
Figure 2 shows the two suspects and
When the Mitchell Report on steroid use in Major
League Baseball MLB, was published, people were
surprised at who and how many players were
mentioned. The diagram below shows a human
network created from data found in the Mitchell
Report. Baseball players are shown as green
nodes. Those who were found to be providers of
steroids and other illegal performance enhancing
substances appear as red nodes. The links reveal
the flow of chemicals -- from provider to player.
Knowledge Sharing in Organizations Finding

Figure 2 shows the two suspects and
Knowledge Sharing Network Finding Experts

Figure 2 shows the two suspects and
Organizational leaders are preparing for the
potential loss of expertise and knowledge flow
due to turnover, downsizing, outsourcing, and the
coming retirements of the baby boom generation.
The model network (previous chart) is used to
illustrate the knowledge continuity analysis
process. Each node in this sample network
(previous chart) represents a person that works
in a knowledge domain. Some people have more /
different knowledge than others. Employees who
will retire in 2 years or less have their nodes
colored red. Those who will retire in 3-4 years
are colored yellow. Those retiring in 5 years or
later are colored green. A gray, directed line
is drawn from the seeker of knowledge to the
source of expertise. A--B indicates that A seeks
expertise / advice from B. Those with many
arrows pointing to them are sought often for
assistance. The top subject matter experts --
SMEs -- in this group are nodes 29, 46, 100, 41,
36 and 55. The SMEs were discovered using a
network metric in InFlow that is similar to how
the Google search engine ranks web pages --
using both direct and indirect links. Of the top
six SMEs in this group, half are colored red100
or yellow46, 55. The loss of person 46 has the
greatest potential for knowledge loss. 90 of the
network is within 3 steps of accessing this key
knowledge source.
Social Networks Security and Privacy Issues
European Network and Information Security Agency
  • The European Network and Information Security
    Agency (ENISA) has released its first issue paper
    Security Issues and Recomendations for Online
    Social Networks".
  • http//
  • Four groups of threats privacy related threats,
    variants of traditional network and information
    security threats, identity related threats,
    social threats.
  • Recommendations are given for governments
    (oversight and adaption of existing data
    protection legislation), companies that run such
    networks, technology developers, and research and
    standardisation bodies.
  • Some concenrs recommnendation to use automated
    filters against "offensive, litigious or illegal
    content". This brings potential freedom of speech
    issues. European Digital Rights has started a
    campaign against a similar recommendation by the
    Council of Europe.Issue of portability of
    profiles social graphs are also addressed.
    However what is missing is that Information
    about social links is not about only one user,
    but also the others which he is linked to. They
    have to agree if this information is moved to
    different platforms.

Social Networks Security and Privacy Issues
Microsoft Recommendations http//
  • Online communities require you to provide
    personal information. Profiles are public.
    Comments you post are permanently recorded on the
    community site.You might even mention when you
    plan to be out of town.
  • E-mail and phishing scammers count on the
    appealing sense of trust that is often fostered
    in online communities to steal your personal
    information. The more you reveal in profiles and
    posts, the more vulnerable you are to scams,
    spam, and identity theft.
  • Here are some features to look for when you're
    considering joining an online community
  • Privacy policies that explain exactly what
    information the service will collect and how it
    might be used. User guidelines that outline a
    basic code of conduct for users on their sites.
    Sites have the option to penalize reported
    violators with account suspension or
    termination.Special provisions for children and
    their parents, such as family-friendly options
    geared towards protecting children under a
    certain age.Password protection to help keep
    your account secure..E-mail address hiding,
    which lets you display only part of your e-mail
    address on the site's membership lists. Filtering
    options Offered on blogging sites, these tools
    let you to choose which subscribers can see what
    you've written.

Role of Semantic Web
  • FOAF (Friend of a Friend)
  • Social Graph represented in RDF
  • Use the reasoning tools and analyze the social
    network for suspicious events
  • Protect the privacy of individuals

FOAF http//
  • FOAF (an acronym of Friend of a Friend) is a
    machine-readable ontology describing persons,
    their activities and their relations to other
    people and objects. Anyone can use FOAF to
    describe him or herself. FOAF allows groups of
    people to describe social networks without the
    need for a centralised database.
  • FOAF's descriptive vocabulary is expressed using
    RDF Resource Description Framework and OWL Web
    Ontology Language.
  • Computers may use these FOAF profiles to find,
    for example, all people living in Europe, or to
    list all people both you and a friend of you
    know. This is accomplished by defining
    relationships between people. Each profile has a
    unique identifier (such as the person's e-mail
    addresses, a URI of the homepage or weblog of the
    person), which is used when defining these

FOAF http//
  • The FOAF project, which defines and extends the
    vocabulary of a FOAF profile, was started in 2000
    by and . It can be considered the first Social
    Semantic Web application, in that it combines RDF
    technology with 'Social Web' concerns.
  • Tim Berners-Lee in a recent essay redefined the
    Semantic web concept into something he calls the
    Giant Global Graph, where relationships transcend
    networks/documents. He considers the GGG to be on
    equal grounds with Internet and World Wide Web,
    stating that "I express my network in a FOAF
    file, and that is a start of the revolution."

FOAF http//
  • The following FOAF profile (written in XML
    format) states that Jimmy Wales is the name of
    the person described here. His e-mail address,
    homepage and depiction are resources, which means
    that each of them can be described using RDF as
    well. He has Wikipedia as an interest, and knows
    Angela Beesley (which is the name of a 'Person'
  • xmlnsrdf"http//
    -ns" xmlnsrdfs"http//
    Jimmy Wales rdfresource"" /
    .com/" / Jimbo" / rdfresource"http//"
    rdfslabel"Wikipedia" /

Confidentiality, Privacy and Trust (CPT)
  • How can be CPT be incorporated into FOAF?
  • Assignment 3