Scalable%20Trusted%20Computing%20Engineering%20challenge,%20or%20something%20more%20fundamental? - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Scalable%20Trusted%20Computing%20Engineering%20challenge,%20or%20something%20more%20fundamental?

Description:

Sally is no longer allowed to access them. But this raises problems ... FBI is investigating reports of zombies in Cornell's Mann Library... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 58
Provided by: csCor
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Scalable%20Trusted%20Computing%20Engineering%20challenge,%20or%20something%20more%20fundamental?


1
Scalable Trusted Computing Engineering challenge,
or something more fundamental?
  • Ken Birman
  • Cornell University

2
Cornell Quicksilver Project
  • Krzys Ostrowski The key player
  • Ken Birman, Danny Dolev Collaborators and
    research supervisors
  • Mahesh Balakrishnan, Maya Haridasan, Tudor
    Marian, Amar Phanishayee, Robbert van Renesse,
    Einar Vollset, Hakim Weatherspoon Offered
    valuable comments and criticisms

3
Trusted Computing
  • A vague term with many meanings
  • For individual platforms, integrity of the
    computing base
  • Availability and exploitation of TPM h/w
  • Proofs of correctness for key components
  • Security policy specification, enforcement
  • Scalable trust issues arise mostly in distributed
    settings

4
System model
  • A world of
  • Actors Sally, Ted,
  • Groups Sally_Advisors Ted, Alice,
  • Objects travel_plans.html, investments.xls
  • Actions Open, Edit,
  • Policies
  • (Actor,Object,Action) ? Permit, Deny
  • Places Ted_Desktop, Sally_Phone, .

5
Rules
  • If Emp.place ? Secure_Place and Emp ?
    Client_Advisors then Allow Open
    Client_Investments.xls
  • Can Ted, working at Ted_Desktop, open
    Sally_Investments.xls?
  • yes, if Ted_Desktop ? Secure_Places

6
Miscellaneous stuff
  • Policy changes all the time
  • Like a database receiving updates
  • E.g. as new actors are added, old ones leave the
    system, etc
  • and they have a temporal scope
  • Starting at time t19 and continuing until now,
    Ted is permitted to access Sallys file
    investments.xls

7
Order dependent decisions
  • Consider rules such as
  • Only one person can use the cluster at a time.
  • The meeting room is limited to three people
  • While people lacking clearance are present, no
    classified information can be exposed
  • These are sensitive to the order in which
    conflicting events occur
  • Central clearinghouse decides what to allow
    based on order in which it sees events

8
Goal Enforce policy
investments.xls
Read
(data)
Policy Database
9
reduction to a proof
  • Each time an action is attempted, system must
    develop a proof either that the action should be
    blocked or allowed
  • For example, might use the BAN logic
  • For the sake of argument, lets assume we know
    how to do all this on a single machine

10
Implications of scale
  • Well be forced to replicate and decentralize the
    policy enforcement function
  • For ownership Allows local policy to be stored
    close to the entity that owns it
  • For performance and scalability
  • For fault-tolerance

11
Decentralized policy enforcement
investments.xls
Read
(data)
Policy Database
Original Scheme
12
Decentralized policy enforcement
investments.xls
Read
(data)
Policy DB 1
Policy DB 2
New Scheme
13
So how do we decentralize?
  • Consistency the bane of decentralization
  • We want a system to behave as if all decisions
    occur in a single rules database
  • Yet want the decisions to actually occur in a
    decentralized way a replicated policy database
  • System needs to handle concurrent events in a
    consistent manner

14
So how do we decentralize?
  • More formally
  • Analogy database 1-copy serializability

Any run of the decentralized system should be
indistinguishable from some run of a centralized
system
15
But this is a familiar problem!
  • Database researchers know it as the atomic commit
    problem.
  • Distributed systems people call it
  • State machine replication
  • Virtual synchrony
  • Paxos-style replication
  • and because of this we know a lot about the
    question!

16
replicated data with abcast
  • Closely related to the atomic broadcast problem
    within a group
  • Abcast sends a message to all the members of a
    group
  • Protocol guarantees order, fault-tolerance
  • Solves consensus
  • Indeed, a dynamic policy repository would need
    abcast if we wanted to parallelize it for speed
    or replicate it for fault-tolerance!

17
A slight digression
  • Consensus is a classical problem in distributed
    systems
  • N processes
  • They start execution with inputs?? 0,1
  • Asynchronous, reliable network
  • At most 1 process fails by halting (crash)
  • Goal protocol whereby all decide same value v,
    and v was an input

18
Distributed Consensus
Jenkins, if I want another yes-man, Ill build
one!
Lee Lorenz, Brent Sheppard
19
Asynchronous networks
  • No common clocks or shared notion of time (local
    ideas of time are fine, but different processes
    may have very different clocks)
  • No way to know how long a message will take to
    get from A to B
  • Messages are never lost in the network

20
Fault-tolerant protocol
  • Collect votes from all N processes
  • At most one is faulty, so if one doesnt respond,
    count that vote as 0
  • Compute majority
  • Tell everyone the outcome
  • They decide (they accept outcome)
  • but this has a problem! Why?

21
What makes consensus hard?
  • Fundamentally, the issue revolves around
    membership
  • In an asynchronous environment, we cant detect
    failures reliably
  • A faulty process stops sending messages but a
    slow message might confuse us
  • Yet when the vote is nearly a tie, this confusing
    situation really matters

22
Some bad news
  • FLP result shows that fault-tolerant consensus
    protocols always have non-terminating runs.
  • All of the mechanisms we discussed are equivalent
    to consensus
  • Impossibility of non-blocking commit is a similar
    result from database community

23
But how bad is this news?
  • In practice, these impossibility results dont
    hold up so well
  • Both define impossible ? not always possible
  • In fact, with probabilities, the FLP scenario is
    of probability zero
  • must ask Does a probability zero result even
    hold in a real system?
  • Indeed, people build consensus-based systems all
    the time

24
Solving consensus
  • Systems that solve consensus often use a
    membership service
  • This GMS functions as an oracle, a trusted status
    reporting function
  • Then consensus protocol involves a kind of
    2-phase protocol that runs over the output of the
    GMS
  • It is known precisely when such a solution will
    be able to make progress

25
More bad news
  • Consensus protocols dont scale!
  • Isis (virtual synchrony) new view protocol
  • Selects a leader normally 2-phase 3 if leader
    dies
  • Each phase is a 1-n multicast followed by an n-1
    convergecast (can tolerate n/2-1 failures)
  • Paxos decree protocol
  • Basic protocol has no leader and could have
    rollbacks with probability linear in n
  • Faster-Paxos is isomorphic to the Isis view
    protocol (!)
  • both are linear in group size.
  • Regular Paxos might be O(n2) because of rollbacks

26
Work-arounds?
  • Only run the consensus protocol in the group
    membership service or GMS
  • It has a small number of members, like 3-5
  • They run a protocol like the Isis one
  • Track membership (and other global state on
    behalf of everything in the system as a whole
  • Scalability of consensus wont matter

27
But this is centralized
  • Recall our earlier discussion
  • Any central service running on behalf of the
    whole system will become burdened if the system
    gets big enough
  • Can we decentralize our GMS service?

28
GMS in a large system
Global events are inputs to the GMS
Output is the official record of events that
mattered to the system
GMS
29
Hierarchical, federated GMS
  • Quicksilver V2 (QS2) constructs a hierarchy of
    GMS state machines
  • In this approach, each event is associated with
    some GMS that owns the relevant official record

GMS0
GMS2
GMS1
30
Delegation of roles
  • One (important) use of the GMS is to track
    membership in our rule enforcement subsystem
  • But delegate responsibility for classes of
    actions to subsystems that can own and handle
    them locally
  • GMS reports the delegation events
  • In effect, it tells nodes in the system about the
    system configuration about their roles
  • And as conditions change, it reports new events

31
Delegation
In my capacity as President of the United States,
I authorize John Pigg to oversee this nations
banks
Thank you, sir! You can trust me
32
Delegation
GMS0
GMS1
Policy subsystem
33
Delegation example
  • IBM might delegate the handling of access to its
    Kingston facility to the security scanners at the
    doors
  • Events associated with Kingston access dont need
    to pass through the GMS
  • Instead, they exist entirely within the group
    of security scanners

34
giving rise to pub/sub groups
  • Our vision spawns lots and lots of groups that
    own various aspects of trust enforcement
  • The scanners at the doors
  • The security subsystems on our desktops
  • The key management system for a VPN
  • etc
  • A nice match with publish-subscribe

35
Publish-subscribe in a nutshell
  • Publish(topic, message)
  • Subscribe(topic, handler)
  • Basic idea
  • Platform invokes handler(message) each time a
    topic match arises
  • Fancier versions also support history mechanisms
    (lets joining process catch up)

36
Publish-subscribe in a nutshell
  • Concept first mentioned by Willy Zwaenepoel in a
    paper on multicast in the V system
  • First implementation was Frank Schmucks Isis
    news tool
  • Later re-invented in TIB message bus
  • Also known as event notification very popular

37
Other kinds of published events
  • Changes in the user set
  • For example, IBM hired Sally. Jeff left his job
    at CIA. Halliburton snapped him up
  • Or the group set
  • Jeff will be handling the Iraq account
  • Or the rules
  • Jeff will have access to the secret archives
  • Sally is no longer allowed to access them

38
But this raises problems
  • If actors only have partial knowledge
  • E.g. the Cornell library door access system only
    knows things normally needed by that door
  • then we will need to support out-of-band
    interrogation of remote policy databases in some
    cases

39
A Scalable Trust Architecture
GMS hierarchy tracks configuration events
GMS
GMS
GMS
Pub/sub framework
Role delegation
Slave system applies policy
Master enterprise policy DB
Knowledge limited to locally useful policy
Central database tracks overall policy
Enterprise policy system for some company or
entity
40
A Scalable Trust Architecture
  • Enterprises talk to one-another when decisions
    require non-local information

PeopleSoft
Inquiry
FBI
(policy)
Cornell University
41
www.zombiesattackithaca.com
42
Open questions?
  • Minimal trust
  • A problem reminiscent of zero-knowledge
  • Example
  • FBI is investigating reports of zombies in
    Cornells Mann Library Mulder is assigned to the
    case.
  • The Cornell Mann Library must verify that he is
    authorized to study the situation
  • But does FBI need to reveal to Cornell that the
    Cigarette Man actually runs the show?

43
Other research questions
  • Pub-sub systems are organized around topics, to
    which applications subscribe
  • But in a large-scale security policy system, how
    would one structure these topics?
  • Topics are like file names paths
  • But we still would need an agreed upon layout

44
Practical research question
  • State transfer is the problem of initializing a
    database or service when it joins the system
    after an outage
  • How would we implement a rapid and secure state
    transfer, so that a joining security policy
    enforcement module can quickly come up to date?
  • Once its online, the pub-sub system reports
    updates on topics that matter to it

45
Practical research question
  • Designing secure protocols for inter-enterprise
    queries
  • This could draw on the secured Internet
    transaction architecture
  • A hierarchy of credential databases
  • Used to authenticate enterprises to one-another
    so that they can share keys
  • They employ the keys to secure queries

46
Recap?
  • Weve suggested that scalable trust comes down to
    emulation of a trusted single-node rule
    enforcement service by a distributed service
  • And that service needs to deal with dynamics such
    as changing actor set, object set, rule set,
    group membership

47
Recap?
  • Concerns that any single node
  • Would be politically unworkable
  • Would impose a maximum capacity limit
  • Wont be fault-tolerant
  • pushed for a decentralized alternative
  • Needed to make a decentralized service emulate a
    centralized one

48
Recap?
  • This led us to recognize that our problem is an
    instance of an older problem replication of a
    state machine or an abstract data type
  • The problem reduces to consensus and hence is
    impossible
  • but we chose to accept Mission Impossible V

49
Impossible? Who cares!
  • We decided that the impossibility results were
    irrelevant to real systems
  • Federation addressed by building a hierarchy of
    GMS services
  • Each supported by a group of servers
  • Each GMS owns a category of global events
  • Now can create pub/sub topics for the various
    forms of information used in our decentralized
    policy database
  • enabling decentralized policy enforcement

50
QS2 A work in progress
  • Were building Quicksilver, V2 (aka QS2)
  • Under development by Krzys Ostrowski at Cornell,
    with help from Ken Birman, Danny Dolev (HUJL)
  • Some parts already exist and can be downloaded
    now
  • Quicksilver Scalable Multicast (QSM).
  • Focus is on reliable and scalable message
    delivery even with huge numbers of groups or
    severe stress on the system

51
Quicksilver Architecture
  • Our solution
  • Assumes low latencies, IP multicast
  • A layered platform, native hosting on .NET

Applications (any language)
Quicksilver pub-sub API
our platform
GMS
Strongly-typed .NET group endpoints
Properties Framework endows groups with stronger
properties
Quicksilver Scalable Multicast (C / .NET)
52
Quicksilver Major ideas
  • Maps overlapping groups down to regions
  • Engineering challenge application may belong to
    thousands of groups efficiency of mapping is key
  • Multicast is doing by IP multicast, per-region
  • Discovers failures using circulating tokens
  • Local repair avoids overloading sender
  • Eventually will support strong reliability model
    too
  • Novel rate limited sending scheme

53
Members of a region have similar group
membership
QSM runs protocols that aggregate over regions,
improving scalability
In traditional group multicast systems, groups
run independently
Hierarchical aggregation used for groups that
span multiple regions
54
(No Transcript)
55
Connections to type theory
  • Were developing a new high-level language for
    endowing groups with types
  • Such as security or reliability properties
  • Internally, QS2 will compile from this language
    down to protocols that amortize costs across
    groups
  • Externally, we are integrating QS2 types with
    types in the operating system / runtime
    environment (right now, Windows .net)
  • Many challenging research topics in this area!
  • http//www.cs.cornell.edu/projects/quicksilver/

56
Open questions?
  • Not all policy databases are amenable to a
    decentralized enforcement
  • Must have enough information at the point of
    enforcement to construct proofs
  • Is this problem tractable? Complexity?
  • More research is needed on the question of
    federation of policy databases with minimal
    disclosure

57
Open questions?
  • We lack a constructive logic of distributed,
    fault-tolerant systems
  • Part of the issue is exemplified by the FLP
    problem logic has yet to deal with the
    pragmatics of real-world systems
  • Part of the problem resides in type theory we
    lack true distributed type mechanisms
About PowerShow.com