Scalable Trusted Computing Engineering challenge, or something more fundamental? - PowerPoint PPT Presentation

About This Presentation

Title:

Scalable Trusted Computing Engineering challenge, or something more fundamental?

Description:

Scalable Trusted Computing Engineering challenge, or something more fundamental? Ken Birman Cornell University – PowerPoint PPT presentation

Number of Views:170

Avg rating:3.0/5.0

Slides: 58

Provided by: csCornell96

Learn more at: http://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Scalable Trusted Computing Engineering challenge, or something more fundamental?

1
Scalable Trusted ComputingEngineering challenge,
or something more fundamental?

Ken Birman
Cornell University

2
Cornell Quicksilver Project

Krzys Ostrowski The key player
Ken Birman, Danny Dolev Collaborators and
research supervisors
Mahesh Balakrishnan, Maya Haridasan, Tudor
Marian, Amar Phanishayee, Robbert van Renesse,
Einar Vollset, Hakim Weatherspoon Offered
valuable comments and criticisms

3
Trusted Computing

A vague term with many meanings
For individual platforms, integrity of the
computing base
Availability and exploitation of TPM h/w
Proofs of correctness for key components
Security policy specification, enforcement
Scalable trust issues arise mostly in distributed
settings

4
System model

A world of
Actors Sally, Ted,
Groups Sally_Advisors Ted, Alice,
Objects travel_plans.html, investments.xls
Actions Open, Edit,
Policies
(Actor,Object,Action) ? Permit, Deny
Places Ted_Desktop, Sally_Phone, .

5
Rules

If Emp.place ? Secure_Place and Emp ?
Client_Advisors thenAllow Open
Client_Investments.xls
Can Ted, working at Ted_Desktop, open
Sally_Investments.xls?
yes, if Ted_Desktop ? Secure_Places

6
Miscellaneous stuff

Policy changes all the time
Like a database receiving updates
E.g. as new actors are added, old ones leave the
system, etc
and they have a temporal scope
Starting at time t19 and continuing until now,
Ted is permitted to access Sallys file
investments.xls

7
Order dependent decisions

Consider rules such as
Only one person can use the cluster at a time.
The meeting room is limited to three people
While people lacking clearance are present, no
classified information can be exposed
These are sensitive to the order in which
conflicting events occur
Central clearinghouse decides what to allow
based on order in which it sees events

8
Goal Enforce policy
investments.xls
Read
(data)
Policy Database
9
reduction to a proof

Each time an action is attempted, system must
develop a proof either that the action should be
blocked or allowed
For example, might use the BAN logic
For the sake of argument, lets assume we know
how to do all this on a single machine

10
Implications of scale

Well be forced to replicate and decentralize the
policy enforcement function
For ownership Allows local policy to be stored
close to the entity that owns it
For performance and scalability
For fault-tolerance

11
Decentralized policy enforcement
investments.xls
Read
(data)
Policy Database
Original Scheme
12
Decentralized policy enforcement
investments.xls
Read
(data)
Policy DB 1
Policy DB 2
New Scheme
13
So how do we decentralize?

Consistency the bane of decentralization
We want a system to behave as if all decisions
occur in a single rules database
Yet want the decisions to actually occur in a
decentralized way a replicated policy database
System needs to handle concurrent events in a
consistent manner

14
So how do we decentralize?

More formally
Analogy database 1-copy serializability

Any run of the decentralized system should be
indistinguishable from some run of a centralized
system
15
But this is a familiar problem!

Database researchers know it as the atomic commit
problem.
Distributed systems people call it
State machine replication
Virtual synchrony
Paxos-style replication
and because of this we know a lot about the
question!

16
replicated data with abcast

Closely related to the atomic broadcast problem
within a group
Abcast sends a message to all the members of a
group
Protocol guarantees order, fault-tolerance
Solves consensus
Indeed, a dynamic policy repository would need
abcast if we wanted to parallelize it for speed
or replicate it for fault-tolerance!

17
A slight digression

Consensus is a classical problem in distributed
systems
N processes
They start execution with inputs?? 0,1
Asynchronous, reliable network
At most 1 process fails by halting (crash)
Goal protocol whereby all decide same value v,
and v was an input

18
Distributed Consensus
Jenkins, if I want another yes-man, Ill build
one!
Lee Lorenz, Brent Sheppard
19
Asynchronous networks

No common clocks or shared notion of time (local
ideas of time are fine, but different processes
may have very different clocks)
No way to know how long a message will take to
get from A to B
Messages are never lost in the network

20
Fault-tolerant protocol

Collect votes from all N processes
At most one is faulty, so if one doesnt respond,
count that vote as 0
Compute majority
Tell everyone the outcome
They decide (they accept outcome)
but this has a problem! Why?

21
What makes consensus hard?

Fundamentally, the issue revolves around
membership
In an asynchronous environment, we cant detect
failures reliably
A faulty process stops sending messages but a
slow message might confuse us
Yet when the vote is nearly a tie, this confusing
situation really matters

22
Some bad news

FLP result shows that fault-tolerant consensus
protocols always have non-terminating runs.
All of the mechanisms we discussed are equivalent
to consensus
Impossibility of non-blocking commit is a similar
result from database community

23
But how bad is this news?

In practice, these impossibility results dont
hold up so well
Both define impossible ? not always possible
In fact, with probabilities, the FLP scenario is
of probability zero
must ask Does a probability zero result even
hold in a real system?
Indeed, people build consensus-based systems all
the time

24
Solving consensus

Systems that solve consensus often use a
membership service
This GMS functions as an oracle, a trusted status
reporting function
Then consensus protocol involves a kind of
2-phase protocol that runs over the output of the
GMS
It is known precisely when such a solution will
be able to make progress

25
More bad news

Consensus protocols dont scale!
Isis (virtual synchrony) new view protocol
Selects a leader normally 2-phase 3 if leader
dies
Each phase is a 1-n multicast followed by an n-1
convergecast (can tolerate n/2-1 failures)
Paxos decree protocol
Basic protocol has no leader and could have
rollbacks with probability linear in n
Faster-Paxos is isomorphic to the Isis view
protocol (!)
both are linear in group size.
Regular Paxos might be O(n2) because of rollbacks

26
Work-arounds?

Only run the consensus protocol in the group
membership service or GMS
It has a small number of members, like 3-5
They run a protocol like the Isis one
Track membership (and other global state on
behalf of everything in the system as a whole
Scalability of consensus wont matter

27
But this is centralized

Recall our earlier discussion
Any central service running on behalf of the
whole system will become burdened if the system
gets big enough
Can we decentralize our GMS service?

28
GMS in a large system
Global events are inputs to the GMS
Output is the official record of events that
mattered to the system
GMS
29
Hierarchical, federated GMS

Quicksilver V2 (QS2) constructs a hierarchy of
GMS state machines
In this approach, each event is associated with
some GMS that owns the relevant official record

GMS0
GMS2
GMS1
30
Delegation of roles

One (important) use of the GMS is to track
membership in our rule enforcement subsystem
But delegate responsibility for classes of
actions to subsystems that can own and handle
them locally
GMS reports the delegation events
In effect, it tells nodes in the system about the
system configuration about their roles
And as conditions change, it reports new events

31
Delegation
In my capacity as President of the United States,
I authorize John Pigg to oversee this nations
banks
Thank you, sir! You can trust me
32
Delegation
GMS0
GMS1
Policysubsystem
33
Delegation example

IBM might delegate the handling of access to its
Kingston facility to the security scanners at the
doors
Events associated with Kingston access dont need
to pass through the GMS
Instead, they exist entirely within the group
of security scanners

34
giving rise to pub/sub groups

Our vision spawns lots and lots of groups that
own various aspects of trust enforcement
The scanners at the doors
The security subsystems on our desktops
The key management system for a VPN
etc
A nice match with publish-subscribe

35
Publish-subscribe in a nutshell

Publish(topic, message)
Subscribe(topic, handler)
Basic idea
Platform invokes handler(message) each time a
topic match arises
Fancier versions also support history mechanisms
(lets joining process catch up)

36
Publish-subscribe in a nutshell

Concept first mentioned by Willy Zwaenepoel in a
paper on multicast in the V system
First implementation was Frank Schmucks Isis
news tool
Later re-invented in TIB message bus
Also known as event notification very popular

37
Other kinds of published events

Changes in the user set
For example, IBM hired Sally. Jeff left his job
at CIA. Halliburton snapped him up
Or the group set
Jeff will be handling the Iraq account
Or the rules
Jeff will have access to the secret archives
Sally is no longer allowed to access them

38
But this raises problems

If actors only have partial knowledge
E.g. the Cornell library door access system only
knows things normally needed by that door
then we will need to support out-of-band
interrogation of remote policy databases in some
cases

39
A Scalable Trust Architecture
GMS hierarchy tracks configuration events
GMS
GMS
GMS
Pub/sub framework
Roledelegation
Slave systemapplies policy
Masterenterprisepolicy DB
Knowledge limited to locally useful policy
Central database tracks overall policy
Enterprise policy system for some company or
entity
40
A Scalable Trust Architecture

Enterprises talk to one-another when decisions
require non-local information

PeopleSoft
Inquiry
FBI
(policy)
Cornell University
41
www.zombiesattackithaca.com
42
Open questions?

Minimal trust
A problem reminiscent of zero-knowledge
Example
FBI is investigating reports of zombies in
Cornells Mann Library Mulder is assigned to the
case.
The Cornell Mann Library must verify that he is
authorized to study the situation
But does FBI need to reveal to Cornell that the
Cigarette Man actually runs the show?

43
Other research questions

Pub-sub systems are organized around topics, to
which applications subscribe
But in a large-scale security policy system, how
would one structure these topics?
Topics are like file names paths
But we still would need an agreed upon layout

44
Practical research question

State transfer is the problem of initializing a
database or service when it joins the system
after an outage
How would we implement a rapid and secure state
transfer, so that a joining security policy
enforcement module can quickly come up to date?
Once its online, the pub-sub system reports
updates on topics that matter to it

45
Practical research question

Designing secure protocols for inter-enterprise
queries
This could draw on the secured Internet
transaction architecture
A hierarchy of credential databases
Used to authenticate enterprises to one-another
so that they can share keys
They employ the keys to secure queries

46
Recap?

Weve suggested that scalable trust comes down to
emulation of a trusted single-node rule
enforcement service by a distributed service
And that service needs to deal with dynamics such
as changing actor set, object set, rule set,
group membership

47
Recap?

Concerns that any single node
Would be politically unworkable
Would impose a maximum capacity limit
Wont be fault-tolerant
pushed for a decentralized alternative
Needed to make a decentralized service emulate a
centralized one

48
Recap?

This led us to recognize that our problem is an
instance of an older problem replication of a
state machine or an abstract data type
The problem reduces to consensus and hence is
impossible
but we chose to accept Mission Impossible V

49
Impossible? Who cares!

We decided that the impossibility results were
irrelevant to real systems
Federation addressed by building a hierarchy of
GMS services
Each supported by a group of servers
Each GMS owns a category of global events
Now can create pub/sub topics for the various
forms of information used in our decentralized
policy database
enabling decentralized policy enforcement

50
QS2 A work in progress

Were building Quicksilver, V2 (aka QS2)
Under development by Krzys Ostrowski at Cornell,
with help from Ken Birman, Danny Dolev (HUJL)
Some parts already exist and can be downloaded
now
Quicksilver Scalable Multicast (QSM).
Focus is on reliable and scalable message
delivery even with huge numbers of groups or
severe stress on the system

51
Quicksilver Architecture

Our solution
Assumes low latencies, IP multicast
A layered platform, native hosting on .NET

Applications (any language)
Quicksilver pub-sub API
our platform
GMS
Strongly-typed .NET group endpoints
Properties Framework endows groups with stronger
properties
Quicksilver Scalable Multicast (C / .NET)
52
Quicksilver Major ideas

Maps overlapping groups down to regions
Engineering challenge application may belong to
thousands of groups efficiency of mapping is key
Multicast is doing by IP multicast, per-region
Discovers failures using circulating tokens
Local repair avoids overloading sender
Eventually will support strong reliability model
too
Novel rate limited sending scheme

53
Members of a region have similar group
membership
QSM runs protocols that aggregate over regions,
improving scalability
In traditional group multicast systems, groups
run independently
Hierarchical aggregation used for groups that
span multiple regions
54
(No Transcript)
55
Connections to type theory

Were developing a new high-level language for
endowing groups with types
Such as security or reliability properties
Internally, QS2 will compile from this language
down to protocols that amortize costs across
groups
Externally, we are integrating QS2 types with
types in the operating system / runtime
environment (right now, Windows .net)
Many challenging research topics in this area!
http//www.cs.cornell.edu/projects/quicksilver/

56
Open questions?

Not all policy databases are amenable to a
decentralized enforcement
Must have enough information at the point of
enforcement to construct proofs
Is this problem tractable? Complexity?
More research is needed on the question of
federation of policy databases with minimal
disclosure

57
Open questions?

We lack a constructive logic of distributed,
fault-tolerant systems
Part of the issue is exemplified by the FLP
problem logic has yet to deal with the
pragmatics of real-world systems
Part of the problem resides in type theory we
lack true distributed type mechanisms

Write a Comment

User Comments (0)