Title: Linking FUN and netWORKing: Personal speculations on making an impact
1Linking FUN and netWORKing Personal speculations
on making an impact
- Fritz Henglein
- DIKU, University of Copenhagen
- henglein_at_diku.dk
2Future IT
- mobile, networked computing devices with and
without human interfaces (cell phones, PDAs,
terminals, hearing aids, light switches RFID
chips, injection controllers, routers,
compute/storage servers)
3Future IT-system demands
- All data accessible and updatable anywhere
(office, hallway, car), anytime (7/24) by anybody
with credentials on any hardware platform
(mainframe, pc, pda, cell phone) and quickly
(real-time) - High fault tolerance Accessible even if a
portion of computers and network are down.
4Resultant future challenges
- Key problems
- Providing reliable service using unreliable
network components - Data become ever more intelligent (e.g., original
HTML ? HTML w/ JavaScript ? ActiveX controls/Java
applets) Mobile data ? mobile code - Demands will be service-oriented not
device-oriented (phone conversation, not a
telephone tv transmission, not tv set file
service, not file server) decoupling of message
from (particular) messenger
5A SWOT analysis of FUN
- SWOT Strengths, Weaknesses, Opportunities,
Threats - Or
- What are we good at?
- What are we not so good at?
- What are the others not so good at?
- What are the others good at?
6Strengths
- Prediction Figuring out what might happen,
before it happens e.g., this is a worm, this is
virus, this is worthless. - Type systems, program analysis, proof carrying
code, model checking, theorem proving - Value-oriented programming (VOP) good for
distributed programming (caching, sharing,
replication, coalescing, asynchronous computing,
concurrency/transactions) - Developing simple (well-defined abstract,
general) programming models
7Weaknesses
- Models (theory and technology) for concurrency,
distribution, persistent storage - A knack for backpedaling (figuring out how things
should have been done instead of figuring out new
applications) - Obsession with perfection (generality,
correctness) - Little concern for the masses
8Opportunities
- SQL is not (yet) entrenched in programming
languages high impedance - Small networked devices operating as
swarms/peer-to-peer systems require new (small)
operating systems (opportunity for new designs) - Importance of predictability, in particular
security, combined with increased data
intelligence - Truly mobility-transparent programming languages
still missing - Highly distributed reliable and predictable
systems, in particular peer-to-peer systems, are
very hard to build
9Threats
- Programming is a network sport (value of network
O(n2) or O(n log n)) the masses rule - Conventional wisdom and technology flow in the
other direction Object-oriented, stateful
modeling, design and implementation technology - Gigantic established update-oriented
infrastructure hardware and OS designs are
stateful, create impedance for functional
programming
10The Way Ahead
- Focus on exploiting quality criteria of
application - Application centered, not language centered
approach Lets solve a difficult (new) problem!
(Not Lets build a new general-purpose
programming language) - Embrace and replace Leveraging existing
(competing) technology whilst working on making
it obsolete e.g., SQL, call-outs, locks,
sockets, SOAP, UDDI...) - KISS Good and user-friendly plus workarounds
instead of perfect.
11The Way Ahead
- Find new chokepoint middleware (mitigate
OS/hardware impedance) - Formalizing and integrating concurrency,
distribution, I/O, mobility, typing of
updatability - Leapfrogging existing technologies and
applications - real-time communication
- one-to-one (telephony),
- one-to-many (tv),
- many-to-many (games)
- archival communication (storing things)
- many-to-many (library)
12The Way Ahead
- Maybe find collaborative applications of value
for academic world to reduce risk to innovation
by involving best-practice-oriented actors - technical report management
- journal management
- the universal research library
- e-learning platform
- maybe grid computing,
- some area where P2P is valuable
- inter-enterprise business processes, incl.
digital government.
13VOP Value-oriented programming
- Programming with
- arbitrarily large values (immutable objects),
stored not only in RAM, but also on disk and on
the net - location-independent value references (short,
probabilistically unique identifiers of values,
wherever they are stored) can be thought of as
light-weight proxies for actual (big) values - plus small stateful cells (mutable objects) and
cell references, incl. - wait-free registers with consensus number
infinity (e.g., compare-and-swap registers)
14Benefits/goals
- Value references
- efficient sharing of immutable data
- location-independent
- efficient message passing
- Arbitrarily large values
- programmatic support of efficient atomic update
build new (global) state as value, then perform
update atomically by assigning value reference of
new value to register holding present state. - Small registers
- guaranteeing atomic update, with no (or minimal)
locking - wait-freeness ensure consensus and progress (no
process gets blocked forever or for too long) of
each client, even in the face of partial failures
elsewhere
15Universal references
Never loaded from disk or network!
book
author
title
Susanne Staun
Mit smukke lig
RAM
disk storage
16Rationale for Plan-X
- Provide programming model for viewing the
network as a single computer. - Provide configurable software platform
(reflective middleware) that provides services
for writing applications in this model. - Encapsulate peer-to-peer routing, caching,
replication, etc., in middleware (happens behind
the scenes) - Goal Making development of distributed, mobile
applications (almost) as simple as development of
single-computer systems.
17 Development model
main() app() blob()
18Execution model
main() app() blob()
19XML Store
- Peer-to-peer persistence manager for XML elements
with simple load/store/exec interface - Peer-to-peer architecture
- Global name server for binding and rebinding
value references to human-readable names - Rebinding bindings are updated atomically.
- Presently no implementation of cells
20Plan-X P2P-based middleware for distributed,
mobile computing
Routing p2p communication/name service
Distributed garbage collection
Code as data, remote execution
Data synchronization and merging
XMLStore decorators request buffering,
asynchronous access, caching, socket adapters,
replication,
XMLStore core components disk driver, in-memory
implementation
21Scenario 1 browser/server based communication
display X1F11F
client
server
get X1F11F
put XML doc. - ref X1F11F
Most of XML doc may already be stored locally
Transfers only those parts that are not already
in store
XML Value Store
22Scenario 2Replicated backup on the net
XML Store Files and programs, replicated,
multiple versions
- The clients themselves are XML Store peers!
- Each file contents only stored x (e.g. 10) times
23Scenario 3 Dynamic application partioning
Network
p calls f via the net
proc p(f)... call f(x,y) ...
proc f(u,v)... call f(u-1,v) ...
24Dynamic application partioning...
Network
p calls f locally
proc p(f)... call f(x,y) ...
proc f(u,v)... call f(u-1,v) ...
proc f(u,v)... call f(u-1,v) ...
replicate if enough space on phone while p is
executing
25Dynamic application partioning...
Network
proc p(f)... call f(x,y) ...
proc f(u,v)... call f(u-1,v) ...
proc f(u,v)... call f(u-1,v) ...
DELETED
f is deleted by phone system while p is
executing p then calls f via the net again
26More info
- Website www.plan-x.org
- Courses/seminars on distribution and mobility,
XML, peer-to-peer systems - Student projects XMLStore, garbage collection,
plagiarism checking, data synchronization, etc. - Java source for (P2P) XMLStore ready to play with
- Email henglein_at_diku.dk
27OOP or ratherimperative programming
- Basic model of programming
- primitive in-place update operationsobj.field
obj2ref - compound update operations controlled sequential
execution of updates e.g.(for int i 0 i
arr.size i) arri newVal(i)
28Imperative programming theme
- Goal Global state transition from State0 to
Staten State0 is destroyed. - Implementation (ephemeral state updates) State0
- ... - Statei - Staten of primitive state
transitions, where - each primitive update destroys the previous state
29Consequence 1
- software component interfaces are state-oriented
and stateful - which operations are available depends on history
of operations executed in the - responses from components depend on history of
operations executed - Example Unix file I/O
- NB Operations on such components are not
necessarily atomic (or even recoverable)
30Copy-and-update programming
- Note
- data get copied
- they are not always coherent
- they get copied again
input(f)
process(s)
output(f)
31Why (and when) it works (well)
- no concurrent access to file
- sequential and synchronous programming (control
over sequence of state changes) - no partial failures atomic abort due to single
point of failure (single-process execution on
single processor) - no replication of stateful data
- random access to location of data (rapid access
no matter where they are stored)
32Consequence 2
- Software/hardware component APIs are
copy-oriented data referenced by a pointer get
copied before being manipulated to ensure
integrity - Example Modern operating systems are based on
separation of address spaces require copying of
data or delegation of tasks (ask the other
process to do something for me)
33Properties of distributed (mobile) systems
- Partial failures
- cant even distinguish network failures from
computing node failures - Concurrency
- Difficult (exact) synchronization of processes
- Widely varying access latency
- rpc may block arbitrarily long time
34Techniques for battling these problems
- Caching, replication, memoization
- (buffered) asynchronous message passing
- relaxed or indeterminate semantics
- time-outs
- observational differences between processes
running on same machine or on different machines
Not good for mobile code!
35Imperative programming Problems
- caching and replication require heavy coherence
protocols or different states are observable by
clients and users - e.g. file save under NFS (wait for 30 seconds!!)
- atomic (commit of compound) update is difficult
to achieve in the presence of partial failures - rollback is not naturally supported, but
normally required in situations where (atomic)
updates can fail - coalescing identical data (storing data only
once) cannot be done (easily)
36Imperative programming Problems...
- programming is mostly synchronous to control
degree of nondeterminism due to concurrency - access to storage locations is not random (no
modern file system does whats shown before) - access to updateable objects is typically
location-based mobile objects are not
naturally supported - lots of data stored multiple times
37...but, of course
- Updateable objects are excellent for propagating
information to an arbitrary number of clients (to
any caller of the object, neednt even know or
keep track number or identity of callers)
38Central problem
- ...not reading (loading)
- ...not writing (saving, allocating)
- but updating (overwriting)
Breaks commuting
Note The more updating, the less operations
commute and the more their execution needs to be
controlled (synchronized).
39Updates vs. replication
- Deadlocks with replication O(tps2 ta a5
N3 / (4 o2)) where N number of replica
managersa average number of updates per
transaction (Gray, Helland, ONeil, Shasha, 200x) - GHOS focused on N3, but notice a5!
40XML Store Basic interface
- Load value Value load(ValueRef vr)
- Save valueValueRef save(Value v)
- (Thats it)
- Security/authentication not addressed yet
- extended access control based interface
- encrypted storage
41XML Store General interface
- Equip XML stores with the ability to receive and
apply any function to itself, including any
values it stores. - Called Visitor pattern in OO design
- Corresponds to unique homomorphism/type
elimination rule (fold) known from algebraic
datatypes/type theory (or to apply) - Lets XML stores not only receive single
commands for execution (like load, save),
but whole programs. - Allows implementation of
- query languages
- general remote processing (processing inside
the XML Store, e.g. for grid computing)
42Program code as values
- Program code (software) value Code can be
stored in the XML store. - Remote execution then involves passing a value
reference for the code to the receiver. If the
receiver already has the corresponding value
(code) e.g. due to caching in the XML store, no
further communication is necessary otherwise the
value is requested (pulled in) by the receiver.
43Basic P2P XML Store architecture
- Base configuration each peer is a single
component made up of - raw disk manager
- network proxy for group of remote XML-store peers
- group communication/routing amongst peers,
presently based on - IP-multicast (Pedersen/Tejlgaard 2002), or
- Chord-routing protocol (Baumann/Fennestad/Thorn
2002) - Kademlia-routing protocol (Ambus 2003)
44Configurable XML Store architecture
- Goal Client applications can construct XML
Stores by constructing them from - primitive XML stores (disk manager, in-RAM
manager, adapters to databases, file managers
etc.), and - XML store constructors (decorators)
- caching reads and writes
- asynchronous load/save
- buffered load/save requests
- replicating store
- encryption/decryption
45Example replicated XML Store, with in-memory
caching
client application
replicator
cache
cache
cache
46Example XML Store with caching and asynchronous
writing to disk
client application
async store
cache
47Ephemeral in-memory XML Store
client application
async store
in-memory store
ephemeral XML store new XMLStore ... release
XMLStore
48A simple challenge
- Write a little program that implements a
dictionary, e.g. for looking up phone numbers,
and inserting and updating records. - It should work on the net (concurrent access).
- It should work for a while (also after the
machine has been taken down and restarted). - Surprisingly more complex to program than the
routines you learned in algorithm class...
49Summary
- Value-oriented model for manipulating
semistructured data - supports light-weight caching, replication,
asynchronous computing in the XML middleware - Configurable XML middleware (client can order the
properties one wants from the XML store) - Separation of program logic (in the client code)
from generic deal - Encourages clients to write transaction safe code
programmatically