Title: Computer Science 425 Distributed Systems CS 425 CSE 424 ECE 428
1Computer Science 425Distributed SystemsCS 425
/ CSE 424 / ECE 428
Indranil Gupta (Indy) August 23, 2007
2Our Only Goal Today
- To Define the Term Distributed System
3Can you name some examples of Operating Systems?
4Can you name some examples of Operating Systems?
-
- Linux WinXP Unix FreeBSD Mac
- 2K Aegis Scout Hydra Mach SPIN
- OS/2 Express Flux Hope Spring
- AntaresOS EOS LOS SQOS LittleOS TINOS
- PalmOS WinCE TinyOS
5What is an Operating System?
6What is an Operating System?
- User interface to hardware (device driver)
- Provides abstractions (processes, file system)
- Resource manager (scheduler)
- Means of communication (networking)
-
7FOLDOC definition
(FOLDOC Free On-Line Dictionary of Computing)
- The low-level software which handles the
interface to peripheral hardware, schedules
tasks, allocates storage, and presents a default
interface to the user when no application program
is running. - The OS may be split into a kernel which is always
present and various system programs which use
facilities provided by the kernel to perform
higher-level house-keeping tasks, often acting as
servers in a client-server relationship. - Some would include a graphical user interface and
window system as part of the OS, others would
not. The operating system loader, BIOS, or other
firmware required at boot time or when installing
the operating system would generally not be
considered part of the operating system, though
this distinction is unclear in the case of a
roamable operating system such as RISC OS. - The facilities an operating system provides and
its general design philosophy exert an extremely
strong influence on programming style and on the
technical cultures that grow up around the
machines on which it runs.
8Can you name some examples of Distributed
Systems?
9Can you name some examples of Distributed
Systems?
- Client-Server
- The Web
- The Internet
- An ad-hoc network
- A sensor network
- DNS
- Kazaa (peer to peer overlays)
- (The Solar System?)
- (Society?)
- (Food Chain?)
10What is a Distributed System?
11FOLDOC definition
-
- A collection of (probably heterogeneous)
automata whose distribution is transparent to the
user so that the system appears as one local
machine. This is in contrast to a network, where
the user is aware that there are several
machines, and their location, storage
replication, load balancing and functionality is
not transparent. Distributed systems usually use
some kind of client-server organization.
12Textbook definitions
- A distributed system is a collection of
independent computers that appear to the users of
the system as a single computer - Andrew Tanenbaum
- A distributed system is several computers doing
something together. Thus, a distributed system
has three primary characteristics multiple
computers, interconnections, and shared state - Michael Schroeder
13Unsatisfactory
- Why are these definitions short?
- Why do these definitions look inadequate to us?
- Because we are interested in the insides of a
distributed system - design and implementation
- maintenance
- study
- Algorithmics (protocols)
14- I shall not today attempt further to define the
kinds of material I understand to be embraced
within that shorthand description and perhaps I
could never succeed in intelligibly doing so. But
I know it when I see it, and the motion picture
involved in this case is not that. - Potter Stewart, Associate Justice, US Supreme
Court (talking about his interpretation of a
technical term laid down in the law, case
Jacobellis versus Ohio 1964)
15Which is a Distributed System (A) or (B)?
(A)
(A) Plants and Animals interacting in the Food
Chain
16(B)
(B) The Internet (Internet Mapping Project, color
coded by ISPs)
17A working definition for us
- A distributed system is a collection of
entities, each of which is autonomous,
programmable, asynchronous and failure-prone, and
which communicate through an unreliable
communication medium. - Our interest in distributed systems involves
- design and implementation, maintenance, study,
algorithmics - Entitya process on a device (PC, PDA)
- Communication MediumWired or wireless network
18A range of interesting problems for Distributed
System designers
-
-
- Basic Concepts Asynchrony,Consensus,
- Routing IP,BGP
- Large-scale Systems The Grid,Gnutella,Kazaa
- Distributed File Systems NFS,AFS
- Protocols, e.g., multicast IP multicast, SRM,
RMTP - CoordinationSETI_at_Home,Multiplayer online games
- Storage and Databases
- Security
-
-
19Distributed Systems Design Goals
- Common Goals
- Heterogeneity can the system handle different
types of PCs and devices? - Robustness is the system resilient to host
crashes and failures, and to the network dropping
messages? - Availability are data, services always there
for clients? - Transparency can the system hide its internal
workings from the users? - Concurrency can the server handle multiple
clients simultaneously? - Efficiency is it fast enough?
- Scalability can it handle 100 million nodes?
(nodesclients and/or servers) - Security can the system withstand hacker
attacks? - Openness is the system extensible?
20Distributed System Example -- the Internet
21The Internet
- A vast interconnected collection of computer
networks of many types. - Intranets subnetworks operated by companies and
organizations. - ISPs companies that provide modem links and
other types of connections to users. - Intranets are linked by backbones network links
of large bandwidth, such as satellite
connections, fiber optic cables, and other
high-bandwidth circuits.
22A Typical Intranet
23Intranets
- Composed of several local area networks (LANs)
linked by backbone connections. - Connected to the Internet via a router/multiple
routers. - A firewall is used to protect an intranet (from
the outside Internet) by preventing unauthorized
message leaving/entering, and is implemented by
filtering incoming and outgoing messages.
24Internet Apps Their Protocols and Transport
Protocols
Application layer protocol smtp RFC 821 telnet
RFC 854 http RFC 2068 ftp RFC
959 proprietary (e.g. RealNetworks) NFS proprieta
ry (e.g., Skype)
Underlying transport protocol TCP TCP TCP TCP TCP
or UDP TCP or UDP typically UDP
Application e-mail remote terminal access Web
file transfer streaming multimedia remote file
server Internet telephony
Implemented via network sockets. Basic
primitive that allows machines to send messages
to each other
TCPTransmission Control Protocol UDPUser
Datagram Protocol
25WWW the HTTP Protocol
- HTTP hypertext transfer protocol
- WWWs application layer protocol
- client/server model
- client browser that requests, receives, and
displays WWW objects - server WWW server stores the website, and sends
objects in response to requests - http1.0 RFC 1945
- http1.1 RFC 2068
http request
PC running Explorer
http response
http request
Server Running cnn.com Web server
http response
Mac running Navigator
26The HTTP Protocol More
- http TCP transport service
- client initiates a TCP connection (creates
socket) to server, port 80 - server accepts the TCP connection from client
- http messages (application-layer protocol
messages) exchanged between browser (http client)
and WWW server (http server) - TCP connection closed
- http is stateless
- server maintains no information about past client
requests
aside
- Protocols that maintain state are complex!
- past history (state) must be maintained
- if server/client crashes, their views of state
may be inconsistent, and hence must be
reconciled.
27HTTP Example
- Suppose user enters URL www.cs.uiuc.edu/
(contains text, references to 10 jpeg images)
- 1a. http client initiates a TCP connection to
http server (process) at www.cs.uiuc.edu. Port 80
is default for http server.
1b. http server at host www.cs.uiuc.edu waiting
for a TCP connection at port 80. accepts
connection, notifying client
2. http client sends a http request message
(containing URL) into TCP connection socket
3. http server receives request messages, forms a
response message containing requested object
(index.html), sends message into socket
time
28HTTP Example (cont.)
4. http server closes the TCP connection.
- 5. http client receives a response message
containing html file, displays html, Parses
html file, finds 10 referenced jpeg objects
6. Steps 1-5 are then repeated for each of 10
jpeg objects
time
- For fetching referenced objects, have 2 options
- non-persistent connection only one object
fetched per TCP connection - some browsers create multiple TCP connections
simultaneously - one per object - persistent connection multiple objects
transferred within one TCP connection
29Trying Out HTTP (Client Side) for Yourself
- 1. Telnet to your favorite WWW server
Opens TCP connection to port 80 (default http
server port) at www.cnn.com Anything typed in
sent to port 80 at www.cnn.com
telnet www.cnn.com 80
2. Type in a GET http request
By typing this in (hit carriage return twice),
you send this minimal (but complete) GET request
to http server
GET /index.html HTTP/1.0
3. Look at response message sent by http server!
30Does our Working Definition work for the http
Web?
- A distributed system is a collection of
entities, each of which is autonomous,
programmable, asynchronous and failure-prone, and
that communicate through an unreliable
communication medium. - Our interest in distributed systems involves
- design and implementation, maintenance, study,
algorithmics - Entitya process on a device (PC, PDA)
- Communication MediumWired or wireless network
31Important Distributed Systems Issues
- No global clock no single global notion of the
correct time (asynchrony) - Unpredictable failures of components lack of
response may be due to either failure of a
network component, network path being down, or a
computer crash (failure-prone, unreliable) - Highly variable bandwidth from 16Kbps to Tbps
- Possibly large and variable latency few ms to
several seconds - Large numbers of hosts 2 to several million
32Important Issues
- If youre already complaining that the list of
topics weve discussed so far has been
perplexing - Youre right!
- It was meant to be (perplexing)
- The Goal for the Rest of the Course see enough
examples and learn enough concepts so these
topics and issues will make sense - We will revisit many of these slides in the very
last lecture of the course!
33Concepts?
- Which of the following inventions do you think is
the most important? - The PDA
- The PC
- The transistor
- Which of the following inventions do you think is
the most important? - Email
- The Web
- TCP/IP
- What lies beneath? Concepts!
34How will you Learn?
- Take a look at handout Course Information and
Schedule - Text Colouris, Dollimore and Kindberg (4th
edition) - Lectures
- Homeworks
- Approx. one every two weeks
- Solutions need to be typed, figures can be
hand-drawn - Programming assignments
- Incremental, in 3-4 stages
- We will build a peer to peer system!
- Exams/quizzes
- Midterm, and final
35On the Textbook
- Text Colouris, Dollimore and Kindberg (4th
edition). White book. - The 3rd edition will suffice for most material
too. However, we will refer to section, chapter,
and problem numbers only in the 4th edition. - The 3rd edition may have a different numbering
for some HW problems (that we give from the
textbook). Make sure you solve the right problem
the responsibility is yours (no points for
solving the wrong problem!)
36What assistance is available to you?
- Lectures
- lecture slides will be placed online at course
website - Tentative version before lecture
- Final version after lecture
- Homeworks office hours to help you (without
giving you the solution) - Programming Assignments (MPs) program templates
will be given to you. C (or C) programming. - An additional assignment may be given that is
open-ended, and can turn into a research project.
- Course Prerequisite Operating Systems/Systems
Programming (CS 241 or CS 423 or instructor
permission)
37If youre still thinking, Everything youve
said so far is so boring...
- CS425 is about enjoying distributed systems,
learning a few new things, and designing some
cool new systems that you can boast about to your
friends (and job interviewers). - Were here to help you achieve all these things.
38You can meet us anytime
- In person (Office Hours)
- Myself (Indy) 3112 Siebel Center
- Every Tuesday 3.30 PM to 5 PM, and Every Thursday
2 PM to 3.30 PM - TA, Ramses Morales 207 Siebel Center
- Every Monday 10.30 AM to 12 PM, and Every
Wednesday 12.30 PM to 2 PM - Virtually
- Newsgroup class.cs425 (most preferable,
monitored daily) 24 hour turnaround time for
questions! - Email (turnaround time may be longer than
newsgroup)
39Readings
- For todays lecture Chapter 1
- For next Tuesdays lecture
- Read sections 11.1-11.4
- Fill out and return Student InfoSheet
40Have a Good Weekend!