Title: Revealing the behindthescenes magic that occurs when you surf the Web
1How the Web Is Spun
- Revealing the behind-the-scenes magic that occurs
when you surf the Web - Theron S. Welch
2Presentation Goals
- Demystify the internals of the Web
- Clearly explain the basics of HTTP and TCP/IP and
their differences - Explain why this foundation is important
3What is the World Wide Web?
- Everybody knows how to use a web browser!
- Butrealize that the Web is simply a convenient
way to view and share documents using a
technology known as HTTP.
4More and more acronyms
- Essential Technologies HTTP, HTML, TCP/IP
- Newer technologies ASP, JSP, XML, SOAP, and
much more
Source Phlman International Dec. 1, 1997
5HTTP Hypertext Transfer Protocol
- Question HTTPs purpose?
- Answer Transferring hypertext documents
HTTP can transfer other types of files too
Source Landmark U.S. Market Survey OGJ, October
1997
6Hypertext Files
- They are also known as HTML files Hypertext
Markup Language files - They are simple ASCII files. ASCII is about as
platform independent as you can get! - Markup means that the text is marked up with
special tags - HTML tags describe the formatting of a document
7Sample HTML File
8Sample HTML file in a browser
9Common HTML Tags
- ltAgt - Anchor tag. Use this tag to hyperlink to
another resource on the web. Example ltA
HREFwww.geographix.comgtGGX Websitelt/Agt - ltIMGgt - Image tag. Use the image tag to insert
images (gifs or jpgs) into HTML documents.
ExampleltIMG SRCmygraphic.jpggt - Many other tags exist. You can also add scripts
to web pages.
10HTTP
- HTTP has various methods (functions) that clients
can use to request information from servers - A web browser uses HTTP to GET a web page from
a web server. - A web server is a machine running software that
is listening for HTTP requests from clients - HTTP is a text based protocol it sends ASCII
requests and responses over the Internet
11HTTP Method GET
- GET is by far the most commonly used method in
HTTP - GET is used to get resources from a server
12A closer look at GET
Request line
- GET /index.htm HTTP/1.0
- From twelch_at_geographix.com
- ltblank linegt
Headers
Marks the end of a request
13Some Basic Rules
- Requests are in upper case (as in GET)
- Headers are optional
- A blank line marks the end of a request
14Typical Response from a server
Response Line
- HTTP/1.0 200 OK
- Server Microsoft-IIS/3.0
- Date Thu, 19 Apr 2001 152246 GMT
- Content-Type text/html
- Last-Modified Thu, 19 Apr 2001 000709 GMT
- Content-Length 7118
- ltblank linegt
- ltContentgt
Headers
End of servers response headers
Document that was requested follows
15Connections (TCP)
- The client makes the connection to the server
- The client uses the connection to send a request
- The server sends the requested data on the
connection - The server closes the connection
16Connection management
- HTTP 1.0 required one connection per web resource
on the web - One connection per resource is very inefficient
17HTTP 1.1
- Can maintain connections to improve performance
and reduce traffic - Is an option selected by default in Microsoft
Internet Explorer 5.0
18Other Important HTTP Commands
- HEAD similar to GET but no content is sent
- POST used to send data to a server. Message
boards, shopping cart, etc.
19TCP/IP
- Where HTTP is simple, TCP/IP is very complex
- HTTP uses TCP/IP to connect and talk to a server
TCP/IP functions on a lower layer - TCP and IP are independent of each other but and
typically work together - The phrase TCP/IP refers to a large suite of
protocols, the most common being TCP and IP
20TCP Transmission Control Protocol
- Provides reliable transport service to HTTP (and
other higher level protocols) - TCP assigns a port for each connection
21Ports (TCP)
- Q What the heck is a port? Is it a physical
entry on a network card? - A No. A port is just a 16-bit number that
functions as a key - 16-bit means 2 to the 16th power, or 65,536
available numbers - Used to identify which software applications a
packet of data should go to
22Well-known Ports
Many TCP/IP applications are so common they have
their own well-known port numbers. For example
- FTP 21
- HTTP 80
- Telnet 23
- SMTP 25
- Microsoft Active Directory 3268
- Sybase SQL Anywhere 1498
23Port 80 HTTPs port
- Web Servers listen on port 80 for incoming
connections. In other words, when a request to
connect comes in on port 80, TCP ensures that it
goes to the Web Server that is listening on
that port. - Web Servers dont have to listen on 80, but its
the established convention. Hence, its a
well-known port number.
24Hackers
- Hackers typically break into networks by figuring
out what ports are open. - A receptive port that exposes a weakness can
allow access. - FYI Watch out for File and Printer sharing!
25IP Internet Protocol
- Deals with identifying computers on the Internet
(IP Addresses) - Moves TCP data around the Internet
26IP Addresses
- Example www.microsoft.com aka 207.46.131.199 is
an IP Address - This notation is called dotted-decimal notation
and separates 4 bytes.
27Typical Web Browser
28Not-so-typical Web Browser
WebSnatcher rendering the same site
29The Ultimate Basic Browser Telnet
Instead of clicking on hyperlinks in a web
browser, why not just type in the HTTP commands
yourself in Telnet? -)
Type in a web address and a port number
30Telnet
The screen will go blank. After typing GET /
HTTP/1.0 ltENTERgt ltENTERgtYoull see something
like this
31Telnet
Heres what an error looks like as returned from
a web server.
32What a typical browser has to do
- Formatting pages
- Rendering graphics, animation
- Supporting audio/video plug-ins
- Supporting Java, scripting languages
- And much more
33What a typical server has to do
- Connection management
- Stress handling
34New Technologies and Directions
- ASP - Active Server Pages
- JSP Java Server Pages
- XML Extensible Markup Language
- Web Services with SOAP
35Constants
- New technologies come and go
- Only a few new technologies will make a
significant impact - HTTP has had explosive success
- TCP/IP has stood the test of time about 30
years or so. Wow! - HTTP and TCP/IP will remain with us
36Excellent Resources on the Web
- www.w3c.org - The World Wide Web Consortium
- www.faqs.org - The Internet FAQ Consortium
- Dont forget the RFCs!
37Bibliography
- TCP/IP Illustrated Volume 1, W. Richard Stevens,
1994 Addison Wesley - TCP/IP Illustrated Volume 3, W. Richard Stevens,
1996 Addison Wesley - Network Programming for Microsoft Windows,
Anthony Jones and Jim Ohlund, 1999 Microsoft Press