What is a computer that handles requests for Web pages?

How The Web Works

TCP/IP Standards = Foundation

TCP/IP standards are the packet foundation, allowing a computer to communicate with another computer on the internet. Services we use are built on top of the basic TCP/IP capability: email, the web, skype, google hangouts, many different chat services

The World Wide Web

World Wide Web
1993 Sir Tim Berners-Lee, working at CERN
"The Web" is a made of a set of open standards:
--TCP/IP -- underlying networking
--HTML -- web page format
--HTTP -- web connection protocol to get a page
--JPEG, PNG -- image formats
--Javascript -- web page programming language
Remarkably for such a world-changing invention, there's not a single vendor-specific or proprietary part of it.
It's all open standards. Not a coincidence!

The familiar "web" of connected web pages runs on top of the basic TCP/IP phone system. Chat programs, email, .. these are other services, distinct from the web, which also run on top of the basic connectivity provided by TCP/IP. The web was created by Tim Berners-Lee working at the physics research facility CERN in Switzerland [now Sir Tim Berners-Lee]. Browsers were available in 1993, and the web, urls etc. were becoming broadly popular by 1995.

Study question: why did something as important as the web not come out of a computer company like IBM or Microsoft or Apple or whatever? The web is a free and open standard [like TCP/IP], and for the most part is not locked-in to any particular vendor, and this freedom is a vital part of the web's success. Openness leads to participation, so the lock-in choices get shunned.

1. A URL

URL Uniform Resource Locator
A URL is the address of some information on the web
e.g. //web.stanford.edu/class/cs101
-- system/scheme to use
www.stanford.edu -- domain name of server computer
-recall "domain name" prev lecture
/class/cs101 -- "path", particular page on that server

//web.stanford.edu/class/cs101

A visit to a web page begins with a URL [Uniform Resource Locator] that points to that web page. Of course you've seen a million URLs over the years, but we'll look at the parts:

The http at the start is the networking scheme to use, and "http" and its secure variant "https" are by far the most common. In the future, if there were some new networking scheme, the URL syntax could still support it by starting with a different word before the colon.

After the // we have the web.stanford.edu which is the domain name of the computer on the internet that has this web page -- the web server. For the browser to request this web page, it will make a TCP/IP connection to that computer.

After the domain name we have the /class/cs101 path which indicates essentially which directory and file we want specifically from that web server.

2. Web Browser "Client"

"HTTP" is the protocol of the web
HTTP has 2 parties: client and server
Client is the browser program, e.g. Firefox, Chrome
In the browser, the user types in a url, hits return
Browser sends a request to server
Browser gets back HTML response, displays it
Basically request/response
Browser keeps history, back-button

The Web Browser is the familiar computer program, such as Firefox, that you run on your local computer to access the web. In short, you type URLs into your browser or click a link, and the browser requests and displays those pages for you. The browser also keeps track of your history of web pages so it can implement the back-button for you.

In networking terminology, the browser is the "client" which makes requests and displays what it gets back. The "server" is the other side of the request/response, servicing requests it gets. This is all done with TCP/IP packets between the browser and the server.

3. Web Server

Web Server: computer on the internet, has content, responds to HTTP requests
Web server program runs on the server computer, handles HTTP requests
-apache, nginx are popular open source web server programs
Web Server...
-Must be running all the time, be on the internet
-Needs a fixed, known IP address, a domain name
-Stores a bunch of files [HTML, JPEG, ..]
-Gets HTTP requests [from browsers]
-Sends back HTTP responses [HTML, JPEG, ..]

The other side of the conversation is the web server -- a machine which hosts a set of web pages, and waits for requests to come in for those pages. The phrase "web server" can refer to the physical machine, or it can mean the program that responds to requests. Below I'll use the phrases "web server machine" or "web server program" to distinguish those two cases.

The web server machine needs to be switched on, ready, and connected to the internet at all times. It is essentially waiting for an incoming request which could happen at any time. In contrast, you can switch on your laptop, do some browsing, and switch it off.

The web server program runs all the time, handling any incoming requests. For simple web pages, the web server program identifies a directory [aka a folder] as the web-root of the files to serve. The "path" part of the url maps into the web-root directory. So the url //example.com/a.html means to get the file a.html from the web-root directory. The web-root can itself contain directories, so //example.com/class/cs101/b.html refers to a "class" directory in the web-root directory, in turn containing a "cs101" directory, which contains a b.html file.

Put It Together -- HTTP Request / Response

The HTTP [Hyper Text Transfer Protocol] standard describes how a browser makes a request to the web server program. ["Hypertext" is the idea of links within documents pointing to other documents. This idea long predates the web.] If HTML describes the code for a web page, HTTP is the protocol for getting a web page from the server.

codingbat site

"img" tag to load in image:

"monkey.jpg" must be sitting in the same folder as the HTML file

Below is the HTML code/tags to produce the page above:

starts the whole thing. The section with sets the title used at the top of the window. Inside is the regular HTML content of the page.

PAGE TITLE HERE body { max-width:700px; }

A Heading

This is the first paragraph.

This is a second paragraph, including a link to the codingbat site

An image is done with "img" tag which includes a "src" url pointing to the image data file, like this

A web page is written is written in a plain text code called HTML [Hyper Text Markup Language]. Basically, HTML adds "markup" commands within plain text. The markup indicates that parts of text should be a heading, or bold, or a url, and so on.

You do not need to know much HTML markup for this course, just a few tags so you understand the basic idea. You should not be intimidated about producing an HTML page to show some information. Creating a basic looking HTML page is not difficult, although of course a complex page like the nytimes.com front page is a lot of work. You can write HTML by hand, just typing in the text including the HTML tags, or use a program that looks more like a word-processor to you, but which then generates HTML for you.

HTML Edit Demo

On his laptop, Nick edits the file network-html-sample.html, drags the file onto Firefox to display it. You can View Source on this page to see its underlying html. Key steps

In the editor, make a change, save the file
In the browser, click "reload" to see the changes immediately
This is the way to edit html, see results quickly

View Source

When you are visiting any web page, you can use the View Source command in your browser to see the underlying HTML code for the web page you see -- you will see

tags for paragraphs,