What is a computer that handles requests for Web pages?

How The Web Works

TCP/IP Standards = Foundation

TCP/IP standards are the packet foundation, allowing a computer to communicate with another computer on the internet. Services we use are built on top of the basic TCP/IP capability: email, the web, skype, google hangouts, many different chat services

The World Wide Web

  • World Wide Web
  • 1993 Sir Tim Berners-Lee, working at CERN
  • "The Web" is a made of a set of open standards:
  • --TCP/IP -- underlying networking
  • --HTML -- web page format
  • --HTTP -- web connection protocol to get a page
  • --JPEG, PNG -- image formats
  • --Javascript -- web page programming language
  • Remarkably for such a world-changing invention, there's not a single vendor-specific or proprietary part of it.
  • It's all open standards. Not a coincidence!

The familiar "web" of connected web pages runs on top of the basic TCP/IP phone system. Chat programs, email, .. these are other services, distinct from the web, which also run on top of the basic connectivity provided by TCP/IP. The web was created by Tim Berners-Lee working at the physics research facility CERN in Switzerland (now Sir Tim Berners-Lee). Browsers were available in 1993, and the web, urls etc. were becoming broadly popular by 1995.

Study question: why did something as important as the web not come out of a computer company like IBM or Microsoft or Apple or whatever? The web is a free and open standard (like TCP/IP), and for the most part is not locked-in to any particular vendor, and this freedom is a vital part of the web's success. Openness leads to participation, so the lock-in choices get shunned.

1. A URL

  • URL Uniform Resource Locator
  • A URL is the address of some information on the web
  • e.g. http://web.stanford.edu/class/cs101
  • http: -- system/scheme to use
  • www.stanford.edu -- domain name of server computer
    -recall "domain name" prev lecture
  • /class/cs101 -- "path", particular page on that server

http://web.stanford.edu/class/cs101

A visit to a web page begins with a URL (Uniform Resource Locator) that points to that web page. Of course you've seen a million URLs over the years, but we'll look at the parts:

The http at the start is the networking scheme to use, and "http" and its secure variant "https" are by far the most common. In the future, if there were some new networking scheme, the URL syntax could still support it by starting with a different word before the colon.

After the // we have the web.stanford.edu which is the domain name of the computer on the internet that has this web page -- the web server. For the browser to request this web page, it will make a TCP/IP connection to that computer.

After the domain name we have the /class/cs101 path which indicates essentially which directory and file we want specifically from that web server.

2. Web Browser "Client"

  • "HTTP" is the protocol of the web
  • HTTP has 2 parties: client and server
  • Client is the browser program, e.g. Firefox, Chrome
  • In the browser, the user types in a url, hits return
  • Browser sends a request to server
  • Browser gets back HTML response, displays it
  • Basically request/response
  • Browser keeps history, back-button

The Web Browser is the familiar computer program, such as Firefox, that you run on your local computer to access the web. In short, you type URLs into your browser or click a link, and the browser requests and displays those pages for you. The browser also keeps track of your history of web pages so it can implement the back-button for you.

In networking terminology, the browser is the "client" which makes requests and displays what it gets back. The "server" is the other side of the request/response, servicing requests it gets. This is all done with TCP/IP packets between the browser and the server.

3. Web Server

  • Web Server: computer on the internet, has content, responds to HTTP requests
  • Web server program runs on the server computer, handles HTTP requests
    -apache, nginx are popular open source web server programs
  • Web Server...
  • -Must be running all the time, be on the internet
  • -Needs a fixed, known IP address, a domain name
  • -Stores a bunch of files (HTML, JPEG, ..)
  • -Gets HTTP requests (from browsers)
  • -Sends back HTTP responses (HTML, JPEG, ..)

The other side of the conversation is the web server -- a machine which hosts a set of web pages, and waits for requests to come in for those pages. The phrase "web server" can refer to the physical machine, or it can mean the program that responds to requests. Below I'll use the phrases "web server machine" or "web server program" to distinguish those two cases.

The web server machine needs to be switched on, ready, and connected to the internet at all times. It is essentially waiting for an incoming request which could happen at any time. In contrast, you can switch on your laptop, do some browsing, and switch it off.

The web server program runs all the time, handling any incoming requests. For simple web pages, the web server program identifies a directory (aka a folder) as the web-root of the files to serve. The "path" part of the url maps into the web-root directory. So the url http://example.com/a.html means to get the file a.html from the web-root directory. The web-root can itself contain directories, so http://example.com/class/cs101/b.html refers to a "class" directory in the web-root directory, in turn containing a "cs101" directory, which contains a b.html file.

Put It Together -- HTTP Request / Response

The HTTP (Hyper Text Transfer Protocol) standard describes how a browser makes a request to the web server program. ("Hypertext" is the idea of links within documents pointing to other documents. This idea long predates the web.) If HTML describes the code for a web page, HTTP is the protocol for getting a web page from the server.

What is a computer that handles requests for Web pages?

  • You use this all the time!
  • HTTP request/response system
  • Browser has url
  • 1. Browser sends request to server named in url
    -request includes the path
    -e.g. "/class/cs101/syllabus.html"
    -TCP/IP provides the packet-service
  • 2. Server gets request
    -looks up that path in its resources
    -sends back response HTML
  • 3. Browser gets HTML, displays it
  • Notes:
  • Server on all the time, has IP addr
  • Server stores HTML, JPEG etc. data
  • Server sends back "404" error if no such resource

The server just sends back back the HTML or whatever data to your browser. Your browser then "renders" this data into a window. This is why the View Source command works -- it just shows you the HTML of the server response, which is what the browser was using anyway.

The approximate appearance of the HTML is specified in the standard, but not the exact details; the appearance can vary with how wide your browser window is, what fonts your machine has etc. If you want to send a document and specify exactly how it looks, where the line breaks are etc. use PDF (Portable Document Format, owned by Adobe but also a free standard).

"Dynamic" Web Applications

  • Simplest case -- server holds unchanging files
  • "Web application" .. page contents are dynamic
  • e.g. GMail inbox
  • A program on the server runs to produce the page

The above HTTP request/response sequence is the major pattern of the web. For the simple case outlined above, we have static, unchanging content. Each web page corresponds basically to an HTML file stored on the server, and the contents of the file do not change quickly.

A more complex web site will have some pages which are "dynamic" -- the HTML for them is computed, producing HTML on the fly. With a static web site, each web page corresponds roughly to a file stored on the server. With a dynamic web site, a web page corresponds to a program on the server. A request for that web page runs the corresponding program code on the server. That code, essentially runs a series of print statements (like the print we we have used) to dynamically produce the HTML which is sent back as the response. The program could do anything -- look at various data sources, putting together any sort of HTML response page.

  • Web application example
  • www.google.com/trends
  • Type in a word or two: volcano, primary (4 year cycle), oscar (1 year cycle)
  • HTTP request as usual, "submit" button
  • Key: server runs a program to compute the HTML on the fly
  • Program basically uses print as we have seen .. producing HTML

A dynamic web site with a very simple interface is www.google.com/trends -- which graphs the frequency that different words appear in google searches. The front page shows an HTML form which includes fields where you can type in information. Clicking the button in the form (or sometimes typing return) "submits" the form to the server -- sending a request to the server which includes the values typed into the fields. This runs a program on the server which takes in the inputs from the form, and looks up information stored in files or databases on the server. The program puts all this information together and dynamically produces the HTML and images etc. for the result -- basically using print to produce HTML. Note that this still fits within the request/response pattern, but now the response is a one-off, computed on the fly just for this request.

The site sfbay.craigslist.org/ is another nice simple example of a dynamic form/response website. Here's how it works at a very high level. On the craigslist servers are files that store all the current listings .. say 1 million listings. When you submit your search term, code runs on the craigslist server, pulling out the listings that include that word, and printing out HTML to show you the first 100 matches. This program will use a for-loop, if-statement, and print, just as we have been doing.

Later Topics: Tracking and Privacy

In security lectures, we'll talk about logging and blocking HTTP packets.

Web Page - HTML

Here is simple web page with a few elements...


A Heading

This is the first paragraph.

This is a second paragraph, including a link to the codingbat site

An image is done with "img" tag which includes a "src" url pointing to the image data file, like this

What is a computer that handles requests for Web pages?


  • HTML text code describes a web page
  • You should know a little bit of HTML, not being intimidated
  • Plain text with "tags" to mark text as bold etc. within brackets < ... >
  • "h2" or "h2" or "h3" are headings in big font, e.g.

    A Heading

  • "p" tag introduces a paragraph of text (starting on a new line)
  • "b" tags to mark bold: like this
  • "a" tags to mark a url:
      codingbat site
  • "img" tag to load in image:
     
    What is a computer that handles requests for Web pages?
  • "monkey.jpg" must be sitting in the same folder as the HTML file
  • Below is the HTML code/tags to produce the page above:

starts the whole thing. The section with sets the title used at the top of the window. Inside <body> is the regular HTML content of the page. </p><html> <head> <title>PAGE TITLE HERE

A Heading

This is the first paragraph.

This is a second paragraph, including a link to the codingbat site

An image is done with "img" tag which includes a "src" url pointing to the image data file, like this

What is a computer that handles requests for Web pages?

A web page is written is written in a plain text code called HTML (Hyper Text Markup Language). Basically, HTML adds "markup" commands <...> within plain text. The markup indicates that parts of text should be a heading, or bold, or a url, and so on.

You do not need to know much HTML markup for this course, just a few tags so you understand the basic idea. You should not be intimidated about producing an HTML page to show some information. Creating a basic looking HTML page is not difficult, although of course a complex page like the nytimes.com front page is a lot of work. You can write HTML by hand, just typing in the text including the HTML tags, or use a program that looks more like a word-processor to you, but which then generates HTML for you.

HTML Edit Demo

On his laptop, Nick edits the file network-html-sample.html, drags the file onto Firefox to display it. You can View Source on this page to see its underlying html. Key steps

  • In the editor, make a change, save the file
  • In the browser, click "reload" to see the changes immediately
  • This is the way to edit html, see results quickly

View Source

When you are visiting any web page, you can use the View Source command in your browser to see the underlying HTML code for the web page you see -- you will see

tags for paragraphs, tags for images. Check out the source code NYTimes.com .. it's quite a mess, but it's just HTML, rendered by your browser.

HTML 5

The latest version of HTML, HTML5 is becoming very popular, adding needed features to make better web pages and better dynamic web pages. Older versions of HTML lacked some features, so web pages did not look or work so well, but that's been largely fixed.

Web Design Philosophy

Someday, you will be tasked with organizing some important web contant. The most important thing to know about web design is in this comic: XKCD on web design

There are two points of view: the users of your site have interests, common questions. Very often a user visits a site with a question they want answered. The organization creating the web site has a different set of interests. The creators might care about the org-chart of what division is providing what, and who runs it, or just generally promoting how shiny and awesome their organization and their management are. The old joke is that lame web sites end up looking like the org-charts of their producing organizations.

If you want to make a popular web site, concentrate on the questions and interests of your visitors. Sounds obvious, but it's easy to make the other sort of site. Just look looks good to hierarchy to your site as captives, to be shown little graphics or videos that are essentially advertisements or advocacy. Instead, the most important question for the web design is: what are the most common questions/interests visitors will have and how can we make those answers conveniently available.

What is a computer that serves web pages?

A web server is software and hardware that uses HTTP (Hypertext Transfer Protocol) and other protocols to respond to client requests made over the World Wide Web. The main job of a web server is to display website content through storing, processing and delivering webpages to users.

What is a computer that makes a request?

A server is a software or hardware device that accepts and responds to requests made over a network. The device that makes the request, and receives a response from the server, is called a client.

How does a web server handle requests?

HTTP Basics As a quick summary, the HTTP/1.1 protocol works as follows: The client (usually a browser) opens a connection to the server and sends a request. The server processes the request, generates a response, and closes the connection if it finds a Connection: Close header.

Which client software is used to request and display webpages?

d A Web browser is a program that your computer runs to communicate with Web servers on the Internet which enables you to download and display the Web pages that you request.