10 - The World Wide Web
10 - The World Wide Web
Some terms
• Hypertext
• Software system that associates text or images on a computer display with
other documents, images, or files
• Uses links or hyperlinks that are clicked on or tapped on
• First demonstrated in 1968 (pre-internet)
• The World Wide Web (WWW, "the Web")
• Internet "app" that uses hypertext documents to
• Navigate to other hypertext documents
• Link to other files
• Examples: images, audio files, video files, programs, text files
• For displaying, playing, or downloading
3
Some terms
• Hypertext Transfer Protocol (HTTP)
• Dependent on TCP/IP
• Defines method of transferring data between clients and servers on the Web
• HTTPS is a related protocol for secure (encrypted) data transfers
• Website
• A collection of related hypertext documents, programs, and media files
• Accessible via HTTP or HTTPS
• Web server
• A server that stores files for websites and provides IP access to them
4
Some terms
• Uniform Resource Locator (URL)
• Identifies (in a text string)
• A web resource (a website or some website file)
• The protocol to be used for accessing the resource
• Example:
https://www.somesite.com:8080/lookup.php?id=1484&lang=en#desc
Some terms
• Uniform Resource Locator (URL) continued
• Spaces and other special characters not allowed in URLs
• Substitute a percent sign (%) followed by the Unicode representation of the
character in hex
http://sites.cs.queensu.ca/courses/cisc181/download%20this.txt
You can see the space here There's a space here in the file
in this file listing from the name as it is stored on the server.
server.
Some terms
• Hypertext Markup Language (HTML)
• Non-programming language
• Conveys
• Logical structure of a web document
• Text and media content of a web document
• Hypertext links in a web document
• HTML documents may contain JavaScript programs
7
Some terms
• Web browser
• Client app that communicates with web servers using HTTP or HTTPS and
URLs
• Displays web pages by interpreting HTML and related styling information
• Compiles and executes JavaScript programs embedded in web pages
• Sends web form data entered by a user to a web server
• Downloads files from or uploads files to web servers
8
Some terms
• Client-side
• Web-related operations performed on a client machine
• Example: JavaScript program execution is a client-side operation
• Uses the client machine's resources
• Server-side
• Web-related operations performed on a server
• Example: Making updates to an online database is a server-side operation
• Uses the server's resources
• Client browser (and user) only see output from server processes
9
WWW history
• Tim Berners-Lee (now Sir Tim)
• English computer scientist and physicist
• Working at CERN in Switzerland in late 1980s
and early 1990s
• Developed underlying web technologies
• HTTP
• HTML
• First web browser
• WorldWideWeb (note: no spaces!)
• Submitted proposal for what became the Web
on March 12, 1989 (the Web's "birthday")
Sir Timothy John Berners-Lee
10
www history
. Info
FSF I FSF Europe I FSF India Translations of this page
erMedia Browser/Editor
GNU Operating System - Free Software Foundation xcercise in global ,-----2-.0-2-w-,t-h-li-bwww
- -2-. ,-s- re- 1"
mation availability
rigina/ Won'dWideWeb pmgram
by Tim Berners-Lee
ht 1990,81,93,84, TBL, CERN. Distribution restricted: ask for terms.
Panel...
11
WWW history
• Mosaic browser (June 1993)
• First widely used web browser
• Developed at the National Center for Supercomputing Applications (NCSA),
Illinois, US
• Built on UNIX, ported to Microsoft Windows 3.1 and Apple Macintosh System
7 in September 1993
• Development ceased in 1997
12
WWW history
• Netscape Navigator browser (December 1994)
• Created by part of the Mosaic development team
• Microsoft Internet Explorer (IE) browser (August 1995)
• Released with early upgrade to Windows 95
• JavaScript
• Developed by Netscape
• Released as part of Netscape Navigator in September 1995
• Microsoft released mostly compatible JScript with IE 3.0 in August 1996
13
WWW history
• The First Browser War (1995 – 2001)
• Netscape lost market share to IE over time
• IE's inclusion in Windows mostly responsible
• Windows had over 90% share of the desktop OS market
• Netscape code became open source in March 1998
• Netscape bought by AOL in November 1998
• Last Netscape release based on original code: August 2002
14
WWW history
• Firefox browser (September 2002 as Netscape, November 2004 as Mozilla Firefox)
• Developed as part of the Mozilla Project
• Mozilla: Free software community formed by members of Netscape
• Later Netscape releases based on Firefox code
• Last such release February 2008
• Versions produced for all popular OSs
• Safari browser (January 2003)
• Developed by Apple for inclusion on Macintosh computers
• Version for Windows released in 2007, abandoned in 2012
15
WWW history
• Google Chrome browser (September 2008)
• First released for Windows XP
• Later for Unix-like OSs (macOS, Linux)
• Part of code base released as open source project called Chromium
• Microsoft Edge browser (June 2015)
• Microsoft's intended replacement for IE
• Original code replaced with Chromium-based browser (January 2020)
• Versions released for macOS, Linux
• Mobile browsers are web browsers released for use on mobile devices
• Most are based on one of the popular personal computer browsers
16
WWW history
• The Second Browser War (2004 – 2017)
• IE became much criticized for not adhering to emerging web standards
• Newer browsers eroded IE's dominance
• IE slipped below 50% market share in October 2010
• By May 2012, Google Chrome overtook IE as the world's most used browser
• Former Mozilla chief technical officer declared Google Chrome winner of the
Second Browser War in May 2017
• Google Chrome's popularity remains well above all its competitors
17
Web standards
• First Browser War led to multiple versions of HTML
• Web developers faced with
• Building browser-detection techniques into their sites
• Maintaining multiple versions of sites
• World Wide Web Consortium (W3C)
• Founded by Tim Berners Lee, then at MIT, in 1994
• Established uniform standards for HTML
• Major browsers became W3C standards compliant over time
18
Web standards
• Web standards now maintained by the Web Hypertext Application Technology
Working Group (WHATWG)
• Founded in 2004 by major browser makers (but not Microsoft)
• Concerned W3C wasn't adequately maintaining web standards
• Now controlled by a "steering group": Apple, Mozilla, Google, and Microsoft
• W3C became concerned about competing HTML standards
• Ceded control of HTML development to WHATWG in May 2019
19
Server-side programs
• Programs written to be compiled and executed on a web server
• Users (and their browsers) only see the output of such programs
• Source code remains hidden on the server
• Server knows to compile and run files with specific filename extensions
• Example: A file with a name ending in .php will be compiled and run as a
PHP program
• Any output from the program, and any HTML embedded in it, is sent to
the user's browser
21
Server-side programs
• Example: A web page that display's the user's IP address
• Stored on the web server at (fictitious) website, www.bogus.com:
• A short PHP program file called my_ip.php
<?php
$user_ip_address = $_SERVER['REMOTE_ADDR']; The file my_ip.php is
?><!DOCTYPE html> mostly HTML, and server-
<html lang="en"> side programs, like their
<head> client-side JavaScript
<title>Your IP address</title> counterparts, are usually a
</head> mix of HTML and
<body> programming language
<h1>Your IP address is <?=$user_ip_address?>.</h1> code. The darker text here
</body> is the actual PHP.
</html>
22
Server-side programs
,---------------
my_ip.php ' ~------~ '
-
''-'
</hual>
<UtJ.•>Toui- I P
<I"-'>
rj\l>'tout' r, ~
.clO..ttH</Ut.l• ►
'- ~ https://www.bogus.com/my_ip.php
---
--- --
--- I PHP interpreter I
e •
............
Go gle
<DOCT'l'Plltltal>
-
<ht•l l :anq-- •n· >
""""
,,......,
<UU&>Tour IP Add.r.• ■</U U &>
I I
'---------------✓
•
e
1. The user enters www.bogus.com/my_ip.php into browser's address bar
2. The user's browser sends the request to the web server at bogus.com
3. The web server retrieves the my_ip.php file, passes it through its PHP
interpreter (compiler and executer), which produces a purely HTML
document
4. The HTML document is sent to the user's browser
5. The user's browser interprets the HTML and displays the user's IP address
23
Server-side programs
• Server-side programs have full access to the server's file system
• This makes them very useful, but potentially dangerous. Example:
• PHP program written to let users upload data files for sharing to server
• Programmer neglected to restrict the kinds of files being uploaded
• Hacker uploads a malevolent PHP script
• Hacker browses to the uploaded file, which is compiled and executed
• Chaos ensues on the server
24
HTML
<!DOCTYPE html> ,-
~ Hello, worlld ! X +
<html> ~ 0 ~ CD Fie I C:/Users/
<head>
<title>Hello, world!</title> Hello, ,w orld.
</head>
<body>
<p>Hello, <a href="https://en.wikipedia.org/wiki/Earth">world</a>!</p>
</body>
</html>
27
HTML
<!DOCTYPE html>
<html>
<head>
<title>Hello, world!</title>
</head>
<body>
<p>Hello, <a href="https://en.wikipedia.org/wiki/Earth">world</a>!</p>
</body>
</html>
28
HTML
<!DOCTYPE html>
<html>
<head>
<title>Hello, world!</title>
</head>
<body>
<p>Hello, <a href="https://en.wikipedia.org/wiki/Earth">world</a>!</p>
</body>
</html>
29
HTML
<!DOCTYPE html>
<html>
<head>
<title>Hello, world!</title>
</head>
<body>
<p>Hello, <a href="https://en.wikipedia.org/wiki/Earth">world</a>!</p>
</body>
</html>
30
HTML
<!DOCTYPE html> ,-
~ Hello, worlld ! X +
<html> ~ 0 ~ CD Fie I C:/Users/
<head>
<title>Hello, world!</title> Hello, ,w orld.
</head>
<body>
<p>Hello, <a href="https://en.wikipedia.org/wiki/Earth">world</a>!</p>
</body>
</html>
31
HTML
<!DOCTYPE html>
<html>
<head>
<title>Hello, world!</title>
</head>
<body>
<p>Hello, <a href="https://en.wikipedia.org/wiki/Earth">world</a>!</p>
</body>
</html>
32
HTML
<!DOCTYPE html>
<html>
<head>
<title>Hello, world!</title>
</head>
<body>
<p>Hello, <a href="https://en.wikipedia.org/wiki/Earth">world</a>!</p>
</body>
</html>
33
HTML
Web forms
• Hypertext links provide only rudimentary communication between web clients
and web servers
• Requests to view other pages, download or upload files
• Collecting input data directly from a user and sending it to a server requires
HTML forms
• Supported by the <form> paired tags and various input tags
• Allow designers to build user interfaces into web pages
• User fills in a form in the main browser window then clicks or presses on a
submit button to send the form data
• Server receives and processes the input, may use it to update a database
• Server will likely send another page to acknowledge receiving the data
35
<!DOCTYPE html>
<html> Given name: ~------~
<head> Surname:
Web forms
<!DOCTYPE html>
<html>
<head>
<title>Name and DOB</title>
</head>
<body>
<form action="store_info.php" method="post">
<p>Given name: <input type="text" name="gname"></p>
<p>Surname: <input type="text" name="sname"></p>
<p>Date of birth: <input type="date" name="dob"></p>
<p><input type="submit">
</form>
</body>
</html>
37
Web forms
<!DOCTYPE html>
<html> Je_s_si_e _ _ _ _ _~
name: ~I
<head> Surname: l~
W_u______~
<title>Name and DOB</title>
</head>
<body>
<form action="store_info.php" method="post">
<p>Given name: <input type="text" name="gname"></p>
<p>Surname: <input type="text" name="sname"></p>
<p>Date of birth: <input type="date" name="dob"></p>
<p><input type="submit">
</form>
</body>
</html>
38
<html>
Sun. Mon. Tue. Wed. Thu. Fri. Sat.
24 25 26 27 28 29
<head> 2 3 4 5 6 7 8
</head> 16 17 18 19 20 21 22
23 24 25 26 27 28 29
<body> 30 31 2 3 4 5
Web forms
<!DOCTYPE html>
<html>
<head>
<title>Name and DOB</title>
</head> Submit Query I
<body>
<form action="store_info.php" method="post">
<p>Given name: <input type="text" name="gname"></p>
<p>Surname: <input type="text" name="sname"></p>
<p>Date of birth: <input type="date" name="dob"></p>
<p><input type="submit">
</form>
</body>
</html>
40
Web forms
<!DOCTYPE html>
<html>
<head>
<title>Name and DOB</title>
</head>
<body>
<form action="store_info.php" method="post">
<p>Given name: <input type="text" name="gname"></p>
<p>Surname: <input type="text" name="sname"></p>
<p>Date of birth: <input type="date" name="dob"></p>
<p><input type="submit">
</form>
</body>
</html>
41
Web forms
<!DOCTYPE html>
<html>
<head>
<title>Name and DOB</title>
</head>
<body>
<form action="store_info.php" method="post">
<p>Given name: <input type="text" name="gname"></p>
<p>Surname: <input type="text" name="sname"></p>
<p>Date of birth: <input type="date" name="dob"></p>
<p><input type="submit">
</form>
</body> I © a https:// /store_info.php
</html>
© ii https:// / store_info.php ?gname::::Jessie&sname=Wu&dob=2008-05-14
42
Website forgetfulness
• Websites are forgetful
• Server responds to request from a client then returns to previous state
• Must do this to handle requests from other clients
• No ongoing connection between a client and the server
• Yet, when buying something online…
• User signs into vendor's website
• Site presents menu of choices; user selects "Shop"
• Site presents menu of product categories; user selects one
• Site presents items in that category; user selects one
• Etc.
45
Cookies
• Technology invented in 1994 by Lou Montulli at Netscape
• Small amounts of data sent from servers along with web pages
• Transaction, or session, goes like this:
• User requests a web page
• Server responds with a page and a cookie containing initial request details
• Client displays the page for the user and stores the cookie
• User continues the session with another page request; cookie sent as well
• Server uses cookie data to construct a page to send back to the user that
continues the session; also sends another cookie
• Client displays the page for the user and stores the cookie
• Etc.
46
Cookies
• Session cookies
• Associated with specific transactions
• Essential to those transactions
• Expire and are deleted on log out or after specified length of time
• Persistent cookies
• Do not expire automatically
• Can usually be removed if the user knows how
• May be useful for allowing a user to reconnect to a site automatically
• Tracking cookies
• Persistent cookies used to track a user's browsing patterns
47
Cookies
• Third-party cookies
• Maintained by a company (the third party) other than the visited site's
owners
• A type of tracking cookie
• Third party company distributes its cookies through multiple sites
• Tracks a user's browsing and shopping habits
• Can use collected data for targeted advertising
• Operate without users' knowledge or consent
• Some browsers now disable third-party cookies by default
48
LI https://queensuca-my.sharepoint.com/:... (c,
S::•==.::::~=:::::-.:::----
__.. .
tern
Sor et ,...,
... _,._,_,_.._
• -Wl•Httl',-1<11,S
____ ... ,
j
50
Summary
• We looked at
• Web terminology
• Web history
• Web standards organizations
• The size of the Web
• Server-side programs and databases
• HTML tags and elements
• HTML forms
• Content management systems
51
Summary
• We looked at
• Cookies
• … and how they can track us
• Modern apps and how they relate to the Web