the anatomy of a web page

duluth quantum computing project

  1. reading list
  2. discussion
  3. dqcp topic list

reading list


artist @ work


in progress ...

When we refer to a "web page" we usually mean a document written in Hypertext Markup Language (HTML) viewable by means of a special portal called a "web browser". A web page is often addressed via a unique Uniform Resource Locator (URL) that describes its location in that vast interstellar universe of World Wide Web (WWW). The content of a web page in 2016 is often a rich mixture of text, pictures, video, audio and animation. We expect the "document" to be responsive to our touch, keystrokes, gestures and/or voice. A web page often remembers us ::: recalls our interests ::: records our histories.

A web page can be a deeply layered world. At its simplest it may look like :::

<html lang="en">
    <meta charset="utf-8">
    <title>My Title</title>
    <h1>My Title</h1>
    <p>My content</p>
example #1a ::: raw code

The content (what we read at the surface) is merely :::

My Title

My content

example #1b ::: as it would appear in the browser

The extra text in example #1a is the HTML markup. HTML is simply a set of tags and attributes that add meta-information ::: meta-data to the source content. HTML has evolved over many years.

In 1980, a physicist named Tim Berners-Lee was working as a contractor at CERN (the world's largest particle physics laboratory based in Geneva, Switzerland). He envisioned a system for scientists at CERN to share documents. By 1991 he had built a rudimentary browser and had created an early form of HTML that included 18 tags that could be used to add context to content. HTML is a subset ::: a dialect of a language group that originated with something called SGML. This family also includes a later, stricter distillation of SGML called XML. SGML (Standard Generalized Markup Language) was a very early standard for representing paper documents in a digital form. XML (Extensible Markup Language) is a language that was derived from SGML. It was developed to provide a semantic layer for machine-based exchange of data. HTML is a subset of XML (with some deviations). It is now in its fifth version (HTML5). It is a web standard currently maintained and developed by an international standards body called the World Wide Web Consortium (W3C).

HTML instigated a document flow tsunami ::: the information superhighway. It is at its core a very simple concept. HTML defines a set of elements and attributes that are used to provide structural meta-information for what would otherwise be a blob of raw text. It helps the browser ::: the document interpreter understand the relavance of various sections of the document. It marks up a hierarchy of headers (h1, h2, h3, h4, h5, h6) and denotes paragraphs (p). It signifies links, video, audio, images. There are many tutorials on HTML5 available online. For example, start here.

Nested boxes

At its core HTML is a set of boxes and links. It is a markup language for chunking content ::: for adding semantic / structural context and for defining links / references to other resources. Those elements ::: those boxes are bracketed by tags. For example, a paragraph might be marked up as follows::: <p>paragraph content here</p>. In this example ::: <p> is called the opening tag and </p> is called the closing tag. Learn more at : the HTML reference / tutorials ::: MDN. These elements / boxes can contain other boxes, text, audio, images, video. We live in a media-rich world. Links can transport us to locations within the document and to locations outside. Connections between resources (webpages, content chunks, data sets, media streams) can be made via a simple hyperlink ::: (click to navigate to linked resource) but can also be like pipelines opened to a remote service, data source, media origin. Browsers are like spiders spinning, connecting; sky travel portals; radios receiving.

Reveal / conceal ::: subterranian gears ::: layers

Look deeper into the looking glass .... Behind the scenes ::: below the surface a lot is happening in the context of a "web page". There is content and markup and links to styling and behavior information. There is meta content that aids the browser in interpreting the html document. There is meta information that explains the subject / the content to search engines and social media sites. There could be code that listens and responds to events that happen as the user interacts with the page.

topic list