HTML

Tim Berners-Lee-Knight-crop Sir Tim Berners-Lee, knighted for his efforts to create the World-Wide-Web.
Hyper-Text Markup Language (HTML) was developed by Sir Tim Berners-Lee awhile working at CERN (the European particle physics lab) to share scientific papers. It drew heavily on ideas from Standard Generalized Markup Language (SGML), with the addition of the anchor tag which allows for the linking of documents (the basis for hyper-text). It also briefly flirted with XML-based formats (XHTML), which bear much similarity to SGML with a stricter application of the rules. Like all markup languages, HTML provides additional structure and presentation to plain text.

HTML Version 5

The current version - HTML5 - has broken with both these roots; it does not conform to either the SGML or XML standards though it retains many similarities (in fact, it can often still be parsed by a lenient SGML or XML parser - but definitely not a strict one). There have been many changes to HTML since Tim Berners-Lee’s first draft, including many tags that have been added or depreciated. The authoritative source of which tags are current is the HTML5 standard, which is published by the World Wide Web Consortium.

The WW3C

The World Wide Web Consortium, or W3C as it is often called, is the standards-generating body for most Web technologies. Their website is http://www.w3.org. Membership in the W3C includes many of the stakeholders in the industry - browser manufacturers, web developers, researchers, graphic designers, etc. The W3C provides standards for a family of technologies which they refer to as the Open Web Platform. Rather than being a single monolithic standard, the Open Web Platform is broken into technology categories, which, while interdependent to a certain degree, can be developed separately.

The Standards Adoption Process

These categories are organized around working groups - W3C members and invited stakeholders and experts that are responsible for developing the standards. The process begins with interest in a topic, which is reflected by member submissions. When significant interest is generated, a workshop may be organized around the topic. A Working Group is formed to carry out developing standards around the topic. The Working Group develops standards through an iterative process of publication and public review:

  1. Publication of a First Public Working Draft
  2. Publication of subsequent Public Working Drafts
  3. Publication of a Candidate Recommendation
  4. Publication of a Proposed Recommendation
  5. Publication of a W3C Proposed Recommendation
  6. Possibly, Publication as an Edited Recommendation It is not uncommon for more active browser manufacturers to begin adopting or even pre-empt working drafts (usually because they are active in the Working Group and trying out ideas). Once a standard reaches Candidate Recommendations, most browser manufacturers begin to at least implement the easier parts. A full W3C Proposed Recommendation is generally adopted into browsers or at least on the manufacturer’s roadmap for adoption - but do note, even at this stage the standard remains a recommendation - there is nothing (other than market pressures) which can force a manufacturer to follow the standard.

The HTML5 Standard

HTML5 became a W3C Recommendation on October 28, 2014, and can be found here: https://www.w3.org/TR/2014/REC-html5-20141028/. Much of the same information is also presented in Wiki form by the w3c community - of especial interest is the HTML Elements page: https://www.w3.org/community/webed/wiki/HTML/Elements.

HTML Elements

HTML markup is organized into elements which correspond to specific functional roles: defining page structure, declaring the semantic meaning, organizing, linking to external resources and so on. Each element is defined in the page by a corresponding tag, which consists of the following structure:

<tagname attributes>content</tagname>

Note the use of angle brackets <> to identify the opening and closing parts of the tag bracketing the contents. In this structure, tagname is the name of the element being created, i.e. a for an anchor tag, attributes are a series of key-value pairs, i.e. href="google.com" indicating that this anchor tag links to google's website, and content is the text (or nested HTML elements) that are children of the tag. The content should be followed by a repeat of tagname with a forward slash indicating this is the closing tag for the pair. Some tags do not have content, and can be written in an abbreviated self-closing style, i.e. <br/> is the line-break tag. All opening tags should be mirrored with a closing tag or be written in self-closing form (just like all brackets in programming must be matched).

Browsers are designed to be very forgiving of syntax errors - much more so than any compiler or interpreter. Nonetheless, you should strive to write only valid HTML, as invalid HTML can be interpreted different ways by different browsers. Use of a validation tool or html linter can be extremely helpful.