HTML5 Tutorial
- Semantic Elements - 2020
It will show how the pre-HTML5 file, Sample.html, can be transformed to HTML5 file, SampleHTML5.html, for each of the new semantic items listed below.
- <header>
The header element represents a group of introductory or navigational aids. A header element is intended to usually contain the section's heading (an h1-h6 element or an hgroup element), but this is not required. The header element can also be used to wrap a section's table of contents, a search form, or any relevant logos.
header example - <article>
The article element represents a component of a page that consists of a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable, e.g. in syndication. This could be a forum post, a magazine or newspaper article, a Web log entry, a user-submitted comment, an interactive widget or gadget, or any other independent item of content.
article example -
<time>
The time element represents either a time on a 24 hour clock, or a precise date in the proleptic Gregorian calendar, optionally with a time and a time-zone offset. <mark> The mark element represents a run of text in one document marked or highlighted for reference purposes.
time example - <nav>
The nav element represents a section of a page that links to other pages or to parts within the page: a section with navigation links. Not all groups of links on a page need to be in a nav element - only sections that consist of major navigation blocks are appropriate for the nav element. In particular, it is common for footers to have a short list of links to various pages of a site, such as the terms of service, the home page, and a copyright page. The footer element alone is sufficient for such cases, without a nav element.
nav example - <section>
The section element represents a generic document or application section. A section, in this context, is a thematic grouping of content, typically with a heading. Examples of sections would be chapters, the various tabbed pages in a tabbed dialog box, or the numbered sections of a thesis. A Web site's home page could be split into sections for an introduction, news items, contact information.
It's used in footer example - <aside>
The aside element represents a section of a page that consists of content that is tangentially related to the content around the aside element, and which could be considered separate from that content. Such sections are often represented as sidebars in printed typography. The element can be used for typographical effects like pull quotes or sidebars, for advertising, for groups of nav elements, and for other content that is considered separate from the main content of the page. - <hgroup>
The hgroup element represents the heading of a section. The element is used to group a set of h1-h6 elements when the heading has multiple levels, such as subheadings, alternative titles, or taglines.
It's used in header example - <footer>
The footer element represents a footer for its nearest ancestor sectioning content or sectioning root element. A footer typically contains information about its section such as who wrote it, links to related documents, copyright data, and the like. Footers don't necessarily have to appear at the end of a section, though they usually do. When the footer element contains entire sections, they represent appendices, indexes, long colophons, verbose license agreements, and other such content.
footer example
Go back to HTML5 home
Let's look into pre-HTML5 example, Sample.html.
<div id="header"> <h1>MyHTML</h1> <p class="tagline">Open Source Web Animation</p> </div> ... <div class="entry"> <h2>What is HTML5? </h2> </div> ... <div class="entry"> <h2>What's wrong with HTML4?</h2> </div>
There is nothing wrong with this html. It is valid HTML5, we can still keep it. HTML5, however, provides some additional semantic elements for headers and sections.
First, get rid of that <div id="header">. This is a not unusual pattern, but it mean nothing. The div element has no defined semantics, and the id attribute has no defined semantics, either.
HTML5 defines a <header> element for this purpose.
<header> <h1>MyHTML</h1> <p class="tagline">Open Source Web Animation</p> </header>
It is telling it is a header. But what about that tagline? Another common pattern, which up until now had no standard markup. It's a difficult thing to mark up. A tagline is like a subheading, but it's "attached" to the primary heading. That is, it's a subheading that doesn't create its own section.
Header elements like <h1> and <h2> give our page structure. Taken together, they create an outline that we can use to visualize (or navigate) our page. In HTML 4, <h1>-<h6> elements were the only way to create a document outline.
How does the outline of our sample look like:
MyHTML (h1) | +--What is HTML5? (h2) | +--What's wrong with HTML4? (h2)
That's fine, but it means that there's no way to mark up the tagline "Open Source Web Animation." If we tried to mark it up as an <h2>, it would add a phantom node to the document outline:
MyHTML (h1) | +--Open Source Web Animation (h2) | +--What is HTML5? (h2) | +--What's wrong with HTML4? (h2)
But that's not the structure of the document. The tagline does not represent a section; it's just a subheading. Perhaps we could mark up the tagline as an <h2> and mark up each article title as an <h3>? No, that's even worse:
MyHTML (h1) | +--Open Source Web Animation (h2) | +--What is HTML5? (h3) | +--What's wrong with HTML4? (h3)
Now we still have a phantom node in our document outline, but it has "stolen" the children that rightfully belong to the root node. And herein lies the problem: HTML 4 does not provide a way to mark up a subheading without adding it to the document outline. No matter how we try to shift things around, "Open Source Web Animation" is going to end up in that graph. And that's why we end up with semantically meaningless markup like <p class="tagline">.
Here comes <hgroup> to the rescue. HTML5 provides a solution for this: the <hgroup> element. The <hgroup> element acts as a wrapper for two or more related heading elements. What does "related" mean? It means that, taken together, they only create a single node in the document outline.
So, given updated markup from the example above:
<header> <hgroup> <h1>MyHTML</h1> <h2>Open Source Web Animation</h2> </hgroup> </header> ... <div class="entry"> <h2>What is HTML5? </h2> </div> ... <div class="entry"> <h2>What's wrong with HTML4?</h2> </div>
Note that we used style sheet for a new tag <mark>:
mark { display:inline-block; background:#ff8; border:1px dotted #888 }
What we've got:
MyHTML (h1 of its hgroup) | +--What is HTML5? (h2) | +--What's wrong with HTML4? (h2)
Let's look at the our Sample.html.
<div class="entry"> <p class="post-date">June 3, 2010</p> <h2> <a href="#" rel="bookmark" title="link to my homepage">What is HTML5? </a> </h2> <p>HTML5 is the next generation of HTML.....</p> </div>
This is valid HTML5. What we can do with the markup with HTML 5. As we guessed, HTML5 provides more specific elements for the common case of marking up an article on a page with an element named <article>.
<article> <header> <p class="post-date">June 3, 2010</p> <h2> <a href="#" rel="bookmark" title="link to my homepage">What is HTML5? </a> </h2> <p>HTML5 is the next generation of HTML.....</p> </header> </article>
Here, the <h2> element changed to an <h1>, and wrapped it inside a <header> element. The purpose of <header> element is to wrap all the elements that form the article's header (in this case, the article's publication date and title). But, shouldn't we only have one <h1> per document? Won't this screw up the document outline? No, but to understand why not, we need to back up a step.
In HTML 4, the only way to create a document outline was with the <h1>-<h6> elements. If we only wanted one root node in our outline, we had to limit ourselves to one <h1> in our markup. But the HTML5 specification defines an algorithm for generating a document outline that incorporates the new semantic elements in HTML5. The HTML5 algorithm says that an <article> element creates a new section, that is, a new node in the document outline. And in HTML5, each section can have its own <h1> element.
This is a drastic change from HTML 4, and here's why it's a good thing. Many web pages are really generated by templates. A bit of content is taken from one source and inserted into the page up here; a bit of content is taken from another source and inserted into the page down there. Many tutorials are structured the same way. "Here's some HTML markup. Just copy it and paste it into our page." That's fine for small bits of content, but what if the markup we're pasting is an entire section? In that case, the tutorial will read something like this: "Here's some HTML markup. Just copy it, paste it into a text editor, fix the heading tags so they match the nesting level of the corresponding heading tags in the page we're pasting it into."
Let me put it another way. HTML 4 has no generic heading element. It has six strictly numbered heading elements, <h1>-<h6>, which must be nested in exactly that order. That kind of sucks, especially if our page is "assembled" instead of "authored." And this is the problem that HTML5 solves with the new sectioning elements and the new rules for the existing heading elements. If we're using the new sectioning elements, I give us the new markup and we can copy that markup and paste it anywhere in our page without modification. The fact that it contains an <h1> element is not a problem, because the entire thing is contained within an <article>. The <article> element defines a self-contained node in the document outline, the <h1> element provides the title for that outline node, and all the other sectioning elements on the page will remain at whatever nesting level they were at before.
Let's look at the our Sample.html.
<div class="entry"> <p class="post-date">June 3, 2010</p> <h2> <a href="#" rel="bookmark" title="link to my homepage"> What is HTML5? </a> </h2> </div>
The publication date here has no semantic markup to back it up, so authors resort to generic markup with custom class attributes. Again, this is valid HTML5. We're not required to change it. But HTML5 does provide a specific solution for this case: the <time> element.
<time datetime="2010-06-03" pubdate>June 3, 2010</time>There are three parts to a <time> element:
- A machine-readable timestamp
- Human-readable text content
- An optional pubdate flag
<time datetime="2010-06-03" pubdate>June 3, 2010</time>If we want to include a time too, add the letter T after the date, then the time in 24-hour format, then a timezone offset.
<time datetime="2010-06-03T13:59:47-04:00" pubdate> June 3, 2010 1:59pm EDT </time>Notice I changed the text content - the stuff between <time> and </time> - to match the machine-readable timestamp. This is not actually required. The text content can be anything we like, as long as we provide a machine-readable date/timestamp in the datetime attribute. So this is valid HTML5:
<time datetime="2010-06-03">last Thursday</time>This is also valid HTML5:
<time datetime="2010-06-03"></time>The final piece of the puzzle here is the pubdate attribute. It's a boolean attribute, so just add it if we need it, like this:
<time datetime="2010-06-03" pubdate>June 3, 2010</time>If we dislike "naked" attributes, this is also equivalent:
<time datetime="2010-06-03" pubdate="pubdate">June 3, 2010</time>What does the pubdate attribute mean? It means one of two things. If the <time> element is in an <article> element, it means that this timestamp is the publication date of the article. If the <time> element is not in an <article> element, it means that this timestamp is the publication date of the entire document. Here's the entire article, reformulated to take full advantage of HTML5:
<article> <header> <time datetime="2010-06-03" pubdate>June 3, 2010</time> <h1> <a href="#" rel="bookmark" title="link to my homepage"> What is HTML5? </a> </h1> </header> <p>HTML5 is the next generation of HTML, ....</p> </article>
One of the most important parts of any web site is the navigation bar. And our Sample.html has a navigation bar in the header that includes links to different sections of our hypothetical sites "home," "about us," and "sitemap". This is how the navigation bar was originally marked up:
<div id="nav"> <ul> <li><a href="#">home</a></li> <li><a href="#">about us</a></li> <li><a href="#">sitemap</a></li> </ul> </div>Again, this is valid HTML5. And while it's marked up as a list of three items, there is nothing about the list that tells us that it's part of the site navigation. Visually, we could guess that by the fact that it's part of the page header, and by reading the text of the links. But semantically, there is nothing to distinguish this list of links from any other. If we want to navigate quickly, we'd tell our screenreader to jump to the navigation bar and start reading. If we want to browse quickly, we might tell our screenreader to jump over the navigation bar and start reading the main content. Either way, being able to determine navigation links programmatically is important. So while there's nothing wrong with using <div id="nav"> to mark up our site navigation, there's nothing particularly right about it either. It's suboptimal in ways that affect real people. HTML5 provides a semantic way to mark up navigation sections: the <nav> element.
<nav> <ul> <li><a href="#">home</a></li> <li><a href="#">about us</a></li> <li><a href="#">sitemap</a></li> </ul> </nav>
The footer was originally marked up like this:
<div id="footer"> <p>————————— </p> <p>© 2012 <a href="#">BoGoToBoGo KiHyuck Hong</a></p> </div>This is valid HTML5. But HTML5 provides a more specific element for this: the <footer> element.
<footer> <p>————————— </p> <p>© 2010 <a href="#">BoGoToBoGo KiHyuck Hong</a></p> </footer>What's appropriate to put in a <footer> element? Probably whatever we're putting in a <div id="footer"> now. OK, that's a circular answer. But really, that's it. The HTML5 specification says, "A footer typically contains information about its section such as who wrote it, links to related documents, copyright data, and the like." That's what this example page has: a short copyright statement and a link to an about-the-author page. Looking around at some popular sites, I see lots of footer potential.
- CNN has a footer that contains a copyright statement, links to translations, and links to terms of service, privacy, "about us," "contact us," and "help." All totally <footer> material.
- Google has a famously sparse home page, but at the bottom of it are links to "Advertising Programs," "Business Solutions," "About Google," a copyright statement, and a link to Google's privacy policy. All of that could be wrapped in a <footer>.
<div id="w3c_footer"> <div class="w3c_footer-nav"> <h3>Navigation</h3> <ul> <li><a href="/">Home</a></li> <li><a href="/standards/">Standards</a></li> <li><a href="/participate/">Participate</a></li> <li><a href="/Consortium/membership">Membership</a></li> <li><a href="/Consortium/">About W3C</a></li> </ul> </div> <div class="w3c_footer-nav"> <h3>Contact W3C</h3> <ul> <li><a href="/Consortium/contact">Contact</a></li> <li><a href="/Help/">Help and FAQ</a></li> <li><a href="/Consortium/sup">Donate</a></li> <li><a href="/Consortium/siteindex">Site Map</a></li> </ul> </div> <div class="w3c_footer-nav"> <h3>W3C Updates</h3> <ul> <li><a href="http://twitter.com/W3C">Twitter</a></li> <li><a href="http://identi.ca/w3c">Identi.ca</a></li> </ul> </div> <p class="copyright">Copyright © 2010 W3C</p> </div>To convert this to semantic HTML5, I would make the following changes:
- Convert the outer <div id="w3c_footer"> to a <footer> element.
- Convert the first two instances of <div class="w3c_footer-nav"> to <nav> elements, and the third instance to a <section> element.
- Convert the <h3> headers to <h1>, since they'll now each be inside sectioning elements. The <nav> element creates a section in the document outline, just like the <article> element.
<footer> <nav> <h1>Navigation</h1> <ul> <li><a href="/">Home</a></li> <li><a href="/standards/">Standards</a></li> <li><a href="/participate/">Participate</a></li> <li><a href="/Consortium/membership">Membership</a></li> <li><a href="/Consortium/">About W3C</a></li> </ul> </nav> <nav> <h1>Contact W3C</h1> <ul> <li><a href="/Consortium/contact">Contact</a></li> <li><a href="/Help/">Help and FAQ</a></li> <li><a href="/Consortium/sup">Donate</a></li> <li><a href="/Consortium/siteindex">Site Map</a></li> </ul> </nav> <section> <h1>W3C Updates</h1> <ul> <li><a href="http://twitter.com/W3C">Twitter</a></li> <li><a href="http://identi.ca/w3c">Identi.ca</a></li> </ul> </section> <p class="copyright">Copyright © 2010 W3C</p> </footer>
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization