So what's wrong with HTML 5?

As it sits right now HTML 5 is quickly becoming used as a sick buzzword akin to "Web 2.0" - typically by people who don't know enough about what it actually is, why to use it, or most importantly when NOT to use it! The comparison to "Web 2.0" fits WAY too well as while there is a legitimate HTML 5, most of what people call HTML 5... well... just plain isn't. The new CSS3 effects, media queries and responsive layout? That's NOT HTML! the new scripting stuff? NOT HTML!!! You take away all the actually 'cool' stuff people who don't know any better are running around calling HTML 5, and what are you left with?

Not much. The Emperor is standing bare for the world to see.

Some history

Dude you're a guy, why is your name hypenated? Tim Berners-Lee with
a NeXT Workstation
Picture from Plyojump's
web history article

When Tim Berners-Lee created HTML all those years ago, the original intent was to say what things ARE, or would be when professionally written so that the "user agent" - what today is best known as a browser - could best determine how to show that content within the capabilities of the target device. What today we call "Semantic markup" is just what HTML is supposed to have been used for all along. This is because at the time there were a whole slew of possible different devices on which a document could be shown ranging from teletype to 22x21 VIC-20 screen to the 1152x864 NeXT workstation TBL was working on at the time... nothing like today where we have all these displays of uniform size and resolution... NOT

Sadly, that entire intent was thrown out the window during the browser wars of the late 1990's, where browser makers started making up their own extensions to HTML to let 'designers' say what things look like. This destroyed the accessibility of pages built using these additions and made people stop thinking about what tags actually meant - instead the entire focus became what the tags looked like, defeating the entire reason HTML even existed in the first place! Eventually being the toothless wimp they are, the W3C adopted many of these into HTML 3.2.

Then the W3C went all schizophrenic. They created two HTML's - "4 transitional", where everything in 3.2 and a lot of newer proprietary tags were allowed into the specification, and "4 Strict" which was meant to drive HTML back to the original purpose of saying what things are, so CSS could be used to say what things look like for specific targets. Transitional was to allow old pages to add the new HTML 4 stuff, Strict was for the creation of new, better, faster, and more accessible websites.

STRICT wasn't just about the removal of presentation from markup - it was also about removing redundancies. APPLET was dropped and proprietary tags and attributes like EMBED or BGSOUND were rejected in favor of OBJECT. (and we were told IMG was on the chopping block as also redundant to OBJECT) MENU and DIR were dropped in favor of UL. Even the tags removed for being presentational like CENTER and FONT were removed more for being redundant to CSS, as they held no semantic meaning. (even B and I at least have the semantic meaning of "what it would be in a professionally written document").

OBJECT also existed for a very important reason - to allow the market and site developers to choose what formats to support; if we had PROPER OBJECT implementation and proper sandboxed plugin support, we could be using all sorts of fun formats like JPEG2000... Which, well... we'll get to that in a moment.

Bottom line: Transitional quite litererlly means "in transition from 1997 to 1998 coding practices", while STRICT meant following all the good development and accessibiltiy practices HTML was supposed to be about. Tranny for old sites, Strict for new...

So naturally what did people write brand-new pages as in the decade that followed? Why transitional of course since few developers were willing to extract their cranium from 1997's rectum. Didn't matter if the new technology was more accessible, used less bandwidth, was actually easier to work with, or could prepare you for future possibilities - as always people just wanted to sleaze along with what they were doing, regardless of how much it costs in the long run.

Along comes HTML 5

... and it's a sorry state of affairs. Instead of further trying to drag developers out of the dark ages of computing it gives them a gentle pat on the back, re-introduces old redundancies, creates all new redundancies, and a bunch of flashy 'sounds good' nonsense that is more fiction than fact. In many ways it's just the W3C throwing it's hands in the air and saying "oh well, just go ahead and sleaze things out any old way. We don't care anymore!"

Redundancies

The re-introduction of redundancies is particularly annoying - EMBED for example was not allowed into HTML 4 for being redundant to OBJECT. Now with HTML 5 we not only get EMBED in the specification, we get VIDEO and AUDIO which are ALSO redundant to why OBJECT exists. They aren't doing a single blasted thing with these new tags that couldn't have been handled by OBJECT. Instead of riding browser maker's asses about implementing OBJECT properly, they create redundant tags and allow in an old tag for no fathomable reason. WORSE, with AUDIO and VIDEO you are at the mercy of browser makers for supported formats. That's called "vendor lock-in", yet laughably they have sold it as fighting vendor lock-in all because Flash won the media format wars of Quicktime vs. WMP vs. Realplayer from around the same time period as the browser wars.

You'd think all the parties behind the VIDEO tag were the ones who lost that war, or had formats nobody would use by choice.

Even the new structural tags are redundant if you bother using numbered headings and horizontal rules for what they mean in professional writing; a heading being the start of a subsection, if the headings have 'levels' as conveyed by size or indentation, they are the start of a subsection of the 'higher level' heading before them. Therein a H1 is the heading under which everything on the page is a subsection (like the site title, much like a newspaper title or book title is at the top of EVERY page), a H2 is the start of main subsections, H3 are the start of subsections of the H2, and so forth... Horizontal rules meaning a change in topic/section where a text heading is unwanted/unwarranted/undesired. Anything after the last HR being the footer. Again, use tags for what they mean - H1 through H6 don't mean "big text", nor does a HR mean "draw a line on the screen"

Which means if we have all that to mark the start of sections, changes in topic, or even the final page footer, what legitimate purpose does HTML 5's HEADER, SECTION and FOOTER have other than being redundant pointless code bloat? The only possible explanation would be a certain group of developers interested in what's called "data scraping" -- and let's be perfectly blunt about this and use their REAL name: CONTENT THIEVES.

Likewise, if numbered headings when used properly make a nice outline/navigable tree of the document, that browsers like "real" Opera (as opposed to the pathetic crippleware that is Chrome with the Opera logo slapped on it) and screen readers like JAWS are able to leverage to navigate from the top of the document quickly to the start of content, what purpose does 5's NAV tag serve other than once again, being pointless code-bloat. Remember, NAV exists to say "this is a navigation block you can skip to get to the page's content" - It's shocking how many people using HTML 5 seem to be unaware of that and think it means something completely different, usually a meaning pulled out their backside! In particular you will often see HTML 5 users just putting NAV around a bunch of anchors with no list thinking it replaces lists -- 100% grade A farm fresh manure and a fiction woven by the ignorant. Then they wonder why screen readers like Jaws treat their menu as a run-on sentence. NOT that I know anything about run-on sentences...

Then there's HGROUP - putting multiple headings together as a single entity - TO WHAT END? If a heading is the start of a subsection why would you pair them together? I realize a lot of people pair them together as title and tagline type relationships, but in that case it's ONE heading and should probably be something like a SPAN or SMALL inside one numbered heading. HGROUP was such a poster child for how the people who came up with HTML 5 didn't even understand how to use HTML 4, even they ended up striking it from the specification sometime early 2013. That it even existed in the first place simply proved that the people who created HTML 5 didn't know enough about HTML 4 to be creating its successor!!!

ASIDE -- pointless or crippled?

IF we go with the only dictionary meaning of the word that would make the least bit of sense, a literary aside breaking the forth wall and/or a subset of text related to yet supplementary to the content it is near - usage cases would be so extremely rare it ends up about as useful as ADDRESS is with its current definition in the spec. Sadly, people seem to be unaware what an aside is from a literary standpoint (the only one from which semantics means a blasted thing) and just seem to want to use it for sidebars - reducing it's meaning to basically the same as the CENTER tag, tables for layout, or presentational use of classes like the OOCSS nutjobs would have you do... which is why I see no real reason for this tag to even exist... That is, unless you write a LOT of slashfic screenplays starring Ferris Bueller.

So... WHY?

Some would argue these new structural tags are "to avoid using DIV". Tags like DIV and SPAN are great as they don't apply any extra meaning to the content other than "this stuff may receive style" -- that's when you add them, and really they should not be added to your code until AFTER you've exhausted styling for your existing semantic tags.

These new "structural" tags really serve no legitimate purpose other than making tags for things people were abusing DIV for... things that quite often wasn't needed in the markup in the first place.

It also goes hand in hand with throwing the concept of structural rules in the trash and pissing on the very idea of "logical document structure" - they now say "start over at H1 when you open a ARTICLE or SECTION" - that provides structure to the page HOW exactly? Much less if people are too stupid to use the ridiculously simple numbered headings we already had properly, do you REALLY think throwing more tags at it is going to help?

Seriously, when most of the people still vomiting up HTML 3.2, and slapping either 4 tranny or 5 lip-service on it are blissfully unaware of the simplest of tags and attributes - like LEGEND, FIELDSET, LABEL, TH, THEAD, TBODY, SCOPE, AXIS - and cannot be bothered to use the tags they do know - like H1, H2, P, etc, etc - properly, is throwing more tags at doing THE EXACT SAME THINGS really going to make things any better?

Not so much...

It's also resulting in broken sites even in "supported" browsers - see how they now say inline-level tags can now wrap block-level; would be fine if it didn't introduce margin, padding, float and all sorts of other positioning bugs in FF and Safari. Just because some people were too stupid to bother learning "Hey, you can't put a H1 inside a A" is no reason to throw that rule out the window; particularly when said rule was part of logical document structure!

Tags that shouldn't even exist!

Moving on, we have the tags that to be frank, have no business even EXISTING as tags. I am of course referring to the tags that only serve a purpose when supported by JavaScript; the likes of CANVAS and PROGRESS. If they only serve a purpose for JavaScript, why the blue blazes do they even HAVE a tag?!? Canvas in particular - mind you, I LIKE CANVAS - as a JavaScript feature! -- But there is no reason for it to even have a tag, and frankly I'm not sure why they didn't just make adding a context part of EVERY tag behaving like a background-image or something similar. Would be far more useful that way, and again wouldn't even have its own tag. Hell, the CSS3 linear-gradient and its kin are basically background-image, why not CANVAS? The only reason I could see for it to have its own tag is for fallbacks scripting off, but isn't that NOSCRIPT's job? Much less capabilities detection in the scripting? Again, a pointless redundancy.

Doctor Evil "Specification"

Shuffling bloat around instead of removing it?

I'm not even wild about the removal of versioning and the lip-service doctype. This whole notion of a "living document" just sets my teeth on edge, and is why when I say "specification" in regards to HTML 5 I now make air-quotes with my fingers like a second rate Doctor Evil... All the minor reductions in code-size in HEAD don't seem to offset the bloat that is the bread and butter of HTML 5 inside BODY. Really they seem to serve no real legitimate purpose other than making things uselessly vague and ignoring the other specifications (like SGML and XML) on which HTML and XHTML are based.

Cherries - Ripe on the outside, but inside it's the pits

Even some of the seemingly useful stuff is questionable if you really look at them; much of it stems from legitimizing bad/broken development practices - see PLACEHOLDER. When people were doing it with JavaScript, it was a useless wreck known as "False Simplicity" - that continues today, with many people using it instead of a proper LABEL.

Though even I admit that is NOT the HTML 5 specifications fault!

The "specification" says - and I quote:

The placeholder attribute should not be used as a replacement for a label

W3C HTML 5 specification for PLACEHOLDER attribute

But it just seems to prove that the people who are embracing HTML 5 don't seem to want to take the time to understand what they are trying to use, and that the specification itself does little but add more garbage for the people who never embraced HTML 4 Strict to abuse. The handful of alleged "real world" usage scenarios for that element are dubious at best, and in many cases just ends up confusing users.

"Aria Roles" are another aspect that to be frank, just strikes me as pointless code bloat. It's like they gave the micro-formats nutters their own little corner to go spank it in with this one. The only legitimate purpose I can see for it is satiating the wants of data scrapers -- a group that I thought most site owners were trying to keep out so as to prevent content theft! Again, like microformats sooner or later your CDATA has to be allowed to do what text does, and adding extra metadata and identifiers to it just has diminishing returns. Bottom line, you'll have a hard time convincing me that "aria roles" actually exist for visitors to websites to use, or that there's even a meaningful way to leverage them for such. It's just code bloat in the name of "We have to slap more meaning on everything".

Or the "data-" attributes for passing data to JavaScript... said data only serving a purpose in JS, why isn't it... oh I don't know... IN THE SCRIPTING?!? This is being abused as code bloat for people who don't grasp how to use JavaScript/ECMAScript properly; though to put that in perspective I say the same thing about the various onevent attributes that IMHO should be stricken from the HTML spec as redundant to attachEvent / addEventListener. I'm not saying the data- attributes can't be handy, but people are throwing them at everything willy-nilly.

So it's all bad? Not really!

I'm not saying everything in HTML 5 is rubbish - MANIFEST for example is great if you are developing web applications. Some of the form attributes are handy too and gracefully degrade decently. The new TYPE attributes thankfully seem to degrade to TYPE="TEXT" so it's pretty much safe to use them so long as you are remembering that when processing the response server-side. The reduced doctype is a step too far, but it was in the right direction -- I more complain about the lack of versioning.

Something I REALLY like is how certain properties like "type" have been made optional in places where the tag or another attribute is already saying what it is -- like type="text/css" on a rel="stylesheet", or type="text/javascript" on a SCRIPT tag. The shortened doctype, header declarations and clearer, conscise language and character encoding declarations are MORE than welcome! They even go a bit more into defining the role of certain tags and attributes instead of blindly hoping the people writing markup actually know something about professional writing!

But the gems are far and few between in this slag-heap. For every good change there's a slew of pointless redundancies, actively encouraged code bloat, broken methodologies and a honestly leaving me feeling like they want to drag things back to the WORST of Pre-Strict browser-wars era development practices.

In summary, that is why I and many other developers are saying that you should build and design using either HTML 4.01 STRICT or XHTML 1.0 STRICT, and if you have to / want to use the handful of useful bits in HTML 5, live with the unrecognized tags/attributes during validation and slap the HTML 5 doctype on it at the last moment. In many ways I think we need an "HTML 5 STRICT" to axe all the pointless redundancies and backwards thinking while including JUST the actual improvements... that or it's just time to make a whole new specification that addresses the concerns of people making websites, instead of the whims of the browser makers. One that might actually, I don't know, be authoritative instead of documentative - you know, a SPECIFICATION!

Clarifying the meanings and having a version written in plain English wouldn't hurt either. REALLY doesn't help anyone that the "specification" is written not in a manner meant for digestion by people who write websites, but for people who write BROWSERS. If that's not back-assward, what is?

Of course, the 500 pound gorilla's at the W3C are the browser makers, so go figure... They could really give a flying purple fish about the people making websites, they just want a flashy new buzzword to try and dupe people into using their particular flavor of browser since, well... Joe forbid they be expected to win us over with innovating user interface features -- As proven by most every browser dragging their core UI features back to the worst of IE 4 Mac circa 1995. See the giant middle finger Opera flipped at it's userbase during their conversion to blink. Opera with webkit/blink would have been cool, but just slapping the logo on Chromium was a backhanded slap to the loyal fans.

/for_others

Browse code samples of people I've helped on various forums. These code snippets, images, and full rewrites of websites date back a decade or more, and are organized by the forum username of who I was helping. You'll find all sorts of oddball bits and pieces in here. You find any of it useful, go ahead, pick up the ball, and run with it.