What's wrong with YOUR website - Part 2, Markup
You though part one was long, hold onto your hats folks... We're just getting warmed up!
It is somewhat shocking to still come across pages written without a proper
DOCTYPE. Despite what some of the HTML 5 junkies will tell you, knowing what version of HTML a site is built with is important... but more so is the fact that Internet Explorer uses the
DOCTYPE to figure out if a page was written for older browsers or newer browsers. Specifically it switches into a 'quirks mode' that handles how padding, margins, borders, widths and heights are applied to elements. (This is called the 'box model'). Since modern browsers do not have a quirks mode, sites lacking a
DOCTYPE will struggle with CSS hacks and broken layouts across different browsers.
Which is why even with their noodle-doodle "lets get rid of versioning" nonsense, even the HTML 5 folks at the WhatWG admit you need one, and came up with a minimalistic lip-service one for that reason.
This is usually the result of people who have no business editing HTML... cutting and pasting together different pages. You should only have ONE instance of opening and closing
BODY, and only one
DOCTYPE. If you see them opened more than once, that's invalid and likely a cause of problems.
There are a LOT of
META tags that nothing out there actually uses. To be frank, I consider there to only be three or four legitimate
META to be used. It is easier to list the legitimate ones than to try list all the possible garbage.
Apart from that, there's usually little legitimate reason to be using
META tags... making all the other ones people pull out of their backsides little more than pointless bloat. There ARE exceptions like certain (but not all) "openGraph" values or tracking software specific ones. Really though if you need extra scripted tracking software on your page, there's probably something either wrong with your server or your knowledge on how to manage a website. Yes, Google Analytics, I'm looking at you!
2.4 overstuffed keywords
A lot of people say the keywords
META is now ignored completely, my own results beg to differ. I think the reason people think it's ignored is few if any people use it correctly.
First off, it's called keywords. NOT keyphrases, not keysentences, not keyparagraphs -- keyWORDS!!! It should be 8 or 9 single words (I make an exception for unique multi-word pronouns -- like the names of states, provinces, or even software) that exist in the BODY of your document, you want to have a slight ranking boost on. Preferably it should come in at under 127 characters... and some sites like SEOWorkers.com suggest even less than that.
It should also be thought of as a word jumble, you'll often seen endless pointless messes like this where people try to come up with every possible combination.
<meta name="keywords" content="babysitting keene new hampshire, babysitting winchester new hampshire, babysitting chesterfield new hampshire, babysitting walpole new hampshire" >
Pointlessly redundant -- THIS would be functionally identical:
<meta name="keywords" content="babysitting,new hampshire,keene,winchester,chesterfield,walpole" >
Of course, if any of those do not exist inside the document
BODY they have no business in your keywords
META, and could make the whole thing be ignored; or worse result in the engines slapping you down for trying to game the system!
I would also point out that it's a comma delimited list. I've seen people go and use all sorts of goofy characters like vertical breaks, asterisk, semicolons... and they're all gibberish that also basically flushes your chances of a keywords
META doing anything.
2.5 Nonsensical description
META exists to be a short description of your site shown as a tooltip or as the text below your SERP listing. That's it, that's what it is for. Natural language text to tell people viewing your listing in the search engines what the site is about. It is NOT a place to endlessly stuff keywords, it is not a place to put oddball random information that doesn't tell people what the site is about. It's pretty simple, I'm shocked how many people screw it up. It is also recommended you keep it to below 127 characters -- basically a sentence or two.
The purpose of the title tag is to be the text shown in a taskbar, on a window frame, on a tab, and on your SERP listing. That is IT. It is NOT for stuffing with keywords, it is not for some massive tagline. It's for saying what page on the site it is.
This is why I at least prefer to say the page title, followed by a hyphen, followed by the site tittle -- as then it gives me useful text on tabs and/or the task bar making it easier to navigate. THAT'S WHAT IT'S FOR!!!
Because it's also used for that, there is often little reason to use more than 70 characters, and I like to set 64 as my upper limit since it's a multiple of 8. (that's the programmer in me)... as such stuffing two paragraphs in there doesn't get used by a blasted thing.
Oh, and minor nitpick, what the blazes is with using vertical breaks or as some people call it, a "pipe" in titles?!? Do the browsers use that to separate the browser name? No... they use a hyphen. Where did this practice come from as to be frank, it looks like ass... and not hot juicy playboy ass, but 400 pounds of cottage cheese ass.
DIV for nothing
2.7b Classes for nothing
I see this all the time, where people slap
DIV around elements that don't need them... and classes on elements that don't need them. A great example of this is your average main menu on a site, where you'll see bloat like this:
<div id="mainMenu"> <ul class="mainMenuUl"> <li class="mainMenuLi"> <a href="/" class="mainMenuA">Home</a> </li> <li class="mainMenuLi"> <a href="/forums" class="mainMenuA">Forums</a> </li> <li class="mainMenuLi"> <a href="/links" class="mainMenuA">Links</a> </li> <li class="mainMenuLi"> <a href="/about" class="mainMenuA">About Us</a> </li> </ul> </div>
The main reason to use classes is to target those elements with CSS... but CSS has this thing called 'inheritance' which means that if you have a container with a perfectly good class or ID on it, and all the children inside it are getting the same class, NONE of those child elements need classes. Likewise a UL is a perfectly good block-level container, which means in the majority of cases there is no legitimate reason to waste code on putting a
DIV around it!
There is usually little reason for the above code to be much more than:
What you'd access as
<ul id="mainMenu"> <li><a href="/">Home</a></li> <li><a href="/forums">Forums</a></li> <li><a href="/links">Links</a></li> <li><a href="/about">About Us</a></li> </ul>
#mainMenu, what was
#mainMenu A-- pretty simple, no matter how much the ignorant fools coding Wordpress claim otherwise. You want to recoil in horror, look at what turdpress does to menus.
An even bigger laugh is some people actually advocate the idea of slapping classes EVERYWHERE -- even some tools from alleged experts like Google's "page speed" penalize you for using inheritance over targeting, as if somehow more code makes a page load faster? When it comes to the Internet, when you hear things like more markup being faster, your common sense should tingle.
Yes, common sense, it's a superpower... Though when it comes to G's "Page Speed" we are talking about the tool that calls a 75k page built from 24 files worse in terms of "speed" than a megabyte sized page build from 100 or more files -- a laugh when their own waterfall breakdown shows the smaller page loading in a tenth the time... Hey, there's a good topic for another article, "what's wrong with tools like Google 'page speed' or 'YSlow'"
Similarly you'll often see people nesting tags like
DIV because they seem to be under the delusion you can only set one class per element or one CSS property per element... or at least, that's the only explanation I can come up with for the noodle-doodle and bizarre code bloat that fills many pages out there.
TITLE attribute (not the tag, the attribute) should be used when the content of a tag like Anchor does not convey the full meaning... but given that's it's purpose, doesn't that just mean there's something wrong with the content of a tag?
There is NO reason to make a
TITLE that's identical to the content of a tag, which is why idiocy like this:
<a href="/" title="home">Home</a>
...or even this:
<a href="/" title="home"><img src="images/home.png" alt="home" /></a>
Is just a waste of your time typing and bandwidth. If it is identical to the contents, there is no real reason to have one! That or there's something horribly wrong with your contents.
TITLE is also not for stuffing with keywords since it's supposed to describe the contents of the element. You'll see that all the time and said black hat SEO hoodoo-voodoo is just begging for your site to be slapped down by the search engines.
2.9 Improper use of numbered headings
The entire purpose of a heading, from a grammatical and literary viewpoint -- as opposed to the typographical sense -- is to indicate a start of a section or subsection of a page. That's just common sense of what a heading IS. As such, lower order headings mean the start of subsections of the heading preceeding it. That's how it is in essays, technical documents, books, etc, etc...
You see, typography is all about presentaton presentation -- and presentation has NO BUSINESS IN YOUR MARKUP!!!
As such, skipping over heading numbers -- such as a
H1 followed by a H3 -- makes no sense. You would be missing the
H3 is a subsection of. Using a
H1 as your primary article title instead of the page title/logo/banner makes no sense -- as it's very likely all the other
H2 on the page are actually subsections of that
H1. Using a
H1 paired with a
H2 for a page title and tagline is also gibberish! That tagline is NOT the start of a new subsection. That's why using multiple
H1 makes no sense! (sorry HTML 5). It is also why having lower order (higher numbered) headings like H2 or H3 preceding the H1 is also equally nonsensical.
Headings are moronically simple -- I don't understand why people have such a hard time with them excepting perhaps copypasta from other people who don't get it, choosing them based on their default appearance, or a general ignorance of how to write.
2.10 Tables for layout
Tables are for the markup of tabular data, which is to say data where the columns are all the same type of data and the rows are related. A spreadsheet is a good example of this, as is the index of threads on a forums.
Using tables just to make columns is abuse of the tag and non-semantic markup. It tends to result in bloated code, harder to maintain code, and a inability to change the layout on the fly since you've put the presentation in the markup -- the antithesis of accessibility, fluid design and the 'new' idea of 'responsive layout'.
WORSE is when people just use tables for no good reason -- an example of this is a table where there's only one TD per TR, or only one TR... at which point why is it in a table again?
Though I'm NOT saying don't ever use tables... As I'm about to explain.
2.11 Non table elements for tabular data
See how above I listed forum indexes as an example of tabular data? You have the status icon, the title and description, number of thread, number of posts, last posted on each row. That's fields in a row -- that's TABULAR DATA.
For some weird reason the die-hard 'never use tables weirdos have gotten it into their heads that 'tables are evil'... which is 100% grade A farm fresh cow pies. Handy if you want to fertilize a rose, otherwise... not so much. You can see it in things like the default skin for vBulletin 4, which abuses nested lists (when it's not short list-type items), endless
2.12a There are more tags for tables than just
You'll often see code like this:
<tr> <td colspan="5"><big><strong>Shopping Cart</strong></big></td> </tr><tr> <td class="header"><td> <td class="header"><strong>Title</strong></td> <td class="header"><strong>Threads</strong></td> <td class="header"><strong>Posts</strong></td> <td class="header"><strong>Last Post</strong></td> </tr>
Which is laughably bad in the worst of times -- Not only should NONE of those elements be receiving "more emphasis" with the STRONG tag, none of those should even be
TD!!! The first
TD is quite obviously a table
CAPTION, the latter ones are all table headings, and there are tags for doing that! Specifically
TH. Much less the omission of 'scope' which builds the relationship between a heading and the row or column it is associated with. Also, if you have a section of column headings, it should probably go inside a
THEAD tag, just as the content should go into
<caption>Shopping Cart</caption> <thead> <tr> <td></td> <th scope="col">Title</th> <th scope="col">Threads</th> <th scope="col">Posts</th> <th scope="col">Last Post</th> </tr> </thead><tbody>
Is how that should probably be handled. It is shocking how many people are blissfully unaware of
CAPTION... and their code is bloated and harder to maintain because of it!
2.12b There are more tags for forms than
Just as people seem unaware of table elements like
CAPTION, it seems there's an equal amount of ignorance in regards to forms for tags like
FIELDSET. The net result is forms that are often inaccessible or confusing to use on mobile, screen readers, or just in general.
FIELDSET are for grouping like tags the user will be interacting with to enter values or make choices. Usually 99% of the time someone puts a
DIV around or inside a form as a wrapper, they should be using a
FIELDSET or even multiple fieldsets.
LEGENDS are like
CAPTION is for a table -- it summarizes what the
FIELDSET IS. Unfortunately the
LEGEND tag is such a colossal pain the backside to style that it ends up rendered near useless. Like a great many things browser makers have never been on the same page in terms of how
LEGEND is supposed to accept styling, and all went their own directions with it.
As such many developers have taken to using an appropriate level numbered heading tag instead as it makes a bit of sense to use headings to mark sections inside the form.
LABEL should go around the text describing what the
TEXTAREA is. It should also have a
FOR attribute pointing at the
ID of the
TEXTAREA it is describing. You see the omission of labels and abuse of tags like
P inside forms ALL the time, when using the proper tags is so simple... despite that simplicity I'd say well over half the people out their writing websites are unaware of how to build a form properly!
I grouped those together as 2.13a and 2.13b because they are the same thing -- ignorance of what elements HTML even has -- despite being for two different parts of coding a page.
2.13 Non-semantic markup
The entire point of HTML back when Tim Berners-Lee first created it was device independent delivery of content. Regardless of your devices capabilities -- from braille to 1152x864 resolution displays, HTML was for saying what things WERE (paragraphs, headings, lists) or would be if printed (bold and italic) and then allow the user agent (browser) best determine how to show it.
We got away from that during the late 90's/early part of this century browser wars, where browser makers started just adding presentational garbage to HTML to cater to the crowd who gave a flying purple fish about accessibility, and instead were solely concerned with "what does it look like in IE on a desktop display".
HTML 4 Strict was an attempt to return to HTML's roots by re-emphasizing HTML's role of saying what things are. CSS was then added to let you customize the appearance for all the different possible targets, while still allowing "graceful degradation" of the markup when CSS was unavailable.
This "separation of presentation from content" can result in more accessible websites that load faster, use less bandwidth and make better use of caching models... and I'd say 80% or more of people crapping out websites out there JUST DON'T GET IT.
Semantics is easy -- just say what things are.
P is for Paragraph,
UL is for unordered bullet point type lists,
OL is for bullet point type ordered lists, (and when I say bullet point I am not referring to them HAVING a bullet before them!)
H6 are for headings (see section 2.9), C is for Cow (moo), D is for Dog (woof-woof), etc, etc.. Even
I serve a semantic purpose -- when something WOULD BE bold or italic in writing like a book title or company name. That's why
I are still in the specification alongside
STRONG, since those mean something entirely different! (emphasis and 'more' emphasis!). Notice I say WOULD BE, not SHOULD BE -- because the target device may be incapable of showing bold or italic, and may convey it in some other manner.
Screwing up semantics seems to be the order of the day -- with people wrapping paragraphs around non-paragraph elements (like
INPUT), improper use of headings (Again, see section 2.9), putting lists around tabular data or elements that are too big to be 'bullet points' or selections and quite possibly even have an existing semantic structure (like heading tags)... improper use of semantics is as bad as not bothering to use them at all! PARTICULARLY if you do so in a manner that just makes more markup for no good reason.
This isn't exactly rocket science!
-- Wernher von Braun
It's not rocket science -- people seem to treat it like it is. Lands sake, even rocket science isn't rocket science...
2.14 "Risky" Comment Placement
If you tried explaining this one to people who know 'real' programming languages you'd get this incredulous look as if you were making things up... Then you explain "It's mostly an IE problem" and they go "Oh... That explains it."
As STUPID as it sounds, in IE and some versions of gecko based browsers like Firefox, COMMENTS can actually trip rendering bugs if they appear between sibling inline-blocks, floats, negative margins or positioned (relative or absolute) elements. A lot of times people hack around these problems with extra CSS and IE conditionals when the solution is to change your commenting practices.
The most common reason people put comments in that could cause this is before the start of a section or to label the closing tag... For example:
<!-- start content --> <div id="c128v4"> Some content here </div> <!-- end content -->
Even that fat bloated overpriced steaming dung-heap known as Dreamweaver generates code like this... How can you avoid this bug?
First, STOP using pointless comments to make up for bad coding practices... that opening comment is just idiocy since instead of wasting a comment on it, how about you use an ID or comment that actually *SHOCK* describes what the element is.
Second, the closing comment does help, but
</div> already means it's being closed, so lose the word 'end', then move the comment INSIDE the tag. Literally if you just reverse the order of the comment and the tag, the problem will NEVER creep up.
So 'fixed' that would read:
<div id="content"> Some Content Here <!-- #content --></div>
Which works out to a fraction the code as well. Notice I say # so I know it's an ID and not a class. Using the same selector methodology as CSS makes it consistent and easier to follow.
2.15 Code Bloat
I half laugh and half cry whenever I see people blowing tens or even hundreds of K of markup on delivering one or two K of plaintext and maybe a dozen content (non-theme) images. The vast majority of pages should only need ten to twenty K of HTML for every five to ten k of actual content text... I have to ask - are these developers charging by the K-LoC like it's still 1978 or something?
HTML is so simple and done properly your text should outnumber the tags by a factor of 4:1 or more... If your code is bigger than your content once you break around 2.5k (the amount that SHOULD be in all well formed documents for HEAD,
DOCTYPE, etc), simple fact is you MUST be doing something wrong!
What sort of things? Things like putting appearance in the markup, static scripting or style in the markup where it can't be cached if the same page is reloaded/updated. Things like having dozens of separate files dragging performance down thanks to handshaking.