Have you ever wondered if it’s better to “close” a br
or input
tag like
<br />
or if it’s better to just write <br>
in HTML5? Or why it’s not correct
to write <script src="script.js" />
? Well so have I, and
my findings on the subject were a lot more interesting than I anticipated
(if for some strange reason you find stuff like this interesting).
If you are not interested in the whole story, just jump to the section «validity» to get your answer.
Void elements
Void elements are a special kind of element that must not have content.
That’s a big difference to other elements that can be empty but can also contain
other elements and text (such as <div>
s).
The most known void elements are:
<br>
<hr>
<img>
<input>
<link>
<meta>
The lesser known are:
<area>
<base>
<col>
<command>
<embed>
<keygen>
<param>
<source>
<track>
<wbr>
That’s it. Those are all of the existing void elements.
It is not, and has never been, valid HTML to write <br></br>
, since this would
imply that the br
element accepts content (writing <br>Hello!</br>
has
absolutely no meaning). However, it is very common to see both <br>
and <br />
.
Although most people know that in XHTML it is mandatory to write <br />
the
rules for HTML are less obvious.
History
To completely understand the rules of void elements a bit of history is necessary.
HTML, XML and X(HT)ML are all based on SGML, the Standard Generalized Markup Language which has been crafted in 1986.
HTML and XML derived directly from SGML. XML is a more restrictive subset of SGML and that’s what XHTML is based on.
XHTML is basically the same as HTML but based on XML.
So far so good? Then lets get to the interesting part:
SGML has a feature called NET (Null End Tag).
This is a short notation to avoid having to close a tag when the content
of your element is simple text. With NET you can write <quote/Quoted text/
instead of <quote>Quoted text</quote>
.
As a side note, elements that do not contain any text, can be written as <quote//
which is called SHORTTAG NETENABL IMMEDNET
and is the same as <quote></quote>
.
Now, by that logic, if a void element does not have a closing tag, <br/
would
be interpreted as <br>
, and <br/>
would be interpreted as <br>>
which is
obviously incorrect syntax. If you’re like me, you’re probably thinking «This
is insane!». Unfortunately the authors of the HTML4 specification didn’t think
so, which is why this is part of the specification. Apparently, the browser
vendors at the time were not convinced as well, which resulted in very poor
browser support (which, in this case, is arguably not a bad thing).
XML (and thus XHTML) recognized the madness of such a syntax, and did not include
the NET or the SHORTTAG NETENABL IMMEDNET «features», but provided a sane
syntax for void elements, namely the Empty-Element tag
that looks like this: <br />
. It seems very natural which is why
most developers thought it was the right way to write it.
Luckily HTML evolves and the people at the World Wide Web Consortium (who are drafting and setting the standards throughout the web) are learning from their past mistakes as well. Which is why HTML5 makes a lot more sense.
Right in the introduction of the new HTML5 syntax, the W3C says:
HTML 5 defines an HTML syntax that is compatible with HTML 4 and XHTML 1 documents published on the Web, but is not compatible with the more esoteric SGML features of HTML 4, such as the NET syntax (i.e. <em/content/).
Yay for HTML5!
(I think they should have kept the cool SHORTTAG feature (<strong>Hell yea</>
)
but hey… at least HTML is not a complete mess any more)
Validity
So back to the question of validity, the current HTML5 specification for void elements is as follows:
Start tags consist of the following parts, in exactly the following order:
- A “<” character.
- The element’s tag name.
- Optionally, one or more attributes, each of which must be preceded by one or more space characters.
- Optionally, one or more space characters.
- Optionally, a “/” character, which may be present only if the element is a void element.
- A “>” character.
This means that the /
character has been rendered optional in HTML5, but
it doesn’t add any meaning. There is absolutely no difference between <br>
and <br />
.
Correct ness
Well, for those of you who are really addicted to X(HT)ML, you might think, «yeah,
it’s optional, but <br />
is still ‘more correct’», but I have to tell you:
it is not. Actually, one might argue that adding /
to a void tag is an ignored
syntax error. The possibility to write it has mostly been added for compatibility
reasons and every browser and parser should not handle <br>
and <br />
any
differently.
Google’s styleguide on that subject is also very clear that you should indeed not close void tags.
Theo retical disad vantages
Of course, not closing void tags has its disadvantages as well, but I think
that they do not outweigh the advantage of having clean and terse void tags like <meta>
.
The first disadvantage of not closing void tags is that users have to have knowledge
of the existing void tags. If, for example, you don’t know what a <img>
element
is, you might be confused if you can’t find any closing tags for it. But the list
of void tags is very short and normally it’s quite obvious which tags are void tags.
The second disadvantage is that it gets more complicated for editors to get it
right. They need to have a knowledge of void tags and a complete list to provide
proper highlighting and code completion. If you write <input>
in an editor,
it has to know that there will never be a </input>
following that.
But it’s very easy to implement, and I don’t know any editor that doesn’t get this right, so it’s not a real disadvantage.
My opinion on void tags
I think that the whole concept of void tags could be avoided completely by using
the content of some tags instead of defining additional attributes. Let’s take
the <img>
tag for example. It has a mandatory alt
attribute, and for good
reason: people who can’t see the image (either because they are physically
incapable or because their device can’t display images) should at least know
what image they could see there (and if you’re adding an img
tag solely for
design purposes then you’re doing something wrong anyway). So my question is:
why isn’t the content of the image tag the alternative tag? It seems rather
obvious to me to write <img src="doge.png">Image of doge</img>
. The same goes
for <meta>
tags which even have a content
attribute! Why not just use the
actual element content for that? <input value="Value content">
should be
<input>Value content</input>
as is the case with <textarea>
, etc…
So really there are only a few void tags that should exist anyway, but obviously the W3C has to take backwards compatibility into account which makes changes of this kind much more difficult.
Final thoughts: the <script>
tag
The script tag has really been bothering me because it is such a verbose tag for
such a simple directive. It seems wrong to write <script src="my-script.js"></script>
since the content of this script
tag has no logical correlation to my-script.js
(and
the html specification
disallows to add both, content and the src
attribute).
The problem is, that <script>
is not a void tag since you can inline JavaScript
on your page and there are no “optional void tags”.
Using the <link>
tag would have been perfect since it’s already used for other
imports and provides all the attributes necessary to include external
files. Of course, as so often in the web, the reason it is not used is backwards
compatibility, since you would exclude all old browsers that don’t support that
syntax.