Skip to content

Instantly share code, notes, and snippets.

@cmalven
Last active April 27, 2024 10:17
Show Gist options
  • Star 53 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save cmalven/1885287 to your computer and use it in GitHub Desktop.
Save cmalven/1885287 to your computer and use it in GitHub Desktop.
Shortest (useful) HTML5 Document
<!-- http://www.brucelawson.co.uk/2010/a-minimal-html5-document/ -->
<!doctype html>
<html lang=en>
<head>
<meta charset=utf-8>
<title>blah</title>
</head>
<body>
<p>I'm the content</p>
</body>
</html>
@hh-lohmann
Copy link

Modern authoring tools, IDEs, editors, etc. will surely assume UTF-8 as default. But imagine something like Leftpad, i.e. something deep inside the dependency graph of your complex tool and module chain (= out of your awareness) that breaks everything esp. on higher levels just because you had no charset declaration, maybe because it was added as a dependency - by another dependency you are not even aware you have it - before the whole world was UTF-8 or by any other reason setting ISO-8859-1 as default (the parallel to the Leftpad disaster is being killed by something you not even knew about). It's trivial to set a charset declaration before trouble arises, it may be very time consuming to find out that a missing charset declaration was the cause. Note that such a dev tool dependency will also hit you if you will never need any non-ASCII character.

Often small companies (maybe your clients) have no control themselves on the - outdated - server space they bought some years ago from someone who bought it themself from someone other etc. - you may have to do reverse engineering to know about the HTTP headers that are sent, and you cannot change them, so it's quite nice that you are always safe if you have a charset declaration in your code. You may get hit on a server just after years if there arises a new requirement for e.g. a French version and out of the blue there are strange replacements just because of quotation marks that are not in your implicit default charset.

Anyway, you will always have better arguments for stakeholders when things break although you adhered to standards.

@alemens
Copy link

alemens commented Oct 24, 2022

Why does everyone include <meta charset="utf-8">?

UTF-8 is the only valid encoding for HTML5 documents. Means if you have <!DOCTYPE html> at the top of an HTML file then charset is implied

@punund
Copy link

punund commented Sep 15, 2023

if we're talking about a minimal HTML document, it's certainly not required - but doesn't really seem like a good idea either way.

Now imagine serving files with different encodings off the same web server.
Encoding is inalienable from the document itself. You can't usefully change the encoding without changing the text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment