How many of us have used a meta tag to define content type and default character sets? The tag may appear something like this:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
But do we REALLY understand what is going on? This tag is important when a webpage is being opened locally. It instructs the browser as to what character encoding to use to display the page. This may override the platform default.
But what about when a page is being viewed by HTTP? Well, the tag is important if the HTTP response header(s) being sent fail to designate a default character encoding. What if the response header(s) DO include a default character set? AHHH! Then the meta tag is (are you ready for this?)… IGNORED!
Let’s say you designate a page, via meta tag, to have a character set of UTF-8, but your web server is sending a response header setting the default as Windows-1252. Your page is going to display in Windows-1252!
And guess what? Your page, viewed over HTTP, just may still appear correctly giving you the impression that the meta tag is working! Then you force your browser to actually display in UTF-8 and that beautiful page suddenly becomes what is referred to as “mojibake!”
There are at least a couple ways to get this all sorted out. If you are coding in PHP, one way would be to set the response header in the code for each page. Here is an example PHP header:
header('Content-Type: text/html; charset=utf-8');
This needs to appear in the PHP BEFORE a single bit of HTML is displayed.
Another way is if you have access to your server settings, you can specify a default character set.
Still another way, with Apache servers, is to specify a default character set in you .htaccess file.
AddDefaultCharset UTF-8
So…. knowing all this, just HOW do you go about confirming that the character set you want is the character set actually being set? With Firefox/Waterfox/SeaMonkey, bring up the page in question. Up in the url display area, to the left of the url, click on the little circle with the upside-down “!”. There will be information on whether or not the connection is secure, then a “>”. Click on that. Click on “More information”, then the “General” tab. This will display the text-encoding AND the meta tags. If they don’t agree, the response header being sent isn’t what you want it to be. This applies to Waterfox in Linux, also.
Google Chrome USED to allow the option to see what the default character set REALLY is, but they removed it. Fortunately, there is an extension that does it for you. The extension is simply named “Charset”, and allows you to not only see what the actual character set is for a given page, allows you to change it. The results may be an eye opener. BTW, this applies to Linux Chromium as well.
What about IE/Edge? You’re on your own! I won’t touch those monstrosities! LOL!!