day macha
Unicode and UTF-8 Explained

I spent some time reading this excellent but lengthy guide to Unicode and UTF-8 and playing with Unicode a little. Here’s a hopefully clearer and gently paced guide telling you what you need to know, especially if you’re working with a programming language that does not automatically handle Unicode (it’s a good overview for non-programmers, too).


Before Unicode

Everything, as you know, in a computer is stored as naughts and ones. These naughts and ones can represent numbers. 01000001 represents 65. These are also used to represent letters. 65 (1000001) is the uppercase letter A, for example.

Read More