yehudakatz.com
21 Mar '12, 7am
@myfear I also recently read encodings unabridged by Yehuda Katz. Learned more in 10 mins than in 10 yrs. Click!
This means that for most of the Western world, it is a good idea to use Unicode as the “one true character set” inside programming languages. This means that programmers can treat Strings as simple sequences of Unicode code points (several code points may add up to a single character, such as the ¨ code point, which can be applied to other code points to form characters like ü). In the Asian world, while this can sometimes be a good strategy, it is often significantly simpler to use the original encoding and handle merging Strings in different encodings together manually (when an appropriate decision about the tradeoffs around fidelity can be made).
Full article:
http://yehudakatz.com/2010/05/17/encodings-unabridged/