ascii & unicode reference jbwyatt.com

ASCII

ASCII (American Standard Code for Information Interchange) is a one byte code used to represent characters. Standard ASCII uses the low 7 bits and can represent 128 (2^7) characters. More on ASCII.

The first 32 characters (00-1F hex )are not printable - they are control characters. There are 96 printable characters.
ascii table

UTF-8 (8-bit UCS/Unicode Transformation Format) is a variable-length character encoding for Unicode. It is able to represent any character in the Unicode standard, yet is backwards compatible with ASCII.

ASCII Encoding Example:
The hex value for the letter 'D' is 44. In decimal 'D' is 68 (4x16 + 4x1)
In binary, it is 01000100.
4416 = 010001002 = 0 + 26 + 0 + 0 + 0 + 21 + 0 + 0 = 6410 + 410 = 6810



Extended ASCII character set, high bit (bit 7) set to one.   (click on picture to see another version of table)

ascii table

Unicode

Unicode Reference

As the need to represent more characters (Chinese, Thai, Japanese) increases, we need a larger code space. This is why ASCII is being replaced by Unicode. Unicode uses 16 bits which means it can represent 2^16 = 65,536 different characters.

32 bit codes are also being developed which would allow over 4 billion characters to be represented.


Reference: Brief History of Character Codes ...
valid xhtml 1.0