Web Designing Tips

If You Can Watch This, So Can Your Customers

Place Your Ad Here

 

 

Share This Website

Yocto Recommendations

Buy Best Selling Web

Development Products

Amazon.com Widgets

 

Yocto Promotions On

Top Featured Products

Superb web hosting

 

One-way backlinks for higher rankings! linkvana.com

 

Find a sponsor for your web site. Get paid for your great content. shareasale.com.

 

www.yocto-web-development.com

A Place To Learn And Find Everything Necessary For Best Web Business Development

HTML Character Sets List

HTML Character Sets List

Buy Our Top Recommended Guide: Head First HTML with CSS & XHTML

learn html

To display an HTML page correctly, the browser must know what character-set to use.

The html character-set for the early world wide web was ASCII. ASCII supports the numbers from 0-9, the uppercase and lowercase English alphabet, and some special characters.

Since many countries use characters which are not a part of ASCII, the default character-set for modern browsers is ISO-8859-1.

If a web page uses a different character-set than ISO-8859-1, it should be specified in the <meta> tag.

Buy An Excellent Guide: HTML, XHTML, and CSS, Sixth Edition


ISO Html Character Sets List

It is the International Standards Organization (ISO) that defines the standard character-sets for different alphabets/languages.

The different character-sets being used around the world are listed below:

Character set Description Covers
ISO-8859-1 Latin alphabet part 1 North America, Western Europe, Latin America, the Caribbean, Canada, Africa
ISO-8859-2 Latin alphabet part 2 Eastern Europe
ISO-8859-3 Latin alphabet part 3 SE Europe, Esperanto, miscellaneous others
ISO-8859-4 Latin alphabet part 4 Scandinavia/Baltics (and others not in ISO-8859-1)
ISO-8859-5 Latin/Cyrillic part 5 The languages that are using a Cyrillic alphabet such as Bulgarian, Belarusian, Russian and Macedonian
ISO-8859-6 Latin/Arabic part 6 The languages that are using the Arabic alphabet
ISO-8859-7 Latin/Greek part 7 The modern Greek language as well as mathematical symbols derived from the Greek
ISO-8859-8 Latin/Hebrew part 8 The languages that are using the Hebrew alphabet
ISO-8859-9 Latin 5 part 9 The Turkish language. Same as ISO-8859-1 except Turkish characters replace Icelandic ones
ISO-8859-10 Latin 6 Lappish, Nordic, Eskimo The Nordic languages
ISO-8859-15 Latin 9 (aka Latin 0) Similar to ISO 8859-1 but replaces some less common symbols with the euro sign and some other missing characters
ISO-2022-JP Latin/Japanese part 1 The Japanese language
ISO-2022-JP-2 Latin/Japanese part 2 The Japanese language
ISO-2022-KR Latin/Korean part 1 The Korean language


Buy A Great Guide: HTML, XHTML and CSS All-In-One For Dummies

The Unicode Standard

Because the html character sets list above are limited in size, and are not compatible in multilingual environments, the Unicode Consortium developed the Unicode Standard.

The Unicode Standard covers all the characters, punctuations, and symbols in the world.

Unicode enables processing, storage and interchange of text data no matter what the platform, no matter what the program, no matter what the language.

Buy Another Great Guide: Build Your Own Web Site The Right Way Using HTML & CSS, 2nd Edition


The Unicode Consortium

The Unicode Consortium develops the Unicode Standard. Their goal is to replace the existing html character sets list with its standard Unicode Transformation Format (UTF).

The Unicode Standard has become a success and is implemented in XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML, etc. The Unicode standard is also supported in many operating systems and all modern browsers.

The Unicode Consortium cooperates with the leading standards development organizations, like ISO, W3C, and ECMA.

Unicode can be implemented by different character-sets. The most commonly used encodings are UTF-8 and UTF-16:

Character-set Description
UTF-8 A character in UTF8 can be from 1 to 4 bytes long. UTF-8 can represent any character in the Unicode standard. UTF-8 is backwards compatible with ASCII. UTF-8 is the preferred encoding for e-mail and web pages
UTF-16 16-bit Unicode Transformation Format is a variable-length character encoding for Unicode, capable of encoding the entire Unicode repertoire. UTF-16 is used in major operating systems and environments, like Microsoft Windows 2000/XP/2003/Vista/CE and the Java and .NET byte code environments

Tip: The first 256 characters of Unicode html character sets list correspond to the 256 characters of ISO-8859-1.

Tip: All HTML 4 processors already support UTF-8, and all XHTML and XML processors support UTF-8 and UTF-16!

Find Our Best Recommended Premium Web Templates - $1+ million worth of templates for instant download!:


HTML Lessons

HTML Basic
HTML Elements
HTML Attributes
HTML Headings
HTML Paragraphs
HTML Formatting
HTML Styles
HTML Links
HTML Images
HTML Tables
HTML Lists
HTML Forms
HTML Frames
HTML Iframes
HTML Colors
HTML Colornames
HTML Colorvalues
HTML Quick List

HTML Advanced

HTML Doctypes
HTML CSS
HTML Head
HTML Meta
HTML Scripts
HTML Entities
HTML URLs
HTML URL Encode
HTML Webserver
HTML Summary


HTML References

HTML Tag List
HTML Attributes
HTML Events
HTML Colornames
HTML Colorpicker
HTML Character Sets
HTML ASCII
HTML ISO-8859-1
HTML Symbols
HTML URL Encode
HTML Lang Codes
HTML Status Codes

Quick links

Yocto Guide

Yocto Store

Yocto Updates

Yocto Forum

Yocto Links

Contact Us

 

top ecommerce sites
only search yocto web development

Join Yocto Forum And Share Your Views And Comments On Any Web Development Topic With The World web dev