Useful Links

Useful Download Links
Sureshu Font Download
Sourashtra Unicode Font Link 1 Link 2
Lakshmanacchaaryaa - Sourashtra English Transliteration Tool Download
My Youtube Channel Download

Sunday, January 2, 2011

Fonts , Unicode and Sourashtra

I am, here, writing a small article on my understanding about fonts , Unicode and Sourashtra language in Unicode in brief. This information may provide you a basic understanding on fonts and Unicode.

Language: Language is evolved to communicate between people. Initially people used to draw the picture or to make sound (which does not have any pre-defined meaning) to express what they think or feel and then slowly they started using sounds to express a specific thing or feeling. But sound alone was not helpful much when people want to communicate to a person when both of the persons who are communicating, are not in the same place. So they started assigning symbols for the sounds or a group of sound (like Chinese characters). The symbols assigned to a sound was gradually improved and standardized;And the letters (alphabets of language) we are using are latest symbols for sound.

There is also other theory on the development of symbol for sound. It says predefined sound and symbol corresponding to them was evolved simultaneously.

Sound: When we say ambo in Sourashtra, it is a combination of different sound and it always gives the same meaning in Sourashtra. The word ambo is a combination of sound a(/) , m(ꢪ꣄/ம்) and bo(ꢨꣁ/பொ3) . Sound is unique i.e the same sound will not vary from one word to another word. So combination of sound has meaning in a language.

Symbol(Script): As long as we speak or discuss in person, we don't have any problem. Suppose we want to communicate to some one who is not present at that time, we need some kind of mechanism to facilitate this. Symbol fulfills the purpose of representing sound. It depends on the language whether a symbol is used for one sound or more. For example, ka(க) in Tamizh language has more than one sound in Tamizh language. But a native speaker of a language does not have problem with their sound and symbol.

Sourashtra language has unique symbol for each sound. You can see the sound and symbols of sourashtra in YouTube videos or sourashtra.info or other websites related with sourashtra language. Explaining sound and symbols of sourashtra is beyond the scope of this small article. I will explain in later posts.

Medium used to write symbol: Script, lets say symbol as script from here after, can be written in paper/in wall/sands/rock like traditional method or we can use electronic devises for the same. In traditional method i.e writing in paper, we don't have any problem; because we draw (writing is also a kind of drawing) whatever script we see.

When we use electronics devices to write/read letters, the problem comes. Because, electronics devices does not recognize letter. All it knows is Numbers. Lets consider Computer which is known better than other electronics devices used to read/write.

To resolve the number and letters problems, computer assigns a number, ranging from 0 to 255, to each letter. If the computer reads number 65, it will display capital letter A; vice-verse if we want to store capital letter A, it stores the letter as number 65 in its memory (memory - a place to store data). This is how the computer is designed.

Font: Now we can understand that how a character is displayed in Monitor i.e it reads the computer memory and displays character ‘A’ in Monitor if number in the memory is 65; displays ‘a’ if the number is 97. Now, next level is how the letter ‘A’ should look like. Pronunciation for letter A will not change but shape may vary slightly. Some people like to write A with straights line; some may like to write it with curve or zig-zag lines. Check the below pictures - all are letters 'A' but there is slight variation between the letters





Here, we go to font to have different shape for a letter. A font will have a constant shape for each letter. If you type ‘ABCD’ and select a font, the characters will be displayed as how the shape is defined in the font. Common fonts names you may heard are ‘Times New Roman’, ‘Arial’, ‘Verdana’,’Serif’. So once you have written a document, you can change the font name in the document for different styles. But the letters typed will not be changed. Only the shape is getting changed on changing the font.

Font for non-English letters: As I said early, the number of places for characters in Computer is 256 in total; this includes punctuation (comma, full stop,etc), English alphabets and some places for special instructions to computer which is not visible to us. Now if I want to display character a (a in ambo - see the image below) for sourashtra language, we need a number to hold this letter.


Since all the numbers, from 0 to 255, are already taken by English letters and other special instruction, an alternative way is implemented to solve this problem. The method is - do not display character A from English for number 65; instead display sourashtra character a(அ/) for number 65. You might have observed this when you are browsing few website - “where you initially see English letter or English like letters which does not have any meaning. After you installed a particular font, the page will be changed to specific language (Tamil, Telugu, Kannada etc)”.

Using the above method i.e displaying a non-English language character for English letter’s place, few fonts also developed for Sourashtra language. They are

1. Sanghudari Font - http://www.palkar.org/dwnload.shtml

2. Kuber Font - http://www.palkar.org/dwnload.shtml

3. Suresh(u) Font - http://www.sourashtra.info/fonts/SURESHu.ttf (This is latest font of this type with all symbols of Sourashtra language).

You can view keyboard mapping i.e, which keyboard key corresponds to which Sourashtra letter, from the link : http://www.sourashtra.info/journaldocuments/5.pdf

Note: You will see English character in a document which is written using the above Sourashtra font but you have not installed the font in your computer. Most of the people would have experienced this.

Unicode : When the number of computer’s user was increased, standards were developed to support multiple language without confusion. Unicode (http://unicode.org/) is one of the organizations and successfully form specifications for software internationalization in all major operating systems, search engines, applications, and the Web.

Computer supports 0-255 numbers in its memory; In the Unicode method, two consecutive number is used to represent a character i.e a number from first number and a number from second number; it will form 256 * 256 (65536) numbers of combinations. and it will hold 65536 characters. If two consecutive place in memory is 65 and 65, then it is not representing ‘AA’ but (character from Asian language; number 16701 = 65 * 256 + 65). By this method we can hold 65536 characters in single font and we don’t need to change font name to get characters of different language set i.e in single font we can show English, Sourashtra and other alphabet. You can ignore this paragraph content if it is boring/confusing and just remember that Unicode can represent 65536 distinct characters and can show multiple languages in same font.

Since multiple languages have its own position in Unicode, a font may or may not have all language’s characters in same font i.e it is not mandatory to include all language characters in same font when one develops a font. A Unicode font may have letters from English and Sourashtra alone if we want to create Unicode font for sourashtra. Later we can include other language characters (Tamil, Telugu and Kannada, etc) in their allocated Unicode range if required. If a character is not defined in a Unicode number, then it will be displayed as Square or Question mark. This will happen if you did not install the Unicode font for sourashtra language and opened a sourashtra language document in Unicode, then you will see squares in the places of those characters.

Advantage of having Unicode font:

  • Each characters has its own number.
  • All websites are updating their data to Unicode characters.
  • There will not be confusion on seeing a document whether the document is written in English or non-English language.
  • Multiple language supports.
  • Searching a data, data processing will be easier.

Sourashtra language in Unicode: Jeyakumar Chinnakkonda Krishnamoorty in association with Michael Everson has taken steps to include Sourashtra language in Unicode. Without them the progress of sourashtra language in Internet is not possible.

Sourashtra language was included in Unicode version 5.1 which was released on April 2008. Version 5.1 extends support for languages in Africa, India, Indonesia, Myanmar, and Vietnam, with the addition of the Cham, Lepcha, Ol Chiki, Rejang, Saurashtra, Sundanese, and Vai scripts. Range allocated for Sourashtra language in Unicode is from number 43136 to 43311. You can check the characters from the link http://www.unicode.org/charts/PDF/UA880.pdf.

Following are the font which has sourashtra language Unicode characters:

1. Code2000 ( http://www.code2000.net/code2000_page.htm )

2. Sourashtra (http://www.sourashtra.info/fonts/sourashtra.otf)

For further reading:

  1. http://en.wikipedia.org/wiki/Font
  2. http://en.wikipedia.org/wiki/Unicode
  3. http://www.alanwood.net/demos/ansi.html
  4. http://www.alanwood.net/unicode/saurashtra.html
  5. http://www.unicode.org/charts/
  6. Sourashtra language in Unicode: http://std.dkuug.dk/jtc1/sc2/wg2/docs/n2969.pdf

2 comments:

  1. ꢱꣃꢬꢵꢰ꣄ꢜ꣄ꢬ ꢭꢶꢦꢶꢪꢸ ꢭꢶꢒ꣄ꢒꢡ꣄ꢡꢾ ꢂꢪ꣄ꢒꣁ ꢕꢾꢥꢪ꣄ ꢣꢿꢫꢶ ꣎
    ꢒꢳꢵꢥꢵꢡ꣄ꢡꢾꢥꢸ ꢱꢶꢒ꣄ꢒꢸꢭ꣄ꢥꣂ꣎
    ꢂꢡ꣄ꢡꢾꢖ꣄ꢔꢸꢜ꣄ ꢂꢮ꣄ꢬꢾ ꢲꢵꢭ꣄ ꢱꢶꢒ꣄ꢒꢵꢥꢵ ꢪꢾꢥꢾꢡꢶ, ꢡꢸꢪꢶ ꢡꢸꢬꢾ
    ꢨꢾꢜꢵ ꢨꢾꢜꢶꢥꢸꢒꢸ, ꢥꢡ꣄ꢫꢵ ꢥꢡꢶꢥꢶꢥꢸꢒꢸ ꢱꢶꢒ꣄ꢒꢸꢭ꣄ꢭꢸꢮꣂ ꢪꢾꢥꢶ ꢱꢖ꣄ꢔꢸꢮꣂ ꣎

    ꢂꢮ꣄ꢬꢾ ꢩꢵꢰꢵꢪ꣄ ꢭꢶꢒ꣄ꢒꢬꢶꢫꣁ ꢱꢾꢬ꣄ꢒꣁ ꢬꢴꢥꣂ꣎
    ꢆꢗ꣄ꢗꢬꢶꢦ꣄ꢦꢸ ꢤꢵꢥꢸꢒꢸ ꢂꢒ꣄ꢰꢬ꣄ꢥꢸ ꢡꢔꢶ ꢭꢶꢒ꣄ꢒꢸꢥꣂ ꣎
    ꢡ꣄ꢫꢾ ꢮꢿꢳꢹꢱ꣄ ꢡꢾꢭ꣄ꢫꣁ ꢔꢵꢪꢸ ꢮꢡ꣄ꢡꣁ ꢒꣂꢥꢒ꣄ ꢯꢿ ꢪꢾꢥꢡ꣄ꢡꢾ ꢒꢳꢵꢥ꣄ ꢂꢮꢫꢶ.
    ꢂꢱ꣄ꢒꢶ ꢔꢵꢪ꣄ꢪꢸ ꢂꢪ꣄ꢬꢾ ꢡꢾꢥꢸ ꢏꢠ꣄ꢜꢿꢱ꣄ ꢮꢶꢤꢪ꣄ ꢒꢥ꣄ ꢮꢡ꣄ꢡꣁ ꢒꢾꢬꢬꢶꢫꣁ ꢥꢴꢷ ꢪꢾꢥꢡ꣄ꢡꢾ ꢡꢸꢪꢶ ꢙꢥ꣄ꢭꢸꢥꣂ ꣎

    ꢗꣁꢒꢜ꣄ ꢒꢵꢪ꣄ ꢒꢾꢬꢡ꣄ꢡꢾꢖ꣄ꢒꣁ ꢃꢣꢬꢮꢸ ꢣꢾꢥꣂ ꣎

    ReplyDelete