I originally thought using fonts would be pretty simple. However, proper handling of fonts has ended up being a significant effort in Windward Reports (our XML and SQL Reporting system). If you're going to do much more than place a line of text in a form, then the details start to matter. Fonts & Glyphs
So what is a font? Fundamentally a font is a series of glyphs. What you think of as a character like the letter A is a glyph. A font is then a set of glyphs for all the letters in that font. If you get the Helvetica font, all their glyphs look one way. If you get the Times Roman font, they look another. Each is the set of glyphs from that font. Now we need to introduce the concept of code pages. A code page is a mapping from a character number to a specific glyph. Programs originally stored each character as a byte. Then for Asian character sets there were the DBCS systems (some characters were 1 byte, some 2). Programs today mostly use Unicode, but web pages tend to be UTF-8 which is a multi-byte sequence that can be up to 4 bytes. Why bring up encoding? Because each font has an encoding where character number 178 could return a very different glyph depending on the codepage used by the font. Most font files use Unicode so you have a standard there, but many programs still use specific code pages, where that page is mapped to the font. This is what occurs when you display ABC and the font is Wingdings so you get . So point one is you need to make sure that the encoding you use matches or is mapped to the encoding of the fonts you use. And it gets even more complex. The characters with the values 0xE000 – 0xF8FF are undefined. Each font can make those anything they want (one use is to add the Klingon script). So a character with a value in this range is by definition tied to the font file it is using to display that font. This is how most symbol type fonts work. Read more: Windward Wrocks
So what is a font? Fundamentally a font is a series of glyphs. What you think of as a character like the letter A is a glyph. A font is then a set of glyphs for all the letters in that font. If you get the Helvetica font, all their glyphs look one way. If you get the Times Roman font, they look another. Each is the set of glyphs from that font. Now we need to introduce the concept of code pages. A code page is a mapping from a character number to a specific glyph. Programs originally stored each character as a byte. Then for Asian character sets there were the DBCS systems (some characters were 1 byte, some 2). Programs today mostly use Unicode, but web pages tend to be UTF-8 which is a multi-byte sequence that can be up to 4 bytes. Why bring up encoding? Because each font has an encoding where character number 178 could return a very different glyph depending on the codepage used by the font. Most font files use Unicode so you have a standard there, but many programs still use specific code pages, where that page is mapped to the font. This is what occurs when you display ABC and the font is Wingdings so you get . So point one is you need to make sure that the encoding you use matches or is mapped to the encoding of the fonts you use. And it gets even more complex. The characters with the values 0xE000 – 0xF8FF are undefined. Each font can make those anything they want (one use is to add the Klingon script). So a character with a value in this range is by definition tied to the font file it is using to display that font. This is how most symbol type fonts work. Read more: Windward Wrocks
0 comments:
Post a Comment