Opened 12 years ago

Last modified 12 years ago

#1039 new enhancement

Support Unicode combining characters

Reported by: Philip Taylor Owned by:
Priority: Should Have Milestone: Backlog
Component: Core engine Keywords:
Cc: Patch:

Description

See forum discussion. We ought to support names with a/i/u/y-macron-acute characters, which means the font builder tool and font renderer need to support multi-codepoint sequences involving combining characters. (For e/o-macron-acute I guess we should stick with the single-codepoint NFC version.)

Attachments (1)

pango-test.py (1.1 KB ) - added by Philip Taylor 12 years ago.
pango layout test script

Download all attachments as: .zip

Change History (4)

comment:1 by philip, 12 years ago

(In [10796]) Add a hack for a-macron-acute characters (see #1039)

by Philip Taylor, 12 years ago

Attachment: pango-test.py added

pango layout test script

comment:2 by Philip Taylor, 12 years ago

Milestone: Alpha 8Backlog

This seems non-trivial.

First problem: To correctly handle the OpenType layout tables (GSUB/GPOS/etc), we need to use Pango (since Cairo and FreeType don't support that). We can tell Pango to render a string like "{a-macron} {combining-acute}" then find the glyph indexes and offsets (using ctypes since the Python API doesn't expose that data). The attached script does that, which seems to work. Then we need to update all the font rendering stuff to associate multi-codepoint sequences with the pre-composed glyph sequence. I think that's all technically possible; it's just quite a pain to get everything integrated and working.

Second problem: Pango can only render with fonts that are installed into the system; it can't render directly from a .ttf file. That's not too bad, it just means users of the font-builder script need to manually install the fonts first.

Third problem: The Pagella font doesn't have the necessary OpenType tables for a-macron-acute, so the two accents get rendered at the same location. The least-hacky solution would be to modify the font to add the appropriate positioning data for this case, though then it's a bit irritating since you have to manually install that customised font to test it.

For now, I've gone with the simplest hack of just adding the combining acute character into the font and forcing the script to lift it upwards a bit so that it happens to line up adequately over an a-macron character. That seems to work for now, but the font system really ought to be cleaned up later.

comment:3 by Philip Taylor, 12 years ago

(Forgot one idea: Rather than making the font renderer support multi-codepoint sequences natively, we could just map each multi-codepoint sequence into a single PUA codepoint (U+0E00 etc). Store that PUA codepoint in the font texture like any other normal glyph, and have the game engine translate sequences into PUA codepoints before passing them to the font renderer.)

Note: See TracTickets for help on using tickets.