Status of support for Romanian

View: New views
2 Messages — Rating Filter:   Alert me  

Status of support for Romanian

by Vasile Gaburici :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


The text below is also part of the archive with code and images available
at: http://www.cs.umd.edu/~gaburici/ro-fonts-test.tgz

Here is the summary of the tests (as of July 18, 2008 on Fedora 9):

== Quickie 1: Test for Unicode code points relevant for Romanian ==

Issues (and suggested fixes):

* All PS1 (non-OTF) fonts need an additional mapping from 021A/B to
   U+162/3.

   Background: Most PS1 fonts (not OpenType) that ship with Linux distorts
   lack the code points U+21A/B (t with comma below) which is the proper
   code point for Romanian t with comma below. Thankfully, Adobe once
   decided that t with cedilla is not used in any language so the proper
   glyphs are usually present at U+162/3, which used to be the unified code
   point for t cedilla and t comma below before Unicode 3.0. The Uniscribe
   renderer automatically handles this issue by remapping U+21A/B to
   U+162/3 when the former glyphs are missing. Unfortunately,
   Pango/fonconfig doesn't do this, so most new Romanian documents can only
   be displayed with a very narrow font selection.

   Proposed solution: adopt Uniscribe solution; editing the PS1 fonts is
   pointless and would violate the license for the commercial ones.

* Liberation fonts lack "s with comma" completely -- the glyph needs to be
   added to the fonts. Hopefully, they'll be made into OT fonts at the same
   time (see further discussion why this would be good). "T with comma" can
   be copied from U+162/3.

== Quickie 2: Test for localized OT forms (GSUB/latn/ROM/{locl,ccmp}) ==

The Adobe/Linotype/Vista industry standard seems to that activating
ROM/locl should map "s with cedilla" to "s with comma". Since in Adobe
(Pro) OT fonts U+162/3 is by default mapped to "t with comma", activating
this optional mapping for s renders old, pre-Unicode 3.0 Romanian texts
with comma below both s and t. Thankfully Pango already handles this!

Funny enough, the MS Uniscribe from XP SP 3 doesn't turn ROM/locl for
Romanian locale (or at least I don't know how to convince it).

The OT SIL fonts take a slightly different approach, but work with Pango
nonetheless. First, they have proper cedilla variants for both s and t.
Second, they don't have a ROM/locl feature, but a ROM/ccmp feature which
remaps both cedilla variants to comma-below counterparts. I'm not sure
this approach is entirely correct because the OpenType spec on MS.com says
that ccmp should not be language sensitive. YMMV, I'm no expert on this.
[http://www.microsoft.com/typography/OTSPEC/features_ae.htm#ccmp]

Issues (and suggested fixes):

* DejaVu fonts (which are already OT), do not have a ROM/locl feature.
   From what I've seen on typophile.com this is a two line fix in fontlab.

* Pango does not allow an application to set the OT features language
   independently from UI language. For instance, there's currently no way
   to run gedit with the English UI but to force Romanian localized rendering
   for U+015E/F, which is silly because those code points mean nothing in
   English. Hopefully somebody there will grok it. See my comments on the
   Pango bug at http://bugzilla.gnome.org/show_bug.cgi?id=442786.

Finally, if this doesn't make any sense to you, there's some further
background info on the Wikipedia page for the Romanian alphabet.

Much ado for some commas ;)
Vasile

P.S. The Linux console fonts also lack U+219-B code points, but few care
about the console fonts these days...
_______________________________________________
gtk-i18n-list mailing list
gtk-i18n-list@...
http://mail.gnome.org/mailman/listinfo/gtk-i18n-list

Re: Status of support for Romanian

by Vasile Gaburici :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


  Here's a short "appendix". I did a quick test in Open Office since
it doesn't use pango but ICU (as far as I know). It's less impressive than
Pango:

* No support for GSUB/latn/ROM/locl (or ccmp). I guess it needs a special
   CTL (Complex text layout) module for Romanian, but I don't think there
   is any.

* Fonts steal glyphs from similar fonts, but sometimes luck out and seem
   to choose inappropriately (sans replaces serif). Here is a little demo:
   http://www.cs.umd.edu/~gaburici/oo-ro-font-test.tgz. I cannot find a way
   to turn off fallback substitution either.

I guess this stuff is mostly Sun's problem, but if you have any comments...

Thanks,
Vasile

_______________________________________________
gtk-i18n-list mailing list
gtk-i18n-list@...
http://mail.gnome.org/mailman/listinfo/gtk-i18n-list
LightInTheBox - Buy quality products at wholesale price