Problem with asian language fonts

View: New views
13 Messages — Rating Filter:   Alert me  

Problem with asian language fonts

by Rakesh Kumar S :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi

I have problem regarding display of Asian Language fonts in FOP.

1. I am using Castor Framework to convert my Objects to XML.
2. I am passing this XML and the input XSL to the transformer and flushing out the PDF.

In the DB they are sorted as encoded characters, now i am confused where i have to do this conversion into chinese characters.

Should this be done by castor framewrok which transforms my objects into XML,should i introduce the encoding here?
Or
Should i get the characters as encoded string in the XML and convert them into chinese characters while i am printing them as PDF.

transformer.setOutputProperty(OutputKeys.ENCODING,"ISO-8859-1");
transformer.transform(source, new StreamResult(outTransform));

I am using arial unicode font for display.

Thanks,
Rakesh Kumar S

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: Problem with asian language fonts

by cbowditch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rakesh Kumar S wrote:

> Hi
>
> I have problem regarding display of Asian Language fonts in FOP.
>
> 1. I am using Castor Framework to convert my Objects to XML.
> 2. I am passing this XML and the input XSL to the transformer and flushing out the PDF.
>
> In the DB they are sorted as encoded characters, now i am confused where i have to do this conversion into chinese characters.
>
> Should this be done by castor framewrok which transforms my objects into XML,should i introduce the encoding here?
> Or
> Should i get the characters as encoded string in the XML and convert them into chinese characters while i am printing them as PDF.

You need to make sure that every part in your processing chain that does
byte to string or vice versa does the conversion using a UTF-8 encoding.

>
> transformer.setOutputProperty(OutputKeys.ENCODING,"ISO-8859-1");

This won't work as iso-8859-1 doesn't include Chinese Characters only
Western Characters.

> transformer.transform(source, new StreamResult(outTransform));
>
> I am using arial unicode font for display.

Regards,

Chris

<snip/>



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


RE: Problem with asian language fonts

by Rakesh Kumar S :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

Which is the encoding format that will support both asian language and
western fonts?

Thanks,
Rakesh Kumar S

________________________________________
From: Chris Bowditch [bowditch_chris@...]
Sent: Tuesday, July 08, 2008 5:12 PM
To: fop-users@...
Subject: Re: Problem with asian language fonts

Rakesh Kumar S wrote:

> Hi
>
> I have problem regarding display of Asian Language fonts in FOP.
>
> 1. I am using Castor Framework to convert my Objects to XML.
> 2. I am passing this XML and the input XSL to the transformer and flushing out the PDF.
>
> In the DB they are sorted as encoded characters, now i am confused where i have to do this conversion into chinese characters.
>
> Should this be done by castor framewrok which transforms my objects into XML,should i introduce the encoding here?
> Or
> Should i get the characters as encoded string in the XML and convert them into chinese characters while i am printing them as PDF.

You need to make sure that every part in your processing chain that does
byte to string or vice versa does the conversion using a UTF-8 encoding.

>
> transformer.setOutputProperty(OutputKeys.ENCODING,"ISO-8859-1");

This won't work as iso-8859-1 doesn't include Chinese Characters only
Western Characters.

> transformer.transform(source, new StreamResult(outTransform));
>
> I am using arial unicode font for display.

Regards,

Chris

<snip/>



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: Problem with asian language fonts

by Jean-François El Fouly :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rakesh Kumar S a écrit :
> Hi,
>
> Which is the encoding format that will support both asian language and
> western fonts?
>
> Thanks,
> Rakesh Kumar S
>
>  
Any Unicode-based encoding will do the job. One of the UTF-16 (Big
Endian or Little Endian) is probably your best choice, since UTF-8 is a
variable-length encoding that will use 3 bytes or more for Asian
characters, while UTF-16 will use 16 bits flat for every character.

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: Problem with asian language fonts

by cbowditch :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rakesh Kumar S wrote:

> Hi,
>
> Which is the encoding format that will support both asian language and
> western fonts?

UTF-8

<snip/>

Chris



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Parent Message unknown RE: Problem with asian language fonts

by Pascal Sancho :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Rakesh,

In a well-formed xml, you may use any encoding you want.
If your text nodes contains characters that are not part of the encoding pattern, then you have to use character entities.

To choose the character encoding, you should consider:
 - environment (what encoding is supported by your system/your applications)
 - human readable (not easy when there is too many character encoding)
 - file size:
    a US text in UTF-8 or US-ASCII is about 1 byte-per-char
    an asian text can be:
      - about 3 or 4 byte-per-char in UTF-8
      - about 2 byte-per-char in UTF-16
      - about 8 byte-per-char in US-ASCII (using characters entities, like 豈

In your case, I think the best choice should be UTF-16.

See http://en.wikipedia.org/wiki/Character_encoding.

Note that XML rec [1] says that All XML processors must accept the UTF-8 and UTF-16 encodings.

[1] http://www.w3.org/TR/2000/REC-xml-20001006#charsets

HTH,

Pascal


> -----Message d'origine-----
> De : Rakesh Kumar S [mailto:Rakesh_Kumar06@...]
> Envoyé : mardi 8 juillet 2008 14:03
>
> Hi,
>
> Which is the encoding format that will support both asian language and
> western fonts?
>
> Thanks,
> Rakesh Kumar S


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Parent Message unknown RE: Problem with asian language fonts

by Pascal Sancho :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Re-Hi,
Here is a very good tool to play with character encoding...
http://www.babelstone.co.uk/Software/BabelPad.html

Pascal

> -----Message d'origine-----
> De : Pascal Sancho
> Envoyé : mardi 8 juillet 2008 15:47
>
> Hi Rakesh,
>
> In a well-formed xml, you may use any encoding you want.
> If your text nodes contains characters that are not part of
> the encoding pattern, then you have to use character entities.
>
> To choose the character encoding, you should consider:
>  - environment (what encoding is supported by your
> system/your applications)
>  - human readable (not easy when there is too many character encoding)
>  - file size:
>     a US text in UTF-8 or US-ASCII is about 1 byte-per-char
>     an asian text can be:
>       - about 3 or 4 byte-per-char in UTF-8
>       - about 2 byte-per-char in UTF-16
>       - about 8 byte-per-char in US-ASCII (using characters
> entities, like 豈
>
> In your case, I think the best choice should be UTF-16.
>
> See http://en.wikipedia.org/wiki/Character_encoding.
>
> Note that XML rec [1] says that All XML processors must
> accept the UTF-8 and UTF-16 encodings.
>
> [1] http://www.w3.org/TR/2000/REC-xml-20001006#charsets
>
> HTH,
>
> Pascal
>
>
> > -----Message d'origine-----
> > De : Rakesh Kumar S [mailto:Rakesh_Kumar06@...]
> > Envoyé : mardi 8 juillet 2008 14:03
> >
> > Hi,
> >
> > Which is the encoding format that will support both asian
> language and
> > western fonts?
> >
> > Thanks,
> > Rakesh Kumar S
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@...
> For additional commands, e-mail: fop-users-help@...
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Parent Message unknown Problem with Asian Language fonts

by Rakesh Kumar S :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Guys,

I have a problem with displaying Asian Language fonts in PDF.Ours is a Java application using hibernate and we have a reporting module where we need to print reports as PDF's.
We are using apache FOP for this purpose.

Our application allows the users to enter in asian language fonts Japenese, Chinese, Korean languages.
Now those languages get stored in the DB.

When they are stored in the DB they get converted into encoded values and are saved.

While converting into PDF we have two steps:

1. Convert the Object into XML using castor framework.
2. Convert the XML into a PDF using FOP.

My problem is that the characters are appearing as encoded text in the PDF also. I am using UTF-8 encoding.I am using Arial Unicode MS Font that allows Asian language fonts.
Please find the PDF attached.
Could someone guide me how to overcome this problem.

Tell me where this conversion should actually happen,should this happen in while i generate the XML or while i convert this into PDF.

Please find the PDF and the XML Generated Attached.
Please do guide me as i am totally stuck and unable to proceed as i am running out of options.


Thanks,
Rakesh Kumar S

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***


<?xml version="1.0" encoding="UTF-8"?>
<candidates>
<candidate-data>
<candidatename>Raja Raja</candidatename>
<addressline1>Raja</addressline1>
<addressline2>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline2>
<addressline3>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline3>
<addressline4>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline4>
<addressline5>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline5>
<addressline6>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline6>
</candidate-data>
<candidate-data>
<candidatename>Rakesh Rakesh</candidatename>
<addressline1>Raja</addressline1>
<addressline2>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline2>
<addressline3>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline3>
<addressline4>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline4>
<addressline5>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline5>
<addressline6>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline6>
</candidate-data>
</candidates>

<?xml version="1.0" encoding="UTF-8"?>
<candidates>
<candidate-data>
<candidatename>Raja Raja</candidatename>
<addressline1>Raja</addressline1>
<addressline2>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline2>
<addressline3>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline3>
<addressline4>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline4>
<addressline5>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline5>
<addressline6>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline6>
</candidate-data>
<candidate-data>
<candidatename>Rakesh Rakesh</candidatename>
<addressline1>Raja</addressline1>
<addressline2>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline2>
<addressline3>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline3>
<addressline4>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline4>
<addressline5>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline5>
<addressline6>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline6>
</candidate-data>
</candidates>

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...

Asian-Language.pdf (27K) Download Attachment

RE: Problem with Asian Language fonts

by Rakesh Kumar S :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

The XML Structure is like this :

<?xml version="1.0" encoding="UTF-8"?>
<candidates>
<candidate-data>
<candidatename>Raja Raja</candidatename>
<addressline1>Raja</addressline1>
<addressline2>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline2>
<addressline3>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline3>
<addressline4>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline4>
<addressline5>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline5>
<addressline6>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline6>
</candidate-data>
<candidate-data>
<candidatename>Rakesh Rakesh</candidatename>
<addressline1>Raja</addressline1>
<addressline2>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline2>
<addressline3>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline3>
<addressline4>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline4>
<addressline5>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline5>
<addressline6>&#21271;&#26041;&#35805;/&#21271;&#26041;&#35441;</addressline6>
</candidate-data>
</candidates>

Where am i doing the Mistake !!!!

________________________________________
From: Rakesh Kumar S [Rakesh_Kumar06@...]
Sent: Wednesday, July 09, 2008 4:34 PM
To: fop-users@...
Subject: Problem with Asian Language fonts

Hi Guys,

I have a problem with displaying Asian Language fonts in PDF.Ours is a Java application using hibernate and we have a reporting module where we need to print reports as PDF's.
We are using apache FOP for this purpose.

Our application allows the users to enter in asian language fonts Japenese, Chinese, Korean languages.
Now those languages get stored in the DB.

When they are stored in the DB they get converted into encoded values and are saved.

While converting into PDF we have two steps:

1. Convert the Object into XML using castor framework.
2. Convert the XML into a PDF using FOP.

My problem is that the characters are appearing as encoded text in the PDF also. I am using UTF-8 encoding.I am using Arial Unicode MS Font that allows Asian language fonts.
Please find the PDF attached.
Could someone guide me how to overcome this problem.

Tell me where this conversion should actually happen,should this happen in while i generate the XML or while i convert this into PDF.

Please find the PDF and the XML Generated Attached.
Please do guide me as i am totally stuck and unable to proceed as i am running out of options.


Thanks,
Rakesh Kumar S

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


RE: Problem with Asian Language fonts

by Alias John Brown :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


"Rakesh Kumar S" wrote:
>
> Hi Guys,
[snip]
>
> My problem is that the characters are appearing as encoded text in the PDF also. I am using UTF-8 encoding.I am using Arial Unicode MS Font that allows Asian language fonts.
> Please find the PDF attached.
> Could someone guide me how to overcome this problem.
>

Looking at candidade.xml.txt, I see that you have:
北方话/北方��

I believe that this should be
北方话/北方��

Hmm, the browser or Hotmail has magically replaced the text that I typed
with the chinese/japanese/korean/whatever characters, but I simply wanted
to say don't use &. Write & followed by #nnnnn.

When you write & you are saying that you want the literal symbol "&",
and you do not want the & to be treated as a special symbol, i.e., part of a
character entity. Therefore you end up with "&" followed by "#21271"
instead of the character that 北 represents.


_________________________________________________________________
Making the world a better place one message at a time.
http://www.imtalkathon.com/?source=EML_WLH_Talkathon_BetterPlace

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: Problem with Asian Language fonts

by Alias John Brown :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

My XML is being totally scrambled. I meant to write:

Don't use the chararcter entity "& a m p ;", which means that
you want a literal ampersand character to appear in the output.

That is, instead of "& a m p ; # 2 1 2 7 1 ;", write "& # 2 1 2 7 1 ;"



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


RE: Problem with Asian Language fonts

by Rakesh Kumar S :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Thanks a lot John...
it worked fine...

Now should i do this conversion every time ...
Because XML stores this as & but i need this as & while converting as PDF

________________________________________
From: news [news@...] On Behalf Of John Brown [johnbrown105@...]
Sent: Wednesday, July 09, 2008 6:23 PM
To: fop-users@...
Subject: Re: Problem with Asian Language fonts

My XML is being totally scrambled. I meant to write:

Don't use the chararcter entity "& a m p ;", which means that
you want a literal ampersand character to appear in the output.

That is, instead of "& a m p ; # 2 1 2 7 1 ;", write "& # 2 1 2 7 1 ;"



---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are not
to copy, disclose, or distribute this e-mail or its contents to any other person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has taken
every reasonable precaution to minimize this risk, but is not liable for any damage
you may sustain as a result of any virus in this e-mail. You should carry out your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...


Re: Problem with Asian Language fonts

by Abel Braaksma (online) :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Rakesh Kumar S wrote:
> Thanks a lot John...
> it worked fine...
>
> Now should i do this conversion every time ...
> Because XML stores this as & but i need this as & while converting as PDF

XML does not store anything (XML is a meta language and does not have
actions on its own). Your problem is either when you store it to the
database (you say that you escape the data) or when you retrieve it back
(using Castor, as you stated in your original question). At some point,
you have a character, say "€" (euro symbol), it gets escaped to €,
which gets escaped to &#128; (a second escape which is not proper).
It may even be escaped one more time, storing it in the database as
&amp;#128;.

To get out of this mess do the following:

1. Always check your data using a text editor (DO NOT USE A BROWSER TO
VIEW XML!)
2. Make sure you do not escape at all anymore (!!!)
3. Before storing it in the database, use XSLT (or a tool) to store the
XML as XML with encoding "US-ASCII", this will effectively escape all
higher characters (above ASCII 127).
4. When retrieving it from the database, either do nothing (use the XML
with the encoding US-ASCII should be just fine to Apache FOP or any
other XML capable process), or transform it to XML with encoding UTF-8
for readability. Most (XSLT/tool) processors will remove the entities to
their UTF-8 character counterparts, but they are not required to do so!

Regardless: do not escape by hand, only use XML tools and set the
encoding to something your database can store. That way, you do not have
to worry about silly double / triple up or down conversions.

HTH,

Cheers,
-- Abel --

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@...
For additional commands, e-mail: fop-users-help@...

LightInTheBox - Buy quality products at wholesale price