Forum OpenACS Q&A: Character encoding of ns_xml

Collapse
Posted by Taka Chan on
I have tried to use the ns_xml module, however, I found a problem
about character encoding issue.  I have a xml file, which is in big5.

When I try to copy some of the nodes from the source parsed tree to a
new parsed tree, then render and save the new tree into a text file,
I find out that the content of the file is as follows:

<category name="waste_recycling"/>
<release_date>2001-05-26</release_date>
<expiry_date>2001-12-26</expiry_date>
<keywords>&#x53EF;&#x6301;&#x7E8C;&#x767C;&#x5C55;;&#x5EE2;&#x68C4;&#x
7269;&#x8CC7;&#x6E90;&#x56DE;&#x6536;;&#x7DA0;&#x9818;&#x806F;&#x76DF;
</keywords>

All of the chinese characters have been converted to hexdecimal code
while the ISO characters are not affected.  When I try to browse this
file in IE, it can display the content in big5 characters properly.
But since this xml file might be
read by human, how can I make the content of the output file in
chinese characters rather than hexdecimal code?

Collapse
Posted by Yon Derek on
You probably can't unless you're willing to modify ns_xml. If you are then since ns_xml is just a thin wrapper around libxml (http://xmlsoft.org/) step one would be to figure out how to do it in pure libxml and step two would be to add relevant capability to ns_xml (which would probably be trivial once you figure out how to control libxml to do what you want).