[4suite] iso8859-1

Scott Rohde rohde at onthejob.net
Mon Sep 25 10:48:13 MDT 2000


On Mon, Sep 25, 2000 at 05:06:14PM +0200, Alexandre Fayolle wrote:
> Hi, 
> 
> We are in semi deep trouble, with character encoding. What is the
> recommended way to encode strings in UTF-8 so that they can be passed to
> Document.createTextNode? 
> 
> Actually passing an iso8859-1 string works fine, until we try to
> (Pretty)Print it, which fails miserably. 

Printing isn't the only problem with passing an iso-8859-1 string.
For example, Xpath functions won't operate correctly on strings that
include non-ascii iso-8859-1 characters (string-length() treats such
characters as having length four, for example).

Scott Rohde
On The Job Consulting, Inc.




Attached to the mail is a sample
> file demonstrating the workaround we found, but we think it is very
> cumbersome to use.
> 
> Any suggestion welcome.
> 
> -- 
> Alexandre Fayolle
> http://www.logilab.com - "Mais o est donc Ornicar ?" - 
> LOGILAB, Paris (France).
> 

> from xml.dom.ext.reader import Sax2
> from xml.dom.ext import Print   
> from xml.parsers.xmlproc.charconv import iso8859_to_utf8
> 
> d= Sax2.FromXml('''<?xml version="1.0" encoding="iso-8859-1"?> 
> <document>lvation</document>''')
> 
> Print(d)
> Print(d,encoding='iso-8859-1')
> 
> c=d.createElementNS('','created-child')
> d.documentElement.appendChild(c)
> # comment next line and uncomment the line after to see bug
> t=d.createTextNode(iso8859_to_utf8("le mois d'aot est chaud"))
> #t=d.createTextNode("le mois d'aot est chaud")
>  c.appendChild(t)
> Print(d)
> Print(d,encoding='iso-8859-1')
> 




More information about the 4suite mailing list