[4suite] HTML -> XHTML

Mike Olson Mike.Olson at fourthought.com
Fri Sep 29 12:24:05 MDT 2000


Nicolas Chauvat wrote:
> 
> Hello,
> 
> 
> What's the preferred way to do this ?

Have you tried something like this?


d = """<HTML><HEAD>
<SCRIPT>
         if (1<2) document.write('<B>Hello</B>');
</SCRIPT>
</HEAD>
<BODY>
</BODY>
</HTML>
"""

from xml.dom.ext.reader import HtmlLib

import cStringIO
s = cStringIO.StringIO(d)

doc = HtmlLib.FromHtmlStream(s)

from xml.dom import ext
ext.PrettyPrint(doc)


Note, It doesn't print out XHTML and we probably won't add this in the
ext.Printer* because DOM III has all sorts of specifications for
serialization and deserialaztion and we'll be migrating to that in the
near future.

I quick stylesheet can turn the output of this into valid XML (no easy
way to print an HTML document out as XML in 4DOM, I have ideas if you
are interested).

Mike




> 
> --
> Nicolas Chauvat
> 
> http://www.logilab.com - "Mais o est donc Ornicar ?" - LOGILAB, Paris (France)
> 
> _______________________________________________
> 4suite mailing list
> 4suite at lists.fourthought.com
> http://lists.fourthought.com/mailman/listinfo/4suite

-- 
Mike Olson				 Principal Consultant
mike.olson at fourthought.com               (303)583-9900 x 102
Fourthought, Inc.                         http://Fourthought.com 
Software-engineering, knowledge-management, XML, CORBA, Linux, Python



More information about the 4suite mailing list