"PeterPan" <<zenothing[at]hotmail.com>> wrote:
> Sorry, The description of open is a bit confusing.
>
> open!=0 mains: the tag itself is not such as "<p/>" and there is not a
> paired "</p>" exists.
Ok. Then I think it's best to not print "/>" for all non-open tags.
It's confusing.
Btw, if you just want to extract stuff out if an xml/html page, then
maybe Parser.XML.SloppyDOM is an alternative. It provides a function
simple_path which lets you pick out nodes and subtrees very
conveniently using an XPath subset.
> By the way, something more about the compatibility of SGML(or HTML):
>
> <a href=http://www.somethin.com/?q=abc&arg1=efg&arg2=hij>haha</a>
>
> In the real browner this is ok, but SGML recognize it as:
>
> ({ /* 1 element */
> SGMLatom(<a ="hij" href="http://www.somethin.com/?q"/>
> "haha")
> })
Although that's incorrect HTML (at least - I don't know about SGML
really), the parser could indeed treat it better. I've got a fix, but
I won't risk 7.8 stability and compatibility with it, so it'll go into
7.9 as well.
It's a bit odd that such a fairly obvious case has gone by undetected
for so long.
|