> Ok. Then I think it's best to not print "/>" for all non-open tags.
> It's confusing.
you mean never print "/>" in any case ?
for "<haha><p><x/></haha></p>":
({ /* 2 elements */
SGMLatom(<haha>
SGMLatom(<p>
SGMLatom(<x>))),
SGMLatom(</p>)
})
or you mean never print "/>" except <x/> ? like this:
({ /* 2 elements */
SGMLatom(<haha>
SGMLatom(<p>
SGMLatom(<x/>))),
SGMLatom(</p>)
})
I'm puzzled because my code is
res="<"+res+(open?">":"/>");
If you request "not print '/>' for all non-open tags", it result to no "/>"
at all.
--------------------------------------------------
From: "Martin Stjernholm" <<mast[at]lysator.liu.se>>
Sent: Sunday, April 05, 2009 7:29 PM
To: "PeterPan" <<zenothing[at]hotmail.com>>
Cc: <<pike[at]roxen.com>>
Subject: Re: submit modify of Parser.SGML
> "PeterPan" <<zenothing[at]hotmail.com>> wrote:
>
>> Sorry, The description of open is a bit confusing.
>>
>> open!=0 mains: the tag itself is not such as "<p/>" and there is not a
>> paired "</p>" exists.
>
> Ok. Then I think it's best to not print "/>" for all non-open tags.
> It's confusing.
>
> Btw, if you just want to extract stuff out if an xml/html page, then
> maybe Parser.XML.SloppyDOM is an alternative. It provides a function
> simple_path which lets you pick out nodes and subtrees very
> conveniently using an XPath subset.
>
>> By the way, something more about the compatibility of SGML(or HTML):
>>
>> <a href=http://www.somethin.com/?q=abc&arg1=efg&arg2=hij>haha</a>
>>
>> In the real browner this is ok, but SGML recognize it as:
>>
>> ({ /* 1 element */
>> SGMLatom(<a ="hij" href="http://www.somethin.com/?q"/>
>> "haha")
>> })
>
> Although that's incorrect HTML (at least - I don't know about SGML
> really), the parser could indeed treat it better. I've got a fix, but
> I won't risk 7.8 stability and compatibility with it, so it'll go into
> 7.9 as well.
>
> It's a bit odd that such a fairly obvious case has gone by undetected
> for so long.
>
>
|