roxen.lists.pike.general

Subject Author Date
Re: RFC: New Parser class for parsing structured datafiles (non-XML) Stephen R. van den Berg <srb[at]cuci[dot]nl> 28-01-2009
Martin Stjernholm, Roxen IS @ Pike  importm?te f?r mailinglistan wrote:
>Although I haven't encountered that "MT940" format, I'm sure it's
>common enough.

Erm, actually, the csv and mt940 where example usage cases, not actual
proposed classes.  The purpose of this parser is to make it easy to
parse line/block oriented dataformats.

>At least a parser for csv is, preferably with options to control quote
>chars and quote escape sequences, as I cannot possibly imagine that
>there is a single universal de-facto standard when it comes to such
>details.

Well, it probably was never their intention, but this single universal
de-facto standard for csv is actually being dictated by MS-Excel.
I.e. if a CSV file produced by MS-Excel is not properly read back, or
if you produce CSV that MS-Excel can't read, 99.9% of all people consider
your implementation broken.  So, the actual CSV format adhered to in this
example is the MS-Excel compatible one (it actually accepts a tiny bit
more than MS-Excel actually produces); better than this way defacto is
hardly possible.

>I think the obvious locations would be just Parser.CSV and
>Parser.MT940. Would that work?

I could create two classes like that which then would use a base
class Parse.Structured to actually implement them, but that was not
my primary concern.

>> this can already be done by Parser.LR and I'm just being silly for
>> not reusing that?

>But then you'd have to write a bit of code first, right? To just parse
>a csv blob in a single call is convenient.

Yes.  Though the most convenience stems from the fact that it is easy
to specify the named and nested structure of the file (without writing code,
just fill in the format-array).
-- 
Sincerely,
           Stephen R. van den Berg.

"The difficult we do today; the impossible takes a little longer."