[fpc-devel] XML Parser problems with C-Data and Character Encoding

michael.vancanneyt at wisa.be michael.vancanneyt at wisa.be
Thu Sep 27 15:50:33 CEST 2012



On Thu, 27 Sep 2012, Andrew Brunner wrote:

> On 09/27/2012 04:19 AM, Mattias Gaertner wrote:
>> Have you tried setting the right encoding in the xml?  > 
>> http://mantis.freepascal.org/view.php?id=22990
>
> I have, and it did and it did work (thanks Sergei :-)
>> Maybe you can explain what you are trying to do?
>
> I have a cloud social virtual operating system and each read/write operation 
> is done via XML.  So adding content encoding mechanism for comparing each 
> byte is extremely costly from a client/server standpoint.  Just imagine 1M+ 
> users and having to encode/decode each xml fragment just to get the parser to 
> parse the data - unwanted latency.
>> AFAIK such statements have seldom a positive effect on volunteer projects.
>
> My frustration is not with FPC team, because they are drawing code from 
> projects like firefox.

To my knowledge, we are not ?

>  I am extremely sensitive towards wasted cpu cycles as 
> efficiency in scale is maximized by reducing things like byte encoding.  Some 
> of my stream fragments can be as large as 1.4MB deflated from 8MB.  Multiply 
> that number by say 100 concurrent users on that 1 node and you'll begin to 
> see my frustration.

If you are sensible to that, drop XML and use JSON. It will parse much
faster, contains far less overhead (smaller packet size).
It was invented exactly because XML became too heavyweight.

I did some benchmarking. Using XML (well, SOAP) makes a typical application 
6 times slower than a comparative binary transmission mechanism.

> To me, an XML parser is just looking for "<>" etc.  Once it hits a CDATA 
> section it should only look for ]].  Therefore I was surprised to learn that 
> it required the encoding tag (which in itself just increased the average 
> network packet size) that I must transmit from client to my server nodes.

You cannot just choose which parts of XML to use, and which not.
XML is highly standardized, and so are XML parsers.
That is why people have standards.

Michael.



More information about the fpc-devel mailing list