[fpc-devel] XML Components

Michael Van Canneyt michael at freepascal.org
Fri Nov 2 17:49:21 CET 2012



On Fri, 2 Nov 2012, Andrew Brunner wrote:

> So where in the specs does it say that parsers must reject certain byte sequences between cdata tags excepting XML tags.
>
> If this is supported by specs it would help shape a viable solution.

Where did you get that it is supported ?

The specs list the allowed characters. Section 2.2:

[Definition: A character is an atomic unit of text as specified by ISO/IEC 10646:2000 [ISO/IEC 10646]. 
Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646.

Therefor, any character not in the list is not a legal character and should be rejected.

Speaking from painful experience: Relaxing this and silently ignoring these illegal 
characters will at some point lead to problems when you encounter a system that 
enforces the rules more strictly.

You will then wonder how it can be that the XML is considered invalid when your own 
XML code handles it so nicely. Whereas now, you already know.

To show that this is not just idle talk: I ran your XML through the MS-XML parser, 
and it complained just as well about an illegal character in the input.

Forewarned is forearmed.

Michael.



More information about the fpc-devel mailing list