[fpc-pascal] XMLWrite looses data

Graeme Geldenhuys mailinglists at geldenhuys.co.uk
Mon Mar 24 21:12:50 CET 2014


On 2014-03-24 13:58, Mattias Gaertner wrote:
> 
> Yes, XSL is XML.

Thought so - thanks for confirming.


> The parser converts &#*; to Unicode characters when
> reading. AFAIR some xsl parsers like xsltproc do the same.
> If you want xslt to output ' ' you can use ' '

Thanks for that info, it helped find the problem (though no solution
yet). Tha character isn't actully a unicode character, it is simply a
"no-break space" character at position $A0 in the ASCII chart. Using hex
value notation, instead of the more popular decimal notation when escaped.

=======[ charmap details ]============
U+00A0 NO-BREAK SPACE
UTF-8: 0xC2 0xA0
UTF-16: 0x00A0

C octal escaped UTF-8: \302\240
XML decimal entity:  
=========================

But I now see what happened. When I enabled "show hidden characters"
like spaces and tabs in my editor, I noticed that the no-break space
character is still there, but in the resaved output file it is simply
not escaped any more.

How is the fcl-xml package supposed to handle escaped characters which
will form part of the data the XSL will generate? Is fcl-xml supposed to
write them back as escaped characters, or as an normal un-escaped character?

I tried using the decimal notation too:  
And that produced the same result as the original.

Note:
When we process a XML file with our XSL file, we want he resulting
output to have a no-break character - we don't what to display the text
'&#a0;' - which I think is what your suggestion with the & will produce.

To put this in context, in case my original XSL snippet wasn't clear.
That snippet generates a date string in the format 'dd MMM yyyy' and the
spaces between those elements are not normal spaces, but no-break
spaces, so that whole text stays together (and wouldn't wordwrap in the
middle).


The current resaved XSL file still works, but not being able to
physically see the no-break space characters could cause us problems
months down the line when we re-edit those files. Hence the reason they
were escaped (to make them clearly visible to the developer).


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/



More information about the fpc-pascal mailing list