[fpc-devel] fpdoc and unicode characters

Michael Van Canneyt michael at freepascal.org
Thu Aug 14 16:41:13 CEST 2008


> 
> 
> On Thu, 14 Aug 2008, Graeme Geldenhuys wrote:
> 
> > On Thu, Aug 14, 2008 at 4:12 PM, "Vinzent Höfler"
> <JeLlyFish.software at gmx.net> wrote:
> >
> > I suspect those entities get parsed by the DOM-unit as entities (which is the right thing to do generally) and simply get lost in the transformation back to the byte stream (aka. AnsiString) then.
> >
> >> &#x2026;
> >>
> >> should be treated exactly the same as
> >>
> >> <    or   >    or   &
> >>
> >> when generating HTML output from fpdoc.
> >
> > Yes. But the latter have a character code below 256 (even below 128, so they're plain 7-bit ASCII).
> >
> 
> 
> I think you have a point with the DOM-unit parsing the documentation
> content. So we can safely say, the actual content is NOT copied as-is!
> 
> 
> &#x2026;
> 
> If the above was interpreted as-is (with the rest of the content), it
> would be 8 ascii characters all below 256 character code. No issues
> then!
> 
> So yes I think I agree with you. Somewhere the above is being parsed,
> then found that as a whole it's above 256 ascii code and simply
> replaced with a ? character.  I simply found this confusing, because
> Michael is very versed with fpdoc, and when he said the content (not
> the XML tags) is copied as-is, I would not have envisioned any issues
> with Unicode escaped character.
>
> 
> Now we know better!  :-)

One test to do would be to add cwstrings to the fpdoc project and
trying again, see what it does.

Michael.


More information about the fpc-devel mailing list