[fpc-devel] fpdoc and unicode characters

Graeme Geldenhuys graemeg.lists at gmail.com
Thu Aug 14 13:01:34 CEST 2008


In researching how to type Unicode characters on different platforms,
I came across an interesting argument regarding Unicode characters and
HTML.  The argument might apply to fpdoc documentation (xml) files as
well—hence the reason for this post.

With W3C embracing UTF-8 as the de facto standard for HTML pages, do
we still need to escape characters like ampersand ['  U+2019] to
[&] etc.  Unicode has been around for some time now, so surely all
half-decent software should be able to read and display the actual
character correctly by now (sensitive subject for FPC and Delphi at
the moment), instead of having to bother with the escaped version.

How does this argument fit with XML which also uses UTF-8 as the de
facto standard encoding. And seeing that fpdoc uses XML for the
documentation files, can I use the actual Unicode characters in my
fpdoc documentation, or must I still stick with the—what now seems to
be outdated—escaped method?

These are the characters I was interested in.
— (U+2014): emphasis dash
… (U+2026): horizontal ellipses
' (U+2019): right single quotation
" (U+201C): left double quotation
" (U+201D): right double quotation
― (U+2015): quotation dash (introducing quoted text)

Just for completeness, and the curious among you, here are some
methods of entering the Unicode characters on different platforms:

Windows (Win2K and up I guess):
 ALT+<hex number>

Linux (GTK2 only):
 * Press Ctrl+Shift+U, then type the desired hexadecimal code. The benefit
 of this approach is that you can correct the hexadecimal code in typed
 wrongly. (I never knew this handy little trick!)
 * The Compose key can also be used, but I don't know all the key
  combinations. The previous option always works.

No idea about the other platforms or desktop managers like KDE.

 - Graeme -

fpGUI - a cross-platform Free Pascal GUI toolkit

More information about the fpc-devel mailing list