[fpc-devel] Unicode support (yet again)

Felipe Monteiro de Carvalho felipemonteiro.carvalho at gmail.com
Fri Sep 16 13:52:52 CEST 2011


On Fri, Sep 16, 2011 at 12:38 PM, Marco van de Voort <marcov at stack.nl> wrote:
> In the UTF8 RTL, all "string"s _ARE_ utf8, unless specified otherwise (by
> naming them unicodestring or ansistring(..encoding) or shortstrings).

This is somewhat interesting, but then Lazarus and fpvectorial would
only work in the UTF-8 RTL. They would not work with the Unicode RTL
and they would not be compatible with libraries written to work only
for the Unicode RTL. I think it is more stable in the long run if the
LCL, fpvectorial and fpspreadsheet simply no longer use TStringList
and use an utf-8 variant from libutf8.

> I hope though that Lazarus in time will see the light and change the Windows
> port to the UTF16 RTL, since when the manual conversions are removed, the
> places where encoding matters decreases significantly. (and the places where
> the automatic ones happen can vary without codechanges)

I am totally against using a string type that changes from platform to platform.

> I don't see the point of that. I don't see the reason to move the
> workarounds of Lazarus manual UTF8 conventions into the FPC repository that
> doesn't support those conventions.  Specially since it is only for the 2.6
> series that is already branched, because after that new solutions will
> remain available.

For me it is completely the opposite, this is mostly useful for 2.8+

The LCL and fpvectorial (and fpspreadsheet and etc) should work with
the Unicode RTL. fpvectorial and fpspreadsheet should theorically even
be able to work with Delphi if necessary one day. So being locked in a
UTF-8 RTL which other people are not using is not a long term
solution.

fpvectorial and fpspreadsheet use TStringList to pre-parse text which
is separated in lines. Therefore they cannot use a UTF-16 TStringList,
much less a unknown string TStringList. It must be a raw UTF8string
TStringList (raw as in supporting zero conversions, conversions could
also be added, but zero is a must), not more, not less. If the RTL
cannot provide this class, they need to use another one. I don't want
to duplicate code in the LCL and in fpvectorial and in fpspreadsheet,
so I propose that I be allowed to start a libutf8 which will be a
dependency for all 3.

-- 
Felipe Monteiro de Carvalho



More information about the fpc-devel mailing list