cpstrnew branch (was Re: [fpc-devel] Freepascal 2.4.0rc1 released)

Marco van de Voort marcov at stack.nl
Wed Nov 11 16:31:13 CET 2009


In our previous episode, Tomas Hajny said:
> >
> > We might implement 1 or 2 of those. Most of them will however be
> > handled by libiconv, the Windows code page conversion APIs, or some
> > other external library (just like with the current widestring manager).
> 
> Nevertheless: is e.g. ISO 8859-2 character set referenced the same way
> under different platforms (in the new concept), or would the new codepage
> number contain different values depending on the host platform? Does
> libiconv allow referencing the character sets using some numeric
> identifier at all? If yes, where are these identifiers defined? As an
> example, MS Windows addresses ISO 8859-2 as codepage number 28592 whereas
> OS/2 uses codepage number 912.

Yes this is a problem. When I made the unicode document I thought about this
too, and no solution is perfect. (using windows everywhere is strange for
users, but you don't want to break Delphi per se)

So I came up with a compromise (solution 3 below)

There are three solutions:

1 delphi compatible, always use Windows encodings.
2 define a handful of constants that map to the encoding on the given
  platform.  FPC_CODEPAGE_8859_2 =...
3 a mix of 1 and a handful of own constants:

take a range of say 30-50 values that are free in the Windows range.
Have a per platform table that maps these 50 values to native codepage
numbers. The indexes into these table get nice names like in option 2.

This way in the encoding translate routine you can do

if (encoding>fpc_encoding_low) and (encoding<fpc_encoding_high) then
   begin
     encoding:=fpc_encodingtable[encoding-FPC_encoding_low]; // cheap lookup
   end
else
   begin
     encoding:=windowsencoding2nativeencoding[encoding];
   end;

Delphi users would only have to define the fpc constants of (2) to their
respective windows codepages to keep the code working.



More information about the fpc-devel mailing list