[fpc-devel] ccharset.pas, charset.pas and strings/unicode ?

Sven Barth pascaldragon at googlemail.com
Wed Apr 6 14:40:53 CEST 2011


Am 06.04.2011 08:30, schrieb Skybuck Flying:
> Hello,
>
> I am having momentarily confusion about the situation with ccharset.pas
> and charset.pas and strings, ansistrings and unicode in general... ?!?
>
> So some questions about this:
>
> I in particularly do not understand the following uses clausule:
>
> {$ifdef VER2_2}ccharset{$else VER2_2}charset{$endif VER2_2},
>
> Somewhere it says something about bootstrapping and stuff like that...
> it seems to have something to do with unicode mappings...
>
> It also said that this wasn't necessary anymore beyond version 2.2.2 or
> something ?
>

Something like this is normally done when code is added to the RTL (in 
this case the unit "charset") which is used by the compiler as well. As 
the compiler must be built with an older compiler (and its older RTL) 
first, that compiler does not yet know about the "charset" unit. Thatfor 
the unit is copied to the compiler's directory with a "c" prefix (in 
this case "ccharset") until a release is made which contains that new 
unit. The unit you are looking for is in rtl/inc now, so that 
ifdef-construct (and the ccharset unit) could be removed now.

Something similar was done a few days ago with the new "windirs" unit 
which was added as "cwindirs" to the compiler as well.

> This seems to me like a little unicode-hack to get unicode into the
> compiler or something ?
>
> What the hell is this ? =D
>
> Anyway some questions about the free pascal 2.4.2 sources in relation to
> Delphi XE situation:
>
> In the latest Delphi versions "string" is now considered a Unicode string.
>
> What's the situation with the "options.pas" in the compiler folder ?
>
> Lot's of string stuff and character stuff going on there... ansistring
> versus unicodestring, ansichar versus unicodechar ?
>

Options.pas has nothing to do with different string types. It's for 
parsing the command line arguments and the configuration file and for 
setting up the start defines based on that arguments and files. Mostly 
you don't need to touch options.pas at all.

> Seems a bit conflicting for what I am trying to do... which is use some
> of this code in Delphi...
>
> So I am getting all kinds of typecast/implicit string cast warnings and
> errors and stuff and potential data loss
> from "string" to "ansistring"... a bit too whacky for my taste but ok...
>
> So to get some sense into all of this let me ask you a simple question:
>
> 1. What type of strings does free pascal use ? Especially in options.pas ?
>
> Are these "string" types considered to be AnsiStrings or UnicodeStrings ???
>
> And what about "char" types ? Are those AnsiChar or UnicodeChar ???
>
> (probably also know as widechar,widestrong...)
>

The compiler itself mostly uses ShortString and pointers to ShortString 
as they don't have the reference counting and thus are faster to handle. 
In some seldom cases AnsiString (aka String) is used and WideString is - 
as far as I'm aware of - never used.

The supported string types by FPC though are ShortString, AnsiString, 
WideString (non reference counted 2 Byte String for Windows 
compatibilty) and UnicodeString (reference counted 2 Byte String). On 
all platforms except Windows (Win32, Win64, WinCE) a WideString is an 
alias for UnicodeString.
In mode Delphi "String" is an alias for "AnsiString" in all other modes 
(unless $H+ is given) "String" is an alias for "ShortString".

> (I have in principle done no real programming yet with the newer Delphi
> versions with the unicode stuff in it...
> so this is new stuff for me... and now a bit confusion unfortunately...
> and perhaps even unavoidable confusion...
> because this "reinterpretation" that "new-borland" did is now
> conflicting and causing interpretation issue's :(
> so it depends on the compiler... and I don't know what free pascal
> does... so that's why I ask here...)
>
> Also there is something I don't understand about the conditional way above:
>
> It reads in away:
>
> IF VERSION IS 2.2 THEN USE CCHARSET ELSE CHARSET
>
> The thing is: I am using 2.4.2 and CHARSET is missing from 2.4.2

This condition is the correct one. CCharSet should be removed maybe as 
all compilers from 2.4.0 on use CharSet from the RTL directory.

>
> So perhaps this conditional was ment to read something like:
>
> if Version > 2.2 then use CCHARSET else CHARSET; ???
>
> So for 2.4.2 I must probably use CCHARSET.pas the thing with the
> confusing strings remains though ;)
>
> So for messy posting... but this is messy ! ;) =D

No, it's not ;)

Regards,
Sven



More information about the fpc-devel mailing list