[fpc-pascal] Default source encoding
Jonas Maebe
jonas.maebe at elis.ugent.be
Thu Mar 31 14:38:35 CEST 2016
On 31/03/16 14:12, Mattias Gaertner wrote:
> On Thu, 31 Mar 2016 13:52:54 +0200
> Jonas Maebe <jonas.maebe at elis.ugent.be> wrote:
>
>> On 31/03/16 13:46, Mattias Gaertner wrote:
>>
>>> According to
>>> http://wiki.freepascal.org/index.php?title=FPC_Unicode_support#String_constants
>>>
>>> "the constant strings are assumed to have code page 28591 (ISO 8859-1
>>> Latin 1; Western European)."
>>>
>>> Is this true?
>>
>> Yes.
>
> What happens on a Russian system cp1251 with a cp1251 AnsiString
> literal?
>
> writeln('Привет');
There are two separate things:
a) the code page that the compiler uses *if* it has to convert a string
at compile time to a different code page (e.g. because you assign the
string constant to an ansistring(1251), or to a unicodestring)
b) whether or not it will in fact convert a string at compile time to a
different code page
a) is what I was talking about above.
For b), the conditions are described in the the section linked above.
So, in this case: if the source file code page is CP_ACP (i.e., no
explicit code page specified), then writeln('constant') will call either
writeln(shortstring) or writeln(rawbytestring) (I'm not sure which one
by heart, it may depend on the state of {$h+}), and hence the described
rules for assigning a constant string to a shortstring/rawbytestring apply.
Therefore, no *compile time* conversion of the string type will happen
in this case, since the code page of the string constant and that of the
called helper match, or because the called helper uses rawbytestring.
This means that the string constant will be stored unmodified in the
binary with as code page CP_ACP (a situation that can never happen in
Delphi-with-support-for-codepage-aware strings, but which is done by
default in FPC because it matches the behaviour of previous FPC and
Delphi versions), and the string constant will be interpreted at run
time using whatever the actual value of DefaultSystemCodePage is at that
time.
So with DefaultSystemCodePage = 1251, a string constant encoded in
cp1251, and with source file code = CP_ACP, the result of writeln will
be correct. Running such a program with a different
DefaultSystemCodePage may result in errors (depending on how much the
actual code page differs from cp1251 for the printed character).
Jonas
More information about the fpc-pascal
mailing list