[fpc-devel] fpc compiler wrong encoding in console output on Windows
Ondrej Pokorny
lazarus at kluug.net
Wed Aug 30 23:38:20 CEST 2023
Am 30.08.2023 um 20:21 schrieb Ondrej Pokorny via fpc-devel:
> On 30.08.2023 17:35, Tomas Hajny via fpc-devel wrote:
>> On 2023-08-30 17:23, Ondrej Pokorny via fpc-devel wrote:
>>> Sorry to bother you with something as trivial: is your t2.pas file
>>> really encoded in UTF-8?
>>>
>>> Because if I compile an ANSI file with the {$codepage utf8}
>>> declaration, then I get "correct" output. But obviously this is very
>>> wrong.
>>>
>>> You can try yourself with the attached files. So maybe this is your
>>> mistake?
>>
>> Well, you're right, this was indeed my mistake, shame on me. :-( Then
>> I can confirm that the compiler behaviour is indeed wrong (although I
>> have no clue why it behaves that way).
>
> Having seen the outputs, I think that the compiler just ignores the
> source file encoding for {$MESSAGE} and {$NOTE}. It reads them always
> as ANSI and then converts them to DOS-whatever.
>
> That would explain why UTF-8 byte stream is encoded into DOS CP.
>
> So the fix should be quite easy - when {$MESSAGE} or {$NOTE} is read
> into a string, set the correct codepage of the string.
I was correct in my assumption and I was able to fix it:
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/482
On the other hand, when I read the $CODEPAGE docs:
https://www.freepascal.org/docs-html/prog/progsu87.html#x95-940001.3.4
There it is stated that only literal strings follow $CODEPAGE and the
actual code must be in US-ASCII.
But you know: Delphi compatibility :) ...and there is no "illegal
character" compiler error as it is for:
var
ä: string;
so one would expect {$note ä} to show up correctly.
Ondrej
More information about the fpc-devel
mailing list