[fpc-devel] fpc compiler wrong encoding in console output on Windows

Ondrej Pokorny lazarus at kluug.net
Wed Aug 30 23:38:20 CEST 2023


Am 30.08.2023 um 20:21 schrieb Ondrej Pokorny via fpc-devel:
> On 30.08.2023 17:35, Tomas Hajny via fpc-devel wrote:
>> On 2023-08-30 17:23, Ondrej Pokorny via fpc-devel wrote:
>>> Sorry to bother you with something as trivial: is your t2.pas file
>>> really encoded in UTF-8?
>>>
>>> Because if I compile an ANSI file with the {$codepage utf8}
>>> declaration, then I get "correct" output. But obviously this is very
>>> wrong.
>>>
>>> You can try yourself with the attached files. So maybe this is your 
>>> mistake?
>>
>> Well, you're right, this was indeed my mistake, shame on me. :-( Then 
>> I can confirm that the compiler behaviour is indeed wrong (although I 
>> have no clue why it behaves that way).
>
> Having seen the outputs, I think that the compiler just ignores the 
> source file encoding for {$MESSAGE} and {$NOTE}. It reads them always 
> as ANSI and then converts them to DOS-whatever.
>
> That would explain why UTF-8 byte stream is encoded into DOS CP.
>
> So the fix should be quite easy - when {$MESSAGE} or {$NOTE} is read 
> into a string, set the correct codepage of the string.

I was correct in my assumption and I was able to fix it: 
https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/482

On the other hand, when I read the $CODEPAGE docs: 
https://www.freepascal.org/docs-html/prog/progsu87.html#x95-940001.3.4
There it is stated that only literal strings follow $CODEPAGE and the 
actual code must be in US-ASCII.

But you know: Delphi compatibility :) ...and there is no "illegal 
character" compiler error as it is for:

var
   ä: string;

so one would expect {$note ä} to show up correctly.

Ondrej



More information about the fpc-devel mailing list