[fpc-pascal] String literals and code page of .pas source file

Michael Van Canneyt michael at freepascal.org
Mon Sep 14 16:09:22 CEST 2020



On Mon, 14 Sep 2020, Tomas Hajny via fpc-pascal wrote:

>>> application (let's say notepad.exe) will result in garbage.  I don't 
>>> say that it is necessarily bad, but it should be documented at least 
>>> if we want to keep it that way.
>> 
>> I would definitely keep it that way.
>> 
>> As I see it: Redirection or not should not matter, the system should
>> assume console output.
>> Things like 'tee' make this concept dubious in any case:
>> 
>> If you pipe output to a program, you don't expect the codepage to 
>> change
>> because of the redirection.
>
> No problem, but I'd suggest documenting it at least.

Document what exactly ? That redirecting does not change the codepage ?


>>>> I think it will differ since Crt is not codepage aware. If you want 
>>>> it to
>>>> work the same you'll have to make Crt codepage (and hence unicode) 
>>>> aware.
>>> 
>>> As mentioned by me, Crt is currently more codepage aware than the 
>>> System unit output as far as output to console is concerned, because 
>>> Crt provides correct output even for shortstrings (unlike the System 
>>> unit).
>> 
>> I would need to check the details, but that sounds more like
>> accidental behaviour as opposed to intended behaviour to me :-)
>
>
> Not really accidental:
>
> r3606 | florian | 2006-05-20 23:42:58 +0200 (Sat, 20 May 2006) | 2 lines
> * fix from Maxim Ganetsky to fix CRT output with non latin code pages, 
> should fix #6785
>
> (there were additional changes performed later, but the primary change 
> was this one)

Does this handle UTF8 ?

Judging by the sources, I would think not:

Interface

{$mode fpc} // Shortstring is assumed
{$i crth.inc}

Const
   { Controlling consts }
   Flushing     = false;               {if true then don't buffer output}
   ConsoleMaxX  = 1024;
   ConsoleMaxY  = 1024;
   ScreenHeight : longint = 25;
   ScreenWidth  : longint = 80;

Type
   TCharAttr=packed record
     ch   : char;
     attr : byte;
   end;
   TConsoleBuf=Array[0..ConsoleMaxX*ConsoleMaxY-1] of TCharAttr;
   PConsoleBuf=^TConsoleBuf;

var
   ConsoleBuf : PConsoleBuf;

Since every screen position handles only a single char (byte) there is no
way this can handle UTF8. Maybe other "real" single-byte codepages, yes.


Michael.


More information about the fpc-pascal mailing list