[fpc-devel] Re: EBCDIC ( was On a port of Free Pascal to the IBM 370)

Hans-Peter Diettrich DrDiettrich1 at aol.com
Tue Jan 31 02:44:36 CET 2012


Sven Barth schrieb:
> On 30.01.2012 20:31, steve smithers wrote:
>>> Hans-Peter Diettrich wrote on Mon, 30 Jan 2012 17:40:27 +0100
>>> Existing source code frequently assumes ASCII encoding. The obvious are
>>> upper/lowercase conversions, by and/or or add/sub constant values to the
>>> characters. It will be hell to find and fix all such code in the
>>> compiler and RTL, even if only the constants have to be modified for
>>> EBCDIC. Even code with the assumed order of common characters (' '<  '0'
>>> <  'A'<  'a') has to be found and fixed manually - how would you even
>>> *find* code with such implicit assumptions?
>>
>> It does indeed.  I am aware of the problems inherent in this.  But the 
>> RTL
>> has to be more or less rewritten anyway to support OS.  OS is a very 
>> different
>> animal to Windows or Linux.
> 
> The RTL consists of two parts (though the border is not easily visible): 
> a platform independant one and a platform dependant one. A port to a 
> different target normally only includes touching the platform dependant 
> one, but a port to 370 also requires touching the platform independant 
> one. This is what DoDi talks about.

It's not anything the compiler could solve. Find out what will happen on 
e.g.
   for c := 'A' to 'Z' do ...
   for c := '0' to 'Z' do ...
(where the literals 'A' etc. could be named constants, or computed values)

With EBCDIC encoding the second loop will never be entered!

> @other devs: Could the code page aware AnsiString type be of any help here?

Only at the I/O side, when files are read/written, or when strings 
(filenames!) are sent or received via the OS API. The latter reminds me 
of the Windows OEM charset, used in console I/O, which could be 
exchanged to mean EBCDIC in IBM consoles.

Unfortunately the Encoding is available only with *strings*, not with 
single characters. New types like EBCDICchar could be introduced, 
different from AnsiChar, and a directive telling the compiler "literals 
are EBCDIC" or "Char is EBCDICchar".

DoDi




More information about the fpc-devel mailing list