[fpc-devel] ansistrings and widestrings

Alexey Barkovoy clootie at ixbt.com
Sun Jan 9 14:46:58 CET 2005


----- Original Message ----- 
From: "Marco van de Voort" <marcov at stack.nl>
To: "FPC developers' list" <fpc-devel at lists.freepascal.org>
Sent: Sunday, January 09, 2005 2:53 PM
Subject: Re: [fpc-devel] ansistrings and widestrings


>> This is the level where multibyte characters can come in, so that just a
>> Character can be different from any fixed-size data type, and that the
>> same Character can have multiple representations - remember your umlaut
>> example? Nonetheless the rules on the Character level at least are quite
>> well defined, so that it's possible to implement according standard
>> procedures for comparison and conversion.
>
>> Of course these procedures
>> require parameters like the language and the encoding of the characters,
>> so that IMO exchangable and configurable classes are the best containers
>> for characters.
>
> The problem with string-classes is that you loose all automatism. This
> complicates each and every operation where new strings are created from old
> ones. This is what Peter was hinting at.

So, seems best approach here is to leave compiler generated code for equality 
and comparision as a plain binary comparision of bytes (btw. it's the way Delphi 
does) and introduce set of string handling functions that should be aware of 
language depended encoding.

To current compiler implementation this means changing of

Type
  TWide2AnsiMove=procedure(source:pwidechar;dest:pchar;len:SizeInt);
  TAnsi2WideMove=procedure(source:pchar;dest:pwidechar;len:SizeInt);

to

Type
  // Lenght paremeters are number of CHARS not bytes
  TWide2AnsiMove=function(source:pwidechar; srclen:SizeInt; dest:pansichar; 
destlen:SizeInt): SizeInt;
  TAnsi2WideMove=function(source:pansichar; srclen:SizeInt; dest:pwidechar; 
destlen:SizeInt): SizeInt;

These functions should return actual number of characters in output. Returning 
ZERO should indicate insufficient destination size. In Windows 
WideCharToMultiByte can return needed number of characters in output buffer, but 
LIBICONV (http://www.gnu.org/software/libiconv/ - library suited for all 
UNIX'es) doesn't allow this. So common solution (if result of conversion will be 
stored in AnsiString or WideString) is just to enlarge output buffer untill 
TWide2AnsiMove / TAnsi2WideMove return non zero value.





More information about the fpc-devel mailing list