[fpc-devel] ansistrings and widestrings
clootie at ixbt.com
Sun Jan 9 14:46:58 CET 2005
----- Original Message -----
From: "Marco van de Voort" <marcov at stack.nl>
To: "FPC developers' list" <fpc-devel at lists.freepascal.org>
Sent: Sunday, January 09, 2005 2:53 PM
Subject: Re: [fpc-devel] ansistrings and widestrings
>> This is the level where multibyte characters can come in, so that just a
>> Character can be different from any fixed-size data type, and that the
>> same Character can have multiple representations - remember your umlaut
>> example? Nonetheless the rules on the Character level at least are quite
>> well defined, so that it's possible to implement according standard
>> procedures for comparison and conversion.
>> Of course these procedures
>> require parameters like the language and the encoding of the characters,
>> so that IMO exchangable and configurable classes are the best containers
>> for characters.
> The problem with string-classes is that you loose all automatism. This
> complicates each and every operation where new strings are created from old
> ones. This is what Peter was hinting at.
So, seems best approach here is to leave compiler generated code for equality
and comparision as a plain binary comparision of bytes (btw. it's the way Delphi
does) and introduce set of string handling functions that should be aware of
language depended encoding.
To current compiler implementation this means changing of
// Lenght paremeters are number of CHARS not bytes
TWide2AnsiMove=function(source:pwidechar; srclen:SizeInt; dest:pansichar;
TAnsi2WideMove=function(source:pansichar; srclen:SizeInt; dest:pwidechar;
These functions should return actual number of characters in output. Returning
ZERO should indicate insufficient destination size. In Windows
WideCharToMultiByte can return needed number of characters in output buffer, but
LIBICONV (http://www.gnu.org/software/libiconv/ - library suited for all
UNIX'es) doesn't allow this. So common solution (if result of conversion will be
stored in AnsiString or WideString) is just to enlarge output buffer untill
TWide2AnsiMove / TAnsi2WideMove return non zero value.
More information about the fpc-devel