[fpc-pascal] UTF-8 versions of Copy() and Length()

Rimgaudas Laucius rimga at ktl.mii.lt
Sat May 19 11:46:53 CEST 2007


----- Original Message ----- 
From: "Graeme Geldenhuys" <graemeg.lists at gmail.com>
To: "FPC-Pascal users discussions" <fpc-pascal at lists.freepascal.org>
Sent: Saturday, May 19, 2007 11:58 AM
Subject: Re: [fpc-pascal] UTF-8 versions of Copy() and Length()


> On 5/19/07, Daniƫl Mantione <daniel.mantione at freepascal.org> wrote:
>> > Does FPC have UTF-8 versions of the Copy() and Length() functions?
>>
>> They don't exist. FPC has been designed to either use the system encoding
>> (which can be utf8). In this case, the string routines from sysutils do
>> what you want. The other option is to use widestrings;
>> length(utf8decode(s)) will return the length of an utf-8 string.
>
> Sorry, I'm very new to Unicode support.  Wouldn't it be useful to have
> UTF-8 and UTF-16 (and all the other encodings) functions in FPC?  For
> example the Lazarus LCL (LCLProc unit) has loads of such functions.
>

You can find info on Unicode standard at 
http://www.unicode.org/versions/Unicode4.0.0/. 
http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf presents Unicode 
encoding forms.

It is not useful to have functions for both encodings, because these 
encodings are interconvertable and it is more effective to use UTF-16 for 
data processing. Actually, UTF-8 is suitable only for storing of external 
dada, because it is more compact. It expresses characters that are outside 
ASCII as sequences of 8-bit code points (actually 2 or 3) while UTF-16 
expesses them using single (~actually) 16-bit code points. Thus processing 
of internal data (iterating, counting, etc.) using UTF-16 encoding may be 
done more effectivelly and easy.




> The Length function is easy to get around, but the Copy, Pos ,etc
> functions are not.
>
>
>
> -- 
> Graeme Geldenhuys
>
> General error, hit any user to continue.
> _______________________________________________
> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org
> http://lists.freepascal.org/mailman/listinfo/fpc-pascal
> 





More information about the fpc-pascal mailing list