[fpc-devel] Unicode proceedings

Sven Barth pascaldragon at googlemail.com
Fri Nov 18 14:57:20 CET 2011


Seems like the message you quote here went to you personally as well 
(that would explain why you sent this answer to me directly first...)

Thus here the original mail I wroted

=== original mail begin ===

Am 18.11.2011 10:22, schrieb Michael Schnell:
 > On 11/17/2011 02:44 PM, Sven Barth wrote:
 >>
 >> One could implement a similar type for something like this (maybe even
 >> use the mentioned TBytes) and define operator overloads for it (at
 >> least for "+").
 >>
 > Why should one do this, regarding that a normal string type provides
 > exactly what very often is requested: reference counting, lazy copy,
 > dynamic length, extracting single bytes at any location, finding a
 > certain byte or sequence of bytes, concatenation, deletion of a number
 > of bytes at a given position. proven functionality and performance.

Because then you don't need to rely on the point that SizeOf(Char) = 1. 
Now imagine you have an applications that uses strings as buffers and 
port that from lets say Delphi 7 to Delphi 2009. Have fun finding the 
bugs if you don't remember that you used a String as buffer.

 > OTOH some important feature that very often is necessary with byte
 > buffers is lacking with strings: To do a FiFo you need a _fast_ deletion
 > from position 1. AFAIK, here strings do a copy of the complete content.
 > To avoid this, an implementation would require handling the content in
 > chops and manage the appropriate pointers. Providing such a Type and the
 > appropriate compiler magic (for reference counting and operations line
 > '+' would be really nice. Regarding the current discussion this even
 > might be a dedicated string type (additional to the "General" and the
 > multiple "Strictly encoded" variants). Of course such a specialized FiFo
 > aware type would perform worse than an ordinarily string type in most
 > other operations.

Why implement another String type to ease something strings weren't 
intended for? Implement your own type that handles all this and then you 
don't even have a problem should FPC decide to switch String from 
AnsiString to UnicodeString.

Regards,
Sven

=== original mail end ===

Am 18.11.2011 14:38, schrieb Michael Schnell:
> On 11/18/2011 01:37 PM, Sven Barth wrote:
>>
>> Because then you don't need to rely on the point that SizeOf(Char) =
>> 1. Now imagine you have an applications that uses strings as buffers
>> and port that from lets say Delphi 7 to Delphi 2009. Have fun finding
>> the bugs if you don't remember that you used a String as buffer.
> I don't see what you mean here. My suggestion regarding "new string
> types" was to provide all thee Types RawByteString, RawWordString,
> RawDWordString, for this purpose.

It might indeed work as long as you don't do the following to move the 
received data to the buffer string (the same applies for an operation 
into the other direction):

SetLength(YourStringBuffer, LengthOfData);
Move(YourBinaryData^, YourStringBuffer[1], LengthOfData);

>>
>> Why implement another String type to ease something strings weren't
>> intended for?
> In C it is very common to use an int for a pointer. :)

And its also common to use the Object part in TStrings for integers, but 
that doesn't make it suitable for that task.

>> Implement your own type that handles all this and then you don't even
>> have a problem should FPC decide to switch String from AnsiString to
>> UnicodeString.
>>
> Why invent yet another time, when an existing one is perfectly suitable ?

The point is: it's not perfectly suitable. While things like reference 
counting and copy-on-write are nice, a type for usage as buffer should 
have fast insert (and maybe delete) operations. You can easily program a 
custom type once that handles that good while for Strings you always 
need to do the SetLength-magic every time you want to do this.

>
> I don't see how a user can define a type that provides reference
> counting and lazy copy. I gather that this needs compiler magic.

COM type interfaces would be a possibility...

>
> With this in mind I feel that (regarding the proceedings regarding the
> new string type definition) it would be a nice thing to provide
> RawByteString, RawWordString, RawDWordString that don't ever trigger a
> conversion and (at a later time) additionally consider to do FiFo String
> types that, are optimized for allowing deleting from position 0.

I see no need to introduce a special string type that has optimized FiFo 
operations. One can easily write that once by hand and maybe include 
that in the FCL, so there is no need for additional compiler magic.

Regards,
Sven



More information about the fpc-devel mailing list