[fpc-devel] String handling in trunk (was utf8 in 2.6.0)
Sven Barth
pascaldragon at googlemail.com
Sat Jan 5 14:58:19 CET 2013
On 05.01.2013 14:16, Michael Van Canneyt wrote:
>
>
> On Sat, 5 Jan 2013, Jonas Maebe wrote:
>
>>
>> On 05 Jan 2013, at 13:10, Michael Van Canneyt wrote:
>>
>>> On Sat, 5 Jan 2013, Jonas Maebe wrote:
>>>
>>>>
>>>> On 05 Jan 2013, at 12:53, Paul Ishenin wrote:
>>>>
>>>>> ResourceStrings are stored as AnsiString type with 0 codepage (as I
>>>>> remember). Delphi now stores ResourceStrings as UnicodeString type.
>>>>> I think FPC will follow this in m_default_unicodestring modeswitch.
>>>>
>>>> It would probably even be better to always do that. At least I don't
>>>> see a
>>>> downside, other than slightly larger binaries (and that's not an
>>>> issue in
>>>> this case as far as I'm concerned; maintaining two separate
>>>> resourcestring
>>>> systems/handlers is just not worth the trouble).
>>>
>>> But it means that for
>>>
>>> Resourcestring
>>> AString = 'Something';
>>>
>>> Var
>>> S : Ansistring;
>>>
>>> begin
>>> S:=AString;
>>> end.
>>>
>>> Always a conversion will happen.
>>>
>>> I do not think this is a good idea given that currently, String =
>>> Ansistring.
>>
>> String will always be shortstring or ansistring in the syntax modes in
>> which that is currently the case. And yes, it will involve a
>> conversion in that case. Just like every single constant string
>> assignment to an ansistring in 2.6.x in case the constant string
>> contains non-ASCII characters and is part of a {$codepage xxx} file
>> (because those strings are all stored as unicodestring in the program
>> there).
>
> Judging by all the code that I have written during 14 years, there would
> never be a single conversion necessary.
> This system would force them on me for every single use.
>
> I do not think that the support of both ansi/unicode string resources is
> such a burden that it justifies that.
>
> I admittedly have limited knowledge of compiler internals, but I cannot
> imagine that being able to store them in 2 formats (ansi and some form
> of unicode) is more than a matter of maintaining 1 flag per string, and
> writing a word instead of a byte.
>
> All the other code, needed for conversions depending on codepage and
> whatnot settings, is necessary anyway.
You forget also the code necessary to translate resourcestrings (at
runtime). Currently the ResourceString related code inside
rtl/objpas/objpas.pp only handles AnsiString and then this would need to
be adjusted so that UnicodeString can also be handled. For example there
will be the need for a "SetResourceStrings" overload with a
UnicodeString based TResourceIterator.
Regards,
Sven
More information about the fpc-devel
mailing list