[fpc-devel] TRegistry and Unicode

Bart bartjunk64 at gmail.com
Wed Mar 6 21:21:55 CET 2019

On Wed, Mar 6, 2019 at 8:39 PM Michael Van Canneyt
<michael at freepascal.org> wrote:

> Not sure I follow you ?
> If somewhere there is a warning about conversion of unicodestring to
> ansistring (often abused as single-byte string) then this must be looked at and somehow
> fixed.
> This can mean changing the single-byte string type to UTF8String and doing a UTF8Decode/Encode.
> Needs to be checked on a case-by case basis.

Initially I replied to you saying it (have overloads with plain string
and unicodestring) wasn't backwards compatible.
I then tried to point out to you that the past (the back in backwards
compatibilty) was that string was a single byte enocded type and on
Windows each character was just one byte and since there are only 256
possible characters this way codepages exists and you cannot represent
e.g. a Russian character in a Wester European codepage.
The main point here being that each singe byte represented a character.

Now Lazarus from the start has treated AnsiString as if it were UTF8.
For people who are not used to this, this is a really big difference.

Using UTF8String in TRegistry instead of String forces users to
consider the fact that returned strings are Utf8Encoded now always,
even if they (probably most of them) do not need that because what
they retrieve from and put in the registry fits into their codepage.
And somebody out there will have code that checks if
ReturnedString[Index] = #$E4 ("รค" in my codepage), and have the
sourcefile in ANSI (system codepage), and now this will fail, because
that character will now be made up of 2 bytes.

All that lead me to say that you cannot have Unicode in (classic)
1-byte encoded, (using default system codepage) strings.

Using UnicodeString/String overloads means that "old style" programs
still function as the did before and anyone who needs it can use the
UnicdeString variant directly.
And if you system codepage happened to be UTF8 in the first place, you
don't care either way.

Anway, i am starting to repeat my previous arguments.
So either I am unable to make myself clear, or I'm just plain wrong.


More information about the fpc-devel mailing list