[fpc-devel] Trying to understand the wiki-Page "FPC Unicode support"
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Wed Dec 3 00:52:45 CET 2014
Michael Schnell schrieb:
> On 11/29/2014 07:55 AM, Jonas Maebe wrote:
>> Exactly the same goes for converting strings with code page CP_NONE to
>> a different code page: your program is broken when it tries to do that,
>
> While accessing an array beyond its bounds is not detectable at compile
> time and accessing an array beyond its bounds when range checking is
> switched off is technically not detectable at runtime, and hence
> *undefined* cant be avoided, the attempt to convert strings with code
> page CP_NONE to a different code page is easily detectable by the
> compiler, as we have predefined string variable type "brands" types
> here. Thus, if the outcome is *defined* *to* *be* *undefined* it can and
> should result in a compiler error message.
You forget that Jonas refers to *dynamic* string encodings, unknown at
compile time. At runtime the dynamic encoding of every string is stored
together with the string data, like the size of dynamic arrays is stored
together with the array data.
In Delphi *no* string can have an dynamic encoding of CP_NONE or CP_ACP,
so that nothing can be broken. In fact all CP_xxx constants are private
in System.pas, they are not available to user or library code.
SetCodePage (i.e. the RTL/OS function for casting AnsiString into
UnicodeString) replace 0 (CP_ACP) by DefaultSystemCodePage before a
conversion, and return an empty string for an unknown target codepage,
like $FFFF (CP_NONE).
For the curious: for the exact behaviour of SetCodePage see
MultiByteToWideChar (on Windows) and UnicodeFromLocaleChars (on POSIX),
which finally are used to perform (the first step of) an encoding
conversion by Delphi.
For MultiByteToWideChar see the list of allowed CP_xxx constants, as
#defined in windows.h, how they are replaced, and what shit may happen
to your strings when using them. The function returns 0 if it does not
succeed; since this result is used to determine the required buffer size
(length of the resulting string), the resulting string then is empty.
DoDi
More information about the fpc-devel
mailing list