[fpc-pascal] Re: RE : JSON and UTF8

Wed Jul 11 22:22:58 CEST 2012

On 7/11/2012 01:36, Reinier Olislagers wrote:
> On 11-7-2012 4:19, waldo kitty wrote:
>> On 7/10/2012 07:00, Luiz Americo Pereira Camara wrote:
>>> With the old behavior, in an system with a system code page<>  UTF8,
>>> if i try to
>>> show the parsed value of "\u4E01" in e.g. a LCL app will get garbage.
>>>
>>> I would expect to work correctly in any enviroment
>>
>> this means that some environments will end up with "garbage" for those
>> UTF-8 characters that cannot be translated back to the local codepage...
>
> And your point is?

that there will be data loss when converting from UTF-8 to codepages with less 
characters than UTF-8 has... that is what i stated and i thought it was an 
answer to the poster's question/problem of conversion...

> This change moves the conversion problem from with the JSON library to
> your responsibility. Please note that these same characters in a JSON
> string would have ended up as garbage already in the old situation.

and they'll still end up as garbage anyway... especially if they do not exist in 
the current codepage and there's no multiple characters format to represent them...

> In the new situation, you have some more control over it. If you don't
> like the new behaviour, you can always set the conversion flag in the
> constructor to false instead of its default true... and move back to the
> earlier "conversion in a black box" method.
>
> However, if you don't have a need to convert to system codepage, as you
> say, you could lose information because the conversion is lossy.
> That has been fixed (and correctly so, IMO).
> For instance, I can now get a UTF8 JSON string, and write it out to a
> UTF8 File, process it with the FPC UTF8 functions, show it in a Lazarus
> TEdit or grid, etc.

maybe we're talking the same basic point to each other? ;)