[fpc-pascal] Parse unicode scalar

Martin Frb lazarus at mfriebe.de
Sun Jul 2 19:38:57 CEST 2023

On 02/07/2023 19:20, Nikolay Nikolov via fpc-pascal wrote:
> On 7/2/23 16:30, Hairy Pixels via fpc-pascal wrote:
>> I'm interested in parsing unicode scalars (I think they're called) to 
>> byte sized values but I'm not sure where to start. First thing I did 
>> was choose the unicode scalar U+1F496 (💖).
> There's no such thing as "unicode scalar" in Unicode terminology:
> https://unicode.org/glossary/
There seems to be

>> Next I cheated and ask ChatGPT. :) Amazingly from my question it was 
>> able to tell me the scaler is comprised of these 4 bytes:
>>   240 159 146 150

That is an utf-8 encoded representation of such a value.

You can find them on https://www.compart.com/en/unicode/U+0041
(using the hex for whatever codepoint interests you)

>> The question is, how was 1F496 decomposed into 4 bytes?

More information about the fpc-pascal mailing list