[fpc-pascal] Parse unicode scalar

Tomas Hajny xhajt03 at hajny.biz
Mon Jul 3 09:04:41 CEST 2023


On 3 July 2023 8:42:05 +0200, Hairy Pixels via fpc-pascal <fpc-pascal at lists.freepascal.org> wrote:
>> On Jul 3, 2023, at 12:04 PM, Mattias Gaertner via fpc-pascal <fpc-pascal at lists.freepascal.org> wrote:
>> 
>> No, the header of a codepoint to figure out the length.
>
>so the smallest character UTF-8 can represent is 2 bytes? 1 for the header and 1 for the character? 
>
>ASCII #100 is the same character in UTF-8 but it needs a header byte, so 2 bytes?

No - in this case, the "header" is the highest bit of that byte being 0.

Tomas



More information about the fpc-pascal mailing list