[fpc-pascal] Parse unicode scalar
Nikolay Nikolov
nickysn at gmail.com
Tue Jul 4 14:52:18 CEST 2023
On 7/4/23 09:12, Hairy Pixels via fpc-pascal wrote:
>
>> On Jul 4, 2023, at 12:38 PM, Nikolay Nikolov via fpc-pascal <fpc-pascal at lists.freepascal.org> wrote:
>>
>> For console apps that use the Unicode KVM video unit, I've introduced two functions for determining the display width of a Unicode string in the video unit:
>>
>> function ExtendedGraphemeClusterDisplayWidth(const EGC: UnicodeString): Integer;
>> { Returns the number of display columns needed for the given extended grapheme cluster }
>>
>> function StringDisplayWidth(const S: UnicodeString): Integer;
>> { Returns the number of display columns needed for the given string }
>>
>> Remember, the display width is different than the number of graphemes, due to East Asian double width characters.
>>
>> And these work with UnicodeString, which is UTF-16, not UTF-8. But Free Pascal can convert between the two.
> is there an example snippet of how all this works? It's too level for newbies to understand. :)
Rendering Unicode to the screen is not for newbies :)
Using Unicode (where another library, like GTK or QT or the console
deals with it) is another matter. What is it that you need to do? From
your emails I get the impression you're writing a parser for a language.
For that, you don't usually need this sort of "length". If you're making
a GUI app, e.g. with the LCL, there should be ways to determine the
display length of a text control? Generally, you should use your GUI or
TUI toolkit. The Unicode version of Free Vision is for fullscreen TUI
apps, like the console IDE (which does not yet support Unicode). If
that's what you want, here's a starting point:
https://wiki.freepascal.org/Free_Vision#Unicode_version
Nikolay
More information about the fpc-pascal
mailing list