[fpc-devel] Unicode resource strings
Mark Morgan Lloyd
markMLl.fpc-devel at telemetry.co.uk
Mon Aug 20 23:00:47 CEST 2012
Hans-Peter Diettrich wrote:
> Mark Morgan Lloyd schrieb:
>
>> I've got a couple of terminal emulators using WideChar and WideString
>> for internal manipulation, what /should/ I be using? and where does it
>> leave things like Sorokin's regex unit, which similarly use WideChar
>> and WideString?
>
> Depends on which libraries you use. AFAIK SBCS RegEx works for both Ansi
> and UTF-8 strings, so that an UTF-16 library is optional. For the
> terminal emulators I'd think that it's sufficient to introduce an
> internal string type that allows to switch between UTF-8 and UTF-16, so
> that the (different?) behaviour can be tested. When there exist
> differences, this indicates that the WideString emulators *only* handle
> Unicode BMP characters, not surrogate pairs, and you have to decide
> whether this restriction is okay for you.
I think I need to clarify. The terminal emulators are not for a standard
coding such as UTF-8, but accept a non-standard byte sequence over e.g.
a telnet or serial connection and convert that to a particular set of
characters to emulate e.g. an IBM Selectric APL golfball.
Sorokin's regex unit is a separate issue, and applies to FPC's regexpr
package which uses WideChar: I don't know whether this would be
problematic on Windows.
--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk
[Opinions above are the author's, not those of his employers or colleagues]
More information about the fpc-devel
mailing list