[fpc-devel] Unicode resource strings

Hans-Peter Diettrich DrDiettrich1 at aol.com
Wed Aug 22 14:38:57 CEST 2012


Marco van de Voort schrieb:
> In our previous episode, Hans-Peter Diettrich said:
>>> utf8/16 -> ansi are a bit more involved. (since mapping many chars to few,
>>> naieve implementation requiring large lookupsets)
>> A single 256 element array can be used for both directions. In Ansi to 
>> Unicode the char value is used to index the array of Unicode values, 
>> otherwise the given Unicode value is searched in the array.
> 
> That is an option also of course, but O(n).

I'm not sure whether this is a valid argument here. A constant n=256 is 
equivalent to O(1) - it may be a single machine instruction. Effectively 
the array size is only 128, because ASCII maps 1:1 to Unicode.

P.S.: Above applies to SBCS only, MBCS require more complex solutions.

> Probably the better solution is
> what was mentioned before, have a set of ranges and smaller lookuptables for
> these ranges.
> 
> This lowers the set size at the expense of a few (constant time)
> comparisons.

See above :-)

In any case a single lookup table, for both directions, reduces memory 
requirements and error prone implementation efforts.

DoDi




More information about the fpc-devel mailing list