[fpc-devel] Memory consumed by strings
Mattias Gaertner
nc-gaertnma at netcologne.de
Sun Nov 23 13:34:29 CET 2008
On Sun, 23 Nov 2008 14:11:50 +0200
listmember <listmember at letterboxes.org> wrote:
>[...]
> > For very large projects, that should probably be done anyway at some
> > point. But even in that case, using a more memory-efficient string
> > type enables you to keep more data in memory and hence potentially
> > obtain better performance.
>
> The last time I joined a relevant discussion, I was told worrying
> about native UCS-4 string-type would be pointless simply because that
> sort of thing is really needed for word processors only.
>
> Now, I have been informed that Lazarus (and perhaps other IDEs) use
> upwards of 50 MB string space just to do one of their basic
> operations.
>
> That leaves me wondering how much do we lose performance-wise in
> endlessly decompressing UTF-8 data, instead of using, say, UCS-4
> strings.
I'm wondering what you mean with 'endlessly decompressing UTF-8
data'.
You have to make a compromise between memory, ease of use and
compatibility. There is no solution without drawbacks.
If you want to process large 8bit text files then UTF-8 is better.
If you want to paint glyphs then normalized UTF-32 is better.
If you want some unicode with some mem overhead and some easy usage and
have compiler support for some compatibility then UTF-16 is better.
Mattias
More information about the fpc-devel
mailing list