[fpc-devel] TFieldDef.Size vs TField.Size

Thu Feb 24 11:32:35 CET 2011

On Thursday 24 February 2011 11:05:50 michael.vancanneyt at wisa.be wrote:
> On Thu, 24 Feb 2011, Martin Schreiber wrote:
> > On Thursday 24 February 2011 10:16:43 michael.vancanneyt at wisa.be wrote:
> >>>> But here you implicitly assume that you have a fixed number of bytes
> >>>> per character. You should always be explicit about such things, since
> >>>> this is a non-trivial assumption.
> >>>
> >>> I don't understand.
> >>
> >> "tmsebufdataset stores string data as UnicodeString instead to use a
> >> fixed record layout."
> >>
> >> If you say "fixed record layout", this means you assume that each
> >> character uses the same amount of bytes, and that the size of the string
> >> is limited, otherwise I fail to see how you can have fixed record
> >> layout.
> >
> > I really should learn English, but I fear i am too old for the task...
> > What I meant:
> > Original tbufdataset uses a fixed record layout where for every
> > stringfield memory is reserved to hold the maximum possible amount of
> > bytes of the field. MSEgui tmsebufdataset has a fixed record layout too
> > but stores a UnicodeString which actually is a pointer and uses
> > sizeof(pointer) memory space in record layout.
>
> The only drawback from this system is memory fragmentation for all the
> strings. This is the advantage of the TBufDataset.
>
> In one of the forms of our application, the users load up to 200.000
> records. each containing at minimum 3-4 strings (don't ask why..., they
> just do) with your system, much more memory would be allocated.
>
tmsebufdataset has been tested with more than 1 million records, still usable. 
Today often varchar fields with big character lengths are used (>32000) which 
wastes much space in the FPC record layout. Postgres and Sqlite3  users often 
use unlimited "TEXT" fields which must be handled as blobs. Because of the 
UnicodeString ability of MSEgui datasets MSEgui can handle such unlimited 
length text fields as normal text fields.
Although I must say that the handling of UnicodeString in datasets is horribly 
complicated, the implementation of tmsebufdataset led to many gray hairs and 
headache...
Also the UnicodeString memory overhead especially on 64 bit is huge, I don't 
like the plans to add even more fields because of the new encoding aware 
string type.

Martin