[fpc-devel] TAVLTree(avl_tree.pp) thread safety, fcl-xml(DOM) is also concerned.

Sergei Gorelkin sergei_gorelkin at mail.ru
Thu Aug 7 14:14:12 CEST 2008


Mattias Gärtner wrote:
> Zitat von Burkhard Carstens <fpc at bcsoft.de>:
> 
>> Am Mittwoch, 6. August 2008 21:37 schrieb Sergei Gorelkin:
>>> Since this topic is touched, I would like to vote for removing
>>> avl_tree dependency from the DOM altogether. The reason is that the
>>> avl tree of child nodes is useless for any real-world XML data
>>> because it does not handle duplicate names. The tree is useful only
>>> for the one particular case of XML configs where each node name is
>>> unique. OTOH, it causes considerable performance penalties (each
>>> insert results in two searches, first one for checking the existence,
>>> second one for insert itself; each search is a number of calls to
>>> Node.GetNodeName, which, in case of Windows, is a Widestring copy,
>>> etc.)
> 
> The widestrings version is slow under linux too. Last time I tested (a few
> months ago) it was about 2 to 10 times slower than the ansistring version.
> Mainly because of the parser.
> 
This will change soon. I've almost completed the improved parser which 
is very competitive with ansistring one (about 3 times faster than 
current implementation, twice faster than MSXML parser).

> 
>>> Therefore I'd suggest to remove the avltree from DOM.
>> This would be great to have in 2.2.2.
>>
>>> In order to
>>> keep configs at the reasonable speed, it is possible to implement
>>> indexing within TXMLConfig class itself, preferably hashmap-based.
>> This would be a performance tweak which (from my POV) does not need to
>> be in 2.2.2, although it would be nice.
> 
> The performance penalty without the trees can be quite big for things like
> txmlconfig. For example for lazarus project files it can be a hundred times
> slower without the trees.
> Note: lazarus itself uses its own ansistring version of xml parser, so this not
> a direct problem for lazarus.
> 
> 
>>> Alternatively, it is possible to use Registry-like access
>>> (OpenKey/CloseKey methods) that I had already implemented in newer
>>> xmlconf.pp unit. The advantage is that the searches are done starting
>>> from the 'opened key', not from the root, so linear search isn't too
>>> slow.
>> If I understand correctly, this would change the usage of xmlconfig
>> which would require changes to any existing code using it? If so, it's
>> probably not an option for 2.2.2, maybe even not for 2.2.4 ..
> 
> I second that. The main advantage of TXMLConfig is it's easy usage.
> I guess you can gain some speed by implementing some kind of caching for paths.
> Maybe I will do that for the lazarus ansistring version. But IMO this is too
> risky for fpc 2.2.2 which is released soon.
> 
xmlconf is source-level compatible with xmlcfg, you can replace it in 
uses clause and the code should remain working (although slower). Later 
on, you can start replacing things like

s1 := cfg.readstring('foo/bar/value1');
s2 := cfg.readstring('foo/bar/value2');

with

cfg.OpenKey('foo/bar');
s1 := cfg.readstring('value1');
s2 := cfg.readstring('value2');
cfg.CloseKey;

In this simplest example, the code becomes more complicated. However, 
most real projects compose the paths from parts, and they no longer need 
to do that.

The above also means that the key caching implemented in xmlconf may be 
back-ported to xmlcfg without breaking the existing user code.

Sergei



More information about the fpc-devel mailing list