[fpc-pascal] getting XML element's value using dom.pp unit

Bisma Jayadi bisma at brawijaya.ac.id
Thu Dec 8 10:02:29 CET 2005


 > i think you are going to find that "my node value" is not the
 > NodeValue() of <anode>, but is the NodeValue() of a _child_ of
 > <anode>.

Thanks Tony... using your information, I've found what the problem is. It's a 
different way of understanding XML node, between a Delphi XML (DOM) component 
and a FPC DOM unit. I used to be a Delphi programmer, trying to migrate to be a 
Lazarus/FPC programmer. :)

Here's the problem, using the XML I used.

<?xml version="1.0" encoding="UTF-8"?>
<BAIS_XML Version='1.0'>
   <RESPOND ID="BAIS">
     <RESPOND_TIME>2005.01.20 12:26:58</RESPOND_TIME>
   </RESPOND>
   <REQUEST ID="SI1">
     <REQEUST_TIME>2005.01.20 12:26:58</REQUEST_TIME>
   </REQUEST>
   <CONTENT ID="L1">
     <USER ID="simba">
       <GROUP ID="1">Operator bank</GROUP>
       <NAME>Bisma Jayadi</NAME>
       <ALIAS>simba</ALIAS>
       <PASSPORT>FB0woDvKVE4AFQW29a9E</PASSPORT>
     </USER>
     <LOCATION ID="1">
       <CODE>UPPTI.1</CODE>
       <NAME>Komputer Bisma</NAME>
     </LOCATION>
   </CONTENT>
</BAIS_XML>

To get "value" of "RESPOND_TIME" element, in Delphi I use this code...

DELPHI CODE SNIPPET #1:
n := doc.DocumentElement.ChildNodes[0].ChildNodes[0].NodeName;
v := doc.DocumentElement.ChildNodes[0].ChildNodes[0].NodeValue;

which works this way...

the DocumentElement points to "BAIS_XML" element (root), the first ChildNodes[0] 
points to "RESPOND" element, the next ChildNodes[0] points to "RESPOND_TIME" 
element. So, NodeName returns 'RESPOND_TIME' (into "n" var) and NodeValue 
returns '2005.01.20 12:26:58' (into "v" var). The node's name and value is 
stored in the same tree-depth.

Using the way/logic of Delphi points the XML elements, I use a "similar" way 
when I code using the FPC's dom.pp unit...

FPC CODE SNIPPET #1:
n := doc.DocumentElement.FirstChild.FirstChild.NodeName;
v := doc.DocumentElement.FirstChild.FirstChild.NodeValue;

which I _assume_ the DocumentElement points to "BAIS_XML" element (root), the 
first FirstChild points to "RESPOND" element, the next FirstChild points to 
"RESPOND_TIME" element. So, I expect (like I used to get it from Delphi) the "n" 
var will have 'RESPOND_TIME' string and the 'v' var will have '2005.01.20 
12:26:58' string. But, instead... I got an empty string on "v" var.

After I change the code into this...

FPC CODE SNIPPET #2:
n := doc.DocumentElement.FirstChild.FirstChild.NodeName;
v := doc.DocumentElement.FirstChild.FirstChild.FirstChild.NodeValue;

I got all what I want! :) The "n" var filled with 'RESPOND_TIME' and the "v" var 
filled with '2005.01.20 12:26:58'. :)

When I try this code...

FPC CODE SNIPPET #3:
n := doc.DocumentElement.FirstChild.FirstChild.FirstChild.NodeName;
v := doc.DocumentElement.FirstChild.FirstChild.FirstChild.NodeValue;

I got '#text' string on "n" variable. This confirms Tony's information that 
node value that I'm looking for is actually a child node (with "#text" as the 
node's name). :)

Interestingly... the _correct_ FPC code (#2) is also accepted and works well in 
Delphi! While the FPC code #3 in Delphi resulting var "n" filled with '#text'. :)

As a Delphi ex-programmer... the FPC dom.pp unit behavior is pretty weird in my 
opinion. With this experience, I got a new lesson about XML. Thank you, Tony. :)

But then I question the _correct_ DOM framework itself. Why does an element have 
to have a "#text" node while it is obviously invisible within the document? By 
using this framework, it is logically possible to have multiple values/childs 
within a single element. Say the first child is "#text" and the second is 
"#image". But, how we wrote (and distinguished) them in the file while the 
"#text" node (and the "#image") is invisible? To me, the "#text" node is useless 
and confusing!

Interrestingly Delphi DOM component overcomes the confusion by allowing to 
access node's value in the same depth with its name. It simply convert the XML 
document structure into the equal tree-hierarchy based on the _visible_ text. 
Delphi makes the _invisible_ nodes invisible, but still accessible. No matter 
which "framework" you're using, the _correct_ one or the visible document 
tree-hierarchy, Delphi DOM works as you expected. IMO, Delphi way is a lot more 
simpler, clearer, and straight-forward. :)

-Bee-

has Bee.ography at
http://beeography.blogsome.com

PS: Sorry for the long email... I'm really excited with this topic because I 
also had lots of XML experience in Delphi, errr... using the Delphi "way". :)



More information about the fpc-pascal mailing list