[fpc-pascal] XML read

Sat Jun 11 10:52:05 CEST 2011

On Fri, 10 Jun 2011, Marcos Douglas wrote:

> Hi,
> What is the more simpler way to get all text in each paragraph in the
> XML below, using fcl-xml?
> For example:
>
> Page 1
> -------------------
> Para 1:
> -1524
> -888
> -14/06/06
> Para 2
> -TEXT 2
>
> Page 2
> -------------------
> Para 1
> --TEXT page 2
>
> I need each paragraph in a TStringList.

There is no "simple" way. You must walk the nodes and append the text of all <Text> nodes.

Or you can try XPath, this should give you all the <Text> nodes, but I am not familiar 
with the exact structure of XPath.

Michael.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <Document filename="127.pdf" pageCount="7" filesize="141859"
> linearized="false" pdfVersion="1.5">
> <Pages>
> <Page number="1" width="612" height="1008">
> <Options>granularity=line</Options>
> <Content granularity="line" dehyphenation="false" dropcap="false"
> font="false" geometry="false" shadow="false" sub="false" sup="false">
> <Para>
> <Line>
>  <Text>1524</Text>
> </Line>
> <Line>
>  <Text>888</Text>
> </Line>
> <Line>
>  <Text>14/06/06</Text>
> </Line>
> </Para>
> <Para>
> <Line>
>  <Text>TEXT 2</Text>
> </Line>
> </Para>
> </Content>
> </Page>
> <Page number="2" width="612" height="1008">
> <Options>granularity=line</Options>
> <Content granularity="line" dehyphenation="false" dropcap="false"
> font="false" geometry="false" shadow="false" sub="false" sup="false">
> <Para>
> <Line>
>  <Text>TEXT page 2</Text>
> </Line>
> </Para>
> </Content>
> </Page>
> </Pages>
> </Document>
>
>
> Thanks,
> Marcos Douglas
> _______________________________________________
> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org
> http://lists.freepascal.org/mailman/listinfo/fpc-pascal
>