[fpc-pascal] How to split file of whitespace separated numbers?

Bo Berglund bo.berglund at gmail.com
Sat Dec 31 08:48:26 CET 2016


On Fri, 23 Dec 2016 11:53:58 +0000, Graeme Geldenhuys
<mailinglists at geldenhuys.co.uk> wrote:

>That problem is perfectly suited for regular expressions. And a rather
>simple one at than. The FPC's FCL packages include a regex unit too
>which should suite your needs.

I was away over Xmas so I have not seen all this regexp discussion for
my problem until now....

In past times I have come across solutions using regular expressions
for example in shellscripts or similar. In most cases I saw that they
worked but had a hard time understanding *how* they worked, the syntax
is too dense for me.

The actual problem I had was that a data processing program, I did not
write myself, was using up extremely long times just loading the input
data file so I was looking at alternate ways to read that data in.

The program was written originally using RAD Studio 2007 by someone
else and I "ported" it to RAD Studio XE5 a few years ago. But I did
not get into the working code, just making the transfer to Unicode and
updating the GUI. All processing code was untouched (except for
changing string to ansistring where needed).
It is a very math intensive processing package and I am no
mathematician...

Anyway, the original author was no real coder but a scientist so
things like file I/O was not optimal. This shows up when reading the
large actual data files, which could be hundreds of Mbytes in length.
In his code it takes minutes to do!
And this was the cause of my original question. Since it seemed rather
general in nature I posted both here and in the Embarcadero forum...

Now I am down to just seconds using the ReadLn +
StringList.Delimitedtext way to parse the data.

My goal now is to simply create a utility that transforms these files
into binary format instead and add code to load the data into dynamic
float arrays.
The tests I did timed the conversion at some few seconds and once the
binary files are created the load of these resulting binary data is
done in fractions of a second.

So I am pretty much done with this problem (without resorting to
regexp usage).

Thanks anyway for pointing out an alternate path!

-- 
Bo Berglund
Developer in Sweden




More information about the fpc-pascal mailing list