[fpc-pascal] Firebird: bulk insert performance: suggestions?

Reinier Olislagers reinierolislagers at gmail.com
Fri Sep 7 13:03:33 CEST 2012


For my Dutch postcode program https://bitbucket.org/reiniero/postcode
with an embedded Firebird 2.5 database, I allow users to read in a CSV
file with new or updated postcode data.
I use sqldb, FPC x86.
I'd like to get your suggestions on speed improvements.


I try to get the data into a temporary table as quickly as possible.
Later on, a stored procedure will normalize the data and insert
to/update various tables (with postcode, city, street information, etc).

Because I also allow querying information, I set up 2
connections+transactions: for reading and writing in my database class
constructor, and destroy them in the destructor.
However, (currently) my application controls the database and I know
that querying and bulk inserts at the same time is impossible.

The write transaction has this code:
FWriteTransaction.Params.Add('isc_tpb_concurrency');
FWriteTransaction.Params.Add('isc_tpb_write');
FWriteTransaction.Params.Add('isc_tpb_no_auto_undo'); //disable
transaction-level undo log, handy for getting max throughput when
performing a batch update

My code loads an ANSI CSV file into a csvdocument in memory (about
50meg), then goes through it, and calls an insert procedure for each
record (converting the field contents to UTF8):
FDBLayer.BulkInsertUpdateRecord(
  SysToUTF8(Postcodes.Cells[ProvinceField,LineNum]),
  SysToUTF8(Postcodes.Cells[CityField,LineNum]),
  SysToUTF8(Postcodes.Cells[PostcodeField,LineNum]),
  SysToUTF8(Postcodes.Cells[StreetField,LineNum]),
  StrToInt(Postcodes.Cells[NumberLowestField,LineNum]),
  StrToInt(Postcodes.Cells[NumberHighestField,LineNum]),
  Even,
  Latitude,
  Longitude);


Relevant snippets from the insert procedure:
  QuerySQL='INSERT INTO BULKINSERTDATA '+
'(PROVINCENAME,CITYNAME,POSTCODE,STREETNAME,LOW,HIGH,EVEN,LATITUDE,LONGITUDE)
'+
    'VALUES ( '+
':PROVINCENAME,:CITYNAME,:POSTCODE,:STREETNAME,:LOW,:HIGH,:EVEN,:LATITUDE,:LONGITUDE)';
then the transaction is started (if it is inactive) and the query
parameters are filled (using Query.Params.ParamByName, but I don't
suppose that would be a big slowdown??); finally the SQL is executed.
The transaction is left open.

Currently, after every 100 records, the transaction is committed:
if (linenum mod 100=0) then
  FDBLayer.BulkInsertCommit(false);
IIRC, advice on the Firebird list is to play with this interval; any
suggestions? Given the aggressive nature of the transaction parameters,
I might even dispense with it.

Finally, once done, the transaction is committed, and the xtored
procedure that does subsequent updates is called.

Thanks,
Reinier



More information about the fpc-pascal mailing list