[fpc-pascal] Firebird: bulk insert performance: suggestions?
michael.vancanneyt at wisa.be
michael.vancanneyt at wisa.be
Fri Sep 7 13:12:50 CEST 2012
On Fri, 7 Sep 2012, Reinier Olislagers wrote:
> For my Dutch postcode program https://bitbucket.org/reiniero/postcode
> with an embedded Firebird 2.5 database, I allow users to read in a CSV
> file with new or updated postcode data.
> I use sqldb, FPC x86.
> I'd like to get your suggestions on speed improvements.
>
>
> I try to get the data into a temporary table as quickly as possible.
> Later on, a stored procedure will normalize the data and insert
> to/update various tables (with postcode, city, street information, etc).
>
> Because I also allow querying information, I set up 2
> connections+transactions: for reading and writing in my database class
> constructor, and destroy them in the destructor.
> However, (currently) my application controls the database and I know
> that querying and bulk inserts at the same time is impossible.
>
> The write transaction has this code:
> FWriteTransaction.Params.Add('isc_tpb_concurrency');
> FWriteTransaction.Params.Add('isc_tpb_write');
> FWriteTransaction.Params.Add('isc_tpb_no_auto_undo'); //disable
> transaction-level undo log, handy for getting max throughput when
> performing a batch update
>
> My code loads an ANSI CSV file into a csvdocument in memory (about
> 50meg), then goes through it, and calls an insert procedure for each
> record (converting the field contents to UTF8):
> FDBLayer.BulkInsertUpdateRecord(
> SysToUTF8(Postcodes.Cells[ProvinceField,LineNum]),
> SysToUTF8(Postcodes.Cells[CityField,LineNum]),
> SysToUTF8(Postcodes.Cells[PostcodeField,LineNum]),
> SysToUTF8(Postcodes.Cells[StreetField,LineNum]),
> StrToInt(Postcodes.Cells[NumberLowestField,LineNum]),
> StrToInt(Postcodes.Cells[NumberHighestField,LineNum]),
> Even,
> Latitude,
> Longitude);
>
>
> Relevant snippets from the insert procedure:
> QuerySQL='INSERT INTO BULKINSERTDATA '+
> '(PROVINCENAME,CITYNAME,POSTCODE,STREETNAME,LOW,HIGH,EVEN,LATITUDE,LONGITUDE)
> '+
> 'VALUES ( '+
> ':PROVINCENAME,:CITYNAME,:POSTCODE,:STREETNAME,:LOW,:HIGH,:EVEN,:LATITUDE,:LONGITUDE)';
> then the transaction is started (if it is inactive) and the query
> parameters are filled (using Query.Params.ParamByName, but I don't
> suppose that would be a big slowdown??); finally the SQL is executed.
> The transaction is left open.
Do you prepare the query before you start the batch ?
If not, it is prepared on every insert, which is inherently slower.
>
> Currently, after every 100 records, the transaction is committed:
> if (linenum mod 100=0) then
> FDBLayer.BulkInsertCommit(false);
> IIRC, advice on the Firebird list is to play with this interval; any
> suggestions? Given the aggressive nature of the transaction parameters,
> I might even dispense with it.
I once did tests with that (600.000 records) and did not notice any influence
of the transaction control.
Michael.
More information about the fpc-pascal
mailing list