[fpc-pascal]Databases and FPC

Tue May 6 08:30:59 CEST 2003

Hello, James!

I accidently read your below question. When I understand right, you have a
relatively simple data structure but you are afraid about a "bottle-neck" when
accessing them?

Well, I am also a programmer over tens of years (I started with TurboPascal 3
for CP/M !), and I also worked all the time with flat files (file of record).

Let me tell you my opinion:
---------------------------
The major question are:
- How many records are assumed to be in the file?
- What kind of file?
- Sorted/unsorted file?

Kind of file:
-------------
The poorest thing you can do is using a TEXT file. doing so, increases the
processing time heavily because you always have to parse each line again and
again to pick out the particular "fields". -- The only advantage is that you
can read it directly with any ASCII editor.
--> So I would use a file of record. It needs the same space but is quicker
    in handling. Well, you need a small "reader program" to read its content.

If you have continuously many changes in the file, this is another weak point.
If you have multiple programs working with the file, you will run into
troubles with text files, because you typically will have to re-write the
whole file with every modification (except when you use it as untyped file
and ro _really_ everything by hands.
--> File of record allows positioning and changing records one by one, but you
    still have the problem of locking...

Sorted/unsorted:
----------------
When you use an unsorted file, you have to read it sequentially every time you
search something. In the worst case you will have to read the whole file every
time! - On the other hand, adding a record is a simply 'append', so it is very
quick.

Using a sorted file allows you to search extremely efficient *), but you
cannot simply append a new record; it must be inserted at the right point.
This costs a lot of time/cpu power and you will usually have to lock out all
other tasks for that time. So this will be a big bottle-neck.

*) You start searching in the middle. If the found record is smaller, you
choose the middle of the upper half, otherwise of the lower half. You repeat
that until you have only one record remaining or until you found what you are
searching for. EXTREMELY quick, but needs a sorted file.

My idea:
--------
If you
a.) do not have too many records (let's say a file with not more than
    256MB) and
b.) do not have to access the data simultaneously from other "normal"
    programs at the same time...
... I would suggest a completely different way of handling:

Use a very fat machine with a lot of memory, the best will be to do that on
the same machine where your IRC application is running, too.
Write a "server" part which loads the whole "database file" into a dynamic
list on the heap, hereby sorting it. Then it offers some kind of communication
to your IRC applications, i.e. an Unix Socket, where it accepts queries and
returns the results.

Feature:
o  The data have to be loaded only ONCE. This is the most time consuming
   part
o  You access your data in RAM. This is much quicker than any other way.
   Also, inserting a new record is extremely quick.
o  You have exactly ONE process who plays with the data file, so you have
   no problems with file/record locking and so on.
o  Modifications can be written back either on some triggers (if not such
   important data, you may write back to file only from time to time or
   only on shutdown of the program)

What do you think?

 mfg

  Ing. Rainer Hantsch

.---------------------------------------------------------------------.
|      \\|//              Ing. Rainer HANTSCH  -  Hardware + Software |
|      (o o)              Forget Windoze! -- We focus on L-I-N-U-X... |
|--oOOo-(_)-oOOo------------------------------------------------------|
| Ing. Rainer HANTSCH |  mail: office at hantsch.co.at                   |
| Khunngasse 21/20    |   www: http://www.hantsch.co.at               |
| A-1030 Vienna       |   tel: +43-1-79885380    fax: +43-1-798853818 |
| ** A u s t r i a ** | handy: +43-664-9194382   UID-Nr: ATU 11134002 |
'---------------------------------------------------------------------'

--

On Mon, 5 May 2003, James Mills wrote:
| Hi,
|
| I dunno if any of you have ever written an entire IRC Services in FPC
| before, but for the past few months I've been porting my 5 year old
| windows (delphi) version to linux.
|
| I'm wondering about databases however... The old windows version used to
| use a flat-file database to store nickname and channel registrations,
| the port I'm writing also uses the same thing. However this obviously
| will be slow if say an IRC network has 10,000 nicks/channels etc...
|
| Is there anything anyone would suggest I do ? I have only a couple of
| ideas, perhaps you might have a few more than I...
|
| cheers
| James