[fpc-pascal] best? safest? fastest?

waldo kitty wkitty42 at windstream.net
Fri Aug 8 09:09:04 CEST 2014


On 8/7/2014 4:35 AM, Mark Morgan Lloyd wrote:
> waldo kitty wrote:
>> On 8/6/2014 4:08 AM, Mark Morgan Lloyd wrote:
>>> I'd be inclined to start off using your method 1, i.e. text manipulation until
>>> the format is consistent.
>>
>> i don't understand "until the format is consistent"... the format has been in
>> use since the 60s at least (AFAIK) ;)
>
> What I mean is, while you're doing the initial processing to e.g. add century
> digits to the date and possibly to check number of decimal places etc.

oh, ok... yeah, that's all done before we reach this stage... as i posted 
previously, the format is already laid out so the string spacing is already a 
known factor... it is only a couple of places where one might run into 
historical TLE records that are space filled instead of zero filled in the 
mathematical number areas... the NORAD, COSPAR and Epoch areas are the only ones 
that one might find

>> FWIW: i have taken some time and reworked things to be math based while still
>> taking the required text format into account... i've seen a very nice increase
>> in processing speed and now need to just make sure that i don't run into any
>> of the basic and well known flaws that math processing of date strings seem to
>> have ;)
>>
>>> Flatten the original record and save it in a database, create a new flat text
>>
>> this appears that you are speaking of a sql database or similar? that may be a
>> later feature but for now, everything is/has to be done with the raw TLE files...
>
> Databases, even for plain-text records, can be incredibly useful.

agreed very much... however, in this case, with the old convoluted string 
stuffings that were being done, processing 80000+ records takes only a minute or 
two whereas the previous tool was limited by the old 64k memory limit and 
processing the same files took over 45 minutes and resulted in only 10% of the 
total output of the new tool i've written... it is very late and my math may be 
off but output file sizes of 274K compared to 2.5Meg is 10%, right?

but as previously noted, a real database is a possibility for the future... 
currently, my personal processing uses a text file containing all the latest 
TLEs that i have been able to accumulate... granted, the historical ones are not 
available in this format but then again, i don't have room for 9million records 
like other systems ;)

>> i'm not sure what you mean by "flatten", either... currently i break down the
>> TLEs into their major records for storage in the in-memory ""database""... the
>> processing i posted is done before that storage takes place...
>
> I was thinking that the first thing you could do was convert the two lines into
> a single one for processing, but on reflection it would be better to save the
> original with as little modification as possible- possibly with any accession
> info you had (i.e. what body had provided that particular TLE).

when i move to incorporate ""real"" database capabilities, something like this 
will be being done... that's an understood :)  the reason i hesitate to do this 
now is due to the processing speed which has been achieved compared to the old 
tool written by someone else... adding database support will really slow all the 
processing down compared to what has been achieved at this point... granted, 
flat file text processing is ""slow"" and ""ugly"" but in this case, it is a 
major GoodThing<tm>  :)

>> the goal of the program is to build the in-memory database from all specified
>> TLE files and then to write out new TLE files which may be filtered on a
>> selection property so that only certain matching TLE records are saved...
>
> The problem there being that once the program stops you've then got to rebuild
> the next time.

that's not really a problem in this case... when i allow others to use the tool 
(which is still considered by myself as alpha but may be beta or gamma level by 
others), they may choose to use it the same way that i do... or not... as noted 
above, i maintain a master ""database"" file which is loaded first and then the 
latest live data and finally all the "just in" update files... from these are 
built the desired output files... every once in a while i output an updated 
""master database""... out of over 40000 possible records, i think i'm missing 
only a thousand or so and those are unlikely to ever be filled due to their 
military aspect and the lack of publicly available data on their orbital 
activities... so back to your statement above, i guess i'm already maintaining a 
fairly up-to-date ""master database" which is quickly loaded and then updated by 
the other files being processed...

> What I normally do when handling large bodies of tabular info is
> to either have a series of database tables or a series of text files, where
> ideally the text files are absolutely predictable (all fields a known length and
> appropriately padded).

i guess that's what i'm doing as described above ;)

> What I'm normally looking for is rate-of-change over multiple records with
> irregular timestamps, which is an awkward job however it's done.

i do have extensive logs which i regularly review... one night's processing 
generates some 45Meg or so for roughly 60 output files ;)


-- 
  NOTE: No off-list assistance is given without prior approval.
        Please *keep mailing list traffic on the list* unless
        private contact is specifically requested and granted.



More information about the fpc-pascal mailing list