<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">2014-02-06 waldo kitty <span dir="ltr"><<a href="mailto:wkitty42@windstream.net" target="_blank">wkitty42@windstream.net</a>></span>:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
On 2/5/2014 3:57 AM, Frederic Da Vitoria wrote:<br>
[...]<div class="im"><br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Once again I did not test this, but it seems to me that if Compare returned -1<br>
instead of 0, any duplicate would be inserted after because it would never be<br>
considered as equal to any other. But since you still want your collection to be<br>
able to choose between skipping duplicates or keeping them, the Compare<br>
modification would have to be slightly more subtle, something like:<br>
<br>
if result = 0 and Duplicates<br>
then result := -1<br>
<br>
at the end of the Compare function.<br>
</blockquote>
<br></div>
i tried this and it kinda works... it keeps the entries in their original order but...<br>
<br>
1. the final logging of each item's "record number" (position in the collection) is -1<br>
2. it doesn't help to sort them in order by a different field<br>
<br>
i'm unsure what to do or how to handle this so that there is a secondary (sub) sorting order so that the main key is the master sort and then a secondary ""key"" is used when duplicates are allowed... ideally, the secondary key would retain the original order in the case that the ""secondary key"" is exactly the same as a previous ""secondary key""... but this is also problematic...<br>
<br>
to try to clarify: sometimes there are records released with exactly the same time stamp (epoch in my code i posted) but slightly different data within the record... there is another field that might be used to differentiate those BUT the records come from numerous locations... they may or may not use this ""tertiary key" and if they do, their numbering in this ""tertiary key"" may not be the same as any other system's count for this ""tertiary key""... this is a problem i don't know how to solve as there is no coordination between locations and no ""master"" coordinator for this ""tertiary key""... it becomes even more apparent because my flow doesn't take any certain record containers before any others... they are read and processed as they appear (OS ordering actually) which may cause ""newer"" records with an identical time stamp to be processed after others... in my current design, i'm using "first come, first served" meaning that the first record processed is the one that is retained... with dupes, this doesn't matter so much but it does all still come into play...</blockquote>
</div><br></div><div class="gmail_extra">Then my trick does not work for you because it hides the fact that the records are identical. You need to give to someone the responsibility of giving the secondary key. If I understand what you wrote correctly, there is already something which could be used as a tertiary key, but it doesn't really work because the way it is filled is not consistent across the different sources. If I were you, I'd keep this tertiary key data (I guess it is meaningful, so you can't remove it), and I'd create my own secondary key inside the TSortedCollection descendant. I'd use 2 compare functions, <br>
- one which works as your current one but wouldn't be declared as a compare function (you could call it CheckPrimaryExists) and which would return 0 if the primary key already exists<br> - and one which uses both the primary and the secondary key as Jim suggested. <br>
<br>The algo (when duplicates are allowed) would be something like:<br></div><div class="gmail_extra">if primary key exists<br></div><div class="gmail_extra"> then set secondary key to a number <br></div><div class="gmail_extra">
insert the data<br><br></div><div class="gmail_extra">Note that depending on the total number of rows, you could use a general counter for the secondary key, no need to fetch the value of the last secondary key for the same primary key.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">You'd get something like (primary key / secondary key)<br></div><div class="gmail_extra">A / 0<br></div><div class="gmail_extra">B / 0<br></div><div class="gmail_extra">
B / 1 (duplicate detected, first secondary key generated)<br></div><div class="gmail_extra">C / 0<br></div><div class="gmail_extra">C / 2 (duplicate detected, second secondary key generated, note that there is no C1)<br>
...<br>
<br></div><div class="gmail_extra">... and now that I think of it, you don't need 2 compare functions, the second one should work for both usages.<br></div><div class="gmail_extra"><br>-- <br>Frederic Da Vitoria<br>(davitof)<br>
<br>Membre de l'April - « promouvoir et défendre le logiciel libre » - <a href="http://www.april.org" target="_blank">http://www.april.org</a><br>
</div></div>