[fpc-pascal] Implementing AggPas with PtcGraph
Nikolay Nikolov
nickysn at gmail.com
Thu Jun 22 01:46:50 CEST 2017
On 06/22/2017 02:42 AM, Nikolay Nikolov wrote:
>
>
> On 06/22/2017 01:21 AM, James Richters wrote:
>>> putimage can be accelerated, although it would still have to do a
>>> memory copy.
>> Like this?
>> https://github.com/Zaaphod/ptcpas/compare/Zaaphod_Custom?expand=1#diff-fb31461e009ff29fda5c35c5115978b4
>>
>>
>> This is amazingly faster. I ran a test of just ptcgraph.putimage()
>> in a loop, putting the same image over and over 1000 times and timing
>> it. The original ptcgraph.putimage() took 18.017 seconds. After I
>> applied this, the same loop took 1.056 seconds. Quite an
>> improvement! It's still nowhere near as fast as just drawing stuff
>> with ptcgraph directly, but for doing a memory copy of the entire
>> screen, it's very fast
> Yes, that's a good start. That was exactly what I meant :)
>>
>> I have an idea on how I could speed it up even further....
>> If I set up a second array with 1 bit per pixel, then (somehow)
>> aggpas could set bits in this array to 1 whenever it changed a
>> corresponding bit. Now by analyzing the 'pixel changed' array one
>> word at a time, (or maybe longword or qword at a time) I could just
>> skip over all the words that =0 and when I come across a word that <>
>> 0 I could do a binary search of that word to only change the pixels
>> that need to be changed. If very little on the screen has changed,
>> this would be quite a bit faster because the pixel changed array
>> would be 1/16 the size of the full buffer.
>>
>> The only way this would be of any benefit though is if aggpas set the
>> bits in the 'pixel changed' array while it was changing the pixels of
>> the buffer, because at that time it already has the array position
>> and the fact that something changed available. If I had to analyze
>> the buffer separately and create the 'pixels changed' array, it would
>> take too long.
> That sounds like a little bit of a special case - it'll work where
> you're using putimage for a large area, that has very few pixels set.
> Perhaps just reimplementing the general algorithm in inline asm, by
> using SSE (or MMX) vector instructions would be the fastest, but maybe
> it's not worth the pain and the pascal implementation is fast enough
> for you. Just experiment and see what works best :)
Btw, I looked at your code again and saw a quick and cheap optimization
- just move the case statement (case BitBlt of) outside the inner loop
(for i:=X to X1 do), so the value of BitBlt is not checked once every
pixel, but once per row.
Nikolay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-pascal/attachments/20170622/f4c77e63/attachment.html>
More information about the fpc-pascal
mailing list