[fpc-devel] I'll be straight

J. Gareth Moreton gareth at moreton-family.com
Sat Feb 9 11:57:05 CET 2019

 Well, at least we're having a good civil conversation about it!  As long
as we can talk and discuss things, then there's no harm, and hopefully
someone like me who is over-enthusiastic at times can still be useful!

 I suppose loc[low(loc)] isn't necessarily bad and is guaranteed to be
correct - just feels a little clumsy, that's all, plus with the almost
universal standard of starting arrays at 0, low(loc) seems a bit
superfluous (although looks neat when paired with "High", e.g. "for x :=
low(loc) to high(loc)" etc.)
 All I can promise is that the code I write tends to be completely
original.  I was exploring cpupara.pas with no knowledge of libffi, so
looking for optimisations with fresh eyes.  Admittedly I was looking there
while researching something else and just saw something and thought "that
looks a bit inefficient" and also was a little bemused at the mixed
standards (i.e. loc[0] and loc[low(loc)], and also using "fillchar" and a
for-loop to initialise an array of the same record type).

 I try to avoid hacks and produce clean code, or if I have to use something
a little strange, like calling Create as a standard method rather than as a
constructor (I did this in my patch for #34679), I make sure the comments
explain clearly what's going on.
 If I had to describe my own standard, it would be "iterative
improvement".  I start with something rough (although hopefully still
relatively clean according to a design plan) and then refactor it over time
after leaving it for a bit, or in the case of FPC, coming to the code with
no previous knowledge of it and studying how it works.  I also look for
small performance gains that, by themselves, don't amount to much, but when
several of them stack together, start to have a noticeable effect.

 My biggest submission to date, but which is still pending analysis and
approval by Florian and others, is the x86-64 optimiser overhaul (although
it also affects i386 now as well, since separating out the 32-bit code for
testing reasons just made everything incredibly untidy with $ifdef's). 
That one is a case of "if it ain't broke, don't fix it", although given my
preliminary tests show a 15% speed increase, it's a little bit deeper than
just fixing it!  The main objective with the overhaul was to reduce the
number of passes over individual assembler blocks - currently it's a
minimum of 5 (pre-peephole, pass 1, pass 1 again, pass 2 and
post-peephole), while the overhaul attempts to reduce this to 2 (it merges
pre-peephole, the two pass 1 runs, and pass 2, while adding some extra code
in the individual procedures to ensure the optimiser doesn't perform worse
than before under -O1).  A few other developers spotted problems, mostly
on Linux, but I believe those have been resolved now.
 Admittedly, the bounty that was offered a few months ago is a minor
incentive too, although I play around with FPC because I love it!

 I hope despite us being a little bit different in philosophies, we can
find common ground and both improve FPC as a team.  Thanks for your
understanding response, and I hope I wasn't too challenging or critical
over #34679.

 Gareth aka. Kit

 On Sat 09/02/19 11:32 , Jonas Maebe jonas at freepascal.org sent:
 On 09/02/19 08:33, J. Gareth Moreton wrote: 

 > In the last patch that Jonas almost immediately closed, the speed 
 > savings were inconclusive because the number of cycles saved is probably

 > only a few dozen, but I would argue it makes the code a bit more 
 > reasonable too because it replaces things like loc[low(loc)] with loc[0]

 > and fillchar with a for-loop that initialises each element of an array 
 > to zero (it's slightly faster because the element size is a multiple of 
 > 16, while fillchar is general-purpose and spends a lot of time jumping 
 > around and even performing a multiplication before it starts filling 
 > things up).  I guess seeing it marked as "won't fix" within 20 minutes 
 > was a moment of horror. 

 When it comes to the compiler, I'm indeed usually inclined to the "if it 
 ain't broken, don't fix it" school of thought. There are already plenty 
 of things that break when necessary changes are performed, so why risk 
 more breakage by minor touch-ups? As an aside, I also disagree that 
 loc[low(loc)] is unreasonable. I consider it defensive programming. That 
 said, since the convention is far from followed throughout all of that 
 code, removing it doesn't make things really worse either. 

 That said, it is true that the parameter classification code for x86-64 
 is quite complex and probably takes a lot of time. Additionally, 
 sometimes we even calculate the parameter layout for the same type two 
 times in a row (first in push_addr_param, and then again if that one 
 returned false). 

 I think a better way to tackle this would be to add a small cache 
 (possibly just an array of 10 elements that is searched linearly; a hash 
 table risks spending more time and memory/cache flushing on maintaining 
 the table than gaining from it) that keeps track of recently requested 
 parameter layouts. This would have a potentially much bigger impact, and 
 at the same time would not require fiddling with the internals of the 
 existing code. 

 > I just want to avoid working on something, building passion for it which

 > is an unfortunate character flaw of mine, and then seeing it rejected or

 > ignored when it's finished. 

 I understand this is annoying. I realise that I am rather conservative 
 in this respect. While I think I have good reasons for this stance 
 (having spent a lot of time on fixing bugs added by other people while I 
 wanted to work on my own stuff; not that other people have never been 
 blocked by bugs I added, of course), it also bothers me that I 
 discourage new contributors with this attitude. 

 > I've just hit a couple of impasses since I'm waiting on the node dump 
 > feature to be approved and the status of the optimiser overhaul (I tend 
 > to be quite good at doing peephole optimisations... at least a peephole 
 > optimisation was the very first piece of code I submitted to FPC). 

 That's also how I started on FPC in 1997 :) But yes, it definitely was 
 much easier back then to jump right in, because there were way fewer 
 barriers to getting your code included. At the same time, we paid a 
 hefty price for that for many years after that (from copyright issues to 
 cleaning out hacks). 

 fpc-devel maillist - fpc-devel at lists.freepascal.org [1] 


[1] mailto:fpc-devel at lists.freepascal.org
[2] http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20190209/a09a7a4b/attachment.html>

More information about the fpc-devel mailing list