<div dir="auto"><div><div class="gmail_quote"><div dir="ltr">Am Fr., 23. Nov. 2018, 12:15 hat Adriaan van Os <<a href="mailto:fpc@microbizz.nl">fpc@microbizz.nl</a>> geschrieben:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Simon Kissel wrote:<br>
<br>
> We know about a couple of bottlenecks (fpc_pushexceptaddr /<br>
> RelocateThreadVar etc) which explain FPC's terrible multi-threading<br>
> performance, but in general, FPC's code generator really is quite<br>
> a mess, which we learned the hard way a couple of years when we<br>
> did optimization work on the ARM target.<br>
<br>
I find the phrase. "FPC's terrible multi-threading performance" unjust. When I do multi-threading <br>
with FPC, I get a near N speed improvement (on i386 and x86_64) where N is the number of cores, <br>
including hyper-threaded cores ....<br>
<br>
What about taking another way, having a precise look at the source code ? Did you profile it ? What <br>
sort of work does the code do ? How are the threads synchronized ? What data structures are used ?<br>
<br>
I don't take "the compiler is so bad" without an answer to these questions.<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">Simon wrote that the same code performs better when compiled with Kylix, so there definitely are things that can be done better by FPC and as Florian's work on TLS variables showed indeed *do* make FPC perform better. I suspect a similar improvement with DWARF exceptions as the setjmp/longjmp based approach *is* more expensive for the case when no exception occures compared to the case of marking protected code in the meta data as DWARF and SEH64 do. </div><div dir="auto"><br></div><div dir="auto">Regards, </div><div dir="auto">Sven </div><div dir="auto"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
</blockquote></div></div></div>