<HTML>

<div>I think that is the fundamental difference between intrinsics and reusable, inlinable assembly language... you would use intrinsics mostly to use one or two special instructions in a convenient, concise way, while assembly language is a "give me all the power!" request.  Admittedly though my patch does allow the peephole optimizer to touch the assembly language in the inlined procedure, mostly to help optimize parameter and result passing.</div><div><br>

</div><div>Using assembly language always carries risk - that's pretty much what a programmer signs up for if they choose to use it.  The ability to inline particular assembler routines is a means to give as much power and trust as possible to the programmer.  If the program breaks because of what they've coded, it's on them to fix it.  Besides, when it comes to debugging, I've found that I comment out all of my "inline" directives because of how even regular inlined procedures cause problems with the code stepper.  The only thing I want to minimise as much as possible when it comes to developing the compiler is the chance of code behaving differently depending on whether it's inlined or not.</div><div><br>

</div><div>I'll admit that I'm stubborn and sceptical with certain things, not least because I know the Free Pascal Compiler has limitations that are not easily addressed.  For a simple example, if you try to do "x div 10" and "x mod 10" close to each other, the compiler will write code to perform each calculation separately rather than being more efficient and performing the division once (using a multiplication trick for speed) and calculating the remainder via "R := x - (Q * 10)", where Q is the result of "x div 10".  It's even worse if the divisor is a variable.  Something like the Microsoft Visual C++ compiler will spot this and combine the calculations into a single DIV instruction, using the results in EAX and EDX as appropriate, while Free Pascal will perform the calculations separately.  It's not easy to peephole-optimize either because the second DIV instruction may use a different register for the divisor.<br>

</div><div><br>

</div><div>Gareth aka. Kit<br>

<br>

P.S. This is where my work on a "deep optimizer" comes into play, as it attempts to perform data flow analysis on the used registers, although only works on MOV instructions currently.  It works fairly well currently, but would work better if the step where virtual registers are converted into real ones is performed last (or right before the post-peephole optimization stage), since this step also allocates stack space that might not be needed if optimization is able to remove the use of a register completely and hence free it up for something that would otherwise spend its time on the stack.<br>

</div> <br>

<br>

<span style="font-weight: bold;">On Mon 18/03/19 11:05 , Marco van de Voort core@pascalprogramming.org sent:<br>

</span><blockquote style="BORDER-LEFT: #F5F5F5 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT:0px; PADDING-LEFT: 5px; PADDING-RIGHT: 0px"> 

  
 <defanged_meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 

  
 <defanged_body text="#000000" bgcolor="#FFFFFF"> 

 <p><br>

 
 </p> 

 <div class="moz-cite-prefix">Op 3/18/2019 om 8:00 AM schreef Sven 

 Barth via fpc-devel:<br>

 
 </div> 

 <blockquote type="cite" cite="mid:CAFMUeB843H49QHuGLD7WqOPzpmyycFqXPsepqvHKZBhfGcAKUQ@mail.gmail.com"> 

 <defanged_meta http-equiv="content-type" content="text/html; charset=UTF-8"> 

 <div dir="auto"> 

 <div> 

 <div class="gmail_quote"> 

 <div dir="ltr" class="gmail_attr">J. Gareth Moreton <<a href="mailto:gareth@moreton-family.com">gareth@moreton-family.com</a>> 

 schrieb am So., 17. März 2019, 23:27:<br>

 
 </div> 

 <br>

 
 </div> 

 </div> 

 <div dir="auto">And I believe that this is the advantage of 

 intrinsics, because here the compiler *can* decide to use a 

 different register. Especially if the compiler supports 

 instruction scheduling and such. </div> 

 <div dir="auto">At work I've worked with AES-NI and I 

 definitively preferred to work with the intrinsics and didn't 

 have to care about what registers to use, because the compiler 

 and optimizer took care of that. <br>

 
 </div> 

 </div> 

 </defanged_meta></blockquote> 

 <p>(well, better double check output, it is not always ideal)<br>

 
 </p> 

 <p>I've seen nice examples in simd lib 

 (<a class="moz-txt-link-freetext" href="http://ermig1979.github.io/Simd/index.html">http://ermig1979.github.io/Simd/index.html</a>), where they use 

 generics to bundle intrinsics into blocks, and then reuse them 

 multiple times, e.g. 3 times for the first, bulk and last line of 

 an image.<br>

 
 </p> 

 <blockquote type="cite" cite="mid:CAFMUeB843H49QHuGLD7WqOPzpmyycFqXPsepqvHKZBhfGcAKUQ@mail.gmail.com"> 

 <div dir="auto"> 

 <div dir="auto">That is something that Pascal should stand for: 

 ease of use. Assembler is not easy to use. </div> 

 <br>

 
 </div> 

 </blockquote> 

 <p>If something is generic enough to be an intrinsic, it should be 

 an intrinsic and as secured as much as possible.</p> 

 <p>Inlinable assembler however is something to get some of that 

 defining power also in the user's hand. It doesn't really matter 

 that there are border cases, as long as they can be described, 

 since assembler is intrinsically unportable anyway. But having 

 something like that is quite important I think. Though examples 

 come more from my embedded, and less from my PC work (even though 

 I use AVX2 there. Intrinsics would be better for many cases)<br>

 
 </p> 

  
_______________________________________________<br>


fpc-devel maillist  -  <a href="mailto:fpc-devel@lists.freepascal.org">fpc-devel@lists.freepascal.org</a><br>


<a target="_blank" href="<a href="http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel</a>"><span style="color: red;">http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel</span></a><br>


<br>


</defanged_body></defanged_meta></blockquote></HTML>