<div dir="ltr">I think the problem was the way write_portd() is implemented:<div><br></div><div><div>procedure write_portd(const Data: Pointer; const Port: Word); {$IFDEF ASMINLINE} inline; {$ENDIF}</div><div>asm // RCX: data, RDX: port</div><div>  {$IFDEF LINUX} mov dx, port {$ENDIF}</div><div><span style="white-space:pre-wrap">  </span>mov rsi, data // DX=port</div><div>        outsd</div><div>end;   </div></div><div><br></div><div>If I replace with something that does not use the outsd instruction, it works fine. </div><div><br></div><div>Matias</div></div><div class="gmail_extra"><br><div class="gmail_quote">2018-01-10 15:55 GMT+01:00 Matias Vara <span dir="ltr"><<a href="mailto:matiasevara@gmail.com" target="_blank">matiasevara@gmail.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello everyone, <div><br></div><div>I am getting an exception when I enable the -O2 optimization. More precisaily, the line that stars with write_portd.... is corrupting the data section. This is the pascal code: </div><div><br></div><div><div>function PciReadDword(const bus, device, func, regnum: UInt32): UInt32;</div><div>var</div><div>  Send: DWORD;</div><div>begin</div><div>  Send := $80000000 or (bus shl 16) or (device shl 11) or (func shl 8) or (regnum shl 2);</div><div>  write_portd(@Send, PCI_CONF_PORT_INDEX);</div><div>  read_portd(@Send, PCI_CONF_PORT_DATA);</div><div>  Result := Send;</div><div>end;  </div></div><div><br></div><div><b><u>which generates (without -02):</u></b></div><div><br></div><div><div>.section .text.n_arch_$$_pcireaddword$<wbr>longword$longword$longword$<wbr>longword$$longword,"x"</div><div><span style="white-space:pre-wrap">      </span>.balign 16,0x90</div><div>.globl<span style="white-space:pre-wrap">    </span>ARCH_$$_PCIREADDWORD$LONGWORD$<wbr>LONGWORD$LONGWORD$LONGWORD$$<wbr>LONGWORD</div><div>ARCH_$$_PCIREADDWORD$LONGWORD$<wbr>LONGWORD$LONGWORD$LONGWORD$$<wbr>LONGWORD:</div><div>.Lc207:</div><div>.seh_proc ARCH_$$_PCIREADDWORD$LONGWORD$<wbr>LONGWORD$LONGWORD$LONGWORD$$<wbr>LONGWORD</div><div>.Ll464:</div><div># [992] begin</div><div><span style="white-space:pre-wrap">        </span>pushq<span style="white-space:pre-wrap">   </span>%rbp</div><div>.seh_pushreg %rbp</div><div>.Lc209:</div><div>.Lc210:</div><div><span style="white-space:pre-wrap"> </span>movq<span style="white-space:pre-wrap">    </span>%rsp,%rbp</div><div>.Lc211:</div><div><span style="white-space:pre-wrap">  </span>leaq<span style="white-space:pre-wrap">    </span>-80(%rsp),%rsp</div><div>.seh_stackalloc 80</div><div>.seh_endprologue</div><div># Var bus located at rbp-8, size=OS_32</div><div># Var device located at rbp-16, size=OS_32</div><div># Var func located at rbp-24, size=OS_32</div><div># Var regnum located at rbp-32, size=OS_32</div><div># Var $result located at rbp-40, size=OS_32</div><div># Var Send located at rbp-48, size=OS_32</div><div><span style="white-space:pre-wrap">    </span>movl<span style="white-space:pre-wrap">    </span>%ecx,-8(%rbp)</div><div><span style="white-space:pre-wrap">    </span>movl<span style="white-space:pre-wrap">    </span>%edx,-16(%rbp)</div><div><span style="white-space:pre-wrap">   </span>movl<span style="white-space:pre-wrap">    </span>%r8d,-24(%rbp)</div><div><span style="white-space:pre-wrap">   </span>movl<span style="white-space:pre-wrap">    </span>%r9d,-32(%rbp)</div><div>.Ll465:</div><div># [993] Send := $80000000 or (bus shl 16) or (device shl 11) or (func shl 8) or (regnum shl 2);</div><div><span style="white-space:pre-wrap">       </span>movl<span style="white-space:pre-wrap">    </span>-8(%rbp),%eax</div><div><span style="white-space:pre-wrap">    </span>shll<span style="white-space:pre-wrap">    </span>$16,%eax</div><div><span style="white-space:pre-wrap"> </span>orl<span style="white-space:pre-wrap">     </span>$2147483648,%eax</div><div><span style="white-space:pre-wrap"> </span>movl<span style="white-space:pre-wrap">    </span>-16(%rbp),%edx</div><div><span style="white-space:pre-wrap">   </span>shll<span style="white-space:pre-wrap">    </span>$11,%edx</div><div><span style="white-space:pre-wrap"> </span>orl<span style="white-space:pre-wrap">     </span>%eax,%edx</div><div><span style="white-space:pre-wrap">        </span>movl<span style="white-space:pre-wrap">    </span>-24(%rbp),%eax</div><div><span style="white-space:pre-wrap">   </span>shll<span style="white-space:pre-wrap">    </span>$8,%eax</div><div><span style="white-space:pre-wrap">  </span>orl<span style="white-space:pre-wrap">     </span>%edx,%eax</div><div><span style="white-space:pre-wrap">        </span>movl<span style="white-space:pre-wrap">    </span>-32(%rbp),%edx</div><div><span style="white-space:pre-wrap">   </span>shll<span style="white-space:pre-wrap">    </span>$2,%edx</div><div><span style="white-space:pre-wrap">  </span>orl<span style="white-space:pre-wrap">     </span>%eax,%edx</div><div><span style="white-space:pre-wrap">        </span>movl<span style="white-space:pre-wrap">    </span>%edx,-48(%rbp)</div><div>.Ll466:</div><div># [995] write_portd(@Send, PCI_CONF_PORT_INDEX);</div><div><span style="white-space:pre-wrap">      </span>leaq<span style="white-space:pre-wrap">    </span>-48(%rbp),%rcx</div><div><span style="white-space:pre-wrap">   </span>movl<span style="white-space:pre-wrap">    </span>$3320,%edx</div><div><span style="white-space:pre-wrap">       </span>call<span style="white-space:pre-wrap">    </span>ARCH_$$_WRITE_PORTD$POINTER$<wbr>WORD</div><div>.Ll467:</div><div># [996] read_portd(@Send, PCI_CONF_PORT_DATA);</div><div><span style="white-space:pre-wrap">   </span>leaq<span style="white-space:pre-wrap">    </span>-48(%rbp),%rcx</div><div><span style="white-space:pre-wrap">   </span>movl<span style="white-space:pre-wrap">    </span>$3324,%edx</div><div><span style="white-space:pre-wrap">       </span>call<span style="white-space:pre-wrap">    </span>ARCH_$$_READ_PORTD$POINTER$<wbr>WORD</div><div>.Ll468:</div><div># [997] Result := Send;</div><div><span style="white-space:pre-wrap">   </span>movl<span style="white-space:pre-wrap">    </span>-48(%rbp),%eax</div><div><span style="white-space:pre-wrap">   </span>movl<span style="white-space:pre-wrap">    </span>%eax,-40(%rbp)</div><div>.Ll469:</div><div># [998] end;</div><div><span style="white-space:pre-wrap">  </span>movl<span style="white-space:pre-wrap">    </span>-40(%rbp),%eax</div><div><span style="white-space:pre-wrap">   </span>nop</div><div><span style="white-space:pre-wrap">      </span>leaq<span style="white-space:pre-wrap">    </span>(%rbp),%rsp</div><div><span style="white-space:pre-wrap">      </span>popq<span style="white-space:pre-wrap">    </span>%rbp</div><div><span style="white-space:pre-wrap">     </span>ret</div><div>.seh_endproc</div><div>.Lc208:</div><div>.Lt28:</div><div>.Ll470:</div></div><div><br></div><div><b><u>and with -O2:</u></b></div><div><br></div><div><div>.section .text.n_arch_$$_pciwriteword$<wbr>word$word$word$word$word,"x"</div><div><span style="white-space:pre-wrap"> </span>.balign 16,0x90</div><div>.globl<span style="white-space:pre-wrap">    </span>ARCH_$$_PCIWRITEWORD$WORD$<wbr>WORD$WORD$WORD$WORD</div><div>ARCH_$$_PCIWRITEWORD$WORD$<wbr>WORD$WORD$WORD$WORD:</div><div>.Lc148:</div><div># Temps allocated between rbp-16 and rbp-8</div><div>.seh_proc ARCH_$$_PCIWRITEWORD$WORD$<wbr>WORD$WORD$WORD$WORD</div><div>.Ll471:</div><div># [1014] begin</div><div><span style="white-space:pre-wrap">      </span>pushq<span style="white-space:pre-wrap">   </span>%rbp</div><div>.seh_pushreg %rbp</div><div>.Lc150:</div><div>.Lc151:</div><div><span style="white-space:pre-wrap"> </span>movq<span style="white-space:pre-wrap">    </span>%rsp,%rbp</div><div>.Lc152:</div><div><span style="white-space:pre-wrap">  </span>leaq<span style="white-space:pre-wrap">    </span>-48(%rsp),%rsp</div><div>.seh_stackalloc 48</div><div># Var bus located in register ax</div><div># Var device located in register dx</div><div># Var func located in register r8w</div><div># Var regnum located in register r9w</div><div># Var value located in register cx</div><div><span style="white-space:pre-wrap">    </span>movq<span style="white-space:pre-wrap">    </span>%rbx,-16(%rbp)</div><div>.seh_savereg %rbx, 32</div><div>.seh_endprologue</div><div># Var Send located at rbp-8, size=OS_32</div><div><span style="white-space:pre-wrap">  </span>movw<span style="white-space:pre-wrap">    </span>%cx,%ax</div><div><span style="white-space:pre-wrap">  </span>movw<span style="white-space:pre-wrap">    </span>48(%rbp),%bx</div><div># PeepHole Optimization,var11</div><div>.Ll472:</div><div># [1015] Send := $80000000 or (bus shl 16) or (device shl 11) or (func shl 8) or (regnum and $fc);</div><div><span style="white-space:pre-wrap">  </span>andl<span style="white-space:pre-wrap">    </span>$65535,%eax</div><div><span style="white-space:pre-wrap">      </span>shll<span style="white-space:pre-wrap">    </span>$16,%eax</div><div><span style="white-space:pre-wrap"> </span>orl<span style="white-space:pre-wrap">     </span>$2147483648,%eax</div><div># PeepHole Optimization,var11</div><div><span style="white-space:pre-wrap">     </span>andl<span style="white-space:pre-wrap">    </span>$65535,%edx</div><div><span style="white-space:pre-wrap">      </span>shll<span style="white-space:pre-wrap">    </span>$11,%edx</div><div><span style="white-space:pre-wrap"> </span>orl<span style="white-space:pre-wrap">     </span>%eax,%edx</div><div># PeepHole Optimization,var11</div><div><span style="white-space:pre-wrap">    </span>andl<span style="white-space:pre-wrap">    </span>$65535,%r8d</div><div><span style="white-space:pre-wrap">      </span>shll<span style="white-space:pre-wrap">    </span>$8,%r8d</div><div><span style="white-space:pre-wrap">  </span>orl<span style="white-space:pre-wrap">     </span>%edx,%r8d</div><div># PeepHole Optimization,var1</div><div># PeepHole Optimization,var11</div><div><span style="white-space:pre-wrap"> </span>andl<span style="white-space:pre-wrap">    </span>$252,%r9d</div><div><span style="white-space:pre-wrap">        </span>orl<span style="white-space:pre-wrap">     </span>%r8d,%r9d</div><div><span style="white-space:pre-wrap">        </span>movl<span style="white-space:pre-wrap">    </span>%r9d,-8(%rbp)</div><div>.Ll473:</div><div># [1016] write_portd(@Send, PCI_CONF_PORT_INDEX);</div><div><span style="white-space:pre-wrap">      </span>leaq<span style="white-space:pre-wrap">    </span>-8(%rbp),%rcx</div><div><span style="white-space:pre-wrap">    </span>movl<span style="white-space:pre-wrap">    </span>$3320,%edx</div><div><span style="white-space:pre-wrap">       </span>call<span style="white-space:pre-wrap">    </span>ARCH_$$_WRITE_PORTD$POINTER$<wbr>WORD</div><div>.Ll474:</div><div># [1017] write_portw(value, PCI_CONF_PORT_DATA);</div><div><span style="white-space:pre-wrap"> </span>movw<span style="white-space:pre-wrap">    </span>%bx,%cx</div><div># Var value located in register cx</div><div># PeepHole Optimization,var11</div><div><span style="white-space:pre-wrap">     </span>andl<span style="white-space:pre-wrap">    </span>$65535,%ecx</div><div><span style="white-space:pre-wrap">      </span>movl<span style="white-space:pre-wrap">    </span>$3324,%edx</div><div><span style="white-space:pre-wrap">       </span>call<span style="white-space:pre-wrap">    </span>ARCH_$$_WRITE_PORTW$WORD$WORD</div><div>.Ll475:</div><div># [1018] end;</div><div><span style="white-space:pre-wrap">  </span>movq<span style="white-space:pre-wrap">    </span>-16(%rbp),%rbx</div><div><span style="white-space:pre-wrap">   </span>leaq<span style="white-space:pre-wrap">    </span>(%rbp),%rsp</div><div><span style="white-space:pre-wrap">      </span>popq<span style="white-space:pre-wrap">    </span>%rbp</div><div><span style="white-space:pre-wrap">     </span>ret</div><div>.seh_endproc</div></div><div><br></div><div>The first thing that I realize was the the optimized version is not generating the correct source when is exiting since it should return "Send", but am I right? The assembler code of write_portd remains the same, Am I missing something? </div><div><br></div><div>Regards, Matias. </div><div><br></div></div>
</blockquote></div><br></div>