[fpc-devel] SSE/AVX instruction encodings

J. Gareth Moreton gareth at moreton-family.com
Fri Oct 2 08:59:38 CEST 2020


Hi Torsten,

The reason why it's not compiling correctly with -a is because the 
operand size is being set to S_XMM, not S_YMM (because it's going by the 
size of the source operand), so when writing the .s files, it adds an 
'x' suffix to the end of the opcode.

I know there's a high risk of it breaking existing code, but there are a 
lot of exceptional cases in the function check, many of which are 
SSE/AVX instructions that deal with operands of different sizes.

Gareth aka. Kit

On 01/10/2020 23:09, avx512--- via fpc-devel wrote:
> Hi Gareth,
>
> in my opinion it is not a good idea to introduce a new function to calculate the operand size.
>
> The risk of breaking existing code (fpc and user code) is very high.
>
> I introduced the system with memrefinfo for sse and avx opcodes to protect the existing user code. The basis of this concept is the opcode definition in x86ins.dat
>
> In trunk is the definition for opcode VCVTPD2PS:
>
> ; VCVTPD2PS xmmreg_mz,mem256 must come first - map MemRefSize 256bits correct
> ;                                              map all other MemrefSize (without broasdcast MemRef) to xmmreg, xmmrm
> [VCVTPD2PS,vcvtpd2psM]
> (Ch_Wop2, Ch_Rop1)
> xmmreg_mz,mem256                          \350\352\361\362\364\370\1\x5A\110        AVX,SANDYBRIDGE,TFV
> xmmreg_mz,ymmreg                          \350\352\361\362\364\370\1\x5A\110        AVX,SANDYBRIDGE
> xmmreg_mz,xmmrm                           \350\352\361\362\370\1\x5A\110            AVX,SANDYBRIDGE,TFV
>
> // AVX512
> xmmreg_mz,bmem64                          \350\352\361\370\1\x5A\110                AVX512,BCST2,TFV
> xmmreg_mz,bmem64                          \350\352\361\364\370\1\x5A\110            AVX512,BCST4,TFV
> ymmreg_mz,mem512                          \350\351\352\361\370\1\x5A\110            AVX512,TFV
> ymmreg_mz,bmem64                          \350\351\352\361\370\1\x5A\110            AVX512,BCST8,TFV
> ymmreg_mz,zmmreg_er                       \350\351\352\361\370\1\x5A\110            AVX512
>
>
> In trunk is compiling correct (without compileroption -a), with -a is not correct. I check this.
>
> Torsten
>
>
>
> -----Original-Nachricht-----
> Betreff: Re: [fpc-devel] SSE/AVX instruction encodings
> Datum: 2020-10-01T18:04:26+0200
> Von: "J. Gareth Moreton via fpc-devel" <fpc-devel at lists.freepascal.org>
> An: "fpc-devel at lists.freepascal.org" <fpc-devel at lists.freepascal.org>
>
> Hi Torsten,
>
> I've done that already actually, although only to grab the value of the
> ExistsSSEAVX field.  I'm currently testing a new nested function in
> Tx86Instruction.SetInstructionOpsize:
>
>     function CheckSSEAVX: Boolean;
>       begin
>         Result := False;
>
>         if not MemRefInfo(opcode).ExistsSSEAVX then
>           Exit;
>
>         { This check also covers MMX instructions that move data to and from
>           32-bit and 64-bit registers or memory, since such instructions are
>           replicated in SSE2 for use with XMM registers }
>         if tx86operand(operands[1]).opsize in [S_B,S_W,S_L,S_Q] then
>           begin
>             opsize := S_NO;
>             Exit(True);
>           end;
>
>         if (tx86operand(operands[1]).opsize <> S_NO) and
> (operands[1].opr.typ = OPR_REFERENCE) then
>           begin
>             { Memory sizes of 64 bits and under are handled above }
>             opsize:=tx86operand(operands[1]).opsize;
>             Exit(True);
>           end;
>
>         { If the source operand is larger than the destination (e.g.
>           "VCVTTPD2DQ XMM0, YMM1" in Intel notation), use the source
> operand }
>         if ((tx86operand(operands[1]).opsize = S_YMM) and
> (tx86operand(operands[2]).opsize = S_XMM)) or
>           (tx86operand(operands[1]).opsize = S_ZMM) and
> (tx86operand(operands[2]).opsize = S_XMM) or
>           (tx86operand(operands[1]).opsize = S_ZMM) and
> (tx86operand(operands[2]).opsize = S_YMM) then
>           begin
>             opsize:=tx86operand(operands[1]).opsize;
>             Exit(True);
>           end;
>
>         { If none of the conditions are met, this function returns False
> and the
>           opsize is set to the last operand's opsize }
>       end;
>
> I've also commented out the individual checks for MOVD, MOVQ, VMOVQ etc
> to see how it handles itself and to simplify the code. "make all" at
> least works successfully and it fixes the bug listed in #37785, but it
> will need extensive testing, lest I break someone's assembly language.
>
> Note that the reason why I've done "(tx86operand(operands[1]).opsize =
> S_YMM) and (tx86operand(operands[2]).opsize = S_XMM)" etc. and not
> something like "(tx86operand(operands[1]).opsize >= S_YMM) and
> (tx86operand(operands[1]).opsize > tx86operand(operands[2]).opsize)" is
> for future safety, since the opsize field doesn't have items in size
> order (plus some entries, like S_BL, don't have a distinct size because
> it's a size conversion) and it's to prevent an unintended side-effect if
> a new entry is added after S_ZMM in the future.
>
> One thing that makes it difficult is that I don't have a processor that
> supports the AVX-512 instruction set, at least I don't think it does
> (Intel Core i7-10750H).
>
> Gareth aka. Kit
>
> P.S. If anyone can see a way to break the above code (before I submit a
> patch), please tell me!
>
>
> On 01/10/2020 15:52, avx512--- via fpc-devel wrote:
>> Hi,
>>
>> look at the function "MemRefInfo(aAsmop: TAsmOp)" in "compiler/x86/aasmcpu.pas".
>>
>>
>> Torsten
>>
>>
>>
>> -----Original-Nachricht-----
>> Betreff: [fpc-devel] SSE/AVX instruction encodings
>> Datum: 2020-10-01T13:57:05+0200
>> Von: "J. Gareth Moreton via fpc-devel" <fpc-devel at lists.freepascal.org>
>> An: "FPC developers' list" <fpc-devel at lists.freepascal.org>
>>
>> Hi everyone,
>>
>> I've decided to take on https://bugs.freepascal.org/view.php?id=37785 -
>> I've noticed that the compiler isn't too good at working out the sizes
>> of SSE and AVX instructions.  If you look at
>> Tx86Instruction.SetInstructionOpsize in compiler/x86/rax86.pas, it
>> checks for individual problematic instructions rather than any logical
>> flags.  I feel this isn't viable in the long-term (i.e. I really don't
>> want to continually add exceptional instructions) and has the code smell
>> of something being fundamentally wrong or incomplete with how
>> instruction sizes and encodings are determined.
>>
>> I'm looking to see if there's a way I can detect the correct size
>> logically given the flags.  I figure I'll need to learn a few things
>> about AVX512 as well so I don't mess anything up (I've noticed a few
>> AVX512 flags to indicate if scalars rather than vectors are being used,
>> and wondering if they can be incorporated into the older SSE and AVX
>> instructions in x86ins.dat.
>>
>> Long story short, I'm going to experiment a bit to see if I can develop
>> an algorithm that works and is correct.
>>
>> Gareth aka. Kit
>>
>>


More information about the fpc-devel mailing list