[fpc-devel] fpc bug with M1
Martin Frb
lazarus at mfriebe.de
Thu Dec 30 18:22:15 CET 2021
On 30/12/2021 17:16, Florian Klämpfl via fpc-devel wrote:
> Am 30.12.21 um 14:52 schrieb Jonas Maebe via fpc-devel:
>> On 29/12/2021 00:48, Martin Frb via fpc-devel wrote:
>>> I don't have an M1 myself, but according to the data from the thread
>>> on the lazarus mail list, there is a bug in the 3.3.1 asm generator
>>> for M1
>>>
>>> var pn8: pint8; // pointer signed byte
>>>
>>> In the below expression ...(not pn8^)...
>>>
>>> "pn8^" is loaded to w0 and sign extended. From this point onwards
>>> operations on the value should be 32 bits (the value has been
>>> extended, and the full 32 bits are later used).
>>> but "not" only affects the lowest 8 bit.
>>>
>>> Apparently in 3.2.2 (or was it 3.2.0) there was
>>> mvn w0,w0
>>>
>>> If someone can confirm tihs....
>>
>> It's probably caused by c90616944d3bde7b36e924d27a0790195d61f95c
>> (Florian)
>>
>
> Isn't the sign extension during the load wrong? Martin didn't post the
> whole assemble code but I would expect that 3.2.2 produced an uxtb
> instruction afterwards which hide the problem.
The code is from the "old" LazUtils Utf8LengthFast.
"old" => about a week back, since it got recently changed to uint8.
function UTF8LengthFast(p: PChar; ByteCount: PtrInt): PtrInt;
var
pnx: PPtrInt absolute p; // To get contents of text in PtrInt blocks.
x refers to 32 or 64 bits
pn8: pint8 absolute pnx; // To read text as Int8 in the initial and
final loops
begin
....
Result += (pn8^ shr 7) and ((not pn8^) shr 6);
It is about the "((not pn8^) shr 6)" part.
For X86 the "not" is byte only, then sign extend, then shift
(interesting, that the value for a logical shift is sign extended.)
Project1.pas:276 Result += (pn8^ shr 7) and
((not pn8^) shr 6);
0000000100001B30 488b45f8 mov -0x8(%rbp),%rax
0000000100001B34 8a00 mov (%rax),%al
0000000100001B36 f6d0 not %al
0000000100001B38 0fbec0 movsbl %al,%eax
0000000100001B3B c1e806 shr $0x6,%eax
0000000100001B3E 488b55f8 mov -0x8(%rbp),%rdx
0000000100001B42 0fbe12 movsbl (%rdx),%edx
0000000100001B45 c1ea07 shr $0x7,%edx
0000000100001B48 21d0 and %edx,%eax
0000000100001B4A 4863c0 movslq %eax,%rax
0000000100001B4D 480345e8 add -0x18(%rbp),%rax
0000000100001B51 488945e8 mov %rax,-0x18(%rbp)
More information about the fpc-devel
mailing list