[fpc-pascal] Question about System.Move()

Derek Edson mr.ed.123 at gmail.com
Sat Jan 30 08:32:39 CET 2021


The reference count is only updated if the variable "A" was assigned to
another variable. However, if "A" was global, it may still be changed by a
second thread without changing its value.

program SwapTest;
uses
  Classes, SysUtils;

type

  { MyThread }

  MyThread = class(TThread)
  protected
    procedure Execute; override;
  end;

type
  PAnsiRec = ^TAnsiRec;
  TAnsiRec = Record
    CodePage    : TSystemCodePage;
    ElementSize : Word;
    {$ifdef CPU64}
    { align fields  }
    Dummy       : DWord;
    {$endif CPU64}
    Ref         : SizeInt;
    Len         : SizeInt;
    end;

var
  AStr: string;
  BStr: string;
  LThread: TThread;

function SRefCount(P : Pointer) : integer;
begin
  if P=Nil then
    Result:=0
  else
    Result:=PAnsiRec(P-SizeOf(TAnsiRec))^.Ref;
end;

procedure stringSwap(var a, b: string);
var
  t: string;
begin
  writeln('RefCount of A before assign to T=', SRefCount(pointer(a)));
  t := a;
  writeln('RefCount of A after assign to T=', SRefCount(pointer(a)));
  a := b;
  b := t;
  writeln('RefCount of A after at exit of stringSwap=',
SRefCount(pointer(a)));
  writeln('RefCount of B after at exit of stringSwap=',
SRefCount(pointer(B)));
end;

{ MyThread }

procedure MyThread.Execute;
begin
  AStr := '';
end;

begin
  LThread := MyThread.Create(false);
  writeln('Multi-threading=', IsMultiThread);
  AStr := IntToStr(123);
  UniqueString(AStr);
  BStr := IntToHex(123, 3);
  UniqueString(BStr);
  writeln('AStr="', AStr, '", Reference count before call=',
SRefCount(pointer(AStr)));
  writeln('BStr="', BStr, '", Reference count before call=',
SRefCount(pointer(BStr)));
  stringSwap(AStr, BStr);
  writeln('AStr="', AStr, '", Reference count after call=',
SRefCount(pointer(AStr)));
  writeln('BStr="', BStr, '", Reference count after call=',
SRefCount(pointer(BStr)));
  LThread.WaitFor;
  LThread.Free;
end.

The output is dependent on the the timing of the two threads, but in this
case it occurred after the return of stringSwap, but you can still see the
affect of the variable being cleared by the second thread. This global
value could be assigned a new value, and that could also cause an issue as
the reference in your simplified code could still reference the old value
which may have been deleted.

Multi-threading=TRUE
AStr="123", Reference count before call=1
BStr="07B", Reference count before call=1
RefCount of A before assign to T=1
RefCount of A after assign to T=2
RefCount of A after on exit of stringSwap=1
RefCount of B after at exit of stringSwap=2
AStr="", Reference count after call=0
BStr="123", Reference count after call=1

Derek


On Sat, Jan 30, 2021 at 12:24 AM Benito van der Zander via fpc-pascal <
fpc-pascal at lists.freepascal.org> wrote:

> Hi,
>
>
> Should "A" have a reference count of 1, and it is assigned a new value on
> another thread, its reference count will decrease to zero and the string
> and its descriptor are both freed. The value of "A" will be set to nil.
>
>
> If the reference count is 1, there is no other thread involved.
> If there was another thread, there would be a reference in this thread and
> in the other thread, so the reference count would be at least 2
>
>
> It is exactly as thread safe as the unoptimized function. If the reference
> count was 1, the unoptimized function would not work either, since the
> thread swap might occur just before the first instruction, and the strings
> are freed before the function can do anyhting
>
>
> Bye,
> Benito
> On 28.01.21 08:08, Derek Edson via fpc-pascal wrote:
>
> Your simplified code would not be thread safe. A thread swap after the
> first instruction of
>
>         mov (%rdi), %rax
>
> could potentially cause a problem. RAX (A) contains the pointer to the
> string descriptor, which includes the pointer to the actual string data and
> the reference count.
>
> Should "A" have a reference count of 1, and it is assigned a new value on
> another thread, its reference count will decrease to zero and the string
> and its descriptor are both freed. The value of "A" will be set to nil.
>
> RAX will not have been updated, so after
>
>         mov %rax, (%rsi)
>
> "B" will be pointing to an invalid location. Its content is now dependent
> on memory reallocation.
>
> In this situation, changing "B" after the return from stringSwap procedure
> might again cause the string to be freed (assuming the memory was not
> reallocated), causing a double free error.
>
> Use of the internal procedures to increment the string reference count of
> both strings before your code and then decrementing the reference counts
> afterwards would probably work.
>
> The same probably applies to other reference count objects.
>
> Derek
>
> On Thu, Jan 28, 2021 at 11:35 AM Benito van der Zander via fpc-pascal <
> fpc-pascal at lists.freepascal.org> wrote:
>
>> Hi,
>>
>>
>> Procedure ManagedMove<T>(const source: T;var dest: T;count: SizeInt);
>>
>>
>> In principle a good idea. However this is one of those cases where you'd
>> definitely need to use constref instead of const.
>>
>>
>> Or var, since the source might be cleared
>>
>> And perhaps there could be a special attribute to mark which kind of
>>> moving is needed, e.g..
>>>   type [moveable] TA = record
>>>   type [referencecounted] TA = record
>>>   type [nonmoveable] TA = record
>>>
>>
>> No, thank you.
>>
>>
>>
>> But it could help a lot with optimizations. For moveable types it could
>> omit calling an assignment function and just do a memory copy; and for
>> refcounted tyes it could that when the number of references does not
>> change.
>>
>>
>>
>> Like a simple swap function:
>>
>> procedure stringSwap(var a, b: string);
>> var t: string;
>> begin
>>   t := a;
>>   a := b;
>>   b := t;
>> end;
>>
>> A smart compiler could detect that it is a refcounted type and the number
>> of references stays the same, and thus optimize it into:
>>
>>     mov    (%rdi),%rax
>>     mov    (%rsi),%rdx
>>     mov    %rdx,(%rdi)
>>     mov    %rax,(%rsi)
>>     retq
>>
>> But FPC turns it into:
>>
>>     begin
>>     push   %rbx
>>     push   %r12
>>     lea    -0x68(%rsp),%rsp
>>     mov    %rdi,%rbx
>>     mov    %rsi,%r12
>>     movq   $0x0,(%rsp)
>>     lea    0x8(%rsp),%rdx
>>     lea    0x20(%rsp),%rsi
>>     mov    $0x1,%edi
>>     callq  0x4324c0 <fpc_pushexceptaddr>
>>     mov    %rax,%rdi
>>     callq  0x41fba0 <fpc_setjmp>
>>     movslq %eax,%rdx
>>     mov    %rdx,0x60(%rsp)
>>     test   %eax,%eax
>>     jne    0x469191 <STRINGSWAP+97>
>>     t := a;
>>     mov    (%rbx),%rsi
>>     mov    %rsp,%rdi
>>     callq  0x428d00 <fpc_ansistr_assign>
>>     a := b;
>>     mov    (%r12),%rsi
>>     mov    %rbx,%rdi
>>     callq  0x428d00 <fpc_ansistr_assign>
>>     b := t;
>>     mov    (%rsp),%rsi
>>     mov    %r12,%rdi
>>     callq  0x428d00 <fpc_ansistr_assign>
>>     callq  0x432830 <fpc_popaddrstack>
>>     end;
>>     mov    %rsp,%rdi
>>     callq  0x428ca0 <fpc_ansistr_decr_ref>
>>     mov    0x60(%rsp),%rax
>>     test   %rax,%rax
>>     je     0x4691b8 <STRINGSWAP+136>
>>     callq  0x4329e0 <fpc_reraise>
>>     movq   $0x0,0x60(%rsp)
>>     jmp    0x469191 <STRINGSWAP+97>
>>     lea    0x68(%rsp),%rsp
>>     pop    %r12
>>     pop    %rbx
>>     retq
>>
>>
>>
>> Then you want to simply use ismanagedtype and move().  Moving the move()
>> to a separate generic procedure will only lead to many instantiations of
>> what is basically a move() procedure.
>>
>>
>> The goal is to reduce instantiations.
>> If there are n generic (collection) classes, C1, C2, .., Cn, and k types
>> used in the collections, T1, T2, .. Tk, there are n*k collection class
>> instantiation C1<T1>, C1<T2>, .. C1<Tk>, C2<T1>, ... Cn<Tk>
>> If each collection class does its own Finalize-Loop/Move/Fillchar there
>> are also n*k of these move implementations. So like 3*n*k calls.
>> When any collection can call the same ManagedMove, there should only be k
>> ManagedMove instantiations and n*k calls to ManagedMove, which is only one
>> call. So it is k + n*k calls,  which is around a third of 3*n*k
>>
>> Bye,
>> Benito
>> On 11.01.21 18:51, Sven Barth via fpc-pascal wrote:
>>
>> Benito van der Zander via fpc-pascal <fpc-pascal at lists.freepascal.org>
>> schrieb am Mo., 11. Jan. 2021, 15:26:
>>
>>> Hi,
>>>
>>> perhaps a  safe, generic function for this copying could be added to the
>>> RTL. Like:
>>>
>>> Procedure ManagedMove<T>(const source: T;var dest: T;count: SizeInt);
>>>
>>
>> In principle a good idea. However this is one of those cases where you'd
>> definitely need to use constref instead of const.
>>
>> And when you use IsManagedType, it does not distinguish standard strings
>>> with such weird managed types.
>>>
>>
>> You can additionally use GetTypeKind as well. Unlike TypeInfo it directly
>> returns the TTypeKind (which for this case is enough) and is considered
>> constant.
>>
>> And perhaps there could be a special attribute to mark which kind of
>>> moving is needed, e.g..
>>>   type [moveable] TA = record
>>>   type [referencecounted] TA = record
>>>   type [nonmoveable] TA = record
>>>
>>
>> No, thank you.
>>
>> Regards,
>> Sven
>>
>> _______________________________________________
>> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.orghttps://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
>>
>> _______________________________________________
>> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org
>> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
>>
>
>
> --
> Derek Edson
>
> _______________________________________________
> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.orghttps://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
>
> _______________________________________________
> fpc-pascal maillist  -  fpc-pascal at lists.freepascal.org
> https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal
>


-- 
Derek Edson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-pascal/attachments/20210130/bcf5d707/attachment-0001.htm>


More information about the fpc-pascal mailing list