[fpc-devel] SMP > 2 core Cache issue with Posix threads

Michael Schnell mschnell at lumino.de
Fri Jul 1 10:43:04 CEST 2011


In another topic (now closed) Andrew described that a code similar to 
HansPeter's example did run correctly on a dual core machine, but 
produced errors on a machine with more cores.

Now I understand that threaded FPC user programs are supposed to be done 
in a Posix compliant way and FPC/RTL/LCL provide Posix compatible MUTEX 
behavior e.g. with TCtriticalSection.

I understand that regarding threads and SMP, Posix (while not explicitly 
handling SMP) requires a user program to assume that unsynchronized code 
sequences in multiple threads can work truly in parallel (an indefinite 
number of cores  needs to be assumed, priority does not enforce a 
sequential behavior).

Moreover regarding memory / cache synchronization Posix (not explicitly 
handling SMP and cache) requires the user program not to assume that any 
variable is set by another thread unless some synchronization ensures 
that the appropriate code sequences that sets it already has been 
executed..

OTOH if some synchronization ensures that the appropriate code sequences 
that sets it has been executed, IMHO, Posix compatibility of the 
infrastructure (compiler, libraries, OS, Hardware) ensures that another 
thread in fact gets that correct value of that variable (i.e. SMP Cache 
synchronization after the synchronization is granted).


On 06/30/2011 11:31 AM, Hans-Peter Diettrich wrote:
>
> Consider the shareable bi-linked list, where insertion requires code 
> like this:
>   list.Lock; //prevent concurrent access
>   ... //determine affected list elements
>   new.prev := prev; //prev must be guaranteed to be valid
>   new.next := next;
>   prev.next := new;
>   next.prev := new;
>   list.Unlock;

in the other topic  we found that
  - my original "volatile" question is invalid, as the function calls 
done with list.lock and list.unlock are a "volatile barrier" preventing 
the compiler from caching some value inappropriately in a register, 
simply because they are function calls.
  - the MUTEX operations done with list.lock and list.unlock (at least 
on PCs in Linux and Windows) use library calls that includes memory 
barriers
  - if the potential cache incoherency would not be handled by Hardware 
/ OS / Libraries on behalf of user land programs (not done with FPC), I 
feel that this would so disastrous and ubiquitous that it results in so 
many programs not working on SMP systems that it would be a really well 
known issue in the (C-) programmer community.

So, if Andrew's test case really fails, I think, there is some kind of 
bug _somewhere_ and I think it should be located to verify that it is 
not in the FPC compiler or the FPC RTL.

(It it fails only on Andrew's machine, my suggestion is a hardware problem.)

Who has a >2 Core machine to test and debug this ?
-Michael



More information about the fpc-devel mailing list