[fpc-devel] SMP > 2 core Cache issue with Posix threads
Michael Schnell
mschnell at lumino.de
Fri Jul 1 10:43:04 CEST 2011
In another topic (now closed) Andrew described that a code similar to
HansPeter's example did run correctly on a dual core machine, but
produced errors on a machine with more cores.
Now I understand that threaded FPC user programs are supposed to be done
in a Posix compliant way and FPC/RTL/LCL provide Posix compatible MUTEX
behavior e.g. with TCtriticalSection.
I understand that regarding threads and SMP, Posix (while not explicitly
handling SMP) requires a user program to assume that unsynchronized code
sequences in multiple threads can work truly in parallel (an indefinite
number of cores needs to be assumed, priority does not enforce a
sequential behavior).
Moreover regarding memory / cache synchronization Posix (not explicitly
handling SMP and cache) requires the user program not to assume that any
variable is set by another thread unless some synchronization ensures
that the appropriate code sequences that sets it already has been
executed..
OTOH if some synchronization ensures that the appropriate code sequences
that sets it has been executed, IMHO, Posix compatibility of the
infrastructure (compiler, libraries, OS, Hardware) ensures that another
thread in fact gets that correct value of that variable (i.e. SMP Cache
synchronization after the synchronization is granted).
On 06/30/2011 11:31 AM, Hans-Peter Diettrich wrote:
>
> Consider the shareable bi-linked list, where insertion requires code
> like this:
> list.Lock; //prevent concurrent access
> ... //determine affected list elements
> new.prev := prev; //prev must be guaranteed to be valid
> new.next := next;
> prev.next := new;
> next.prev := new;
> list.Unlock;
in the other topic we found that
- my original "volatile" question is invalid, as the function calls
done with list.lock and list.unlock are a "volatile barrier" preventing
the compiler from caching some value inappropriately in a register,
simply because they are function calls.
- the MUTEX operations done with list.lock and list.unlock (at least
on PCs in Linux and Windows) use library calls that includes memory
barriers
- if the potential cache incoherency would not be handled by Hardware
/ OS / Libraries on behalf of user land programs (not done with FPC), I
feel that this would so disastrous and ubiquitous that it results in so
many programs not working on SMP systems that it would be a really well
known issue in the (C-) programmer community.
So, if Andrew's test case really fails, I think, there is some kind of
bug _somewhere_ and I think it should be located to verify that it is
not in the FPC compiler or the FPC RTL.
(It it fails only on Andrew's machine, my suggestion is a hardware problem.)
Who has a >2 Core machine to test and debug this ?
-Michael
More information about the fpc-devel
mailing list