[fpc-devel] SMP > 2 core Cache issue with Posix threads

Michael Schnell mschnell at lumino.de
Fri Jul 1 15:36:46 CEST 2011


On 07/01/2011 03:03 PM, Vincent Snijders wrote:
>
> So how you expect us to find the description *you* want us to read in
> all those mails, if even you cannot find it.
I can't find it in the backlog website. I did find it in my  mailstore 
(no idea if this helps, though):

This is the  message of Andrew's (06/28/2011 02:47 PM):
> On Tue, Jun 28, 2011 at 7:40 AM, Michael Schnell<mschnell at lumino.de>  wrote:
>> >  All references I read today say that pthread_mutex (on which supposedly
>> >  TCriticalSection is based) and the appropriate Windows stuff does contain an
>> >  MB. But there might be issues with other OSes and Archs.
>> >
> Yes, any object that requires atomic features will employ a memory barrier.
> That is to say the MB is employed in order for the spincount to be
> accurate across all cores.
>
>> >  If they would not do so, the complete plain old threaded application
>> >  paradigm would be invalid, and tons of applications would need to be
>> >  trashed.
>> >  -Michael
> Probably right here.  My engine had worked fine on a tripple core AMD.
>   It wasn't until I upgraded to the 6 core system did I have to start
> looking into what was causing random problems with pointer
> assignments.
>

------------------------------------------------------------------------

This is the initial message of Andrew's (06/23/2011 02:02 PM):
> Partially.  Read barriers may not be necessary.  It depends on the
> implementation.  By having barriers placed at strategic locations in
> the code, you can make an implementation thread safe.
>
> Add, Delete, and Clear would be some.  List iterations would be
> difficult if not unsafe outside a lock but if accessed under a manager
> thread you can schedule and operation and it will wait done until
> conditions are safe to perform.
>
> Getting an item from the list rather than locking the entire list and
> having an inUse boolean set to true for that item, would allow for any
> deletion to be blocked until another time (by a manager thread).
>
> With multi-core systems simply adding a mutex/lock is not enough.  A
> manager is needed to be in place to safely provision, delegate, and
> use items.  IMO, this forces efficiency in a system design.  The basic
> logic for a manager thread is that the manager thread is accepting
> commands, and scheduled execution happens automatically.
>
> Further, you could create a command system for each list so when you
> add/remove an item from the list you are merely scheduling that item
> for deletion. I would avoid polling for large lists.  The command
> should have a completion routine like Item.onComplete(Item) where the
> item is passed for use in the application.
>
> This way there would be absolutely no waiting for data in code and the
> system in general at rest until needed.
>
Here he goes on (06/27/2011 04:58 PM):
> You're totally underestimating the need for a memory barrier :
>
> "Multithreaded programming and memory visibility
>
> See also: Memory model (computing)
> Multithreaded programs usually use synchronization primitives provided
> by a high-level programming environment, such as Java and .NET
> Framework, or an application programming interface (API) such as POSIX
> Threads or Windows API. Primitives such as mutexes and semaphores are
> provided to synchronize access to resources from parallel threads of
> execution. These primitives are usually implemented with the memory
> barriers required to provide the expected memory visibility semantics.
> In such environments explicit use of memory barriers is not generally
> necessary. "
>
> Focus on the "These primitives are usually implemented with the memory
> barriers required to provide the expected memory visibility semantics.
> "
>
> These primitives include TCrticialSection.  It's not enough to ensure
> that someone from another thread is reading a variable at will.  And
> it would safe to assume that CPU management systems can switch code
> execution on any core at any time.
(06/27/2011 08:03 AM):
> On Mon, Jun 27, 2011 at 12:52 PM, Hans-Peter Diettrich
> <DrDiettrich1 at aol.com>  wrote:
>> >  You forget the strength of protection. A MB will disallow immediately any
>> >  concurrent access to the memory area - no way around. Protection by other
>> >  means only can work in*perfect*  cooperation, a very weak model.
> Absolutely incorrect.
>
> entercricitalsection();
> loop
>    a:=b+c;
> end loop;
> leavecriticalsection();
>
> thread 2
>
> can read a and b and c at any state.  If you want an accurate view of
> a,b,c you need to employ interlocked statements:-)
(06/28/2011 01:41 AM):
> 2011/6/27 Malcom Haak<insanemal at gmail.com>:
>> >  Tell me then why any of what you have said is relevant. In fact in cases
>> >  this this the use of CriticalSections would be sensible and would not cause
>> >  'tons of wait' as you have all your worker threads off doing things 99% of
>> >  the time.
> Thread 1:
> a=b+c
> a2=a+c2
> SignalEvent(E1)
>
> Thread 2:
>    repeat
>    WaitForEvent(E1,120);
>      We can read anything now
>    until terminated or complete
>
> This the prime example.  On a 6 core system a looks like one value to
> one core than it does to another core.  It's that simple.  No getting
> around this issue.
>
> While spinlocks can block a entrance - it cannot guarantee memory
> order / code execution order.  Therefore it is good practice to adopt
> interlocked assignments to guarantee memory is what it is.
>
>
> Core X computes a=b+c
> Core X+Y computes a2
>
> This is relatively new theory which required low-level CPU code to
> perform such locks.  This was never needed until the introduction of
> multi-core systems.  Of which I did extensive tests on AMD via
> FPC/Lazarus.
(06/28/2011 04:41 AM):
> On Tue, Jun 28, 2011 at 9:47 AM, Hans-Peter Diettrich
> <DrDiettrich1 at aol.com>  wrote:
>
>> >  I don't see anything like memory barriers here.
> Compare and swap mechanisms aren't quite like memory barriers but they
> to get the CPU to send a "fresh" copy of a variable to all cores'
> cache...




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freepascal.org/pipermail/fpc-devel/attachments/20110701/d12d0981/attachment.html>


More information about the fpc-devel mailing list