[fpc-devel] volatile variables
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Thu Jun 30 11:31:23 CEST 2011
Vinzent Höfler schrieb:
>>>> Question is, what makes one variable use read/write-through, while
>>>> other variables can be read from the cache, with lazy-write?
>>> Synchronisation. Memory barriers. That's what they are for.
>>
>> And this doesn't happen out of thin air. How else?
>
> Ok, maybe I misunderstood your question. Normally, these days every
> memory access is cached (and that's of course independent from compiler
> optimizations).
This should be "normal" (speedy) behaviour.
> But there are usually possibilities to define certain areas as
> write-through (like the video frame buffer). But that's more a chip-set
> thing than it has anything to do with concurrent programming.
I also thought about cache or page attributes (CONST vs. DATA sections),
but these IMO don't offer the fine granularity, that would allow to
separate synchronized from unsynchronized variables (class/record
members!) in code. Even the memory manager would have to deal with two
classes of un/synced memory then :-(
> Apart from that you have to use the appropiate CPU instructions.
Seems to be the only solution.
>>>> Is this a compiler requirement, which has to enforce
>>>> read/write-through for all volatile variables?
>>> No. "volatile" (at least in C) does not mean that.
>>
>> Can you provide a positive answer instead?
>
> Ada2005 RM:
>
> |C.6(16): For a volatile object all reads and updates of the object as
> | a whole are performed directly to memory.
> |C.6(20): The external effect [...] is defined to include each read and
> | update of a volatile or atomic object. The implementation shall
> | not generate any memory reads or updates of atomic or volatile
> | objects other than those specified by the program.
>
> That's Ada's definition of "volatile". C's definition is less stronger, but
> should basically have the same effect.
>
> Is that positive enough for you? ;)
Much better ;-)
But what would this mean to FPC code in general (do we *need* such
attributes?), and what will be their speed impact? This obviously
depends on the effects of the specific synchronizing instructions,
inserted by the compiler.
>>>> But if so, which variables (class fields...) can ever be treated as
>>>> non-volatile, when they can be used from threads other than the main
>>>> thread?
>>> Without explicit synchronisation? Actually, none.
>>
>> Do you understand the implication of your answer?
>
> I hope so. :)
>
>> When it's up to every coder, to insert explicit synchronization
>> whenever required, how to determine the places where explicit code is
>> required?
>
> By careful analysis. Although there may exist tools which detect
> potentially
> un-synchronised accesses to shared variables, there will be no tool that
> inserts synchronisation code automatically for you.
I wouldn't like such tools, except the compiler itself :-(
Consider the shareable bi-linked list, where insertion requires code
like this:
list.Lock; //prevent concurrent access
... //determine affected list elements
new.prev := prev; //prev must be guaranteed to be valid
new.next := next;
prev.next := new;
next.prev := new;
list.Unlock;
What can we expect from the Lock method/instruction - what kind of
synchronizaton (memory barrier) can, will or should it provide?
My understanding of a *full* cache synchronization would slow down not
only the current core and cache, but also all other caches?
If so, would it help to enclose above instructions in e.g.
Synchronized begin
update the links...
end;
so that the compiler can make all memory references (at least reads)
occur read/write-through, inside such a code block? Eventually a global
cache sync can be inserted on exit from such a block.
After these considerations I'd understand that using Interlocked
instructions in the code would ensure such read/write-through, but
merely as a side effect - they also lock the bus for every instruction,
what's not required when concurrent access has been excluded by other
means before.
Conclusion:
We need a documentation of the FPC specific means of cache
synchronization, with their guaranteed effects on every target[1].
Furthermore we need concrete examples[2], how (to what extent) it's
required to use these special instructions/procedures, in examples like
above. When cache synchronization is a big issue, then the usage of
related (thread-unaware) objects should be discussed as well, i.e. how
to ensure that their use will cause no trouble, e.g. by invalidating the
entire cache before.
[1] When the effects of the "primitÃves" vary amongst targets, then
either more specific documentation is required, or higher level
target-insensitive procedures with a guaranteed behaviour. Eventually
both, so that the experienced coder can use conditional code for the
different handling on various targets.
[2] References to e.g. C code samples IMO is inappropriate, because the
user cannot know what special handling such a different compiler will
apply to its compiled code (see "volatile").
DoDi
More information about the fpc-devel
mailing list