<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">

</head>

<body text="#000000" bgcolor="#ffffff">

<pre wrap="">In the thread "Threading support and C library under Linux/Unix", 

On 24 June 2010 10:05, Henry Vermaak <a class="moz-txt-link-rfc2396E"

 href="mailto:henry.vermaak@gmail.com"><henry.vermaak@gmail.com></a> wrote:

</pre>

Which exposes the user helpers here:<br>

<pre wrap="">

<a class="moz-txt-link-freetext"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l767">http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l767</a>

</pre>

<div class="pre"><a id="l768"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l768"

 class="linenr"> 768</a>  * User helpers.</div>

<div class="pre"><a id="l769"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l769"

 class="linenr"> 769</a>  *</div>

<div class="pre"><a id="l770"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l770"

 class="linenr"> 770</a>

 * These are segment of kernel provided user code reachable from user space</div>

<div class="pre"><a id="l771"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l771"

 class="linenr"> 771</a>

 * at a fixed address in kernel memory.  This is used to provide user space</div>

<div class="pre"><a id="l772"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l772"

 class="linenr"> 772</a>

 * with some operations which require kernel help because of unimplemented</div>

<div class="pre"><a id="l773"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l773"

 class="linenr"> 773</a>

 * native feature and/or instructions in many ARM CPUs. The idea is for</div>

<div class="pre"><a id="l774"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l774"

 class="linenr"> 774</a>

 * this code to be executed directly in user mode for best efficiency but</div>

<div class="pre"><a id="l775"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l775"

 class="linenr"> 775</a>

 * which is too intimate with the kernel counter part to be left to user</div>

<div class="pre"><a id="l776"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l776"

 class="linenr"> 776</a>

 * libraries.  In fact this code might even differ from one CPU to another</div>

<div class="pre"><a id="l777"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l777"

 class="linenr"> 777</a>

 * depending on the available  instruction set and restrictions like on</div>

<div class="pre"><a id="l778"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l778"

 class="linenr"> 778</a>

 * SMP systems.  In other words, the kernel reserves the right to change</div>

<div class="pre"><a id="l779"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l779"

 class="linenr"> 779</a>

 * this code as needed without warning. Only the entry points and their</div>

<div class="pre"><a id="l780"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l780"

 class="linenr"> 780</a>  * results are guaranteed to be stable.</div>

<br>

Thus the Kernel creator community provides us with a nice abstraction

to prevent doing different code for different ARM sub-archs. <br>

<br>

Moreover, with older ARM sub-archs, some "atomic" operations (called

"interlocked" in the FPC RTL) in fact are not decently doable in user

space without the help of the Kernel (which does not affect the speed

of such operations. See the other thread).<br>

<br>

In the RTL, these interlocked operations are used e.g. with memory

management<br>

<br>

As it seems to be a good idea to avoid libc binding, it is suggested to

do threading support directly in the RTL, using Kernel APIs instead of

using pthreadib. Besides creating and managing the thread this includes

doing inter-thread communication such as TCriticalSection. <br>

<br>

Now, if supported by the Kernel - which is always but with 68K true for

the archs supported by FPC -, pthreadlib uses FUTEX (Fast User space

muTEX) to do pthread_mutex, providing a lot more performance than using

one of the Kernel's interprocess synchronizing APIs. <br>

<br>

This quite tricky code (see "FUEXes are tricky":

<a class="moz-txt-link-freetext" href="http://people.redhat.com/drepper/futex.pdf">http://people.redhat.com/drepper/futex.pdf</a> ) but of course can be done

in the RTL. But it does needs certain "atomic" operations. <br>

<br>

Right now, the implementation of "atomic" ("interlocked") functions in

the FPC RTL is not optimum with ARM. <br>

<br>

 - It uses compiler switches to determine if an older or a newer ARM

sub-arch is used. And of course it is not always know at compile time

on which sub-arch the program will run. So as a default the old

sub-arch is assumed.<br>

<br>

 - the code for the old sub-arch is not very good (and it can't be

good, as here Kernel-help is necessary): <br>

.... - it is prune to be slow, as a busy spin lock is used<br>

.... - it will dead lock if the priority of one thread is "realtime"<br>

.... - it might fail as it's not granted that all ARMv5 chips have a

correct implementation of the atomicness of the swap instruction<br>

<br>

I feel that this definitely asks for redoing this part of the RTL in a

way using the Kernel-provided "User Helpers", making this part of the

RTL much more reliable, a lot (just redirecting the function calls) and

free of sub-arch related compiler switches..<br>

<br>

I feel this should be a quite easy task. The fixed addresses should be

found in the .h files of the ARM Kernel interface and calling a

function with a fixed address should be no problem at all and not even

seems to ask for ASM. <br>

<br>

I very likely will start to use FPC for arm some day in Future, but

right now we did not even start to define the hardware and construct

the PCB the software will be running on. So I can't be of much help

here at the moment.<br>

<br>

So maybe someone who already has the necessary hardware and experience

might want to get this done.... <br>

<br>

Thanks a lot !<br>

-Michael<br>

<br>

<br>

<a id="l795"

 href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/kernel/entry-armv.S;h=7ee48e7f8f318a7b453e12849b60a6832bb85770;hb=HEAD#l795"

 class="linenr"> </a>

</body>

</html>