Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Science is to computer science as hydrodynamics is to plumbing.


devel / comp.lang.c++ / Semaphore thread Flip: 20.000 clock cycles

SubjectAuthor
* Semaphore thread Flip: 20.000 clock cyclesBonita Montero
+* Re: Semaphore thread Flip: 20.000 clock cyclesBo Persson
|`* Re: Semaphore thread Flip: 20.000 clock cyclesBonita Montero
| `* Re: Semaphore thread Flip: 20.000 clock cyclesBo Persson
|  `- Re: Semaphore thread Flip: 20.000 clock cyclesBonita Montero
`* Re: Semaphore thread Flip: 20.000 clock cyclesPavel
 `* Re: Semaphore thread Flip: 20.000 clock cyclesBonita Montero
  `* Re: Semaphore thread Flip: 20.000 clock cyclesChris M. Thomasson
   `* Re: Semaphore thread Flip: 20.000 clock cyclesBonita Montero
    `- Re: Semaphore thread Flip: 20.000 clock cyclesChris M. Thomasson

1
Semaphore thread Flip: 20.000 clock cycles

<u6kmk8$19q46$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=440&group=comp.lang.c%2B%2B#440

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Semaphore thread Flip: 20.000 clock cycles
Date: Sat, 17 Jun 2023 18:22:35 +0200
Organization: A noiseless patient Spider
Lines: 89
Message-ID: <u6kmk8$19q46$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 17 Jun 2023 16:22:32 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="70ae5adec0be00e54b241ae7ebc9e0a2";
logging-data="1370246"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+MwmNeCwUT+zL5cG42a6uENRaK7KbsvF0="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:n/X/SakNlNnn4FTee24KoXT5G04=
Content-Language: de-DE
 by: Bonita Montero - Sat, 17 Jun 2023 16:22 UTC

I wanted to test how many time it takes for a thread to signal
a semaphore to another thread and to wait to be signalled back.
That's essential for mutexes when they're contended. I tested
this under Windows 11 with a Ryzen 9 7950X system.
I tested different combinations of logical cores. The first
thread is always fixed on the first core and the other thread
is varying. I print the X2 APIC ID along with the result.
The fastest result I get is about 20.000 clock cycles for one
switch to the other thread. I think that's enormous.
A similar benchmark written for linux using Posix semapohres
gives about 8.000 clock cylces per switch on a 3990X system.
That's a huge difference since the CPU is a Zen2-CPU with a
much lower clock rate than the 7950X Zen4 system.

#include <Windows.h>
#include <iostream>
#include <thread>
#include <system_error>
#include <chrono>
#include <latch>
#include <charconv>
#include <intrin.h>

using namespace std;
using namespace chrono;

int main( int argc, char **argv )
{ static auto errTerm = []( bool succ, char const *what )
{
if( succ )
return;
cerr << what << endl;
terminate();
};
int regs[4];
__cpuid( regs, 0 );
errTerm( (unsigned)regs[0] >= 0xB, "max CPUID below 0xB" );
bool fPrio = SetPriorityClass( GetCurrentProcess(),
REALTIME_PRIORITY_CLASS )
|| SetPriorityClass( GetCurrentProcess(), HIGH_PRIORITY_CLASS );
errTerm( fPrio, "can't set process priority class" );
unsigned nCPUs = jthread::hardware_concurrency();
for( unsigned cpuB = 1; cpuB != nCPUs; ++cpuB )
{
auto init = []( HANDLE &hSem, bool set )
{
hSem = CreateSemaphoreA( nullptr, set, 1, nullptr );
errTerm( hSem, "can't create semaphore" );
};
HANDLE hSemA, hSemB;
init( hSemA, false );
init( hSemB, true );
atomic_int64_t tSum( 0 );
latch latRun( 3 );
auto flipThread = [&]( HANDLE hSemMe, HANDLE hSemYou, size_t n,
uint32_t *pX2ApicId )
{
latRun.arrive_and_wait();
auto start = high_resolution_clock::now();
for( ; n--; )
errTerm( WaitForSingleObject( hSemMe, INFINITE ) == WAIT_OBJECT_0,
"can't wait for semaphore" ),
errTerm( ReleaseSemaphore( hSemYou, 1, nullptr ), "can't post
semaphore" );
tSum.fetch_add( duration_cast<nanoseconds>(
high_resolution_clock::now() - start ).count(), memory_order::relaxed );
if( !pX2ApicId )
return;
int regs[4];
__cpuidex( regs, 0xB, 0 );
*pX2ApicId = regs[3];
};
constexpr size_t ROUNDS = 10'000;
uint32_t x2ApicId;
jthread
thrA( flipThread, hSemA, hSemB, ROUNDS, nullptr ),
thrB( flipThread, hSemB, hSemA, ROUNDS, &x2ApicId );
errTerm( SetThreadAffinityMask( thrA.native_handle(), 1 ), "can't set
CPU affinity" );
errTerm( SetThreadAffinityMask( thrB.native_handle(), (DWORD_PTR)1 <<
cpuB ), "can't set CPU affinity" );
latRun.arrive_and_wait();
thrA.join();
thrB.join();
cout << x2ApicId << ": " << (double)tSum.load( memory_order::relaxed )
/ (2.0 * ROUNDS) << endl;
};
}

Re: Semaphore thread Flip: 20.000 clock cycles

<kf65r7Fm7c2U1@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=441&group=comp.lang.c%2B%2B#441

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: bo@bo-persson.se (Bo Persson)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Sat, 17 Jun 2023 18:37:58 +0200
Lines: 99
Message-ID: <kf65r7Fm7c2U1@mid.individual.net>
References: <u6kmk8$19q46$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: individual.net 8vgwm5RWfA6iXxIh8xvEYAQAc8R5zLhSJfJTbXplMFQtTFrYdE
Cancel-Lock: sha1:Bl5vkQN0qAVuktlM8QInB/1YvXA=
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Content-Language: sv
In-Reply-To: <u6kmk8$19q46$1@dont-email.me>
 by: Bo Persson - Sat, 17 Jun 2023 16:37 UTC

On 2023-06-17 at 18:22, Bonita Montero wrote:
> I wanted to test how many time it takes for a thread to signal
> a semaphore to another thread and to wait to be signalled back.
> That's essential for mutexes when they're contended. I tested
> this under Windows 11 with a Ryzen 9 7950X system.
> I tested different combinations of logical cores. The first
> thread is always fixed on the first core and the other thread
> is varying. I print the X2 APIC ID along with the result.
> The fastest result I get is about 20.000 clock cycles for one
> switch to the other thread. I think that's enormous.
> A similar benchmark written for linux using Posix semapohres
> gives about 8.000 clock cylces per switch on a 3990X system.
> That's a huge difference since the CPU is a Zen2-CPU with a
> much lower clock rate than the 7950X Zen4 system.

I have Windows 10 with a Core i9 9900K, and get results between 7500 and
8000.

So is it the Windows version or the CPU model that is most important?

>
> #include <Windows.h>
> #include <iostream>
> #include <thread>
> #include <system_error>
> #include <chrono>
> #include <latch>
> #include <charconv>
> #include <intrin.h>
>
> using namespace std;
> using namespace chrono;
>
> int main( int argc, char **argv )
> {
>     static auto errTerm = []( bool succ, char const *what )
>     {
>         if( succ )
>             return;
>         cerr << what << endl;
>         terminate();
>     };
>     int regs[4];
>     __cpuid( regs, 0 );
>     errTerm( (unsigned)regs[0] >= 0xB, "max CPUID below 0xB" );
>     bool fPrio = SetPriorityClass( GetCurrentProcess(),
> REALTIME_PRIORITY_CLASS )
>         || SetPriorityClass( GetCurrentProcess(), HIGH_PRIORITY_CLASS );
>     errTerm( fPrio, "can't set process priority class" );
>     unsigned nCPUs = jthread::hardware_concurrency();
>     for( unsigned cpuB = 1; cpuB != nCPUs; ++cpuB )
>     {
>         auto init = []( HANDLE &hSem, bool set )
>         {
>             hSem = CreateSemaphoreA( nullptr, set, 1, nullptr );
>             errTerm( hSem, "can't create semaphore" );
>         };
>         HANDLE hSemA, hSemB;
>         init( hSemA, false );
>         init( hSemB, true );
>         atomic_int64_t tSum( 0 );
>         latch latRun( 3 );
>         auto flipThread = [&]( HANDLE hSemMe, HANDLE hSemYou, size_t n,
> uint32_t *pX2ApicId )
>         {
>             latRun.arrive_and_wait();
>             auto start = high_resolution_clock::now();
>             for( ; n--; )
>                 errTerm( WaitForSingleObject( hSemMe, INFINITE ) ==
> WAIT_OBJECT_0, "can't wait for semaphore" ),
>                 errTerm( ReleaseSemaphore( hSemYou, 1, nullptr ),
> "can't post semaphore" );
>             tSum.fetch_add( duration_cast<nanoseconds>(
> high_resolution_clock::now() - start ).count(), memory_order::relaxed );
>             if( !pX2ApicId )
>                 return;
>             int regs[4];
>             __cpuidex( regs, 0xB, 0 );
>             *pX2ApicId = regs[3];
>         };
>         constexpr size_t ROUNDS = 10'000;
>         uint32_t x2ApicId;
>         jthread
>             thrA( flipThread, hSemA, hSemB, ROUNDS, nullptr ),
>             thrB( flipThread, hSemB, hSemA, ROUNDS, &x2ApicId );
>         errTerm( SetThreadAffinityMask( thrA.native_handle(), 1 ),
> "can't set CPU affinity" );
>         errTerm( SetThreadAffinityMask( thrB.native_handle(),
> (DWORD_PTR)1 << cpuB ), "can't set CPU affinity" );
>         latRun.arrive_and_wait();
>         thrA.join();
>         thrB.join();
>         cout << x2ApicId << ": " << (double)tSum.load(
> memory_order::relaxed ) / (2.0 * ROUNDS) << endl;
>     };
> }

Re: Semaphore thread Flip: 20.000 clock cycles

<u6kon8$1a545$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=442&group=comp.lang.c%2B%2B#442

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Sat, 17 Jun 2023 18:58:18 +0200
Organization: A noiseless patient Spider
Lines: 7
Message-ID: <u6kon8$1a545$1@dont-email.me>
References: <u6kmk8$19q46$1@dont-email.me> <kf65r7Fm7c2U1@mid.individual.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 17 Jun 2023 16:58:16 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="70ae5adec0be00e54b241ae7ebc9e0a2";
logging-data="1381509"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18AO3wyDHu+NEurVgjHxnqYDX8ZeAc8c10="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:k8LpV5aZxuXoJTyZKqoth5F5+qI=
In-Reply-To: <kf65r7Fm7c2U1@mid.individual.net>
Content-Language: de-DE
 by: Bonita Montero - Sat, 17 Jun 2023 16:58 UTC

Am 17.06.2023 um 18:37 schrieb Bo Persson:

> I have Windows 10 with a Core i9 9900K, and get results between 7500 and
> 8000.

That's the number of nanoseconds for each switch.
You must check the current clock rate.

Re: Semaphore thread Flip: 20.000 clock cycles

<kf69bgFmpj7U1@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=443&group=comp.lang.c%2B%2B#443

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: bo@bo-persson.se (Bo Persson)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Sat, 17 Jun 2023 19:37:51 +0200
Lines: 14
Message-ID: <kf69bgFmpj7U1@mid.individual.net>
References: <u6kmk8$19q46$1@dont-email.me> <kf65r7Fm7c2U1@mid.individual.net>
<u6kon8$1a545$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: individual.net +wcNJ0OK33QJqGtm5grKcQJJyGdJ3tEQPNe23AysYMoRKb5KUU
Cancel-Lock: sha1:UxwQpL3dcC/BPl7mQRDmdeS7oIo=
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Content-Language: sv
In-Reply-To: <u6kon8$1a545$1@dont-email.me>
 by: Bo Persson - Sat, 17 Jun 2023 17:37 UTC

On 2023-06-17 at 18:58, Bonita Montero wrote:
> Am 17.06.2023 um 18:37 schrieb Bo Persson:
>
>> I have Windows 10 with a Core i9 9900K, and get results between 7500
>> and 8000.
>
> That's the number of nanoseconds for each switch.
> You must check the current clock rate.

Ok, so if running at 5 Ghz, you multiply by 5?

Then it goes from good to really bad! :-)

Re: Semaphore thread Flip: 20.000 clock cycles

<u6krq8$1aib5$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=444&group=comp.lang.c%2B%2B#444

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Sat, 17 Jun 2023 19:51:06 +0200
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <u6krq8$1aib5$1@dont-email.me>
References: <u6kmk8$19q46$1@dont-email.me> <kf65r7Fm7c2U1@mid.individual.net>
<u6kon8$1a545$1@dont-email.me> <kf69bgFmpj7U1@mid.individual.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 17 Jun 2023 17:51:04 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="70ae5adec0be00e54b241ae7ebc9e0a2";
logging-data="1395045"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/eD7nq/nWlV1KmGPR8CBUKj58K4cLQwWo="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:T0F9ME1kUPKt/4xE2n7Wy+wsiac=
In-Reply-To: <kf69bgFmpj7U1@mid.individual.net>
Content-Language: de-DE
 by: Bonita Montero - Sat, 17 Jun 2023 17:51 UTC

Am 17.06.2023 um 19:37 schrieb Bo Persson:
> On 2023-06-17 at 18:58, Bonita Montero wrote:
>> Am 17.06.2023 um 18:37 schrieb Bo Persson:
>>
>>> I have Windows 10 with a Core i9 9900K, and get results between 7500
>>> and 8000.
>>
>> That's the number of nanoseconds for each switch.
>> You must check the current clock rate.
>
>
> Ok, so if running at 5 Ghz, you multiply by 5?
>
> Then it goes from good to really bad!  :-)

I used CoreTemp to see how the clocking of
the cores develops while running that code.

Re: Semaphore thread Flip: 20.000 clock cycles

<_WmjM.27743$8uge.25185@fx14.iad>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=447&group=comp.lang.c%2B%2B#447

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.swapon.de!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx14.iad.POSTED!not-for-mail
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Newsgroups: comp.lang.c++
References: <u6kmk8$19q46$1@dont-email.me>
From: pauldontspamtolk@removeyourself.dontspam.yahoo (Pavel)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.15
MIME-Version: 1.0
In-Reply-To: <u6kmk8$19q46$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 98
Message-ID: <_WmjM.27743$8uge.25185@fx14.iad>
X-Complaints-To: https://www.astraweb.com/aup
NNTP-Posting-Date: Sat, 17 Jun 2023 18:28:10 UTC
Date: Sat, 17 Jun 2023 14:28:01 -0400
X-Received-Bytes: 5060
 by: Pavel - Sat, 17 Jun 2023 18:28 UTC

Bonita Montero wrote:
> I wanted to test how many time it takes for a thread to signal
> a semaphore to another thread and to wait to be signalled back.
> That's essential for mutexes when they're contended. I tested
> this under Windows 11 with a Ryzen 9 7950X system.
> I tested different combinations of logical cores. The first
> thread is always fixed on the first core and the other thread
> is varying. I print the X2 APIC ID along with the result.
> The fastest result I get is about 20.000 clock cycles for one
> switch to the other thread. I think that's enormous.
> A similar benchmark written for linux using Posix semapohres
> gives about 8.000 clock cylces per switch on a 3990X system.
> That's a huge difference since the CPU is a Zen2-CPU with a
> much lower clock rate than the 7950X Zen4 system.

Comparing the performance of POSIX mutices and Windows semaphores for
thread signalling is apple-to-orange. A Windows critical section is the
closest analogue to the POSIX inter-thread recursive non-robust mutex. I
don't think there is a close analogue to a non-recursive non-robust
mutex -- which is potentially the fastest.

>
> #include <Windows.h>
> #include <iostream>
> #include <thread>
> #include <system_error>
> #include <chrono>
> #include <latch>
> #include <charconv>
> #include <intrin.h>
>
> using namespace std;
> using namespace chrono;
>
> int main( int argc, char **argv )
> {
>     static auto errTerm = []( bool succ, char const *what )
>     {
>         if( succ )
>             return;
>         cerr << what << endl;
>         terminate();
>     };
>     int regs[4];
>     __cpuid( regs, 0 );
>     errTerm( (unsigned)regs[0] >= 0xB, "max CPUID below 0xB" );
>     bool fPrio = SetPriorityClass( GetCurrentProcess(),
> REALTIME_PRIORITY_CLASS )
>         || SetPriorityClass( GetCurrentProcess(), HIGH_PRIORITY_CLASS );
>     errTerm( fPrio, "can't set process priority class" );
>     unsigned nCPUs = jthread::hardware_concurrency();
>     for( unsigned cpuB = 1; cpuB != nCPUs; ++cpuB )
>     {
>         auto init = []( HANDLE &hSem, bool set )
>         {
>             hSem = CreateSemaphoreA( nullptr, set, 1, nullptr );
>             errTerm( hSem, "can't create semaphore" );
>         };
>         HANDLE hSemA, hSemB;
>         init( hSemA, false );
>         init( hSemB, true );
>         atomic_int64_t tSum( 0 );
>         latch latRun( 3 );
>         auto flipThread = [&]( HANDLE hSemMe, HANDLE hSemYou, size_t n,
> uint32_t *pX2ApicId )
>         {
>             latRun.arrive_and_wait();
>             auto start = high_resolution_clock::now();
>             for( ; n--; )
>                 errTerm( WaitForSingleObject( hSemMe, INFINITE ) ==
> WAIT_OBJECT_0, "can't wait for semaphore" ),
>                 errTerm( ReleaseSemaphore( hSemYou, 1, nullptr ),
> "can't post semaphore" );
>             tSum.fetch_add( duration_cast<nanoseconds>(
> high_resolution_clock::now() - start ).count(), memory_order::relaxed );
>             if( !pX2ApicId )
>                 return;
>             int regs[4];
>             __cpuidex( regs, 0xB, 0 );
>             *pX2ApicId = regs[3];
>         };
>         constexpr size_t ROUNDS = 10'000;
>         uint32_t x2ApicId;
>         jthread
>             thrA( flipThread, hSemA, hSemB, ROUNDS, nullptr ),
>             thrB( flipThread, hSemB, hSemA, ROUNDS, &x2ApicId );
>         errTerm( SetThreadAffinityMask( thrA.native_handle(), 1 ),
> "can't set CPU affinity" );
>         errTerm( SetThreadAffinityMask( thrB.native_handle(),
> (DWORD_PTR)1 << cpuB ), "can't set CPU affinity" );
>         latRun.arrive_and_wait();
>         thrA.join();
>         thrB.join();
>         cout << x2ApicId << ": " << (double)tSum.load(
> memory_order::relaxed ) / (2.0 * ROUNDS) << endl;
>     };
> }

Re: Semaphore thread Flip: 20.000 clock cycles

<u6m2b8$1iues$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=455&group=comp.lang.c%2B%2B#455

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Sun, 18 Jun 2023 06:48:43 +0200
Organization: A noiseless patient Spider
Lines: 8
Message-ID: <u6m2b8$1iues$1@dont-email.me>
References: <u6kmk8$19q46$1@dont-email.me> <_WmjM.27743$8uge.25185@fx14.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 18 Jun 2023 04:48:40 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="66df86f3028653baf0f76f4b60dd598f";
logging-data="1669596"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+JLYRHMMx1dvjsjb5CZXQq1zGO4p4/zAI="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:5du/D7CgCqovdQHyf9Ewgdt8tkI=
In-Reply-To: <_WmjM.27743$8uge.25185@fx14.iad>
Content-Language: de-DE
 by: Bonita Montero - Sun, 18 Jun 2023 04:48 UTC

Am 17.06.2023 um 20:28 schrieb Pavel:

> Comparing the performance of POSIX mutices and Windows semaphores for
> thread signalling is apple-to-orange. A Windows critical section is the
> closest analogue to the POSIX inter-thread recursive non-robust mutex.

I'm comparing semaphores vs. semaphores, not mutexes vs. mutexes.
That's relevant when a mutex is contended and you don't spin.

Re: Semaphore thread Flip: 20.000 clock cycles

<u6nmta$1o2f6$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=461&group=comp.lang.c%2B%2B#461

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Sun, 18 Jun 2023 12:45:45 -0700
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <u6nmta$1o2f6$2@dont-email.me>
References: <u6kmk8$19q46$1@dont-email.me> <_WmjM.27743$8uge.25185@fx14.iad>
<u6m2b8$1iues$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 18 Jun 2023 19:45:46 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="4a804301eb3b625a6f946fb08b3c4c15";
logging-data="1837542"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Kx1x+8+aPj9COj0jD3/SacB6DmO0VDYE="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:vj6YMQG3uCGh51irLCknpoVGZbg=
In-Reply-To: <u6m2b8$1iues$1@dont-email.me>
Content-Language: en-US
 by: Chris M. Thomasson - Sun, 18 Jun 2023 19:45 UTC

On 6/17/2023 9:48 PM, Bonita Montero wrote:
> Am 17.06.2023 um 20:28 schrieb Pavel:
>
>> Comparing the performance of POSIX mutices and Windows semaphores for
>> thread signalling is apple-to-orange. A Windows critical section is
>> the closest analogue to the POSIX inter-thread recursive non-robust
>> mutex.
>
> I'm comparing semaphores vs. semaphores, not mutexes vs. mutexes.
> That's relevant when a mutex is contended and you don't spin.

Have you heard of an adaptive mutex that spins on a contended state of a
mutex for a finite number of times before it resorts to blocking? There
are "fast" semaphores as well. Take a look at the benaphore:

https://www.haiku-os.org/legacy-docs/benewsletter/Issue1-26.html#Engineering1-26

The benaphore helps things out by skipping calls into the kernel.

Re: Semaphore thread Flip: 20.000 clock cycles

<u6og38$1v906$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=463&group=comp.lang.c%2B%2B#463

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Mon, 19 Jun 2023 04:55:40 +0200
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <u6og38$1v906$1@dont-email.me>
References: <u6kmk8$19q46$1@dont-email.me> <_WmjM.27743$8uge.25185@fx14.iad>
<u6m2b8$1iues$1@dont-email.me> <u6nmta$1o2f6$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 19 Jun 2023 02:55:36 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3684d5eb37dfa19f4cda13045b7995ff";
logging-data="2073606"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/2HtnK2UePvLZh5g1J+ZZCnRUOiGb36YU="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:8d9Ee6nL2kSAubryUwUwZrflSaU=
Content-Language: de-DE
In-Reply-To: <u6nmta$1o2f6$2@dont-email.me>
 by: Bonita Montero - Mon, 19 Jun 2023 02:55 UTC

Am 18.06.2023 um 21:45 schrieb Chris M. Thomasson:

> Have you heard of an adaptive mutex that spins on a contended state of a
> mutex for a finite number of times before it resorts to blocking? There
> are "fast" semaphores as well. Take a look at the benaphore:

I've written that before myself, using the spin count algorithm from
glibc (max = before * 2 + 10). Spinning only makes sense if the mustex
is held a very short time and there's a high degree of contention. This
doesn't fit very often.

Re: Semaphore thread Flip: 20.000 clock cycles

<u6qsbu$273bq$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=471&group=comp.lang.c%2B%2B#471

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Semaphore thread Flip: 20.000 clock cycles
Date: Mon, 19 Jun 2023 17:37:15 -0700
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <u6qsbu$273bq$1@dont-email.me>
References: <u6kmk8$19q46$1@dont-email.me> <_WmjM.27743$8uge.25185@fx14.iad>
<u6m2b8$1iues$1@dont-email.me> <u6nmta$1o2f6$2@dont-email.me>
<u6og38$1v906$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 20 Jun 2023 00:37:18 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="acf1b0791dc941086b9a0d988d3109cf";
logging-data="2329978"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19TGFnyEcX0xdFWzmUGwD+Q/cd4E7Gf+Wg="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.12.0
Cancel-Lock: sha1:fVhAU3luU4Lhr12ETTfc4XdoAaQ=
In-Reply-To: <u6og38$1v906$1@dont-email.me>
Content-Language: en-US
 by: Chris M. Thomasson - Tue, 20 Jun 2023 00:37 UTC

On 6/18/2023 7:55 PM, Bonita Montero wrote:
> Am 18.06.2023 um 21:45 schrieb Chris M. Thomasson:
>
>> Have you heard of an adaptive mutex that spins on a contended state of
>> a mutex for a finite number of times before it resorts to blocking?
>> There are "fast" semaphores as well. Take a look at the benaphore:
>
> I've written that before myself, using the spin count algorithm from
> glibc (max = before * 2 + 10). Spinning only makes sense if the mustex
> is held a very short time and there's a high degree of contention. This
> doesn't fit very often.

I have "seen things":

https://youtu.be/QefqJ7YhbWQ?t=117

;^D Seriously... Nasty deadlocks, and horrible things! ;^o

Mainly back when I used to work on this kind of stuff quite a bit, well
over a decade ago. I have seen some code where some simple logic changes
ended up making some mutexes go quite "hot", contended, and become
actual bottlenecks... Luckily, a decent amount of those had critical
sections that were able to be made lock-free.

One can also spin on semaphore logic where _if_ the semaphore is going
to have to block in the kernel, we can detect that and spin a couple of
times on a CAS instead issuing XADD's for the counting logic.

Benaphores are pretty damn nice! :^)

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor