Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Everything should be made as simple as possible, but not simpler. -- Albert Einstein


devel / comp.lang.c++ / Scanning memory forward vs. reverse

SubjectAuthor
* Scanning memory forward vs. reverseBonita Montero
+* Re: Scanning memory forward vs. reverseMarcel Mueller
|`* Re: Scanning memory forward vs. reverseBonita Montero
| `* Re: Scanning memory forward vs. reverseChris M. Thomasson
|  `* Re: Scanning memory forward vs. reverseBonita Montero
|   `* Re: Scanning memory forward vs. reverseChris M. Thomasson
|    `- Re: Scanning memory forward vs. reverseBonita Montero
`- Re: Scanning memory forward vs. reverseVir Campestris

1
Scanning memory forward vs. reverse

<up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3096&group=comp.lang.c%2B%2B#3096

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.nntp4.net!news.hispagatos.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Scanning memory forward vs. reverse
Date: Sun, 28 Jan 2024 11:19:18 +0100
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Jan 2024 10:19:18 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="65968ff335b22d031a9b2814ce3faa7b";
logging-data="4061793"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/5+MpsHYXSH7b3vjb7xVfZ9MMhY8E82mo="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:KJYxpN1G6L1Zjol1AyZ0fSFNQ7s=
Content-Language: de-DE
 by: Bonita Montero - Sun, 28 Jan 2024 10:19 UTC

With my thread pool there's some code that looks up a deque in reverse
order with find_if with reverse iterators to check if there's a "this"
pointer inside the deque. I also could have scanned the deque in forward
order but it's more likely to find a fitting element wenn searching from
the back first.
A deque usually consists of a number of linear parts in memory. This
lead me to the question if scanning memory is faster forward or back-
ward. I tried to test this with the below program:

#include <iostream>
#include <vector>
#include <atomic>
#include <chrono>

using namespace std;
using namespace chrono;

atomic_char aSum;

int main()
{ constexpr size_t GB = 1ull << 30;
vector<char> vc( GB );
auto sum = []( auto begin, auto end, ptrdiff_t step )
{
auto start = high_resolution_clock::now();
char sum = 0;
for( auto p = begin; end - p >= step; sum += *p, p += step );
::aSum.store( sum, memory_order_relaxed );
cout << duration_cast<nanoseconds>( high_resolution_clock::now() -
start ).count() / 1.0e6 << "ms" << endl;
};
constexpr size_t STEP = 100;
sum( vc.begin(), vc.end(), STEP );
sum( vc.rbegin(), vc.rend(), STEP );
}

On my Windows 7050X Zen4 computer scanning memory in both directions
has the same speed. On my Linux 3990X Zen2 computer scanning forward
is 22% faster. On my small Linux PC, a HP EliteDesk Mini PC with a
Skylake Pentium G4400 scanning memory forward is about 38% faster.
I'd first have guessed that the prefetchers between the memory-levels
are as effective for both directions. So I'd like to see some results
from you.

Re: Scanning memory forward vs. reverse

<up5ago$3ev6k$4@gwaiyur.mb-net.net>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3097&group=comp.lang.c%2B%2B#3097

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.mb-net.net!open-news-network.org!.POSTED!not-for-mail
From: news.5.maazl@spamgourmet.org (Marcel Mueller)
Newsgroups: comp.lang.c++
Subject: Re: Scanning memory forward vs. reverse
Date: Sun, 28 Jan 2024 11:32:56 +0100
Organization: MB-NET.NET for Open-News-Network e.V.
Message-ID: <up5ago$3ev6k$4@gwaiyur.mb-net.net>
References: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Jan 2024 10:32:56 -0000 (UTC)
Injection-Info: gwaiyur.mb-net.net;
logging-data="3636436"; mail-complaints-to="abuse@open-news-network.org"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:IqfzCFNuKgrRTg8beBBxIlrhZXo= sha256:NY4R/u1wCK0kni5zHTtV/TeXJhFucxv+9SMKmzkJPso=
sha1:v3LAxwejrqBFIJCp0IkYnDdSrNQ= sha256:k3RJVlDroOWmQ4I8DL0Itsbo5x6fGyetPkv08Tye+es=
Content-Language: de-DE, en-US
In-Reply-To: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
 by: Marcel Mueller - Sun, 28 Jan 2024 10:32 UTC

Am 28.01.24 um 11:19 schrieb Bonita Montero:
> A deque usually consists of a number of linear parts in memory. This
> lead me to the question if scanning memory is faster forward or back-
> ward. I tried to test this with the below program:

> On my Windows 7050X Zen4 computer scanning memory in both directions
> has the same speed. On my Linux 3990X Zen2 computer scanning forward
> is 22% faster. On my small Linux PC, a HP EliteDesk Mini PC with a
> Skylake Pentium G4400 scanning memory forward is about 38% faster.

> I'd first have guessed that the prefetchers between the memory-levels
> are as effective for both directions. So I'd like to see some results
> from you.

Reverse memory access is typically slower simply because the last data
of a cache line (after a cache miss) arrives at last. If you read
forward the process continues when the first few bytes of the cache line
are read. The further data is read in parallel.

But details depend on many other factors. First of all the placement of
the memory chunks and the used prefetching technique (if any).

Marcel

Re: Scanning memory forward vs. reverse

<up5jje$3u7bh$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3098&group=comp.lang.c%2B%2B#3098

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Scanning memory forward vs. reverse
Date: Sun, 28 Jan 2024 14:07:58 +0100
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <up5jje$3u7bh$1@raubtier-asyl.eternal-september.org>
References: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
<up5ago$3ev6k$4@gwaiyur.mb-net.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Jan 2024 13:07:58 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="65968ff335b22d031a9b2814ce3faa7b";
logging-data="4136305"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX182IvIbae8KEAi8ksCag6F8R33eAJHxxio="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:x7GMzLps8vMRpBIvl+yQJwliRWY=
Content-Language: de-DE
In-Reply-To: <up5ago$3ev6k$4@gwaiyur.mb-net.net>
 by: Bonita Montero - Sun, 28 Jan 2024 13:07 UTC

Am 28.01.2024 um 11:32 schrieb Marcel Mueller:

> Reverse memory access is typically slower simply because the
> last data of a cache line (after a cache miss) arrives at last.

I tested this and for all offsets within a cacheline I get thes
same timing for all three of my computers:

#include <iostream>
#include <vector>
#include <chrono>
#include <atomic>

using namespace std;
using namespace chrono;

#if defined(__cpp_lib_hardware_interference_size)
constexpr size_t CL_SIZE = hardware_constructive_interference_size;
#else
constexpr size_t CL_SIZE = 64;
#endif

atomic_char aSum;

int main()
{ constexpr size_t
BLOCK_SIZE = 1ull << 20,
BLOCKS = BLOCK_SIZE / CL_SIZE,
ROUNDS = 1000;
vector<char> block( BLOCK_SIZE );
for( size_t offset = 0; offset != CL_SIZE; ++offset )
{
auto start = high_resolution_clock::now();
char sum = 0;
for( size_t round = ROUNDS; round--; )
for( size_t i = offset; i < BLOCK_SIZE; sum += block[i], i += CL_SIZE );
::aSum.store( sum, memory_order_relaxed );
cout << offset << ": " <<
duration_cast<nanoseconds>(high_resolution_clock::now() - start).count()
/ ((double)BLOCKS * ROUNDS) << endl;
}
}

Re: Scanning memory forward vs. reverse

<up69an$29an$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3099&group=comp.lang.c%2B%2B#3099

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Scanning memory forward vs. reverse
Date: Sun, 28 Jan 2024 11:18:47 -0800
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <up69an$29an$1@dont-email.me>
References: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
<up5ago$3ev6k$4@gwaiyur.mb-net.net>
<up5jje$3u7bh$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Jan 2024 19:18:47 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="9fe4cfb1338e5b9f0f0e57f1cea759c7";
logging-data="75095"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+ConqmPqHageJHVLwj/FZo6w3Et4tpeic="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Qt2b4bZx4SmiCe805OjtsJalpfA=
Content-Language: en-US
In-Reply-To: <up5jje$3u7bh$1@raubtier-asyl.eternal-september.org>
 by: Chris M. Thomasson - Sun, 28 Jan 2024 19:18 UTC

On 1/28/2024 5:07 AM, Bonita Montero wrote:
> Am 28.01.2024 um 11:32 schrieb Marcel Mueller:
>
>> Reverse memory access is typically slower simply because the
>> last data  of a cache line (after a cache miss) arrives at last.
>
> I tested this and for all offsets within a cacheline I get thes
> same timing for all three of my computers:
>
> #include <iostream>
> #include <vector>
> #include <chrono>
> #include <atomic>
>
> using namespace std;
> using namespace chrono;
>
> #if defined(__cpp_lib_hardware_interference_size)
> constexpr size_t CL_SIZE = hardware_constructive_interference_size;
> #else
> constexpr size_t CL_SIZE = 64;
> #endif
>
> atomic_char aSum;
>
> int main()
> {
>     constexpr size_t
>         BLOCK_SIZE = 1ull << 20,
>         BLOCKS = BLOCK_SIZE / CL_SIZE,
>         ROUNDS = 1000;

>     vector<char> block( BLOCK_SIZE );

Try padding and aligning the blocks. iirc, std::vector works with
alignas. Actually, it's pretty nice.

>     for( size_t offset = 0; offset != CL_SIZE; ++offset )
>     {
>         auto start = high_resolution_clock::now();
>         char sum = 0;
>         for( size_t round = ROUNDS; round--; )
>             for( size_t i = offset; i < BLOCK_SIZE; sum += block[i], i
> += CL_SIZE );
>         ::aSum.store( sum, memory_order_relaxed );
>         cout << offset << ": " <<
> duration_cast<nanoseconds>(high_resolution_clock::now() - start).count()
> / ((double)BLOCKS * ROUNDS) << endl;
>     }
> }
>

Re: Scanning memory forward vs. reverse

<up7p8b$d086$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3100&group=comp.lang.c%2B%2B#3100

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Scanning memory forward vs. reverse
Date: Mon, 29 Jan 2024 09:56:43 +0100
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <up7p8b$d086$1@raubtier-asyl.eternal-september.org>
References: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
<up5ago$3ev6k$4@gwaiyur.mb-net.net>
<up5jje$3u7bh$1@raubtier-asyl.eternal-september.org>
<up69an$29an$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 29 Jan 2024 08:56:43 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="40860ad1226e7d11c6632c7fa930e32b";
logging-data="426246"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19EKHT6gUbQMWmsDosufXnl1F/+EgIHs3w="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:4xnpsH0bfnQDPZfnf0yJHZMITV4=
In-Reply-To: <up69an$29an$1@dont-email.me>
Content-Language: de-DE
 by: Bonita Montero - Mon, 29 Jan 2024 08:56 UTC

Am 28.01.2024 um 20:18 schrieb Chris M. Thomasson:

> Try padding and aligning the blocks. iirc, std::vector works with
> alignas. Actually, it's pretty nice.

I'm testing all 64 offsets. If offset zero becomes physically offset
one in the cacheline doesn't matter since physical offset zero would
then be occupied by logical offset 63.

Re: Scanning memory forward vs. reverse

<up97tb$kvd2$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3102&group=comp.lang.c%2B%2B#3102

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Scanning memory forward vs. reverse
Date: Mon, 29 Jan 2024 14:12:59 -0800
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <up97tb$kvd2$1@dont-email.me>
References: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
<up5ago$3ev6k$4@gwaiyur.mb-net.net>
<up5jje$3u7bh$1@raubtier-asyl.eternal-september.org>
<up69an$29an$1@dont-email.me>
<up7p8b$d086$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 29 Jan 2024 22:12:59 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="79e47a7daf77a3dc87926c1beb8fd8a3";
logging-data="687522"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18DYrvclLV9w10Bnwtgi2L3F8x2OjdbPaU="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:qqvpkmoO3iXbKpgN8vVxG8CJmfs=
In-Reply-To: <up7p8b$d086$1@raubtier-asyl.eternal-september.org>
Content-Language: en-US
 by: Chris M. Thomasson - Mon, 29 Jan 2024 22:12 UTC

On 1/29/2024 12:56 AM, Bonita Montero wrote:
> Am 28.01.2024 um 20:18 schrieb Chris M. Thomasson:
>
>> Try padding and aligning the blocks. iirc, std::vector works with
>> alignas. Actually, it's pretty nice.
>
> I'm testing all 64 offsets. If offset zero becomes physically offset
> one in the cacheline doesn't matter since physical offset zero would
> then be occupied by logical offset 63.
>

You don't want to straddle any cache lines. Properly aligning and
padding the blocks gets around that...

Re: Scanning memory forward vs. reverse

<upak4j$uvc2$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3105&group=comp.lang.c%2B%2B#3105

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Scanning memory forward vs. reverse
Date: Tue, 30 Jan 2024 11:47:48 +0100
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <upak4j$uvc2$1@raubtier-asyl.eternal-september.org>
References: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
<up5ago$3ev6k$4@gwaiyur.mb-net.net>
<up5jje$3u7bh$1@raubtier-asyl.eternal-september.org>
<up69an$29an$1@dont-email.me>
<up7p8b$d086$1@raubtier-asyl.eternal-september.org>
<up97tb$kvd2$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 30 Jan 2024 10:47:47 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="79fa68310583cf528f9b017a6ae1a240";
logging-data="1015170"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19HdfMuWx0bCMQgVS4sUjQyBTx+N9RrqD0="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:IqkgjUPLvbQrn1NvvZaIf3dLvpA=
Content-Language: de-DE
In-Reply-To: <up97tb$kvd2$1@dont-email.me>
 by: Bonita Montero - Tue, 30 Jan 2024 10:47 UTC

Am 29.01.2024 um 23:12 schrieb Chris M. Thomasson:
> On 1/29/2024 12:56 AM, Bonita Montero wrote:
>> Am 28.01.2024 um 20:18 schrieb Chris M. Thomasson:
>>
>>> Try padding and aligning the blocks. iirc, std::vector works with
>>> alignas. Actually, it's pretty nice.
>>
>> I'm testing all 64 offsets. If offset zero becomes physically offset
>> one in the cacheline doesn't matter since physical offset zero would
>> then be occupied by logical offset 63.
>>
>
> You don't want to straddle any cache lines. ...

I'm testing all 64 offsets and for my measurement it doesn't matter if
the beginning of the block is at offset zero inside a cacheline since
the result show equal access times for all offsets.
If there were different results it might have made sense to have proper
alignment.

Re: Scanning memory forward vs. reverse

<updqii$1j99p$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=3109&group=comp.lang.c%2B%2B#3109

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Scanning memory forward vs. reverse
Date: Wed, 31 Jan 2024 15:56:01 +0000
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <updqii$1j99p$1@dont-email.me>
References: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 31 Jan 2024 15:56:02 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6b29e205a28bdd0429ffe04c46481d5c";
logging-data="1680697"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/XoDaFXdWNW6fVa0SvEF5eBdUgB2DU+no="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:6c9jEIOrjIxMoSm86KTSihfY0XA=
Content-Language: en-GB
In-Reply-To: <up59n6$3ruj1$1@raubtier-asyl.eternal-september.org>
 by: Vir Campestris - Wed, 31 Jan 2024 15:56 UTC

On 28/01/2024 10:19, Bonita Montero wrote:
> On my Windows 7050X Zen4 computer scanning memory in both directions
> has the same speed. On my Linux 3990X Zen2 computer scanning forward
> is 22% faster. On my small Linux PC, a HP EliteDesk Mini PC with a
> Skylake Pentium G4400 scanning memory forward is about 38% faster.
> I'd first have guessed that the prefetchers between the memory-levels
> are as effective for both directions. So I'd like to see some results
> from you.

On my Linux box with an AMD Ryzen 5 3400G it's about 11% slower for the
second number. But that's a very about, it's doing something else right
now and that's the average from several runs - where the ratio is
between 97% and 130%.

Andy

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor