Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Every little picofarad has a nanohenry all its own. -- Don Vonada


devel / comp.lang.c++ / Sieve of Erastosthenes optimized to the max

SubjectAuthor
* Sieve of Erastosthenes optimized to the maxBonita Montero
+* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
| `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  +* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|  |`* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  | +* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  | |`- Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  | `* Re: Sieve of Erastosthenes optimized to the maxred floyd
|  |  `* Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|  |   `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  |    `* Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|  |     `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  |      `- Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|   `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|    +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|    +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|    `* Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|     `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|      `- Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
+* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
| `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|   `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|     `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|      `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|       `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|        `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|         `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          +* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |`* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          | `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |   `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |     `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |      `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | +* Re: Sieve of Erastosthenes optimized to the maxDavid Brown
|          |       | |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | | `* Re: Sieve of Erastosthenes optimized to the maxDavid Brown
|          |       | |  +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  `* Re: Sieve of Erastosthenes optimized to the maxScott Lurndal
|          |       | |   `- Re: Sieve of Erastosthenes optimized to the maxDavid Brown
|          |       | +* Re: Sieve of Erastosthenes optimized to the maxScott Lurndal
|          |       | |+* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | ||`- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |`* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | | `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |  +* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |+- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |  |`* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  | `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  +* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  | `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  |   `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  |     `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |      `- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  `* Re: Sieve of Erastosthenes optimized to the maxScott Lurndal
|          |       | |  |   `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |    `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |     `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |      `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |  `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |   `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |   `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |    +* Re: Sieve of Erastosthenes optimized to the maxred floyd
|          |       |    |`- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     | `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |   `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |     `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |      `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |       `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |        `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |         `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |          `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |           `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       |     |            `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |             `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |              +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |              |`- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |              `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |               `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |                `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |                 +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |                 |`* Re: Sieve of Erastosthenes optimized to the maxred floyd
|          |       |     |                 | +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |                 | `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |                 `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       |     `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
`* Re: Sieve of Erastosthenes optimized to the maxwij

Pages:123456
Sieve of Erastosthenes optimized to the max

<ul41d4$2koct$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2897&group=comp.lang.c%2B%2B#2897

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Sieve of Erastosthenes optimized to the max
Date: Sun, 10 Dec 2023 10:46:18 +0100
Organization: A noiseless patient Spider
Lines: 251
Message-ID: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 10 Dec 2023 09:46:12 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="90533d92cd816742ce0b78ec141ea3b0";
logging-data="2777501"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18hdKnADcnTGXgGPdK8vICjhKiMglRJ6vA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:lhSVyr/alyDsQcXD19gMx0dc3Yc=
Content-Language: de-DE
 by: Bonita Montero - Sun, 10 Dec 2023 09:46 UTC

I've parallelized my sieve of Erastosthenes to all cores. The parallel
threads do al the punching of the non-prime multiplex beyond sqrt( max
). I found that the algorithm is limited by the memory bandwith since
the bit-range between sqrt( max ) and max is to large to fit inside
the cache. So I partitioned the each thread to a number of bit-ranges
that fits inside the L2-cache. With that I got a speedup of factor 28
when calculating about the first 210 million primes, i.e. all primes
<= 2 ^ 32.
On my Zen4 16 core PC under Windows with clang-cl 16.0.5 producing the
primes without any file output is only 0.28s. On my Linux-PC, a Zen2 64
core CPU producing the same number of primes is about 0.09s.
The file output is done with my own buffering. With that writing the
mentioned prime number range results in a 2.2GB file which is written
in about two seconds with a PCIe 4.0 SSD.
The first parameter is the highest prime candidate to be generated,
it can be prefixed with 0x for hex values. The second number is the
name of the output file; it can be "" to suppress file output. The
third parameter is the number of punching threads; if it is left the
number of threads defaults to the number of threads reported by the
runtime.

#include <iostream>
#include <cstdlib>
#include <charconv>
#include <cstring>
#include <vector>
#include <algorithm>
#include <cmath>
#include <bit>
#include <fstream>
#include <string_view>
#include <thread>
#include <mutex>
#include <condition_variable>
#include <utility>
#include <semaphore>
#include <atomic>
#include <new>
#include <span>
#if defined(_MSC_VER)
#include <intrin.h>
#elif defined(__GNUC__) || defined(__clang__)
#include <cpuid.h>
#endif

#if defined(_MSC_VER)
#pragma warning(disable: 26495)
#endif

using namespace std;

#if defined(__cpp_lib_hardware_interference_size)
constexpr ptrdiff_t CL_SIZE = hardware_destructive_interference_size;
#else
constexpr ptrdiff_t CL_SIZE = 64;
#endif

size_t getL2Size();

int main( int argc, char **argv )
{ if( argc < 2 )
return EXIT_FAILURE;
auto parse = []( char const *str, bool hex, auto &value )
{
from_chars_result fcr = from_chars( str, str + strlen( str ), value,
!hex ? 10 : 16 );
return fcr.ec == errc() && !*fcr.ptr;
};
char const *sizeStr = argv[1];
bool hex = sizeStr[0] == '0' && (sizeStr[1] == 'x' || sizeStr[1] == 'X');
sizeStr += 2 * hex;
size_t end;
if( !parse( sizeStr, hex, end ) )
return EXIT_FAILURE;
if( (ptrdiff_t)end++ < 0 )
throw bad_alloc();
unsigned hc;
if( argc < 4 || !parse( argv[3], false, hc ) )
hc = jthread::hardware_concurrency();
if( !hc )
return EXIT_FAILURE;
using word_t = size_t;
constexpr size_t
BITS = sizeof(word_t) * 8,
CL_BITS = CL_SIZE * 8;
union alignas(CL_SIZE) ndi_words_cl { word_t words[CL_SIZE /
sizeof(word_t)]; ndi_words_cl() {} };
size_t nCls = (end + CL_BITS - 1) / CL_BITS;
vector<ndi_words_cl> sieveCachelines( (end + CL_BITS - 1) / CL_BITS );
size_t nWords = (end + BITS - 1) / BITS;
span<word_t> sieve( &sieveCachelines[0].words[0], nWords );
auto fill = sieve.begin();
*fill++ = (word_t)0xAAAAAAAAAAAAAAACu;
if( fill != sieve.end() )
for_each( fill, sieve.end(), []( word_t &w ) { w =
(word_t)0xAAAAAAAAAAAAAAAAu; } );
size_t sqrt = (ptrdiff_t)ceil( ::sqrt( (ptrdiff_t)end ) );
auto punch = [&]( auto, size_t start, size_t end, size_t prime )
{
if( start >= end )
return;
size_t multiple = start;
if( prime >= BITS ) [[likely]]
do
sieve[multiple / BITS] &= rotl( (word_t)-2, multiple % BITS );
while( (multiple += prime) < end );
else
{
auto pSieve = sieve.begin() + multiple / BITS;
do
{
word_t
word = *pSieve,
mask = rotl( (word_t)-2, multiple % BITS ),
oldMask;
do
word &= mask,
oldMask = mask,
mask = rotl( mask, prime % BITS ),
multiple += prime;
while( mask < oldMask );
*pSieve++ = word;
} while( multiple < end );
}
};
for( size_t prime = 3; prime < sqrt; ++prime )
{
for( auto pSieve = sieve.begin() + prime / BITS; ; )
{
if( word_t w = *pSieve >> prime % BITS; w ) [[likely]]
{
prime += countr_zero( w );
break;
}
prime = prime + BITS & -(ptrdiff_t)BITS;
if( prime >= sqrt )
break;
}
if( prime >= sqrt ) [[unlikely]]
break;
punch( false_type(), prime * prime, sqrt, prime );
}
auto scan = [&]( size_t start, size_t end, auto fn )
{
for( size_t prime = start; prime < end; )
{
word_t word = sieve[prime / BITS] >> prime % BITS;
bool hasBits = word;
for( unsigned gap; word; word >>= gap, word >>= 1 )
{
gap = countr_zero( word );
if( (prime += gap) >= end ) [[unlikely]]
return;
fn( prime );
if( ++prime >= end ) [[unlikely]]
return;
}
prime = (prime + BITS - hasBits) & -(ptrdiff_t)BITS;
}
};
size_t bitsPerPartition = (end - sqrt) / hc;
using range_t = pair<size_t, size_t>;
vector<pair<size_t, size_t>> ranges;
ranges.reserve( hc );
for( size_t t = hc, rEnd = end, rStart; t--; rEnd = rStart )
{
rStart = sqrt + t * bitsPerPartition;
size_t rClStart = rStart & -(CL_SIZE * 8);
rStart = rClStart >= sqrt ? rClStart : sqrt;
if( rStart >= rEnd )
continue;
ranges.emplace_back( rStart, rEnd );
}
vector<jthread> threads;
threads.reserve( ranges.size() - 1 );
static auto dbl = []( ptrdiff_t x ) { return (double)x; };
double maxPartitionBits = dbl( getL2Size() ) * 8 / 3;
for( range_t const &range : ranges )
{
auto rangePuncher = [&]( size_t rStart, size_t rEnd )
{
double nBits = dbl( rEnd - rStart );
ptrdiff_t nPartitions = (ptrdiff_t)ceil( nBits / maxPartitionBits );
double bitsPerPartition = nBits / dbl( nPartitions );
for( ptrdiff_t p = nPartitions, pEnd = rEnd, pStart; p--; pEnd = pStart )
{
pStart = rStart + (ptrdiff_t)((double)p * bitsPerPartition);
pStart &= -(CL_SIZE * 8);
scan( 3, sqrt,
[&]( size_t prime )
{
punch( true_type(), (pStart + prime - 1) / prime * prime, pEnd,
prime );
} );
}
};
if( &range != &ranges.back() )
threads.emplace_back( rangePuncher, range.first, range.second );
else
rangePuncher( range.first, range.second );
}
threads.resize( 0 );
if( argc >= 3 && !*argv[2] )
return EXIT_SUCCESS;
ofstream ofs;
ofs.exceptions( ofstream::failbit | ofstream::badbit );
ofs.open( argc >= 3 ? argv[2] : "primes.txt", ofstream::binary |
ofstream::trunc );
constexpr size_t
BUF_SIZE = 0x100000,
AHEAD = 32;
union ndi_char { char c; ndi_char() {} };
vector<ndi_char> printBuf( BUF_SIZE + AHEAD - 1 );
auto
bufBegin = printBuf.begin(),
bufEnd = bufBegin,
bufFlush = bufBegin + BUF_SIZE;
auto print = [&]() { ofs << string_view( &bufBegin->c, &to_address(
bufEnd )->c ); };
scan( 2, end,
[&]( size_t prime )
{
auto [ptr, ec] = to_chars( &bufEnd->c, &bufEnd[AHEAD - 1].c, prime );
if( ec != errc() ) [[unlikely]]
throw system_error( (int)ec, generic_category(), "converson failed" );
bufEnd = ptr - &bufBegin->c + bufBegin; // pointer to iterator - NOP
bufEnd++->c = '\n';
if( bufEnd >= bufFlush ) [[unlikely]]
print(), bufEnd = bufBegin;
} );
print();
}

size_t getL2Size()
{ int regs[4];
auto cpuId = [&]( unsigned code )
{
#if defined(_MSC_VER)
__cpuid( regs, code );
#elif defined(__GNUC__) || defined(__clang__)
__cpuid(code, regs[0], regs[1], regs[2], regs[3]);
#endif
return regs[0];
};
if( (unsigned)cpuId( 0x80000000u ) < 0x80000006u )
return 512ull * 1024;
cpuId( 0x80000006u );
return ((unsigned)regs[2] >> 16) * 1024;
}

Re: Sieve of Erastosthenes optimized to the max

<ul5bn0$2qsht$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2898&group=comp.lang.c%2B%2B#2898

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sun, 10 Dec 2023 21:48:16 +0000
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <ul5bn0$2qsht$2@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 10 Dec 2023 21:48:16 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="44d07384f10fd4ad011f4d721d185aa8";
logging-data="2978365"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19SWdK3vxfTPxphKIzy2tOMGoRPUDYn2ds="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:zgt2zfpl9CChgHPJqWgle6Y3E0M=
In-Reply-To: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
Content-Language: en-GB
 by: Vir Campestris - Sun, 10 Dec 2023 21:48 UTC

On 10/12/2023 09:46, Bonita Montero wrote:
> I've parallelized my sieve of Erastosthenes to all cores. The parallel
> threads do al the punching of the non-prime multiplex beyond sqrt( max
> ). I found that the algorithm is limited by the memory bandwith since
> the bit-range between sqrt( max ) and max is to large to fit inside
> the cache. So I partitioned the each thread to a number of bit-ranges
> that fits inside the L2-cache. With that I got a speedup of factor 28
> when calculating about the first 210 million primes, i.e. all primes
> <= 2 ^ 32.
> On my Zen4 16 core PC under Windows with clang-cl 16.0.5 producing the
> primes without any file output is only 0.28s. On my Linux-PC, a Zen2 64
> core CPU producing the same number of primes is about 0.09s.
> The file output is done with my own buffering. With that writing the
> mentioned prime number range results in a 2.2GB file which is written
> in about two seconds with a PCIe 4.0 SSD.
> The first parameter is the highest prime candidate to be generated,
> it can be prefixed with 0x for hex values. The second number is the
> name of the output file; it can be "" to suppress file output. The
> third parameter is the number of punching threads; if it is left the
> number of threads defaults to the number of threads reported by the
> runtime.
>
<snip code>

Just so happens I've been thinking about this. I find your code
impenetrable - there are no comments at all. But two things occurred to
me quite quickly:
- You don't need to store the odd numbers
- You only need a bitmask to save the sieve.

I think you're doing the latter, though it's not obvious. I think you're
storing bits in an array of size_t, and that will be what
0xAAAAAAAAAAAAAAACu and 0xAAAAAAAAAAAAAAAAu are about. However if I am
reading that right you've assumed that unsigned is the same size as
size_t, and that the size is 64 bits.

Andy

Re: Sieve of Erastosthenes optimized to the max

<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2899&group=comp.lang.c%2B%2B#2899

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Mon, 11 Dec 2023 04:15:47 +0100
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Dec 2023 03:15:47 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="74b1f9cd0e072be5ee6fcd60890cf2d4";
logging-data="3188775"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18X1dOWDPmQb4Zm8IooPJ96YcY3y2kcGu0="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:l4DhiiyWJLHmVh9KVa8T1/Rs2FM=
In-Reply-To: <ul5bn0$2qsht$2@dont-email.me>
Content-Language: de-DE
 by: Bonita Montero - Mon, 11 Dec 2023 03:15 UTC

Am 10.12.2023 um 22:48 schrieb Vir Campestris:

> - You don't need to store the odd numbers

All primes except 2 are odd.

> - You only need a bitmask to save the sieve.

I'm using the "scan"-lambda which does serarch through the sieve
as fast as possible with countr_zero, which maps to a single CPU
-instruction on modern processors.

> ... reading that right you've assumed that unsigned is the same
> size as size_t, and that the size is 64 bits.

I've got a type word_t that is a size_t, but it can also
be a uint8_t to test what's the performance difference.

Re: Sieve of Erastosthenes optimized to the max

<ul7ft1$3853g$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2900&group=comp.lang.c%2B%2B#2900

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Mon, 11 Dec 2023 17:12:00 +0000
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <ul7ft1$3853g$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Dec 2023 17:12:01 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="a3e9d5138d53c7adfd05c0f97864f54d";
logging-data="3413104"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19x/zuz9nhtFJMIcuNto/aIDvJm8HRCfvE="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:R/lpRBJnnsPytKlgFn+k4rgKMk0=
In-Reply-To: <ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
Content-Language: en-GB
 by: Vir Campestris - Mon, 11 Dec 2023 17:12 UTC

On 11/12/2023 03:15, Bonita Montero wrote:
> Am 10.12.2023 um 22:48 schrieb Vir Campestris:
>
>> - You don't need to store the odd numbers
>
> All primes except 2 are odd.
>
I had a really bad night, and I was half asleep yesterday.

Because all primes except 2 are odd you don't need to store the _even_
numbers, which is what I meant to say. Or else you only need to store
the odd ones.

>> - You only need a bitmask to save the sieve.
>
> I'm using the "scan"-lambda which does serarch through the sieve
> as fast as possible with countr_zero, which maps to a single CPU
> -instruction on modern processors.
>
>> ... reading that right you've assumed that unsigned is the same
>> size as size_t, and that the size is 64 bits.
>
> I've got a type word_t that is a size_t, but it can also
> be a uint8_t to test what's the performance difference.
>

Your constants are 64 bit hexadecimal numbers (from counting the digits)
and are unsigned (the u suffix) but there's no guarantee that size_t
will be the same size as unsigned, nor that it will be 64 bit.

Andy

Re: Sieve of Erastosthenes optimized to the max

<ul7gal$3885r$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2901&group=comp.lang.c%2B%2B#2901

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Mon, 11 Dec 2023 18:19:17 +0100
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Dec 2023 17:19:18 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="74b1f9cd0e072be5ee6fcd60890cf2d4";
logging-data="3416251"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19knBDHO6Hm+pWeuOkL0TlBbz9ZOLFDK3s="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:8bsdWWG0WA8SRlMUDS+LIXcRHLM=
Content-Language: de-DE
In-Reply-To: <ul7ft1$3853g$1@dont-email.me>
 by: Bonita Montero - Mon, 11 Dec 2023 17:19 UTC

Am 11.12.2023 um 18:12 schrieb Vir Campestris:

> Because all primes except 2 are odd you don't need to store the _even_
> numbers, which is what I meant to say. Or else you only need to store
> the odd ones.

I'll try that the next days; but I don't expect a significant change.

> Your constants are 64 bit hexadecimal numbers (from counting the digits)
> and are unsigned (the u suffix) but there's no guarantee that size_t
> will be the same size as unsigned, nor that it will be 64 bit.

The constants are wide enough for 64 bit size_t-s but it won't make
a functional difference if you define word_t as a uint8_t. With an
uint8_t word_t the code runs about 1/6-th slower; that's less than
I expected.

Re: Sieve of Erastosthenes optimized to the max

<ulchs1$t3ph$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2902&group=comp.lang.c%2B%2B#2902

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Wed, 13 Dec 2023 15:16:17 +0000
Organization: A noiseless patient Spider
Lines: 235
Message-ID: <ulchs1$t3ph$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 13 Dec 2023 15:16:17 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="7092820afedbead44477d8e518f5b0a6";
logging-data="954161"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18lO6PjjdEvlX5xMPoMm7FHfv06WWw0NME="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:2ScmBnGGJ4exX08fPgvSrw2Q6fk=
In-Reply-To: <ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
Content-Language: en-GB
 by: Vir Campestris - Wed, 13 Dec 2023 15:16 UTC

On 11/12/2023 17:19, Bonita Montero wrote:
> Am 11.12.2023 um 18:12 schrieb Vir Campestris:
>
>> Because all primes except 2 are odd you don't need to store the _even_
>> numbers, which is what I meant to say. Or else you only need to store
>> the odd ones.
>
> I'll try that the next days; but I don't expect a significant change.
>
Well, I went away and tried it. I didn't write anything nearly as
sophisticated as yours - I didn't worry about threads and caching. The
code follows. You'll have to unwrap it in a few places.

It occurred to me I could go further - not just optimise it out to store
odd numbers only, but also miss out the ones divisible by 3, or other
higher primes. The results were:
- Storing all numbers: 9.494 seconds to run the sieve
- Storing odd numbers only: 4.455. That's over twice as fast.
- Excluding also the ones divisible by 3: 3.468. That's an improvement,
but not as much as I expected. Perhaps it's got coo complicated. I don't
have a high powered PC.

>> Your constants are 64 bit hexadecimal numbers (from counting the
>> digits) and are unsigned (the u suffix) but there's no guarantee that
>> size_t will be the same size as unsigned, nor that it will be 64 bit.
>
> The constants are wide enough for 64 bit size_t-s but it won't make
> a functional difference if you define word_t as a uint8_t. With an
> uint8_t word_t the code runs about 1/6-th slower; that's less than
> I expected.
>
Most likely what you are seeing there is that you've gone from being
memory bound to being compute bound. The CPU cache will make sure the
main memory accesses are 128 bits (or whatever your bus width is)

Andy
--
#include <cassert>
#include <chrono>
#include <cmath>
#include <iostream>
#include <thread>
#include <vector>

size_t PrimeCount = 1e8;

// this is a convenience class for timing.
class stopwatch {
std::chrono::time_point<std::chrono::steady_clock> mBegin, mEnd;
public:
stopwatch() {}

// record the start of an interval
void start()
{
mBegin = std::chrono::steady_clock::now();
}

// record the end of an interval
void stop()
{
mEnd = std::chrono::steady_clock::now();
}

// report the difference between the last start and stop
double delta() const
{
return
std::chrono::duration_cast<std::chrono::milliseconds>(mEnd -
mBegin).count() / 1000.0;
}
};

// the bitmask will be stores in a class that looks like this
class Primes
{ public:
Primes(size_t size) {};
virtual ~Primes() {};
virtual void unprime(size_t value) = 0; // marks a number
as not prime
virtual bool get(size_t value) const = 0; // gets how a
number is marked
virtual size_t size() const = 0; // Size of the store
};

// Basic store. Stores all the primes in a vector<bool>
class PrimesVB: public Primes
{ std::vector<bool> mStore;
public:
PrimesVB(size_t size) : Primes(size), mStore(size, true) {};
virtual void unprime(size_t value) override
{
mStore[value] = false;
}

virtual bool get(size_t value) const override
{
return mStore[value];
}

virtual size_t size() const override
{
return mStore.size();
}
};

class Primes2: public PrimesVB
{ size_t mSize;
public:
Primes2(size_t size) : PrimesVB((size + 1) / 2), mSize(size) {};
virtual void unprime(size_t value) override
{
if (value & 1)
PrimesVB::unprime(value / 2);
}

virtual bool get(size_t value) const override
{
if (value <= 2) return true;
return (value & 1) && PrimesVB::get(value / 2);
}

virtual size_t size() const override
{
return mSize;
}
};

class Primes23: public PrimesVB
{ // Size of the map. This is a lot bigger than the size of the
bitmap.
size_t mSize;

// each "page" contains 2 "lines". A line is a prime we are
storing.
// this map is where we store each number, depending on its
value modulo 6.
// -1 is known not to be prime, and isn't stored.
// we store 7, 13, 18 in "line" 0; 11, 17, 23 in "line" 1.
// the others are either divisible by 2 or 3, and don't need to
be stored.
static constexpr int mPosMap[6] = {-1, 0, -1, -1, -1, 1};
public:
Primes23(size_t size) : PrimesVB((size + 2) / 3), mSize(size) {};
virtual void unprime(size_t value) override
{
if (value <= 3) return;
auto page = value / 6;
auto line = mPosMap[value % 6];
if (line >= 0)
{
PrimesVB::unprime(page * 2 + line);
}
}

virtual bool get(size_t value) const override
{
if (value <= 3) return true;

auto page = value / 6; // 5=0 7=1 11=1 13=2
auto line = mPosMap[value % 6]; // 5=2 7=0 11=2 13=0
if (line >= 0)
{
return PrimesVB::get(page * 2 + line);
}
else
{
return false;
}
}

virtual size_t size() const override
{
return mSize;
}
};

// Simple sieve of Eratosthenes function
void sieve(Primes& store)
{ const size_t storesize = store.size();
const size_t maxfilter = std::ceil(std::sqrt(storesize));

size_t prime = 2;
while (prime < maxfilter)
{
// Unmark all the multiples
for (size_t inval = prime*2; inval < storesize; inval+=
prime)
store.unprime(inval);

// Find the next prime to filter with
while (!store.get(++prime) && prime < maxfilter) {};
}
}

// Benchmark function. Returns the constructed collection for validation.
template <typename storetype> storetype test(const char* classname)
{ stopwatch s;
s.start();
storetype store(PrimeCount);
s.stop();
std::cout << classname << " construct " << s.delta() << "
seconds." << std::endl;

s.start();
sieve(store);
s.stop();
std::cout << classname << " sieve " << s.delta() << " seconds."
<< std::endl;

return store;
}

int main()
{ auto vb = test<PrimesVB>("vector<bool>");
auto p2 = test<Primes2>("Storing odds only");
auto p23 = test<Primes23>("Not storing 2s and 3s");

// double check they all match.
for (auto p = 1; p < PrimeCount; ++p)
{
assert(vb.get(p) == p2.get(p));
assert(vb.get(p) == p23.get(p));
}

return 0;
}

Re: Sieve of Erastosthenes optimized to the max

<ulcidm$t3ph$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2903&group=comp.lang.c%2B%2B#2903

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Wed, 13 Dec 2023 15:25:42 +0000
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <ulcidm$t3ph$2@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
<ulchs1$t3ph$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 13 Dec 2023 15:25:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="7092820afedbead44477d8e518f5b0a6";
logging-data="954161"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+v04pg0L6kCo1GJ6MEV6PdfcB5GEXzpfY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:HpqzbiMgGwQG5JFTn6fwyUpwocA=
Content-Language: en-GB
In-Reply-To: <ulchs1$t3ph$1@dont-email.me>
 by: Vir Campestris - Wed, 13 Dec 2023 15:25 UTC

On 13/12/2023 15:16, Vir Campestris wrote:
> Well, I went away and tried it. I didn't write anything nearly as
> sophisticated as yours - I didn't worry about threads and caching. The
> code follows. You'll have to unwrap it in a few places.
>
> It occurred to me I could go further - not just optimise it out to store
> odd numbers only, but also miss out the ones divisible by 3, or other
> higher primes. The results were:
> - Storing all numbers: 9.494 seconds to run the sieve
> - Storing odd numbers only: 4.455. That's over twice as fast.
> - Excluding also the ones divisible by 3: 3.468. That's an improvement,
> but not as much as I expected. Perhaps it's got coo complicated. I don't
> have a high powered PC.

And no sooner had I posted that than I realised I'd got the optimisation
settings wrong.

With -Ofast fed to g++ the version that doesn't store multiples of 3 is
_slower_ than the odds only one!

Andy

Re: Sieve of Erastosthenes optimized to the max

<ulf5l6$1e43l$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2904&group=comp.lang.c%2B%2B#2904

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!paganini.bofh.team!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 14 Dec 2023 15:06:14 +0000
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <ulf5l6$1e43l$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
<ulchs1$t3ph$1@dont-email.me> <ulcidm$t3ph$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 14 Dec 2023 15:06:14 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="4e6688d52ceacbf9987d5ecb7f3f8a7b";
logging-data="1511541"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19E+oqEPk/eSYrTu8e+bjBzv+XHuVBBuQE="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:0PmzWlAXJRzlHZnoytlUA/ajBfo=
In-Reply-To: <ulcidm$t3ph$2@dont-email.me>
Content-Language: en-GB
 by: Vir Campestris - Thu, 14 Dec 2023 15:06 UTC

I tried a few more collections.

The slowest on my system is vector<bool> 12.165s for 1e9 primes.

Having a manual bitmask, and storing in uint64_t is a lot faster -
almost twice as fast (6.659).

A uint32_t is a little faster than that (6.306).

Optimising it to store odds only more than doubles the speed of
vector<bool> to 5.107 seconds.

That has less effect with uint64_t, taking it to 4.946 - which is the
best time I have.

Oddly storing odds only with uint32_t is not as fast as odds only with
uint64_t, although it is still faster than storing all the values (5.302).

I've optimised it to do less recursion, which has helped, but it still
isn't worth not storing the multiples of 3. I'll guess that's because
that code required a divide and mod by 6, whereas optimising out the
evens only requires shifts and masks.

Andy

Re: Sieve of Erastosthenes optimized to the max

<ulfa02$1er97$1@redfloyd.dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2905&group=comp.lang.c%2B%2B#2905

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!redfloyd.dont-email.me!.POSTED!not-for-mail
From: no.spam.here@its.invalid (red floyd)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 14 Dec 2023 08:20:19 -0800
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <ulfa02$1er97$1@redfloyd.dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
<ulchs1$t3ph$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 14 Dec 2023 16:20:19 -0000 (UTC)
Injection-Info: redfloyd.dont-email.me; posting-host="5d04a0f9e044967e16dc08d839041786";
logging-data="1535271"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19y9YD9isZPdgaUXucqBYejHIWECiSovZ0="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:jxjzWVjdzAouDKaTyGXN55pmxwU=
In-Reply-To: <ulchs1$t3ph$1@dont-email.me>
Content-Language: en-US
 by: red floyd - Thu, 14 Dec 2023 16:20 UTC

On 12/13/2023 7:16 AM, Vir Campestris wrote:
> On 11/12/2023 17:19, Bonita Montero wrote:
>> Am 11.12.2023 um 18:12 schrieb Vir Campestris:
>>
>>> Because all primes except 2 are odd you don't need to store the
>>> _even_ numbers, which is what I meant to say. Or else you only need
>>> to store the odd ones.
>>
>> I'll try that the next days; but I don't expect a significant change.
>>
> Well, I went away and tried it. I didn't write anything nearly as
> sophisticated as yours - I didn't worry about threads and caching. The
> code follows. You'll have to unwrap it in a few places.
>
> It occurred to me I could go further - not just optimise it out to store
> odd numbers only, but also miss out the ones divisible by 3, or other
> higher primes. The results were:
> - Storing all numbers: 9.494 seconds to run the sieve
> - Storing odd numbers only: 4.455. That's over twice as fast.
> - Excluding also the ones divisible by 3: 3.468. That's an improvement,
> but not as much as I expected. Perhaps it's got coo complicated. I don't
> have a high powered PC.

Another way to attempt to optimize is that except for 2 and 3, all
primes are of the form 6n+1 or 6n-1.

Re: Sieve of Erastosthenes optimized to the max

<ulunk0$iacr$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2927&group=comp.lang.c%2B%2B#2927

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Wed, 20 Dec 2023 13:44:49 +0100
Organization: A noiseless patient Spider
Lines: 261
Message-ID: <ulunk0$iacr$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 20 Dec 2023 12:44:48 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="e15d072def80e99b64772bfcde13d860";
logging-data="600475"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/YeHSRkBg7NnoWi/cPVe9RomHnyRyVJ8w="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:6s1Es5wfiAv7gcG2Y/9vRfak9sE=
In-Reply-To: <ul7ft1$3853g$1@dont-email.me>
Content-Language: de-DE
 by: Bonita Montero - Wed, 20 Dec 2023 12:44 UTC

Am 11.12.2023 um 18:12 schrieb Vir Campestris:

> Because all primes except 2 are odd you don't need to store the _even_
> numbers, which is what I meant to say. Or else you only need to store
> the odd ones.

I changed the code this morning:

#include <cstdlib>
#include <charconv>
#include <cstring>
#include <vector>
#include <algorithm>
#include <cmath>
#include <bit>
#include <fstream>
#include <string_view>
#include <thread>
#include <utility>
#include <new>
#include <span>
#include <array>
#include <cassert>
#if defined(_MSC_VER)
#include <intrin.h>
#elif defined(__GNUC__) || defined(__clang__)
#include <cpuid.h>
#endif

#if defined(_MSC_VER)
#pragma warning(disable: 26495)
#endif

using namespace std;

#if defined(__cpp_lib_hardware_interference_size)
constexpr ptrdiff_t CL_SIZE = hardware_destructive_interference_size;
#else
constexpr ptrdiff_t CL_SIZE = 64;
#endif

size_t getL2Size();
bool smt();

int main( int argc, char **argv )
{ if( argc < 2 )
return EXIT_FAILURE;
auto parse = []( char const *str, auto &value )
{
bool hex = str[0] == '0' && (str[1] == 'x' || str[1] == 'X');
str += 2 * hex;
auto [ptr, ec] = from_chars( str, str + strlen( str ), value, !hex ?
10 : 16 );
return ec == errc() && !*ptr;
};
size_t end;
if( !parse( argv[1], end ) )
return EXIT_FAILURE;
if( end < 2 || (ptrdiff_t)end++ < 0 )
throw bad_alloc();
unsigned
hc = jthread::hardware_concurrency(),
nThreads;
if( argc < 4 || !parse( argv[3], nThreads ) )
nThreads = hc;
if( !nThreads )
return EXIT_FAILURE;
using word_t = size_t;
constexpr size_t
BITS_PER_CL = CL_SIZE * 8,
BITS = sizeof(word_t) * 8,
DBITS = 2 * BITS;
size_t nBits = end / 2 + (end & 1 ^ 1);
union alignas(CL_SIZE) ndi_words_cl { word_t words[CL_SIZE /
sizeof(word_t)]; ndi_words_cl() {} };
vector<ndi_words_cl> sieveCachelines( (nBits + BITS_PER_CL - 1) /
BITS_PER_CL );
span<word_t> sieve( &sieveCachelines[0].words[0], (nBits + BITS - 1) /
BITS );
fill( sieve.begin(), sieve.end(), (word_t)-1 );
size_t sqrt = (ptrdiff_t)ceil( ::sqrt( (ptrdiff_t)end ) );
auto punch = [&]( auto, size_t start, size_t end, size_t prime )
{
assert(start & 1);
size_t bit = start / 2;
end = end / 2 + (end & 1 ^ 1);
if( bit >= end )
return;
if( prime >= BITS ) [[likely]]
do [[likely]]
sieve[bit / BITS] &= rotl( (word_t)-2, bit % BITS );
while( (bit += prime) < end );
else
{
auto pSieve = sieve.begin() + bit / BITS;
do [[likely]]
{
word_t
word = *pSieve,
mask = rotl( (word_t)-2, bit % BITS ),
oldMask;
do
word &= mask,
oldMask = mask,
mask = rotl( mask, prime % BITS ),
bit += prime;
while( mask < oldMask );
*pSieve++ = word;
} while( bit < end );
}
};
for( size_t prime = 3; prime < sqrt; prime += 2 ) [[likely]]
{
auto pSieve = sieve.begin() + prime / DBITS;
do [[likely]]
if( word_t w = *pSieve >> prime / 2 % BITS; w ) [[likely]]
{
prime += 2 * countr_zero( w );
break;
}
while( (prime = (prime + DBITS & -(ptrdiff_t)DBITS) + 1) < sqrt );
if( prime >= sqrt ) [[unlikely]]
break;
punch( false_type(), prime * prime, sqrt, prime );
}
auto scan = [&]( size_t start, size_t end, auto fn )
{
assert(start & 1);
start /= 2;
end = end / 2 + (end & 1 ^ 1);
if( start >= end )
return;
auto pSieve = sieve.begin() + start / BITS;
for( size_t bit = start; ; )
{
word_t word = *pSieve++ >> bit % BITS;
bool hasBits = word;
for( unsigned gap; word; word >>= gap, word >>= 1 ) [[likely]]
{
gap = countr_zero( word );
if( (bit += gap) >= end ) [[unlikely]]
return;
fn( bit * 2 + 1, true_type() );
if( ++bit >= end ) [[unlikely]]
return;
}
if( bit >= end ) [[unlikely]]
break;
bit = (bit + BITS - hasBits) & -(ptrdiff_t)BITS;
}
};
static auto dbl = []( ptrdiff_t x ) { return (double)x; };
double threadTange = dbl( end - sqrt ) / nThreads;
using range_t = pair<size_t, size_t>;
vector<pair<size_t, size_t>> ranges;
ranges.reserve( nThreads );
for( size_t t = nThreads, rEnd = end, rStart; t--; rEnd = rStart )
[[likely]]
{
rStart = sqrt + (ptrdiff_t)((ptrdiff_t)t * threadTange + 0.5);
size_t rClStart = rStart & -(2 * CL_SIZE * 8);
rStart = rClStart >= sqrt ? rClStart : sqrt;
if( rStart < rEnd )
ranges.emplace_back( rStart, rEnd );
}
double maxCacheRange = dbl( getL2Size() * (8 * 2) ) / 3 * (smt() &&
nThreads > hc / 2 ? 1 : 2);
vector<jthread> threads;
threads.reserve( ranges.size() - 1 );
for( range_t const &range : ranges )
{
auto cacheAwarePunch = [&]( size_t rStart, size_t rEnd )
{
double thisThreadRange = dbl( rEnd - rStart );
ptrdiff_t nCachePartitions = (ptrdiff_t)ceil( thisThreadRange /
maxCacheRange );
double cacheRange = thisThreadRange / dbl( nCachePartitions );
for( size_t p = nCachePartitions, cacheEnd = rEnd, cacheStart; p--;
cacheEnd = cacheStart ) [[likely]]
{
cacheStart = rStart + (ptrdiff_t)((double)(ptrdiff_t)p * cacheRange
+ 0.5);
cacheStart &= -(2 * CL_SIZE * 8);
scan( 3, sqrt,
[&]( size_t prime, auto )
{
size_t start = ((cacheStart >= sqrt ? cacheStart : sqrt) + prime -
1) / prime * prime;
start = start & 1 ? start : start + prime;
punch( true_type(), start, cacheEnd, prime );
} );
}
};
if( &range != &ranges.back() )
threads.emplace_back( cacheAwarePunch, range.first, range.second );
else
cacheAwarePunch( range.first, range.second );
}
threads.resize( 0 );
if( argc >= 3 && !*argv[2] )
return EXIT_SUCCESS;
ofstream ofs;
ofs.exceptions( ofstream::failbit | ofstream::badbit );
ofs.open( argc >= 3 ? argv[2] : "primes.txt", ofstream::binary |
ofstream::trunc );
constexpr size_t
BUF_SIZE = 0x100000,
AHEAD = 32;
union ndi_char { char c; ndi_char() {} };
vector<ndi_char> rawPrintBuf( BUF_SIZE + AHEAD - 1 );
span printBuf( &rawPrintBuf.front().c, &rawPrintBuf.back().c + 1 );
auto
bufBegin = printBuf.begin(),
bufEnd = bufBegin,
bufFlush = bufBegin + BUF_SIZE;
auto print = [&]() { ofs << string_view( bufBegin, bufEnd ); };
auto printPrime = [&]( size_t prime, auto )
{
auto [ptr, ec] = to_chars( &*bufEnd, &bufEnd[AHEAD - 1], prime );
if( ec != errc() ) [[unlikely]]
throw system_error( (int)ec, generic_category(), "converson failed" );
bufEnd = ptr - &*bufBegin + bufBegin; // pointer to iterator - NOP
*bufEnd++ = '\n';
if( bufEnd >= bufFlush ) [[unlikely]]
print(), bufEnd = bufBegin;
};
printPrime( 2, false_type() );
scan( 3, end, printPrime );
print();
}

array<unsigned, 4> cpuId( unsigned code )
{ array<unsigned, 4> regs;
#if defined(_MSC_VER)
__cpuid( (int *)&regs[0], code );
#elif defined(__GNUC__) || defined(__clang__)
__cpuid(code, regs[0], regs[1], regs[2], regs[3]);
#endif
return regs;
}

bool smt()
{ if( cpuId( 0 )[0] < 1 )
return false;
return cpuId( 1 )[3] >> 28 & 1;
}

size_t getL2Size()
{ if( cpuId( 0x80000000u )[0] < 0x80000006u )
return 512ull * 1024;
return (cpuId( 0x80000006u )[2] >> 16) * 1024;
}

Now the code runs 0.15s on the 16 core Zen4 system, that's +87% faster.
On th4e 64 core Zen2 system the time to produce the first 21 million
primes is about 0.05 seconds.

Re: Sieve of Erastosthenes optimized to the max

<um1lm5$14br5$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2942&group=comp.lang.c%2B%2B#2942

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 21 Dec 2023 15:30:12 +0000
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <um1lm5$14br5$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ulunk0$iacr$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 21 Dec 2023 15:30:13 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="5d46455dea781c0d66cdade1098eb582";
logging-data="1191781"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+u2q83ST1gmblRxO4oPaAQJdeUvWU38ag="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:2Qpo0baGTi9KrhzDl677gO3iRIM=
In-Reply-To: <ulunk0$iacr$1@raubtier-asyl.eternal-september.org>
Content-Language: en-GB
 by: Vir Campestris - Thu, 21 Dec 2023 15:30 UTC

On 20/12/2023 12:44, Bonita Montero wrote:
> Now the code runs 0.15s on the 16 core Zen4 system, that's +87% faster.
> On th4e 64 core Zen2 system the time to produce the first 21 million
> primes is about 0.05 seconds.

I thought I'd try it.

Can I make a suggestion? Where it says

if( argc < 2 )
return EXIT_FAILURE;

make it instead print a user guide?

On my system my code takes about 20 seconds to produce 1e9 primes. It's
single threaded. Yours is taking a little over 6, but about 18 seconds
of CPU time. (I have 4 cores, each with 2 threads. Mine is designed to
be cool and quiet...)

I've obviously got something wrong though - I'm using a _lot_ more
memory than you.

Out of interest - I thought you must be Spanish from your name. And yet
you speak fluent German?

Andy

Re: Sieve of Erastosthenes optimized to the max

<um1nsg$14lai$2@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2943&group=comp.lang.c%2B%2B#2943

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.niel.me!news.gegeweb.eu!gegeweb.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 21 Dec 2023 17:07:45 +0100
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <um1nsg$14lai$2@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ulunk0$iacr$1@raubtier-asyl.eternal-september.org>
<um1lm5$14br5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 21 Dec 2023 16:07:44 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="dbdc32063cf8648a2068507deca7038a";
logging-data="1201490"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18BmklbYicRv/+KB+rLrvf3XnZK5Yb8yIM="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:jeCrGFgtV9v7FhomiD9P3PN8Q8s=
In-Reply-To: <um1lm5$14br5$1@dont-email.me>
Content-Language: de-DE
 by: Bonita Montero - Thu, 21 Dec 2023 16:07 UTC

Am 21.12.2023 um 16:30 schrieb Vir Campestris:

> Can I make a suggestion? Where it says
>
>     if( argc < 2 )
>         return EXIT_FAILURE;
>
> make it instead print a user guide?

The code isnt for an application which is ready-to-use but just
a technology experiment.

> On my system my code takes about 20 seconds to produce 1e9 primes. It's
> single threaded. Yours is taking a little over 6, but about 18 seconds
> of CPU time. (I have 4 cores, each with 2 threads. Mine is designed to
> be cool and quiet...)

bg --times --high bitmapSieve 0x1000000000 "" // bg is my own tool
real 2.471s
user 1m:0.313s
sys 0.234s
cylces 273.769.270.155

As you can see my code spends an minute CPU-time in about 2.5 seconds.
Interestingly the whole code largely depends on the performance of the
L2 cache. If more than one of my punch-threads occupies one core there
is almost no performance-improvement since both threads depend on the
same L2 cache.

> I've obviously got something wrong though - I'm using a _lot_ more
> memory than you.

I'm using a bitmap to store the odd values.

> Out of interest - I thought you must be Spanish from your name.
> And yet you speak fluent German?

My parents come from Peru.

Re: Sieve of Erastosthenes optimized to the max

<um1o7j$14ool$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2944&group=comp.lang.c%2B%2B#2944

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.chmurka.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 21 Dec 2023 17:13:39 +0100
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <um1o7j$14ool$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ulunk0$iacr$1@raubtier-asyl.eternal-september.org>
<um1lm5$14br5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 21 Dec 2023 16:13:39 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="dbdc32063cf8648a2068507deca7038a";
logging-data="1205013"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+U5vXbk1BWYQreZeQVzIRFUwLK5dmas18="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:UR6+BCMFRAa82rElW8zTF5Jguhk=
In-Reply-To: <um1lm5$14br5$1@dont-email.me>
Content-Language: de-DE
 by: Bonita Montero - Thu, 21 Dec 2023 16:13 UTC

Am 21.12.2023 um 16:30 schrieb Vir Campestris:

> On my system my code takes about 20 seconds to produce 1e9 primes. It's
> single threaded. Yours is taking a little over 6, but about 18 seconds
> of CPU time. (I have 4 cores, each with 2 threads. Mine is designed to
> be cool and quiet...)

I forgot to mention that I'm using "" as the second commandline
parameter, ich suppresses the file output. Producing a 2.2GB
file from all primes that fit in a 32 bit number costs an extra
two seconds for my system with a PCIe 4.0 SSD from Samsung.

Re: Sieve of Erastosthenes optimized to the max

<um2dsb$17vgg$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2945&group=comp.lang.c%2B%2B#2945

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 21 Dec 2023 14:23:05 -0800
Organization: A noiseless patient Spider
Lines: 6
Message-ID: <um2dsb$17vgg$2@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 21 Dec 2023 22:23:07 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d5d40264675499f3691087d5c7ec9ace";
logging-data="1310224"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19lrmfOAkLOpACnaQtiHhbqP6IXsEhUMVM="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:2tqbe5/9x+ALRvo7s9VYEuQqEpY=
Content-Language: en-US
In-Reply-To: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
 by: Chris M. Thomasson - Thu, 21 Dec 2023 22:23 UTC

On 12/10/2023 1:46 AM, Bonita Montero wrote:
> union alignas(CL_SIZE) ndi_words_cl { word_t words[CL_SIZE /
> sizeof(word_t)]; ndi_words_cl() {} };

alignas is pretty damn convenient. Iirc, it pads and aligns? Iirc, it
even works with std::vector.

Re: Sieve of Erastosthenes optimized to the max

<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2947&group=comp.lang.c%2B%2B#2947

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 22 Dec 2023 04:28:59 +0100
Organization: A noiseless patient Spider
Lines: 10
Message-ID: <um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 22 Dec 2023 03:28:59 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="e4ef1e5a5b1a4a3709614f0f0644043b";
logging-data="1515319"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Fm90M6nHEEH+cT6SBSQwPtOrrEacndFo="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:LTML8Tq6QY7Pg39YlqgCCLllUXU=
In-Reply-To: <um2dsb$17vgg$2@dont-email.me>
Content-Language: de-DE
 by: Bonita Montero - Fri, 22 Dec 2023 03:28 UTC

Am 21.12.2023 um 23:23 schrieb Chris M. Thomasson:

> alignas is pretty damn convenient. Iirc, it pads and aligns? Iirc, it
> even works with std::vector.

The problem with that is that I can't have an alignas on a word_t since
the rest of the cacheline would be padded. So I join as much word_t's as
possible in a single ndi_word_s_cl and have a span later on it to have
full iterator debugging.

Re: Sieve of Erastosthenes optimized to the max

<um31o0$1edq3$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2948&group=comp.lang.c%2B%2B#2948

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 21 Dec 2023 20:02:07 -0800
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <um31o0$1edq3$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 22 Dec 2023 04:02:08 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="976eed046f8a4bb8ca380aeef2438635";
logging-data="1521475"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18mVua5/JKMeRAuehXKY5D33uPXEtqYq+E="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Ay4MpWFEfuMa19rXY4BDZV/zv4A=
In-Reply-To: <um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
Content-Language: en-US
 by: Chris M. Thomasson - Fri, 22 Dec 2023 04:02 UTC

On 12/21/2023 7:28 PM, Bonita Montero wrote:
> Am 21.12.2023 um 23:23 schrieb Chris M. Thomasson:
>
>> alignas is pretty damn convenient. Iirc, it pads and aligns? Iirc, it
>> even works with std::vector.
>
> The problem with that is that I can't have an alignas on a word_t since
> the rest of the cacheline would be padded. So I join as much word_t's as
> possible in a single ndi_word_s_cl and have a span later on it to have
> full iterator debugging.
>

Yup. This is a good practice indeed as long as they are related
logically, it will be beneficial to do this. Say a cache line is say 64
bytes and a word of 8 bytes, can be packed 8 fulls words indeed. No
cache line straddling allowed, damn it!

aligned on a l2 cache line boundary:

struct l2_cache_line
{ word m_words[8];
};

Fine. I think alignas auto packs. I need to to bust out the most recent
C++ mode on my compilers to revisit it. Actually, I need to do it
anyway. The fun part is that I have old code that is interesting, well
before alignas:

https://pastebin.com/raw/f37a23918

Fun times! :^)

Re: Sieve of Erastosthenes optimized to the max

<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2952&group=comp.lang.c%2B%2B#2952

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 22 Dec 2023 17:55:19 +0100
Organization: A noiseless patient Spider
Lines: 12
Message-ID: <um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 22 Dec 2023 16:55:17 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="e4ef1e5a5b1a4a3709614f0f0644043b";
logging-data="1738114"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+hh5zdQ3Ss4hq0JhCaqTQiVg+CoCxlCGc="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:2xdOE5LI+gFOBCnn1o+5FW5jksU=
Content-Language: de-DE
In-Reply-To: <um31o0$1edq3$1@dont-email.me>
 by: Bonita Montero - Fri, 22 Dec 2023 16:55 UTC

Am 22.12.2023 um 05:02 schrieb Chris M. Thomasson:

> Fine. I think alignas auto packs. I need to to bust out the most recent
> C++ mode on my compilers to revisit it. Actually, I need to do it
> anyway. The fun part is that I have old code that is interesting, well
> before alignas:

False-sharing woudln't hurt my algorithm much since only the beginning
and the end of the thread-local segment overlaps with other thread; but
I do it anyway to have maximum performance.

Re: Sieve of Erastosthenes optimized to the max

<86il4ovkz3.fsf@linuxsc.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2957&group=comp.lang.c%2B%2B#2957

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sat, 23 Dec 2023 10:21:04 -0800
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <86il4ovkz3.fsf@linuxsc.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org> <ul5bn0$2qsht$2@dont-email.me> <ul5ut3$31a17$1@raubtier-asyl.eternal-september.org> <ul7ft1$3853g$1@dont-email.me> <ulunk0$iacr$1@raubtier-asyl.eternal-september.org> <um1lm5$14br5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="a0e2891dc7b8f3c7cc576ac7cc01eb92";
logging-data="2290155"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ryWQqyaEWken9dKLl+AmYCosc68BbGlA="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:oaYFpJqf/ydu7EsdeGkUxBQF37A=
sha1:NoP2UDUBLBFl+g8H6BxYxYc2CIk=
 by: Tim Rentsch - Sat, 23 Dec 2023 18:21 UTC

Vir Campestris <vir.campestris@invalid.invalid> writes:

> On my system my code takes about 20 seconds to produce 1e9
> primes. [...]

Do you mean 1e9 primes or just the primes less than 1e9? To
do the first 1e9 primes a sieve would need to go up to about
23.9e9 (so half that many bits if only odd numbers were
represented).

Re: Sieve of Erastosthenes optimized to the max

<86edfcvkjm.fsf@linuxsc.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2958&group=comp.lang.c%2B%2B#2958

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sat, 23 Dec 2023 10:30:21 -0800
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <86edfcvkjm.fsf@linuxsc.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org> <ul5bn0$2qsht$2@dont-email.me> <ul5ut3$31a17$1@raubtier-asyl.eternal-september.org> <ul7ft1$3853g$1@dont-email.me> <ul7gal$3885r$1@raubtier-asyl.eternal-september.org> <ulchs1$t3ph$1@dont-email.me> <ulfa02$1er97$1@redfloyd.dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="a0e2891dc7b8f3c7cc576ac7cc01eb92";
logging-data="2290155"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Vyi49E7ZSkRyKccDKY/7YbBETR+andEE="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:33CFMS860q9JZWmmwIjOnWsLbE4=
sha1:a3/TSNbwRePKMhi8WOpkGryMJe4=
 by: Tim Rentsch - Sat, 23 Dec 2023 18:30 UTC

red floyd <no.spam.here@its.invalid> writes:

> On 12/13/2023 7:16 AM, Vir Campestris wrote:
>
>> On 11/12/2023 17:19, Bonita Montero wrote:
>>
>>> Am 11.12.2023 um 18:12 schrieb Vir Campestris:
>>>
>>>> Because all primes except 2 are odd you don't need to store the
>>>> _even_ numbers, which is what I meant to say. Or else you only
>>>> need to store the odd ones.
>>>
>>> I'll try that the next days; but I don't expect a significant change.
>>
>> Well, I went away and tried it. I didn't write anything nearly as
>> sophisticated as yours - I didn't worry about threads and
>> caching. The code follows. You'll have to unwrap it in a few places.
>>
>> It occurred to me I could go further - not just optimise it out to
>> store odd numbers only, but also miss out the ones divisible by 3,
>> or other higher primes. The results were:
>> - Storing all numbers: 9.494 seconds to run the sieve
>> - Storing odd numbers only: 4.455. That's over twice as fast.
>> - Excluding also the ones divisible by 3: 3.468. That's an
>> improvement, but not as much as I expected. Perhaps it's got coo
>> complicated. I don't have a high powered PC.
>
> Another way to attempt to optimize is that except for 2 and 3, all
> primes are of the form 6n+1 or 6n-1.

A more compact form is to consider numbers mod 30, in which case
numbers that are 0 mod {2, 3, or 5} can be ignored. Conveniently
there are just 8 such values - 1, 7, 11, 13, 17, 19, 23, and 29 -
so each block of 30 can be represented in one 8-bit byte. Doing
that gives a 25% reduction in space compared to a 6n+/-1 scheme.

Re: Sieve of Erastosthenes optimized to the max

<um7haa$274oh$3@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2959&group=comp.lang.c%2B%2B#2959

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m.thomasson.1@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sat, 23 Dec 2023 12:52:25 -0800
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <um7haa$274oh$3@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 23 Dec 2023 20:52:26 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="ec15cd74988e47dd581b59265d205ee1";
logging-data="2331409"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/E9Tga76ZQs40AvmT2S2n1iDmuxTw/6Aw="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:nH4bvOdMu5QDtMxTY27uxAGKZPU=
Content-Language: en-US
In-Reply-To: <um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
 by: Chris M. Thomasson - Sat, 23 Dec 2023 20:52 UTC

On 12/22/2023 8:55 AM, Bonita Montero wrote:
> Am 22.12.2023 um 05:02 schrieb Chris M. Thomasson:
>
>> Fine. I think alignas auto packs. I need to to bust out the most
>> recent C++ mode on my compilers to revisit it. Actually, I need to do
>> it anyway. The fun part is that I have old code that is interesting,
>> well before alignas:
>
> False-sharing woudln't hurt my algorithm much since only the beginning
> and the end of the thread-local segment overlaps with other thread; but
> I do it anyway to have maximum performance.

That's is a good habit to get into. :^)

Re: Sieve of Erastosthenes optimized to the max

<um7iva$27ikh$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2960&group=comp.lang.c%2B%2B#2960

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sat, 23 Dec 2023 21:20:42 +0000
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <um7iva$27ikh$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
<ulchs1$t3ph$1@dont-email.me> <ulfa02$1er97$1@redfloyd.dont-email.me>
<86edfcvkjm.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 23 Dec 2023 21:20:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="1ef8412645e9256ce98af330d3194132";
logging-data="2345617"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19kZdzmfS5VjWtEn6BZi8hatmOuXT9kTGc="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:cVu6Z+nkMRoOiWq4Iue6fy3mXww=
Content-Language: en-GB
In-Reply-To: <86edfcvkjm.fsf@linuxsc.com>
 by: Vir Campestris - Sat, 23 Dec 2023 21:20 UTC

On 23/12/2023 18:30, Tim Rentsch wrote:
> A more compact form is to consider numbers mod 30, in which case
> numbers that are 0 mod {2, 3, or 5} can be ignored. Conveniently
> there are just 8 such values - 1, 7, 11, 13, 17, 19, 23, and 29 -
> so each block of 30 can be represented in one 8-bit byte. Doing
> that gives a 25% reduction in space compared to a 6n+/-1 scheme.

I found that on my system the modulo operation was so slow this wasn't
worth doing.

Andy

Re: Sieve of Erastosthenes optimized to the max

<um7j0h$27ikh$2@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2961&group=comp.lang.c%2B%2B#2961

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.campestris@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sat, 23 Dec 2023 21:21:21 +0000
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <um7j0h$27ikh$2@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ulunk0$iacr$1@raubtier-asyl.eternal-september.org>
<um1lm5$14br5$1@dont-email.me> <86il4ovkz3.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 23 Dec 2023 21:21:21 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="1ef8412645e9256ce98af330d3194132";
logging-data="2345617"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18+5fFymIyYq9pXYMpMYwRxqNbAW9EoDPQ="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:NonLJ2HeCkHGqGybvg0zuw05XNo=
In-Reply-To: <86il4ovkz3.fsf@linuxsc.com>
Content-Language: en-GB
 by: Vir Campestris - Sat, 23 Dec 2023 21:21 UTC

On 23/12/2023 18:21, Tim Rentsch wrote:
> Vir Campestris <vir.campestris@invalid.invalid> writes:
>
>> On my system my code takes about 20 seconds to produce 1e9
>> primes. [...]
>
> Do you mean 1e9 primes or just the primes less than 1e9? To
> do the first 1e9 primes a sieve would need to go up to about
> 23.9e9 (so half that many bits if only odd numbers were
> represented).

Primes up to 1e9.

I have another idea though, watch this space...

Andy

Re: Sieve of Erastosthenes optimized to the max

<86a5q0uhd1.fsf@linuxsc.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2963&group=comp.lang.c%2B%2B#2963

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sun, 24 Dec 2023 00:36:42 -0800
Organization: A noiseless patient Spider
Lines: 50
Message-ID: <86a5q0uhd1.fsf@linuxsc.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org> <ul5bn0$2qsht$2@dont-email.me> <ul5ut3$31a17$1@raubtier-asyl.eternal-september.org> <ul7ft1$3853g$1@dont-email.me> <ul7gal$3885r$1@raubtier-asyl.eternal-september.org> <ulchs1$t3ph$1@dont-email.me> <ulfa02$1er97$1@redfloyd.dont-email.me> <86edfcvkjm.fsf@linuxsc.com> <um7iva$27ikh$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="e91b714cfd3f51772707cd580185f87c";
logging-data="2640454"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18653vuyhC0XQlBvlzbrV8p5m4vElefOEk="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:WmDrc1q/bl46P8FXGL88MxhdTxc=
sha1:uxfVDNveoH1P1f66t4eETrVYTUs=
 by: Tim Rentsch - Sun, 24 Dec 2023 08:36 UTC

Vir Campestris <vir.campestris@invalid.invalid> writes:

> On 23/12/2023 18:30, Tim Rentsch wrote:
>
>> A more compact form is to consider numbers mod 30, in which case
>> numbers that are 0 mod {2, 3, or 5} can be ignored. Conveniently
>> there are just 8 such values - 1, 7, 11, 13, 17, 19, 23, and 29 -
>> so each block of 30 can be represented in one 8-bit byte. Doing
>> that gives a 25% reduction in space compared to a 6n+/-1 scheme.
>
> I found that on my system the modulo operation was so slow this wasn't
> worth doing.

Depending on how the code is written, no modulo operations need
to be done, because they will be optimized away and done at
compile time. If we look at multiplying two numbers represented
by bits in bytes i and j, the two numbers are

i*30 + a
j*30 + b

for some a and b in { 1, 7, 11, 13 17, 19, 23, 29 }.

The values we're interested in are the index of the product and
the residue of the product, namely

(i*30+a) * (j*30+b) / 30 (for the index)
(i*30+a) * (j*30+b) % 30 (for the residue)

Any term with a *30 in the numerator doesn't contribute to
the residue, and also can be combined with the by-30 divide
for computing the index. Thus these expressions can be
rewritten as

i*30*j + i*b + j*a + (a*b/30) (for the index)
a*b%30 (for the residue)

When a and b have values that are known at compile time,
neither the divide nor the remainder result in run-time
operations being done; all of that heavy lifting is
optimized away and done at compile time. Of course there
are some multiplies, but they are cheaper than divides, and
also can be done in parallel. (The multiplication a*b also
can be done at compile time.)

The residue needs to be turned into a bit mask to do the
logical operation on the byte of bits, but here again that
computation can be optimized away and done at compile time.

Does that all make sense?

Re: Sieve of Erastosthenes optimized to the max

<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2964&group=comp.lang.c%2B%2B#2964

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.Montero@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sun, 24 Dec 2023 11:03:37 +0100
Organization: A noiseless patient Spider
Lines: 14
Message-ID: <um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 24 Dec 2023 10:03:37 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="d22b4bfa57955bf45d3f20f1cb3746a0";
logging-data="2663180"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+atPIhSnHzW6hxVNd3chRvi7APUmSs83c="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:BohbBszU0OgzEe11bvs3umHwUqk=
Content-Language: de-DE
In-Reply-To: <um7haa$274oh$3@dont-email.me>
 by: Bonita Montero - Sun, 24 Dec 2023 10:03 UTC

Am 23.12.2023 um 21:52 schrieb Chris M. Thomasson:

> On 12/22/2023 8:55 AM, Bonita Montero wrote:

>> False-sharing woudln't hurt my algorithm much since only the beginning
>> and the end of the thread-local segment overlaps with other thread; but
>> I do it anyway to have maximum performance.

> That's is a good habit to get into. :^)

I experimentally removed the masking of the lower bits of the
partition bounds according to the cacheline size and there was
no measurable performance-loss.

Re: Sieve of Erastosthenes optimized to the max

<865y0nv3jk.fsf@linuxsc.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=2965&group=comp.lang.c%2B%2B#2965

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17687@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sun, 24 Dec 2023 10:49:51 -0800
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <865y0nv3jk.fsf@linuxsc.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org> <ul5bn0$2qsht$2@dont-email.me> <ul5ut3$31a17$1@raubtier-asyl.eternal-september.org> <ul7ft1$3853g$1@dont-email.me> <ulunk0$iacr$1@raubtier-asyl.eternal-september.org> <um1lm5$14br5$1@dont-email.me> <86il4ovkz3.fsf@linuxsc.com> <um7j0h$27ikh$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="e91b714cfd3f51772707cd580185f87c";
logging-data="2818417"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX184l9H8QNZOupemAcZo9Zi7QzqBThFXs1g="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:eeWUq6/R/OhTOm/UK+gDp85eSQU=
sha1:IkkWL/x7S+6lYzzqXVjWGKtSntU=
 by: Tim Rentsch - Sun, 24 Dec 2023 18:49 UTC

Vir Campestris <vir.campestris@invalid.invalid> writes:

> On 23/12/2023 18:21, Tim Rentsch wrote:
>
>> Vir Campestris <vir.campestris@invalid.invalid> writes:
>>
>>> On my system my code takes about 20 seconds to produce 1e9
>>> primes. [...]
>>
>> Do you mean 1e9 primes or just the primes less than 1e9? To
>> do the first 1e9 primes a sieve would need to go up to about
>> 23.9e9 (so half that many bits if only odd numbers were
>> represented).
>
> Primes up to 1e9.
>
> I have another idea though, watch this space...

Does your have enough memory to compute all the primes up
to 24e9? If it does I suggest that for your next milestone.

Pages:123456
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor