Message-ID:

The nicest thing about the Alto is that it doesn't run faster at night.

Unfortunately C++20's from_chars doesn's support wide characters.
So I implemented my own from_chars called parse_double generically
as a template which can handle any ASCII-based character set whose
characters are integers.
Another requirement for me was that the code should support zero
-terminated strings as well as flat memory ranges with a beginning
and an end. This is done with a function-object which derermines
if a iterator points to a terminating character or address. The
default type for this function object is parse_never_ends(), which
results in any invalid character for the text text value to be a
termination character and as the zero is always an invalid charac-
ter for a floating point value parsing doens't end because the end
function object reports an end but because parsing can't find fur-
ther caracters.
And there's an overload of parse_double which is given a start and
and end iterator and this overload internally has its own end-func-
tion-object which compares the current reading position against the
end iterator.
My code scans the digits beyond the comma first and stores it in a
thead_local vector of doubles whose capacity only grows across mul-
tiple calls of my function for optimal performance. To maximize per-
formance I'm using my own union of doubles to suppress default-ini-
tialization of the vector's elements. The digits are appended until
their value becomes zero or there are no further digits.
The values in the suffix-vector are added in reverse order. If I'd
add them in forward order the precision would be less since there
would be mantissa digits dropped right from the final mantissa.
That's while there's the vector of doubles.
Each digit's valus is multiplied by a 10 ^ N value. This value is
calculated incrementally by successive / 10.0 or * 10.0 operations.
This successive calculations may lead to less precision than if
this value is calculated for each digit with binary exponentation.
So there's a precise mode with my code which is activated through
the first template parameter of my function which defaults to false.
With that each digit's value is calculated with binary exponenta-
tion of 10 ^ N, but this also gives less performance. With my test
code this gives up to four bits of additional precision.

So here's the code:

template<std::random_access_iterator Iterator>
requires std::integral<std::iter_value_t<Iterator>>
struct parse_result
{ std::errc ec;
Iterator next;
};

// parse ends at first invalid character, at least at '\0'
auto parse_never_ends = []( auto ) { return false; };

template<bool Precise = false, std::random_access_iterator Iterator,
typename End = decltype(parse_never_ends)>
requires std::integral<std::iter_value_t<Iterator>>
&& requires( End end, Iterator it ) { { end( it ) } ->
std::convertible_to<bool>; }
parse_result<Iterator> parse_double( Iterator str, double &result, End
end = End() )
{ using namespace std;
static_assert(sizeof(double) == sizeof(uint64_t) &&
numeric_limits<double>::is_iec559, "double must be IEEE-754 double
precision");
// mask to a double's exponent
constexpr uint64_t EXP_MASK = 0x7FFull << 52;
// calculate 10 ^ exp in double
auto pow10 = []( int64_t exp ) -> double
{
// table for binary exponentation with 10 ^ (2 ^ N)
static array<double, 64> tenPows;
// table initialized ?
if( static atomic_bool once( false ); !once.load( memory_order_acquire ) )
{
// weakly no: test locked again
static mutex onceMtx;
lock_guard lock( onceMtx );
if( !once.load( memory_order_relaxed ) )
{
// no: calculate table
for( double p10x2xN = 10.0; double &pow : tenPows )
pow = p10x2xN,
p10x2xN *= p10x2xN;
// set initialized flag with release semantics
once.store( true, memory_order_release );
}
}
// begin with 1.0 since x ^ 0 = 1
double result = 1.0;
// unsigned exponent
uint64_t uExp = exp >= 0 ? exp : -exp;
// highest set bit of exponent
size_t bit = 63 - countl_zero( uExp );
// bit mask to highest set bit
uint64_t mask = 1ull << bit;
// loop as long as there are bits in unsigned exponent
for( ; uExp; uExp &= ~mask, mask >>= 1, --bit )
// bit set ?
if( uExp & mask )
{
// yes: multiply result by 10 ^ (bit + 1)
result *= tenPows[bit];
// overlow ?
if( (bit_cast<uint64_t>( result ) & EXP_MASK) == EXP_MASK )
// yes: result wouldn't change furhter; stop
break;
}
// return 1 / result if exponent is negative
return exp >= 0 ? result : 1.0 / result;
};
Iterator scn = str;
// ignore-case compare of a string with arbitrary with with a C-string
auto xstricmp = [&]( Iterator str, char const *second ) -> bool
{
// unsigned character-type
using uchar_t = make_unsigned_t<iter_value_t<Iterator>>;
auto toLower = []( uchar_t c ) -> uchar_t
{
return c - (c >= 'a' && c <= 'a' ? 'a' - 'A' : 0);
};
for( ; ; ++str, ++second )
if( !*second ) [[unlikely]]
return true;
else if( end( str ) ) [[unlikely]]
return false;
else if( toLower( *str ) != (unsigned char)*second ) [[unlikely]]
return false;
};
// at end ?
if( end( scn ) )
// yes: err
return { errc::invalid_argument, scn };
// double's binary representation sign
uint64_t binSign = 0;
// positive sign ?
if( *scn == '+' ) [[unlikely]]
// at end ?
if( end( ++scn ) ) [[unlikely]]
// yes: err
return { errc::invalid_argument, str };
else;
// negative sign ?
else if( *scn == '-' )
{
// yes: remember sign
binSign = 1ull << 63;
// at end ?
if( end( ++scn ) )
// yes: err
return { errc::invalid_argument, str };
}
// apply binSign to a double
auto applySign = [&]( double d )
{
return bit_cast<double>( binSign | bit_cast<uint64_t>( d ) );
};
// NaN ?
if( xstricmp( scn, "nan" ) ) [[unlikely]]
{
// yes: apply sign to NaN
result = applySign( numeric_limits<double>::quiet_NaN() );
return { errc(), scn + 3 };
}
// SNaN ?
if( xstricmp( scn, "snan" ) ) [[unlikely]]
{
// yes: apply sign to NaN
result = applySign( numeric_limits<double>::signaling_NaN() );
return { errc(), scn + 4 };
}
// Inf
if( xstricmp( scn, "inf" ) ) [[unlikely]]
{
// yes: apply sign to Inf
result = applySign( numeric_limits<double>::infinity() );
return { errc(), scn + 3 };
}
// begin of prefix
Iterator prefixBegin = scn;
while( *scn >= '0' && *scn <= '9' && !end( ++scn ) );
Iterator
// end of prefix
prefixEnd = scn,
// begin and end of suffix initially empty
suffixBegin = scn,
suffixEnd = scn;
// has comma for suffix ?
if( !end( scn ) && *scn == '.' )
{
// suffix begin
suffixBegin = ++scn;
for( ; !end( scn ) && *scn >= '0' && *scn <= '9'; ++scn );
// suffix end
suffixEnd = scn;
}
// prefix and suffix empty ?
if( prefixBegin == prefixEnd && suffixBegin == suffixEnd ) [[unlikely]]
// yes: err
return { errc::invalid_argument, str };
// exponent initially zero
int64_t exp = 0;
// has 'e' for exponent ?
if( !end( scn ) && (*scn == 'e' || *scn == 'E') )
// yes: scan exponent
if( auto [ec, next] = parse_int( ++scn, exp, end ); ec == errc() )
[[likely]]
// succeeded: rembember end of exponent
scn = next;
else
// failed: 'e' without actual exponent
return { ec, scn };
// number of suffix digits
size_t nSuffixes;
if( exp >= 0 ) [[likely]]
// suffix is within suffix or right from suffix
if( suffixEnd - suffixBegin - exp > 0 ) [[likely]]
// suffix is within suffix
nSuffixes = suffixEnd - suffixBegin - (ptrdiff_t)exp;
else
// there are no suffixes
nSuffixes = 0;
else
if( prefixEnd - prefixBegin + exp >= 0 ) [[likely]]
// suffix is within prefix
nSuffixes = suffixEnd - suffixBegin - (ptrdiff_t)exp;
else
// there are no prefixes, all digits are suffixes
nSuffixes = suffixEnd - suffixBegin + (prefixEnd - prefixBegin);
// have non-default initialized doubles to save CPU-time
union ndi_dbl { double d; ndi_dbl() {} };
// thread-local vector with suffixes
thread_local vector<ndi_dbl> ndiSuffixDbls;
// resize suffixes vector, capacity will stick to the maximum
ndiSuffixDbls.resize( nSuffixes );
// have range checking with suffixes on debugging
span suffixDbls( &to_address( ndiSuffixDbls.begin() )->d, &to_address(
ndiSuffixDbls.end() )->d );
// iterator after last suffix
auto suffixDblsEnd = suffixDbls.begin();
double digMul;
int64_t nextExp;
auto suffix = [&]( Iterator first, Iterator end )
{
while( first != end ) [[likely]]
{
// if we're having maximum precision calculate digMul with pow10 for
every iteration
if constexpr( Precise )
digMul = pow10( nextExp-- );
// pow10-value of digit becomes zero ?
if( !bit_cast<uint64_t>( digMul ) )
// yes: no further suffix digits to calculate
return false;
// append suffix double
*suffixDblsEnd++ = (int)(*first++ - '0') * digMul;
// if we're having less precision calculate digMul cascaded
if constexpr( !Precise )
digMul /= 10.0;
}
// further suffix digits to calculate
return true;
};
// flag that signals that is suffix beyond the suffix in prefix
bool furtherSuffix;
if( exp >= 0 ) [[likely]]
// there's no suffix in prefix
nextExp = -1,
digMul = 1.0 / 10.0,
furtherSuffix = true;
else
{
// there's suffix in prefix
Iterator suffixInPrefixBegin;
if( prefixEnd - prefixBegin + exp >= 0 )
// sufix begins within prefix
suffixInPrefixBegin = prefixEnd + (ptrdiff_t)exp,
nextExp = -1,
digMul = 1.0 / 10.0;
else
{
// suffix begins before prefix
suffixInPrefixBegin = prefixBegin;
nextExp = (ptrdiff_t)exp + (prefixEnd - prefixBegin) - 1;
if constexpr( !Precise )
digMul = pow10( nextExp );
}
furtherSuffix = suffix( suffixInPrefixBegin, prefixEnd );
}
if( furtherSuffix && exp < suffixEnd - suffixBegin )
// there's suffix in suffix
if( exp <= 0 )
// (remaining) suffix begins at suffix begin
suffix( suffixBegin, suffixEnd );
else
// suffix begins at exp in suffix
suffix( suffixBegin + (ptrdiff_t)exp, suffixEnd );
result = 0.0;
// add suffixes from the tail to the beginning
for( ; suffixDblsEnd != suffixDbls.begin(); result += *--suffixDblsEnd );
// add prefix digits from end reverse to first
auto prefix = [&]( Iterator end, Iterator first )
{
while( end != first ) [[likely]]
{
// if we're having maximum precision calculate digMul with pow10 for
every iteration
if constexpr( Precise )
digMul = pow10( nextExp++ );
// pow10-value of digit becomes infinte ?
if( (bit_cast<uint64_t>( digMul ) & EXP_MASK) == EXP_MASK ) [[unlikely]]
{
// yes: infinte result, no further suffix digits to calculate
result = numeric_limits<double>::infinity();
return false;
}
// add digit to result
result += (int)(*--end - '0') * digMul;
// if we're having less precision calculate digMul cascaded
if constexpr( !Precise )
digMul *= 10.0;
}
return true;
};
// flag that signals that prefix digits are finite so far, i.e. not Inf
bool prefixFinite = true;
if( !exp ) [[likely]]
// prefix ends at suffix
nextExp = 0,
digMul = 1.0;
else if( exp > 0 ) [[likely]]
{
// there's prefix in suffix
Iterator prefixInSuffixEnd;
if( exp <= suffixEnd - suffixBegin )
// prefix ends within suffix
prefixInSuffixEnd = suffixBegin + (ptrdiff_t)exp,
nextExp = 0,
digMul = 1.0;
else
{
// prefix ends after suffix end
prefixInSuffixEnd = suffixEnd;
nextExp = exp - (suffixEnd - suffixBegin);
if constexpr( !Precise )
digMul = pow10( nextExp );
}
prefixFinite = prefix( prefixInSuffixEnd, suffixBegin );
}
else if( exp < 0 )
{
// prefix ends before suffix
nextExp = -exp;
if constexpr( !Precise )
digMul = pow10( -exp );
}
if( prefixFinite && prefixEnd - prefixBegin + exp > 0 ) [[likely]]
// prefix has prefix
if( exp >= 0 ) [[likely]]
// there's full prefix in prefix
prefixFinite = prefix( prefixEnd, prefixBegin );
else
// remaining prefix is within prefix
prefixFinite = prefix( prefixEnd + (ptrdiff_t)exp, prefixBegin );
if( !prefixFinite ) [[unlikely]]
{
// result is Inf or NaN
// if there we had (digit = 0) * (digMul = Inf) == NaN:
// make result +/-Inf
result = bit_cast<double>( binSign | EXP_MASK );
return { errc::result_out_of_range, scn };
}
result = applySign( result );
return { errc(), scn };
}

template<bool Precise = false, std::random_access_iterator Iterator>
requires std::integral<std::iter_value_t<Iterator>>
parse_result<Iterator> parse_double( Iterator str, Iterator end, double
&result )
{ return parse_double<Precise>( str, result, [&]( Iterator it ) { return
it == end; } );
}

Subject	Replies	Author
Ain't that beautiful / 2 By: Bonita Montero on Wed, 10 Apr 2024	3	Bonita Montero

The nicest thing about the Alto is that it doesn't run faster at night.

devel / comp.lang.c++ / Ain't that beautiful / 2