Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Men love to wonder, and that is the seed of science.


devel / comp.lang.ada / [ANN] Release of UXStrings 0.5.0

SubjectAuthor
* [ANN] Release of UXStrings 0.5.0Blady
`* Re: [ANN] Release of UXStrings 0.5.0Vincent D.
 `- Re: [ANN] Release of UXStrings 0.5.0Blady

1
[ANN] Release of UXStrings 0.5.0

<u31tka$26jou$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=9573&group=comp.lang.ada#9573

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: p.p11@orange.fr (Blady)
Newsgroups: comp.lang.ada
Subject: [ANN] Release of UXStrings 0.5.0
Date: Fri, 5 May 2023 05:36:42 +0200
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <u31tka$26jou$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 5 May 2023 03:36:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="ad364dd2239aa4d2eb3bc512e0257a2d";
logging-data="2314014"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18hVOV7VfvhtkPIIlxJ3iM1"
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0)
Gecko/20100101 Thunderbird/102.10.1
Cancel-Lock: sha1:jAfdMgcq/7kH3MhDIeUPys6HdxA=
Content-Language: fr, en-US
 by: Blady - Fri, 5 May 2023 03:36 UTC

This Ada library, providing Unicode character strings of dynamic length,
is enriched by a third implementation: UXStrings3 [1] also available on
Alire [2]. With this latter implementation, the characters are stored in
Unicode form and the management of dynamic size uses the standard
Wide_Wide_Unbounded strings library.

Performance with Gnoga [3] is better. UXStrings2 already brought better
performance in the case of strings only made up of ASCII characters
(improvement by a factor 2 to 3 compared to UXStrings1). With UXStrings3
performance in the latter case is still improved (factor 6 to 7 compared
to UXStrings1) moreover in the case of strings accentuated in French and
strings containing emojis the process times are also improved (factor 7
to 8 by compared to UXStrings1 or even more in the case of emojis).

For all cases, the global memory occupation of the Gnoga application is
generally similar (9 to 10 Mb). The memory occupation due to UXStrings3
is negligible compared to the memory occupation of the server engine
implemented in Gnoga.

Study case: AdaEdit application using the Gnoga graphics library with
UTF-8 files:
English 315 kb
French: 447 kb
Emojis: 439 kb
Process: read all lines of the given file and display the full text

Regardless of the implementation chosen, the appealing of a library is
mainly based on the capabilities it offers (API). So far in UXStrings,
these are similar to those of the strings Ada standard libraries. If you
find some missing, make your proposals on Github [4].

Pascal.

[1] https://github.com/Blady-Com/UXStrings/blob/master/src/uxstrings3.ads
[2] https://alire.ada.dev/crates/uxstrings.html
[3] https://sourceforge.net/projects/gnoga
[4] https://github.com/Blady-Com/UXStrings/issues

Re: [ANN] Release of UXStrings 0.5.0

<0162bf97-8a37-4244-a368-1bf7ae00077bn@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=9772&group=comp.lang.ada#9772

  copy link   Newsgroups: comp.lang.ada
X-Received: by 2002:ad4:48d3:0:b0:635:6fb4:ec58 with SMTP id v19-20020ad448d3000000b006356fb4ec58mr19212qvx.1.1688028553589;
Thu, 29 Jun 2023 01:49:13 -0700 (PDT)
X-Received: by 2002:a05:6870:5ba2:b0:1b0:60ff:b748 with SMTP id
em34-20020a0568705ba200b001b060ffb748mr4889447oab.3.1688028553304; Thu, 29
Jun 2023 01:49:13 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Thu, 29 Jun 2023 01:49:12 -0700 (PDT)
In-Reply-To: <u31tka$26jou$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=90.63.246.187; posting-account=hya6vwoAAADTA0O27Aq3u6Su3lQKpSMz
NNTP-Posting-Host: 90.63.246.187
References: <u31tka$26jou$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0162bf97-8a37-4244-a368-1bf7ae00077bn@googlegroups.com>
Subject: Re: [ANN] Release of UXStrings 0.5.0
From: vincent.diemunsch@gmail.com (Vincent D.)
Injection-Date: Thu, 29 Jun 2023 08:49:13 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Vincent D. - Thu, 29 Jun 2023 08:49 UTC

Hello Pascal,

Thank you for this contribution. Here are some comments:
- since UTFString is a class ("a tagged record type"), why don't you create an abstract root "UXString" and then derive specialized object types ? Like UTF_8_XString, UTF_16_XString, ASCII_XString, Win_1252_XString, Latin_XString, etc.
- The default format to convert between different encodings should be UTF-8 as it is now ubiquitous.
> [...] moreover in the case of strings accentuated in French and strings containing emojis the process times are also improved (factor 7 to 8 by compared to UXStrings1
- I find quite astonishing to have a factor 8 compared to UTF-8 encoding. Do you have an explanation ? This looks like a poor implementation because UTF-8 encoding is fast and allows direct manipulation in most cases. Maybe because random access is treated as sequential access for UTF-8 encoded strings but this again is poor implementation.

Kind regards,

Vincent

Re: [ANN] Release of UXStrings 0.5.0

<u7pdv2$2v1de$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=9793&group=comp.lang.ada#9793

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: p.p11@orange.fr (Blady)
Newsgroups: comp.lang.ada
Subject: Re: [ANN] Release of UXStrings 0.5.0
Date: Sat, 1 Jul 2023 16:41:37 +0200
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <u7pdv2$2v1de$1@dont-email.me>
References: <u31tka$26jou$1@dont-email.me>
<0162bf97-8a37-4244-a368-1bf7ae00077bn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 1 Jul 2023 14:41:38 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="97c2e6457631dbb7a6cc16f6dfc9f936";
logging-data="3114414"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19bMZDjdstSKAIX8e/DcTSx"
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0)
Gecko/20100101 Thunderbird/102.10.1
Cancel-Lock: sha1:wg+2DaW1qJj4D78+5Z7cHRbUtQM=
Content-Language: fr, en-US
In-Reply-To: <0162bf97-8a37-4244-a368-1bf7ae00077bn@googlegroups.com>
 by: Blady - Sat, 1 Jul 2023 14:41 UTC

Hello Vincent,

Le 29/06/2023 à 10:49, Vincent D. a écrit :
> Hello Pascal,
>
> Thank you for this contribution. Here are some comments:
> - since UTFString is a class ("a tagged record type"), why don't you create an abstract root "UXString" and then derive specialized object types ? Like UTF_8_XString, UTF_16_XString, ASCII_XString, Win_1252_XString, Latin_XString, etc.

Well, that's a possibility chosen in some other Ada Strings libraries.
I've preferred that the API of legacy Ada "string" types to be closed to
those of Ada library so that the adaptation would be easy.
These are not intended to be used outside legacy code adaptation.
Note that I've renamed them as character arrays rather than strings in
order to accentuate the semantic difference.

> - The default format to convert between different encodings should be UTF-8 as it is now ubiquitous.

Conversions are between UXString and encodings, not between encodings.

>> [...] moreover in the case of strings accentuated in French and strings containing emojis the process times are also improved (factor 7 to 8 by compared to UXStrings1
> - I find quite astonishing to have a factor 8 compared to UTF-8 encoding. Do you have an explanation ? This looks like a poor implementation because UTF-8 encoding is fast and allows direct manipulation in most cases. Maybe because random access is treated as sequential access for UTF-8 encoded strings but this again is poor implementation.

You got it: "most cases". Apart from complex implementations, if you
want to access a specific position you have to parse from the beginning
of the UTF-8 data as UXStrings1 does.
UXStrings2 always computes if the resulting data are all ASCII, so the
access is then direct.
UXStrings3 is internally like an Unicode array, so the access is direct.

Best regards, Pascal.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor