Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

"Don't think; let the machine do it for you!" -- E. C. Berkeley


devel / comp.lang.ada / Re: Ada and Unicode

SubjectAuthor
* Re: Ada and UnicodeThomas
`- Re: Ada and UnicodeThomas

1
Re: Ada and Unicode

<fantome.forums.tDeContes-079FD6.18515603042022@news.free.fr>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8139&group=comp.lang.ada#8139

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!212.27.60.64.MISMATCH!cleanfeed3-b.proxad.net!nnrp2-2.free.fr!not-for-mail
From: fantome.forums.tDeContes@free.fr.invalid (Thomas)
Newsgroups: comp.lang.ada
Mail-Copies-To: nobody
Subject: Re: Ada and Unicode
References: <607b5b20$0$27442$426a74cc@news.free.fr> <f9d91cb0-c9bb-4d42-a1a9-0cd546da436cn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X)
Date: Sun, 03 Apr 2022 18:51:56 +0200
Message-ID: <fantome.forums.tDeContes-079FD6.18515603042022@news.free.fr>
Lines: 57
Organization: Guest of ProXad - France
NNTP-Posting-Date: 03 Apr 2022 18:51:57 CEST
NNTP-Posting-Host: 91.175.52.121
X-Trace: 1649004717 news-2.free.fr 11564 91.175.52.121:8636
X-Complaints-To: abuse@proxad.net
 by: Thomas - Sun, 3 Apr 2022 16:51 UTC

In article <f9d91cb0-c9bb-4d42-a1a9-0cd546da436cn@googlegroups.com>,
Vadim Godunko <vgodunko@gmail.com> wrote:

> On Sunday, April 18, 2021 at 1:03:14 AM UTC+3, DrPi wrote:

> > What's the way to manage Unicode correctly ?
> >
>
> Ada doesn't have good Unicode support. :( So, you need to find suitable set
> of "workarounds".
>
> There are few different aspects of Unicode support need to be considered:
>
> 1. Representation of string literals. If you want to use non-ASCII characters
> in source code, you need to use -gnatW8 switch and it will require use of
> Wide_Wide_String everywhere.
> 2. Internal representation during application execution. You are forced to
> use Wide_Wide_String at previous step, so it will be UCS4/UTF32.

> It is hard to say that it is reasonable set of features for modern world.

I don't think Ada would be lacking that much, for having good UTF-8
support.

the cardinal point is to be able to fill a
Ada.Strings.UTF_Encoding.UTF_8_String with a litteral.
(once you got it, when you'll try to fill a Standard.String with a
non-Latin-1 character, it'll make an error, i think it's fine :-) )

does Ada 202x allow it ?

if not, it would probably be easier if it was
type UTF_8_String is new String;
instead of
subtype UTF_8_String is String;

for all subprograms it's quite easy:
we just have to duplicate them with the new type, and to mark the old
one as Obsolescent.

but, now that "subtype UTF_8_String" exists, i don't know what we can do
for types.
(is the only way to choose a new name?)

> To
> fix some of drawbacks of current situation we are developing new text
> processing library, know as VSS.
>
> https://github.com/AdaCore/VSS

(are you working at AdaCore ?)

--
RAPID maintainer
http://savannah.nongnu.org/projects/rapid/

Re: Ada and Unicode

<642b68fb$0$3206$426a34cc@news.free.fr>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=9471&group=comp.lang.ada#9471

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!cleanfeed1-b.proxad.net!nnrp1-1.free.fr!not-for-mail
From: fantome.forums.tDeContes@free.fr.invalid (Thomas)
Newsgroups: comp.lang.ada
Mail-Copies-To: nobody
Subject: Re: Ada and Unicode
References: <607b5b20$0$27442$426a74cc@news.free.fr> <f9d91cb0-c9bb-4d42-a1a9-0cd546da436cn@googlegroups.com> <fantome.forums.tDeContes-079FD6.18515603042022@news.free.fr>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X)
Date: Tue, 04 Apr 2023 02:02:03 +0200
Lines: 114
Message-ID: <642b68fb$0$3206$426a34cc@news.free.fr>
Organization: Guest of ProXad - France
NNTP-Posting-Date: 04 Apr 2023 02:02:03 CEST
NNTP-Posting-Host: 91.175.52.121
X-Trace: 1680566523 news-4.free.fr 3206 91.175.52.121:3426
X-Complaints-To: abuse@proxad.net
 by: Thomas - Tue, 4 Apr 2023 00:02 UTC

In article
<fantome.forums.tDeContes-079FD6.18515603042022@news.free.fr>,
Thomas <fantome.forums.tDeContes@free.fr.invalid> wrote:

> In article <f9d91cb0-c9bb-4d42-a1a9-0cd546da436cn@googlegroups.com>,
> Vadim Godunko <vgodunko@gmail.com> wrote:
>
> > On Sunday, April 18, 2021 at 1:03:14 AM UTC+3, DrPi wrote:
>
> > > What's the way to manage Unicode correctly ?

> > Ada doesn't have good Unicode support. :( So, you need to find suitable set
> > of "workarounds".
> >
> > There are few different aspects of Unicode support need to be considered:
> >
> > 1. Representation of string literals. If you want to use non-ASCII
> > characters
> > in source code, you need to use -gnatW8 switch and it will require use of
> > Wide_Wide_String everywhere.
> > 2. Internal representation during application execution. You are forced to
> > use Wide_Wide_String at previous step, so it will be UCS4/UTF32.
>
> > It is hard to say that it is reasonable set of features for modern world.
>
> I don't think Ada would be lacking that much, for having good UTF-8
> support.
>
> the cardinal point is to be able to fill a
> Ada.Strings.UTF_Encoding.UTF_8_String with a litteral.
> (once you got it, when you'll try to fill a Standard.String with a
> non-Latin-1 character, it'll make an error, i think it's fine :-) )
>
> does Ada 202x allow it ?

hi !

I think I found a quite nice solution!
(reading <t3lj44$fh5$1@dont-email.me> again)
(not tested yet)

it's not perfect as in the rules of the art,
but it is:

- Ada 2012 compatible
- better than writing UTF-8 Ada code and then telling gnat it is Latin-1
(in this way it would take UTF_8_String for what it is:
an array of octets, but it would not detect an invalid UTF-8 string,
and if someone tells it's really UTF-8 all goes wrong)
- better than being limited to ASCII in string literals
- never need to explicitely declare Wide_Wide_String:
it's always implicit, for very short time,
and AFAIK eligible for optimization

package UTF_Encoding is

subtype UTF_8_String is Ada.Strings.UTF_Encoding.UTF_8_String;

function "+" (A : in Wide_Wide_String) return UTF_8_String
renames Ada.Strings.UTF_Encoding.Wide_Wide_Strings.Encode;

end UTF_Encoding;

then we can do:

package User is

use UTF_Encoding;

My_String : UTF_8_String := + "Greek characters + smileys";

end User;

if you want to avoid "use UTF_Encoding;",
i think "use type UTF_Encoding.UTF_8_String;" doesn't work,
but this should work:

package UTF_Encoding is

subtype UTF_8_String is Ada.Strings.UTF_Encoding.UTF_8_String;

type Literals_For_UTF_8_String is new Wide_Wide_String;

function "+" (A : in Literals_For_UTF_8_String) return UTF_8_String
renames Ada.Strings.UTF_Encoding.Wide_Wide_Strings.Encode;

end UTF_Encoding;

package User is

use type UTF_Encoding.Literals_For_UTF_8_String;

My_String : UTF_Encoding.UTF_8_String
:= + "Greek characters + smileys";

end User;

what do you think about that ? good idea or not ? :-)

--
RAPID maintainer
http://savannah.nongnu.org/projects/rapid/


devel / comp.lang.ada / Re: Ada and Unicode

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor