Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Statistics are no substitute for judgement. -- Henry Clay


devel / comp.lang.ada / Re: Ada and Unicode

SubjectAuthor
* Re: Ada and UnicodeThomas
+* Re: Ada and UnicodeVadim Godunko
|+* Re: Ada and UnicodeSimon Wright
||+- Re: Ada and UnicodeSimon Wright
||`* Re: Ada and UnicodeVadim Godunko
|| `- Re: Ada and UnicodeSimon Wright
|`- Re: Ada and UnicodeThomas
`- Re: Ada and UnicodeSimon Wright

1
Re: Ada and Unicode

<fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8143&group=comp.lang.ada#8143

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!cleanfeed1-b.proxad.net!nnrp1-2.free.fr!not-for-mail
From: fantome.forums.tDeContes@free.fr.invalid (Thomas)
Newsgroups: comp.lang.ada
Mail-Copies-To: nobody
Subject: Re: Ada and Unicode
References: <607b5b20$0$27442$426a74cc@news.free.fr> <660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com> <lyfszm5xv2.fsf@pushface.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X)
Date: Sun, 03 Apr 2022 21:20:19 +0200
Message-ID: <fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
Lines: 43
Organization: Guest of ProXad - France
NNTP-Posting-Date: 03 Apr 2022 21:20:20 CEST
NNTP-Posting-Host: 91.175.52.121
X-Trace: 1649013620 news-2.free.fr 13452 91.175.52.121:4329
X-Complaints-To: abuse@proxad.net
 by: Thomas - Sun, 3 Apr 2022 19:20 UTC

In article <lyfszm5xv2.fsf@pushface.org>,
Simon Wright <simon@pushface.org> wrote:

> But don't use unit names containing international characters, at any
> rate if you're (interested in compiling on) Windows or macOS:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114

if i understand, Eric Botcazou is a gnu admin who decided to reject your bug?
i find him very "low portability thinking"!

it is the responsability of compilers and other underlying tools, to manage various underlying OS and FS,
not of the user to avoid those that the compiler devs find too bad!
(or to use the right encoding. i heard that Windows uses UTF-16, do you know about it?)

clearly, To_Lower takes Latin-1.
and this kind of problems would be easier to avoid if string types were stronger ...

after:

package Ada.Strings.UTF_Encoding
...
type UTF_8_String is new String;
...
end Ada.Strings.UTF_Encoding;

i would have also made:

package Ada.Directories
...
type File_Name_String is new Ada.Strings.UTF_Encoding.UTF_8_String;
...
end Ada.Directories;

with probably a validity check and a Dynamic_Predicate which allows "".

then, i would use File_Name_String in all Ada.Directories and Ada.*_IO.

--
RAPID maintainer
http://savannah.nongnu.org/projects/rapid/

Re: Ada and Unicode

<48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8144&group=comp.lang.ada#8144

  copy link   Newsgroups: comp.lang.ada
X-Received: by 2002:a0c:f649:0:b0:443:da6c:4297 with SMTP id s9-20020a0cf649000000b00443da6c4297mr1882187qvm.71.1649052626313;
Sun, 03 Apr 2022 23:10:26 -0700 (PDT)
X-Received: by 2002:a81:7b8b:0:b0:2eb:7582:f35 with SMTP id
w133-20020a817b8b000000b002eb75820f35mr1199425ywc.274.1649052626074; Sun, 03
Apr 2022 23:10:26 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Sun, 3 Apr 2022 23:10:25 -0700 (PDT)
In-Reply-To: <fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
Injection-Info: google-groups.googlegroups.com; posting-host=5.167.112.112; posting-account=niG3UgoAAAD7iQ3takWjEn_gw6D9X3ww
NNTP-Posting-Host: 5.167.112.112
References: <607b5b20$0$27442$426a74cc@news.free.fr> <660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com>
<lyfszm5xv2.fsf@pushface.org> <fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com>
Subject: Re: Ada and Unicode
From: vgodunko@gmail.com (Vadim Godunko)
Injection-Date: Mon, 04 Apr 2022 06:10:26 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Vadim Godunko - Mon, 4 Apr 2022 06:10 UTC

On Sunday, April 3, 2022 at 10:20:21 PM UTC+3, Thomas wrote:
>
> > But don't use unit names containing international characters, at any
> > rate if you're (interested in compiling on) Windows or macOS:
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114
>
> and this kind of problems would be easier to avoid if string types were stronger ...
>

Your suggestion is unable to resolve this issue on Mac OS X. Like case sensitivity, binary compare of two strings can't compare strings in different normalization forms. Right solution is to use right type to represent any paths, and even it doesn't resolve some issues, like relative paths and change of rules at mounting points.

Re: Ada and Unicode

<lyfsmt53tn.fsf@pushface.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8145&group=comp.lang.ada#8145

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!aioe.org!vNObJwB5W4WN632vBkQn9g.user.46.165.242.75.POSTED!not-for-mail
From: simon@pushface.org (Simon Wright)
Newsgroups: comp.lang.ada
Subject: Re: Ada and Unicode
Date: Mon, 04 Apr 2022 15:19:16 +0100
Organization: Aioe.org NNTP Server
Message-ID: <lyfsmt53tn.fsf@pushface.org>
References: <607b5b20$0$27442$426a74cc@news.free.fr>
<660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com>
<lyfszm5xv2.fsf@pushface.org>
<fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
<48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="42082"; posting-host="vNObJwB5W4WN632vBkQn9g.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (darwin)
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:MezQlYBg1S6ccyXwBaOgurYWAKw=
 by: Simon Wright - Mon, 4 Apr 2022 14:19 UTC

Vadim Godunko <vgodunko@gmail.com> writes:

> On Sunday, April 3, 2022 at 10:20:21 PM UTC+3, Thomas wrote:
>>
>> > But don't use unit names containing international characters, at
>> > any rate if you're (interested in compiling on) Windows or macOS:
>> >
>> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114
>>
>> and this kind of problems would be easier to avoid if string types
>> were stronger ...
>>
>
> Your suggestion is unable to resolve this issue on Mac OS X. Like case
> sensitivity, binary compare of two strings can't compare strings in
> different normalization forms. Right solution is to use right type to
> represent any paths, and even it doesn't resolve some issues, like
> relative paths and change of rules at mounting points.

I think that's a macOS problem that Apple aren't going to resolve* any
time soon! While banging my head against PR81114 recently, I found
(can't remember where) that (lower case a acute) and (lower case a,
combining acute) represent the same concept and it's up to
tools/operating systems etc to recognise that.

Emacs, too, has a problem: it doesn't recognise the 'combining' part of
(lower case a, combining acute), so what you see on your screen is "a'".

* I don't know how/whether clang addresses this.

Re: Ada and Unicode

<lybkxg6hqm.fsf@pushface.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8146&group=comp.lang.ada#8146

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!aioe.org!vNObJwB5W4WN632vBkQn9g.user.46.165.242.75.POSTED!not-for-mail
From: simon@pushface.org (Simon Wright)
Newsgroups: comp.lang.ada
Subject: Re: Ada and Unicode
Date: Mon, 04 Apr 2022 15:33:21 +0100
Organization: Aioe.org NNTP Server
Message-ID: <lybkxg6hqm.fsf@pushface.org>
References: <607b5b20$0$27442$426a74cc@news.free.fr>
<660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com>
<lyfszm5xv2.fsf@pushface.org>
<fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="57122"; posting-host="vNObJwB5W4WN632vBkQn9g.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (darwin)
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:sC9GeAbdN4pb/EhjHKOOq85VtGM=
 by: Simon Wright - Mon, 4 Apr 2022 14:33 UTC

Thomas <fantome.forums.tDeContes@free.fr.invalid> writes:

> In article <lyfszm5xv2.fsf@pushface.org>,
> Simon Wright <simon@pushface.org> wrote:
>
>> But don't use unit names containing international characters, at any
>> rate if you're (interested in compiling on) Windows or macOS:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114
>
> if i understand, Eric Botcazou is a gnu admin who decided to reject
> your bug? i find him very "low portability thinking"!

To be fair, he only suspended it - you can tell I didn't want to press
very far.

We could remove the part where the filename is smashed to lower-case as
if it were ASCII[1][2][3] (OK, perhaps Latin-1?) if the machine is
Windows or (Apple if not on aarch64!!!), but that still leaves the
filesystem name issue. Windows might be OK (code pages???)

[1] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/adaint.c#L620
[2] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/lib-writ.adb#L812
[2] https://github.com/gcc-mirror/gcc/blob/master/gcc/ada/lib-writ.adb#L1490

Re: Ada and Unicode

<ly7d846fz8.fsf@pushface.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8147&group=comp.lang.ada#8147

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!aioe.org!vNObJwB5W4WN632vBkQn9g.user.46.165.242.75.POSTED!not-for-mail
From: simon@pushface.org (Simon Wright)
Newsgroups: comp.lang.ada
Subject: Re: Ada and Unicode
Date: Mon, 04 Apr 2022 16:11:23 +0100
Organization: Aioe.org NNTP Server
Message-ID: <ly7d846fz8.fsf@pushface.org>
References: <607b5b20$0$27442$426a74cc@news.free.fr>
<660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com>
<lyfszm5xv2.fsf@pushface.org>
<fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
<48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com>
<lyfsmt53tn.fsf@pushface.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="31648"; posting-host="vNObJwB5W4WN632vBkQn9g.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (darwin)
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:AloDTtRfvRboSke6CN4zEzWfkaY=
 by: Simon Wright - Mon, 4 Apr 2022 15:11 UTC

Simon Wright <simon@pushface.org> writes:

> I think that's a macOS problem that Apple aren't going to resolve* any
> time soon! While banging my head against PR81114 recently, I found
> (can't remember where) that (lower case a acute) and (lower case a,
> combining acute) represent the same concept and it's up to
> tools/operating systems etc to recognise that.
[...]
> * I don't know how/whether clang addresses this.

It doesn't, so far as I can tell; has the exact same problem.

Re: Ada and Unicode

<2c365530-67d2-4983-95fd-71966c468997n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8150&group=comp.lang.ada#8150

  copy link   Newsgroups: comp.lang.ada
X-Received: by 2002:a05:6214:21cf:b0:42d:cc:4121 with SMTP id d15-20020a05621421cf00b0042d00cc4121mr1786034qvh.70.1649145595078;
Tue, 05 Apr 2022 00:59:55 -0700 (PDT)
X-Received: by 2002:a81:a04c:0:b0:2eb:51d6:27ca with SMTP id
x73-20020a81a04c000000b002eb51d627camr1493533ywg.145.1649145594820; Tue, 05
Apr 2022 00:59:54 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.ada
Date: Tue, 5 Apr 2022 00:59:54 -0700 (PDT)
In-Reply-To: <lyfsmt53tn.fsf@pushface.org>
Injection-Info: google-groups.googlegroups.com; posting-host=5.167.112.112; posting-account=niG3UgoAAAD7iQ3takWjEn_gw6D9X3ww
NNTP-Posting-Host: 5.167.112.112
References: <607b5b20$0$27442$426a74cc@news.free.fr> <660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com>
<lyfszm5xv2.fsf@pushface.org> <fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
<48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com> <lyfsmt53tn.fsf@pushface.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2c365530-67d2-4983-95fd-71966c468997n@googlegroups.com>
Subject: Re: Ada and Unicode
From: vgodunko@gmail.com (Vadim Godunko)
Injection-Date: Tue, 05 Apr 2022 07:59:55 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 7
 by: Vadim Godunko - Tue, 5 Apr 2022 07:59 UTC

On Monday, April 4, 2022 at 5:19:20 PM UTC+3, Simon Wright wrote:
> I think that's a macOS problem that Apple aren't going to resolve* any
> time soon! While banging my head against PR81114 recently, I found
> (can't remember where) that (lower case a acute) and (lower case a,
> combining acute) represent the same concept and it's up to
> tools/operating systems etc to recognise that.
>
And will not. It is application responsibility to convert file names to NFD to pass to OS. Also, application must compare any paths after conversion to NFD, it is important to handle more complicated cases when canonical reordering is applied.

Re: Ada and Unicode

<lya6cv2xak.fsf@pushface.org>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=8166&group=comp.lang.ada#8166

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!aioe.org!vNObJwB5W4WN632vBkQn9g.user.46.165.242.75.POSTED!not-for-mail
From: simon@pushface.org (Simon Wright)
Newsgroups: comp.lang.ada
Subject: Re: Ada and Unicode
Date: Fri, 08 Apr 2022 10:01:33 +0100
Organization: Aioe.org NNTP Server
Message-ID: <lya6cv2xak.fsf@pushface.org>
References: <607b5b20$0$27442$426a74cc@news.free.fr>
<660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com>
<lyfszm5xv2.fsf@pushface.org>
<fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr>
<48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com>
<lyfsmt53tn.fsf@pushface.org>
<2c365530-67d2-4983-95fd-71966c468997n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="59716"; posting-host="vNObJwB5W4WN632vBkQn9g.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (darwin)
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:XhGLMFLn7NCJAhC8G+orV56ENsg=
 by: Simon Wright - Fri, 8 Apr 2022 09:01 UTC

Vadim Godunko <vgodunko@gmail.com> writes:

> On Monday, April 4, 2022 at 5:19:20 PM UTC+3, Simon Wright wrote:
>> I think that's a macOS problem that Apple aren't going to resolve* any
>> time soon! While banging my head against PR81114 recently, I found
>> (can't remember where) that (lower case a acute) and (lower case a,
>> combining acute) represent the same concept and it's up to
>> tools/operating systems etc to recognise that.
>>
> And will not. It is application responsibility to convert file names
> to NFD to pass to OS. Also, application must compare any paths after
> conversion to NFD, it is important to handle more complicated cases
> when canonical reordering is applied.

Isn't the compiler a tool? gnatmake? gprbuild? (gnatmake handles ACATS
c250002 provided you tell the compiler that the fs is case-sensitive,
gprbuild doesn't even manage that)

Re: Ada and Unicode

<64261cd4$0$7661$426a74cc@news.free.fr>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=9432&group=comp.lang.ada#9432

  copy link   Newsgroups: comp.lang.ada
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!cleanfeed3-b.proxad.net!nnrp1-1.free.fr!not-for-mail
From: fantome.forums.tDeContes@free.fr.invalid (Thomas)
Newsgroups: comp.lang.ada
Mail-Copies-To: nobody
Subject: Re: Ada and Unicode
References: <607b5b20$0$27442$426a74cc@news.free.fr> <660e25a5-506b-43c0-b4ac-e7738e5500e5n@googlegroups.com> <lyfszm5xv2.fsf@pushface.org> <fantome.forums.tDeContes-ACB00A.21201803042022@news.free.fr> <48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
User-Agent: MT-NewsWatcher/3.5.3b3 (Intel Mac OS X)
Date: Fri, 31 Mar 2023 01:35:48 +0200
Lines: 44
Message-ID: <64261cd4$0$7661$426a74cc@news.free.fr>
Organization: Guest of ProXad - France
NNTP-Posting-Date: 31 Mar 2023 01:35:48 CEST
NNTP-Posting-Host: 91.175.52.121
X-Trace: 1680219348 news-2.free.fr 7661 91.175.52.121:7873
X-Complaints-To: abuse@proxad.net
 by: Thomas - Thu, 30 Mar 2023 23:35 UTC

sorry for the delay.

In article <48309745-aa2a-47bd-a4f9-6daa843e0771n@googlegroups.com>,
Vadim Godunko <vgodunko@gmail.com> wrote:

> On Sunday, April 3, 2022 at 10:20:21 PM UTC+3, Thomas wrote:
> >
> > > But don't use unit names containing international characters, at any
> > > rate if you're (interested in compiling on) Windows or macOS:
> > >
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81114
> >
> > and this kind of problems would be easier to avoid if string types were
> > stronger ...
> >
>
> Your suggestion is unable to resolve this issue on Mac OS X.

i said "easier" not "easy".

don't forget that Unicode has 2 levels :
- octets <-> code points
- code points <-> characters/glyphs

and you can't expect the upper to work if the lower doesn't.

> Like case
> sensitivity, binary compare of two strings can't compare strings in different
> normalization forms. Right solution is to use right type to represent any
> paths,

what would be the "right type", according to you?

In fact, here the first question to ask is:
what's the expected encoding for Ada.Text_IO.Open.Name?
- is it Latin-1 because the type is String not UTF_8_String?
- is it undefined because it depends on the underling FS?

--
RAPID maintainer
http://savannah.nongnu.org/projects/rapid/


devel / comp.lang.ada / Re: Ada and Unicode

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor