Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Dammit Jim, I'm an actor, not a doctor.


devel / comp.lang.tcl / Re: When do we need "encoding system"

SubjectAuthor
* When do we need "encoding system"Alexandru
`* Re: When do we need "encoding system"Rich
 +* Re: When do we need "encoding system"Alexandru
 |`- Re: When do we need "encoding system"Rich
 `- Re: When do we need "encoding system"Ralf Fassel

1
When do we need "encoding system"

<8105dbe2-9117-4d23-8947-0047c324ff63n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19728&group=comp.lang.tcl#19728

  copy link   Newsgroups: comp.lang.tcl
X-Received: by 2002:a05:622a:110d:b0:31f:1c18:eb1a with SMTP id e13-20020a05622a110d00b0031f1c18eb1amr18943878qty.292.1659457817175;
Tue, 02 Aug 2022 09:30:17 -0700 (PDT)
X-Received: by 2002:a05:6870:c08c:b0:10c:94ff:adb1 with SMTP id
c12-20020a056870c08c00b0010c94ffadb1mr129442oad.157.1659457816881; Tue, 02
Aug 2022 09:30:16 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.tcl
Date: Tue, 2 Aug 2022 09:30:16 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2a02:2f0a:4103:6a00:25a8:3d51:e8e0:b98e;
posting-account=glPZ8goAAADztwA3kVEZPMKXCGydx5DU
NNTP-Posting-Host: 2a02:2f0a:4103:6a00:25a8:3d51:e8e0:b98e
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8105dbe2-9117-4d23-8947-0047c324ff63n@googlegroups.com>
Subject: When do we need "encoding system"
From: alexandru.dadalau@meshparts.de (Alexandru)
Injection-Date: Tue, 02 Aug 2022 16:30:17 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 1416
 by: Alexandru - Tue, 2 Aug 2022 16:30 UTC

Recently I though it would be a good idea to add "encoding system utf-8" to my code. After that I realized that the icons of Windows folders in the treectrl package are not shown anymore, if the folder path contains special chars such as umlaute. So I must revert back. But when do we need this command anyway?

Re: When do we need "encoding system"

<tcbpln$1l6pi$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19729&group=comp.lang.tcl#19729

  copy link   Newsgroups: comp.lang.tcl
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: comp.lang.tcl
Subject: Re: When do we need "encoding system"
Date: Tue, 2 Aug 2022 18:16:23 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <tcbpln$1l6pi$1@dont-email.me>
References: <8105dbe2-9117-4d23-8947-0047c324ff63n@googlegroups.com>
Injection-Date: Tue, 2 Aug 2022 18:16:23 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="52fd381bd220a09e5195c1add0c29f27";
logging-data="1743666"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/vzUwcrVry5Qi5H4ZjeN0X"
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/3.10.17 (x86_64))
Cancel-Lock: sha1:p2LMC8dDVVKK9QyYYjpxl/qWGCo=
 by: Rich - Tue, 2 Aug 2022 18:16 UTC

Alexandru <alexandru.dadalau@meshparts.de> wrote:
> Recently I though it would be a good idea to add "encoding system
> utf-8" to my code.

Will only work right if the OS system call encoding is also UTF-8.

> After that I realized that the icons of Windows folders in the
> treectrl package are not shown anymore, if the folder path contains
> special chars such as umlaute. So I must revert back.

Yup, expected, as windows system calls are likely largely still UTF-16.

> But when do we need this command anyway?

When you need to change the encoding for a system call that accepts
something other than the overall default for the rest. From the man page:

encoding system ?encoding?
Set the system encoding to encoding. If encoding is omitted then
the command returns the current system encoding. The system
encoding is used whenever Tcl passes strings to system calls.

The key phrase is the last sentence.

Overall, unless you are testing obscure things, it is probably best to
leave the system encoding alone.

Re: When do we need "encoding system"

<709e4c74-26d3-4380-b601-5c4948cdbb35n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19730&group=comp.lang.tcl#19730

  copy link   Newsgroups: comp.lang.tcl
X-Received: by 2002:a37:b983:0:b0:6b8:ddba:f27e with SMTP id j125-20020a37b983000000b006b8ddbaf27emr1990505qkf.774.1659472960087;
Tue, 02 Aug 2022 13:42:40 -0700 (PDT)
X-Received: by 2002:a05:6830:6388:b0:61c:80a9:d5b6 with SMTP id
ch8-20020a056830638800b0061c80a9d5b6mr8163550otb.124.1659472959864; Tue, 02
Aug 2022 13:42:39 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.tcl
Date: Tue, 2 Aug 2022 13:42:39 -0700 (PDT)
In-Reply-To: <tcbpln$1l6pi$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a02:2f0a:4103:6a00:25a8:3d51:e8e0:b98e;
posting-account=glPZ8goAAADztwA3kVEZPMKXCGydx5DU
NNTP-Posting-Host: 2a02:2f0a:4103:6a00:25a8:3d51:e8e0:b98e
References: <8105dbe2-9117-4d23-8947-0047c324ff63n@googlegroups.com> <tcbpln$1l6pi$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <709e4c74-26d3-4380-b601-5c4948cdbb35n@googlegroups.com>
Subject: Re: When do we need "encoding system"
From: alexandru.dadalau@meshparts.de (Alexandru)
Injection-Date: Tue, 02 Aug 2022 20:42:40 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2685
 by: Alexandru - Tue, 2 Aug 2022 20:42 UTC

Rich schrieb am Dienstag, 2. August 2022 um 21:16:27 UTC+3:
> Alexandru <alexandr...@meshparts.de> wrote:
> > Recently I though it would be a good idea to add "encoding system
> > utf-8" to my code.
> Will only work right if the OS system call encoding is also UTF-8.
> > After that I realized that the icons of Windows folders in the
> > treectrl package are not shown anymore, if the folder path contains
> > special chars such as umlaute. So I must revert back.
> Yup, expected, as windows system calls are likely largely still UTF-16.
> > But when do we need this command anyway?
> When you need to change the encoding for a system call that accepts
> something other than the overall default for the rest. From the man page:
>
> encoding system ?encoding?
> Set the system encoding to encoding. If encoding is omitted then
> the command returns the current system encoding. The system
> encoding is used whenever Tcl passes strings to system calls.
>
> The key phrase is the last sentence.
>
> Overall, unless you are testing obscure things, it is probably best to
> leave the system encoding alone.

Thanks Rich for the explanation.
I think Windows uses cp1252.
So it's a mess: I write files typically in utf-8, read them back in utf-8.
All the application data is encoded in utf-8 although the system encoding is cp1252.
E.g. when I use CAWT to read an Excel file, it's content is cp1252 but somehow this still works?

Regards
Alexandru

Re: When do we need "encoding system"

<tcc4g6$1ntr4$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19731&group=comp.lang.tcl#19731

  copy link   Newsgroups: comp.lang.tcl
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: comp.lang.tcl
Subject: Re: When do we need "encoding system"
Date: Tue, 2 Aug 2022 21:21:10 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <tcc4g6$1ntr4$1@dont-email.me>
References: <8105dbe2-9117-4d23-8947-0047c324ff63n@googlegroups.com> <tcbpln$1l6pi$1@dont-email.me> <709e4c74-26d3-4380-b601-5c4948cdbb35n@googlegroups.com>
Injection-Date: Tue, 2 Aug 2022 21:21:10 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="52fd381bd220a09e5195c1add0c29f27";
logging-data="1832804"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18AOo7t8KLSJOJv8I+Ih2BG"
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/3.10.17 (x86_64))
Cancel-Lock: sha1:nebwx4BPv/CWGGRsebqMhaEZmHY=
 by: Rich - Tue, 2 Aug 2022 21:21 UTC

Alexandru <alexandru.dadalau@meshparts.de> wrote:
> Rich schrieb am Dienstag, 2. August 2022 um 21:16:27 UTC+3:
>> Alexandru <alexandr...@meshparts.de> wrote:
>> > Recently I though it would be a good idea to add "encoding system
>> > utf-8" to my code.
>>
>> Overall, unless you are testing obscure things, it is probably best to
>> leave the system encoding alone.
>
> Thanks Rich for the explanation.
> I think Windows uses cp1252.

cp1252 is a font mapping, UTF-16 is an encoding - two different, but
related, items. Font mappings define what characters each integer
value represents (such as 65 meaning capital letter A in ASCII).
Encodings are how the integers are stored in memory (in the case of
UTF-16, as 16-bit integer values).

> So it's a mess: I write files typically in utf-8, read them back in
> utf-8.

Yep, and most new work really should be in UTF-8, unless you need
something else due to 'legacy'.

> All the application data is encoded in utf-8 although the system
> encoding is cp1252.

Again, that legacy stuff... :)

> E.g. when I use CAWT to read an Excel file, it's content is cp1252
> but somehow this still works?

Yes, because Tcl transparently converts it from cp1252 (and whatever
encoding it is stored in) for you.

Re: When do we need "encoding system"

<ygah72twy1j.fsf@akutech.de>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=19734&group=comp.lang.tcl#19734

  copy link   Newsgroups: comp.lang.tcl
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: ralfixx@gmx.de (Ralf Fassel)
Newsgroups: comp.lang.tcl
Subject: Re: When do we need "encoding system"
Date: Wed, 03 Aug 2022 10:07:04 +0200
Lines: 31
Message-ID: <ygah72twy1j.fsf@akutech.de>
References: <8105dbe2-9117-4d23-8947-0047c324ff63n@googlegroups.com>
<tcbpln$1l6pi$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: individual.net PXyrOwpxTOfqZHPJvIJqhghIiGytoQRCJgo/Ss7/v0QVUb1cc=
Cancel-Lock: sha1:eOhaq6ncBqErU+aHobl8AE+VSnQ= sha1:HDlUfeT6CqqHenAxkoGeiHMaFUA=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux)
 by: Ralf Fassel - Wed, 3 Aug 2022 08:07 UTC

* Rich <rich@example.invalid>
| Alexandru <alexandru.dadalau@meshparts.de> wrote:
| > Recently I though it would be a good idea to add "encoding system
| > utf-8" to my code.
>
| Will only work right if the OS system call encoding is also UTF-8.
>
| > After that I realized that the icons of Windows folders in the
| > treectrl package are not shown anymore, if the folder path contains
| > special chars such as umlaute. So I must revert back.
>
| Yup, expected, as windows system calls are likely largely still UTF-16.

I'm not convinced that this is the real reason for that error.

In my experience, the file handling functions on Windows don't care
about the system encoding when it comes to the *name* of the file - they
simply convert TCL's internal rep to wide char
(win/tclWinFile.c:TclNativeCreateNativeRep(), using
MultiByteToWideChar() from CP_UTF8).

In contrast, the code on unix indeed uses the system encoding to get the
file name to open (unix/tclUnixFile.c:TclNativeCreateNativeRep() uses
Tcl_UtfToExternalDString(NULL,) where the NULL denotes the system
encoding).

I rather suspect that the file *reading* behind the scenes relies on the
system encoding being 'correct'. That might fail if the system encoding
ist set to utf-8, but the file content is not stored in utf-8.

R'


devel / comp.lang.tcl / Re: When do we need "encoding system"

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor