Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

This screen intentionally left blank.


devel / comp.lang.tcl / Converting from utf-16le to ascii or utf-8

SubjectAuthor
* Converting from utf-16le to ascii or utf-8Simon Geard
+- Re: Converting from utf-16le to ascii or utf-8Rich
`- Re: Converting from utf-16le to ascii or utf-8Peter Dean

1
Converting from utf-16le to ascii or utf-8

<uco2a7$2r9lu$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=22657&group=comp.lang.tcl#22657

  copy link   Newsgroups: comp.lang.tcl
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: simon@whiteowl.co.uk (Simon Geard)
Newsgroups: comp.lang.tcl
Subject: Converting from utf-16le to ascii or utf-8
Date: Wed, 30 Aug 2023 19:37:59 +0100
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <uco2a7$2r9lu$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 30 Aug 2023 18:37:59 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="a9f416049db5797860d7307096210d4f";
logging-data="2991806"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/talSIjGIa468wo9b8ZGSR"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.14.0
Cancel-Lock: sha1:eaBbF2t7WeWmJS7J0Ze+1eG76G4=
Content-Language: en-GB
 by: Simon Geard - Wed, 30 Aug 2023 18:37 UTC

So I have a data file containing ascii characters which is UTF-16LE
encoded (output from Powershell). I'd like to do two things with this
file from tcl on both Windows and Linux:

1) detect that it is utf-16le
2) convert it to ascii (or utf-8)

Looking at the output from [encoding names] on Linux there is no
utf-16le, does it have a different name? It looks to me as if I could
just read the file two bytes at a time and drop the second byte but I
was hoping I could use fconfigure and encoding.

Thanks for any ideas.

Simon

Re: Converting from utf-16le to ascii or utf-8

<uco5hv$2rtm7$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=22658&group=comp.lang.tcl#22658

  copy link   Newsgroups: comp.lang.tcl
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: rich@example.invalid (Rich)
Newsgroups: comp.lang.tcl
Subject: Re: Converting from utf-16le to ascii or utf-8
Date: Wed, 30 Aug 2023 19:33:19 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <uco5hv$2rtm7$1@dont-email.me>
References: <uco2a7$2r9lu$1@dont-email.me>
Injection-Date: Wed, 30 Aug 2023 19:33:19 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="96e9b5b7ecbcb76401670542db394452";
logging-data="3012295"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/y+wz5gpjrRunkr2FjnqNN"
User-Agent: tin/2.6.1-20211226 ("Convalmore") (Linux/5.15.117 (x86_64))
Cancel-Lock: sha1:c6qAGDw0q0bjeMbwI96OCunP6CM=
 by: Rich - Wed, 30 Aug 2023 19:33 UTC

Simon Geard <simon@whiteowl.co.uk> wrote:
> So I have a data file containing ascii characters which is UTF-16LE
> encoded (output from Powershell). I'd like to do two things with this
> file from tcl on both Windows and Linux:
>
> 1) detect that it is utf-16le

For detection, a UTF-16 encoded file is /supposed/ to begin with a Byte
Order Mark (https://en.wikipedia.org/wiki/Byte_order_mark) -- so
assuming it has one, this is how to detect it is UTF-16LE.

> 2) convert it to ascii (or utf-8)

It looks like there is no utf-16 'encoding' support yet. There is a
Tip: https://core.tcl-lang.org/tips/doc/main/tip/547.md but it is
marked as Tcl 8.7.

> Looking at the output from [encoding names] on Linux there is no
> utf-16le, does it have a different name? It looks to me as if I could
> just read the file two bytes at a time and drop the second byte but I
> was hoping I could use fconfigure and encoding.

Since you have Linux, it appears that the iconv command handles
converting from UTF-16 LE and BE -- so you might be able to convert on
Linux and then use the converted file afterward.

Re: Converting from utf-16le to ascii or utf-8

<ucp6vg$34bnj$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=22660&group=comp.lang.tcl#22660

  copy link   Newsgroups: comp.lang.tcl
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: peterd@nospam.net (Peter Dean)
Newsgroups: comp.lang.tcl
Subject: Re: Converting from utf-16le to ascii or utf-8
Date: Thu, 31 Aug 2023 15:03:44 +1000
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <ucp6vg$34bnj$1@dont-email.me>
References: <uco2a7$2r9lu$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 31 Aug 2023 05:03:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d116fd3a8fda54eebf4fe2b73f543e07";
logging-data="3288819"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+S7hCU+//A52ebw9Z4QWLw"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:fzHljsyTtOle27XVyzDJH6w3whI=
In-Reply-To: <uco2a7$2r9lu$1@dont-email.me>
Content-Language: en-US
 by: Peter Dean - Thu, 31 Aug 2023 05:03 UTC

On 31/8/23 04:37, Simon Geard wrote:
> So I have a data file containing ascii characters which is UTF-16LE
> encoded (output from Powershell). I'd like to do two things with this
> file from tcl on both Windows and Linux:
>
> 1) detect that it is utf-16le
> 2) convert it to ascii (or utf-8)
>
> Looking at the output from [encoding names] on Linux there is no
> utf-16le, does it have a different name? It looks to me as if I could
> just read the file two bytes at a time and drop the second byte but I
> was hoping I could use fconfigure and encoding.
>
> Thanks for any ideas.
>
> Simon

https://wiki.tcl-lang.org/page/Unicode+file+reader

but I just use notepad++ and save as utf-8

Peter


devel / comp.lang.tcl / Converting from utf-16le to ascii or utf-8

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor