Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Computer Science is merely the post-Turing decline in formal systems theory.


computers / comp.misc / Re: in praise of text files

SubjectAuthor
* in praise of text filesBen Collver
+- Re: in praise of text filesBob Eager
+* Re: in praise of text files5GyYap52yQ1UGMWD
|`* Re: in praise of text filesMatthew Ernisse
| `- Re: in praise of text filesBob Eager
+* Re: in praise of text filesOregonian Haruspex
|`* Re: in praise of text filesBen Collver
| `- Re: in praise of text filesSamuel Christie
`* Re: in praise of text filesRoger Blake
 `* Re: in praise of text filesComputer Nerd Kev
  +* Re: in praise of text filesscott
  |+- Re: in praise of text filesComputer Nerd Kev
  |+* Re: in praise of text filesRoger Blake
  ||+- Re: in praise of text filesRich
  ||+* Re: in praise of text filesRetrograde
  |||`* Re: in praise of text filesComputer Nerd Kev
  ||| `* Re: in praise of text filesAnthk
  |||  `- Re: in praise of text filesComputer Nerd Kev
  ||`* Re: in praise of text filesAnthk
  || `- Re: in praise of text filesBob Eager
  |`- Re: in praise of text filesDan Espen
  +* Re: in praise of text filesMike Spencer
  |+* Re: in praise of text filesSn!pe
  ||`* Re: in praise of text filesMike Spencer
  || `- Re: in praise of text filesSn!pe
  |`* Re: in praise of text filesSamuel Christie
  | +- Re: in praise of text filesRichard Kettlewell
  | +- Re: in praise of text filesGrant Taylor
  | `* Re: in praise of text filesSpiros Bousbouras
  |  +* Re: in praise of text filesSamuel Christie
  |  |`- Re: in praise of text filesThe Real Bev
  |  +* Re: in praise of text filesscott
  |  |`- Re: in praise of text filesComputer Nerd Kev
  |  +* Re: in praise of text filesBob Eager
  |  |`- Re: in praise of text filesRetrograde
  |  +- Re: in praise of text filesBen Collver
  |  `- Re: in praise of text filesOtto J. Makela
  `* Re: in praise of text filesBob Eager
   `* Re: in praise of text filesComputer Nerd Kev
    `* Re: in praise of text filesBob Eager
     `* Re: in praise of text filesLouis Krupp
      `- Re: in praise of text filesBob Eager

Pages:12
Re: in praise of text files

<jt8e1j-ec9.ln1@berry.solani.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1983&group=comp.misc#1983

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!berry.solani.net!not-for-mail
From: fungus@amongus.com.invalid (Retrograde)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: Mon, 10 Oct 2022 20:37:55 +0100
Message-ID: <jt8e1j-ec9.ln1@berry.solani.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<CZY%K.752888$BKL8.201000@fx15.iad>
<20221007231243@news.eternal-september.org>
Reply-To: fungus@amongus.com.invalid
Injection-Info: solani.org;
logging-data="155495"; mail-complaints-to="abuse@news.solani.org"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:GEswgNsZ04AbtZshGytYsbz2f04=
X-User-ID: eJwVyMkRACEMA7CW4o1joJ0c9F8Cs3oqXFAtKsS4cQ25N+rfaJfBg9XZQC8aD2aYt2lf2ujoAQHtEF0=
X-Face: B,ckSl,FpK$Tw&Gx_oee5Tcj|RCK=sbQ=a&cJ9)e*A|.f}uctF}Rohq&$BI&OBVck/zSV
DV s<~Tu)q"Z]^2KikYTfy^bh'9MsB'ObTszVRGI_#zXVB\_B4BE~|Ad
 by: Retrograde - Mon, 10 Oct 2022 19:37 UTC

On 2022-10-07, Roger Blake <rogblake@iname.invalid> wrote:
> On 2022-10-07, scott@alfter.diespammersdie.us <scott@alfter.diespammersdie.us> wrote:

> I used to be able to extract text directly from Microsoft Word
> documents using "antiword" but it only works with the old binary
> (.doc) format and of course the default has been the new .docx format
> since the 2007 version.

Pandoc does quite a nice job of converting docx to other formats.

Re: in praise of text files

<jqjhe5Fers4U1@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1984&group=comp.misc#1984

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: news0009@eager.cx (Bob Eager)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: 10 Oct 2022 21:33:57 GMT
Lines: 20
Message-ID: <jqjhe5Fers4U1@mid.individual.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<87wn9bjyz9.fsf@bogus.nodomain.nowhere> <878rlrcv13.fsf@sdf.org>
<=QM=Kpsbv2YnanbCN@bongo-ra.co>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net UoNa4pL72+B+x9zn5eNduwbUArqW0MLjlDLQQo6MzUxqP8KQn4
Cancel-Lock: sha1:AyeQ2j2ToeQl3TlQ3xo/1zSjsN0=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Bob Eager - Mon, 10 Oct 2022 21:33 UTC

On Sat, 08 Oct 2022 03:58:04 +0000, Spiros Bousbouras wrote:

> Lets try it out :
>
> Greek alphabet :
> ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ αβγδεζηθικλμνξοπρστυφχψω
>
> Some mathematical symbols :
> ∅ ∁ ∈ ∉ ∋ ∌ ∖ ∩ ∪ ⊂ ⊃ ⊄ ⊅ ⊆ ⊇ ⊈ ⊉ ⊊ ⊋
>
> Can you read all this ?

Fine for me. Pan on FreeBSD.

--
Using UNIX since v6 (1975)...

Use the BIG mirror service in the UK:
http://www.mirrorservice.org

Re: in praise of text files

<jqjhg7Fers4U2@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1985&group=comp.misc#1985

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: news0009@eager.cx (Bob Eager)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: 10 Oct 2022 21:35:03 GMT
Lines: 15
Message-ID: <jqjhg7Fers4U2@mid.individual.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net grtvuKcSQ0Guxbuy0kH1hw8I4DtVFOLXW0Rs+UZUDXphPm4M6r
Cancel-Lock: sha1:yik2FkAW1AjHMRIN5V69l7YMqlc=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Bob Eager - Mon, 10 Oct 2022 21:35 UTC

On Fri, 07 Oct 2022 11:53:18 +1000, Computer Nerd Kev wrote:

> Well you can convert PDF to Postscript, and so far as I'm concened
> that's "plain text" in the way that Markdown is.

Doesn't work if the PostScript file is just a load of images. I usually
print, scan and OCR.

--
Using UNIX since v6 (1975)...

Use the BIG mirror service in the UK:
http://www.mirrorservice.org

Re: in praise of text files

<jqjhhoFers4U3@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1986&group=comp.misc#1986

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!news.uzoreto.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: news0009@eager.cx (Bob Eager)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: 10 Oct 2022 21:35:52 GMT
Lines: 16
Message-ID: <jqjhhoFers4U3@mid.individual.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<87y1tv7xlp.fsf@gzwpqvv.pcxqwwh>
<slrntk4anm.3jb.matt@imladris.colo.ub3rgeek.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net rGfJ6N0cadsNXUgCEUGuygdTVmZN2Z4sZkB4T7xGGME4p/uQfu
Cancel-Lock: sha1:03MS5r5BnYHRR2UzHa+d8wuD8EM=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Bob Eager - Mon, 10 Oct 2022 21:35 UTC

On Sun, 09 Oct 2022 01:59:18 +0000, Matthew Ernisse wrote:

> I'm hoping you are aware that you don't need a CMS or a database to
> publish information over HTTP, but if you aren't then you can quite
> happily (and just as easily) publish things to a web server to present
> over HTTP using a text editor and scp. This has the benefit of still
> being supported by modern browsers.

I always do it like that (although I use the curl library for REXX for an
automated upload).

--
Using UNIX since v6 (1975)...

Use the BIG mirror service in the UK:
http://www.mirrorservice.org

Re: in praise of text files

<634492eb@news.ausics.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1987&group=comp.misc#1987

  copy link   Newsgroups: comp.misc
Message-ID: <634492eb@news.ausics.net>
From: not@telling.you.invalid (Computer Nerd Kev)
Subject: Re: in praise of text files
Newsgroups: comp.misc
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain> <20221006222224@news.eternal-september.org> <633f868e@news.ausics.net> <CZY%K.752888$BKL8.201000@fx15.iad> <20221007231243@news.eternal-september.org> <jt8e1j-ec9.ln1@berry.solani.net>
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/2.4.31 (i586))
NNTP-Posting-Host: news.ausics.net
Date: 11 Oct 2022 07:47:24 +1000
Organization: Ausics - https://www.ausics.net
Lines: 27
X-Complaints: abuse@ausics.net
Path: i2pn2.org!i2pn.org!news.bbs.nz!news.ausics.net!not-for-mail
 by: Computer Nerd Kev - Mon, 10 Oct 2022 21:47 UTC

Retrograde <fungus@amongus.com.invalid> wrote:
> On 2022-10-07, Roger Blake <rogblake@iname.invalid> wrote:
>> On 2022-10-07, scott@alfter.diespammersdie.us <scott@alfter.diespammersdie.us> wrote:
>
>> I used to be able to extract text directly from Microsoft Word
>> documents using "antiword" but it only works with the old binary
>> (.doc) format and of course the default has been the new .docx format
>> since the 2007 version.
>
> Pandoc does quite a nice job of converting docx to other formats.

I just discovered that myself actually. This command seems to work
well to generate a HTML file with any images embedded within it (I
prefer this a little over PDF):
pandoc -s --embed-resources --ascii -o file.htm file.docx

The other one that I would like to handle is Excel spreadsheets in
xls and xlsx formats. PHPSpreadsheet from the PHPOffice project
seems to handle this, but as it's not designed for command-line use
it's going to take some more work to get equivalent functionality
out of it.

https://github.com/PHPOffice

--
__ __
#_ < |\| |< _#

Re: in praise of text files

<63449931@news.ausics.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1988&group=comp.misc#1988

  copy link   Newsgroups: comp.misc
Message-ID: <63449931@news.ausics.net>
From: not@telling.you.invalid (Computer Nerd Kev)
Subject: Re: in praise of text files
Newsgroups: comp.misc
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain> <20221006222224@news.eternal-september.org> <633f868e@news.ausics.net> <jqjhg7Fers4U2@mid.individual.net>
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/2.4.31 (i586))
NNTP-Posting-Host: news.ausics.net
Date: 11 Oct 2022 08:14:09 +1000
Organization: Ausics - https://www.ausics.net
Lines: 28
X-Complaints: abuse@ausics.net
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.quux.org!news.bbs.nz!news.ausics.net!not-for-mail
 by: Computer Nerd Kev - Mon, 10 Oct 2022 22:14 UTC

Bob Eager <news0009@eager.cx> wrote:
> On Fri, 07 Oct 2022 11:53:18 +1000, Computer Nerd Kev wrote:
>
>> Well you can convert PDF to Postscript, and so far as I'm concened
>> that's "plain text" in the way that Markdown is.
>
> Doesn't work if the PostScript file is just a load of images.

Presuming Bitmap images, yes. Markdown apparantly allows you to
reference images as well though, so you could just as well have a
Markdown document with only scanned images of text in it.

> I usually print, scan and OCR.

Surely you can OCR without the printing and scanning? Ghostscript
can generate PNG (etc.) bitmap images for each page of a PDF, at a
specified resolution.

The pdfimages program from Xpdf claims that it "extracts the images
from a PDF file", so it may be better again because there isn't
any recompression or resampling. To be honest I don't do OCR for
anything so I haven't looked into it much. Where I last found that
editing Postscript manually came in handy was actually for
correcting a formatting glitch for printing.

--
__ __
#_ < |\| |< _#

Re: in praise of text files

<jqknbmFers4U4@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1989&group=comp.misc#1989

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: news0009@eager.cx (Bob Eager)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: 11 Oct 2022 08:21:10 GMT
Lines: 17
Message-ID: <jqknbmFers4U4@mid.individual.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<jqjhg7Fers4U2@mid.individual.net> <63449931@news.ausics.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net WuJw9DTw0ZnJrl6yWr8mZQ1HM8Q3W0vo7t7ygNNpAlRYqtLrHN
Cancel-Lock: sha1:oQ81UKimgafjOIhkZ6IQHLoovbo=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Bob Eager - Tue, 11 Oct 2022 08:21 UTC

On Tue, 11 Oct 2022 08:14:09 +1000, Computer Nerd Kev wrote:

>> I usually print, scan and OCR.
>
> Surely you can OCR without the printing and scanning? Ghostscript can
> generate PNG (etc.) bitmap images for each page of a PDF, at a specified
> resolution.

Not in this case. I have a lot of material that is on a CD, in a format
only accessible by a Windows program that won't run on anything later
than XP. It fails when printed to a file!

--
Using UNIX since v6 (1975)...

Use the BIG mirror service in the UK:
http://www.mirrorservice.org

Re: in praise of text files

<slrntkb5md.qo3.bencollver@svadhyaya.localdomain>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1990&group=comp.misc#1990

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: bencollver@tilde.pink (Ben Collver)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: Tue, 11 Oct 2022 16:19:17 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <slrntkb5md.qo3.bencollver@svadhyaya.localdomain>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<87wn9bjyz9.fsf@bogus.nodomain.nowhere> <878rlrcv13.fsf@sdf.org>
<=QM=Kpsbv2YnanbCN@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 11 Oct 2022 16:19:17 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="64b87334c509d752d9a2d8be0486c360";
logging-data="1208477"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Rnj4CB0xUS5KN5XCLjIc8BT9knV4OZk8="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:YeVrFzwbuQMAIF4Abi7NcxRoHhg=
 by: Ben Collver - Tue, 11 Oct 2022 16:19 UTC

On 2022-10-08, Spiros Bousbouras <spibou@gmail.com> wrote:
> Greek alphabet :
> ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ
> αβγδεζηθικλμνξοπρστυφχψω
>
> Some mathematical symbols :
> ∅ ∁ ∈ ∉ ∋ ∌ ∖ ∩ ∪ ⊂ ⊃ ⊄ ⊅ ⊆ ⊇ ⊈ ⊉ ⊊ ⊋
>
> Can you read all this ?

Reads fine for me in slrn and xfce4-terminal.

Re: in praise of text files

<6345d0fa@news.ausics.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1991&group=comp.misc#1991

  copy link   Newsgroups: comp.misc
Message-ID: <6345d0fa@news.ausics.net>
From: not@telling.you.invalid (Computer Nerd Kev)
Subject: Re: in praise of text files
Newsgroups: comp.misc
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain> <20221006222224@news.eternal-september.org> <633f868e@news.ausics.net> <87wn9bjyz9.fsf@bogus.nodomain.nowhere> <878rlrcv13.fsf@sdf.org> <=QM=Kpsbv2YnanbCN@bongo-ra.co> <EGZ0L.327560$9Yp5.106107@fx12.iad>
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/2.4.31 (i586))
NNTP-Posting-Host: news.ausics.net
Date: 12 Oct 2022 06:24:26 +1000
Organization: Ausics - https://www.ausics.net
Lines: 28
X-Complaints: abuse@ausics.net
Path: i2pn2.org!i2pn.org!news.bbs.nz!news.ausics.net!not-for-mail
 by: Computer Nerd Kev - Tue, 11 Oct 2022 20:24 UTC

scott@alfter.diespammersdie.us wrote:
> Spiros Bousbouras <spibou@gmail.com> wrote:
>> Lets try it out :
>>
>> Greek alphabet :
>> ????????????????????????
>> ????????????????????????
>>
>> Some mathematical symbols :
>> ? ? ? ? ? ? \ ? ? ? ? ? ? ? ? ? ? ? ?
>>
>> Can you read all this ?
>
> Received five-by-five, though the math symbols are a bit small. Pretty sure
> that's just down to font choice (Lucida Console, 9 pt.).
>
> As you might see from examining the header, I'm using tin.

Tin also supports translating characters into other character sets
if it's set to prefer them, which is handy if you don't use a
unicode-capable terminal or font. But as you can see, it does tend
to go a little heavy on the "I don't know" character at times.

Compile-time options control some of that behaviour.

--
__ __
#_ < |\| |< _#

Re: in praise of text files

<jcah1j-svo.ln1@berry.solani.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1992&group=comp.misc#1992

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!berry.solani.net!not-for-mail
From: fungus@amongus.com.invalid (Retrograde)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: Wed, 12 Oct 2022 00:21:24 +0100
Message-ID: <jcah1j-svo.ln1@berry.solani.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<87wn9bjyz9.fsf@bogus.nodomain.nowhere> <878rlrcv13.fsf@sdf.org>
<=QM=Kpsbv2YnanbCN@bongo-ra.co> <jqjhe5Fers4U1@mid.individual.net>
Reply-To: fungus@amongus.com.invalid
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Info: solani.org;
logging-data="201334"; mail-complaints-to="abuse@news.solani.org"
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:j/3xL7MK+tta2XwZxO8dNEsZntw=
X-User-ID: eJwFwYkBgDAIA8CV+EJlHEzN/iN4h2xvnmp0QZB9E+CSxAOW0Hp11rfzRsTcs0ZZhsWsW/0rYhEa
X-Face: B,ckSl,FpK$Tw&Gx_oee5Tcj|RCK=sbQ=a&cJ9)e*A|.f}uctF}Rohq&$BI&OBVck/zSV
DV s<~Tu)q"Z]^2KikYTfy^bh'9MsB'ObTszVRGI_#zXVB\_B4BE~|Ad
 by: Retrograde - Tue, 11 Oct 2022 23:21 UTC

On 2022-10-10, Bob Eager <news0009@eager.cx> wrote:
> On Sat, 08 Oct 2022 03:58:04 +0000, Spiros Bousbouras wrote:
>
>> Lets try it out :
>>
>> Greek alphabet :
>> ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ αβγδεζηθικλμνξοπρστυφχψω
>>
>> Some mathematical symbols :
>> ∅ ∁ ∈ ∉ ∋ ∌ ∖ ∩ ∪ ⊂ ⊃ ⊄ ⊅ ⊆ ⊇ ⊈ ⊉ ⊊ ⊋
>>
>> Can you read all this ?
>
> Fine for me. Pan on FreeBSD.
>
It's encoded text/plain; charset=UTF-8 so any UTF-8-aware newsreader in an
environment with the right font should work fine. Both claws-mail and slrn (in
gnome-term) on Linux Mint show me both your Greek and your math just fine over
here. On the Linux console, the Greek comes through but only half the math - I
interpet that as my chosen console font having only a partial set of the math
glyphs.

I'm nostalgic for lots of early technology, but I wouldn't go back to
the era before UTF-8 for anything.

Re: in praise of text files

<TGn1L.574866$Ny99.90119@fx16.iad>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1993&group=comp.misc#1993

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx16.iad.POSTED!not-for-mail
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.3.2
Subject: Re: in praise of text files
Content-Language: en-US
Newsgroups: comp.misc
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<jqjhg7Fers4U2@mid.individual.net> <63449931@news.ausics.net>
<jqknbmFers4U4@mid.individual.net>
From: lkrupp@invalid.pssw.com.invalid (Louis Krupp)
In-Reply-To: <jqknbmFers4U4@mid.individual.net>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 18
Message-ID: <TGn1L.574866$Ny99.90119@fx16.iad>
X-Complaints-To: abuse(at)newshosting.com
NNTP-Posting-Date: Wed, 12 Oct 2022 00:15:47 UTC
Organization: Newshosting.com - Highest quality at a great price! www.newshosting.com
Date: Tue, 11 Oct 2022 18:15:47 -0600
X-Received-Bytes: 1681
 by: Louis Krupp - Wed, 12 Oct 2022 00:15 UTC

On 10/11/2022 2:21 AM, Bob Eager wrote:
> On Tue, 11 Oct 2022 08:14:09 +1000, Computer Nerd Kev wrote:
>
>>> I usually print, scan and OCR.
>> Surely you can OCR without the printing and scanning? Ghostscript can
>> generate PNG (etc.) bitmap images for each page of a PDF, at a specified
>> resolution.
> Not in this case. I have a lot of material that is on a CD, in a format
> only accessible by a Windows program that won't run on anything later
> than XP. It fails when printed to a file!
>
Can the program that reads the file export it as something else? Out of
curiosity, what is the file format called, and is it by any chance
documented?

Louis

(My apologies if this shows up twice.)

Re: in praise of text files

<jqnaghFers4U8@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1995&group=comp.misc#1995

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: news0009@eager.cx (Bob Eager)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: 12 Oct 2022 08:00:17 GMT
Lines: 36
Message-ID: <jqnaghFers4U8@mid.individual.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<jqjhg7Fers4U2@mid.individual.net> <63449931@news.ausics.net>
<jqknbmFers4U4@mid.individual.net> <TGn1L.574866$Ny99.90119@fx16.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net ahatnyTKQAXr7tslO6WFJg8NkrnEWoCS0dm4DhGEE2SG0QeNao
Cancel-Lock: sha1:ZeTEQVNu+aGgr6nizcG4rJMdTxg=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Bob Eager - Wed, 12 Oct 2022 08:00 UTC

On Tue, 11 Oct 2022 18:15:47 -0600, Louis Krupp wrote:

> On 10/11/2022 2:21 AM, Bob Eager wrote:
>> On Tue, 11 Oct 2022 08:14:09 +1000, Computer Nerd Kev wrote:
>>
>>>> I usually print, scan and OCR.
>>> Surely you can OCR without the printing and scanning? Ghostscript can
>>> generate PNG (etc.) bitmap images for each page of a PDF, at a
>>> specified resolution.
>> Not in this case. I have a lot of material that is on a CD, in a format
>> only accessible by a Windows program that won't run on anything later
>> than XP. It fails when printed to a file!
>>
> Can the program that reads the file export it as something else? Out of
> curiosity, what is the file format called, and is it by any chance
> documented?

It's a proprietary format, and the thing that reads it is designed to
ONLY allow documents to be read on screen or printed.

It's not a problem; finally I have completed it and won't have to revisit.

Explanation: it's a CD of back issues of a journal. They want silly money
for PDFs of single articles. I knew a colleague had all the back issues
on paper, but when I asked him he had dumped them three weeks previously!
He had the CD, but it has a 16 bit installer for the reading application.
A VM with XP allowed me to use the application.

I have now thought of another possible way, but I've done it all now. The
printing and OCR worked really well.

--
Using UNIX since v6 (1975)...

Use the BIG mirror service in the UK:
http://www.mirrorservice.org

Re: in praise of text files

<87fsft9vtf.fsf@tigger.extechop.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=1996&group=comp.misc#1996

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: om@iki.fi (Otto J. Makela)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: Wed, 12 Oct 2022 13:32:12 +0300
Organization: Games and Theory
Lines: 24
Message-ID: <87fsft9vtf.fsf@tigger.extechop.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<87wn9bjyz9.fsf@bogus.nodomain.nowhere> <878rlrcv13.fsf@sdf.org>
<=QM=Kpsbv2YnanbCN@bongo-ra.co>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: reader01.eternal-september.org; posting-host="19d0196dbf0b8363a0ee977c91ae597b";
logging-data="1543256"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18w4ssqpwyWtLTmobFZcvb1"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:q7HTcGDE3+PqdZbCKk6QuKUBE40=
sha1:PyF0rq0Eku1J5syhM0aTcDYbQUg=
Mail-Copies-To: never
X-URL: http://www.iki.fi/om/
X-Face: 'g'S,X"!c;\pfvl4ljdcm?cDdk<-Z;`x5;YJPI-cs~D%;_<\V3!3GCims?a*;~u$<FYl@"E
c?3?_J+Zwn~{$8<iEy}EqIn_08"`oWuqO$#(5y3hGq8}BG#sag{BL)u8(c^Lu;*{8+'Z-k\?k09ILS
 by: Otto J. Makela - Wed, 12 Oct 2022 10:32 UTC

Spiros Bousbouras <spibou@gmail.com> wrote:

> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
[...]
> Lets try it out :
>
> Greek alphabet :
> ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ
> αβγδεζηθικλμνξοπρστυφχψω
>
> Some mathematical symbols :
> ∅ ∁ ∈ ∉ ∋ ∌ ∖ ∩ ∪ ⊂ ⊃ ⊄ ⊅ ⊆ ⊇ ⊈ ⊉ ⊊ ⊋
>
> Can you read all this ?

UTF-8 encoding works just fine with Gnus v5.13, to the extent that a
text terminal (I'm running this through mosh) can display characters.
--
/* * * Otto J. Makela <om@iki.fi> * * * * * * * * * */
/* Phone: +358 40 765 5772, ICBM: N 60 10' E 24 55' */
/* Mail: Mechelininkatu 26 B 27, FI-00100 Helsinki */
/* * * Computers Rule 01001111 01001011 * * * * * * */

Re: in praise of text files

<slrntkd9jk.1lhp.anthk@openbsd.home.local>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2000&group=comp.misc#2000

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: anthk@disroot.org (Anthk)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: Thu, 13 Oct 2022 00:38:56 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <slrntkd9jk.1lhp.anthk@openbsd.home.local>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<CZY%K.752888$BKL8.201000@fx15.iad>
<20221007231243@news.eternal-september.org>
Reply-To: Ander GM <anthk@disroot.org>
Injection-Date: Thu, 13 Oct 2022 00:38:56 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="63f79be1eadd618227d11acc94762c66";
logging-data="1699850"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18qzDgWo9tW3tQndu1m1cfh"
User-Agent: slrn/1.0.3 (OpenBSD)
Cancel-Lock: sha1:ofK0CYAV3hHq9KNWA1fTxD3OfTM=
 by: Anthk - Thu, 13 Oct 2022 00:38 UTC

On 2022-10-07, Roger Blake <rogblake@iname.invalid> wrote:
> On 2022-10-07, scott@alfter.diespammersdie.us <scott@alfter.diespammersdie.us> wrote:
>> If you're lucky, you can extract text from a PDF by selecting and copying
>> it. If it's just an image, though (as it might be if the PDF was produced
>> from a scan), you'll get back nothing. You might be able to feed the PDF
>> through an OCR engine and extract the text that way, but the quality of
>> those results depends largely on the quality of the scan.
>
> I used to be able to extract text directly from Microsoft Word documents
> using "antiword" but it only works with the old binary (.doc) format and
> of course the default has been the new .docx format since the 2007 version.
>
> At least pdf is an open format. The "pdftotext" program can extract any
> actual text it finds in a pdf file but sometimes those are just an image
> which would require ocr to interpret.
>

With MUPDF you can select the text with the right click mouse button
and it will be copied into the clipboard.

Re: in praise of text files

<slrntkd9md.1lhp.anthk@openbsd.home.local>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2001&group=comp.misc#2001

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: anthk@disroot.org (Anthk)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: Thu, 13 Oct 2022 00:38:57 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <slrntkd9md.1lhp.anthk@openbsd.home.local>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<CZY%K.752888$BKL8.201000@fx15.iad>
<20221007231243@news.eternal-september.org>
<jt8e1j-ec9.ln1@berry.solani.net> <634492eb@news.ausics.net>
Reply-To: Ander GM <anthk@disroot.org>
Injection-Date: Thu, 13 Oct 2022 00:38:57 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="63f79be1eadd618227d11acc94762c66";
logging-data="1699850"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19yDYp4nsNOFXWrMuxGjEIg"
User-Agent: slrn/1.0.3 (OpenBSD)
Cancel-Lock: sha1:A7jF5EoHDzNLg6Atk74HwFN/+yo=
 by: Anthk - Thu, 13 Oct 2022 00:38 UTC

On 2022-10-10, Computer Nerd Kev <not@telling.you.invalid> wrote:
> Retrograde <fungus@amongus.com.invalid> wrote:
>> On 2022-10-07, Roger Blake <rogblake@iname.invalid> wrote:
>>> On 2022-10-07, scott@alfter.diespammersdie.us <scott@alfter.diespammersdie.us> wrote:
>>
>>> I used to be able to extract text directly from Microsoft Word
>>> documents using "antiword" but it only works with the old binary
>>> (.doc) format and of course the default has been the new .docx format
>>> since the 2007 version.
>>
>> Pandoc does quite a nice job of converting docx to other formats.
>
> I just discovered that myself actually. This command seems to work
> well to generate a HTML file with any images embedded within it (I
> prefer this a little over PDF):
> pandoc -s --embed-resources --ascii -o file.htm file.docx
>
> The other one that I would like to handle is Excel spreadsheets in
> xls and xlsx formats. PHPSpreadsheet from the PHPOffice project
> seems to handle this, but as it's not designed for command-line use
> it's going to take some more work to get equivalent functionality
> out of it.
>
> https://github.com/PHPOffice
>

Get sc-im+gnuplot for xls and xlsx files. It's like LibreOffice Calc
but for the CLI and with vi keys.
For more operations, install visicalc and the required dependencies.
Also, to dump DOC files, you can catdoc and antiword.

Re: in praise of text files

<jqpu9jF9nd2U1@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2004&group=comp.misc#2004

  copy link   Newsgroups: comp.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: news0009@eager.cx (Bob Eager)
Newsgroups: comp.misc
Subject: Re: in praise of text files
Date: 13 Oct 2022 07:50:11 GMT
Lines: 17
Message-ID: <jqpu9jF9nd2U1@mid.individual.net>
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain>
<20221006222224@news.eternal-september.org> <633f868e@news.ausics.net>
<CZY%K.752888$BKL8.201000@fx15.iad>
<20221007231243@news.eternal-september.org>
<slrntkd9jk.1lhp.anthk@openbsd.home.local>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: individual.net yuThgtqvPgIzts6/AN+VdwfQ/t95amLhJaekIILfk/mT0XcLBB
Cancel-Lock: sha1:ZBNoEhER6jvA4YQxodWcVnfgQWk=
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
 by: Bob Eager - Thu, 13 Oct 2022 07:50 UTC

On Thu, 13 Oct 2022 00:38:56 +0000, Anthk wrote:

>> At least pdf is an open format. The "pdftotext" program can extract any
>> actual text it finds in a pdf file but sometimes those are just an
>> image which would require ocr to interpret.
>>
>>
> With MUPDF you can select the text with the right click mouse button and
> it will be copied into the clipboard.

Not if the pages are just scanned images.

--
Using UNIX since v6 (1975)...

Use the BIG mirror service in the UK:
http://www.mirrorservice.org

Re: in praise of text files

<634886c1@news.ausics.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2005&group=comp.misc#2005

  copy link   Newsgroups: comp.misc
Message-ID: <634886c1@news.ausics.net>
From: not@telling.you.invalid (Computer Nerd Kev)
Subject: Re: in praise of text files
Newsgroups: comp.misc
References: <slrntjoo4n.73r.bencollver@svadhyaya.localdomain> <20221006222224@news.eternal-september.org> <633f868e@news.ausics.net> <CZY%K.752888$BKL8.201000@fx15.iad> <20221007231243@news.eternal-september.org> <jt8e1j-ec9.ln1@berry.solani.net> <634492eb@news.ausics.net> <slrntkd9md.1lhp.anthk@openbsd.home.local>
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/2.4.31 (i586))
NNTP-Posting-Host: news.ausics.net
Date: 14 Oct 2022 07:44:33 +1000
Organization: Ausics - https://www.ausics.net
Lines: 33
X-Complaints: abuse@ausics.net
Path: i2pn2.org!i2pn.org!news.bbs.nz!news.ausics.net!not-for-mail
 by: Computer Nerd Kev - Thu, 13 Oct 2022 21:44 UTC

Anthk <anthk@disroot.org> wrote:
> On 2022-10-10, Computer Nerd Kev <not@telling.you.invalid> wrote:
>> The other one that I would like to handle is Excel spreadsheets in
>> xls and xlsx formats. PHPSpreadsheet from the PHPOffice project
>> seems to handle this, but as it's not designed for command-line use
>> it's going to take some more work to get equivalent functionality
>> out of it.
>>
>> https://github.com/PHPOffice
>>
>
> Get sc-im+gnuplot for xls and xlsx files. It's like LibreOffice Calc
> but for the CLI and with vi keys.

Thanks! That saved me from trying to figure out how to write a
command-line application in PHP. It still took me a while to find
the right options to get it to work as a Pandoc-style command-line
converter though. This is the magic concoction that generates a TSV
file from an XLSX spreadsheet without a lot of rubbish at the start
of the file:

sc-im --export_tab --nocurses --quit_afterload file.xlsx > file.tsv

Strangely the "--output=" option only wants to create empty files
for me (with verision 0.7.0).

The terminal-based spreadsheet program itself does look interesting
as well, though I'm pretty sure that I'd miss selecting cells and
copying/pasting using the mouse.

--
__ __
#_ < |\| |< _#

Pages:12
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor