Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Linux is obsolete (Andrew Tanenbaum)


devel / comp.lang.awk / The Art of Unix Programming - Case Study: awk

SubjectAuthor
* The Art of Unix Programming - Case Study: awkJanis Papanagnou
+* Re: The Art of Unix Programming - Case Study: awkKenny McCormack
|`* Re: The Art of Unix Programming - Case Study: awkKenny McCormack
| `- Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
+- Re: The Art of Unix Programming - Case Study: awkEd Morton
+* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
|`* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
| +* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
| |`* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
| | `* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
| |  `* Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
| |   +* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
| |   |`- Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
| |   `- Re: The Art of Unix Programming - Case Study: awkJeremy Brubaker
| `- Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
+* Re: The Art of Unix Programming - Case Study: awkOlaf Schultz
|`- Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
`* Re: The Art of Unix Programming - Case Study: awkKpop 2GM
 `* Re: The Art of Unix Programming - Case Study: awkKpop 2GM
  `* Re: The Art of Unix Programming - Case Study: awkAxel Reichert
   +* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   |+* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
   ||+* Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
   |||`* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
   ||| `* Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
   |||  +* Re: The Art of Unix Programming - Case Study: awkKpop 2GM
   |||  |`* Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
   |||  | `- Re: The Art of Unix Programming - Case Study: awkKpop 2GM
   |||  +* Re: The Art of Unix Programming - Case Study: awkKpop 2GM
   |||  |`* Re: The Art of Unix Programming - Case Study: awkAxel Reichert
   |||  | +- Re: The Art of Unix Programming - Case Study: awkKpop 2GM
   |||  | `- Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
   |||  `* Syntactic Sugar (Was: The Art of Unix Programming - Case Study: awk)Kenny McCormack
   |||   `* Re: Syntactic Sugar (Was: The Art of Unix Programming - Case Study:Kaz Kylheku
   |||    `- Re: Syntactic SugarBen Bacarisse
   ||`* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   || +* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
   || |`* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   || | `* Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
   || |  `* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   || |   `- Re: The Art of Unix Programming - Case Study: awkBen Bacarisse
   || `* Re: The Art of Unix Programming - Case Study: awkAxel Reichert
   ||  +- Re: The Art of Unix Programming - Case Study: awkKpop 2GM
   ||  `* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   ||   `* Re: The Art of Unix Programming - Case Study: awkAxel Reichert
   ||    `* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   ||     `* Re: The Art of Unix Programming - Case Study: awkAxel Reichert
   ||      +* Re: The Art of Unix Programming - Case Study: awkolivier gabathuler
   ||      |`* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   ||      | `* Re: The Art of Unix Programming - Case Study: awkolivier gabathuler
   ||      |  `* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   ||      |   +- Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   ||      |   `* Re: The Art of Unix Programming - Case Study: awkolivier gabathuler
   ||      |    `- Re: The Art of Unix Programming - Case Study: awkKpop 2GM
   ||      `* Re: The Art of Unix Programming - Case Study: awkJanis Papanagnou
   ||       `- Re: The Art of Unix Programming - Case Study: awkAxel Reichert
   |`- Re: The Art of Unix Programming - Case Study: awkAndreas Eder
   +* Re: The Art of Unix Programming - Case Study: awkKpop 2GM
   |`- Re: The Art of Unix Programming - Case Study: awkKaz Kylheku
   `- Re: The Art of Unix Programming - Case Study: awkKpop 2GM

Pages:123
The Art of Unix Programming - Case Study: awk

<st6udg$k03$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1025&group=comp.lang.awk#1025

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: The Art of Unix Programming - Case Study: awk
Date: Sun, 30 Jan 2022 22:02:40 +0100
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <st6udg$k03$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 30 Jan 2022 21:02:40 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d659a454b7165f7bafea18059a70f275";
logging-data="20483"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+lj/0JDfP7zrdDnOENfBCA"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:n5hkvgEuqXCrnujzIyVC2BKPbYU=
X-Mozilla-News-Host: news://news.eternal-september.org:119
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sun, 30 Jan 2022 21:02 UTC

I accidentally stumbled across the book "The Art of UNIX Programming"
(2004), by Eric S. Raymond. It has a chapter on Awk (about one and a
half page long). I was a bit astonished about quite some statements,
valuations, and conclusions. (And not only in the light of a recent
Fosslife article that Arnold informed us about here in c.l.a in May
2021.)

Here are two paragraphs quoted from the book. I'm interested in your
opinions.

" The awk language was originally designed to be a small,
expressive special-purpose language for report generation.
Unfortunately, it turns out to have been designed at a bad
spot on the complexity-vs.-power curve. The action language
is noncompact, but the pattern-driven framework it sits
inside keeps it from being generally applicable — that’s the
worst of both worlds. And the new-school scripting languages
can do anything awk can; their equivalent programs are
usually just as readable, if not more so. "

" For a few years after the release of Perl in 1987, awk
remained competitive simply because it had a smaller, faster
implementation. But as the cost of compute cycles and memory
dropped, the economic reasons for favoring a special-purpose
language that was relatively thrifty with both lost their
force. Programmers increasingly chose to do awklike things
with Perl or (later) Python, rather than keep two different
scripting languages in their heads. By the year 2000 awk had
become little more than a memory for most old-school Unix
hackers, and not a particularly nostalgic one. "

Janis

Re: The Art of Unix Programming - Case Study: awk

<st70q4$3r4c6$1@news.xmission.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1026&group=comp.lang.awk#1026

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gazelle@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Sun, 30 Jan 2022 21:43:32 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <st70q4$3r4c6$1@news.xmission.com>
References: <st6udg$k03$1@dont-email.me>
Injection-Date: Sun, 30 Jan 2022 21:43:32 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="4034950"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Sun, 30 Jan 2022 21:43 UTC

In article <st6udg$k03$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>I accidentally stumbled across the book "The Art of UNIX Programming"
>(2004), by Eric S. Raymond. It has a chapter on Awk (about one and a
>half page long). I was a bit astonished about quite some statements,
>valuations, and conclusions. (And not only in the light of a recent
>Fosslife article that Arnold informed us about here in c.l.a in May
>2021.)
>
>Here are two paragraphs quoted from the book. I'm interested in your
>opinions.

Obviously, this guy is full of crap.

That's not as uncommon a situation (even in those we are supposed to admire
and hold up as heroes) as we'd like it to be.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/Pedantic

Re: The Art of Unix Programming - Case Study: awk

<st7144$765$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1027&group=comp.lang.awk#1027

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: mortonspam@gmail.com (Ed Morton)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Sun, 30 Jan 2022 15:48:51 -0600
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <st7144$765$1@dont-email.me>
References: <st6udg$k03$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 30 Jan 2022 21:48:52 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="417a700ee5a5967ef05c27d6b30f61e7";
logging-data="7365"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX184i2WXSNNgUOm74dH5C1QI"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:jQ9ue8Y7ifvwDhlJSl7aFCqi6tk=
In-Reply-To: <st6udg$k03$1@dont-email.me>
X-Antivirus-Status: Clean
Content-Language: en-US
X-Antivirus: Avast (VPS 220130-2, 1/30/2022), Outbound message
 by: Ed Morton - Sun, 30 Jan 2022 21:48 UTC

On 1/30/2022 3:02 PM, Janis Papanagnou wrote:
> I accidentally stumbled across the book "The Art of UNIX Programming"
> (2004), by Eric S. Raymond. It has a chapter on Awk (about one and a
> half page long). I was a bit astonished about quite some statements,
> valuations, and conclusions. (And not only in the light of a recent
> Fosslife article that Arnold informed us about here in c.l.a in May
> 2021.)
>
> Here are two paragraphs quoted from the book. I'm interested in your
> opinions.
>
> " The awk language was originally designed to be a small,
> expressive special-purpose language for report generation.
> Unfortunately, it turns out to have been designed at a bad
> spot on the complexity-vs.-power curve. The action language
> is noncompact, but the pattern-driven framework it sits
> inside keeps it from being generally applicable — that’s the
> worst of both worlds. And the new-school scripting languages
> can do anything awk can; their equivalent programs are
> usually just as readable, if not more so. "
>
> " For a few years after the release of Perl in 1987, awk
> remained competitive simply because it had a smaller, faster
> implementation. But as the cost of compute cycles and memory
> dropped, the economic reasons for favoring a special-purpose
> language that was relatively thrifty with both lost their
> force. Programmers increasingly chose to do awklike things
> with Perl or (later) Python, rather than keep two different
> scripting languages in their heads. By the year 2000 awk had
> become little more than a memory for most old-school Unix
> hackers, and not a particularly nostalgic one. "
>
>
> Janis

Sounds like he completely missed the point on how and why to use awk,
misunderstood the huge benefits of a tiny language that doesn't have a
million constructs to do things "compactly", and is unaware of current
awk usage which, if the questions posted on StackOverflow and
StackExchange are any indication, is thriving.

Ed.

Re: The Art of Unix Programming - Case Study: awk

<st71oc$3r4c6$2@news.xmission.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1028&group=comp.lang.awk#1028

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gazelle@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Sun, 30 Jan 2022 21:59:40 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <st71oc$3r4c6$2@news.xmission.com>
References: <st6udg$k03$1@dont-email.me> <st70q4$3r4c6$1@news.xmission.com>
Injection-Date: Sun, 30 Jan 2022 21:59:40 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="4034950"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Sun, 30 Jan 2022 21:59 UTC

In article <st70q4$3r4c6$1@news.xmission.com>,
Kenny McCormack <gazelle@shell.xmission.com> wrote:
>In article <st6udg$k03$1@dont-email.me>,
>Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>I accidentally stumbled across the book "The Art of UNIX Programming"
>>(2004), by Eric S. Raymond. It has a chapter on Awk (about one and a
>>half page long). I was a bit astonished about quite some statements,
>>valuations, and conclusions. (And not only in the light of a recent
>>Fosslife article that Arnold informed us about here in c.l.a in May
>>2021.)
>>
>>Here are two paragraphs quoted from the book. I'm interested in your
>>opinions.
>
>Obviously, this guy is full of crap.
>
>That's not as uncommon a situation (even in those we are supposed to admire
>and hold up as heroes) as we'd like it to be.

It's funny in particular, since he mentions the power-complexity curve, and
I always thought that was AWK's main strength - that's thing I always liked
about it - that it was perfectly situated on that curve. You can do really
cool things in AWK w/o having to spend lots of time bowing down to the gods
of the language. I.e., with AWK, you can sit down and start writing your
algorithm w/o having to spend lots of time writing boilerplate code to get
started, as with most other languages.

The problem really is as it with everything - ya always gotta push the new
stuff. Whether we're talking about books, movies, TV, music, programming
languages, whatever. You always have to be pushing the new stuff and
disparaging the old. In fact, I read something recently that most of the
interest in music these days is in old music and this is viewed as a
certified Bad Thing, by people who need people to be interested in (and, of
course, buying and supporting) new music in order to keep the economic
engine running.

--
Reading any post by Fred Hodgin, you're always faced with the choice of:
lunatic, moron, or troll.

I always try to be generous and give benefit of the doubt, by assuming troll.

Re: The Art of Unix Programming - Case Study: awk

<st735v$lnq$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1029&group=comp.lang.awk#1029

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Sun, 30 Jan 2022 23:23:59 +0100
Organization: A noiseless patient Spider
Lines: 55
Message-ID: <st735v$lnq$1@dont-email.me>
References: <st6udg$k03$1@dont-email.me> <st70q4$3r4c6$1@news.xmission.com>
<st71oc$3r4c6$2@news.xmission.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 30 Jan 2022 22:23:59 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d659a454b7165f7bafea18059a70f275";
logging-data="22266"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18bUkarHcBEgpp5lsn5kA9U"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:RdWR4fMzD7vFSt1DCVzaTmDwYUQ=
In-Reply-To: <st71oc$3r4c6$2@news.xmission.com>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sun, 30 Jan 2022 22:23 UTC

On 30.01.2022 22:59, Kenny McCormack wrote:
> In article <st70q4$3r4c6$1@news.xmission.com>,
> Kenny McCormack <gazelle@shell.xmission.com> wrote:
>> In article <st6udg$k03$1@dont-email.me>,
>> Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
>>> I accidentally stumbled across the book "The Art of UNIX Programming"
>>> (2004), by Eric S. Raymond. It has a chapter on Awk (about one and a
>>> half page long). I was a bit astonished about quite some statements,
>>> valuations, and conclusions. (And not only in the light of a recent
>>> Fosslife article that Arnold informed us about here in c.l.a in May
>>> 2021.)
>>>
>>> Here are two paragraphs quoted from the book. I'm interested in your
>>> opinions.
>>
>> Obviously, this guy is full of crap.
>>
>> That's not as uncommon a situation (even in those we are supposed to admire
>> and hold up as heroes) as we'd like it to be.
>
> It's funny in particular, since he mentions the power-complexity curve, and
> I always thought that was AWK's main strength - that's thing I always liked
> about it - that it was perfectly situated on that curve.

Yep. Exactly this was where I just thought: "WHAT?" (or rather "WTF?").
This ratio is what I also regularly communicate as big strength of Awk.

I read that paragraph two or three times to understand Eric's mindset;
obviously he comes from a general purpose language as being the target
function. I would say (as also Ed formulated it), he missed the point.

I also stumbled across the argument of other languages (in context of
Perl and Python as the only mentioned languages) supposedly being the
preferable alternatives ("are usually just as readable, if not more
so").

> You can do really
> cool things in AWK w/o having to spend lots of time bowing down to the gods
> of the language. I.e., with AWK, you can sit down and start writing your
> algorithm w/o having to spend lots of time writing boilerplate code to get
> started, as with most other languages.
>
> The problem really is as it with everything - ya always gotta push the new
> stuff. Whether we're talking about books, movies, TV, music, programming
> languages, whatever. You always have to be pushing the new stuff and
> disparaging the old. [...]

It's interesting that in the same book (200 pages earlier), in chapter
1.2, "The Durability of Unix", he points to the strengths of Unix and
why Unix persisted (and even got stronger).

Schizophrenic.

Janis

Re: The Art of Unix Programming - Case Study: awk

<87mtjcdcu4.fsf@bsb.me.uk>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1030&group=comp.lang.awk#1030

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Sun, 30 Jan 2022 23:26:59 +0000
Organization: A noiseless patient Spider
Lines: 56
Message-ID: <87mtjcdcu4.fsf@bsb.me.uk>
References: <st6udg$k03$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="d7d5326d3165a3b6d759569ab07e6e17";
logging-data="15764"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19elNHgSC7qed3vDFSXV6qA+E62xnmDVR0="
Cancel-Lock: sha1:HAAdTi5/QOARhPyYL5RcF29qTMw=
sha1:V8+dB/vQNmagfylR831iJIhsYjM=
X-BSB-Auth: 1.1a7fecc1b911b5289ab1.20220130232659GMT.87mtjcdcu4.fsf@bsb.me.uk
 by: Ben Bacarisse - Sun, 30 Jan 2022 23:26 UTC

Janis Papanagnou <janis_papanagnou@hotmail.com> writes:

> I accidentally stumbled across the book "The Art of UNIX Programming"
> (2004), by Eric S. Raymond. It has a chapter on Awk (about one and a
> half page long). I was a bit astonished about quite some statements,
> valuations, and conclusions. (And not only in the light of a recent
> Fosslife article that Arnold informed us about here in c.l.a in May
> 2021.)
>
> Here are two paragraphs quoted from the book. I'm interested in your
> opinions.
>
> " The awk language was originally designed to be a small,
> expressive special-purpose language for report generation.
> Unfortunately, it turns out to have been designed at a bad
> spot on the complexity-vs.-power curve. The action language
> is noncompact, but the pattern-driven framework it sits
> inside keeps it from being generally applicable — that’s the
> worst of both worlds. And the new-school scripting languages
> can do anything awk can; their equivalent programs are
> usually just as readable, if not more so. "
>
> " For a few years after the release of Perl in 1987, awk
> remained competitive simply because it had a smaller, faster
> implementation. But as the cost of compute cycles and memory
> dropped, the economic reasons for favoring a special-purpose
> language that was relatively thrifty with both lost their
> force. Programmers increasingly chose to do awklike things
> with Perl or (later) Python, rather than keep two different
> scripting languages in their heads. By the year 2000 awk had
> become little more than a memory for most old-school Unix
> hackers, and not a particularly nostalgic one. "

I think there's some truth in this. I don't like the quasi-scientific
way it's put -- I'll bet ESR has no measurements of complexity or
power -- but the story matches my experience of people moving away from
AWK.

As someone else has said here, there's a lot to be said for a small
language, but that advantage starts to drain away as soon as you are
forced to bite the bullet of using a bigger one (whatever that really
means). A huge driver of this for Perl was CPAN. Perl had publicly
shared modules so you could knock up something to parse out bits of
HTML, process emails and so on in just an hour or so. And you could
avoid name clashes quite easily. At the time (and maybe to this day)
you shared AWK code by literal copying of text into your script, hoping
that no name clashes would cause trouble.

Of course, AWK was not designed for things like HTML, but once you know
enough Perl to do the one project that needed it, it's right there for
the next one, even if AWK would do that one just as well. In fact, even
if AWK can do the next project /better/, because keeping two scripting
languages in your head is not as easy as keeping one there.

--
Ben.

Re: The Art of Unix Programming - Case Study: awk

<st8c31$dc9$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1031&group=comp.lang.awk#1031

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Mon, 31 Jan 2022 11:02:08 +0100
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <st8c31$dc9$1@dont-email.me>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 31 Jan 2022 10:02:09 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="83e7dd5673bbbe3cca86a7ceed9190a8";
logging-data="13705"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+TcIhY+Qc3HAdYMl846bTA"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:UfeJUk84y8IYZZw8cTW7Y+nCmwE=
In-Reply-To: <87mtjcdcu4.fsf@bsb.me.uk>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Mon, 31 Jan 2022 10:02 UTC

On 31.01.2022 00:26, Ben Bacarisse wrote:
>
> As someone else has said here, there's a lot to be said for a small
> language, but that advantage starts to drain away as soon as you are
> forced to bite the bullet of using a bigger one (whatever that really
> means). A huge driver of this for Perl was CPAN. Perl had publicly
> shared modules so you could knock up something to parse out bits of
> HTML, process emails and so on in just an hour or so. And you could
> avoid name clashes quite easily. At the time (and maybe to this day)
> you shared AWK code by literal copying of text into your script, hoping
> that no name clashes would cause trouble.

I am skeptical about that. Aren't you essentially drawing the picture
of "featureitis" - feature driven language enhancements? It seems to
me that often hype starts and initially fosters a new language, and
fans continue using these languages for things initially unintended,
so that the application domain is expanded step by step, and (if done
in the right way) by libraries, successfully (in a way). The result
appears to be asymptotically evolution of general purpose languages, or
something that's intended as one; often ignoring sophisticated design
(Javascript comes to my mind)[*]. But the point is, in my opinion, that
the original intent to be a small language that covers only a special
domain gets lost. The schizophrenic thing - also just in my opinion -
is that it seems contrary to Unix-Philosophy, the separation of duties
and keeping tools small and specialized -; incidentally also described
by ESR in that book extensively. It's one thing if folks are fans of
some new language (because of design, features, applicable for their
domain, or a good marketing division, etc.) and focus on that; that's
fine. Or whether powerful special purpose languages - I consider awk
as one - are dismissed because they are not general purpose monoliths
or feature-full toolboxes.[**]

Janis

[*] This is different from language designs as we know it e.g. from
the 1960's (e.g. Simula, Algol) or even later (e.g. C++), especially
when standards-driven.

[**] Incidentally GNU Awk opens that path with its Extension Library,
without actually taking it.

Re: The Art of Unix Programming - Case Study: awk

<8735l3ddwz.fsf@bsb.me.uk>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1032&group=comp.lang.awk#1032

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Mon, 31 Jan 2022 17:15:56 +0000
Organization: A noiseless patient Spider
Lines: 53
Message-ID: <8735l3ddwz.fsf@bsb.me.uk>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d7d5326d3165a3b6d759569ab07e6e17";
logging-data="1940"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18ET7mS5nV81UA3rkwn5E0ErWJvZ6eNKJU="
Cancel-Lock: sha1:rDnyFHd2Q3z0KmV0wxslodtZaS0=
sha1:1P7qvlySdwN4FH0bGZJ1/Dq1QpY=
X-BSB-Auth: 1.4ab265803611dd0413ea.20220131171556GMT.8735l3ddwz.fsf@bsb.me.uk
 by: Ben Bacarisse - Mon, 31 Jan 2022 17:15 UTC

Janis Papanagnou <janis_papanagnou@hotmail.com> writes:

> On 31.01.2022 00:26, Ben Bacarisse wrote:
>>
>> As someone else has said here, there's a lot to be said for a small
>> language, but that advantage starts to drain away as soon as you are
>> forced to bite the bullet of using a bigger one (whatever that really
>> means). A huge driver of this for Perl was CPAN. Perl had publicly
>> shared modules so you could knock up something to parse out bits of
>> HTML, process emails and so on in just an hour or so. And you could
>> avoid name clashes quite easily. At the time (and maybe to this day)
>> you shared AWK code by literal copying of text into your script, hoping
>> that no name clashes would cause trouble.
>
> I am skeptical about that. Aren't you essentially drawing the picture
> of "featureitis" - feature driven language enhancements?

I don't think so because I don't think I'm talking about language
enhancements.

> It seems to
> me that often hype starts and initially fosters a new language, and
> fans continue using these languages for things initially unintended,
> so that the application domain is expanded step by step, and (if done
> in the right way) by libraries, successfully (in a way). The result
> appears to be asymptotically evolution of general purpose languages, or
> something that's intended as one; often ignoring sophisticated design
> (Javascript comes to my mind)[*].

I must be having a bad day because I don't follow. I was describing how
a lot of people I know transitioned away from AWK. It started with a
task that AWK was not good at (I'm not blaming AWK here) but then they
have two options for the next task.

I don't think any of the people I am thinking of were subject to hype.
For one thing, I'm talking about the late 80s and early 90s. There
really wasn't much "hype" about scripting languages. You just used
whatever tools suited the task.

> But the point is, in my opinion, that
> the original intent to be a small language that covers only a special
> domain gets lost. The schizophrenic thing - also just in my opinion -
> is that it seems contrary to Unix-Philosophy, the separation of duties
> and keeping tools small and specialized -; incidentally also described
> by ESR in that book extensively.

This is why keeping AWK simple and narrowly focused is good. But that
will inevitable lead people to find alternatives for some tasks, and
that is a danger (if you want to look at it like a competition) because
it opens the door to using these other alternatives in the future.

--
Ben.

Re: The Art of Unix Programming - Case Study: awk

<20220131101050.713@kylheku.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1033&group=comp.lang.awk#1033

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: 480-992-1380@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Mon, 31 Jan 2022 18:55:02 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 71
Message-ID: <20220131101050.713@kylheku.com>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me>
Injection-Date: Mon, 31 Jan 2022 18:55:02 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e713e0546884df8580f66c7be2c26e18";
logging-data="23498"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/mn4briWtUGFTb+kHlZU0/nCQET43FqI0="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:u2wdBwshAW9IO5Yj+cjNiYlqp9Q=
 by: Kaz Kylheku - Mon, 31 Jan 2022 18:55 UTC

On 2022-01-31, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:
> On 31.01.2022 00:26, Ben Bacarisse wrote:
>>
>> As someone else has said here, there's a lot to be said for a small
>> language, but that advantage starts to drain away as soon as you are
>> forced to bite the bullet of using a bigger one (whatever that really
>> means). A huge driver of this for Perl was CPAN. Perl had publicly
>> shared modules so you could knock up something to parse out bits of
>> HTML, process emails and so on in just an hour or so. And you could
>> avoid name clashes quite easily. At the time (and maybe to this day)
>> you shared AWK code by literal copying of text into your script, hoping
>> that no name clashes would cause trouble.
>
> I am skeptical about that. Aren't you essentially drawing the picture
> of "featureitis" - feature driven language enhancements? It seems to
> me that often hype starts and initially fosters a new language, and
> fans continue using these languages for things initially unintended,

Perl had enough capabilities in the core language that someone could
write, say, a useable interface module to some RDBMS. Or HTTP serving or
querying or whatever.

For some people, not having such a ready-made module will be a
deal-breaker. And that's even if it *can* be written in some language
they are considering. Most of those things can't be written in Awk
because it has insufficient system access.

> [**] Incidentally GNU Awk opens that path with its Extension Library,
> without actually taking it.

This is is probably too late, and in an awkward form. Leading scripting
languages nowadays have a FFI (foreign function interface), whereby you
can bind to shared libraries without having to compile (let alone write)
any C code, just using FFI statements in the scripting language.

Lisps had this kind of FFI going back to at least the 1980's. E.g.
DEC's VaxLisp.

Here is a documentation reference: VAX LISP VMS System Access
Programming Guide, May 1986:

http://www.softwarepreservation.org/projects/LISP/common_lisp_family/dec/VAX_LISP_VMS_System_Access_Programming_Guide_May86.pdf

There are examplews in section 2.10 like, calling an Acos function
in a math library:

(DEFINE-EXTERNAL-ROUTINE (MTH$ACOSD
:FILE "MTHRTL"
:RESULT (:LISP-TYPE SINGLE-FLOAT
:VAX-TYPE :F-FLOATING))
"This routine returns the arc cosine
of an angle in degrees."
(X :LISP-TYPE SINGLE-FLOAT
:VAX-TYPE :F-FLOATING))

That's it: bind to the MTH$ACOSD symbol in the MTHRTL library
(RTL == run-time library, likely). The lisp and external types
of the paramter and return value are specified, and all the conversion
happens automatically.

After the above DEFINE-EXTERNAL-ROUTINE incantation, it's ready for use:

Lisp> (CALL-OUT MTH$ACOSD 0.5)
60.0

OK, that's either state of the art for 1986, or else a lower bound for what
was state of the art.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: The Art of Unix Programming - Case Study: awk

<stblci$iso$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1038&group=comp.lang.awk#1038

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Tue, 1 Feb 2022 16:59:14 +0100
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <stblci$iso$1@dont-email.me>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me> <8735l3ddwz.fsf@bsb.me.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 1 Feb 2022 15:59:14 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="104b6dee84ab88669742ec376381c132";
logging-data="19352"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18bbY2uuwEt+LuQh6aadqDV"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:A9b4Dr9MlhILE21xVSbO88AOqUE=
In-Reply-To: <8735l3ddwz.fsf@bsb.me.uk>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Tue, 1 Feb 2022 15:59 UTC

On 31.01.2022 18:15, Ben Bacarisse wrote:
>> [ keep things simple and specialized Unix philosophy ]
>
> This is why keeping AWK simple and narrowly focused is good. But that
> will inevitable lead people to find alternatives for some tasks, and
> that is a danger (if you want to look at it like a competition) because
> it opens the door to using these other alternatives in the future.

Well, I think it's okay to find a language better suited for a given
task. Certainly better that if every [simple] language gets enhanced
only for competition purposes (which is no argument for me).

In that light I cannot understand how ESR came to the statements that
I quoted in my OP.

Janis

Re: The Art of Unix Programming - Case Study: awk

<875ypybkqn.fsf@bsb.me.uk>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1039&group=comp.lang.awk#1039

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Tue, 01 Feb 2022 16:43:44 +0000
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <875ypybkqn.fsf@bsb.me.uk>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me> <8735l3ddwz.fsf@bsb.me.uk>
<stblci$iso$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="f5d105e5673c19dc18ee2409e3f0d08d";
logging-data="13331"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/MVhwfHSYTgdX1n+64peCWZRG/6R+CfaQ="
Cancel-Lock: sha1:tMlVm4EjtO66XfMoRD5nwCojIuU=
sha1:41HNbMmOjWbct+mlw8nBNqInKlU=
X-BSB-Auth: 1.ea352035f87cda47b19c.20220201164344GMT.875ypybkqn.fsf@bsb.me.uk
 by: Ben Bacarisse - Tue, 1 Feb 2022 16:43 UTC

Janis Papanagnou <janis_papanagnou@hotmail.com> writes:

> On 31.01.2022 18:15, Ben Bacarisse wrote:
>>> [ keep things simple and specialized Unix philosophy ]
>>
>> This is why keeping AWK simple and narrowly focused is good. But that
>> will inevitable lead people to find alternatives for some tasks, and
>> that is a danger (if you want to look at it like a competition) because
>> it opens the door to using these other alternatives in the future.
>
> Well, I think it's okay to find a language better suited for a given
> task. Certainly better that if every [simple] language gets enhanced
> only for competition purposes (which is no argument for me).
>
> In that light I cannot understand how ESR came to the statements that
> I quoted in my OP.

The problem is cognitive load. Not everyone can (or wants to) keep
multiple general-purpose programming languages, several scripting
languages and a couple of shell languages in their head all the time.

To my mind, where ESR errs is in thinking of AWK as a scripting language
at all. If you think of it as a tool for manipulating line-oriented
text files to be used alongside Unixes other tool like grep, cut, sort,
uniq then you probably won't mind the space it takes up in your head.

The sweet spot is a simple, easy to remember language, that can do 95%
of the tasks you want to script. ESR thinks AWK misses that sweet spot
and that that explains why it is not much used these days. You need a
particular computing environment to see exactly where AWK fits in.

--
Ben.

Re: The Art of Unix Programming - Case Study: awk

<20220201141255.519@kylheku.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1040&group=comp.lang.awk#1040

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: 480-992-1380@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Tue, 1 Feb 2022 22:21:06 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <20220201141255.519@kylheku.com>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me> <8735l3ddwz.fsf@bsb.me.uk>
<stblci$iso$1@dont-email.me> <875ypybkqn.fsf@bsb.me.uk>
Injection-Date: Tue, 1 Feb 2022 22:21:06 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="bcdfedb9bfd2b4b5d14fa171537b55e5";
logging-data="28409"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18GAKr4i/gTC6G41Ck7nFkxO3iTWSiO4lI="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:8j/hfQLBaPS/6tMn8wdod5RRmNs=
 by: Kaz Kylheku - Tue, 1 Feb 2022 22:21 UTC

On 2022-02-01, Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
> To my mind, where ESR errs is in thinking of AWK as a scripting language
> at all. If you think of it as a tool for manipulating line-oriented
> text files to be used alongside Unixes other tool like grep, cut, sort,
> uniq then you probably won't mind the space it takes up in your head.

Where ESR errs is believing that Awk is a language he actually knows.

Otherwise he'd know that you can use the "curly brace dialect" without
the pattern-condtiion framework, other than a BEGIN clause.

function helper()
{
}

function main()
{
helper();
}

BEGIN { main(); }

Awk turns off the pattern-action framework when there are no patterns
and actions other than BEGIN.

He said some pretty quirky things about Lisp also, like it being
useful for some profound enlightenment "once you get it" that will
magically make you a better programmer for the rest of your days,
without you having to learn anything about Lisp or continue using
it beyond that one "eureka" moment.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: The Art of Unix Programming - Case Study: awk

<87czk69dj6.fsf@bsb.me.uk>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1041&group=comp.lang.awk#1041

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Wed, 02 Feb 2022 03:02:21 +0000
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <87czk69dj6.fsf@bsb.me.uk>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me> <8735l3ddwz.fsf@bsb.me.uk>
<stblci$iso$1@dont-email.me> <875ypybkqn.fsf@bsb.me.uk>
<20220201141255.519@kylheku.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="f78ddbb5a5a3a1b2f8beca2913947ed7";
logging-data="25970"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/KYx76LeMovNgEmgOVcX1HUYSAj1dWh7A="
Cancel-Lock: sha1:73ZI0t1DA/yTe/zDNnkjoLMjjII=
sha1:mycvwM4vV6v2ndtcnmld9699QhY=
X-BSB-Auth: 1.2fd638553af94f44ee6b.20220202030221GMT.87czk69dj6.fsf@bsb.me.uk
 by: Ben Bacarisse - Wed, 2 Feb 2022 03:02 UTC

Kaz Kylheku <480-992-1380@kylheku.com> writes:

> On 2022-02-01, Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
>> To my mind, where ESR errs is in thinking of AWK as a scripting language
>> at all. If you think of it as a tool for manipulating line-oriented
>> text files to be used alongside Unixes other tool like grep, cut, sort,
>> uniq then you probably won't mind the space it takes up in your head.
>
> Where ESR errs is believing that Awk is a language he actually knows.
>
> Otherwise he'd know that you can use the "curly brace dialect" without
> the pattern-condtiion framework, other than a BEGIN clause.
>
> function helper()
> {
> }
>
> function main()
> {
> helper();
> }
>
> BEGIN { main(); }
>
> Awk turns off the pattern-action framework when there are no patterns
> and actions other than BEGIN.

I'll take your word that he did not know this. But how does this weaken
what he was saying? The "curly brace dialect" of AWK is hardly a better
AWK. It's AWK without the most convenient part (for most tasks).

--
Ben.

Re: The Art of Unix Programming - Case Study: awk

<20220201221825.754@kylheku.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1042&group=comp.lang.awk#1042

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: 480-992-1380@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Wed, 2 Feb 2022 06:29:33 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <20220201221825.754@kylheku.com>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me> <8735l3ddwz.fsf@bsb.me.uk>
<stblci$iso$1@dont-email.me> <875ypybkqn.fsf@bsb.me.uk>
<20220201141255.519@kylheku.com> <87czk69dj6.fsf@bsb.me.uk>
Injection-Date: Wed, 2 Feb 2022 06:29:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1fe7e55e8dc8942cfb9f0ecc61e74891";
logging-data="16343"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18OG91SqK0IdbnzESvQ6ZhCl8ZGTrjCY2I="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:1YQ6CsUY6YU5JCi+RTMjYvhKg+g=
 by: Kaz Kylheku - Wed, 2 Feb 2022 06:29 UTC

On 2022-02-02, Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
> Kaz Kylheku <480-992-1380@kylheku.com> writes:
>
>> On 2022-02-01, Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
>>> To my mind, where ESR errs is in thinking of AWK as a scripting language
>>> at all. If you think of it as a tool for manipulating line-oriented
>>> text files to be used alongside Unixes other tool like grep, cut, sort,
>>> uniq then you probably won't mind the space it takes up in your head.
>>
>> Where ESR errs is believing that Awk is a language he actually knows.
>>
>> Otherwise he'd know that you can use the "curly brace dialect" without
>> the pattern-condtiion framework, other than a BEGIN clause.
>>
>> function helper()
>> {
>> }
>>
>> function main()
>> {
>> helper();
>> }
>>
>> BEGIN { main(); }
>>
>> Awk turns off the pattern-action framework when there are no patterns
>> and actions other than BEGIN.
>
> I'll take your word that he did not know this.

At that time he wrote incorrect statements, which he apparently hasn't
subject to errata.

ESR> Programs in awk consist of pattern/action pairs.

Contradicted by my example above: function definitions are not
pattern/action pairs.

ESR> Each pattern is a regular expression, [...]

Nope.

ESR> The action language is noncompact, but the pattern-driven
framework it sits inside keeps it from being generally applicable

Whatever.

> But how does this weaken
> what he was saying? The "curly brace dialect" of AWK is hardly a better
> AWK.

I think that by "action language" he means that stuff written between
the curly braces. It is supposedly trapped in this pattern/action DSL and
so cannot be used to just write normal programs.

> It's AWK without the most convenient part (for most tasks).

But it's not exactly/entirely without it. because something like

{ print $1 + $2 }

can be coded explicitly in the "action language":

function main()
{
while (getline > 0) {
print $1 + $2
}
}

BEGIN { main() }

Call getline to to do the input and field splitting, then
code your own loop around it, using if for the conditions.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: The Art of Unix Programming - Case Study: awk

<stgsh0$j01$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1045&group=comp.lang.awk#1045

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: jbrubake.362@orionarts.invalid (Jeremy Brubaker)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Thu, 3 Feb 2022 15:31:45 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <stgsh0$j01$1@dont-email.me>
References: <st6udg$k03$1@dont-email.me> <87mtjcdcu4.fsf@bsb.me.uk>
<st8c31$dc9$1@dont-email.me> <8735l3ddwz.fsf@bsb.me.uk>
<stblci$iso$1@dont-email.me> <875ypybkqn.fsf@bsb.me.uk>
<20220201141255.519@kylheku.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 3 Feb 2022 15:31:45 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="5e09d67c13245496c073ba2cf4b75065";
logging-data="19457"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX182Sflr0vzESzHHoHiEKriqL7WQLpo2bQE="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:2JbJMm35UA+MI18/NMHt7lJxWDI=
 by: Jeremy Brubaker - Thu, 3 Feb 2022 15:31 UTC

On 2022-02-01, Kaz Kylheku wrote:
> On 2022-02-01, Ben Bacarisse <ben.usenet@bsb.me.uk> wrote:
>> To my mind, where ESR errs is in thinking of AWK as a scripting language
>> at all. If you think of it as a tool for manipulating line-oriented
>> text files to be used alongside Unixes other tool like grep, cut, sort,
>> uniq then you probably won't mind the space it takes up in your head.
>
> Where ESR errs is believing that Awk is a language he actually knows.
>
> Otherwise he'd know that you can use the "curly brace dialect" without
> the pattern-condtiion framework, other than a BEGIN clause.
>
> function helper()
> {
> }
>
> function main()
> {
> helper();
> }
>
> BEGIN { main(); }
>
> Awk turns off the pattern-action framework when there are no patterns
> and actions other than BEGIN.

I recently figured this out and it made me appreciate awk more. I found
it a good language for processing text files even beyond the standard
case where each line is the same record type.

By embracing awk's capabilities as a more robust language I could solve
problems with text file input fairly quickly by combining the standard
pattern-action method with the function oriented approach Kaz mentions
above.

Awk isn't good for everything, but I'm glad I have a better appreciation
of it.

--
() www.asciiribbon.org | Jeremy Brubaker
/\ - against html mail | јЬruЬаkе@оrіоnаrtѕ.іо / neonrex on IRC

I Know A Joke!!

Re: The Art of Unix Programming - Case Study: awk

<j65dqrF3odiU1@mid.individual.net>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1046&group=comp.lang.awk#1046

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: o.schultz@enhydralutris.de (Olaf Schultz)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Fri, 4 Feb 2022 20:41:15 +0100
Lines: 33
Message-ID: <j65dqrF3odiU1@mid.individual.net>
References: <st6udg$k03$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: individual.net bMXpGiiGezuTopKQHJ5EmQ+PfJQ9NnHmuz093j20vVrTfGnU9d
Cancel-Lock: sha1:bKwqevpS+IsErELhnSgBHaH4VTw=
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:97.0) Gecko/20100101
Thunderbird/97.0
Content-Language: en-US
In-Reply-To: <st6udg$k03$1@dont-email.me>
 by: Olaf Schultz - Fri, 4 Feb 2022 19:41 UTC

Am 30.01.22 um 22:02 schrieb Janis Papanagnou:
...

Just my 5 ct:
With programming languages I started with BASIC and switched in approx.
1989 to AutoLISP.

I'm using awk since 1996 for data conversion at first. And from that it
is to 99.x percent my coding-language (beside LaTeX;-)
Now mainly for manipulation of FE pre- and post-processing (Nastran,
Abaqus...) on Unix/Linux. But also for data processing at home
(measurement, conversion...)

Other colleagues tend to use python or TCL (as the Calculation/FEM-Tools
are using this as interface-language).

The underestimated smoothness of awk is: Piping from the command line
for very little helpers... Oh it gets to complex? Go with the same
language to a larger piece code in an editor.

In a few cases we made benchmarks against coding (execution speed) and
readability and coding speed...) awk was not the slowest (perl, python).

So I try to encourage new colleagues to use awk as often as I can.

Olaf

PS: An awk a day keeps the python away;-)

PPS: And a careful look in codes and files I receive at work from
unknown people: I'm not the only awk-user there. So that language is
clearly not dead, it's underestimated.

Re: The Art of Unix Programming - Case Study: awk

<stlrh2$gei$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1047&group=comp.lang.awk#1047

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Sat, 5 Feb 2022 13:45:21 +0100
Organization: A noiseless patient Spider
Lines: 14
Message-ID: <stlrh2$gei$1@dont-email.me>
References: <st6udg$k03$1@dont-email.me> <j65dqrF3odiU1@mid.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 5 Feb 2022 12:45:22 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f252f9e3bddb6955f3f10ebe1d1da080";
logging-data="16850"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Kogn8x7cmRyRzq8ZYQOzi"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:Et48Hgsc2lGQ4W+7njj5npMrKmM=
In-Reply-To: <j65dqrF3odiU1@mid.individual.net>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sat, 5 Feb 2022 12:45 UTC

On 04.02.2022 20:41, Olaf Schultz wrote:
>
> So I try to encourage new colleagues to use awk as often as I can.

And sometimes curious colleagues are asking. I recall to have done
some data evaluations, a couple awk scripts (1-4 liners), and when
they saw how cute solutions with awk can be they immediately asked
for a presentation.[*]

Janis

[*] A similar reaction, BTW, I once had when doing editing with vi;
some Unix tools are just brilliant.

Re: The Art of Unix Programming - Case Study: awk

<88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1048&group=comp.lang.awk#1048

  copy link   Newsgroups: comp.lang.awk
X-Received: by 2002:a05:620a:22d4:: with SMTP id o20mr4218110qki.90.1644366489192;
Tue, 08 Feb 2022 16:28:09 -0800 (PST)
X-Received: by 2002:a25:a569:: with SMTP id h96mr7135597ybi.142.1644366489016;
Tue, 08 Feb 2022 16:28:09 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.awk
Date: Tue, 8 Feb 2022 16:28:08 -0800 (PST)
In-Reply-To: <st6udg$k03$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2603:7000:3c3d:41c0:a19e:b71e:7819:f6b4;
posting-account=n74spgoAAAAZZyBGGjbj9G0N4Q659lEi
NNTP-Posting-Host: 2603:7000:3c3d:41c0:a19e:b71e:7819:f6b4
References: <st6udg$k03$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
Subject: Re: The Art of Unix Programming - Case Study: awk
From: jason.cy.kwan@gmail.com (Kpop 2GM)
Injection-Date: Wed, 09 Feb 2022 00:28:09 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Kpop 2GM - Wed, 9 Feb 2022 00:28 UTC

all i can say is that eric raymond definitely exists in his own fantasy land where he equates unnecessary complexity as a virtue, and treats bloat as a feature.

if he could tell me what to type in python3 to apply comma-formatting to a single 526,824,456-digit integer in less than 6.24 seconds, great i'll heed his advice and switch over.

until such time, i'll trust the raw power of awk above all.

Re: The Art of Unix Programming - Case Study: awk

<947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1049&group=comp.lang.awk#1049

  copy link   Newsgroups: comp.lang.awk
X-Received: by 2002:a05:6214:4111:: with SMTP id kc17mr5119487qvb.61.1644367788758;
Tue, 08 Feb 2022 16:49:48 -0800 (PST)
X-Received: by 2002:a25:b908:: with SMTP id x8mr7009344ybj.561.1644367788607;
Tue, 08 Feb 2022 16:49:48 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.awk
Date: Tue, 8 Feb 2022 16:49:48 -0800 (PST)
In-Reply-To: <88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2603:7000:3c3d:41c0:a19e:b71e:7819:f6b4;
posting-account=n74spgoAAAAZZyBGGjbj9G0N4Q659lEi
NNTP-Posting-Host: 2603:7000:3c3d:41c0:a19e:b71e:7819:f6b4
References: <st6udg$k03$1@dont-email.me> <88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com>
Subject: Re: The Art of Unix Programming - Case Study: awk
From: jason.cy.kwan@gmail.com (Kpop 2GM)
Injection-Date: Wed, 09 Feb 2022 00:49:48 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: Kpop 2GM - Wed, 9 Feb 2022 00:49 UTC

i'm an ultra late-comer to awk - only discovering it in 2017-2018. and the moment i found it, i realized nearly all else - perl R python java C# - can be thrown straight into the toilet, if performance is a key criteria for the task at a hand

unwashed masses still think awk is for benchmarking against perl or python. i skip those and directly benchmark my codes against compiled C-code binaries, which awk is very competitive against - rather remarkable for something largely, if not entirely, single threaded.

the analogy i would use is python being an artist with every brush and every color, pre-mixed, every paint type, and tasked to draw on a 3x2 canvas.

awk would be an artist with only 2 brushes, 1 type of paint, and only the 3 basic colors - even getting to orange green and purple would require manual mixing by the painter herself…... and the only thing constraining her from fully expressing her talents in ceiling murals,

would be the height of the Sistine Chapel itself.

Re: The Art of Unix Programming - Case Study: awk

<8735ksqy1k.fsf@axel-reichert.de>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1050&group=comp.lang.awk#1050

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: mail@axel-reichert.de (Axel Reichert)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Wed, 09 Feb 2022 08:49:59 +0100
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <8735ksqy1k.fsf@axel-reichert.de>
References: <st6udg$k03$1@dont-email.me>
<88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
<947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="e735537f540f1bd18a124c715bca005a";
logging-data="5038"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+p5AShpNLQCDdzbOtIdNc0hVgaisqe9Bs="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
Cancel-Lock: sha1:on7cuypWn2EhPQ9VvV8ZUp9MJJo=
sha1:wd5yirYjerV9ef2nmxgflUCBXJg=
 by: Axel Reichert - Wed, 9 Feb 2022 07:49 UTC

Kpop 2GM <jason.cy.kwan@gmail.com> writes:

> i'm an ultra late-comer to awk - only discovering it in 2017-2018. and
> the moment i found it, i realized nearly all else - perl R python java
> C# - can be thrown straight into the toilet, if performance is a key
> criteria for the task at a hand

I would rather go for TCW (Total Cost of Wizardry): A competent Python
programmer once consulted me on performance tuning for an (ASCII data
mangling) script he had written (which took him about 30 min). It was
running since 10 min, and no end in sight according to a monitor on the
(transformed) output. After he had explained the task at hand, I replied
that I would not use Python, but rather some Unix command line tools. I
started immediately, cobbled something together (awk featured
prominently among other usual suspects, such as tr, sed, cut, grep). It
delivered the desired results before his Python script was finished. So
the final tally was "10 min" versus "> 30 min + 10 min + 10 min".

Once the logic becomes more intricate, I will usually go for Python
though, so I will use awk mostly for command line use, rarely as a file
to be run by "awk -f".

I was also a later-comer to this tool. When I started to learn Perl in
the late 90s, I learned that it was a superset to sed and awk (coming
even with conversion scripts), and so I gave the older tools another try
(the "man" pages were completely incomprehensible to me before, I could
not wrap my head around stream processing). Once it clicked, I rarely
used Perl anywmore.

Same goes for spreadsheet tools, for which I also seldom feel the need.

Best regards

Axel

Re: The Art of Unix Programming - Case Study: awk

<su0n16$od0$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1051&group=comp.lang.awk#1051

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Wed, 9 Feb 2022 16:36:06 +0100
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <su0n16$od0$1@dont-email.me>
References: <st6udg$k03$1@dont-email.me>
<88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
<947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com>
<8735ksqy1k.fsf@axel-reichert.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 9 Feb 2022 15:36:06 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="184a7a44e78b754863a5559082d4210b";
logging-data="24992"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+b+LxtqJu7Fqam4fWeZhuq"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:6WA0PyT0KSWSibSznLNdV61EVk0=
In-Reply-To: <8735ksqy1k.fsf@axel-reichert.de>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Wed, 9 Feb 2022 15:36 UTC

On 09.02.2022 08:49, Axel Reichert wrote:
[ about an ASCII data mangling Python script ]
> [....] I started immediately, cobbled something together (awk featured
> prominently among other usual suspects, such as tr, sed, cut, grep).

Hmm.. - these four tools are amongst those where I usually say; instead
of connecting and running a lot of such processes use just one instance
of awk. The functions expressed in those tools are - modulo a few edge
cases - basics in Awk and part of its core.

(And as an essential plus; you can keep state information in the awk
instance where managing state between the first and the last process in
a pipeline is cumbersome, to say the least, sometimes "impossible", and
usually inefficient. - But I am digressing.)

Janis

Re: The Art of Unix Programming - Case Study: awk

<4a69797d-ef6e-44c4-a5d5-0b61fee65aa9n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1052&group=comp.lang.awk#1052

  copy link   Newsgroups: comp.lang.awk
X-Received: by 2002:a05:6214:509a:: with SMTP id kk26mr2628593qvb.24.1644433013205;
Wed, 09 Feb 2022 10:56:53 -0800 (PST)
X-Received: by 2002:a25:d246:: with SMTP id j67mr3566269ybg.641.1644433012996;
Wed, 09 Feb 2022 10:56:52 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.awk
Date: Wed, 9 Feb 2022 10:56:52 -0800 (PST)
In-Reply-To: <8735ksqy1k.fsf@axel-reichert.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2603:7000:3c3d:41c0:a19e:b71e:7819:f6b4;
posting-account=n74spgoAAAAZZyBGGjbj9G0N4Q659lEi
NNTP-Posting-Host: 2603:7000:3c3d:41c0:a19e:b71e:7819:f6b4
References: <st6udg$k03$1@dont-email.me> <88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
<947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com> <8735ksqy1k.fsf@axel-reichert.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4a69797d-ef6e-44c4-a5d5-0b61fee65aa9n@googlegroups.com>
Subject: Re: The Art of Unix Programming - Case Study: awk
From: jason.cy.kwan@gmail.com (Kpop 2GM)
Injection-Date: Wed, 09 Feb 2022 18:56:53 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 100
 by: Kpop 2GM - Wed, 9 Feb 2022 18:56 UTC

(awk featured
> prominently among other usual suspects, such as tr, sed, cut, grep). It
> delivered the desired results before his Python script was finished. So
> the final tally was "10 min" versus "> 30 min + 10 min + 10 min".
>
> Once the logic becomes more intricate, I will usually go for Python
> though, so I will use awk mostly for command line use, rarely as a file
> to be run by "awk -f".

funny you mentioned "the usual suspects". You can replicate the following test benchmark attempting a bare-bones replication of unix utility [ wc ] , for both GNU wc ("gwc") and BSD wc ("wc") :

Obviously this isn't a full UTF8 validator to deal with all the edge cases of non-UTF8-compliant input, but assuming the input is already known to be UTF8-valid text, even after setting locale to LC_ALL=C, i.e. byte-level only and not UTF8-aware, to count rows, UTF8 characters, and bytes of a 1.84 GB input file,

when compared against 17.9 secs of BSD wc and 23.3 secs of GNU wc,

— gawk 5.1.1 posts a reasonably competitive time of 31.5 secs,
— mawk 1.3.4 's time of 19.3secs beats GNU wc and being only slightly slower than BSD wc, while
— mawk 1.9.9.6 's impressive 12.7secs leaves both in the dust, some 41% faster than BSD wc, and a whopping 83% faster than GNU wc. I wasn't kidding when I said I benchmark awk codes against C binaries instead of against perl or python.

an interpreted scripting language that only can use 1 cpu core comes in as much as 41-83% faster than compiled C-code binaries. And it took me less than 10 mins to write this.

* I couldn't set the same locales for wc otherwise they couldn't count UTF8 properly
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

command ::

echo " awk0 :: ${awk0}" ; ( time ( pv < "${f}" | LC_ALL=C "${awk0}" 'BEGIN { FS = "^$"

} { bytes += length
} /[\200-\377]/ { gsub(/[\200-\301\365-\377]+/,_)
} { chars += length
} END {
printf("rows = %\43.f | "\
"UTF8 chars = %\43.f | "\
"bytes = %\43.f\n",\
NR, \
chars+NR, \
bytes+NR) } ' ) ) | lgp3

awk0 :: mawk

1.85GiB 0:00:19 [97.8MiB/s] [============================================>] 100%
( pv < "${f}" | LC_ALL=C "${awk0}" ; ) 18.75s user 1.31s system 103% cpu 19.344 total
rows = 12494275. | UTF8 chars = 1285316715. | bytes = 1983544693.

awk0 :: gawk

1.85GiB 0:00:31 [60.1MiB/s] [============================================>] 100%
( pv < "${f}" | LC_ALL=C "${awk0}" ; ) 31.02s user 0.94s system 101% cpu 31.474 total
rows = 12494275. | UTF8 chars = 1285316715. | bytes = 1983544693.

awk0 :: mawk2

1.85GiB 0:00:12 [ 148MiB/s] [============================================>] 100%
( pv < "${f}" | LC_ALL=C "${awk0}" ; ) 12.31s user 1.09s system 105% cpu 12.729 total
rows = 12494275. | UTF8 chars = 1285316715. | bytes = 1983544693.

in0: 1.85GiB 0:00:23 [81.3MiB/s] [81.3MiB/s] [=====================>] 100%
( pvE 0.1 in0 < "${f}" | gwc -lcm; ) 22.74s user 1.29s system 103% cpu 23.297 total
12494275 1285316715 1983544693

in0: 1.85GiB 0:00:17 [ 105MiB/s] [ 105MiB/s] [=====================>] 100%
( pvE 0.1 in0 < "${f}" | wc -lm; ) 17.18s user 1.96s system 106% cpu 17.951 total
12494275 1285316715

Re: The Art of Unix Programming - Case Study: awk

<87leyjn42c.fsf@bsb.me.uk>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1053&group=comp.lang.awk#1053

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ben.usenet@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Wed, 09 Feb 2022 21:05:47 +0000
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <87leyjn42c.fsf@bsb.me.uk>
References: <st6udg$k03$1@dont-email.me>
<88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
<947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com>
<8735ksqy1k.fsf@axel-reichert.de> <su0n16$od0$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="5ae0e0286c332dbca1a767f49131889a";
logging-data="13492"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+BDJTHqezopHofyMiPCUeBTMkT5f2KWFI="
Cancel-Lock: sha1:5AAjSi03WhQAT7B6ryfR47cH5NE=
sha1:MC8U3i+mGIrS9NLkWwRmTdi3zL4=
X-BSB-Auth: 1.85ac84452fef93f7aa08.20220209210547GMT.87leyjn42c.fsf@bsb.me.uk
 by: Ben Bacarisse - Wed, 9 Feb 2022 21:05 UTC

Janis Papanagnou <janis_papanagnou@hotmail.com> writes:

> On 09.02.2022 08:49, Axel Reichert wrote:
> [ about an ASCII data mangling Python script ]
>> [....] I started immediately, cobbled something together (awk featured
>> prominently among other usual suspects, such as tr, sed, cut, grep).
>
> Hmm.. - these four tools are amongst those where I usually say; instead
> of connecting and running a lot of such processes use just one instance
> of awk. The functions expressed in those tools are - modulo a few edge
> cases - basics in Awk and part of its core.

That sometimes works, but the trouble is that once you've used AWK's
pattern/action once feature, you can't do so again -- you are stuck
inside the action part. Just the other day I needed to split fields
within a filed after finding the lines I wanted. This was, for me, an
obvious case for two processes:

awk -F: '/wanted/ { print $3 }' | awk -F, '...'

but I could have used grep and cut in place of the first AWK. Maybe I'm
just not good at remembering the details of all the key functions, but I
find I use AWK in pipelines quite a lot.

--
Ben.

Re: The Art of Unix Programming - Case Study: awk

<87pmnvvh81.fsf@eder.anydns.info>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1054&group=comp.lang.awk#1054

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: a_eder_muc@web.de (Andreas Eder)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Wed, 09 Feb 2022 22:54:22 +0100
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <87pmnvvh81.fsf@eder.anydns.info>
References: <st6udg$k03$1@dont-email.me>
<88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
<947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com>
<8735ksqy1k.fsf@axel-reichert.de> <su0n16$od0$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="823b21c27e6a0c3be7db1b50530d07ca";
logging-data="27986"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+2HJhsc/uHZ+f39Lmd+8Br"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:iVZ/5ldpdwBslAKz41YUZGAfbG8=
sha1:glkNJ7JhV3CsgkrMHz9zwxPWCd0=
 by: Andreas Eder - Wed, 9 Feb 2022 21:54 UTC

On Mi 09 Feb 2022 at 16:36, Janis Papanagnou <janis_papanagnou@hotmail.com> wrote:

> On 09.02.2022 08:49, Axel Reichert wrote:
> [ about an ASCII data mangling Python script ]
>> [....] I started immediately, cobbled something together (awk featured
>> prominently among other usual suspects, such as tr, sed, cut, grep).
>
> Hmm.. - these four tools are amongst those where I usually say; instead
> of connecting and running a lot of such processes use just one instance
> of awk. The functions expressed in those tools are - modulo a few edge
> cases - basics in Awk and part of its core.

+1

'Andreas

Re: The Art of Unix Programming - Case Study: awk

<20220209140644.790@kylheku.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1055&group=comp.lang.awk#1055

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: 480-992-1380@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.awk
Subject: Re: The Art of Unix Programming - Case Study: awk
Date: Wed, 9 Feb 2022 22:11:38 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <20220209140644.790@kylheku.com>
References: <st6udg$k03$1@dont-email.me>
<88d1d61e-458f-4364-81b5-7301658ee500n@googlegroups.com>
<947ba8e0-80b6-458d-8caa-dac0764526bcn@googlegroups.com>
<8735ksqy1k.fsf@axel-reichert.de>
<4a69797d-ef6e-44c4-a5d5-0b61fee65aa9n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Feb 2022 22:11:38 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="6c15947a1f4d9f5630cd9b2dccfd20d6";
logging-data="1201"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+hV9c4I1wl+tR6H2OOY7Usvb9fSBLaFx0="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:nZV903AfCFJfgVjqG2pEcq65R+8=
 by: Kaz Kylheku - Wed, 9 Feb 2022 22:11 UTC

On 2022-02-09, Kpop 2GM <jason.cy.kwan@gmail.com> wrote:
> — gawk 5.1.1 posts a reasonably competitive time of 31.5 secs,
> — mawk 1.3.4 's time of 19.3secs beats GNU wc and being only slightly slower than BSD wc, while
> — mawk 1.9.9.6 's impressive 12.7secs leaves both in the dust, some 41% faster than BSD wc, and a whopping 83% faster than GNU wc. I wasn't kidding when I said I benchmark awk codes against C binaries instead of against perl or python.

Why would you need a UTF8 validator in languages that are
largely Unicode-ignorant?

It's a bit like a visual guitar tuner for the deaf.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Pages:123
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor