Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

"Silent gratitude isn't very much use to anyone." -- G. B. Stearn


devel / comp.lang.awk / Reverse Polish Notation Parser

SubjectAuthor
* Reverse Polish Notation ParserMike Sanders
+- Re: Reverse Polish Notation ParserMike Sanders
+* Re: Reverse Polish Notation ParserJanis Papanagnou
|+* Re: Reverse Polish Notation ParserMike Sanders
||`* Re: Reverse Polish Notation ParserJanis Papanagnou
|| `- Re: Reverse Polish Notation ParserMike Sanders
|`* Re: Reverse Polish Notation ParserEd Morton
| `* Re: Reverse Polish Notation ParserJanis Papanagnou
|  +- Re: Reverse Polish Notation ParserJanis Papanagnou
|  `* Re: Reverse Polish Notation ParserEd Morton
|   `- Re: Reverse Polish Notation ParserEd Morton
`* Re: Reverse Polish Notation ParserKaz Kylheku
 `- Re: Reverse Polish Notation ParserMike Sanders

1
Reverse Polish Notation Parser

<ui9ist$74vq$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1707&group=comp.lang.awk#1707

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: porkchop@invalid.foo (Mike Sanders)
Newsgroups: comp.lang.awk
Subject: Reverse Polish Notation Parser
Date: Mon, 6 Nov 2023 02:26:38 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 95
Sender: Mike Sanders <busybox@sdf.org>
Message-ID: <ui9ist$74vq$1@dont-email.me>
Injection-Date: Mon, 6 Nov 2023 02:26:38 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="bea9002c59825590baf04d6c11b58fc5";
logging-data="234490"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18CbEiU8OxHAXyWwCy5XSrt"
Keywords: Mike's Notes
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64))
Cancel-Lock: sha1:mDMTtxd9s5YPFALn4UoEA67nlCQ=
 by: Mike Sanders - Mon, 6 Nov 2023 02:26 UTC

Ping Janis: Question... Are you interested in rewriting this
as a Gawk only implementation? Would be great for switch/case
statements IMO. If so, I'll add your version to the file at
my website.

Folks, assuming Janis says yes, lets work off that version
& use it as a baseline.

At any rate, a work in progress...

# tags: rpn, calc, numbers, awk, code
# # awk reverse polish notation parser
# Michael Sanders 2023
# https://busybox.neocities.org/notes/rpn.txt
# # operands...
# # large numbers? this really depends on your awk
# decimal fractions? yes, via a hopfully not too
# clever regex that expects decimal points to be
# surrounded by digits eg...
# # valid 0.5, invalid .5
# valid 5.0, invalid 5.
# # operators...
# # + addition
# - subtraction
# * multiplication
# / division
# % modulus
# ^ exponentiation (see footnotes)
# # *always* surround input with 'quotes' as some RPN
# operators can be misconstrued as meta-characters
# by your shell, example...
# # echo '0.089 2.5 + 2 * 3 ^' | awk -f rpn.txt
# # arbitrary precision using printf format specifier...
# # %0.0f 1
# %0.2f 1.00 (default)
# %0.9f 1.000000000
# # further reading...
# # https://en.wikipedia.org/wiki/Reverse_Polish_notation

{ RPN($0) }

function RPN(input, x, y, z, stack) {
split(input, stack, /[[:space:]]+/)
z = 0
for (x = 1; x in stack; ++x) {
y = stack[x]
if (y ~ /^[0-9]+(\.[0-9]+)?$/) stack[++z] = y
else {
if (z < 2) { print "error: insufficient operands"; return }
if (y == "+") stack[z-1] += stack[z]
else if (y == "-") stack[z-1] -= stack[z]
else if (y == "*") stack[z-1] *= stack[z]
else if (y == "^") stack[z-1] ^= stack[z] # see footnotes
else if (y == "/") {
if (stack[z] == 0) { print "error: division by zero"; return }
stack[z-1] /= stack[z]
} else if (y == "%") {
if (stack[z] == 0) { print "error: modulo by zero"; return }
stack[z-1] %= stack[z]
} else { print "error: invalid operator " y; return }
--z
}
}
if (z != 1) { print "error: invalid RPN expression"; return }
printf "%0.2f\n", stack[z]
}

# footnotes: exponentiation operators for differing awks...
# # awk 'BEGIN {print 2 ^ 3}' should print 8 if using ^
# stack[z-1] ^= stack[z] works on some awks using ^
# stack[z-1] = stack[z-1] ^ stack[z] always works if using ^
# # awk 'BEGIN {print 2 ** 3}' should print 8 if using **
# stack[z-1] **= stack[z] works on some awks using **
# stack[z-1] = stack[z-1] ** stack[z] always works if using **

# eof

--
:wq
Mike Sanders

Re: Reverse Polish Notation Parser

<ui9tfo$c5gv$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1712&group=comp.lang.awk#1712

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: porkchop@invalid.foo (Mike Sanders)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Mon, 6 Nov 2023 05:27:20 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 12
Sender: Mike Sanders <busybox@sdf.org>
Message-ID: <ui9tfo$c5gv$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me>
Injection-Date: Mon, 6 Nov 2023 05:27:20 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="bea9002c59825590baf04d6c11b58fc5";
logging-data="398879"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/2qGXNOATx2c6oDw60W+By"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64))
Cancel-Lock: sha1:/wEaFMA5ncgYD48940UbKp1SmaU=
 by: Mike Sanders - Mon, 6 Nov 2023 05:27 UTC

Mike Sanders <porkchop@invalid.foo> wrote:

> [...]

quick update I've meaning to make...

https://busybox.neocities.org/notes/rpn.txt

--
:wq
Mike Sanders

Re: Reverse Polish Notation Parser

<uiahqs$epj9$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1715&group=comp.lang.awk#1715

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Mon, 6 Nov 2023 12:14:35 +0100
Organization: A noiseless patient Spider
Lines: 95
Message-ID: <uiahqs$epj9$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 6 Nov 2023 11:14:37 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3c0c854f90746957498651c7ebacd4ac";
logging-data="484969"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX185+Sg3IgX+kkcKZ7mO2/3Q"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:9nPE3RzsPmKRP0sJSD6a1qPoZu8=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <ui9ist$74vq$1@dont-email.me>
 by: Janis Papanagnou - Mon, 6 Nov 2023 11:14 UTC

On 06.11.2023 03:26, Mike Sanders wrote:
> Ping Janis: Question... Are you interested in rewriting this
> as a Gawk only implementation? Would be great for switch/case
> statements IMO. If so, I'll add your version to the file at
> my website.

This is no rocket science. If you have

if (some_variable = "Val1") Cmd1 ;
else if (some_variable = "Val2") Cmd2 ;
...
else Def ;

the switch statement (for simple string comparisons) looks like

switch (some_variable) {
case "Val1": Cmd1 ; break ;
case "Val2": Cmd2 ; break ;
...
default: Def ;
}

The point is that you have to provide and evaluate 'some_variable'
just once. The "disadvantage" is that you will need the 'break'
commands to not fall through to the next case (which is another
common feature of such statements but not necessary in your
case).

You can apply 'switch' in function RPN() and errormessage(), but
whether it makes sense or not to introduce 'switch' and sacrifice
portability is on you to decide.

GNU Awk has (as opposed to some other programming languages
like C) the advantage that you can compare not only scalars;
you can also compare strings and patterns in GNU Awk's switch:

switch (some_variable) [
case 42: ...
case "string": ...
case /pattern/: ...
}

I don't see any GNU Awk specific features/constructs that would
suggest a [non-portable] transcription.

One point you may want to consider is the trim() function; the
two substitutions can be combined in one

sub (/^[[:space:]]+(.*)[[:space:]]+$/, "&", str)

(but test that in your Awk versions before using it; "&" is an
old feature but off the top of my head I'm not sure whether the
subexpression with parenthesis /...(...).../ is generally
supported in other Awks).

Janis

>
> Folks, assuming Janis says yes, lets work off that version
> & use it as a baseline.
>
> At any rate, a work in progress...
> [...]
>
> { RPN($0) }
>
> function RPN(input, x, y, z, stack) {
> split(input, stack, /[[:space:]]+/)
> z = 0
> for (x = 1; x in stack; ++x) {
> y = stack[x]
> if (y ~ /^[0-9]+(\.[0-9]+)?$/) stack[++z] = y
> else {
> if (z < 2) { print "error: insufficient operands"; return }
> if (y == "+") stack[z-1] += stack[z]
> else if (y == "-") stack[z-1] -= stack[z]
> else if (y == "*") stack[z-1] *= stack[z]
> else if (y == "^") stack[z-1] ^= stack[z] # see footnotes
> else if (y == "/") {
> if (stack[z] == 0) { print "error: division by zero"; return }
> stack[z-1] /= stack[z]
> } else if (y == "%") {
> if (stack[z] == 0) { print "error: modulo by zero"; return }
> stack[z-1] %= stack[z]
> } else { print "error: invalid operator " y; return }
> --z
> }
> }
> if (z != 1) { print "error: invalid RPN expression"; return }
> printf "%0.2f\n", stack[z]
> }
> [...]

Re: Reverse Polish Notation Parser

<uial6t$f7o5$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1717&group=comp.lang.awk#1717

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: porkchop@invalid.foo (Mike Sanders)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Mon, 6 Nov 2023 12:12:14 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 33
Sender: Mike Sanders <busybox@sdf.org>
Message-ID: <uial6t$f7o5$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me>
Injection-Date: Mon, 6 Nov 2023 12:12:14 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="bea9002c59825590baf04d6c11b58fc5";
logging-data="499461"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18wvOqzF930WqGheXYOD3I+"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64))
Cancel-Lock: sha1:lfOVWs96BC9w0kmyptm90EJ3V6M=
 by: Mike Sanders - Mon, 6 Nov 2023 12:12 UTC

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

> the switch statement (for simple string comparisons) looks like
>
> switch (some_variable) {
> case "Val1": Cmd1 ; break ;
> case "Val2": Cmd2 ; break ;
> ...
> default: Def ;
> }

Can multiple cases be 'or'ed (combined) as in $SHELL?

case q|Q) quit(); break;

> I don't see any GNU Awk specific features/constructs that would
> suggest a [non-portable] transcription.

Ok, so no.
> (but test that in your Awk versions before using it; "&" is an
> old feature but off the top of my head I'm not sure whether the
> subexpression with parenthesis /...(...).../ is generally
> supported in other Awks).

'&' I did not know this, must study it further.

Thanks. Off to work for me.

--
:wq
Mike Sanders

Re: Reverse Polish Notation Parser

<uid86i$10146$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1723&group=comp.lang.awk#1723

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: mortonspam@gmail.com (Ed Morton)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Tue, 7 Nov 2023 05:48:34 -0600
Organization: A noiseless patient Spider
Lines: 153
Message-ID: <uid86i$10146$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 7 Nov 2023 11:48:34 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="5cbdf8628088b236a92509c82ff09111";
logging-data="1049734"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+AIQYtGAouu3vWt93sREx1"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:9z91JGExgYeVVo6DPkn6wGGwNMY=
In-Reply-To: <uiahqs$epj9$1@dont-email.me>
X-Antivirus: Avast (VPS 231106-8, 11/6/2023), Outbound message
Content-Language: en-US
X-Antivirus-Status: Clean
 by: Ed Morton - Tue, 7 Nov 2023 11:48 UTC

On 11/6/2023 5:14 AM, Janis Papanagnou wrote:
> On 06.11.2023 03:26, Mike Sanders wrote:
>> Ping Janis: Question... Are you interested in rewriting this
>> as a Gawk only implementation? Would be great for switch/case
>> statements IMO. If so, I'll add your version to the file at
>> my website.
>
> This is no rocket science. If you have
>
> if (some_variable = "Val1") Cmd1 ;
> else if (some_variable = "Val2") Cmd2 ;
> ...
> else Def ;
>
> the switch statement (for simple string comparisons) looks like
>
> switch (some_variable) {
> case "Val1": Cmd1 ; break ;
> case "Val2": Cmd2 ; break ;
> ...
> default: Def ;
> }
>
> The point is that you have to provide and evaluate 'some_variable'
> just once. The "disadvantage" is that you will need the 'break'
> commands to not fall through to the next case (which is another
> common feature of such statements but not necessary in your
> case).
>
> You can apply 'switch' in function RPN() and errormessage(), but
> whether it makes sense or not to introduce 'switch' and sacrifice
> portability is on you to decide.
>
> GNU Awk has (as opposed to some other programming languages
> like C) the advantage that you can compare not only scalars;
> you can also compare strings and patterns in GNU Awk's switch:
>
> switch (some_variable) [
> case 42: ...
> case "string": ...
> case /pattern/: ...
Nitpick - it's

case /regexp/:

rather than:

case /pattern/

The word "pattern" is ambiguous and misused all over awk documentation.
You can make some argument for it in:

pattern { action }

since that includes `BEGIN`, integers, etc. I'd argue that should be:

condition { action }

but in the "case" statement above what goes inside `/.../` is simply and
always a regexp.

> }
>
> I don't see any GNU Awk specific features/constructs that would
> suggest a [non-portable] transcription.
>
>
> One point you may want to consider is the trim() function; the
> two substitutions can be combined in one
>
> sub (/^[[:space:]]+(.*)[[:space:]]+$/, "&", str)
>
> (but test that in your Awk versions before using it; "&" is an
> old feature but off the top of my head I'm not sure whether the
> subexpression with parenthesis /...(...).../ is generally
> supported in other Awks).
You can write that, but it's not a capture group that can be
backreferenced from the replacement and if it was "&" wouldn't refer to
the string that matched ".*" anyway, it'd refer to the string that
matched the whole regexp.

You could use a capture group in GNU awk for gensub():

str = gensub (/^[[:space:]]+(.*)[[:space:]]+$/, "\\1", 1, str)

and in most awks you could do:

gsub (/^[[:space:]]+|[[:space:]]+$/, "", str)

but there are some out there that will fail to do both substitutions for
that case (tawk and nawk maybe?) and so you need:

sub(/^[[:space:]]+/, "", str)
sub(/[[:space:]]+$/, "", str)

for those but since this requires an awk that supports POSIX character
classes anyway you could just declare the script as portable to all
POSIX awks (rather than all "modern" awks) and not worry about ones that
fail given the above gsub().

Alternatively in any POSIX awk you could do:

match(str,/[^[:space:]](.*[^[:space:]])?/)
str = substr(str,RSTART,RLENGTH)

and I expect that'd also work in any awk that supports POSIX character
classes but does not support the above gsub().

Regards,

Ed.

>
> Janis
>
>>
>> Folks, assuming Janis says yes, lets work off that version
>> & use it as a baseline.
>>
>> At any rate, a work in progress...
>> [...]
>>
>> { RPN($0) }
>>
>> function RPN(input, x, y, z, stack) {
>> split(input, stack, /[[:space:]]+/)
>> z = 0
>> for (x = 1; x in stack; ++x) {
>> y = stack[x]
>> if (y ~ /^[0-9]+(\.[0-9]+)?$/) stack[++z] = y
>> else {
>> if (z < 2) { print "error: insufficient operands"; return }
>> if (y == "+") stack[z-1] += stack[z]
>> else if (y == "-") stack[z-1] -= stack[z]
>> else if (y == "*") stack[z-1] *= stack[z]
>> else if (y == "^") stack[z-1] ^= stack[z] # see footnotes
>> else if (y == "/") {
>> if (stack[z] == 0) { print "error: division by zero"; return }
>> stack[z-1] /= stack[z]
>> } else if (y == "%") {
>> if (stack[z] == 0) { print "error: modulo by zero"; return }
>> stack[z-1] %= stack[z]
>> } else { print "error: invalid operator " y; return }
>> --z
>> }
>> }
>> if (z != 1) { print "error: invalid RPN expression"; return }
>> printf "%0.2f\n", stack[z]
>> }
>> [...]
>
>

Re: Reverse Polish Notation Parser

<uidb0k$10f8a$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1724&group=comp.lang.awk#1724

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Tue, 7 Nov 2023 13:36:35 +0100
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <uidb0k$10f8a$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me>
<uial6t$f7o5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 7 Nov 2023 12:36:36 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e45243d42addc5c2b31baa962c3de949";
logging-data="1064202"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18OeN5JJO52Do5nOI6nMiA/"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:Waw9HECGrwDA9YT7jl2DyzKBKTk=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <uial6t$f7o5$1@dont-email.me>
 by: Janis Papanagnou - Tue, 7 Nov 2023 12:36 UTC

On 06.11.2023 13:12, Mike Sanders wrote:
> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
>
>> the switch statement (for simple string comparisons) looks like
>>
>> switch (some_variable) {
>> case "Val1": Cmd1 ; break ;
>> case "Val2": Cmd2 ; break ;
>> ...
>> default: Def ;
>> }
>
> Can multiple cases be 'or'ed (combined) as in $SHELL?

What's always possible is to use the same bulky way as other C based
languages:

case "q": case "Q":

in case of comparing patterns I'd try

case /q|Q/:

(adding anchors ^ and $ as necessary). For details see [*].

>
> case q|Q) quit(); break;
[...]

Janis

[*] https://www.gnu.org/software/gawk/manual/gawk.html#Switch-Statement

Re: Reverse Polish Notation Parser

<uidch4$10mh8$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1725&group=comp.lang.awk#1725

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Tue, 7 Nov 2023 14:02:27 +0100
Organization: A noiseless patient Spider
Lines: 95
Message-ID: <uidch4$10mh8$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me>
<uid86i$10146$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 7 Nov 2023 13:02:28 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d631079bd7ce3fe7d8e32b894bc434c1";
logging-data="1071656"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18+adp+ciPZN7WLOytuTZB2"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:kZpqs9YcEsh8w6o7DXTxS7hPjjM=
In-Reply-To: <uid86i$10146$1@dont-email.me>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Tue, 7 Nov 2023 13:02 UTC

On 07.11.2023 12:48, Ed Morton wrote:
> On 11/6/2023 5:14 AM, Janis Papanagnou wrote:
>>
>> switch (some_variable) [
>> case 42: ...
>> case "string": ...
>> case /pattern/: ...

> Nitpick - it's

And I expected the nitpick on my typo using '=' (instead of '==') in
the 'if' comparison. ;-)

>
> case /regexp/:
>
> rather than:
>
> case /pattern/
>
> The word "pattern" is ambiguous and misused all over awk documentation.

You are right that the historic (and, sadly, surviving) documentation
speaks about 'pattern' and '/regexp'/'.

> You can make some argument for it in:
>
> pattern { action }
>
> since that includes `BEGIN`, integers, etc. I'd argue that should be:
>
> condition { action }

Decades ago I was the first one suggesting the term 'condition' here so
you certainly don't need to teach me. (Myself I was using that term in
my Awk courses even since the 1990's.)

>
> but in the "case" statement above what goes inside `/.../` is simply and
> always a regexp.

Yes. Sorry for my sloppiness here. Thanks for the nitpick.

>
> [...]
>>
>>
>> One point you may want to consider is the trim() function; the
>> two substitutions can be combined in one
>>
>> sub (/^[[:space:]]+(.*)[[:space:]]+$/, "&", str)
>>
>> (but test that in your Awk versions before using it; "&" is an
>> old feature but off the top of my head I'm not sure whether the
>> subexpression with parenthesis /...(...).../ is generally
>> supported in other Awks).

> You can write that, but it's not a capture group that can be
> backreferenced from the replacement and if it was "&" wouldn't refer to
> the string that matched ".*" anyway, it'd refer to the string that
> matched the whole regexp.

Using my Awk it does what advertised. I merely didn't find it clearly
documented whether the '&' is generally guaranteed to refer to the
grouping.

>
> You could use a capture group in GNU awk for gensub():

I deliberately abstained from gensub() here since the OP avoids GNU
Awk specifics.

>
> str = gensub (/^[[:space:]]+(.*)[[:space:]]+$/, "\\1", 1, str)
>
> and in most awks you could do:
>
> gsub (/^[[:space:]]+|[[:space:]]+$/, "", str)

Yes, this is a sensible alternative.

>
> but there are some out there that will fail to do both substitutions for
> that case (tawk and nawk maybe?) and so you need:

Really? But why? - Alternatives in regexp is certainly an old feature
(at least since nawk in the 1980's).

>
> sub(/^[[:space:]]+/, "", str)
> sub(/[[:space:]]+$/, "", str)
> [...]

Janis

Re: Reverse Polish Notation Parser

<uidej4$111gq$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1728&group=comp.lang.awk#1728

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Tue, 7 Nov 2023 14:37:40 +0100
Organization: A noiseless patient Spider
Lines: 27
Message-ID: <uidej4$111gq$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me>
<uid86i$10146$1@dont-email.me> <uidch4$10mh8$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 7 Nov 2023 13:37:41 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="d631079bd7ce3fe7d8e32b894bc434c1";
logging-data="1082906"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX191culWSJPkh59P8i/sYnPW"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:RJej3S94QdJgZZZPoNbak3+VrS0=
In-Reply-To: <uidch4$10mh8$1@dont-email.me>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Tue, 7 Nov 2023 13:37 UTC

On 07.11.2023 14:02, Janis Papanagnou wrote:
>>>
>>> One point you may want to consider is the trim() function; the
>>> two substitutions can be combined in one
>>>
>>> sub (/^[[:space:]]+(.*)[[:space:]]+$/, "&", str)
>>>
>>> (but test that in your Awk versions before using it; "&" is an
>>> old feature but off the top of my head I'm not sure whether the
>>> subexpression with parenthesis /...(...).../ is generally
>>> supported in other Awks).
>
>> You can write that, but it's not a capture group that can be
>> backreferenced from the replacement and if it was "&" wouldn't refer to
>> the string that matched ".*" anyway, it'd refer to the string that
>> matched the whole regexp.
>
> Using my Awk it does what advertised. I merely didn't find it clearly
> documented whether the '&' is generally guaranteed to refer to the
> grouping.

I stand corrected, I had a wrong test case! - "&" does _not_ work
on grouping, and as you said (and as it is specified), it generally
defines the whole match. (It wouldn't make sense otherwise.) - Thanks!

Janis

Re: Reverse Polish Notation Parser

<20231107091941.131@kylheku.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1729&group=comp.lang.awk#1729

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 864-117-4973@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Tue, 7 Nov 2023 17:24:38 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <20231107091941.131@kylheku.com>
References: <ui9ist$74vq$1@dont-email.me>
Injection-Date: Tue, 7 Nov 2023 17:24:38 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="ae97bfd932834673b179a519cc953896";
logging-data="1170736"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Zbsea7AadfcO1qh5cheKRCkgsK+F/LJ4="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:IAzXvorZdKXXqH1pIm0zx4pij4s=
 by: Kaz Kylheku - Tue, 7 Nov 2023 17:24 UTC

On 2023-11-06, Mike Sanders <porkchop@invalid.foo> wrote:
> Ping Janis: Question... Are you interested in rewriting this
> as a Gawk only implementation? Would be great for switch/case
> statements IMO. If so, I'll add your version to the file at
> my website.

The cppawk preprocessor supports a case macro which
compiles to the switch statement for Gawk, or to portable
Awk code.

The macro is documented in its own man page:

https://www.kylheku.com/cgit/cppawk/tree/cppawk-case.1

case is safer than switch because it doesn't have implicit
"fallthrough".

Each case must end with one of: cbreak, cfall or cret: break,
fallthrough or return.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Re: Reverse Polish Notation Parser

<uie0m3$14n3d$3@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1732&group=comp.lang.awk#1732

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!news.hispagatos.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: porkchop@invalid.foo (Mike Sanders)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Tue, 7 Nov 2023 18:46:27 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 10
Sender: Mike Sanders <busybox@sdf.org>
Message-ID: <uie0m3$14n3d$3@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me> <uial6t$f7o5$1@dont-email.me> <uidb0k$10f8a$1@dont-email.me>
Injection-Date: Tue, 7 Nov 2023 18:46:27 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="71a8db3047af07a88a22f1215a4f4163";
logging-data="1203309"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18xIgyRo6D0AUbk3cCk6nnI"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64))
Cancel-Lock: sha1:aJqaJrtr7Kc1XiDZAk6fkY7nd8I=
 by: Mike Sanders - Tue, 7 Nov 2023 18:46 UTC

Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

> [*] https://www.gnu.org/software/gawk/manual/gawk.html#Switch-Statement

Will read/study (as always thanks Janis!)

--
:wq
Mike Sanders

Re: Reverse Polish Notation Parser

<uie0ub$14n3d$4@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1733&group=comp.lang.awk#1733

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!news.hispagatos.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: porkchop@invalid.foo (Mike Sanders)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Tue, 7 Nov 2023 18:50:51 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 28
Sender: Mike Sanders <busybox@sdf.org>
Message-ID: <uie0ub$14n3d$4@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <20231107091941.131@kylheku.com>
Injection-Date: Tue, 7 Nov 2023 18:50:51 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="71a8db3047af07a88a22f1215a4f4163";
logging-data="1203309"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/V74FBevjZsKUefspXQzss"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64))
Cancel-Lock: sha1:2yW58QdP4aToZTSUgqHknIKBb8A=
 by: Mike Sanders - Tue, 7 Nov 2023 18:50 UTC

Kaz Kylheku <864-117-4973@kylheku.com> wrote:

> The cppawk preprocessor supports a case macro which
> compiles to the switch statement for Gawk, or to portable
> Awk code.
>
> The macro is documented in its own man page:
>
> https://www.kylheku.com/cgit/cppawk/tree/cppawk-case.1
>
> case is safer than switch because it doesn't have implicit
> "fallthrough".
>
> Each case must end with one of: cbreak, cfall or cret: break,
> fallthrough or return.

Hey-hey Kaz.

That's really nifty in fact. I might try my hand at a cppawk
project just to familiarizes myself with its workings.

Thanks & mucho appreciate the head's up kind sir, must study
more about it all.

--
:wq
Mike Sanders

Re: Reverse Polish Notation Parser

<uifnh7$1i709$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1734&group=comp.lang.awk#1734

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: mortonspam@gmail.com (Ed Morton)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Wed, 8 Nov 2023 04:22:28 -0600
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <uifnh7$1i709$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me>
<uid86i$10146$1@dont-email.me> <uidch4$10mh8$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 8 Nov 2023 10:22:32 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="cdc024731fbfc486e757d0c640edeaa3";
logging-data="1645577"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+sXsXZMBe82LTSPZxgf9sb"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Gu6U6inTg0wWv4UckbLzdruSqEk=
X-Antivirus-Status: Clean
Content-Language: en-US
X-Antivirus: Avast (VPS 231108-0, 11/7/2023), Outbound message
In-Reply-To: <uidch4$10mh8$1@dont-email.me>
 by: Ed Morton - Wed, 8 Nov 2023 10:22 UTC

On 11/7/2023 7:02 AM, Janis Papanagnou wrote:
> On 07.11.2023 12:48, Ed Morton wrote:
<snip>
>> and in most awks you could do:
>>
>> gsub (/^[[:space:]]+|[[:space:]]+$/, "", str)
>
> Yes, this is a sensible alternative.
>
>>
>> but there are some out there that will fail to do both substitutions for
>> that case (tawk and nawk maybe?) and so you need:
>
> Really? But why? - Alternatives in regexp is certainly an old feature
> (at least since nawk in the 1980's).

In my opinion it's just a bug. It was demonstrated to me when I posted
an answer on Stack Overflow several years ago that I can't find right
now. I know it's not gawk and I'm about 99% sure it's neither BSD awk
nor /usr/xpg[46]/bin/awk so the only non-oawk awks I can imagine would
have this problem are nawk, tawk (I'm about 80% sure I remember tawk is
one that DOES have the problem), and/or busybox awk, none of which I
have access to, so if anyone has and could test them by running:

$ echo ' foo ' | awk '{gsub(/^ +| +$/,""); print "<" $0 ">"}'
<foo>

and let us know which don't produce that output, that'd be great.

Ed.

Re: Reverse Polish Notation Parser

<uifs4e$1iumm$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1735&group=comp.lang.awk#1735

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!news.niel.me!news.gegeweb.eu!gegeweb.org!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: mortonspam@gmail.com (Ed Morton)
Newsgroups: comp.lang.awk
Subject: Re: Reverse Polish Notation Parser
Date: Wed, 8 Nov 2023 05:41:01 -0600
Organization: A noiseless patient Spider
Lines: 48
Message-ID: <uifs4e$1iumm$1@dont-email.me>
References: <ui9ist$74vq$1@dont-email.me> <uiahqs$epj9$1@dont-email.me>
<uid86i$10146$1@dont-email.me> <uidch4$10mh8$1@dont-email.me>
<uifnh7$1i709$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 8 Nov 2023 11:41:02 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="cdc024731fbfc486e757d0c640edeaa3";
logging-data="1669846"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19TDL9JuYeuZiIH5FcxxeSI"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:PzUd/VvHcLSKJcWQbTVznR1m1no=
X-Antivirus-Status: Clean
In-Reply-To: <uifnh7$1i709$1@dont-email.me>
X-Antivirus: Avast (VPS 231108-0, 11/7/2023), Outbound message
Content-Language: en-US
 by: Ed Morton - Wed, 8 Nov 2023 11:41 UTC

On 11/8/2023 4:22 AM, Ed Morton wrote:
> On 11/7/2023 7:02 AM, Janis Papanagnou wrote:
>> On 07.11.2023 12:48, Ed Morton wrote:
> <snip>
>>> and in most awks you could do:
>>>
>>>     gsub (/^[[:space:]]+|[[:space:]]+$/, "", str)
>>
>> Yes, this is a sensible alternative.
>>
>>>
>>> but there are some out there that will fail to do both substitutions for
>>> that case (tawk and nawk maybe?) and so you need:
>>
>> Really? But why? - Alternatives in regexp is certainly an old feature
>> (at least since nawk in the 1980's).
>
> In my opinion it's just a bug. It was demonstrated to me when I posted
> an answer on Stack Overflow several years ago that I can't find right
> now. I know it's not gawk and I'm about 99% sure it's neither BSD awk
> nor /usr/xpg[46]/bin/awk so the only non-oawk awks I can imagine would
> have this problem are nawk, tawk (I'm about 80% sure I remember tawk is
> one that DOES have the problem), and/or busybox awk, none of which I
> have access to, so if anyone has and could test them by running:
>
> $ echo ' foo ' | awk '{gsub(/^ +| +$/,""); print "<" $0 ">"}'    <foo>
>
> and let us know which don't produce that output, that'd be great.
>
>     Ed.
>

I found a different answer I had posted (also years ago) that mentions
this bug and there I say it's tawk and mawk1 where it occurs. Still
can't find the original where I was told about it (and it was in a
discussion in comments that's probably been removed by now) but that
should be good enough for others to reproduce it.

The specific case was removing quotes from around a field in a CSV:

$ printf '"foo"' | awk '{gsub(/^"+|"+$/,""); print "<" $0 ">"}'
<foo>

but I doubt if that detail matters.

Regards,

Ed.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor