Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

It's later than you think, the joint Russian-American space mission has already begun.


devel / comp.lang.awk / Difficulty to use sensible line breaks in expressions

SubjectAuthor
* Difficulty to use sensible line breaks in expressionsJanis Papanagnou
`* Re: Difficulty to use sensible line breaks in expressionsKaz Kylheku
 `* Re: Difficulty to use sensible line breaks in expressionsKaz Kylheku
  `- Re: Difficulty to use sensible line breaks in expressionsJanis Papanagnou

1
Difficulty to use sensible line breaks in expressions

<ti6i38$1h02v$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1334&group=comp.lang.awk#1334

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Difficulty to use sensible line breaks in expressions
Date: Wed, 12 Oct 2022 16:13:59 +0200
Organization: A noiseless patient Spider
Lines: 70
Message-ID: <ti6i38$1h02v$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 12 Oct 2022 14:14:00 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="625d7edd30b37f526b7bcd3ee73f96bf";
logging-data="1605727"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19WHDuHXnGDEuDDat1AC3w3"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:Ci+q4YcV65s10Ju51vBrcG+wQXA=
X-Enigmail-Draft-Status: N1110
X-Mozilla-News-Host: news://news.eternal-september.org:119
 by: Janis Papanagnou - Wed, 12 Oct 2022 14:13 UTC

About the difficulty to use sensible line breaks in expressions,
without adding syntactically spurious escape characters.
(Note 1: The need for line breaks arise with longer expressions.)
(Note 2: Yes, we can use/add line-continuation/escape characters.)

1
2 function f (a,b) { }
3
4 {
5 # okay
6 if (f(a,b) < c + d) print a, b, c, d
7
8 # okay
9 if (f(a,b) < c + d) print a, b,
10 c, d
11
12 # okay
13 if (f(a,
14 b) < c + d) print a, b, c, d
15
16 # error
17 if (f(a, b) <
18 c + d) print a, b, c, d
19
20 # error
21 if (f(a,b) < c +
22 d) print a, b, c, d
23
24 # error
25 if (f(a,b) < c + d
26 ) print a, b, c, d
27
28 # okay
29 if (f(a,b) < c &&
30 d) print a, b, c, d
31
32 # okay
33 if (f(a,b) < (c &&
34 d)) print a, b, c, d
35
36 # error
37 if (f(a,b) < (c +
38 d)) print a, b, c, d
39 }

awk: awk-breaks:18: if (f(a, b) <
awk: awk-breaks:18: ^ unexpected newline or end of string
awk: awk-breaks:18: c + d) print a, b, c, d
awk: awk-breaks:18: ^ syntax error
awk: awk-breaks:22: if (f(a,b) < c +
awk: awk-breaks:22: ^ unexpected newline or end of string
awk: awk-breaks:26: if (f(a,b) < c + d
awk: awk-breaks:26: ^ unexpected newline or end of
string
awk: awk-breaks:38: if (f(a,b) < (c +
awk: awk-breaks:38: ^ unexpected newline or end of string
awk: awk-breaks:38: d)) print a, b, c, d
awk: awk-breaks:38: ^ syntax error
awk: awk-breaks:38: d)) print a, b, c, d
awk: awk-breaks:38: ^ syntax error
awk: awk-breaks:39: d)) print a, b, c, d
awk: awk-breaks:39: ^ unexpected
newline or end of string

Is throwing (some/any of) these syntax errors mandated by POSIX? - If
not, Awk variants, I suppose, could decide to implement semantically
sensible [valid] interpretations and remove existing inconsistencies?

Janis

Re: Difficulty to use sensible line breaks in expressions

<20221012083803.803@kylheku.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1335&group=comp.lang.awk#1335

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: 864-117-4973@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.awk
Subject: Re: Difficulty to use sensible line breaks in expressions
Date: Wed, 12 Oct 2022 16:56:19 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 101
Message-ID: <20221012083803.803@kylheku.com>
References: <ti6i38$1h02v$1@dont-email.me>
Injection-Date: Wed, 12 Oct 2022 16:56:19 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="679ea19dd320bbf3f3d2548648aa4cbd";
logging-data="1632909"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/GyJEPfBLM2Kn11doua9fOZByqCGVLosE="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:8dAPW0k2Wcpn+pPx2Sx35Ueqi3E=
 by: Kaz Kylheku - Wed, 12 Oct 2022 16:56 UTC

On 2022-10-12, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
> About the difficulty to use sensible line breaks in expressions,
> without adding syntactically spurious escape characters.
> (Note 1: The need for line breaks arise with longer expressions.)
> (Note 2: Yes, we can use/add line-continuation/escape characters.)
>
> 1
> 2 function f (a,b) { }
> 3
> 4 {
> 5 # okay
> 6 if (f(a,b) < c + d) print a, b, c, d
> 7
> 8 # okay
> 9 if (f(a,b) < c + d) print a, b,
> 10 c, d
> 11
> 12 # okay
> 13 if (f(a,
> 14 b) < c + d) print a, b, c, d
> 15
> 16 # error
> 17 if (f(a, b) <
> 18 c + d) print a, b, c, d
> 19
> 20 # error
> 21 if (f(a,b) < c +
> 22 d) print a, b, c, d
> 23
> 24 # error
> 25 if (f(a,b) < c + d
> 26 ) print a, b, c, d
> 27
> 28 # okay
> 29 if (f(a,b) < c &&
> 30 d) print a, b, c, d
> 31
> 32 # okay
> 33 if (f(a,b) < (c &&
> 34 d)) print a, b, c, d
> 35
> 36 # error
> 37 if (f(a,b) < (c +
> 38 d)) print a, b, c, d
> 39 }
>
> awk: awk-breaks:18: if (f(a, b) <
> awk: awk-breaks:18: ^ unexpected newline or end of string
> awk: awk-breaks:18: c + d) print a, b, c, d
> awk: awk-breaks:18: ^ syntax error
> awk: awk-breaks:22: if (f(a,b) < c +
> awk: awk-breaks:22: ^ unexpected newline or end of string
> awk: awk-breaks:26: if (f(a,b) < c + d
> awk: awk-breaks:26: ^ unexpected newline or end of
> string
> awk: awk-breaks:38: if (f(a,b) < (c +
> awk: awk-breaks:38: ^ unexpected newline or end of string
> awk: awk-breaks:38: d)) print a, b, c, d
> awk: awk-breaks:38: ^ syntax error
> awk: awk-breaks:38: d)) print a, b, c, d
> awk: awk-breaks:38: ^ syntax error
> awk: awk-breaks:39: d)) print a, b, c, d
> awk: awk-breaks:39: ^ unexpected
> newline or end of string
>
>
> Is throwing (some/any of) these syntax errors mandated by POSIX? - If
> not, Awk variants, I suppose, could decide to implement semantically
> sensible [valid] interpretations and remove existing inconsistencies?

Newlines are significant in Awk, and appear as a token (the NEWLINE
token int the POSIX grammar).

Not all parts of the grammar recognize newline tokens, so they
cause a syntax error.

I think that would require that, for instance the phrase structure for
E + E would admit zero or more newline tokens on either side of the +,
which are ignored.

Or else, we have the parser communicate with the lexer, so that the
lexer makes newlines disappear and reappear in a syntax-directed way.

I suspect that this wouldn't be upstreamed into gawk.

I have a fork of gawk called egawk (enhanced gnu awk) where this
approach could be tried.

At certain points in the parser, we call
some function in the lexer which says "eat newlines; do not feed me
NEWLINE tokens", and at other points we re-enable newlines.

The lexer could do it itself; for instance if a '(' token is processed,
it may be okay to enable newline-eating until the matching ')',
which just requires a counter. So then line breaks would be allowed
in anything parenthesized, without disturbing their syntactic role
as alternative semicolon terminators.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: Difficulty to use sensible line breaks in expressions

<20221012120715.628@kylheku.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1336&group=comp.lang.awk#1336

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: 864-117-4973@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.awk
Subject: Re: Difficulty to use sensible line breaks in expressions
Date: Wed, 12 Oct 2022 19:15:30 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 74
Message-ID: <20221012120715.628@kylheku.com>
References: <ti6i38$1h02v$1@dont-email.me> <20221012083803.803@kylheku.com>
Injection-Date: Wed, 12 Oct 2022 19:15:30 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="679ea19dd320bbf3f3d2548648aa4cbd";
logging-data="1653615"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/7tRKdiBPWx8Bo1yHWIL7SUd1w1mTLvS0="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:4QnYEwZrFPv/p86gK1dSCzoubMg=
 by: Kaz Kylheku - Wed, 12 Oct 2022 19:15 UTC

On 2022-10-12, Kaz Kylheku <864-117-4973@kylheku.com> wrote:
> I have a fork of gawk called egawk (enhanced gnu awk) where this
> approach could be tried.

I got it working very easily, at the proof of concept stage,
not having validated test cases and such:

Patched:

~/gawk$ ./gawk 'BEGIN {
if (x +
x == 0) { print "blah" } }'
blah

Stock distro gawk:

~/gawk$ gawk 'BEGIN {
if (x +
x == 0) { print "blah" } }'
gawk: cmd. line:3: if (x +
gawk: cmd. line:3: ^ unexpected newline or end of string
gawk: cmd. line:3: x == 0) { print "blah" } }
gawk: cmd. line:3: ^ syntax error

Patched, in --posix mode:

~/gawk$ ./gawk --posix 'BEGIN {
if (x +
x == 0) { print "blah" } }'
gawk: cmd. line:3: if (x +
gawk: cmd. line:3: ^ unexpected newline or end of string
gawk: cmd. line:3: x == 0) { print "blah" } }
gawk: cmd. line:3: ^ syntax error

Patch:

~/gawk$ git diff awkgram.y
diff --git a/awkgram.y b/awkgram.y
index fc35100d..c24e35c5 100644
--- a/awkgram.y
+++ b/awkgram.y
@@ -3911,6 +3911,13 @@ yylex(void)

case '\n':
sourceline++;
+ /*
+ * If not in POSIX mode, allow free-form newline in bracketed
+ * and parenthesized expressions, by swallowing '\n' rather than
+ * turning it into a NEWLINE token.
+ */
+ if (! do_posix && in_parens)
+ goto retry;
return lasttok = NEWLINE;

case '#': /* it's a comment */

Very easy; the lexer already counts parentheses, so nothing to do.

All of the above said and patched, note that you can use backslash
continuations, which is a bit ugly:

~/gawk$ gawk 'BEGIN {
if (x + \
x == 0) { print "blah" } }'
blah

So before trying to upstreaming, you need a convincing argument why
standard-conforming backslash-newline continuations aren't good enough.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: Difficulty to use sensible line breaks in expressions

<ti7bai$1j23a$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1337&group=comp.lang.awk#1337

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: Difficulty to use sensible line breaks in expressions
Date: Wed, 12 Oct 2022 23:24:34 +0200
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <ti7bai$1j23a$1@dont-email.me>
References: <ti6i38$1h02v$1@dont-email.me> <20221012083803.803@kylheku.com>
<20221012120715.628@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 12 Oct 2022 21:24:34 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="625d7edd30b37f526b7bcd3ee73f96bf";
logging-data="1673322"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+wAQ3NXKMV05NB7yVLSqYa"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:s6M7YlnF+u7nALbH42jn7du+c7g=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <20221012120715.628@kylheku.com>
 by: Janis Papanagnou - Wed, 12 Oct 2022 21:24 UTC

On 12.10.2022 21:15, Kaz Kylheku wrote:
>
> So before trying to upstreaming, you need a convincing argument why
> standard-conforming backslash-newline continuations aren't good enough.

I acknowledged line-continuation/escapes in my OP:
>> About the difficulty to use sensible line breaks in expressions,
>> without adding syntactically spurious escape characters.
....
>> (Note 2: Yes, we can use/add line-continuation/escape characters.)

It may be just me, but I consider line-continuation as a hack of the
last century or even of the 1960's (cf. the '+' symbol in column 1 of
punch cards, where THAT continuation has NOT the issues of invisible
whitespace characters after the '\' that we have at least since the
UNIX epoch). In the Awk language, because of its design, we have to
put certain things together on a line because of an otherwise changed
semantics; e.g. pattern { action } cannot be split before the
braces. In other places (see my OP-examples) it's syntactically and
semantically unnecessary. There's also inconsistencies (see examples
again) in expressions (with + vs. && to name just one).

But as you pointed out in your first post, the syntax is in POSIX, so
at least in POSIX mode it should behave standard conforming. (If the
POSIX syntax is "informational" only the valuation may change, though.)

In cases where fatal (syntax-)errors are [unnecessarily] produced,
though, I think that a more graceful/accommodating behavior would
not only add to readability, safety, and consistency, it might also
increase the attractivity for new users and acceptance by users (in
case anyone is concerned about such considerations).

That's all. I don't think that anything will change here. And I will
continue to write lengthy lines in Awk (where its syntax requires it)
and hope to not need looking into it again some time later, or check
(in case of bug tracking) whether any continuation will have a NL
immediately after it. And in 10 years when I will have forgot my post
I'll probably ask that question again.

Janis

PS: Thanks for your prove of concept and tests.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor