Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Hacking's just another word for nothing left to kludge.


devel / comp.lang.awk / gawk regexp question

SubjectAuthor
* gawk regexp questionKenny McCormack
`* Re: gawk regexp questionManuel Collado
 +- Re: gawk regexp questionKenny McCormack
 `- Re: gawk regexp questionJanis Papanagnou

1
gawk regexp question

<tm9qmi$nuf4$3@news.xmission.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1342&group=comp.lang.awk#1342

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gazelle@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.lang.awk
Subject: gawk regexp question
Date: Thu, 1 Dec 2022 09:04:18 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <tm9qmi$nuf4$3@news.xmission.com>
Injection-Date: Thu, 1 Dec 2022 09:04:18 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="784868"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Thu, 1 Dec 2022 09:04 UTC

I have a regexp like:

/^.*[?/]word[=/]/

and it seems to work as expected. Notice that neither of the weird/special
characters (? or /) are escaped (I.e., preceded with \) inside of [].

Am I correct in assuming this is OK? Is there a list anywhere of what is
and isn't "special" (i.e., needing to be escaped) inside of []?

--
When I was growing up we called them "retards", but that's not PC anymore.
Now, we just call them "Trump Voters".

The question is, of course, how much longer it will be until that term is also un-PC.

Re: gawk regexp question

<651e12f0-d7c7-fc22-ffc1-080cb836e4f6@users.sourceforge.net>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1343&group=comp.lang.awk#1343

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!aioe.org!0in1pHreWDEP413UZPB/SA.user.46.165.242.75.POSTED!not-for-mail
From: m-collado@users.sourceforge.net (Manuel Collado)
Newsgroups: comp.lang.awk
Subject: Re: gawk regexp question
Date: Thu, 1 Dec 2022 11:30:53 +0100
Organization: Aioe.org NNTP Server
Message-ID: <651e12f0-d7c7-fc22-ffc1-080cb836e4f6@users.sourceforge.net>
References: <tm9qmi$nuf4$3@news.xmission.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="30103"; posting-host="0in1pHreWDEP413UZPB/SA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.5.0
X-Notice: Filtered by postfilter v. 0.9.2
X-Antivirus-Status: Clean
X-Antivirus: AVG (VPS 221201-0, 1/12/2022), Outbound message
 by: Manuel Collado - Thu, 1 Dec 2022 10:30 UTC

El 01/12/2022 a las 10:04, Kenny McCormack escribió:
> I have a regexp like:
>
> /^.*[?/]word[=/]/
>
> and it seems to work as expected. Notice that neither of the weird/special
> characters (? or /) are escaped (I.e., preceded with \) inside of [].
>
> Am I correct in assuming this is OK? Is there a list anywhere of what is
> and isn't "special" (i.e., needing to be escaped) inside of []?
>

The gawk manual says:

"To include one of the characters ‘\’, ‘]’, ‘-’, or ‘^’ in a bracket
expression, put a ‘\’ in front of it. For example:
[d\]]
matches either ‘d’ or ‘]’. Additionally, if you place ‘]’ right after
the opening ‘[’, the closing bracket is treated as one of the characters
to be matched."

Don't know if this also applies to other awk variants.

--
Manuel Collado - http://mcollado.z15.es

Re: gawk regexp question

<tma7ca$o4og$1@news.xmission.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1344&group=comp.lang.awk#1344

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gazelle@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.lang.awk
Subject: Re: gawk regexp question
Date: Thu, 1 Dec 2022 12:40:42 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <tma7ca$o4og$1@news.xmission.com>
References: <tm9qmi$nuf4$3@news.xmission.com> <651e12f0-d7c7-fc22-ffc1-080cb836e4f6@users.sourceforge.net>
Injection-Date: Thu, 1 Dec 2022 12:40:42 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="791312"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Thu, 1 Dec 2022 12:40 UTC

In article <651e12f0-d7c7-fc22-ffc1-080cb836e4f6@users.sourceforge.net>,
Manuel Collado <m-collado@users.sourceforge.net> wrote:
....
>The gawk manual says:
>
>"To include one of the characters \, ], -, or ^ in a bracket
>expression, put a \ in front of it. For example:
> [d\]]
>matches either d or ]. Additionally, if you place ] right after
>the opening [, the closing bracket is treated as one of the characters
>to be matched."

OK, so it is just those 4 (\]-^).

I think "-" is also OK (i.e., doesn't need to be escaped) if it is the first
character inside of [].

>Don't know if this also applies to other awk variants.

Nobody cares anymore about "other awk variants".
(This is a Good Thing...)

--
People who say they'll vote for someone else because Obama couldn't fix
*all* of Bush's messes are like people complaining that he couldn't cure
cancer, so they'll go and vote for (more) cancer.

Re: gawk regexp question

<tmgqkh$3f19p$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1345&group=comp.lang.awk#1345

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: gawk regexp question
Date: Sun, 4 Dec 2022 01:46:08 +0100
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <tmgqkh$3f19p$1@dont-email.me>
References: <tm9qmi$nuf4$3@news.xmission.com>
<651e12f0-d7c7-fc22-ffc1-080cb836e4f6@users.sourceforge.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 4 Dec 2022 00:46:09 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="c2ba3099d819ac3f62a359006f0c1e9a";
logging-data="3638585"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/cF+bnoBdl79FK1cuF9cUh"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:t/tWmJgldnxbbluzWhUxL1nhFdw=
In-Reply-To: <651e12f0-d7c7-fc22-ffc1-080cb836e4f6@users.sourceforge.net>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Sun, 4 Dec 2022 00:46 UTC

On 01.12.2022 11:30, Manuel Collado wrote:
>
> The gawk manual says:
>
> "To include one of the characters ‘\’, ‘]’, ‘-’, or ‘^’ in a bracket
> expression, put a ‘\’ in front of it. For example:
> [d\]]
> matches either ‘d’ or ‘]’. Additionally, if you place ‘]’ right after
> the opening ‘[’, the closing bracket is treated as one of the characters
> to be matched."
>
> Don't know if this also applies to other awk variants.

The old Awk "Bible" says:
"Inside a character class, all characters have their literal meaning,
except for the quoting character \ , ^ at the beginning, and - between
two characters."

And for meta-characters generally it says that single meta-characters
match themselves, and otherwise need to be \-escaped to preserve their
literal meaning.

I suppose that's what we could expect from other including older awks.
(Test cases might be []], [[], vs. [\]], [\[].)

For more recent tools POSIX defines BRE bracket expressions for POSIX
awk, also mentioning the brackets. (WRT the bracket symbols it gets a
bit more complicated, though, with the collating syntaxes.)

Janis

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor