Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

Counting in binary is just like counting in decimal -- if you are all thumbs. -- Glaser and Way


devel / comp.lang.awk / Re: approaches for reformatting data points into pairs

SubjectAuthor
* approaches for reformatting data points into pairsBryan
`* Re: approaches for reformatting data points into pairsJanis Papanagnou
 `* Re: approaches for reformatting data points into pairsBryan
  `* Re: approaches for reformatting data points into pairsJanis Papanagnou
   `* Re: approaches for reformatting data points into pairsKenny McCormack
    `- Re: approaches for reformatting data points into pairsJanis Papanagnou

1
approaches for reformatting data points into pairs

<be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1404&group=comp.lang.awk#1404

  copy link   Newsgroups: comp.lang.awk
X-Received: by 2002:a05:620a:15eb:b0:745:3f06:7c6b with SMTP id p11-20020a05620a15eb00b007453f067c6bmr5727371qkm.9.1679062868667;
Fri, 17 Mar 2023 07:21:08 -0700 (PDT)
X-Received: by 2002:a9d:12e9:0:b0:698:f988:7c37 with SMTP id
g96-20020a9d12e9000000b00698f9887c37mr3781288otg.3.1679062868337; Fri, 17 Mar
2023 07:21:08 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.awk
Date: Fri, 17 Mar 2023 07:21:08 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2601:190:67f:67bb:a7dd:ad5d:3100:5cdd;
posting-account=PtxH9gkAAAAGMw6wJDvB_vinyKd1zCKS
NNTP-Posting-Host: 2601:190:67f:67bb:a7dd:ad5d:3100:5cdd
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com>
Subject: approaches for reformatting data points into pairs
From: bryanlepore@gmail.com (Bryan)
Injection-Date: Fri, 17 Mar 2023 14:21:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3449
 by: Bryan - Fri, 17 Mar 2023 14:21 UTC

I am interested in generally organizing a long string of comma-separated numbers ("CSV" or "CVS") in different ways. For instance, I'd like to get every other pair of numbers (see below for work). This might be useful and extendable for basic mathematical analysis, or practical reformatting of program output. E.g. svg files have paths with such features (see the "q" or "c" commands), or for plotting different sets the data, e.g. every other pair, or other combinations. (However, I note that the gnuplot "every" command is also useful for this).

For example this sequence:

-10.000000,-9.000000,-8.000000,-7.000000,-[...trim...]7.000000,8.000000,9.000000,10.000000

can be put into different groups, for example these "x,y" data points :

-10.000, -9.000
-8.000, -7.000
-6.000, -5.000
-4.000, -3.000
-2.000, -1.000
0.000, 1.000
2.000, 3.000
4.000, 5.000
6.000, 7.000
8.000, 9.000
10.000,

(note there is no partner for the last pair). This script will do that (with extra details shown to help follow the processes):

awk_dev_test_seq=$(seq -s',' -f '%f' -10 10)
gawk -F, '
{ {
for (i=1;i<=NF;i++ )
{
if ( i % 2 == 0 ) printf("i=%s Y:%3.3f%s ", i, $i, "\n")
else
printf("i=%s X:%3.3f%s ", i, $i, ",")
}
} }' <<EOF
${awk_dev_test_seq}
EOF

The number in (i % 2 == 0 ) can be adjusted to get e.g. each line containing the three consecutive numbers by changing "i % 2" to "i % 3". results :

i=1 X:-10.000, i=2 X:-9.000, i=3 Y:-8.000

.... and so on. I have been looking at how to do other groupings of the data - for example, getting every other *pair* of numbers would be interesting, illustrated in this pseudo-output :

keep this line : -10.000, -9.000
Skip this line->-8.000, -7.000
keep this line : -6.000, -5.000
Skip this line-> -4.000, -3.000
keep this line : -2.000, -1.000

I am asking what approaches might be best to do that in awk - if/else, while, for, or other control sequences (I think is the term for those).

Tried to keep this short, but I'll note some interesting postings on this topic :

"Parsing standard CVS data by gawk"
https://lists.gnu.org/archive/html/bug-gawk/2015-07/msg00002.html
"CSV parsing with awk"
https://backreference.org/2010/04/17/csv-parsing-with-awk/index.html

-Bryan

Re: approaches for reformatting data points into pairs

<tv1vr7$20pvs$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1405&group=comp.lang.awk#1405

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: approaches for reformatting data points into pairs
Date: Fri, 17 Mar 2023 16:09:27 +0100
Organization: A noiseless patient Spider
Lines: 110
Message-ID: <tv1vr7$20pvs$1@dont-email.me>
References: <be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 17 Mar 2023 15:09:27 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="1dfab266009a1d3620c01eb02fd7a582";
logging-data="2123772"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18mMQNUMDsfYw+yB8ZimSh4"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:tDJlYuZBcEl09kvYD5gzsgTWOJs=
In-Reply-To: <be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com>
X-Enigmail-Draft-Status: N1110
 by: Janis Papanagnou - Fri, 17 Mar 2023 15:09 UTC

On 17.03.2023 15:21, Bryan wrote:
> I am interested in generally organizing a long string of
> comma-separated numbers ("CSV" or "CVS") in different ways. For
> instance, I'd like to get every other pair of numbers (see below for
> work). This might be useful and extendable for basic mathematical
> analysis, or practical reformatting of program output. E.g. svg files
> have paths with such features (see the "q" or "c" commands), or for
> plotting different sets the data, e.g. every other pair, or other
> combinations. (However, I note that the gnuplot "every" command is
> also useful for this).
>
> For example this sequence:
>
> -10.000000,-9.000000,-8.000000,-7.000000,-[...trim...]7.000000,8.000000,9.000000,10.000000
>
> can be put into different groups, for example these "x,y" data points :
>
> -10.000, -9.000
> -8.000, -7.000
> -6.000, -5.000
> -4.000, -3.000
> -2.000, -1.000
> 0.000, 1.000
> 2.000, 3.000
> 4.000, 5.000
> 6.000, 7.000
> 8.000, 9.000
> 10.000,
>
> (note there is no partner for the last pair). This script will do
> that (with extra details shown to help follow the processes):
>

I'm not sure you want some "universal" script or just hints for coding
variants. For the former case you should specify the requirements
accurately. In the latter case see below...

> awk_dev_test_seq=$(seq -s',' -f '%f' -10 10)
> gawk -F, '
> {
> {
> for (i=1;i<=NF;i++ )
> {
> if ( i % 2 == 0 ) printf("i=%s Y:%3.3f%s ", i, $i, "\n")
> else
> printf("i=%s X:%3.3f%s ", i, $i, ",")
> }
> }
> }' <<EOF
> ${awk_dev_test_seq}
> EOF

Personally I'd take a (slightly) different approach here, like doing
a handling of irregular (odd) cases

awk -F, '
NF % 2 == 1 { ...in case of odd number of fields - what to do?... }
NF % 2 == 0 { ...(regular?) case of even number of fields... }
'

(The second condition may be irrelevant if you use the first action
to fix your data, and you can fall through in the regular case.)

For the iteration I'd do

for (i=1; i<=NF; i+=2) # i.e. increment by 2

and print a pair of numbers in one single print statement

printf "X:%3.3f%s,Y:%3.3f%s\n", $i, $(i+1)

(adjust the formatting string and arguments as desired).

In case you want to skip a data pair adjust the increment
appropriately, say, by i+=4 (for your example below), or by
i+=3 if you want to skip a data value (say a Z-coordinate).

Janis

>
> The number in (i % 2 == 0 ) can be adjusted to get e.g. each line
> containing the three consecutive numbers by changing "i % 2" to "i % 3".
> results :
>
> i=1 X:-10.000, i=2 X:-9.000, i=3 Y:-8.000
>
> ... and so on. I have been looking at how to do other groupings of
> the data - for example, getting every other *pair* of numbers would be
> interesting, illustrated in this pseudo-output :
>
> keep this line : -10.000, -9.000
> Skip this line->-8.000, -7.000
> keep this line : -6.000, -5.000
> Skip this line-> -4.000, -3.000
> keep this line : -2.000, -1.000
>
> I am asking what approaches might be best to do that in awk -
> if/else, while, for, or other control sequences (I think is the term
> for those).
>
> Tried to keep this short, but I'll note some interesting postings on this topic :
>
> "Parsing standard CVS data by gawk"
> https://lists.gnu.org/archive/html/bug-gawk/2015-07/msg00002.html
> "CSV parsing with awk"
> https://backreference.org/2010/04/17/csv-parsing-with-awk/index.html
>
> -Bryan
>

Re: approaches for reformatting data points into pairs

<6029953a-aaae-43a7-a794-31412515e215n@googlegroups.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1408&group=comp.lang.awk#1408

  copy link   Newsgroups: comp.lang.awk
X-Received: by 2002:a05:620a:22af:b0:746:2832:a32 with SMTP id p15-20020a05620a22af00b0074628320a32mr1668327qkh.13.1679070724301;
Fri, 17 Mar 2023 09:32:04 -0700 (PDT)
X-Received: by 2002:a9d:6b07:0:b0:68b:e0dc:abc7 with SMTP id
g7-20020a9d6b07000000b0068be0dcabc7mr53181otp.4.1679070724036; Fri, 17 Mar
2023 09:32:04 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.awk
Date: Fri, 17 Mar 2023 09:32:03 -0700 (PDT)
In-Reply-To: <tv1vr7$20pvs$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2601:190:67f:67bb:a7dd:ad5d:3100:5cdd;
posting-account=PtxH9gkAAAAGMw6wJDvB_vinyKd1zCKS
NNTP-Posting-Host: 2601:190:67f:67bb:a7dd:ad5d:3100:5cdd
References: <be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com> <tv1vr7$20pvs$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6029953a-aaae-43a7-a794-31412515e215n@googlegroups.com>
Subject: Re: approaches for reformatting data points into pairs
From: bryanlepore@gmail.com (Bryan)
Injection-Date: Fri, 17 Mar 2023 16:32:04 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2737
 by: Bryan - Fri, 17 Mar 2023 16:32 UTC

On Friday, March 17, 2023 at 11:09:30 AM UTC-4, Janis Papanagnou wrote:
> Personally I'd take a (slightly) different approach here, like doing
> a handling of irregular (odd) cases
>
> awk -F, '
> NF % 2 == 1 { ...in case of odd number of fields - what to do?... }
> NF % 2 == 0 { ...(regular?) case of even number of fields... }
> '
>
> (The second condition may be irrelevant if you use the first action
> to fix your data, and you can fall through in the regular case.)

This is interesting, thanks.
> For the iteration I'd do
>
> for (i=1; i<=NF; i+=2) # i.e. increment by 2
>
> and print a pair of numbers in one single print statement
>
> printf "X:%3.3f%s,Y:%3.3f%s\n", $i, $(i+1)
>
> (adjust the formatting string and arguments as desired).
>
> In case you want to skip a data pair adjust the increment
> appropriately, say, by i+=4 (for your example below), or by
> i+=3 if you want to skip a data value (say a Z-coordinate).

that idea - in the following script - appears to be exactly what I mean:
awk_dev_test_seq=$(seq -s',' -f '%f' -10 10)
gawk -F, '
{ for (i=1; i<=NF; i+=4 )
printf ( "i=%s %3.3f %3.3f \n", i, $i, $(i+1) )
}' <<EOF
${awk_dev_test_seq}
EOF

output:

i=1 -10.000 -9.000
i=5 -6.000 -5.000
i=9 -2.000 -1.000
i=13 2.000 3.000
i=17 6.000 7.000

That helped a lot, thank you.

-Bryan

Re: approaches for reformatting data points into pairs

<tv3aao$2aucf$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1410&group=comp.lang.awk#1410

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: approaches for reformatting data points into pairs
Date: Sat, 18 Mar 2023 04:14:31 +0100
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <tv3aao$2aucf$1@dont-email.me>
References: <be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com>
<tv1vr7$20pvs$1@dont-email.me>
<6029953a-aaae-43a7-a794-31412515e215n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 18 Mar 2023 03:14:32 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="1304a8e9e6cb483b4c642a5889513c8e";
logging-data="2455951"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX189g5sftjU32m5/jIcUkPky"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:cyXYCu0R78EhE5GujGNPnnQLTew=
In-Reply-To: <6029953a-aaae-43a7-a794-31412515e215n@googlegroups.com>
 by: Janis Papanagnou - Sat, 18 Mar 2023 03:14 UTC

On 17.03.2023 17:32, Bryan wrote:
> printf ( "i=%s %3.3f %3.3f \n", i, $i, $(i+1) )

I see you added parenthesis. But note that 'printf' - as 'print',
but as opposed to 'sprintf()' - is a statement, not a function.
Just by the way.

Janis

Re: approaches for reformatting data points into pairs

<tv3e0l$17jet$1@news.xmission.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1413&group=comp.lang.awk#1413

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: gazelle@shell.xmission.com (Kenny McCormack)
Newsgroups: comp.lang.awk
Subject: Re: approaches for reformatting data points into pairs
Date: Sat, 18 Mar 2023 04:17:25 -0000 (UTC)
Organization: The official candy of the new Millennium
Message-ID: <tv3e0l$17jet$1@news.xmission.com>
References: <be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com> <tv1vr7$20pvs$1@dont-email.me> <6029953a-aaae-43a7-a794-31412515e215n@googlegroups.com> <tv3aao$2aucf$1@dont-email.me>
Injection-Date: Sat, 18 Mar 2023 04:17:25 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:166.70.8.4";
logging-data="1297885"; mail-complaints-to="abuse@xmission.com"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: gazelle@shell.xmission.com (Kenny McCormack)
 by: Kenny McCormack - Sat, 18 Mar 2023 04:17 UTC

In article <tv3aao$2aucf$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
>On 17.03.2023 17:32, Bryan wrote:
>> printf ( "i=%s %3.3f %3.3f \n", i, $i, $(i+1) )
>
>I see you added parenthesis. But note that 'printf' - as 'print',
>but as opposed to 'sprintf()' - is a statement, not a function.

Although you don't say so explicitly, the implication is that using
parentheses with printf is wrong. This implication is incorrect.

Although the parens are optional in most cases, they are necessary in
certain cases. I always use them (when I use printf in awk), because:

1) It looks better (IMHO, of course). It conforms more to what we
would expect to see in C.
2) It is necessary in certain cases, so might as well use them always.

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/DanQuayle

Re: approaches for reformatting data points into pairs

<tv3hlv$2c1g6$1@dont-email.me>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=1415&group=comp.lang.awk#1415

  copy link   Newsgroups: comp.lang.awk
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: janis_papanagnou+ng@hotmail.com (Janis Papanagnou)
Newsgroups: comp.lang.awk
Subject: Re: approaches for reformatting data points into pairs
Date: Sat, 18 Mar 2023 06:19:59 +0100
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <tv3hlv$2c1g6$1@dont-email.me>
References: <be037406-2aeb-4508-a09a-793afcda8cd2n@googlegroups.com>
<tv1vr7$20pvs$1@dont-email.me>
<6029953a-aaae-43a7-a794-31412515e215n@googlegroups.com>
<tv3aao$2aucf$1@dont-email.me> <tv3e0l$17jet$1@news.xmission.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 18 Mar 2023 05:19:59 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="1304a8e9e6cb483b4c642a5889513c8e";
logging-data="2491910"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Ofr/AL5WbA0Tfow7aJv0Y"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
Thunderbird/45.8.0
Cancel-Lock: sha1:+bmFHIth3dqZaBaXvi+VQRLQk1s=
X-Enigmail-Draft-Status: N1110
In-Reply-To: <tv3e0l$17jet$1@news.xmission.com>
 by: Janis Papanagnou - Sat, 18 Mar 2023 05:19 UTC

On 18.03.2023 05:17, Kenny McCormack wrote:
> In article <tv3aao$2aucf$1@dont-email.me>,
> Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
>> On 17.03.2023 17:32, Bryan wrote:
>>> printf ( "i=%s %3.3f %3.3f \n", i, $i, $(i+1) )
>>
>> I see you added parenthesis. But note that 'printf' - as 'print',
>> but as opposed to 'sprintf()' - is a statement, not a function.
>
> Although you don't say so explicitly, the implication is that using
> parentheses with printf is wrong. This implication is incorrect.

There was no implication that they are wrong - actually they work.

But to know that it is a statement and not a function allows you
to understand how the mechanics are, and to derive explanations
for cases in which expressions parentheses are necessary, and in
these cases it's not because of [wrongly assuming] that it is a
function. In other words; knowing the difference allows to grasp
the semantics of these language construct.

>
> Although the parens are optional in most cases, they are necessary in
> certain cases. I always use them (when I use printf in awk), because:
>
> 1) It looks better (IMHO, of course). It conforms more to what we
> would expect to see in C.
> 2) It is necessary in certain cases, so might as well use them always.

That's worth religious wars. :-) I'll abstain.

Janis

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor