Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

An Ada exception is when a routine gets in trouble and says 'Beam me up, Scotty'.


computers / news.software.nntp / Re: force expiration by path?

SubjectAuthor
* force expiration by path?Dave McGuire
+* Re: force expiration by path?Grant Taylor
|+* Re: force expiration by path?Dave McGuire
||`* Re: force expiration by path?Grant Taylor
|| `* Re: force expiration by path?Dave McGuire
||  `- Re: force expiration by path?Grant Taylor
|`- Re: force expiration by path?Tom Furie
`- Re: force expiration by path?Julien ÉLIE

1
force expiration by path?

<ul5nl4$18j4$1@mail.neurotica.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2624&group=news.software.nntp#2624

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.killfile.org!news.eyrie.org!news.xcski.com!news.neurotica.com!.POSTED.gw.neurotica.com!not-for-mail
From: mcguire@lssmuseum.org (Dave McGuire)
Newsgroups: news.software.nntp
Subject: force expiration by path?
Date: Sun, 10 Dec 2023 20:12:04 -0500
Organization: LSSM
Message-ID: <ul5nl4$18j4$1@mail.neurotica.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Dec 2023 01:12:04 -0000 (UTC)
Injection-Info: mail.neurotica.com; posting-host="gw.neurotica.com:50.73.179.1";
logging-data="41572"; mail-complaints-to="usenet@mail.neurotica.com"
User-Agent: Mozilla Thunderbird
Content-Language: en-US
 by: Dave McGuire - Mon, 11 Dec 2023 01:12 UTC

Hi folks. Can anyone tell me if there's a way to tell INN to expire
a set of articles, as a one-time operation, based on their path?

I'm sure it's obvious that my goal is to get rid of all the Google
spam from the spool. I just filtered them in my cleanfeed configuration
but would like to purge the articles that are already there, as my
server is set up with a long expiration period.

A perusal of the docs for expire and such have turned up nothing, so
I'd appreciate some advice on whether or not there's a way to do this.

Thanks,
-Dave

--
Dave McGuire, President/Curator
Large Scale Systems Museum
New Kensington, PA

Re: force expiration by path?

<ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2625&group=news.software.nntp#2625

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!tncsrv06.tnetconsulting.net!tncsrv09.home.tnetconsulting.net!.POSTED.198.18.1.140!not-for-mail
From: gtaylor@tnetconsulting.net (Grant Taylor)
Newsgroups: news.software.nntp
Subject: Re: force expiration by path?
Date: Sun, 10 Dec 2023 23:14:02 -0600
Organization: TNet Consulting
Message-ID: <ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>
References: <ul5nl4$18j4$1@mail.neurotica.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 11 Dec 2023 05:14:02 -0000 (UTC)
Injection-Info: tncsrv09.home.tnetconsulting.net; posting-host="198.18.1.140";
logging-data="7306"; mail-complaints-to="newsmaster@tnetconsulting.net"
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <ul5nl4$18j4$1@mail.neurotica.com>
 by: Grant Taylor - Mon, 11 Dec 2023 05:14 UTC

On 12/10/23 19:12, Dave McGuire wrote:
>   Hi folks.  Can anyone tell me if there's a way to tell INN to expire
> a set of articles, as a one-time operation, based on their path?

Maybe and it depends. (More below.)

>   I'm sure it's obvious that my goal is to get rid of all the Google
> spam from the spool.  I just filtered them in my cleanfeed configuration
> but would like to purge the articles that are already there, as my
> server is set up with a long expiration period.

I was doing that very thing as we type this thread. -- I just checked
and a long running command finished.

time says that my command ran for:

84021.02s user 19364.71s system 29% cpu 98:13:31.85 total

This is a tradspool on a four (spinning rust) drive ZFS pool.

Seeing as how I'm using tradspool, I'm able to delete files from the
spool directory.

I suspect that this isn't proper, much less pure, from an INN sense. I
bet I should have extracted the article number and passed a given a
cancel message to INN, likely via ctlinnd. But, I did a hack and I'll
deal with it if / when it becomes a problem.

That being said, I did a find across /var/spool/news/articles and had it
exec a script per article that looked for Message-IDs that ended with
@googlegroups.com.

This is actually the second time I've done this. The first time I did
it the process removed nearly seven million articles. Then I found out
that the Message-ID had a different pattern, likely as fields grew over
time. So I re-ran the process with a more forgiving format.

export LC_ALL=C
egrep -lm1 "^Message-ID:
<[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+@googlegroups.com>$"
${1} > /dev/null 2>&1
if [ ${?} -eq 0 ]; then
echo -n "X"
rm ${1}
fi

I'm sure there are other ways to do this. But it worked for me. I was
able to let it run in the background in a window.

time (clear; find $(pwd) -type d | while read DIR; do echo -n
"${TS}${${DIR/\/var\/spool\/news\/articles\//}//\//.}${FS}"; find ${DIR}
-maxdepth 1 -type f -exec
/root/remove-google-groups-news-posting-if-its-spam.sh {} \; ; done; echo)

The echo / ${TS} / ${FS} isn't important, much less required. It's
there because I wanted to update the window title to be the newsgroup
that was being worked on.

I'm sure there are better ways to do this. But this has worked for me
to do exactly what you're asking to do.

>   A perusal of the docs for expire and such have turned up nothing, so
> I'd appreciate some advice on whether or not there's a way to do this.

I'm not aware of anything built in to INN that will do this. But this
is one way that you can do this outside of INN.

N.B. what I did is possibly very specific to the tradspool method. I
have no idea about other methods. It may be possible, but would likely
require using ctlinnd to cancel the articles.

--
Grant. . . .

Re: force expiration by path?

<ul7rvb$3k8q$1@news.trigofacile.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2636&group=news.software.nntp#2636

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.trigofacile.com!.POSTED.2a01cb080adc11009867ab1f31395dad.ipv6.abo.wanadoo.fr!not-for-mail
From: iulius@nom-de-mon-site.com.invalid (Julien ÉLIE)
Newsgroups: news.software.nntp
Subject: Re: force expiration by path?
Date: Mon, 11 Dec 2023 21:38:03 +0100
Organization: Groupes francophones par TrigoFACILE
Message-ID: <ul7rvb$3k8q$1@news.trigofacile.com>
References: <ul5nl4$18j4$1@mail.neurotica.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 11 Dec 2023 20:38:03 -0000 (UTC)
Injection-Info: news.trigofacile.com; posting-account="julien"; posting-host="2a01cb080adc11009867ab1f31395dad.ipv6.abo.wanadoo.fr:2a01:cb08:adc:1100:9867:ab1f:3139:5dad";
logging-data="119066"; mail-complaints-to="abuse@trigofacile.com"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:JOn0d7/6EtDLhSjudat+tlHuDSQ= sha256:1CQRa/82zMTZBz/vCM4C5rVMGjfhFzzi3hIMWo+/UA4=
sha1:+fOgbC3jh32bLdlWh9XGvPmS+s8= sha256:unzvnInkmUAVX3TORM2YBFA3MO9J8h1MiHVpdTnDrCc=
In-Reply-To: <ul5nl4$18j4$1@mail.neurotica.com>
 by: Julien ÉLIE - Mon, 11 Dec 2023 20:38 UTC

Hi Dave,

> Can anyone tell me if there's a way to tell INN to expire
> a set of articles, as a one-time operation, based on their path?

Grant's method naturally works on tradspool and you can use it.

In a more general case, you can parse the history file (in <pathdb> as
set in inn.conf), retrieve the headers of each article (sm -H) and run
the regexps you wish on these headers.
As you're asking for a search based on the Path header field, the
following command will write to a googlegroups.tokens file the storage
tokens of articles sent from Google Groups:

perl -ne 'chomp; our ($hash, $timestamps, $_) = split " "; print
"$_\n" if $_ and qx/sm -q -H "$_" | grep Path/ =~
/!google-groups\.googlegroups\.com!not-for-mail$/' history >
googlegroups.tokens

The command will take a bit of time to run, as INN retrieves every article.

Then, to delete these articles from your history file, just run "sm -d"
on them. Something like:

xargs sm -d < googlegroups.tokens

Before doing that, check that your regexp worked, by retrieving a few
storage tokens and verifying they're coming from Google Groups. You can
see the contents of an article with:

sm -R '@...token...@'

(-R in uppercase)

> A perusal of the docs for expire and such have turned up nothing, so
> I'd appreciate some advice on whether or not there's a way to do this.

The next run of news.daily will properly clean the overview, etc.

--
Julien ÉLIE

« Qui habet aures audiendi, audiat. » (Évangiles)

Re: force expiration by path?

<uldq2o$4rn$1@mail.neurotica.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2642&group=news.software.nntp#2642

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.trigofacile.com!news.eyrie.org!news.xcski.com!news.neurotica.com!.POSTED.gw.neurotica.com!not-for-mail
From: mcguire@lssmuseum.org (Dave McGuire)
Newsgroups: news.software.nntp
Subject: Re: force expiration by path?
Date: Wed, 13 Dec 2023 21:42:32 -0500
Organization: LSSM
Message-ID: <uldq2o$4rn$1@mail.neurotica.com>
References: <ul5nl4$18j4$1@mail.neurotica.com>
<ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 14 Dec 2023 02:42:33 -0000 (UTC)
Injection-Info: mail.neurotica.com; posting-host="gw.neurotica.com:50.73.179.1";
logging-data="4983"; mail-complaints-to="usenet@mail.neurotica.com"
User-Agent: Betterbird (Linux)
Content-Language: en-US
In-Reply-To: <ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>
 by: Dave McGuire - Thu, 14 Dec 2023 02:42 UTC

On 12/11/23 00:14, Grant Taylor wrote:
> That being said, I did a find across /var/spool/news/articles and had it
> exec a script per article that looked for Message-IDs that ended with
> @googlegroups.com.
>
> This  is actually the second time I've done this.  The first time I did
> it the process removed nearly seven million articles.  Then I found out
> that the Message-ID had a different pattern, likely as fields grew over
> time.  So I re-ran the process with a more forgiving format.
>
>    export LC_ALL=C
>    egrep -lm1 "^Message-ID:
> <[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+@googlegroups.com>$" ${1} > /dev/null 2>&1
>    if [ ${?} -eq 0 ]; then
>         echo -n "X"
>         rm ${1}
>    fi
>
> I'm sure there are other ways to do this.  But it worked for me.  I was
> able to let it run in the background in a window.
>
>    time (clear; find $(pwd) -type d | while read DIR; do echo -n
> "${TS}${${DIR/\/var\/spool\/news\/articles\//}//\//.}${FS}"; find ${DIR}
> -maxdepth 1 -type f -exec
> /root/remove-google-groups-news-posting-if-its-spam.sh {} \; ; done; echo)
>
> The echo / ${TS} / ${FS} isn't important, much less required.  It's
> there because I wanted to update the window title to be the newsgroup
> that was being worked on.
>
> I'm sure there are better ways to do this.  But this has worked for me
> to do exactly what you're asking to do.

Hi Grant, thank you, I'll give this a shot. The window title thing
is a nice touch. :)

Thanks,
-Dave

--
Dave McGuire, President/Curator
Large Scale Systems Museum
New Kensington, PA

Re: force expiration by path?

<ulduab$5tt$1@tncsrv09.home.tnetconsulting.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2643&group=news.software.nntp#2643

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!news.1d4.us!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!tncsrv06.tnetconsulting.net!tncsrv09.home.tnetconsulting.net!.POSTED.198.18.1.140!not-for-mail
From: gtaylor@tnetconsulting.net (Grant Taylor)
Newsgroups: news.software.nntp
Subject: Re: force expiration by path?
Date: Wed, 13 Dec 2023 21:54:51 -0600
Organization: TNet Consulting
Message-ID: <ulduab$5tt$1@tncsrv09.home.tnetconsulting.net>
References: <ul5nl4$18j4$1@mail.neurotica.com>
<ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>
<uldq2o$4rn$1@mail.neurotica.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 14 Dec 2023 03:54:51 -0000 (UTC)
Injection-Info: tncsrv09.home.tnetconsulting.net; posting-host="198.18.1.140";
logging-data="6077"; mail-complaints-to="newsmaster@tnetconsulting.net"
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <uldq2o$4rn$1@mail.neurotica.com>
 by: Grant Taylor - Thu, 14 Dec 2023 03:54 UTC

On 12/13/23 20:42, Dave McGuire wrote:
>   Hi Grant, thank you, I'll give this a shot.  The window title thing
> is a nice touch. :)

Hi Dave,

You're welcome.

Please let me know if it works or if you have questions.

--
Grant. . . .

Re: force expiration by path?

<ule7ht$b93$1@freeq.furie.org.uk>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2644&group=news.software.nntp#2644

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!news.furie.org.uk!.POSTED.2001:470:1ae8:50::18d!not-for-mail
From: tom@furie.org.uk (Tom Furie)
Newsgroups: news.software.nntp
Subject: Re: force expiration by path?
Date: Thu, 14 Dec 2023 06:32:28 +0000
Organization: Little to None
Message-ID: <ule7ht$b93$1@freeq.furie.org.uk>
References: <ul5nl4$18j4$1@mail.neurotica.com>
<ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: freeq.furie.org.uk; posting-host="2001:470:1ae8:50::18d";
logging-data="11555"; mail-complaints-to="usenet@furie.org.uk"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
 by: Tom Furie - Thu, 14 Dec 2023 06:32 UTC

Grant Taylor <gtaylor@tnetconsulting.net> writes:

> export LC_ALL=C
> egrep -lm1 "^Message-ID:
> <[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+-[0-9A-Za-z]+@googlegroups.com>$"

Just an aside in case you need to do something like this again...

A pattern like

"^Message-ID: <([[:alnum:]]+-)+[[:alnum:]]+[[:alpha:]]"

before the "@" should allow the hyphenated grouping to expand
arbitrarily without intervention required to modify the pattern. I'm
pretty certain that's all hex (probably a hash of something) until the
"n@google...", and I don't think I've ever noticed anything other than
"n" immediately preceding the "@", so I guess the pattern could be
something like

"^Message-ID: <([0-9a-f]+-)+[0-9a-f]+n@google..."

Re: force expiration by path?

<uneq50$2bjh$1@mail.neurotica.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2779&group=news.software.nntp#2779

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.trigofacile.com!news.eyrie.org!news.xcski.com!news.neurotica.com!.POSTED.gw.neurotica.com!not-for-mail
From: mcguire@lssmuseum.org (Dave McGuire)
Newsgroups: news.software.nntp
Subject: Re: force expiration by path?
Date: Sun, 7 Jan 2024 13:22:24 -0500
Organization: LSSM
Message-ID: <uneq50$2bjh$1@mail.neurotica.com>
References: <ul5nl4$18j4$1@mail.neurotica.com>
<ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>
<uldq2o$4rn$1@mail.neurotica.com>
<ulduab$5tt$1@tncsrv09.home.tnetconsulting.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 7 Jan 2024 18:22:24 -0000 (UTC)
Injection-Info: mail.neurotica.com; posting-host="gw.neurotica.com:50.73.179.1";
logging-data="77425"; mail-complaints-to="usenet@mail.neurotica.com"
User-Agent: Betterbird (Linux)
Content-Language: en-US
In-Reply-To: <ulduab$5tt$1@tncsrv09.home.tnetconsulting.net>
 by: Dave McGuire - Sun, 7 Jan 2024 18:22 UTC

On 12/13/23 22:54, Grant Taylor wrote:
>>    Hi Grant, thank you, I'll give this a shot.  The window title thing
>> is a nice touch. :)
>
> Hi Dave,
>
> You're welcome.
>
> Please let me know if it works or if you have questions.

Hi Grant, yes it did indeed work. Thank you for your advice.

-Dave

--
Dave McGuire, President/Curator
Large Scale Systems Museum
New Kensington, PA

Re: force expiration by path?

<unl5k2$kml$3@tncsrv09.home.tnetconsulting.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2792&group=news.software.nntp#2792

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!tncsrv06.tnetconsulting.net!tncsrv09.home.tnetconsulting.net!.POSTED.198.18.1.140!not-for-mail
From: gtaylor@tnetconsulting.net (Grant Taylor)
Newsgroups: news.software.nntp
Subject: Re: force expiration by path?
Date: Tue, 9 Jan 2024 22:14:58 -0600
Organization: TNet Consulting
Message-ID: <unl5k2$kml$3@tncsrv09.home.tnetconsulting.net>
References: <ul5nl4$18j4$1@mail.neurotica.com>
<ul65qq$74a$1@tncsrv09.home.tnetconsulting.net>
<uldq2o$4rn$1@mail.neurotica.com>
<ulduab$5tt$1@tncsrv09.home.tnetconsulting.net>
<uneq50$2bjh$1@mail.neurotica.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 10 Jan 2024 04:14:58 -0000 (UTC)
Injection-Info: tncsrv09.home.tnetconsulting.net; posting-host="198.18.1.140";
logging-data="21205"; mail-complaints-to="newsmaster@tnetconsulting.net"
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <uneq50$2bjh$1@mail.neurotica.com>
 by: Grant Taylor - Wed, 10 Jan 2024 04:14 UTC

On 1/7/24 12:22, Dave McGuire wrote:
> Hi Grant, yes it did indeed work.  Thank you for your advice.

Hi Dave,

Thank you for the follow up. I'm glad that it worked for you. :-)

--
Grant. . . .

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor