Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

No directory.


computers / news.software.nntp / Getting spamassassin and clamav as inn filters

SubjectAuthor
* Getting spamassassin and clamav as inn filtersThe Doctor
`* Re: Getting spamassassin and clamav as inn filtersGea-Suan Lin
 +* Re: Getting spamassassin and clamav as inn filtersRay Banana
 |`- Re: Getting spamassassin and clamav as inn filtersGea-Suan Lin
 `* Re: Getting spamassassin and clamav as inn filtersJulien ÉLIE
  `* Re: Getting spamassassin and clamav as inn filtersyamo'
   `* Re: Getting spamassassin and clamav as inn filtersRay Banana
    `- Re: Getting spamassassin and clamav as inn filtersyamo'

1
Getting spamassassin and clamav as inn filters

<ug14cb$a9g$15@gallifrey.nk.ca>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2287&group=news.software.nntp#2287

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.nk.ca!.POSTED.doctor.nl2k.ab.ca!not-for-mail
From: doctor@doctor.nl2k.ab.ca (The Doctor)
Newsgroups: news.software.nntp
Subject: Getting spamassassin and clamav as inn filters
Date: Mon, 9 Oct 2023 14:57:15 -0000 (UTC)
Organization: NetKnow News
Message-ID: <ug14cb$a9g$15@gallifrey.nk.ca>
Injection-Date: Mon, 9 Oct 2023 14:57:15 -0000 (UTC)
Injection-Info: gallifrey.nk.ca; posting-host="doctor.nl2k.ab.ca:204.209.81.1";
logging-data="10544"; mail-complaints-to="usenet@gallifrey.nk.ca"
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: doctor@doctor.nl2k.ab.ca (The Doctor)
 by: The Doctor - Mon, 9 Oct 2023 14:57 UTC

Any recipes how?
--
Member - Liberal International This is doctor@nk.ca Ici doctor@nk.ca
Yahweh, King & country!Never Satan President Republic!Beware AntiChrist rising!
Look at Psalms 14 and 53 on Atheism https://www.empire.kred/ROOTNK?t=94a1f39b
An oil stain on the carpet is not removed by picking up the litter. -unknown Beware https://mindspring.com

Re: Getting spamassassin and clamav as inn filters

<ug7sh8$pcc$1@colo-sc-1.gslin.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2302&group=news.software.nntp#2302

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder6.news.weretis.net!newsfeed.hasname.com!.POSTED.114-34-121-114.hinet-ip.hinet.net!not-for-mail
From: gslin@gslin.org (Gea-Suan Lin)
Newsgroups: news.software.nntp
Subject: Re: Getting spamassassin and clamav as inn filters
Date: Thu, 12 Oct 2023 04:26:16 -0000 (UTC)
Organization: Hasname
Message-ID: <ug7sh8$pcc$1@colo-sc-1.gslin.com>
References: <ug14cb$a9g$15@gallifrey.nk.ca>
Injection-Date: Thu, 12 Oct 2023 04:26:16 -0000 (UTC)
Injection-Info: colo-sc-1.gslin.com; posting-host="114-34-121-114.hinet-ip.hinet.net:114.34.121.114";
logging-data="25996"; mail-complaints-to="usenet@colo-sc-1.gslin.com"
User-Agent: slrn/1.0.3 (Linux)
 by: Gea-Suan Lin - Thu, 12 Oct 2023 04:26 UTC

On 2023-10-09, The Doctor <doctor@doctor.nl2k.ab.ca> wrote:
> Any recipes how?

Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
tried, but not so useful. Still many spam into comp.lang.c and other
groups.

The most efficient way to avoid Google Groups spam for now is just
giving up anything from Google Groups.

```
use Mail::SpamAssassin;

my $sa_agent = Mail::SpamAssassin->new();

sub local_filter_last {
return unless $hdr{Path} =~ /google-groups\.googlegroups\.com/;

my %myhdr = %hdr;
delete $myhdr{__BODY__};
delete $myhdr{__LINES__};

my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
my $article_str = "$header_str\n\n$hdr{__BODY__}";

my $mail = $sa_agent->parse($article_str);
my $status = $sa_agent->check($mail);

return reject("Reject Google Groups posting to $hdr{Newsgroups} by SpamAssassin") if $status->is_spam();

$status->finish();
$mail->finish();

return;
} ```

Re: Getting spamassassin and clamav as inn filters

<slrnuif1ts.29emv.rayban@raybanana.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2303&group=news.software.nntp#2303

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!raybanana.eternal-september.org!.POSTED!not-for-mail
From: rayban@raybanana.net (Ray Banana)
Newsgroups: news.software.nntp
Subject: Re: Getting spamassassin and clamav as inn filters
Date: Thu, 12 Oct 2023 05:44:28 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <slrnuif1ts.29emv.rayban@raybanana.net>
References: <ug14cb$a9g$15@gallifrey.nk.ca> <ug7sh8$pcc$1@colo-sc-1.gslin.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 12 Oct 2023 05:44:28 -0000 (UTC)
Injection-Info: raybanana.eternal-september.org; posting-host="f1613cabe9933e581fdcceef24ef56be";
logging-data="2467742"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1++TWolk8UkPYWzkGJx1twJqioR6WRT3J0="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:RHnK30vlAU1H7C+KhXT6MIYGIKU=
 by: Ray Banana - Thu, 12 Oct 2023 05:44 UTC

* Gea-Suan Lin wrote:
> On 2023-10-09, The Doctor <doctor@doctor.nl2k.ab.ca> wrote:
>> Any recipes how?
>
> Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
> tried, but not so useful. Still many spam into comp.lang.c and other
> groups.
[...]
> use Mail::SpamAssassin;
>
> my $sa_agent = Mail::SpamAssassin->new();
>
> sub local_filter_last {
> return unless $hdr{Path} =~ /google-groups\.googlegroups\.com/;
>
> my %myhdr = %hdr;
> delete $myhdr{__BODY__};
> delete $myhdr{__LINES__};
>
> my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
> my $article_str = "$header_str\n\n$hdr{__BODY__}";
>
> my $mail = $sa_agent->parse($article_str);
> my $status = $sa_agent->check($mail);
>
> return reject("Reject Google Groups posting to $hdr{Newsgroups} by SpamAssassin") if $status->is_spam();
>
> $status->finish();
> $mail->finish();
>
> return;
> }
> ```

OK, now you need a ~/.spamassassin directory for your news user and a user_prefs
file in that directory. After that you can start adding rules for Usenet spam.
You will also need to feed several hundreds of spam and ham articles to sa-learn --spam
or sa-learn --ham as the news user. After that, SpamAssassin will gradually improve.

--
Пу́тін — хуйло́
http://www.eternal-september.org

Re: Getting spamassassin and clamav as inn filters

<ug85eb$435p$1@news.trigofacile.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2304&group=news.software.nntp#2304

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.trigofacile.com!.POSTED.san13-h02-176-143-2-105.dsl.sta.abo.bbox.fr!not-for-mail
From: iulius@nom-de-mon-site.com.invalid (Julien ÉLIE)
Newsgroups: news.software.nntp
Subject: Re: Getting spamassassin and clamav as inn filters
Date: Thu, 12 Oct 2023 08:58:19 +0200
Organization: Groupes francophones par TrigoFACILE
Message-ID: <ug85eb$435p$1@news.trigofacile.com>
References: <ug14cb$a9g$15@gallifrey.nk.ca> <ug7sh8$pcc$1@colo-sc-1.gslin.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 12 Oct 2023 06:58:19 -0000 (UTC)
Injection-Info: news.trigofacile.com; posting-account="julien"; posting-host="san13-h02-176-143-2-105.dsl.sta.abo.bbox.fr:176.143.2.105";
logging-data="134329"; mail-complaints-to="abuse@trigofacile.com"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:OYMBnKOY3Wn1Nd4LTRgSLTLlfQU= sha256:xAu49yXSrQw4lG8OBsg2VdfOJ8EKWSOwCOwFRsfEKng=
sha1:l26XjeHr4qnrI8g6UuWUACGrtjM= sha256:TdxRgXCEsH9iHenYrVOYqV9lQvmFp1XdQtP7Ool1OaE=
In-Reply-To: <ug7sh8$pcc$1@colo-sc-1.gslin.com>
 by: Julien ÉLIE - Thu, 12 Oct 2023 06:58 UTC

Hi Gea-Suan Lin,

>> Any recipes how?
>
> Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
> tried, but not so useful. Still many spam into comp.lang.c and other
> groups.

FWIW, there's a doc in French to set up a "spamchk" funnel to
SpamAssassin in the newsfeeds file:
https://web.archive.org/web/20230901182332/https://git.alphanet.ch/gitweb/?p=inn-install;a=blob_plain;f=README.html;hb=HEAD#filtrer-le-spam-avec-spamassassin

--
Julien ÉLIE

« Medicus curat, natura sanat. »

Re: Getting spamassassin and clamav as inn filters

<ug8cim$255$1@colo-sc-1.gslin.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2306&group=news.software.nntp#2306

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!newsfeed.hasname.com!.POSTED.114-34-121-114.hinet-ip.hinet.net!not-for-mail
From: gslin@gslin.org (Gea-Suan Lin)
Newsgroups: news.software.nntp
Subject: Re: Getting spamassassin and clamav as inn filters
Date: Thu, 12 Oct 2023 09:00:09 -0000 (UTC)
Organization: Hasname
Message-ID: <ug8cim$255$1@colo-sc-1.gslin.com>
References: <ug14cb$a9g$15@gallifrey.nk.ca> <ug7sh8$pcc$1@colo-sc-1.gslin.com> <slrnuif1ts.29emv.rayban@raybanana.net>
Injection-Date: Thu, 12 Oct 2023 09:00:09 -0000 (UTC)
Injection-Info: colo-sc-1.gslin.com; posting-host="114-34-121-114.hinet-ip.hinet.net:114.34.121.114";
logging-data="2213"; mail-complaints-to="usenet@colo-sc-1.gslin.com"
User-Agent: tin/2.6.2-20220130 ("Convalmore") (Linux/5.15.0-86-generic (x86_64))
 by: Gea-Suan Lin - Thu, 12 Oct 2023 09:00 UTC

Thanks for the information.

I added a setting into ~/.spamassassin/user_prefs for recognizing MIME
part:

#
bayes_token_sources all

Then I manually selected 200+ hams and 200+ spams from comp.lang.c, and
50+ spams from comp.lang.python as well as 200+ spams from sci.crypt.
Afterwards I sent all these hams/spams into sa-learn.

The result looks pretty good so far. Almost all new spams into
comp.lang.c were blocked by SpamAssassin.

I put my trained files here, so you may just reuse it:

https://newsfeed.hasname.com/files/usenet-spamassassin-20231012.tar.gz

Ray Banana <rayban@raybanana.net> wrote:
> * Gea-Suan Lin wrote:
>> On 2023-10-09, The Doctor <doctor@doctor.nl2k.ab.ca> wrote:
>>> Any recipes how?
>>
>> Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
>> tried, but not so useful. Still many spam into comp.lang.c and other
>> groups.
> [...]
>> use Mail::SpamAssassin;
>>
>> my $sa_agent = Mail::SpamAssassin->new();
>>
>> sub local_filter_last {
>> return unless $hdr{Path} =~ /google-groups\.googlegroups\.com/;
>>
>> my %myhdr = %hdr;
>> delete $myhdr{__BODY__};
>> delete $myhdr{__LINES__};
>>
>> my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
>> my $article_str = "$header_str\n\n$hdr{__BODY__}";
>>
>> my $mail = $sa_agent->parse($article_str);
>> my $status = $sa_agent->check($mail);
>>
>> return reject("Reject Google Groups posting to $hdr{Newsgroups} by SpamAssassin") if $status->is_spam();
>>
>> $status->finish();
>> $mail->finish();
>>
>> return;
>> }
>> ```
>
> OK, now you need a ~/.spamassassin directory for your news user and a user_prefs
> file in that directory. After that you can start adding rules for Usenet spam.
> You will also need to feed several hundreds of spam and ham articles to sa-learn --spam
> or sa-learn --ham as the news user. After that, SpamAssassin will gradually improve.
>

--
Resistance is futile.
https://blog.gslin.org/ & <gslin@gslin.org>

Re: Getting spamassassin and clamav as inn filters

<up55dd$6jv$1@rasp.pasdenom.info>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2869&group=news.software.nntp#2869

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!news.nntp4.net!pasdenom.info!.POSTED.2a01:e0a:21:ea80:432d:d113:4f14:b986!not-for-mail
From: yamo@beurdin.invalid (yamo')
Newsgroups: news.software.nntp
Subject: Re: Getting spamassassin and clamav as inn filters
Date: Sun, 28 Jan 2024 10:05:49 +0100
Organization: <http://pasdenom.info/news.html>
Message-ID: <up55dd$6jv$1@rasp.pasdenom.info>
References: <ug14cb$a9g$15@gallifrey.nk.ca> <ug7sh8$pcc$1@colo-sc-1.gslin.com>
<ug85eb$435p$1@news.trigofacile.com>
Reply-To: yamo@groumpf.org
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Jan 2024 09:05:49 -0000 (UTC)
Injection-Info: rasp.pasdenom.info; posting-account="stephane@usenet"; posting-host="2a01:e0a:21:ea80:432d:d113:4f14:b986";
logging-data="6783"; mail-complaints-to="abuse@pasdenom.info"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18.1
Cancel-Lock: sha1:VXrshjYfjdanuNjmb5CJ3KDqzhs= sha256:TQDr1LocNVNmDwFD8LdySljCgZ8MPrNHY4FhNYzz6dM=
sha1:wGLIh2hSgmZhfS7mo1PSPOwspU0= sha256:BX+HaYhkJKmgmj/LfS3XwlL3PeaNCvna9nyF8pou1Xg=
X-Seamonkey: <https://www.seamonkey-project.org/>
In-Reply-To: <ug85eb$435p$1@news.trigofacile.com>
 by: yamo' - Sun, 28 Jan 2024 09:05 UTC

Hi Julien,

Julien ÉLIE a tapoté le 12/10/2023 08:58:
> Hi Gea-Suan Lin,
>
>>> Any recipes how?
>>
>> Yeah, I just implemented a simple hack within `cleanfeed.local`. Have
>> tried, but not so useful. Still many spam into comp.lang.c and other
>> groups.
>
> FWIW, there's a doc in French to set up a "spamchk" funnel to
> SpamAssassin in the newsfeeds file:
>
> https://web.archive.org/web/20230901182332/https://git.alphanet.ch/gitweb/?p=inn-install;a=blob_plain;f=README.html;hb=HEAD#filtrer-le-spam-avec-spamassassin
>

The spamchk funnel is slower than calling SpamAssassin in cleanfeed.local.
After some tests, I've adopted the technique from Gea-Suan Lin, it could
be found here :
<http://al.howardknight.net/?STYPE=msgid&MSGI=%3Cug7sh8%24pcc%241%40colo-sc-1.gslin.com%3E>

I will update the French documentation :
<https://git.mcos.nc/INN/inn_install>

--
Stéphane
UTILISATEURS de GOOGLE GROUPS, vous n'aurez bientôt plus accès à Usenet.
<https://support.google.com/groups/answer/11036538>
Des serveurs gratuits de remplacement : <http://usenet-fr.yakakwatik.org>
Des logiciels : <http://usenet-fr.yakakwatik.org/lecteurs-de-news.html>

Re: Getting spamassassin and clamav as inn filters

<8m1qa1daf8.fsf@raybanana.net>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2871&group=news.software.nntp#2871

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!news.hispagatos.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raybanana.eternal-september.org!.POSTED!not-for-mail
From: rayban@raybanana.net (Ray Banana)
Newsgroups: news.software.nntp
Subject: Re: Getting spamassassin and clamav as inn filters
Date: Sun, 28 Jan 2024 11:22:51 +0100
Organization: A noiseless patient spider
Lines: 38
Message-ID: <8m1qa1daf8.fsf@raybanana.net>
References: <ug14cb$a9g$15@gallifrey.nk.ca> <ug7sh8$pcc$1@colo-sc-1.gslin.com>
<ug85eb$435p$1@news.trigofacile.com> <up55dd$6jv$1@rasp.pasdenom.info>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: raybanana.eternal-september.org; posting-host="9539ecef5f12bf8cc58827fd6470a152";
logging-data="4063256"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+V0PAFfmbPt6j88DONDj0Xhcf1Br6caIM="
User-Agent: Plonkenlights
Cancel-Lock: sha1:qIN1XGj3HXhIzKon4sw1KZ/cdaE=
sha1:3mfvrIJVmp4kLjIHPThBs3Jnupw=
X-Attribution: Ray Banana
 by: Ray Banana - Sun, 28 Jan 2024 10:22 UTC

Thus spake yamo' <yamo@beurdin.invalid>
[...]
> After some tests, I've adopted the technique from Gea-Suan Lin, it could
> be found here :
> <http://al.howardknight.net/?STYPE=msgid&MSGI=%3Cug7sh8%24pcc%241%40colo-sc-1.gslin.com%3E>

For performance reasons, especially if you receive a full text feed, I
would recommend to use spamd instead of starting spamassassin for every
article:

my %myhdr = %hdr;
delete $myhdr{__BODY__};
delete $myhdr{__LINES__};
my $header_str = join "\n", map { "$_: $hdr{$_}" } keys %myhdr;
my $article_str = "$header_str\n\n$hdr{__BODY__}";
my $spamtest = Mail::SpamAssassin::Client->new({
port => /spamd port/,
host => /spamd host/,
username => 'news'}); # Use ~news/.spamassassin/user_prefs

my $result = $spamtest->process($article_str);
$score = $result->{score};

INN::syslog('notice', $hdr{'Message-ID'} . " Score: $score, isspam: " . $result->{isspam} );
if ($result->{isspam} =~ 'True') {
[...] # local proceessing, nocemize etc.
return 'SPAM';

} else {
[...] # local processing
}

--
Пу́тін — хуйло́
https://www.eternal-september.org

Re: Getting spamassassin and clamav as inn filters

<up684d$3gp$1@rasp.pasdenom.info>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=2872&group=news.software.nntp#2872

  copy link   Newsgroups: news.software.nntp
Path: i2pn2.org!i2pn.org!news.niel.me!pasdenom.info!.POSTED.2a01:e0a:21:ea80:b768:587:2545:afd1!not-for-mail
From: yamo@beurdin.invalid (yamo')
Newsgroups: news.software.nntp
Subject: Re: Getting spamassassin and clamav as inn filters
Date: Sun, 28 Jan 2024 19:58:20 +0100
Organization: <http://pasdenom.info/news.html>
Message-ID: <up684d$3gp$1@rasp.pasdenom.info>
References: <ug14cb$a9g$15@gallifrey.nk.ca> <ug7sh8$pcc$1@colo-sc-1.gslin.com>
<ug85eb$435p$1@news.trigofacile.com> <up55dd$6jv$1@rasp.pasdenom.info>
<8m1qa1daf8.fsf@raybanana.net>
Reply-To: yamo@groumpf.org
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Jan 2024 18:58:21 -0000 (UTC)
Injection-Info: rasp.pasdenom.info; posting-account="stephane@usenet"; posting-host="2a01:e0a:21:ea80:b768:587:2545:afd1";
logging-data="3609"; mail-complaints-to="abuse@pasdenom.info"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18.1
Cancel-Lock: sha1:gyxzCDgeS35/SIlpPnnkk6gdhuU= sha256:GFxfvGP7wPnEf+E39NSbtfrc9YR7wajI6ecQZNfLSos=
sha1:YLYXUlYlytwF1Oqd/UhWeg1AJ70= sha256:vmHPP8OIoT5LnkOplvTkmWOjDVqBrXAFBM5Gk7/IDXY=
X-Seamonkey: <https://www.seamonkey-project.org/>
In-Reply-To: <8m1qa1daf8.fsf@raybanana.net>
 by: yamo' - Sun, 28 Jan 2024 18:58 UTC

Hi Ray,

Ray Banana a tapoté le 28/01/2024 11:22:
> Thus spake yamo' <yamo@beurdin.invalid>
>
> [...]
>> After some tests, I've adopted the technique from Gea-Suan Lin, it could
>> be found here :
>> <http://al.howardknight.net/?STYPE=msgid&MSGI=%3Cug7sh8%24pcc%241%40colo-sc-1.gslin.com%3E>
>
> For performance reasons, especially if you receive a full text feed, I
> would recommend to use spamd instead of starting spamassassin for every
> article:

Thanks!

It works but I have to test a little more.

--
Stéphane
UTILISATEURS de GOOGLE GROUPS, vous n'aurez bientôt plus accès à Usenet.
<https://support.google.com/groups/answer/11036538>
Des serveurs gratuits de remplacement : <http://usenet-fr.yakakwatik.org>
Des logiciels : <http://usenet-fr.yakakwatik.org/lecteurs-de-news.html>

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor