Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

19 May, 2024: Line wrapping has been changed to be more consistent with Usenet standards.
 If you find that it is broken please let me know here rocksolid.nodes.help


devel / comp.lang.perl.misc / Parsing an email message

SubjectAuthor
* Parsing an email messageBernie Cosell
+- Re: Parsing an email messageRainer Weikusat
+- Re: Parsing an email messageHenry Law
+- Re: Parsing an email messageAndreas Karrer
`- Re: Parsing an email messageBernie Cosell

1
Parsing an email message

<lmgptg9u4a7vitb5g7s9vrvbtdmtqc4gor@4ax.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=257&group=comp.lang.perl.misc#257

  copy link   Newsgroups: comp.lang.perl.misc
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bernie@fantasyfarm.com (Bernie Cosell)
Newsgroups: comp.lang.perl.misc
Subject: Parsing an email message
Date: Mon, 10 Jan 2022 18:37:26 -0500
Organization: Fantasy Farm Fibers
Lines: 16
Message-ID: <lmgptg9u4a7vitb5g7s9vrvbtdmtqc4gor@4ax.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: reader02.eternal-september.org; posting-host="d839a48ff028b0a7343c68d3acf624be";
logging-data="9604"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18YwSlotMx5OTXn/PSbFAqJ"
User-Agent: ForteAgent/8.00.32.1272
Cancel-Lock: sha1:k8PShlzRdxz88JFaMoKUkZah6Yc=
 by: Bernie Cosell - Mon, 10 Jan 2022 23:37 UTC

I need to parse an email message and pull its various parts apart. Is
there some not-so-difficult way to do it? Corriel looks like it would be
just the thing, unfortunately it won't run on Windows. The Mail:: and
Email:: modules seem very complicated when all I want to do is feed it a
complete message and get at the various pieces [body, attachments, etc] and
the headers [from, date, etc]. Is there a _simple_ package that'll do
that? If not, are there tutorials or the like for Mail:: and/or Email::?
They seem to be much more focused on managing actual mailboxes {Mail::} and
*composing* emails [Email::] and give pretty short shrift [to my struggling
with the man pages] to just *parsing* an email. Thanks!

/Bernie\
--
Bernie Cosell Fantasy Farm Fibers
bernie@fantasyfarm.com Pearisburg, VA
--> Too many people, too few sheep <--

Re: Parsing an email message

<87r19e89ha.fsf@doppelsaurus.mobileactivedefense.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=258&group=comp.lang.perl.misc#258

  copy link   Newsgroups: comp.lang.perl.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: rweikusat@talktalk.net (Rainer Weikusat)
Newsgroups: comp.lang.perl.misc
Subject: Re: Parsing an email message
Date: Tue, 11 Jan 2022 17:31:45 +0000
Lines: 14
Message-ID: <87r19e89ha.fsf@doppelsaurus.mobileactivedefense.com>
References: <lmgptg9u4a7vitb5g7s9vrvbtdmtqc4gor@4ax.com>
Mime-Version: 1.0
Content-Type: text/plain
X-Trace: individual.net Wj6hBsUsq/WL7xPjB6yEjAOojJgvl4xIVTDzvQcmTPBNsVDKA=
Cancel-Lock: sha1:4dcsu6eHMaqTst8Qguf4swiIBWI= sha1:ln4ox+STtqpWlJWrRssFX3zvXXs=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
 by: Rainer Weikusat - Tue, 11 Jan 2022 17:31 UTC

Bernie Cosell <bernie@fantasyfarm.com> writes:
> I need to parse an email message and pull its various parts apart. Is
> there some not-so-difficult way to do it? Corriel looks like it would be
> just the thing, unfortunately it won't run on Windows. The Mail:: and
> Email:: modules seem very complicated when all I want to do is feed it a
> complete message and get at the various pieces [body, attachments, etc] and
> the headers [from, date, etc]. Is there a _simple_ package that'll do
> that? If not, are there tutorials or the like for Mail:: and/or Email::?
> They seem to be much more focused on managing actual mailboxes {Mail::} and
> *composing* emails [Email::] and give pretty short shrift [to my struggling
> with the man pages] to just *parsing* an email. Thanks!

There is no simple way to parse an e-mail message: That's literally the
most complicated grammar I ever wrote a parser for.

Re: Parsing an email message

<SfidnQNjndEIlkP8nZ2dnUU7-dWdnZ2d@giganews.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=259&group=comp.lang.perl.misc#259

  copy link   Newsgroups: comp.lang.perl.misc
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!buffer2.nntp.dca1.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Tue, 11 Jan 2022 16:58:29 -0600
From: news@lawshouse.org (Henry Law)
Subject: Re: Parsing an email message
Newsgroups: comp.lang.perl.misc
References: <lmgptg9u4a7vitb5g7s9vrvbtdmtqc4gor@4ax.com>
User-Agent: Pan/0.145 (Duplicitous mercenary valetism; d7e168a
git.gnome.org/pan2)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Message-ID: <SfidnQNjndEIlkP8nZ2dnUU7-dWdnZ2d@giganews.com>
Date: Tue, 11 Jan 2022 16:58:29 -0600
Lines: 24
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-CimKUQdrl61LdqV8fCdJuywFgzlcJeb+EYBe+qrWPnztqO+VovkLjTDfb8avQ1IZlITCKauV1vON/s3!wx/3KgNYQbinWwGfVbHrzb4Sdg6MbduTVireViyfq0SrICSWXLXQ73/raP5cY+1zTvwfo44A9jRT
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
X-Original-Bytes: 2207
 by: Henry Law - Tue, 11 Jan 2022 22:58 UTC

On Mon, 10 Jan 2022 18:37:26 -0500, Bernie Cosell wrote:

> Is there a _simple_ package that'll do that? If not, are there
> tutorials or the like for Mail:: and/or Email::?

I use Email::MIME. How "simple" it is depends on your point of view but,
as someone else has already observed, MIME email has a complicated
structure (e.g. separate parts within one message are themselves
Email::MIME structures), and you're not going to get a /simple/ piece of
code that understands that.

However, if you pass the text of a single message to Email::MIME, the
object will then give you a "header_pairs" method, which will give you a
great deal of what you need. And there's a "body" method which will give
you the body, surprisingly.

If you want to send me a mail (address is valid) I can let you have great
wodges of code that does this stuff; maybe reading through it and taking
out the bits you don't need might help you. It's object-oriented so you
might even be able to use the packages.

--
Henry Law n e w s @ l a w s h o u s e . o r g
Manchester, England

Re: Parsing an email message

<slrnsts7ir.3ut2a.ak-1a@chimborazo.ee.ethz.ch>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=260&group=comp.lang.perl.misc#260

  copy link   Newsgroups: comp.lang.perl.misc
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: ak-1a@gmx.ch (Andreas Karrer)
Newsgroups: comp.lang.perl.misc
Subject: Re: Parsing an email message
Date: 12 Jan 2022 00:18:35 GMT
Lines: 48
Message-ID: <slrnsts7ir.3ut2a.ak-1a@chimborazo.ee.ethz.ch>
References: <lmgptg9u4a7vitb5g7s9vrvbtdmtqc4gor@4ax.com>
X-Trace: individual.net UghnHDWbBsFuN3u//R12zgRbWwXIEoFWUWCFPZ5jxPaClOA3s3
Cancel-Lock: sha1:xbFaOauDJVjPPzk6B0fT3r5B6Lw=
User-Agent: slrn/1.0.3 (Linux)
 by: Andreas Karrer - Wed, 12 Jan 2022 00:18 UTC

* Bernie Cosell <bernie@fantasyfarm.com>:
> I need to parse an email message and pull its various parts apart. Is
> there some not-so-difficult way to do it? Corriel looks like it would be

There is no really simple way because mail headers and MIME are not
simple. A MIME message may be an arbitrarily complex tree of parts,
parts may be items of a whole lot of media types such as text, html,
images, videos, pdf etc. Then there is the further complexity of
"multipart/alternative", where you will have to decide by some
heuristic which of the alternatives you want to extract or display.

I'd recommend Email::MIME, maybe that qualifies as "not-so-difficult".

"arbitrarily complex tree" is a hint that a recursive approach should
be used.

This skeleton passes the mail message in $message to Email::MIME for
parsing. The "showparts" method then displays a summary of each direct
subpart and calls itself recursively for that subpart. It uses
Email::MIME::ContentType to parse the "Content-Type" headers, which may
be quite complex, too.

use Email::MIME;
use Email::MIME::ContentType;

my $email = Email::MIME->new($message);
sub showparts;
sub showparts {
my $item = shift;
my $indent = shift;
my $i = 1;
for my $part ($item->subparts) {
my $ct = parse_content_type($part->content_type);
my $len = length $part->body;
print "part$indent $i: $ct->{type}/$ct->{subtype}, $len bytes\n";
showparts $part, "$indent $i";
$i++;
}
}
showparts $email, "";

If you are, for example, just interested in all pdf attachments,
might be enough to filter out the parts with a Content-Type of
application/pdf or application/x-pdf.

- Andi

Re: Parsing an email message

<almguglnrqndekoml5odpan7tavfk5n54s@4ax.com>

  copy mid

https://www.rocksolidbbs.com/devel/article-flat.php?id=261&group=comp.lang.perl.misc#261

  copy link   Newsgroups: comp.lang.perl.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bernie@fantasyfarm.com (Bernie Cosell)
Newsgroups: comp.lang.perl.misc
Subject: Re: Parsing an email message
Date: Wed, 19 Jan 2022 13:39:52 -0500
Organization: Fantasy Farm Fibers
Lines: 15
Message-ID: <almguglnrqndekoml5odpan7tavfk5n54s@4ax.com>
References: <lmgptg9u4a7vitb5g7s9vrvbtdmtqc4gor@4ax.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: reader02.eternal-september.org; posting-host="0ead2ba44799486a90e8c986a281975b";
logging-data="18857"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+awgFhWkObz3mkClHgpyGg"
User-Agent: ForteAgent/8.00.32.1272
Cancel-Lock: sha1:WhUlbvDJi5OPvdmf8DwJICtTOlg=
 by: Bernie Cosell - Wed, 19 Jan 2022 18:39 UTC

Bernie Cosell <bernie@fantasyfarm.com> wrote:

} I need to parse an email message and pull its various parts apart. Is
} there some not-so-difficult way to do it?

Wow -- thanks for all the info. I knew MIME messages were messy but I
didn't really realize just *how* messy. I think I'll need to more
fine-tune exactly what I want from the message and then focus on
finding/extracting just that.

Thanks! /Bernie\
--
Bernie Cosell Fantasy Farm Fibers
bernie@fantasyfarm.com Pearisburg, VA
--> Too many people, too few sheep <--


devel / comp.lang.perl.misc / Parsing an email message

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor