Rocksolid Light

Welcome to RetroBBS

mail  files  register  newsreader  groups  login

Message-ID:  

"From there to here, from here to there, funny things are everywhere." -- Dr. Seuss


computers / alt.usenet.offline-reader.forte-agent / More Info On Agent Filtering - Perl Regular Expression Tutorial

SubjectAuthor
* More Info On Agent Filtering - Perl Regular Expression Tutorialmarcus
`* Re: More Info On Agent Filtering - Perl Regular Expression TutorialRalph Fox
 `- Re: More Info On Agent Filtering - Perl Regular Expression Tutorialcroy

1
More Info On Agent Filtering - Perl Regular Expression Tutorial

<v48hnipu1qm6an8m561hnqapscbv41ihv0@4ax.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=5540&group=alt.usenet.offline-reader.forte-agent#5540

  copy link   Newsgroups: alt.usenet.offline-reader.forte-agent
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx12.iad.POSTED!not-for-mail
From: marcus@invalid.com
Newsgroups: alt.usenet.offline-reader.forte-agent
Subject: More Info On Agent Filtering - Perl Regular Expression Tutorial
Message-ID: <v48hnipu1qm6an8m561hnqapscbv41ihv0@4ax.com>
X-Newsreader: Forte Agent 1.93/32.576 English (American)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 473
X-Complaints-To: https://www.astraweb.com/aup
NNTP-Posting-Date: Tue, 12 Dec 2023 18:05:53 UTC
Date: Tue, 12 Dec 2023 12:05:57 -0600
X-Received-Bytes: 16685
 by: marcus@invalid.com - Tue, 12 Dec 2023 18:05 UTC

I got this info years back from some now dead site.

Perl Regular Expression Tutorial

Contents

1. Overview 2. Simple Regular Expressions 3. Metacharacters 4.
Forbidden Characters 5. Things To Remember

Overview A regular expression is a string of characters which
tells the searcher which string (or strings) you are looking
for. The following explains the format of regular expressions in
detail. If you are familiar with Perl, you already know the
syntax. If you are familiar with Unix, you should know that
there are subtle differences between Perl's regular expressions
and Unix' regular expressions. Simple Regular Expressions In its
simplest form, a regular expression is just a word or phrase to
search for. For example,

gauss

would match any subject with the string "gauss" in it, or which
mentioned the word "gauss" in the subject line. Thus, subjects
with "gauss", "gaussian" or "degauss" would all be matched, as
would a subject containing the phrases "de-gauss the monitor" or
"gaussian elimination." Here are some more examples:

carbon

Finds any subject with the string "carbon" in its name, or which
mentions carbon (or carbonization or hydrocarbons or
carbon-based life forms) in the subject line.

hydro

Finds any subject with the string "hydro" in its name or
contents. Subjects with "hydro", "hydrogen" or "hydrodynamics"
are found, as well as subjects containing the words "hydroplane"
or "hydroelectric".

oxy

Finds any subject with the string "oxy" in the subject line.
This could be used to find subjects on oxygen, boxy houses or
oxymorons.

top ten

Note that spaces may be part of the regular expression. The
above expression could be used to find top ten lists. (Note that
they would also find articles on how to stop tension.)

Metacharacters Some characters have a special meaning to the
searcher. These characters are called metacharacters. Although
they may seem confusing at first, they add a great deal of
flexibility and convenience to the searcher.

The period (.) is a commonly used metacharacter. It matches
exactly one character, regardless of what the character is. For
example, the regular expression:

2,.-Dimethylbutane

will match "2,2-Dimethylbutane" and "2,3-Dimethylbutane". Note
that the period matches exactly one character-- it will not
match a string of characters, nor will it match the null string.
Thus, "2,200-Dimethylbutane" and "2,-Dimenthylbutane" will not
be matched by the above regular expression.

But what if you wanted to search for a string containing a
period? For example, suppose we wished to search for references
to pi. The following regular expression would not work:

3.14 (THIS IS WRONG!)

This would indeed match "3.14", but it would also match "3514",
"3f14", or even "3+14". In short, any string of the form "3x14",
where x is any character, would be matched by the regular
expression above.

To get around this, we introduce a second metacharacter, the
backslash (\). The backslash can be used to indicate that the
character immediately to its right is to be taken literally.
Thus, to search for the string "3.14", we would use:

3\.14 (This will work.)

This is called "quoting". We would say that the period in the
regular expression above has been quoted. In general, whenever
the backslash is placed before a metacharacter, the searcher
treats the metacharacter literally rather than invoking its
special meaning.

(Unfortunately, the backslash is used for other things besides
quoting metacharacters. Many "normal" characters take on special
meanings when preceded by a backslash. The rule of thumb is,
quoting a metacharacter turns it into a normal character, and
quoting a normal character may turn it into a metacharacter.)

Let's look at some more common metacharacters. We consider first
the question mark (?). The question mark indicates that the
character immediately preceding it either zero times or one
time. Thus

m?ethane

would match either "ethane" or "methane". Similarly,

comm?a

would match either "coma" or "comma".

Another metacharacter is the star (*). This indicates that the
character immediately to its left may be repeated any number of
times, including zero. Thus

ab*c

would match "ac", "abc", "abbc", "abbbc", "abbbbbbbbc", and any
string that starts with an "a", is followed by a sequence of
"b"'s, and ends with a "c".

The plus (+) metacharacter indicates that the character
immediately preceding it may be repeated one or more times. It
is just like the star metacharacter, except it doesn't match the
null string. Thus

ab+c

would not match "ac", but it would match "abc", "abbc", "abbbc",
"abbbbbbbbc" and so on.

Metacharacters may be combined. A common combination includes
the period and star metacharacters, with the star immediately
following the period. This is used to match an arbitrary string
of any length, including the null string. For example:

cyclo.*ane

would match "cyclodecane", "cyclohexane" and even "cyclones
drive me insane." Any string that starts with "cyclo", is
followed by an arbitrary string, and ends with "ane" will be
matched. Note that the null string will be matched by the
period-star pair; thus, "cycloane" would be matche by the above
expression.

If you wanted to search for articles on cyclodecane and
cyclohexane, but didn't want to match articles about how
cyclones drive one insane, you could string together three
periods, as follows:

cyclo...ane

This would match "cyclodecane" and "cyclohexane", but would not
match "cyclones drive me insane." Only strings eleven characters
long which start with "cyclo" and end with "ane" will be
matched. (Note that "cyclopentane" would not be matched,
however, since cyclopentane has twelve characters, not eleven.)

Here are some more examples. These involve the backslash. Note
that the placement of backslash is important.

a\.*z

Matches any string starting with "a", followed by a series of
periods (including the "series" of length zero), and terminated
by "z". Thus, "az", "a.z", "a..z", "a...z" and so forth are all
matched.

a.\*z

(Note that the backslash and period are reversed in this regular
expression.)

Matches any string starting with an "a", followed by one
arbitrary character, and terminated with "*z". Thus, "ag*z",
"a5*z" and "a@*z" are all matched. Only strings of length four,
where the first character is "a", the third "*", and the fourth
"z", are matched.

a\++z

Matches any string starting with "a", followed by a series of
plus signs, and terminated by "z". There must be at least one
plus sign between the "a" and the "z". Thus, "az" is not
matched, but "a+z", "a++z", "a+++z", etc. will be matched.

a\+\+z

Matches only the string "a++z".

a+\+z

Matches any string starting with a series of "a"'s, followed by
a single plus sign and ending with a "z". There must be at least
one "a" at the start of the string. Thus "a+z", "aa+z", "aaa+z"
and so on will match, but "+z" will not.

a.?e

Matches "ace", "ale", "axe" and any other three-character string
beginning with "a" and ending with "e"; will also match "ae".

a\.?e

Matches "ae" and "a.e". No other string is matched.

a.\?e

Matches any four-character string starting with "a" and ending
with "?e". Thus, "ad?e", "a1?e" and "a%?e" will all be matched.

a\.\?e

Matches only "a.?e" and nothing else.

Earlier it was mentioned that the backslash can turn ordinary
characters into metacharacters, as well as the other way around.
One such use of this is the digit metacharacter, which is
invoked by following a backslash with a lower-case "d", like
this: "\d". The "d" must be lower case, for reasons explained
later. The digit metacharacter matches exactly one digit; that
is, exactly one occurence of "0", "1", "2", "3", "4", "5", "6",
"7", "8" or "9". For example, the regular expression:

2,\d-Dimethylbutane

would match "2,2-Dimethylbutane", "2,3-Dimethylbutane" and so
forth. Similarly,

1\.\d\d\d\d\d

would match any six-digit floating-point number from 1.00000 to
1.99999 inclusive. We could combine the digit metacharacter with
other metacharacters; for instance,


Click here to read the complete article
Re: More Info On Agent Filtering - Perl Regular Expression Tutorial

<2udhnil0lkhe55874jjdgndq4nkbcgedvb@4ax.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=5546&group=alt.usenet.offline-reader.forte-agent#5546

  copy link   Newsgroups: alt.usenet.offline-reader.forte-agent
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx17.iad.POSTED!not-for-mail
From: -rf-nz-@-.invalid (Ralph Fox)
Newsgroups: alt.usenet.offline-reader.forte-agent
Subject: Re: More Info On Agent Filtering - Perl Regular Expression Tutorial
Message-ID: <2udhnil0lkhe55874jjdgndq4nkbcgedvb@4ax.com>
References: <v48hnipu1qm6an8m561hnqapscbv41ihv0@4ax.com>
User-Agent: ForteAgent/8.00.32.1272
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Face: 5gSW~"1=jGDo(BXfTrgL2BnC3tUB_\d0u@mP~wA1fvK`z8I[>1jXVVZ!N6ittQ.K<5!i3l> ==jcyAk.[B>kLg8TY{+8%edZ(le:ncPt%s8Pr?]QXNXO]0RC#V_zt|%>=bt>rZ2iCI^-yl7Be(]Ep> OfyI!3Bf|e
Lines: 47
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Wed, 13 Dec 2023 08:43:36 +1300
X-Received-Bytes: 1933
 by: Ralph Fox - Tue, 12 Dec 2023 19:43 UTC

On Tue, 12 Dec 2023 12:05:57 -0600, marcus@invalid.com wrote:

> Subject: More Info On Agent Filtering - Perl Regular Expression Tutorial

> I got this info years back from some now dead site.
>
> Perl Regular Expression Tutorial

For Agent filtering, take this tutorial with several grains of salt.
Agent's regular expressions are _not_ the same as Perl regular expressions.

For example...

> Similarly,
>
> comm?a
>
> would match either "coma" or "comma".

Agent's regular expressions are different here.

In Agent, this would match either "a" or "comma".
See part II.E of Jim Bradley's Agent Regular Expressions page at
<https://web.archive.org/web/20030122175504/http://jlbradley.home.att.net/REGULAR.HTM>.

> One such use of this is the digit metacharacter, which is
> invoked by following a backslash with a lower-case "d", like
> this: "\d".

Not in Agent.

> ab{3,5}c

Not in Agent.

--
Kind regards
Ralph Fox
🦊

If things were to be done twice, all would be wise.

Re: More Info On Agent Filtering - Perl Regular Expression Tutorial

<23njnitnna0ofpj0lrgjhakvsf3qdbmckl@4ax.com>

  copy mid

https://www.rocksolidbbs.com/computers/article-flat.php?id=5554&group=alt.usenet.offline-reader.forte-agent#5554

  copy link   Newsgroups: alt.usenet.offline-reader.forte-agent
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!newsfeed.hasname.com!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx33.iad.POSTED!not-for-mail
From: croy@spam.invalid.net (croy)
Newsgroups: alt.usenet.offline-reader.forte-agent
Subject: Re: More Info On Agent Filtering - Perl Regular Expression Tutorial
Reply-To: croy@pam.invalid.net
Message-ID: <23njnitnna0ofpj0lrgjhakvsf3qdbmckl@4ax.com>
References: <v48hnipu1qm6an8m561hnqapscbv41ihv0@4ax.com> <2udhnil0lkhe55874jjdgndq4nkbcgedvb@4ax.com>
User-Agent: ForteAgent/8.00.32.1272
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 27
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Wed, 13 Dec 2023 08:35:12 -0800
X-Received-Bytes: 1557
 by: croy - Wed, 13 Dec 2023 16:35 UTC

On Wed, 13 Dec 2023 08:43:36 +1300, Ralph Fox <-rf-nz-@-.invalid> wrote:

>For Agent filtering, take this tutorial with several grains of salt.
>Agent's regular expressions are _not_ the same as Perl regular expressions.
>
>
>For example...
>
>> Similarly,
>>
>> comm?a
>>
>> would match either "coma" or "comma".
>
>Agent's regular expressions are different here.
>
>In Agent, this would match either "a" or "comma".
>See part II.E of Jim Bradley's Agent Regular Expressions page at
><https://web.archive.org/web/20030122175504/http://jlbradley.home.att.net/REGULAR.HTM>.

What a wonderful reference! I found that I could copy it all, and paste into MS Word 2k, and
retain *all* the formatting.

Thanks Ralph!

--
croy

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor