regm.py
Here is a Python script I made to find mail in mailboxes
using quick regular expression.
It could be useful for people with huge mail directory (and
especially for Mutt users).
Regm.py is released under
GNU General Public License.
Documentation
The output of regm --help :
regm version 1.0 15/07/01
Filter and extract messages from mailboxes with regular expressions.
installation:
- change defaults at the begining of the script to fit your
needs, it's GPL code.
- put it with your python scripts (need python 2.0) and do a
ln -s regm.py ~/bin/regm
bugs:
- parse encoded attachment
- only tested on Linux
- not usable as a library
usage:
regm [options] [search string] [input files]
exemples of use:
regm hello ~/mail/* > ~/tmp/mbox
regm -v -f ~/tmp/mbox -b hello -o -b bye ~/mail/*
regm -vx -f ~/tmp/mbox 'hello||bye' ~/mail/*
regm -v -f ~/tmp/mbox -n -b hello ~/mail/*
regm -v -f ~/tmp/mbox -h '^to:.*@free.fr' ~/mail/*
regm -vx 't:@free.fr' > ~/tmp/mbox
regm -vN -f ~/tmp/mbox -h '^to:.*@free.fr' ~/mail/*
regm -xm 'f:joe&&hello&&!t:@free.fr'
regm -p '%40d sub: %s\n %5B\n\n' "" ~/mail/inbox
regm -v -p '%a\n\n' '' ~/news/fr.misc.bavardages.dinosaures
regm -p '%D %T\n' "" ~/mail/sent
regm -U '' duplicate.mbox > clean.mbox
Default file path for mbox is ~/mail/*
All research string are regex, in expert mode (-x) you can use "||"
"&&" and "!" operator in the same string as separator between different
regex.
Each output message is absolutely left unmodified by the filter.
options:
-h string search string in message header
-b string search string in message body
-n negation of the following -h or -b
-N global negation (invert the filter output)
-o "or" (between -h or -b). Default is a "and"
-f file output file. there is a warning if output file exists.
Default output is stdout.
-x xpert mode
no -h or -b option, use of && || !, the search string
must be after all options (and before input path)
's:' is '^Subject: .*'
'd:' is '^Date: .*'
'f:' is '^From: .*'
'e:' is '^Sender: .*'
't:' is '^To: .*'
'c:' is '^Cc: .*'
'r:' is '^Reply-To: .*'
'i:' is '^Message-ID: .*'
'g:' is '^References: .*'
'a:' is '^Approved: .*'
'x:' is '^X-Loop: .*'
'n:' is '^newsgroups: .*'
'h:' is equivalent to the -h option, default is
searching in body
-u case sensitive. Default is case insensitive
-p string output format, syntax : \n \t and
%[number][sdfetcrixnagBDFETCR] for subject, date, from,...
B is body, other uppercase are for stripped mail
addresses and D for date with "%D" format
-m direct launch of mutt on the result (using temp output
file)
-U discard duplicate messages (with same message-id)
-D string change output date format of -p "%D" option
-q quiet
-v verbosity
--help this help
--version version
Other tools that you may consider:
- grepmail
http://grepmail.sourceforge.net
Search mailboxes for mail matching a regular expression.
Grepmail is a lot more powerful than regm, with many options.
- archivemail
http://archivemail.sourceforge.net
A tool written in Python for archiving and compressing old email in mailboxes.
Extraction based on date, gzip of the resulting mailbox, and many other usefull options.
|
|