MHonArc Resource List

MIMEFILTERS


Syntax

Envariable

N/A

Element

<MIMEFILTERS>
filter-specification
...
</MIMEFILTERS>

Command-line Option

N/A


Description

The resource MIMEFILTERS is used to hook in user specifed filters into MHonArc. The MIMEFILTERS resource can only be set via the MIMEFILTERS element. The syntax for each line of the the MIMEFILTERS element is as follows:

content-type;routine-name;file-of-routine

The definition of each semi-colon-separated value is as follows:

content-type

The MIME content-type the filter processes. An explicit content-type (base/subtype) or a base content-type (base/*) can be specified.

routine-name

The actual routine name of the filter. The name should be fully qualified by the package it is defined in (e.g. "mypackage::filter").

file-of-routine

The name of the file that defines routine-name. If the file is not a full pathname, MHonArc finds the file by looking in the standard include paths of Perl, and the paths specified by the PERLINC resource.

NOTE

For backwards compatibility, the values of a filter specification can be separated with a colon, ":". However, if you use a colon, package qualification of a function must use Perl 4 syntax.

Whitespace is stripped out for each filter specification.

Writing Filters

If you want to write your own filter for use in MHonArc, you need to know the Perl programming language. The following information assumes you know Perl.

NOTE

The filter model follows Perl 4 syntax conventions and constructs. This is because of historical reasons. Note, the implementation of the filter can use Perl 5 syntax and features, where applicable, if running MHonArc under Perl 5.

NOTE

The default filters provided by MHonArc are described in the Default Settings section.

Function Interface of Filter

MHonArc interfaces with MIME filters by calling a routine with a specific set of arguments. The prototype of the interface routine is as follows:

sub filter {
    local($head, *fields, $data, $decoded, $argstring) = @_;

    # Filter code here

    # The last statement should be the return value, unless an
    # explicit return is done. See the following for the format of the
    # return value.
}
Argument Descriptions
$head

This is the header text of the message (or body part if called in a multipart message).

*fields

A pointer (typeglob) to an associative array that has broken down $head into field label/field value components. The keys are the lower-case representations of the field values. Example: If you would like to retrieve the value of the Content-Type field, then use the following: $fields{'content-type'}.

If a field occurs more than once in a header, MHonArc separates the field values in the associative array by a `\034' character. To make your filter less likely to break due to changes in MHonArc, you may use the $readmail::FieldSep variable instead of `\034'.

NOTE

Since the *fields argument is a typeglob, my cannot be used when assigning the typeglob from the @_ array. my can be used for creating local copies of the other arguments.

$data

This is a copy of the message (or body part if called in a mulitpart message) body.

$decoded

This flag is set to 1 if MHonArc decoded the message and $data represents the orginal data before encoded by the sender. If set to 0, $data has not been decoded. The failure to decode occurs if MHonArc does not recognizeed the encoding specified in the Content-Transfer-Encoding field.

MHonArc has decoded the data for you if it was encoded in 7-Bit, 8-Bit, Binary, Quoted-Printable, Base64, Uuencode (x-uuencode, uuencode, x-uue, uue).

$argstring

This is an optional argument string that may be used to modify the behavior of the filter. The format of this string is determined by the filter itself. The value of the string is set by the MIMEARGS resource.

Return Value

The return value is treated as a list. The first item of the list is a string representing the HTML markup to insert in the HTMLized message. An empty string may be returned to tell MHonArc that the routine was unable to filter data.

Any other list items are treated as names of any files that were generated by the filter. MHonArc needs to keep track if any extra files that a filter may generate in order for MHonArc to delete those files if the message gets removed from the archive.

NOTE

If the filter creates a subdirectory with files, the filter only needs to return the subdirectory in the return list. If the message gets removed, MHonArc will delete the entire directory.

Filter Writing Tips

The following recommendations/tips are given to help you write filters:

Using C

If a MIME filter requires the utilization of a C program, or other non-Perl executable, a Perl wrapper must be written for the program in-order to interface with MHonArc. The wrapper must follow the conventions described in Writing Filters.


Default Setting

<MIMEFilters>
application/*;		   m2h_external::filter;	mhexternal.pl
application/x-patch;	   m2h_text_plain::filter;	mhtxtplain.pl
audio/*;		   m2h_external::filter;	mhexternal.pl
chemical/*;		   m2h_external::filter;	mhexternal.pl
model/*;		   m2h_external::filter;	mhexternal.pl
image/*;		   m2h_external::filter;	mhexternal.pl
message/delivery-status;   m2h_text_plain::filter;	mhtxtplain.pl
message/partial;	   m2h_text_plain::filter;	mhtxtplain.pl
text/*;			   m2h_text_plain::filter;	mhtxtplain.pl
text/enriched;		   m2h_text_enriched::filter;	mhtxtenrich.pl
text/html;		   m2h_text_html::filter;	mhtxthtml.pl
text/plain;		   m2h_text_plain::filter;	mhtxtplain.pl
text/richtext;		   m2h_text_enriched::filter;	mhtxtenrich.pl
text/setext;		   m2h_text_setext::filter;	mhtxtsetext.pl
text/tab-separated-values; m2h_text_tsv::filter;	mhtxttsv.pl
text/x-html;		   m2h_text_html::filter;	mhtxthtml.pl
text/x-setext;		   m2h_text_setext::filter;	mhtxtsetext.pl
video/*;		   m2h_external::filter;	mhexternal.pl
x-sun-attachment;	   m2h_text_plain::filter;	mhtxtplain.pl
</MIMEFilters>

The following describes the behavior of each filter.


m2h_external::filter

The filter extracts the data into a separate file and puts a hyperlink to the file into the HTMLized message.

By default, the filter ignores any filename specification given in the message when writing the data to disk. A unique filename with an extenstion based upon sub-type is generated.

m2h_external::filter can take the following arguments:

iconurl="url"

Use "url" as the url for the icon to use if the useicon option is set. This option will override any setting defined by the ICONS resource. The double quotes are required.

inline

Inline image data by default if content-disposition not defined.

ext=ext

Use ext as the filename extension. The filter already has a large list of extensions for various content-types. Use this argument if you process a content-type not recognized by the filter.

type="description"

Use "description" as type description of the data. The double quotes are required. The filter already has a large list of descriptions for various content-types. Use this argument if you process a content-type not recognized by the filter.

subdir

Place derived file in a subdirectory of the archive. The subdirectory will be called "msgMSGNUM.dir". This option may be useful if usename is specified to avoid security and name conflict problems.

target=name

Set the TARGET attribute of anchor link to file Default value is undefined (ie. no TARGET attribute will be written).

useicon

Include a content-type icon with the hyperlink to the derived file. The icon used is the value of the iconurl option or the icon defined by the ICONS resource.

usename

Use (file)name attribute for determining name of derived file. Use this option with caution since it can lead to filename conflicts and security problems (however, see the subdir option).

usenameext

Use (file)name attribute for determining the extension of derived file. Use this option with caution since it can lead to security problems (however, see the subdir option).

All arguments should be separated by at least one space.

The following table shows the default list of content-types with the filename extension used and a short description that m2h_external::filter recognizes:

Content-typeExtensionDescription
application/astound asd Astound presentation
application/envoy evy Envoy file
application/fastman lcc fastman file
application/fractals fif Fractal Image Format
application/iges iges IGES file
application/mac-binhex40 hqx Mac BinHex archive
application/mathematica ma Mathematica Notebook document
application/mbedlet mbd mbedlet file
application/msword doc MS-Word document
application/octet-stream bin Binary data
application/oda oda ODA file
application/pdf pdf Adobe PDF document
application/pgp pgp PGP message
application/pgp-signature pgp PGP signature
application/postscript ps PostScript document
application/rtf rtf RTF file
application/sgml sgml SGML document
application/studiom smp Studio M file
application/timbuktu tbt timbuktu file
application/vis5d v5d Vis5D dataset
application/vnd.framemaker fm FrameMaker document
application/vnd.hp-hpgl hpg HPGL file
application/vnd.mif mif Frame MIF document
application/vnd.ms-excel xls MS-Excel spreadsheet
application/vnd.ms-powerpoint ppt MS-Powerpoint presentation
application/vnd.ms-project mpp MS-Project file
application/winhlp hlp WinHelp document
application/wordperfect5.1 wp WordPerfect 5.1 document
application/x-asap asp asap file
application/x-bcpio bcpio BCPIO file
application/x-compress Z Unix compressed data
application/x-cpio cpio CPIO file
application/x-csh csh C-Shell script
application/x-dot dot dot file
application/x-dvi dvi TeX dvi file
application/x-earthtime etc Earthtime file
application/x-envoy evy Envoy file
application/x-excel xls MS-Excel spreadsheet
application/x-gtar gtar GNU Unix tar archive
application/x-gzip gz GNU Zip compressed data
application/x-hdf hdf HDF file
application/x-javascript js JavaScript source
application/x-ksh ksh Korn Shell script
application/x-latex latex LaTeX document
application/x-maker fm FrameMaker document
application/x-mif mif Frame MIF document
application/x-mocha moc mocha file
application/x-msaccess mdb MS-Access database
application/x-mscardfile crd MS-CardFile
application/x-msclip clp MS-Clip file
application/x-msmediaview m14 MS-Media View file
application/x-msmetafile wmf MS-Metafile
application/x-msmoney mny MS-Money file
application/x-mspublisher pub MS-Publisher document
application/x-msschedule scd MS-Schedule file
application/x-msterminal trm MS-Terminal
application/x-mswrite wri MS-Write document
application/x-net-install ins Net Install file
application/x-netcdf cdf Cdf file
application/x-ns-proxy-autoconfig proxy Netscape Proxy Auto Config
application/x-patch patch Source code patch
application/x-perl pl Perl program
application/x-pointplus css pointplus file
application/x-salsa slc salsa file
application/x-script script A script file
application/x-sh sh Bourne shell script
application/x-shar shar Unix shell archive
application/x-sprite spr sprite file
application/x-stuffit sit Macintosh archive
application/x-sv4cpio sv4cpio SV4Cpio file
application/x-sv4crc sv4crc SV4Crc file
application/x-tar tar Unix tar archive
application/x-tcl tcl Tcl script
application/x-tex tex TeX document
application/x-texinfo texinfo TeXInfo document
application/x-timbuktu tbp timbuktu file
application/x-tkined tki tkined file
application/x-troff roff Troff document
application/x-troff-man man Unix manual page
application/x-troff-me me Troff ME-macros document
application/x-troff-ms ms Troff MS-macros document
application/x-ustar ustar UStar file
application/x-wais-source src WAIS Source
application/x-zip-compressed zip Zip compressed data
application/zip zip Zip archive
audio/basic snd Basic audio
audio/echospeech es Echospeech audio
audio/microsoft-wav wav Wave audio
audio/midi midi MIDI audio
audio/x-aiff aif AIF audio
audio/x-epac pae epac audio
audio/x-midi midi MIDI audio
audio/x-mpeg mp2 MPEG audio
audio/x-pac pac pac audio
audio/x-pn-realaudio ra PN Realaudio
audio/x-wav wav Wave audio
chemical/chem3d c3d Chem3d chemical test
chemical/chemdraw chm Chemdraw chemical test
chemical/cif cif CIF chemical test
chemical/cml cml CML chemical test
chemical/cxf cxf Chemical Exhange Format file
chemical/daylight-smiles smi SMILES format file
chemical/embl-dl-nucleotide emb EMBL nucleotide format file
chemical/gaussian-input gau Gaussian chemical test
chemical/gcg8-sequence gcg GCG format file
chemical/genbank gen GENbank data
chemical/jcamp-dx jdx Jcamp chemical spectra test
chemical/kinemage kin Kinemage chemical test
chemical/macromodel-input mmd Macromodel chemical test
chemical/mdl-molfile mol MOL mdl chemical test
chemical/mdl-rdf rdf RDF chemical test
chemical/mdl-rxn rxn RXN chemical test
chemical/mdl-sdf sdf SDF chemical test
chemical/mdl-tgf tgf TGF chemical test
chemical/mif mif MIF chemical test
chemical/mopac-input mop MOPAC data
chemical/ncbi-asn1 asn NCBI data
chemical/pdb pdb PDB chemical test
chemical/rosdal ros Rosdal data
image/bmp bmp Windows bitmap
image/cgm cgm Computer Graphics Metafile
image/fif fif Fractal Image Format image
image/g3fax g3f Group III FAX image
image/gif gif GIF image
image/ief ief IEF image
image/ifs ifs IFS image
image/jpeg jpg JPEG image
image/png png PNG image
image/tiff tif TIFF image
image/vnd dwg VND image
image/wavelet wi Wavelet image
image/x-cmu-raster ras CMU raster
image/x-pbm pbm Portable bitmap
image/x-pcx pcx PCX image
image/x-pgm pgm Portable graymap
image/x-pict pict Mac PICT image
image/x-pnm pnm Portable anymap
image/x-portable-anymap pnm Portable anymap
image/x-portable-bitmap pbm Portable bitmap
image/x-portable-graymap pgm Portable graymap
image/x-portable-pixmap ppm Portable pixmap
image/x-ppm ppm Portable pixmap
image/x-rgb rgb RGB image
image/x-xbitmap xbm X bitmap
image/x-xbm xbm X bitmap
image/x-xpixmap xpm X pixmap
image/x-xpm xpm X pixmap
image/x-xwd xwd X window dump
image/x-xwindowdump xwd X window dump
model/iges iges IGES model
model/mesh mesh Mesh model
model/vrml wrl VRML model
text/enriched rtx Text-enriched document
text/html html HTML document
text/plain txt Text document
text/richtext rtx Richtext document
text/setext stx Setext document
text/sgml sgml SGML document
text/tab-separated-values tsv Tab separated values
text/x-speech talk Speech document
video/isivideo fvi isi video
video/mpeg mpg MPEG movie
video/msvideo avi MS Video
video/quicktime mov QuickTime movie
video/vivo viv vivo video
video/wavelet wv Wavelet video
video/x-sgi-movie movie SGI movie

m2h_text_enriched::filter

This filter is designed to process text/enriched, or text/richtext, data. The following table summarizes the translation of text/enriched commands to HTML tags:

Text/Enriched CommandHTML Translation
<Bold> <B>
<Italic> <I>
<Underline> <U>
<Fixed> <TT>
<Smaller> <SMALL>
<Bigger> <BIG>
<FontFamily><Param>family</Param> <FONT face="family">
<Color><Param>color</Param> <FONT color="color">
<Center> <P align="center">
<FlushLeft> <P align="left">
<FlushRight> <P align="right">
<FlushBoth> <P align="both"> (not supported in HTML)
<ParaIndent> <BLOCKQUOTE>
<Excerpt> <BLOCKQUOTE>
<Lang> Stripped

If the text/enriched contains non-ASCII character, the filter will convert the characters to the appropriate entity references.

NOTE

Only the ISO-8859-[1-10] character sets are recognized.


m2h_text_html::filter

This filter is designed to process text/html or text/x-html data. The filter modifies the HTML so it can be included validly into the message page. Any head data will be stripped out, but the title will be extracted and prepended to the body data.


m2h_text_plain::filter

This filter is designed to process text/plain messages and messages with no MIME information. The filter is also used to process text messages of an unknown subtype.

The default behavior of the filter is wrap the data in the HTML PRE element and escape special characters. It will also convert text that looks like a URL into a hyperlink. If the data contains non-ASCII character, the filter will convert the characters to the appropriate entity reference.

NOTE

Only the ISO-8859-[1-10] and ISO-2022-JP character sets are recognized. For ISO-2022-JP data, a Web client with ISO-2022-JP is required to read the data.

m2h_text_plain::filter can take the following arguments:

asis=set1:...

Colon separated lists of charsets to leave as-is. Only HTML special characters will be converted into entities.

default=charset

Character set to use as the default if no character set is defined for the message. If option not specified, "us-ascii" is used.

keepspace

Preserve all spaces if the nonfixed option is specified. All spaces and tabs will be translated to the equivalent number of &nbsp; entity references.

maxwidth=#

Force the maximum width of lines to be # characters in length. Any lines longer than # characters will be wrapped.

nonfixed

Do not wrap message text in the HTML PRE element. This will cause text to be rendered in the default font (which is normally proportionally spaced). Each line of the message will have a <BR> appended in order to preserve the line representation of the message.

nourl

Do not hyperlink URLs.

quote

Italicize quoted message text.

target=name

Set the TARGET attribute of an anchor links generated from hyperlinking URLs.

All arguments should be separated by at least one space.


m2h_text_setext::filter

This filter converts text/setext and text/x-setext messages to HTML.


m2h_text_tsv::filter

This filter converts text/tab-separated-values to HTML. The tabular data will be converted into an HTML table.


Resource Variables

N/A


Examples

The following code is a filter for HTML message data (code extracted from the HTML filter provided with MHonArc):

##---------------------------------------------------------------------------##
##    Copyright (C) 1995-1998	Earl Hood, earlhood@usa.net
##---------------------------------------------------------------------------##

package m2h_text_html;

$Url	= '(\w+://|\w+:)';	# Beginning of URL match expression

##---------------------------------------------------------------------------
##	The filter must modify HTML content parts for merging into the
##	final filtered HTML messages.  Modification is needed so the
##	resulting filtered message is valid HTML.
##
sub filter {
    local($header, *fields, *data, $isdecode, $args) = @_;
    local($base, $title, $tmp);
    $base 	= '';
    $title	= '';
    $tmp	= '';

    ## Get/remove title
    if ($data =~ s%<title\s*>([^<]*)</title\s*>%%i) {
        $title = "<ADDRESS>Title: <STRONG>$1</STRONG></ADDRESS>\n";
    }
    ## Get/remove BASE url
    if ($data =~ s%(<base\s[^>]*>)%%i) {
        $tmp = $1;
        if ($tmp =~ m|href\s*=\s*['"]([^'"]+)['"]|i) {
	    $base = $1;
	} elsif ($tmp =~ m|href\s*=\s*([^\s>]+)|i) {
	    $base = $1;
	}
    } elsif ((defined($tmp = $fields{'content-base'}) ||
	      defined($tmp = $fields{'content-location'})) &&
	     ($tmp =~ m%/%)) {
	($base = $tmp) =~ s/['"\s]//g;
    }
    $base =~ s%(.*/).*%$1%;

    ## Strip out certain elements/tags
    $data =~ s%<!doctype\s[^>]*>%%i;
    $data =~ s%</?html[^>]*>%%ig;
    $data =~ s%</?body[^>]*>%%ig;
    $data =~ s%<head\s*>[\s\S]*</head\s*>%%i;

    ## Modify relative urls to absolute using BASE
    if ($base =~ /\S/) {
        $data =~ s%(href\s*=\s*['"])([^'"]+)(['"])%
		   &addbase($base,$1,$2,$3)%gei;
        $data =~ s%(src\s*=\s*['"])([^'"]+)(['"])%
                   &addbase($base,$1,$2,$3)%gei;
    }

    ($title . $data);
}
##---------------------------------------------------------------------------
sub addbase {
    local($b, $pre, $u, $suf) = @_;
    local($ret);
    $u =~ s/^\s+//;
    if ($u =~ m%^$Url%o) {	# Non-relative URL, do nothing
        $ret = $pre . $u . $suf;
    } else {			# Relative URL
	if ($u =~ m%^/%) {		# Check for "/..."
	    $b =~ s%^(${Url}[^/]*)/.*%$1%o;	# Get hostname:port number
	}
        $ret = $pre . $b . $u . $suf;
    }
    $ret;
}
##---------------------------------------------------------------------------

1;

Version

1.0


See Also

CHARSETCONVERTERS, MIMEARGS, PERLINC


98/10/10 21:28:59
MHonArc
Copyright © 1997-1998 Earl Hood, earlhood@usa.net