Re: Typesetting email automatically (macro package?)



Daniel Barrett <dbarrett@xxxxxxxxxxxxxxx> writes:
I need to typeset several thousand email messages "nicely" for a book.
Is there a macro package that makes this easy, say, typesetting quoted
replies in italics, suppressing uninteresting headers (like
"Received:"), bolding the mail header keywords (To, From, etc), and so
on? My fantasy is:

\begin{email}
[insert entire email message, unedited]
\end{email}

the problem is, that email messages are enormously variable.

first, the headers: i've just looked at one from the upload system at
the tug box ... 52 lines of headers. what of that lot is important?

second, the mime structure: lots of exciting differences here,
including in-line images, etc., and of course the actual text bodies
may be in plain text, html, or both (in different mime parts).

third, encodings: you have to guess, for example, whether the encoding
that claims to be iso-latin-1 is in fact micro$'s corrupt version of
it, and you have to deal with all the myriad m$ code pages and the old
national encodings, too, as well as the iso 8-bit codes and the
various unicode encodings (i've had several mails that are unicode but
some encoding other than utf-8).

fourth, content transfer encodings: you have to deal with quoted
printable, base64, and all that sort of thing.

and fifth (the last one i can currently think of), how are quotations
within mails to be recognised? i have at least 4 different quoting
styles among my regular correspondents.

Editing each message by hand doesn't seem fun....

A combination of programs or scripts would also be fine, e.g., feeding
the output of an email pretty-printer (or whatever) into something
else, ultimately producing something close to what I want. I can
write Perl scripts if needed.

I looked on CTAN & Google a bit, but "email" is such a common word
that it's hard to narrow the search.

with a preparatory script to sort out the encodings and insert some
sort of basic sanity, the listings package can probably be tricked
into making sense of mails, but html mail will be a killer (largely
because it's usually such *awful* html).

i think i'm saying, "email" isn't well-specified: without a clear
specification of the input format, no-one can hope to format it
automatically, unless serious randomness is acceptable.
--
Robin Fairbairns, Cambridge
.



Relevant Pages

  • Re: How to set different inboxes for different accounts in OE?
    ... It depends entirely on which account was polled to retrieve the mails. ... I suppose you could filter based on e-mail addresses in the To or Cc headers but those headers are *data* that the sender put into the message and which were sent during the DATA command from the e-mail client to the SMTP server. ... So your account may not be included in any e-mail addresses listed in the To ... It still came *through* your ISP e-mail account whether or not it had the OPTIONAL To and Cc headers. ...
    (microsoft.public.windows.inetexplorer.ie6_outlookexpress)
  • XNEWS: Are Bugs Going to Be Fixed?
    ... Speaking of Xnews bugs going unfixed, such as font sizes and Word_Wrapping, ... raw 8-bit encodings in headers, the sheer plethora of encodings, ... My newsreader, X, uses MLang to decode 8 bits headers any which way, ... instead of raw 8bits. ...
    (news.software.readers)
  • Re: More MSDN lies: RtlStringCchLength
    ... > Because I saw it used, and I saw headers make it available. ... both kernel and user mode code; in the latter case, TCHAR must be supported. ... Even extended _single_ byte encodings were in a great state of confusion ...
    (microsoft.public.win32.programmer.kernel)
  • Re: How to maquerade headers ..
    ... I have a problem that if I add smtp with 127.0.0.1 the mails contain ... Those headers are perfectly normal, and any spam filter that blocks them ... will block tons of legitimate mail - your problem is likely elsewhere, ...
    (comp.mail.sendmail)
  • Re: [OT] Setting up getmail to retrieve mails from gmail [Was: Re: [OT] GMail troubles [Was: Re: du&
    ... the full header has the mailing list headers. ... It had my id as "cc", so did not have the list headers. ... Next i checked few other mails and found that mails ... correct whatever is wrong with my spam filter (the other thread ...
    (Debian-User)