this post was submitted on 21 Mar 2024
2 points (75.0% liked)

Digital Forensics

289 readers
1 users here now

A community focused on all things digital forensics.

founded 1 year ago
MODERATORS
 

I received several machine-generate e-mails which are all mostly the same: a notification. They are HTML emails with no plaintext MIME part. Yikes! And to complicate matters further, the messages traversed my anonaddy forwarding account which PGP encrypts every message to me before forwarding it to my normal email account.

The gov wants me to give them an “unaltered copy” of these e-mails. This gov office actually blocks my mail server so I am generally unwilling to send them email. This means I will be giving them the emails on paper hardcopy.

So wtf, this is tricky. They want an “unaltered copy”. If I were to print the MBOX files, it would be useless to them because it’s a base64 blob that only I can decrypt. My mail client is mutt so the HTML is detected and piped through w3m to give me a text version that is readable enough.

But in general, how do you give unaltered copies of an HTML email on paper form? This is not necessarily for a court but it could go down that path. Would a court want to see raw HTML tags? Or do courts prefer the HTML to be rendered for readability?

Normally I copy the w3m-rendered text of email into LaTeX and typeset it to look pretty and copy-paste the useful headers into a well-styled header in a monospaced font. And I omit the useless headers. But I get the impression my way of working would not pass for “unaltered”.

I could perhaps try to feed the HTML into wkhtmltopdf. In the end, HTML rendering always varies depending on the rendering tool. Normies use MS Outlook, and I have to figure that the gov is normally dealing with normies. So maybe I should install Evolution or Thunderbird. Any suggestions for a tool that is particularly good at making HTML email presentable on paper without looking too custom?

#askFedi

top 7 comments
sorted by: hot top controversial new old
[–] thebardingreen@lemmy.starlightkel.xyz 4 points 7 months ago (1 children)

In my admittedly limited experience, courts don't want to look at raw HTML unless something in the headers or something is relevant to the case. Then, they want the important bits to be highlighted by experts.

I actually wrote some Python scripts about 3 months ago to parse MBOX files so that specific emails could be entered into evidence in a lawsuit. I don't know if my scripts would help you, but I'd be happy to send them to you.

[–] coffeeClean@infosec.pub 1 points 7 months ago* (last edited 7 months ago) (1 children)

My python knowledge is quite rough but if not much hacking is needed it could be useful. I’ve seen others asking for a similar tool. I thought about creating one over the years but keep passing on it thinking I won’t need it often enough and every situation can bring different requirements as well. Which is why I settled on pasting into a LaTeX template. I do things like use a tiny font on signature blocks that are so big they would spill over to another page.

Does python have a standard library for HTML rendering? Or do you call a browser of some kind?

[–] thebardingreen@lemmy.starlightkel.xyz 2 points 7 months ago (1 children)

I'm on my phone right now. When I get home I'll dig them up.

[–] coffeeClean@infosec.pub 1 points 7 months ago* (last edited 7 months ago) (1 children)

I might be able to get by without the script. I just found that I can render the body in Firefox well enough (that often fails but it works with the particular emails I’m dealing with), fiddle with the paper format and scale to exactly fit a page, and then import it into LaTeX, rescale, and attach a header. If you’ve already got the script ready then I would be happy to take it anyway and compare the script output to what I’m manually rigging up. But if you’ve not started then no worries. Thanks!

(edit)
fwiw to anyone with the same need, I found this project: https://github.com/nickrussler/email-to-pdf-converter It looks a bit messy to install on my distro and I’m not sure of EML / Mbox differences, so I’m not planning to use it myself.

[–] thebardingreen@lemmy.starlightkel.xyz 2 points 7 months ago* (last edited 7 months ago) (1 children)

I'd kind of forgotten how I'd done it.

This script searches an MBOX file for emails from or too lawyer1 and lawyer2, that contain the names or email addresses in target_names. It exports each email to a txt file, and saves any attachments in their original format.

https://pastebin.com/i0xq4fP9

Then I used this bash script to export the txt files to PDF, using pandoc (https://pandoc.org/)

https://pastebin.com/17FPXPr5

In my case, we needed to export like 7,000 emails from over a 6 year period from like a 45 GB GMail MBOX export. The lawyers seemed happy with the result, but it was a lot of data.

[–] coffeeClean@infosec.pub 1 points 7 months ago (1 children)

Thanks! I grabbed it in case it comes in handy. I wonder if the first script which searches for messages might have been simplified by using grepmail. Grepmail is slow but powerful.

This is slow too.