seawasp: (A wise toad)
[personal profile] seawasp
I've posted screenshots of the output from "less" that I was able to cut-and-paste into a text file. The first shot is this one, called Less Screen One; this is typical of the first few screens of stuff that come up. Then the next few look like this; this unreadable garbage goes for several screens, then it goes back to things like Less Screen One for quite some time (I can't say for the rest of the file as I don't have the time to bounce through the entire 1+GB a screen at a time).

Also at intervals it pops up the error messages:

"less(3414) malloc: *** mmap(size=1073741824) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
less(3414) malloc: *** mmap(size=1073741824) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
:"

Now you see what I see... is it of any help?

Date: 2009-07-30 04:28 pm (UTC)
From: [identity profile] mouser.livejournal.com
I can't tell if it's a relay causing it, or if the mail program is having the issue.



1) Unix/Linux message - End of line is a single character (CR) instead of the Windows stadnard (CR/LF). Probably something got confused in a configuration somewhere.

2) The second looks like part of a UUENCODED message (at a guess).

3) The memory allocation errors are because something doesn't understand the end of line character in #1 and is going until it runs out of memory at a one gig limit (one gig on a single line).



Date: 2009-07-30 04:28 pm (UTC)
From: [identity profile] mouser.livejournal.com
^M = CR (Carriage Return)
^L = LF (Line feed)

That's what is getting munged.


I'm not sure what your system layout is - I only found one other post about your mail issues.

Edited Date: 2009-07-30 04:30 pm (UTC)

Date: 2009-07-30 04:40 pm (UTC)
From: [identity profile] ninjarat.livejournal.com
Yes. Screen One confirms what I suspected: line ends are a disaster. See the line break between "MX-M" and "ozilla-Status:"? That shouldn't be there.

If that were my mailbox, what I would do is:
- Use m0rlock's perl script to convert the DOS end of lines to Unix.
- Break the file up into N-line pieces with split.
- Edit the first piece with Emacs and look at the line endings and pray that I can tell the difference between good line breaks (with white space or punctuation at the ends) and bad splits (no white space or punctuation).
- Create a replace expression or macro that deletes the line ends for bad splits and run that macro on every piece. I could do it in Perl or Python (or, sed for that matter) but for this kind of thing I want to see it work in real time.
- If I can't do it global then I figure out a way to repair broken headers as automatically as possible.
- Catenate all the pieces together again and load it up.

For the record, I've unmunged plenty of mail files before. This one takes the prize for weirdness for the mix of DOS and Unix end of lines and the misplaced ends.

Don't worry about Screen Two. That's MIME base64-encoded data, an attachment of some sort.

Date: 2009-07-30 05:47 pm (UTC)
From: [identity profile] kpreid.livejournal.com
Er, line feed is ^J.

Date: 2009-07-30 05:48 pm (UTC)
From: [identity profile] mouser.livejournal.com
Unless you're using it as a mail server (which you don't seem to be), I'd contact your ISP. It seems to be having issues with handshaking with other mail servers.

Date: 2009-07-30 06:01 pm (UTC)
From: [identity profile] mouser.livejournal.com
...maybe I just like a LOT of space between lines, did you ever think of that!?!?!





^L = FF (Form Feed/Page break)

Date: 2009-07-30 06:06 pm (UTC)
From: [identity profile] kpreid.livejournal.com
Heh heh.^M^J
^M^J
I used to write software inside of a telecom app's scripting (y'know, modems, talking to BBSes and 'online services' kind of thing) and it got very familiar...^M^J

Date: 2009-07-30 06:47 pm (UTC)
From: [identity profile] mouser.livejournal.com
...I really need to read the entire thread...


Changing the ^M to your favorite line break while removing any other line breaks should reformat it correctly.

The stuff in the second view is definitely a UUEncoded message. These are a headache to "hand" decode with an external program.

I used to have a UUDECODE command line program but it won't run anymore under modern OSs (least of a Mac...)
Edited Date: 2009-07-30 06:49 pm (UTC)

Date: 2009-07-30 06:51 pm (UTC)
From: [identity profile] ninjarat.livejournal.com
This is a job for MacPorts.
http://www.macports.org/

Date: 2009-07-30 07:00 pm (UTC)
From: [identity profile] mouser.livejournal.com
...you were hearing theme music in your head when you posted that, weren't you?

Date: 2009-07-30 07:39 pm (UTC)
From: [identity profile] ninjarat.livejournal.com
The only reason that I would use Emacs is because I'm familiar with doing this kind of work with Emacs. You can use whatever tools you are comfortable with. I could do it with Microsoft Word. In that case I would likely have to massage the existing line ends (paragraph marks in Word) to something unique like "lineendratPP", then change the bad ones, then change the good ones to something useful. The tricky part is figuring out how to either automate it or make it a few global replaces. Once you have that it's just tedious.

Date: 2009-07-30 10:32 pm (UTC)
kengr: (Default)
From: [personal profile] kengr
The garbage is standard mime base-64 encoding. So those are sections of the message that are images, or other "binary" content.

Odds are that the file is fine, it just needs to get run thru the right software to split out messages.

Date: 2009-07-31 05:31 am (UTC)
From: [identity profile] m0rlock.livejournal.com
That's frustrating - I assume the file is either too big, or the line endings have it confused, or more likely a combination of the two.

Can you post a small chunk of the file somewhere? If you run

head -c 10000 Inbox > tmpInbox

it'll copy the first 10000 bytes of Inbox to tmpInbox. That way we'd have something we could experiment on. You can adjust 10000 down if you have privacy concerns, or up if you don't (a bigger sample would be better, of course).

Date: 2009-07-31 04:38 pm (UTC)
From: [identity profile] m0rlock.livejournal.com
In the interest of defeating email address harvesting...send it to dmeyer@example.com. Except you should use dmeyer.net instead of example.com as the domain.

February 2026

S M T W T F S
1234 567
891011121314
15161718192021
22232425262728

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Feb. 7th, 2026 06:04 am
Powered by Dreamwidth Studios