备份本地邮件到Gmail等邮箱

Requirements

把邮件备份到Gmail或者其他邮箱中。

Google GMail Loader (GML) 是一个实现这种功能的软件。

我的基本要求是Subject, From, Date, To 这几个字段能够保留,也就是说邮件看起来还能和当初接收时一样,发信人、收信人、日期等等都不会因为这个Backup的过程而发生变化。效果和邮件服务器“自动转发”类似。

Outline

直接通过客户端软件转发或者像备份Outlook中的邮件到Gmail 这篇文章中介绍的方法,实际上都是在原来邮件的基础上产生新的邮件,新的邮件的上述字段描述的都是这次转发的动作。

MIME-Type的邮件是文本格式,只要能保持邮件头中的相应Meta就可以了。
mbox 是Unix中邮件的存储格式,Thunderbird也使用这种格式。在Gmail中如果在More options中点击 Show original 也可以看到邮件的原始格式也是这样的。下面是一个mbox文件中的一封邮件,除了开头以From开始的一行,就是整个邮件的内容(包括附件)。.eml邮件也是一样的格式。
需要做的就是把要备份的邮件按照原始格式发出去。

From - Sat Mar 18 13:05:01 2006
X-Account-Key: account2
X-UIDL: GmailId109e4eefe4d8ed58
X-Mozilla-Status: 0001
X-Mozilla-Status2: 00000000
X-Gmail-Received: c8a53f9e65e27182a7181970efbd3fed4e69a43b
Received: by 10.70.56.14; Fri, 10 Mar 2006 08:13:21 -0800 (PST)
Message-ID:
Date: Fri, 10 Mar 2006 08:13:21 -0800
From: “Gmail Team”
Reply-To: caichanghui520@gmail.com
To: “charlie zhu”
Subject: =?GB2312?B?ssyzpLvUIGhhcyBhY2NlcHRlZCB5b3VyIGludml0YXRpb24gdG8gR21haWw=?=
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary=”—-=_Part_6342_7572464.1142007201426″

——=_Part_6342_7572464.1142007201426
Content-Type: text/plain; charset=GB2312
Content-Transfer-Encoding: base64
Content-Disposition: inline

ssyzpLvUIGhhcyBhY2NlcHRlZCB5b3VyIGludml0YXRpb24gdG8gR21haWwgYW5kIGhhcyBjaG9z
ZW4gdGhlIGJyYW5kIG5ldwphZGRyZXNzIGNhaWNoYW5naHVpNTIwQGdtYWlsLmNvbS4gQmUgb25l
IG9mIHRoZSBmaXJzdCB0byBlbWFpbCCyzLOku9QgYXQgdGhpcwpuZXcgR21haWwgYWRkcmVzcy0t
anVzdCBoaXQgcmVwbHkgYW5kIHNlbmQgssyzpLvUIGEgbWVzc2FnZS4KY2FpY2hhbmdodWk1MjBA
Z21haWwuY29tIGhhcyBhbHNvIGJlZW4gYXV0b21hdGljYWxseSBhZGRlZCB0byB5b3VyIGNvbnRh
Y3QKbGlzdCBzbyB5b3UgY2FuIHN0YXkgaW4gdG91Y2ggd2l0aCBHbWFpbC4KCgpUaGFua3MsCgpU
aGUgR21haWwgVGVhbQo=
——=_Part_6342_7572464.1142007201426
Content-Type: text/html; charset=GB2312
Content-Transfer-Encoding: base64
Content-Disposition: inline

PGh0bWw+Cjxmb250IGZhY2U9IkFyaWFsLCBIZWx2ZXRpY2EsIHNhbnMtc2VyaWYiPgo8cD6yzLOk
u9QgaGFzIGFjY2VwdGVkIHlvdXIgaW52aXRhdGlvbiB0byBHbWFpbCBhbmQgaGFzCiAgY2hvc2Vu
IHRoZSBicmFuZCBuZXcgYWRkcmVzcyBjYWljaGFuZ2h1aTUyMEBnbWFpbC5jb20uIEJlIG9uZSBv
ZiB0aGUgZmlyc3QgdG8gZW1haWwgCiAgssyzpLvUIGF0IHRoaXMgbmV3IEdtYWlsIGFkZHJlc3Mt
LWp1c3QgaGl0IHJlcGx5IGFuZCBzZW5kIAogILLMs6S71CBhIG1lc3NhZ2UuIGNhaWNoYW5naHVp
NTIwQGdtYWlsLmNvbSBoYXMgYWxzbyBiZWVuIGF1dG9tYXRpY2FsbHkgYWRkZWQgdG8KICB5b3Vy
IGNvbnRhY3QgbGlzdCBzbyB5b3UgY2FuIHN0YXkgaW4gdG91Y2ggd2l0aCBHbWFpbC4KPC9wPgo8
cD48YnI+CiAgVGhhbmtzLCA8L3A+CjxwPiBUaGUgR21haWwgVGVhbTwvcD4KPC9mb250Pgo8L2h0
bWw+Cg==
——=_Part_6342_7572464.1142007201426–

上面绿色的部分是Thunderbird加上去的。
需要注意的是,红色的Message-ID这行要去掉才可以被Gmail正确处理。我猜Gmail会因为这个ID已经存在而不会接收。不过这封邮件本来就是Gmail早先收到pop到本地的,一般也没有发回去的必要。

Implement

Python code:

smtp = smtplib.SMTP(MY_SMTP_SERVER, MY_SMTP_SERVER_PORT)
smtp.set_debuglevel(1)
smtp.login( MY_SMTP_USERNAME, MY_SMTP_PASSWORD )
”’
ms = mailbox.UnixMailbox( open(MY_SOURCE_MBOX_FILE))
m = ms.next()
while m:
msg = m.__str__()+ m.fp.read()
smtp.sendmail("dumi@me.com", MY_BACKUP_EMAIL, msg)
m = ms.next()
”’
mbx = open( MY_SOURCE_MBOX_FILE )
line = mbx.readline()
msg = ”
while line:
if line.find( ‘From -’ ) == 0: #@ New msg start
if not msg == ”:
smtp.sendmail("dumy@me.com", MY_BACKUP_EMAIL, msg )
msg = ”
if not line.find( ‘Message-ID:’ ) == 0:
#@ Nor perhaps refused by mail server as resent
msg += line
line = mbx.readline()
if not msg == ”:
smtp.sendmail("dumy@me.com", MY_BACKUP_EMAIL, msg )
mbx.close()
smtp.quit()

Result

下面是备份后邮件的情况,收、发者和接收时间都没有发生变化。

Programming with mBox

An mbox is a text file containing an arbitrary number of e-mail messages. Each message consists of a postmark, followed by an e-mail message formatted according to RFC 822. The file format is line-oriented. Lines are separated by line feed characters (ASCII 10). mbox

The mbox Format , If we use the mbox format to store emails, we put all of them in one file. This creates more or less long text file (Internet email always only exists as 7-bit ASCII text, everything else — attachments, for example — is encoded) containing one email message after the other. How do we know where one ends and another starts?

Fortunately, every email has at least one From-line at its very beginning. Every message begins with “From “ (From followed by a white space character, also called a “From_” line). If this sequence (”From “) at the beginning of a line is preceded by an empty line or is at the top of the file, we have found the beginning of a message.

Reading and writing mbox style mailbox files

mbox - file containing mail messages

More about Thunderbird

我个人建议使用Thunderbird等这种使用RFC标准的mbox格式存储邮件的客户端软件来管理邮件。因为不论在什么环境中,都可以保证可靠的访问。实际上我现在的邮件管理策略就是用Gmail来作为邮件处理的中心,其他的各个邮箱都自动转发到主Gmail帐号中。然后再通过Thunderbird pop下来作为本地的备份,以防不测。写这个程序就是为了把从前大量的直接用客户端邮件接收到的邮件再同步到Gmail中。毕竟Gmail的Label式管理搜索很好用。

Mozilla Thunderbird stores emails in the mbox format. http://www.broobles.com/imapsize/th2outlook.php

Calling Thunderbird from other programs
Thunderbird supports SimpleMAPI, which is a Microsoft standard way for a third party application to send email messages using the default email client. SimpleMAPI can be called from C , C++ and Visual Basic.

XPCOM (Cross Platform Component Object Model) is Mozilla’s framework for writing cross-platform, modular software. Despite some obvious similarities, Microsoft COM and XPCOM components are not compatible or interchangeable. XPCOM components can be written in and used from C, C++, Perl, Python, and JavaScript.

Thunderbird supports command line arguments to open the compose message window and fill in the headers, the message body and attachment(s), but you’d still have to press the send button.

Thunderbird doesn’t have a scripting capability. Its functionality can be modifed using XUL based extensions. It does not support traditional plug-ins.

If all you want to do is to process new messages don’t overlook writing a script that parses the “inbox.” mbox file using the X-Mozilla-Status headers to figure out if a message is a new message. A mbox file is essentially just a flat text file that has a seperator between the messages and special encoding for any “From” strings if they occur at the beginning of a line in either the headers or the message body.

您的邮件文件保存在你的配置文件中 (参阅 配置文件夹),在 Mail 和 ImapMail (如果您使用 IMAP) 文件夹里。每一个邮件文件夹 (收件箱,已发送消息,等等) 被储存为两个文件 — 一个没有扩展名 (例如 INBOX),这是邮件文件本身 (“mbox” 格式),一个带有 .msf 扩展名 (例如 INBOX.msf),这是邮件文件的索引 (邮件概要文件)。其他程序从没有扩展名的文件导入邮件。http://www.mozilla.org.cn/support/thunderbird/faq.html#q2.10

No comments yet. Be the first.

Leave a reply

Additional comments powered by BackType

Random posts

  • Chemical Structure Similarity 笔记
  • 秃瓢
  • Mind Map of ITIL v3
  • 西雅图星巴克总店宣传画
  • 鸟蛋