Skip to content

Instantly share code, notes, and snippets.

@tremby
Created May 3, 2020 20:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tremby/a4caaa86bb69e6af052a55b77eae48be to your computer and use it in GitHub Desktop.
Save tremby/a4caaa86bb69e6af052a55b77eae48be to your computer and use it in GitHub Desktop.
Migrating email from Messenger Pro (or probably any other IMAP server) to Gmail

Migrating email from any IMAP server to Gmail

I recently needed to move some large archives of email from Messenger Pro on RISC OS to Gmail. I tried a lot of different approaches, and finally come up with a solution I believe to be the most reliable and least error-prone, using OfflineIMAP.

Using OfflineIMAP has its own set of difficulties, but I think it's much more bulletproof a solution overall than the others I tried, such as logging into both IMAP servers with Thunderbird and doing a lot of dragging and dropping, or by having Gmail collect email via POP.

Pros:

  • No dragging and dropping.
  • Once the slow parts are going, they're pretty hands-off, so it can be left overnight for the download stage and then the upload stage.
  • Directories are preserved (but may need some renaming and renesting).
  • Timestamps are preserved.
  • Flags (read/unread) are preserved.
  • Can pause/resume.
  • Can run steps again to transfer more mail later, if needed.

Cons:

  • You need a Linux machine (though you can probably do this on Windows too).
  • You need to be at least somewhat comfortable with a Linux command line.
  • This involves workarounds for two OfflineIMAP bugs.
  • It looks like a lot of steps. But trust me: with a large mail archive it's a lot quicker and less error-prone than dragging and dropping.

What follows, then, is the best method I've found now of migrating mail from Messenger Pro to Gmail. It should probably be very similar from any source IMAP server.

A brief outline

a) One-way sync your legacy IMAP server to a local Maildir b) Fix the directory structure c) Two-way sync the local Maildir and Gmail, with deletions disabled

Test first!

I recommend trying on a fresh, empty Gmail account first (with a few test emails thrown in it), to make sure the process works end to end, before trying it on an existing Gmail account with existing precious mail in it. (You can abort the first leg of the process after a few dozen emails have downloaded and continue onwards to the next step, to speed things up.)

If you don't test like this first, and instead go ahead and upload to your main Gmail account immediately, but then something goes wrong and you want to start over, you'll have a nightmare trying to delete only those mails you uploaded via this method, without touching mails which were already there in Gmail. This is especially difficult since Gmail treats groups of messages (both incoming and outgoing) as conversations rather than individual messages. You've been warned!

Instructions

  1. Enable IMAP in the Gmail account. (Gmail settings → Forwarding and POP/IMAP → IMAP access. I didn't change any of its other settings from default.)

  2. Switch on "less secure app access". (Google Account → Security → Less secure app access.) There is probably another way, but this seemed fine to me for this temporary usage.

  3. On a Linux machine, install OfflineIMAP. I used the latest version, installed via pip, which at the time of writing is v7.3.3.

  4. Configure OfflineIMAP as follows. Adjust as necessary. I called this configuration file mprotogmail.conf:

    [general]
    accounts = myaccount
    
    [Account myaccount]
    localrepository = Local
    
    # Uncomment for download only
    remoterepository = Mpro
    
    # Uncomment for upload only
    #remoterepository = Gmail
    
    [Repository Local]
    type = Maildir
    localfolders = ~/localmail
    
    # Ensure mail timestamps end up correct on Gmail
    # See https://github.com/OfflineIMAP/offlineimap/issues/662
    utime_from_header = yes
    
    # Ensure nothing found locally but not on Gmail gets deleted
    # from Gmail
    sync_deletes = no
    
    # Uncomment for download only, if your source server is Mpro
    sep = @
    # Choose a separator which doesn't appear in any of your
    # mailbox names. This is to get around an OfflineIMAP bug.
    # See https://github.com/OfflineIMAP/offlineimap/issues/663
    
    [Repository Mpro]
    type = IMAP
    remotehost = mpro.address.or.ip
    ssl = no
    remoteuser = your.mpro.username
    remotepass = anyStringSeemsToWork
    readonly = True
    
    # Speed things up a little
    maxconnections = 3
    
    # Don't download mail from particular mailboxes
    folderfilter = lambda mbox: mbox not in [
        'wastebasket',
        'spam',
        'Trash',
        ]
    
    [Repository Gmail]
    type = Gmail
    remoteuser = your.address@gmail.com
    remotepass = yourGmailPassword
    sslcacertfile = /etc/ssl/certs/ca-certificates.crt
    
    # Speed things up a little
    maxconnections = 3
    
    # Ignore special Gmail boxes
    folderfilter = lambda mbox: not mbox.startswith('[Gmail]')
    
    # Ensure nothing found on Gmail but not locally gets deleted
    # from the local repo
    sync_deletes = no
    
  5. Do a test run to check mailbox names, and adjust configuration (folderfilter) as necessary to limit those that will download to just those you want. You probably want to filter out mailing list archives and spam boxes, for example.

    offlineimap -c mprotogmail.conf --info
    

    Run this again after each config tweak until you're satisfied.

    Note that either Mpro or Offline IMAP doesn't handle non-Latin1 characters well. If you see strings like &IBQ- in mailbox names, it'd be a good idea to rename those mailboxes at the source in Mpro before continuing. Your mileage may vary with other IMAP servers.

  6. Run the first stage: transfer mail from Mpro to the local machine.

    offlineimap -c mprotogmail.conf
    

    This can be run again if it fails at some point, and it will continue from where it left off. OfflineIMAP collects any errors which happen during execution and repeats them when it exits, so it's easy to tell whether it was successful.

    It can also be stopped with control-C and will wait for the current message to finish copying before exiting, and then you can continue later by running the same command again.

  7. Check you've downloaded the mail you expected to, and that no mailboxes you expected to have mail are empty. Here's a command which might be useful for that.

    find localmail -type d \( -name cur -o -name new \) \
        -exec bash -c "echo -n \"{}: \"; ls \"{}\" | wc -l" \;
    

    At this point you may want to make a full backup to avoid having to do this download again later, should you make a mistake in the next steps. Make sure you preserve timestamps. For example:

    cp -a localmail localmail.backup
    
  8. If a particular bug still exists, and your source server is Mpro, you'll find that all the mailboxes are named with @ characters between each letter, and at the start and end. (These would have been dots if we had not set the sep setting.)

    The separator being before and after each character at this stage is a current OfflineIMAP bug, which I reported at OfflineIMAP/offlineimap#663

    Strip all of these characters out (this is why we chose a character which doesn't appear in any mailbox names):

    rename -v "s/@//g" localmail/@?*@
    

    Beware: there is more than one program on common Linux distributions called rename. The above expects the perl rename command, which comes on Ubuntu with the util-linux package, and on some other distributions it might come in the perl package, or be called prename. If you run it with no argumenst or with --help it should mention perlexpr in the usage summary.

    If the bug doesn't still exist, you should be able to remove the sep part from the configuration file (for both stages), start over, and skip this particular step entirely.

  9. Though Mpro can display mailboxes as nested, the parent mailboxes don't actually exist. OfflineIMAP will choke while trying to sync to Gmail since it thinks it's a true hierarchy and so expects them to exist. So let's make sure they exist:

    ls -d localmail/*?.?* | sed 's/\..*//' | sort | uniq | \
        while read stem; do mkdir -p "$stem"/{cur,new,tmp}; done
    

    You may also want to rename some other mailboxes. I didn't want Mpro's "outgoing.mail" to show up in Google Mail as nested labels "outgoing/mail", so I renamed this to "outgoing-mail".

  10. Edit the OfflineIMAP configuration file to be ready to upload to Gmail. Things you need to comment/uncomment are noted in comments:

    • Switch the remoterepository to Gmail by commenting one remoterepository setting and uncommenting the other.

    • Comment out the sep setting in [Repository Local] -- since we renamed the mailboxes, the correct separator is dot, which is the default.

  11. Run OfflineIMAP again to sync mail with Gmail:

    offlineimap -c mprotogmail.conf
    

    Again, if this stops it can be run again and continue where it left off.

    This will both download mail found on Gmail to the local repository and upload anything found in the local repository to Gmail. Since sync_deletes is switched off on both ends, nothing found on one end but not on the other will be deleted. Gmail will deduplicate things for you -- its "mailboxes" are really "labels", and the same email (presumably identified by its message ID) can have multiple labels. To OfflineIMAP this appears as the same message having copies in multiple mailboxes. Other than taking extra disk space and transfer time, this is harmless.

  12. Once you refresh your browser Gmail page you'll find that the mail has appeared and that mailboxes have come up as labels. Once you're happy that everything's there, you may want to do some more renaming or renesting.

    It'll also be safe to delete your "outgoing-mail" label, or whatever it is called for you, since Gmail autodetects what is sent based on the sender address and which email addresses you have configured that you can send from. You'll find your sent mail in "Sent" if Gmail is configured appropriately now or later. (Removing a label does not delete the associated mail in Gmail.)

  13. Disable "less secure access" again once you're done.

Please test first

As I said above, please test this first on a brand new Gmail account, with nothing but some test emails in it! You should find that your test emails are still there and so is everything you are uploading.

Resetting

If something goes wrong during your test runs, you can reset everything like this:

  1. Delete your local mailbox, and OfflineIMAP's cache too for good measure:

    rm -rf localmail .offlineimap
    
  2. In Gmail, go to the "All mail" mailbox, select all, and delete them, perhaps except your test emails. Clearly you won't want to do this on your real Gmail account, which is one reason I urge you to test on a fresh account first.

  3. Still in Gmail, go to Bin (might be named differently like "Trash" based on your selected language) and empty it.

  4. Reset the configuration file to the download stage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment