Skip to content

Instantly share code, notes, and snippets.

@evanlinde
Forked from pecigonzalo/mbox2gg.py
Last active January 27, 2024 00:37
Show Gist options
  • Save evanlinde/00dbbd4150b0955a355f9d508b3cbfdb to your computer and use it in GitHub Desktop.
Save evanlinde/00dbbd4150b0955a355f9d508b3cbfdb to your computer and use it in GitHub Desktop.
mbox to Google Groups
#
# Import an mbox file into Google Groups using the Groups Migration API
#
# To run this script, you will need to:
# 1. Be an admin on your domain
# 2. Enable API access in your domain
# 3. Create a project on the developer console
# 4. Activate the Groups Migration API for the project
# 5. Create an OAuth Client ID and Secret for the project
# 6. Install the google-api-python-client and google-auth-oauthlib
# packages into your python environment (tested in python 3)
#
# Messy Details (recorded 2018-07-02):
# 2. See https://developers.google.com/admin-sdk/groups-migration/v1/guides/prerequisites
# 3. Go to https://console.developers.google.com
# Create a new project
# Pick a meaningful name and set the Organization and Location to your domain name
# 4. Select your project in the developer console (https://console.developers.google.com)
# Go the the dashboard page and click "Enable APIs and Servces"
# Find the Groups Migration API and click it
# Click the "Enable" button
# 5. Select your project in the developer console (https://console.developers.google.com)
# Go to the credentials page/tab and click "Create Credentials"
# Choose OAuth Client ID
# Click "Configure Consent Screen"
# Set a product name. (This will appear in an authorization prompt when you attempt to run the script.)
# Click "Save"
# Choose "Other" for the application type and choose a name (e.g. "mbox2gg") for the OAuth client ID
# Click "Create"
# Your OAuth Client ID should now be listed on the Credentials page/tab. Click the download link
# to save the json file.
#
# References:
# Groups Migration API:
# https://developers.google.com/admin-sdk/groups-migration/v1/get-start/getting-started
# Python API Client:
# https://developers.google.com/api-client-library/python/
# Google Auth / OAuth:
# http://google-auth-oauthlib.readthedocs.io/en/latest/
#
import mailbox
import time
from io import StringIO
import apiclient # pip install --upgrade google-api-python-client
from google_auth_oauthlib.flow import Flow # pip install --upgrade google-auth-oauthlib
# File containing the OAuth Client ID / Secret
client_secret_file = 'client_secret.json'
# The email address of the group to import to (e.g. "groupname@example.com")
groupId = input('Enter groupId: ')
# The mbox file you want to import
mbox_path = input('Enter mbox_path: ')
# Scope for the Groups Migration API
# See https://developers.google.com/admin-sdk/groups-migration/v1/guides/authorizing
scope = 'https://www.googleapis.com/auth/apps.groups.migration'
#
# The next few lines to handle authentication are taken almost verbatim from
# http://google-auth-oauthlib.readthedocs.io/en/latest/reference/google_auth_oauthlib.flow.html
#
# The original version of this script used the oauth2client library (which is now [2018-07-02]
# deprecated) for authentication and included a feature for reading saved credentials from a
# file. The google_auth_oauthlib library doesn't have comparable functionality (and I haven't
# spent the time to figure out how to do it) so this script will prompt for authorization every
# time it runs.
#
# Create the flow using the client secrets file from the Google API Console.
flow = Flow.from_client_secrets_file(
client_secret_file,
scopes=[scope],
redirect_uri='urn:ietf:wg:oauth:2.0:oob')
# Tell the user to go to the authorization URL.
auth_url, _ = flow.authorization_url(prompt='consent')
print('Please go to this URL: {}'.format(auth_url))
# The user will get an authorization code. This code is used to get the
# access token.
code = input('Enter the authorization code: ')
flow.fetch_token(code=code)
# Can now get the credentials after the call to fetch_token()
credentials = flow.credentials
# Create an object for interacting with the Groups Migration API
#service = discovery.build('groupsmigration', 'v1', credentials=credentials)
service = apiclient.discovery.build('groupsmigration', 'v1', credentials=credentials)
# Open the mbox file
mb = mailbox.mbox(mbox_path) # The path of the mbox file to import
# Create a counter variable and find the total number of messages
# Use this for status updates as messages are uploaded
i = 1
total_messages = len(mb)
# Process messages from the mbox file one-at-a-time
for msg in mb:
# Read the rfc822 text (i.e. the message) as a stream object
stream = StringIO()
stream.write(msg.as_string())
# Create an object/description suitable for uploading
media = apiclient.http.MediaIoBaseUpload(stream, mimetype='message/rfc822')
# Upload message to the group archive and get the response
response = service.archive().insert(groupId=groupId, media_body=media).execute()
# Show a status message
print('Message {} of {}: {}'.format(
i,
total_messages,
response['responseCode'])
)
i = i + 1
# Limit to no more than 10 messages per second to avoid exceeding API quota
# https://developers.google.com/admin-sdk/groups-migration/v1/limits
time.sleep(0.1)
print('Done.')
@Yeikop
Copy link

Yeikop commented Jan 14, 2021

Hello,

Could you help me please ? I follow your intructions but I receive this error?

mboxerrror

Thanks for your help!

@evanlinde
Copy link
Author

Hello,

Could you help me please ? I follow your intructions but I receive this error?

mboxerrror

Thanks for your help!

I'm not sure why you'd be getting that particular error since this script doesn't use the oauth2client library. Perhaps you were using another script and not this one.

@Yeikop
Copy link

Yeikop commented Jan 15, 2021

Firstable thanks for your help,
I seem I have advance a little, when I launch your script, I receive these 2 screen
mbox1
mbox2

After that, I receive the URL to authenticate and permit the script to migrate the files, but when I put the code, the script doesn't do anything.

It is possible send me an example a my personal email???? or Can I send you my script and you tell what happen???

Thank you so much!!!

@evanlinde
Copy link
Author

The screenshots you have included do not show very much; there should be prompts for the group name and mbox file, but I do not see them.

If you are running the script above, please read lines 4-31 and 49-50 to make sure you have everything ready. Then, in your command prompt, navigate to c:\mbox or wherever you have placed the client_secret.json file and run python mbox2gg.py.

For troubleshooting, it will be helpful to copy the text from your console window. You may need to use the menu from clicking the window icon to select and copy everything. (See image below.)

image

@Yeikop
Copy link

Yeikop commented Jan 16, 2021

Finally I can migrate 50 emails a my google group, thank you so much. Now when I try migrate 400Mb into my google Group I receive this error

mboxerror

It is possible the Quotas?

Thanks again for you support!

@evanlinde
Copy link
Author

I don't think it is a quota problem -- note how the error says "Unable to parse the raw message". Most likely, there is a problem with that message (e.g. bad encoding, problem with a header, etc.) and there could be many more problem messages.

I ran into a similar problem before. You can use the following code to replace lines 116-124. It will cause the script to create an output file with the content of each message entry that fails and then continue with the rest of the mbox file. For this code to work, you will need to create a directory named errors in the same directory as your mbox file and run the script from the same directory as your mbox file (so that you only need to specify the name of the mbox file, not the whole path).

    try: 
        # Upload message to the group archive and get the response
        response = service.archive().insert(groupId=groupId, media_body=media).execute()
        print('Message {} of {}: {}'.format(
            i,
            total_messages,
            response['responseCode'])
        )
    except:
        print('ERROR: Failed on message %d of %d'%(i,total_messages))
        with open('errors/%s.%04d' % (mbox_path, i), 'wb+') as f:
            f.write(msg.as_bytes())

@Yeikop
Copy link

Yeikop commented Jan 17, 2021

The code seems ok, but I receive this error ;

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "mboxggg.py", line 126, in
with open('errors/%s.%04d' % (mbox_path, i), 'wb+') as f:
OSError: [Errno 22] Invalid argument: 'errors/C:\mboxDemo\Mail.mbox.0003'

I create a folder errors in the directory, but I don't know where is the problem.

@evanlinde
Copy link
Author

Try not specifying the entire path of the mbox file (i.e. use Mail.mbox instead of C:\mboxDemo\Mail.mbox). You'll need to run the script from the C:\mboxDemo folder for this to work.

@Yeikop
Copy link

Yeikop commented Jan 19, 2021

@evalinde thank you so much for your support.

Now I get run the Script but I receive a lot of errors when it's migrate, I send you the error

Traceback (most recent call last):
  File "mboxgg.py", line 111, in <module>
    stream.write(msg.as_string())
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\message.py", line 158, in as_string
    g.flatten(self, unixfrom=unixfrom)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 116, in flatten
    self._write(msg)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 181, in _write
    self._dispatch(msg)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 214, in _dispatch
    meth(msg)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 272, in _handle_multipart
    g.flatten(part, unixfrom=False, linesep=self._NL)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 116, in flatten
    self._write(msg)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 181, in _write
    self._dispatch(msg)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 214, in _dispatch
    meth(msg)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\generator.py", line 243, in _handle_text
    msg.set_payload(payload, charset)
  File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.8_3.8.1776.0_x64__qbz5n2kfra8p0\lib\email\message.py", line 315, in set_payload
    payload = payload.encode(charset.output_charset)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 3622-3623: ordinal not in range(128)

Thanks for your support!!!!

@Yeikop
Copy link

Yeikop commented Mar 9, 2021

Hi Evalinde,

Thanks you su much for your Script, for me it was very usefull but now I have a problem and I don't understand Why. Your Script in Windows I dont have problems but when I use my Mac (Big Sur MacBook Pro (13-inch, M1, 2020) I receive this error;

Captura de pantalla 2021-03-09 a las 16 33 58

I have installed python Version Python 2.7.18 and this is my screenshot where I have my MboxFile and your Script.

Captura de pantalla 2021-03-09 a las 16 37 32

Could you help me?

Many thanks

@evanlinde
Copy link
Author

I think you've missed the important part of the error in your screenshot -- it should be the message immediately below the caret character pointing to the @ sign.

But before worrying about that, try python 3. (I don't know if this script is compatible with python 2, but I know I haven't used it with python 2.)

@Yeikop
Copy link

Yeikop commented Mar 10, 2021

I send you my error

jacobogarrido@MacBook-Pro MboxGroups % python3 mboxggup.py
Traceback (most recent call last):
File "/Users/jacobogarrido/MboxGroups/mboxggup.py", line 45, in
import apiclient # pip install --upgrade google-api-python-client
ModuleNotFoundError: No module named 'apiclient'
jacobogarrido@MacBook-Pro MboxGroups % pip install --upgrade google-api-python-client
Traceback (most recent call last):
File "/usr/local/bin/pip", line 11, in
load_entry_point('pip==21.0.1', 'console_scripts', 'pip')()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 489, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 2843, in load_entry_point
return ep.load()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 2434, in load
return self.resolve()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources/init.py", line 2440, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
File "/Library/Python/2.7/site-packages/pip-21.0.1-py2.7.egg/pip/_internal/cli/main.py", line 60
sys.stderr.write(f"ERROR: {exc}")

I installed homebrew in my mac and I don't know where is the next step to solve this issue.

Many thanks

@Yeikop
Copy link

Yeikop commented Mar 10, 2021

I fixed it. I put the command pip3 install --upgrade google-api-python-client and run perfectly.

Thank you so much for your support.

Kind Regards !!!

@greatestradioshow
Copy link

Hello,
I tried this code and for some messages the import worked, but for most of my messages i got the following error:
<HttpError 400 when requesting https://groupsmigration.googleapis.com/upload/groups/v1/groups/orla.baumgartner%40steadysense.at/archive?alt=json&uploadType=media returned "Unable to parse the raw message". Details: "[{'message': 'Unable to parse the raw message', 'domain': 'global', 'reason': 'invalid'}]">

What is the problem with my emails, and how can I resolve this error?

Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment