Skip to content

Instantly share code, notes, and snippets.

@nmukerje
Last active January 11, 2018 01:33
Show Gist options
  • Save nmukerje/3647491bf4a35be2da7ae32be7585b53 to your computer and use it in GitHub Desktop.
Save nmukerje/3647491bf4a35be2da7ae32be7585b53 to your computer and use it in GitHub Desktop.
Exports a Zeppelin Notebook in S3 to a Python file in S3
import json,boto3
def notebook2py(nb_bucket,nb_key,py_bucket,py_key):
s3c = boto3.client('s3')
obj = s3c.get_object(Bucket=nb_bucket, Key=nb_key)
content = json.loads(obj['Body'].read())
notebook_text = ['\n'+item['text'][8::] for item in content['paragraphs'] if 'enabled' in item['config'] and item['config']['enabled']==True and item['text'].startswith('%pyspark')]
io_handle = StringIO('\n'.join(notebook_text))
s3c.put_object(Bucket=py_bucket, Key=py_key, Body=io_handle.read())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment