Skip to content

Instantly share code, notes, and snippets.

@yuanzhaoYZ
Last active July 9, 2022 00:48
Show Gist options
  • Save yuanzhaoYZ/9f684e6214880cbac078bea83c978565 to your computer and use it in GitHub Desktop.
Save yuanzhaoYZ/9f684e6214880cbac078bea83c978565 to your computer and use it in GitHub Desktop.
S3 backed notebooks for Zeppelin

7. S3 backed notebooks for Zeppelin

sudo cp /etc/zeppelin/conf/zeppelin-site.xml.template /etc/zeppelin/conf/zeppelin-site.xml
  • Edit the file sudo nano /etc/zeppelin/conf/zeppelin-site.xml.
sudo vim /etc/zeppelin/conf/zeppelin-site.xml
  • Uncomment the XML block under the line that reads
<!-- Amazon S3 notebook storage -->
  • Edit the lines with the the S3 BUCKET name and FOLDER name (Make sure the bucket/folder already exists in S3)
<!-- Amazon S3 notebook storage -->
<!-- Creates the following directory structure: s3://{bucket}/{username}/{notebook-id}/note.json -->
<property>  
  <name>zeppelin.notebook.s3.user</name>
  <value>zeppelin</value> <!-- FOLDER -->
  <description>user name for s3 folder structure</description>
</property>

<property>
  <name>zeppelin.notebook.s3.bucket</name>
  <value>bootcamp.workspace.intellinum.co</value> <!-- BUCKET -->
  <description>bucket name for notebook storage</description>
</property>

<property>
  <name>zeppelin.notebook.s3.endpoint</name>
  <value>s3.amazonaws.com</value>
  <description>endpoint for s3 bucket</description>
</property>

<property>
  <name>zeppelin.notebook.storage</name>
  <value>org.apache.zeppelin.notebook.repo.S3NotebookRepo</value>
  <description>notebook persistence layer implementation</description>
</property>
  • Comment out the next property to disable local git notebook storage (the default):
<property>
  <name>zeppelin.notebook.storage</name>
  <value>org.apache.zeppelin.notebook.repo.GitNotebookRepo</value>
  <description>versioned notebook persistence layer implementation</description>
</property>
  • Restart Zeppelin
sudo /sbin/stop zeppelin
sudo /sbin/start zeppelin
  • Access Zeppelin
http://master-public-dns-name:8890/

Reference:

@k2ev
Copy link

k2ev commented Jul 9, 2022

Can you please add concrete examples:

  1. zeppelin.notebook.s3.user
    zeppelin
    what value should go here? you have said folder? what does that mean?

  2. similarly, how should bucket name be defined
    should it be just xyz or s3://xyz/folder

  3. example of s3 endpoint

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment