Skip to content

Instantly share code, notes, and snippets.

@hoangdh
Last active June 6, 2024 08:48
Show Gist options
  • Save hoangdh/e17a3ba8fb042b32d712ee38cdb0846f to your computer and use it in GitHub Desktop.
Save hoangdh/e17a3ba8fb042b32d712ee38cdb0846f to your computer and use it in GitHub Desktop.
Một vài bước cài đặt Jupyterhub

Cài đặt Jupyter qua conda

conda install -c conda-forge jupyterhub  # installs jupyterhub and proxy
conda install jupyterlab notebook  # needed if running the notebook servers in the same environment

File cấu hình

Filename: jupyterhub_config.py

c = get_config()  #noqa
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 8080
c.Authenticator.admin_users = { 'hoangdh' }
# c.LocalAuthenticator.create_system_users=True
c.FileContentsManager.delete_to_trash = False

Systemd

Filename: /lib/systemd/system/jupyterhub.service

[Unit]
Description=JupyterLab Server

[Service]
Type=Simple
User=jupyter
Group=jupyter
WorkingDirectory=/data/jupyterhub/
Environment="PATH=/data/softs/anaconda/envs/jupyterhub/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"
ExecStart=/data/softs/anaconda/envs/jupyterhub/bin/jupyterhub -f /data/jupyterhub/jupyterhub_config.py 

[Install]
WantedBy=multi-user.target

Tạo môi trường Python cho Jupyter

  • Tạo user cho hub
useradd -m -d /data/users/bi_shark/ -s /bin/bash bi_shark
passwd bi_shark
  • Khởi tạo conda cho user
su - bi_shark
/data/bigdata/anaconda3/bin/conda init
  • Đăng nhập lại và tạo môi trường mới cho user
conda create -n bi_shark python=3.9 -y
  • Cài kernel mới cho Jupyter
conda activate bi_shark
pip install --upgrade pip
pip install ipykernel
python -m ipykernel install --user --name="bi_shark" --display-name="Python 3.9 (Conda)"
  • Thêm các biến môi trường riêng cho user

vi /data/users/bi_shark/.local/share/jupyter/kernels/bi_shark/run.sh

#!/usr/bin/bash
export JAVA_HOME=/opt/softs/jdk1.8.0_331
export HADOOP_HOME=/opt/softs/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export CLASSPATH=$($HADOOP_HOME/bin/hdfs classpath --glob)

export HADOOP_PREFIX=$HADOOP_HOME

export SPARK_LOCAL_HOSTNAME=10.10.10.10
export SPARK_HOME=/opt/softs/spark

# Change me
export USER_ENV="/data/users/bi_shark/.conda/envs/bi_shark"

export PATH="${USER_ENV}/bin:/data/bigdata/anaconda3/condabin/:${PATH}"
exec ${USER_ENV}/bin/python -m ipykernel "$@"

chmod +x /data/users/bi_shark/.local/share/jupyter/kernels/bi_shark/run.sh

  • Sửa file khởi động của kernel

Ta thay file khởi động kernel với script ta vừa tạo bên trên

vi /data/users/bi_shark/.local/share/jupyter/kernels/bi_/kernel.json

{
 "argv": [
  "/data/users/bi_shark/.local/share/jupyter/kernels/bi_shark/run.sh",
  "-f",
  "{connection_file}"
 ],
 "display_name": "Python 3.9 (Conda)",
 "language": "python",
 "metadata": {
  "debugger": true
 }
}

Bonus: Script tự tạo môi trường Conda và Kernel tương ứng

#!/bin/bash

ENV_NAME="${USER}_p${1}"
ENV_DISPLAYNAME="Python ${1} (Conda)"

conda create --name ${USER}_p${1} python=${1} -y
eval "$(conda shell.bash hook)"
conda activate ${ENV_NAME}
pip install --upgrade pip
pip install ipykernel
python -m ipykernel install --user --name="${ENV_NAME}" --display-name="${ENV_DISPLAYNAME}"

cat > ~/.local/share/jupyter/kernels/${ENV_NAME}/run.sh << EOF
#!/usr/bin/bash
export JAVA_HOME=/opt/softs/jdk1.8.0_331
export HADOOP_HOME=/opt/softs/hadoop
export HADOOP_CONF_DIR=\$HADOOP_HOME/etc/hadoop

export CLASSPATH=\$(\$HADOOP_HOME/bin/hdfs classpath --glob)

export HADOOP_PREFIX=\$HADOOP_HOME

export SPARK_LOCAL_HOSTNAME=10.10.10.10
export SPARK_HOME=/opt/softs/spark

# Change me
export USER_ENV="/data/users/${USER}/.conda/envs/${ENV_NAME}"

export PATH="\${USER_ENV}/bin:/data/bigdata/anaconda3/condabin/:\${PATH}"
exec \${USER_ENV}/bin/python -m ipykernel "\$@"
EOF

chmod +x ~/.local/share/jupyter/kernels/${ENV_NAME}/run.sh

cat > ~/.local/share/jupyter/kernels/${ENV_NAME}/kernel.json << EOF
{
 "argv": [
  "~/.local/share/jupyter/kernels/${ENV_NAME}/run.sh",
  "-f",
  "{connection_file}"
 ],
 "display_name": "${ENV_DISPLAYNAME}",
 "language": "python",
 "metadata": {
  "debugger": true
 }
}
EOF

Ví dụ: Cài đặt môi trường mới cho user bi_tuna

useradd -m -d /data/users/bi_tuna/ -s /bin/bash bi_tuna
passwd bi_tuna

Chuyển sang user bi_tuna; sao chép script và chạy script để tạo môi trường mới với Python 3.10

su - bi_tuna
/data/bigdata/anaconda3/bin/conda init
logout
su - bi_tuna
./script.sh 3.10

Tham khảo: https://help.rc.ufl.edu/doc/Managing_Python_environments_and_Jupyter_kernels

Cài đặt PyArrow

pip install pyarrow

export JAVA_HOME=/opt/softs/jdk1.8.0_331
export HADOOP_HOME=/opt/softs/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_HOME=/opt/softs/spark
export PATH=${JAVA_HOME}/bin:${PATH}
export CLASSPATH=`/opt/softs/hadoop/bin/hdfs classpath --glob`

Lưu ý: Chỗ CLASSPATH cần chạy lệnh và copy output vào biến.

@hoangdh
Copy link
Author

hoangdh commented May 22, 2024

Cấu hình với LDAP

pip install mysql-connector-python
pip install jupyterhub-ldapauthenticator
import os
import pwd
import subprocess

c = get_config()  #noqa
c.JupyterHub.ip = '0.0.0.0'
c.JupyterHub.port = 8080
c.Authenticator.admin_users = { 'hoangdh' }
c.JupyterHub.log_level = 'DEBUG'

# LDAP
c.JupyterHub.authenticator_class = 'ldapauthenticator.LDAPAuthenticator'
c.LDAPAuthenticator.server_address = 'ldap.local'
c.LDAPAuthenticator.server_port =  389
c.LDAPAuthenticator.use_ssl = False
c.LDAPAuthenticator.lookup_dn_search_user = 'cn=admin,ou=Technical,dc=ldap,dc=local'
c.LDAPAuthenticator.lookup_dn_search_password = 'PassWord'
c.LDAPAuthenticator.bind_dn_template = [ 'uid={username},ou=Technical,dc=ldap,dc=local' ]
c.LDAPAuthenticator.user_search_base = 'ou=Technical,dc=ldap,dc=local'
c.LDAPAuthenticator.user_attribute = 'uid'
c.LDAPAuthenticator.user_search_base = 'ou=Technical,dc=ldap,dc=local'
c.LDAPAuthenticator.user_search_filter = '(&(objectClass=person)(uid={username}))'

notebook_dir = '/data/hub/userdata/{username}/notebooks'
c.Spawner.notebook_dir = notebook_dir

# Create the user notebook directory if it doesn't exist
def create_notebook_dir(spawner):
    username = spawner.user.name
    notebook_path = notebook_dir.format(username=username)
    if not os.path.exists(notebook_path):
        os.makedirs(notebook_path, exist_ok=True)
        uid = pwd.getpwnam(username).pw_uid
        gid = pwd.getpwnam(username).pw_gid
        os.chown(notebook_path, uid, gid)
        # Ensure the ownership is recursive
        subprocess.check_call(['chown', '-R', f'{username}:{username}', notebook_path])

c.Spawner.pre_spawn_hook = create_notebook_dir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment