Skip to content

Instantly share code, notes, and snippets.

@izikeros
Last active December 16, 2023 03:22
Show Gist options
  • Save izikeros/e81139612102dbda5725cdfb102ada41 to your computer and use it in GitHub Desktop.
Save izikeros/e81139612102dbda5725cdfb102ada41 to your computer and use it in GitHub Desktop.
Install package in colab and jupyter notebook

Installation of Git Packages for Jupyter and Google Colab

This README will guide you through the process of installing packages directly from a Git repository in a way that is compatible with both Google Colab and Jupyter Notebooks. The provided code will attempt to execute compatible commands depending on the running environment.

Workflow Explanation

Extension Loading:

%load_ext autoreload
%autoreload 2

These lines enable the autoreload extension. It is generally used in order to automatically reload modules that have changed, meaning that you won't have to manually reload or restart the kernel.

Importing Necessary Modules:

import sys
import os

We import the sys and os modules that will allow us to interact with the operating system and Python’s runtime environment.

Cloning the Repository:

try:  # When on Google Colab, clone the repository to download any necessary cache.
    import google.colab
    repo_path = '<package_repo>' 
    !git -C $repo_path pull origin || git clone <github_repo_url> $repo_path  
except:
    repo_path = '.'  # Use the local path if not on Google Colab

We attempt to import google.colab to check if we are in a Google Colab environment. If this import is successful, it means we are running the code on Google Colab and hence we carry out operations suited for it.

Here, repo_path is defined as string <package_repo>, which should be replaced with your repository name. The repo is then pulled if it already exists or cloned from the GitHub URL <github_repo_url> if it does not exist on the local system.

If google.colab cannot be imported, it means the code is not running on Colab, possibly on a local system or a Jupyter notebook. In this case, repo_path is set as the current directory, '.'.

Checking If 'repo_path' is in sys.path:

if repo_path not in sys.path:
    sys.path.append(repo_path)

We check if our repo_path is a part of sys.path. If not, we add it. This allows Python to look for libraries in the specified repository when importing.

Checking If Package is Installed & Installing the Package:

import pkg_resources  

if not "<package_name>" in {pkg.key for pkg in pkg_resources.working_set}: 
    !pip install -U pip
    !pip install <package_name>  
    # !pip install -e $repo_path

We first import the pkg_resources module, which is a runtime facilities for developers library. Then, we check if the package <package_name> (replace it with your desired package's name) is installed.

If the package is not installed, we update pip and then install the required package.

The commented line !pip install -e $repo_path would install the package in editable mode. If this is the mode you desire, uncomment this line.

Importing the Package:

import <package_name>

Finally, we import the package for use. Remember to replace <package_name> with your package name.

Conclusion

By following these steps, you can install a package from a Git repository on both Google Colab and Jupyter notebooks. This is effectively a way to bypass the need to manually install packages across different environments.

Please make sure to replace <package_repo>, <github_repo_url>, and <package_name> with your specific repository name, GitHub URL, and package name respectively.

Credits

The snippet is inspired by the setup part of the intro notebook of great DSPy package.

%load_ext autoreload
%autoreload 2
import sys
import os
try: # When on Google Colab, clone the repository to download any necessary cache.
import google.colab
repo_path = '<package_repo>' # replace '<package_repo>' with the name of your repository
!git -C $repo_path pull origin || git clone <github_repo_url> $repo_path # replace '<github_repo_url>' with the URL of your repo
except:
repo_path = '.' # Use the local path if not on Google Colab
if repo_path not in sys.path:
sys.path.append(repo_path)
# Set up cache for this notebook, modify if needed for your package
os.environ["NOTEBOOK_CACHEDIR"] = os.path.join(repo_path, 'cache')
import pkg_resources # Check if package is installed
if not "<package_name>" in {pkg.key for pkg in pkg_resources.working_set}: # replace '<package_name>' with your package name
!pip install -U pip
!pip install <package_name> # replace '<package_name>' with your package name
# !pip install -e $repo_path # Uncomment this line if you want to install the package in editable mode
import <package_name> # replace '<package_name>' with your package name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment