Skip to content

Instantly share code, notes, and snippets.

@PeterSprague
Last active January 19, 2022 23:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save PeterSprague/eb65766399e1bd9b823dadfbce3bef5f to your computer and use it in GitHub Desktop.
Save PeterSprague/eb65766399e1bd9b823dadfbce3bef5f to your computer and use it in GitHub Desktop.
Install and access FastAI course locally on a Nvidia Jetson natively as user
# Install and access FastAI course locally on a Nvidia Jetson natively as user
Steps
--------
1) set up host Jetson with newest v32.6.1 18.04 L4T w/ Jetpack
2) build custom Python3.6 Fastai Jupyter kernel on Jetson host to access full Cuda, etc. header files for compiling
3) test
# References
https://docs.fast.ai/
https://github.com/fastai
https://docs.nvidia.com/deeplearning/frameworks/index.html
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/l4t-ml
https://qooba.net/2020/05/10/fastai-with-tensorrt-on-jetson-nano/
https://github.com/streicherlouw/fastai2_jetson_nano
https://forums.fast.ai/t/platform-nvidia-jetson-xavier-nx/72119
https://www.pyimagesearch.com/2019/05/06/getting-started-with-the-nvidia-jetson-nano/
https://jkjung-avt.github.io/setting-up-xavier-nx/
https://pypi.org/project/jetson-stats/
# testing on Jetson TX2 & Nano 4Gb
# new flash with L4T 32.6.1 with Jetpack 4.6
# TX2 - 8Gb ram & 250Gb SSD
# 1st training cell imports fastai & downloads images/model, then errors, needs debugging
# Nano - 32Gb SD card ==> errors, needs debugging
# use >=64Gb SD card for ML, 32Gb runs out of space quickly
# training will be very slow for most models on Nano
# on Jetson [ssh or local console as "user"]
$ sudo apt update && sudo apt upgrade -y
$ sudo apt install -y curl htop vim git python3-pip
$ sudo apt install -y build-essential cmake
$ sudo apt install -y libatlas-base-dev gfortran
$ sudo apt install -y libhdf5-serial-dev hdf5-tools
$ sudo apt install -y libopenblas-base libopenblas-dev libopenmpi-dev
# fix docker error when using Jetpack v4.6
https://github.com/dusty-nv/jetson-containers/issues/108
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update && apt upgrade -y
# $ sudo apt-get install nvidia-docker2=2.8.0-1
# ========================================================
# for Python3.6 installed in user
# doesn't work from a venv, may work with work-around
# https://stackoverflow.com/questions/55600132/installing-local-packages-with-python-virtualenv-system-site-packages#55600285
# installs some out of date packages
# don't just blindly copy & run, versions & options change
# items in [] are notes or options, don't include the brackets in cmd
#-------------------------------------
$ mkdir -p ~/Development/Tools/jupyter_kernels
$ cd ~/Development/Tools/jupyter_kernels
# insert in .bashrc or run from cmd line each time
$ PATH=/home/user/.local/bin:$PATH
$ pip3 install -U --user pip setuptools wheel
$ pip3 install --user "Cython>=0.25.0,<3.0" --no-cache-dir [--no-binary :all:]
# install matplotlib from source, need <v3.4 for python3.6 [current v3.5.1]
https://matplotlib.org/stable/users/installing/index.html
https://github.com/matplotlib/matplotlib/issues/16027
https://github.com/matplotlib/matplotlib/blob/v3.3.4/INSTALL.rst
# need to build whl
$ sudo apt install libfreetype6-dev
$ pip3 install --user numpy>=1.15 setuptools cycler>=0.10.0 python-dateutil>=2.1 kiwisolver>=1.0.0 pillow>=6.2 pyparsing>=2.0.3
$ cd ~/Software-local/Matplotlib
$ git clone
$ cd matplotlib
$ git checkout v3.3.4
$ pip3 install --user .
$ cd ~/Development/Tools/jupyter_kernels
$ pip3 install --user jupyterlab ipywidgets pudb
# initial test of fastai on TX2
# torchvision 0.11.2 errors on import
https://discuss.pytorch.org/t/failed-to-load-image-python-extension-could-not-find-module/140278
https://github.com/pytorch/vision/blob/main/torchvision/io/image.py
https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-10-now-available/72048
# install older version of torchvision for Jetsons, needs previous version of torch
https://pytorch.org/get-started/locally/
https://forums.developer.nvidia.com/t/pytorch-for-jetson-version-1-10-now-available/72048
https://stackoverflow.com/questions/62407851/pytorch-doesnt-find-a-cuda-device#
# download Jetson torch v1.10.0 whl [only for Python3.6]
$ wget https://nvidia.box.com/shared/static/fjtbno0vpo676a25cgvuqc1wty0fkkg6.whl -O torch-1.10.0-cp36-cp36m-linux_aarch64.whl
$ pip3 install --user torch-1.10.0-cp36-cp36m-linux_aarch64.whl
$ pip3 install --user torchvision==0.11.1
# forces install of numpy v1.19.5
# reinstall previous numpy versions if required instead because of errors
$ pip3 install "numpy==1.21.5" --no-cache-dir [--no-binary :all: - errors w/ --no-binary]
# test in python console
$ python
>> import torch
>> x = torch.rand(5, 3)
>> print(x)
# test gpu driver & cuda availability
>> import torch
>> torch.cuda.is_available()
==> works
# test numpy
https://numpy.org/doc/stable/reference/testing.html
$ python
>> import numpy
>> numpy.test(label='slow')
==> no errors
# break this out because getting errors when just using "pip install fastai"
# check for errors and build individual modules from source that failed
# check on latest release version of thinc for Python3.6 [v7.4.3 for 3.6]
# preinstall blis - build from source for arm64, host Jetson has Cuda development env
https://github.com/explosion/spaCy/issues/3861
$ BLIS_ARCH="generic" pip3 --no-cache-dir install blis --user --no-binary blis [installed v0.7.3]
# generate thinc requirements.txt
https://github.com/explosion/thinc/blob/v7.4.3/requirements.txt
$ wget -c https://raw.githubusercontent.com/explosion/thinc/v7.4.3/requirements.txt -O requirements-thinc-v8.0.13.txt
$ pip3 install --user -r requirements-thinc-v7.4.3.txt --no-cache-dir [--no-binary :all:]
$ pip3 install --user "thinc[cuda102,torch]==7.4.3" --no-cache-dir --no-deps [installed 7.4.3]
# don't use "thinc --pre", fails because of murmurhash or ?pip conflicts with Cpython versions
# see: https://github.com/pypa/pip/issues/10222
https://github.com/explosion/spaCy/issues/3861
$ pip3 install spacy[cuda]==2.3.4 --user --no-cache-dir [installed v2.3.4]
# check on latest release version of fastai for Python3.6 [v2.5.3 for 3.6 as of Jan-05-2022]
# generate fastai requirements.txt
https://github.com/fastai/fastai/blob/2.5.3/settings.ini
$ vi requirements-fastai-v2.5.3.txt
# need to preinstall scipy & scikit-learn to user to have newer versions
# import fastai was erroring from scikit-learn
$ pip3 install -U scipy --user --no-cache-dir no-binary scipy [installed v1.5.4]
$ pip3 install -U scikit-learn --user --no-cache-dir --no-binary scikit-learn [installed v0.24.2]
$ pip3 install --user -r requirements-fastai-v2.5.3.txt --no-cache-dir
# error: can't install any version of matplotlib
# built matplotlib from source, moved step to above
$ pip3 install --user fastai --no-cache-dir --no-deps
# install Jupyterlab start script
# include LD_PRELOAD in start script, see:
OSError: /usr/lib/aarch64-linux-gnu/libgomp.so.1: cannot allocate memory in static TLS block
--> https://forums.developer.nvidia.com/t/what-is-cannot-allocate-memory-in-static-tls-block/169225
https://forums.developer.nvidia.com/t/oserror-usr-lib-aarch64-linux-gnu-libgomp-so-1-only-in-jupyter-notebook/174881/8
$ cd ~/Development/Tools
$ echo '#!/usr/bin/env bash
jupyter lab --ip=0.0.0.0 --port=9743 --allow-root --no-browser'
LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libgomp.so.1:$LD_PRELOAD >> ~/Development/Tools/run_jupyter.sh
$ chmod 755 ~/Development/Tools/run_jupyter.sh
# git clone FastAI book & course material
$ mkdir -p ~/Development/ML/projects/fastai/course
$ cd ~/Development/ML/projects/fastai/course
$ git clone https://github.com/fastai/course20.git
$ git clone https://github.com/fastai/fastbook.git
# create 01_intro_cell-1 test script for debugging
$ cd ~/Development/ML/projects/fastai/course/fastbook
$ vi 01_intro_cell-01_test.py
from fastai.vision.all import *
path = untar_data(URLs.PETS)/'images'
def is_cat(x): return x[0].isupper()
dls = ImageDataLoaders.from_name_func(path, get_image_files(path), valid_pct=0.2, seed=42, label_func=is_cat, item_tfms=Resize(224))
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(1)
# run debugger in console
$ pudb3 01_intro_cell-01_test.py
step through code
==> works, no error: illegal instruction core dump
# test in Jupyter
# start Jupyterlab on port 9743
$ cd ~/Development/ML/projects/fastai/course/fastbook
$ ~/Development/Tools/run_jupyter.sh
# access from a browser
http://x.x.x.x:9743 --> fastbook --> 01_intro.ipynb
use Python 3 kernel
run "first training" cell
==> runs, no errors
#=========================================================================
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment