Skip to content

Instantly share code, notes, and snippets.

@abishekmuthian
Last active June 16, 2023 15:07
Show Gist options
  • Save abishekmuthian/da3de9b188d41570f4e988ceb5d81abc to your computer and use it in GitHub Desktop.
Save abishekmuthian/da3de9b188d41570f4e988ceb5d81abc to your computer and use it in GitHub Desktop.
Installing cudf on ARM(aarch64)[Jetson Nano]

My setup

I'm using Nvidia Jetson nano.

Quad-core ARM® Cortex®-A57 MPCore processor

NVIDIA Maxwell™ architecture with 128 NVIDIA CUDA® cores

4 GB 64-bit LPDDR4 1600MHz - 25.6 GB/s

Ubuntu 18.04 LTS

Python 3.6.9

Environment variables for CUDA

export CUDACXX=/usr/local/cuda/bin/nvcc
export CUDA_HOME=/usr/local/cuda
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
export PATH=$PATH:$CUDA_HOME/bin

Pip Dependencies

$ pip install cmake-setuptools

Dependencies

apache arrow

Follow my earlier apache arrow installation guide for ARMv8 - https://gist.github.com/heavyinfo/04e1326bb9bed9cecb19c2d603c8d521.

DLPack

$ git clone https://github.com/dmlc/dlpack.git
$ cd dlpack
$ mkdir build && cd build
$ cmake .. -DCMAKE_INSTALL_PREFIX=/usr/local/lib/dlpack
$ make -j4
$ make install

RMM

$ git clone --recurse-submodules https://github.com/rapidsai/rmm.git
$ cd rmm
$ mkdir build && cd build
$ cmake .. -DCMAKE_INSTALL_PREFIX=/usr/local/lib/dlpack
$ make -j4
$ sudo -E make install

Installing rmm python library.

cd python
sudo -E python3 setup.py build_ext --inplace
sudo -E python3 setup.py install

Export RMM path to LD_LIBRARY_PATH

export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib/rmm/lib:$LD_LIBRARY_PATH
sudo ldconfig

Download cudf

I'm checking out the commit to reflect the HEAD I used when building cudf. You can use later versions if there aren't any deviations from this install procedure.

git clone https://github.com/rapidsai/cudf.git
git checkout d6b4794b4a3ed7e6577042596a32732bd62fd074

Building and installing libcudf

Set the paths according to your own locations.

-DGPU_ARCHS="" was needed for jetson nano as the cuda compute capability is 5.3 and cudf requires 6.0+. If you are building for a device is GPU with cuda compute 6.0+, this flag can be ommited.

export CUDF_HOME=/home/abishek/Downloads/cudf
export DLPACK_ROOT=/usr/local/lib/dlpack/
export RMM_ROOT=/usr/local/lib/rmm/
cd $CUDF_HOME/cpp/build/
cmake .. -DCMAKE_INSTALL_PREFIX=/usr/local/lib/cudf/ -DCMAKE_CXX11_ABI=ON -DRMM_INCLUDE=/usr/local/lib/rmm/include/ -DDLPACK_INCLUDE=/usr/local/lib/dlpack/include/ -DGPU_ARCHS=""
make #will take a day to build, more process will crash the build due to low memory.
sudo -E make install

Building and installing nvstrings python module

cd $CUDF_HOME/python/nvstrings
export NVSTRINGS_ROOT=/usr/local/lib/cudf/
export NVSTRINGS_LIBRARY=/usr/local/lib/cudf/lib/
export NVTEXT_LIBRARY=/usr/local/lib/cudf/lib/
export NVCATEGORY_LIBRARY=/usr/local/lib/cudf/lib/
sudo -E python3 setup.py install

Export CUDF path to LD_LIBRARY_PATH

export LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib/cudf/lib:/usr/local/lib/rmm/lib:$LD_LIBRARY_PATH
sudo ldconfig

Fixing cudf python module setup

cd $CUDF_HOME/python/cudf
nano setup.py

To fix this issue - rapidsai/cudf#4121,

Add the following include paths to setup.py at this location - https://github.com/rapidsai/cudf/blob/d6b4794b4a3ed7e6577042596a32732bd62fd074/python/cudf/setup.py#L41-L50

Change the path to your appropirate location if you haven't followed the earlier paths for installation of these libraries.

"/usr/local/lib/",
"/usr/local/lib/cudf",
"/usr/local/lib/include",
"/usr/local/lib/rmm/include/",
"/usr/local/lib/dlpack/include/",

Building cudf python module

cd $CUDF_HOME/python/cudf
sudo -E python3 setup.py build_ext --inplace
sudo -E python3 setup.py install

Notes

  1. You could probably do away with sudo while installing python libraries if you don't face any permission errors. If you use sudo for python library installations, understand the risks involved in it.

References

https://github.com/rapidsai/cudf/blob/branch-0.13/CONTRIBUTING.md#setting-up-your-build-environment rapidsai/cudf#2770 (comment) rapidsai/cudf#2505 (comment)

@baldoucam
Copy link

I have a problem with the proposed configuration. In my AGX Xavier gives an error when I import cudf from python. It doesn't give any error at compile time but at run time:

Python 3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

import rmm
import pyarrow
from numba import cuda
import cudf
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.6/dist-packages/cudf-0.14.0a0+4051.g5f95c9745.dirty-py3.6-linux-aarch64.egg/cudf/init.py", line 5, in
validate_setup(check_dask=False)
File "/usr/local/lib/python3.6/dist-packages/cudf-0.14.0a0+4051.g5f95c9745.dirty-py3.6-linux-aarch64.egg/cudf/utils/gpu_utils.py", line 10, in validate_setup
from cudf._cuda.gpu import (
ImportError: /usr/local/lib/python3.6/dist-packages/cudf-0.14.0a0+4051.g5f95c9745.dirty-py3.6-linux-aarch64.egg/cudf/_cuda/gpu.cpython-36m-aarch64-linux-gnu.so: undefined symbol: cudaDriverGetVersion

Running python, import rmm, import pyarrow, numba or dlpack does not cause problems. I'm several days into this problem.

@abishekmuthian
Copy link
Author

@baldoucam Have you set the environmental variables mentioned at the top of my gist?

@baldoucam
Copy link

baldoucam commented May 12, 2020

Yes, I have configured everything in my .bashrc so that I don't have problems every time I leave the session. I have compiled with options like -DGPU_ARCHS="" in a test, in case that was the reason, but it is not.

I have also tested the CUDA API with a small software and devicequery works fine without problems. In python I have run LSTM with tensorflow on GPU for a study and it also works well. I don't know what could be going on.

@baldoucam
Copy link

Another thing that is important is that if I use "ldd" with gpu.cpython-36m-aarch64-linux-gnu.so its output is:

linux-vdso.so.1 (0x0000007f88fd8000)
libc.so.6 => /lib/aarch64-linux-gnu/libc.so.6 (0x0000007f88dfe000)
/lib/ld-linux-aarch64.so.1 (0x0000007f88fad000)

The CUDA library is not linked, is this correct?...Can you tell me the output of that library on the TX2?
In this case, if this function is part of the CUDA API, it cannot be found.
But it links well, it does not give any failure when it creates the .so.

@abishekmuthian
Copy link
Author

@balcoucam I'm on Jetson Nano (Checkout My Setup section), where I was able to reproduce this build several times.

gpu.cpython-36m-aarch64-linux-gnu.so is not found in my system, even this /_cuda/ path is non-existent, so I suppose whatever it's being used for is directly linked from the system Cuda binaries in my case. It could be because of Cudf version 0.13 I'm using.

Update: Found the issue you are discussing this with Cudf team regarding AGX Xavier, I think you're facing the same issue with TX2?

@Zerorigin
Copy link

Hello, is Pyarrow & CUDF working well on Jetson Nano B01?
I'm trying to compile & install, but in trouble for now.

@schindam
Copy link

schindam commented Jun 16, 2023

Have you guys able to use cudf in python code successfully and see performance improvements? From rapidsai/cuml#665 RAPIDS, and its libraries like cuML, require Pascal architecture and higher. The Jetson boards use Maxwell Architecture. RAPIDS does work on the Xavier boards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment