Skip to content

Instantly share code, notes, and snippets.

@RomanSteinberg
Created November 14, 2017 09:35
Show Gist options
  • Save RomanSteinberg/961ecc63f0a5da159d846e7284de6df2 to your computer and use it in GitHub Desktop.
Save RomanSteinberg/961ecc63f0a5da159d846e7284de6df2 to your computer and use it in GitHub Desktop.
Install nvidia staff

Introduction

I assume you make clean system install, no previous Nvidia installations. Other case, try the following steps on your own risk:

  • sudo apt-get purge 'nvidia*'
  • rm ~/.Xauthority

Pre-step

Switch off the PC, remove video card

Installation

  • Make clean installation of the Ubuntu with Intel drivers, check it is able to login into system with graphics;

  • In BIOS, cehck the following things:

    • internal video is enabled (not auto)
    • the video output during boot time is something like IGX (select the one that looks like internal graphics)
  • Switch off the PC, insert the card back, boot, check the graphics is ok

  • Install nvidia-375 from the repo

    • check: after that I have following packages installed:
    bbswitch-dkms/xenial,now 0.8-3ubuntu1 amd64 [installed,automatic]
    libcuda1-375/xenial-updates,xenial-security,now 375.39-0ubuntu0.16.04.1 amd64 [installed,automatic]
    libvdpau1/xenial,now 1.1.1-3ubuntu1 amd64 [installed,automatic]
    libxnvctrl0/xenial,now 361.42-0ubuntu1 amd64 [installed,automatic]
    nvidia-375/xenial-updates,xenial-security,now 375.39-0ubuntu0.16.04.1 amd64 [installed]
    nvidia-opencl-icd-375/xenial-updates,xenial-security,now 375.39-0ubuntu0.16.04.1 amd64 [installed,automatic]
    nvidia-prime/xenial,now 0.8.2 amd64 [installed,automatic]
    nvidia-settings/xenial,now 361.42-0ubuntu1 amd64 [installed,automatic]
    vdpau-driver-all/xenial,now 1.1.1-3ubuntu1 amd64 [installed,automatic]
    xserver-xorg-video-nouveau/xenial,now 1:1.0.12-1build2 amd64 [installed,automatic]
    

Note: you can look at the proposed by the system driver version by the following command:

$ ubuntu-drivers devices
== cpu-microcode.py ==
driver   : intel-microcode - distro non-free

== /sys/devices/pci0000:00/0000:00:1c.4/0000:06:00.0 ==
vendor   : NVIDIA Corporation
modalias : pci:v000010DEd00001B80sv00001458sd00003702bc03sc00i00
driver   : xserver-xorg-video-nouveau - distro free builtin
driver   : nvidia-375 - distro non-free recommended
  • the first problem is with the xserver-xorg-video-nouveau, which will try to take over the correct nvidia-375 driver. So, just disable it after installation.

    1. Create the file /etc/modprobe.d/blacklist-nouveau.conf with the following content:
    $ cat /etc/modprobe.d/blacklist-nouveau.conf 
    
    blacklist nouveau
    blacklist lbm-nouveau
    options nouveau modeset=0
    alias nouveau off
    alias lbm-nouveau off
    1. Create the file /etc/modprobe.d/nouveau-kms.conf with the following content:
    $ cat /etc/modprobe.d/nouveau-kms.conf 
    
    options nouveau modeset=0
    1. update boot image
    $ sudo update-initramfs -u
  • Run NVIDIA X server settings app from the menu / start nvidia-xsettings app. Select PRIME profile to be 'Intel'

  • The second problem is that nvidia-375 install the OpenGL drivers that are not working. Solution: just remove/rename libs! =)

$ sudo mv /usr/lib/nvidia-375/libGLdispatch.so.0 /usr/lib/nvidia-375/disable_libGLdispatch.so.0
$ sudo mv /usr/lib/nvidia-375/libGL.so /usr/lib/nvidia-375/disable_libGL.so
$ sudo mv /usr/lib/nvidia-375/libGLX.so /usr/lib/nvidia-375/disable_libGLX.so
  • Check the OpenGL is working correctly:

    1. Install mesa-utils
    2. Run
    $ ldd $(which glxinfo)|grep nvidia

    it should return nothing

    1. For check: my output form ldd:
    linux-vdso.so.1 =>  (0x00007ffcdb1f6000)
    libGL.so.1 => /usr/lib/x86_64-linux-gnu/mesa/libGL.so.1 (0x00007f0fb484d000)
    libX11.so.6 => /usr/lib/x86_64-linux-gnu/libX11.so.6 (0x00007f0fb4513000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f0fb4149000)
    libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007f0fb3f20000)
    libxcb-dri3.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-dri3.so.0 (0x00007f0fb3d1d000)
    libxcb-present.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-present.so.0 (0x00007f0fb3b19000)
    libxcb-sync.so.1 => /usr/lib/x86_64-linux-gnu/libxcb-sync.so.1 (0x00007f0fb3912000)
    libxshmfence.so.1 => /usr/lib/x86_64-linux-gnu/libxshmfence.so.1 (0x00007f0fb370f000)
    libglapi.so.0 => /usr/lib/x86_64-linux-gnu/libglapi.so.0 (0x00007f0fb34df000)
    libXext.so.6 => /usr/lib/x86_64-linux-gnu/libXext.so.6 (0x00007f0fb32cd000)
    libXdamage.so.1 => /usr/lib/x86_64-linux-gnu/libXdamage.so.1 (0x00007f0fb30ca000)
    libXfixes.so.3 => /usr/lib/x86_64-linux-gnu/libXfixes.so.3 (0x00007f0fb2ec3000)
    libX11-xcb.so.1 => /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1 (0x00007f0fb2cc1000)
    libxcb-glx.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-glx.so.0 (0x00007f0fb2aa8000)
    libxcb-dri2.so.0 => /usr/lib/x86_64-linux-gnu/libxcb-dri2.so.0 (0x00007f0fb28a2000)
    libxcb.so.1 => /usr/lib/x86_64-linux-gnu/libxcb.so.1 (0x00007f0fb2680000)
    libXxf86vm.so.1 => /usr/lib/x86_64-linux-gnu/libXxf86vm.so.1 (0x00007f0fb247a000)
    libdrm.so.2 => /usr/lib/x86_64-linux-gnu/libdrm.so.2 (0x00007f0fb226a000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f0fb1f61000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f0fb1d44000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f0fb1b3f000)
    /lib64/ld-linux-x86-64.so.2 (0x000055c15a3e6000)
    libXau.so.6 => /usr/lib/x86_64-linux-gnu/libXau.so.6 (0x00007f0fb193b000)
    libXdmcp.so.6 => /usr/lib/x86_64-linux-gnu/libXdmcp.so.6 (0x00007f0fb1734000)
    
  • Reboot

  • Check there is no nouveau loaded:

$ lsmod|grep nouveau

should return nothing.

$ lsmod|grep nvidia
nvidia_uvm            647168  0
nvidia_drm             53248  1
nvidia_modeset        790528  1 nvidia_drm
nvidia              12144640  2 nvidia_modeset,nvidia_uvm
drm_kms_helper        139264  2 i915_bpo,nvidia_drm
drm                   360448  9 i915_bpo,drm_kms_helper,nvidia_drm

$ lsmod|grep i915
i915_bpo             1261568  6
intel_ips              20480  1 i915_bpo
i2c_algo_bit           16384  1 i915_bpo
drm_kms_helper        139264  2 i915_bpo,nvidia_drm
drm                   360448  9 i915_bpo,drm_kms_helper,nvidia_drm
video                  40960  1 i915_bpo
  • Check graphics is working

  • Install CUDA (from binary package from NVIDIA site). Important - say not to install opengl libs, say no to install driver, say no to configure Xorg.

    $ sudo cuda_8.0.44_linux.run --no-opengl-libs
  • For nvidia-smi to work correctly (and other CUDA-related stuff) we should add path to libraries and binaries

    1. Add path to the libraries to LD_LIBRARY_PATH
    $ echo "export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/usr/local/cuda/lib64:/usr/lib/nvidia-375" >> ~/.bashrc
    1. Start new shell and check the nvidia-smi output:
    $ nvidia-smi
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
    |-------------------------------+----------------------+----------------------+
    | GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
    |===============================+======================+======================|
    |   0  GeForce GTX 1080    Off  | 0000:06:00.0     Off |                  N/A |
    |  0%   33C    P0    40W / 200W |      0MiB /  8114MiB |      0%      Default |
    +-------------------------------+----------------------+----------------------+
                                                                                   
    +-----------------------------------------------------------------------------+
    | Processes:                                                       GPU Memory |
    |  GPU       PID  Type  Process name                               Usage      |
    |=============================================================================|
    |  No running processes found                                                 |
    +-----------------------------------------------------------------------------+

    If it fails to detect the GPU - its OK for first time after reboot. Solution:

    1. start nvidia-settings and use NVIDIA for PRIME profiles (or run sudo prime-select nvidia)
    2. restart X session as it say (or sudo service lightdm restart)
    3. after that nvidia-smi should say the X server is running on GPU
    4. start nvidia-settings and use Intel for PRIME profiles (or run sudo prime-select intel)
    5. restart X session as it say (or sudo service lightdm restart)
    6. after that nvidia-smi should work OK

Note: in case if you can not start graphics, you can try from TTY:

$ sudo service lightdm stop
$ sudo prime-select intel
$ sudo service lightdm start

Troubleshooting

In case if nvidia-smi says it cannot communicate to driver, solution that helped me:

sudo apt install nvidia-375
sudo apt remove bbswitch-dkms
sudo apt install bbswitch-dkms

also, some information about possible problems with loading kernel driver can be found at /var/log/gpu-manager.log


Author: @sergregory from https://opendatascience.slack.com/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment