Skip to content

Instantly share code, notes, and snippets.

@whizzzkid
Last active December 3, 2022 15:43
Show Gist options
  • Save whizzzkid/37c0d365f1c7aa555885d102ec61c048 to your computer and use it in GitHub Desktop.
Save whizzzkid/37c0d365f1c7aa555885d102ec61c048 to your computer and use it in GitHub Desktop.
[XPS 15 Early 2017 9560 kabylake] Making Nvidia Drivers + (CUDA 8 / CUDA 9 / CUDA 9.1) + Bumblebee work together on linux ( Ubuntu / KDE Neon / Linux Mint / debian )
# Instructions for 4.14 and cuda 9.1
# If upgrading from 4.13 and cuda 9.0
$ sudo apt-get purge --auto-remove libcud*
$ sudo apt-get purge --auto-remove cuda*
$ sudo apt-get purge --auto-remove nvidia*
# also remove the container directory direcotory at /usr/local/cuda-9.0/
# Important libs required with 4.14.x with Cuda 9.X
$ sudo apt install libelf1 libelf-dev
# Install Intel Graphics Patch Firmwares (This should reboot your system):
bash -c "$(curl -fsSL http://bit.ly/IGFWL-install)"
# Update to 4.14 kernel. nvidia-384 compiles fine with this.
cd /tmp
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.13/linux-headers-4.14.13-041413_4.14.13-041413.201801101001_all.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.13/linux-headers-4.14.13-041413-generic_4.14.13-041413.201801101001_amd64.deb
wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14.13/linux-image-4.14.13-041413-generic_4.14.13-041413.201801101001_amd64.deb
sudo dpkg -i *.deb
# Add Nvidia repository
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
# Install via ubuntu drivers
sudo ubuntu-drivers autoinstall
# <optional for ML Folks> Install CUDA 8 (if you're interested in using gpu for ML) ~> requires nvidia stable drivers
# This acts as repo so install it somewhere safe and do not delete
# cuda is now available with membership only, download from https://developer.nvidia.com/cuda-release-candidate-download
sudo dpkg -i cuda-repo-ubuntu1604-9-0-local-rc_9.0.103-1_amd64.deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda
# Update Path to have /usr/local/cuda/bin incase using dotfiles.
# <optional for ML Folks> Install CuDNN for running things like tensorflow
# Download cudnn7 for cuda9 from nvidia https://developer.nvidia.com/cudnn
sudo dpkg -i libcudnn7_7.0.2.43-1+cuda9.0_amd64.deb
# Testing cuda.
$ sudo prime-select intel
$ nvidia-smi
# This should give an error, no drivers found
# Try this
$ sudo prime-select nvidia
# displays CUDA info.
# At this point the nvidia drivers work well enough. Search for "nvidia x server settings"
# in your applications menu and you can switch between intel and nvidia PRIME profiles.
# But the nvidia card is still ON (bbswitch will report it off, but the battery consumption is (20 +- 5) Watts
# You can leave it here if you're not worried about battery but if you are then continue with this.
# Install powertop
sudo apt install powertop
# TLP interferes with bluetooth, better not install it, or remove it completely.
sudo apt remove --purge tlp
# Run powertop:
sudo powertop
# You should see battery discharge around 20w +/- 5W, this eats up my battery 4 times faster.
# Add command line params:
sudo nano /etc/default/grub
# Make the following look like this, do not ask why.
GRUB_CMDLINE_LINUX_DEFAULT='pcie_port_pm=off acpi_backlight=none acpi_osi=Linux acpi_osi=! acpi_osi="Windows 2009"'
sudo update-grub2
# Install bumblebee - now this is the danger zone, this software has not been updated in a while and I am not sure when will this available.
# Avoid updating your system if you're fine with this.
sudo add-apt-repository ppa:bumblebee/testing
sudo apt update
sudo apt install bumblebee bumblebee-nvidia
# at the time of writing this, the latest is nvidia-381 but cuda 8 requires the stable, which is nvidia-375
# add them to bumblebee config file.
sudo nano /etc/bumblebee/bumblebee.conf
# Change 'Driver=' to 'Driver=nvidia'
# Change all occurences of 'nvidia-current' to 'nvidia-xxx'
# Change KernelDriver=nvidia-384
# save and run
sudo service bumblebeed restart
# this should give you daemon already running
sudo bumblebeed
# Since the driver load will now be handled by bumblebee, we need to stop the OS from loading it.
sudo nano /etc/modprobe.d/bumblebee.conf
# Make the following section look like this (the drm line will be added):
#387
blacklist nvidia-387
blacklist nvidia-387-drm
blacklist nvidia-387-updates
blacklist nvidia-experimental-387
# once that is done, you'll need bbswitch dkms module
sudo apt-get install bbswitch-dkms
# Load this with the kernel.
sudo nano /etc/modules-load.d/modules.conf
# add following
i915
bbswitch
# now make sure nvidia-settings has nvidia prime profile selected.
# So what actually happened:
# The control for switching between graphics has been moved from nvidia's driver to bumblebee. This helps
# maximize battery life because now you can selectively switch between which graphics card to use. In case
# you want to provide access to nvidia gpu for the current application run it using optirun.
# e.g. if you want to run steam with nvidia gpu, run something like: $ optirun steam
# or if you're using gpu to run ml tasks, just run them with optirun and they would work just fine.
# Additional
# TLP is known to interfere with bumblebee, make it avoid using this https://wiki.archlinux.org/index.php/Talk:Bumblebee#Bumblebee_and_TLP_interferening
# Run powertop to see if battery consumption is in check: 10w +/- 5W
# Testing bumblebee
cat /proc/acpi/bbswitch # Ouput:0000:01:00.0 OFF
optirun glxgears -info # Runs the Gears demo
optirun nvidia-smi # Should give an error
sudo prime-select nvidia # Should select nvidia hardware for cuda
optirun nvidia-smi # Outputs:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.34 Driver Version: 387.34 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1050 Off | 00000000:01:00.0 Off | N/A |
| N/A 29C P0 N/A / N/A | 5MiB / 4041MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 23052 G /usr/lib/xorg/Xorg 5MiB |
+-----------------------------------------------------------------------------+
cat /proc/acpi/bbswitch # Still Outputs:0000:01:00.0 OFF which means, we're using nvida hardware only when we run applications using optirun
# Getting mouse freezes, random misses?
dmesg -w | grep psmouse #check if your trackpad is out of sync frequently
# Add this boot flag:
"psmouse.resetafter=0"
# The system loads a blank screen with just a cursor on reboot?
# I found that selecting nvidia and restarting the system would need nvidia to run again,
# but unfortunately we disable nvidia on startup. A simple fix could be adding an alias in your ~/.(bash|zsh)rc
# I always restart my system from the terminal so this makes sure I move back to intel always before restart.
alias reboot="sudo prime-select intel; sudo reboot now"
alias shutdown="sudo prime-select intel; sudo shutdown -h now"
# Other Helpful links:
http://en.community.dell.com/techcenter/os-applications/f/4613/t/19629103
https://karlgrz.com/dell-xps-15-ubuntu-tweaks/
https://hemenkapadia.github.io/blog/2016/05/07/Ubuntu-with-Nvidia-Bumblebee.html
https://askubuntu.com/questions/879856/nvidia-prime-cant-switch-to-intel/885487
http://www.webupd8.org/2016/08/how-to-install-and-configure-bumblebee.html
#Benchmarks
==================================
GpuTest 0.7.0
http://www.geeks3d.com
Module: TessMark X64
Score: 5243 points (FPS: 87)
Settings:
- 1920x1080 windowed
- antialiasing: Off
- duration: 60000 ms
Renderer:
- GeForce GTX 1050/PCIe/SSE2
- OpenGL: 4.5.0 NVIDIA 384.98
==================================
==================================
GpuTest 0.7.0
http://www.geeks3d.com
Module: Plot3D
Score: 11340 points (FPS: 189)
Settings:
- 1920x1080 windowed
- antialiasing: Off
- duration: 60000 ms
Renderer:
- GeForce GTX 1050/PCIe/SSE2
- OpenGL: 4.5.0 NVIDIA 384.98
==================================
=================================
GpuTest 0.7.0
http://www.geeks3d.com
Module: FurMark
Score: 2670 points (FPS: 44)
Settings:
- 1920x1080 windowed
- antialiasing: Off
- duration: 60000 ms
Renderer:
- GeForce GTX 1050/PCIe/SSE2
- OpenGL: 4.5.0 NVIDIA 384.98
==================================
@fayazkhan
Copy link

@whizzzkid, my setup has stopped working suddenly. I had an abrupt shutdown while steam with primusrun was live.

Then, when I try to start steam again, it just runs the updater and quits.

$ primusrun steam
/usr/bin/primusrun: line 41: warning: command substitution: ignored null byte in input                                                                        
Running Steam on ubuntu 17.10 64-bit                                                                                                                          
STEAM_RUNTIME is enabled automatically
Pins up-to-date!
[2018-04-07 21:44:12] Startup - updater built Apr  2 2018 15:23:43
Looks like steam didn't shutdown cleanly, scheduling immediate update check
[2018-04-07 21:44:13] Checking for update on startup
[2018-04-07 21:44:13] Checking for available updates...
[2018-04-07 21:44:14] Download skipped: /client/steam_client_ubuntu12 version 1522709999, installed version 1522709999
[2018-04-07 21:44:14] Nothing to do
[2018-04-07 21:44:14] Verifying installation...
[2018-04-07 21:44:14] Performing checksum verification of executable files
[2018-04-07 21:44:15] Verification complete
$

This doesn't happen when running steam in intel or with optirun, but the performance is just not there.

Also I am seeing some weird dmesg output too.

[ 4265.046654] ldconfig.real[17704]: segfault at 338 ip 000000000049c5a7 sp 00007ffd43442710 error 4 in ldconfig.real[400000+e2000]
[ 4265.193360] bbswitch: enabling discrete graphics
[ 4265.407318] nvidia-nvlink: Nvlink Core is being initialized, major device number 239
[ 4265.407797] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  390.48  Thu Mar 22 00:42:57 PDT 2018 (using threaded interrupts)
[ 4266.463078] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  390.48  Wed Mar 21 23:48:34 PDT 2018
[ 4266.479341] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 238
[ 4266.509646] nvidia-modeset: Allocated GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4266.509799] nvidia-modeset: Freed GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4266.568571] nvidia-modeset: Allocated GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4266.568720] nvidia-modeset: Freed GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4268.457375] steam[17728]: segfault at a8 ip 00000000f75c8f83 sp 00000000ffd95e10 error 4 in libGL.so.390.48[f7547000+c5000]
[ 4268.582475] nvidia-modeset: Unloading
[ 4268.611549] nvidia-uvm: Unloaded the UVM driver in 8 mode
[ 4268.639389] nvidia-nvlink: Unregistered the Nvlink Core, major device number 239
[ 4268.697497] bbswitch: disabling discrete graphics
[ 4268.714983] pci 0000:01:00.0: Refused to change power state, currently in D0
[ 4273.194150] bbswitch: enabling discrete graphics
[ 4273.417666] nvidia-nvlink: Nvlink Core is being initialized, major device number 239
[ 4273.418106] NVRM: loading NVIDIA UNIX x86_64 Kernel Module  390.48  Thu Mar 22 00:42:57 PDT 2018 (using threaded interrupts)
[ 4274.474290] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver for UNIX platforms  390.48  Wed Mar 21 23:48:34 PDT 2018
[ 4274.491723] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 238
[ 4274.523783] nvidia-modeset: Allocated GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4274.523930] nvidia-modeset: Freed GPU:0 (GPU-e67ebdca-1dc6-cc63-e15c-c63297995ddf) @ PCI:0000:01:00.0
[ 4274.575560] nvidia-modeset: Unloading
[ 4274.640395] nvidia-uvm: Unloaded the UVM driver in 8 mode
[ 4274.659580] nvidia-nvlink: Unregistered the Nvlink Core, major device number 239
[ 4274.693555] bbswitch: disabling discrete graphics
[ 4274.711057] pci 0000:01:00.0: Refused to change power state, currently in D0

@whizzzkid
Copy link
Author

Welcome everyone

I updated to 390.48 a couple of days ago and 4.15.18 kernel, the watts are oddly satisfying on idle.
4 43w

@fayazkhan try reinstalling, I am clueless.

Cheers :)

@fayazkhan
Copy link

fayazkhan commented Apr 27, 2018

@whizzzkid it actually wasn't a setup issue, but an issue from steam which had a solve here: ValveSoftware/steam-for-linux#5428 (comment)

This command solves it

primusrun steam -steamos

@ToothyTahr
Copy link

Hello! Thank you for taking the time to provide clear instructions.

I am running Ubuntu 16.04 on my Dell XPS 15 9560 (UHD screen, 16Gb RAM, 512 SSD etc.) After following the guide, I ended up with a black screen (no cursor) upon reboot. I have a feeling the error has something to do with one of these factors:

  1. When running sudo bumblebeed , the output was something to the effect of "nvidia" not being found.
  2. When I ran cat /proc/acpi/bbswitch, the output was 0000:01:00.0 ON
  3. I installed the V4.14.13 kernel, but sudo ubuntu-drivers autoinstall installed the version 396 driver.

Do you perhaps have any advice how I could get this working properly? At the moment, having removed all nvidia related components, powerstat indicates an average power draw of 25W. I would very much like to get this down to more reasonable figures.

My CPU temperature sits around 50 degrees - which is hotter than the 40 degrees I have read online others are able to achieve. Furthermore, the GPU temperature seems to be at around 60 degrees.

@SvenMeyer
Copy link

I finally installed Manjaro Arch Linux and suddenly everything works perfectly out of the box, zero configuration required !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment