johnstcn/ethminer_ubuntu_nvidia.md

## ethminer_ubuntu_nvidia.md

      
    Raw
  

              ethminer_ubuntu_nvidia.md
            
          
    Headless Ethminer (nVidia) Setup Guide

Cian Johnston, July 2017
WARNING: THESE WORDS ARE OLD AND MAY NOT WORK FOR YOU IN THESE NEW AND INTERESTING TIMES.

A couple of weeks ago, I decided I should put my gaming rig to work crypto mining. I did not expect to make any significant profit on this, it was more of a fun project to set up. However, there were a large number of tutorials and guides already out there, and many were more than a year out of date.
This guide assumes the reader already has a crypto wallet set up, is comfortable with Linux and the command line, and knows how to use Google if they run into problems.
The end result is an Ubuntu 16.04 LTS headless server running CUDA ethminer via systemd.
Hardware


ASUS P8Z68-V/GEN3 motherboard
1x nVidia GeForce GTX 970 (I only have one of these)
Intel i7-2600 processor.
Corsair Modular PSU 850W
plus a case, RAM, SSD/HDD, etc., the usual.

The GTX 970 is a decent card - not the best for mining, but it's what I have. The Intel processor is way overkill and is basically idle when mining in CUDA mode. In OpenCL mode, one virtual core is pegged at 100%. Hence, CUDA.
System Setup

Operating System

I'm running Ubuntu Server 16.04 (LTS) on this machine. Most folks seem to use Windows for mining, but I'd rather not have to deal with the random Windows Update-induced restarts and other miscellaneous annoyances that come bundled with it.
Installing Ubuntu is outside of the scope of what I'm willing to write here, you can check the official documentation. I installed the base packages, build essentials and OpenSSH Server packages as this will be a headless build. I did need to initially hook up a monitor to perform the installation, but once SSH was up and running this was no longer necessary.
Also note that I did not install X11 or any form of GUI. This is not required here, unless you want to overclock your GPUs using nvidia-settings (see #Overclocking).
nVidia GPU Driver

Once Ubuntu was installed I grabbed the official nVidia driver from their website. You need to select "Show More Operating Systems" to select Linux 64-bit, then you can select your particular graphics card.
You will then be prompted to download a file named something like NVIDIA-Linux-x86_64-XXX.XX.run; copy the link and download this using e.g. wget, chmod +x and then sudo ./NVIDIA-Linux-x86_64-XXX.XX.run to install the driver, just follow the prompts and then reboot when asked to do so.
When the system comes back up, verify the driver is loaded and running: lsmod | grep nvidia.
It is also no harm to check the open-source nouveau module is not loaded:

lsmod | grep nouveau should return no output,
The file /etc/modprobe.d/nvidia-installer-disable-nouveau.conf should exist with the following content:

# generated by nvidia-installer
blacklist nouveau
options nouveau modeset=0

Also verify that the command nvidia-smi outputs a text-based table showing information about your GPU. If the driver is not installed properly this command will output an error.
ethminer setup

We will be using ethminer but the below is also applicable for claymore. Visit the releases page on GitHub and download the latest version (ethminer-X.YY.Z-Linux.tar.gz).
As of the time of writing (July 2017), the project has a proper build pipline setup for both Windows and Linux x64, so there should be no need to compile it yourself.
Extract the tar.gz archive to wherever you like (preferably somewhere in your $PATH) on your mining machine. Then ensure it runs proprerly e.g. ethminer --help, and perform a benchmark or two to ensure everything is working properly: ethminer -U -M.
On a single nVidia GTX 970 I observed roughly 20.78 MH/s on the above benchmark. A value lower than this on a more powerful GPU would indicate an issue with your setup. Ensure the GPU driver is installed and loaded correctly.
ethminer systemd unit

Instead of running directly via the CLI, we should let systemd handle starting ethminer for us. Create a file /etc/systemd/system/eth-miner.service and add the content from here, changing the variables as appropriate.


Change the flag -G to -U if you want to use CUDA instead of OpenCL. I found that OpenCL would cause one CPU core to be pegged at 100%.


Be sure to replace 127.0.0.1:8080/NAMEOFTHISRIG with the URL of whatever pool you are using. Make sure you include your correct wallet name in your connection URL, unless you want to give someone free ETH!


Once this is done, run sudo systemctl daemon-reload to force systemd to re-read its unit files, and then run sudo service eth-miner start to start ethminer.
You can view ethminer output by running the command journalctl -u eth-miner.service -f
Monitoring

I found the following things useful:


Install byobu and configure it to start on login. It's nice to not have to worry about detaching/reattaching screen sessions or having whatever you were running die when you suspend your laptop.


As a bonus, you can have it monitor your mining rig's temperature if you edit ~/.byobu/statusrc and add MONITORED_TEMP. This can be different on many systems; for me it's MONITORED_TEMP=/sys/class/thermal/thermal_zone0/temp. Have a poke around in /proc and /sys to see where your temp sensors live.


Monitor the output of nvidia-smi either via watch or by running nvidia-smi -l INTERVAL_IN_SECONDS. If you find your GPU in power state P2 you should be able to gain some extra performance by setting the application clock:


Run nvidia-smi -q -d SUPPORTED_CLOCKS to see the supported GPU/Memory clock rates, and then run nvidia-smi -ac MEM_CLOCK,GPU_CLOCK to set the clock.


You should now see nvidia-smi report a P0 power state. On a GTX 970, the maximum MEM/GPU clock pairing is 3505 MHz and 1455 MHz. If you wish, you may also underclock the GPU clock to reduce the power consumption potentially at the expense of hash rate.


htop is also a useful program to leave open to see your system load, and if anything is causing undue strain on the system.


If you are into prometheus and haskell, there is a Haskell module to monitor nvidia-smi. I haven't looked into this personally, but I am liking the idea of writing a Go module.


Overclocking

Be careful when overclocking. You could damage your hardware.
nVidia drivers above 37x.xx support increasing GPU graphics clock and memory transfer rates above the default maximum values. This can net you some extra hash rate at the expense of increased power usage and heat generation... and dead hardware if you're not careful.
Currently this can only be accomplished via nvidia-settings, which requires an X server to be running. Luckily it does provide a command-line interface and can be tricked into not caring about opening a GUI window.
Xorg setup

To do this we first need to install a minimal X server and a display manager: sudo apt install --no-install-recommends xorg lightdm lightdm-gtk-greeter. No need for a window manager. (Edit: @jeff-dagenais pointed out we might need to install lightdm-gtk-greeter as well, and also recommends removing ubuntu-drivers-common to ensure gpu-manager does not override your Xorg.conf: sudo apt remove ubuntu-drivers-common.)
Then, run sudo nvidia-xconfig to generate a minimal /etc/X11/xorg.conf. (Wow, this brings me back.) Edit this file with vim/nano or whatever you like and edit Section "Screen", adding the following options:
    Option         "AllowEmptyInitialConfiguration" "True"
    Option         "ConnectedMonitor" "DFP-0"
    Option         "Interactive" "False"
    Option         "Coolbits" "24"

The first three let us start X without having a monitor actually connected. The Coolbits line allows us to modify overclocking and overvoltage settings. The ever-wonderful ArchLinux wiki has more information on this, or you can read the HTML documents under /usr/share/doc/nvidia-*/html/.
At this point, you should be able to sudo service lightdm start and verify it is running correctly by sshing into your box with the -X option (you may have to exit your current session) and trying to run an xterm. Double-check hostname in case you opened a local xterm by accident. You can also verify it's running remotely by keeping an eye on your network usage and moving the window around a bit. It will spike like crazy.
If X doesn't start, pretend like it's 2005 again and look at the end of /var/log/Xorg.0.log to try and see where it tripped up. Google is your friend here; X11 has been causing premature hair loss since 1987.
Determine GPU Perf Mode

If you get this far with no issues, awesome. To actually change some clock speeds, determine which GPUPerfMode allows this by running sudo DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 nvidia-settings -q '[gpu:0]/GPUPerfModes'. (Note: lightdm needs to be running for this as we are hijacking root's XAUTHORITY). You should see something like this:
sudo DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 nvidia-settings -q '[gpu:0]/GPUPerfModes'
  Attribute 'GPUPerfModes' (monolith:0[gpu:0]): perf=0, nvclock=135, nvclockmin=135, nvclockmax=405, nvclockeditable=0, memclock=324, memclockmin=324,
  memclockmax=324, memclockeditable=0, memTransferRate=648, memTransferRatemin=648, memTransferRatemax=648, memTransferRateeditable=0 ; perf=1, nvclock=135,
  nvclockmin=135, nvclockmax=1392, nvclockeditable=0, memclock=810, memclockmin=810, memclockmax=810, memclockeditable=0, memTransferRate=1620,
  memTransferRatemin=1620, memTransferRatemax=1620, memTransferRateeditable=0 ; perf=2, nvclock=135, nvclockmin=135, nvclockmax=1455, nvclockeditable=0,
  memclock=3004, memclockmin=3004, memclockmax=3004, memclockeditable=0, memTransferRate=6008, memTransferRatemin=6008, memTransferRatemax=6008,
  memTransferRateeditable=0 ; perf=3, nvclock=135, nvclockmin=135, nvclockmax=1455, nvclockeditable=1, memclock=3505, memclockmin=3505, memclockmax=3505,
  memclockeditable=1, memTransferRate=7010, memTransferRatemin=7010, memTransferRatemax=7010, memTransferRateeditable=1

The GPUPerfModes are semicolon-separated; you want to check which one has nvclockeditable=1 and memclockeditable=1. In the above example, it's 3.
Then, verify your GPU is in that perf level: sudo DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 nvidia-settings -q '[gpu:0]/GPUCurrentPerfLevel'. All going well, it should report that same perf mode as indicated above, as we set the Coolbits as required in /etc/X11/xorg.conf.
Overclock

At this stage you should be able to perform some overclocking. Do this in small increments and ensure the power and heat does not get too high.
You can modify the configuration options GPUGraphicsClockOffset and GPUMemoryTransferRateOffset like so, assuming that 3 is the perf level that allows editing them as in my above example:
sudo DISPLAY=:0 XAUTHORITY=/var/run/lightdm/root/:0 nvidia-settings -a '[gpu:0]/GPUGraphicsClockOffset[3]=100' -a '[gpu:0]/GPUMemoryTransferRateOffset[3]=200'

Overclocking Example:

Baseline of 3505/1455 MEM/GFX: ~18.64 MH/s at 144W/65C.
+600/+300 MEM/GFX: peak of 23.30 MH/s at 158W/67C.

Troubleshooting


If you start see a large number of rejected solutions and/or 0 MH/s, try restarting ethminer: sudo service eth-miner restart. I have had to do this if my internet connection had a blip.
If the ethminer process is stuck in the shutdown state and won't restart even with a sudo kill -9, it's likely that the GPU driver has run into some issue and it's time to do a power cycle.

References


/r/EtherMining (of course)
Ethereum Systemd Unit Files
How to Squeeze Some Extra Performance Mining Ethereum on Nvidia
nvidia-smi-prometheus
How to run nvidia-settings remotely (Bless you, Mr. f0k)
Faking a Head for a Headless X Server
nvidia-settings over SSH sees my local GPU?
ArchLinux Wiki - nVidia Tips and tricks: Enabling overclocking
nVidia DevTalk Forum - Overclocking Issues