adodge/ChaiNNer with remote backend.md

## ChaiNNer with remote backend.md

      
    Raw
  

              ChaiNNer with remote backend.md
            
          
    ChaiNNer with remote backend

Alex Dodge (@Eighty on Discord)
Last Updated: 27 Feb 2023
Introduction

This document will describe how to run chaiNNer on RunPod and connect to it with a frontend running on your local machine.  This way you can take advantage of cloud GPUs.
We're not using any RunPod-specific features, so this general approach should work for any cloud service, with some modifications.  This is written assuming you're running Linux.  At the end, I'll put some notes about Mac and Windows, as far as I can figure it out.
This was written Feb 23, 2023.  It's likely that this process will change, ideally to become easier.  I will try to keep this document up-to-date, but it's a good idea to refer to the chaiNNer github page to see if it has more recent information.
Summary

This is possible because ChaiNNer has two components that start at the same time when you run the application:  a frontend written in node.js that shows the UI and a backend written in python that executes the chains.  The backend listens to the frontend on port 8000.
With that in mind, the approach proposed here is:

start a runpod instance
install chaiNNer and its python dependencies on the runpod and run the backend
forward port 8000 to your local machine
install chaiNNer and its node.js dependencies to your local machine and run the frontend
move files back and forth with rsync

Process

Set up RunPod account

You have to make an account and load it with some money to start.  10 USD is the minimum, and that's plenty to try this out.  (Roughly 50 hours of the cheapest pod, which has an 8GB RTX 3070.)
You also need to add an SSH key to your account.  If you don't have one, you can generate one with ssh-keygen -b 4096 and follow the prompts.  You'll want to copy the the ~/.ssh/id_rsa.pub file into this field on the RunPod settings page.

Start a pod

Go to "Browse Servers" and find a machine to start a pod on.  We need a public TCP connection, so make sure it has the "TCP" tag.  Try to find one that has a decent amount of system memory and vCPUs.  We want the GPU to be the bottleneck, not anything else.  If there's not enough system memory, we might run into problems with the non-GPU parts of chainner.  If there's not enough CPU, things like loading models might take forever.  I don't have hard numbers on this, except to say that 1 vCPU is not enough.
We're also going to be moving data between the server and your local machine, so good bandwidth and geographic proximity is a consideration.  Though, it might be irrelevant if you're limited by your home internet connection.  Download is probably more significant than upload, on the pod, since you're going to be sending it large model files and it will be sending you relatively small image files.  Download also affects how long it will take to install chaiNNer and its dependencies.
Note that ChaiNNer doesn't currently support more than one GPU, so don't bother getting a multi-GPU pod.
Here's one that seems fine:

We're going to use the "RunPod Pytorch" image to start with.  I'm selecting no persistent storage, and 50GB of temporary disk, because this will be plenty for my experiments today.

install chainner

Go the "My Pods" and click the down arrow under your new pod.  It will spend some time starting up and when it's ready, click "Connect" to see how to log in.


Use the public IP and external port number to ssh in.
# ON LOCAL
ssh root@$HOST -p $PORT
(I'm going to use $HOST and $PORT going forward to indicate whatever the host and port of your pod are.  From the screenshot above, this is HOST=74.218.30.108 and PORT=9581)
Install the stuff we need on the pod:
# ON POD
apt-get update && apt-get install rsync libgl1 python3.9 python3.9-venv
git clone https://github.com/chaiNNer-org/chaiNNer
cd chaiNNer
python3.9 -m venv venv
source ./venv/bin/activate
pip3 install -r requirements.txt
You don't need to install any node.js stuff, because that's all being run on your local host.
run chainner backend

# ON POD
python3 ./backend/src/run.py 8000
This will start the backend.  If you want to keep this running without keeping a terminal open, use something like tmux or screen.  (Out of scope for this document.)
install chainner on local machine

We need to use the github version, not a packaged application.  So, we'll check it out and install the dependencies ourselves.
# ON LOCAL
git clone https://github.com/chaiNNer-org/chaiNNer
cd chaiNNer
sudo apt-get update && sudo apt-get install npm
npm install --force  # npm installs things to the current directory, not globally
You don't need to install any python stuff, because that's all being run on the remote machine.
port forwarding

This will forward port 8000 on your local machine to port 8000 on the remote machine.
# ON LOCAL
ssh root@$HOST -p $PORT -L 8000:localhost:8000 -N
This will also occupy a shell.  If you want to keep it running without keeping a terminal open, again I suggest tmux.  Note that if the backend stops, this will probably need to be restarted as well.  autossh is a good way to keep connections like this up, but I'll let you figure that out yourself.
run chainner frontend

# ON LOCAL
npm run frontend
Once again, this will occupy a shell.
demo

You can now run chains that don't use local files.  They will execute on the remote backend.  I suggest using the Create Noise node to test that it's working.

(An example of the sort of chain you could run now.)
set up local data directory

We definitely need to be able to load our own models and files, so let's set up a data directory that will be synced between the two machines.  This directory needs to be at the same path on both machines, so that you can use the frontend to select a file, and the backend will find that file at the same location.
I suggest making a subdirectory in your home directory, then making a directory with the same name (including the entire path to your home directory) on the pod.
# ON BOTH MACHINES
# NAME is whatever your username is on your host machine
# DATA_PATH could be be /home/$NAME/Documents/chaiNNer_data/
mkdir -p $DATA_PATH
(The "/" at the end of the data path is important, for rsync later.)
Put the models and files you want to use in here.  Keep it to things you're actually using, because you're going to need to transfer this.  Also it needs to fit in the storage that you picked when you set up your pod.

rsync to pod

We can use rsync to move the files.  This will only move new files or files that have changed, so we can just run it again and it will transfer only what's needed.
# ON LOCAL
rsync -rav -e "ssh -p $PORT" $DATA_PATH root@$HOST:$DATA_PATH
demo

Now you can run chains that load files from this directory and save them back.  The new files won't be on your local machine until you copy them back.

rsync back to local

Note this command is the same, except the last two arguments are flipped.  We're syncing from the pod to our local path.  Again, this will only move files that have changed, so it should just copy the new images you saved on the pod.
# ON LOCAL
rsync -rav -e "ssh -p $PORT" root@$HOST:$DATA_PATH/ $DATA_PATH/


Done

There you have it.  It's a little hacky, but surprisingly easy considering this is not a supported way of using chaiNNer.  (Yet?)  Hopefully if this is something people want to do, support can be added to make it easier.
Remember to copy your files off and terminate the RunPod instance when you're done, otherwise it'll eventually use all the money you loaded in.  There's a useful indicator at the top of the RunPod page to show how much you're spending:  
Other Operating Systems

Mac

I believe these steps would mostly work on MacOS.  git, ssh, and rsync are built into OSX.  The only problem I anticipate is installing npm.  I think you would have to install it from the node.js download page.
Alternatively, there is this even hackier approach that avoids having to do anything with git or npm:  Using the packaged executable for chaiNNer, start the application, then manually find and kill the backend process without killing the frontend process.  (The backend process should be the one ending with run.py 8000.)   Run the port forward command and the frontend should just start talking to the remote backend on the same port.
I have not tested this fully, but it does seem to work.  Presumably, it will be more reliable if the backend and frontend are the same version, so maybe check out the tagged release when you install chaiNNer on the pod.  You still have to forward a port and rsync the files.
Windows

I have never used Windows to do development, so I don't know much about how it would work.  If rsync and ssh are installed (or equivalent Windows utilities), maybe the same alternate strategy for MacOS would work.  (Killing the backend process and replacing it with a forwarded port to the remote backend.)  I welcome feedback and ideas.
(Feb 27:  One obvious problem we can anticipate is that Windows filepaths look nothing like Linux filepaths, which might mean this is impractical without explicit support in ChaiNNer for syncing files.)
Future Work


make this more user-friendly
put a chainner backend image on dockerhub so setting up the pod is faster
hosting models/data on S3 or google drive or something, to get around a slow home internet connection
try something like sshfs to mount the data directory into the remote pod.  would only be viable if the data is cached on the remote end, unless there's a lot of bandwidth.  Also, the way I'm imagining it it would involve sshing into your home computer from the pod, which I would be reluctant to do considering the "trust-based" security model of runpod.
I believe the dependency manager installs things on the local machine, not through the backend