Skip to content

Instantly share code, notes, and snippets.

@BloodBlight
Forked from In-line/Dockerfile
Last active October 19, 2023 16:12
Show Gist options
  • Save BloodBlight/0d36b33d215056395f34db26fb419a63 to your computer and use it in GitHub Desktop.
Save BloodBlight/0d36b33d215056395f34db26fb419a63 to your computer and use it in GitHub Desktop.
AMD 7900 XTX Stable Diffusion Web UI docker container (ROCM 5.6)

Make a spot and go there:

mkdir -p ~/SD
cd ~/SD

Put the dockerfile and docker-compose.yml here.

Your on your own on that. ;)

Build the image:

docker build -t sd-mrocm-5-6 .

Start it up!

docker-compose up -d

Now watch the logs to see it do it's thing, will take a while the first time:

docker logs -f sd-mrocm-5-6

Usually finished after the "lve/main/config.json" file is DLed.

In the future, you can just run this to re-start it:

docker start sd-mrocm-5-6

It's now running as a service, browse to: http://127.0.0.1:7860

NOTES

The web UI will pull things in the background and may act... Odd the first few times you run things. WAIT for a while after a click, watch your GUI and network load to see what's up. Sometimes just hitting F5 to reload the page will fix things.

version: "3.9"
services:
sd:
image: sd-mrocm-5-6
container_name: sd-mrocm-5-6
ports:
- "7860:7860"
volumes:
- ./models:/SD/stable-diffusion-webui/models/
- ./repositories:/SD/stable-diffusion-webui/repositories/
- ./extensions:/SD/stable-diffusion-webui/extensions/
- ./outputs:/SD/stable-diffusion-webui/outputs/
devices:
- '/dev/kfd:/dev/kfd'
- '/dev/dri:/dev/dri'
security_opt:
- seccomp:unconfined
group_add:
- video
#docker run -ti rocm/composable_kernel:ck_ub20.04_rocm5.6 bash
FROM rocm/composable_kernel:ck_ub20.04_rocm5.6
RUN mkdir /SD
# Clone SD
WORKDIR /SD
RUN git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
WORKDIR /SD/stable-diffusion-webui
RUN git reset --hard 22bcc7be428c94e9408f589966c2040187245d81
RUN apt update && apt install -y python3.8-venv
RUN python3 -m venv venv
# Activate VENV
ENV VIRTUAL_ENV=/SD/stable-diffusion-webui/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"
RUN python3 -m pip install --upgrade pip wheel
ENV HIP_VISIBLE_DEVICES=0
ENV PYTORCH_ROCM_ARCH="gfx1100"
ENV CMAKE_PREFIX_PATH=/SD/stable-diffusion-webui/venv/
ENV USE_CUDA=0
# Remove old torch and torchvision
RUN pip uninstall -y torch torchvision
# pytorch
RUN pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm5.5
WORKDIR /SD/stable-diffusion-webui
# Patch requirements.txt to remove torch
RUN sed '/torch/d' requirements.txt
RUN pip install -r requirements.txt
EXPOSE 7860/tcp
# Fix for "detected dubious ownership in repository" by rom1win.
RUN git config --global --add safe.directory '*'
CMD python3 launch.py --listen --disable-safe-unpickle --skip-torch-cuda-test
@BloodBlight
Copy link
Author

BloodBlight commented Jul 3, 2023

You can also make variations by adjusting the composer file and have different versions at the same time. I think you might be able to run at least 2 at the same time with an XTX. Probably not on the XT.

I tried doing a single 1024x1024 image on my XTX and it used so much vRAM it killed my desktop session. It ended up working, I just lost my login.

@Dnak-jb
Copy link

Dnak-jb commented Jul 4, 2023

Oh, no, I was talking about automatic1111 being an older build. It was as simple as I thought to change though. I just replaced the git reset hard command hash to the newest build they have. Then i compiled a new docker image with that hash and new name in another folder.

As for not being able to get to 1024 without issue, that is strange. With your provided dockerfiles i was able to do 1024 easily around 16it/s. Highres fix went about 2.5it/sec. It was only eating about 17 to 18gb of vram doing so. When i upgraded to my new docker image i was able to use the invokeai optimizations as opposed to the default doggetx on your dockerfiles and that allowed me to go all the way to 2048 image size with about 4s/it.

Edit : just tested with invokeai optimizations and I was able to batch 8 x 1024 images at once. It used 16gb during both the highres fix portion and first pass. Totat time was 1:01 with facefix. 2.07it/s during regular and 3.55s/it during highres fix. Using dpm++2m karras at 50 steps normal and 10 for highres.

One thing i did do though was stop firefox from using hardware acceleration to prevent it from eating vram for the session. But thats it.

@SebaLenny
Copy link

Unfortunately I get 😭
Error response from daemon: error gathering device information while adding custom device "/dev/kfd": no such file or directory

@BloodBlight
Copy link
Author

BloodBlight commented Oct 19, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment