Skip to content

Instantly share code, notes, and snippets.

@fbarretto
Last active December 18, 2024 19:43
Show Gist options
  • Save fbarretto/e61dbf5bb25fd5e98975a42fc7dd5295 to your computer and use it in GitHub Desktop.
Save fbarretto/e61dbf5bb25fd5e98975a42fc7dd5295 to your computer and use it in GitHub Desktop.
StreamDiffusion on a Mac

This is a gist on how to get StreamDiffusion running on a Mac (mps)

  1. Clone the repo

git clone https://github.com/cumulo-autumn/StreamDiffusion.git
  1. Setup the environment

cd StreamDiffusion
python -m venv venv (or python3)
source venv/bin/activate
  1. Install dependencies and StreamDiffusion

pip install --upgrade pip

pip install --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cpu

pip install .          
  1. Edit pipeline.py to support non cuda

  • Go to venv/lib/python3.11/site-packages/streamdiffusion/pipeline.py

Line 439, replace the call function by this:

	@torch.no_grad()
    # condition hack event sync/track for non-cuda devices, RIP profiling etc
    def __call__(
        self, x: Union[torch.Tensor, PIL.Image.Image, np.ndarray] = None
    ) -> torch.Tensor:
        if self.device == "cuda":
            start = torch.cuda.Event(enable_timing=True)
            end = torch.cuda.Event(enable_timing=True)
            start.record()
        if x is not None:
            x = self.image_processor.preprocess(x, self.height, self.width).to(
                device=self.device, dtype=self.dtype
            )
            if self.similar_image_filter:
                x = self.similar_filter(x)
                if x is None:
                    time.sleep(self.inference_time_ema)
                    return self.prev_image_result
            x_t_latent = self.encode_image(x)
        else:
            # TODO: check the dimension of x_t_latent
            x_t_latent = torch.randn((1, 4, self.latent_height, self.latent_width)).to(
                device=self.device, dtype=self.dtype
            )
        x_0_pred_out = self.predict_x0_batch(x_t_latent)
        x_output = self.decode_image(x_0_pred_out).detach().clone()

        self.prev_image_result = x_output
        if self.device == "cuda":
            end.record()
            torch.cuda.synchronize()
            inference_time = start.elapsed_time(end) / 1000
            self.inference_time_ema = 0.9 * self.inference_time_ema + 0.1 * inference_time
        return x_output
  1. Run the demos

txt2img

Make sure you have node installed and npm or pnpn in your path.

  • Edit config.py to set
    torch.device("cuda" if torch.cuda.is_available() else "mps")
    ...
    acceleration: Literal["none", "xformers", "tensorrt"] = "none"
  • Run
cd demo/realtime-txt2img
pip install -r requirements.txt
./start.sh

or

cd demo/realtime-txt2img
pip install -r requirements.txt
cd frontend
pnpm i
pnpm run build
cd ..
python main.py

img2img

  • Edit main.py (line 161)
device = torch.device("cuda" if torch.cuda.is_available() else "mps")
  • Run the demo
cd demo/realtime-img2img
pip install -r requirements.txt
./start.sh
  1. References

[1] cumulo-autumn/StreamDiffusion#34

[2] cumulo-autumn/StreamDiffusion#125

[3] https://developer.apple.com/metal/pytorch/

[4] cumulo-autumn/StreamDiffusion#134

@jobeejoba
Copy link

Sorry, for not answering. Yes this works !
Thank's FBarreto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment