Get Flash Attention + Axolotl + Torch to work on CUDA 12.1 This fixed it for me pip uninstall -y torch flash-attn conda install pytorch==2.1.2 torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia pip install -U git+https://github.com/Dao-AILab/flash-attention