Last active
May 7, 2024 21:28
-
-
Save bigsnarfdude/47684d7afc4941c118ad8c0a2d764ca5 to your computer and use it in GitHub Desktop.
fsdp-qlora-llama3
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://github.com/philschmid/deep-learning-pytorch-huggingface/blob/main/training/fsdp-qlora-distributed-llama3.ipynb | |
Expected Memory usage: | |
Full-finetuning with FSDP needs ~16X80GB GPUs | |
FSDP + LoRA needs ~8X80GB GPUs | |
FSDP + Q-Lora needs ~2x40GB GPUs | |
FSDP + Q-Lora + CPU offloading needs 4x24GB GPUs, with 22 GB/GPU and 127 GB CPU RAM with a sequence length of 3072 and a batch size of 1. | |
Tue May 7 20:43:36 2024 | |
+---------------------------------------------------------------------------------------+ | |
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 | | |
|-----------------------------------------+----------------------+----------------------+ | |
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | |
| | | MIG M. | | |
|=========================================+======================+======================| | |
| 0 NVIDIA A100-SXM4-40GB On | 00000000:07:00.0 Off | 0 | | |
| N/A 45C P0 248W / 400W | 20611MiB / 40960MiB | 82% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
| 1 NVIDIA A100-SXM4-40GB On | 00000000:08:00.0 Off | 0 | | |
| N/A 41C P0 242W / 400W | 20755MiB / 40960MiB | 82% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
| 2 NVIDIA A100-SXM4-40GB On | 00000000:09:00.0 Off | 0 | | |
| N/A 44C P0 235W / 400W | 20253MiB / 40960MiB | 84% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
| 3 NVIDIA A100-SXM4-40GB On | 00000000:0A:00.0 Off | 0 | | |
| N/A 45C P0 252W / 400W | 20755MiB / 40960MiB | 83% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
| 4 NVIDIA A100-SXM4-40GB On | 00000000:0B:00.0 Off | 0 | | |
| N/A 44C P0 205W / 400W | 20253MiB / 40960MiB | 85% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
| 5 NVIDIA A100-SXM4-40GB On | 00000000:0C:00.0 Off | 0 | | |
| N/A 40C P0 249W / 400W | 20253MiB / 40960MiB | 86% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
| 6 NVIDIA A100-SXM4-40GB On | 00000000:0D:00.0 Off | 0 | | |
| N/A 42C P0 253W / 400W | 20253MiB / 40960MiB | 85% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
| 7 NVIDIA A100-SXM4-40GB On | 00000000:0E:00.0 Off | 0 | | |
| N/A 46C P0 273W / 400W | 20611MiB / 40960MiB | 86% Default | | |
| | | Disabled | | |
+-----------------------------------------+----------------------+----------------------+ | |
+---------------------------------------------------------------------------------------+ | |
| Processes: | | |
| GPU GI CI PID Type Process name GPU Memory | | |
| ID ID Usage | | |
|=======================================================================================| | |
| 0 N/A N/A 10795 C /usr/bin/python3 20576MiB | | |
| 1 N/A N/A 10796 C /usr/bin/python3 20720MiB | | |
| 2 N/A N/A 10797 C /usr/bin/python3 20218MiB | | |
| 3 N/A N/A 10798 C /usr/bin/python3 20720MiB | | |
| 4 N/A N/A 10799 C /usr/bin/python3 20218MiB | | |
| 5 N/A N/A 10800 C /usr/bin/python3 20218MiB | | |
| 6 N/A N/A 10801 C /usr/bin/python3 20218MiB | | |
| 7 N/A N/A 10802 C /usr/bin/python3 20576MiB | | |
+---------------------------------------------------------------------------------------+ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment