Skip to content

Instantly share code, notes, and snippets.

@simonw
Created October 14, 2025 03:11
Show Gist options
  • Select an option

  • Save simonw/021651a14e6c5bf9876c9c4244ed6c2d to your computer and use it in GitHub Desktop.

Select an option

Save simonw/021651a14e6c5bf9876c9c4244ed6c2d to your computer and use it in GitHub Desktop.
╭─── Claude Code v2.0.14 ───────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ │ Tips for getting started │
│ Welcome back Simon! │ Run /init to create a CLAUDE.md file with instructions for Claude │
│ │ Note: You have launched claude in your home directory. For the best experience, launch it in a project… │
│ ▐▛███▜▌ │ ─────────────────────────────────────────────────────────────────────────────────────────────────────── │
│ ▝▜█████▛▘ │ Recent activity │
│ ▘▘ ▝▝ │ No recent activity │
│ │ │
│ Sonnet 4.5 · Claude Max │ │
│ /home/dev │ │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
> Tell me things about this environment, in particular the CUDA abilities
● I'll gather information about your environment and CUDA capabilities.
● Bash(nvidia-smi)
⎿  Mon Oct 13 20:54:05 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.09 Driver Version: 580.82.09 CUDA Version: 13.0 |
… +17 lines (ctrl+o to expand)
● Bash(nvcc --version)
⎿  Error: /bin/bash: line 1: nvcc: command not found
● Bash(cat /usr/local/cuda/version.txt 2>/dev/null || cat /usr/local/cuda/version.json 2>/dev/null || echo "CUDA version file not found")
⎿  {
"cuda" : {
"name" : "CUDA SDK",
… +151 lines (ctrl+o to expand)
● Bash(lscpu | grep -E "Model name|CPU\(s\)|Thread|Core")
⎿  CPU(s): 20
On-line CPU(s) list: 0-19
Model name: Cortex-X925
… +8 lines (ctrl+o to expand)
● Bash(free -h)
⎿   total used free shared buff/cache available
Mem: 119Gi 7.5Gi 100Gi 17Mi 12Gi 112Gi
Swap: 0B 0B 0B
● Bash(python3 --version 2>&1 && python --version 2>&1)
⎿  Error: Python 3.12.3
/bin/bash: line 1: python: command not found
● Bash(ls -la /usr/local/cuda* 2>/dev/null | head -20)
⎿  lrwxrwxrwx 1 root root 22 Sep 8 10:33 /usr/local/cuda -> /etc/alternatives/cuda
lrwxrwxrwx 1 root root 25 Sep 8 10:33 /usr/local/cuda-13 -> /etc/alternatives/cuda-13
… +17 lines (ctrl+o to expand)
● Bash(python3 -c "import torch; print(f'PyTorch: {torch.__version__}'); print(f'CUDA Available: {torch.cuda.is_available()}');
print(f'CUDA Version: {torch.version.cu…)
⎿  Error: Traceback (most recent call last):
File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'torch'
● Bash(df -h /home/dev)
⎿  Filesystem Size Used Avail Use% Mounted on
overlay 3.7T 194G 3.3T 6% /
● Bash(python3 -c "import sys; import pkg_resources; packages = [p.project_name for p in pkg_resources.working_set]; ml_packages = [p for
p in packages if any(x in p.l…)
⎿ <string>:1: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.ht
ml
No common ML packages found
● Bash(echo $PATH | tr ':' '\n' | grep -i cuda)
⎿  No matches found
● Bash(echo $LD_LIBRARY_PATH | tr ':' '\n' | grep -i cuda)
⎿  No matches found
● Bash(uname -m && arch)
⎿  aarch64
aarch64
● Environment Summary
System Overview
- OS: Linux 6.11.0-1014-nvidia (Ubuntu/Debian-based)
- Architecture: aarch64 (ARM64)
- CPU: ARM-based with 20 cores
- 10x Cortex-X925 cores (performance cores)
- 10x Cortex-A725 cores (efficiency cores)
- RAM: 119 GB total (112 GB available)
- Storage: 3.7 TB total, 3.3 TB available (6% used)
- Python: 3.12.3
CUDA Capabilities
GPU Hardware:
- GPU: NVIDIA GB10 (Blackwell architecture - latest generation)
- Temperature: 33°C (idle)
- Power: 4W (idle state)
- Status: Persistence mode enabled, no running processes
CUDA Software:
- Driver Version: 580.82.09
- CUDA SDK: 13.0.1
- Installation: /usr/local/cuda-13.0
- Included Tools:
- NVCC compiler (13.0.88) - installed but not in PATH
- cuBLAS 13.0.2.14 (linear algebra)
- cuFFT 12.0.0.61 (fast Fourier transforms)
- cuRAND 10.4.0.35 (random number generation)
- cuSOLVER 12.0.4.66 (linear solvers)
- cuSPARSE 12.6.3.3 (sparse matrix operations)
- NVML, NVRTC, CUPTI (profiling and runtime compilation)
- Nsight Compute 2025.3.1.4 (profiling)
- Nsight Systems 2025.3.2.474 (system profiling)
- GPUDirect Storage 1.15.1.6
Current Limitations:
- CUDA tools not in PATH (nvcc not accessible by default)
- No ML frameworks installed (PyTorch, TensorFlow, JAX, etc.)
- No scientific computing libraries (NumPy, SciPy, Pandas, CuPy)
- LD_LIBRARY_PATH not configured for CUDA
This is a high-performance ARM-based system with cutting-edge NVIDIA GPU support and CUDA 13.0, ideal for GPU-accelerated
computing, ML/AI workloads, or CUDA development once the appropriate frameworks are installed.
> create a python virtual environment and then try to install torch and torch.cuda to say that the GPU is visible
● I'll create a Python virtual environment and install PyTorch with CUDA support, then verify the GPU is visible.
● Bash(python3 -m venv ~/pytorch_env)
⎿  Error: The virtual environment was not created successfully because ensurepip is not
available. On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.
apt install python3.12-venv
You may need to use sudo with that command. After installing the python3-venv
package, recreate your virtual environment.
Failing command: /home/dev/pytorch_env/bin/python3
● Bash(sudo apt install -y python3.12-venv) timeout: 3m 0s
⎿  Reading package lists...
Building dependency tree...
Reading state information...
… +32 lines (ctrl+o to expand)
● Bash(python3 -m venv ~/pytorch_env)
⎿  (No content)
● Bash(source ~/pytorch_env/bin/activate && pip install --upgrade pip) timeout: 3m 0s
⎿  Requirement already satisfied: pip in ./pytorch_env/lib/python3.12/site-packages (24.0)
Collecting pip
Downloading pip-25.2-py3-none-any.whl.metadata (4.7 kB)
… +8 lines (ctrl+o to expand)
● Bash(source ~/pytorch_env/bin/activate && pip install torch torchvision torchaudio) timeout: 10m 0s
⎿  Collecting torch
Downloading torch-2.8.0-cp312-cp312-manylinux_2_28_aarch64.whl.metadata (30 kB)
Collecting torchvision
… +54 lines (ctrl+o to expand)
● Bash(source ~/pytorch_env/bin/activate && python3 -c "
import torch…)
⎿  ============================================================
PyTorch Installation and CUDA Information
============================================================
… +32 lines (ctrl+o to expand)
● The installed PyTorch version (2.8.0+cpu) is CPU-only and doesn't have CUDA support. Let me install the CUDA-enabled version for
ARM64.
● Bash(source ~/pytorch_env/bin/activate && pip uninstall -y torch torchvision torchaudio)
⎿  Found existing installation: torch 2.8.0
Uninstalling torch-2.8.0:
Successfully uninstalled torch-2.8.0
… +6 lines (ctrl+o to expand)
● Bash(source ~/pytorch_env/bin/activate && pip install torch torchvision torchaudio --index-url timeout: 10m
https://download.pytorch.org/whl/cu124) 0s
⎿  Looking in indexes: https://download.pytorch.org/whl/cu124
Collecting torch
Downloading https://download.pytorch.org/whl/cu124/torch-2.5.1-cp312-cp312-linux_aarch64.whl (2359.8 MB)
… +28 lines (ctrl+o to expand)
● Bash(source ~/pytorch_env/bin/activate && python3 -c "
import torch…)
⎿  Error: /home/dev/pytorch_env/lib/python3.12/site-packages/torch/cuda/__init__.py:235: UserWarning:
NVIDIA GB10 with CUDA capability sm_121 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_50 sm_80 sm_86 sm_89 sm_90 sm_90a.
If you want to use the NVIDIA GB10 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/
warnings.warn(
Traceback (most recent call last):
File "<string>", line 27, in <module>
File "/home/dev/pytorch_env/lib/python3.12/site-packages/torch/_tensor.py", line 1053, in __format__
return object.__format__(self, format_spec)
… +44 lines (ctrl+o to see all)
● Summary
Good news: The GPU is visible to PyTorch!
PyTorch Version: 2.5.1
CUDA Available: True
CUDA Version (compiled): 12.4
cuDNN Version: 90100
Number of CUDA Devices: 1
Device 0: NVIDIA GB10
- Device Capability: (12, 1)
- Total Memory: 119.68 GB
- Multi-processor Count: 48
Issue: Your NVIDIA GB10 (Blackwell architecture) has compute capability sm_121, which is very new. The current PyTorch 2.5.1 build
was compiled with support for older architectures (sm_50, sm_80, sm_86, sm_89, sm_90, sm_90a) but not sm_121 yet.
This means:
- ✓ PyTorch can detect and see the GPU
- ✗ PyTorch cannot execute operations on it (no kernel image available)
Solutions:
1. Try PyTorch nightly build (may have sm_121 support):
source ~/pytorch_env/bin/activate
pip uninstall -y torch torchvision torchaudio
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu124
2. Wait for official PyTorch release with sm_121 support
3. Build PyTorch from source with sm_121 support (more complex)
The environment is otherwise perfectly set up for CUDA development - you have CUDA 13.0 installed, 119 GB of GPU memory, and the
driver is working correctly.
> OK you try a pytorch nightly then
● Bash(source ~/pytorch_env/bin/activate && pip uninstall -y torch torchvision torchaudio)
⎿  Found existing installation: torch 2.5.1
Uninstalling torch-2.5.1:
Successfully uninstalled torch-2.5.1
… +6 lines (ctrl+o to expand)
● Bash(source ~/pytorch_env/bin/activate && pip install --pre torch torchvision torchaudio --index-url timeout: 10m 0s
https://download.pytorch.org/whl/nightly/cu124)
⎿  Error: ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch
Looking in indexes: https://download.pytorch.org/whl/nightly/cu124
● Bash(source ~/pytorch_env/bin/activate && pip install --pre torch torchvision torchaudio --index-url timeout: 10m 0s
https://download.pytorch.org/whl/nightly/cu121)
⎿  Error: ERROR: Could not find a version that satisfies the requirement torch (from versions: none)
ERROR: No matching distribution found for torch
Looking in indexes: https://download.pytorch.org/whl/nightly/cu121
● Bash(source ~/pytorch_env/bin/activate && pip install --pre torch torchvision torchaudio --index-url timeout: 10m 0s
https://download.pytorch.org/whl/nightly/cpu)
⎿ Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch
Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.10.0.dev20251013%2Bcpu-cp312-cp312-manylinux_2_28_aarch64.whl
… +37 lines (ctrl+o to expand)
● Bash(source ~/pytorch_env/bin/activate && pip uninstall -y torch torchvision torchaudio)
⎿  Found existing installation: torch 2.10.0.dev20251013+cpu
Uninstalling torch-2.10.0.dev20251013+cpu:
Successfully uninstalled torch-2.10.0.dev20251013+cpu
… +6 lines (ctrl+o to expand)
● Web Search("PyTorch ARM64 CUDA sm_121 Blackwell GB10 support 2025")
⎿  Did 6 searches in 70s
● Bash(pip search nvidia-pyindex 2>&1 || echo "pip search is disabled")
⎿  /bin/bash: line 1: pip: command not found
pip search is disabled
● Fetch(https://pytorch.org/blog/pytorch-2-7/)
⎿  Received 347.8KB (200 OK)
● Bash(source ~/pytorch_env/bin/activate && pip install torch==2.7.0 torchvision torchaudio --index-url timeout: 10m
https://download.pytorch.org/whl/cu124) 0s
⎿  Error: ERROR: Could not find a version that satisfies the requirement torch==2.7.0 (from versions: 2.4.0, 2.4.1, 2.5.0, 2.5.1)
ERROR: No matching distribution found for torch==2.7.0
Looking in indexes: https://download.pytorch.org/whl/cu124
● Bash(source ~/pytorch_env/bin/activate && pip index versions torch 2>&1 | head -30)
⎿  torch (2.8.0)
Available versions: 2.8.0, 2.7.1, 2.7.0, 2.6.0, 2.5.1, 2.5.0, 2.4.1, 2.4.0, 2.3.1, 2.3.0, 2.2.2, 2.2.1, 2.2.0
● Bash(source ~/pytorch_env/bin/activate && pip install torch==2.8.0 torchvision torchaudio) timeout: 10m 0s
⎿  Collecting torch==2.8.0
Using cached torch-2.8.0-cp312-cp312-manylinux_2_28_aarch64.whl.metadata (30 kB)
Collecting torchvision
… +22 lines (ctrl+o to expand)
● Bash(source ~/pytorch_env/bin/activate && pip uninstall -y torch torchvision torchaudio && pip install torch torchvision timeout: 10m
torchaudio --index-url https://download.pyto…) 0s
⎿  Found existing installation: torch 2.8.0
Uninstalling torch-2.8.0:
Successfully uninstalled torch-2.8.0
… +40 lines (ctrl+o to expand)
● Perfect! I've installed PyTorch 2.7.1 with CUDA 12.8 support and Triton 3.3.1 (which has Blackwell support according to the release
notes). Let me test if this works with your GB10 GPU:
● Bash(source ~/pytorch_env/bin/activate && python3 -c "
import torch…)
⎿  ============================================================
PyTorch Installation and CUDA Information
============================================================
… +38 lines (ctrl+o to expand)
● Success! 🎉
PyTorch 2.7.1 with CUDA 12.8 works perfectly with your NVIDIA GB10 GPU!
Results:
PyTorch Version: 2.7.1+cu128
CUDA Available: True
CUDA Version (compiled): 12.8
cuDNN Version: 90800
Device 0: NVIDIA GB10
- Device Capability: (12, 1)
- Total Memory: 119.68 GB
- Multi-processor Count: 48
✓ GPU is visible and working correctly!
All GPU operations tested successfully:
- ✓ Tensor creation and transfer to GPU
- ✓ Basic arithmetic operations on GPU
- ✓ Matrix multiplication on GPU
What Made It Work:
PyTorch 2.7.1 with CUDA 12.8 includes:
- Support for Blackwell architecture (sm_121)
- Triton 3.3.1 with Blackwell compatibility
- Pre-built ARM64 wheels with full CUDA support
Virtual Environment Location:
source ~/pytorch_env/bin/activate
You now have a fully functional PyTorch environment that can leverage your GB10's 119.68 GB of GPU memory and 48 multi-processors!
> Write detailed notes on how we got here
● Write(pytorch_cuda_setup_notes.md)
⎿  Wrote 380 lines to pytorch_cuda_setup_notes.md
# PyTorch CUDA Setup on NVIDIA GB10 (Blackwell) - Detailed Notes
**Date:** October 13, 2025
**System:** ARM64 (aarch64) Linux with NVIDIA GB10 Grace Blackwell
**Goal:** Install PyTorch with CUDA support to utilize the NVIDIA GB10 GPU
---
## System Environment
… +370 lines (ctrl+o to expand)
● I've created comprehensive documentation in /home/dev/pytorch_cuda_setup_notes.md that covers:
Contents
System Environment
- Complete hardware specs (ARM64, 20-core CPU, 119GB RAM)
- NVIDIA GB10 GPU details (sm_121, Blackwell architecture, 119GB VRAM)
- CUDA 13.0.1 toolkit components and libraries
Challenges Encountered
1. Python venv setup (missing python3.12-venv package)
2. CPU-only PyTorch from default PyPI
3. Blackwell sm_121 compatibility issues with PyTorch 2.5.1
4. No ARM64 CUDA nightly builds available
The Solution
- Why PyTorch 2.7.1 with CUDA 12.8 works
- Step-by-step installation process
- Package versions and sizes
Verification
- Complete test script
- Results showing successful GPU operations
- All operations verified (tensor ops, matrix multiplication)
Key Takeaways
- Version compatibility matrix
- What worked vs what didn't
- CUDA version compatibility notes
Reference Section
- Quick command reference
- Troubleshooting guide
- Requirements.txt for reproduction
- Links to official documentation
The document captures the entire journey from initial environment discovery through troubleshooting to the working solution, with
enough detail that someone could reproduce this setup or understand why specific versions were needed.
@simonw

simonw commented Oct 14, 2025

Copy link
Copy Markdown
Author

Installed npm, then:

npm install -g @anthropic-ai/claude-code
claude --dangerously-skip-permissions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment