SfOps sa-faizal

## triton_server_working_with_vllm.md

      
        
          
            
              
              1 file
            
          
          
            
              
              0 forks
            
          
            
              
                
                0 comments
              
            
          
            
              
              0 stars
            
          
        
        
          
              
          
          
            
                sa-faizal
                / triton_server_working_with_vllm.md
            
            
              Created
              November 12, 2025 20:32
            
              
                Triton Inference Server w/ vLLM on AMD GPUs
              
          
        
      
        

      
      
    Triton Inference Server w/ vLLM on AMD GPUs

Note
This tutorial is largely based on the following rocm blog:
Triton Inference Server with vLLM on AMD GPUs
With a couple tweaks and fixes to enable running across 3 MI300X GPUs.

Table of Contents