Ayush Dattagupta ayushdg

## prom-testing.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              1 star
            
          
                quasiben
                / prom-testing.md
            
            
              Last active
              April 20, 2020 15:12
            
          
    Prom is 36 node DGX based Slurm cluster.  There are three main partitions:

main/batch: max 4 nodes per user
bigjob partition: max 16 nodes per user
backfill partition: no limits, but jobs are lower priority and pre-emptible

Below are two scripts: dask-scheduler.script and dask-cuda-worker.script .  For the interactive workflows I think we should do the following:

Allocate a node for interactive use: salloc -N1 bash -- this will allocate a node we can ssh into (the client)
start scheduler and set of dask-cuda workers: sbatch dask-scheduler.script -- scheduler on main/batch partition


## pynvml_query_memory.py
import datetime
import getopt
import os
import sys
import time

import pynvml


def get_printable_util_mem(dev_count, peak_mem):

## ninja_instructions.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              10 stars
            
          
                jrhemstad
                / ninja_instructions.md
            
            
              Last active
              May 27, 2024 07:34
            
              
                How to build with Ninja
              
          
    How to Use Ninja


Install Ninja

sudo apt install ninja-build


Configure CMake to create Ninja build files

mkdir build &amp;&amp; cd build
	import datetime
	import getopt
	import os
	import sys
	import time

	import pynvml


	def get_printable_util_mem(dev_count, peak_mem):