Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save rahulvigneswaran/8b5e6ecd2cae9698e360dbf6d6fc7ed3 to your computer and use it in GitHub Desktop.
Save rahulvigneswaran/8b5e6ecd2cae9698e360dbf6d6fc7ed3 to your computer and use it in GitHub Desktop.

Table of contents

GPU

CPU

  • If you are running multiple experiments but have limited number of cores, use taskset --cpu-list <starting_thread>-<ending thread number> <your_code>.py. This will make sure your specific runs use only the allotted threads from <starting_thread>to <ending thread number> and prevents from constant reallocation of CPU threads as each run fight for the threads. Note that this is helpful only if everyone on the server respects the core allotment.
  • More num_workers doesn't lead to a faster data loader. In fact, in most cases having higher num_workers will lead to a slower data loader. As far as I know, there is no thumb rule but there does exist a sweet spot that is mostly identified through trial and error.
  • Useful utilities/commands:
    • htop
    • glances

Storage

  • Make sure you are running (read, log, train) on SSD. HDD causes I/O bottlenecks which are hard to get over even if you sell your soul to satan.
    • Check with this lsblk -o NAME,MOUNTPOINT,MODEL,ROTA,SIZE. ROTA == 0 means, the drive is an SSD.
  • Instead of loading your data from SSD or HDD, you can directly move it to the RAM. /dev/shm/ is the RAM dir. First check whether you have sufficient RAM size, then move the entire dataset to the RAM. Then make your dataloader load from /dev/shm directly.
  • Useful utilities/commands:
    • ncdu
    • df -h

General Training

Loss

Dataset/Dataloader

Misc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment