Skip to content

Instantly share code, notes, and snippets.

@v-i-s-h
Created April 17, 2020 05:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save v-i-s-h/eb32bf83d479b391457204df2e817107 to your computer and use it in GitHub Desktop.
Save v-i-s-h/eb32bf83d479b391457204df2e817107 to your computer and use it in GitHub Desktop.
Tensorflow performance tips

Notes for performance optimization

Training in local CPU (When using conda tensorflow)

Consider setting the following optimization flags while training.

  • KMP_BLOCKTIME=0
  • KMP_AFFINITY=granularity=fine,verbose,compact,1,0
  • KMP_SETTINGS=1
  • OMP_NUM_THREADS=8 Also, use the following code after importing tf.
import os
if 'OMP_NUM_THREADS' in os.environ:
    NUM_THREADS = int(os.environ['OMP_NUM_THREADS'])
    tf.config.threading.set_inter_op_parallelism_threads(NUM_THREADS)
    tf.config.threading.set_intra_op_parallelism_threads(NUM_THREADS)
    print(f"Using {NUM_THREADS} threads")
else:
    print("Running with default configuration. Consider setting optimization flags while starting jupyter.")

Example:

  • When starting jupyter-lab, use
    KMP_BLOCKTIME=0 KMP_AFFINITY=granularity=fine,verbose,compact,1,0 KMP_SETTINGS=1 OMP_NUM_THREADS=8 jupyter-lab
  • When starting a python script, use
    KMP_BLOCKTIME=0 KMP_AFFINITY=granularity=fine,verbose,compact,1,0 KMP_SETTINGS=1 OMP_NUM_THREADS=8 python <script_to_run.py>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment