ds-hwang/gist:0c35e1f6bf7da4804ac1e91b21275cbd

## gistfile1.md

      
    Raw
  

              gistfile1.md
            
          
    Env

test notebook
Xeon Skylake (36 cores, 72 threads) >$7k (note: the link has half cores)
Quadro P1000 $300

Model

densenet121: 8M parameters
feature image: 224x224x3, train epoch: 352, eval epoch: 40

Result

Intel takes 975.458 secs for one epoch training and one epoch validation
Nvidia takes 382.901 secs for one epoch training and one epoch validation
Nvidia is 2.55x faster although Nvidia is >10x cheaper.

Note: when Nvidia is used, about 30 cpu threads work in %27 utilization to distribute and aggregate(?) tasks.
Note: GeForce GTX 1080 is 3x better FLOP than Quadro P1000 :( according to CUDA benchmark