AHEADer/benchmark.md

## benchmark.md

      
    Raw
  

              benchmark.md
            
          
    Benchmark

This is a speed benchmark for distributed training.
Enviroment

System configuration


Ubuntu xxx
CUDA xxx
NCCL xxx

Framework


Autobot xxx
Tensorflow xxx
Pytorch xxx
MXNet xxx

Profiling tools


cProfile
NVIDIA Nsight Systems
Profile tools provided by each framework

Testing models and experiments

Models

*Image Classification: ResNet50 VGG16
*Translation: GNMT-16
*Video Captioning: S2VT
Experiments

Each experient below should be tested among four deep learning frameworks.

Different GPU placement (e.g. 4 GPUs in different nodes)
Horovod or not, our Horvod vs offical Horovod
RDMA or socket
Different parallel architecture

Benchmark results

TODO