- Follow these directions, everything before "Set up Python environment"
- Choose Ubuntu 18.04
- Follow some sections in these directions
- Do "Setting up CUDA Toolkit"
- In "Running CUDA Applications", try to
cd /usr/local/cuda/samples/0_Simple/matrixMulCUBLAS/
,make
, and./matrixMulCUBLAS
- Do "Setting up to Run Containers"
- Follow 1-4 in these directions
- Inside wsl do
docker pull nvcr.io/nvidia/tensorflow:20.11-tf1-py3
- run this container with
docker run --gpus all --rm -it -v /mnt/d/Documents/:/projects nvcr.io/nvidia/tensorflow:20.11-tf1-py3
substituting/mnt/d/Documents/
for your documents folder - other containers are available here
- run this container with
git clone https://github.com/aime-team/tf1-benchmarks.git
cd tf1-benchmarks
python ./scripts/tf_cnn_benchmarks/tf_cnn_benchmarks.py --model resnet50 --num_gpus=1 --batch_size=1
Increase batch size slowly, out of memory error in this benchmark needs to be kill -9
ed from another terminal (docker container list
, docker exec -it CONTAINER_ID bash
, ps -a
, and kill -9 PID
).
On GeForce GTX 960 4GB, max batchsize is 3 and gets 20.48 images/sec. This is 10x faster than benchmark running with directml tensorflow.