Note: all these tests use the GPU. "loader" does not execute any neural-net operations, it only loads the images on the GPU, then pass to the next batch.
weights are guesstimates from Carl.
loader
- weight: 2 (not realistic)
- dataloader only, on fakeimagenet
translator
- weight: 8
- language translation
wlm
- weight: 8
- next word (or character) prediction
wlmfp16
- weight: 8
- same as "wlm" using half precision floats
- potential issue: Warnings seem to say that FP16 is disabled. see "FP16 warnings" below. However, the result of this test is about 5000, while for
wlm
it is 1500; thus it probably works anyway.
cart
- weight: 8
- RL toy: balance a pole in 2d
minigrid
- weight: 4 (very niche)
- mila research platform: BabyAI
- RL + language understanding
- test RL environment
atari
- weight: 8
- RL: playing the game Pong
vae
- weight: 8
- Image generation
reso
- weight: 8
- image superresolution
ssd
- weight: 8
- object detection in images
- potential issue: downloads files from the internet: https://download.pytorch.org/models/resnet34-333f7ec4.pth and https://download.pytorch.org/models/vgg16-397923af.pth. The docker image should be modified to include these two files inside
/root/.cache/torch/checkpoints
; this will avoid the download.
fast_style
- weight: 8
- image stylization (filters)
dcgan
- weight: 8
- Deep Convolutional Generative Adversarial Network
- image generation
convnet
- weight: 8
- Image classification
convnet_fp16
- weight: 8
- Same as "convnet" but using half precision floats
- potential issue: Warnings seem to say that FP16 is disabled. see "FP16 warnings" below. However, the result of this test is about 278-333, while for
convnet
it is about 210; thus it probably works anyway.
scaling
- weight: 8
- Image classification
- multi GPU testing
toy_reg
- weight: 1 (toy)
- polynome fitting
toy_lstm
- weight: 1 (toy)
- sinus function fitting
recom
- weight: 8
- movie recommendation using collaborative filtering (not based on language understanding)
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original Impor
tError was: ModuleNotFoundError("No module named 'amp_C'")
Attempting to unscale a grad with type torch.cuda.HalfTensor Unscaling non-fp32 grads may indicate an error. When using Amp, you don't need to call .half() on your model.
Warning: FP16_Optimizer is deprecated and dangerous, and will be deleted soon. If it still works, you're probably getting lucky. For mixed precision, use the documented
API https://nvidia.github.io/apex/amp.html, with opt_level=O1.