This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" I was writing a dataloader from a video stream. I ran some numbers. | |
# in a nutshell. | |
-> np.transpose() or torch.permute() is faster as uint8, no difference between torch and numpy | |
-> np.uint8/number results in np.float64, never do it, if anything cast as np.float32 | |
-> convert to pytorch before converting uint8 to float32 | |
-> contiguous() is is faster in torch than numpy | |
-> contiguous() is faster for torch.float32 than for torch.uint8 | |
-> convert to CUDA in the numpy to pytorch conversion, if you can. | |
-> in CPU tensor/my_float is > 130% more costly than tensor.div_(myfloat), however tensor.div_() | |
does not keep track of gradients, so be careful using it. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""testing vram in pytorch cuda | |
every time a variable is put inside a container in python, to remove it completely | |
one needs to delete variable and container, | |
this can be problematic when using pytorch cuda if one doesnt clear all containers | |
Three tests: | |
>>> python memory_tests list | |
# creates 2 tensors puts them in a list, modifies them in place, deletes them | |
# in place mod changes original tensors | |
# list and both tensors need to be deleted |
NewerOlder