mkmohangb/torch.md

## torch.md

      
    Raw
  

              torch.md
            
          
    Tensors

Construction

There are a few main ways to create a tensor, depending on your use case.

To create a tensor with pre-existing data, use torch.tensor(). Constructs a tensor with no autograd history(leaf tensor) by copying data.

Letting t be a tensor, torch.tensor(t) is equivalent to t.clone().detach()


To create a tensor with specific size, use torch.* tensor creation ops (see Creation Ops).
To create a tensor with the same size (and similar types) as another tensor, use torch.*_like tensor creation ops (see Creation Ops).
To create a tensor with similar type but different size as another tensor, use tensor.new_* creation ops.
torch.as_tensor() preserves autograd history and avoids copies where possible. torch.from_numpy() creates a tensor that shares storage with a NumPy array.

conversion & size


You can obtain the shape of a tensor in the same way as in numpy (x.shape), or using the .size method.

size method also accepts the dimension to return. e.g. a.size(0)


The conversion of tensors to numpy require the tensor to be on the CPU, and not the GPU. np_arr = tensor.cpu().numpy()

examples


Passing shape to torch.Tensor
x = torch.Tensor(2, 3)


From python iterables
a = torch.tensor([0,1,2])
b = torch.tensor(((1.0,1.1), (1.2, 1.3)))
c = torch.tensor(np.ones([2,3]))


Tensor constructors
x = torch.ones(5,3)
y = torch.zeros(2)
z = torch.empty(1,1,5)


Constructors for random numbers
a = torch.rand(1,3) #uniform
b = torch.randn(3,4) #normal
c = torch.randint(100, 20000, (1, 4, 40)) #uniform in the range (100,20000)


Tensors with dimensions equal to another tensor
c = torch.zeros_like(a)
d = torch.rand_like(b)


Generate ranges

```
a = torch.arange(0, 10, step=1)
b = torch.linspace(0, 5, steps=11)
```

Operations

In-place operations are usually marked with a underscore postfix (e.g. add_ instead of add).

View
Returns a new tensor with the same data as the tensor but of a different shape.

  torch.randn(4, 4).view(16)


Matrix

@ or torch.matmul or torch.mm - a1 @ a2 + a3
  - torch.mm does not broadcast
  - torch.bmm - batch matrix multiply
  - torch.einsum - Performs matrix multiplications and more (i.e. sums of products) using the Einstein summation convention
torch.dot - torch.dot(b1, b2)
torch.t() or Tensor.T - transpose


Indexing

x = torch.arange(0, 10)
x, x[-1], x[1:3], x[:-2]


Flatten & reshape

z = torch.arange(12).reshape(6,2)
z = z.flatten()
z = z.reshape(3,4)


Squeeze & Unsqueeze

x = torch.randn(1, 10)
x = x.squeeze(0) # squeeze 0th dim - [10]
y = torch.randn(5, 5)
y = y.unsqueeze(1) # dim 1 - [5, 1, 5]


Permutation

x = torch.rand(3, 48, 64)
y = transpose(0,1).transpose(1,2)
z = x.permute(1,2,0) # [48, 64, 3]
torch.equal(y, z) # True


Concatentation

x = torch.arange(12, dtype=torch.float32).reshape((3,4))
y = torch.tensor([[2.0, 1, 4, 3],[1,2,3,4],[4,3,2,1]])

cat_rows = torch.cat((x, y), dim=0)
cat_cols = torch.cat((x, y), dim=1)


Conversion between torch & numpy

x = torch.randn(5)
y = x.numpy()
z = torch.tensor(y)


Tensor to Python data type

a = torch.tensor([3.5])
a, a.item(), float(a), int(a), x.tolist()


Number of elements

  torch.rand(3,5).numel() # 15


Casting

from sklearn.datasets import make_moons
X, y = make_moons(256, noise=0.1)
X = torch.tensor(X, dtype=torch.float32)
y = torch.from_numpy(y).type(torch.LongTensor)


Find index of element

a = torch.tensor([1, 2, 3])
torch.where(a == 2)[0]

or
(a == 2).nonzero(as_tuple=True)


nn.ReLU - inplace argument

a = torch.randn(2,3)
nn.ReLU(inplace=True)(a)

==> a is also updated
additional insight
Backpropagation

By default, when we create a tensor, requires_grad is False. Use requires_grad_ method to change inplace or pass requires_grad=True when creating tensor.
  a = torch.ones(3, requires_grad=True)
  y = a.mean()
  y.backward()
  a.grad # tensor([0.3333, 0.3333, 0.3333])

GPU

x = torch.rand(dim, dim, device="cuda")
z = 2 * torch.ones(10, 10).to("cuda")

When generating random numbers, the seed between CPU and GPU is not synchronized. Hence, we need to set the seed on the GPU separately to ensure a reproducible code. Note that due to different GPU architectures, running the same code on different GPUs does not guarantee the same random numbers.
GPU operations have a separate seed we also want to set
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)
    torch.cuda.manual_seed_all(42)

Additionally, some operations on a GPU are implemented stochastic for efficiency.
We want to ensure that all operations are deterministic on GPU (if used) for reproducibility.
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

As background for CuDNN, it is important to realize that, for many operations, CuDNN has several implementations, let’s call them different algorithms. Now cudnn.deterministic will only allow those CuDNN algorithms that are (believed to be) deterministic. Now, usually CuDNN has heuristics as to which algorithm to pick, that, roughly, depend on the input shape, strides (aka memory layout) and dtype. Those heuristics cover a broad set of cases, but, as they are heuristics, they might pick a less efficient algorithm at times. In order to improve on using heuristics, if you set the cudnn.benchmark the CuDNN library will benchmark several algorithms and pick that which it found to be fastest.
benchmark mode is good whenever your input sizes for your network do not vary. This way, cudnn will look for the optimal set of algorithms for that particular configuration (which takes some time). This usually leads to faster runtime. But if your input sizes changes at each iteration, then cudnn will benchmark every time a new size appears, possibly leading to worse runtime performances. As such it seems good practice to turn off cudnn.benchmark when turning on cudnn.deterministic.

Ref: UVA DL Tutorials
Ref: cudnn.benchmark
Ref: cudnn.deterministic & cudnn.benchmark