Skip to content

Instantly share code, notes, and snippets.

@mkmohangb
Last active September 14, 2023 06:02
Show Gist options
  • Save mkmohangb/75e0495eb1ef3d8e23865b8a97c425bd to your computer and use it in GitHub Desktop.
Save mkmohangb/75e0495eb1ef3d8e23865b8a97c425bd to your computer and use it in GitHub Desktop.

Tensors

Construction

There are a few main ways to create a tensor, depending on your use case.

  • To create a tensor with pre-existing data, use torch.tensor(). Constructs a tensor with no autograd history(leaf tensor) by copying data.
    • Letting t be a tensor, torch.tensor(t) is equivalent to t.clone().detach()
  • To create a tensor with specific size, use torch.* tensor creation ops (see Creation Ops).
  • To create a tensor with the same size (and similar types) as another tensor, use torch.*_like tensor creation ops (see Creation Ops).
  • To create a tensor with similar type but different size as another tensor, use tensor.new_* creation ops.
  • torch.as_tensor() preserves autograd history and avoids copies where possible. torch.from_numpy() creates a tensor that shares storage with a NumPy array.

conversion & size

  • You can obtain the shape of a tensor in the same way as in numpy (x.shape), or using the .size method.
    • size method also accepts the dimension to return. e.g. a.size(0)
  • The conversion of tensors to numpy require the tensor to be on the CPU, and not the GPU. np_arr = tensor.cpu().numpy()

examples

  1. Passing shape to torch.Tensor

    x = torch.Tensor(2, 3)
    
  2. From python iterables

    a = torch.tensor([0,1,2])
    b = torch.tensor(((1.0,1.1), (1.2, 1.3)))
    c = torch.tensor(np.ones([2,3]))
    
  3. Tensor constructors

    x = torch.ones(5,3)
    y = torch.zeros(2)
    z = torch.empty(1,1,5)
    
  4. Constructors for random numbers

    a = torch.rand(1,3) #uniform
    b = torch.randn(3,4) #normal
    c = torch.randint(100, 20000, (1, 4, 40)) #uniform in the range (100,20000)
    
  5. Tensors with dimensions equal to another tensor

    c = torch.zeros_like(a)
    d = torch.rand_like(b)
    

Generate ranges

```
a = torch.arange(0, 10, step=1)
b = torch.linspace(0, 5, steps=11)
```

Operations

In-place operations are usually marked with a underscore postfix (e.g. add_ instead of add).

  1. View Returns a new tensor with the same data as the tensor but of a different shape.
  torch.randn(4, 4).view(16)
  1. Matrix
@ or torch.matmul or torch.mm - a1 @ a2 + a3
  - torch.mm does not broadcast
  - torch.bmm - batch matrix multiply
  - torch.einsum - Performs matrix multiplications and more (i.e. sums of products) using the Einstein summation convention
torch.dot - torch.dot(b1, b2)
torch.t() or Tensor.T - transpose
  1. Indexing
x = torch.arange(0, 10)
x, x[-1], x[1:3], x[:-2]
  1. Flatten & reshape
z = torch.arange(12).reshape(6,2)
z = z.flatten()
z = z.reshape(3,4)
  1. Squeeze & Unsqueeze
x = torch.randn(1, 10)
x = x.squeeze(0) # squeeze 0th dim - [10]
y = torch.randn(5, 5)
y = y.unsqueeze(1) # dim 1 - [5, 1, 5]
  1. Permutation
x = torch.rand(3, 48, 64)
y = transpose(0,1).transpose(1,2)
z = x.permute(1,2,0) # [48, 64, 3]
torch.equal(y, z) # True
  1. Concatentation
x = torch.arange(12, dtype=torch.float32).reshape((3,4))
y = torch.tensor([[2.0, 1, 4, 3],[1,2,3,4],[4,3,2,1]])

cat_rows = torch.cat((x, y), dim=0)
cat_cols = torch.cat((x, y), dim=1)
  1. Conversion between torch & numpy
x = torch.randn(5)
y = x.numpy()
z = torch.tensor(y)
  1. Tensor to Python data type
a = torch.tensor([3.5])
a, a.item(), float(a), int(a), x.tolist()
  1. Number of elements
  torch.rand(3,5).numel() # 15
  1. Casting
from sklearn.datasets import make_moons
X, y = make_moons(256, noise=0.1)
X = torch.tensor(X, dtype=torch.float32)
y = torch.from_numpy(y).type(torch.LongTensor)
  1. Find index of element
a = torch.tensor([1, 2, 3])
torch.where(a == 2)[0]

or

(a == 2).nonzero(as_tuple=True)
  1. nn.ReLU - inplace argument
a = torch.randn(2,3)
nn.ReLU(inplace=True)(a)

==> a is also updated additional insight

Backpropagation

By default, when we create a tensor, requires_grad is False. Use requires_grad_ method to change inplace or pass requires_grad=True when creating tensor.

  a = torch.ones(3, requires_grad=True)
  y = a.mean()
  y.backward()
  a.grad # tensor([0.3333, 0.3333, 0.3333])

GPU

x = torch.rand(dim, dim, device="cuda")
z = 2 * torch.ones(10, 10).to("cuda")

When generating random numbers, the seed between CPU and GPU is not synchronized. Hence, we need to set the seed on the GPU separately to ensure a reproducible code. Note that due to different GPU architectures, running the same code on different GPUs does not guarantee the same random numbers.

GPU operations have a separate seed we also want to set

if torch.cuda.is_available():
    torch.cuda.manual_seed(42)
    torch.cuda.manual_seed_all(42)

Additionally, some operations on a GPU are implemented stochastic for efficiency. We want to ensure that all operations are deterministic on GPU (if used) for reproducibility.

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

As background for CuDNN, it is important to realize that, for many operations, CuDNN has several implementations, let’s call them different algorithms. Now cudnn.deterministic will only allow those CuDNN algorithms that are (believed to be) deterministic. Now, usually CuDNN has heuristics as to which algorithm to pick, that, roughly, depend on the input shape, strides (aka memory layout) and dtype. Those heuristics cover a broad set of cases, but, as they are heuristics, they might pick a less efficient algorithm at times. In order to improve on using heuristics, if you set the cudnn.benchmark the CuDNN library will benchmark several algorithms and pick that which it found to be fastest. benchmark mode is good whenever your input sizes for your network do not vary. This way, cudnn will look for the optimal set of algorithms for that particular configuration (which takes some time). This usually leads to faster runtime. But if your input sizes changes at each iteration, then cudnn will benchmark every time a new size appears, possibly leading to worse runtime performances. As such it seems good practice to turn off cudnn.benchmark when turning on cudnn.deterministic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment