There are a few main ways to create a tensor, depending on your use case.
- To create a tensor with pre-existing data, use torch.tensor(). Constructs a tensor with no autograd history(leaf tensor) by copying data.
- Letting t be a tensor,
torch.tensor(t)
is equivalent tot.clone().detach()
- Letting t be a tensor,
- To create a tensor with specific size, use torch.* tensor creation ops (see Creation Ops).
- To create a tensor with the same size (and similar types) as another tensor, use torch.*_like tensor creation ops (see Creation Ops).
- To create a tensor with similar type but different size as another tensor, use tensor.new_* creation ops.
torch.as_tensor()
preserves autograd history and avoids copies where possible.torch.from_numpy()
creates a tensor that shares storage with a NumPy array.
- You can obtain the shape of a tensor in the same way as in numpy
(x.shape)
, or using the.size
method.size
method also accepts the dimension to return. e.g.a.size(0)
- The conversion of tensors to numpy require the tensor to be on the CPU, and not the GPU.
np_arr = tensor.cpu().numpy()
-
Passing shape to
torch.Tensor
x = torch.Tensor(2, 3)
-
From python iterables
a = torch.tensor([0,1,2]) b = torch.tensor(((1.0,1.1), (1.2, 1.3))) c = torch.tensor(np.ones([2,3]))
-
Tensor constructors
x = torch.ones(5,3) y = torch.zeros(2) z = torch.empty(1,1,5)
-
Constructors for random numbers
a = torch.rand(1,3) #uniform b = torch.randn(3,4) #normal c = torch.randint(100, 20000, (1, 4, 40)) #uniform in the range (100,20000)
-
Tensors with dimensions equal to another tensor
c = torch.zeros_like(a) d = torch.rand_like(b)
```
a = torch.arange(0, 10, step=1)
b = torch.linspace(0, 5, steps=11)
```
In-place operations are usually marked with a underscore postfix (e.g. add_
instead of add
).
- View Returns a new tensor with the same data as the tensor but of a different shape.
torch.randn(4, 4).view(16)
- Matrix
@ or torch.matmul or torch.mm - a1 @ a2 + a3
- torch.mm does not broadcast
- torch.bmm - batch matrix multiply
- torch.einsum - Performs matrix multiplications and more (i.e. sums of products) using the Einstein summation convention
torch.dot - torch.dot(b1, b2)
torch.t() or Tensor.T - transpose
- Indexing
x = torch.arange(0, 10)
x, x[-1], x[1:3], x[:-2]
- Flatten & reshape
z = torch.arange(12).reshape(6,2)
z = z.flatten()
z = z.reshape(3,4)
- Squeeze & Unsqueeze
x = torch.randn(1, 10)
x = x.squeeze(0) # squeeze 0th dim - [10]
y = torch.randn(5, 5)
y = y.unsqueeze(1) # dim 1 - [5, 1, 5]
- Permutation
x = torch.rand(3, 48, 64)
y = transpose(0,1).transpose(1,2)
z = x.permute(1,2,0) # [48, 64, 3]
torch.equal(y, z) # True
- Concatentation
x = torch.arange(12, dtype=torch.float32).reshape((3,4))
y = torch.tensor([[2.0, 1, 4, 3],[1,2,3,4],[4,3,2,1]])
cat_rows = torch.cat((x, y), dim=0)
cat_cols = torch.cat((x, y), dim=1)
- Conversion between torch & numpy
x = torch.randn(5)
y = x.numpy()
z = torch.tensor(y)
- Tensor to Python data type
a = torch.tensor([3.5])
a, a.item(), float(a), int(a), x.tolist()
- Number of elements
torch.rand(3,5).numel() # 15
- Casting
from sklearn.datasets import make_moons
X, y = make_moons(256, noise=0.1)
X = torch.tensor(X, dtype=torch.float32)
y = torch.from_numpy(y).type(torch.LongTensor)
- Find index of element
a = torch.tensor([1, 2, 3])
torch.where(a == 2)[0]
or
(a == 2).nonzero(as_tuple=True)
- nn.ReLU - inplace argument
a = torch.randn(2,3)
nn.ReLU(inplace=True)(a)
==> a is also updated additional insight
By default, when we create a tensor, requires_grad
is False. Use requires_grad_
method to change inplace or pass requires_grad=True
when creating tensor.
a = torch.ones(3, requires_grad=True)
y = a.mean()
y.backward()
a.grad # tensor([0.3333, 0.3333, 0.3333])
x = torch.rand(dim, dim, device="cuda")
z = 2 * torch.ones(10, 10).to("cuda")
When generating random numbers, the seed between CPU and GPU is not synchronized. Hence, we need to set the seed on the GPU separately to ensure a reproducible code. Note that due to different GPU architectures, running the same code on different GPUs does not guarantee the same random numbers.
GPU operations have a separate seed we also want to set
if torch.cuda.is_available():
torch.cuda.manual_seed(42)
torch.cuda.manual_seed_all(42)
Additionally, some operations on a GPU are implemented stochastic for efficiency. We want to ensure that all operations are deterministic on GPU (if used) for reproducibility.
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
As background for CuDNN, it is important to realize that, for many operations, CuDNN has several implementations, let’s call them different algorithms. Now cudnn.deterministic will only allow those CuDNN algorithms that are (believed to be) deterministic. Now, usually CuDNN has heuristics as to which algorithm to pick, that, roughly, depend on the input shape, strides (aka memory layout) and dtype. Those heuristics cover a broad set of cases, but, as they are heuristics, they might pick a less efficient algorithm at times. In order to improve on using heuristics, if you set the cudnn.benchmark the CuDNN library will benchmark several algorithms and pick that which it found to be fastest. benchmark mode is good whenever your input sizes for your network do not vary. This way, cudnn will look for the optimal set of algorithms for that particular configuration (which takes some time). This usually leads to faster runtime. But if your input sizes changes at each iteration, then cudnn will benchmark every time a new size appears, possibly leading to worse runtime performances. As such it seems good practice to turn off cudnn.benchmark when turning on cudnn.deterministic.
- Ref: UVA DL Tutorials
- Ref: cudnn.benchmark
- Ref: cudnn.deterministic & cudnn.benchmark