1
2
3
# For tips on running notebooks in Google Colab, see
# https://pytorch.org/tutorials/beginner/colab
%matplotlib inline

Tensors

Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch, we use tensors to encode the inputs and outputs of a model, as well as the model’s parameters.

Tensors are similar to NumPy’s ndarrays, except that tensors can run on GPUs or other specialized hardware to accelerate computing. If you’re familiar with ndarrays, you’ll be right at home with the Tensor API. If not, follow along in this quick API walkthrough.

1
2
import torch
import numpy as np

Tensor Initialization:初始化方式

Tensors can be initialized in various ways. Take a look at the following examples:

Directly from data:使用数据直接初始化torch.tensor

Tensors can be created directly from data. The data type is automatically inferred.

1
2
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)

From a NumPy array:使用numpy的array进行初始化torch.from_numpy

Tensors can be created from NumPy arrays (and vice versa - see bridge-to-np-label{.interpreted-text role=“ref”}).

1
2
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

From another tensor:使用tensor进行初始化

The new tensor retains the properties (shape, datatype) of the argument tensor, unless explicitly overridden.

1
2
3
4
5
x_ones = torch.ones_like(x_data) # retains the properties of x_data(torch.ones_like形状一致的单位tensor)
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data(torch.rand_like形状一致的随机tensor)
print(f"Random Tensor: \n {x_rand} \n")
Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.5875, 0.9007],
        [0.6026, 0.1032]]) 

With random or constant values:使用变量或常量指定tensor形状(shape描述tensor的形状)

shape is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor.

1
2
3
4
5
6
7
8
shape = (2, 3,)# 注意元组使用圆括号
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")
Random Tensor: 
 tensor([[0.9403, 0.6379, 0.9612],
        [0.6934, 0.6707, 0.0477]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])

Tensor Attributes:属性包括tensor的形状,元素的数据类型,tensor被存储在什么设备上

Tensor attributes describe their shape, datatype, and the device on which they are stored.

1
2
3
4
5
tensor = torch.rand(3, 4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")
Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu

Tensor Operations:支持的操作(转置、索引、切片、数学操作、线性代数、随机采样等)

Over 100 tensor operations, including transposing, indexing, slicing, mathematical operations, linear algebra, random sampling, and more are comprehensively described here.

Each of them can be run on the GPU (at typically higher speeds than on a CPU). If you’re using Colab, allocate a GPU by going to Edit > Notebook Settings.

1
2
3
4
# We move our tensor to the GPU if available
if torch.cuda.is_available():
tensor = tensor.to('cuda')
print(f"Device tensor is stored on: {tensor.device}")
Device tensor is stored on: cuda:0

Try out some of the operations from the list. If you’re familiar with the NumPy API, you’ll find the Tensor API a breeze to use.

Standard numpy-like indexing and slicing:
跟numpy类似的索引和切片操作

1
2
3
tensor = torch.ones(4, 4)
tensor[:,1] = 0# 所有行的第一列的元素都设置为0
print(tensor)
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

Joining tensors You can use torch.cat to concatenate a sequence of tensors along a given dimension. See alsotorch.stack, another tensor joining op that is subtly different from torch.cat.

根据给定的维度,连接tensors;

另请参阅 torch.stack,这是另一个与 torch.cat 略有不同的tensor连接运算。

1
2
t1 = torch.cat([tensor, tensor, tensor], dim=1)# dim=1说明从列的方向连接,也就是横着拼接。如果dim=0则应该竖着拼接
print(t1)
tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])

Multiplying tensors

tensor乘法

1
2
3
4
# This computes the element-wise product将矩阵的对应位置的元素相乘,因此要求形状一致
print(f"tensor.mul(tensor) \n {tensor.mul(tensor)} \n")
# Alternative syntax:*原理同上,将矩阵的对应位置的元素相乘,因此要求形状一致
print(f"tensor * tensor \n {tensor * tensor}")
tensor.mul(tensor) 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor * tensor 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

This computes the matrix multiplication between two tensors

tensors的矩阵乘法,注意其在形状上的要求,m1 x m2要求m1的列数和m2的行数相同

1
2
3
print(f"tensor.matmul(tensor.T) \n {tensor.matmul(tensor.T)} \n")
# Alternative syntax:
print(f"tensor @ tensor.T \n {tensor @ tensor.T}")
tensor.matmul(tensor.T) 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

tensor @ tensor.T 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]])

In-place operations Operations that have a _ suffix are in-place. For example: x.copy_(y), x.t_(), will change x.

有后缀_的操作会代替原值,即.前的对象

1
2
3
print(tensor, "\n")
tensor.add_(5)
print(tensor)
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])
NOTE:

In-place operations save some memory, but can be problematic when computing derivatives because of an immediate lossof history. Hence, their use is discouraged.带后缀_的操作会节省内存,但在计算微分时会出现问题,不推荐。


Bridge with NumPy

Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change the other.

CPU上的tensor和Numpy的array可以共享其底层内存位置,此时改变其中一个就会改变另一个。

Tensor to NumPy array

1
2
3
4
t = torch.ones(5)# 创建一个tensor
print(f"t: {t}")
n = t.numpy()# tensor转化为numpy array
print(f"n: {n}")
t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]

A change in the tensor reflects in the NumPy array.

上面的t和n此时共享了内存位置

1
2
3
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")
t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]

NumPy array to Tensor

1
2
3
n = np.ones(5)# 创建一个numpy array
t = torch.from_numpy(n)# numpy array 转化为 tensor
# 上面的t和n此时共享了内存位置

Changes in the NumPy array reflects in the tensor.

1
2
3
np.add(n, 1, out=n)# 两个数组对应位置的元素求和。n是第一个数组,1是第二个数组,输出存储在n中。
print(f"t: {t}")
print(f"n: {n}")
t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
x1 = np.arange(9.0).reshape((3, 3))
x2 = np.arange(3.0)
x3 = np.add(x1, x2)# np.add在x2长度跟x1不匹配时,自动扩展,变成[[0. 1. 2.],[0. 1. 2.],[0. 1. 2.]]
print(x1)
print('\n')
print(x2)
print('\n')
print(x3)

print("-----------------------------------")

x1 = np.arange(3.0)
x2 = np.arange(9.0).reshape((3, 3))
x3 = np.add(x1, x2)# np.add在x1长度跟x2不匹配时,自动扩展,变成[[0. 1. 2.],[0. 1. 2.],[0. 1. 2.]]
print(x1)
print('\n')
print(x2)
print('\n')
print(x3)
[[0. 1. 2.]
 [3. 4. 5.]
 [6. 7. 8.]]

[0. 1. 2.]

[[ 0.  2.  4.]
[ 3.  5.  7.]
[ 6.  8. 10.]]

-----------------------------------

[0. 1. 2.]

[[0. 1. 2.]
[3. 4. 5.]
[6. 7. 8.]]

[[ 0.  2.  4.]
[ 3.  5.  7.]
[ 6.  8. 10.]]
1
2
3
import numpy as np
xx = np.random.randint(0, 10, [1, 2, 4, 4])# 0,10表示随机数生成的范围
print(xx)
[[[[7 2 8 5]
   [4 7 6 8]
   [6 1 9 7]
   [9 2 1 1]]

  [[8 6 7 8]
   [2 9 1 5]
   [1 6 9 7]
   [8 0 9 9]]]]