# Pytorch 张量

## 概述

本文的内容主要来自《[动手学深度学习](https://github.com/d2l-ai/d2l-zh)》和《[深入浅出 Pytorch](https://github.com/datawhalechina/thorough-pytorch)》, 在本系列中，不会涉及过多的深度学习相关知识，主要聚焦于如何使用 Pytorch 进行深度学习。

## 创建张量

张量表示一个由数值组成的数组，这个数组可能有多个维度。具有一个轴的张量对应数学上的向量（vector）；具有两个轴的张量对应数学上的矩阵（matrix）；具有两个轴以上的张量没有特殊的数学名称。

在 `Pytorch` 中，张量有如下的构造函数：

``` python
class Tensor(torch._C.TensorBase): ...
class DoubleTensor(Tensor): ...  
class FloatTensor(Tensor): ...  
class BFloat16Tensor(Tensor): ...  
class LongTensor(Tensor): ...  
class IntTensor(Tensor): ...  
class ShortTensor(Tensor): ...  
class HalfTensor(Tensor): ...  
class CharTensor(Tensor): ...  
class ByteTensor(Tensor): ...  
class BoolTensor(Tensor): ...
```

通过构造函数可以直接创建张量，不过 `Pytorch` 实现了许多工厂函数，可以实现定制化的 `Tensor` 生成，一般， `Pytorch` 推荐我们使用各类工厂函数生成张量，如下：

### 从已有数据创建

可以从 `list` 和 `np.ndarray` 创建张量。

使用 `torh.tensor` 创建张量时，总是新建一个 `tensor`，和原 `numpy` 数组不共享内存\`。

``` python
import torch
import numpy as np
torch.tensor([[1, 2], [3, 4]], dtype = torch.float32)
```

    tensor([[1., 2.],
            [3., 4.]])

使用 `torch.as_tensor` 创建张量时，如果指定了 `dtype` 或者 `device` 使用了 GPU 就会创建新的张量，否则共享内存。

``` python
torch.as_tensor([5, 6, 7])
```

    tensor([5, 6, 7])

使用 `torch.from_numpy` 创建张量时，一定会共享内存，创建时不可以指定 `dtype` 和 `device`。

``` python
np_arr = np.array([1, 2, 3], dtype = np.float32)
torch.from_numpy(np_arr)
```

    tensor([1., 2., 3.])

#### 常数张量

可以创建全部为常数的张量。

``` python
torch.zeros((2, 3))
```

    tensor([[0., 0., 0.],
            [0., 0., 0.]])

``` python
torch.ones((3, 2), dtype = torch.int64)
```

    tensor([[1, 1],
            [1, 1],
            [1, 1]])

``` python
torch.full((2, 2), 3.14)
```

    tensor([[3.1400, 3.1400],
            [3.1400, 3.1400]])

``` python
e = torch.eye(3, 4)
e
```

    tensor([[1., 0., 0., 0.],
            [0., 1., 0., 0.],
            [0., 0., 1., 0.]])

``` python
torch.zeros_like(e)
```

    tensor([[0., 0., 0., 0.],
            [0., 0., 0., 0.],
            [0., 0., 0., 0.]])

``` python
torch.ones_like(e)
```

    tensor([[1., 1., 1., 1.],
            [1., 1., 1., 1.],
            [1., 1., 1., 1.]])

``` python
torch.full_like(e, -1.0)
```

    tensor([[-1., -1., -1., -1.],
            [-1., -1., -1., -1.],
            [-1., -1., -1., -1.]])

#### 未初始化的张量

未初始化的张量只分配内存，不初始化数值，因而存储的数值是随机的。

``` python
torch.empty((3, 3))
```

    tensor([[0., 0., 0.],
            [0., 0., 0.],
            [0., 0., 0.]])

``` python
torch.empty_like(e)
```

    tensor([[-3.9431e-34,  1.8133e-42,  0.0000e+00,  0.0000e+00],
            [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00],
            [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00]])

#### 序列张量

可以按照特定的序列生成张量。

``` python
# 等差数列：从 1.0 开始到 5.0 结束（不包含 5.0），步长为 0.5
torch.arange(1.0, 5.0, 0.5)
```

    tensor([1.0000, 1.5000, 2.0000, 2.5000, 3.0000, 3.5000, 4.0000, 4.5000])

``` python
# 线性等分：从 0 到 1，均匀分成 5 个点（包含终点 1）
torch.linspace(0, 1, steps=5)
```

    tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])

``` python
# 对数等分：生成 10^0 到 10^3 之间的 4 个点
torch.logspace(0, 3, steps=4)
```

    tensor([   1.,   10.,  100., 1000.])

#### 随机张量

可以按照特定的分布随机生成张量。

``` python
# 均匀分布：从 [0, 1) 的均匀分布中采样，形状为 (2, 2)
torch.rand((2, 2))
```

    tensor([[0.9802, 0.4825],
            [0.1893, 0.8265]])

``` python
# 正态分布：从正态分布 N(mean=0.0, std=0.5) 中采样，生成一维 Tensor
torch.normal(mean=0.0, std=0.5, size=(3,))
```

    tensor([ 0.7416, -1.2993, -0.4609])

``` python
# 标准正态分布：从标准正态分布 N(0, 1) 中采样，形状为 (2, 2)
torch.randn((2, 2))
```

    tensor([[ 0.2501, -1.0411],
            [-0.4427,  1.6018]])

``` python
# 生成 [0, 10) 范围内的随机整数，形状为 (3, 3)
torch.randint(0, 10, (3, 3))
```

    tensor([[7, 4, 5],
            [5, 5, 9],
            [0, 2, 2]])

``` python
# 生成 0 到 9 的随机排列（常用于打乱索引）
torch.randperm(10)
```

    tensor([1, 8, 3, 4, 2, 0, 7, 6, 9, 5])

## 张量运算

### 基础属性

张量有以下常见的基础属性：

``` python
import torch
a = torch.ones(4,3)
a
```

    tensor([[1., 1., 1.],
            [1., 1., 1.],
            [1., 1., 1.],
            [1., 1., 1.]])

``` python
# 张量的维度数
a.ndim
```

    2

``` python
# 同 ndim
a.dim()
```

    2

``` python
# 返回张量每个维度的大小
a.shape
```

    torch.Size([4, 3])

``` python
# 和 shape 一样，返回尺寸
a.size(0)
```

    4

``` python
# 返回第 0 维大小
len(a)
```

    4

``` python
# 返回元素总数
a.numel()
```

    12

``` python
# 张量中元素的数据类型
a.dtype
```

    torch.float32

``` python
# 张量所在设备（CPU / GPU）
a.device
```

    device(type='cpu')

``` python
print(torch.cuda.is_available())
```

    True

``` python
a.to('cuda'), a.cpu(), a.cuda()
```

    (tensor([[1., 1., 1.],
             [1., 1., 1.],
             [1., 1., 1.],
             [1., 1., 1.]], device='cuda:0'),
     tensor([[1., 1., 1.],
             [1., 1., 1.],
             [1., 1., 1.],
             [1., 1., 1.]]),
     tensor([[1., 1., 1.],
             [1., 1., 1.],
             [1., 1., 1.],
             [1., 1., 1.]], device='cuda:0'))

``` python
# 是否参与反向传播
a.requires_grad
```

    False

``` python
# 反向传播后保存的梯度
a.grad
```

``` python
# 切断梯度但共享数据
a.detach()
```

    tensor([[1., 1., 1.],
            [1., 1., 1.],
            [1., 1., 1.],
            [1., 1., 1.]])

### 尺寸变换

张量可以改变其形状：

``` python
# 改变形状，内存不连续则拷贝
a.reshape([2, -1])
```

    tensor([[1., 1., 1., 1., 1., 1.],
            [1., 1., 1., 1., 1., 1.]])

``` python
# 改变形状，不拷贝，要求内存连续
a.view([3, -1])
```

    tensor([[1., 1., 1., 1.],
            [1., 1., 1., 1.],
            [1., 1., 1., 1.]])

``` python
# 改变形状，拷贝
a.view([3, -1]).clone()
```

    tensor([[1., 1., 1., 1.],
            [1., 1., 1., 1.],
            [1., 1., 1., 1.]])

``` python
# 增加一个维度
a.unsqueeze(0).shape, a.unsqueeze(1).shape, a.unsqueeze(2).shape
```

    (torch.Size([1, 4, 3]), torch.Size([4, 1, 3]), torch.Size([4, 3, 1]))

``` python
# 去掉大小为 1 的维度
b = a.unsqueeze(0).clone()
b.shape, b.squeeze().shape
```

    (torch.Size([1, 4, 3]), torch.Size([4, 3]))

``` python
# 交换两个维度
b.shape, b.transpose(0,1).shape, b.transpose(1,2).shape
```

    (torch.Size([1, 4, 3]), torch.Size([4, 1, 3]), torch.Size([1, 3, 4]))

``` python
# 任意重排维度
b.shape, b.permute(2,0,1).shape
```

    (torch.Size([1, 4, 3]), torch.Size([3, 1, 4]))

``` python
# 转置矩阵
a.T
```

    tensor([[1., 1., 1., 1.],
            [1., 1., 1., 1.],
            [1., 1., 1., 1.]])

``` python
# 拉平成 1D
a.flatten()
```

    tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

### 类型变换

张量可以改变其类型：

``` python
type(a), type(a.numpy()), type(a.tolist())
```

    (torch.Tensor, numpy.ndarray, list)

``` python
a.double().dtype, a.float().dtype, a.long().dtype, a.int().dtype, a.half().dtype
```

    (torch.float64, torch.float32, torch.int64, torch.int32, torch.float16)

对于标量，有如下的特殊运算：

``` python
a = torch.tensor([3.5])
a, a.item(), float(a), int(a)
```

    (tensor([3.5000]), 3.5, 3.5, 3)

### 运算符

张量有如下的运算符，常见的运算符经过了运算符重载，可以直接用符号表示：

``` python
x = torch.Tensor([1.0,2,3,4,5,6]).view(2,3)
y = torch.Tensor([1,3,5,7,9,11]).view(2,3)
```

``` python
x + y, x - y, x * y, x / y, x ** y, x @ y.T
```

    (tensor([[ 2.,  5.,  8.],
             [11., 14., 17.]]),
     tensor([[ 0., -1., -2.],
             [-3., -4., -5.]]),
     tensor([[ 1.,  6., 15.],
             [28., 45., 66.]]),
     tensor([[1.0000, 0.6667, 0.6000],
             [0.5714, 0.5556, 0.5455]]),
     tensor([[1.0000e+00, 8.0000e+00, 2.4300e+02],
             [1.6384e+04, 1.9531e+06, 3.6280e+08]]),
     tensor([[ 22.,  58.],
             [ 49., 139.]]))

``` python
# e^x
x.exp()
```

    tensor([[  2.7183,   7.3891,  20.0855],
            [ 54.5981, 148.4132, 403.4288]])

``` python
# 平均值
x.mean(), x.mean(axis = 0), x.mean(axis = 1)
```

    (tensor(3.5000), tensor([2.5000, 3.5000, 4.5000]), tensor([2., 5.]))

``` python
# 方差
x.var(), x.var(axis = 0), x.var(axis = 1)
```

    (tensor(3.5000), tensor([4.5000, 4.5000, 4.5000]), tensor([1., 1.]))

``` python
# 标准差
x.std(), x.std(axis = 0), x.std(axis = 1)
```

    (tensor(1.8708), tensor([2.1213, 2.1213, 2.1213]), tensor([1., 1.]))

``` python
# L1 范数
x.abs().sum()
```

    tensor(21.)

``` python
# L2 范数
x.norm()
```

    tensor(9.5394)

``` python
# 求和
x.sum(), x.sum(axis = 0), x.sum(axis = 1)
```

    (tensor(21.), tensor([5., 7., 9.]), tensor([ 6., 15.]))

``` python
# 求累计和
x.cumsum(axis = 0), x.cumsum(axis = 1)
```

    (tensor([[1., 2., 3.],
             [5., 7., 9.]]),
     tensor([[ 1.,  3.,  6.],
             [ 4.,  9., 15.]]))

``` python
# 求最大
x.max(), x.max(axis = 0), x.max(axis = 1)
```

    (tensor(6.),
     torch.return_types.max(
     values=tensor([4., 5., 6.]),
     indices=tensor([1, 1, 1])),
     torch.return_types.max(
     values=tensor([3., 6.]),
     indices=tensor([2, 2])))

``` python
# 求最小
x.min(), x.min(axis = 0), x.min(axis = 1)
```

    (tensor(1.),
     torch.return_types.min(
     values=tensor([1., 2., 3.]),
     indices=tensor([0, 0, 0])),
     torch.return_types.min(
     values=tensor([1., 4.]),
     indices=tensor([0, 0])))

``` python
# 最大值索引
x.argmax(), x.argmax(axis = 0), x.argmax(axis = 1)
```

    (tensor(5), tensor([1, 1, 1]), tensor([2, 2]))

``` python
# 最小值索引
x.argmin(), x.argmin(axis = 0), x.argmin(axis = 1)
```

    (tensor(0), tensor([0, 0, 0]), tensor([0, 0]))

``` python
# 是否存在 True
x.any(), x.any(axis = 0), x.any(axis = 1)
```

    (tensor(True), tensor([True, True, True]), tensor([True, True]))

``` python
# 是否全 True
x.all(), x.all(axis = 0), x.all(axis = 1)
```

    (tensor(True), tensor([True, True, True]), tensor([True, True]))

### 拼接

张量可以进行拼接，两个拼接的张量要求在其他维度上的 `shape` 是一样的。

``` python
torch.cat((x, y), axis = 0), torch.cat((x, y), axis = 1)
```

    (tensor([[ 1.,  2.,  3.],
             [ 4.,  5.,  6.],
             [ 1.,  3.,  5.],
             [ 7.,  9., 11.]]),
     tensor([[ 1.,  2.,  3.,  1.,  3.,  5.],
             [ 4.,  5.,  6.,  7.,  9., 11.]]))

### 广播

广播从**最后一维**开始对齐，两个维度必须相等或者广播前的维度为 1。

``` python
a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 1, 2))
a, b
```

    (tensor([[0],
             [1],
             [2]]),
     tensor([[[0, 1]]]))

``` python
a + b, (a + b).shape
```

    (tensor([[[0, 1],
              [1, 2],
              [2, 3]]]),
     torch.Size([1, 3, 2]))

### 索引和切片

`pytorch` 的索引和切片与 `numpy` 相同，索引和切片与原变量共享相同的内存空间。

``` python
X = torch.arange(12, dtype = torch.float32).view((3,4))
X
```

    tensor([[ 0.,  1.,  2.,  3.],
            [ 4.,  5.,  6.,  7.],
            [ 8.,  9., 10., 11.]])

``` python
x[-1], x[1:3]
```

    (tensor([4., 5., 6.]), tensor([[4., 5., 6.]]))

``` python
X[1, 2] = 9
X
```

    tensor([[ 0.,  1.,  2.,  3.],
            [ 4.,  5.,  9.,  7.],
            [ 8.,  9., 10., 11.]])

``` python
X[0:2, :] = 12
X
```

    tensor([[12., 12., 12., 12.],
            [12., 12., 12., 12.],
            [ 8.,  9., 10., 11.]])


---

> 作者: Aphros  
> URL: https://blog.papergate.top/posts/pytorch-%E5%BC%A0%E9%87%8F/