Pytorch 张量

2025-12-30 2026-01-06 约 2200 字预计阅读 5 分钟 - 次阅读 - 条评论

第 1 节概述

本文的内容主要来自《动手学深度学习》和《深入浅出 Pytorch》, 在本系列中，不会涉及过多的深度学习相关知识，主要聚焦于如何使用 Pytorch 进行深度学习。

第 2 节创建张量

张量表示一个由数值组成的数组，这个数组可能有多个维度。具有一个轴的张量对应数学上的向量（vector）；具有两个轴的张量对应数学上的矩阵（matrix）；具有两个轴以上的张量没有特殊的数学名称。

在 Pytorch 中，张量有如下的构造函数：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class Tensor(torch._C.TensorBase): ...
class DoubleTensor(Tensor): ...  
class FloatTensor(Tensor): ...  
class BFloat16Tensor(Tensor): ...  
class LongTensor(Tensor): ...  
class IntTensor(Tensor): ...  
class ShortTensor(Tensor): ...  
class HalfTensor(Tensor): ...  
class CharTensor(Tensor): ...  
class ByteTensor(Tensor): ...  
class BoolTensor(Tensor): ...

通过构造函数可以直接创建张量，不过 Pytorch 实现了许多工厂函数，可以实现定制化的 Tensor 生成，一般， Pytorch 推荐我们使用各类工厂函数生成张量，如下：

2.1 从已有数据创建

可以从 list 和 np.ndarray 创建张量。

使用 torh.tensor 创建张量时，总是新建一个 tensor，和原 numpy 数组不共享内存`。

1
2
3
import torch
import numpy as np
torch.tensor([[1, 2], [3, 4]], dtype = torch.float32)

tensor([[1., 2.],
        [3., 4.]])

使用 torch.as_tensor 创建张量时，如果指定了 dtype 或者 device 使用了 GPU 就会创建新的张量，否则共享内存。

1
torch.as_tensor([5, 6, 7])

tensor([5, 6, 7])

使用 torch.from_numpy 创建张量时，一定会共享内存，创建时不可以指定 dtype 和 device。

1
2
np_arr = np.array([1, 2, 3], dtype = np.float32)
torch.from_numpy(np_arr)

tensor([1., 2., 3.])

常数张量

可以创建全部为常数的张量。

1
torch.zeros((2, 3))

tensor([[0., 0., 0.],
        [0., 0., 0.]])

1
torch.ones((3, 2), dtype = torch.int64)

tensor([[1, 1],
        [1, 1],
        [1, 1]])

1
torch.full((2, 2), 3.14)

tensor([[3.1400, 3.1400],
        [3.1400, 3.1400]])

1
2
e = torch.eye(3, 4)
e

tensor([[1., 0., 0., 0.],
        [0., 1., 0., 0.],
        [0., 0., 1., 0.]])

1
torch.zeros_like(e)

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

1
torch.ones_like(e)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

1
torch.full_like(e, -1.0)

tensor([[-1., -1., -1., -1.],
        [-1., -1., -1., -1.],
        [-1., -1., -1., -1.]])

未初始化的张量

未初始化的张量只分配内存，不初始化数值，因而存储的数值是随机的。

1
torch.empty((3, 3))

tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

1
torch.empty_like(e)

tensor([[-3.9431e-34,  1.8133e-42,  0.0000e+00,  0.0000e+00],
        [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00],
        [ 0.0000e+00,  0.0000e+00,  0.0000e+00,  0.0000e+00]])

序列张量

可以按照特定的序列生成张量。

1
2
# 等差数列：从 1.0 开始到 5.0 结束（不包含 5.0），步长为 0.5
torch.arange(1.0, 5.0, 0.5)

tensor([1.0000, 1.5000, 2.0000, 2.5000, 3.0000, 3.5000, 4.0000, 4.5000])

1
2
# 线性等分：从 0 到 1，均匀分成 5 个点（包含终点 1）
torch.linspace(0, 1, steps=5)

tensor([0.0000, 0.2500, 0.5000, 0.7500, 1.0000])

1
2
# 对数等分：生成 10^0 到 10^3 之间的 4 个点
torch.logspace(0, 3, steps=4)

tensor([   1.,   10.,  100., 1000.])

随机张量

可以按照特定的分布随机生成张量。

1
2
# 均匀分布：从 [0, 1) 的均匀分布中采样，形状为 (2, 2)
torch.rand((2, 2))

tensor([[0.9802, 0.4825],
        [0.1893, 0.8265]])

1
2
# 正态分布：从正态分布 N(mean=0.0, std=0.5) 中采样，生成一维 Tensor
torch.normal(mean=0.0, std=0.5, size=(3,))

tensor([ 0.7416, -1.2993, -0.4609])

1
2
# 标准正态分布：从标准正态分布 N(0, 1) 中采样，形状为 (2, 2)
torch.randn((2, 2))

tensor([[ 0.2501, -1.0411],
        [-0.4427,  1.6018]])

1
2
# 生成 [0, 10) 范围内的随机整数，形状为 (3, 3)
torch.randint(0, 10, (3, 3))

tensor([[7, 4, 5],
        [5, 5, 9],
        [0, 2, 2]])

1
2
# 生成 0 到 9 的随机排列（常用于打乱索引）
torch.randperm(10)

tensor([1, 8, 3, 4, 2, 0, 7, 6, 9, 5])

第 3 节张量运算

3.1 基础属性

张量有以下常见的基础属性：

1
2
3
import torch
a = torch.ones(4,3)
a

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

1
2
# 张量的维度数
a.ndim

1
2
# 同 ndim
a.dim()

1
2
# 返回张量每个维度的大小
a.shape

torch.Size([4, 3])

1
2
# 和 shape 一样，返回尺寸
a.size(0)

1
2
# 返回第 0 维大小
len(a)

1
2
# 返回元素总数
a.numel()

1
2
# 张量中元素的数据类型
a.dtype

torch.float32

1
2
# 张量所在设备（CPU / GPU）
a.device

device(type='cpu')

1
print(torch.cuda.is_available())

True

1
a.to('cuda'), a.cpu(), a.cuda()

(tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]], device='cuda:0'),
 tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]]),
 tensor([[1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.],
         [1., 1., 1.]], device='cuda:0'))

1
2
# 是否参与反向传播
a.requires_grad

False

1
2
# 反向传播后保存的梯度
a.grad

1
2
# 切断梯度但共享数据
a.detach()

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

3.2 尺寸变换

张量可以改变其形状：

1
2
# 改变形状，内存不连续则拷贝
a.reshape([2, -1])

tensor([[1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1.]])

1
2
# 改变形状，不拷贝，要求内存连续
a.view([3, -1])

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

1
2
# 改变形状，拷贝
a.view([3, -1]).clone()

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

1
2
# 增加一个维度
a.unsqueeze(0).shape, a.unsqueeze(1).shape, a.unsqueeze(2).shape

(torch.Size([1, 4, 3]), torch.Size([4, 1, 3]), torch.Size([4, 3, 1]))

1
2
3
# 去掉大小为 1 的维度
b = a.unsqueeze(0).clone()
b.shape, b.squeeze().shape

(torch.Size([1, 4, 3]), torch.Size([4, 3]))

1
2
# 交换两个维度
b.shape, b.transpose(0,1).shape, b.transpose(1,2).shape

(torch.Size([1, 4, 3]), torch.Size([4, 1, 3]), torch.Size([1, 3, 4]))

1
2
# 任意重排维度
b.shape, b.permute(2,0,1).shape

(torch.Size([1, 4, 3]), torch.Size([3, 1, 4]))

1
2
# 转置矩阵
a.T

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

1
2
# 拉平成 1D
a.flatten()

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

3.3 类型变换

张量可以改变其类型：

1
type(a), type(a.numpy()), type(a.tolist())

(torch.Tensor, numpy.ndarray, list)

1
a.double().dtype, a.float().dtype, a.long().dtype, a.int().dtype, a.half().dtype

(torch.float64, torch.float32, torch.int64, torch.int32, torch.float16)

对于标量，有如下的特殊运算：

1
2
a = torch.tensor([3.5])
a, a.item(), float(a), int(a)

(tensor([3.5000]), 3.5, 3.5, 3)

3.4 运算符

张量有如下的运算符，常见的运算符经过了运算符重载，可以直接用符号表示：

1
2
x = torch.Tensor([1.0,2,3,4,5,6]).view(2,3)
y = torch.Tensor([1,3,5,7,9,11]).view(2,3)

1
x + y, x - y, x * y, x / y, x ** y, x @ y.T

(tensor([[ 2.,  5.,  8.],
         [11., 14., 17.]]),
 tensor([[ 0., -1., -2.],
         [-3., -4., -5.]]),
 tensor([[ 1.,  6., 15.],
         [28., 45., 66.]]),
 tensor([[1.0000, 0.6667, 0.6000],
         [0.5714, 0.5556, 0.5455]]),
 tensor([[1.0000e+00, 8.0000e+00, 2.4300e+02],
         [1.6384e+04, 1.9531e+06, 3.6280e+08]]),
 tensor([[ 22.,  58.],
         [ 49., 139.]]))

1
2
# e^x
x.exp()

tensor([[  2.7183,   7.3891,  20.0855],
        [ 54.5981, 148.4132, 403.4288]])

1
2
# 平均值
x.mean(), x.mean(axis = 0), x.mean(axis = 1)

(tensor(3.5000), tensor([2.5000, 3.5000, 4.5000]), tensor([2., 5.]))

1
2
# 方差
x.var(), x.var(axis = 0), x.var(axis = 1)

(tensor(3.5000), tensor([4.5000, 4.5000, 4.5000]), tensor([1., 1.]))

1
2
# 标准差
x.std(), x.std(axis = 0), x.std(axis = 1)

(tensor(1.8708), tensor([2.1213, 2.1213, 2.1213]), tensor([1., 1.]))

1
2
# L1 范数
x.abs().sum()

tensor(21.)

1
2
# L2 范数
x.norm()

tensor(9.5394)

1
2
# 求和
x.sum(), x.sum(axis = 0), x.sum(axis = 1)

(tensor(21.), tensor([5., 7., 9.]), tensor([ 6., 15.]))

1
2
# 求累计和
x.cumsum(axis = 0), x.cumsum(axis = 1)

(tensor([[1., 2., 3.],
         [5., 7., 9.]]),
 tensor([[ 1.,  3.,  6.],
         [ 4.,  9., 15.]]))

1
2
# 求最大
x.max(), x.max(axis = 0), x.max(axis = 1)

(tensor(6.),
 torch.return_types.max(
 values=tensor([4., 5., 6.]),
 indices=tensor([1, 1, 1])),
 torch.return_types.max(
 values=tensor([3., 6.]),
 indices=tensor([2, 2])))

1
2
# 求最小
x.min(), x.min(axis = 0), x.min(axis = 1)

(tensor(1.),
 torch.return_types.min(
 values=tensor([1., 2., 3.]),
 indices=tensor([0, 0, 0])),
 torch.return_types.min(
 values=tensor([1., 4.]),
 indices=tensor([0, 0])))

1
2
# 最大值索引
x.argmax(), x.argmax(axis = 0), x.argmax(axis = 1)

(tensor(5), tensor([1, 1, 1]), tensor([2, 2]))

1
2
# 最小值索引
x.argmin(), x.argmin(axis = 0), x.argmin(axis = 1)

(tensor(0), tensor([0, 0, 0]), tensor([0, 0]))

1
2
# 是否存在 True
x.any(), x.any(axis = 0), x.any(axis = 1)

(tensor(True), tensor([True, True, True]), tensor([True, True]))

1
2
# 是否全 True
x.all(), x.all(axis = 0), x.all(axis = 1)

(tensor(True), tensor([True, True, True]), tensor([True, True]))

3.5 拼接

张量可以进行拼接，两个拼接的张量要求在其他维度上的 shape 是一样的。

1
torch.cat((x, y), axis = 0), torch.cat((x, y), axis = 1)

(tensor([[ 1.,  2.,  3.],
         [ 4.,  5.,  6.],
         [ 1.,  3.,  5.],
         [ 7.,  9., 11.]]),
 tensor([[ 1.,  2.,  3.,  1.,  3.,  5.],
         [ 4.,  5.,  6.,  7.,  9., 11.]]))

3.6 广播

广播从最后一维开始对齐，两个维度必须相等或者广播前的维度为 1。

1
2
3
a = torch.arange(3).reshape((3, 1))
b = torch.arange(2).reshape((1, 1, 2))
a, b

(tensor([[0],
         [1],
         [2]]),
 tensor([[[0, 1]]]))

1
a + b, (a + b).shape

(tensor([[[0, 1],
          [1, 2],
          [2, 3]]]),
 torch.Size([1, 3, 2]))

3.7 索引和切片

pytorch 的索引和切片与 numpy 相同，索引和切片与原变量共享相同的内存空间。

1
2
X = torch.arange(12, dtype = torch.float32).view((3,4))
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]])

1
x[-1], x[1:3]

(tensor([4., 5., 6.]), tensor([[4., 5., 6.]]))

1
2
X[1, 2] = 9
X

tensor([[ 0.,  1.,  2.,  3.],
        [ 4.,  5.,  9.,  7.],
        [ 8.,  9., 10., 11.]])

1
2
X[0:2, :] = 12
X

tensor([[12., 12., 12., 12.],
        [12., 12., 12., 12.],
        [ 8.,  9., 10., 11.]])