张量裁剪、索引与数据筛选、组合与拼接、切片

释放双眼，带上耳机，听听看~！

本文介绍了张量裁剪、索引与数据筛选、组合与拼接、切片的操作及应用，帮助读者更好地理解和使用Tensor库。

携手创作，共同成长！这是我参与「掘金日新计划 · 8 月更文挑战」的第20天，上一篇文章中，我们介绍的Tensor统计学相关的函数、分布函数、随机抽样、线性代数运算、矩阵分解——PCA、SVD分解——LDA。

今天，我们来介绍张量裁剪、索引与数据筛选、组合与拼接、切片。

$5.1 T e n sor 的裁剪运算$

对Tensor中的元素进行范围过滤
常用于梯度裁剪(gradient clipping)，即在发生梯度离散或者梯度爆炸时对梯度的处理
a.clamp(2, 10)

a = torch.rand(2, 2) * 10

print(a)
a = a.clamp(2, 5)

print(a)

运行结果：

tensor([[6.8261, 3.0515],
        [4.6355, 4.5499]])
tensor([[5.0000, 3.0515],
        [4.6355, 4.5499]])

对于大于5的，变成5，对于小于2的，变成2，在之间的，保持不变，将数据处理到 m <=a <= n

$5.2 T e n sor 的索引与数据筛选$

张量裁剪、索引与数据筛选、组合与拼接、切片

5.2.1 torch.where

a = torch.rand(4, 4)
b = torch.rand(4, 4)

print(a)
print(b)

out = torch.where(a > 0.5, a, b)

print(out)

运行结果：

tensor([[0.3398, 0.5239, 0.7981, 0.7718],
        [0.0112, 0.8100, 0.6397, 0.9743],
        [0.8300, 0.0444, 0.0246, 0.2588],
        [0.9391, 0.4167, 0.7140, 0.2676]])
tensor([[0.9906, 0.2885, 0.8750, 0.5059],
        [0.2366, 0.7570, 0.2346, 0.6471],
        [0.3556, 0.4452, 0.0193, 0.2616],
        [0.7713, 0.3785, 0.9980, 0.9008]])
tensor([[0.9906, 0.5239, 0.7981, 0.7718],
        [0.2366, 0.8100, 0.6397, 0.9743],
        [0.8300, 0.4452, 0.0193, 0.2616],
        [0.9391, 0.3785, 0.7140, 0.9008]])

5.2.2 torch.index_select

print("torch.index_select")
a = torch.rand(4, 4)
print(a)
out = torch.index_select(a, dim=0,
                   index=torch.tensor([0, 3, 2]))

print(out, out.shape)

运行结果：

torch.index_select
tensor([[0.4766, 0.1663, 0.8045, 0.6552],
        [0.1768, 0.8248, 0.8036, 0.9434],
        [0.2197, 0.4177, 0.4903, 0.5730],
        [0.1205, 0.1452, 0.7720, 0.3828]])
tensor([[0.4766, 0.1663, 0.8045, 0.6552],
        [0.1205, 0.1452, 0.7720, 0.3828],
        [0.2197, 0.4177, 0.4903, 0.5730]]) torch.Size([3, 4])

5.2.3 torch.gather

print("torch.gather")
a = torch.linspace(1, 16, 16).view(4, 4)

print(a)

out = torch.gather(a, dim=0,
             index=torch.tensor([[0, 1, 1, 1],
                                 [0, 1, 2, 2],
                                 [0, 1, 3, 3]]))
print(out)
print(out.shape)

定义了一个1-16的4×4的矩阵，进行gather按照列进行索引

运行结果：

torch.gather
tensor([[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.],
        [13., 14., 15., 16.]])
tensor([[ 1.,  6.,  7.,  8.],
        [ 1.,  6., 11., 12.],
        [ 1.,  6., 15., 16.]])
torch.Size([3, 4])

在这里，gather和index_select不一样，对于index_select是对于[0，3，2]是每一列取0，3，2构造，对于gather就是，dim定义成0，就是按照列来进行索引，每一个值对应一个维度

5.2.4 torch.masked_index

print("torch.masked_index")
a = torch.linspace(1, 16, 16).view(4, 4)
mask = torch.gt(a, 8)
print(a)
print(mask)
out = torch.masked_select(a, mask)
print(out)

将a中大于8的值打印出来

运行结果：

torch.masked_index
tensor([[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.],
        [ 9., 10., 11., 12.],
        [13., 14., 15., 16.]])
tensor([[False, False, False, False],
        [False, False, False, False],
        [ True,  True,  True,  True],
        [ True,  True,  True,  True]])
tensor([ 9., 10., 11., 12., 13., 14., 15., 16.])

5.2.5 torch.take

print("torch.take")
a = torch.linspace(1, 16, 16).view(4, 4)

b = torch.take(a, index=torch.tensor([0, 15, 13, 10]))

print(b)

输出的就是拉成一个向量之后的对应的索引值

运行结果：

torch.take
tensor([ 1., 16., 14., 11.])

5.2.6 torch.nonzero

nonzero的作用：

如果Tensor对应的非零元素非常多，那么意味着它的Tensor就是非常稀疏的，
对于这样一个稀疏的矩阵，我们在表示的时候可以采用稠密的方式来进行表示，就是将所有的数据直接存储在我们当前的存储中，
我们还有一种表示方式，就是只表示当前非零元素和非零元素对应的位置，这个位置就是索引值，而我们可以通过nonzero将索引值拿出来，再根据索引值再将值取出来，将它们存储下来，这么保存就会更加节省空间。

print("torch.take")
a = torch.tensor([[0, 1, 2, 0], [2, 3, 0, 1]])
out = torch.nonzero(a)
print(out)

输出的就是对应的非零元素的索引值

运行结果：

torch.take
tensor([[0, 1],
        [0, 2],
        [1, 0],
        [1, 1],
        [1, 3]])

$5.3 T e n sor 的组合 / 拼接$

这是我们在搭建深度学习网络的时候必然会用到的方式。

张量裁剪、索引与数据筛选、组合与拼接、切片

对于cat，我输入的是三阶的张量，输出的也是三阶的张量，只是在张量不同维度上的维度值进行叠加
对于stack，在进行维度拼接的时候会扩展出来一个维度，而这里维度的参数就是指向了我们所要拼接好的新的Tensor扩展出来的维度所对应到的哪个维度上，具体看一下编程实例来理解
对于gather，在指定过维度索引之后，会对输入的数据进行索引，将索引到的值再聚合到一起去，最终得到一个新的Tensor
5.3.1 torch.cat

a = torch.zeros((2, 4))
b = torch.ones((2, 4))

out = torch.cat((a,b),dim=0)
print(out)

out = torch.cat((a,b),dim=1)
print(out)

运行结果：

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[0., 0., 0., 0., 1., 1., 1., 1.],
        [0., 0., 0., 0., 1., 1., 1., 1.]])

5.3.2 torch.stack

print("torch.stack")
a = torch.linspace(1, 6, 6).view(2, 3)
b = torch.linspace(7, 12, 6).view(2, 3)
print(a, b)
out = torch.stack((a, b), dim=0)
print(out)
print(out.shape)

运行结果：

torch.stack
tensor([[1., 2., 3.],
        [4., 5., 6.]]) tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])
tensor([[[ 1.,  2.,  3.],
         [ 4.,  5.,  6.]],

        [[ 7.,  8.,  9.],
         [10., 11., 12.]]])
torch.Size([2, 2, 3])

当dim=2的时候

print("torch.stack")
a = torch.linspace(1, 6, 6).view(2, 3)
b = torch.linspace(7, 12, 6).view(2, 3)
print(a, b)
out = torch.stack((a, b), dim=2)
print(out)
print(out.shape)

print(out[:, :, 0])
print(out[:, :, 1])

运行结果：

torch.stack
tensor([[1., 2., 3.],
        [4., 5., 6.]]) tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])
tensor([[[ 1.,  7.],
         [ 2.,  8.],
         [ 3.,  9.]],

        [[ 4., 10.],
         [ 5., 11.],
         [ 6., 12.]]])
torch.Size([2, 3, 2])
tensor([[1., 2., 3.],
        [4., 5., 6.]])
tensor([[ 7.,  8.,  9.],
        [10., 11., 12.]])

$5.4 T e n sor 的切片$

张量裁剪、索引与数据筛选、组合与拼接、切片

平均就有一种可能就是平均不了，这时就会在最后一个维度进行小维度的切片，比如 5/2 就会切分成3，2

5.4.1 chunk

a = torch.rand((3, 4))
print(a)
out = torch.chunk(a, 2, dim=0)
print(out[0], out[0].shape)
print(out[1], out[1].shape)

运行结果：

tensor([[0.4521, 0.7974, 0.8710, 0.2167],
        [0.5701, 0.6740, 0.2146, 0.1090],
        [0.3028, 0.0716, 0.3060, 0.7585]])
tensor([[0.4521, 0.7974, 0.8710, 0.2167],
        [0.5701, 0.6740, 0.2146, 0.1090]]) torch.Size([2, 4])
tensor([[0.3028, 0.0716, 0.3060, 0.7585]]) torch.Size([1, 4])

5.4.1 split

a = torch.rand((10, 4))
print(a)
out = torch.split(a, 3, dim=0)
print(len(out))
for t in out:
    print(t.shape)

按照3来进行切片，3个一切，最终剩下多少就是多少，所以size应该是[3,4][3,4][3,4][1,4],这里的3就相当于chunks中的平均数

运行结果：

tensor([[0.7520, 0.2409, 0.0557, 0.2538],
        [0.2732, 0.9621, 0.8594, 0.2093],
        [0.6765, 0.0184, 0.0302, 0.8823],
        [0.4466, 0.8463, 0.0128, 0.7556],
        [0.9977, 0.5755, 0.1309, 0.5111],
        [0.4356, 0.2921, 0.9877, 0.3839],
        [0.3028, 0.0747, 0.0995, 0.0351],
        [0.9524, 0.5013, 0.5494, 0.8217],
        [0.3683, 0.4238, 0.9658, 0.7221],
        [0.5116, 0.6708, 0.0067, 0.4776]])
4
torch.Size([3, 4])
torch.Size([3, 4])
torch.Size([3, 4])
torch.Size([1, 4])

还可以指定list来切分，

out = torch.split(a, [1, 3, 6], dim=0)
for t in out:
    print(t.shape)

运行结果：

torch.Size([1, 4])
torch.Size([3, 4])
torch.Size([6, 4])

我们常用的就是split，因为split是可以完成chunk的方式的

张量裁剪、索引与数据筛选、组合与拼接、切片

本网站的内容主要来自互联网上的各种资源，仅供参考和信息分享之用，不代表本网站拥有相关版权或知识产权。如您认为内容侵犯您的权益，请联系我们，我们将尽快采取行动，包括删除或更正。

{{userData.name}}已认证

张量裁剪、索引与数据筛选、组合与拼接、切片

$5.1 T e n sor 的裁剪运算$

$5.2 T e n sor 的索引与数据筛选$

5.2.1 torch.where

5.2.2 torch.index_select

5.2.3 torch.gather

5.2.4 torch.masked_index

5.2.5 torch.take

5.2.6 torch.nonzero

$5.3 T e n sor 的组合 / 拼接$

5.3.1 torch.cat

5.3.2 torch.stack

$5.4 T e n sor 的切片$

5.4.1 chunk

5.4.1 split

PolyCoder: A Systematic Evaluation of Open-Source Code Language Models

基于风险驱动的交付在百度实践智能测试中的应用

GeoSpy.ai

Globe Explorer

即梦Dreamina

Luma Dream Machine

Motionshop

StoryDiffusion

归档

{{userData.name}}已认证

5.1Tensor的裁剪运算color{#135ce0}{5.1 Tensor的裁剪运算}5.1Tensor的裁剪运算

5.2Tensor的索引与数据筛选color{#135ce0}{5.2 Tensor的索引与数据筛选}5.2Tensor的索引与数据筛选

5.2.1 torch.where

5.2.2 torch.index_select

5.2.3 torch.gather

5.2.4 torch.masked_index

5.2.5 torch.take

5.2.6 torch.nonzero

5.3Tensor的组合/拼接color{#135ce0}{5.3 Tensor的组合/拼接}5.3Tensor的组合/拼接

5.3.1 torch.cat

5.3.2 torch.stack

5.4Tensor的切片color{#135ce0}{5.4 Tensor的切片}5.4Tensor的切片

5.4.1 chunk

5.4.1 split

PolyCoder: A Systematic Evaluation of Open-Source Code Language Models

基于风险驱动的交付在百度实践智能测试中的应用

Pandas快速入门：基础操作与常用工作流

PyTorch简明教程：四则运算和线性回归实践

Python sklearn库常用数据预处理方法详解

Python文本分类入门实战攻略

$5.1 T e n sor 的裁剪运算$

$5.2 T e n sor 的索引与数据筛选$

$5.3 T e n sor 的组合 / 拼接$

$5.4 T e n sor 的切片$