Tensor | PyTorch で利用する Tensor 型のデータ操作

PyTorch の Tensor は、多次元配列であるテンソルを扱うためのデータ型である。NumPy の配列型と似て、NumPy の配列型で提供されているメソッドは Tensor 型でも提供されている。NumPy の配列と同様に、Tensor 型のデータから特定の次元に対して平均を求めたり、特定の部分データをスライスしてきたりすることができる。両者の違いとして、Tensor は GPU 上での演算もサポートしている点で異なる。

Tensor の生成

Tensor は、リストまたは NumPy 配列から変換して作成する。リストから変換するとき torch.tensor 関数を利用する。この際に、数値の型は自動的に決まる。数値の型を明示的に指定したい場合は NumPy 配列を作る時と同様に dtype 引数を利用する。

import torch
import numpy as np

x = torch.tensor([[1, 2], [3, 4]])
print(x)
## tensor([[1, 2],
##         [3, 4]])

y = torch.tensor([[1, 2], [3, 4]], dtype=torch.float64)
print(y)
## tensor([[1., 2.],
##         [3., 4.]])

NumPy 配列から変換する場合は次のように torch.from_numpy 関数を利用する。この際に、値の型は NumPy の値の型を継承する。

import torch
import numpy as np

y = np.array([[1, 2], [3, 4]])
x = torch.from_numpy(y)
print(x)
## tensor([[1, 2],
##         [3, 4]])

import torch
import numpy as np

y = np.array([[1, 2], [3, 4]], dtype=np.float64)
x = torch.from_numpy(y)
print(x)
## tensor([[1, 2],
##         [3, 4]], dtype=torch.float64)

Tensor の要素の取り出し

Tensor は NumPy 配列と同様に、添字で特定の要素を取り出せたり、スライスで連続した一部の要素を取り出したりすることができる。

import torch

x = torch.tensor([[1, 2], [3, 4], [5, 6]])
print(x)
## tensor([[1, 2],
##         [3, 4],
##         [5, 6]])

print(x[0, 0])
## tensor(1)

print(x[:, 1:2])
## tensor([[2],
##         [4],
##         [6]])

Tensorの演算

NumPy 配列に対して行える演算は、Tensor に対しても同様に行うことができる。

a = torch.tensor([1, 2])
b = torch.tensor([10, 20])

d = a + 1
print(d)
## tensor([2, 3])

d = a + b
print(d)
## tensor([11, 22])

d = a * b
print(d)
## tensor([10, 40])

Tensor の CPU-GPU 間の移動

torch.tensor 関数をデフォルトのままで利用すると、Tensor は CPU のメモリ上に作られる。

import torch

a = torch.tensor([1, 2])
print(a)
## tensor([1, 2])

GPU が使用できる場合、device 引数を使用することで、テンソルを GPU のメモリ上に作ることができる。

import torch

device = torch.device('cuda:0')

a = torch.tensor([1, 2, 3, 4], device=device)
print(a)
## tensor([1, 2, 3, 4], device='cuda:0')

CPU から GPU へ移動

CPU メモリ上にあるテンソルを GPU メモリ上に移動するとき、テンソルの to メソッドを使用する。

import torch

device = torch.device('cuda:0')

a = torch.tensor([1, 2, 3, 4])
print(a)
## tensor([1, 2, 3, 4])

a = a.to(device)
print(a)
## tensor([1, 2, 3, 4], device='cuda:0')

GPU から CPU へ移動

GPU メモリ上にあるテンソルを CPU メモリ上に移動するとき、cpu メソッドを使用する。

import torch

device = torch.device('cuda:0')

a = torch.tensor([1, 2, 3, 4], device=device)
print(a)
## tensor([1, 2, 3, 4], device='cuda:0')

a = a.cpu()
print(a)
## tensor([1, 2, 3, 4])

Tensor と NumPy 配列の交互変換

Tensor が CPU 上にあるとき、両者の交互変換が可能である。Tensor を NumPy 配列に変換するときは numpy メソッドを使用する。逆に、配列から Tensor に変換するときは、前出の from_numpy を使用する。

両者が CPU 上にあるとき、Tensor と NumPy 配列は同じメモリを共有している。そのため、一方を変更すると、もう一方も変更される。

import torch
import numpy as np

x = torch.tensor([[1, 2], [3, 4]])
y = x.numpy()
print(x)
## tensor([[1, 2],
##         [3, 4]])
print(y)
## [[1 2]
##  [3 4]]


x[0, 0] = 0
print(x)
## tensor([[0, 2],
##         [3, 4]])
print(y)
## [[0 2]
##  [3 4]]


y[1, 1] = 0
## print(x)
## tensor([[0, 2],
##         [3, 0]])
print(y)
## [[0 2]
##  [3 0]]



次のように NumPy 配列から Tensor を使ったときも、同じようにメモリが共有される。

y = np.array([[1, 2], [3, 4]])
x = torch.from_numpy(y)
print(x)
## tensor([[1, 2],
##         [3, 4]])


y[0, 0] = 0
print(x)
## tensor([[0, 2],
##         [3, 4]])
pirnt(y)
## [[0 2]
##  [3 4]]


x[1, 1] = 0
print(x)
## tensor([[0, 2],
##         [3, 0]])
print(y)
## [[0 2]
##  [3 0]]

Tensor が GPU 上にあるとき、これを NumPy 配列に変換する場合は、テンソルを一度 CPU 上に移動する必要がある。

import torch
import numpy as np

device = torch.device('cuda:0')
x = torch.tensor([1, 2, 3, 4], device=device)

y = x.cpu().numpy()
print(y)
## [1 2 3 4]