Skip to content

[Solved] RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

Last Updated on 2024-08-21 by Clay

Problem Description

When building deep learning models in PyTorch, adjusting the shapes of layers and input/output dimensions is something every AI engineer has to deal with. However, there is a small but interesting pitfall in the view() method of PyTorch:

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.


Intuitively, PyTorch requires the elements of a tensor to be stored contiguously in memory when using .view() to change its shape. However, after certain operations such as .transpose() and .permute(), the memory layout of the tensor might no longer be contiguous.


Solution

Therefore, after applying methods like .transpose() or .permute(), if you need to use view() to reshape the tensor, you must first call .contiguous() to ensure the tensor is stored contiguously in memory.

Below is an example that reproduces the error. Suppose after calculating the multi-head attention mechanism, we want to merge the heads back into the original hidden_size dimension.

import torch

batch_size = 16
seq_length = 512
num_head = 2
hidden_size = 768

inputs = torch.rand(batch_size, num_head, seq_length, int(hidden_size / num_head))
print("Shape:", inputs.shape)

inputs = inputs.permute(0, 2, 1, 3)
print("Permute Shape:", inputs.shape)

inputs = inputs.view(batch_size, seq_length, hidden_size)
print("Merge multi-head Shape:", inputs.shape)


Output:

Shape: torch.Size([16, 2, 512, 384])
Permute Shape: torch.Size([16, 512, 2, 384])
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/home/clay/Projects/machine_learning/transformers_from_scratch/analysis.ipynb Cell 12 line 1
     11 inputs = inputs.permute(0, 2, 1, 3)
     12 print("Permute Shape:", inputs.shape)
---> 14 inputs = inputs.view(batch_size, seq_length, hidden_size)
     15 print("Merge multi-head Shape:", inputs.shape)

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.


This issue arises because the tensor is not contiguous in memory. Once we apply .contiguous(), the reshaping proceeds smoothly.

import torch

batch_size = 16
seq_length = 512
num_head = 2
hidden_size = 768

inputs = torch.rand(batch_size, num_head, seq_length, int(hidden_size / num_head))
print("Shape:", inputs.shape)

# Wrong
# inputs = inputs.permute(0, 2, 1, 3)

# Correct
inputs = inputs.permute(0, 2, 1, 3).contiguous()
print("Permute Shape:", inputs.shape)

inputs = inputs.view(batch_size, seq_length, hidden_size)
print("Merge multi-head Shape:", inputs.shape)


Output:

Shape: torch.Size([16, 2, 512, 384])
Permute Shape: torch.Size([16, 512, 2, 384])
Merge multi-head Shape: torch.Size([16, 512, 768])


As we can see, the multi-head attention outputs have been successfully merged back together.


References


Read More

Leave a ReplyCancel reply

Exit mobile version