Manipulating the Shape of Matrices and Arrays

Last modified: May 19, 2025

This article is written in: 🇺🇸

Manipulating the Shape of Matrices and Arrays

In data manipulation and analysis, adjusting the shape or dimensionality of arrays and matrices is a common task. Reshaping allows you to reorganize data without altering its underlying values, making it suitable for various applications such as data preprocessing, machine learning model input preparation, and visualization. Understanding how to effectively reshape arrays using NumPy's versatile functions is crucial for efficient data handling and computational performance.

Mathematical Intro

In linear-algebra terms, an \$k\$-dimensional array (or tensor) is just an ordered collection of scalars indexed by a Cartesian product of finite sets. If

$$ \mathbf{A}=\bigl(a_{i_1,i_2,\dots ,i_k}\bigr) \quad\text{with}\quad
1\le i_j\le n_j,\; j=1,\dots ,k $$

we say that $shape(\mathbf{A})=(n_1,n_2,\dots ,n_k)$ and that the array lives in the vector space

$$ \mathbb{R}^{n_1\times n_2\times!\cdots!\times n_k} $$

Total size is invariant

The cardinality of the indexing set

$$ N\;=\;\prod_{j=1}^{k}n_j $$

equals the number of stored scalars. Any operation that merely reshapes an array must preserve this product; it is therefore a bijection

$$ \varphi:{1,\dots ,N}\;\longrightarrow\;{1,\dots ,m_1}\times\dots\times{1,\dots ,m_\ell} \quad\text{with}\quad\prod_{r=1}^{\ell}m_r=N, $$

where the left-hand side is usually taken to be the flattened (1-D) index. In NumPy, calling arr.reshape(m₁,…,m_ℓ) simply installs the mapping \$\varphi\$ without moving data in memory.

Rank-1 vs. Higher-Rank Views

A vector in \$\mathbb{R}^N\$ can be viewed as

a row vector: \$\mathbb{R}^{1\times N}\$
a column vector: \$\mathbb{R}^{N\times 1}\$

These are not new data structures—only different indexings of the same ordered list of \$N\$ scalars. Likewise, an RGB image of size \$H\times W\$ can be regarded as

a matrix in \$\mathbb{R}^{H\times 3W}\$ (channel-concatenated), or
a 3-tensor in \$\mathbb{R}^{H\times W\times 3}\$ (height × width × channel),

depending on what a downstream algorithm expects.

Reshaping as Vector-Space Isomorphism

Because reshaping is a bijection of index sets, it yields a linear isomorphism of vector spaces:

$$ \mathcal{R}:\mathbb{R}^{n_1\times!\cdots!\times n_k}\;\overset{\cong}{\longrightarrow}\;\mathbb{R}^{m_1\times!\cdots!\times m_\ell} $$

$$ \mathcal{R}\bigl(a_{i_1,\dots ,i_k}\bigr)=a_{\varphi^{-1}(j_1,\dots ,j_\ell)} $$

Hence no numerical information is lost or gained—only the interpretation of the indices changes.

The Basics of Reshaping

Reshaping an array involves changing its structure—such as the number of dimensions or the size of each dimension—while keeping the total number of elements unchanged. This transformation is vital for preparing data in the required format for different computational tasks.

import numpy as np
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Reshape to a 2x5 matrix
reshaped_arr = arr.reshape(2, 5)
print(reshaped_arr)

Expected output:

[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

arr.reshape(2, 5) transforms the original 1D array of 10 elements into a 2x5 matrix.
The product of the new dimensions (2 * 5) matches the total number of elements in the original array, ensuring a valid reshape.
Reshaping is useful when transitioning data between different formats, such as converting a flat list of features into a matrix suitable for machine learning algorithms that expect 2D input.

From One Dimension to Many

Transforming 1D arrays into multi-dimensional structures is a frequent requirement, especially when dealing with data that inherently possesses multiple dimensions, such as images or time-series data.

arr = np.array([1, 2, 3, 4, 5, 6])

# Convert to 1x6 row vector
row_vector = arr.reshape(1, -1)
print("Row Vector:\n", row_vector)

# Convert to 6x1 column vector
column_vector = arr.reshape(-1, 1)
print("\nColumn Vector:\n", column_vector)

Expected output:

Row Vector:
 [[1 2 3 4 5 6]]

Column Vector:
 [[1]
 [2]
 [3]
 [4]
 [5]
 [6]]

Using -1 in the reshape() method allows NumPy to automatically determine the appropriate dimension size based on the total number of elements.
arr.reshape(1, -1) converts the array into a row vector with 1 row and as many columns as needed.
arr.reshape(-1, 1) converts the array into a column vector with as many rows as needed and 1 column.
Reshaping arrays into row or column vectors is useful when performing matrix multiplications or when interfacing with libraries that require specific input shapes.

Higher-Dimensional Reshaping

Reshaping isn't limited to two dimensions; NumPy allows the creation of arrays with three or more dimensions, which are useful in more complex data representations like 3D models, color images, or time-series data across multiple sensors.

# Create a 1D array with 12 elements
arr = np.arange(12)

# Reshape to a 2x3x2 3D array
reshaped_3d = arr.reshape(2, 3, 2)
print("3D Array:\n", reshaped_3d)

Expected output:

3D Array:
 [[[ 0  1]
  [ 2  3]
  [ 4  5]]

 [[ 6  7]
  [ 8  9]
  [10 11]]]

np.arange(12) creates a 1D array with elements from 0 to 11.
arr.reshape(2, 3, 2) reshapes the array into a 3D structure with dimensions 2x3x2.
The total number of elements remains consistent (2 * 3 * 2 = 12).
3D reshaping is commonly used in image processing where images are represented as 3D arrays (height x width x color channels) or in processing volumetric data like medical scans.

Flattening Arrays

Converting multi-dimensional arrays back to a single dimension is known as flattening. This operation is useful when you need to preprocess data for algorithms that require input in a specific shape or when simplifying data for certain analyses.

arr = np.array([[1, 2, 3], [4, 5, 6]])

# Using flatten
flat_arr = arr.flatten()
print("Using flatten:\n", flat_arr)

# Using reshape
one_d_arr = arr.reshape(-1)
print("\nUsing reshape:\n", one_d_arr)

Expected output:

Using flatten:
 [1 2 3 4 5 6]

Using reshape:
 [1 2 3 4 5 6]

arr.flatten() creates a copy of the original array in a 1D format.
arr.reshape(-1) reshapes the array into a 1D array without creating a copy, providing a view of the original data.
Flattening can be used when preparing data for machine learning models that expect input features as flat vectors or when performing certain types of statistical analyses that require data in a single dimension.

Practical Applications and Considerations

Reshaping arrays is a fundamental skill in data science and machine learning, facilitating the preparation and transformation of data to fit various computational models and visualization requirements. Here are some practical applications and important considerations when reshaping arrays:

Reshaping is used in data preparation to ensure the data conforms to the input shape requirements of machine learning models. For example, a list of pixel values can be converted into a 2D image matrix or a 3D tensor for convolutional neural networks.
During batch processing, data is organized into batches to facilitate efficient training and processing. An example is reshaping data into batches of samples to be input into a neural network.
Data augmentation often involves reshaping to create variations in datasets. For example, images can be flipped or rotated to increase the diversity of training data for better model generalization.
In memory management, reshaping can help optimize memory usage. For instance, using reshaped views is more memory-efficient than creating copies, as views avoid duplicating data in memory.
Matrix operations often require reshaping to align data structures for mathematical computations. For example, vectors may be reshaped into matrices to enable matrix multiplication or inversion in linear algebra tasks.

Reshape Examples and Edge Cases

Understanding both standard and edge case scenarios in reshaping helps in writing robust and error-free code.

Example: Invalid Reshaping

Attempting to reshape an array into an incompatible shape—where the total number of elements does not match—will raise an error. This ensures data integrity by preventing mismatched transformations.

arr = np.array([1, 2, 3, 4, 5, 6])

try:
    invalid_reshape = arr.reshape(3, 3)
except ValueError as e:
    print("Error:", e)

Expected output:

Error: cannot reshape array of size 6 into shape (3,3)

The original array has 6 elements.
Attempting to reshape it into a 3x3 matrix requires 9 elements (3 * 3), which is not possible, resulting in a ValueError.
This scenario emphasizes the importance of ensuring that the product of the new dimensions matches the total number of elements in the original array when reshaping.

Example: Reshaping for Machine Learning

Machine learning models often require data in specific shapes. For example, convolutional neural networks expect image data with channel dimensions.

# Example: Reshape a batch of images (28x28) to (batch_size, 28, 28, 1)
batch_size = 100
images = np.random.rand(batch_size, 28, 28)

# Reshape to include the channel dimension
images_reshaped = images.reshape(batch_size, 28, 28, 1)
print("Reshaped Images Shape:", images_reshaped.shape)

Expected output:

Reshaped Images Shape: (100, 28, 28, 1)

np.random.rand(batch_size, 28, 28) creates a batch of 100 grayscale images, each of size 28x28 pixels.
images.reshape(batch_size, 28, 28, 1) adds a channel dimension, converting the shape to (100, 28, 28, 1), which is required by many deep learning frameworks.
Adding or modifying dimensions is crucial when preparing image data for training convolutional neural networks, ensuring compatibility with the model's expected input shape.

Additional Reshaping Techniques

Beyond the basic reshape and flatten, NumPy offers other methods to manipulate array shapes effectively:

I. resize(): Changes the shape of an array in-place, which can alter the original array.

arr = np.array([1, 2, 3, 4, 5, 6])
arr.resize((2, 3))
print(arr)

Expected output:

[[1 2 3]
[4 5 6]]

II. swapaxes(): Swaps two axes of an array, useful for changing the orientation of multi-dimensional data.

arr = np.array([[1, 2, 3], [4, 5, 6]])
swapped = arr.swapaxes(0, 1)
print(swapped)

Expected output:

[[1 4]
[2 5]
[3 6]]

III. transpose(): Permutes the dimensions of an array, similar to swapaxes but more general.

arr = np.array([[1, 2, 3], [4, 5, 6]])
transposed = arr.transpose()
print(transposed)

Expected output:

[[1 4]
 [2 5]
 [3 6]]

These additional methods provide more flexibility in manipulating array dimensions, allowing for complex data transformations required in various computational tasks.
Transposing and swapping axes are commonly used in data preprocessing steps, such as preparing data for matrix multiplication or reorienting image data for different processing pipelines.

Summary Table for Manipulating Dimensions

This table summarizes various operations for manipulating the dimensions of arrays in NumPy, along with examples and descriptions of each operation.

Operation	Method/Function	Description	Example Code	Example Output
Reshape	`reshape(new_shape)`	Changes the shape of an array without altering its data. Total elements must remain the same.	`arr = np.array([1, 2, 3, 4, 5, 6])` `reshaped = arr.reshape(2, 3)`	`[[1 2 3]` `[4 5 6]]`
Row Vector	`reshape(1, -1)`	Converts a 1D array to a 1xN row vector.	`row_vector = arr.reshape(1, -1)`	`[[1 2 3 4 5 6]]`
Column Vector	`reshape(-1, 1)`	Converts a 1D array to an Nx1 column vector.	`column_vector = arr.reshape(-1, 1)`	`[[1]` `[2]` `[3]` `[4]` `[5]` `[6]]`
Flatten	`flatten()`	Flattens a multi-dimensional array into a 1D array, returns a copy.	`flat_arr = np.array([[1, 2, 3], [4, 5, 6]]).flatten()`	`[1 2 3 4 5 6]`
Flatten with Reshape	`reshape(-1)`	Flattens a multi-dimensional array into a 1D array, returns a view.	`one_d_arr = np.array([[1, 2, 3], [4, 5, 6]]).reshape(-1)`	`[1 2 3 4 5 6]`
3D Reshape	`reshape(new_shape)`	Converts a 1D array into a 3D array.	`reshaped_3d = np.arange(12).reshape(2, 3, 2)`	`[[[ 0 1]` `[ 2 3]` `[ 4 5]]` `[[ 6 7]` `[ 8 9]` `[10 11]]]`
Transpose	`transpose()`	Permutes the dimensions of an array.	`transposed = np.array([[1, 2, 3], [4, 5, 6]]).transpose()`	`[[1 4]` `[2 5]` `[3 6]]`
Resize	`resize(new_shape)`	Changes the shape and size of an array, modifying the array in-place.	`arr = np.array([1, 2, 3, 4, 5, 6])` `arr.resize(2, 3)`	`[[1 2 3]` `[4 5 6]]`
Expand Dimensions	`expand_dims(a, axis)`	Expands the shape of an array by inserting a new axis.	`expanded = np.expand_dims(np.array([1, 2, 3]), axis=0)`	`[[1 2 3]]`
Squeeze Dimensions	`squeeze()`	Removes single-dimensional entries from the shape of an array.	`squeezed = np.array([[[1], [2], [3]]]).squeeze()`	`[1 2 3]`