Last modified: May 18, 2025
This article is written in: 🇺🇸
A matrix is a systematic arrangement of numbers (or elements) in rows and columns. An m × n matrix has m
rows and n
columns. The dimensions of the matrix are represented as m × n.
A vector norm is a function that assigns a non-negative value to a vector in an $n$-dimensional space, providing a quantitative measure of the vector’s length or magnitude. One commonly used vector norm is the Euclidean norm, also known as the $L^2$ norm, defined for a vector $\vec{x}$ in $\mathbb{R}^n$ as:
$$ \lVert \vec{x} \rVert_2 = \sqrt{\sum_{i=1}^n x_i^2} $$
where $x_i$ represents the components of the vector $\vec{x}$.
Matrix norms extend the concept of vector norms to matrices. A widely used matrix norm is the Frobenius norm, analogous to the Euclidean norm for vectors. For a matrix $M$ with dimensions $m \times n$, the Frobenius norm is defined as:
$$ \lVert M \rVert_F = \sqrt{\sum_{i=1}^m \sum_{j=1}^n (M_{ij})^2} $$
where $M_{ij}$ represents the elements of the matrix $M$.
Several matrix norms are commonly used in practice, each with unique properties and applications:
I. Frobenius Norm
Measures the “size” of a matrix in terms of the sum of the squares of its entries.
$$ \lVert M \rVert_F = \sqrt{\sum_{i=1}^m \sum_{j=1}^n (M_{ij})^2} $$
II. Spectral Norm
Also known as the operator 2-norm, it is the largest singular value of the matrix, which corresponds to the square root of the largest eigenvalue of $M^T M$.
$$ \lVert M \rVert_2 = \sigma_{\max}(M) $$
III. 1-Norm (Maximum Column Sum Norm)
The maximum absolute column sum of the matrix.
$$ \lVert M \rVert_1 = \max_{1 \le j \le n} \sum_{i=1}^m \lvert M_{ij} \rvert $$
IV. Infinity Norm (Maximum Row Sum Norm)
The maximum absolute row sum of the matrix.
$$ \lVert M \rVert_\infty = \max_{1 \le i \le m} \sum_{j=1}^n \lvert M_{ij} \rvert $$
NumPy provides functions to compute various matrix norms, making it easy to work with these concepts in Python.
The Frobenius norm can be computed using the numpy.linalg.norm
function with the 'fro'
argument:
import numpy as np
A = np.array([[1, 2], [3, 4]])
frobenius_norm = np.linalg.norm(A, 'fro')
print("Frobenius Norm:", frobenius_norm)
Expected Output:
Frobenius Norm: 5.477225575051661
The spectral norm, or 2-norm, can be computed using the numpy.linalg.norm
function with the 2
argument:
spectral_norm = np.linalg.norm(A, 2)
print("Spectral Norm:", spectral_norm)
Expected Output:
Spectral Norm: 5.464985704219043
The 1-norm can be computed by specifying the 1
argument in the numpy.linalg.norm
function:
one_norm = np.linalg.norm(A, 1)
print("1-Norm:", one_norm)
Expected Output:
1-Norm: 6.0
The infinity norm can be computed using the np.inf
argument in the numpy.linalg.norm
function:
infinity_norm = np.linalg.norm(A, np.inf)
print("Infinity Norm:", infinity_norm)
Expected Output:
Infinity Norm: 7.0
Matrix norms are used in a variety of applications, including:
Consider a problem where we want to find a matrix $X$ that approximates another matrix $A$ while minimizing the Frobenius norm of the difference:
A = np.array([[1, 2], [3, 4]])
X = np.array([[0.9, 2.1], [3.1, 3.9]])
# Calculate the Frobenius norm of the difference
approx_error = np.linalg.norm(A - X, 'fro')
print("Approximation Error (Frobenius Norm):", approx_error)
Expected Output:
Approximation Error (Frobenius Norm): 0.22360679774997896
Matrix norms exhibit the sub-multiplicative property, a crucial characteristic in linear algebra and numerical analysis. This property is defined as follows:
$$ ||AB|| \leq ||A|| \times ||B|| $$
The sub-multiplicative property implies that the norm of the product of two matrices $A$ and $B$ is at most the product of the norms of the individual matrices. This property is significant because it helps in understanding the behavior of matrix operations and provides bounds for the results of these operations.
The distance between two matrices $A$ and $B$ can be defined as:
$$ d(A, B) = ||A - B|| $$
This metric measures how "far apart" two matrices are, which is particularly useful in iterative methods where convergence to a particular matrix is desired.
Let's verify the sub-multiplicative property with a practical example using NumPy.
A = np.array([[1, 2], [3, 4]])
B = np.array([[2, 0], [1, 2]])
# Calculate the norms
norm_A = np.linalg.norm(A, 'fro')
norm_B = np.linalg.norm(B, 'fro')
# Calculate the product of the norms
product_of_norms = norm_A * norm_B
# Calculate the norm of the matrix product
product_matrix = np.dot(A, B)
norm_product_matrix = np.linalg.norm(product_matrix, 'fro')
print("Norm of A:", norm_A)
print("Norm of B:", norm_B)
print("Product of Norms:", product_of_norms)
print("Norm of Product Matrix:", norm_product_matrix)
# Verify the sub-multiplicative property
assert norm_product_matrix <= product_of_norms, "Sub-multiplicative property violated!"
print("Sub-multiplicative property holds!")
Output:
Norm of A: 5.477225575051661
Norm of B: 2.449489742783178
Product of Norms: 13.416407864998739
Norm of Product Matrix: 12.083045973594572
Sub-multiplicative property holds!
Matrix multiplication is a fundamental operation in linear algebra, essential for various applications in science, engineering, computer graphics, and machine learning. The operation involves two matrices, where the number of columns in the first matrix must match the number of rows in the second matrix. The resulting matrix has dimensions determined by the rows of the first matrix and the columns of the second matrix.
Given two matrices $M$ and $N$:
The product $P = M \times N$ will be an $m \times p$ matrix. The elements of the resulting matrix $P$ are computed as follows:
$$ P_{ij} = \sum_{k=1}^n{M_{ik} N_{kj}} $$
Where:
NumPy provides several methods to perform matrix multiplication:
np.dot()
The np.dot()
function computes the dot product of two arrays. For 2-D arrays, it is equivalent to matrix multiplication.
import numpy as np
M = np.array([[-4, 5], [1, 7], [8, 3]])
N = np.array([[3, -5, 2, 7], [-5, 1, -4, -3]])
product = np.dot(M, N)
print(product)
Expected Output:
[[-37 25 -28 -43]
[-32 2 -26 -14]
[ 9 -37 4 47]]
@
OperatorThe @
operator is another way to perform matrix multiplication in Python 3.5+.
product = M @ N
print(product)
Expected Output:
[[-37 25 -28 -43]
[-32 2 -26 -14]
[ 9 -37 4 47]]
np.matmul()
The np.matmul()
function performs matrix multiplication for two arrays.
product = np.matmul(M, N)
print(product)
Expected Output:
[[-37 25 -28 -43]
[-32 2 -26 -14]
[ 9 -37 4 47]]
Matrix multiplication can be computationally intensive, especially for large matrices. NumPy uses optimized libraries such as BLAS and LAPACK to perform efficient matrix multiplications. For very large datasets, leveraging these optimizations is crucial.
import numpy as np
# Create large random matrices
A = np.random.rand(1000, 500)
B = np.random.rand(500, 1000)
# Multiply using np.dot
result = np.dot(A, B)
print(result.shape)
Expected Output:
(1000, 1000)
For extremely large matrices, Strassen's algorithm can be used to reduce the computational complexity. Although NumPy does not implement Strassen's algorithm directly, understanding it can be beneficial for theoretical insights.
Transposing a matrix involves interchanging its rows and columns. The transpose of an $m \times n$ matrix results in an $n \times m$ matrix. This operation is fundamental in various applications, including solving linear equations, optimization problems, and transforming data.
For an $m \times n$ matrix $M$, the transpose of $M$, denoted as $M^T$, is an $n \times m$ matrix where the element at the $i$-th row and $j$-th column of $M$ becomes the element at the $j$-th row and $i$-th column of $M^T$.
Consider the matrix $M$:
import numpy as np
M = np.array([[-4, 5], [1, 7], [8, 3]])
print("Original Matrix:
", M)
print("Transpose of Matrix:
", M.T)
Expected output:
Original Matrix:
[[-4 5]
[ 1 7]
[ 8 3]]
Transpose of Matrix:
[[-4 1 8]
[ 5 7 3]]
The determinant is a scalar value that is computed from a square matrix. It has significant applications in linear algebra, including solving systems of linear equations, computing inverses of matrices, and determining whether a matrix is invertible.
For a square matrix $A$, the determinant is denoted as $\text{det}(A)$ or $|A|$. For a $2 \times 2$ matrix:
$$ A = \begin{pmatrix} a & b \ c & d \end{pmatrix} $$
The determinant is calculated as:
$$ \text{det}(A) = ad - bc $$
Consider the matrix $M$:
M = np.array([[-4, 5], [1, 7]])
det_M = np.linalg.det(M)
print("Determinant of M:", det_M)
Expected output: -33.0
I. The multiplicative property of determinants states that for any two square matrices $A$ and $B$ of the same size, the determinant of their product is equal to the product of their determinants: $\text{det}(AB) = \text{det}(A) \cdot \text{det}(B)$.
II. According to the transpose property, the determinant of a matrix is the same as the determinant of its transpose, represented as $\text{det}(A) = \text{det}(A^T)$.
III. The inverse property indicates that if $A$ is invertible, then the determinant of the inverse is the reciprocal of the determinant of the matrix, expressed as $\text{det}(A^{-1}) = \frac{1}{\text{det}(A)}$.
IV. For row operations on a matrix:
The identity matrix, typically denoted as $I$, is a special square matrix with ones on its main diagonal and zeros in all other positions. It serves as the multiplicative identity in matrix operations, meaning any matrix multiplied by the identity matrix remains unchanged.
For an $n \times n$ identity matrix $I$:
$$ I = \begin{bmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & 1 \end{bmatrix} $$
Creating an identity matrix using NumPy:
I = np.eye(3)
print("Identity Matrix I:", I)
Expected output:
Identity Matrix I:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
A square matrix $A$ is said to have an inverse, denoted $A^{-1}$, if:
$$ A \times A^{-1} = A^{-1} \times A = I $$
The inverse matrix is crucial in solving systems of linear equations and various other applications.
Consider the matrix $M$:
M = np.array([[-4, 5], [1, 7]])
inv_M = np.linalg.inv(M)
print("Inverse of M:", inv_M)
Expected output:
[[-0.21212121 0.15151515]
[ 0.03030303 0.12121212]]
The rank of a matrix provides insight into its structure and properties. Essentially, it is the number of linearly independent rows or columns present in the matrix. The rank can reveal information about the solutions of linear systems or the invertibility of a matrix.
Python's NumPy library offers a convenient function to compute the rank: np.linalg.matrix_rank
.
Example:
import numpy as np
# Define a matrix
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
# Calculate its rank
rank_A = np.linalg.matrix_rank(A)
print("Rank of A:", rank_A)
Output:
Rank of A: 2
In this instance, the rank of matrix A is 2, suggesting that only 2 of its rows (or columns) are linearly independent.
A matrix's rank can indicate whether it's singular (non-invertible). A square matrix is singular if its rank is less than its size (number of rows or columns). Singular matrices don't possess unique inverses.
To check for singularity using rank:
import numpy as np
# Create a matrix, which is clearly singular due to linearly dependent rows
A = np.array([[1, 2], [2, 4]])
# Calculate the rank
rank_A = np.linalg.matrix_rank(A)
# Check for singularity
is_singular = "Matrix A is singular." if rank_A < A.shape[1] else "Matrix A is not singular."
print(is_singular)
Output:
Matrix A is singular.
By understanding the rank, one can determine the properties of a matrix and its ability to be inverted, which is crucial in numerous linear algebra applications.
Operation | Purpose | Primary NumPy Call | Python Shorthand | Shape Rules |
Dot / Inner product | • 1-D arrays → scalar (inner product) • 2-D arrays → matrix product |
np.dot(a, b) |
a @ b |
Last dim of a = second-to-last dim of b |
Matrix product | General (broadcast-aware) matrix multiplication | np.matmul(a, b) |
a @ b |
Handles (..., m, k) @ (..., k, n) batched |
Element-wise multiply | Hadamard product (same shape) | a * b |
* |
Broadcasting-compatible shapes |
Transpose | Swap axes 0 and 1 (or any via axes= ) |
a.T or np.transpose(a) |
— | For >2D use np.swapaxes /np.moveaxis |
Inverse | Matrix inverse (square, non-singular) | np.linalg.inv(a) |
— | Prefer np.linalg.solve(a, b) for systems |
Determinant | Scalar determinant of square matrix | np.linalg.det(a) |
— | Ill-conditioned if det(a) ≈ 0 |
Rank | Numerical rank (≈ # of linearly independent rows) | np.linalg.matrix_rank(a) |
— | Uses SVD under the hood |
Trace | Sum of diagonal elements | np.trace(a) |
— | Works on the last two axes by default |
Eigenvalues / vectors | Spectral decomposition (square) | vals, vecs = np.linalg.eig(a) |
— | Use np.linalg.eigvals(a) for values only |
SVD | Singular-value decomposition | u, s, vh = np.linalg.svd(a) |
— | Robust for rectangular or rank-deficient mats |
Matrix power | Integer power k (square) | np.linalg.matrix_power(a, k) |
— | k < 0 gives inverse powers |
Tips & Best Practices:
@
for readability and automatic broadcasting; it resolves to np.matmul
for ≥2-D inputs and to np.dot
for 1-D.solve
not inv
: to compute $x$ in $Ax=b$, x = np.linalg.solve(A, b)
is faster and stabler than np.linalg.inv(A) @ b
.np.linalg.cond(a)
before inverting or solving.scipy.sparse.linalg
counterparts to avoid excessive memory use.a.shape
—most dimension-mismatch bugs arise from overlooked trailing axes.