Last modified: February 17, 2022
This article is written in: 🇺🇸
Matrices are often described as rectangular arrays of numbers organized into rows and columns, and they form the bedrock of numerous processes in numerical methods. People use them for solving systems of linear equations, transforming geometric data, and carrying out many algorithmic tasks that lie at the heart of applied mathematics and engineering. The flexibility and power of matrix operations arise from their ability to compactly represent large sets of data or complicated transformations. Numerical methods built around matrices become necessary whenever problems need consistent and systematic solutions, such as in high-dimensional computations or iterative methods for optimization.
A matrix is denoted by an uppercase letter, while its elements are commonly denoted by the corresponding lowercase letter with subscripts indicating row and column position. For instance, a generic 3×3 matrix $A$ might look like this:
$$A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}$$
One way to visualize a 3×3 matrix is with a diagram that shows the rows and columns:
Column 1 Column 2 Column 3
Row 1 a11 a12 a13
Row 2 a21 a22 a23
Row 3 a31 a32 a33
Rows are horizontal sequences of elements, and columns are vertical sequences. This basic structure underpins all matrix-based numerical methods, from simple multiplication to more advanced decompositions.
Matrix operations include addition, multiplication, inversion, and determinant calculation. Each plays an important role in numerical algorithms.
When adding two matrices $A$ and $B$ of the same dimension, you add each element in $A$ to the corresponding element in $B$. This is expressed by:
$$C = A + B \quad \Longrightarrow \quad c_{ij} = a_{ij} + b_{ij}$$
In scalar multiplication, you multiply every element of a matrix by the same scalar:
$$C = kA \quad \Longrightarrow \quad c_{ij} = k \cdot a_{ij}$$
Multiplying two matrices of appropriate dimensions involves taking the dot product of each row of the first matrix with each column of the second. If $A$ is an $(m \times n)$ matrix and $B$ is an $(n \times p)$ matrix, then their product $C = AB$ is an $(m \times p)$ matrix where:
$$c_{ij} = \sum_{k=1}^{n} a_{ik} \, b_{kj}$$
A sketch for a 3×2 times 2×3 multiplication might look like this:
A (3x2) B (2x3)
[a11 a12] [b11 b12 b13]
[a21 a22] x [b21 b22 b23]
[a31 a32]
The product C = A * B will be a 3x3 matrix:
[c11 c12 c13]
C = [c21 c22 c23]
[c31 c32 c33]
An invertible matrix $A$ of dimension $n \times n$ has an inverse $A^{-1}$ such that:
$$A \, A^{-1} = A^{-1} \, A = I_n$$
where $I_n$ is the $n \times n$ identity matrix. Not all matrices possess an inverse. Those that do not can be called singular or degenerate.
The determinant of a square matrix $A$ is a scalar value that provides information about certain properties of $A$, such as whether $A$ is invertible. For a 3×3 matrix:
$$\det(A) = a_{11}(a_{22}a_{33} - a_{23}a_{32}) - a_{12}(a_{21}a_{33} - a_{23}a_{31}) + a_{13}(a_{21}a_{32} - a_{22}a_{31})$$
A non-zero determinant indicates that $A$ is invertible, while a zero determinant means that $A$ is singular.
Matrix decompositions break a matrix into products of simpler or special-purpose matrices. They help solve complicated problems systematically.
LU decomposition (or factorization) writes a square matrix $A$ as the product of a lower triangular matrix $L$ and an upper triangular matrix $U$. Symbolically:
$$A = L \, U$$
Matrix $L$ has all 1s on the main diagonal and zero entries above the diagonal, while $U$ has zeros below the diagonal. An ASCII depiction for a 3×3 case could look like this:
[ a b c ] [ l11 0 0 ]
A = [ d e f ] = [ l21 l22 0 ] x [ u11 u12 u13 ]
[ g h i ] [ l31 l32 l33 ] [ 0 u22 u23 ]
This method is a cornerstone for solving systems of linear equations of the form $Ax = b$, because once $A$ is decomposed into $L$ and $U$, forward and backward substitutions become more efficient.
SVD factorizes any $m \times n$ matrix $A$ (whether square or rectangular) into three matrices:
$$A = U \, \Sigma \, V^T$$
Matrix $U$ is an orthogonal (or unitary in the complicated case) $m \times m$ matrix, $\Sigma$ is an $m \times n$ diagonal matrix (with nonnegative real numbers on the diagonal), and $V$ is an $n \times n$ orthogonal (or unitary) matrix. The superscript $T$ indicates the transpose. SVD highlights the underlying structure of $A$, making it valuable in data compression, noise reduction, and feature extraction.
QR decomposition takes a matrix $A$ of size $m \times n$ (often $m \geq n$) and expresses it as:
$$A = Q \, R$$
Matrix $Q$ is $m \times m$ and orthogonal, which means $Q^T Q = I$. Matrix $R$ is $m \times n$ and upper triangular. This decomposition appears in algorithms for solving least squares problems and eigenvalue calculations. Its ability to transform a matrix into an upper-triangular form via orthogonal operations simplifies numerical analyses, since orthogonal transformations are known for preserving numerical stability.
The Power Method is an iterative approach for finding the largest eigenvalue and the associated eigenvector of a square matrix $A$. You start with an initial vector $x^{(0)}$ and repeatedly apply the matrix to this vector, normalizing each time:
$$x^{(k+1)} = \frac{A \, x^{(k)}}{\|A \, x^{(k)}\|}$$
Under suitable conditions, $x^{(k)}$ converges to the eigenvector corresponding to the dominant eigenvalue of $A$. Numerically, the eigenvalue can be approximated by the Rayleigh quotient:
$$\lambda_{\text{approx}} = \frac{(x^{(k)})^T \, A \, x^{(k)}}{(x^{(k)})^T \, x^{(k)}}$$