Monte Carlo Integration

Last modified: January 03, 2020

This article is written in: 🇺🇸

Monte Carlo Integration

Monte Carlo integration is a numerical technique for approximating integrals using randomness. Rather than systematically sampling a function at predetermined points, as done in methods like the trapezoidal rule or Simpson’s rule, Monte Carlo methods rely on random samples drawn from a prescribed domain.

The key strength of Monte Carlo integration lies in its applicability to high-dimensional problems, where conventional deterministic methods suffer from the “curse of dimensionality” and become prohibitively complex or computationally expensive. In such high-dimensional spaces, Monte Carlo methods often provide a practical solution by producing an estimate whose accuracy improves as more random samples are drawn.

Conceptual Illustration:

Consider integrating a function $f(x)$ over an interval or a multidimensional region. You can think of Monte Carlo integration as throwing random “dart points” into the domain and using the average value of the function at these randomly chosen points to approximate the integral. The more darts you throw, the closer you get to the true area (integral), provided the sampling is unbiased:

output(32)

As you increase the number of random points (samples), your approximation generally improves.

Mathematical Formulation

Suppose we want to approximate the integral:

$$I = \int_D f(x)\, dx$$

where $D$ is the domain of integration and $x$ can be scalar or vector-valued (i.e., the integral can be 1D, 2D, or higher-dimensional).

Key Idea:

I. Let $V = \text{Volume}(D)$ be the volume (or length, area, hypervolume) of the domain $D$.

II. Draw $N$ independent random points $x_1, x_2, \ldots, x_N$ uniformly distributed within $D$.

III. Evaluate the function $f$ at these points and compute the average value:

$$\overline{f} = \frac{1}{N}\sum_{i=1}^N f(x_i)$$

IV. Estimate the integral as:

$$I \approx V \overline{f}$$

As $N \to \infty$, the Law of Large Numbers ensures that $\overline{f}$ converges to the true mean of $f$ over the domain, and thus $I$ converges to the true integral.

Derivation and Statistical Basis

Monte Carlo integration’s foundation lies in probability theory. If $X$ is a random variable uniformly distributed over $D$, then:

$$\mathbb{E}[f(X)] = \frac{1}{V}\int_D f(x)\, dx$$

Hence:

$$\int_D f(x)\, dx = V \mathbb{E}[f(X)]$$

By sampling $X_1, X_2, \ldots, X_N$ independently and identically distributed (i.i.d.) as $X$, we can approximate $\mathbb{E}[f(X)]$ by the sample mean:

$$\mathbb{E}[f(X)] \approx \frac{1}{N}\sum_{i=1}^N f(X_i)$$

Thus:

$$\int_D f(x)\, dx \approx V \frac{1}{N}\sum_{i=1}^N f(X_i)$$

The accuracy improves as $\sqrt{N}$, meaning the error decreases proportionally to $1/\sqrt{N}$.

Algorithm Steps

I. Identify the Domain and Volume:

Determine the region $D$ over which you need to integrate.
Compute or know the volume $V$ of $D$. For example, if $D=[a,b]$ in one dimension, then $V=b-a$. In higher dimensions, compute the product of side lengths for a hyper-rectangle, or use known formulas or methods for more complex domains.

II. Generate Random Points:

Generate $N$ random points $x_i$ uniformly distributed in $D$.
In one dimension, sample $x_i \in [a,b]$ uniformly.
In multiple dimensions, sample each coordinate from the appropriate range to cover the entire domain $D$.

III. Evaluate the Function:

Compute $f(x_i)$ for each random point $x_i$.

IV. Compute the Average:

Calculate $\overline{f} = \frac{1}{N}\sum_{i=1}^N f(x_i)$.

V. Estimate the Integral:

Multiply by the volume $V$:

$$I \approx V \overline{f}.$$

VI. Assess Accuracy:

If necessary, increase $N$ and repeat to improve accuracy.
The standard deviation of the estimator decreases as $1/\sqrt{N}$.

Example

1D Example: Estimate $\int_0^1 x^2 dx$.

Exact answer: $\int_0^1 x^2 dx = \frac{1}{3} \approx 0.3333.$

Monte Carlo Steps:

I. Domain $D=[0,1]$, volume $V=1$.

II. Let $N=1000$. Generate 1000 random points $x_i$ in [0,1].

III. Compute $f(x_i)= (x_i)^2$ for each $i$.

IV. Suppose after computation, $\frac{1}{1000}\sum_{i=1}^{1000} (x_i)^2 =0.3365.$ (This is just an example value.)

V. Since $V=1$, the integral estimate is $I \approx 0.3365.$

VI. With more points (e.g., $N=10^5$), we would expect the estimate to get closer to 0.3333.

Advantages

I. Dimensional Independence:

Monte Carlo methods handle high-dimensional integrals more easily than deterministic methods, whose complexity often grows exponentially with dimension.

II. Simplicity:

Easy to implement, no complex quadrature rules needed. Just random sampling and arithmetic.

III. Versatility:

Works with any integrable function and domain, including complex shapes, as long as uniform sampling is possible.

Limitations

I. Convergence Rate:

Monte Carlo integration converges as $1/\sqrt{N}$, which can be slower than some deterministic methods in low dimensions.

II. Variance and Accuracy:

To achieve high accuracy, a large $N$ may be required, increasing computational cost.

III. Randomness:

The result is a random variable. Each run may give slightly different answers unless a fixed random seed is used. Confidence intervals and variance reduction techniques (e.g., importance sampling, stratified sampling) are often employed.

Variants and Enhancements

Importance Sampling: Improves convergence by sampling more frequently in regions where the function contributes more to the integral.
Stratified, Latin Hypercube, and Quasi-Monte Carlo Sampling: Reduce variance by more clever sampling strategies.
Adaptive Methods: Adjust the sampling distribution on the fly to improve efficiency.

Monte Carlo Integration

Mathematical Formulation
Derivation and Statistical Basis
Algorithm Steps
Example
Advantages
Limitations
Variants and Enhancements