Last modified: September 19, 2022
This article is written in: 🇺🇸
Root-Finding in Numerical Methods
Root-finding algorithms aim to solve equations of the form
f(x)=0f(x)=0
for a scalar (or vector) variable x. Although linear problems ax+b=0 admit the straightforward solution x=−b/a, many practical applications involve nonlinear equations—polynomial or transcendental—where closed-form solutions may be unavailable or difficult to find. Consequently, numerical methods are indispensable for approximating roots.
Relevance and Applications
- Root-finding is used in engineering to determine operating points and equilibrium solutions in mechanical, electrical, or chemical systems.
- In physics, root-finding helps solve boundary value problems, locate energy eigenstates, or find intersection points in scattering problems.
- In computer science and machine learning, root-finding is involved in optimizing objective functions, which can reduce to solving (\nabla f(x) = 0).
- Financial applications include calculating implied volatilities or internal rates of return, often involving transcendental equations that require root-finding.
- Solving nonlinear systems in higher dimensions, such as (\mathbf{f}(\mathbf{x}) = \mathbf{0}), extends 1D root-finding methods (e.g., Newton’s method generalizes to Newton–Raphson for multiple dimensions).
Basic Concepts
Root
A root r of a function f is a solution to
f(r)=0.
Depending on f:
- Polynomial functions f(x)=anxn+⋯+a0 may have multiple real or complex roots.
- Transcendental functions (e.g., sinx, expx, logx) may have infinitely many roots or require special methods.
- Generalized exponents or radicals (xπ,x3/5,√x,…) introduce domain constraints or branch cuts.
Brackets and the Intermediate Value Theorem
A bracket [a,b] is an interval such that
f(a)f(b)<0.
Provided f is continuous on [a,b], the Intermediate Value Theorem (IVT) guarantees there is at least one root in [a,b]. This property underpins bracketing methods, ensuring that each iteration can reduce the interval size while preserving a sign change that encloses a root.
Note: If f is not continuous, sign changes do not necessarily imply roots in that interval.
Convergence
An iterative method generates a sequence of approximations xk. If xk→r as k→∞ (for some root r), we say the method converges to r. The rate of convergence is an important property:
- Linear convergence: |xk+1−r|≈C|xk−r| for some C<1.
- Quadratic convergence: |xk+1−r|≈K|xk−r|2.
For example, Bisection typically has linear convergence, while Newton’s Method can exhibit quadratic convergence (under favorable conditions).
Tolerance
A tolerance ε is chosen so that when
|xk+1−xk|<εor|f(xk)|<δ, we consider the root sufficiently approximated and stop. The choice of ε and δ depends on the application’s precision needs.
Common Root-Finding Methods
Root-finding methods can be broadly classified into bracketing (guaranteed convergence under certain assumptions) and open methods (faster convergence but risk of divergence). Some methods combine both aspects.
Bracketing Methods
Bisection Method
- Start with an interval [a0,b0] where f(a0)f(b0)<0 to ensure the presence of a root.
- Calculate the midpoint m=ak+bk2 of the current interval.
- If f(ak)f(m)<0, update the interval to [ak+1,bk+1]=[ak,m].
- If f(ak)f(m)≥0, update the interval to [ak+1,bk+1]=[m,bk].
- The method halves the interval length in each iteration, leading to linear convergence with a ratio of 0.5.
- After n steps, the bracket size reduces to b0−a02n.
False Position (Regula Falsi)
- Select an interval [a,b] such that f(a)f(b)<0, ensuring the presence of a root in the interval.
- Compute the interpolated point xinterp=af(b)−bf(a)f(b)−f(a) using linear interpolation between (a,f(a)) and (b,f(b)).
- Evaluate f(xinterp) to check which subinterval contains the root.
- If f(a)f(xinterp)<0, update the interval to [a,xinterp].
- If f(a)f(xinterp)≥0, update the interval to [xinterp,b].
- Repeat the computation of xinterp and interval updates until the stopping condition is met.
- Stop when |b−a| or |f(xinterp)| is smaller than the predefined tolerance.
- The final result is the root approximation xinterp with accuracy determined by the chosen tolerance.
Open Methods
Open methods rely on derivative information or estimates, and do not require an initial bracket. However, they can fail to converge if poorly chosen initial guesses or pathological function behavior occurs.
Newton’s Method
- Start with an initial guess x0 for the root of f(x)=0.
- Compute the next approximation using the formula xk+1=xk−f(xk)f′(xk).
- At each step, approximate f(x) by the tangent line at xk and solve for the root of this tangent.
- The method exhibits quadratic convergence near simple roots when f′(r)≠0, as the error reduces quadratically with each iteration.
- The convergence rate satisfies limk→∞|xk+1−r||xk−r|2=constant.
- The method requires the derivative f′(x), which must be computed or approximated if unavailable.
- If f′(xk)≈0, the method can diverge or produce large, inaccurate jumps.
- Without a guaranteed bracket, the method may fail if the initial guess x0 is too far from the actual root.
Secant Method
- The idea is to approximate f′(xk) using a finite difference, resulting in the formula f′(xk)≈f(xk)−f(xk−1)xk−xk−1.
- The iterative formula is derived as xk+1=xk−f(xk)xk−xk−1f(xk)−f(xk−1), replacing the derivative in Newton's method.
- The method typically exhibits superlinear convergence with a rate close to 1.618, which is the golden ratio.
- This approach is useful because it avoids the need for an explicit derivative function, relying instead on previous function evaluations.
Combination Methods
- Start with a bracket [a0,b0] such that f(a0)f(b0)<0, ensuring a root exists within the interval.
- Evaluate f(a) and f(b), and initialize c=b as the last valid root approximation.
- At each iteration, compute an interpolation candidate xinterp using either secant or inverse quadratic interpolation based on the most recent points.
- Check if xinterp falls within the bracket [a,b] and provides sufficient progress toward the root.
- If xinterp is valid and improves the approximation, update the bracket to [a,xinterp] or [xinterp,b] based on the sign of f(xinterp).
- If xinterp fails or is outside the bracket, perform a bisection step by setting the midpoint of the bracket as the next approximation.
- Update c to the most recent approximation, ensuring c always represents the last reliable approximation of the root.
- Repeat the process until the stopping condition is met, typically when the interval size or f(c) is below a predefined tolerance.
- The final result is the approximation c, which satisfies the convergence criteria.
Choosing a Root-Finding Method
- Continuity ensures that a bracketing method can reliably locate a root, as the existence of a sign change guarantees a solution within the interval.
- Differentiability allows for use of methods like Newton's, which rely on first derivatives to predict the root. If the derivative is unavailable, alternative methods such as secant or bisection are better options.
- Multiplicity of roots poses challenges for Newton’s method; when a root is not simple (e.g., f(r)=0 but f′(r)=0), the method can converge very slowly or fail entirely.
- Bracketing methods like bisection are robust because they systematically narrow the interval containing the root, but their linear convergence makes them slower for high-precision needs.
- Open methods such as Newton's or secant can converge much faster, often at a superlinear or quadratic rate, but they require good initial guesses to avoid divergence or incorrect results.
- Computational cost can escalate when functions or their derivatives are complex to evaluate, making derivative-free methods like secant attractive as they avoid calculating derivatives directly. However, they may require more iterations to achieve the same accuracy.
- Hybrid approaches combine reliability and speed by starting with a bracketing method like bisection to secure the root interval, then transitioning to faster methods like Newton's or secant once near the solution.
- Combination methods like Brent’s dynamically adapt between bracketing and open techniques, providing both robustness of convergence and faster refinement when conditions are favorable.
Example Flow of Decision
- Determine if f(x) has an easily detectable sign change.
- If a sign change is present, use bracketing methods like bisection or false position to isolate the root.
- If no sign change is found, scan the domain or apply domain knowledge to guess a valid interval.
- Assess if f′(x) is analytically or numerically cheap to compute.
- If derivatives are easy to compute, use Newton’s method for faster local convergence.
- If derivatives are not available or are expensive, use derivative-free methods like secant.
- Decide if reliability is a key priority for the solution.
- If reliability is critical, select robust methods like Brent’s, which combine bracketing and open steps.