Root finding algorithm basics

Motivation

Finding the zero(s), or “root(s)”, of an equation is often necessary in engineering or mathematical problems. Fortunately for us, there are a suite of different pre-packaged root finding algorithms available for different programming languages. Julie Quinn wrote a blog post in 2016 summarizing some of the root finding tools available in MATLAB, R, Python, and C++ which can be found here.

Although these tools make the process of finding the root of an equation quick and easy, it is important for users to have some basic knowledge regarding the numerical methods being used.

In this post, I will provide background on the fundamentals of some common root finding algorithms, point out the strengths and weaknesses of the different algorithms, and highlight some points that you might want to take into consideration when using these algorithms. Ultimately, the goal here is to encourage more informed use of well-established tools.

The Bisection Method

The bisection method is a simple bracketing method. The start and end points of a range are specified, [a, b], and if the signs of the value of the function are opposite (i.e., sign(f(a)) ≠ sign(f(b))), then the intermediate value theorem guarantees the existence of a root within that range. The range is sequentially split in half, or bisected, and the sign of the midpoint function value determines which half of the range is used in the subsequent iteration.

The algorithm generally follows:

The simplest version of this algorithm, implemented in Python follows:

import numpy as np

def Bisect(Function, range_start, range_end, maxNFE = 1000, tolerance = 10**-6):
 
    A = range_start
    B = range_end
    
    # Check that range bounds have alternate signs
    if np.sign(Function(A)) == np.sign(Function(B)):
        print("Start and end values for the range must correspond to function values of different sign.")
        
    RootRange = [A, B]
    
    # Perform Bisection method until tolerance is achieved
    NFE = 0
    spread = RootRange[1] - RootRange[0]
        
    while (spread > tolerance) and (NFE < maxNFE):
        
        MidPoint = (RootRange[1] + RootRange[0])/2
            
        if Fluxes(MidPoint == 0):
            Root = MidPoint
            NFE += 1
            return Root, NFE
            
        elif np.sign(Function(MidPoint)) != np.sign(Function(RootRange[1])):
            RootRange[0] = MidPoint
                
        else:
            RootRange[1] = MidPoint
            
        NFE += 1        
        spread = RootRange[1] - RootRange[0]
        
    Root = (RootRange[1] + RootRange[0])/2
    
    return Root, NFE

The Newton-Raphson Method

The Newton-Raphson method is an iterative root finding approach which uses information about the value and derivative of the function to produce subsequently better approximations of the root. The user must provide some initial guess of the root. If the value of the function inadequately far from zero, then the slope of the function at that point is used to determine the next estimate value.

The mathematic representation of the algorithm follows:

The use of the derivative makes this algorithm much quicker than the bisection method. The analytical derivative can either be provided to the algorithm, or an estimation can be used. Here, I use use a simple estimation of the derivative by calculating the slope of the function across some very small step size.

def df(Function, x , h = 10**-6):
    
    df = (Function(x+h) - Function(x))/h
    
    return df

With the derivative estimation function ready, I can implement the Newton-Raphson method:

def NewtonRaphson(Function, x0, maxNFE = 1000, tol = 10**-10):
    
    NFE = 0

    while abs(Function(x0)) > tol and NFE < maxNFE:
        NFE += 1
        derivative = df(Function, x0, h = 10**-6)
        
        x0 = x0 - Function(x0)/derivative
        
    return x0, NFE

When the Newton-Raphson method does converge to a solution, it converges at a quadratic rate, which is much quicker than the linear convergence of the bisection method. When choosing to implement a root finding technique, it is important to have some understand of that technique’s unique strengths and weakness and choose an algorithm accordingly. For example, while the Newton-Raphson method is quicker than the bisection method, it does not provide guaranteed convergence to the intended root in the way that the bisection method does.

An example in custom root finding algorithms

Here, I would like to provide an example of how to implement a custom root finding algorithm, to demonstrate some challenges that can occur when using generic root finding algorithms, and how custom algorithms can help avoid potential pitfalls.

The shallow lake problem [1, 2], which has been used as an example in many other posts on this site, is a great example of a dynamic environmental system with multiple equilibria of interest. The problem centers around a shallow lake which receives both natural, Yt~LN(μ,σ2), and anthropogenic, at, phosphorous (P) pollution inputs, along with a natural recycling of the P back into the water column characterized by the shape parameter q. The lake experiences some natural P removal at rate b. The combination of these inputs and outputs results in the presence of three different equilibrium states, where the inputs are equal to the outputs. The figure below (Quinn, 2016) helps to illustrate this behavior.

Figure generated by Prof. Julie Quinn (2016).

Notice that of the three equilibrium states, two are stable and one is unstable. The first stable equilibrium, the oligotrophic equilibrium, is the preferred state and will result in healthy ecosystem conditions. The unstable equilibrium presents a dangerous tipping point, referred to as the critical P threshold, beyond which the lake will irreversible transition into an unhealthy, stable eutrophic equilibrium.

When considering pollution control policies, it is critical that the analyst identify the unstable, critical P threshold so that it can be avoided!

For the purposes of this example, I will consider the lake dynamics under the influence of only the natural removal and recycling, and make assumptions regarding these parameter values (b = 0.42, q = 2.0). The net flux of P experienced by the lake can be defined by the function:

def Fluxes(x):
    # Set parameters
    b = 0.42
    q = 2
    flux = (x**q) / (1 + x**q) - b*x
    return flux

Let’s use the Newton-Raphson root solver defined above to try and find the critical P threshold. In the case that I do not have prior knowledge of the critical P threshold, it seems reasonable to begin the Newton-Raphson algorithm with an arbitrary initial guess of 0.25.

Root, NFE = NewtonRaphson(Fluxes, x0 = 0.25)

From the above, I find that:
Root = 1.836

Just out of curiosity, I would like to test my Newton-Raphson function with a different initial estimate… if I place my initial estimate closer to the previously found root, then I should find the same root again, right?

Root, NFE = NewtonRaphson(Fluxes, 1.0)

Root = 0.0

In fact, the Newton-Raphson approach can result in some very unexpected convergence behavior, and might result in the convergence to a wildly different root if not used carefully.

For the sake of completeness, I will use the method once more with an initial guess between my first two:

Root, NFE = NewtonRaphson(Fluxes, 0.75)

Root = 0.5445

Finally, I find the unstable tipping point that I have been looking for. However, this process has revealed how unpredictable the convergence of this method can be.

The figure below shows which equilibria was found for different initial guesses of the root. The color of the horizontal line shows which of the three equilibria will be found given that value as the initial estimate.

This is where one might recognize the need to construct a unique algorithm better suited for this analysis. With regard to the lake problem, I want to be certain that I am identifying the correct critical P threshold. To do this, I will design an algorithm which is capable of finding all roots of the flux equation as well as identify the stability condition of each found equilibria.

The bisection method is preferable for this purpose because the bracketing method will avoid the unpredictable convergence patterns of the Newton-Raphson method, and can guarantee that I find every root within the specified range.

I’ve created a new algorithm, building off of the basic Bisect() function included above, which take the entire range of interest and first discretizes the range into a set of subspaces. It then checks the signs of the function value at the boundary of each subspace to identify those subspaces which have opposite boundary value signs, and thus identifies each subspace containing a zero.

After identify the number and subspaces containing zeros, the algorithm performs the traditional bisection method as shown above. Lastly, I’ve included the evaluation of stability conditions once each root has been found within the specified tolerance.

def ManyRootBisect(Function, range_start, range_end, step_size, maxNFE = 1000, tolerance = 10**-6):
 
    A = range_start
    B = range_end
    
    # Check that range bounds have alternate signs
    if np.sign(Function(A)) == np.sign(Function(B)):
        print("Start and end values for the range must correspond to function values of different sign.")
        
    # Discretize the range space
    n = np.floor((B-A)/step_size)
    nodes = np.linspace(A, B, int(n), endpoint = True)
    
    # Count sign-changes (roots) and record ranges where roots exist
    nRoots = 0
    RootRange = []
    
    for i in list(range(len(nodes)-1)):
        
        if np.sign(Function(nodes[i])) != np.sign(Function(nodes[i+1])):
            nRoots += 1
            AddRange = [nodes[i], nodes[i+1]]
            RootRange.append(AddRange)
    
    # Perform Bisection method on each range with a root
    
    Roots = []
    NFE = 0

    for r in list(range(nRoots)):
        
        spread = RootRange[r][1] - RootRange[r][0]
        
        while (spread > tolerance) and (NFE < maxNFE):
        
            MidPoint = (RootRange[r][1] + RootRange[r][0])/2
            
            if Fluxes(MidPoint == 0):
                
                RootRange[r][0] = MidPoint
                RootRange[r][1] = MidPoint
            
            elif np.sign(Function(MidPoint)) != np.sign(Function(RootRange[r][1])):
                RootRange[r][0] = MidPoint
                
            else:
                RootRange[r][1] = MidPoint
                
            NFE += 1
            spread = RootRange[r][1] - RootRange[r][0]
    
    for i in list(range(nRoots)):
        
        Root = (RootRange[i][1] + RootRange[i][0])/2
        
        Roots.append(Root)
    
    # Check the stability of each root
    Stability = []
    for i in list(range(nRoots)):
        
        if ((Function(Roots[i] + tolerance)) < 0):
            Stability.append("Stable")
        else:
            Stability.append("Unstable")
        
    return Roots, Stability, nRoots, NFE

The above function is much more robust, and is capable of identify each of the three roots and their stability conditions.

Roots = [0.0, 0.5445, 1.8384]
Stability = [‘Stable’, ‘Unstable’, ‘Stable’]

The goal of this blog post was not to argue that any particular root finding algorithm is superior to any other. Rather, I hoped to encourage more informed use of these common tools.

P.s. For a very interesting and tangentially related exploration of how the Newton-Raphson method can produce fractal geometries, check out the video by 3Blue1Brown here.

References

[1] Carpenter, Stephen R., Donald Ludwig, and William A. Brock. “Management of eutrophication for lakes subject to potentially irreversible change.” Ecological applications 9.3 (1999): 751-771.

[2] Quinn, Julianne D., Patrick M. Reed, and Klaus Keller. “Direct policy search for robust multi-objective management of deeply uncertain socio-ecological tipping points.” Environmental Modelling & Software 92 (2017): 125-141.

Leave a comment