What is Newtons Method?
Let’s say you had some function f(x), and you wanted to find where that function is equal to zero. If that function is a polynomial like the following
we can plug it into the quadratic equation and find that the roots are -2 and -1. What about if the function is the following
There’s no closed form solution for the above equation. We could either turn to an infinitive series solution, or a numerical method like Newtons method. Newtons method is an iterative method for approximating the roots of a function.
How does Newtons Method Work?
Newtons method is easily explored geometrically so let’s pick some initial x value on a cubic function f(x) which we know. The black horizontal line is g(x)=0 so we want to find where these two curves intercept.
First, we find the derivative of f(x) at the point we chose and use that as the slope of a new line that I drew in blue. Because this is linear we can calculate at what new x value this line crosses g(x)=0.
The if we plug in the new x value into the original function f(x) we can get a new tangent line. We then repeat the process of seeing where this tangent line intersects g(x).
Again taking the new x value and finding a tangent one more time our new x point is quite close to the actual place where our function f(x)=0
We could continue this iterative process until we get arbitrarily close to the actual zero of our function. We arrived at an extremely close point in only 3 iterations, and once you are close, this method converges quadratically. (Note: I’m planning on covering what rates of convergence are in a future post but for now just know that quadratic convergence is good).
Now how do we convert this method from the geometrical way I’ve shown above to an algebraic method we can implement in a script? This goes back to calculus 1 and finding the tangent line to a function. The formula for this new line is
Where m is the slope and b is the y-intercept. The derivative of the function at a point n will be our slope. Because we only want to find the distance we need to move to our new x we can place our y-axis at the original starting point and set y=0 we get the following equation.
By re-arranging, we can get the following update equation
Now, I’ve implemented a newton solver for a simple cubic function and created a GIF of it being solved along with a the absolute error plotted on a log scale.
How to Pick an Initial Guess
Now we know how to iterate, but how do we get the first guess? Well, we want something near the answer. How near? It depends on the function, and I’m going to put that as beyond the scope of this post. But, if you’re interested in digging in deeper, I’ve heard this is a good and cheap introduction to analysis on numerical methods.
How Newton’s Method can Fail
Newton’s method is not foolproof. It can fail in many different ways. First, is if any initial guess / iteration lands on or near a point where the derivative is zero. This leads the update equation to be
If the derivative is zero we have hit a singularity. If the derivative is small it is going to make the new step really far away from the old step, which because we assume our initial guess is close, means the new step will likely have a large amount of error. *Shakes fist at singularities*
Speaking of singularities, what if the derivative is undefined? If we are trying to find a zero of any of the following family of functions
Where n is an integer larger than 1. The derivative is always undefined at 0, which again leads to the same problems of landing at or near a singularity.
Another way it can fail, is if it get’s stuck in a loop. Take the function
If your initial guess is either 0 or 1 it will bounce between these two points forever. In-fact, even if you start close to these two points, newtons method can converge to this cycle.
Newtons Method, Good for More than Just Finding Roots.
Newtons method was designed to find roots, but it can also be applied to solving certain equations, where there are no closed form solutions. One great example of that is Kepler’s equation
I’m not going to go into this equation in this post, but small e is a constant and large E and M both are variables. If we were write it in a function notation it would be
In this equation it’s easy to find M given E but what about the inverse problem of finding E given M? At a cursory glace we can see the appearance of the sine function, meaning this is a transcendental equation. Transcendental equations generally do not have closed form solutions. In the case of Kepler’s equation there’s no closed form solution unless e is 0 or 1. There’s a way of solving it with an infinite series but this can be tedious and using a numerical method is often times preferred.
If we wanted to use Newton’s method to find the solution we would re-arrange Kepler’s equation as the following by subtracting M from both sides
This is now of a form where we could take the derivative and plug it into our update equation
Now what would make a good initial guess?
Because we are dealing with sine and cosine we can bound E to be between 0 and 2π. Note: we could also have bounded E through looking at the context that Kepler’s equation arises in. Let’s select something in the middle of that range, but not something that causes either sin or cos to be 0. I’d select 3π/4 as my initial guess and see if it converges from there.
What About if we Don’t Have a Function?
Let’s say we have some problem where we have a desired outcome, and we have some variable we can alter to change that final outcome. Unfortunately, we don’t have a function relating these to each other. Aside from that we know the system dynamics. For example we have the following governing equations
where ρ is a constant and we want to get to a certain x,y location. Our initial conditions are as follows where we can change θ, but V is constant.
Relating θ to the final position analytically would be a hard thing to do, but we can find a solution rather easily turning to newtons method.
First let’s turn our two requirements, x and y location, into a single cost function below.
Now let’s write out the update equation.
Getting the function for J as a function of θ is hard analytically, but we can get it’s value at a certain value by numerically integrating the governing equations forward in time. Additionally, we can get the derivative of J though a finite difference like.
The above approximation only works when the differences between the two theta’s are small.
Next weeks post is going to be solving a problem similar to this one using Newton’s method, but with a different source of the non linearity, so I’m not going to solve this problem here. I instead leave it up to you the reader if you want to solve it (Solve it to four significant figures). If you can solve it, email me the answer you got at ari@gereshes.com, I’ll add your name, or pseudonym if you prefer, to the bottom of this page. Note: I reserve the right to veto any name on any grounds. I’ll probably only invoke this for obscenity reasons.
Use the following values as constants
- ρ=1
- V=1
- x Final = 80.03716
- y Final =0
In case you were wondering, is there a real world background for this problem? It’s a modified trajectory problem to include both drag and a super basic atmospheric density model.
Note: Hey, this post was a surprise hit, and at the time I wrote this note (2018.09.08) was the second most read post on this site. I’ll be producing more numerical methods posts in the future, but if you want to get ahead, I recommend this book.
Want more Gerehses…
If you want to receive the weekly Gereshes blog post directly to your email every Monday morning, you can sign up for the newsletter here!
If you can’t wait for next weeks post and want some more Gereshes I suggest
The Math behind swinging a swing