An Introduction to Convergence

Good Enough

Numerical Methods are all about getting an answer that is good enough. A whole class of numerical methods are iterative methods, which continue running until they are good enough. Newton’s method is the example of an iterative method. It keeps going, making the error smaller and smaller until it finds a zero.

A good iterative method should, as it iterates, converge to the truth. The big question, is first does it converge, and if it does, how fast does it converge. This post will go over some basic ideas in convergence as well as how to analyze the convergence of newtons method.

A Sequence of Unfortunate Events

Ok, before we get started on convergence, lets take a moment to go over what is a sequence

Let’s say you are making a schedule for yourself

Pickup paycheck
Cash paycheck
Pay Rent

You’ll notice that if you changed the order of this schedule it wouldn’t make sense. You couldn’t cash your paycheck if you haven’t picked it up, and you cant pay your rent if you don’t have cash. This schedule is a sequence. It’s an ordered list of things. Let’s turn some words into sequences

repetition – (r,e,p,e,t,i,t,i,o,n )

mary – (m,a,r,y)

army – (a,r,m,y)

You’ll want to notice two important things from the three above sequences.

Repetition in elements is allowed – in the word repetition, the letters e,t, and i appear 2 times each.
Order matters – mary and army contain the same letters, but they are different sequences.

Because order matters, let’s define a way to pick a certain element. In mary, m is the first letter, a is the second, r is the third and y is the fourth. What is the 7th element of repetition? Answer: t

Numeric Sequences

This blog deals with math so let’s define a numeric sequence. Lets start easy with the counting numbers from 1 to 5 – (1,2,3,4,5)

What about all integers greater than zero – (1,2,3,4,5,…). Now that sequence will go on forever, so lets call that sequence z and refer to it’s n’th element as

$Z_n = n$

Let’s create a new sequence with the n’t element defined as

$a_n = e^{-0.1n} +.5$

Let’s plot that sequence here, with the element as the y axis and the index for that element as the x axis

we can see that as the index goes keeps going up, this sequence keeps going towards .5 We can say that this sequence converges to .5

What if we eliminate the minus sign to form the new sequence

$b_n = e^{.1n} +.5$

We can now see that this sequence heads off towards infinity as the index grows. We say this sequence diverges.

Respecting Bounds

Let’s define another sequence now

$s_n = \cos(\frac{n\pi}{16})$

Does this sequence converge? No, as n goes to infinity, it doesn’t go towards any number. So does it diverge? Technicality yes, but it doesn’t head off towards either plus or minus infinity. This sequence bounces between -1 and 1 forever. We can say that this sequence is bounded by these two numbers.

Note: By now, you’ve probably noticed that I’ve been making good use of scatter plots, instead of my usual line plots. The functions we’ve been using to generate our sequences are all continuous, but the points we evaluate them at are only at the integers.

Definition for convergence

There’s a more technical definition for convergence which involves epsilons, but let’s just use the following simple definition for this post

$\lim_{n \to \infty} a_n = const$

and a divergent sequence is any sequence that doesn’t converge.

Types of convergence.

Lets say we have the following 3 sequence

$a_n = \frac{1}{n}$

$b_n = \frac{1}{2^n}$

and finally,

$c_n = e^{-n}$

Through inspection we can see that all three of them go to zero as n goes to infinity, but do they go to zero at the same rate?

Let’s plot them to find out

We can see that sequence b goes to zero faster than sequence a, but sequence c goes to zero the fastest. Let’s define a test to make this rate of convergence idea more concrete. Let’s see what happens if instead of looking at the sequence, we look at the ratio of the current term to the last term as n goes to infinity. and see what that sequence goes to.

$\lim_{n \to \infty}\frac{|a_n-L|}{|a_{n-1}-L|}=\mu$

In the above equation, L is whatever constant the original sequence approaches and μ is the number the new sequence approaches. Why don’t you take a moment and try to figure out what μ would be for our sequences above, a,b, and c

We can see that sequence a goes to 1, b goes to .5, and c goes to .3679.

If the sequence goes to 1 we say it converges sub-linearly.
It it goes to some number between 0 and 1 (non inclusive) it converges linearly
If it goes to 0 it converges super-linearly.

Note: when it goes to a number between 0 and 1 this is also known ans it’s rate of convergence. The lower the number, the faster the convergence.

You might have heard of the term quadratic convergence from my post on Newton’s method and be wondering where that one fits in. It’s a subsection of super-linear convergence. We can divide super-linear convergence into subgroups by changing our test to the following

$\lim_{n \to \infty}\frac{|a_n-L|}{|a_{n-1}-L|^q}=M$

where M is some number between zero and one (again noninclusive) and q is the power of convergence. If q is 2 then that sequence converges quadratically.

Note: were now going to shift back to iterative methods, but if you want to dive deeper into Sequences and Series this might interest you.

Iterative Methods

But what does this all have to do with iterative methods like Newtons method?

In the beginning of the post I said this would be about iterative methods and here we are, over halfway done and have only talked about sequences.

Well think about an iterative method as a recursive sequence generator. Let’s write out the update equation for Newton’s method.

$d_{n+1} = d_n - \frac{f(d_n)}{f'(d_n)}$

It’s of the same form as the sequences we were working with earlier.

Proving Newton’s Method is Quadratic

Now, let’s actually show that Newtons method has quadratic convergence. As we usually do, let’s start with a first order Taylor series expansion to the function we want to find the roots to

$f(a) = f(x_n)+f'(x_n)(a-x_n)+R$

where R is the remainder. We can replace it with the Lagrange form

$R = \frac{1}{2}f''(\xi_n)(a-x_n)^2$

where ξ is between x and a. We also know that because this is a root finding equation f(a)=0 which allows us to pull the second order term to the other side and divide by f'(x)

$\frac{f(x_n)}{f'(x_n)}+ (a-x_n) = \frac{-f''(\xi_n)}{2f'(x_n)}(a-x_n)^2$

recall out update equation for Newton’s method

$x_{n+1} = x_n - \frac{f(x_n)}{f'(x_n)}$

The right hand side (rhs) of the update equation looks a lot like the left hand side (lhs) of our earlier equation so let’s swap them out.

$a-x_{n+1}= \frac{-f''(\xi_n)}{2f'(x_n)}(a-x_n)^2$

replacing our (a-x) with an ε we get the following

$\epsilon_{n+1} = \frac{-f''(\xi_n)}{2f'(x_n)}\epsilon_n^2$

taking the absolute value, rearranging, and taking the limit as n goes to infinity we get

$\lim_{n \to \infty} \frac{\epsilon_{n+1}}{\epsilon_{n}^2} = \frac{|f''(\xi_n)|}{|2f'(x_n)|}=M$

When we were formulating our super-linear convergence test we had a variable q that denoted the power of convergence. Here that q is a 2 indicating that the convergence is quadratic. We have now brought the ideas of convergence we developed for sequences to bear on the convergence of an iterative method.

When Might this Fail

Now, there’s a few assumptions I made in this proof that I didn’t spell out clearly when I was doing it. First there’s the easy one, f”(x) and f'(x) exist and are continuous. Second, their ratio is somewhere between 0 and 2. Third f'(x) is not zero. The first condition is met on smooth continuous functions, which is generally what we deal with. Keeping with smooth continuous functions, f'(x) generally only equals zero at a set number of points and a choice of proper initial conditions can avoid these points. Finally that second conditions is met when you are close enough to the root of the function. As you get farther away this assumption breaks down and quadratic converge (and convergence more generally) isn’t guaranteed.

Closing Note

Convergence isn’t a small topic. Different fields have different approaches to how they define and use convergence. I’ve just given you a small flavor of it that has hopefully enticed to delve deeper into the topic. Dover has a good and relatively cheap book on the numerical methods that goes more into convergence. You can find that book here.

Want more Gerehses …

If you want to receive the weekly Gereshes blog post directly to your email every Monday morning, you can sign up for the newsletter here! Don’t want another email? That’s ok, Gereshes also has a twitter account and subreddit!

If you can’t wait for next weeks post and want some more Gereshes I suggest

An Introduction to Error

Asteroid Wars – Part 1

My Undergraduate EDC

Announcement: Journal Club

As a new grad student, I’ve been wanting to get better at reading scientific literature. The more you do something the better you get at it, so I set a goal of reading a paper a day and writing up a small informal blurb about each one. Collecting each week’s blurbs might make for an interesting second weekly post on Gereshes but I’ve decided to try it on the newsletter first. I’ll still only be sending out the newsletters on a weekly basis, so no extra emails, but each one will have a section that says Journal Club. This is just an experiment, and I may axe it at any time, but I’ll probably continue using the newsletter to try out new ideas.