Contents | Supplements and Links |
Objectives: To solve the problem of finding the generalization of a few simple results, using a new construction. To gain experience in computing interpolating polynomials. To develop an appreciation of mathematical induction and "infinite" problems. | Necessary Background: It is assumed that students are familiar with the method of mathematical induction. |
Summary: In Section I, students discover explicit formulas for the sum for p = 0, 1, 2, 3, and 4. In Section II, they are led to discover the general formula. |
We would like to investigate sums of consecutive like powers p. Our goal is to arrive at an expression for the function
(1) |
that captures any patterns that might be present in these sums. In particular, we would like to be able to reduce these sums computationally. For example, suppose we wished to compute the sum of the first billion square numbers. Is there any faster way of doing this than by simply adding up 1 + 4 + 9 + 16 + ... ? The answer is "yes", but first let's start with more simple cases.
Consider the power p = 0. What is f_{0}(n) = 1^{0} + 2^{0} + 3^{0} + ... + n^{0}, the sum of the first n natural numbers, each raised to the 0th power?
For p = 1, the sum (1) becomes the sum
of the first n natural numbers. Find f_{1}(n) for n = 1, 2, 3, ..., 10.
This sequence is called the sequence of triangular numbers because its members are the numbers of dots in the following isosceles right "triangles":
There is a story in the mathematical folklore about the great mathematician Gauss (1777–1855), who, while in elementary school, reportedly found the sum of the first hundred natural numbers in a matter of minutes. Gauss did this by pairing up the numbers into pairs that summed to 101:
Since there are fifty such pairs, the total sum is then 50 · 101 = 5050.
Now we move on to the case p = 2. Compute the sum of the first n squares for n = 1, 2, 3, 4, 5, and 6. Is there an obvious trick for computing these sums that works like Gauss's method for p = 1?
To find a general formula for , we will use an "interpolating polynomial". Interpolation is the process of fitting a curve to a set of data points. For example, there is a unique line that passes through any two points in the plane. If the coordinates of the plane are x and y and our two points are (x_{1}, y_{1}) and (x_{2}, y_{2}), then the line passing through them is given by the equation
We say that this "curve" (in this case, a line) fits the data points (x_{1}, y_{1}) and (x_{2}, y_{2}). (Here we assume that x_{1} ≠ x_{2} so that y is actually a function of x. We need not worry about this in the case of sums of consecutive powers since for every p, f_{p}(n) is a function of n by definition.) For example, let's fit a line to the points (x_{1}, y_{1}) = (1, f_{0}(1)) = (1, 1) and (x_{2}, y_{2}) = (2, f_{0}(2)) = (2, 2). Check that the above formula gives the same polynomial for f_{0}(n) that you determined earlier.
Any three points in the plane (with distinct x values) can be fit to a polynomial of degree two; any four points can be fit by a polynomial of degree three; etc. One formula for the interpolating polynomial of p + 2 points (x_{i}, y_{i}) is the so-called Lagrange interpolating polynomial,
(The product is over 1 ≤ j ≤ i – 1 and i + 1 ≤ j ≤ n.) First note that y(x_{k}) = y_{k} for each k since the summand is 0 for every i ≠ k and y_{i} for i = k. Secondly, note that the polynomial has degree at most p + 1. One computes the polynomial passing through p + 2 points merely by plugging the x- and y-coordinates of the points into the Lagrange interpolating polynomial. We might be lucky in that f_{2}(n) is a polynomial in n of degree 3. Assuming that it is, find this polynomial using the Lagrange interpolating polynomial. Check whether this polynomial works for the six data points you found above.
Mathematical induction is a method of proof that is often useful for proving identities such as this one. Prove that this polynomial for f_{2}(n) holds for all natural numbers n.
Possibly by implementing the Lagrange interpolating polynomial on a calculator or computer, use the same method to guess polynomial expressions for f_{3}(n) and f_{4}(n). Prove that these expressions hold for all natural numbers n.
Now that we have found f_{p}(n) to be a polynomial for several special cases, we may start to look for patterns in the hope of discovering a general polynomial formula. What patterns can be observed in the formulas found for p = 0, 1, 2, 3, and 4?
At this point the reader is invited to depart from this outline of the project and to work independently toward a general formula for f_{p}(n). It is possible to find this formula empirically; however, you will likely need outside information. What follows is an approach to the problem that is guided by foreknowledge of the answer. Certainly this is not how research is conducted in mathematics, but in this case the answer is made simpler by considering additional constructions that are not especially natural unless one knows that they will work. But that is not to say that they cannot be discovered without foreknowledge; the reader who chooses not to follow the rest of this outline should start by finding the polynomials f_{p}(n) for many additional p and looking for patterns in this sequence of polynomials.
For each natural number p, let B_{p}(n) be a polynomial in n of degree p satisfying
(2) |
This defines B_{p}(n) up to its constant term. We can see this by writing (for a fixed p) . Then
so b_{p} = 1. The coefficient b_{p – 1} is determined by b_{p}; b_{p – 2} is determined by b_{p} and b_{p – 1}; ...; and b_{1} = –(b_{p} + b_{p – 1} + ... + b_{2}). We call B_{p}(n) the pth Bernoulli polynomial, after Jakob Bernoulli (1654–1705), who discovered them in his attempt to find the general formula for sums of consecutive powers.
The constant term of B_{p}(n) is defined in mathematics, but we need not concern ourselves with it here. We write B_{p} = B_{p}(0) for the constant term of the pth Bernoulli polynomial. (These numbers are called Bernoulli numbers and appear throughout mathematics.) Let p > 1. By substituting n = 0 into the definition of Bernoulli polynomials, find another expression for the Bernoulli number B_{p}.
The first Bernoulli polynomial is B_{1}(n) = n + B_{1}. From the definition, find B_{p}(n) – B_{p} for p = 2, 3, and 4.
Now we notice a similarity between f_{p}(n) and B_{p}(n). Using the definition of f_{p}(n), find a simple expression for the difference f_{p – 1}(n) – f_{p – 1}(n – 1).
This statement is very reminiscent of the definition of Bernoulli polynomials. Multiply both sides of the equality by p. Since the pth Bernoulli polynomial is the unique polynomial (up to a constant) satisfying (2), find a relationship of the function p f_{p – 1}(n) to a Bernoulli polynomial (assuming that f_{p – 1}(n) is a polynomial in n of degree p).
At this point we have a conjectural formula for f_{p}(n) as a polynomial of degree p + 1, assuming that it is a polynomial of this degree. We would like to show that it in fact is this polynomial. For each p we will do this by induction on n. For p > 0 we may redefine f_{p}(n) as
since summing from m = 0 to n gives the same result as summing from m = 1 to n. Use this to find the constant term of f_{p}(n).
For each p, prove the conjectural formula for f_{p}(n) by induction on n.
One might object to this "solution" of finding a formula for sums of consecutive powers under the premise that Bernoulli polynomials aren't sufficiently explicit. After all, we have only changed the problem of computing sums of like powers into one of finding the coefficients of Bernoulli polynomials, and we don't have a closed-form expression for these coefficients. There are two points to make in response to such an objection.
The first is that not every problem in mathematics has its solution in terms of fundamental constructs such as the integers and π. While this accounts for much of the difficulty of mathematics, it is responsible for everything interesting and beautiful in mathematics as well. There is a hierarchy of mathematical objects that appear in nature, and the Bernoulli polynomials (and especially the Bernoulli numbers) happen to lie outside of any "direct" construction from the integers. This emphasizes their value.
The second is that, in a real sense, we have reduced the problem computationally, as it is much easier (from a computational standpoint) to find the p + 1 coefficients of B_{p}(n) and then to evaluate at n = 1000000 than it is to start adding up powers of natural numbers. Doing both by hand gets the point across, but it is even more appreciated when both methods are implemented on a computer.
Relevant links: