The Proof that every Equation has a Root

Let \[Z = P(z) = \alpha_{0} z^{n} + \alpha_{1} z^{n-1} + \dots + \alpha_{n}\] be a polynomial in \(z\), with real or complex coefficients. We can represent the values of \(z\) and \(Z\) by points in two planes, which we may call the \(z\)-plane and the \(Z\)-plane respectively. It is evident that if \(z\) describes a closed path \(\gamma\) in the \(z\)-plane, then \(Z\) describes a corresponding closed path \(\Gamma\) in the \(Z\)-plane. We shall assume for the present that the path \(\Gamma\) does not pass through the origin.

To any value of \(Z\) correspond an infinity of values of \(\operatorname{am} Z\), differing by multiples of \(2\pi\), and each of these values varies continuously as \(Z\) describes \(\Gamma\).1 We can select a particular value of \(\operatorname{am} Z\) corresponding to each point of \(\Gamma\), by first selecting a particular value corresponding to the initial value of \(Z\), and then following the continuous variation of this value as \(Z\) moves along \(\Gamma\). We shall, in the argument which follows, use the phrase ‘the amplitude of \(Z\)’ and the formula \(\operatorname{am} Z\) to denote the particular value of the amplitude of \(Z\) thus selected. Thus \(\operatorname{am} Z\) denotes a one-valued and continuous function of \(X\) and \(Y\), the real and imaginary parts of \(Z\).

When \(Z\), after describing \(\Gamma\), returns to its original position, its amplitude may be the same as before, as will certainly be the case if \(\Gamma\) does not enclose the origin, like path (a) in Fig. B, or it may differ from its original value by any multiple of \(2\pi\). Thus if its path is like (b) in Fig. B, winding once round the origin in the positive direction, then its amplitude will have increased by \(2\pi\). These remarks apply, not merely to \(\Gamma\), but to any closed contour in the \(Z\)-plane which does not pass through the origin. Associated with any such contour there is a number which we may call ‘the increment of \(\operatorname{am} Z\) when \(Z\) describes the contour’, a number independent of the initial choice of a particular value of the amplitude of \(Z\).

 

We shall now prove that if the amplitude of \(Z\) is not the same when \(Z\) returns to its original position, then the path of \(z\) must contain inside or on it at least one point at which \(Z = 0\).

We can divide \(\gamma\) into a number of smaller contours by drawing parallels to the axes at a distance \(\delta_{1}\) from one another, as in Fig. C.2 If there is, on the boundary of any one of these contours, a point at which \(Z = 0\), what we wish to prove is already established. We may therefore suppose that this is not the case. Then the increment of \(\operatorname{am} Z\), when \(z\) describes \(\gamma\), is equal to the sum of all the increments of \(\operatorname{am} Z\) obtained by supposing \(z\) to describe each of these smaller contours separately in the same sense as \(\gamma\). For if \(z\) describes each of the smaller contours in turn, in the same sense, it will ultimately (see Fig. D) have described the boundary of \(\gamma\) once, and each part of each of the dividing parallels twice and in opposite directions. Thus \(PQ\) will have been described twice, once from \(P\) to \(Q\) and once from \(Q\) to \(P\). As \(z\) moves from \(P\) to \(Q\), \(\operatorname{am} Z\) varies continuously, since \(Z\) does not pass through the origin; and if the increment of \(\operatorname{am} Z\) is in this case \(\theta\), then its increment when \(z\) moves from \(Q\) to \(P\) is \(-\theta\); so that, when we add up the increments of \(\operatorname{am} Z\) due to the description of the various parts of the smaller contours, all cancel one another, save the increments due to the description of parts of \(\gamma\) itself.

 

Hence, if \(\operatorname{am} Z\) is changed when \(z\) describes \(\gamma\), there must be at least one of the smaller contours, say \(\gamma_{1}\), such that \(\operatorname{am} Z\) is changed when \(z\) describes \(\gamma_{1}\). This contour may be a square whose sides are parts of the auxiliary parallels, or may be composed of parts of these parallels and parts of the boundary of \(\gamma\). In any case every point of the contour lies in or on the boundary of a square \(\Delta_{1}\) whose sides are parts of the auxiliary parallels and of length \(\delta_{1}\).

We can now further subdivide \(\gamma_{1}\) by the help of parallels to the axes at a smaller distance \(\delta_{2}\) from one another, and we can find a contour \(\gamma_{2}\), entirely included in a square \(\Delta_{2}\), of side \(\delta_{2}\) and itself included in \(\Delta_{1}\) such that \(\operatorname{am} Z\) is changed when \(z\) describes the contour.

Now let us take an infinite sequence of decreasing numbers \(\delta_{1}\), \(\delta_{2}\), …, \(\delta_{m}\), …, whose limit is zero.3 By repeating the argument used above, we can determine a series of squares \(\Delta_{1}\), \(\Delta_{2}\), …, \(\Delta_{m}\), … and a series of contours \(\gamma_{1}\), \(\gamma_{2}\), …, \(\gamma_{m}\), … such that (i) \(\Delta_{m+1}\) lies entirely inside \(\Delta_{m}\), (ii) \(\gamma_{m}\) lies entirely inside \(\Delta_{m}\), (iii) \(\operatorname{am} Z\) is changed when \(z\) describes \(\gamma_{m}\).

If \((x_{m}, y_{m})\) and \((x_{m} + \delta_{m}, y_{m} + \delta_{m})\) are the lower left-hand and upper right-hand corners of \(\Delta_{m}\), it is clear that \(x_{1}\), \(x_{2}\), …, \(x_{m}\), … is an increasing and \(x_{1} + \delta_{1}\), \(x_{2} + \delta_{2}\), …, \(x_{m} + \delta_{m}\), … a decreasing sequence, and that they have a common limit \(x_{0}\). Similarly \(y_{m}\) and \(y_{m} + \delta_{m}\) have a common limit \(y_{0}\), and \((x_{0}, y_{0})\) is the one and only point situated inside every square \(\Delta_{m}\). However small \(\delta\) may be, we can draw a square which includes \((x_{0}, y_{0})\), and whose sides are parallel to the axes and of length \(\delta\), and inside this square a closed contour such that \(\operatorname{am} Z\) is changed when \(z\) describes the contour.

It can now be shown that \[P(x_{0} + iy_{0}) = 0.\] For suppose that \(P(x_{0} + iy_{0}) = a\), where \(|a| = \rho > 0\). Since \(P(x + iy)\) is a continuous function of \(x\) and \(y\), we can draw a square whose centre is \((x_{0}, y_{0})\) and whose sides are parallel to the axes, and which is such that \[|P(x + iy) – P(x_{0} + iy_{0})| < \tfrac{1}{2}\rho\] at all points \(x + iy\) inside the square or on its boundary. At all such points \[P(x + iy) = a + \phi,\] where \(|\phi| < \frac{1}{2}\rho\). Now let us take any closed contour lying entirely inside this square. As \(z\) describes this contour, \(Z = a + \phi\) also describes a closed contour. But the latter contour evidently lies inside the circle whose centre is \(a\) and whose radius is \(\frac{1}{2}\rho\), and this circle does not include the origin. Hence the amplitude of \(Z\) is unchanged.

But this contradicts what was proved above, viz. that inside each square \(\Delta_{m}\) we can find a closed contour the description of which by \(z\) changes \(\operatorname{am} Z\) Hence \(P(x_{0} + iy_{0}) = 0\).

All that remains is to show that we can always find some contour such that \(\operatorname{am} Z\) is changed when \(z\) describes \(\gamma\). Now \[Z = a_{0} z^{n} \left(1 + \frac{a_{1}}{a_{0}z} + \frac{a_{2}}{a_{0} z^{2}} + \dots + \frac{a_{n}}{a_{0} z^{n}}\right).\] We can choose \(R\) so that \[\frac{|a_{1}|}{|a_{0}| R} + \frac{|a_{2}|}{|a_{0}| R^{2}} + \dots + \frac{|a_{n}|}{|a_{0}| R^{n}} < \delta,\] where \(\delta\) is any positive number, however small; and then, if \(\gamma\) is the circle whose centre is the origin and whose radius is \(R\), we have \[Z = a_{0} z^{n} (1 + \rho),\] where \(|\rho| < \delta\), at all points on \(\gamma\). We can then show, by an argument similar to that used above, that \(\operatorname{am}(1 + \rho)\) is unchanged as \(z\) describes \(\gamma\) in the positive sense, while \(\operatorname{am} z^{n}\) on the other hand is increased by \(2n\pi\). Hence \(\operatorname{am} Z\) is increased by \(2n\pi\), and the proof that \(Z = 0\) has a root is completed.

We have assumed throughout the argument that neither \(\Gamma\), nor any of the smaller contours into which it is resolved, passes through the origin. This assumption is obviously legitimate, for to suppose the contrary, at any stage of the argument, is to admit the truth of the theorem.

We leave it as an exercise to the reader to infer, from the discussion which precedes and that of § 43, that

when \(z\) describes any contour \(\gamma\) in the positive sense the increment of \(\operatorname{am} Z\) is \(2k\pi\), where \(k\) is the number of roots of \(Z = 0\) inside \(\gamma\), multiple roots being counted multiply.

There is another proof, proceeding on different lines, which is often given. It depends, however, on an extension to functions of two or more variables of the results of § 102 et seq.

We define, precisely on the lines of § 102, the upper and lower bounds of a function \(f(x, y)\), for all pairs of values of \(x\) and \(y\) corresponding to any point of any region in the plane of \((x, y)\) bounded by a closed curve. And we can prove, much as in § 102, that a continuous function \(f(x, y)\) attains its upper and lower bounds in any such region.

Now \[|Z| = |P(x + iy)|\] is a positive and continuous function of \(x\) and \(y\). If \(m\) is its lower bound for points on and inside \(\gamma\), then there must be a point \(z_{0}\) for which \(|Z| = m\), and this must be the least value assumed by \(|Z|\). If \(m = 0\), then \(P(z_{0}) = 0\), and we have proved what we want. We may therefore suppose that \(m > 0\).

The point \(z_{0}\) must lie either inside or on the boundary of \(\gamma\): but if \(\gamma\) is a circle whose centre is the origin, and whose radius \(R\) is large enough, then the last hypothesis is untenable, since \(|P(z)| \to \infty\) as \(|z| \to \infty\). We may therefore suppose that \(z_{0}\) lies inside \(\gamma\).

If we put \(z = z_{0} + \zeta\), and rearrange \(P(z)\) according to powers of \(\zeta\), we obtain \[P(z) = P(z_{0}) + A_{1}\zeta + A_{2}\zeta^{2} + \dots + A_{n}\zeta^{n},\] say. Let \(A_{k}\) be the first of the coefficients which does not vanish, and let \(|A_{k}| = \mu\), \(|\zeta| = \rho\). We can choose \(\rho\) so small that \[|A_{k+1}|\rho + |A_{k+2}|\rho^{2} + \dots + |A_{n}|\rho^{n-k} < \tfrac{1}{2}\mu.\] Then \[|P(z) – P(z_{0}) – A_{k}\zeta^{k}| < \tfrac{1}{2}\mu\rho^{k},\] and \[|P(z)| < |P(z_{0} + A_{k}\zeta^{k}| + \tfrac{1}{2}\mu\rho^{k}.\]

Now suppose that \(z\) moves round the circle whose centre is \(z_{0}\) and radius \(\rho\). Then \[P(z_{0}) + A_{k}\zeta^{k}\] moves \(k\) times round the circle whose centre is \(P(z_{0})\) and radius \(|A_{k}\zeta^{k}| = \mu\rho^{k}\), and passes \(k\) times through the point in which this circle is intersected by the line joining \(P(z_{0})\) to the origin. Hence there are \(k\) points on the circle described by \(z\) at which \(|P(z_{0}) + A_{k}\zeta^{k}| = |P(z_{0})| – \mu\rho^{k}\) and so \[|P(z)| < |P(z_{0})| – \mu\rho^{k} + \tfrac{1}{2}\mu\rho^{k} = m – \tfrac{1}{2}\mu\rho^{k} < m;\] and this contradicts the hypothesis that \(m\) is the lower bound of \(|P(z)|\).

It follows that \(m\) must be zero and that \(P(z_{0}) = 0\).

EXAMPLES ON APPENDIX I
1. Show that the number of roots of \(f(z) = 0\) which lie within a closed contour which does not pass through any root is equal to the increment of \[\{\log f(z)\}/2\pi i\] when \(z\) describes the contour.

2. Show that if \(R\) is any number such that \[\frac{|a_{1}|}{R} + \frac{|a_{2}|}{R^{2}} + \dots + \frac{|a_{n}|}{R^{n}} < 1,\] then all the roots of \(z^{n} + a_{1}z^{n-1} + \dots + a_{n} = 0\) are in absolute value less than \(R\). In particular show that all the roots of \(z^{5} – 13z -7 = 0\) are in absolute value less than \(2\frac{1}{67}\).

3. Determine the numbers of the roots of the equation \(z^{2p} + az + b = 0\) where \(a\) and \(b\) are real and \(p\) odd, which have their real parts positive and negative. Show that if \(a > 0\), \(b > 0\) then the numbers are \(p – 1\) and \(p + 1\); if \(a < 0\), \(b > 0\) they are \(p + 1\) and \(p – 1\); and if \(b < 0\) they are \(p\) and \(p\). Discuss the particular cases in which \(a = 0\) or \(b = 0\). Verify the results when \(p = 1\).

[Trace the variation of \(\operatorname{am}(z^{2p} + az + b)\) as \(z\) describes the contour formed by a large semicircle whose centre is the origin and whose radius is \(R\), and the part of the imaginary axis intercepted by the semicircle.]

4. Consider similarly the equations \[z^{4q} + az + b = 0,\quad z^{4q-1} + az + b = 0,\quad z^{4q+1} + az + b = 0.\]

5. Show that if \(\alpha\) and \(\beta\) are real then the numbers of the roots of the equation \(z^{2n} + \alpha^{2} z^{2n-1} + \beta^{2} = 0\) which have their real parts positive and negative are \(n – 1\) and \(n + 1\), or \(n\) and \(n\), according as \(n\) is odd or even.

6. Show that when \(z\) moves along the straight line joining the points \(z = z_{1}\), \(z = z_{2}\), from a point near \(z_{1}\) to a point near \(z_{2}\), the increment of \[\operatorname{am} \left(\frac{1}{z – z_{1}} + \frac{1}{z – z_{2}}\right)\] is nearly equal to \(\pi\).

7. A contour enclosing the three points \(z = z_{1}\), \(z = z_{2}\), \(z = z_{3}\) is defined by parts of the sides of the triangle formed by \(z_{1}\), \(z_{2}\), \(z_{3}\), and the parts exterior to the triangle of three small circles with their centres at those points. Show that when \(z\) describes the contour the increment of \[\operatorname{am} \left(\frac{1}{z – z_{1}} + \frac{1}{z – z_{2}} + \frac{1}{z – z_{3}}\right)\] is equal to \(-2\pi\).

8. Prove that a closed oval path which surrounds all the roots of a cubic equation \(f(z) = 0\) also surrounds those of the derived equation \(f'(z) = 0\). [Use the equation \[f'(z) = f(z) \left( \frac{1}{z – z_{1}} + \frac{1}{z – z_{2}} + \frac{1}{z – z_{3}} \right),\] where \(z_{1}\), \(z_{2}\), \(z_{3}\) are the roots of \(f(z) = 0\), and the result of Ex. 7.]

9. Show that the roots of \(f'(z) = 0\) are the foci of the ellipse which touches the sides of the triangle \((z_{1}, z_{2}, z_{3})\) at their middle points. [For a proof see Cesàro’s Elementares Lehrbuch der algebraischen Analysis, p. 352.]

10. Extend the result of Ex. 8 to equations of any degree.

11. If \(f(z)\) and \(\phi(z)\) are two polynomials in \(z\), and \(\gamma\) is a contour which does not pass through any root of \(f(z)\), and \(|\phi(z)| < |f(z)|\) at all points on \(\gamma\), then the numbers of the roots of the equations \[f(z) = 0,\quad f(z) + \phi(z) = 0\] which lie inside \(\gamma\) are the same.

12. Show that the equations \[e^{z} = az,\quad e^{z} = az^{2},\quad e^{z} = az^{3},\] where \(a > e\), have respectively (i) one positive root (ii) one positive and one negative root and (iii) one positive and two complex roots within the circle \(|z| = 1\).


  1. It is here that we assume that \(\Gamma\) does not pass through the origin.↩︎
  2. There is no difficulty in giving a definite rule for the construction of these parallels: the most obvious course is to draw all the lines \(x = k\delta_{1}\), \(y = k\delta_{1}\), where \(k\) is an integer positive or negative.↩︎
  3. We may, , take \(\delta_{m} = \delta_{1}/2^{m-1}\).↩︎

$\leftarrow$ Chapter X Main Page Appendix II $\rightarrow$