156. Definite Integrals and Areas.
It will be remembered that, in Ch. VI, § 145, we assumed that, if \(f(x)\) is a continuous function of \(x\), and \(PQ\) is the graph of \(y = f(x)\), then the region \(PpqQ\) shown in Fig. 47 has associated with it a definite number which we call its area. It is clear that, if we denote \(Op\) and \(Oq\) by \(a\) and \(x\), and allow \(x\) to vary, this area is a function of \(x\), which we denote by \(F(x)\).
Making this assumption, we proved in § 145 that \(F'(x) = f(x)\), and we showed how this result might be used in the calculation of the areas of particular curves. But we have still to justify the fundamental assumption that there is such a number as the area \(F(x)\).
We know indeed what is meant by the area of a rectangle, and that it is measured by the product of its sides. Also the properties of triangles, parallelograms, and polygons proved by Euclid enable us to attach a definite meaning to the areas of such figures. But nothing which we know so far provides us with a direct definition of the area of a figure bounded by curved lines. We shall now show how to give a definition of \(F(x)\) which will enable us to prove its existence.1
Let us suppose \(f(x)\) continuous throughout the interval \({[a, b]}\), and let us divide up the interval into a number of sub-intervals by means of the points of division \(x_{0}\), \(x_{1}\), \(x_{2}\), …, \(x_{n}\), where \[a = x_{0} < x_{1} < \dots < x_{n-1} < x_{n} = b.\] Further, let us denote by \(\delta_{\nu}\) the interval \({[x_{\nu}, x_{\nu+1}]}\), and by \(m_{\nu}\) the lower bound (§ 102) of \(f(x)\) in \(\delta_{\nu}\), and let us write \[s = m_{0}\delta_{0} + m_{1}\delta_{1} + \dots + m_{n}\delta_{n} = \sum m_{\nu}\delta_{\nu},\] say.
It is evident that, if \(M\) is the upper bound of \(f(x)\) in \({[a, b]}\), then \(s \leq M(b – a)\). The aggregate of values of \(s\) is therefore, in the language of § 80, bounded above, and possesses an upper bound which we will denote by \(j\). No value of \(s\) exceeds \(j\), but there are values of \(s\) which exceed any number less than \(j\).
In the same way, if \(M_{\nu}\) is the upper bound of \(f(x)\) in \(\delta_{\nu}\), we can define the sum \[S = \sum M_{\nu}\delta_{\nu}.\]
It is evident that, if \(m\) is the lower bound of \(f(x)\) in \({[a, b]}\), then \(S \geq m(b – a)\). The aggregate of values of \(S\) is therefore bounded below, and possesses a lower bound which we will denote by \(J\). No value of \(S\) is less than \(J\), but there are values of \(S\) less than any number greater than \(J\).
It will help to make clear the significance of the sums \(s\) and \(S\) if we observe that, in the simple case in which \(f(x)\) increases steadily from \(x = a\) to \(x = b\), \(m_{\nu}\) is \(f(x_{\nu})\) and \(M_{\nu}\) is \(f(x_{\nu+1})\). In this case \(s\) is the total area of the rectangles shaded in Fig. 48, and \(S\) is the area bounded by a thick line. In general \(s\) and \(S\) will still be areas, composed of rectangles, respectively included in and including the curvilinear region whose area we are trying to define.
We shall now show that no sum such as \(s\) can exceed any sum such as \(S\). Let \(s\), \(S\) be the sums corresponding to one mode of subdivision, and \(s’\), \(S’\) those corresponding to another. We have to show that \(s \leq S’\) and \(s’ \leq S\).
We can form a third mode of subdivision by taking as dividing points all points which are such for either \(s\), \(S\) or \(s’\), \(S’\). Let \(\mathbf{s}\), \(\mathbf{S}\) be the sums corresponding to this third mode of subdivision. Then it is easy to see that \[\begin{equation*} \mathbf{s} \geq s,\quad \mathbf{s} \geq s’,\quad \mathbf{S} \leq S,\quad \mathbf{S} \leq S’. \tag{1} \end{equation*}\] For example, \(\mathbf{s}\) differs from \(s\) in that at least one interval \(\delta_{\nu}\) which occurs in \(s\) is divided into a number of smaller intervals \[\delta_{\nu, 1},\ \delta_{\nu, 2},\ \dots,\ \delta_{\nu, p},\] so that a term \(m_{\nu}\delta_{\nu}\) of \(s\) is replaced in \(\mathbf{s}\) by a sum \[m_{\nu, 1}\delta_{\nu, 1} + m_{\nu, 2}\delta_{\nu, 2} + \dots + m_{\nu, p}\delta_{\nu, p},\] where \(m_{\nu, 1}\), \(m_{\nu, 2}\), … are the lower bounds of \(f(x)\) in \(\delta_{\nu, 1}\), \(\delta_{\nu, 2}\), …. But evidently \(m_{\nu, 1} \geq m_{\nu}\), \(m_{\nu, 2} \geq m_{\nu}\), …, so that the sum just written is not less than \(m_{\nu}\delta_{\nu}\). Hence \(\mathbf{s} \geq s;\) and the other inequalities (1) can be established in the same way. But, since \(\mathbf{s} \leq \mathbf{S}\), it follows that \[s \leq \mathbf{s} \leq \mathbf{S} \leq S’,\] which is what we wanted to prove.
It also follows that \(j \leq J\). For we can find an \(s\) as near to \(j\) as we please and an \(S\) as near to \(J\) as we please,2 and so \(j > J\) would involve the existence of an \(s\) and an \(S\) for which \(s > S\).
So far we have made no use of the fact that \(f(x)\) is continuous. We shall now show that \(j = J\), and that the sums \(s\), \(S\) tend to the limit \(J\) when the points of division \(x_{\nu}\) are multiplied indefinitely in such a way that all the intervals \(\delta_{\nu}\) tend to zero. More precisely, we shall show that, given any positive number \(\epsilon\), it is possible to find \(\delta\) so that \[0 \leq J – s < \epsilon,\quad 0 \leq S – J < \epsilon\] whenever \(\delta_{\nu} < \delta\) for all values of \(\nu\).
There is, by Theorem II of § 106, a number \(\delta\) such that \[M_{\nu} – m_{\nu} < \epsilon/(b – a),\] whenever every \(\delta_{\nu}\) is less than \(\delta\). Hence \[S – s = \sum (M_{\nu} – m_{\nu})\, \delta_{\nu} < \epsilon.\] But \[S – s = (S – J) + (J – j) + (j – s);\] and all the three terms on the right-hand side are positive, and therefore all less than \(\epsilon\). As \(J – j\) is a constant, it must be zero. Hence \(j = J\) and \(0 \leq j – s < \epsilon\), \(0 \leq S – J < \epsilon\), as was to be proved.
We define the area of \(PpqQ\) as being the common limit of \(s\) and \(S\), that is to say \(J\). It is easy to give a more general form to this definition. Consider the sum \[\sigma = \sum f_{\nu}\delta_{\nu}\] where \(f_{\nu}\) denotes the value of \(f(x)\) at any point in \(\delta_{\nu}\). Then \(\sigma\) plainly lies between \(s\) and \(S\), and so tends to the limit \(J\) when the intervals \(\delta_{\nu}\) tend to zero. We may therefore define the area as the limit of \(\sigma\).
157. The definite integral.
Let us now suppose that \(f(x)\) is a continuous function, so that the region bounded by the curve \(y = f(x)\), the ordinates \(x = a\) and \(x = b\), and the axis of \(x\), has a definite area. We proved in Ch. VI, § 145, that if \(F(x)\) is an ‘integral function’ of \(f(x)\), i.e. if \[F'(x) = f(x),\quad F(x) = \int f(x)\, dx,\] then the area in question is \(F(b) – F(a)\).
As it is not always practicable actually to determine the form of \(F(x)\), it is convenient to have a formula which represents the area \(PpqQ\) and contains no explicit reference to \(F(x)\). We shall write \[(PpqQ) = \int_{a}^{b} f(x)\, dx.\]
The expression on the right-hand side of this equation may then be regarded as being defined in either of two ways. We may regard it as simply an abbreviation for \(F(b) – F(a)\), where \(F(x)\) is some integral function of \(f(x)\), whether an actual formula expressing it is known or not; or we may regard it as the value of the area \(PpqQ\), as directly defined in § 156.
The number \[\int_{a}^{b} f(x)\, dx\] is called a definite integral; \(a\) and \(b\) are called its lower and upper limits; \(f(x)\) is called the subject of integration or integrand; and the interval \({[a, b]}\) the range of integration. The definite integral depends on \(a\) and \(b\) and the form of the function \(f(x)\) only, and is not a function of \(x\). On the other hand the integral function \[F(x) = \int f(x)\, dx\] is sometimes called the indefinite integral of \(f(x)\).
The distinction between the definite and the indefinite integral is merely one of point of view. The definite integral \(\int_{a}^{b} f(x)\, dx = F(b) – F(a)\) is a function of \(b\), and may be regarded as a particular integral function of \(f(b)\). On the other hand the indefinite integral \(F(x)\) can always be expressed by means of a definite integral, since \[F(x) = F(a) + \int_{a}^{x} f(t)\, dt.\]
But when we are considering ‘indefinite integrals’ or ‘integral functions’ we are usually thinking of a relation between two functions, in virtue of which one is the derivative of the other. And when we are considering a ‘definite integral’ we are not as a rule concerned with any possible variation of the limits. Usually the limits are constants such as \(0\) and \(1\); and \[\int_{0}^{1} f(x)\, dx = F(1) – F(0)\] is not a function at all, but a mere number.
It should be observed that the integral \(\int_{a}^{x} f(t)\, dt\), having a differential coefficient \(f(x)\), is a fortiori a continuous function of \(x\).
Since \(1/x\) is continuous for all positive values of \(x\), the investigations of the preceding paragraphs supply us with a proof of the actual existence of the function \(\log x\), which we agreed to assume provisionally in § 128.
158. Area of a sector of a circle. The circular functions.
The theory of the trigonometrical functions \(\cos x\), \(\sin x\), etc., as usually presented in text-books of elementary trigonometry, rests on an unproved assumption. An angle is the configuration formed by two straight lines \(OA\), \(OP\); there is no particular difficulty in translating this ‘geometrical’ definition into purely analytical terms. The assumption comes at the next stage, when it is assumed that angles are capable of numerical measurement, that is to say that there is a real number \(x\) associated with the configuration, just as there is a real number associated with the region \(PpqQ\) of Fig. 47. This point once admitted, \(\cos x\) and \(\sin x\) may be defined in the ordinary way, and there is no further difficulty of principle in the elaboration of the theory. The whole difficulty lies in the question, what is the \(x\) which occurs in \(\cos x\) and \(\sin x\)? To answer this question, we must define the measure of an angle, and we are now in a position to do so. The most natural definition would be this: suppose that \(AP\) is an arc of a circle whose centre is \(O\) and whose radius is unity, so that \(OA = OP = 1\). Then \(x\), the measure of the angle, is the length of the arc \(AP\). This is, in substance, the definition adopted in the text-books, in the accounts which they give of the theory of ‘circular measure’. It has however, for our present purpose, a fatal defect; for we have not proved that the arc of a curve, even of a circle, possesses a length. The notion of the length of a curve is capable of precise mathematical analysis just as much as that of an area; but the analysis, although of the same general character as that of the preceding sections, is decidedly more difficult, and it is impossible that we should give any general treatment of the subject here.
We must therefore found our definition on the notion not of length but of area. We define the measure of the angle \(AOP\) as twice the area of the sector \(AOP\) of the unit circle.
Suppose, in particular, that \(OA\) is \(y = 0\) and that \(OP\) is \(y = mx\), where \(m > 0\). The area is a function of \(m\), which we may denote by \(\phi(m)\). If we write \(\mu\) for \((1 + m^{2})^{-\frac{1}{2}}\), \(P\) is the point \((\mu, m\mu)\), and we have \[\phi(m) = \tfrac{1}{2} m\mu^{2} + \int_{\mu}^{1} \sqrt{1 – x^{2}}\, dx.\] Differentiating with respect to \(m\), we find \[\phi'(m) = \frac{1}{2(1 + m^{2})},\quad \phi(m) = \tfrac{1}{2} \int_{0}^{m} \frac{dt}{1 + t^{2}}.\] Thus the analytical equivalent of our definition would be to define \(\arctan m\) by the equation \[\arctan m = \int_{0}^{m} \frac{dt}{1 + t^{2}};\] and the whole theory of the circular functions could be worked out from this starting point, just as the theory of the logarithm is worked out from a similar definition in Ch. IX. See Appendix III.
159. Calculation of the definite integral from its definition as the limit of a sum.
In a few cases we can evaluate a definite integral by direct calculation, starting from the definitions of §§ 156 and 157. As a rule it is much simpler to use the indefinite integral, but the reader will find it instructive to work through a few examples.
160. General properties of the definite integral.
The definite integral possesses the important properties expressed by the following equations.3
(1)\[\int_{a}^{b} f(x) dx = -\int_{b}^{a} f(x) dx \]
This follows at once from the definition of the integral by means of the integral function \(F(x)\), since \(F(b) – F(a) = -\{F(a) – F(b)\}\). It should be observed that in the direct definition it was presupposed that the upper limit is greater than the lower; thus this method of definition does not apply to the integral \(\int_{b}^{a} f(x)\, dx\) when \(a < b\). If we adopt this definition as fundamental we must extend it to such cases by regarding the equation (1) as a definition of its right-hand side.
(2) \[ \int_{a}^{a} f(x)\, dx = 0\]
(3)\[\int_{a}^{b}f(x)\, dx + \int_{b}^{c}f(x)\, dx = \int_{a}^{c}f(x)\, dx.\]
(4)\[\int_{a}^{b}kf(x)\, dx = k \int_{a}^{b}f(x)\, dx.\]
(5)\[\int_{a}^{b}\{f(x) + \phi(x)\}\, dx= \int_{a}^{b}f(x)\, dx + \int_{a}^{b}\phi(x)\, dx \]
The reader will find it an instructive exercise to write out formal proofs of these properties, in each case giving a proof starting from (\(\alpha\)) the definition by means of the integral function and (\(\beta\)) the direct definition.
The following theorems are also important.
(6) If \(f(x) \geq 0\) when \(a \leq x \leq b\), then \(\int_{a}^{b}f(x)\, dx \geq 0\).
We have only to observe that the sum \(s\) of § 156 cannot be negative. It will be shown later (Misc. Ex. 41) that the value of the integral cannot be zero unless \(f(x)\) is always equal to zero: this may also be deduced from the second corollary of § 121.
(7) If \(H \leq f(x) \leq K\) when \(a \leq x \leq b\), then \[H(b – a) \leq \int_{a}^{b}f(x)\, dx \leq K(b – a).\]
This follows at once if we apply (6) to \(f(x) – H\) and \(K – f(x)\).
(8) \[ \int_{a}^{b}f(x)\, dx = (b-a)f(\xi)\]
where \(\xi\) lies between \(a\) and \(b\).
This follows from (7). For we can take \(H\) to be the least and \(K\) the greatest value of \(f(x)\) in \({[a, b]}\). Then the integral is equal to \(\eta(b – a)\), where \(\eta\) lies between \(H\) and \(K\). But, since \(f(x)\) is continuous, there must be a value of \(\xi\) for which \(f(\xi) = \eta\) (§ 100).
If \(F(x)\) is the integral function, we can write the result of (8) in the form \[F(b) – F(a) = (b – a)F'(\xi),\] so that (8) appears now to be only another way of stating the Mean Value Theorem of § 125. We may call (8) the First Mean Value Theorem for Integrals.
(9) The Generalised Mean Value Theorem for integrals. If \(\phi(x)\) is positive, and \(H\) and \(K\) are defined as in (7), then \[H\int_{a}^{b} \phi(x)\, dx \leq \int_{a}^{b} f(x)\phi(x)\, dx \leq K\int_{a}^{b} \phi(x)\, dx;\] and \[\int_{a}^{b} f(x)\phi(x)\, dx = f(\xi) \int_{a}^{b} \phi(x)\, dx,\] where \(\xi\) is defined as in (8).
This follows at once by applying Theorem (6) to the integrals \[\int_{a}^{b} \{f(x) – H\}\phi(x)\, dx,\quad \int_{a}^{b} \{K – f(x)\}\phi(x)\, dx.\] The reader should formulate for himself the corresponding result which holds when \(\phi(x)\) is always negative.
(10)The Fundamental Theorem of the Integral Calculus. The function \[F(x) = \int_{a}^{x} f(t)\, dt\] has a derivative equal to \(f(x)\).
This has been proved already in § 145, but it is convenient to restate the result here as a formal theorem. It follows as a corollary, as was pointed out in § 157, that \(F(x)\) is a continuous function of \(x\).
161. Integration by parts and by substitution.
It follows from § 138 that \[\int_{a}^{b} f(x)\phi'(x)\, dx = f(b)\phi(b) – f(a)\phi(a) – \int_{a}^{b} f'(x)\phi(x)\, dx.\] This formula is known as the formula for .
Again, we know (§ 133) that if \(F(t)\) is the integral function of \(f(t)\), then \[\int f\{\phi(x)\}\phi'(x)\, dx = F\{\phi(x)\}.\] Hence, if \(\phi(a) = c\), \(\phi(b) = d\), we have \[\int_{c}^{d} f(t)\, dt = F(d) – F(c) = F\{\phi(b)\} – F\{\phi(a)\} = \int_{a}^{b} f\{\phi(x)\}\phi'(x)\, dx;\] which is the formula for the transformation of a definite integral by substitution.
The formulae for integration by parts and for transformation often enable us to evaluate a definite integral without the labour of actually finding the integral function of the subject of integration, and sometimes even when the integral function cannot be found. Some instances of this will be found in the following examples. That the value of a definite integral may sometimes be found without a knowledge of the integral function is only to be expected, for the fact that we cannot determine the general form of a function \(F(x)\) in no way precludes the possibility that we may be able to determine the difference \(F(b) – F(a)\) between two of its particular values. But as a rule this can only be effected by the use of more advanced methods than are at present at our disposal.
- The argument which follows is modelled on that given in Goursat’s Cours d’Analyse (second edition), vol. i, pp. 171 et seq.; but Goursat’s treatment is much more general.↩︎
- The \(s\) and the \(S\) do not in general correspond to the same mode of subdivision.↩︎
- All functions mentioned in these equations are of course continuous, as the definite integral has been defined for continuous functions only.↩︎
$\leftarrow$ 155. Differentials | Main Page | 162. Alternative proof of Taylor’s Theorem $\rightarrow$ |