An important application of the derivatives is in solving various problems of optimization. We have learned how to determine the maximum or minimum values for functions of a single variable, and now in this section, we want to know how to determine the extreme values for functions of two or more variables. The problem of finding the extreme values for functions of several variables has similar features to that for functions of a single variable but it is often more complicated. So first let’s review what we already know about the maxima and minima of functions of one variable.

Review of Maxima and Minima of Single Variable Functions

Consider a function \(y=f(x)\):

  • We say \(f\) has a maximum (respectively minimum) if there exists \(x_0\) in its domain such that for all \(x\) in the domain of \(f\), \(f(x)\leq f(x_0)\) [respectively \(f(x)\geq f(x_0)\)]. \(f(x_0)\) is called the maximum value (the minimum value) of \(f\). The word “extremum” refer to either a maximum or a minimum.
  • We say \(f\) has a relative maximum (respectively relative minimum) at \(x_0\) if \(f(x)\leq f(x_0)\) [respectively \(f(x)\geq f(x_0)\)] for all \(x\) in the domain of \(f\) that are sufficiently close to \(x_0\).
  • If \(f\) is continuous on a closed interval \([a,b]\), it takes on both its absolute maximum value and its absolute minimum value on that interval. If the interval is not closed or if \(f\) is not continuous on that interval, there is no guarantee that the function takes on its extreme values on that interval.
  • If \(f(x_0)\) is a relative or absolute extreme value of \(f\), then the point \(x_0\) is one of three kinds of points:
    1. the point \(x_0\) is a stationary point; that is, \(f'(x_0)=0\),
    2. the point \(x_0\) is a rough point; that is, \(f'(x_0)\) does not exist, or
    3. the point \(x_0\) is one of the endpoints of the domain of \(f\).
  • If \(f^{\prime\prime}(x)>0\) for every \(x\) in an interval \(I\), then the graph of \(f\) is concave up on \(I\).
  • If \(f^{\prime\prime}(x)<0\) for every \(x\) in an interval \(I\), then the graph of \(f\) is concave down on \(I\).
  • If \(f'(x_0)=0\) and \(f^{\prime\prime}(x_0)>0\), then \(f\) has a local minimum at \(x_0\).
  • If \(f'(x_0)=0\) and \(f^{\prime\prime}(x_0)<0\), then \(f\) has a local maximum at \(x_0\).
  • If \(f'(x_0)=0\) and \(f^{\prime\prime}(x_0)=0\), more information is required to conclude whether or not \(f\) has a local extremum at \(x_0\). In fact, the additional information is the behavior of higher order derivatives.The complete theorem is as follows. Suppose
    \(f'(x_0)=\cdots =f^{(n-1)}(x_0)=0\) and \(f^{(n)}(x_0)\neq 0\).

     

    1. If \(n\) is even and \(f^{(n)}(x_0)>0\), then \(f\) has a local minimum at \(x_0\).
    2. If \(n\) is even and \(f^{(n)}(x_0)<0\), then \(f\) has a local maximum at \(x_0\).
    3. If \(n\) is odd, then \(f\) does not have an extremum at \(x_0\).

Definitions of Maxima and Minima for Multivariable Functions

Now we are ready to talk about finding maxima and minima for functions of two or more variables.

Consider a function \(z=f(x,y)\) defined on a set \(U\) in the \(xy\)-plane. We say \(f\) has a maximum at the point \((x_0,y_0)\) of its domain \(U\) if \[f(x,y)\leq f(x_0,y_0)\] for all \((x,y)\) in \(U\). Absolute maximum corresponds to a highest point on the surface \(z=f(x,y)\). We say \(f\) has a relative maximum (or local maximum) at \((x_0,y_0)\) if \[f(x,y)\leq f(x_0,y_0)\] for all \((x,y)\) of \(U\) that are in a sufficiently small neighborhood of \((x_0,y_0)\). The value \(f(x_0,y_0)\) at a relative maximum does not have to be the greatest value of \(f(x,y)\) in the entire of \(U\) but the greatest value of \(f(x,y)\) if we restrict ourselves to points that are sufficiently close to \((x_0,y_0)\). The definitions of minimum (or more specifically absolute minimum) and relative minimum are analogous. Consider Fig. 1.

Figure 1.

In a similar way we can define the maximum and minimum points for functions of three or more variables.

Definition 1. Let \(f:U\subseteq \mathbb{R}^n\to \mathbb{R}\).

We say \(f\) has a maximum (or more specifically an absolute maximum) at the point \(\mathbf{x}_0\in U\) if \[f(\mathbf{x}_0)\geq f(\mathbf{x}),\] for all \(\mathbf{x}\in U\).

We say \(f\) has a relative maximum at the point \(\mathbf{x}_0\in U\), if there is a neighborhood \(V\) of \(\mathbf{x}_0\) such that for every
\(\mathbf{x}\in V\cap U\)
\[f(\mathbf{x}_0)\geq f(\mathbf{x}).\]

We say \(f\) has a minimum (or more specifically an absolute minimum) at the point \(\mathbf{x}_0\in U\) if \[f(\mathbf{x}_0)\geq f(\mathbf{x}),\] for all points \(\mathbf{x}\in U\).

We say \(f\) has a relative minimum (or local minimum) at the point \(\mathbf{x}_0\in U\) if there is a neighborhood \(V\) of \(\mathbf{x}_0\) such that for every \(\mathbf{x}\in V\cap U\)
\[f(\mathbf{x}_0)\leq f(\mathbf{x}),\]

A point which is either a (relative or absolute) maximum or minimum is called a (relative or absolute)
extremum.

  • Every absolute maximum (respectively minimum) is also a relative maximum (minimum).

 

Bounded and Unbounded Sets

A set in \(\mathbb{R}\) is bounded if it is contained in an interval of finite length, and is unbounded otherwise. A set in \(\mathbb{R}^2\) is bounded if the entire set can be contained within a disk of finite radius, and is called unbounded if there is no disk that contains all the points of the set. Similarly, a set of \(\mathbb{R}^3\) is bounded if the entire set can be contained within a sphere of finite radius, and is unbounded otherwise. In general, A set in \(\mathbb{R}^n\) is bounded if the entire points of the set are contained inside a ball \(|\mathbf{x}|^2=x_1^2+x_2^2+\cdots+x_n^2\leq R^2\) of finite radius \(R\).

A bounded set in \(\mathbb{R}^2\)
An unbounded set (first quadrant) in \(\mathbb{R}^2\)
A bounded set in \(\mathbb{R}^3\)

The Extreme Value Theorem

The following theorem assures us that a continuous function in a closed and bounded set takes on its extreme values.

In the conditional statement “if \(P\) then \(Q\)” or “\(P\) implies \(Q\)” (written as \(P \rightarrow Q\) ), we say \(P\) is a sufficient condition for \(Q\) and \(Q\) is necessary condition for \(P\). Also note that “if \(P\) then \(Q\)” is equivalent to “if \(Q\) is false, then \(P\) is false.”

Theorem 1. (Sufficient Conditions for Extreme Values) If \(U\) is a closed and bounded set, and \(f\) is continuous in \(U\), then there is a point in \(U\) at which \(f\) attains its (absolute) maximum and is a point in \(U\) at which \(f\) attains its (absolute) minimum.

Finding Extrema

Calculus gives us the necessary conditions for an interior point to be a relative extremum. Let \(f\) be a function of two variables \(x\) and \(y\) and let \((x_0,y_0)\) be an interior point of the domain of \(f\). If \(f\) has a relative maximum or minimum at \((x_0,y_0)\) and if \(f_x(x_0,y_0)\) and \(f_y(x_0,y_0)\) exist, then
\[\overrightarrow{\nabla} f(x_0,y_0)=\mathbf{0} \quad \quad (\text{that is } f_x(x_0,y_0)=f_y(x_0,y_0)=0).\]

Figure 2. \(P\) is an interior point, and \(Q\) is a boundary point.

If we define a single-variable function \(F(x)=f(x,y_0)\) (see Fig. 3(a)), then \(F'(x)=f_x(x_0,y_0)\). If \((x_0,y_0)\) is a relative maximum point, then for all \((x,y)\) in the domain of \(f\) that are in a sufficiently small neighborhood of \((x_0,y_0)\), \(f(x_0,y_0)\geq f(x,y)\). Consequently, in that neighborhood \(F(x_0)\geq F(x)\). This means \(F\) has a relative maximum at \(x_0\). It follows from single variable calculus that \(F'(x_0)=0\); that is \(f_x(x_0,y_0)=0\) (Fig. 3(b)). The proof that \(f_y(x_0,y_0)=0\) is analogous.

 

Figure 3.

 

At every relative extremum in the interior domain of a differentiable function \(f(x,y)\) we have
\[df=\underbrace{\frac{\partial f}{\partial x}}_{=0} dx+\underbrace{\frac{\partial f}{\partial y}}_{=0} dy=0,\]
for \((x,y)=(x_0,y_0)\) and all \(dx\) and \(dy\). Geometrically \(df=0\) means that the tangent plane at the point \((x_0,y_0,f(x_0,y_0))\) is horizontal (or perpendicular to the \(z\)-axis). See Fig.[fig:MaxMin-3]

Figure 4.

We can easily generalize this result for functions of any number of independent variables. The proof of the following theorem is essentially the same as we discussed here, but is expressed in a different way.

Theorem 2. Let \(f:U\subseteq\mathbb{R}^n\to\mathbb{R}\) and let \(\mathbf{x}_0\) be an interior point of \(U\). If \(f\) has a relative maximum or minimum at \(\mathbf{x_0}\) and if the partial derivatives of \(f\) at \(\mathbf{x}_0\) exist, then \[\overrightarrow{\nabla} f(\mathbf{x}_0)=\mathbf{0}.\]

Show the proof

Suppose \(f\) has a relative maximum at \(\mathbf{x}_0\). We need to show \(f_{x_i}(\mathbf{x}_0)=0\). By the definition of a partial derivative
\[\frac{\partial f}{\partial x_i}(\mathbf{x}_0)=D_{\hat{\mathbf{e}}_i}f(\mathbf{x}_0)=\lim_{h\to 0}\frac{f(\mathbf{x}_0+t\hat{\mathbf{e}}_i)-f(\mathbf{x}_0)}{t},\]
where \(\hat{\mathbf{e}}_i\) as usual is the unit vector all of whose components are zero, except the \(i\)-th component, which is one.

Because \(f\) has a relative maximum at \(\mathbf{x}_0\), by Definition 1 we have \[f(\mathbf{x}_0+t\hat{\mathbf{e}}_i)- f(\mathbf{x}_0)\leq 0,\] whenever \(|t|\) is small enough, so that \(\mathbf{x}_0+t\hat{\mathbf{e}}_i\) is sufficiently close to \(\mathbf{x}_0\). If \(t\to 0^+\); that is, if \(t\) approaches 0 from the right, then \(t>0\); therefore:
\[\frac{f(\mathbf{x}_0+t\hat{\mathbf{e}}_i)-f(\mathbf{x}_0)}{t}\leq 0. \tag{*}\]
If \(t\to 0^-\); that is, \(t\) approaches 0 from the left, \(t<0\); therefore:
\[\frac{f(\mathbf{x}_0+t\hat{\mathbf{e}}_i)-f(\mathbf{x}_0)}{t}\geq 0. \tag{**}\]
If \(\frac{\partial f}{\partial x_i}(\mathbf{x}_0)\) exists, both inequalities (*) and (**) must hold. Therefore, we must have
\(\frac{\partial f}{\partial x_i}(\mathbf{x}_0)=0.\)

The proof when \(f\) has a relative minimum at \(\mathbf{x}_0\) is very similar. \(\blacksquare\)


Definition 2. A point \(\mathbf{x}_0\) is called a stationary point of \(f\) if all first order partial derivatives of \(f\) at that point exist and are equal to 0; that is, \(\overrightarrow{\nabla} f(\mathbf{x}_0)=\mathbf{0}\).

It follows from Theorem 2 that if a function has a relative extremum at an interior point of its domain and if its partial derivatives at that point exist, the point must be a stationary point of the function. Theorem 2 does not talk about the points where the partial derivatives do not exist and the points on the boundary. That is, it is possible for a function to assume its (relative or absolute) extreme value at a point where at least one of the first partial derivatives does not exist (Fig. 5(a)) or at a boundary point (Fig. 5(b)).

(a) Maximum occurs at a rough point.

(b) Extrema occur at two boundary points.

Figure 5.

  • A point at which at least one of the partial derivatives does not exist is called a rough point. In other words, at a rough point, the gradient does not exist.
  • Stationary points and rough points constitute critical points.
  • From the above discussion, we conclude that to determine the extreme values of a function, we should search them among stationary points, rough points, and boundary points.

A function \(f\) has a relative or absolute extremum at a point \(\mathbf{x}_0\) of its domain only if \(\mathbf{x}_0\) is one of the three types of points:

  1. \(\mathbf{x}_0\) is a stationary point of \(f\); that is, \(\overrightarrow{\nabla} f(\mathbf{x}_0)=\mathbf{0}\),
  2. \(\mathbf{x}_0\) is a rough point of \(f\); that is, \(\overrightarrow{\nabla} f(\mathbf{x}_0)\) does not exist, or
  3. \(\mathbf{x}_0\) is on the boundary of the domain of \(f\).
Example 1
Find the extreme values of the function \(f(x,y)=x^2+y^2\).

Solution
First we need to find the gradient of \(f\)
\[\overrightarrow{\nabla} f(x,y)=(2x, 2y).\]

The gradient exists everywhere; therefore, there is no rough point. Both components of the gradient are zero only at \((0,0)\); that is, \((0,0)\) is the only critical point of \(f\). Because \(f(0,0)=0\) and the value of \(f(x,y)\) is always greater than or equal to zero, \(f\) has an absolute minimum at the origin. The value of \(f\) increases unboundedly when we go away from the origin. So there is no absolute maximum.

Graph of \(z=x^2 +y^2\)

 

Example 2
Find the extreme values of the function \(f(x,y)=\sqrt{1-x^2-y^2}\).

Solution
The partial derivatives of \(f\) are \[f_x=-\frac{x}{\sqrt{1-x^2-y^2}}, \quad f_y=-\frac{y}{\sqrt{1-x^2-y^2}},\]

and they become zero at the origin. The function has a maximum at the origin, because at all \((x,y)\neq (0,0)\) the quantity \(1-x^2-y^2\) under the square root is less than 1 which occurs at the origin.

The domain of \(f\) is \(1-x^2-y^2\geq 0\) or \(x^2+y^2\leq 1\), which is a disk of radius 1 and centered at the origin. On the boundary of the domain (i.e. \(x^2+y^2=1\)), \(f\) is zero. The absolute minimum value of \(f\) occurs on the boundary because the minimum of the square root is zero.

Graph of \(z=\sqrt{1-x^2 -y^2}\)
Example 3
Find the extreme values of the function \(f(x,y)=\sqrt{x^2+y^2}\).

Solution
First we calculate the partial derivatives
\[\frac{\partial f}{\partial x}=\frac{x}{\sqrt{x^2+y^2}},\quad \frac{\partial f}{\partial y}=\frac{y}{\sqrt{x^2+y^2}}.\]

The partial derivatives are not both zero if \((x,y)\neq (0,0)\). Because \(\frac{\partial f}{\partial x}(0,0)\) and \(\frac{\partial f}{\partial y}(0,0)\) do not exist, \((0,0)\) is a rough point. The value of \(f\) at \((0,0)\) is zero. This is the minimum value the square root. Thus, \(f\) has an absolute minimum at the origin. The graph of \(f\) is a circular cone, \(f\) increases unboundedly when \(|(x,y)|\to\infty\) and does not have a maximum.

Graph of \(z=\sqrt{x^2 +y^2}\)

 

Example 4
Suppose we would like to build a distribution center to service three cities, and suppose the locations (the coordinates) of the three cities are
\((a_1,b_1)\), \((a_2,b_2)\), and \((a_3,b_3)\). Locate the storage center to minimize the cost

  1. if the cost is proportional to sum of the square distances from the center to the cities
  2. if the cost is proportional to sum of the distances from the center to the cities.
Solution
Let \((x,y)\) be the coordinates of the distribution center. The distance of the distribution center to a city is \[r_i=\sqrt{(x-a_i)^2+(y-b_i)^2}, \quad i=1,2,3.\] For part (a) we wish to minimize \[f(x,y)=r_1^2+r_2^2+r_3^2,\] and for part (b), we wish to minimize \[g(x,y)=r_1+r_2+r_3.\] We will see minimization of \(f(x,y)\) is easier than minimization of \(g(x,y)\).

(a) We have
\[\begin{align} f(x,y)&=(x-a_1)^2+(y-b_1)^2+(x-a_2)^2+(y-b_2)^2+(x-a_3)^2+(y-b_3)^2\\ &=\sum_{i=1}^3 \left[(x-a_i)^2+(y-b_i)^2\right]\end{align}\]
\[\Rightarrow \frac{\partial f}{\partial x}=2(x-a_1+x-a_2+x-a_3),\quad \frac{\partial f}{\partial y}=2(y-b_1+y-b_2+y-b_3)\]
Solving \(\frac{\partial f}{\partial x}=0\) and \(\frac{\partial f}{\partial y}=0\) gives \(x=(a_1+a_2+a_3)/3\) and \(y=(b_1+b_2+b_3)/3\). Therefore, the best location to minimize sum of the square distances is the centroid of the triangle formed by the cities (Fig. 6).

Figure 6.

(b) We have
\[\begin{align} g(x,y)&=r_1+r_2+r_3\\ &=\sum_{i=1}^3 \sqrt{(x-a_i)^2+(y-b_i)^2}.\end{align}\]
Differentiating \(g\) and equating to zero, we obtain
\[\begin{align} \frac{\partial g}{\partial x}&=\frac{x-a_1}{r_1}+\frac{x-a_2}{r_2}+\frac{a-x_3}{r_3}=0,\\ \frac{\partial g}{\partial y}&=\frac{y-b_1}{r_1}+\frac{y-b_2}{r_2}+\frac{y-b_3}{r_3}=0.\end{align}\]
These equations are complicated to solve. However, we note that
\[\overrightarrow{\nabla} g=\overrightarrow{\nabla} r_1+\overrightarrow{\nabla} r_2+\overrightarrow{\nabla} r_3,\]
where
\[\overrightarrow{\nabla} r_i=\left(\frac{x-a_i}{r_i},\frac{y-b_i}{r_i}\right) \ \ (\text{for } i=1,2,3)\]
Also we note that these gradient vectors \(\overrightarrow{\nabla} r_i\) are unit vectors, \(|\overrightarrow{\nabla} r_i|=1\). Therefore, the vector sum of three unit vectors at the relative minimum has to be \(\mathbf{0}\). The only way for this to happen is when the angles between them are 360\(^\circ\)/3=120\(^\circ\) (as (b) and (c) in the following figure). So the solution probably is when the roads from the cities to the center make 120\(^\circ\) angles. There are other possibilities. Each function \(r_i\) has a rough point; that is, \(g\) has three rough points. The graph of \(r_i\) is a circular cone (see Example [Eg:MinCone]), which has been shifted \(a_i\) units in the \(x\)-direction and \(b_i\) units in the \(y\)-direction. Therefore, \(r_i\) has a rough point at \((a_i,b_i)\). This means that if we build the distribution center in one of the cities, we might have minimized the cost.

If the triangle formed by the cities has an angle larger than 120\(^\circ\), then a point inside the triangle such that the angles between the roads from the delivery center to the cities make 120\(^\circ\) does not exist. In this case, the best point is the city with the wide angle (see (c) in the following figure). Otherwise, a point inside the triangle would be the solution.

(a) (b) (c)

Example 5
Find the extreme values of the function \(f(x,y)=-x^2 – y^2 + 4 x y + x + y\) on the triangular region \(R=\{(x,y)| -1\leq x\leq 0, 0\leq y\leq 2x+2\}\) (see Fig. 6).

Figure 6: Region $R$.

 

 

Solution

Because \(R\) is closed and bounded, and the function is continuous in \(R\), \(f\) takes on both its maximum and minimum values. Also because \(f\) is a differentiable function, \(\overrightarrow{\nabla} f(x,y)\) exists everywhere and there is no rough point. So to find the extreme values of \(f\), we should find the critical points and the extreme values on the boundary.

\[f(x,y)=-x^2 – y^2 + 4 x y + x + y\]
\[\Rightarrow \overrightarrow{\nabla} f(x,y)=(-2x+4y+1,-2y+4x+1)\] To find the critical point, we have to solve the following system of equations:
\[\begin{align} \left\{\begin{array}{l} -2x+4y+1=0\\ 4x-2y+1=0 \end{array}\right. \Rightarrow x=\frac{-1}{2}, y=\frac{-1}{2}.\end{align}\]
The value of \(f\) at this point is \(f\left(-\frac{1}{2},-\frac{1}{2}\right)=-\frac{1}{2}\). Now we find the extreme values on the boundary. Three line segments form the boundary. To look for an extremum on the the line segment \(y=0\) and \(-1\leq x\leq 0\), we set \(y=0\), so \(f\) depends on \(x\) alone. Then we find the best \(x\). \[f(x,0)=-x^2+x \quad\text{and}\quad f_x(x,0)=-2x+1.\] Therefore, the critical point of \(f(x,0)\) is \(x=\frac{1}{2}\) which is out of the this line segment. We evaluate \(f\) at the endpoints of this line segment, i.e. \((-1,0)\) and \((0,0)\). \[f(-1,0)=-2,\quad f(0,0)=0.\] On the line segment \(x=0\) and
\(0\leq y\leq 2\), to find the extreme value of \(f\), we set \(x=0\) and find the best \(y\): \[f(0,y)=-y^2+y \quad \text{and}\quad f_y=-2y+1.\] Thus, \(f(0,y)\) has a critical point at \(y=\frac{1}{2}\). We evaluate \(f(0,y)\) at this point and the endpoints.
\[f\left(0,\frac{1}{2}\right)=\frac{1}{4},\quad f(0,2)=-2,\quad \text{(already evaluated } f(0,0)).\]
To follow the last segment of the boundary, i.e. \(y=2x+2\) and \(-1\leq x\leq 0\), we set \(y=2x+2\) and evaluate \(f\). Then we choose the best \(x\)
\[\begin{align} f(x,2x+2)&=-x^2-(2x+2)^2+4x(2x+2)+x+(2x+2)\\ &=3x^2+3x-2\end{align}\]
\[\Rightarrow f_x=6x+3=0 \Rightarrow x=-\frac{1}{2}\] On this line segment when \(x=-\frac{1}{2}\), \(y=2(-\frac{1}{2})+2=1\), and \(f(-\frac{1}{2},1)=-\frac{11}{4}\).

To find the extreme values, we compare \(f(x,y)\) at all the points we found:

\(x\) \(y\) \(f(x,y)\)  
\(-\frac{1}{2}\) \(-\frac{1}{2}\) \(-\frac{1}{2}\)  
\(-1\) \(0\) \(-2\)  
\(0\) \(0\) \(0\)  
\(0\) \(\frac{1}{2}\) \(\frac{1}{4}\) \(\leftarrow\) maximum
\(0\) \(2\) \(-2\)  
\(-\frac{1}{2}\) \(1\) \(-\frac{11}{4}\) \(\leftarrow\) minimum
       

Therefore, the absolute maximum of \(f\) in \(R\) is \(\frac{1}{4}\) and its absolute minimum in \(R\) is \(-\frac{11}{4}\), which occur at \((0,\frac{1}{2})\) and \((-\frac{1}{2},1)\), respectively.

Saddle Points

Theorem 2 states necessary conditions (not sufficient ones). Not every critical point is a relative extremum. See the following example.

Example 6
Given \(f(x,y)=xy\), find its extremum points.

Solution
\[f(x,y)=xy \Rightarrow \overrightarrow{\nabla} f(x,y)=(y, x)\] Equating the gradient of \(f\) to \((0,0)\) tells us that the origin is a critical point. Does \(f\) have a relative maximum or minimum at this point?

The answer is that the origin is neither a relative maximum nor a relative minimum. Because any neighborhood of \((0,0)\)— no matter how small it is— include points from each quarter; \(f(0,0)=0\) but \(f(x,y)>0\) for any point in the first (when \(x>0, y>0\)) and third (when \(x<0, y<0\)) quarters and \(f(x,y)<0\) for any point in the second (when \(x<0, y>0\)) and forth (when \(x>0, y<0\)) quarters. Such a point is called a saddle point.

(a) Graph of \(z=xy\)
(b) Contour plot of \(z=xy\). As we can see every neighborbood of \((0,0)\) contains points with \(z>0\) and points with \(z<0\).
 

Figure 7.

A critical point where all the components of the gradient of the function vanish but it is not a relative maximum nor a relative minimum is a saddle point.

Saddle points are somewhat analogous to the points of inflection for functions of one variable.

Second Partials Test

To figure out whether a critical point is a maximum, a minimum or a saddle point, we may graph the function but what if we do not have access to a graphing application or what can we do to classify the critical points of functions of three or more variables? Fortunately we can systematically use what is called the “second partials test.” This test is similar to the second derivative test for functions of one variable. Because application of the second partial tests for functions of three or more variables is rather laborious, here we restrict ourselves to functions of two variables.

(Sufficient Conditions for Extreme Values: Second Partials Test) Let \((x_0,y_0)\) be a critical point of \(f(x,y)\) and interior to the domain of \(f\). Suppose \(f\) has continuous second order partial derivatives in a neighborhood of \((x_0,y_0)\). Let
\[A=\frac{\partial^2 f}{\partial x^2}(x_0,y_0),\quad B=\frac{\partial^2 f}{\partial x \partial y}(x_0,y_0),\quad C=\frac{\partial^2 f}{\partial y^2}(x_0,y_0)\]
and let
\[\Delta =\det\, H(x_0,y_0)=\begin{bmatrix} A & B\\ B & C \end{bmatrix}=AC-B^2.\]
Then we have:

  1. If \(\Delta >0\) and \(A>0\), then \(f\) has a relative minimum at \((x_0,y_0)\).
  2. If \(\Delta >0\) and \(A<0\), then \(f\) has a relative maximum at \((x_0,y_0)\).
  3. If \(\Delta <0\), then \(f\) has a saddle at \((x_0,y_0)\).
  4. If \(\Delta=0\), this test gives no information.

Show the proof

Let \((x,y)\) be a point (other than \((x_0,y_0)\)) in a neighborhood of \((x_0,y_0)\) where the second order partial derivatives of \(f\) are continuous. Using Taylor’s formula (Theorem 3), we can write
\[\begin{align} \Delta f=&f(x_0+h,y_0+k)-f(x_0,y_0)\\ =&f_x(x_0,y_0) h+f_y(x_0,y_0) k+\frac{1}{2}\left[a h^2+2bhk +ck^2\right]\end{align}\]
where \(h=x-x_0\), \(k=y-y_0\), and \(a, b\) and \(c\) are the second order partial derivatives of \(f\) at some point \((x_0+h\theta,y_0+k\theta)\) for \(0<\theta<1\):
\[a=f_{xx}(x_0+h\theta,y_0+k\theta),\quad b=f_{xy}(x_0+h\theta,y_0+k\theta),\quad c=f_{yy}(x_0+h\theta,y_0+k\theta).\]
Because \((x_0,y_0)\) is a critical point, we have \(f_x(x_0,y_0)=f_y(x_0,y_0)=0\); thus,
\[\begin{align} \Delta f=\frac{1}{2}\left[a h^2+2bhk +ck^2\right] \tag{*}.\end{align}\]

Case I: \(\Delta=AC-B^2>0\). It follows from \(\Delta>0\) that \(A\neq 0\). Let’s define \(\phi(x,y)=f_{xx}(x,y)f_{yy}(x,y)-\left[f_{xy}(x,y)\right]^2\). We are given \(\Delta=\phi(x_0,y_0)>0\). Because \(f_{xx}, f_{yy}\), and \(f_{xy}\) are continuous in a neighborhood of \((x_0,y_0)\), say \(N_r(x_0,y_0)\), \(\phi(x,y)\) is also continuous in \(N_r(x_0,y_0)\). Hence, there is a neighborhood \(N_{r’}(x_0,y_0)\) of \((x_0,y_0)\) for some \(r’\leq r\) in which \(\phi(x,y)>0\) and \(f_{xx}(x,y)\) has the same sign as \(A\). Now consider only points \((x,y)=(x_0+h,y_0+k)\) that lie in \(N_{r’}(x_0,y_0)\). In this case \((x_0+h\theta,y_0+k\theta)\) for \(0<\theta<1\) also lies in \(N_{r’}(x_0,y_0)\); hence \(ac^2-b^2>0\) and \(a\) has the same sign as \(A\).

The expression inside the square brackets in (*) is a quadratic form of \(h\) and \(k\). Because \(a\neq 0\), by completing the squares, we may rewrite (*) as \[\Delta f=\frac{1}{2a}\left[(ah+bk)^2+(ac-b^2)k^2\right]. \tag{**}\] The expression inside the square brackets in (**) is the sum of two squares. Therefore, \(\Delta f\) has the same sign as \(a\) (or \(A\)). Therefore, if \(A>0\), then \(\Delta f=f(x,y)-f(x_0,y_0)>0\); that is, \(f\) has a relative minimum at \((x_0,y_0)\). If \(A<0\), then \(\Delta f=f(x,y)-f(x_0,y_0)<0\); that is, \(f\) has a relative maximum at \((x_0,y_0)\). This proves parts (a) and (b) of the above theorem.

Case II: \(\Delta=AC-B^2<0\). If \(A\neq 0\), we consider \((x,y)\) in a neighborhood of \((x_0,y_0)\) where \(\phi(x,y)<0\) and \(f_{xx}(x,y)\) has the same sign as \(A=f_{xx}(x_0,y_0)\). Again by completing the square, we can rewrite (*) as (**).

Because \(ac-b^2<0\), the expression in square brackets in (**) is the difference of two squares. If we put \(k=0\) and \(h\neq 0\), \(\Delta f\) has the same sign as \(a\) (or, in turn, as \(A\)). Now if we put \(k\neq 0\) and \(h=-\frac{bk}{a}\), \(\Delta f\) has the sign opposite to that of \(A\). Therefore \(f\) has a saddle point at \((x_0,y_0)\).

If \(A=0\) but \(C\neq 0\), again we can complete the square and rewrite (*) as \[\Delta f=\frac{1}{2c}\left[(ck+bh)^2-b^2h^2\right].\] Using the same argument as before (once putting \(h=0\) and \(k\neq 0\) and once \(h\neq 0\) and \(k=-bh/c\)), we can show that \(f\) has a saddle point at \((x_0,y_0)\).

The last case we need to investigate is when \(\Delta<0\), and \(A=C=0\). If follows from \(\Delta<0\) that \(B\neq 0\). If we put \(h=k\), then Eq (*) becomes: \[\Delta f=\frac{h^2}{2}(a+2b+c).\] Taking the limit, we have:
\[\lim_{h\to 0}\frac{\Delta f}{h^2}=\lim_{h\to 0} \frac{a+2b+c}{2}=\frac{A+2B+C}{2}=B\neq 0.\]
Therefore \(\Delta f\) has the same sign as \(B\) for sufficiently small \(|h|\). If we put \(h=-k\), using the same argument, we can show that \(\Delta f\) has the same sign as \(-B\). This means we have shown \(f\) has a saddle point at \((x_0,y_0)\).

Case III: \(\Delta=AC-B^2=0\). Part (d) can be shown through examples. \(\blacksquare\)


Example 7
Given that \(f\) is the function defined by \(f(x,y)=x^3-3xy+y^3\), determine the relative extrema of \(f\) if there are any.

Solution
To find the critical points, we need to find the first order partial derivatives of \(f\) and set both of them equal to zero. We have:
\[\begin{align} \left\{\begin{array}{l} f_x(x,y)=3x^2-3y=0\\ \\ f_y(x,y)=3y^2-3x=0 \end{array}\right.\end{align}\]

It follows from the first equation that \(y=x^2\). If we plug \(y=x^2\) into the second equation, we will arrive at \(3x^4-3x=3x(x^3-1)=0\), which gives us \(x=0\) or \(x=1\). Thus, there are two critical points \(O=(0,0)\) and \(P=(1,1)\). To determine if \(f\) has relative extrema at these points, we use the second partials test. The second order partial derivatives are

\[f_{xx}(x,y)=6x, \quad f_{xy}(x,y)=-3,\quad f_{yy}(x,y)=6y.\]

At \(O=(0,0)\): \[A=f_{xx}(0,0)=0,\quad B=f_{xy}(0,0)=-3,\quad f_{yy}(0,0)=0.\] Because
\[\Delta=\det\begin{bmatrix} 0 & -3\\ -3 & 0 \end{bmatrix} =AC-B^2=-9<0,\]
according to part (c) of Theorem 3, \(f\) has a saddle point at \(O=(0,0)\).

At \(P=(1,1)\): \[A=f_{xx}(1,1)=6,\quad B=f_{xy}(1,1)=-3,\quad f_{yy}(1,1)=6.\] Because
\[\Delta=\det\begin{bmatrix} 6 & -3\\ -3 & 6 \end{bmatrix} =AC-B^2=27>0,\]
and \(A=f_{xx}(1,1)=6>0\), according to part (a) of Theorem 3, \(f\) has a minimum at \(P=(1,1)\).

 

Figure 8. Contour plot of \(f(x,y)=x^3-3xy+y^3\) shows that \(f\) has a relative minimum at \((1,1)\) and \((0,0)\) is a saddle point.

 

Example 8
Given that \(f\) is the function defined by \(f(x,y)=\sin x+\sin y+\sin (x+y)\), determine the relative extrema of \(f\) on the square \(0\leq x<2\pi\) and \(0\leq y<2\pi\).

Solution
To find the critical points, first we calculate the first order partial derivatives and set them equal to zero:
\[\begin{align} \left\{\begin{array}{l} f_x(x,y)=\cos x+\cos (x+y)=0\\ \\ f_y(x,y)=\cos y+\cos (x+y)=0 \end{array}\right.\end{align}\]
These equations together imply that \(\cos y=\cos x\); hence, \(y=2k\pi\pm x\).

If we plug \(y=2k\pi+x\) (for \(k\in\mathbb{Z}\)) into the first equation, we obtain \[\cos x+\cos 2x=0 \tag{*}\] We know \(\cos 2x=2\cos^2 x-1\).1 If we plug this identity into (*), we obtain: \[2\cos^2 x+\cos x-1=0.\] This is a quadratic in \(\cos x\). Thus:
\[\cos x=\frac{1}{4}(-1\pm\sqrt{1+8})=\frac{1}{2}\ \text{or}\ -1\]

The equation \(\cos x=\frac{1}{2}\) has two solutions in the interval \([0,2\pi)\): \(x=\frac{\pi}{3}\) or \(x=2\pi-\frac{\pi}{3}=\frac{5\pi}{3}\). Because \(y\) lies between 0 and \(2\pi\), we have to choose \(k=0\) and get \(y=x\); that is \(P_1=\left(\frac{\pi}{3},\frac{\pi}{3}\right)\) and \(P_2=\left(\frac{5\pi}{3},\frac{5\pi}{3}\right)\) are two critical points.

The solution of \(\cos x=-1\) in the interval \([0,2\pi)\) is \(x=\pi\). Again because \(y\) lies between 0 and \(2\pi\), we have to choose \(k=0\) and find \(y=\pi\). The third critical point is \(P_3=\left(\pi,\pi\right)\)

If we plug \(y=2k\pi-x\) into the first equation, we obtain \[\cos x+\underbrace{\cos(2k\pi)}_{=1}=0 \Rightarrow \cos x=-1,\] which is the same equation we had before.

Hence the critical points are \(P_1=\left(\frac{\pi}{3},\frac{\pi}{3}\right), P_3=\left(\frac{5\pi}{3},\frac{5\pi}{3}\right)\) and \(P_2=\left(\pi,\pi\right)\). For each of these points, we use the second partials test. We have:
\[\begin{align}f_{xx}(x,y)&=-\sin x-\sin (x+y), \\ \quad f_{xy}(x,y)&=-\sin(x+y), \\ \quad f_{yy}(x,y)&=-\sin y-\sin (x+y).\end{align}\]

At \(P_1=(\frac{\pi}{3},\frac{\pi}{3})\), we have:
\[\begin{align} A&=f_{xx}\left(\frac{\pi}{3},\frac{\pi}{3}\right)=-\sin\frac{\pi}{3}-\sin\frac{2\pi}{3}=-\sqrt{3}\\ B&=f_{xy}\left(\frac{\pi}{3},\frac{\pi}{3}\right)=-\sin \frac{2\pi}{3}=-\frac{\sqrt{3}}{2}\\ C&=f_{yy}\left(\frac{\pi}{3},\frac{\pi}{3}\right)=-\sin\frac{\pi}{3}-\sin\frac{2\pi}{3}=-\sqrt{3}.\end{align}\]
Thus
\[\Delta=\det\begin{bmatrix} -\sqrt{3} & -\frac{\sqrt{3}}{2}\\ \\ -\frac{\sqrt{3}}{2} & -\sqrt{3} \end{bmatrix}=3-\frac{3}{4}=\frac{9}{4}>0.\]
Because \(\Delta>0\) and \(A=-\sqrt{3}<0\), we can conclude that \(f\) has a maximum at \(P_1=\left(\frac{\pi}{3},\frac{\pi}{3}\right)\).

At \(P_2=(\frac{5\pi}{3},\frac{5\pi}{3})\), we have:
\[\begin{align} A&=f_{xx}\left(\frac{5\pi}{3},\frac{5\pi}{3}\right)=-\sin\frac{5\pi}{3}-\sin\frac{10\pi}{3}=\sqrt{3}\\ B&=f_{xy}\left(\frac{5\pi}{3},\frac{5\pi}{3}\right)=-\sin \frac{10\pi}{3}=\frac{\sqrt{3}}{2}\\ C&=f_{yy}\left(\frac{5\pi}{3},\frac{5\pi}{3}\right)=-\sin\frac{5\pi}{3}-\sin\frac{10\pi}{3}=\sqrt{3}.\end{align}\]
Thus
\[\Delta=\det\begin{bmatrix} \sqrt{3} & \frac{\sqrt{3}}{2}\\ \\ \frac{\sqrt{3}}{2} & \sqrt{3} \end{bmatrix}=3-\frac{3}{4}=\frac{9}{4}>0.\]
Because \(\Delta>0\) and \(A=\sqrt{3}>0\), we can conclude that \(f\) has a minimum at \(P_2=\left(\frac{5\pi}{3},\frac{5\pi}{3}\right)\).

At \(P_3=(\pi,\pi)\), we have:
\[\begin{align} A&=f_{xx}\left(\pi,\pi\right)=-\sin\pi-\sin2\pi=0\\ B&=f_{xy}\left(\pi,\pi\right)=-\sin 2\pi=0\\ C&=f_{yy}\left(\pi,\pi\right)=-\sin\pi-\sin{2\pi}=0.\end{align}\]
Thus
\[\Delta=\det\begin{bmatrix} 0 & 0\\ 0& 0 \end{bmatrix}=0.\] Because \(\Delta=0\), the second partial test is inconclusive. In this example, to investigate whether or not \(f\) has an extremum at \(P_3\), we can calculate the variation (increment) of \(f\) near \(P_3\):
\[\begin{align} \Delta f=f(\pi+h,\pi+k)-f(\pi,\pi)=\sin(\pi+h)+\sin(\pi+k)+\sin(2\pi+h+k)-\underbrace{f(\pi,\pi)}_{=0}\end{align}\]
Now we can use Taylor’s expansion
\[\begin{align} \Delta f=&-\sin h-\sin k+\sin(h+k)\\ =&-\left[h-\frac{h^3}{3}+\cdots\right]-\left[k-\frac{k^3}{3!}+\cdots\right]+\left[(h+k)-\frac{(h+k)^3}{3}+\cdots\right]\end{align}\]
To approximate \(\Delta f\), because \(h\) and \(k\) can be made as small as we desire, we just keep the dominant terms which are the terms with the lowest degree (i.e. the terms with the smallest power) and ignore the other terms as being negligible by comparison:
\[\begin{align} \Delta f \approx&\frac{h^3}{6}+\frac{k^3}{6}-\frac{(h+k)^3}{6}=-\frac{hk}{2}(h+k).\end{align}\]
If \(hk<0\) and \((h+k)>0\) (for example \(h=0.002\) and \(k=-0.001\)), \(\Delta f>0\) and if \(hk<0\) but \((h+k)<0\) (for example, \(h=-0.002\) and \(k=0.001\)), \(\Delta f<0\). Therefore, \(f\) has a saddle point at \(P_3=(\pi,\pi)\).

The graph of \(f\) is shown in Fig. 9.

(a) Graph of \(f\).
(b) Contour plot of \(f\).

Figure 9.

We mentioned that when \(\Delta=0\), the second partial test is inconclusive, and the function may have a maximum, minimum, or a saddle point at the critical point. In the previous example, \(f\) had a saddle point at a critical point where \(\Delta =0\). Now consider the following functions: \[g(x,y)=x^4+y^4,\quad \phi(x,y)=-(x^4+y^4),\quad \psi(x,y)=x^4-y^4.\] You can verify that the origin is a critical point of these functions; \(g\) has a relative minimum, \(\phi\) has a relative maximum, and \(\psi\) has a saddle point at the origin. See Fig. 10.

 

Figure 10.

1 Alternatively, we can transform the sum into a product using

\[\cos a+\cos b=2\cos\left(\frac{a+b}{2}\right) \cos\left(\frac{a-b}{2}\right).\]

Thus 

\[\cos x+\cos 2x=2\cos\left(\frac{3x}{2}\right) \cos\left(\frac{x}{2}\right)=0.\]

\[\cos \left(\frac{3x}{2}\right) =0 \Rightarrow \frac{3x}{2}=(2n+1)\frac{\pi}{2} \Rightarrow x=(2n+1)\frac{\pi}{3}\]

\[\cos\left(\frac{x}{2}\right)=0 \Rightarrow \frac{x}{2}=(2n+1)\frac{\pi}{2} \Rightarrow x=(2n+1)\pi\] 

In the interval $[0,2\pi)$, we have $x=\frac{\pi}{3}$, or $x=\frac{5\pi}{3}$ or $x=\pi$.