An important application of the derivatives is in solving various problems of optimization. We have learned how to determine the maximum or minimum values for functions of a single variable, and now in this section, we want to know how to determine the extreme values for functions of two or more variables. The problem of finding the extreme values for functions of several variables has similar features to that for functions of a single variable but it is often more complicated. So first let’s review what we already know about the maxima and minima of functions of one variable.

Review of Maxima and Minima of Single Variable Functions

Consider a function y=f(x):

  • We say f has a maximum (respectively minimum) if there exists x0 in its domain such that for all x in the domain of f, f(x)f(x0) [respectively f(x)f(x0)]. f(x0) is called the maximum value (the minimum value) of f. The word “extremum” refer to either a maximum or a minimum.
  • We say f has a relative maximum (respectively relative minimum) at x0 if f(x)f(x0) [respectively f(x)f(x0)] for all x in the domain of f that are sufficiently close to x0.
  • If f is continuous on a closed interval [a,b], it takes on both its absolute maximum value and its absolute minimum value on that interval. If the interval is not closed or if f is not continuous on that interval, there is no guarantee that the function takes on its extreme values on that interval.
  • If f(x0) is a relative or absolute extreme value of f, then the point x0 is one of three kinds of points:
    1. the point x0 is a stationary point; that is, f(x0)=0,
    2. the point x0 is a rough point; that is, f(x0) does not exist, or
    3. the point x0 is one of the endpoints of the domain of f.
  • If f(x)>0 for every x in an interval I, then the graph of f is concave up on I.
  • If f(x)<0 for every x in an interval I, then the graph of f is concave down on I.
  • If f(x0)=0 and f(x0)>0, then f has a local minimum at x0.
  • If f(x0)=0 and f(x0)<0, then f has a local maximum at x0.
  • If f(x0)=0 and f(x0)=0, more information is required to conclude whether or not f has a local extremum at x0. In fact, the additional information is the behavior of higher order derivatives.The complete theorem is as follows. Suppose
    f(x0)==f(n1)(x0)=0 and f(n)(x0)0.

     

    1. If n is even and f(n)(x0)>0, then f has a local minimum at x0.
    2. If n is even and f(n)(x0)<0, then f has a local maximum at x0.
    3. If n is odd, then f does not have an extremum at x0.

Definitions of Maxima and Minima for Multivariable Functions

Now we are ready to talk about finding maxima and minima for functions of two or more variables.

Consider a function z=f(x,y) defined on a set U in the xy-plane. We say f has a maximum at the point (x0,y0) of its domain U if f(x,y)f(x0,y0) for all (x,y) in U. Absolute maximum corresponds to a highest point on the surface z=f(x,y). We say f has a relative maximum (or local maximum) at (x0,y0) if f(x,y)f(x0,y0) for all (x,y) of U that are in a sufficiently small neighborhood of (x0,y0). The value f(x0,y0) at a relative maximum does not have to be the greatest value of f(x,y) in the entire of U but the greatest value of f(x,y) if we restrict ourselves to points that are sufficiently close to (x0,y0). The definitions of minimum (or more specifically absolute minimum) and relative minimum are analogous. Consider Fig. 1.

Figure 1.

In a similar way we can define the maximum and minimum points for functions of three or more variables.

Definition 1. Let f:URnR.

We say f has a maximum (or more specifically an absolute maximum) at the point x0U if f(x0)f(x), for all xU.

We say f has a relative maximum at the point x0U, if there is a neighborhood V of x0 such that for every
xVU
f(x0)f(x).

We say f has a minimum (or more specifically an absolute minimum) at the point x0U if f(x0)f(x), for all points xU.

We say f has a relative minimum (or local minimum) at the point x0U if there is a neighborhood V of x0 such that for every xVU
f(x0)f(x),

A point which is either a (relative or absolute) maximum or minimum is called a (relative or absolute)
extremum.

  • Every absolute maximum (respectively minimum) is also a relative maximum (minimum).

 

Bounded and Unbounded Sets

A set in R is bounded if it is contained in an interval of finite length, and is unbounded otherwise. A set in R2 is bounded if the entire set can be contained within a disk of finite radius, and is called unbounded if there is no disk that contains all the points of the set. Similarly, a set of R3 is bounded if the entire set can be contained within a sphere of finite radius, and is unbounded otherwise. In general, A set in Rn is bounded if the entire points of the set are contained inside a ball |x|2=x12+x22++xn2R2 of finite radius R.

A bounded set in R2
An unbounded set (first quadrant) in R2
A bounded set in R3

The Extreme Value Theorem

The following theorem assures us that a continuous function in a closed and bounded set takes on its extreme values.

In the conditional statement “if P then Q” or “P implies Q” (written as PQ ), we say P is a sufficient condition for Q and Q is necessary condition for P. Also note that “if P then Q” is equivalent to “if Q is false, then P is false.”

Theorem 1. (Sufficient Conditions for Extreme Values) If U is a closed and bounded set, and f is continuous in U, then there is a point in U at which f attains its (absolute) maximum and is a point in U at which f attains its (absolute) minimum.

Finding Extrema

Calculus gives us the necessary conditions for an interior point to be a relative extremum. Let f be a function of two variables x and y and let (x0,y0) be an interior point of the domain of f. If f has a relative maximum or minimum at (x0,y0) and if fx(x0,y0) and fy(x0,y0) exist, then
f(x0,y0)=0(that is fx(x0,y0)=fy(x0,y0)=0).

Figure 2. P is an interior point, and Q is a boundary point.

If we define a single-variable function F(x)=f(x,y0) (see Fig. 3(a)), then F(x)=fx(x0,y0). If (x0,y0) is a relative maximum point, then for all (x,y) in the domain of f that are in a sufficiently small neighborhood of (x0,y0), f(x0,y0)f(x,y). Consequently, in that neighborhood F(x0)F(x). This means F has a relative maximum at x0. It follows from single variable calculus that F(x0)=0; that is fx(x0,y0)=0 (Fig. 3(b)). The proof that fy(x0,y0)=0 is analogous.

 

Figure 3.

 

At every relative extremum in the interior domain of a differentiable function f(x,y) we have
df=fx=0dx+fy=0dy=0,
for (x,y)=(x0,y0) and all dx and dy. Geometrically df=0 means that the tangent plane at the point (x0,y0,f(x0,y0)) is horizontal (or perpendicular to the z-axis). See Fig.[fig:MaxMin-3]

Figure 4.

We can easily generalize this result for functions of any number of independent variables. The proof of the following theorem is essentially the same as we discussed here, but is expressed in a different way.

Theorem 2. Let f:URnR and let x0 be an interior point of U. If f has a relative maximum or minimum at x0 and if the partial derivatives of f at x0 exist, then f(x0)=0.

Show the proof


Definition 2. A point x0 is called a stationary point of f if all first order partial derivatives of f at that point exist and are equal to 0; that is, f(x0)=0.

It follows from Theorem 2 that if a function has a relative extremum at an interior point of its domain and if its partial derivatives at that point exist, the point must be a stationary point of the function. Theorem 2 does not talk about the points where the partial derivatives do not exist and the points on the boundary. That is, it is possible for a function to assume its (relative or absolute) extreme value at a point where at least one of the first partial derivatives does not exist (Fig. 5(a)) or at a boundary point (Fig. 5(b)).

(a) Maximum occurs at a rough point.

(b) Extrema occur at two boundary points.

Figure 5.

  • A point at which at least one of the partial derivatives does not exist is called a rough point. In other words, at a rough point, the gradient does not exist.
  • Stationary points and rough points constitute critical points.
  • From the above discussion, we conclude that to determine the extreme values of a function, we should search them among stationary points, rough points, and boundary points.

A function f has a relative or absolute extremum at a point x0 of its domain only if x0 is one of the three types of points:

  1. x0 is a stationary point of f; that is, f(x0)=0,
  2. x0 is a rough point of f; that is, f(x0) does not exist, or
  3. x0 is on the boundary of the domain of f.
Example 1
Find the extreme values of the function f(x,y)=x2+y2.

Solution
First we need to find the gradient of f
f(x,y)=(2x,2y).

The gradient exists everywhere; therefore, there is no rough point. Both components of the gradient are zero only at (0,0); that is, (0,0) is the only critical point of f. Because f(0,0)=0 and the value of f(x,y) is always greater than or equal to zero, f has an absolute minimum at the origin. The value of f increases unboundedly when we go away from the origin. So there is no absolute maximum.

Graph of z=x2+y2

 

Example 2
Find the extreme values of the function f(x,y)=1x2y2.

Solution
The partial derivatives of f are fx=x1x2y2,fy=y1x2y2,

and they become zero at the origin. The function has a maximum at the origin, because at all (x,y)(0,0) the quantity 1x2y2 under the square root is less than 1 which occurs at the origin.

The domain of f is 1x2y20 or x2+y21, which is a disk of radius 1 and centered at the origin. On the boundary of the domain (i.e. x2+y2=1), f is zero. The absolute minimum value of f occurs on the boundary because the minimum of the square root is zero.

Graph of z=1x2y2
Example 3
Find the extreme values of the function f(x,y)=x2+y2.

Solution
First we calculate the partial derivatives
fx=xx2+y2,fy=yx2+y2.

The partial derivatives are not both zero if (x,y)(0,0). Because fx(0,0) and fy(0,0) do not exist, (0,0) is a rough point. The value of f at (0,0) is zero. This is the minimum value the square root. Thus, f has an absolute minimum at the origin. The graph of f is a circular cone, f increases unboundedly when |(x,y)| and does not have a maximum.

Graph of z=x2+y2

 

Example 4
Suppose we would like to build a distribution center to service three cities, and suppose the locations (the coordinates) of the three cities are
(a1,b1), (a2,b2), and (a3,b3). Locate the storage center to minimize the cost

  1. if the cost is proportional to sum of the square distances from the center to the cities
  2. if the cost is proportional to sum of the distances from the center to the cities.
Solution
Let (x,y) be the coordinates of the distribution center. The distance of the distribution center to a city is ri=(xai)2+(ybi)2,i=1,2,3. For part (a) we wish to minimize f(x,y)=r12+r22+r32, and for part (b), we wish to minimize g(x,y)=r1+r2+r3. We will see minimization of f(x,y) is easier than minimization of g(x,y).

(a) We have
f(x,y)=(xa1)2+(yb1)2+(xa2)2+(yb2)2+(xa3)2+(yb3)2=i=13[(xai)2+(ybi)2]
fx=2(xa1+xa2+xa3),fy=2(yb1+yb2+yb3)
Solving fx=0 and fy=0 gives x=(a1+a2+a3)/3 and y=(b1+b2+b3)/3. Therefore, the best location to minimize sum of the square distances is the centroid of the triangle formed by the cities (Fig. 6).

Figure 6.

(b) We have
g(x,y)=r1+r2+r3=i=13(xai)2+(ybi)2.
Differentiating g and equating to zero, we obtain
gx=xa1r1+xa2r2+ax3r3=0,gy=yb1r1+yb2r2+yb3r3=0.
These equations are complicated to solve. However, we note that
g=r1+r2+r3,
where
ri=(xairi,ybiri)  (for i=1,2,3)
Also we note that these gradient vectors ri are unit vectors, |ri|=1. Therefore, the vector sum of three unit vectors at the relative minimum has to be 0. The only way for this to happen is when the angles between them are 360/3=120 (as (b) and (c) in the following figure). So the solution probably is when the roads from the cities to the center make 120 angles. There are other possibilities. Each function ri has a rough point; that is, g has three rough points. The graph of ri is a circular cone (see Example [Eg:MinCone]), which has been shifted ai units in the x-direction and bi units in the y-direction. Therefore, ri has a rough point at (ai,bi). This means that if we build the distribution center in one of the cities, we might have minimized the cost.

If the triangle formed by the cities has an angle larger than 120, then a point inside the triangle such that the angles between the roads from the delivery center to the cities make 120 does not exist. In this case, the best point is the city with the wide angle (see (c) in the following figure). Otherwise, a point inside the triangle would be the solution.

(a) (b) (c)

Example 5
Find the extreme values of the function f(x,y)=x2y2+4xy+x+y on the triangular region R={(x,y)|1x0,0y2x+2} (see Fig. 6).

Figure 6: Region R.

 

 

Solution

Because R is closed and bounded, and the function is continuous in R, f takes on both its maximum and minimum values. Also because f is a differentiable function, f(x,y) exists everywhere and there is no rough point. So to find the extreme values of f, we should find the critical points and the extreme values on the boundary.

f(x,y)=x2y2+4xy+x+y
f(x,y)=(2x+4y+1,2y+4x+1) To find the critical point, we have to solve the following system of equations:
{2x+4y+1=04x2y+1=0x=12,y=12.
The value of f at this point is f(12,12)=12. Now we find the extreme values on the boundary. Three line segments form the boundary. To look for an extremum on the the line segment y=0 and 1x0, we set y=0, so f depends on x alone. Then we find the best x. f(x,0)=x2+xandfx(x,0)=2x+1. Therefore, the critical point of f(x,0) is x=12 which is out of the this line segment. We evaluate f at the endpoints of this line segment, i.e. (1,0) and (0,0). f(1,0)=2,f(0,0)=0. On the line segment x=0 and
0y2, to find the extreme value of f, we set x=0 and find the best y: f(0,y)=y2+yandfy=2y+1. Thus, f(0,y) has a critical point at y=12. We evaluate f(0,y) at this point and the endpoints.
f(0,12)=14,f(0,2)=2,(already evaluated f(0,0)).
To follow the last segment of the boundary, i.e. y=2x+2 and 1x0, we set y=2x+2 and evaluate f. Then we choose the best x
f(x,2x+2)=x2(2x+2)2+4x(2x+2)+x+(2x+2)=3x2+3x2
fx=6x+3=0x=12 On this line segment when x=12, y=2(12)+2=1, and f(12,1)=114.

To find the extreme values, we compare f(x,y) at all the points we found:

x y f(x,y)  
12 12 12  
1 0 2  
0 0 0  
0 12 14 maximum
0 2 2  
12 1 114 minimum
       

Therefore, the absolute maximum of f in R is 14 and its absolute minimum in R is 114, which occur at (0,12) and (12,1), respectively.

Saddle Points

Theorem 2 states necessary conditions (not sufficient ones). Not every critical point is a relative extremum. See the following example.

Example 6
Given f(x,y)=xy, find its extremum points.

Solution
f(x,y)=xyf(x,y)=(y,x) Equating the gradient of f to (0,0) tells us that the origin is a critical point. Does f have a relative maximum or minimum at this point?

The answer is that the origin is neither a relative maximum nor a relative minimum. Because any neighborhood of (0,0)— no matter how small it is— include points from each quarter; f(0,0)=0 but f(x,y)>0 for any point in the first (when x>0,y>0) and third (when x<0,y<0) quarters and f(x,y)<0 for any point in the second (when x<0,y>0) and forth (when x>0,y<0) quarters. Such a point is called a saddle point.

(a) Graph of z=xy
(b) Contour plot of z=xy. As we can see every neighborbood of (0,0) contains points with z>0 and points with z<0.
 

Figure 7.

A critical point where all the components of the gradient of the function vanish but it is not a relative maximum nor a relative minimum is a saddle point.

Saddle points are somewhat analogous to the points of inflection for functions of one variable.

Second Partials Test

To figure out whether a critical point is a maximum, a minimum or a saddle point, we may graph the function but what if we do not have access to a graphing application or what can we do to classify the critical points of functions of three or more variables? Fortunately we can systematically use what is called the “second partials test.” This test is similar to the second derivative test for functions of one variable. Because application of the second partial tests for functions of three or more variables is rather laborious, here we restrict ourselves to functions of two variables.

(Sufficient Conditions for Extreme Values: Second Partials Test) Let (x0,y0) be a critical point of f(x,y) and interior to the domain of f. Suppose f has continuous second order partial derivatives in a neighborhood of (x0,y0). Let
A=2fx2(x0,y0),B=2fxy(x0,y0),C=2fy2(x0,y0)
and let
Δ=detH(x0,y0)=[ABBC]=ACB2.
Then we have:

  1. If Δ>0 and A>0, then f has a relative minimum at (x0,y0).
  2. If Δ>0 and A<0, then f has a relative maximum at (x0,y0).
  3. If Δ<0, then f has a saddle at (x0,y0).
  4. If Δ=0, this test gives no information.

Show the proof


Example 7
Given that f is the function defined by f(x,y)=x33xy+y3, determine the relative extrema of f if there are any.

Solution
To find the critical points, we need to find the first order partial derivatives of f and set both of them equal to zero. We have:
{fx(x,y)=3x23y=0fy(x,y)=3y23x=0

It follows from the first equation that y=x2. If we plug y=x2 into the second equation, we will arrive at 3x43x=3x(x31)=0, which gives us x=0 or x=1. Thus, there are two critical points O=(0,0) and P=(1,1). To determine if f has relative extrema at these points, we use the second partials test. The second order partial derivatives are

fxx(x,y)=6x,fxy(x,y)=3,fyy(x,y)=6y.

At O=(0,0): A=fxx(0,0)=0,B=fxy(0,0)=3,fyy(0,0)=0. Because
Δ=det[0330]=ACB2=9<0,
according to part (c) of Theorem 3, f has a saddle point at O=(0,0).

At P=(1,1): A=fxx(1,1)=6,B=fxy(1,1)=3,fyy(1,1)=6. Because
Δ=det[6336]=ACB2=27>0,
and A=fxx(1,1)=6>0, according to part (a) of Theorem 3, f has a minimum at P=(1,1).

 

Figure 8. Contour plot of f(x,y)=x33xy+y3 shows that f has a relative minimum at (1,1) and (0,0) is a saddle point.

 

Example 8
Given that f is the function defined by f(x,y)=sinx+siny+sin(x+y), determine the relative extrema of f on the square 0x<2π and 0y<2π.

Solution
To find the critical points, first we calculate the first order partial derivatives and set them equal to zero:
{fx(x,y)=cosx+cos(x+y)=0fy(x,y)=cosy+cos(x+y)=0
These equations together imply that cosy=cosx; hence, y=2kπ±x.

If we plug y=2kπ+x (for kZ) into the first equation, we obtain (*)cosx+cos2x=0 We know cos2x=2cos2x1.1 If we plug this identity into (*), we obtain: 2cos2x+cosx1=0. This is a quadratic in cosx. Thus:
cosx=14(1±1+8)=12 or 1

The equation cosx=12 has two solutions in the interval [0,2π): x=π3 or x=2ππ3=5π3. Because y lies between 0 and 2π, we have to choose k=0 and get y=x; that is P1=(π3,π3) and P2=(5π3,5π3) are two critical points.

The solution of cosx=1 in the interval [0,2π) is x=π. Again because y lies between 0 and 2π, we have to choose k=0 and find y=π. The third critical point is P3=(π,π)

If we plug y=2kπx into the first equation, we obtain cosx+cos(2kπ)=1=0cosx=1, which is the same equation we had before.

Hence the critical points are P1=(π3,π3),P3=(5π3,5π3) and P2=(π,π). For each of these points, we use the second partials test. We have:
fxx(x,y)=sinxsin(x+y),fxy(x,y)=sin(x+y),fyy(x,y)=sinysin(x+y).

At P1=(π3,π3), we have:
A=fxx(π3,π3)=sinπ3sin2π3=3B=fxy(π3,π3)=sin2π3=32C=fyy(π3,π3)=sinπ3sin2π3=3.
Thus
Δ=det[332323]=334=94>0.
Because Δ>0 and A=3<0, we can conclude that f has a maximum at P1=(π3,π3).

At P2=(5π3,5π3), we have:
A=fxx(5π3,5π3)=sin5π3sin10π3=3B=fxy(5π3,5π3)=sin10π3=32C=fyy(5π3,5π3)=sin5π3sin10π3=3.
Thus
Δ=det[332323]=334=94>0.
Because Δ>0 and A=3>0, we can conclude that f has a minimum at P2=(5π3,5π3).

At P3=(π,π), we have:
A=fxx(π,π)=sinπsin2π=0B=fxy(π,π)=sin2π=0C=fyy(π,π)=sinπsin2π=0.
Thus
Δ=det[0000]=0. Because Δ=0, the second partial test is inconclusive. In this example, to investigate whether or not f has an extremum at P3, we can calculate the variation (increment) of f near P3:
Δf=f(π+h,π+k)f(π,π)=sin(π+h)+sin(π+k)+sin(2π+h+k)f(π,π)=0
Now we can use Taylor’s expansion
Δf=sinhsink+sin(h+k)=[hh33+][kk33!+]+[(h+k)(h+k)33+]
To approximate Δf, because h and k can be made as small as we desire, we just keep the dominant terms which are the terms with the lowest degree (i.e. the terms with the smallest power) and ignore the other terms as being negligible by comparison:
Δfh36+k36(h+k)36=hk2(h+k).
If hk<0 and (h+k)>0 (for example h=0.002 and k=0.001), Δf>0 and if hk<0 but (h+k)<0 (for example, h=0.002 and k=0.001), Δf<0. Therefore, f has a saddle point at P3=(π,π).

The graph of f is shown in Fig. 9.

(a) Graph of f.
(b) Contour plot of f.

Figure 9.

We mentioned that when Δ=0, the second partial test is inconclusive, and the function may have a maximum, minimum, or a saddle point at the critical point. In the previous example, f had a saddle point at a critical point where Δ=0. Now consider the following functions: g(x,y)=x4+y4,ϕ(x,y)=(x4+y4),ψ(x,y)=x4y4. You can verify that the origin is a critical point of these functions; g has a relative minimum, ϕ has a relative maximum, and ψ has a saddle point at the origin. See Fig. 10.

 

Figure 10.

1 Alternatively, we can transform the sum into a product using↩