3.19 Constrained Extrema and Lagrange Multipliers

Table of Contents

Constrained Extrema

In practice, we wish to optimize a function considering some existing constraints.

In economics and engineering, the constrain may be due to limited funds, materials, or energy.
If we wish to find the distance from a point $P=(x_0,y_0)$ to a line $ax+by=c$, we should find the minimum value of $d(x,y)=\sqrt{(x-x_0)^2+(y-y_0)^2}$ while $(x,y)$ satisfies $ax+by=c$.
Suppose $T(x,y,z)$ represents the temperature in space, and we want to find the maximum temperature on a surface given by $g(x,y,z)=0$.

When the constraint (also called the side condition) equation is given, we can solve it for one of the variables, say $z=\phi(x,y)$, and replace it in the function $T$. Then the problem would reduce to finding the extremum value of the function $T(x,y,\phi(x,y))$, which now depends only on two independent variables. We have already applied this method in this example, where we optimized the value of the function on the boundary of a region. To practice how to use this method, let’s consider the following example.

Example 1

We want to make a rectangular box without a top, and of given volume $V$. If the least amount of material is to be used, determine the design specifications.

Solution

Let
$x=$ length of the box
$y=$ width of the box
$z=$ height of the box
where $x,y$ and $z$ are in the interval $(0,\infty)$. The specified volume is \[V=xyz\] The amount of material for constructing this box is proportional to its surface area \[S=xy+2xz+2yz.\] So we want to minimize $S(x,y,z)$ subject to $xyz=V$.

From $xyz=V$ we obtain $z=\frac{V}{xy}$ and then we plug it into the formula of $S$: \[S=xy+2\frac{V}{y}+2\frac{V}{x}.\] Now $S$ is expressed as a function of only two variables. To determine the relative extremum, we differentiate and set the partial derivatives equal to zero:
\[\frac{\partial S}{\partial x}=y-\frac{2V}{x^2}=0, \quad \frac{\partial S}{\partial y}=x-\frac{2V}{y^2}=0\]
or equivalently:
\[\left\{\begin{array}{l} x^2y=2V\\ \\ xy^2=2V \end{array}\right.\] If we divide the first question by the second one, we obtain $x/y=1$. Therefore: \[x^2y=x^2 x=x^3=2V \Rightarrow x=y=(2V)^{1/3}\] From these values of $x$ and $y$, we get \[z=\frac{V}{xy}=\frac{(2V)^{1/3}}{2}\] Using the second derivative test, we can show that these values of $x,y$ and $z$ give a relative minimum of $S$. Because $S\to \infty$ as either $x\to 0+$, $y\to 0+$, $x\to \infty$, or $y\to \infty$, we can conclude that the relative minimum is also the absolute minimum.

Lagrange Multipliers

When the constraint is given implicitly by $g(x,y,z)=c$, it is not always possible or easy to solve the constraint equation for one of the variables (express $x, y$ or $z$ as a function of the remaining variables). The problem can be more complicated when there is more than one constraint. In such cases, an alternative procedure is a method called Lagrange multipliers.

To explain this method, let’s start with an example. Suppose we want to find the shortest and longest distance between the point $(1,-1)$ and the curve $C$

\[g(x,y)=\frac{(x-y-2)^2}{18}+\frac{(x+y)^2}{32}=1.\]

The distance between a point $(x,y)$ and $(1,-1)$ is given by \[f(x,y)=\sqrt{(x-1)^2+(y+1)^2}.\] So we want to maximize and minimize $f(x,y)$ subject to $g(x,y)=1$. First let’s sketch the curve $C$ and some level curves of $f$ (Fig. 1).

Figure. 1: The orange lines show contour lines of $f$ with different values. The function $f$ takes on extreme values at four points $A,B,P$, and $Q$.

To extremize $f(x,y)$ subject to $g(x,y)=1$, we have to find the largest and smallest value of $k$ such that the level curve $f(x,y)=k$ intersects $g(x,y)=1$. Among the level curves that intersect $g(x,y)=1$, the minimum value of $f(x,y)$ occurs at the points $P$ and $Q$ where $f(x,y)$ has a value of 3 (see Fig. 1). At these points, the constraint curve $g(x,y)=1$ and the level curve $f(x,y)=3$ just touch each other; in other words, $f=3$ and $g=1$ have a common tangent line at $P$ and a common tangent at $Q$.

Note that $g(x,y)=1$ is the level curve of $z=g(x,y)$. Because at each point $\overrightarrow{\nabla} f$ is perpendicular to the level curves of $f$ and similarly$\overrightarrow{\nabla} g$ is perpendicular to the level curves of $g$, a common tangent at $P$ means that $\overrightarrow{\nabla} f(P)$ and $\overrightarrow{\nabla} g(P)$ are parallel. That is, there is a number $\lambda_1$ such that
\[\overrightarrow{\nabla} f(P)=\lambda_1 \overrightarrow{\nabla} g(P).\]
Similarly, there is a number $\lambda_2$ such that
\[\overrightarrow{\nabla} f(Q)=\lambda_2\overrightarrow{\nabla} g(Q).\]

We also observe that the maximum value of $f(x,y)$ subject to $g(x,y)=1$ occurs where the constraint curve and a level curve (here $f=4$) touch each other (In Fig. 1, they are denoted by$A$ and $B$). Thus \[\overrightarrow{\nabla} f(A)=\mu_1\overrightarrow{\nabla} g(A),\] and \[\overrightarrow{\nabla} f(B)=\mu_2 \overrightarrow{\nabla} g(B),\] for some $\mu_1$ and $\mu_2$. Therefore, to find the maximum or minimum of $f(x,y)$ subject to the constraint $g(x,y)=1$ , we look for a point $\textbf{x}_0$ such that
\[\overrightarrow{\nabla} f(\textbf{x}_0)=\lambda \overrightarrow{\nabla} g(\textbf{x}_0)\]
for some $\lambda$. This is the method of Lagrange multiplier. But why is this true?

Suppose the constraint curve $C$ is parameterized by some functions:¹ \[x=X(t),\quad y=Y(t)\] If, in the equation of $f$, $x$ and $y$ are replaced by $X(t)$ and $Y(t)$, then the distance between $(1,-1)$ and the points on $C$ becomes a function of $t$ \[F(t)=f(x(t),y(t)).\] Therefore, the extreme values of the distance occur where $F'(t)=0$. From the chain rule we know
\[F'(t)=\frac{dF}{dt}=\frac{\partial f}{\partial x}\frac{dX}{dt}+\frac{\partial f}{\partial y}\frac{dY}{dx}\]
We can write the equation $F'(t)=0$ as
\[ \left(\frac{\partial f}{\partial x},\frac{\partial f}{\partial y}\right)\boldsymbol{\cdot}\left(\frac{dX}{dt},\frac{dY}{dt}\right)=0\]
This means at the extreme points the gradient vector $(f_x,f_y)$ is perpendicular to $(X'(t),Y'(t))$. Recall that $(X'(t),Y'(t))$ is tangent to the curve $C$.

On the other hand $C$ is a level curve of $g$. Therefore, at each point $\overrightarrow{\nabla} g$ is perpendicular to $C$. Therefore at the extreme points, $\overrightarrow{\nabla} g$ and $\overrightarrow{\nabla} f$ are parallel. The simplest version of the method of Lagrange multipliers is as follows.

Theorem 1. Suppose $U\subseteq \mathbb{R}^n$ is an open set and $f:U\to\mathbb{R}$ and $g:U\to\mathbb{R}$ are two continuously differentiable functions². Let $S=\left\{\mathbf{x}\in U|\ g(\mathbf{x})=c\right\}$ be the level set for $g$ with value $c$. If $f |S$ denoting “$f$ restricted to $S$,” has a relative extremum on $S$ at $\mathbf{x}_0\in U$, and $\overrightarrow{\nabla} g(\mathbf{x}_0)\neq\mathbf{0}$, then there exists a real number $\lambda$ such that
\[\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda \overrightarrow{\nabla} g(\mathbf{x}_0).\]

Read the proof

Let $\mathbf{r}:I\subseteq\mathbb{R}\to\mathbb{R}^n$ be a differentiable curve on the level set $S$ such that $\mathbf{r}(t_0)=\mathbf{x}_0$, $\mathbf{r}'(t_0)\neq \mathbf{0}$, and $\mathbf{r}(t)\in S$ for every $t\in I$. Then $g(\mathbf{r}(t))=c$, so the chain rule gives
\[\overrightarrow{\nabla} g(\underbrace{\mathbf{r}(t_0)}_{=\mathbf{x}_0})\boldsymbol{\cdot}\mathbf{r}'(t_0)=0.\]
Because $f|S$ attains a relative maximum or minimum at $\mathbf{x}_0$, the function $h=f\circ\mathbf{r}:I\to\mathbb{R}$ attains a relative maximum or minimum at $t_0$. Hence $h'(t_0)=0$ and according to the chain rule we have:
\[h'(t_0)=\overrightarrow{\nabla} f(\mathbf{x}_0)\boldsymbol{\cdot}\mathbf{r}'(t_0)=0.\]

Thus the vectors $\overrightarrow{\nabla} f(\mathbf{x}_0)$ and $\overrightarrow{\nabla} g(\mathbf{x}_0)\neq \mathbf{0}$ are both normal to the nonzero vector $\mathbf{r}'(t_0)$ and are therefore parallel; that is, $\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda \overrightarrow{\nabla} g(\mathbf{x}_0)$ for some $\lambda$. $\blacksquare$

The number $\lambda$ in the above theorem is called a Lagrange multiplier.
$\lambda$ might be zero.

Note that to find the extremum of $f|S$, we have $n+1$ unknowns ($n$ components of $\mathbf{x}_0$ and $\lambda$) and $n+1$ equations:
\[\label{Eq:LagrangeEQ} \left.\begin{align} \frac{\partial f}{\partial x_1}(\mathbf{x}_0)&=\lambda \frac{\partial g}{\partial x_1}(\mathbf{x}_0)\\ &\vdots\\ \frac{\partial f}{\partial x_n}(\mathbf{x}_0)&=\frac{\partial g}{\partial x_n}(\mathbf{x}_0)\\ g(\mathbf{x}_0)&=c \end{align}\right\} \quad (n+1) \text{ equations} \tag{i} \]

Example 2

Find the extrema of the function $f(x,y)=xy$ subject to the constraint $\frac{1}{4}x^2+\frac{1}{9}y^2=1$.

Solution

Let $g(x,y)=\frac{1}{4}x^2+\frac{1}{9}y^2$, so $S$ consists of all points $(x,y)$ such that $g(x,y)=1$. We have:
\[\overrightarrow{\nabla} f(x,y)=(y,x),\quad \overrightarrow{\nabla} g(x,y)=\left(\frac{1}{2}x,\frac{2}{9}y\right)\]

Note that $\overrightarrow{\nabla} g=\mathbf{0}$ if and only if $(x,y)=(0,0)$. Thus if $f$ subject to the constraint has an extremum at $(x_0,y_0)$, we must have $\overrightarrow{\nabla} f(x_0,y_0)=\lambda \overrightarrow{\nabla} g(x_0,y_0)$. Equations i take on the form:
\[y_0=\frac{1}{2}\lambda x_0, \quad x_0=\frac{2}{9}\lambda y_0, \quad \frac{1}{4}x_0^2+\frac{1}{9}y_0^2=1.\]
There are different ways to solve the above system of equations. One way is to replace $y_0$ from the first equation in the second equation
\[x_0=\frac{2}{9}\lambda y_0=\frac{2\lambda}{9}(\frac{\lambda}{2} x_0)=\frac{1}{9} \lambda^2 x_0\]
This leads to
\[x_0-x_0\frac{\lambda^2}{9}=x_0(1-\frac{\lambda^2}{9})=0\Rightarrow x=0 \ \text{or}\ \lambda=\pm 3.\]
If $x=0$, then substituting in $g(x_0,y_0)=1$, we obtain \[\frac{1}{4} 0^2+\frac{1}{9}y_0^2=1\Rightarrow y_0=\pm 3.\] If $\lambda=\pm 3$, it follows from $y_0=\frac{1}{2}\lambda x_0$ that $y_0=\pm \frac{3}{2} x_0$. Substituting into $g(x_0,y_0)$, we get:
\[\frac{1}{4}x_0^2+\frac{1}{9}\frac{9}{4}x_0^2=1 \Rightarrow \frac{1}{\sqrt{2}}x_0^2=1\Rightarrow x_0=\pm \sqrt{2}.\]
If $x_0=\sqrt{2}$, then $y_0=\pm \frac{3}{2}\sqrt{2}$, and if $x_0=-\sqrt{2}$, then $y_0=\pm\frac{3}{2}\sqrt{2}$.

Now we list the points we found and the values of $f$ at these points:

$\large (x_0,y_0)$	$(0,3)$	$(0,-3)$	$(\sqrt{2},\frac{3\sqrt{2}}{2})$	$(\sqrt{2},-\frac{3\sqrt{2}}{2})$	$(\sqrt{2},-\frac{3\sqrt{2}}{2})$	$(-\sqrt{2},-\frac{3\sqrt{2}}{2})$
$f(x_0,y_0)=x_0y_0$	$0$	$0$	$3$	$-3$	$-3$	$3$

The function $f$ attains its maximum value 3 at $(\sqrt{2},3\sqrt{2}/2)$ and $(-\sqrt{2},-3\sqrt{2}/2)$, and its minimum value -3 at $(-\sqrt{2},3\sqrt{2}/2)$ and $(-\sqrt{2},3\sqrt{2}/2)$. Note that at these points the level curve $g=1$ is tangent to one of the level curves of $f$ (see Fig. 2).


(a)	(b)
Figure 2. (a) The graph of $f(x,y)$; the blue curve shows $f$ restricted to the ellipse $g(x,y)=1$. (b) The level curves of $f$ and $g$; the extrema of $f$ subject to the constraint $g(x,y)=1$ occur at points where the level curves of $f$ are tangent to the level curve $g(x,y)=1$.

Let’s go back to the example of making an open box and try to solve it using the method of Lagrange multipliers.

Example 3

We want to make a rectangular box without a top, and of given volume $V$. If the least amount of material is to be used, determine the design specifications using Lagrange multipliers.

Solution

Again Let $x, y$ and $z$ be the length, width, and height of the box, respectively. We want to minimize \[S=xy+2xz+2yz\] subject to $xyz=V$, where $V$ is the given volume.

If we let $g(x,y,z)=xyz$, then
\[\overrightarrow{\nabla} S=(y+2z, x+2z, 2x+2y),\quad \overrightarrow{\nabla} g=(yz, xz, xy)\]
Note that $\overrightarrow{\nabla} g=\mathbf{0}$ if and only if $(x,y,z)=(0,0,0)$ which is not on the level surface $g(x,y,z)=V$. So if $S$ has an extremum at $(x_0,y_0,z_0)$, we have
\[\overrightarrow{\nabla} S(x_0,y_0,z_0)=\lambda \overrightarrow{\nabla} g(x_0,y_0,z_0).\]
Hence our equations become
\[\begin{align} \left\{\begin{array}{l c c} y+2z=\lambda yz & & \text{($i$)}\\ \\ x+2z=\lambda xz & & \text{($ii$)}\\ \\ 2x+2y=\lambda xy& & \text{($iii$)}\\ \\ xyz=V& & \text{($iv$)} \end{array}\right.\end{align}\]
Because $x,y$ and $z$ are nonzero, if we divide the first equation by $yz$, the second equation by $xz$ and the third one $xy$, we get:
\[\begin{align} \left\{\begin{array}{l c c} \dfrac{1}{z}+\dfrac{2}{y}=\lambda & & \text{($i^\prime$)}\\ \\ \dfrac{1}{z}+\dfrac{2}{x}=\lambda & & \text{($ii^\prime$)}\\ \\ \dfrac{2}{y}+\dfrac{2}{x}=\lambda& & \text{($iii^\prime$)}\\ \\ xyz=V& & \text{($iv^\prime$)} \end{array}\right.\end{align}\]
Then
\[\text{Eq. ($i^\prime$)},\ \ \text{Eq. ($ii^\prime$)}\quad \Rightarrow \quad x=y\]
\[\text{Eq. ($ii^\prime$)},\ \ \text{Eq ($iii^\prime$)}, \ \text{and}\ x=y\quad \Rightarrow \quad z=\frac{1}{2} x\]
\[x=y=2z, \ \ \text{Eq (4)} \Rightarrow \frac{1}{2}x^3=V\quad \Rightarrow\quad x=y=2z=(2V)^{1/3}.\]
This is the result we previously obtained in Example 1.

The following example shows an application of Lagrange multipliers in economics and when there are many variables.

Example 4

Assume there are $n$ commodities with prices per unit $p_1,\cdots,p_n$. Assume we have $L$ dollars to spend on these commodities. This means we have the constraint $p_1 x_1+\cdots p_n x_n=L$ where $x_1,\cdots,x_n$ are the amounts of the commodities. Assume the utility is given by the Cobb-Douglas function $U=f(x_1,\cdots,x_n)=K x_1^{\alpha_1}\cdots x_n^{\alpha_n}$ where $K, \alpha_1,\cdots,\alpha_n$ are positive constants. Find the maximum utility we can achieve.

Solution

Let $g(x_1,\cdots,x_n)=p_1 x_1+\cdots+p_n x_n$. So we seek for the maximum value of $f(x_1,\cdots,x_n)$ subject to $g(x_1,\cdots,x_n)=L$. Because $x_i\geq 0$ (for $i=1,\cdots,n$) the constraint is a part of the level set $g(x_1,\cdots,x_n)=L$. We note that when the amount of a commodity $x_i$ approaches zero, the utility approaches zero. Hence, the utility takes on its maximum value when $x_i$’s are positive. Because $\overrightarrow{\nabla} g(x_1,\cdots,x_n)=(p_1,\cdots,p_n)\neq \mathbf{0}$, if $f$ attains its maximum at a point, we have $\overrightarrow{\nabla} f(x_1,\cdots,x_n)=\lambda \overrightarrow{\nabla} g(x_1,\cdots,x_n)$; that is,
\[\left\{\begin{array}{l} \alpha_1 K x_1^{\alpha_1-1}x_2^{\alpha_2}\cdots,x_n^{\alpha_n}=\lambda p_1\\ \hspace{1cm}\vdots\\ \alpha_n K x_1^{\alpha_1}x_2^{\alpha_2}\cdots,x_n^{\alpha_n-1}=\lambda p_n \end{array}\right.\]
If we multiply both sides of the first equation by $x_1$, the second equation by $x_2$, $\cdots$, and the $n$-th equation by $x_n$, we get:
\[\left\{\begin{array}{l} \alpha_1 f(x_1,\cdots,x_n)=\lambda p_1 x_1\\ \hspace{1cm}\vdots\\ \alpha_n f(x_1,\cdots,x_n)=\lambda p_n x_n \end{array}\right.\]
Therefore: \[\frac{p_1}{\alpha_1}x_1=\cdots=\frac{p_n}{\alpha_n} x_n\] This result makes sense because it says the amount of commodity $x_i$ is proportional to its power and inversely proportional to its price, $p_i$.

Let $\frac{\alpha_i}{p_i} x_i=C$, then
\[p_1 x_1+\cdots+p_n x_n=p_1\frac{\alpha_1}{p_1} C+\cdots+p_n\frac{\alpha_1}{p_n} C=L\]
or
\[C(\alpha_1+\cdots+\alpha_n)=C\sum_{i=1}^n \alpha_i=L\Rightarrow C=\frac{L}{\sum_{i=1}^n \alpha_i}.\]
Consequently,
\[x_i=\frac{\alpha_i}{p_i}\frac{L}{\sum_{i=1}^n \alpha_n}.\] Finally we conclude the maximum value of $f$ is:
\[K\left(\frac{\alpha_i L}{p_i \sum_{i=1}^n \alpha_i}\right)^{\alpha_1} \cdots \left(\frac{\alpha_n L}{p_n \sum_{i=1}^n \alpha_i}\right)^{\alpha_n}=K\left(\frac{L}{\sum_{i=1}^n \alpha_i}\right)^{\sum_{i=1}^n\alpha_i} \left(\frac{\alpha_1}{p_1}\right)^{\alpha_1} \cdots \left(\frac{\alpha_n}{p_n}\right)^{\alpha_n}\]

The following example shows when maximizing or minimizing a function subject to a constraint, we should also investigate the points where the gradient of the constraint is zero $\overrightarrow{\nabla} g(\mathbf{x})=\mathbf{0}$.

Example 5

Find the nearest point on the curve $y^3-x^2=0$ to the point $(0,-1)$.

Solution

If we plot the level curve
$g(x,y)=y^3-x^2=0$ (Fig. 3), it is clear that $(0,0)$ is the nearest point on this curve to the point $(0,-1)$, and the distance is 1. If we want to use the method of Lagrange multipliers, we need to minimize the square of distance $f(x,y)=(x-0)^2+(y-(-1))^2$ subject to the constraint $g(x,y)=y^3-x^2=0$.

Figure 3.

Here
\[\overrightarrow{\nabla} f(x,y)=(2x,2y+2),\quad \overrightarrow{\nabla} g(x,y)=(-2x,3y^2).\]
Equations i take on the form
\[\left\{\begin{array}{l} 2x_0=-2\lambda x_0\\ 2y_0+2=3\lambda y_0^2\\ y_0^3-x_0^2=0 \end{array}\right.\]
However the solution, $(x_0,y_0)=(0,0)$, does not satisfy the second equation. The reason is attributed to the fact that at this point the gradient of $g$ is $\mathbf{0}$, $\overrightarrow{\nabla} g(0,0)=(0,0)$, so we cannot use Theorem 1. In addition, the level curve is not smooth at $(0,0)$.

If a function $f$ subject to a constraint $g(\mathbf{x})=c$ attains its extreme value at a point $\mathbf{x}_0$, then $\mathbf{x}_0$ is one of the four types of the point:

$\mathbf{x}_0$ is a point where $\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda \overrightarrow{\nabla} g(\mathbf{x}_0)$,
$\mathbf{x}_0$ is a point where $\overrightarrow{\nabla} g(\mathbf{x}_0)=\mathbf{0}$,
$\mathbf{x}_0$ is a rough point of $f$ or $g$ where $\overrightarrow{\nabla} f(\mathbf{x}_0)$ or $\overrightarrow{\nabla} g(\mathbf{x}_0)$ does not exist, or
$\mathbf{x}_0$ is on the boundary of the level set $g(\mathbf{x})=c$.

Multiple Constraints

The method of Lagrange multipliers extends to the case of multiple constraints: say to $f(x,y,z)$ subject to two constraints
\[\label{Eq:Lagrange_2Constraint} g_1(x,y,z)=c_1,\quad \text{and} \quad g_2(x,y,z)=c_2.\tag{ii}\]
Geometrically this means we seek the extreme values of $f(x,y,z)$ on a curve $C$ formed by the intersection of two level surfaces $g_1(x,y,z)=c_1$ and $g_2(x,y,z)=c_2$ (Fig. 4).

Figure 4. Because $\vec{\nabla}g_1$ is perpendicular to the surface $g_1=c_1$ and $\vec{\nabla}g_2$ is perpendicular to the surface $g_2=c_2$, both $\vec{\nabla}g_1$ and $\vec{\nabla}g_2$ are perpendicular to C. At a point where $f$ has a maximum or minimum, also $vec{\nabla}f$ is perpendicular to C.

Suppose $f$ subject to these constraints, $f|C$, has an extremum at $P=(x_0,y_0,z_0)$, and the functions $f, g_1$, and $g_2$ have continuous first partial derivatives near $P$. Analogous to the case of one constraint, we can argue that $\overrightarrow{\nabla} f$ is perpendicular to $C$ at $P$. Additionally because the gradient is perpendicular to the level surface, both $\overrightarrow{\nabla} g_1$ and $\overrightarrow{\nabla} g_2$ are perpendicular to $C$. If $\overrightarrow{\nabla} g_1(x_0,y_0,z_0)$ and $\overrightarrow{\nabla} g_2(x_0,y_0,z_0)$ are neither zero vectors nor collinear vectors, then $\overrightarrow{\nabla} f(x_0,y_0,z_0)$ lie in a plane spanned by the two vectors $\overrightarrow{\nabla} g_1(x_0,y_0,z_0)$ and $\overrightarrow{\nabla} g_2(x_0,y_0,z_0)$ and hence can be expressed as a linear combination of these two vectors, say:
\[\begin{align} \label{Eq:Lagrange_2Multipliers} \bbox[#F2F2F2,5px,border:2px solid black]{\overrightarrow{\nabla} f(x_0,y_0,z_0)=\lambda_1 \overrightarrow{\nabla} g_1(x_0,y_0,z_0)+\lambda_2 \overrightarrow{\nabla} g_2(x_0,y_0,z_0)}\tag{iii}\end{align}\]
In this method, Equations ii and iii must be solved simultaneously for five unknowns $x_0,y_0,z_0,\lambda_1$ and $\lambda_2$:
\[\left\{\begin{array}{l} \dfrac{\partial f}{\partial x}(x_0,y_0,z_0)=\lambda_1 \dfrac{\partial g_1}{\partial x}(x_0,y_0,z_0)+\lambda_2 \dfrac{\partial g_2}{\partial x}(x_0,y_0,z_0)\\ \\ \dfrac{\partial f}{\partial y}(x_0,y_0,z_0)=\lambda_1 \dfrac{\partial g_1}{\partial y}(x_0,y_0,z_0)+\lambda_2 \dfrac{\partial g_2}{\partial y}(x_0,y_0,z_0)\\ \\ \dfrac{\partial f}{\partial z}(x_0,y_0,z_0)=\lambda_1 \dfrac{\partial g_1}{\partial z}(x_0,y_0,z_0)+\lambda_2 \dfrac{\partial g_2}{\partial z}(x_0,y_0,z_0)\\ \\ g_1(x,y,z)=c_1\\ \\ g_2(x,y,z)=c_2 \end{array}\right.\]

Example 6

Find the extreme points of the function $f(x,y,z)=x+4y-2z$ on the curve of the intersection of the cylinder $x^2+y^2=4$ and the plane $2y-z=3$ (Fig. 5).

Figure 5.

Solution

Here we are given two constraints:
\[g_1(x,y,z)=x^2+y^2=4,\quad g_2(x,y,z)=2y-z=3.\] Note that
\[\overrightarrow{\nabla} f=(1,4,-2),\quad \overrightarrow{\nabla} g_1=(2x,2y,0),\quad\text{and}\quad \overrightarrow{\nabla} g_2=(0,2,-1).\]
The vector $\overrightarrow{\nabla} g_1=(0,0,0)$ only when $x=0$ and $y=0$, which clearly does not satisfy the constraint $g_1=4$. Thus, the two vectors $\overrightarrow{\nabla} g_1$ and $\overrightarrow{\nabla} g_2$ are clearly not parallel. Therefore, any constrained critical point $(x_0,y_0,z_0)$ must satisfy
\[\overrightarrow{\nabla} f(x_0,y_0,z_0)=\lambda_1 \overrightarrow{\nabla} g_1(x_0,y_0,z_0)+\lambda_2 \overrightarrow{\nabla} g_2(x_0,y_0,z_0).\]
It follows that we must solve the following system of equations for five unknowns
$x,y,z,\lambda_1$ and $\lambda_2$:
\[\left\{\begin{array}{l} 1=2x\lambda_1 +0\\ 4=2y\lambda_1+2\lambda_2\\ -2=0-\lambda_2\\ x^2+y^2=4\\ 2y-z=3 \end{array}\right.\]
From the third equation, we know $\lambda_2=2$. So the second equation becomes $2y\lambda_1=0$. Because from the first equation, $\lambda_1\neq 0$, it follows from $2y\lambda_1=0$ that $y=0$. If we substitute $y=0$ in the fourth and fifth equations, we get $x=\pm 2$ and $z=1$. Therefore, $f$ may have extrema at $(\pm 2,0,1)$.

The condition $x^2+y^2=4$ implies $-2\leq x,y\leq 2$. Then it follows from $2y-z=3$ that $-7\leq z\leq 1$. This means the constraint set, $S$, is bounded. Because the constraint set $S$ is closed and bounded, and $f$ is a continuous function, $f|S$ assumes its maximum and minimum ( Theorem 1 in the previous section). Here we have only two potentials, therefore one of them $(2,0,1)$ is the maximum point and the other one $(-2,0,1)$ is the minimum point.

By similar reasoning, we obtain equations for minimizing or maximizing
$f(x_1,\cdots,x_n)$ subject to several constraints
\[\begin{align} \label{Eq:m-constraints} \left\{\begin{array}{l} g_1(x_1,\cdots,x_n)=c_1\\ g_2(x_1,\cdots,x_n)=c_2\\ \vdots\\ g_m(x_1,\cdots,x_n)=c_m \end{array}\right.\tag{iv}\end{align}\]
where $m<n$. Assume $f(x_1,\cdots,x_n)$ has a relative extremum at $\mathbf{x}_0$ when the variables are restricted by the constraint equations iv. If $f, g_1,\cdots,$ and $g_m$ have continuous first partial derivatives at all points near $\mathbf{x}_0$ and if each $\overrightarrow{\nabla} g_i(\mathbf{x}_0)$ is not a linear combination of the other $\overrightarrow{\nabla} g_j(\mathbf{x}_0)$ ($j\neq i$), then there exists $m$ real numbers $\lambda_1,\cdots, \lambda_m$ such that
\[\bbox[#F2F2F2,5px,border:2px solid black]{\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda_1\overrightarrow{\nabla} g_1(\mathbf{x}_0)+\cdots+\lambda_m \overrightarrow{\nabla} g_m(\mathbf{x}_0).}\]

¹For instance, in this specific example, ↩
\[\begin{cases} x=X(t)=\dfrac{3}{\sqrt{2}}\cos t+2\sqrt{2} \sin t\\ \\ y=Y(t)=-\dfrac{3}{\sqrt{2}}\cos t+2\sqrt{2}\sin t \end{cases}\qquad (0\leq t\leq2\pi)\]

²In other words, $f$ and $g$ have continuous partial derivatives.↩

Constrained Extrema

Lagrange Multipliers

Read the proof

Hide the proof

Multiple Constraints