In practice, we wish to optimize a function considering some existing constraints.
In economics and engineering, the constrain may be due to limited funds, materials, or energy.
If we wish to find the distance from a point \(P=(x_0,y_0)\) to a line \(ax+by=c\), we should find the minimum value of \(d(x,y)=\sqrt{(x-x_0)^2+(y-y_0)^2}\) while \((x,y)\) satisfies \(ax+by=c\).
Suppose \(T(x,y,z)\) represents the temperature in space, and we want to find the maximum temperature on a surface given by \(g(x,y,z)=0\).
When the constraint (also called the side condition) equation is given, we can solve it for one of the variables, say \(z=\phi(x,y)\), and replace it in the function \(T\). Then the problem would reduce to finding the extremum value of the function \(T(x,y,\phi(x,y))\), which now depends only on two independent variables. We have already applied this method in this example, where we optimized the value of the function on the boundary of a region. To practice how to use this method, let’s consider the following example.
Example 1
We want to make a rectangular box without a top, and of given volume \(V\). If the least amount of material is to be used, determine the design specifications.
Solution
Let \(x=\) length of the box \(y=\) width of the box \(z=\) height of the box
where \(x,y\) and \(z\) are in the interval \((0,\infty)\). The specified volume is \[V=xyz\] The amount of material for constructing this box is proportional to its surface area \[S=xy+2xz+2yz.\] So we want to minimize \(S(x,y,z)\) subject to \(xyz=V\).
From \(xyz=V\) we obtain \(z=\frac{V}{xy}\) and then we plug it into the formula of \(S\): \[S=xy+2\frac{V}{y}+2\frac{V}{x}.\] Now \(S\) is expressed as a function of only two variables. To determine the relative extremum, we differentiate and set the partial derivatives equal to zero: \[\frac{\partial S}{\partial x}=y-\frac{2V}{x^2}=0, \quad \frac{\partial S}{\partial y}=x-\frac{2V}{y^2}=0\]
or equivalently: \[\left\{\begin{array}{l} x^2y=2V\\ \\ xy^2=2V \end{array}\right.\] If we divide the first question by the second one, we obtain \(x/y=1\). Therefore: \[x^2y=x^2 x=x^3=2V \Rightarrow x=y=(2V)^{1/3}\] From these values of \(x\) and \(y\), we get \[z=\frac{V}{xy}=\frac{(2V)^{1/3}}{2}\] Using the second derivative test, we can show that these values of \(x,y\) and \(z\) give a relative minimum of \(S\). Because \(S\to \infty\) as either \(x\to 0+\), \(y\to 0+\), \(x\to \infty\), or \(y\to \infty\), we can conclude that the relative minimum is also the absolute minimum.
Lagrange Multipliers
When the constraint is given implicitly by \(g(x,y,z)=c\), it is not always possible or easy to solve the constraint equation for one of the variables (express \(x, y\) or \(z\) as a function of the remaining variables). The problem can be more complicated when there is more than one constraint. In such cases, an alternative procedure is a method called Lagrange multipliers.
To explain this method, let’s start with an example. Suppose we want to find the shortest and longest distance between the point \((1,-1)\) and the curve \(C\)
The distance between a point \((x,y)\) and \((1,-1)\) is given by \[f(x,y)=\sqrt{(x-1)^2+(y+1)^2}.\] So we want to maximize and minimize \(f(x,y)\) subject to \(g(x,y)=1\). First let’s sketch the curve \(C\) and some level curves of \(f\) (Fig. 1).
To extremize \(f(x,y)\) subject to \(g(x,y)=1\), we have to find the largest and smallest value of \(k\) such that the level curve \(f(x,y)=k\) intersects \(g(x,y)=1\). Among the level curves that intersect \(g(x,y)=1\), the minimum value of \(f(x,y)\) occurs at the points \(P\) and \(Q\) where \(f(x,y)\) has a value of 3 (see Fig. 1). At these points, the constraint curve \(g(x,y)=1\) and the level curve \(f(x,y)=3\) just touch each other; in other words, \(f=3\) and \(g=1\) have a common tangent line at \(P\) and a common tangent at \(Q\).
Note that \(g(x,y)=1\) is the level curve of \(z=g(x,y)\). Because at each point \(\overrightarrow{\nabla} f\) is perpendicular to the level curves of \(f\) and similarly\(\overrightarrow{\nabla} g\) is perpendicular to the level curves of \(g\), a common tangent at \(P\) means that \(\overrightarrow{\nabla} f(P)\) and \(\overrightarrow{\nabla} g(P)\) are parallel. That is, there is a number \(\lambda_1\) such that \[\overrightarrow{\nabla} f(P)=\lambda_1 \overrightarrow{\nabla} g(P).\]
Similarly, there is a number \(\lambda_2\) such that \[\overrightarrow{\nabla} f(Q)=\lambda_2\overrightarrow{\nabla} g(Q).\]
We also observe that the maximum value of \(f(x,y)\) subject to \(g(x,y)=1\) occurs where the constraint curve and a level curve (here \(f=4\)) touch each other (In Fig. 1, they are denoted by\(A\) and \(B\)). Thus \[\overrightarrow{\nabla} f(A)=\mu_1\overrightarrow{\nabla} g(A),\] and \[\overrightarrow{\nabla} f(B)=\mu_2 \overrightarrow{\nabla} g(B),\] for some \(\mu_1\) and \(\mu_2\). Therefore, to find the maximum or minimum of \(f(x,y)\) subject to the constraint \(g(x,y)=1\) , we look for a point \(\textbf{x}_0\) such that \[\overrightarrow{\nabla} f(\textbf{x}_0)=\lambda \overrightarrow{\nabla} g(\textbf{x}_0)\]
for some \(\lambda\). This is the method of Lagrange multiplier. But why is this true?
Suppose the constraint curve \(C\) is parameterized by some functions:1\[x=X(t),\quad y=Y(t)\] If, in the equation of \(f\), \(x\) and \(y\) are replaced by \(X(t)\) and \(Y(t)\), then the distance between \((1,-1)\) and the points on \(C\) becomes a function of \(t\)\[F(t)=f(x(t),y(t)).\] Therefore, the extreme values of the distance occur where \(F'(t)=0\). From the chain rule we know \[F'(t)=\frac{dF}{dt}=\frac{\partial f}{\partial x}\frac{dX}{dt}+\frac{\partial f}{\partial y}\frac{dY}{dx}\]
We can write the equation \(F'(t)=0\) as \[ \left(\frac{\partial f}{\partial x},\frac{\partial f}{\partial y}\right)\boldsymbol{\cdot}\left(\frac{dX}{dt},\frac{dY}{dt}\right)=0\]
This means at the extreme points the gradient vector \((f_x,f_y)\) is perpendicular to \((X'(t),Y'(t))\). Recall that \((X'(t),Y'(t))\) is tangent to the curve \(C\).
On the other hand \(C\) is a level curve of \(g\). Therefore, at each point \(\overrightarrow{\nabla} g\) is perpendicular to \(C\). Therefore at the extreme points, \(\overrightarrow{\nabla} g\) and \(\overrightarrow{\nabla} f\) are parallel. The simplest version of the method of Lagrange multipliers is as follows.
Theorem 1. Suppose \(U\subseteq \mathbb{R}^n\) is an open set and \(f:U\to\mathbb{R}\) and \(g:U\to\mathbb{R}\) are two continuously differentiable functions2. Let \(S=\left\{\mathbf{x}\in U|\ g(\mathbf{x})=c\right\}\) be the level set for \(g\) with value \(c\). If \(f |S\) denoting “\(f\) restricted to \(S\),” has a relative extremum on \(S\) at \(\mathbf{x}_0\in U\), and \(\overrightarrow{\nabla} g(\mathbf{x}_0)\neq\mathbf{0}\), then there exists a real number \(\lambda\) such that \[\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda \overrightarrow{\nabla} g(\mathbf{x}_0).\]
Read the proof
Hide the proof
Let \(\mathbf{r}:I\subseteq\mathbb{R}\to\mathbb{R}^n\) be a differentiable curve on the level set \(S\) such that \(\mathbf{r}(t_0)=\mathbf{x}_0\), \(\mathbf{r}'(t_0)\neq \mathbf{0}\), and \(\mathbf{r}(t)\in S\) for every \(t\in I\). Then \(g(\mathbf{r}(t))=c\), so the chain rule gives \[\overrightarrow{\nabla} g(\underbrace{\mathbf{r}(t_0)}_{=\mathbf{x}_0})\boldsymbol{\cdot}\mathbf{r}'(t_0)=0.\]
Because \(f|S\) attains a relative maximum or minimum at \(\mathbf{x}_0\), the function \(h=f\circ\mathbf{r}:I\to\mathbb{R}\) attains a relative maximum or minimum at \(t_0\). Hence \(h'(t_0)=0\) and according to the chain rule we have: \[h'(t_0)=\overrightarrow{\nabla} f(\mathbf{x}_0)\boldsymbol{\cdot}\mathbf{r}'(t_0)=0.\]
Thus the vectors \(\overrightarrow{\nabla} f(\mathbf{x}_0)\) and \(\overrightarrow{\nabla} g(\mathbf{x}_0)\neq \mathbf{0}\) are both normal to the nonzero vector \(\mathbf{r}'(t_0)\) and are therefore parallel; that is, \(\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda \overrightarrow{\nabla} g(\mathbf{x}_0)\) for some \(\lambda\). \(\blacksquare\)
The number \(\lambda\) in the above theorem is called a Lagrange multiplier.
\(\lambda\) might be zero.
Note that to find the extremum of \(f|S\), we have \(n+1\) unknowns (\(n\) components of \(\mathbf{x}_0\) and \(\lambda\)) and \(n+1\) equations: \[\label{Eq:LagrangeEQ} \left.\begin{align} \frac{\partial f}{\partial x_1}(\mathbf{x}_0)&=\lambda \frac{\partial g}{\partial x_1}(\mathbf{x}_0)\\ &\vdots\\ \frac{\partial f}{\partial x_n}(\mathbf{x}_0)&=\frac{\partial g}{\partial x_n}(\mathbf{x}_0)\\ g(\mathbf{x}_0)&=c \end{align}\right\} \quad (n+1) \text{ equations} \tag{i} \]
Example 2
Find the extrema of the function \(f(x,y)=xy\) subject to the constraint \(\frac{1}{4}x^2+\frac{1}{9}y^2=1\).
Solution
Let \(g(x,y)=\frac{1}{4}x^2+\frac{1}{9}y^2\), so \(S\) consists of all points \((x,y)\) such that \(g(x,y)=1\). We have: \[\overrightarrow{\nabla} f(x,y)=(y,x),\quad \overrightarrow{\nabla} g(x,y)=\left(\frac{1}{2}x,\frac{2}{9}y\right)\]
Note that \(\overrightarrow{\nabla} g=\mathbf{0}\) if and only if \((x,y)=(0,0)\). Thus if \(f\) subject to the constraint has an extremum at \((x_0,y_0)\), we must have \(\overrightarrow{\nabla} f(x_0,y_0)=\lambda \overrightarrow{\nabla} g(x_0,y_0)\). Equations i take on the form: \[y_0=\frac{1}{2}\lambda x_0, \quad x_0=\frac{2}{9}\lambda y_0, \quad \frac{1}{4}x_0^2+\frac{1}{9}y_0^2=1.\]
There are different ways to solve the above system of equations. One way is to replace \(y_0\) from the first equation in the second equation \[x_0=\frac{2}{9}\lambda y_0=\frac{2\lambda}{9}(\frac{\lambda}{2} x_0)=\frac{1}{9} \lambda^2 x_0\]
This leads to \[x_0-x_0\frac{\lambda^2}{9}=x_0(1-\frac{\lambda^2}{9})=0\Rightarrow x=0 \ \text{or}\ \lambda=\pm 3.\]
If \(x=0\), then substituting in \(g(x_0,y_0)=1\), we obtain \[\frac{1}{4} 0^2+\frac{1}{9}y_0^2=1\Rightarrow y_0=\pm 3.\] If \(\lambda=\pm 3\), it follows from \(y_0=\frac{1}{2}\lambda x_0\) that \(y_0=\pm \frac{3}{2} x_0\). Substituting into \(g(x_0,y_0)\), we get: \[\frac{1}{4}x_0^2+\frac{1}{9}\frac{9}{4}x_0^2=1 \Rightarrow \frac{1}{\sqrt{2}}x_0^2=1\Rightarrow x_0=\pm \sqrt{2}.\]
If \(x_0=\sqrt{2}\), then \(y_0=\pm \frac{3}{2}\sqrt{2}\), and if \(x_0=-\sqrt{2}\), then \(y_0=\pm\frac{3}{2}\sqrt{2}\).
Now we list the points we found and the values of \(f\) at these points:
\((0,3)\)
\((0,-3)\)
\((\sqrt{2},\frac{3\sqrt{2}}{2})\)
\((\sqrt{2},-\frac{3\sqrt{2}}{2})\)
\((\sqrt{2},-\frac{3\sqrt{2}}{2})\)
\((-\sqrt{2},-\frac{3\sqrt{2}}{2})\)
\(f(x_0,y_0)=x_0y_0\)
\(0\)
\(0\)
\(3\)
\(-3\)
\(-3\)
\(3\)
The function \(f\) attains its maximum value 3 at \((\sqrt{2},3\sqrt{2}/2)\) and \((-\sqrt{2},-3\sqrt{2}/2)\), and its minimum value -3 at \((-\sqrt{2},3\sqrt{2}/2)\) and \((-\sqrt{2},3\sqrt{2}/2)\). Note that at these points the level curve \(g=1\) is tangent to one of the level curves of \(f\) (see Fig. 2).
(a)
(b)
Figure 2. (a) The graph of \(f(x,y)\); the blue curve shows \(f\) restricted to the ellipse \(g(x,y)=1\). (b) The level curves of \(f\) and \(g\); the extrema of \(f\) subject to the constraint \(g(x,y)=1\) occur at points where the level curves of \(f\) are tangent to the level curve \(g(x,y)=1\).
Let’s go back to the example of making an open box and try to solve it using the method of Lagrange multipliers.
Example 3
We want to make a rectangular box without a top, and of given volume \(V\). If the least amount of material is to be used, determine the design specifications using Lagrange multipliers.
Solution
Again Let \(x, y\) and \(z\) be the length, width, and height of the box, respectively. We want to minimize \[S=xy+2xz+2yz\] subject to \(xyz=V\), where \(V\) is the given volume.
If we let \(g(x,y,z)=xyz\), then \[\overrightarrow{\nabla} S=(y+2z, x+2z, 2x+2y),\quad \overrightarrow{\nabla} g=(yz, xz, xy)\]
Note that \(\overrightarrow{\nabla} g=\mathbf{0}\) if and only if \((x,y,z)=(0,0,0)\) which is not on the level surface \(g(x,y,z)=V\). So if \(S\) has an extremum at \((x_0,y_0,z_0)\), we have \[\overrightarrow{\nabla} S(x_0,y_0,z_0)=\lambda \overrightarrow{\nabla} g(x_0,y_0,z_0).\]
Hence our equations become \[\begin{align} \left\{\begin{array}{l c c} y+2z=\lambda yz & & \text{($i$)}\\ \\ x+2z=\lambda xz & & \text{($ii$)}\\ \\ 2x+2y=\lambda xy& & \text{($iii$)}\\ \\ xyz=V& & \text{($iv$)} \end{array}\right.\end{align}\]
Because \(x,y\) and \(z\) are nonzero, if we divide the first equation by \(yz\), the second equation by \(xz\) and the third one \(xy\), we get: \[\begin{align} \left\{\begin{array}{l c c} \dfrac{1}{z}+\dfrac{2}{y}=\lambda & & \text{($i^\prime$)}\\ \\ \dfrac{1}{z}+\dfrac{2}{x}=\lambda & & \text{($ii^\prime$)}\\ \\ \dfrac{2}{y}+\dfrac{2}{x}=\lambda& & \text{($iii^\prime$)}\\ \\ xyz=V& & \text{($iv^\prime$)} \end{array}\right.\end{align}\]
Then \[\text{Eq. ($i^\prime$)},\ \ \text{Eq. ($ii^\prime$)}\quad \Rightarrow \quad x=y\] \[\text{Eq. ($ii^\prime$)},\ \ \text{Eq ($iii^\prime$)}, \ \text{and}\ x=y\quad \Rightarrow \quad z=\frac{1}{2} x\] \[x=y=2z, \ \ \text{Eq (4)} \Rightarrow \frac{1}{2}x^3=V\quad \Rightarrow\quad x=y=2z=(2V)^{1/3}.\]
This is the result we previously obtained in Example 1.
The following example shows an application of Lagrange multipliers in economics and when there are many variables.
Example 4
Assume there are \(n\) commodities with prices per unit \(p_1,\cdots,p_n\). Assume we have \(L\) dollars to spend on these commodities. This means we have the constraint \(p_1 x_1+\cdots p_n x_n=L\) where \(x_1,\cdots,x_n\) are the amounts of the commodities. Assume the utility is given by the Cobb-Douglas function \(U=f(x_1,\cdots,x_n)=K x_1^{\alpha_1}\cdots x_n^{\alpha_n}\) where \(K, \alpha_1,\cdots,\alpha_n\) are positive constants. Find the maximum utility we can achieve.
Solution
Let \(g(x_1,\cdots,x_n)=p_1 x_1+\cdots+p_n x_n\). So we seek for the maximum value of \(f(x_1,\cdots,x_n)\) subject to \(g(x_1,\cdots,x_n)=L\). Because \(x_i\geq 0\) (for \(i=1,\cdots,n\)) the constraint is a part of the level set \(g(x_1,\cdots,x_n)=L\). We note that when the amount of a commodity \(x_i\) approaches zero, the utility approaches zero. Hence, the utility takes on its maximum value when \(x_i\)’s are positive. Because \(\overrightarrow{\nabla} g(x_1,\cdots,x_n)=(p_1,\cdots,p_n)\neq \mathbf{0}\), if \(f\) attains its maximum at a point, we have \(\overrightarrow{\nabla} f(x_1,\cdots,x_n)=\lambda \overrightarrow{\nabla} g(x_1,\cdots,x_n)\); that is, \[\left\{\begin{array}{l} \alpha_1 K x_1^{\alpha_1-1}x_2^{\alpha_2}\cdots,x_n^{\alpha_n}=\lambda p_1\\ \hspace{1cm}\vdots\\ \alpha_n K x_1^{\alpha_1}x_2^{\alpha_2}\cdots,x_n^{\alpha_n-1}=\lambda p_n \end{array}\right.\]
If we multiply both sides of the first equation by \(x_1\), the second equation by \(x_2\), \(\cdots\), and the \(n\)-th equation by \(x_n\), we get: \[\left\{\begin{array}{l} \alpha_1 f(x_1,\cdots,x_n)=\lambda p_1 x_1\\ \hspace{1cm}\vdots\\ \alpha_n f(x_1,\cdots,x_n)=\lambda p_n x_n \end{array}\right.\]
Therefore: \[\frac{p_1}{\alpha_1}x_1=\cdots=\frac{p_n}{\alpha_n} x_n\] This result makes sense because it says the amount of commodity \(x_i\) is proportional to its power and inversely proportional to its price, \(p_i\).
Let \(\frac{\alpha_i}{p_i} x_i=C\), then \[p_1 x_1+\cdots+p_n x_n=p_1\frac{\alpha_1}{p_1} C+\cdots+p_n\frac{\alpha_1}{p_n} C=L\]
or \[C(\alpha_1+\cdots+\alpha_n)=C\sum_{i=1}^n \alpha_i=L\Rightarrow C=\frac{L}{\sum_{i=1}^n \alpha_i}.\]
Consequently, \[x_i=\frac{\alpha_i}{p_i}\frac{L}{\sum_{i=1}^n \alpha_n}.\] Finally we conclude the maximum value of \(f\) is: \[K\left(\frac{\alpha_i L}{p_i \sum_{i=1}^n \alpha_i}\right)^{\alpha_1} \cdots \left(\frac{\alpha_n L}{p_n \sum_{i=1}^n \alpha_i}\right)^{\alpha_n}=K\left(\frac{L}{\sum_{i=1}^n \alpha_i}\right)^{\sum_{i=1}^n\alpha_i} \left(\frac{\alpha_1}{p_1}\right)^{\alpha_1} \cdots \left(\frac{\alpha_n}{p_n}\right)^{\alpha_n}\]
The following example shows when maximizing or minimizing a function subject to a constraint, we should also investigate the points where the gradient of the constraint is zero \(\overrightarrow{\nabla} g(\mathbf{x})=\mathbf{0}\).
Example 5
Find the nearest point on the curve \(y^3-x^2=0\) to the point \((0,-1)\).
Solution
If we plot the level curve \(g(x,y)=y^3-x^2=0\) (Fig. 3), it is clear that \((0,0)\) is the nearest point on this curve to the point \((0,-1)\), and the distance is 1. If we want to use the method of Lagrange multipliers, we need to minimize the square of distance \(f(x,y)=(x-0)^2+(y-(-1))^2\) subject to the constraint \(g(x,y)=y^3-x^2=0\).
Figure 3.
Here \[\overrightarrow{\nabla} f(x,y)=(2x,2y+2),\quad \overrightarrow{\nabla} g(x,y)=(-2x,3y^2).\]
Equations i take on the form \[\left\{\begin{array}{l} 2x_0=-2\lambda x_0\\ 2y_0+2=3\lambda y_0^2\\ y_0^3-x_0^2=0 \end{array}\right.\]
However the solution, \((x_0,y_0)=(0,0)\), does not satisfy the second equation. The reason is attributed to the fact that at this point the gradient of \(g\) is \(\mathbf{0}\), \(\overrightarrow{\nabla} g(0,0)=(0,0)\), so we cannot use Theorem 1. In addition, the level curve is not smooth at \((0,0)\).
If a function \(f\) subject to a constraint \(g(\mathbf{x})=c\) attains its extreme value at a point \(\mathbf{x}_0\), then \(\mathbf{x}_0\) is one of the four types of the point:
\(\mathbf{x}_0\) is a point where \(\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda \overrightarrow{\nabla} g(\mathbf{x}_0)\),
\(\mathbf{x}_0\) is a point where \(\overrightarrow{\nabla} g(\mathbf{x}_0)=\mathbf{0}\),
\(\mathbf{x}_0\) is a rough point of \(f\) or \(g\) where \(\overrightarrow{\nabla} f(\mathbf{x}_0)\) or \(\overrightarrow{\nabla} g(\mathbf{x}_0)\) does not exist, or
\(\mathbf{x}_0\) is on the boundary of the level set \(g(\mathbf{x})=c\).
Multiple Constraints
The method of Lagrange multipliers extends to the case of multiple constraints: say to \(f(x,y,z)\) subject to two constraints \[\label{Eq:Lagrange_2Constraint} g_1(x,y,z)=c_1,\quad \text{and} \quad g_2(x,y,z)=c_2.\tag{ii}\]
Geometrically this means we seek the extreme values of \(f(x,y,z)\) on a curve \(C\) formed by the intersection of two level surfaces \(g_1(x,y,z)=c_1\) and \(g_2(x,y,z)=c_2\) (Fig. 4).
Figure 4. Because \(\vec{\nabla}g_1\) is perpendicular to the surface \(g_1=c_1\) and \(\vec{\nabla}g_2\) is perpendicular to the surface \(g_2=c_2\), both \(\vec{\nabla}g_1\) and \(\vec{\nabla}g_2\) are perpendicular to C. At a point where \(f\) has a maximum or minimum, also \(vec{\nabla}f\) is perpendicular to C.
Suppose \(f\) subject to these constraints, \(f|C\), has an extremum at \(P=(x_0,y_0,z_0)\), and the functions \(f, g_1\), and \(g_2\) have continuous first partial derivatives near \(P\). Analogous to the case of one constraint, we can argue that \(\overrightarrow{\nabla} f\) is perpendicular to \(C\) at \(P\). Additionally because the gradient is perpendicular to the level surface, both \(\overrightarrow{\nabla} g_1\) and \(\overrightarrow{\nabla} g_2\) are perpendicular to \(C\). If \(\overrightarrow{\nabla} g_1(x_0,y_0,z_0)\) and \(\overrightarrow{\nabla} g_2(x_0,y_0,z_0)\) are neither zero vectors nor collinear vectors, then \(\overrightarrow{\nabla} f(x_0,y_0,z_0)\) lie in a plane spanned by the two vectors \(\overrightarrow{\nabla} g_1(x_0,y_0,z_0)\) and \(\overrightarrow{\nabla} g_2(x_0,y_0,z_0)\) and hence can be expressed as a linear combination of these two vectors, say: \[\begin{align} \label{Eq:Lagrange_2Multipliers} \bbox[#F2F2F2,5px,border:2px solid black]{\overrightarrow{\nabla} f(x_0,y_0,z_0)=\lambda_1 \overrightarrow{\nabla} g_1(x_0,y_0,z_0)+\lambda_2 \overrightarrow{\nabla} g_2(x_0,y_0,z_0)}\tag{iii}\end{align}\]
In this method, Equations ii and iii must be solved simultaneously for five unknowns \(x_0,y_0,z_0,\lambda_1\) and \(\lambda_2\): \[\left\{\begin{array}{l} \dfrac{\partial f}{\partial x}(x_0,y_0,z_0)=\lambda_1 \dfrac{\partial g_1}{\partial x}(x_0,y_0,z_0)+\lambda_2 \dfrac{\partial g_2}{\partial x}(x_0,y_0,z_0)\\ \\ \dfrac{\partial f}{\partial y}(x_0,y_0,z_0)=\lambda_1 \dfrac{\partial g_1}{\partial y}(x_0,y_0,z_0)+\lambda_2 \dfrac{\partial g_2}{\partial y}(x_0,y_0,z_0)\\ \\ \dfrac{\partial f}{\partial z}(x_0,y_0,z_0)=\lambda_1 \dfrac{\partial g_1}{\partial z}(x_0,y_0,z_0)+\lambda_2 \dfrac{\partial g_2}{\partial z}(x_0,y_0,z_0)\\ \\ g_1(x,y,z)=c_1\\ \\ g_2(x,y,z)=c_2 \end{array}\right.\]
Example 6
Find the extreme points of the function \(f(x,y,z)=x+4y-2z\) on the curve of the intersection of the cylinder \(x^2+y^2=4\) and the plane \(2y-z=3\) (Fig. 5).
Figure 5.
Solution
Here we are given two constraints: \[g_1(x,y,z)=x^2+y^2=4,\quad g_2(x,y,z)=2y-z=3.\] Note that \[\overrightarrow{\nabla} f=(1,4,-2),\quad \overrightarrow{\nabla} g_1=(2x,2y,0),\quad\text{and}\quad \overrightarrow{\nabla} g_2=(0,2,-1).\]
The vector \(\overrightarrow{\nabla} g_1=(0,0,0)\) only when \(x=0\) and \(y=0\), which clearly does not satisfy the constraint \(g_1=4\). Thus, the two vectors \(\overrightarrow{\nabla} g_1\) and \(\overrightarrow{\nabla} g_2\) are clearly not parallel. Therefore, any constrained critical point \((x_0,y_0,z_0)\) must satisfy \[\overrightarrow{\nabla} f(x_0,y_0,z_0)=\lambda_1 \overrightarrow{\nabla} g_1(x_0,y_0,z_0)+\lambda_2 \overrightarrow{\nabla} g_2(x_0,y_0,z_0).\]
It follows that we must solve the following system of equations for five unknowns \(x,y,z,\lambda_1\) and \(\lambda_2\): \[\left\{\begin{array}{l} 1=2x\lambda_1 +0\\ 4=2y\lambda_1+2\lambda_2\\ -2=0-\lambda_2\\ x^2+y^2=4\\ 2y-z=3 \end{array}\right.\]
From the third equation, we know \(\lambda_2=2\). So the second equation becomes \(2y\lambda_1=0\). Because from the first equation, \(\lambda_1\neq 0\), it follows from \(2y\lambda_1=0\) that \(y=0\). If we substitute \(y=0\) in the fourth and fifth equations, we get \(x=\pm 2\) and \(z=1\). Therefore, \(f\) may have extrema at \((\pm 2,0,1)\).
The condition \(x^2+y^2=4\) implies \(-2\leq x,y\leq 2\). Then it follows from \(2y-z=3\) that \(-7\leq z\leq 1\). This means the constraint set, \(S\), is bounded. Because the constraint set \(S\) is closed and bounded, and \(f\) is a continuous function, \(f|S\) assumes its maximum and minimum ( Theorem 1 in the previous section). Here we have only two potentials, therefore one of them \((2,0,1)\) is the maximum point and the other one \((-2,0,1)\) is the minimum point.
By similar reasoning, we obtain equations for minimizing or maximizing \(f(x_1,\cdots,x_n)\) subject to several constraints \[\begin{align} \label{Eq:m-constraints} \left\{\begin{array}{l} g_1(x_1,\cdots,x_n)=c_1\\ g_2(x_1,\cdots,x_n)=c_2\\ \vdots\\ g_m(x_1,\cdots,x_n)=c_m \end{array}\right.\tag{iv}\end{align}\]
where \(m<n\). Assume \(f(x_1,\cdots,x_n)\) has a relative extremum at \(\mathbf{x}_0\) when the variables are restricted by the constraint equations iv. If \(f, g_1,\cdots,\) and \(g_m\) have continuous first partial derivatives at all points near \(\mathbf{x}_0\) and if each \(\overrightarrow{\nabla} g_i(\mathbf{x}_0)\) is not a linear combination of the other \(\overrightarrow{\nabla} g_j(\mathbf{x}_0)\) (\(j\neq i\)), then there exists \(m\) real numbers \(\lambda_1,\cdots, \lambda_m\) such that \[\bbox[#F2F2F2,5px,border:2px solid black]{\overrightarrow{\nabla} f(\mathbf{x}_0)=\lambda_1\overrightarrow{\nabla} g_1(\mathbf{x}_0)+\cdots+\lambda_m \overrightarrow{\nabla} g_m(\mathbf{x}_0).}\]
1 For instance, in this specific example, ↩ \[\begin{cases} x=X(t)=\dfrac{3}{\sqrt{2}}\cos t+2\sqrt{2} \sin t\\ \\ y=Y(t)=-\dfrac{3}{\sqrt{2}}\cos t+2\sqrt{2}\sin t \end{cases}\qquad (0\leq t\leq2\pi)\]
2 In other words, \(f\) and \(g\) have continuous partial derivatives.↩