In single variable calculus, we learned how to use the chain rule. This rule tells us if y=f(u) and u=g(x) are two differentiable functions then y=fg(x) is also a differentiable function and dydx=dydududx,or(fg)(x)=f(u)g(x).There is an analogous theorem for functions of several variables.

The Chain Rule for Multivariable Functions 

 We start with the simplest case for functions of two variables.

Theorem 1. If x(t) and y(t) are differentiable functions at t0 and if z=f(x,y) is a differentiable function at (x0,y0)=(x(t0),y(t0)), then z=f(x(t),y(t)) is differentiable at t0, and
dzdt|t0=zx|(x0,y0)dxdt|t0+zy|(x0,y0)dydx|t0

 

Show the proof

 

Now consider a case where there are other independent variable besides t. In this case:
limΔt0ΔzΔt=zt,limΔt0ΔxΔt=xt,limΔt0ΔyΔt=yt,
and we have:
zt=zxxt+zyyt.
Therefore, we have proved the following theorem.

Theorem 2. If x=x(s,t) and y=y(s,t) are differentiable functions at (s0,t0) and if z=f(x,y) is a differentiable function at (x0,y0)=(x(s0,t0),y(s0,t0)), then z=f(x(s,t),y(s,t)) is differentiable at (s0,t0), and
zs=zxxs+zyys,
zt=zxxt+zyyt.

Matrix Form of the Chain Rule

We can combine the two equations in the above theorem into a single matrix equation.
[zszt]=[zxzy] [xsxtysyt]
This is called the matrix form of the chain rule. Note that the partial derivatives in the first and the last matrices are evaluated at (s0,t0), while the partial derivatives in the second matrix are evaluated at (x0,y0).

  • When z=f(x,y), x=x(s,t), and y=y(s,t), z is dependent variable, and s and t are independent variables. x and y are called intermediate variables.

Tree Diagrams

As you will see in the following examples, the chain can take different forms. You can draw a tree diagram to help you determine the correct form of the chain rule for a given problem. Start from the dependent variable, say z, and draw branches to the intermediate variables, say x and y. Then connect the intermediate variables to the independent variables, say s and t. On each branch write the corresponding partial derivative, for example, z/x. This process is shown in Figure 1. To find z/t, read down each route to t, multiply derivatives along the way, and then add the products.

 

Figure 1. tree diagram

 

 

Example 1

If z=x2yy2 and x=cost and y=et find dzdt at t=0.

Solution

Method (a) We can plug x(t) and y(t) into the expression for z and then differentiate with respect to t:
z=x2yy2=cos2t et(et)2=cos2t ete2t[Remember: (ab)c=abc]
Now we can easily differentiate with respect to t:
dzdt=2sintcostet+cos2t et2e2t

For the last step, we can just plug t=0 into the above expression:
dzdt|t=0=2sin0cos0 e0+cos20 e02e0=0+1×12×1=1

Method (b): We can use Theorem 1.


zx=2xy,zy=x22y
dxdt=sint,dydt=et
dzdt|t0=zx|(x0,y0)dxdt|t0+zy|(x0,y0)dydx|t0

dzdt=2xy×(sint)+(x22y)×et.

When t=t0=0
x0=x(t=t0)=cos0=1,y0=y(t=t0)=e0=1

Therefore:
dzdt|t=0=[2xy]x=1y=1×(sint)t=0+[x22y]x=1y=1×et|t=0=2×1×1×0+(122)×1=1

Sometimes, we cannot use the substitution technique. The following example is one of those situations.

Example 2

Calculate how fast the volume of a right circular cone is changing if the radius of the base is 5 in and increasing at the rate of 0.1 in/sec, and the altitude of the cone is 12 in and decreasing at rate of 0.5 in/sec.

Solution

Let r be the radius of the base, h the altitude, and V the volume of the cone. From geometry we know: V=13πr2h.

 

Now r and h are functions of time, t, and we want to find dV/dt.

We know:
r0=5 in,drdt=0.1 in/sec,h0=12 in,dhdt=0.5 in/sec

 

dVdt|t=t0=Vr|r=r0h=h0drdt|t=t0+Vh|r=r0h=h0dhdt|t=t0=(23πrh)r=r0h=h0(0.1)+(13πr2)r=r0h=h0(0.5)=(23π5×12)(0.1)+(13π52)(0.5)=π6 in3/sec

Extension of Theorem 1

Of course, we expand Theorem 1 when z=f(x1,,xn) is a differentiable function of x1,,xn and x1(t),,xn(t) are differentiable functions of t. Then:
dzdt=zx1dx1(t)dt++zxndxn(t)dt.
We can write the above equation in a matrix form:
[zt]=[zx1zxn] [dx1(t)dtdxn(t)dt]

Example 3

Consider two objects moving with time t0 on two paths given by the following equations:
first object:x1=sint1,y1=sin2t+1,second object:x2=2sint+2,y2=cos2t.
At what rate is the distance between the two objects changing when t=π2?

Solution

The distance between two objects A=(x1,y1) and B=(x2,y2) is:
s=(x2x1)2+(y2y1)2
Here x1,y1,x2 and y2 vary with t. For part (a) and (b), we need to find ds/dt. One method is to plug the formulas of x1(t),y1(t),x2(t) and y2(t) into the formula of s, and make s a function of t alone. Then you can differentiate it with respect to t. We leave this method to you. The second method is to use the chain rule we learned in this section:

dsdt=sx1dx1dt+sy1dy1dt+sx2dx2dt+sy2dy2dt
dx1dt|t=π2=cost|t=π2=0,dy1dt|t=π2=2cos2t|t=π2,=2dx2dt|t=π2=2cost|t=π2=0,dy2dt|t=π2=sin2t|t=π2=0, dsdt|t=π2=sy1|t=π2dy1dt|t=π2=y1y2s|t=π2(2)

Because
x1(π/2)=0,y1(π/2)=1 x2(π/2)=4,y2(π/2)=1 we get
sy1|t=π2=242+22=15 and finally
dsdt|t=π/2=15×(2)=25.

Example 4

If z=x2y2, find the rate of changes of z with respect to polar coordinates (find zr and zθ).

Solution

In polar coordinates:
x=rcosθ,y=rsinθ Method (a): We can write z in terms of r and θ and then differentiate with respect to them directly:
z=(rcosθ)2(rsinθ)2=r2(cos2θsin2θ)
Therefore:
zr=2r(cos2θsin2θ)=2rcos2θ,[Recall cos2θsin2θ=cos2θ]zθ=r2(2sinθcosθ2cosθsinθ)=4r2sinθcosθ=2r2sin2θ,[Recall 2sinθcosθ=sin2θ]

Method (b): We can use the chain rule:


[zrzθ]=[zxzy] [xrxθyryθ]

zx=2x=2rcosθ,zy=2y=2rsinθ,xr=cosθ,xθ=rsinθ,yr=sinθ,yθ=rcosθ.
Thus:
zr=zxxr+zyyr=(2rcosθ)(cosθ)+(2rsinθ)(sinθ)=2r(cos2θsin2θ)=2rcos2θ

zθ=zxxθ+zyyθ=(2rcosθ)(rsinθ)+(2rsinθ)(rcosθ)=4rcosθsinθ=2rsin2θ

Example 5

If x(t), y(t) and z=f(x,y) are twice continuously differentiable functions, find d2z/dt2.

Solution

According to Theorem 1, we have:
dzdt=zxdxdt+zydydt
If we differentiate with respect to t, we have:
(*)d2zdt2=ddt(zx)dxdt+zxd2xdt2+ddt(zy)dydt+zyd2ydt2.

Now zx and zy are functions of x and y, and to find their derivatives with respect to t, we apply Theorem 1, with z replaced by zx or zy. Thus we have:
ddt(zx)=2zx2dxdt+2zyxdydt, and
ddt(zy)=2zyxdxdt+2zy2dydt.

Substituting in (*), we have
d2zdt2=(2zx2dxdt+2zyxdydt)dxdt+zxd2xdt2+(2zyxdxdt+2zy2dydt)dydt+zyd2ydt2
(**)d2zdt2=2zx2(dxdt)2+22zxydxdtdydt+2zy2(dydt)2+zxd2xdt2+zyd2ydt2.

In above, we put 2z/(xy)=2z/(yx), because the second partial derivatives of z=f(x,y) are continuous (recall  the theorem on the symmetry of the second partial derivatives).

Example 6

If x(s,t), y(s,t) and z=f(x,y) are twice continuously differentiable functions, find 2zt2 and 2zst

Solution

Finding 2zt2 is easy. We just need to replace ddt by t in Eq. (**) in the previous example. Therefore:

(*)2zt2=2zx2(xt)2+22zxyxtyt+2zy2(yt)2+zx2xt2+zy2yt2.

The process of finding 2zst is similar to the process of finding d2zdt2 in the previous example. To find 2zst, we start from zt that we know from Theorem 2:
zt=zxxt+zyyt.
Then we differentiate with respect to s:
(**)2zst=s(zx)xt+zx2xst+s(zy)yt+zy2yst.
Again z/x and z/y are two functions of x and y, therefore:
s(zx)=2zx2xs+2zyxys,
s(zy)=2zxyxs+2zy2ys.
Substituting the above expressions in (**), we get:

2zst=2zx2xsxt+2zxy(xsyt+xtys)(***)+2zy2ysyt+zx2xst+zy2yst.

Obviously (***) reduces to (*) if we put s=t.


Example 7

If z=f(x,y) where f is twice continuously differentiable, express 2zx2+2zy2 in polar coordinates.

Solution

Here we assume z=ϕ(r,s), r=r(x,y) and θ=θ(x,y) and we want to find 2zx2+2zy2 in terms of the first and second partial derivates of z with respect to r and θ. Therefore in this example, x and y are independent variables and r and θ are intermediate variables. To this end, we can use Eq. (*) in the previous example, and replace s and t with x and y, and replace x and y in that equation by r and s. Therefore:
2zx2=2zr2(rx)2+22zrθrxθx+2zθ2(θx)2+zr2rx2+zθ2θx2,
Similarly,
2zy2=2zr2(ry)2+22zrθryθy+2zθ2(θy)2+zr2ry2+zθ2θy2.
Now we need to find the partial derivates of r and θ with respect to x and y:
r=x2+y2 {rx=xx2+y2=xr2rx2=rxrxr2=y2r3,ry=yx2+y2=yr2ry2=ryryr2=x2r3,

θ=arctan(yx) {θx=yx21+y2x2=yr22θx2=2rxryr4=2xyr4,θy=1x1+y2x2=xr22θy2=2ryrxr2=2xyr4,

Therefore:
2zx2+2zy2=2zr2[(rx)2+(ry)2]+22zrθ[rxθx+ryθy]+2zθ2[(θx)2+(θy)2]+zr[2rx2+2ry2]+zθ[2θx2+2θy2]=2zr2[x2r2+y2r2]+22zrθ[xr.(y)r2+yrxr2]+2zθ2[y2r4+x2r4]+zr[y2r3+x2r3]+zθ[2xyr42xyr4]=2zr2+1r22zθ2+1rzr. Thus:

2zx2+2zy2=2zr2+1rzr+1r22zθ2.

  • As we learned in the previous example, if z=f(x,y) where f is twice continuously differentiable (= has continuous partial derivatives of second order), and if  x=rcosθ and y=rsinθ, then we have :
    2zx2+2zy2=2zr2+1rzr+1r22zθ2.
    This is a very important result which has numerous applications in physics, engineering, and solving partial differential equations.

Be Careful About Which Variables Are Held Fixed

In physics, many quantities such as temperature, density, electrostatic and magnetic fields may vary with time and location. For example, the temperature, T, on earth varies with location (x,y,z) and time t, so we have T=f(x,y,z,t). If the location where we measure the temperature moves along a path, x,y, and z also vary with time. In other words, we have x(t),y(t), and z(t). The following tree diagram shows the dependency of temperature on time:

 

If we wish to find the rate of change of T (with time), i.e. dT/dt, we need to use the chain rule. With the aid of the above diagram, we multiply the corresponding derivatives along each path from T to t, and add these products together:
dTdt=Txdxdt+Tydydt+Tzdzdt+Tt.
Note that T/t is different from dT/dt. The notation T/t means we have fixed the location and calculate the rate of change of temperature. We may indicate the fact that we have fixed the location and then measured the rate of change of temperate by writing:
(Tt)x,y,z

Example 8

Suppose in a two dimensional flow, an electrical conductor creates the electrostatic potential field P given by: P=cos(πt) ln(x2+y2). If a particle of the fluid is moving along the path x=sin(π2t)+2 and y=t+1. Find the rate of change of P as seen by the particle when t=1.

Solution

Here we have a function P(x,y,t) where x and y also depend on t, and we want to find dP/dt when t=0

dPdt=Pxdxdt+Pydydt+Pt

dxdt=π2cos(π2t)dxdt|t=1=0we don’t need to calculate Px because Pxdxdt=0.dydt=1(because it is constant, we don’t need to plug t=1 in this)

When t=1, we have x=sin(π2)+2=3 and y=1+1=2.
Py=cos(πt)2yx2+y2Py|x=3y=2t=1=cosπ2×232+22=413,Pt=πsin(πt)ln(x2+y2)Pt|x=3y=2t=1=πsin(π×1)=0ln(32+22)=0.
Therefore:
dPdt|t=1=Py|x=3y=2t=1dydt|t=1=413×1=413.

An alternative method is to plug x(t) and y(t) in the formula of P(x,y,t), differentiate with respect to t, and then plug t=1 in that. However, it is more complicated. Here we skip some calculations and show the results.

P=cos(πt) ln(x2+y2)=cos(πt) ln((sin(π2t)+2)2+(t+1)2)

dPdt=cos(πt){2(1+t)+πcos(πt2)(2+sin( πt2))}(1+t)2+(2+sin[πt2])2πsin(πt)ln[(1+t)2+(2+sin(πt2))2]
dPdt|t=1=413

Now consider a case where T=f(x,y,z,t) and x=x(s,t),y=y(s,t), and z=z(s,t). In this case,
T=f(x(s,t),y(s,t),z(s,t),t)=ϕ(s,t). The following tree diagram shows this situation.

If we wish to find ϕt, according to the above tree chart, we have:
(ii)ϕt=fxxt+fyyt+fzzt+ft.
The above equation might be sometimes written as Tt=Txxt+Tyyt+Tzzt+Tt, but we should note that Tt on the left and on the right have two different meanings. On left we are treating s as a constant, and on right we are treating x,y and z as constants. To indicate the difference between them, it would be better to use the notation in Equation (ii) or to write it as:
(Tt)s=(Tx)y,z,t(xt)s+(Ty)x,z,t(yt)s+(Tz)x,y,t(zt)s+(Tt)x,y,z.
When there is no ambiguity, we may drop some or all of the subscripts.

Example 9

Given u=xy+yz+xz and z=x/y, find (u/x)y.

Solution

Method (a): According to the tree diagram:

 

(ux)y=(ux)y,z+(uz)x,y(zx)y
(ux)y,z=y+z=y+xy,(uz)x,y=x+y,(zx)y=1y.
Therefore:
(ux)y=y+xy+(x+y)(1y)=1+y+2xy.
Method (b): First we plug z=x/y in the formula for u, and then differentiate with respect to x:
u=xy+yxy+xxy(ux)y=y+1+2xy.

Homogeneous Functions

A function f(x1,,xn) is called homogeneous of degree k if for any t>0 and for all (x1,,xn) in the domain of f:
f(tx1,,txn)=tkf(x1,,xn)
In other words, a function is homogeneous if we multiply its argument by a factor, its values will be multiplied by some power of this factor. Here are some examples of homogeneous functions:

  • The function f(x,y)=x25xy+y2 is homogeneous of degree 2 because:
    f(tx,ty)=(tx)25(tx)(ty)+(ty)2=t2(x25xy+y2)=t2f(x,y).
  • The function f(x,y,z)=x5yz3 is homogeneous of degree 8 because :f(tx,ty,tz)=(tx)5(ty)(tz)3=(t5t3)(x5yz3)=t8f(x,y,z).
    • (Recall abac=ab+c)
  • The function f(x,y)=(x2y2)/(x4+y4) is homogeneous of degree 0 because:
    f(tx,ty)=t4x2y2t4(x2+y2)=f(x,y).
  • The function f(x,y,z)=(x5+y5+z5)1/5 is homogeneous of degree 1 because:
    f(tx,ty,tz)=((tx)5+(ty)5+(tz)5)1/5=(t5(x5+y5+z5))1/5=t(x5+y5+z5)1/5=tf(x,y,z).
  • The function f(x,y)=(x3+y3)/(x2+xy) is homogeneous of degree -1 because:
    f(tx,ty)=t3(x3+y3)t2(x2+xy)=1tx3+y3x2+xy=t1f(x,y)
  • The function f(x,y)=ln(x2+y2) is NOT homogeneous because in general: f(tx+ty)=ln[t2(x2+y2)]=ln(t2)+ln(x2+y2)=2lnt+ln(x2+y2)tkln(x2+y2).
    • (Recall lnab=lna+lnb and lnab=blna)

Theorem 3. Euler’s theorem: If f(x,y) is a homogeneous function of degree k then:
xfx+yfy=kf.
In general, if f(x1,,xn) is homogeneous of degree k then:
x1fx1++xnfxn=kf.

Show the proof

Example 10

If z=f(x,y) is homogeneous of degree k and u=ϕ(z), show:
xux+yuy=kzϕ(z).

Solution

According to the chain rule:


ux=dudz=ϕ(z)zx=ϕ(z)zx,
uy=dudz=ϕ(z)zy=ϕ(z)zy.
Therefore
xux+yuy=xϕ(z)zx+yϕ(z)zy=ϕ(z)[xzx+yzy]=kz=kzϕ(z).[Because z is homogeneous of order k, according to Theorem 3 xzx+yzy=kz.]

Example 11

Given
u=arcsinx2y2x2+y2,
find xux+yuy.

Solution

We note that if we put z=x2y2x2+y2, z is homogeneous of degree 2, and u=ϕ(z)=arcsinz. Therefore, using the result of the previous example, we have:
xux+yuy=kzϕ(z)=2z11z2=2sinu11sin2u=2sinu|cosu|=2sinucosu=2tanu Because u=arcsinz, therefore π2uπ2. In this interval cosu0. That is |cosu|=cosu.


[1] This relationship is valid only when x>0 and y>0 but for our purpose it is enough.