Directional Derivatives

We have learned that the rates of change of a function f along the coordinate axes are its first partial derivatives. Now we want to generalize this concept and find the rate of change of f along any arbitrary direction.

Consider a function z=f(x,y) and a unit vector v=v1i+v2j. Let’s approach to the point (x0,y0) along a ray that is parallel to v. Then the rate of change of f at the point (x0,y0) with respect to distance is called the directional derivative of f at (x0,y0) in the direction v.

For a geometrical interpretation, consider the point P=(x0,y0,f(x0,y0)) on the surface z=f(x,y) and the points Q=(x0,y0,0) and R=(x0+tv1,y0+tv2,0) in the xy-plane, as illustrated in in Fig. 1. Note that QR is parallel to v. The plane Ω through Q and R and perpendicular to the xy-plane1 intersects the surface z=f(x,u) in the curve C. The slope of the tangent line to the curve C at the point P in the plane Ω is the directional derivative of f in the direction v, and is denoted by Dvf(x0,y0).

Figure 1.

 

Definition 1. The directional derivative of a function f at (x0,y0) in the direction of v, denoted by Dvf(x0,y0) is given by:
Dvf(x0,y0)=limt0f(x0+tv1,y0+tv2)f(x0,y0)t,
whenever the limit on the right hand side exists.

  • The above definition is also meaningful when v is not a unit vector. However, when |v|0, then Dvf(x0,y0) is NOT equal to the rate of change of f in the direction v. The rate of change of f in the direction v or the slope of tangent line to the curve C at (x0,y0) is equal to Dv/|v|f(x0,y0).

The rate of change of f at (x0,y0) along an arbitrary line parallel to v (v0) is equal to: Dv/|v|f(x0,y0)

  • If v=0, Definition 1 gives:
    D0f(x,y)=limt0f(x+t×0,y+t×0)f(x,y)t=0,
    for every (x,y) in the domain of f.
  • If v=i, we have from the above definition
    Dif(x0,y0)=limt0f(x0+t,y0)f(x0,y0)t,
    which is clearly the partial derivative of f with respect to x, fx(x0,y0)=fx(x0,y0).Similarly, if v=j, the directional derivative of f in the direction of j is the partial derivative of f with respect to y. So
    Dif=fx=fx,Djf=fy=fy.
Example 1

Given f(x,y)=x22y2 and v=i3j, find Dvf(x,y).

Solution

It follows from Definition 1 that
Dvf(x,y)=limt0f(x+t,y3t)f(x,y)t=limt0(x+t)22(y3t)2(x22y2)t=limt0x2+2xt+t22(y26yt+9t2)x2+2y2t=limt02xt12yt17t2t=2x12y

  • If g(t)=f(x0+tv1,y0+tv2), then g(0)=Dvf(x0,y0).
    To prove, we use the definition of g(0):
    g(0)=limt0g(t)g(0)t=limt0f(x0+tv1,y0+tv2)f(x,y)t=Dvf(x0,y0).Also we note g(t)=f(x(t),y(t)) where x(t)=x0+tv1,andy(t)=y0+t v2. To find g(t) we can use the chain rule:
    dgdt=fxdxdt+fydydt=fx v1+fy v2.
    When t=0, we have x(0)=x0 and y(0)=y0. Thus:
    g(0)=dgdt|t=0=fx|(x0,y0)v1+fy|(x0,y0)v2=fx(x0,y0) v1+fy(x0,y0) v2.
    Therefore, we could prove the following theorem.

Theorem 1. Iff is a differentiable function at (x0,y0), and v=v1i+v2j, then
Dvf(x0,y0)=fx(x0,y0) v1+fy(x0,y0)v2.

Example 2

Given f(x,y)=x22y2 and v=i3j, find Dvf(x,y) using Theorem 1.

Solution

First we need to calculate fx and fy.
fx=2x,fy=4y.
Therefore:
Dvf(x,y)=fx=2xv1=1+fx=4yv1=3=(2x)(1)+(4y)(3)=2x+12y.

Does the Existence of Directional Derivatives in All Directions Guarantee Differentiability?

We learned that if the first partial derivatives of a function are continuous in a neighbourhood of a point, the function is differentiable at that point However, the mere existence of the first partial derivatives does imply that the function is differentiable. We may face this question: will a stronger condition that the directional derivatives of f in all directions (not just along the coordinate axes) exist guarantee the differentiability of f? The answer is still “no.” Consider the following example.

Example 3

Let
z=f(x,y)={xy2x2+y2if (x,y)(0,0),0if (x,y)=(0,0).
Find Dvf(0,0) for every unit vector v. Is f differentiable at the origin?

Graph of z=xy2x2+y2
Solution

Let v=cosθi+sinθj be any unit vector. If we use the polar coordinates, we have z=rcosθ r2sin2θr2=rcosθsin2θ. This means along any ray from the origin making an angle θ with the positive side of the x-axes, the graph is the straight line of slope cosθsin2θ. Therefore, Dvf(0,0)=cosθsin2θ. If this argument has not been convincing for you, let’s calculate Dvf(0,0):
D(cosθi+sinθj)f(0,0)=limt0f(0+tcosθ,0+tsinθ)f(0,0)=0t=limt01ttcosθ t2sin2θt2(cos2θ+sin2θ)=1=cosθsin2θ
Because the function is constant (z=0) on the x and yaxes, we conclude fx(0,0)=fy(0,0)=0. Therefore, if the function were differentiable, according to 1 we should have Dvf(0,0)=fx(0,0)cosθ+fy(0,0)sinθ=0. However, we showed that Dv=cosθsin2θ which is not zero if θkπ2 (for kZ). Therefore we conclude the function cannot be differentiable at the origin.

Even if a function has finite directional derivatives in all directions, it may fail to be continuous, let alone be differentiable. The following example illustrates such a situation.

Example 4

Let
z=f(x,y)={xy2x2+y4if x0,0 if x=0.
Find Dvf(0,0) for every unit vector v.

Solution

Let v=cosθi+sinθj be any unit vector. Because we don’t know whether or not f is differentiable, to find Dvf(0,0) we have to use the definition of the directional derivative:
Dvf(0,0)=limt0f(0+tcosθ,0+tsinθ)f(0,0)=0t=limt01ttcosθ t2sin2θt2(cos2θ+t2sin4θ)=limt0cosθ sin2θ(cos2θ+t2sin4θ)={sin2θcosθif cosθ0,0if cosθ=0.

In fact, if we approach the origin along the line y=mx, we have:
z=limx0x(m2x2)x2(1+m4x2)=limx0m2x1+m4x2=0.
and if (x,y) tends to (0,0) along y=x, we have:
z=limx0x(x)2x2+(x)4=limx0x22x2=12.
Therefore lim(x,y)(0,0)f(x,y) does not exist and the function is not even continuous at the origin, let alone be differentiable.

Gradients

Now let’s go back to Theorem 1. We learned if f is a differentiable function at (x,y), and v=v1i+v2j, then
Dvf(x,y)=fx(x,y) v1+fy(x,y)v2.
The right hand side of the above expression can be written as the dot product of two vecots:
fx(x,y) v1+fy(x,y)v2=(fx(x,y) i+fy(x,y) j)(v1i+v2j)v.
Therefore:
Dvf(x,y)=(fx(x,y) i+fy(x,y) j)v.
The first vector on the right hand side is called the “gradient of f” and is denoted by “f” or “gradf.” The notation “” is the inverted capital delta, Δ, and is read “del” or “nabla.” We can also write: f=(fx,fy). Gradients have many applications that we will discuss in this chapter.

Definition 2. If f is a function of two variables x and y and if fx(x,y) and fy(x,y) exist, the gradient of f, denoted by f or gradf is defined by
f(x,y)=fx(x,y) i+fy(x,y) j.

 If f is a differentiable function at (x,y), then
Dvf(x,y)=vf(x,y).

    • Recall that dot product is commutative. That is for two vectors a and b, we have: ab=ba.
Example 5

Let f(x,y)=x2y2+xy. Find f(1,2).

Solution

fx(x,y)=2x+yfx(1,2)=2×12=0
fy(x,y)=2y+xfy(1,2)=2×(2)+1=5 Thus:
f(1,2)=5j=(0,5)

Example 6

If f is a function of x and y, find f in polar coordinates.

Solution

We know f=fxi+fyj. Not only should we write fx and fy in terms of fr and fθ but also we need to write i and j in terms of the unit vectors for polar coordinates er and eθ (the subscripts for these two unit vectors do not mean differentiation).

From the chain rule, we know:
fx=frrx+fθθx,fy=frry+fθθy
In this example , we had shown:
[rxryθxθy]=[xryryr2xr2]
Therefore:
fx=frxr=cosθ+fθyr2=sinθr
fx=fryr=sinθ+fθxr2=cosθr

On the other hand, i and j should be written in terms of er and eθ. Using geometry we have:
i=ercosθeθsinθ
j=ersinθ+eθcosθ

f=fxi+fyj=(cosθfrsinθrfθ)(ercosθeθsinθ)+(sinθfr+cosθrfθ)(ersinθ+eθcosθ)=[fr(cos2θ+sin2θ)=1+fθ(sinθcosθ+sinθcosθ)=0]er+[fr(sinθcosθ+sinθcosθ)=0+1rfθ(sin2θr+cos2θr)=1r]eθ
Thus the gradient of f in polar coordinates is:

f=frer+1rfθeθ

  • In the above example, we showed that the gradient in polar coordinatesf=frer+1rfθeθorf=(err+eθ1rθ)f

Directional Derivatives and Gradients in 3- and n-Space

The extension of the concept of the directional derivative and the gradient when
f is a function of three variables or more is easy. For example if f:UR3R, its directional derivative in the direction of a vector v=v1i+v2j+v3j is:
Dvf(x,y,z)=limt0f(x+tv1,y+tv2,z+tv3)f(x,y,z)t=ddtf(x+tv1,y+tv2,z+tvz)|t=0 Note that (x+tv1,y+tv2,z+tv3)=(x,y,z)+t(v1,v2,v3)=(x,y,z)+tv. The general definition of the directional derivative is as follows.

Definition 3. Consider a function f:URnR. The directional derivative of f at xRn in the direction vRn, denoted by Dvf(x), is defined by:
Dvf(x)=limt0f(x+tv)f(x)t
whenever the limit on the right hand side exists.

The gradient of f(x,y,z) is
f(x,y,z)=(fx,fy,fz)=fxi+fyj+fzk
and for the general case f(x1,,xn) we have:

Definition 4. Consider a function f:URnR such that fx1(x),,fxn(x) exist. Then the gradient of f, denoted by f or gradf, is the vector
f(x)=(fx1(x),,fxn(x)).

And

Theorem 2. If f is a differentiable function at x=(x1,,xn), then
Dvf(x)=vf(x),
for a vector vRn.

Example 7

Let f(x,y,z)=ρ=x2+y2+z2 be a function that gives the distance from 0 to (x,y,z). Find f(1,2,1).

Solution

f(x,y,z)=(fx,fy,fz)=(xx2+y2+z2,yx2+y2+z2,zx2+y2+z2)=(xρ,yρ,zρ)=1ρ(x,y,z)
This means f is a unit vector in the direction (x,y,z). It is unit because we divide the vector (x,y,z) by its length ρ.

To find f(1,2,1), we find the length of the vector (1,2,1): ρ=|(1,2,1)|=12+(2)2+(1)2=6, and therefore:
f(1,2,1)=16(1,2,1)=16i26j16k.

Example 8

Let f(x,y,z)=xeyz+3x2yz. Find the rate of change of f at (1,0,3) in the direction of the vector (2,0,1).

Solution

We know the rate of change of f at a point in the direction of a vector v is Dv/|v|f evaluated at that point (for this example at (1,0,3)). On the other hand we know:
Dv/|v|f=fv|v|.
[Actually we have replaced v in Theorem 2 by the vector v|v|]

So we need to (1) find f(1,0,3) (2) normalize v by its length and (3) do dot product the two vectors we got in step 1 and step 2.

Step 1:
f(x,y,z)=xeyz+3x2yzf(x,y,z)=(fx,fy,fz)=(eyz,xzeyzz,xyeyzy)
f(1,0,3)=(e0,3e0+3,0)=(1,0,0)

Step 2: Here v=(2,0,1) and |v|=22+1=5.
v|v|=(25,0,15)

Step3:
Dv/|v|f(1,0,3)=(1,0,0)(25,0,15)=25.

Therefore, the rate of change of f at (1,0,3) in the direction of the vector (2,0,1) is 2/5.

Properties of the Gradient

Properties of the gradient of a function are similar to the properties of regular derivative of functions of single variable. If f and g are differentiable functions from an open set URn to R then:

  1. \(\overrightarrow{\nabla} f(\mathbf{x})=\mathbf{0}forevery\mathbf{x}inUifandonlyiff(\mathbf{x})=constantonU.$
  2. (f+g)(x)=f(x)+g(x).
  3. (cf)(x)=cf(x),c is a constant.
  4. (fg)(x)=g(x)f(x)+f(x)g(x).
  5. (fg)(x)=g(x)f(x)f(x)g(x)g2(x),at points at which g(x)0.
  6. Let r:IRRn be a function that maps an interval I into the domain of f. Assume r(t) exists and f is differentiable at r(t). If we define ϕ:IRR such that ϕ(t)=f(r(t)), then using the chain rule we conclude ϕ(t) exists and is equal to:
    ()ϕ(t)=f(r(t))r(t).
    Note that () is not something new. It is just a new way of writing what we saw before. For example, when n=3 and r(t)=(x(t),y(t),z(t)), () is the same as the following:
    ϕ(t)=dϕ(t)dt=fxdx(t)dt+fydy(t)dt+fzdz(t)dt.

 

Example 9
(Multi-dimensional version of the Mean-Value Theorem) Let U be an open set containing two points a and b and the line segment joining them. Show that if f and its first partial derivatives are continuous on U, then there is a point c on the line segment joining a and b such that f(b)f(a)=f(c)(ba)

[Hint: Express the line segment in parametric form oand use the Mean-Value Theorem for functions of one variable.]

Solution

The line segment joining a and b can be parametrized by means of the equation

r(t)=a+t(ba)0t1.

Let g be a function from [0,1] to R with

g(t)=f(r(t)).

Because g is continuous on [0,1] and differentiable on (0,1), it follows from the Mean-Value Theorem that there is a number t0 between 0 and 1 such that

g(1)g(0)=g(t0)(10).

Because g(0)=f(r(0))=f(a),g(1)=f(r(1))=f(b), and

g(t)=f(r(t))r(t)=f(r(t))(ba),
we have

f(b)f(a)=f(r(t0))(ba)

or

f(b)f(a)=f(c)(ba),

where c=r(t0).

Differentiability and Gradient (Optional)

 

Read the optional part

 

Gradient Vector Field

Suppose f is a function of x and y. The gradient of f assigns a two dimensional vector (fx,fy) to each point in the R2 plane wherever the partial derivatives exist. An association that associates a vector to each point in the two- or three-dimensional space is called vector field. As such, f is referred to as a gradient vector field. Other examples of vector fields in physics and engineering include velocity of (steady) wind or water currents, gravitational field, electric and magnetic fields, and displacement field of a deformable body under external forces.

To visualized a vector field in two or three dimensions, at each point (actually at some points) we draw a vector that the vector field gives us at that point. The length of vectors are often scaled to be able to show more vectors in the plane. This is an effective way of representing a gradient field. For example, if f:UR2R, its gradient is a function from UR2 to R2, and therefore its graph would be a set of (x,f(x))=(x,y,fx,fy), which is a subset of R4 and impossible to plot.


[1] or equivalently parallel to the z-axis