Chain Rule for the Derivative of the Composition of Functions

Chain Rule for the Derivative of the Composition of Functions

Chain Rule for the Derivative of the Composition of Functions

With what we have seen so far, we already have all the basics to compute almost any derivative. However, we must distinguish between the possibility of computing a derivative and the effort we invest in carrying out such calculations, and this is where theorems such as the chain rule for the calculation of one variable come into play. The chain rule will allow us to quickly compute derivatives that would otherwise involve rather tedious and complicated work.

TABLE OF CONTENTS
The Chain Rule Theorem in One Real Variable
Proof of the Chain Rule
Examples of the Use of the Chain Rule in Functions of One Variable
Precaution to Keep in Mind Regarding the Chain Rule
Useful Results Obtained from the Chain Rule
Inverse Function Theorem
Derivative of the Exponential Function
Derivative of the Inverse Trigonometric Functions
Implicit Differentiation
Derivatives of Rational Powers
Derivatives of Rational Powers
Exercise Guide


The Chain Rule Theorem in One Real Variable

Let f and g be two functions that can be composed

f: A\subseteq \mathbb{R} \longmapsto B\subseteq \mathbb{R}

g: B\subseteq Dom(g) \longmapsto D\subseteq \mathbb{R}

If f is differentiable on A and g is differentiable on B, then the composite function g\circ f is differentiable for all x\in A and the following formula holds

\displaystyle \frac{d}{dx}(g\circ f)(x) = \frac{d}{dx} g(f(x)) = \frac{dg(f(x))}{df(x)} \frac{df(x)}{dx}

Proof of the Chain Rule

Let us consider the functions f and g as defined above. If we compute the derivative of the composition, then we have

\begin{array}{rcl} \dfrac{d}{dx} g(f(x))& = & \displaystyle\lim_{\Delta x \to 0} \dfrac{g(f(x + \Delta x)) - g(f(x))}{\Delta x} \\ \\ &=&\displaystyle \lim_{\Delta x \to 0} \frac{g(f(x + \Delta x)) - g(f(x))}{\Delta x} \cdot \frac{f(x + \Delta x) - f(x)}{f(x+\Delta x) - f(x)} \\ \\ &=& \displaystyle \lim_{\Delta x \to 0} \frac{g(f(x + \Delta x)) - g(f(x))}{f(x+\Delta x) - f(x)} \cdot \frac{f(x + \Delta x) - f(x)}{\Delta x} \\ \\ &=&\displaystyle \lim_{\Delta x \to 0} \frac{g(f(x + \Delta x)) - g(f(x))}{f(x+\Delta x) - f(x)} \cdot \lim_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}\\ \\ &=& \displaystyle \lim_{f(x+\Delta x) \to f(x) } \frac{g(f(x + \Delta x)) - g(f(x))}{f(x+\Delta x) - f(x)} \cdot \lim_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}\\ \\ &=& \displaystyle \frac{dg(f(x))}{df(x)} \frac{df(x)}{dx} \end{array}

Which is what we wanted to prove.

Examples of the Use of the Chain Rule in Functions of One Variable

Something that becomes clear, at least at first glance, but is not so evident from an operational perspective, is the fact that the chain rule tells us that when we encounter a composition of functions, we can differentiate “from the outside in.” To explain this in a way that is easy to understand, examples are by far the quickest route.

  1. If we are asked to differentiate f(x) = (2x^2+1)^{12} we would first expand the powers and then apply the power rule to each part of the large polynomial we would have obtained as a result—an unnecessarily exhausting task. With the chain rule, the computation of the derivative can be done in just a few lines:

    \displaystyle \frac{d}{dx} (2x^2+1)^{12} = 12(2x^2+1)^{11}(4x)= 48x(2x^2+1)^{11}

  2. Try to compute the derivative of g(x) = \sin(\cos(x)) using only basic differentiation techniques and face eternal suffering. Do it using the chain rule and the result will appear without tears and in just a few steps:

    \displaystyle \frac{d}{dx} \sin(\cos(x))= -\cos(cos(x))\sin(x)

  3. You can also compute the derivative of functions that are compositions of many functions. If f(x)=\cos(\cos(\cos(x))), the derivative df/dx becomes:

    \begin{array}{rcl} \displaystyle \frac{d}{dx} \cos(\cos(\cos(x))) &=& -\sin(\cos(\cos(x)))\cdot(-\sin(\cos(x))\cdot(-\sin(x)) \\ \\ &=& -\sin(\cos(\cos(x)))\cdot\sin(\cos(x))\cdot\sin(x) \end{array}

    As you can see, applying the chain rule simply means differentiating in a chained manner from the outside in.

Precaution to Keep in Mind Regarding the Chain Rule

In the literature, everyone shows the great benefits of using the chain rule, but very few are emphatic about the precautions that must be taken before applying it. Despite the power of this theorem, you must always pay close attention to the domains and ranges of the functions before applying the chain rule. Before working, you must ensure that the domains and ranges of the functions are compatible for composition; otherwise, you run the risk of computing derivatives where they do not exist. If you differentiate, for example, a function of the form

f(x)=\ln(\cos(x))

if you blindly trust the chain rule, you will perform calculations such as the following:

\displaystyle \frac{d}{dx}\ln(\cos(x)) = -\frac{1}{\cos(x)}\sin(x) = -\tan(x)

Clearly, the tangent function is well defined for the value x=2\pi/3, because its value is \tan(2\pi/3) = -\sqrt{3}. However, the function f(x)=\ln(\cos(x)) is not well defined there because f(2\pi/3) = \ln(\cos(2\pi/3)) = \ln(-1/2), and the logarithm of negative numbers does not exist! In cases like this, it is necessary to indicate, before applying the chain rule, that the values of x to be considered are such that they keep the cosine function positive (so that compatibility under composition is ensured), and only then will the chain rule hold.

Useful Results Obtained from the Chain Rule

The chain rule is not only useful for achieving derivative calculations that would otherwise be unbearable; it is also useful for further expanding differentiation techniques to many other functions. Below we will review these techniques, their results, and their proofs.

Inverse Function Theorem

Let f be a bijective function and differentiable on some interval I\subseteq \mathbb{R}. Using the chain rule, it is possible to compute the derivative of the identity function (f^{-1}\circ f)(x) = f^{-1}(f(x)) = x. The calculations yield the following result:

1 = \displaystyle \frac{d}{dx} x = \frac{d}{dx} f^{-1}(f(x)) = \frac{df^{-1}(f(x))}{df(x)}\frac{df(x)}{dx}

From this, one can solve for df^{-1}(f(x))/df(x) and obtain:

\displaystyle \color{blue}{\frac{df^{-1}(f(x))}{df(x)}= \frac{1}{\frac{df(x)}{dx}}}

This is what is known as the inverse function theorem for computing derivatives. In the literature, it is common to find this theorem written in the form

\displaystyle \color{blue}{\frac{dx}{dy}= \frac{1}{\frac{dy}{dx}}}

Both ways of expressing the inverse function theorem are equivalent and follow from writing y=f(x) and x=f^{-1}(y).

Up to this point, we have seen everything that can be said about what the inverse function theorem concerns; now we will see how we can use it to compute some derivatives that would otherwise be quite difficult.

Derivative of the Exponential Function

When we studied basic differentiation techniques we saw that

\displaystyle \frac{d}{dx}\ln(x) = \frac{1}{x}

With this result and the inverse function theorem, it is easy to prove that

\displaystyle \frac{d}{dx}e^x = e^x

PROOF:

It is clear that y=\ln(x) is equivalent to saying that x=e^y. Then, applying the inverse function theorem we have:

\displaystyle \frac{d}{dy}e^y = \frac{dx}{dy} = \frac{1}{\frac{dy}{dx}} = \frac{1}{\frac{d}{dx}\ln(x)} = x = e^y

That is:

\displaystyle \frac{d}{dy}e^y = e^y

If in this last expression we replace the “y” with “x”, we obtain what we wanted to prove:

\displaystyle \frac{d}{dx}e^x = e^x.

Derivative of the Inverse Trigonometric Functions

The inverse function theorem will also allow us to obtain the derivatives of all the inverse trigonometric functions. These are:

\begin{array}{ccccccc} \dfrac{d}{dx}\text{Arcsin}(x) &=& \dfrac{1}{\sqrt{1-x^2}} &\phantom{asd}&\dfrac{d}{dx}\text{Arccos}(x) &=& \dfrac{-1}{\sqrt{1-x^2}} \\ \\ \dfrac{d}{dx}\text{Arctan}(x) &=& \dfrac{1}{1+x^2} &\phantom{asd}&\dfrac{d}{dx}\text{Arccot}(x) &=& \dfrac{-1}{1-x^2} \\ \\ \dfrac{d}{dx}\text{Arcsec}(x) &=& \dfrac{1}{x\sqrt{x^2-1}} &\phantom{asd}&\dfrac{d}{dx}\text{Arccsc}(x) &=& \dfrac{-1}{x\sqrt{x^2-1}} \end{array}

PROOF

Arcsine
Show Proof

The function \sin(x) is bijective as long as we restrict its domain to a set of the form \displaystyle \left[\frac{-\pi}{2}+k\pi , \frac{\pi}{2}+ k\pi \right], with k any integer. Without loss of generality, it is possible to limit ourselves to the principal case, where k=0, so that the bijective sine function is of the form

\displaystyle\sin : \left[-\frac{\pi}{2}, \frac{\pi}{2}\right] \longrightarrow [-1,1]

and under these conditions it holds that

y=\sin(x) \longleftrightarrow x=arcsin(y).

If we apply the inverse function theorem, we have:

\displaystyle \frac{d}{dy}arcsin(y) = \frac{1}{\frac{d}{dx}\sin(x)} = \frac{1}{\cos(x)}

Now, let us recall the trigonometric identity

\sin^2(x) + \cos^2(x) = 1

from which it follows that, if x\in [-\pi/2, \pi/2], then

\cos(x) = \sqrt{1 - \sin^2(x)}

Then, if we substitute this into the derivative of the arcsine function we obtain

\displaystyle \frac{d}{dy}arcsin(y) = \frac{1}{\cos(x)} = \frac{1}{ \sqrt{1 - \sin^2(x)}}

And since y=\sin(x)

\displaystyle \frac{d}{dy}arcsin(y) = \frac{1}{ \sqrt{1 - y^2}}

Finally, substituting “y” with “x” in this last expression, we arrive at what we wanted to prove:

\displaystyle \color{blue}{\frac{d}{dx}arcsin(x) = \frac{1}{ \sqrt{1 - x^2}}}

Arccosine
Show Proof

The function \cos(x) is bijective as long as we restrict its domain to a set of the form \left[0+k\pi , \pi+ k\pi \right], with k any integer. Without loss of generality, it is possible to limit ourselves to the principal case, where k=0, so that the bijective cosine function is of the form

\cos : \left[0, \pi\right] \longrightarrow [-1,1]

and under these conditions it holds that

y=\cos(x) \longleftrightarrow x=arccos(y).

If we apply the inverse function theorem, we have:

\displaystyle \frac{d}{dy}arccos(y) = \frac{1}{\frac{d}{dx}\cos(x)} = \frac{-1}{\sin(x)}

Now, let us recall the trigonometric identity

\sin^2(x) + \cos^2(x) = 1

from which it follows that, if x\in [0, \pi], then

\sin(x) = \sqrt{1 - \cos^2(x)}

Then, if we substitute this into the derivative of the arccosine function, we obtain

\displaystyle \frac{d}{dy}arccos(y) = \frac{-1}{\sin(x)} = \frac{-1}{ \sqrt{1 - \cos^2(x)}}

And since y=\cos(x)

\displaystyle \frac{d}{dy}arccos(y) = \frac{-1}{ \sqrt{1 - y^2}}

Finally, substituting “y” with “x” in this last expression, we arrive at what we wanted to prove:

\displaystyle \color{blue}{\frac{d}{dx}arccos(x) = \frac{-1}{ \sqrt{1 - x^2}}}

Arctangent
Show Proof

The function \tan(x) is bijective as long as we restrict its domain to a set of the form \displaystyle \left[-\frac{\pi}{2}+k\pi , \frac{\pi}{2}+ k\pi \right], with k any integer. Without loss of generality, it is possible to limit ourselves to the principal case, where k=0, so that the bijective tangent function is of the form

\displaystyle \tan : \left[-\frac{\pi}{2}, \frac{\pi}{2}\right] \longrightarrow \mathbb{R}

and under these conditions it holds that

y=\tan(x) \longleftrightarrow x=arctan(y).

If we apply the inverse function theorem, we have:

\displaystyle \frac{d}{dy}arctan(y) = \frac{1}{\frac{d}{dx}\tan(x)} = \frac{1}{\sec^2(x)}

Now, let us recall the trigonometric identity

\sin^2(x) + \cos^2(x) = 1

from which it follows that

\sec^2(x) =1+\tan^2(x)

Then, if we substitute this into the derivative of the arctangent function, we obtain

\displaystyle \frac{d}{dy}arctan(y) = \frac{1}{\sec^2(x)} = \frac{1}{ 1+\tan^2(x)}

And since y=\tan(x)

\displaystyle \frac{d}{dy}arctan(y) = \frac{1}{1 + y^2}

Finally, substituting “y” with “x” in this last expression, we arrive at what we wanted to prove:

\displaystyle \color{blue}{\frac{d}{dx}arctan(x) = \frac{1}{1+ x^2}}

Arccotangent
Show Proof

The function cot(x) is bijective as long as we restrict its domain to a set of the form \left[0+k\pi , \pi+ k\pi \right], with k any integer. Without loss of generality, it is possible to limit ourselves to the principal case, where k=0, so that the bijective cotangent function is of the form

ctg : \left[0, \pi\right] \longrightarrow \mathbb{R}

and under these conditions it holds that

y=ctg(x) \longleftrightarrow x=arcctg(y).

If we apply the inverse function theorem, we have:

\displaystyle \frac{d}{dy}arcctg(y) = \frac{1}{\frac{d}{dx}ctg(x)} = \frac{-1}{\csc^2(x)}

Now, let us recall the trigonometric identity

\sin^2(x) + \cos^2(x) = 1

from which it follows that

\csc^2(x) =1+ctg^2(x)

Then, if we substitute this into the derivative of the arccotangent function, we obtain

\displaystyle \frac{d}{dy}arcctg(y) = \frac{-1}{\csc^2(x)} = \frac{-1}{ 1+ctg^2(x)}

And since y=ctg(x)

\displaystyle \frac{d}{dy}arcctg(y) = \frac{-1}{1 + y^2}

Finally, substituting “y” with “x” in this last expression, we arrive at what we wanted to prove:

\displaystyle \color{blue}{\frac{d}{dx}arcctg(x) = \frac{-1}{1+ x^2}}

Arcsecant
Show Proof

The function \sec(x) is bijective as long as we restrict its domain to a set of the form \displaystyle \left[0+k\pi , \pi+ k\pi \right]\setminus\left\{\frac{\pi}{2} + k\pi\right\}, with k any integer. Without loss of generality, we may limit ourselves to the principal case, where k=0, so that the bijective secant function is of the form

\sec : \left[0, \pi\right]\setminus\{\pi/2\} \longrightarrow \mathbb{R}\setminus]-1,1[

and under these conditions it holds that

y=\sec(x) \longleftrightarrow x={arcsec}(y).

If we apply the inverse function theorem, we have:

\displaystyle \frac{d}{dy}{arcsec}(y) = \frac{1}{\frac{d}{dx}\sec(x)} = \frac{1}{\sec(x)\tan(x)}

Now, let us recall the trigonometric identity

\sin^2(x) + \cos^2(x) = 1

from which it follows that

\tan^2(x) =\sec^2(x)-1

Then, if we substitute this into the derivative of the arcsecant function, we obtain

\displaystyle \frac{d}{dy}{arcsec}(y) = \frac{1}{\sec(x)\tan(x)} = \frac{1}{sec(x)\sqrt{\sec^2(x)-1}}

And since y=\sec(x)

\displaystyle \frac{d}{dy}{arcsec}(y) = \frac{1}{y\sqrt{y^2-1}}

Finally, substituting “y” with “x” in this last expression, we arrive at what we wanted to prove:

\displaystyle \color{blue}{\frac{d}{dx}{arcsec}(x) = \frac{1}{x\sqrt{x^2-1}}}

Arccosecant
Show Proof

The function \csc(x) is bijective as long as we restrict its domain to a set of the form \displaystyle \left[-\frac{\pi}{2}+k\pi , \frac{\pi}{2} + k\pi \right]\setminus\left\{0+k\pi\right\}, with k any integer. Without loss of generality, it is possible to limit ourselves to the principal case, where k=0, so that the bijective cosecant function is of the form

\displaystyle \csc : \left[-\frac{\pi}{2}, \frac{\pi}{2}\right]\setminus\{0\} \longrightarrow \mathbb{R}\setminus]-1,1[

and under these conditions it holds that

y=\csc(x) \longleftrightarrow x={arccsc}(y).

If we apply the inverse function theorem, we have:

\displaystyle \frac{d}{dy}{arccsc}(y) = \frac{1}{\frac{d}{dx}\csc(x)} = \frac{-1}{\csc(x)ctg(x)}

Now, let us recall the trigonometric identity

\sin^2(x) + \cos^2(x) = 1

from which it follows that

ctg^2(x) =\csc^2(x)-1

Then, if we substitute this into the derivative of the arccosecant function, we obtain

\displaystyle \frac{d}{dy}{arcsec}(y) = \frac{-1}{\csc(x)ctg(x)} = \frac{-1}{csc(x)\sqrt{\csc^2(x)-1}}

And since y=\csc(x)

\displaystyle \frac{d}{dy}{arccsc}(y) = \frac{-1}{y\sqrt{y^2-1}}

Finally, substituting “y” with “x” in this last expression, we arrive at what we wanted to prove:

\displaystyle \color{blue}{\frac{d}{dx}{arccsc}(x) = \frac{-1}{x\sqrt{x^2-1}}}

Implicit Differentiation

All the derivatives we have computed up to now have been carried out on functions that were defined explicitly: y=f(x). However, there are situations in which, based on the relationship between variables, it is either not easy to obtain the explicit expression of the function or such a task is simply not possible. For these types of cases, the technique of implicit differentiation is useful, and its foundations lie, once again, in the chain rule.

To understand this technique, examples are more valuable than proofs, so let us consider the relationship between the variables x and y given by the equation

x^3 +y^3- 9xy=0

If we graph this relationship, we will realize that it is not the graph of any function. It is the graph of a curve called the “Descartes’ Folium.”

hoja de descartes

Now, if we wanted to compute, for example, the derivative of y with respect to x, then we would face serious difficulties in finding an explicit expression f(x) that satisfies the equation y=f(x) in order to differentiate afterwards. What we do, however, is skip that step and implicitly assume that y is a function of x, that is: y=y(x). Doing so transforms the Descartes’ Folium relation into:

x^3 +y^3(x)- 9xy(x)=0

And we can consequently differentiate everything using the chain rule. If we do so, we arrive at the following result:

\begin{array}{rcl} \displaystyle 3x^{2} + 3\,y(x)^{2}\,\frac{dy}{dx} - \left(9\,y(x) + 9x\,\frac{dy}{dx}\right) &=& 0 \\ \\ \displaystyle 3x^{2} + 3\,y(x)^{2}\,\frac{dy}{dx} - 9\,y(x) - 9x\,\frac{dy}{dx} &=& 0 \\ \\ \displaystyle \frac{dy}{dx}\,\big(3\,y(x)^{2} - 9x\big) &=& 9\,y(x) - 3x^{2} \\ \\ \displaystyle \frac{dy}{dx} &=& \dfrac{9\,y(x) - 3x^{2}}{3\,y(x)^{2} - 9x} \\ \\ \displaystyle \color{blue}{\frac{dy}{dx}} &\color{blue}{=}& \color{blue}{\dfrac{3\,y(x) - x^{2}}{y(x)^{2} - 3x}} \end{array}

From this we can compute, if we know a point on the curve, the slope of the tangent line that passes through that point. For example, from the graph we can guess that the point (2,4) lies on the curve; and in fact, this is confirmed because 2^3 + 4^3 - 9\cdot 2\cdot 4 = 8+64 - 72 = 0. Knowing this, we can quickly say that the slope of the tangent line that passes through that point will be:

\displaystyle \color{blue}{\left.\frac{dy}{dx}\right|_{(2,4)}= \frac{3\cdot 4 - 2^2}{4^2 - 3\cdot 2}= \frac{8}{10}= \frac{4}{5}}

Derivatives of Rational Powers

By differentiating implicitly, it is possible to extend the reach of one of the basic differentiation techniques. This is the derivative of functions of the type f(x)=x^n, with n\in\mathbb{Z}. We can now move from considering integers to rational numbers and show without difficulty that

\displaystyle \frac{d}{dx}x^{p/q}= \frac{p}{q}x^{(p/q) -1}

where p,q\in\mathbb{Z} and q\neq 0.

To prove this, we say: let y=x^{p/q} and apply the natural logarithm to obtain:

\ln(y) = \displaystyle \frac{p}{q}\ln(x)

Now, differentiating this expression implicitly yields:

\displaystyle \frac{1}{y}\frac{dy}{dx} = \frac{p}{q}\frac{1}{x} \displaystyle \color{blue}{\frac{dy}{dx} = \frac{p}{q}\frac{1}{x}y(x)= \frac{p}{q}\frac{1}{x}x^{p/q} = \frac{p}{q}x^{(p/q) - 1}}

Exercise Guide:

Chain Rule in One Variable

  1. Compute the derivatives of the following group of functions:
    a.f(x)=(x^2-3)^{12}b.f(x)=\displaystyle \left(\frac{4x^3 - x\cos(2x) - 1}{\sin(2x) + 2} \right)^5
    c.f(x)=\cos(1-x^2)d.f(x)=\tan(x\cos(3-x^2))
    e.f(x)=\displaystyle \frac{1}{(\sec(2x)-1)^{3/2}}f.f(x)=\displaystyle \frac{\tan(2x)}{1-\cot(2x)}
    g.f(x)=\displaystyle \ln\left(\frac{\tan(x)}{x^2+1}\right)h.f(x)=3^{\csc(4x)}
  2. Compute the derivative of the following group of functions:
    a.f(x)=\displaystyle \frac{1}{\sqrt{x}arctan\left(x^3\right)}b.f(x)=\displaystyle \frac{{arcsec}(x^2-x+2)}{\sqrt{x^2+1}}
    c.f(x)=x^xd.f(x)={arccsc}\left(x^{\ln(x)}\right)
    e.f(x)=\ln\left(arctan(e^x)\right)
Views: 2

Leave a Reply

Your email address will not be published. Required fields are marked *