Proof of Multivariable Chain Rule in higher dimensions

SpY] · Apr 9, 2011

Homework Statement

Let [tex]\textbf{F}: \textbf{R}^m \rightarrow \textbf{R}^n[/tex] and [tex]\textbf{G}: \textbf{R}^p \rightarrow \textbf{R}^m[/tex]

Prove that [tex]({\textbf{F} \circ \textbf{G}})'(x) = {\textbf{F}}'(\textbf{G}(\textbf{x})) {\textbf{G}}'(\textbf{x})[/tex]

Homework Equations

Assume the single variable chain rule, that is for
[tex]f, g: \textbf{R} \rightarrow \textbf{R}[/tex]

[tex]\frac {d(f \circ g)}{dt}(t) = \frac {df}{dt} \big]_{g(t)} \frac {dg}{dt}(t)[/tex]

The Attempt at a Solution

I figured using the single variable result by extending it to [tex]\textbf{R}^2[/tex] first, a sort of subproof which uses the mean value theorem:

Let [tex]f: \textbf{R}^2 \rightarrow \textbf{R}[/tex] and [tex]\textbf{G}: \textbf{R} \rightarrow \textbf{R}^2[/tex]

Then

[tex]f(\textbf{G}(t+h)) - f(\textbf{G}(t)) = f(G_1(t+h), G_2(t+h)) - f(G_1(t), G_2(t+h)) + f(G_1(t), G_2(t+h)) - f(G_1(t), G_2(t))[/tex]
The second and third terms change nothing, I will use them later

Then by the first mean value theorem,
[tex]\exists k_1, k_2 \in (0,h) [/tex] such that

[tex] G_1 (t+h) - G_1 (t) = h{G_1}'(t+k_1) [/tex]

[tex] G_2 (t+h) - G_2 (t) = h{G_2}'(t+k_2) [/tex]

Expanding the first two terms previously by substituting [tex]G_1(t+h)[/tex]
[tex]f(G_1(t+h), G_2(t+h)) - f(G_1(t), G_2(t+h))[/tex]

[tex] = f(h{G_1}'(t+k_1) + G_1(t), G_2(t+h))- f(G_1(t), G_2(t+h))[/tex]

[tex] = h{G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} [/tex]

Where [tex]p_1 \in (0, h{G_1}'(t+k_1))[/tex]

Similarly for the next two terms substituting [tex]G_2(t+h)[/tex]
[tex]f(G_1(t), G_2(t+h)) - f(G_1(t), G_2(t))[/tex]

[tex]f(G_1(t), h{G_2}'(t+k_2) + G_2(t)) - f(G_1(t), G_2(t))[/tex]

[tex] = h{G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))} [/tex]

Where [tex]p_2 \in (0, h{G_1}'(t+k_2))[/tex]

Combining this all together and dividing by h:

[tex]\frac {f(\textbf{G}(t+h)) - f(\textbf{G}(t))}{h}[/tex]

[tex]= {G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} + {G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))}[/tex]

Now as [tex]h \rightarrow 0[/tex], [tex]k_1, k_2, p_1, p_2 \rightarrow 0[/tex] since they are contained in intervals up to [tex]h[/tex]. The LHS is now the chain derivative

[tex] {(f \circ \textbf{G})}'(t) =\lim_{h \to 0} \frac {f(\textbf{G}(t+h)) - f(\textbf{G}(t))}{h} [/tex]

[tex] = {G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} + {G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))}[/tex]

[tex]= {f}'(\textbf{G} (t)) { \textbf{G}}'(t) [/tex]

I've tried generalizing this for any n, but it gets rather long so I'm not sure how to put in concisely. After that, I don't know how to take it to the general proof (any m,n) as required.

Thanks

I like Serena · Apr 9, 2011

Rather than going into all the limit stuff, I think there is an easier way.

Let [tex]i \in \{1, ..., n\}, j \in \{1, ..., m\}, k \in \{1, ..., p\}[/tex].

First we need to have that:

[tex]\frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) = \sum_{j=1}^m \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x))[/tex]

Then we can apply the definitions and say:

[tex](F \circ G)'(x)
= \left( \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) \right)
= \left( \sum_{j=1}^m \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x)) \right)
= \left( \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \right) \left( \frac {\partial} {\partial x_k} g_j(x)) \right)
= F'(G(x))G'(x)[/tex]

SpY] · Apr 10, 2011

Hmmm ok so let me get this straight: the [tex]i[/tex] refers to elements in [tex]f[/tex], [tex]j[/tex] to elements in [tex]g[/tex], and [tex]k[/tex] for partial derivatives in [tex]\frac {\partial} {\partial x_k}[/tex]? Where [tex]f: \textbf{R}^n \rightarrow \textbf{R}[/tex] and [tex]g: \textbf{R}^m \rightarrow \textbf{R}[/tex]? (just to be specific on domains here)

Then shouldn't your first line read

[tex]
\sum_{i=1}^n \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) = \sum_{j=1}^m \sum_{i=1}^n \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x))
[/tex]

Because you need to sum the components for [tex]f[/tex] otherwise [tex]f_i[/tex] is meaningless, then end up with a double sum on the right (over f and g).

Also your first partial derivative on the right should be [tex]\frac {\partial} {\partial x_k}[/tex] not by [tex]x_j[/tex] otherwise it goes to [tex]\frac {\partial} {\partial x_m}[/tex] because of the sum, or should there be a [tex]\sum_{k=1}^p[/tex] somewhere?

I'm having trouble following your last line as well, because you expand the partial derivative using a sum, then just take the sum away keeping the same indices. Throughout you have the variables i, j, k without the sum in front, when you should be summing to n, m, p.

Thanks for the effort though. If a mentor or homework helper could give input it would be appreciated.

Proof of Multivariable Chain Rule in higher dimensions

Homework Statement

Homework Equations

The Attempt at a Solution

Related to Proof of Multivariable Chain Rule in higher dimensions

What is the multivariable chain rule in higher dimensions?

How is the multivariable chain rule derived?

Why is the multivariable chain rule important?

What are some applications of the multivariable chain rule?

What are some tips for using the multivariable chain rule?

Similar threads

Hot Threads

Recent Insights