Proof of Multivariable Chain Rule in higher dimensions

In summary, the proof involves using the chain rule for single variable functions and then generalizing it for functions with multiple variables. The key is to use the fact that the partial derivative of a composition of functions is equal to the sum of the partial derivatives of each function multiplied by the partial derivatives of the innermost function.
  • #1
SpY]
65
0

Homework Statement



Let [tex]\textbf{F}: \textbf{R}^m \rightarrow \textbf{R}^n[/tex] and [tex]\textbf{G}: \textbf{R}^p \rightarrow \textbf{R}^m[/tex]

Prove that [tex]({\textbf{F} \circ \textbf{G}})'(x) = {\textbf{F}}'(\textbf{G}(\textbf{x})) {\textbf{G}}'(\textbf{x})[/tex]

Homework Equations


Assume the single variable chain rule, that is for
[tex]f, g: \textbf{R} \rightarrow \textbf{R}[/tex]

[tex]\frac {d(f \circ g)}{dt}(t) = \frac {df}{dt} \big]_{g(t)} \frac {dg}{dt}(t)[/tex]


The Attempt at a Solution


I figured using the single variable result by extending it to [tex]\textbf{R}^2[/tex] first, a sort of subproof which uses the mean value theorem:

Let [tex]f: \textbf{R}^2 \rightarrow \textbf{R}[/tex] and [tex]\textbf{G}: \textbf{R} \rightarrow \textbf{R}^2[/tex]

Then

[tex]f(\textbf{G}(t+h)) - f(\textbf{G}(t)) = f(G_1(t+h), G_2(t+h)) - f(G_1(t), G_2(t+h)) + f(G_1(t), G_2(t+h)) - f(G_1(t), G_2(t))[/tex]
The second and third terms change nothing, I will use them later

Then by the first mean value theorem,
[tex]\exists k_1, k_2 \in (0,h) [/tex] such that

[tex] G_1 (t+h) - G_1 (t) = h{G_1}'(t+k_1) [/tex]

[tex] G_2 (t+h) - G_2 (t) = h{G_2}'(t+k_2) [/tex]

Expanding the first two terms previously by substituting [tex]G_1(t+h)[/tex]
[tex]f(G_1(t+h), G_2(t+h)) - f(G_1(t), G_2(t+h))[/tex]

[tex] = f(h{G_1}'(t+k_1) + G_1(t), G_2(t+h))- f(G_1(t), G_2(t+h))[/tex]

[tex] = h{G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} [/tex]

Where [tex]p_1 \in (0, h{G_1}'(t+k_1))[/tex]

Similarly for the next two terms substituting [tex]G_2(t+h)[/tex]
[tex]f(G_1(t), G_2(t+h)) - f(G_1(t), G_2(t))[/tex]

[tex]f(G_1(t), h{G_2}'(t+k_2) + G_2(t)) - f(G_1(t), G_2(t))[/tex]

[tex] = h{G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))} [/tex]

Where [tex]p_2 \in (0, h{G_1}'(t+k_2))[/tex]

Combining this all together and dividing by h:

[tex]\frac {f(\textbf{G}(t+h)) - f(\textbf{G}(t))}{h}[/tex]

[tex]= {G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} + {G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))}[/tex]

Now as [tex]h \rightarrow 0[/tex], [tex]k_1, k_2, p_1, p_2 \rightarrow 0[/tex] since they are contained in intervals up to [tex]h[/tex]. The LHS is now the chain derivative

[tex] {(f \circ \textbf{G})}'(t) =\lim_{h \to 0} \frac {f(\textbf{G}(t+h)) - f(\textbf{G}(t))}{h} [/tex]

[tex] = {G_1}'(t+k_1) \frac {\partial df}{\partial dx_1} \big]_{(p_1 + G_1(t), G_2(t+h))} + {G_2}'(t+k_2) \frac {\partial df}{\partial dx_2} \big]_{(G_1(t), p_2 + G_2(t))}[/tex]

[tex]= {f}'(\textbf{G} (t)) { \textbf{G}}'(t) [/tex]

I've tried generalizing this for any n, but it gets rather long so I'm not sure how to put in concisely. After that, I don't know how to take it to the general proof (any m,n) as required.

Thanks
 
Last edited:
Physics news on Phys.org
  • #2
Rather than going into all the limit stuff, I think there is an easier way.

Let [tex]i \in \{1, ..., n\}, j \in \{1, ..., m\}, k \in \{1, ..., p\}[/tex].

First we need to have that:

[tex]\frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) = \sum_{j=1}^m \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x))[/tex]

Then we can apply the definitions and say:

[tex](F \circ G)'(x)
= \left( \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) \right)
= \left( \sum_{j=1}^m \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x)) \right)
= \left( \frac {\partial} {\partial x_j} f_i(g_1(x), ..., g_m(x)) \right) \left( \frac {\partial} {\partial x_k} g_j(x)) \right)
= F'(G(x))G'(x)[/tex]
 
  • #3
Hmmm ok so let me get this straight: the [tex]i[/tex] refers to elements in [tex]f[/tex], [tex]j[/tex] to elements in [tex]g[/tex], and [tex]k[/tex] for partial derivatives in [tex]\frac {\partial} {\partial x_k}[/tex]? Where [tex]f: \textbf{R}^n \rightarrow \textbf{R}[/tex] and [tex]g: \textbf{R}^m \rightarrow \textbf{R}[/tex]? (just to be specific on domains here)

Then shouldn't your first line read

[tex]
\sum_{i=1}^n \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) = \sum_{j=1}^m \sum_{i=1}^n \frac {\partial} {\partial x_k} f_i(g_1(x), ..., g_m(x)) \cdot \frac {\partial} {\partial x_k} g_j(x))
[/tex]

Because you need to sum the components for [tex]f[/tex] otherwise [tex]f_i[/tex] is meaningless, then end up with a double sum on the right (over f and g).

Also your first partial derivative on the right should be [tex]\frac {\partial} {\partial x_k}[/tex] not by [tex]x_j[/tex] otherwise it goes to [tex]\frac {\partial} {\partial x_m}[/tex] because of the sum, or should there be a [tex]\sum_{k=1}^p[/tex] somewhere?

I'm having trouble following your last line as well, because you expand the partial derivative using a sum, then just take the sum away keeping the same indices. Throughout you have the variables i, j, k without the sum in front, when you should be summing to n, m, p.

Thanks for the effort though. If a mentor or homework helper could give input it would be appreciated.
 

Related to Proof of Multivariable Chain Rule in higher dimensions

What is the multivariable chain rule in higher dimensions?

The multivariable chain rule in higher dimensions is a mathematical concept that allows us to calculate the rate of change of a function with respect to one variable while holding all other variables constant. It is used in multivariable calculus and is an extension of the chain rule in single-variable calculus.

How is the multivariable chain rule derived?

The multivariable chain rule can be derived using the concept of partial derivatives. It involves taking the derivative of a function with respect to each variable, and then multiplying these derivatives together to get the overall rate of change.

Why is the multivariable chain rule important?

The multivariable chain rule is important because it enables us to solve complex problems in multivariable calculus, such as finding the maximum or minimum values of a function with multiple variables. It is also used in physics, engineering, and other fields to model and analyze systems with multiple variables.

What are some applications of the multivariable chain rule?

The multivariable chain rule has various applications in mathematics, physics, and engineering. Some examples include optimization problems, finding tangent planes and normal lines to surfaces, and calculating the total differential of a multivariable function.

What are some tips for using the multivariable chain rule?

When using the multivariable chain rule, it is important to remember the order in which the variables are being differentiated. It can also be helpful to draw diagrams or visualize the problem to better understand the relationships between the variables. Practice and familiarity with the concept is key in mastering the multivariable chain rule.

Similar threads

  • Calculus and Beyond Homework Help
Replies
13
Views
2K
  • Calculus and Beyond Homework Help
Replies
3
Views
626
  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
900
  • Calculus and Beyond Homework Help
Replies
1
Views
328
  • Calculus and Beyond Homework Help
Replies
10
Views
1K
  • Calculus and Beyond Homework Help
Replies
6
Views
915
  • Calculus and Beyond Homework Help
Replies
10
Views
2K
  • Electromagnetism
Replies
19
Views
2K
  • Calculus and Beyond Homework Help
Replies
11
Views
2K
Back
Top