Is Covariant Derivative Notation Misleading in Vector Calculus?

stevendaryl · Apr 7, 2021

PeterDonis said:

But the notation ##\nabla_\mu V^\nu##, by itself, does not say whether you mean the particular number you get from making a particular choice of values for the indices, or the abstract object that is the (1, 1) tensor itself.

I thought you said that it always means components of a tensor, rather than the tensor itself. Once again, if it doesn't mean components, then how do you indicate the components of that tensor?

You, yourself, complained about that very ambiguity when you said, correctly, that physics notation doesn't make it clear whether you are talking about a vector or the components of a vector.

It's not an ambiguity if it always means components. Rather, it means one element of an indexed collection of objects.

But now you suddenly turn around and say that that notation always means the components? Why are you shifting your ground?

I'm not shifting my ground.

I understand quite well that your choice of index tells you perfectly clearly which one. My point is that it doesn't tell me which one--or most other physics readers. As I noted above, you complained before about physics notation not clearly distinguishing between vectors and their components; the notation you are using here fails to clearly distinguish between components and directional derivatives.

If there are indices, then you're always talking about one element of an indexed collection of objects. There is a directional derivative for each basis vector.

stevendaryl · Apr 7, 2021

PeterDonis said:

I addressed this a while back, and so did @Orodruin. See posts #13 and #21.

I looked at those posts, and they don't give an example of how the equivalence that I'm talking about fails in a non-coordinate basis.

I'm willing to be proved wrong. If ##T## is a (1,1) vector, then ##T(e_\mu)## is a vector. When is it the case that
##T^\nu_\mu \neq (T(e_\mu))^\nu##

PeterDonis · Apr 7, 2021

stevendaryl said:

thought you said that it always means components of a tensor, rather than the tensor itself.

Where did I say that?

PeterDonis · Apr 7, 2021

stevendaryl said:

If there are indices, then you're always talking about one element of an indexed collection of objects.

But if the indices are component indices, the objects in the indexed collection aren't basis vectors. They're
components.

stevendaryl said:

There is a directional derivative for each basis vector.

Which doesn't help if the indexes aren't indexes that denote basis vectors.

PeterDonis · Apr 7, 2021

stevendaryl said:

If ##T## is a (1,1) ~~vector~~ tensor, then ##T(e_\mu)## is a vector.

See correction above. With the correction, I agree.

stevendaryl said:

When is it the case that
##T^\nu_\mu \neq (T(e_\mu))^\nu##

Take Schwarzschild spacetime in Schwarzschild coordinates. In the coordinate basis, we have ##e_0 = (1, 0, 0, 0)##. So for a general (1, 1) tensor ##T##, ##T ( e_0 )##, in matrix multiplication form, looks like:

$$
\begin{bmatrix}
T_{00} & T_{01} & T_{02} & T_{03} \\
T_{10} & T_{11} & T_{12} & T_{13} \\
T_{20} & T_{21} & T_{22} & T_{23} \\
T_{30} & T_{31} & T_{32} & T_{33}
\end{bmatrix}
\begin{bmatrix}
1 \\
0 \\
0 \\
0
\end{bmatrix}
=
\begin{bmatrix}
T_{00} \\
T_{10} \\
T_{20} \\
T_{30}
\end{bmatrix}
$$

Or, in the notation @Orodruin used in post #13, we have ##\left( e_0 \right)^\nu = \delta^\nu_0##, so ##\left[ T ( e_0 ) \right]^\nu = T_0{}^\nu##.

But in a non-coordinate, orthonormal basis, we have

$$
\hat{e}_0 = \left( \frac{1}{\sqrt{1 - 2M / r}}, 0, 0, 0 \right)
$$

So ##T ( \hat{e}_0 )## in matrix multiplication form now looks like this:

$$
\begin{bmatrix}
T_{00} & T_{01} & T_{02} & T_{03} \\
T_{10} & T_{11} & T_{12} & T_{13} \\
T_{20} & T_{21} & T_{22} & T_{23} \\
T_{30} & T_{31} & T_{32} & T_{33}
\end{bmatrix}
\begin{bmatrix}
\frac{1}{\sqrt{1 - 2M / r}} \\
0 \\
0 \\
0
\end{bmatrix}
= \frac{1}{\sqrt{1 - 2M / r}}
\begin{bmatrix}
T_{00} \\
T_{10} \\
T_{20} \\
T_{30}
\end{bmatrix}
$$

In other words, we have ##\left[ T ( \hat{e}_0 ) \right]^\nu \neq T_0{}^\nu##. The extra factor in ##\hat{e}_0## makes the two unequal.

Orodruin · Apr 7, 2021

PeterDonis said:

See correction above. With the correction, I agree.
Take Schwarzschild spacetime in Schwarzschild coordinates. In the coordinate basis, we have ##e_0 = (1, 0, 0, 0)##. So for a general (1, 1) tensor ##T##, ##T ( e_0 )##, in matrix multiplication form, looks like:

$$
\begin{bmatrix}
T_{00} & T_{01} & T_{02} & T_{03} \\
T_{10} & T_{11} & T_{12} & T_{13} \\
T_{20} & T_{21} & T_{22} & T_{23} \\
T_{30} & T_{31} & T_{32} & T_{33}
\end{bmatrix}
\begin{bmatrix}
1 \\
0 \\
0 \\
0
\end{bmatrix}
=
\begin{bmatrix}
T_{00} \\
T_{10} \\
T_{20} \\
T_{30}
\end{bmatrix}
$$

Or, in the notation @Orodruin used in post #13, we have ##\left( e_0 \right)^\nu = \delta^\nu_0##, so ##\left[ T ( e_0 ) \right]^\nu = T_0{}^\nu##.

But in a non-coordinate, orthonormal basis, we have

$$
\hat{e}_0 = \left( \frac{1}{\sqrt{1 - 2M / r}}, 0, 0, 0 \right)
$$

So ##T ( \hat{e}_0 )## in matrix multiplication form now looks like this:

$$
\begin{bmatrix}
T_{00} & T_{01} & T_{02} & T_{03} \\
T_{10} & T_{11} & T_{12} & T_{13} \\
T_{20} & T_{21} & T_{22} & T_{23} \\
T_{30} & T_{31} & T_{32} & T_{33}
\end{bmatrix}
\begin{bmatrix}
\frac{1}{\sqrt{1 - 2M / r}} \\
0 \\
0 \\
0
\end{bmatrix}
= \frac{1}{\sqrt{1 - 2M / r}}
\begin{bmatrix}
T_{00} \\
T_{10} \\
T_{20} \\
T_{30}
\end{bmatrix}
$$

In other words, we have ##\left[ T ( \hat{e}_0 ) \right]^\nu \neq T_0{}^\nu##. The extra factor in ##\hat{e}_0## makes the two unequal.

This is true only if you are looking for the components of ##T## in the coordinate basis. If you instead wanted the components in the non-coordinate basis, then that is what you need to insert. It makes little sense to insert different bases unless you for some reason need to used mixed bases to express your tensor.

PeterDonis · Apr 8, 2021

Orodruin said:

If you instead wanted the components in the non-coordinate basis

In post #13, you said:

Orodruin said:

I will agree in the general case, but it really does not matter as long as we are dealing with holonomic bases. Since the components are ##(e_\mu)^\nu = \delta_\mu^\nu##, it is indeed the case that ##\nabla_{e_\nu} = \delta^\mu_\nu \nabla_\mu = \nabla_\nu##.

The non-coordinate basis is not holonomic. Are you now disagreeing with yourself?

PeterDonis · Apr 8, 2021

Orodruin said:

This is true only if you are looking for the components of T in the coordinate basis. If you instead wanted the components in the non-coordinate basis, then that is what you need to insert.

What you are calling "the components in the non-coordinate basis" are components in local inertial coordinates, where the coordinate basis vectors are orthonormal. But those coordinates are only local, and in them covariant derivatives are identical to partial derivatives so none of the issues discussed in this thread even arise.

Orodruin · Apr 8, 2021

PeterDonis said:

In post #13, you said:
The non-coordinate basis is not holonomic. Are you now disagreeing with yourself?

In that post I believe I assumed that the index was referring to the coordinate basis. If the index instead refers to a general basis it works regardless of the basis. All you really need is a basis of the tangent space ##e_a## and its dual ##e^a##. The components ##V^a## of the tangent vector ##V## are then defined through the relation ##V = V^a e_a## which also means that ##e^a(V) = e^a(V^b e_b) = V^b e^a(e_b) = V^b \delta^a_b = V^a## and so we can extract the components of ##V## by passing ##V## as the argument of ##e^a## (with the appropriate generalisation to any type of tensor). Whether ##e_a## is a coordinate basis or not is not relevant to this argument.

PeterDonis said:

What you are calling "the components in the non-coordinate basis" are components in local inertial coordinates, where the coordinate basis vectors are orthonormal.

No, not necessarily. It is true for any basis, not only coordinate bases or even normalised or orthogonal bases (now, why you would pick such a basis is a different question). It is perfectly possible to find basis fields that are not the bases of any local coordinate system (e.g., by picking linearly independent but non-commutative fields as the basis fields).

Edit:
A good example of an orthonormal non-coordinate basis would be the normalised polar basis in ##\mathbb R^2##. We would have
$$
e_r = \partial_r, \qquad e_\theta = \frac 1r \partial_\theta
$$
leading to ##[e_r,e_\theta] = [\partial_r , (1/r) \partial_\theta] = -\frac{1}{r^2} \partial_\theta \neq 0##. The corresponding dual would be
$$
e^r = dr, \qquad e^\theta = r\, d\theta.
$$
Since ##e_r## and ##e_\theta## do not commute, they cannot be the tangent fields of local coordinate functions. In this case clearly also ##\nabla_{e_a} e_b \neq 0## in general.

PeterDonis · Apr 8, 2021

Orodruin said:

It is perfectly possible to find basis fields that are not the bases of any local coordinate system (e.g., by picking linearly independent but non-commutative fields as the basis fields).

The orthonormal basis in Schwarzschild spacetime that I used is such a non-holonomic (i.e., the basis vector fields don't commute) basis. That was my point in using it.

Perhaps I should have explicitly included all of the basis vector fields, although they can be read off by inspection from the standard Schwarzschild line element so I had assumed it was clear which ones I was referring to:

$$
\hat{e}_0 = \frac{1}{\sqrt{1 - 2M / r}} \partial_t
$$

$$
\hat{e}_1 = \sqrt{1 - \frac{2M}{r}} \partial_r
$$

$$
\hat{e}_2 = \frac{1}{r} \partial_\theta
$$

$$
\hat{e}_3 = \frac{1}{r \sin \theta} \partial_\varphi
$$

Orodruin · Apr 8, 2021

Indeed, and it has a corresponding dual basis ##\hat e^a## that together can be used to extract the components of any tensor in that basis. It is not restricted to local inertial coordinates.

PeterDonis · Apr 8, 2021

Orodruin said:

Since ##e_r## and ##e_\theta## do not commute, they cannot be the tangent fields of local coordinate functions.

Orodruin said:

Indeed, and it has a corresponding dual basis ##\hat e^a## that together can be used to extract the components of any tensor in that basis. It is not restricted to local inertial coordinates.

Hm. I think I see what was confusing me. The required commutation property is not that the basis vectors have to commute, but that the covariant derivative has to commute with contraction, since the contraction operation is what "extracting the components of the tensor" involves. AFAIK that commutation property is always true.

I'll elaborate (sorry if this is belaboring the obvious) by restating the original issue: we have a notation ##\nabla_\mu V^\nu## that can have at least two different meanings. Using Wald's abstract index notation, the two meanings are:

(1) ##\left[ \left( e_\mu \right)^a \nabla_a V^b \right]^\nu##, i.e., the ##\nu## component of the vector obtained by taking the directional derivative of the vector ##V## in the direction of the vector ##e_\mu##;

(2) ##\left( \nabla_a V^b \right)_\mu{}^\nu##, i.e., the ##\mu##, ##\nu## component of the (1, 1) tensor obtained by taking the covariant derivative of the vector ##V##.

Writing out the "taking the component" operations explicitly, we have:

(1) ##\left[ \left( e_\mu \right)^a \nabla_a V^b \right] \left[ \left( e^\nu \right)_b \right]##

(2) ##\left( \nabla_a V^b \right) \left[ \left( e_\mu \right)^a \left( e^\nu \right)_b \right]##

As long as the ##\nabla## operator commutes with contraction, these will be equal, since we just have to swap the contraction with ##e_\mu## and the ##\nabla## operation on ##V##.

PeterDonis · Apr 8, 2021

Orodruin said:

Since ##e_r## and ##e_\theta## do not commute, they cannot be the tangent fields of local coordinate functions.

This does raise another question. I understand that there is a 1-1 correspondence between tangent vectors and directional derivatives (MTW, for example, discusses this in some detail in a fairly early chapter). But doesn't that require that the tangent vectors be tangent fields of local coordinate functions? If so, how would we interpret directional derivatives in the direction of a vector that is part of a non-holonomic set of basis vector fields?

Orodruin · Apr 8, 2021

PeterDonis said:

But doesn't that require that the tangent vectors be tangent fields of local coordinate functions?

Not really. The tangent space is made by linear combinations of those tangent fields (the holonomic basis), but nothing stops you from introducing a different linear combination of those fields that does not form a holonomic basis of any set of local coordinates that also span the tangent space at each point. For any given vector, you can of course find a coordinate system where it is the tangent of a local coordinate function.

PeterDonis said:

If so, how would we interpret directional derivatives in the direction of a vector that is part of a non-holonomic set of basis vector fields?

So, if I understand the question, you're asking how we should interpret something like ##e_a \phi##, where ##e_a## is a basis vector of some set of basis vectors on the tangent space (not necessarily holonomic). If we make things easier for us and just consider when this vector is on the form ##f \partial_a## where ##f## is some scalar function, then ##e_a \phi = f \partial_a \phi## would be the rate of change in ##\phi## if you go in the direction ##e_a##, which is going in the same direction as specified by the coordinate direction ##\partial_a##, but a factor ##f## faster.

To take a more concrete example, consider ##e_\theta## of the polar coordinates on ##\mathbb R^2##. While ##\partial_\theta \phi## represents the change in ##\phi## per change in the coordinate ##\theta##, ##e_\theta\phi## represents the change in ##\phi## per physical distance in the coordinate direction (since ##e_\theta## is normalised), but generally nothing stops you from defining any direction.

It should also be noted that any single field can be made into a coordinate direction (just take the flow lines of that field and label them with ##n-1## other coordinates), but that a full set of basis fields cannot necessarily form a holonomic basis together.

vanhees71 · Apr 8, 2021

PeterDonis said:

Yes, that's correct; the common notation ##\nabla_\mu## is really a shorthand for saying that ##\nabla## is an operator that takes a (p, q) tensor and produces a (p, q + 1) tensor, i.e., it "adds" one lower index. As you note, it makes more sense to put the indexes on the entire expression ##\nabla V## instead of on the ##\nabla## and the ##V## separately.

But with such purism all the "magic" of the index calculus is lost. It's just convenient notation, and I don't think that it's very problematic.

vanhees71 · Apr 8, 2021

stevendaryl said:

That might be true. I think that the usual notation is pretty screwed up.
That doesn't make any sense to me. To me, ##\nabla V## is a 1,1 tensor, and ##\nabla_\mu V^\nu## is not a tensor at all, but a COMPONENT of a tensor.

To me, ##V^\nu## is not a vector, it is a component of a vector.

##\nabla V## is a (1,1) tensor and ##\nabla_{\mu} V^{\nu}## are the tensor components. From the context we discuss here it's the components with respect to the holonomous coordinate basis and dual basis, though also some posters seem to also discuss non-holonomic bases, which of course have also their merits (particularly when using orthonormal tetrads).

cianfa72 · Apr 8, 2021

In this thread we talked about Wald's abstract index notation and Penrose's abstract index notation. Are they actually the same ?

etotheipi said:

I think the point being made here is that the horizontal positioning is important [well, for any not totally-symmetric tensor ] and of course cannot be ignored; In slot notation, the covariant derivative of a type ##(k,l)## tensor ##T## along a vector ##U## is$$\nabla_{U} T := \nabla T(\, \cdot \, , \dots, \, \cdot \, , U)$$with the ##U## in the final slot.

Is it a usual convention to "add" the slot supposed to be filled with the vector field ##U## at the end of the tensor map ##\nabla T := \nabla T(\, \cdot \, , \dots, \, \cdot \, , \cdot \,)## ?
In other words in slot notation do we reference first the set of covector slots (instances of ##V^*##) and then the set of vector slots (instances of ##V##) ?

vanhees71 · Apr 8, 2021

This notation I think I know from Misner, Thorne, Wheeler, Gravitation.

cianfa72 · Apr 8, 2021

vanhees71 said:

This notation I think I know from Misner, Thorne, Wheeler, Gravitation.

Does your statement apply to the following part of my previous post ?

cianfa72 said:

Is it a usual convention to "add" the slot supposed to be filled with the vector field ##U## at the end of the tensor map ##\nabla T := \nabla T(\, \cdot \, , \dots, \, \cdot \, , \cdot \,)## ?
In other words in slot notation do we reference first the set of covector slots (instances of ##V^*##) and then the set of vector slots (instances of ##V##) ?

vanhees71 · Apr 8, 2021

Yes! MTW is pretty good at explaining the abstract index-free notation to physicists.

stevendaryl · Apr 8, 2021

PeterDonis said:

Hm. I think I see what was confusing me. The required commutation property is not that the basis vectors have to commute, but that the covariant derivative has to commute with contraction, since the contraction operation is what "extracting the components of the tensor" involves. AFAIK that commutation property is always true.

I'll elaborate (sorry if this is belaboring the obvious) by restating the original issue: we have a notation ##\nabla_\mu V^\nu## that can have at least two different meanings. Using Wald's abstract index notation, the two meanings are:

(1) ##\left[ \left( e_\mu \right)^a \nabla_a V^b \right]^\nu##, i.e., the ##\nu## component of the vector obtained by taking the directional derivative of the vector ##V## in the direction of the vector ##e_\mu##;

(2) ##\left( \nabla_a V^b \right)_\mu{}^\nu##, i.e., the ##\mu##, ##\nu## component of the (1, 1) tensor obtained by taking the covariant derivative of the vector ##V##.

Writing out the "taking the component" operations explicitly, we have:

(1) ##\left[ \left( e_\mu \right)^a \nabla_a V^b \right] \left[ \left( e^\nu \right)_b \right]##

(2) ##\left( \nabla_a V^b \right) \left[ \left( e_\mu \right)^a \left( e^\nu \right)_b \right]##

As long as the ##\nabla## operator commutes with contraction, these will be equal, since we just have to swap the contraction with ##e_\mu## and the ##\nabla## operation on ##V##.

So they are always equal?

PeterDonis · Apr 8, 2021

vanhees71 said:

##\nabla V## is a (1,1) tensor and ##\nabla_{\mu} V^{\nu}## are the tensor components. From the context we discuss here it's the components with respect to the holonomous coordinate basis and dual basis, though also some posters seem to also discuss non-holonomic bases

MTW sometimes uses the notation ##\nabla_\hat{\mu}{}^\hat{\nu}## to denote the components with respect to an orthonormal (non-coordinate) basis, to distinguish them from the components ##\nabla_\mu{}^\nu## with respect to a coordinate basis.

PeterDonis · Apr 8, 2021

cianfa72 said:

In other words in slot notation do we reference first the set of covector slots (instances of ##V^*##) and then the set of vector slots (instances of ##V##) ?

No. The slots can come in any order, and, strictly speaking, the order of the slots is part of the definition of a particular tensor, so, for example, there are really two possible kinds of (1, 1) tensors, a (lower index slot, upper index slot) tensor and an (upper index slot, lower index slot) tensor. In a space with a metric (which is all we ever deal with in GR), you can always raise and lower indexes, so this fine distinction doesn't really matter.

IIRC MTW puts the "covariant derivative" slot first, not last.

etotheipi · Apr 8, 2021

PeterDonis said:

In other words, we have ##\left[ T ( \hat{e}_0 ) \right]^\nu \neq T_0{}^\nu##. The extra factor in ##\hat{e}_0## makes the two unequal.

Although, wouldn't ##\left[ T ( \hat{e}_0 ) \right]^\nu = {\hat{T}_{0}}^{\nu}##, where I called ##{\hat{T}_{0}}^{\nu}## the components of ##T## in this non-coordinate basis? In other words I'm not so sure what this example is trying to show, because of course if you insert the ##e_{\mu}## basis vectors you get those 'un-hatted' components and if you insert the ##\hat{e}_{\mu}## you get those 'hatted' components, and we have no reason to believe these are equal anyway!

PeterDonis · Apr 8, 2021

etotheipi said:

wouldn't ##\left[ T ( \hat{e}_0 ) \right]^\nu = {\hat{T}_{0}}^{\nu}##, where I called ##{\hat{T}_{0}}^{\nu}## the components of ##T## in this non-coordinate basis?

Some sources put the hat on the tensor, others (such as MTW, as I mentioned before) put the hats on the component indexes, to indicate that the components are being taken with respect to a non-coordinate (usually orthonormal) basis.

etotheipi said:

I'm not so sure what this example is trying to show, because of course if you insert the ##e_{\mu}## basis vectors you get those 'un-hatted' components and if you insert the ##\hat{e}_{\mu}## you get those 'hatted' components, and we have no reason to believe these are equal anyway!

Yes, that was part of my original issue: if we interpret "un-hatted" indexes as components with respect to a coordinate basis, then equating ##\nabla_\mu V##, i.e., the ##\mu## component of the covariant derivative of ##V##, with ##\nabla_{e_\mu} V##, i.e., the directional derivative of ##V## in the ##e_\mu## direction, is only valid if ##e_\mu## is a coordinate basis vector, not a non-coordinate basis vector. But forming the directional derivative works fine for a non-coordinate basis vector, and extracting components works fine with respect to a non-coordinate basis (just insert the non-coordinate basis vector in the appropriate slot of the covariant derivative tensor), and if the covariant derivative commutes with contraction those two operations give the same result regardless of whether we are using a coordinate or non-coordinate basis, so the only real issue remaining is whether we should write ##\nabla_\hat{\mu}## (or ##\hat{\nabla}_\mu##) instead of just ##\nabla_\mu## when we are using a non-coordinate (usually orthonormal) basis.

vanhees71 · Apr 8, 2021

All these different conventions are really a nuissance. The only thing I don't like about MTW is precisely putting indicators on the indices and not on the symbol. For me that's almost confusing, because if ##\mu## and ##\hat{\mu}## run through the values 0...3 (or 1...4 ;-)) ##V^{\mu}## and ##V^{\hat{\mu}}## simply denote the same four numbers ##V^0\ldots V^3##.

One just has to be careful to find out what each author means with his notation not to get confused...

PeterDonis · Apr 8, 2021

stevendaryl said:

So they are always equal?

If the covariant derivative commutes with contraction, yes. The discussion in Chapter 3 of Carroll's online lecture notes [1] indicates that this property will always hold for the covariant derivative used in GR, although it strictly speaking does not have to hold for a general covariant derivative (one that only satisfies the linearity and Leibniz rule properties); the primary benefit of requiring the property to be true appears from his discussion to be that it means we can use the same connection coefficients for transforming tensors of any rank.

[1] https://ned.ipac.caltech.edu/level5/March01/Carroll3/Carroll3.html

PeterDonis · Apr 8, 2021

vanhees71 said:

The only thing I don't like about MTW is precisely putting indicators on the indices and not on the symbol.

But the symbol symbolizes the geometric object itself, which is independent of any choice of coordinates; it's the indices that (at least with MTW's convention--Wald's abstract index convention is different) are supposed to convey information about the choice of coordinates. So putting the hats on the indices makes sense given MTW's general approach. Putting the hat on the symbol itself would imply that something about the geometric object changes when you change coordinates.

PeterDonis · Apr 8, 2021

vanhees71 said:

if ##\mu## and ##\hat{\mu}## run through the values 0...3 (or 1...4 ;-)) ##V^{\mu}## and ##V^{\hat{\mu}}## simply denote the same four numbers ##V^0\ldots V^3##

The components aren't numerically the same, since they are components taken with respect to two different choices of basis. The index numbers are the same, but that's just because we number them by dimensions without taking into account anything about the particular coordinate choice. But if we were to designate indexes by coordinate instead of by index number, we would have, for example, ##V^t##, ##V^r##, ##V^\theta##, and ##V^\phi## for a coordinate basis as compared with ##V^T##, ##V^X##, ##V^Y##, and ##V^Z## for an orthonormal basis.

dextercioby · Apr 8, 2021

I cannot believe there are 100 posts here about a simple pure mathematics issue (albeit with application in GR).

##\nabla## is a linear operator called covariant derivative which can be applied to any ##(n,l)## tensor to bring it to an ##(n,l+1)## tensor. It generalizes what in (pseduo-) Riemannian manifolds is called "affine connection".

In mathematics ##\nabla_{\mu}V^{\nu}## is ill defined (and from your long debate, it's certainly controversial in physics (!)). However, in order to connect the mathematical definition of a covariant derivative of a tensor with physics (in the GR tensor approach à la Einstein/Levi-Civita/Hilbert/Weyl, i.e. the old-school GR), we can define this dubious object appearing in physics texts as:

$$\nabla_{\mu}V^{\nu} := \left(\nabla V\right)_{\mu}^{~\nu}, \tag{1} $$

that is, the LHS are the components of the (1,1) tensor ##\nabla V## (with ##V## being a (1,0) tensor a.k.a. vector) in the basis ##dx^{\mu}\otimes \partial_{\nu}## (a coordinate basis in the tangent space of each point of the curved manifold).

So regarding the original "bracketing" issue brought up by @stevendaryl and its ensuing discussion, ##(1)## is the only reasonable definition of that object which appears in the "old-school" GR works. Modern (after 1960) GR uses the so-called "abstract index notation" which is meant to make more sense when analyzed from a pure math perspective. However, when the so-called "abstract tensors" are projected onto bases of directional derivatives and differential one-forms, the dubious objects of the LHS of ##(1)## appear again.

P.S. The directional derivative of a vector ##Y## along a vector ##X## is a vector. In formulas:

$$\nabla_{X} Y =: \nabla_{X^{\mu} \partial_{\mu}} \left(Y^{\nu}\partial_{\nu}\right) = \quad ... \quad =\\
\left[X^{\mu} \left(\partial_{\mu}Y^{\nu} +\Gamma^{\nu}_{~\mu\lambda} Y^{\lambda}\right)\right] \partial_{\nu} \tag{2} $$

In the round brackets of ##(2)## one recognizes the object defined in ##(1)##.

PeterDonis · Apr 8, 2021

PeterDonis said:

others (such as MTW, as I mentioned before) put the hats on the component indexes, to indicate that the components are being taken with respect to a non-coordinate (usually orthonormal) basis.

Actually, looking at MTW again, I think their "hat" convention actually has nothing to do with a non-coordinate basis; I think it has to do with components with respect to a local inertial coordinate chart, as opposed to components with respect to a general coordinate chart. As far as I can tell, MTW don't write equations in component form at all unless they are using a coordinate basis.

vanhees71 · Apr 9, 2021

PeterDonis said:

But the symbol symbolizes the geometric object itself, which is independent of any choice of coordinates; it's the indices that (at least with MTW's convention--Wald's abstract index convention is different) are supposed to convey information about the choice of coordinates. So putting the hats on the indices makes sense given MTW's general approach. Putting the hat on the symbol itself would imply that something about the geometric object changes when you change coordinates.

That again depends on how you define your symbols. For me a tensor (e.g., a vector ##V##) is an invariant object (under transformations under consideration, i.e., rotations in 3D Euclidean vector spaces, Lorentz transformations in 4D Lorentzian vector spaces, general linear transformations for a general vector space, general diffeomorphisms in all kinds of differentiable manifolds,...). Then there are the components wrt. a basis. Then the index notation comes into play, and we write e.g., for a vector (field) on a manifold wrt. the holonous basis of some coordinates, ##V=V^{\mu} \partial_{\mu}##. For me the indices just count from 0 to 3 (in GR), no matter which ornaments I put on them. That's why I have to write the components in another holonomous basis derived from the "new" coordinates ##\bar{x}^{\mu}## as ##\bar{V}^{\mu}##. Then of course you have ##V=V^{\mu} \partial_{\mu}=\bar{V}^{\mu} \bar{\partial}_{\mu}##.

It's of course possible to do it like MTW and put some ornaments on the indices to indicate wrt. which basis vectors you decompose your tensors. It's just a matter to stay consistent, but I think a notation should also be as "error resistant" as possible, and then this notation is in my experience less "save" than the other.

vanhees71 · Apr 9, 2021

dextercioby said:

I cannot believe there are 100 posts here about a simple pure mathematics issue (albeit with application in GR).

##\nabla## is a linear operator called covariant derivative which can be applied to any ##(n,l)## tensor to bring it to an ##(n,l+1)## tensor. It generalizes what in (pseduo-) Riemannian manifolds is called "affine connection".

In mathematics ##\nabla_{\mu}V^{\nu}## is ill defined (and from your long debate, it's certainly controversial in physics (!)). However, in order to connect the mathematical definition of a covariant derivative of a tensor with physics (in the GR tensor approach à la Einstein/Levi-Civita/Hilbert/Weyl, i.e. the old-school GR), we can define this dubious object appearing in physics texts as:

$$\nabla_{\mu}V^{\nu} := \left(\nabla V\right)_{\mu}^{~\nu}, \tag{1} $$

that is, the LHS are the components of the (1,1) tensor ##\nabla V## (with ##V## being a (1,0) tensor a.k.a. vector) in the basis ##dx^{\mu}\otimes \partial_{\nu}## (a coordinate basis in the tangent space of each point of the curved manifold).

So regarding the original "bracketing" issue brought up by @stevendaryl and its ensuing discussion, ##(1)## is the only reasonable definition of that object which appears in the "old-school" GR works. Modern (after 1960) GR uses the so-called "abstract index notation" which is meant to make more sense when analyzed from a pure math perspective. However, when the so-called "abstract tensors" are projected onto bases of directional derivatives and differential one-forms, the dubious objects of the LHS of ##(1)## appear again.

P.S. The directional derivative of a vector ##Y## along a vector ##X## is a vector. In formulas:

$$\nabla_{X} Y =: \nabla_{X^{\mu} \partial_{\mu}} \left(Y^{\nu}\partial_{\nu}\right) = \quad ... \quad =\\
\left[X^{\mu} \left(\partial_{\mu}Y^{\nu} +\Gamma^{\nu}_{~\mu\lambda} Y^{\lambda}\right)\right] \partial_{\nu} \tag{2} $$

In the round brackets of ##(2)## one recognizes the object defined in ##(1)##.

If you really want to be save you also have to obey the horizontal order of indices!

I don't see, what's the advantage to complicate the notation by writing ##{(\nabla V)_{\mu}}^{\nu}## instead of the usual notation ##\nabla_{\mu} V^{\nu}##, which is simply the same thing. Mathematicians are sometimes in a disadvantage if it comes to practical calculations in comparison to physicists who tend to write things in a way that facilitates such calculations. I took a lot of pure-math lectures, and usually when it came to just calculating something my notation was way easier and quicker than the mathematicians'. My disadvantage was that I had to translate my sloppy physicist's notation to the more rigorous but combersome notation of the mathematicians. There's no free lunch! ;-)).

vanhees71 · Apr 9, 2021

PeterDonis said:

If the covariant derivative commutes with contraction, yes. The discussion in Chapter 3 of Carroll's online lecture notes [1] indicates that this property will always hold for the covariant derivative used in GR, although it strictly speaking does not have to hold for a general covariant derivative (one that only satisfies the linearity and Leibniz rule properties); the primary benefit of requiring the property to be true appears from his discussion to be that it means we can use the same connection coefficients for transforming tensors of any rank.

[1] https://ned.ipac.caltech.edu/level5/March01/Carroll3/Carroll3.html

Interesting. I thought it's usually put in the definition of a derivation operation on vector spaces that they should commute with contraction. Otherwise you could have a different connection on the co-vector space than the vector space, but that's a bit confusing then. It's easier to have the commutability between contraction and taking some kind of derivative. Then the definition of a specific kind of covariant derivative on the vector space implies a unique one on the co-vector space and vice versa. If you need different kinds of derivatives you can just introduce them. In GR you have not only the usual covariant derivatives (implying parallel transport) but also, e.g., Lie derivatives (implying Lie transport), etc.

It's amazing which variety is in the definitions of standard mathematical objects in the literature. Fortunately in standard GR everything is pretty unique, i.e., the usual connection is the unique metric-compatible connection in a torsion-free Lorentzian manifold ;-)).

stevendaryl · Apr 9, 2021

vanhees71 said:

I don't see, what's the advantage to complicate the notation by writing ##{(\nabla V)_{\mu}}^{\nu}## instead of the usual notation ##\nabla_{\mu} V^{\nu}##, which is simply the same thing.

My point is that it is ambiguous: Are you operating on a vector ##V## and then taking component ##\nu## of the result, or are you operating on the component ##V^\nu##?

Is Covariant Derivative Notation Misleading in Vector Calculus?

Similar threads

Hot Threads

Recent Insights