Confusion about Einstein notation

TimeRip496 · Dec 10, 2018

In Einstein summation convention, the summation occurs for upper indices and its repeated but lower indices. However I have some confusion

1) $${\displaystyle v=v^{i}e_{i}={\begin{bmatrix}e_{1}&e_{2}&\cdots &e_{n}\end{bmatrix}}{\begin{bmatrix}v^{1}\\v^{2}\\\vdots \\v^{n}\end{bmatrix}},\ \qquad w=w_{i}e^{i}={\begin{bmatrix}w_{1}&w_{2}&\cdots &w_{n}\end{bmatrix}}{\begin{bmatrix}e^{1}\\e^{2}\\\vdots \\e^{n}\end{bmatrix}}}$$
Won't the above gives me a scalar each? And most text seems to label the v here as a vector, including wikipedia. I understand the vector components labelled as vⁱ and its coordinate basis as e_i or is the definition of vector different in Einstein convention?

In addition, how does the above transpose then work?
E.g. $$v^T=v^ie^i$$
Does it only change the coordinate basis but not the coefficient?

1a) For transpose of matrix, we just need to switch the two indices around. What about the transpose of a vector? Does it remains the same?

2) Inner product of vectors
To do inner product of two vectors, I first need to convert the other into a covector right? In that case, inner product of vector should be expressed as
$$v.u=v^iu_i=g_{ij}v^iu^j$$

3) For similar indices on a 4th order tensor, can I rewrite it as a 2nd order tensor without it losing its meaning?
E.g.
$$R^{\mu}_{\nu\mu\kappa}=R_{\nu\kappa}$$
Is the above equivalent valid? It doesn't seem correct to me as the same indices require summation and thus removing them will remove the summation which seems to contain less info.

4) For matrix-vector multiplication or matrix-matrix multiplication, they can only be done when the upper and lower similar indices from each tensor must be side by side right?
E.g. $$u_i=A^{j}_iv_j=v^jA_j^i$$
But this multiplication is not possible right, $$u_i\neq A_{ij}v_j$$

5) As for derivative, can the partial derivative tensor be arrange anywhere throughout the equation?
E.g. $$A_{ij}\partial_{\mu}\partial_{\nu}f(x)=\partial_{\nu}A_{ij}\partial_{\mu}f(x)$$
It should be possible based on the commutative property of partial derivative unless is covariant derivative.

PeterDonis · Dec 10, 2018

TimeRip496 said:

Won't the above gives me a scalar each?

It depends on what the ##e##'s are supposed to be. I don't know what textbooks or other sources you are looking at (Wikipedia is not the best place to learn about this stuff rigorously), but there are two notation conventions here that are easy to confuse.

Convention #1 says that a thing with upper indexes, like ##v^i##, is a vector, and a thing with lower indexes, like ##w_i##, is a covector (also called a 1-form). Then you can use the Einstein summation convention to form the scalar ##v^i w_i##. In fact, this can be taken as the definition of a covector: a linear mapping from vectors to scalars.

Convention #2 says that you express a vector in components by multiplying each component by its matching basis vector. So you would write ##\vec{v} = v^i e_i##, where ##e_i## is the basis vector with index ##i##. However, calling this the Einstein summation convention is a bit of a misnomer, because the lower index on the basis vector ##e_i## does not mean it's a covector or a 1-form, and the sum is not a scalar, it's the vector ##\vec{v}##.

TimeRip496 said:

What about the transpose of a vector?

The idea of a "transpose" doesn't really apply to a vector. But in particular cases (basically, when you are working in a metric space), you can find a one-to-one correspondence between vectors and covectors, and then you can think of the covector ##v_i## that corresponds to the vector ##v^i## as being the "transpose" of a vector. However, this terminology has limited usefulness, since it really depends on thinking of vectors as columns and covectors as rows, with operators as matrices, and that representation gets problematic when you start working with more complicated vector spaces.

TimeRip496 said:

To do inner product of two vectors, I first need to convert the other into a covector right?

Yes, which means you can only do this if you have a metric, i.e., a correspondence between vectors and covectors. If you don't have a metric, then the concept of an inner product of two vectors has no meaning, nor does the concept of an inner product of two covectors. Only the inner product of a vector and a covector has meaning.

TimeRip496 said:

For similar indices on a 4th order tensor, can I rewrite it as a 2nd order tensor without it losing its meaning?

This operation is called "contraction", and it does not preserve all of the information in the original tensor. But it is a valid operation when you have an upper and lower index on a tensor, yes.

TimeRip496 said:

For matrix-vector multiplication or matrix-matrix multiplication, they can only be done when the upper and lower similar indices from each tensor must be side by side right?

Again, a proper understanding of this requires separating vectors and tensors from their representations as rows (or columns for covectors) and matrices. You don't multiply matrices and vectors or matrices and matrices. You combine vectors and tensors to form new vectors and tensors. For example, ##A^i_j v^j## is really two separate operations: first, combining the (1-1) tensor ##A^i_j## and the vector, or (1-0) tensor, ##v^k##, into the (2-1) tensor ##A^i_j v^k##, and then contracting the lower index with the second upper index. These operations have meaning even if the vectors and tensors do not have matrix representations.

Also, the matrix representation of a tensor is ambiguous: it doesn't really distinguish between a (2-0) tensor, a (1-1) tensor, and a (0-2) tensor. This is a key reason for keeping vectors and tensors separate conceptually from particular representations.

TimeRip496 said:

the partial derivative tensor

The partial derivative is not a tensor; it's an operator. More precisely, it's an operation that can be used to build various operators on vectors and tensors.

TimeRip496 said:

E.g.
$$
A_{ij}\partial_{\mu}\partial_{\nu}f(x)=\partial_{\nu}A_{ij}\partial_{\mu}f(x)
$$

I don't know what this equation is supposed to mean.

haushofer · Dec 11, 2018

My 2 cents: I strongly suggest that for basisvectors, you put parentheses around the index,

[tex]
V = V^i e_{(i)}
[/tex]

to emphasize that the index i of ##e_{(i)}## labels not components, but whole basisvectors.

PeroK · Dec 11, 2018

TimeRip496 said:

In Einstein summation convention, the summation occurs for upper indices and its repeated but lower indices. However I have some confusion

1) $$v=v^{i}e_{i}$$

This is not a lot different from the "standard" notation for linear algebra, except the sum has been suppressed:
$$\mathbf{v} = \sum_{i=1}^{n} v^i \mathbf{e}_i = \sum_i v^i \mathbf{e}_i = v^i \mathbf{e}_i$$

robphy · Dec 11, 2018

haushofer said:

My 2 cents: I strongly suggest that for basisvectors, you put parentheses around the index,

[tex]
V = V^i e_{(i)}
[/tex]

to emphasize that the index i of ##e_{(i)}## labels not components, but whole basisvectors.

PeroK said:

This is not a lot different from the "standard" notation for linear algebra, except the sum has been suppressed:
$$\mathbf{v} = \sum_{i=1}^{n} v^i \mathbf{e}_i = \sum_i v^i \mathbf{e}_i = v^i \mathbf{e}_i$$

@PeroK 's post clarifies that [itex]v^i[/itex] is the [itex]i[/itex]th component and [itex]\mathbf{e}_i[/itex] is the [itex]i[/itex]th basis vector. Instead of boldface [which isn't so easy to do when writing], you could use the arrowhead notation.

$${\vec v} = v^i {\vec e}_i \qquad\mbox{implied summation}$$

Introducing greek abstract indices [not to be summed over, but the label of a "slot"] instead of the arrowheads,
$${v^\mu} = v^i {{ e}_i}^\mu \qquad \mbox{implied summation}$$

In column-vector form, for example,
[tex]
\begin{bmatrix}v^{1}\\v^{2}\\\vdots \\v^{n}\end{bmatrix}=
v^{1}\begin{bmatrix}1\\0\\\vdots \\0\end{bmatrix}+
v^{2}\begin{bmatrix}0\\1\\\vdots \\0\end{bmatrix}+
\cdots+
v^{n}\begin{bmatrix}0\\0\\\vdots \\1\end{bmatrix}
[/tex]

PeterDonis · Dec 11, 2018

robphy said:

Introducing greek abstract indices [not to be summed over, but the label of a "slot"] instead of the arrowheads

I don't think it's correct to "mix" slot notation and component notation like this. Where are you getting this from?

robphy · Dec 11, 2018

PeterDonis said:

I don't think it's correct to "mix" slot notation and component notation like this. Where are you getting this from?

It's not ideal to have both (multiple) types of indices... especially for a novice... but sometimes it might be needed.

Here are sections from Penrose & Rindler's Spinors and Spacetime

From p.93 in vol I, in Ch 2 of the Abstract Index Notation

then at the top of the next page

Here's something from p.81 showing an explicit summation symbol for a different combination of indices

PeterDonis · Dec 11, 2018

robphy said:

Here are sections from Penrose & Rindler's Spinors and Spacetime

Thanks for the reference! I admit there are a lot of complexities in this subject that I am not expert on.

robphy · Dec 11, 2018

PeterDonis said:

Thanks for the reference! I admit there are a lot of complexities in this subject that I am not expert on.

Although different folks might have the same general idea of they want to say,
there is a wide variety of [possibly idiosyncratic] notations that they employ,.
Unfortunately, sometimes the notation get too condensed and too abstract (pun intended).
Many times, I feel I need a translator to unpackage the notation.

TimeRip496 · Dec 14, 2018

PeterDonis said:

It depends on what the ##e##'s are supposed to be. I don't know what textbooks or other sources you are looking at (Wikipedia is not the best place to learn about this stuff rigorously), but there are two notation conventions here that are easy to confuse.

Convention #1 says that a thing with upper indexes, like ##v^i##, is a vector, and a thing with lower indexes, like ##w_i##, is a covector (also called a 1-form). Then you can use the Einstein summation convention to form the scalar ##v^i w_i##. In fact, this can be taken as the definition of a covector: a linear mapping from vectors to scalars.

Convention #2 says that you express a vector in components by multiplying each component by its matching basis vector. So you would write ##\vec{v} = v^i e_i##, where ##e_i## is the basis vector with index ##i##. However, calling this the Einstein summation convention is a bit of a misnomer, because the lower index on the basis vector ##e_i## does not mean it's a covector or a 1-form, and the sum is not a scalar, it's the vector ##\vec{v}##.
The idea of a "transpose" doesn't really apply to a vector. But in particular cases (basically, when you are working in a metric space), you can find a one-to-one correspondence between vectors and covectors, and then you can think of the covector ##v_i## that corresponds to the vector ##v^i## as being the "transpose" of a vector. However, this terminology has limited usefulness, since it really depends on thinking of vectors as columns and covectors as rows, with operators as matrices, and that representation gets problematic when you start working with more complicated vector spaces.
Yes, which means you can only do this if you have a metric, i.e., a correspondence between vectors and covectors. If you don't have a metric, then the concept of an inner product of two vectors has no meaning, nor does the concept of an inner product of two covectors. Only the inner product of a vector and a covector has meaning.
This operation is called "contraction", and it does not preserve all of the information in the original tensor. But it is a valid operation when you have an upper and lower index on a tensor, yes.
Again, a proper understanding of this requires separating vectors and tensors from their representations as rows (or columns for covectors) and matrices. You don't multiply matrices and vectors or matrices and matrices. You combine vectors and tensors to form new vectors and tensors. For example, ##A^i_j v^j## is really two separate operations: first, combining the (1-1) tensor ##A^i_j## and the vector, or (1-0) tensor, ##v^k##, into the (2-1) tensor ##A^i_j v^k##, and then contracting the lower index with the second upper index. These operations have meaning even if the vectors and tensors do not have matrix representations.

Also, the matrix representation of a tensor is ambiguous: it doesn't really distinguish between a (2-0) tensor, a (1-1) tensor, and a (0-2) tensor. This is a key reason for keeping vectors and tensors separate conceptually from particular representations.
The partial derivative is not a tensor; it's an operator. More precisely, it's an operation that can be used to build various operators on vectors and tensors.
I don't know what this equation is supposed to mean.

Thanks a lot!

Confusion about Einstein notation

Attachments

What is Einstein notation?

Why is Einstein notation used?

How does Einstein notation work?

What are the benefits of using Einstein notation?

Are there any limitations to using Einstein notation?

Similar threads

Hot Threads

Recent Insights