Understanding Minkowski Space Metrics: The Sign Reversal Mystery Explained

RCopernicus · Oct 14, 2014

I've never seen a satisfactory explanation of the metrics used in a calculation of distance in Minkowski space. In Euclidean space, the distance is:
ds^2 = dx^2 + dy^2 + dz^2
But in Minkowski space, the distance is
ds^2 = (dt * c)^2 - dx^2 - dy^2 - dz^2
Why are the signs reversed? This implies that space (or time depending on your convention) is imaginary.

Dale · Oct 14, 2014

That is one way to look at it, but it faded into disuse quite some time ago. Now, the usual approach is not to consider the time coordinate to be imaginary, but to consider the minus sign to be in the metric. So (in units where c=1):

##ds^2 = g_{\mu\nu} dx^{\mu} dx^{\nu} = -dt^2 + dx^2 + dy^2 + dz^2##

This can, as you suggested, be achieved by ##dx = (i~dt,dx,dy,dz)## and

##g = \left(
\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}
\right)##

But it can also be achieved by ##dx = (dt,dx,dy,dz)## and

##g = \left(
\begin{array}{cccc}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}
\right)##

The usual modern approach is the latter

Fredrik · Oct 14, 2014

The ##dx^\mu## notation is from differential geometry. In the context of SR, we can talk about matrices instead. The Euclidean inner product (i.e. the dot product) on the space of 4×1 matrices is given by ##\langle x,y\rangle=x^Ty##. If you insist on using this formula in SR, you have to make some components of x and y imaginary. A nicer way is to modify the definition to ##\langle x,y\rangle =x^Tg y##, where g is defined in DaleSpam's post.

Nugatory · Oct 14, 2014

RCopernicus said:

Why are the signs reversed? This implies that space (or time depending on your convention) is imaginary.

The different sign on the ##t## coordinate means that the Minkowski metric describes a space-time in which the distance between points on the line corresponding to the path of a light beam is zero. Experiments confirm that this model accurately describes the universe that we live in, so that's the model that we use. Thus, your "Why?" question comes down to "Why is the universe built this way and not some other way?" - and science isn't going to give you a satisfactory answer to that question.

As DaleSpam points out above, the modern style of moving the sign difference into the metric tensor reduces the embarrassing appearance of "imaginary" (better to say "complex" instead) numbers in the formulas. The older style, in which sooner or later you find yourself treating ##ict## (with ##i=\sqrt{-1}##) as a coordinate, was used mostly because it made the Lorentz transformations look like the already familiar problem of rotating the coordinate axes in Euclidean space. That helped people who were familiar with the mathematical underpinnings of classical mechanics make the jump to special relativity (it's worth noting that Goldstein introduces relativistic mechanics this way) but it's something you'll have to unlearn when you move on to general relativity.

Chalnoth · Oct 14, 2014

DaleSpam said:

That is one way to look at it, but it faded into disuse quite some time ago. Now, the usual approach is not to consider the time coordinate to be imaginary, but to consider the minus sign to be in the metric. So (in units where c=1):

##ds^2 = g_{\mu\nu} dx^{\mu} dx^{\nu} = -dt^2 + dx^2 + dy^2 + dz^2##

This can, as you suggested, be achieved by ##dx = (i~dt,dx,dy,dz)## and

##g = \left(
\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}
\right)##

But it can also be achieved by ##dx = (dt,dx,dy,dz)## and

##g = \left(
\begin{array}{cccc}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}
\right)##

The usual modern approach is the latter

Also, whether to put the minus sign on the time coordinate or the spatial coordinates is a matter of convention, and both conventions are in wide use.

ghwellsjr · Oct 14, 2014

RCopernicus said:

I've never seen a satisfactory explanation of the metrics used in a calculation of distance in Minkowski space. In Euclidean space, the distance is:
ds^2 = dx^2 + dy^2 + dz^2
But in Minkowski space, the distance is
ds^2 = (dt * c)^2 - dx^2 - dy^2 - dz^2
Why are the signs reversed? This implies that space (or time depending on your convention) is imaginary.

Since you have the option of using either convention, when you are actually doing a calculation for the distance between two events, if it comes out imaginary in one convention, you can switch to the other convention. Then you can think of the distance as being either an actual spatial distance or an actual time interval, depending on which convention evaluates to a positive value by taking its squareroot. In the first case, it can be measured with a ruler at rest in an Inertial Reference Frame where the two events occur at the same time or in the second case, it can be measured with an inertial clock that is present at both events. Another commonly used term for this distance is the Spacetime Interval, which in the first case is called "Spacelike" and in the second case is called "Timelike". If the Spacetime Interval evaluates to zero, that means that it cannot be measured either with a ruler or with a clock and it is called "Null".

Chalnoth · Oct 14, 2014

Nugatory said:

The different sign on the ##t## coordinate means that the Minkowski metric describes a space-time in which the distance between points on the line corresponding to the path of a light beam is zero. Experiments confirm that this model accurately describes the universe that we live in, so that's the model that we use. Thus, your "Why?" question comes down to "Why is the universe built this way and not some other way?" - and science isn't going to give you a satisfactory answer to that question.

I think we can do a bit better than that.

For example, one way to understand why we formulate the dimensions in this way, with the time coordinate taking the opposite sign of the spatial coordinates, is that when we write down distances in this way, the shortest path something will take between two events is the path it actually takes. This isn't terribly useful in just Minkowski space, as this just means that things travel in straight lines. But in General Relativity this can be used to compute the paths of orbits, or of light rays being deflected by a gravitational field, or anything else you might care to estimate the path of. Everything takes the shortest distance when you use the metric, which has different signs between the spatial and time components, as the measure of distance. With the added constraints that light always takes a path which has space-time distance equal to zero and objects with mass cannot travel space-like distances (with the metric convention RCopernicus used, s^2 must be greater than zero).

pervect · Oct 14, 2014

RCopernicus said:

I've never seen a satisfactory explanation of the metrics used in a calculation of distance in Minkowski space. In Euclidean space, the distance is:
ds^2 = dx^2 + dy^2 + dz^2
But in Minkowski space, the distance is
ds^2 = (dt * c)^2 - dx^2 - dy^2 - dz^2
Why are the signs reversed? This implies that space (or time depending on your convention) is imaginary.

The name "distance" may be confusing you. In special relativity, distance is observer dependent, due to Lorentz contraction. What is independent of the observer, and hence an invariant, is the Lorentz interval.

We sometimes, especially in analogies, refer to space-like Lorentz intervals as distances, or call them "proper distances". But the Lorentz interval is still a separate concept, it's distinguishing feature is that it's the same for all observers, and the formula (with it's minus sign) calculates this quantity that is the same for all observers. Without the minus sign, this quantity we calculate would not be the same for all observers, and hence would not be as of much interest.

We'll get back to the similarities of the Lorentz interval with Euclidean distance later, but for now it's important to recognize that they are different ghings, before we point out their underlying similarity.

Note that the Lorentz interval being equal to zero is equivalent to a lightlike separation between a pair of points, and vica versa. So, the Lorentz interval being zero is equivalent to saying that the geometry of space-time is such that a light like separation between points is independent of the observer. If this sounds like it's on the right track, good! If it seems a bit vague, read on.

The more formal justification of the Lorentz interval follows from the Lorentz transform itself. You can verify mathematically that a consequence of the Lorentz transform is that it leaves the Lorentz interval unchanged. The Lorentz transformations don't leave distances unchanged, nor do they leave times unchanged. The only scalar quantity that the Lorentz transforms leave unchanged is the Lorentz interval.

There are several ways of motivating the Lorentz transforms, you can use Einstein's original approach, or my favorite, the k-calculus approach due to Bondi. But the point is that after you start out with the axioms of relativity, that the speed of light is the same for all obserervers, plus whatever auxillary assumptions your particular approach to special realtivity needs (isotropy is a common one). At the end, you wind up with the Lorentz transform. I can't really get more specific than that in a short post, I will just suggest that if you don't understand how the Lorentz transformations came about, and my explanation is too brief, that there is a lot of literature out there you can read to fill in the gaps. After you've derived this transform, you notice an interesting property it has - it leaves this quantity that we call the Lorentz interval unchanged.

If you compare this to Euclidean geometry, the invariance of the Lorentz interval under Lorentz transformations is similar to the invariance of distance under rotations in Euclidean space. So the Lorentz interval is a bit like the concept that distance used to be in Euclidean space, beuase it's independent of the observer.

So you have this useful analogy, between the transforms induced by changes in velocity (called Lorentz boosts), and rotations in standard Euclidean space. They both leave something underlying unchanged. In the case of Euclidean space, this important thing that is unchanged by rotation is Euclidean distance. In the case of Minkowskii space, this important thing that is unchanged by a boost is the Lorentz interval.

Dale · Oct 14, 2014

Chalnoth said:

Also, whether to put the minus sign on the time coordinate or the spatial coordinates is a matter of convention, and both conventions are in wide use.

Yes, both are in wide use. My personal preference is to use ##ds^2## for the (-+++) convention and ##d\tau^2## for the (+---) convention.

ghwellsjr · Oct 14, 2014

NOTE: the Lorentz interval that pervect was talking about in his post is the same as the Spacetime Interval that I was talking about in my post.

And remember, between any pair of arbitrary events, it is either a pure spatial distance, or a pure time interval, or neither, which is why it is called "null". Only a flash of light can be present at both events in the null case and for that reason it is also called "lightlike".

harrylin · Oct 15, 2014

RCopernicus said:

I've never seen a satisfactory explanation of the metrics used in a calculation of distance in Minkowski space. In Euclidean space, the distance is:
ds^2 = dx^2 + dy^2 + dz^2
But in Minkowski space, the distance is
ds^2 = (dt * c)^2 - dx^2 - dy^2 - dz^2
Why are the signs reversed? This implies that space (or time depending on your convention) is imaginary.

As others already explained, it's just a matter of convention. The way it was written the first time (I think) is probably easier to understand:

"[..] the invariants of the Lorentz group.
We know that the substitutions of this group [..] are linear substitutions which do not affect the quadratic form
x² + y² + z² - t².
- https://en.wikisource.org/wiki/Translation:On_the_Dynamics_of_the_Electron_(July)#.C2.A7_9._.E2.80.94_Hypotheses_on_gravitation

shounakbhatta · Oct 15, 2014

DaleSpam said:

That is one way to look at it, but it faded into disuse quite some time ago. Now, the usual approach is not to consider the time coordinate to be imaginary, but to consider the minus sign to be in the metric. So (in units where c=1):

##ds^2 = g_{\mu\nu} dx^{\mu} dx^{\nu} = -dt^2 + dx^2 + dy^2 + dz^2##

This can, as you suggested, be achieved by ##dx = (i~dt,dx,dy,dz)## and

##g = \left(
\begin{array}{cccc}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}
\right)##

But it can also be achieved by ##dx = (dt,dx,dy,dz)## and

##g = \left(
\begin{array}{cccc}
-1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1 \\
\end{array}
\right)##

The usual modern approach is the latter

Hello Dale Spam,

As you have written, I would like to know how the values of (-1,0,0,0) come from? Are they parameters?

shounakbhatta · Oct 15, 2014

Nugatory said:

The different sign on the ##t## coordinate means that the Minkowski metric describes a space-time in which the distance between points on the line corresponding to the path of a light beam is zero. Experiments confirm that this model accurately describes the universe that we live in, so that's the model that we use. Thus, your "Why?" question comes down to "Why is the universe built this way and not some other way?" - and science isn't going to give you a satisfactory answer to that question.

As DaleSpam points out above, the modern style of moving the sign difference into the metric tensor reduces the embarrassing appearance of "imaginary" (better to say "complex" instead) numbers in the formulas. The older style, in which sooner or later you find yourself treating ##ict## (with ##i=\sqrt{-1}##) as a coordinate, was used mostly because it made the Lorentz transformations look like the already familiar problem of rotating the coordinate axes in Euclidean space. That helped people who were familiar with the mathematical underpinnings of classical mechanics make the jump to special relativity (it's worth noting that Goldstein introduces relativistic mechanics this way) but it's something you'll have to unlearn when you move on to general relativity.

Hello Nugatory,

Thank you for this wonderful, lucid answer. I would clarify (-ct), is because that t time is considered imaginary, hence i=root(sqrt-1)). Is that so?

Torbjorn_L · Oct 15, 2014

Nugatory said:

your "Why?" question comes down to "Why is the universe built this way and not some other way?" - and science isn't going to give you a satisfactory answer to that question.

This is off topic, I'm swimming in deep waters here and maybe this is obvious, but as a note I thought relativity was the embodiment of having a universal speed limit. And that the problem with "why this way" starts out of that observation (e.g. "why locality and hence causality").

Nugatory · Oct 15, 2014

Torbjorn_L said:

This is off topic, I'm swimming in deep waters here and maybe this is obvious, but as a note I thought relativity was the embodiment of having a universal speed limit.

The essential basis of relativity is not the universal speed limit, it is the invariance of the speed of light; the universal speed limit (and much else) follows from light-speed invariance.

But with said...
You've just moved the "Why?" question around. Why do we live in a universe that has a universal speed limit instead of one that does not? There's a fine and consistent mathematical model for describing a universe in which the speed of light is not invariant and there is no universal speed limit; it's called classical mechanics and there's nothing wrong with it except that observation tells us that it's not the way the universe works.

RCopernicus · Oct 15, 2014

Chalnoth said:

I think we can do a bit better than that.

For example, one way to understand why we formulate the dimensions in this way, with the time coordinate taking the opposite sign of the spatial coordinates, is that when we write down distances in this way, the shortest path something will take between two events is the path it actually takes. This isn't terribly useful in just Minkowski space, as this just means that things travel in straight lines. But in General Relativity this can be used to compute the paths of orbits, or of light rays being deflected by a gravitational field, or anything else you might care to estimate the path of. Everything takes the shortest distance when you use the metric, which has different signs between the spatial and time components, as the measure of distance. With the added constraints that light always takes a path which has space-time distance equal to zero and objects with mass cannot travel space-like distances (with the metric convention RCopernicus used, s^2 must be greater than zero).

I'm afraid your answer doesn't make much sense. I can claim that ds^2 = dx^2 - dy^2 describes a shorter path than ds^2 = dx^2 + dy^2, but I have no justification for arbitrarily flipping the sign on one of my dimensions. So why is Minkowski able to get away with it with time? Is not time orthogonal in every way to space?

Dale · Oct 15, 2014

The flipped sign is what sets up the causal structure of spacetime. It separates spacetime and four-vectors into timelike spacelike and lightlike regions.

Clocks measure timelike intervals and rods measure spacelike intervals. If you gave them both the same sign then you would have a theory where you could measure time with rods and simply turn towards the past as easily as turning left.

Dale · Oct 15, 2014

shounakbhatta said:

I would like to know how the values of (-1,0,0,0) come from?

They come directly from the line element:
##ds^2=g_{\mu\nu}dx^{\mu}dx^{\nu}=-dt^2+dx^2+dy^2+dz^2##

There are no cross terms, so all of the off diagonal entries are 0. Then the diagonal entries are the corresponding coefficients: (-1,1,1,1)

pervect · Oct 15, 2014

Respoinding to Chalnoth:

RCopernicus said:

I'm afraid your answer doesn't make much sense.

Interesting,why single it out for special attention then?

At the risk of being repetitive, I'll summarize my original longer post, which I'm concerned may have just gotten lost in the mass of replies.

While one can pretend that the sign flip in the expression for ds^2 did just "fall out of the air", and explore the consequence of saying "this magic quantity, which we won't tell you where it came from, is the same for all inertial observers" in detail and comparing the predictions made by this claim, I think it is more in the spirit of the original question to ask, historically, where did this quantity come from.

The process is a bit long, as I outlined in a longer post - one starts with special relativity and derives the Lorentz transform, then one looks at special quantities that are unchanged by the Lorentz transform and singles them out for special interest, eventually winding up with a geometrical interpretation of the special quantities that have this special property, which is called invariance.

So the quantity in question has a bit of a history, and if one wants to know where it came from, one needs to study the history.

If the real underlying question is "why study relativity at all", it might be good to remember that the end goal of science is to make predictions that agree with experiment, and focus on the experimental results. Starting out with preconceived notions of "how Nature ought to work" winds up in frustration at best, and at worst ends up with one sticking to the (incorrect) preconceived notions because one is happier with them than one is with the sometimes messy results that are actually measured.

DrGreg · Oct 15, 2014

RCopernicus, you might find an old thread useful: Lorentz interval

ChrisVer · Oct 16, 2014

I prefer the minus in the time component.
One reason is that we have the boosts and rotations in a Lorentz transformation. The boosts are the transformations concerning the time-components and they are generated by hyperbolic trigonometric functions.
If instead you had a 4D Euclidean space, you could have just the rotations. Rotations are generated by trigonometric functions. Changing the 0-component then into imaginary you will get hyperbolic functs and so boosts.

I haven't ever gone through this to see if it's working or not... but subconsciously it leads me in writing the metric as diag(-+++)

(Side remark:

Nugatory said:

Thus, your "Why?" question comes down to "Why is the universe built this way and not some other way?" - and science isn't going to give you a satisfactory answer to that question.

Then scientists should never ask the question why, and just start collecting and believing in data.)

RCopernicus · Oct 16, 2014

pervect said:

Respoinding to Chalnoth:
While one can pretend that the sign flip in the expression for ds^2 did just "fall out of the air", and explore the consequence of saying "this magic quantity, which we won't tell you where it came from, is the same for all inertial observers" in detail and comparing the predictions made by this claim, I think it is more in the spirit of the original question to ask, historically, where did this quantity come from.

Actually, that is precisely the question I'm asking. I accept the reality that the square-root of minus one accurately describes an observation. What is harder to accept is that geometry allows us to do this. I can make a shortest path by postulating that ds^2 = dx^2 - dy^2, but these dimensions are orthogonal to each other so I can't just arbitrarily flip the sign to make the distance shorter. So why are you able to do this for the dimension of time? (Yes, I understand they're different things, but the whole idea of an invariant distance is to put them in a form where you can add them together: ds^2 = cti^2 + x^2 + y^2 +z^2).

One poster has claimed: we don't know, it just works that way. I suppose that's good enough for the Quantum Mechanics, but I'm left with the sense that we just threw in a minus sign to make the formula fit the observation and I feel the same sense of dissatisfaction I feel when I use Gravitational Constant in a formula.

Nugatory · Oct 16, 2014

RCopernicus said:

Actually, that is precisely the question I'm asking. I accept the reality that the square-root of minus one accurately describes an observation. What is harder to accept is that geometry allows us to do this. I can make a shortest path by postulating that ds^2 = dx^2 - dy^2, but these dimensions are orthogonal to each other so I can't just arbitrarily flip the sign to make the distance shorter. So why are you able to do this for the dimension of time? (Yes, I understand they're different things, but the whole idea of an invariant distance is to put them in a form where you can add them together: ds^2 = cti^2 + x^2 + y^2 +z^2).

The time coordinate is different from the three spatial coordinates because we can always rotate the spatial axes in such a way that only one of ##dx##, ##dy##, ##dz## are non-zero, or make anyone of then negative, while still maintaining the orthogonality of all four axes. We can't do the same thing with the time axis because (as others have already said in this thread) that would be tantamount to turning in some direction and being able to look backwards in time.

The difference in sign between the three spatial components and the one temporal component of the Minkowski metric is capturing this basic difference between the spatial coordinates on the one hand and the temporal coordinate on the other. Indeed, the fact that both the (-1,1,1,1) and (1,-1,-1,-1) conventions work equally well is a pretty strong hint that no matter how we talk about them, they're different.

One poster has claimed: we don't know, it just works that way. I suppose that's good enough for the Quantum Mechanics, but I'm left with the sense that we just threw in a minus sign to make the formula fit the observation and I feel the same sense of dissatisfaction I feel when I use Gravitational Constant in a formula.

I didn't say that I was any more satisfied with this answer than you...
I said we are stuck with it, because the universe isn't responding to our complaints :)

robphy · Oct 16, 2014

RCopernicus said:

What is harder to accept is that geometry allows us to do this. I can make a shortest path by postulating that ds^2 = dx^2 - dy^2, but these dimensions are orthogonal to each other so I can't just arbitrarily flip the sign to make the distance shorter. So why are you able to do this for the dimension of time? (Yes, I understand they're different things, but the whole idea of an invariant distance is to put them in a form where you can add them together: ds^2 = cti^2 + x^2 + y^2 +z^2).

Suppose I told you about an experiment plotted on a position-vs-time graph.
Starting at a common event, have an infinite set of inertial observers with different velocities along a line in space
run and stop when their wristwatch reads 1 minute. Call the set of their stopping events a "circle" in this graph.
What is the equation of this "circle" in t and x variables? (Note that every inertial observer making a graph of this experiment will have identical looking graphs... with identical asymptotes.)

pervect · Oct 16, 2014

RCopernicus said:

Actually, that is precisely the question I'm asking. I accept the reality that the square-root of minus one accurately describes an observation. What is harder to accept is that geometry allows us to do this. I can make a shortest path by postulating that ds^2 = dx^2 - dy^2, but these dimensions are orthogonal to each other so I can't just arbitrarily flip the sign to make the distance shorter. So why are you able to do this for the dimension of time? (Yes, I understand they're different things, but the whole idea of an invariant distance is to put them in a form where you can add them together: ds^2 = cti^2 + x^2 + y^2 +z^2).

One poster has claimed: we don't know, it just works that way. I suppose that's good enough for the Quantum Mechanics, but I'm left with the sense that we just threw in a minus sign to make the formula fit the observation and I feel the same sense of dissatisfaction I feel when I use Gravitational Constant in a formula.

The full explanation of where it came from would involve deriving the Lorentz transform.

That's too much work for a post. Any SR book should go into the full details. As I said before, I'm particularly fond of the so-called k-calculus approach, if you look for books by Bondi like "Relativity and common sense", you'll see that approach applied. https://www.amazon.com/dp/0486240215/?tag=pfamazon01-20

But perhaps I can say something short and motivational instead of trying to derive the transform, I'll just point out one of its properties.

Consider a spherical wavefront propagating at a velocity "c". Let's describe them in a frame S with coordinates (t,x,y,z). The equations for the points on this wavefront will be an expanding sphere. At time t, the radius of the sphere will be ct. This implies that that ##x^2 + y^2 + z^2 = (ct)^2##, or ##x^2 + y^2 + z^2 - (ct)^2 = 0##.

Now, relativity says that light will propagate isotropically in a sphere in any inertial frame of referece. Let's consider two specific inertial frames of reference, S, and S'. If both S and S' are inertial, S' must be moving with some constant velocity v relative to S.

The first point is that any event in space-time will have exactly one unique set of coordinates in S, and a different set of unique coordinates in S'. As a consequence there will be a 1:1 mapping from coordinates in S to coordinates in S'.

Proof:
Given there is a 1:1 mapping from events<->S, and from events<->S'
We can invert the order and find a 1:1 mapping from S to events, because a 1:1 mapping must be invertible.

Then we construct the map S->events. Composing it with our map from events to S', we get
S -> events -> S'

This is the desired map from S to S'

This 1:1 mapping from S to S' is the Lorentz transform. But rather than derive it in detail I'm going to make a much simpler remark.

In frame S', describing the same wavefront from the same event, presumed to happen at t=t'=0, what do we get? There's nothing particularly special about either S or S', so hopefully it's clear that the description must involves simply replacing x with x', y with y', z with z', and t with t'. Thus we have

##x^2 + y^2 + z^2 - (ct)^2 = 0## in S
before the transform, and after the transform we must have

## x'^2 + y'^2 + z'^2 - (ct')^2 = 0.## in S'

Giving the quantity ##x^2 + y^2 + z^2 - (ct)^2## a name, the Lorentz interval, we've demonstrated that if th Lorentz interval is zero in S, it must also be zero in S'. It turns out that there is a more general result, that the value of the Lorentz interval is preserved even when it's not zero. I'm afraid you'll have to wade through the full details of the Lorentz transform to prove that. But if we are looking for preserved quantities, we have narrowed the field down a lot by noting that a zero value of the Lorentz interval in S must yield a zero value in S'.

Note that our proof relied on the constancy and isotropy of the speed of light, the idea that if it is a spherical wavefront in S, it must be a spherical wavefront in S'. This is one of the assumptions in relativity.

I

stevendaryl · Oct 17, 2014

RCopernicus said:

I'm afraid your answer doesn't make much sense. I can claim that ds^2 = dx^2 - dy^2 describes a shorter path than ds^2 = dx^2 + dy^2, but I have no justification for arbitrarily flipping the sign on one of my dimensions. So why is Minkowski able to get away with it with time? Is not time orthogonal in every way to space?

The length of a path is relative to a metric. For example, if [itex]x[/itex] and [itex]y[/itex] are Cartesian coordinates, then traveling [itex]\delta x[/itex] in the x-direction and [itex]\delta y[/itex] in the y-direction will put you at a distance away from your start of [itex]\delta s[/itex] where [itex]\delta s^2 = \delta x^2 + \delta y^2[/itex]. On the other hand, in polar coordinates [itex]r, \theta[/itex], the distance [itex]\delta s[/itex] is given by [itex]\delta s^2 = \delta r^2 + r^2 \delta \theta^2[/itex]. The general notion of a metric (for 2D space, for simplicity) is a tensor, that can be represented as four numbers: [itex]g_{11}, g_{12}, g_{21}, g_{22}[/itex], and the corresponding notion of "distance" is given by:

[itex]\delta s^2 = \sum_{i j} g_{i j} \delta x^i \delta x^j[/itex]

For any such metric [itex]g_{ij}[/itex], there is a corresponding notion of "distance" (or distance-squared, actually), and that gives rise to its own notion of "minimal" (or "extremal"; it's not necessarily minimal) path, that generalizes "shortest path" in Euclidean space. This generalized notion of a metric views distance and the corresponding notion of extremal path as something empirical, rather than something you can discover by pure logic alone. So empirically, it turns out that the metric [itex]ds^2 = (c dt)^2 - dx^2 - dy^2 - dz^2[/itex] is important in nature. If you move a clock from point [itex](x,y,z)[/itex] at time [itex]t[/itex] to point [itex](x+dx, y+dy, z+dz)[/itex] at time [itex]t+dt[/itex], then the time on the clock will advance by an amount [itex]ds/c= \sqrt{dt^2 - (dx/c)^2 - (dy/c)^2 - (dz/c)^2}[/itex]. That's an empirical fact. The Minkowsky metric is an especially convenient way to express this fact.

Torbjorn_L · Oct 17, 2014

Nugatory said:

The essential basis of relativity is not the universal speed limit, it is the invariance of the speed of light; the universal speed limit (and much else) follows from light-speed invariance.

I was talking to the question of "why this way" and not "how does it work". Relativity is based on the preservation of laws for all observers, unless I am mistaken - the speed limit and its universality (invariance) falls out of that.

And the problem with having no speed limit would be that there wouldn't be a spacetime metric. (I thank DaleSpam for making this more precise.)

Nugatory said:

You've just moved the "Why?" question around. Why do we live in a universe that has a universal speed limit instead of one that does not?

My observation was that the question is more constrained, not simply formulated as loose but differently. Causality is essential.

Nugatory said:

There's a fine and consistent mathematical model for describing a universe in which the speed of light is not invariant and there is no universal speed limit; it's called classical mechanics and there's nothing wrong with it except that observation tells us that it's not the way the universe works.

Interesting. I haven't thought of it that way.

The obvious problem is that it is a nonphysical approximation as you note, which we now know can't be realized. E.g. fields are quantum relativistic, not classical infinite-speed. Most problematic would be the loss of a working cosmology, I think.

I note that there would also be loss of generality, without the universal speed limit c the electric and magnetic field would "fall apart", et cetera.

And of course space, time and causality comes unglued as per my first point and are tacked on as ad hoc constraints rather than a (more or less approximate) map of a physical system (spacetime in GR).

But this has become philosophy and not science, so I will stop there.

RCopernicus · Oct 17, 2014

I'd like to thank stevendary,l pervec andt Nugatory for their informative posts. What I get from this is Lorentz looked at his data and asked 'how would time and space need to be shaped in order to explain these observations?' and from there we have the minus sign on the metrics for space. I can live with that. However, I don't see how we escape the conclusion that space is imaginary.

Fredrik · Oct 17, 2014

RCopernicus said:

I'd like to thank stevendary,l pervec andt Nugatory for their informative posts. What I get from this is Lorentz looked at his data and asked 'how would time and space need to be shaped in order to explain these observations?' and from there we have the minus sign on the metrics for space. I can live with that.

I wouldn't say that it has anything to do with "shape", because Minkowski spacetime is completely flat, in the sense of differential geometry. (The Riemann tensor is zero everywhere). I prefer to think of it this way: The result that pervect explained suggests that the function g defined below is going to be useful.

RCopernicus said:

However, I don't see how we escape the conclusion that space is imaginary.

That's been answered in this thread. One of the answers is that we can define
$$\eta=\begin{pmatrix}-1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1\end{pmatrix},$$ and then define ##g(x,y)## for all 4×1 matrices ##x,y## by ##g(x,y)=x^T\eta y##. We haven't used any complex numbers in the definition of the function ##g##, and the right-hand side of that last equality is still equal to ##-x^0y^0+x^1y^1+x^2y^2+x^3y^3##.

I'm using units such that c=1. If you don't, you might want to take the upper left component of ##\eta## to be ##-c^2## instead of -1. Alternatively, you can take the "zeroth" component of x (i.e. ##x^0##) to be c times the time coordinate, rather than the time coordinate.

robphy · Oct 17, 2014

RCopernicus said:

I'd like to thank stevendary,l pervec andt Nugatory for their informative posts. What I get from this is Lorentz looked at his data and asked 'how would time and space need to be shaped in order to explain these observations?' and from there we have the minus sign on the metrics for space. I can live with that. However, I don't see how we escape the conclusion that space is imaginary.

Then I shouldn't suggest a related experiment:
Galileo does the experiment I proposed (with a limited range of velocities (say, up to the speed of the fastest horse)),
then extrapolates the portion of his "circle" to infinite velocities. What would be the equation of Galileo's circle?

You might not recognize that this diagram is essentially the position-vs-time graph drawn and interpreted in every introductory physics class... It's just that its non-euclidean geometry is not treated [or recognized]).

(In these two experiments without fancy equations, I have actually produced the metric of Minkowski spacetime and the degenerate-metric of Galilean spacetime. If I continue my story, I can build up all of the geometry of Special Relativity and Galilean Relativity.)

pervect · Oct 17, 2014

robphy said:

Then I shouldn't suggest a related experiment:
Galileo does the experiment I proposed (with a limited range of velocities (say, up to the speed of the fastest horse)),
then extrapolates the portion of his "circle" to infinite velocities. What would be the equation of Galileo's circle?

You might not recognize that this diagram is essentially the position-vs-time graph drawn and interpreted in every introductory physics class... It's just that its non-euclidean geometry is not treated [or recognized]).

(In these two experiments without fancy equations, I have actually produced the metric of Minkowski spacetime and the degenerate-metric of Galilean spacetime. If I continue my story, I can build up all of the geometry of Special Relativity and Galilean Relativity.)

I think I'm missing something. I get ##R = v t = v \tau## for the Galliean case, and ##R = v / \sqrt{1-(v/c)^2} \, \tau## for the relativistic case, but I don't see how to use this to derive relativity, rather I use relativity to derive the results. Here ##\tau## is the proper time for which the observers run, in your example it's a constant (one minute), but I've given it a symbolic value anyway.

Nugatory · Oct 17, 2014

RCopernicus said:

I3What I get from this is Lorentz looked at his data and asked 'how would time and space need to be shaped in order to explain these observations?' and from there we have the minus sign on the metrics for space.

There's more to the history than that.

Well before Einstein and as early as 1895, Lorentz developed the coordinate transformations that were consistent with the null result of the Michelson-Morley experiment. None of this stuff about metrics, geometry, space-time intervals showed up in this formulation; it was just an alternative to the Galilean transforms, one in which the ##\gamma## constant showed up and time did something a bit more complicated than the Galilean ##t'(x,y,z,t)=t##.

In 1905 Einstein demonstrated that these Lorentz transformations can be derived from the principle of relativity and the light-speed invariance. That introduced no new mathematics, but established those two principles as the basis for all subsequent theoretical physics.

Two years later, in 1907, Minkowski recognized that the Lorentz transformations were mathematically equivalent to a geometry in which the metric took on the form diag(-1,1,1,1) or diag(1,-1,-1,-1) depending on one's choice of sign conventions. That's when the metrics/geometry/space-time interval stuff appeared. At first it seemed to be just a more abstract mathematical formulation of what Einstein had already discovered, but it turned out to be essential to making the next jump to general relativity.

I can live with that. However, I don't see how we escape the conclusion that space is imaginary.

Easy... use the other sign convention, which is really nothing more than a trivial coordinate transformation, and space won't be "imaginary". Of course then time will be, but the ease with which I can flip them with a simple mathematical trick suggests that there is no physical significance to the complex numbers that appear when I take the square root of squared intervals calculated using the Minkowski metric.

Here's a more prosaic example of a mathematical formalism leading to a conclusion that you ought to be able to escape no matter what the math says: Standing at a height ##H## above the ground, I throw a ball upwards with speed ##v##. How many seconds later does the ball strike the ground? This is a fairly standard high-school sort of problem... but when we solve it, we find (because we're dealing with a quadratic equation) that we have two solutions, one positive and one negative. We could look at the negative time solution and say that we're stuck with the inescapable fact that the ball can travel backwards in time and strike the ground before we threw it. Or we can say that just because we can calculate a negative time doesn't mean that we have to assign any physical significance to it.

robphy · Oct 17, 2014

pervect said:

I think I'm missing something. I get ##R = v t = v \tau## for the Galliean case, and ##R = v / \sqrt{1-(v/c)^2} \, \tau## for the relativistic case, but I don't see how to use this to derive relativity, rather I use relativity to derive the results. Here ##\tau## is the proper time for which the observers run, in your example it's a constant (one minute), but I've given it a symbolic value anyway.

The idea is this: suppose we really did these experiments in the real world [without a theory yet to explain the observation]. How could one obtain a theory to explain it? Let's pretend that Euclid, Galileo, and Minkowski performed these experiments.

If Euclid set up surveyors on a plane and told them, from a point, travel in various directions and stop when their odometers read 1 mi, what is the locus of these stopping points? A circle, of course. Suppose each surveyor also had a long ruler, which they somehow carried "perpendicular to his radial path". Each could assign coordinates to points: t [along his path] and x [perpendicular to his path... defined by being tangent to the circle]. Each surveyor would make a map of the stopping points such that t^2+x^2=(1 mi)^2, all identical--independent of surveyor. From this we get a way to measure the separation between points.

Now to find a separation between surveyor paths (radial lines), Euclid defines an angle as the arc-length intercepted divided by the radius of the circle. Define cos(angle) as the ratio between the t-coordinate of the stopping point of the other radial line and the radius of the circle (that is, the t-coordinate of the stopping point on my path). sin(angle) is the ratio between the y-coordinate and the radius of the circle. tan(angle)=slope=y/t. Note that since arc-length is additive, then angle is additive but slope as tan(angle) is not. You are now on your way to deriving the rest of Euclidean geometry.

If Galileo did this on position-vs-time graph with a wristwatch and a ruler, he would get his version of a "circle" by extrapolating an apparently vertical segment (by probing a small range of slow speeds) to a vertical line t^2+(0)x^2=(1 minute)^2. Note that all tangents to the circle agree... so they agree on elapsed times between events (i.e. absolute simultaneity). Galileo's version of cosine would equal 1 (no time dilation) and Galileo's version of slope (a.k.a. velocity) would coincide with Galileo's version of "angle" (Galilean rapidity)... so velocities and angles are additive. With some work (and a spatial metric), you could get the Galilean spacetime geometry.

Of course, Minkowski's version of a "circle" would be a hyperbola t^2+(-1)x^2=(1 minute)^2, with asymptotes x=t and x=-t. Galileo couldn't see the hyperbola from his extrapolation from the small speed range. However, Minkowski had access to really fast particles. Alas, the tangents no longer agree as in Galileo's case (simultaneity is not absolute) and velocities are no longer additive. Minkowski's version of cosine is greater or equal to 1 (i.e. time dilation) and would be identified as the hyperbolic-cosine (a.k.a. gamma). With some work, you could get the Minkowski spacetime geometry.

harrylin · Oct 19, 2014

Nugatory said:

There's more to the history than that. [..]

Two years later, in 1907, Minkowski recognized that the Lorentz transformations were mathematically equivalent to a geometry in which the metric took on the form diag(-1,1,1,1) or diag(1,-1,-1,-1) depending on one's choice of sign conventions. That's when the metrics/geometry/space-time interval stuff appeared. At first it seemed to be just a more abstract mathematical formulation of what Einstein had already discovered, but it turned out to be essential to making the next jump to general relativity.

[..]

Just a little nitpicking (see my post #11): the geometric invariant space-time interval stuff appeared already in 1906 with a paper by Poincare paper that described the Lorentz transformation as a 4-dimensional space rotation (in the section on gravitation).

DiracPool · Oct 19, 2014

RCopernicus said:

I'd like to thank stevendary,l pervec andt Nugatory for their informative posts. What I get from this is Lorentz looked at his data and asked 'how would time and space need to be shaped in order to explain these observations?' and from there we have the minus sign on the metrics for space. I can live with that. However, I don't see how we escape the conclusion that space is imaginary.

I have to agree with Copernicus, I don't see what we gain by pretending that time (or space) isn't imaginary. Depending on how you set up your delta S squared, it could be either one. You can try to explain that away somehow, but I think Minkowski's trick of replacing the Y axis with ict and treating the Lorentz transformation as a rotation is the most instructive approach to understanding the invariant interval.

Understanding Minkowski Space Metrics: The Sign Reversal Mystery Explained

Similar threads

Hot Threads

Recent Insights