0% found this document useful (0 votes)
16 views30 pages

Flat Spacetime Acta

The article presents a structured approach to understanding Special Relativity (SR) and Galilean Spacetime (GS) through their causal structures and inertial motions, arguing that transformations like Galilean and Lorentz are secondary to the geometric structure of spacetime. It emphasizes the importance of a full geometrical perspective in teaching SR to avoid misconceptions. The paper also discusses the metric structure, causal vectors, and inertial frames, providing a clear distinction between the two types of spacetime and their properties.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views30 pages

Flat Spacetime Acta

The article presents a structured approach to understanding Special Relativity (SR) and Galilean Spacetime (GS) through their causal structures and inertial motions, arguing that transformations like Galilean and Lorentz are secondary to the geometric structure of spacetime. It emphasizes the importance of a full geometrical perspective in teaching SR to avoid misconceptions. The paper also discusses the metric structure, causal vectors, and inertial frames, providing a clear distinction between the two types of spacetime and their properties.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Vol.

41 (2010) ACTA PHYSICA POLONICA B No 1

FLAT SPACETIME IN A CAPSULE

Andrzej Herdegen

Institute of Physics, Jagiellonian University


Reymonta 4, 30-059 Kraków, Poland
herdegen@th.if.uj.edu.pl

(Received August 14, 2009)


We propose a parallel introduction to Galilean and Einsteinian rela-
tivity based on the causal structure and inertial motions. Galilean and
Poincaré transformations, as objects secondary to the geometrical struc-
ture, are left aside.

PACS numbers: 03.30.+p

1. Introduction
In this article we propose a highly structured and logical approach to the
fundamentals of Special Relativity (SR) based on its causal structure and
relativity of inertial motions. For comparison and better understanding we
parallelly build the Galilean Spacetime (GS) on similar ideas. We indicate
that the causal structure determines the metric structure of SR spacetime
uniquely, which is not the case for the choice of Euclidean metric in the
Galilean case.
We want to stress the point that the Galilean and Lorentz (Poincaré)
transformations are objects secondary to the geometric structure of space-
time: they are affine mappings leaving this structure invariant. We regard
basing the introduction to SR on these transformations as a serious miscon-
ception and we do not discuss them in this article.
We are also of the opinion that introducing SR, for the sake of alleged
simplicity, from the three-dimensional rather than full geometrical point
of view, in fact makes understanding of SR more difficult, and can easily
lead to misconceptions. We regard as especially harmful figures illustrating
hypothetical relative motion of frames as depicted in Fig. 1. Whereas this
is not the best, but correct picture in GS, it is completely wrong in SR. The
reason for that is that the hyperplanes of constant time (‘pure space’) of
observers in relative motion are not parallel, so they cannot be regarded as
‘sliding’ on each other.

(23)
24 A. Herdegen

Fig. 1. Reference frames — a popular picture.

Elements of the programme sketched above appeared, of course, in many


earlier publications and books (see e.g. Refs. [1–3]), thus the aims of this
article are mainly pedagogical. However, we believe that our scheme adds
some value to the clarity and logic of presentation.
In addition, we discuss some simple geometric effects in the present con-
text. This will include a discussion of the view of the celestial sphere as seen
by different observers [4, 5]. This point is particularly worth adding, as it is
usually treated with the help of a rather indirect method of stereographic
projection1 . We discuss it directly on the celestial spheres of two observers.
In all discussions of effects involving different observers we consistently
avoid, as mentioned above, the use of Galilean or Lorentz transformations.
To relate the views on the spacetime as seen by two inertial observers one
needs only to know the directional vectors of their world-lines. On the other
hand, one needs complete bases attached to the observers to specify a trans-
formation between them.

2. Homogeneity with respect to translations


and the affine structure
It is fairly obvious from everyday experience that one needs four real
numbers to place an event in space and time. For a given event the specific
values of these numbers depend on an adopted system of labels, but they
always form an element of the set R4 . Our spacetime is a structure based
on this set.
Another common experience points to the applicability of spacetime
translations: if a physical occurrence takes place in a given region of space
and within some time-span, an analogous occurrence may take place else-
where and at another time. We include this property in our construction of
a model of the spacetime in the following form: the group of four-dimensional
translations acts transitively on the spacetime. This leads us to the following
starting point for the construction of a spacetime model:
Flat spacetime is modeled by a real four-dimensional affine space (M, M ).
1
However, in the original article on the shape of moving sphere, Ref. [4], there is
a short remark on the idea used in the present article.
Flat Spacetime in a Capsule 25

Here M denotes the affine space based on the four-dimensional vector


space M . We adopt the notation P, Q, . . . for points in M and x, y, . . . for
−−→
vectors in M . We write x = P Q if Q = P + x. Moreover, if P ∈ M and
N ⊂ M is any subset then we use the usual shorthand: P + N = {P + x |
x ∈ N }. In particular, straight lines are one-dimensional affine subspaces
P + L(x), where L(x) denotes the one-dimensional vector subspace spanned
by the vector x. Ordered vector bases in M will be denoted by (e0 , e1 , e2 , e3 ).
See Fig. 2 for a graphic representation (here, as in the following, one space
dimension is omitted).

Fig. 2. Vector and affine space.

3. Causal structure and inertial motions


Of course, the affine space structure is still a very poor one, one needs
further specification. The most obvious element needed is a one introduc-
ing the differentiation between physical time and space directions. This is
achieved in the following way.
We shall say that the spacetime is equipped with the causal structure
if in the accompanying vector space one has distinguished the following set
(see Fig. 3):

GS: a three-dimensional subspace S ⊂ M ,


SR: a homogeneous vector quadric V ⊂ M (different from a subspace),
with respect to which three dimensions of M are on equal footing, but
not the fourth.

Fig. 3. Causal structure.


26 A. Herdegen

By a homogeneous vector quadric we mean here a set of vectors x ∈ M


whose coordinates x0 , x1 , x2 , x3 in some (and then any) basis satisfy the
P3 i j
equation i,j=0 αij x x = 0, with some basis-dependent numerical coeffi-
cients αij . We recall that for any such quadric there is a basis in which
it takes one of the forms ε0 (x0 )2 + ε1 (x1 )2 + ε2 (x2 )2 + ε3 (x3 )2 = 0, where
εµ = 0, ±1 (uncorrelated values). The only possibility (up to a permutation
of the basis vectors) to satisfy the demand imposed above on V is that in
this canonical basis V is a cone given by:

x∈V ⇐⇒ (x0 )2 − (x1 )2 − (x2 )2 − (x3 )2 = 0 . (1)

We shall say that a vector x lies inside (or outside) V if (x0 )2 − (x1 )2
− (x2 )2 − (x3 )2 > 0 (< 0) respectively.
We say that a nonzero vector is a causal vector if it:
GS: does not lie in S,
SR: lies inside or on V .

In addition we introduce the notion of a timelike vector which


GS: is identical with a causal vector,
SR: lies inside V .
−−→
We shall say that two events P and Q are causally related if P Q is
a causal vector, and they are temporally related if it is a timelike vector.
The causal structure makes contact with physics by the following identi-
fication. An inertial motion is a straight line in spacetime M with a timelike
directional vector (thus any two events on this line are temporally related).
Such lines will be called world-lines of the motion (see Fig. 4).

Fig. 4. Inertial motions.

If a point Q 6= P is not causally related to P we say that it lies elsewhere


with respect to P . One then cannot connect Q and P by an inertial motion.
Flat Spacetime in a Capsule 27

4. The four orientations of the spacetime


Let us choose a basis of M in which
GS: the subspace S is given by x0 = 0,
SR: the cone V has the canonical form.

The set of causal vectors splits into two disjoint sets: those for which x0 > 0
or x0 < 0 respectively in the distinguished basis. We denote one of these sets
by C+ and call it the future and the other by C− and call it the past. (After
this choice has been done we can adjust the sign of x0 so that x0 > 0 for
x ∈ C+ .) Then the future (past) of any event P is the set P + C+ (P + C− ),
and Q is in the future of P if, and only if, P is in the past of Q. Let us
write Q > P for ‘Q is in the future of P ’, and Q ≥ P for: Q > P or Q = P .
Then the relation Q ≥ P defines a partial order in M:
1. P ≥ P ,
2. if Q ≥ P and P ≥ Q then Q = P ,
3. if R ≥ Q and Q ≥ P then R ≥ P .

The only less obvious of these properties is the third one in the special
case. To prove it observe that x ∈ C+ if in a canonical basis
relativity p
0 < x0 ≥ (x1 )2 + (x2 )2 + x3 )2 . If y is another such vector then it is easily
seen that the same relation is satisfied with x replaced by x + y, which was
to be proved. See Fig. 5 for a graphic representation of causally defined
regions.

Fig. 5. Past, future, elswhere.

As there are two possible choices for the identification of the sets C± we
say that there are two possible causal orientations of the spacetime M.
At the same time M as a real vector space has two possible orientations
defined as usually as the equivalence classes of bases. In combination with
the causal orientation this gives four choices of the spacetime M orientations.
28 A. Herdegen

5. Relative rest, inertial observers, inertial frames


We do not have yet any metric tools, so we are unable to determine
relative velocity of inertial motions, but we can already say what it means
that two motions are in relative rest: their world-lines are parallel (i.e. have
common directional vectors).
We decide that there is no need to differentiate between an inertial mo-
tion and an often used term of inertial observer ; the difference, if any, is
a rather psychological one.
Finally, by an inertial frame we mean the class of all inertial observers
remaining in relative rest to each other. We do not see the need to make this
notion more specific, as is often assumed, by demanding that a particular
basis has been chosen with the timelike vector parallel to the world line of
the motions in this family.

6. Metric structure, four-velocities


We recall two facts from linear algebra:
1. The kernel (zero space) of a nonzero linear form on a vector space
is a subspace of codimension one. Conversely, any such subspace S
determines uniquely up to a constant factor a linear form Dt such
that
x∈S ⇐⇒ Dt(x) = 0 . (2)
2. A real vector quadric V (if different from a subspace) determines
uniquely up to a constant factor a symmetric metric g such that
x∈V ⇐⇒ g(x, x) = 0 . (3)
A proof of the second fact for the case of our cone V is given for completeness
in the Appendix.

6.1. Galilean spacetime


In the case of the Galilean spacetime we chose the sign of Dt by demand-
ing that
Dt(x) > 0 for x ∈ C+ . (4)
−−→
Then Dt(P Q) > 0 if Q lies in the future of P . The remaining positive factor
in the definition of Dt is fixed arbitrarily. For an arbitrarily chosen point P0
we fix a real value t(P0 ). Then there is a unique affine form taking this
value at P0 and having Dt as its linear part. This means that for each pair
of points P, Q there is
−−→
t(Q) = t(P ) + Dt P Q . (5)
Flat Spacetime in a Capsule 29

This form determines the universal time in the Galilean spacetime. The
metric structure of this spacetime is now completed by choosing a Euclidean
metric h on the subspace S. This metric then determines ‘spatial’ metric
relations on each hyperplane Q + S of constant time. One notes that there
are no relations of this kind between points on different constant time planes.
Note also that the relative scale of the metric tools Dt and h is arbitrary.
See Fig. 6 for a graphic representation of the metric structure of GS.

Fig. 6. Metric structure of GS.

The world-lines of inertial motions pierce precisely at one point each of


the constant time hyperplanes. For each family of parallel inertial motions
there is a unique directional vector u for which Dt(u) = 1. We shall call
such vector a unit timelike, future-pointing vector or the four-velocity of
these world-lines.
Having chosen a particular family of inertial parallel motions character-
ized by the four-velocity u one can split the vector space into time and space
parts by
M = L(u) ⊕ S , (6)
where L(u) denotes the one-dimensional subspace spanned by u. Observers
in the chosen family decompose each vector x into the time and space
parts by
x = Dt(x)u + xu , so xu ∈ S . (7)
Note that while Dt(x) does not depend on u, the space part xu does depend
on this vector, that is to say on the family of parallel inertial motions. The
Euclidean scalar product h can be applied to the space parts of any two
vectors x and y and we shall also write
h(xu , yu ) = xu ◦ yu . (8)

6.2. Special relativity


In this case g is fixed up to a real factor by the cone V , as described
above. We choose its sign by the convention that in the canonical basis of V
30 A. Herdegen

the metric has the signature (+1, −1, −1, −1). The remaining positive factor
is chosen arbitrarily. The metric structure of the spacetime is determined
completely by g. The vector x is a timelike vector when g(x, x) > 0, and it
is a causal vector when it is nonzero and g(x, x) ≥ 0. In addition, we say
that a vector is spacelike if g(x, x) < 0. We shall also use the notation
g(x, y) = x · y , x · x = x2 . (9)
See Fig. 7 for the metric properties of vector types.

Fig. 7. Scalar product in SR.

If Q lies in the future of P then there is a unique inertial motion joining


them. The proper time interval covered by this motion from P to Q is
determined by
h −−→ −−→i1/2
∆τ (P, Q) = g P Q, P Q . (10)
−−→
Let u = λP Q with λ > 0 so that u ∈ C+ . If we demand that g(u, u) = 1
 −−→ −−→ −1/2
then u is fixed uniquely by these conditions and λ = g(P Q, P Q) . We
call such u a unit timelike, future-pointing vector or a four-velocity.
A four-velocity u may be used to define a time variable correlated with
the inertial frame defined by u. As in the Galilean case we fix tu (P0 ) and
then there is a unique affine form tu taking this value at P0 and having the
linear form
Dtu (x) = u · x (11)
as its linear part. This means that for each pair of points P, Q there is
−−→
tu (Q) = tu (P ) + Dtu P Q . (12)

Note that if P and Q lie on one u-world-line, Q in the future of P , then


−−→
Dtu P Q = ∆τ (P, Q) (13)

so the definition of Dtu is an extension of the proper time interval on


a u-world-line, Eq. (10).
Flat Spacetime in a Capsule 31

Let us denote by Su the kernel of the form Dtu , which is the subspace of
vectors orthogonal to u with respect to the metric g. Then the hyperplanes
P + Su are the sheets of constant tu time. The metric g when restricted to
Su reduces to −hu , where hu is a Euclidean metric. Thus the objects Dtu ,
tu , Su and hu play a similar role as Dt, t, S and h in the Galilean case, but
with several important differences:

1. Here these quantities are not universal as in the Galilean case, they are
functions of the vector u; thus they depend on the choice of a family
of inertial observers in relative rest.
2. This relative character implies weaker status of these quantities as
compared to the Galilean case.
3. On the other hand, the form Dtu and the metric hu are uniquely
determined by g, so their relative scale is unambiguous. This is to be
contrasted with the Galilean case, where the scale of Dt and h could
be fixed independently.

The decomposition of the vector space M into time and space parts takes
now the form

M = L(u) ⊕ Su , x = Dtu (x)u + xu , x u ∈ Su . (14)

Note that in this case both Dtu (x) and xu depend on u, and for different
choices of this four-velocity the space parts xu lie in different subspaces. For

xu , yu ∈ Su we shall write xu ◦yu = −xu ·yu and also denote |xu | = xu ◦ xu .
Then

x · y = Dtu (x)Dtu (y) − xu ◦ yu , x2 = (u · x)2 − |xu |2 . (15)

The scalar product, in contrast to the Galilean case, is applicable to any


vectors. See Fig. 8 and 9 for a graphic representation of decompositions and
four-velocities, and Fig. 10 for the dependence of Su on u.

Fig. 8. Metric structure of SR.


32 A. Herdegen

Fig. 9. Four-velocities and future-directed lightvectors.

Fig. 10. Subspaces orthogonal to 4-velocities.

7. Equivalence of observers, light signals and their speed


The principle of relativity, i.e. of the equivalence of observers, can be
now put in the following form:
1. Physical theories do not depend on the choice of an inertial observer,
i.e. of the four-velocity u determining all inertial motions in a given
frame or a particular world-line in the family.
2. The set of physical states conforming with physical theories does not
distinguish any of the inertial observers.
In particular:
1. In SR the Maxwell equations imply that the light signals propagate
along straight lines whose directional vectors lie on V , i.e. l is such
a vector iff g(l, l) = 0. These vectors are called therefore lightlike
vectors and V is called the light-cone. The Maxwell equations do not
conform to the principle of relativity in the GS case. In this case the
only way to avoid clash with the principle of relativity is to assume
that light propagates with infinite speed, i.e. the directional vectors of
light rays lie in S.
2. If one defines physical units of time and space in each inertial frame
with the use of analogous physical phenomena then the proportion of
these units to the geometrical units defined by Dt and h in the case
of Galilean spacetime, and g in the case of SR, is the same for all
observers.
Flat Spacetime in a Capsule 33

3. In the SR case if l is lightlike and u is any four-velocity, then


|Dtu (l)| = |lu | — light covers in each inertial frame a unit distance
in a unit time in geometrical units. If one determines physical units as
in the preceding point their ratio gives the speed of light in all inertial
frames in those units.
Note that the geometrical objects of the spacetime include beside met-
rical tools also the choice of one of the four orientations (as defined above).
The principle of relativity in the above form does not require the indepen-
dence of physics of this choice. As is well-known there are exceptions not
conforming to this extended demand.

8. Relative velocities and their composition


To be precise the term ‘four-velocity’, although deeply rooted in the
language usually used in SR, is somewhat misleading. In fact the vector u
of an inertial frame simply points in the direction in which time flows but
there is no space translation for all observers in this frame. To introduce
a more justified notion of velocity one needs a reference observer which
‘rests’. But ‘all observers are equal’, so one has to say with respect to which
of them one makes the measurement.
Thus we assume there are given two four-velocities u and u0 and we
want to determine a velocity of the motion defined by u0 with respect to
that defined by u. We propose three candidates:
1. ∆ (u0 , u) = u0 − u ,
2. vpr (u0 , u) = u0u ,
3. v (u0 , u) = u0u /Dtu (u0 ) .
The r.h.s. in 2. is formed as in (7) and (14) and the subscript ‘pr’ stands for
‘proper’. In 3. Dtu is independent of u in the Galilean case.
The first of these definitions satisfies the antisymmetry and chain prop-
erties:
∆ u0 , u = −∆ u, u0 , ∆ u00 , u = ∆ u00 , u0 + ∆ u0 , u ,
    
(16)
which has obvious interpretational advantages.

8.1. Galilean spacetime


In this case Dt(u0 ) = 1 and u0u = u0 − Dt(u0 )u = u0 − u, so all three
definitions coincide and we shall use notation v(u0 , u) for this quantity (see
Fig. 11). We have v(u0 , u) ∈ S and point 3. above tells us that this vector
gives the change of position of an observer with four-velocity u0 with respect
to one with four velocity u, undergone in unit time. The composition of
velocities obeys simple vector addition law (16) (see Fig. 12).
34 A. Herdegen

Fig. 11. Relative velocity in GS.

Fig. 12. Composition of velocities in GS.

8.2. Special relativity


In this case all three definitions are different (see Fig. 13). The first one
has the advantage of the vector addition composition law (16) (see Fig. 14),
but ∆(u0 , u) does not lie in any of the subspaces Su or Su0 . Rather, it is in
the subspace Sw of pthe observer with four-velocity ‘half way’ between u and
u0 : w = (u + u0 )/ (u + u0 ) · (u + u0 ).

Fig. 13. Relative velocity in SR.


Flat Spacetime in a Capsule 35

Fig. 14. Composition of velocities in SR.

The second and the third definitions give parallel vectors in Su . The
proper velocity vpr (u0 , u) is the displacement of the motion along any world-
line P +L(u0 ), as seen in the u-frame, undergone during unit time interval as
measured on the world-line (proper time) (see Fig. 15). The velocity v(u0 , u)
is a similar displacement but scaled to unit time in u-frame. It is only this
latter quantity which is bounded by 1 (light velocity as defined in Section 7).

Fig. 15. Proper velocities in SR.

The explicit form of the two latter velocities is easily obtained:


vpr u0 , u = u0 − u0 · u u ,

(17)
u 0
v u0 , u = 0

− u. (18)
u ·u
Neither of these velocities satisfies the antisymmetry or the chain rule prop-
erties (16). If we write the first of these equations in the form u0 = u0 ·u u+vpr
and take the scalar square of both sides we find
2
u0 · u − |vpr |2 = 1 (19)
36 A. Herdegen

(from now on we write vpr ≡ vpr (u0 , u), v ≡ v(u0 , u)). This tells us that
the quantities u0 · u and |vpr | may be represented as the hyperbolic cosine
and hyperbolic sine of some unique parameter ψ ≥ 0. If we denote (after
Bondi [1]) k = exp ψ ≥ 1 we get the representation
s(k)
u0 ·u = 21 k+k −1 ≡ c(k) , |vpr | = 12 k − k −1 ≡ s(k) ,
 
|v| = .
c(k)
(20)
Some other useful relations which follow are
1 |v|
q
c(k) = 1 + |vpr |2 = p , s(k) = p , (21)
1 − |v| 2 1 − |v|2
1 + |v| 1/2
q  
k = |vpr | + 1 + |vpr | =2 . (22)
1 − |v|
We shall find the direct physical interpretation of k in the next section.
The magnitude of k is invariant with respect to the interchange of u
and u0 , so if we denote vpr
0 ≡ v (u, u0 ) and v 0 ≡ v(u, u0 ) then we have
pr

0
|vpr | = |vpr | , |v 0 | = |v| . (23)
The motion of an observer with respect to the u-frame is often defined
rather in terms of vpr or v than u0 , or similarly with the role of observers
interchanged, and then
u0 = c(k)u + vpr = c(k)(u + v) = c(k)u + s(k)n ,
u = c(k)u0 + vpr
0
= c(k)(u0 + v 0 ) = c(k)u0 + s(k)n0 , (24)
where by n and n0 we have denoted the unit spacelike vectors pointing in the
direction of v and v 0 respectively. Although the use of vpr or v instead of u0
may seem better suited for the point of view of the u-frame, one has to be
careful not to project Galilean properties of velocities to SR. For instance,
we have v 0 6= −v, in contrast to GS.
The composition of velocities of these types is rather complicated and
not very illuminating. The special case of four-velocities u, u0 , u00 lying in
one two-dimensional subspace will be discussed in the next section.

9. Time measurement
The problem one wants to address here is the following. Two events
P and Q on a world line with four velocity u0 are separated by the vector
∆t0 u0 , so the time interval between them as measured directly by the inertial
observer on this world-line is ∆t0 . What time-span ∆t will be measured
between these events in the frame defined by the four-velocity u?
Flat Spacetime in a Capsule 37

9.1. Galilean spacetime


Here the answer is simple. The spacetime is equipped with the universal
time interval form Dt, so there is no doubt how to measure this interval in
any frame. One has
∆t = Dt(∆t0 u0 ) = ∆t0 . (25)

9.2. Special relativity


If one employs the frame-dependent time interval form Dtu described in
Section 6.2, one finds

∆t = Dtu (∆t0 u0 ) = u · u0 ∆t0 = c(k)∆t0 (26)

(notation as in the preceding section). This gives the famous ‘time dilation’
effect. However, one should be careful to interpret this result properly. No
inertial observer from the u-frame can pass directly both events P and Q,
thus the measurement in this frame is by necessity indirect. Observers on
the world-lines P +L(u) and Q+L(u) to establish one frame-dependent time
variable tu need only to agree on a choice of a constant time hypersurface
to synchronize their clocks (as the time-interval form Dtu is known directly
to both of them). After this has been settled (see below) the time tu (P )
is measured directly by the first observer, and the time tu (Q) is measured
directly by the other. The difference tu (Q) − tu (P ) gives ∆t. See Fig. 16.

Fig. 16. Time measurement.

The synchronization of clocks can be done by the radar method. The


first observer sends at his time t1 a light signal towards the other one and
receives it back reflected at t2 . Denote by X the event on the world-line of
the first observer at his time (t1 +t2 )/2, and by Y the event on the world-line
of the second observer at which the reflection of the light ray takes place,
see Fig. 17. If l1 and l2 are lightlike vectors as depicted in the figure, then
38 A. Herdegen

−−→ −−→
(t2 − t1 )u = l1 + l2 , XY = (l1 − l2 )/2, so u · XY = 0. Thus X and Y lie
in one hyperplane of u-simultaneity and if the second observer agrees to set
his clock for (t1 + t2 )/2 at Y , the clocks will be synchronized.

Fig. 17. Synchronization of clocks.

In real life the time dilation measurement is rarely, if at all, done this
way. Probably the most famous instance of the dilation effect is the decay of
muons produced by cosmic radiation coming to Earth. Muons are unstable
particles with a characteristic lifetime (in their rest-frames). They are pro-
duced with known energy (so also known velocity) by scattered cosmic rays.
One finds that their mean lifetime in the Earth-frame is much longer than
the characteristic one. However, what is directly measured is not any time
at all! One measures the distance they cover during their life; then knowing
their relative velocity in the Earth-frame one calculates their lifetime in this
frame.
Another type of time measurement is by registering the time of arrival
of light signals. Suppose that two inertial observers travel along world-lines
P + L(u0 ) and P + L(u) respectively (thus we assume for simplicity that
they meet at P ). Let both of them set their clocks so as to show 0 at P .
The u0 -observer sends a light signal towards the u-observer at his time t0 ,
which arrives at the u-observer’s world-line at the time t+ on that line. Thus
one has the equation t0 u0 + l = t+ u, where l is the lightlike, future-pointing
vector connecting these two events (see Fig. 18). We write this as
l = t+ u − t0 u0 , l · l = 0, l · u > 0. (27)
Solving the second equation for t+ one obtains two values out of which the
third condition selects only one:
t+ = u · u0 t0 + (u · u0 )2 − 1 t0 = c(k)t0 + s(k) t0 .
p
(28)
Note that t0 , t+ < 0 for observers approaching each other (parts of world-lines
causally preceding P ) and t0 , t+ > 0 for observers moving away from each
Flat Spacetime in a Capsule 39

Fig. 18. Time of arrival of light signals.

other (parts of world-lines causally following P ). Let now the u0 -observer


send two signals at times t01 and t02 > t01 , either both negative or both positive,
and denote ∆t0 = t02 − t01 , ∆t+ = t+2 − t+1 . Then one finds from the above
relation that

∆t+ = k −1 ∆t0 observers moving towards each other ,


∆t+ = k∆t0 observers moving away from each other . (29)

Note that the result is completely different from the ‘dilation effect’.
The above connections have a directly observable physical consequence.
The light is a wave phenomenon; the change of its phase from one ray to
another is the same for each of the above observers. But the times corre-
sponding to the given change of phase, say 2π, are related as above. Thus
the frequencies of light ν 0 and ν for the two observers are related by

ν = k ν0 observers moving towards each other ,


ν = k −1 ν 0 observers moving away from each other . (30)

With the interpretation of k-coefficient given by the second equation


in (29) we can now find a simple formula for the composition of velocities
(or rather their lengths) in the special case of three co-planar four-velocities
u, u0 , u00 . Let the k-coefficients be denoted as in Fig. 19. This figure then
also shows that K = kk 0 . Using the last equation in (20) and Eq. (22) one
finds
|v (u0 , u)| + |v (u00 , u0 )|
v u00 , u =

. (31)
1 + |v (u0 , u)| |v (u00 , u0 )|
40 A. Herdegen

Fig. 19. Composition of k-coefficients for co-planar four-velocities: k/1 = K/k 0 , so


K = kk 0 .

We end this section with a warning against a popular error in graphical


representations of the time dilation found in many introductory texts on SR.
One of many variants is this: an individual A is speeding in a rocket towards
(or away from) another individual B, who is busy with some activity. Each
of the individuals is equipped with a clock and A watches (by ‘looking’) B’s
activity. The claim then is that A will measure B’s activity to last longer then
it lasts for B in agreement with the time dilation formula. This, however,
is wrong; in fact A receives light signals from B, so his measurement will
give a result obeying one of the cases in Eqs. (29). In fact, for approaching
observers, the time in question is shorter.

10. Space measurement


Here we pose the following question. Two parallel world-lines with four-
velocity u0 are separated by a vector z 0 which is a ‘pure space’ vector in the
u0 -frame. What is the ‘pure space’ vector z which separates them in the
frame defined by u? These two vectors may be thought of as connecting
two particles in a rigid body in these two frames. This latter notion has
limitations in SR: it runs into difficulty when accelerations are involved,
and then needs an input of dynamics to be modified. However, as long as
only inertial motions are involved, a rigid body may be identified with some
family of parallel world-lines. This body rests in the frame defined by these
world-lines.

10.1. Galilean spacetime


Here again the answer is simple: the ‘pure space’ directions are univer-
sally determined by S, so
z = z0 ∈ S . (32)
Flat Spacetime in a Capsule 41

10.2. Special relativity


In this case ‘pure space’ means that u0 · z 0 = u · z = 0. The condition
for z to connect the same two world-lines is z = z 0 + λu0 with some real λ.
Taking the scalar product of this equation with u we find this coefficient and
obtain
z0 · u 0
z = z0 − 0 u . (33)
u ·u
These two vectors can be decomposed as

z 0 = z⊥
0
+ α 0 n0 , z = z⊥ + αn , (34)

where z⊥0 is orthogonal to u0 and n0 (as defined at the end of Section 9), z is

orthogonal to u and n, and α, α0 are numerical constants. Note that z⊥ 0 and

z⊥ are equivalently identified as parts of z 0 and z orthogonal both to u and u0 .


Taking the scalar product of Eq. (33) with u0 we find z · u0 = −z 0 · u/u0 · u.
Using now Eqs. (24) and (34) we find after some simple algebra

0 α0
z⊥ = z⊥ , α=− . (35)
c(k)
The second of these equations describes the effect of the so-called ‘length
contraction’, whose popular formulation could run as: ‘the dimensions par-
allel to the relative velocity measured by the moving observer are by the
factor 1/c(k) shorter then those measured by the observer in rest with re-
spect to the object being measured’. However, one should note that this
formulation and the term ‘contraction’ are somewhat misleading:
1. The vectors z 0 and z connect two different pairs of events on the two
world-lines considered, nothing is being ‘contracted’. Events separated
by z 0 are simultaneous in the rest frame of the ‘rigid body’, while those
separated by z are simultaneous for the moving observer.
2. The vectors n0 and n (pointing in the directions of the two respective
velocities) are not even parallel, so for each of the frames the term
‘parallel to the velocity’ means something different.
Figure 20 illustrates the situation for the special case z⊥0 = z = 0, which

means that for the u-observer the rigid rod with ends on the two world-lines
moves parallelly to its axis.
The proper understanding of the above dismisses various ‘length con-
traction paradoxes’ in SR (see e.g. [6]). The key to all of them is a cautious
analysis of the relation between various vectors involved in the problem.
We illustrate this with a geometrical situation whose variants lie at the
base of most of these effects. Suppose we have two pairs of parallel world-
lines: P + L(u0 ), Q + L(u0 ), and P + L(u), Q + L(u), so that the first lines
42 A. Herdegen

Fig. 20. Space measurement.

in these pairs intersect at P , and the second lines intersect at Q. Physically


this may be thought of as modeling two rigid rods in relative motion, the
ends of the first and the second rod described by the lines in the first and
in the second pair respectively. The ‘front’ ends of the rods meet at some
point and similarly the ‘back’ ends meet at some other point. Let z 0 and w
be the ‘pure space’ vectors (in respective rest-frames) connecting the ends of
−−→
rods and denote x = P Q. (See Fig. 21. The picture might suggest that the
rods are bound to clash and cannot ‘go through’. This is because we lack in
the picture the fourth dimension, which may be used to slightly detach the
world-lines of the rods.)

Fig. 21. Two rods with ends meeting at P and Q respectively.

Then one has


x = z 0 + µ0 u0 = w + νu (36)
Flat Spacetime in a Capsule 43

with some constants µ0 , ν. We decompose z 0 as in the first Eq. (34) and


similarly write
β
w = w⊥ + βn , w 0 = w⊥ − n0 , (37)
c(k)
(the second formula obtained in analogy with Eqs. (34) and (35) is written
down for later use). As n and n0 can be expressed as linear combinations
of u and u0 (see Eq. (24)), the consistency condition for the second equality
in (36) is
0
z⊥ = w⊥ , (38)
and then the constants µ0 and ν have unique solutions, which we do not
need to write down explicitly.
The geometry of the situation is clear and no interpretational difficulty
arises if one insists on this four-dimensional picture. However, if one uses the
‘length contraction’ language ‘paradoxes’ easily arise. Suppose, for instance,
that the vector x is spacelike (as in Fig. 21) and consider any frame with
a four-velocity orthogonal to x. Then the intersecting of lines has this in-
terpretation: in each of these frames the two rods pass each other parallelly,
with both respective ends simultaneously coming into contact. But now the
‘paradoxical’ problem arises: if we go to some other frame not in this family,
then due to different velocities of the two rods they will change their size
in different way, so the ends cannot meet. The simple explanation is, of
course, that what is simultaneous in one frame usually is not simultaneous
in another, which falsifies the above conclusion. And even more, the rods
moving parallelly in one frame usually do not remain parallel in another.
To illustrate the last point suppose that in the above geometrical set-
ting x = w, i.e. the rods are parallel and of equal length in the u-frame.
This means that w = z, and decomposing these vectors as before we find
α0 = −c(k)β. Using this and Eq. (38) we find

β
w 0 = w⊥ − n0 , z 0 = w⊥ − c(k)β n0 . (39)
c(k)

These vectors are parallel if, and only if w⊥ = 0 or β = 0. In all other cases
rods move in the u0 -frame askew to each other. This is illustrated in Fig. 22.

11. Non-inertial motions, proper time, simultaneity


Inertial motions, as we have seen, have a special role to play for the
interpretation of the geometry of spacetime. However, the picture would
not be complete without mentioning other, non-inertial, motions. Straight
lines are special examples in the more general class of curves. A regular
curve may be defined as a set of points obtained as values of a differentiable
44 A. Herdegen

Fig. 22. Two rods moving parallelly in u-frame, and askew in u0 -frame.

mapping λ 7→ P (λ), where λ is a real parameter taking values in some (finite


or not) interval on the real axis. The curve is invariant under a change of
parameter λ = f (λ0 ), where f is differentiable together with its inverse.
Each regular curve has at each its point P (λ) a tangent vector defined as
dP (λ)/dλ. The extension of tangent vectors changes with the change of
parameter (but the tangent straight lines they generate remain unchanged).
We now define a general world-line as a curve with a four-velocity as
its tangent vector at each its point. We say that τ is a proper time of
a world-line if it has the form τ 7→ P (τ ) and the equation

dP (τ )
= u(τ ) (40)

defines at each point the tangent four-velocity u(τ ). Physically proper time
intervals are measured by clocks traveling along the world-line. Integrating
the above equation one obtains
Zτ2
−−−→
P1 P2 = u(τ ) dτ , where Pi = P (τi ) . (41)
τ1

Note that sums of four-velocities are future-pointing timelike vectors, so P2 is


in the future of P1 . One introduces also the concept of the four-acceleration:

du(τ )
a(τ ) = . (42)

Flat Spacetime in a Capsule 45

Note that acceleration, like relative velocity, points in a ‘purely spatial’


direction:
d
GS: Dt(a(τ )) = Dt(u(τ )) = 0 ,

1 d
SR: Dtu(τ ) (a(τ )) = u(τ ) · a(τ ) = [u(τ )]2 = 0 . (43)
2 dτ
However, unlike relative velocity, the acceleration is absolute — it does not
need a reference observer.
We now want to find
1. what is the relation of the proper time to affine time functions defined
earlier,
2. does the presence of acceleration influence the concept of simultaneity?

11.1. Galilean spacetime


We apply the linear form Dt to both sides of Eq. (41) and find

−−−→ Zτ2
t(P2 ) − t(P1 ) = Dt P1 P2 = Dt(u(τ )) dτ = τ2 − τ1 . (44)
τ1

Thus the proper time intervals are identical with the absolute time intervals.
Also, the notion of simultaneity is in no way influenced by accelerations.

11.2. Special relativity


Here we take the form Dtu and then proceed as in the Galilean case to
find
Zτ2
tu (P2 ) − tu (P1 ) = u · u(τ ) dτ ≥ τ2 − τ1 . (45)
τ1

Therefore, the proper time interval is always smaller than any affine time
function interval, except for the case when u(τ ) ≡ u. The latter case gives
simply P (τ ) = P (τ1 ) + (τ − τ1 )u, which is an inertial motion; proper time
intervals are then equal to the u-inertial time intervals on that line. In
general this is not the case. However, put τ1 = τ , τ2 = τ + dτ and u = u(τ ).
Then we find
tu(τ ) (P (τ + dτ )) − tu(τ ) (P (τ )) = dτ , (46)
so locally the proper time interval is equal to the time interval as defined
earlier for inertial motions.
46 A. Herdegen

With accelerated motions in play it is now possible to let two general


observers start from P1 , take different routes, and then meet again at P2 .
In general their clocks will show different time intervals between these two
events. In particular, let the first observer go straight from P1 to P2 along
an inertial world-line, and let u be his four-velocity. Then his clock will
show the interval tu (P2 ) − tu (P1 ), which is always more than the reading of
the proper time interval for any accelerated observer. There is no paradox
here (the famous ‘twin paradox’) — the accelerations, as noted above, are
absolute, so there is no symmetry between the observers.
Consider now simultaneity. Suppose that for an observer on the world-
line P (τ ) we can extend this notion in the way determined by his local
position and four-velocity: an event X is from his point of view simulta-
−−−−→
neous with the event P (τ ) iff P (τ )X · u(τ ) = 0. However, this leads to
conceptual difficulties. To see this suppose the observer crosses P1 with
a four-velocity u1 and then P2 with a different four-velocity u2 . The two
corresponding simultaneity hyperplanes cross on the 2-plane of events X
determined by the linear system
−−→
Pi X · ui = 0 , i = 1, 2 . (47)
−−→ −−−→
Take any event X on this 2-plane and put Xi0 = X + Pi X. We have Pi Xi0 =
−−→ −−−→
2Pi X, so Xi0 is simultaneous with Pi . At the same time there is X10 X20 =
−−−→
−P1 P2 . Therefore X20 is in the past of X10 . Thus an event which according
to the above definition is simultaneous with P1 turns out to be in the future
of an event simultaneous with a later event P2 (see Fig. 23).

Fig. 23. Accelerated motion and simultaneity.


This difficulty should by no means be interpreted as an argument against
the objectivity of the ‘direction of time flow’. This latter notion should be
simply identified with the choice of the causal orientation and the emerging
partial order Q ≥ P , as discussed in Section 4. The difficulty rather points
to the weakness of the notion of simultaneity, its restricted applicability
and, to some degree, its conventional character. It also shows that the strict
‘dilation’ and ‘contraction’ problems are of rather academic nature.
Flat Spacetime in a Capsule 47

12. Four-momentum, four-angular momentum


and their conservation
The four-momentum of a particle with mass m1 and four-velocity u1 is
given by
p1 = m1 u1 . (48)
If one chooses a reference point O and x1 is a vector from this point to the
position of the particle then the four-angular momentum tensor is defined by
L1 = 2x1 ∧ p1 . (49)
Let p1 , . . . , pk be the initial and p01 , . . . p0l the final four-momenta in a con-
servative mechanical process. The invariant laws of momentum and angular
momentum conservation say
k
X l
X k
X l
X
pi = p0j , Li = L0j . (50)
i=1 j=1 i=1 j=1

12.1. Galilean spacetime


Here the mass is an invariant of the four-momentum given by
m1 = Dt(p1 ). The decomposition of the four-momentum with respect to
the frame defined by the four-velocity u is thus
p1 = m1 u + p1u , (51)
see Fig. 24. We see thus that the law of conservation of mass and the law
of conservation of momentum are aspects of one observer-invariant law of
conservation of four-momentum.

Fig. 24. Four-momentum in GS.

12.2. Special relativity


The mass again is an invariant, but formed in another way: p1 · p1 = m21 .
Then in the u-frame we have
2
p1 = E1u u + p1u , E1u − |p1u |2 = m21 , (52)
48 A. Herdegen

see Fig. 25. E1u has the interpretation of the energy as seen in the chosen
frame. Now the aspects of the observer-invariant law of conservation of four-
momentum are laws of energy and momentum conservation, while the sum
of masses needs not to be conserved.

Fig. 25. Four-momentum in SR.

We observe that geometrical analogy is:


Galilean mass ↔ Einsteinian energy
(and not energy ↔ energy). This analogy is further confirmed when one
considers the time-space part of the conservation of four-angular momentum.
For freely moving particles one obtains the law of uniform motion of center
of mass in the Galilean case, and of center of energy in the SR case.

13. Galilean kinetic energy


The question then arises what is the geometrical status of the Galilean
kinetic energy and does its conservation have an invariant character.
To answer this observe that while there is no geometrical numerical in-
variant formed out of space-part of a single timelike vector, one can form
a respective invariant for a pair of such vectors. Let Dt(pi ) = mi , i = 1, 2,
and let u be any four-velocity. Then pi = mi u + piu , so that
p1 p2 p1u p2u
− = − ∈S. (53)
m1 m2 m1 m2
Thus the number
2
m1 m2 p1u p2u
d(p1 , p2 ) = − ≥0 (54)
2 m1 m2
does not depend on u (see Fig. 26). For momenta p1 , . . . , pk it is now easy
to show, that
Xk
d(pi , pj ) = 2M E − |Pu |2 ≥ 0 , (55)
i,j=1
Flat Spacetime in a Capsule 49

where
k k
X X |piu |2
P = pi , P = M u + Pu , E= . (56)
2mi
i=1 i=1

Fig. 26. Galilean invariant of two causal vectors: |p2u /m2 − p1u /m1 |.

We learn two facts:

1. If the total four-momentum is conserved, then the condition of energy


conservation is Galilean invariant.
2. There is always E ≥ |Pu |2 /2M , and the equality holds if, and only if,
all momenta are parallel.

14. Celestial sphere


We fix a reference point O and consider all light rays coming into this
point. Imagine a world-line of an inertial observer with four-velocity u passes
through this point. At this point the observer positions the space directions
from which all light rays arrive. We want to find how the picture obtained
in this way depends on the four-velocity u of the observer.

14.1. Galilean spacetime


Here we assume that the light rays propagate with infinite speed. Thus
the straight lines of the rays lie in the hyperplane O + S, and their direc-
tional vectors are in S. But for such vectors the decomposition (7) is trivial
and independent of u. Therefore the picture formed by light on the celes-
tial sphere is independent of the choice of particular observer crossing the
point O.
50 A. Herdegen

14.2. Special relativity


A light ray with the directional past-pointing vector −l ∈ V comes from
the space direction pointed by the unit spacelike vector
−lu l −u·lu l
r(l, u) = =− =− + u, (57)
|lu | u·l u·l
where we have used the fact that |lu |2 = −lu · lu = (u · l)2 (see Fig. 27). If u0
is the four-velocity of another observer passing O and we denote for brevity
r = r(l, u), r0 = r(l, u0 ) then we find
u0 · l
= (u − r) · u0 = c(k) + s(k) n ◦ r . (58)
u·l

Fig. 27. Celestial sphere.

Using this and Eq. (57) for r and r0 we find the transformation r 7→ r0 of
the celestial sphere of the u-observer to the sphere of the u0 -observer:
r−u
r0 = u0 + . (59)
c(k) + s(k) n ◦ r
Taking the scalar product of this equation with u we find, in particular, the
well-known aberration formula:
s(k) + c(k) n ◦ r
n0 ◦ r 0 = − (60)
c(k) + s(k) n ◦ r
(the difference in signs is due to the direction of n and n0 ).
A small variation of the direction of the light ray induces small variations
δr and δr0 , which are tangent to the two respective celestial spheres. The
linear transformation δr 7→ δr0 is found by varying Eq. (59):
δr s(k) n ◦ δr
δr0 = + (u − r) . (61)
c(k) + s(k) n ◦ r [c(k) + s(k) n ◦ r]2
Flat Spacetime in a Capsule 51

Taking now two different variations δ1 and δ2 and using the constraints
u · δr = r · δr = 0 we find
δ1 r ◦ δ2 r
δ1 r0 ◦ δ2 r0 = . (62)
[c(k) + s(k) n ◦ r]2

This equation tells us that the linear transformation δr 7→ δr0 differs only
by the factor [c(k) + s(k) n ◦ r]−1 from an isometric transformation. Thus
locally (in the first order in δr) the picture registered on the celestial sphere
scales by this factor without a change of the shape (the angles) [5].
Larger areas on the celestial sphere lose this scaling property and un-
dergo more complicated transformations. However, one feature of the local
transformation survives. To find it chose a spacelike vector z, z 2 < 0, and
consider among vectors −l all those which satisfy the equation

z · l = 0. (63)

Using the geometrical quantities correlated to u the spacelike character of z


is written down as (u · z)2 < |zu |2 and the above condition on l’s takes the
form zu u·z
r(l, u) ◦ =− = cos[φ(z, u)] , (64)
|zu | |zu |
where the last equality defines the angle φ(z, u). This equation tells us
that the vectors r(l, u) are all those which form the angle φ(z, u) with the
vector zu /|zu |. Thus they form a circle on the celestial sphere. This fact
is independent of the choice of a particular observer (its vector u) crossing
the point O. However, the angle φ(z, u) does depend on this choice. Note
in particular that if Eq. (63) determines a ‘great circle’ for the observer
with four-velocity u (i.e. φ(z, u) = π/2), this circle will in general cease to
be ‘great’ for the one with the four-velocity u0 . The exceptional cases when
‘great’ goes to ‘great’ are those determined by z orthogonal both to u and u0 .
To summarize, the picture obtained on the celestial sphere undergoes
deformation from one observer to another, but in such a way that angles
are conserved and circles become circles, although the ‘greatness’ property
is usually not conserved. This is illustrated in Figs. 28 and 29.

Fig. 28. A bicycle wheel in rest.


52 A. Herdegen

Fig. 29. The same wheel as seen by a fast moving observer.

I am grateful to my colleague Piotr Bizoń for his suggestion to expand


what originally was a lecture presentation into this article, and for careful
reading of the manuscript.

Appendix
Theorem. The cone V determines uniquely up to a constant factor a sym-
metric bilinear form g such that x ∈ V ⇐⇒ g(x, x) = 0.

qP basis V takes the form given in Eq.


Proof. In a canonical (1), which is equiv-
0 3 i 2
P3 µ ν
alent to x = ± i=1 (x ) . If this implies g(x, x) = µ,ν=0 gµν x x = 0,
then the conditions
qP
i k 3 i 2 i
(gik + g00 δik ) x x ± 2g0i i=1 (x ) x = 0

must be satisfied identically (for any numbers xi , i = 1, 2, 3). Thus g0i = 0,


i = 1, 2, 3, and
 gik +Pg00 δik = 0, i, k = 1, 2, 3. Therefore in this frame
g(x, y) = g00 x0 y 0 − 3i=1 xi y i .

REFERENCES

[1] H. Bondi, Relativity and Common Sense, Doubleday & Company, New York
1964.
[2] W. Kopczyński, A. Trautman, Spacetime and Gravitation, John Wiley & Sons,
1992 [Polish original publication: Ossolineum, 1971].
[3] R. Geroch, General Relativity from A to B, The University of Chicago Press,
Chicago & London 1978.
[4] R. Penrose, Proc. Cambridge Phil. Soc. 55, 137 (1959).
[5] J. Terrell, Phys. Rev. 116, 1041 (1959).
[6] W. Rindler, Am. J. Phys. 29, 365 (1961); R. Shaw, Am. J. Phys. 30, 72
(1962).

You might also like