The Principle of Relativity

| g_μα g^αν | = | δ^ν_μ | = 1

So that it follows (17) that | g_μν | | g^μν | = 1.

Invariant of volume.

We see first the transformation law for the determinant g = | g_μν |. According to (11)

From this by applying the law of multiplication twice, we obtain

"(A)."

On the other hand the law of transformation of the volume element

dτ′ = ∫ dx₁ dx₂ dx₃ dx₄

is according to the wellknown law of Jacobi.

"(B)."

by multiplication of the two last equations (A) and (B) we get

(18) = √g dτ′ = √g dτ.

Instead of √g, we shall afterwards introduce √(-g) which has a real value on account of the hyperbolic character of the time-space continuum. The invariant √(-g)dτ, is equal in magnitude to the four-dimensional volume-element measured with solid rods and clocks, in accordance with the special relativity theory.

Remarks on the character of the space-time continuum—Our assumption that in an infinitely small region the special relativity theory holds, leads us to conclude that ds² can always, according to (1) be expressed in real magnitudes dX₁ ... dX_h. If we call dτ₀ the “natural” volume element dX₁ dX₂ dX₃ dX₄ we have thus (18a) dτ₀ = √(g)iτ.

Should √(-g) vanish at any point of the four-dimensional continuum it would signify that to a finite co-ordinate volume at the place corresponds an infinitely small “natural volume.” This can never be the case; so that g can never change its sign; we would, according to the special relativity theory assume that g has a finite negative value. It is a hypothesis about the physical nature of the continuum considered, and also a pre-established rule for the choice of co-ordinates.

If however (-g) remains positive and finite, it is clear that the choice of co-ordinates can be so made that this quantity becomes equal to one. We would afterwards see that such a limitation of the choice of co-ordinates would produce a significant simplification in expressions for laws of nature.

In place of (18) it follows then simply that

dτ′ = d

from this it follows, remembering the law of Jacobi,

"(19)."

With this choice of co-ordinates, only substitutions with determinant 1 are allowable.

It would however be erroneous to think that this step signifies a partial renunciation of the general relativity postulate. We do not seek those laws of nature which are co-variants with regard to the transformations having the determinant 1, but we ask: what are the general co-variant laws of nature? First we get the law, and then we simplify its expression by a special choice of the system of reference.

Building up of new tensors with the help of the fundamental tensor.

Through inner, outer and mixed multiplications of a tensor with the fundamental tensor, tensors of other kinds and of other ranks can be formed.

Example:—

A^μ = g^μσ A_σ

A = g_μν A^μν

We would point out specially the following combinations:

A^μν = g^μα g^νβ A_αβ

A_μν = g_μα g_νβ A^αβ

(complement to the co-variant or contravariant tensors)

and B_μν = g_μν g^αβ A_αβ

We can call B_μν the reduced tensor related to A_μν.

Similarly

B^μν = g^μνg_αβA^αβ.

It is to be remarked that g^μν is no other than the “complement” of g_μν for we have,—

g^μαg^νβg_αβ = g_μαδ^ν_α = g^μν.

§ 9. Equation of the geodetic line (or of point-motion).

As the “line element” ds is a definite magnitude independent of the co-ordinate system, we have also between two points P₁ and P₂ of a four dimensional continuum a line for which ∫ds is an extremum (geodetic line), i.e., one which has got a significance independent of the choice of co-ordinates.

Its equation is

(20) δ{ ∫^P₂_P₁ ds } = 0

From this equation, we can in a wellknown way deduce 4 total differential equations which define the geodetic line; this deduction is given here for the sake of completeness.

Let λ, be a function of the co-ordinates x_ν; this defines a series of surfaces which cut the geodetic line sought-for as well as all neighbouring lines from P₁ to P₂. We can suppose that all such curves are given when the value of its co-ordinates x_ν are given in terms of λ. The sign δ corresponds to a passage from a point of the geodetic curve sought-for to a point of the contiguous curve, both lying on the same surface λ.

Then (20) can be replaced by

{ λ₃
{ ∫δω dλ = 0
(20a)   { λ₁
{
{ ω² = gμν(dxμ/dλ)(dxν/dλ)

But

δω = (1/ω){½(∂g_μν/∂x_σ) · (dx_μ/dλ) · (dx_ν/dλ) · δx_σ

+ g_μν(dx_μ/dλ)δ(dx_ν/dλ)}

So we get by the substitution of δω in (20a), remembering that

δ(dx_ν/dλ) = (d/dλ)(δx_ν)

after partial integration,

{ λ₃
{ ∫ dλ kσ δxσ = 0
(20b)   { λ₁
{
{ where kσ = (d/dλ){(gμν/ω) · (dxμ/dλ)} - (1/(2ω))(∂gμν/∂xσ

× (dx_μ/dλ) · (dx_ν/dλ).

From which it follows, since the choice of δν_σ is perfectly arbitrary that k_σ’s should vanish. Then

(20c) k_σ = 0 (σ = 1, 2, 3, 4)

are the equations of geodetic line; since along the geodetic line considered we have ds ≠ 0, we can choose the parameter λ, as the length of the arc measured along the geodetic line. Then w = 1, and we would get in place of (20c)

Or by merely changing the notation suitably,

"20d"

where we have put, following Christoffel,

"21"

Multiply finally (20d) with g^στ (outer multiplication with reference to τ, and inner with respect to σ) we get at last the final form of the equation of the geodetic line—

Here we have put, following Christoffel,

§ 10. Formation of Tensors through Differentiation.

Relying on the equation of the geodetic line, we can now easily deduce laws according to which new tensors can be formed from given tensors by differentiation. For this purpose, we would first establish the general co-variant differential equations. We achieve this through a repeated application of the following simple law. If a certain curve be given in our continuum whose points are characterised by the arc-distances s, measured from a fixed point on the curve, and if further φ, be an invariant space function, then dφ/ds is also an invariant. The proof follows from the fact that dφ as well as ds, are both invariants

Since

so that

is also an invariant for all curves which go out from a point in the continuum, i.e., for any choice of the vector dx_μ. From which follows immediately that

A_μ = ∂φ/∂x_μ

is a co-variant four-vector (gradient of φ).

According to our law, the differential-quotient χ = ∂ψ/∂s taken along any curve is likewise an invariant.

Substituting the value of ψ, we get

Here however we can not at once deduce the existence of any tensor. If we however take that the curves along which we are differentiating are geodesics, we get from it by replacing d²x_ν/ds² according to (22)

From the interchangeability of the differentiation with regard to μ and ν, and also according to (23) and (21) we see that the bracket

is symmetrical with respect to μ and ν.

As we can draw a geodetic line in any direction from any point in the continuum, ∂x_μ/ds is thus a four-vector, with an arbitrary ratio of components, so that it follows from the results of §7 that

"25"

is a co-variant tensor of the second rank. We have thus got the result that out of the co-variant tensor of the first rank A_μ = ∂φ/∂x_μ we can get by differentiation a co-variant tensor of 2nd rank

"26"

We call the tensor A_μν the “extension” of the tensor A_μ. Then we can easily show that this combination also leads to a tensor, when the vector A_μ is not representable as a gradient. In order to see this we first remark that ψ (dφ/∂x_μ) is a co-variant four-vector when ψ and φ are scalars. This is also the case for a sum of four such terms :—

when ψ⁽¹⁾, φ⁽¹⁾ ... ψ⁽⁴⁾, φ⁽⁴⁾ are scalars. Now it is however clear that every co-variant four-vector is representable in the form of S_μ.

If for example, A_μ is a four-vector whose components are any given functions of x_ν, we have, (with reference to the chosen co-ordinate system) only to put

ψ⁽¹⁾ = A₁ φ⁽¹⁾ = x₁

ψ⁽²⁾ = A₂ φ⁽²⁾ = x₂

ψ⁽³⁾ = A₃ φ⁽³⁾ = x₃

ψ⁽⁴⁾ = A₄ φ⁽⁴⁾ = x₄.

in order to arrive at the result that S_μ is equal to A_μ.

In order to prove then that A_μν is a tensor when on the right side of (26) we substitute any co-variant four-vector for A_μ we have only to show that this is true for the four-vector S_μ. For this latter case, however, a glance on the right hand side of (26) will show that we have only to bring forth the proof for the case when

A_μ = ψ ∂φ/∂x_μ.

Now the right hand side of (25) multiplied by ψ is

which has a tensor character. Similarly, (∂φ/∂x_μ) (∂φ/∂x_ν) is also a tensor (outer product of two four-vectors).

Through addition follows the tensor character of

Thus we get the desired proof for the four-vector, ψ ∂φ/∂x_μ and hence for any four-vectors A_μ as shown above.

With the help of the extension of the four-vector, we can easily define “extension” of a co-variant tensor of any rank. This is a generalisation of the extension of the four-vector. We confine ourselves to the case of the extension of the tensors of the 2nd rank for which the law of formation can be clearly seen.

As already remarked every co-variant tensor of the 2nd rank can be represented as a sum of the tensors of the type A_μ B_ν.

It would therefore be sufficient to deduce the expression of extension, for one such special tensor. According to (26) we have the expressions

are tensors. Through outer multiplication of the first with B_ν and the 2nd with A_μ we get tensors of the third rank. Their addition gives the tensor of the third rank

"(27)"

where A_μν is put = A_μ B_ν. The right hand side of (27) is linear and homogeneous with reference to A_μν, and its first differential co-efficient, so that this law of formation leads to a tensor not only in the case of a tensor of the type A_μ B_ν but also in the case of a summation for all such tensors, i.e., in the case of any co-variant tensor of the second rank. We call A_μνσ the extension of the tensor A_μν. It is clear that (26) and (24) are only special cases of (27) (extension of the tensors of the first and zero rank). In general we can get all special laws of formation of tensors from (27) combined with tensor multiplication.

Some special cases of Particular Importance.

A few auxiliary lemmas concerning the fundamental tensor. We shall first deduce some of the lemmas much used afterwards. According to the law of differentiation of determinants, we have

(28) dg = g^μν g dg_μν = -g_μν gdg^μν.

The last form follows from the first when we remember that

g_μν g^μ′ν = δ^μ′_μ , and therefore g_μνg^μν = 4,

consequently g_μνdg^μν + g^μν dg_μν = 0.

From (28), it follows that

"(29)"

Again, since g_μν g^νσ = δ^ν_μ , we have, by differentiation,

By mixed multiplication with g^στ and g_νλ respectively we obtain (changing the mode of writing the indices).

dg^μν = -g^μα g^νβ dg_αβ

∂g^μν/∂x_σ = -g^μα g^νβ dg_αβ

and

(32)

dg_μν = -g_μα g_νβ dg^αβ

∂g_μν/∂x_σ = -g_μα g_νβ ∂g^αβ/∂x_σ.

The expression (31) allows a transformation which we shall often use; according to (21)

"(33)"

If we substitute this in the second of the formula (31), we get, remembering (23),

"(34)"

By substituting the right-hand side of (34) in (29), we get

"(29a)"

Divergence of the contravariant four-vector.

Let us multiply (26) with the contravariant fundamental tensor g^μν (inner multiplication), then by a transformation of the first member, the right-hand side takes the form

"(A)"

According to (31) and (29), the last member can take the form

"(B)"

Both the first members of the expression (B), and the second member of the expression (A) cancel each other, since the naming of the summation-indices is immaterial. The last member of (B) can then be united with first of (A). If we put

g^μν A_μ = A^ν,

where A^ν as well as A_μ are vectors which can be arbitrarily chosen, we obtain finally

This scalar is the Divergence of the contravariant four-vector A^ν.

Rotation of the (covariant) four-vector.

The second member in (26) is symmetrical in the indices μ, and ν. Hence A_μν - A_νμ is an antisymmetrical tensor built up in a very simple manner. We obtain

∂Aμ      ∂Aν
(36)  Bμν = --------- - -------
∂xν       ∂xμ

Antisymmetrical Extension of a Six-vector.

If we apply the operation (27) on an antisymmetrical tensor of the second rank A_μ{ν²} and form all the equations arising from the cyclic interchange of the indices μ, ν, σ, and add all them, we obtain a tensor of the third rank

(37) B_μνσ = A_μνσ + A_νσμ + A_σμν

∂Aμν        ∂Aνσ         ∂Aσμ
= --------- + ---------- + ---------
∂xσ          ∂xμ        ∂xν

from which it is easy to see that the tensor is antisymmetrical.

Divergence of the Six-vector.

If (27) is multiplied by g^μα g^νβ (mixed multiplication), then a tensor is obtained. The first member of the right hand side of (27) can be written in the form

If we replace g^μα g^νβ A_μνσ by A_σ^αβ, g^μα g^νβ A_μν by A^αβ and replace in the transformed first member

∂g^νβ/∂x_σ and ∂g^μα/∂x_σ

with the help of (34), then from the right-hand side of (27) there arises an expression with seven terms, of which four cancel. There remains

"(38)"

This is the expression for the extension of a contravariant tensor of the second rank; extensions can also be formed for corresponding contravariant tensors of higher and lower ranks.

We remark that in the same way, we can also form the extension of a mixed tensor A_μ^α

"(39)"

By the reduction of (38) with reference to the indices β and σ(inner multiplication with δ_β^σ), we get a contravariant four-vector

On the account of the symmetry of

with reference to the indices β and κ, the third member of the right hand side vanishes when A^αβ is an antisymmetrical tensor, which we assume here; the second member can be transformed according to (29a); we therefore get

"(40)"

This is the expression of the divergence of a contravariant six-vector.

Divergence of the mixed tensor of the second rank.

Let us form the reduction of (39) with reference to the indices α and σ, we obtain remembering (29a)

"(41)"

If we introduce into the last term the contravariant tensor A^ρσ = g^ρτ A^σ_τ, it takes the form

If further A^ρσ or is symmetrical it is reduced to

If instead of A^ρσ, we introduce in a similar way the symmetrical co-variant tensor A_ρσ = g_ρα g_σβ A^αβ, then owing to (31) the last member can take the form

In the symmetrical case treated, (41) can be replaced by either of the forms

"(41a)"

"(41b)"

which we shall have to make use of afterwards.

§12. The Riemann-Christoffel Tensor.

We now seek only those tensors, which can be obtained from the fundamental tensor g^μν by differentiation alone. It is found easily. We put in (27) instead of any tensor A^μν the fundamental tensor g^μν and get from it a new tensor, namely the extension of the fundamental tensor. We can easily convince ourselves that this vanishes identically. We prove it in the following way; we substitute in (27)

i.e., the extension of a four-vector.

Thus we get (by slightly changing the indices) the tensor of the third rank

We use these expressions for the formation of the tensor A_μστ - A_μτσ. Thereby the following terms in A_μστ cancel the corresponding terms in A_μτσ; the first member, the fourth member, as well as the member corresponding to the last term within the square bracket. These are all symmetrical in σ, and τ. The same is true for the sum of the second and third members. We thus get

"(43)"

The essential thing in this result is that on the right hand side of (42) we have only A_ρ, but not its differential co-efficients. From the tensor-character of A_μστ - A_μτσ, and from the fact that A_ρ is an arbitrary four vector, it follows, on account of the result of §7, that B^ρ_μστ is a tensor (Riemann-Christoffel Tensor).

The mathematical significance of this tensor is as follows; when the continuum is so shaped, that there is a co-ordinate system for which g_μν’s are constants, B^ρ_μστ all vanish.

If we choose instead of the original co-ordinate system any new one, so would the g_μν’s referred to this last system be no longer constants. The tensor character of B^ρ_μστ shows us, however, that these components vanish collectively also in any other chosen system of reference. The vanishing of the Riemann Tensor is thus a necessary condition that for some choice of the axis-system g_μν’s can be taken as constants. In our problem it corresponds to the case when by a suitable choice of the co-ordinate system, the special relativity theory holds throughout any finite region. By the reduction of (43) with reference to indices to τ and ρ, we get the covariant tensor of the second rank

"(44)"

Remarks upon the choice of co-ordinates.—It has already been remarked in §8, with reference to the equation (18a), that the co-ordinates can with advantage be so chosen that √(-g) = 1. A glance at the equations got in the last two paragraphs shows that, through such a choice, the law of formation of the tensors suffers a significant simplification. It is specially true for the tensor B_μν, which plays a fundamental rôle in the theory. By this simplification, S_μν vanishes of itself so that tensor B_μν reduces to R_μν.

I shall give in the following pages all relations in the simplified form, with the above-named specialisation of the co-ordinates. It is then very easy to go back to the general covariant equations, if it appears desirable in any special case.

C. THE THEORY OF THE GRAVITATION-FIELD

§13. Equation of motion of a material point in a gravitation-field. Expression for the field-components of gravitation.

A freely moving body not acted on by external forces moves, according to the special relativity theory, along a straight line and uniformly. This also holds for the generalised relativity theory for any part of the four-dimensional region, in which the co-ordinates K₀ can be, and are, so chosen that g_μν’s have special constant values of the expression (4).

Let us discuss this motion from the stand-point of any arbitrary co-ordinate-system K₁; it moves with reference to K₁ (as explained in §2) in a gravitational field. The laws of motion with reference to K₁ follow easily from the following consideration. With reference to K₀, the law of motion is a four-dimensional straight line and thus a geodesic. As a geodetic-line is defined independently of the system of co-ordinates, it would also be the law of motion for the motion of the material-point with reference to K₁. If we put

"(45)"

we get the motion of the point with reference to K₁, given by

"(46)"

We now make the very simple assumption that this general covariant system of equations defines also the motion of the point in the gravitational field, when there exists no reference-system K₀, with reference to which the special relativity theory holds throughout a finite region. The assumption seems to us to be all the more legitimate, as (46) contains only the first differentials of g_μν, among which there is no relation in the special case when K₀ exists.

If γ_μν^τ’s vanish, the point moves uniformly and in a straight line; these magnitudes therefore determine the deviation from uniformity. They are the components of the gravitational field.

§14. The Field-equation of Gravitation in the absence of matter.

In the following, we differentiate gravitation-field from matter in the sense that everything besides the gravitation-field will be signified as matter; therefore the term includes not only matter in the usual sense, but also the electro-dynamic field. Our next problem is to seek the field-equations of gravitation in the absence of matter. For this we apply the same method as employed in the foregoing paragraph for the deduction of the equations of motion for material points. A special case in which the field-equations sought-for are evidently satisfied is that of the special relativity theory in which g_μν’s have certain constant values. This would be the case in a certain finite region with reference to a definite co-ordinate system K₀. With reference to this system, all the components B^ρ_μστ of the Riemann’s Tensor [equation 43] vanish. These vanish then also in the region considered, with reference to every other co-ordinate system.

The equations of the gravitation-field free from matter must thus be in every case satisfied when all B^ρ_μστ vanish. But this condition is clearly one which goes too far. For it is clear that the gravitation-field generated by a material point in its own neighbourhood can never be transformed away by any choice of axes, i.e., it cannot be transformed to a case of constant g_μν’s.

Therefore it is clear that, for a gravitational field free from matter, it is desirable that the symmetrical tensors B_μν deduced from the tensors B^ρ_μστ should vanish. We thus get 10 equations for 10 quantities g_μν which are fulfilled in the special case when B^ρ_μστ’s all vanish.

Remembering (44) we see that in absence of matter the field-equations come out as follows; (when referred to the special co-ordinate-system chosen.)

"(47)"

It can also be shown that the choice of these equations is connected with a minimum of arbitrariness. For besides B_μν, there is no tensor of the second rank, which can be built out of g_μν’s and their derivatives no higher than the second, and which is also linear in them.

It will be shown that the equations arising in a purely mathematical way out of the conditions of the general relativity, together with equations (46), give us the Newtonian law of attraction as a first approximation, and lead in the second approximation to the explanation of the perihelion-motion of mercury discovered by Leverrier (the residual effect which could not be accounted for by the consideration of all sorts of disturbing factors). My view is that these are convincing proofs of the physical correctness of my theory.

§15. Hamiltonian Function for the Gravitation-field.
Laws of Impulse and Energy.

In order to show that the field equations correspond to the laws of impulse and energy, it is most convenient to write it in the following Hamiltonian form:—

(47a)

δ∫ Hdτ = 0

H = g^μν γ^α_μβ γ^β_να

√(-g) = 1

Here the variations vanish at the limits of the finite four-dimensional integration-space considered.

It is first necessary to show that the form (47a) is equivalent to equations (47). For this purpose, let us consider H as a function of g^μν and g^μν_σ (= ∂g^μν/∂x_σ)

We have at first

δH = Γ^α_μβ Γ^β_να δg^μν + 2g^μν Γ^α_μβ δΓ^β_να

= - Γ^α_μβ Γ^β_να δg^μν + 2Γ^α_μβ δ(g^μνΓ^β_να).

But

The terms arising out of the two last terms within the round bracket are of different signs, and change into one another by the interchange of the indices μ and β. They cancel each other in the expression for δH, when they are multiplied by Γ_μβ^α, which is symmetrical with respect to μ and β, so that only the first member of the bracket remains for our consideration. Remembering (31), we thus have:—

δH = -Γ_μβ^α Γ_να^β δg^μν + Γ_μβ^α δg_α^μβ

Therefore

(48)

∂H/∂g^μν = -Γ_μβ^α Γ_να^β

∂H/∂g_σ^μν = Γ_μν^σ

If we now carry out the variations in (47a), we obtain the system of equations

(47b) ∂/∂x_α ( ∂H/∂g_α^μν ) - ∂H/∂g^μν = 0,

which, owing to the relations (48), coincide with (47), as was required to be proved.

If (47b) is multiplied by g_σ^μν, since

∂g_σ^μν/∂x_α = ∂g_α^μν/∂x_σ

and consequently