The analysis of matter

CHAPTER VII
THE METHOD OF TENSORS

THE method of tensors contains the answer to a question which is rendered urgent by the arbitrary character of our co-ordinates. How can we know whether a formula expressed in terms of our co-ordinates expresses something which describes the physical occurrences, and not merely the particular co-ordinate system which we happen to be employing? A striking example of the mistakes that are possible in this respect is afforded by simultaneity. Suppose we have two events, whose co-ordinates, in the system we are employing, are () and ()—i.e. their time co-ordinates are the same. Before the special theory of relativity everybody would have asserted that this represented a physical fact about the two events—namely, that they are simultaneous. Now we know that the fact concerned is one which also involves mention of the co-ordinate system—that is to say, it is not a relation between the two events only, but between them and the body of reference. But this is to speak the language of the special theory. In the general theory, our co-ordinates may have no important physical significance, and a pair of events which have one co-ordinate identical need not have any intrinsic physical property not possessed by other pairs of events. In practice, there must be some principle on which co-ordinates are assigned, and this principle must have some physical significance. But we might, for instance, measure time by the worst clock ever made, provided it only went wrong and did not actually stop. And we might use a certain worm as our unit of length, disregarding the "FitzGerald contraction" to which motion subjects him.

In that case, if we say that there was unit distance between two events which both occurred at a certain instant, we shall be making a complicated comparison between the events, a bad clock, and a certain worm—that is to say, we shall be making a statement which depends upon our co-ordinate system. We want to discover a sufficient, if not necessary, condition which, if fulfilled, insures that a statement in terms of co-ordinates has a meaning independent of co-ordinates. The difference is more or less analogous to that, in ordinary language, between linguistic statements and statements which (as is usually the case) are about what words mean. If I say "strength is a desirable quality," my statement can be put into French or German without change of meaning. But if I say "strength is a word containing seven consonants and only one vowel," my statement becomes false if translated into French or German. Now in physics co-ordinates are analogous to words, with the difference that it is much harder to distinguish "linguistic" statements from others. This is what the method of tensors undertakes to do.

It does not seem possible to state the method of tensors in untechnical language; I am afraid that those philosophers who have not thought it worth while to learn the calculus cannot hope to understand it. Perhaps in time some simple way of explaining it may be found, but none has been found so far.[24]

Suppose we have a vector quantity whose components are , , , . (Here 1, 2, 3, 4 play the part of suffixes, not of exponents denoting powers.) It happens in certain cases that, if we transform to any other co-ordinates , , , , which are continuous functions of the old co-ordinates , , , , we shall have, as the components of the vector in the new co-ordinates, , , , , where: with similar formulæ for , , . When this happens, the vector in question is called contravariant. The simplest example is (). Except in this one case, the "contravariant" property is symbolized by the upper position of the suffix.

Again we may have a vector, whose components are , , , , which is transformed according to the law: with similar formulæ for , , . Such a vector is called covariant. The simplest example is the vector whose components are: where is some function which has a fixed value at each point, independently of the co-ordinate system.

It is obvious that, if we have two contravariant vectors and whose components are equal in one system of co-ordinates, then their components are equal in any system of co-ordinates; and the same applies to two covariant vectors and . This follows at once from the above rules of transformation. Thus an equality of two contravariant vectors, or of two covariant vectors, when it occurs, is a fact independent of the co-ordinate system. It is, in fact, a tensor equation of the simplest kind.

The general definition of a "tensor" is a generalization of those of contravariant and covariant vectors. Instead of a vector with only four components, we may have a quantity with sixteen components: Such a quantity may be denoted by "" where it is understood that and can each take all values from 1 to 4. Similarly we may have a quantity with sixty-four components, , , etc.; such a quantity may be denoted by "" where and and can each take all values from 1 to 4. Such quantities are called "tensors" if they obey laws of transformation analogous to those of contravariant and covariant vectors. Thus a contravariant tensor with sixteen components, which is written "," is one which satisfies the rule: with similar equations for the other components—e.g.: These equations are comprised in: where , are to take all values from 1 to 4. Similarly a covariant tensor with sixteen components, written "," is one which is transformed according to the rule: and a mixed tensor, written , is one which satisfies the rule:

There is no difficulty in extending these definitions to any number of suffixes. It is obvious, as in the case of contravariant and covariant vectors, that if two tensors of the same kind are equal in one system of co-ordinates they are equal in any system of co-ordinates, so that tensor equations express conditions which are independent of the choice of co-ordinates. For this reason it is necessary to express all the general laws of physics as tensor equations; if this cannot be done, the law concerned must be wrong, and must require such correction as will enable it to be expressed as a tensor equation. The law of gravitation is the most noteworthy example of this; but perhaps the conservation of energy is scarcely less noteworthy.[25] It seems natural to suppose that it would be possible to develop a less indirect method of expressing physical laws than that afforded by the method of tensors, which is perhaps a consequence of the historical development of physics. Originally, in physics, the co-ordinates were intended to express physical relations between the event concerned and the origin. Three of the co-ordinates were lengths, which, it was thought, could be ascertained by measurement with a rigid rod. The fourth was a time, which could be measured by a chronometer. There were difficulties, however, which the progress of physics made increasingly evident. So long as the earth could be regarded as motionless, axes fixed relatively to the earth and clocks which remained on the surface of the earth seemed to suffice. It was possible to disregard the facts that no body is quite rigid and no clock quite accurate, because the system of physical laws suggested by the choice of the most rigid bodies and the most accurate clocks could be used to estimate the departure of these instruments from strict constancy, and the results were on the whole self-consistent. But in astronomical problems, including that of the tides, the earth could not be treated as fixed. It was necessary to Newtonian dynamics that the axes should not have any acceleration, but it resulted from the law of gravitation that any material axes must have some acceleration. The axes, therefore, became ideal structures in absolute space; actual measurements with actual rods could only approximate to the results which would have followed if we could have used unaccelerated axes. This difficulty was not the most serious: the worst trouble was concerned with absolute acceleration. Then came the experimental discovery of the facts which led to the special theory of relativity: the variation of length and mass with velocity, and the constancy of the velocity of light in vacuo no matter what body was used to define the co-ordinates. This set of difficulties was solved by the special theory of relativity, which showed that equivalent results come from employing as reference-body any one of a set of bodies in uniform rectilinear motion. This, however, only achieved what Galileo and Newton thought they had achieved. It included electromagnetic phenomena within the scope of relativity as regards velocities, but it was clearly necessary to extend relativity to accelerations, and when this was done, co-ordinates ceased to have the clear physical meaning they had formerly possessed. It is true that, even in the general theory, a co-ordinate, in any system which can actually be used, will always have some physicals significance, but its significance is trivial and complicated, not, as before, important and simple.

It is natural to ask: Could we not dispense with co-ordinates altogether, since they have become little more than conventional names systematically assigned? Perhaps this will become possible in time, but at present the necessary mathematics is lacking. We wish, for example, to be able to differentiate, and we cannot differentiate a function unless its arguments and values are numbers. This is not due to what might seem the more difficult parts of the definition of a differential. We can define for a non-numerical function the limit (if it exists) of a function for a given argument, and also the four limits which exist more frequently—viz. the maximum and minimum for approaches from above and below; we can also define a "continuous" non-numerical function. (See Principia Mathematica, *230—*234.) What, so far, has not been defined, except for numbers, is a fraction. Now is the limit of a fraction; thus, although we can generalize the notion of a limit, we cannot at present generalize , because we cannot generalize the notion of a fraction. It seems clear a priori that, since differentiation of co-ordinates is physically useful even when the quantitative value of the co-ordinates is conventional, there must be some process, of which differentiation is a special numerical form, which can be applied wherever we have continuous functions, even when they are non-numerical. To define such a process is a problem in mathematical logic, probably soluble, but hitherto unsolved. If it were solved, it might become possible to avoid the elaborate and round-about process of assigning co-ordinates and then treating almost all their properties as irrelevant, which is what is done when the method of tensors is employed.

There are, it is true, certain numbers which are important in the new geometry: they are those giving the measure of intervals. But, as we have already seen, two points at a finite distance apart do not have an unambiguous interval; and any two points are at a finite distance apart. The numbers involved in the notion of interval are not finite distances, but numbers derivable from the sixteen coefficients involved in the formula for in the previous chapter. These coefficients themselves depend upon the co-ordinate system, but does not. We cannot develop this theme until we have considered geodesics; it is from them that we must derive the numbers which have, in the new geometry, the same sort of physical importance as co-ordinates were supposed to have in the old. These numbers will be the integrals of taken along certain geodesics. But, unlike lengths in the old metrical geometry, they are geometrically insufficient. To avoid irrelevant complications, we may illustrate this insufficiency by considering the special theory.

The most obvious example of the failure of interval to constitute a geometry is derived from consideration of light-rays. The interval between two events which are parts of the same light-ray is zero. Suppose now that a light-ray starts from an event , and arrives at an event at the moment when it reaches , another light-ray starts from and reaches . Then the interval between and is zero, that between and is zero, but that between and may have any time-like magnitude. Euclid proved that two sides of a triangle are together greater than the third side, and was criticized on the ground that this proposition was evident even to asses. But in relativity geometry this proposition is false. In our triangle , and are zero, while may have any finite magnitude.

Again, the events which are parts of a single light-ray have a definite time-order, in spite of the fact that the interval between any two of them is zero. This appears as follows. Suppose a light-ray proceeds from the sun to the moon and is thence reflected to the earth: it reaches the earth later than a direct ray which left the sun at the same time. There is therefore a definite sense in saying that the ray reached the moon later than it left the sun—i.e. we can say that the ray went from the sun to the moon, not from the moon to the sun. Generalizing, we may say: If and are part of one light-ray, and light-rays from and , distinct from the previous light-ray, contain events , whose interval is time-like, then the time-order of , is the same whatever these new light-rays may be—i.e. we shall have always before , or always before . In the first case, we say that the "sense" of the ray is from to in the second, from to . This illustrates the difficulties which would arise if we were to attempt to found our geometry on interval alone. We must also take account of the purely ordinal properties of the space-time manifold. These properties give a wide separation between the departure of a light-ray from the sun and its arrival on the earth, although the "interval" between these two events is zero.

Reverting now to the method of tensors and its possible eventual simplification, it seems probable that we have an example of a general tendency to over-emphasize numbers, which has existed in mathematics ever since the time of Pythagoras, though it was temporarily less prominent in later Greek geometry as exemplified in Euclid. Euclid's theory of proportion does not, of course, dispense with numbers, since it uses "equimultiples"; but at any rate it requires only integers, not irrationals. Owing to the fact that arithmetic is easy, Greek methods in geometry have been in the background since Descartes, and co-ordinates have come to seem indispensable. But mathematical logic has shown that number is logically irrelevant in many problems where it formerly seemed essential, notably mathematical induction, limits, and continuity. A new technique, which seems difficult because it is unfamiliar, is required when numbers are not used; but there is a compensating gain in logical purity. It should be possible to apply a similar process of purification to physics. The method of tensors first assigns co-ordinates, and then shows how to obtain results which, though expressed in terms of co-ordinates, do not really depend upon them. There must be a less indirect technique possible, in which we use no more apparatus than is logically necessary, and have a language which will only express such facts as are now expressed in the language of tensors, not such as depend upon the choice of co-ordinates. I do not say that such a method, if discovered, would be preferable in practice, but I do say that it would give a better expression of the essential relations, and greatly facilitate the task of the philosopher. In the meantime, the method of tensors is technically delightful, and suffices for mathematical needs.

FOOTNOTES:

[24] For what follows see Eddington, Mathematical Theory of Relativity, chap. II., Cambridge, 1924.

[25] See Eddington, op. cit., p. 134.

The analysis of matter

About This Book

CHAPTER VII THE METHOD OF TENSORS

FOOTNOTES:

CHAPTER VII
THE METHOD OF TENSORS