c √ ¹/_m + ¹/_n.

§ 20. So far, the problem under investigation has been of a direct kind. We have supposed that the ultimate mean value or central position has been given to us; either à priori (as in many games of chance), or from more immediate physical considerations (as in aiming at a mark), or from extensive statistics (as in tables of human stature). In all such cases therefore the main desideratum is already taken for granted, and it may reasonably be asked what remains to be done. The answers are various. For one thing we may want to estimate the value of an average of many when compared with an average of a few. Suppose that one man has collected statistics including 1000 instances, and another has collected 4000 similar instances. Common sense can recognize that the latter are better than the former; but it has no idea how much better they are. Here, as elsewhere, quantitative precision is the privilege of science. The answer we receive from this quarter is that, in the long run, the modulus,—and with this the probable error, the mean error, and the error of mean square, which all vary in proportion,—diminishes inversely as the square root of the number of measurements or observations. (This follows from the second of the above formulæ.) Accordingly the probable error of the more extensive statistics here is one half that of the less extensive. Take another instance. Observation shows that “the mean height of 2,315 criminals differs from the mean height of 8,585 members of the general adult population by about two inches” (v. Edgeworth, Methods of Statistics: Stat. Soc. Journ. 1885). As before, common sense would feel little doubt that such a difference was significant, but it could give no numerical estimate of the significance. Appealing to science, we see that this is an illustration of the third of the above formulæ. What we really want to know is the odds against the averages of two large batches differing by an assigned amount: in this case by an amount equalling twenty-five times the modulus of the variable quantity. The odds against this are many billions to one.

§ 21. The number of direct problems which will thus admit of solution is very great, but we must confine ourselves here to the main inverse problem to which the foregoing discussion is a preliminary. It is this. Given a few only of one of these groups of measurements or observations; what can we do with these, in the way of determining that mean about which they would ultimately be found to cluster? Given a large number of them, they would betray the position of their ultimate centre with constantly increasing certainty: but we are now supposing that there are only a few of them at hand, say half a dozen, and that we have no power at present to add to the number.

In other words,—expressing ourselves by the aid of graphical illustration, which is perhaps the best method for the novice and for the logical student,—in the direct problem we merely have to draw the curve of frequency from a knowledge of its determining elements; viz. the position of the centre, and the numerical value of the modulus. In the inverse problem, on the other hand, we have three elements at least, to determine. For not only must we, (1), as before, determine whereabouts the centre may be assumed to lie; and (2), as before, determine the value of the modulus or degree of dispersion about this centre. This does not complete our knowledge. Since neither of these two elements is assigned with certainty, we want what is always required in the Theory of Chances, viz. some estimate of their probable truth. That is, after making the best assignment we can as to the value of these elements, we want also to assign numerically the ‘probable error’ committed in such assignment. Nothing more than this can be attained in Probability, but nothing less than this should be set before us.

§ 22. (1) As regards the first of these questions, the answer is very simple. Whether the number of measurements or observations be few or many, we must make the assumption that their average is the point we want; that is, that the average of the few will coincide with the ultimate average. This is the best, in fact the only assumption we can make. We should adopt this plan, of course, in the extreme case of there being only one value before us, by just taking that one; and our confidence increases slowly with the number of values before us. The only difference therefore here between knowledge resting upon such data, and knowledge resting upon complete data, lies not in the result obtained but in the confidence with which we entertain it.

§ 23. (2) As regards the second question, viz. the determination of the modulus or degree of dispersion about the mean, much the same may be said. That is, we adopt the same rule for the determination of the E.M.S. (error of mean square) by which the modulus is assigned, as we should adopt if we possessed full Information. Or rather we are confined to one of the rules given on p. 473, viz. the second, for by supposition we have neither the à priori knowledge which would be able to supply the first, nor a sufficient number of observations to justify the third. That is, we reckon the errors, measured from the average, and calculate their mean square: twice this is equal to the square of the modulus of the probable curve of facility.[8]

§ 24. (3) The third question demands for its solution somewhat advanced mathematics; but the results can be indicated without much difficulty. A popular way of stating our requirement would be to say that we want to know how likely it is that the mean of the few, which we have thus accepted, shall coincide with the true mean. But this would be to speak loosely, for the chances are of course indefinitely great against such precise coincidence. What we really do is to assign the ‘probable error’; that is, to assign a limit which it is as likely as not that the discrepancy between the inferred mean and the true mean should exceed.[9] To take a numerical example: suppose we had made several measurements of a wall with a tape, and that the average of these was 150 feet. The scrupulous surveyor would give us this result, with some such correction as this added,—‘probable error 3 inches’. All that this means is that we may assume that the true value is 150 feet, with a confidence that in half the cases (of this description) in which we did so, we should really be within three inches of the truth.

The expression for this probable error is a simple multiple of the modulus: it is the modulus multiplied by 0.4769…. That it should be some function of the modulus, or E.M.S., seems plausible enough; for the greater the errors,—in other words the wider the observed discrepancy amongst our measurements,—the less must be the confidence we can feel in the accuracy of our determination of the mean. But, of course, without mathematics we should be quite unable to attempt any numerical assignment.

§ 25. The general conclusion therefore is that the determination of the curve of facility,—and therefore ultimately of every conclusion which rests upon a knowledge of this curve,—where only a few observations are available, is of just the same kind as where an infinity are available. The rules for obtaining it are the same, but the confidence with which it can be accepted is less.

The knowledge, therefore, obtainable by an average of a small number of measurements of any kind, hardly differs except in degree from that which would be attainable by an indefinitely extensive series of them. We know the same sort of facts, only we are less certain about them. But, on the other hand, the knowledge yielded by an average even of a small number differs in kind from that which is yielded by a single measurement. Revert to our marksman, whose bullseye is supposed to have been afterwards removed. If he had fired only a single shot, not only should we be less certain of the point he had aimed at, but we should have no means whatever of guessing at the quality of his shooting, or of inferring in consequence anything about the probable remoteness of the next shot from that which had gone before. But directly we have a plurality of shots before us, we not merely feel more confident as to whereabouts the centre of aim was, but we also gain some knowledge as to how the future shots will cluster about the spot thus indicated. The quality of his shooting begins at once to be betrayed by the results.

§ 26. Thus far we have been supposing the Law of Facility to be of the Binomial type. There are several reasons for discussing this at such comparative length. For one thing it is the only type which,—or something approximately resembling which,—is actually prevalent over a wide range of phenomena. Then again, in spite of its apparent intricacy, it is really one of the simplest to deal with; owing to the fact that every curve of facility derived from it by taking averages simply repeats the same type again. The curve of the average only differs from that of the single elements in having a smaller modulus; and its modulus is smaller in a ratio which is exceedingly easy to give. If that of the one is c, that of the other (derived by averaging n single elements) is ^c/_√n.

But for understanding the theory of averages we must consider other cases as well. Take then one which is intrinsically as simple as it possibly can be, viz. that in which all values within certain assigned limits are equally probable. This is a case familiar enough in abstract Probability, though, as just remarked, not so common in natural phenomena. It is the state of things when we act at random directly upon the objects of choice;[10] as when, for instance, we choose digits at random out of a table of logarithms.

The reader who likes to do so can without much labour work out the result of taking an average of two or three results by proceeding in exactly the same way which we adopted on p. 476. The ‘curve of facility’ with which we have to start in this case has become of course simply a finite straight line. Treating the question as one of simple combinations, we may divide the line into a number of equal parts, by equidistant points; and then proceed to take these two and two together in every possible way, as we did in the case discussed some pages back.

If we did so, what we should find would be this. When an average of two is taken, the ‘curve of facility’ of the average becomes a triangle with the initial straight line for base; so that the ultimate mean or central point becomes the likeliest result even with this commencement of the averaging process. If we were to take averages of three, four, and so on, what we should find would be that the Binomial law begins to display itself here. The familiar bell shape of the exponential curve would be more and more closely approximated to, until we obtained something quite indistinguishable from it.

§ 27. The conclusion therefore is that when we are dealing with averages involving a considerable number it is not necessary, in general, to presuppose the binomial law of distribution in our original data. The law of arrangement of what we may call the derived curve, viz. that corresponding to the averages, will not be appreciably affected thereby. Accordingly we seem to be justified in bringing to bear all the same apparatus of calculation as in the former case. We take the initial average as the probable position of the true centre or ultimate average: we estimate the probability that we are within an assignable distance of the truth in so doing by calculating the ‘error of mean square’; and we appeal to this same element to determine the modulus, i.e. the amount of contraction or dispersion, of our derived curve of facility.

The same general considerations will apply to most other kinds of Law of Facility. Broadly speaking,—we shall come to the examination of certain exceptions immediately,—whatever may have been the primitive arrangement (i.e. that of the single results) the arrangement of the derived results (i.e. that of the averages) will be more crowded up towards the centre. This follows from the characteristic of combinations already noticed, viz. that extreme values can only be got at by a repetition of several extremes, whereas intermediate values can be got at either by repetition of intermediates or through the counteraction of opposite extremes. Provided the original distribution be symmetrical about the centre, and provided the limits of possible error be finite, or if infinite, that the falling off of frequency as we recede from the mean be very rapid, then the results of taking averages will be better than those of trusting to single results.

§ 28. We will now take notice of an exceptional case. We shall do so, not because it is one which can often actually occur, but because the consideration of it will force us to ask ourselves with some minuteness what we mean in the above instances by calling the results of the averages ‘better’ than those of the individual values. A diagram will bring home to us the point of the difficulty better than any verbal or symbolic description.

Distribution for two samples from a non-Gaussian distribution

The black line represents a Law of Error easily stated in words, and one which, as we shall subsequently see, can be conceived as occurring in practice. It represents a state of things under which up to a certain distance from O, on each side, viz. to A and B, the probability of an error diminishes uniformly with the distance from O; whilst beyond these points, up to E and F, the probability of error remains constant. The dotted line represents the resultant Law of Error obtained by taking the average of the former two and two together. Now is the latter ‘better’ than the former? Under it, certainly, great errors are less frequent and intermediate ones more frequent; but then on the other hand the small errors are less frequent: is this state of things on the whole an improvement or not? This requires us to reconsider the whole question.

§ 29. In all the cases discussed in the previous sections the superiority of the curve of averages over that of the single results showed itself at every point. The big errors were scarcer and the small errors were commoner; it was only just at one intermediate point that the two were on terms of equality, and this point was not supposed to possess any particular significance or importance. Accordingly we had no occasion to analyse the various cases included under the general relation. It was enough to say that one was better than the other, and it was sufficient for all purposes to take the ‘modulus’ as the measure of this superiority. In fact we are quite safe in simply saying that the average of those average results is better than that of the individual ones.

When however we proceed in what Hume calls “the sifting humour,” and enquire why it is sufficient thus to trust to the average; we find, in addition to the considerations hitherto advanced, that some postulate was required as to the consequences of the errors we incur. It involved an estimate of what is sometimes called the ‘detriment’ of an error. It seemed to take for granted that large and small errors all stand upon the same general footing of being mischievous in their consequences, but that their evil effects increase in a greater ratio than that of their own magnitude.

§ 30. Suppose, for comparison, a case in which the importance of an error is directly proportional to its magnitude (of course we suppose positive and negative errors to balance each other in the long run): it does not appear that any advantage would be gained by taking averages. Something of this sort may be considered to prevail in cases of mere purchase and sale. Suppose that any one had to buy a very large number of yards of cloth at a constant price per yard: that he had to do this, say, five times a day for many days in succession. And conceive that the measurement of the cloth was roughly estimated on each separate occasion, with resultant errors which are as likely to be in excess as in defect. Would it make the slightest difference to him whether he paid separately for each piece; or whether the five estimated lengths were added together, their average taken, and he were charged with this average price for each piece? In the latter case the errors which will be made in the estimation of each piece will of course be less in the long run than they would be in the former: will this be of any consequence? The answer surely is that it will not make the slightest difference to either party in the bargain. In the long run, since the same parties are concerned, it will not matter whether the intermediate errors have been small or large.

Of course nothing of this sort can be regarded as the general rule. In almost every case in which we have to make measurements we shall find that large errors are much more mischievous than small ones, that is, mischievous in a greater ratio than that of their mere magnitude. Even in purchase and sale, where different purchasers are concerned, this must be so, for the pleasure of him who is overserved will hardly equal the pain of him who is underserved. And in many cases of scientific measurement large errors may be simply fatal, in the sense that if there were no reasonable prospect of avoiding them we should not care to undertake the measurement at all.

§ 31. If we were only concerned with practical considerations we might stop at this point; but if we want to realize the full logical import of average-taking as a means to this particular end, viz. of estimating some assigned magnitude, we must look more closely into such an exceptional case as that which was indicated in the figure on p. 493. What we there assumed was a state of things in reference to which extremely small errors were very frequent, but that when once we got beyond a certain small range all other errors, within considerable limits, were equally likely.

It is not difficult to imagine an example which will aptly illustrate the case in point: at worst it may seem a little far-fetched. Conceive then that some firm in England received a hurried order to supply a portion of a machine, say a steam-engine, to customers at a distant place; and that it was absolutely essential that the work should be true to the tenth of an inch for it to be of any use. But conceive also that two specifications had been sent, resting on different measurements, in one of which the length of the requisite piece was described as sixty and in the other sixty-one inches. On the assumption of any ordinary law of error, whether of the binomial type or not, there can be no doubt that the firm would make the best of a very bad job by constructing a piece of 60 inches and a half: i.e. they would have a better chance of being within the requisite tenth of an inch by so doing, than by taking either of the two specifications at random and constructing it accurately to this. But if the law were of the kind indicated in our diagram,[11] then it seems equally certain that they would be less likely to be within the requisite narrow margin by so doing. As a mere question of probability,—that is, if such estimates were acted upon again and again,—there would be fewer failures encountered by simply choosing one of the conflicting measurements at random and working exactly to this, than by trusting to the average of the two.

This suggests some further reflections as to the taking of averages. We will turn now to another exceptional case, but one involving somewhat different considerations than those which have been just discussed. As before, it may be most conveniently introduced by commencing with an example.

§ 32. Suppose then that two scouts were sent to take the calibre of a gun in a hostile fort,—we may conceive that the fort was to be occupied next day, and used against the enemy, and that it was important to have a supply of shot or shell,—and that the result is that one of them reports the calibre to be 8 inches and the other 9. Would it be wise to assume that the mean of these two, viz. 8¹/₂ inches, was a likelier value than either separately?

The answer seems to be this. If we have reason to suppose that the possible calibres partake of the nature of a continuous magnitude,—i.e. that all values, with certain limits, are to be considered as admissible, (an assumption which we always make in our ordinary inverse step from an observation or magnitude to the thing observed or measured)—then we should be justified in selecting the average as the likelier value. But if, on the other hand, we had reason to suppose that whole inches are always or generally preferred, as is in fact the case now with heavy guns, we should do better to take, even at hazard, one of the two estimates set before us, and trust this alone instead of taking an average of the two.

§ 33. The principle upon which we act here may be stated thus. Just as in the direct process of calculating or displaying the ‘errors’, whether in an algebraic formula or in a diagram, we generally assume that their possibility is continuous, i.e. that all intermediate values are possible; so, in the inverse process of determining the probable position of the original from the known value of two or more errors, we assume that that position is capable of falling at any point whatever between certain limits. In such an example as the above, where we know or suspect a discontinuity of that possibility of position, the value of the average may be entirely destroyed.

In the above example we were supposed to know that the calibre of the guns was likely to run in English inches or in some other recognized units. But if the battery were in China or Japan, and we knew nothing of the standards of length in use there, we could no longer appeal to this principle. It is doubtless highly probable that those calibres are not of the nature of continuously varying magnitudes; but in an entire ignorance of the standards actually adopted, we are to all intents and purposes in the same position as if they were of that continuous nature. When this is so the objections to trusting to the average would no longer hold good, and if we had only one opportunity, or a very few opportunities, we should do best to adhere to the customary practice.

§ 34. When however we are able to collect and compare a large number of measurements of various objects, this consideration of the probable discontinuity of the objects we thus measure,—that is, their tendency to assume some one or other of a finite number of distinct magnitudes, instead of showing an equal readiness to adapt themselves to all intermediate values,—again assumes importance. In fact, given a sufficient number of measurable objects, we can actually deduce with much probability the standard according to which the things in question were made.

This is the problem which Mr Flinders Petrie has attacked with so much acuteness and industry in his work on Inductive Metrology, a work which, merely on the ground of its speculative interest, may well be commended to the student of Probability. The main principles on which the reasoning is based are these two:—(1) that all artificers are prone to construct their works according to round numbers, or simple fractions, of their units of measurement; and (2) that, aiming to secure this, they will stray from it in tolerable accordance with the law of error. The result of these two assumptions is that if we collect a very large number of measurements of the different parts and proportions of some ancient building,—say an Egyptian temple,—whilst no assignable length is likely to be permanently unrepresented, yet we find a marked tendency for the measurements to cluster about certain determinate points in our own, or any other standard scale of measurement. These points mark the length of the standard, or of some multiple or submultiple of the standard, employed by the old builders. It need hardly be said that there are a multitude of practical considerations to be taken into account before this method can be expected to give trustworthy results, but the leading principles upon which it rests are comparatively simple.

§ 35. The case just considered is really nothing else than the recurrence, under a different application, of one which occupied our attention at a very early stage. We noticed (Chap. II.) the possibility of a curve of facility which instead of having a single vertex like that corresponding to the common law of error, should display two humps or vertices. It can readily be shown that this problem of the measurements of ancient buildings, is nothing more than the reopening of the same question, in a slightly more complex form, in reference to the question of the functions of an average.

Take a simple example. Suppose an instance in which great errors, of a certain approximate magnitude, are distinctly more likely to be committed than small ones, so that the curve of facility, instead of rising into one peak towards the centre, as in that of the familiar law of error, shows a depression or valley there. Imagine, in fact, two binomial curves, with a short interval between their centres. Now if we were to calculate the result of taking averages here we should find that this at once tends to fill up the valley; and if we went on long enough, that is, if we kept on taking averages of sufficiently large numbers, a peak would begin to arise in the centre. In fact the familiar single binomial curve would begin to make its appearance.

§ 36. The question then at once suggests itself, ought we to do this? Shall we give the average free play to perform its allotted function of thus crowding things up towards the centre? To answer this question we must introduce a distinction. If that peculiar double-peaked curve had been, as it conceivably might, a true error-curve,—that is, if it had represented the divergences actually made in aiming at the real centre,—the result would be just what we should want. It would furnish an instance of the advantages to be gained by taking averages even in circumstances which were originally unfavourable. It is not difficult to suggest an appropriate illustration. Suppose a man firing at a mark from some sheltered spot, but such that the range crossed a broad exposed valley up or down which a strong wind was generally blowing. If the shot-marks were observed we should find them clustering about two centres to the right and left of the bullseye. And if the results were plotted out in a curve they would yield such a double-peaked curve as we have described. But if the winds were equally strong and prevalent in opposite directions, we should find that the averaging process redressed the consequent disturbance.

If however the curve represented, as it is decidedly more likely to do, some outcome of natural phenomena in which there was, so to say, a real double aim on the part of nature, it would be otherwise. Take, for instance, the results of measuring a large number of people who belonged to two very heterogeneous races. The curve of facility would here be of the kind indicated on p. 45, and if the numbers of the two commingled races were equal it would display a pair of twin peaks. Again the question arises, ‘ought’ we to involve the whole range within the scope of a single average? The answer is that the obligation depends upon the purpose we have in view. If we want to compare that heterogeneous race, as a whole, with some other, or with itself at some other time, we shall do well to average without analysis. All statistics of population, as we have already seen (v. p. 47), are forced to neglect a multitude of discriminating characteristics of the kind in question. But if our object were to interpret the causes of this abnormal error-curve we should do well to break up the statistics into corresponding parts, and subject these to analysis separately.

Similarly with the measurements of the ancient buildings. In this case if all our various ‘errors’ were thrown together into one group of statistics we should find that the resultant curve of facility displayed, not two peaks only, but a succession of them; and these of various magnitudes, corresponding to the frequency of occurrence of each particular measurement. We might take an average of the whole, but hardly any rational purpose could be subserved in so doing; whereas each separate point of maximum frequency of occurrence has something significant to teach us.

§ 37. One other peculiar case may be noticed in conclusion. Suppose a distinctly asymmetrical, or lop-sided curve of facility, such as this:—

An asymmetric (lop-sided) distribution

Laws of error, of which this is a graphical representation, are, I apprehend, far from uncommon. The curve in question, is, in fact, but a slight exaggeration of that of barometrical heights as referred to in the last chapter; when it was explained that in such cases the mean, the median, and the maximum ordinate would show a mutual divergence. The doubt here is not, as in the preceding instances, whether or not a single average should be taken, but rather what kind of average should be selected. As before, the answer must depend upon the special purpose we have in view. For all ordinary purposes of comparison between one time or place and another, any average will answer, and we should therefore naturally take the arithmetical, as the most familiar, or the median, as the simplest.

§ 38. Cases might however arise under which other kinds of average could justify themselves, with a momentary notice of which we may now conclude. Suppose, for instance, that the question involved here were one of desirability of climate. The ordinary mean, depending as it does so largely upon the number and magnitude of extreme values, might very reasonably be considered a less appropriate test than that of judging simply by the relatively most frequent value: in other words, by the maximum ordinate. And various other points of view can be suggested in respect of which this particular value would be the most suitable and significant.

In the foregoing case, viz. that of the weather curve, there was no objective or ‘true’ value aimed at. But a curve closely resembling this would be representative of that particular class of estimates indicated by Mr Galton, and for which, as he has pointed out, the geometrical mean becomes the only appropriate one. In this case the curve of facility ends abruptly at O: it resembles a much foreshortened modification of the common exponential form. Its characteristics have been discussed in the paper by Dr Macalister already referred to, but any attempt to examine its properties here would lead us into far too intricate details.

§ 39. The general conclusion from all this seems quite in accordance with the nature and functions of an average as pointed out in the last chapter. Every average, it was urged, is but a single representative intermediate value substituted for a plurality of actual values. It must accordingly let slip the bulk of the information involved in these latter. Occasionally, as in most ordinary measurements, the one thing which it represents is obviously the thing we are in want of; and then the only question can be, which mean will most accord with the ‘true’ value we are seeking. But when, as may happen in most of the common applications of statistics, there is really no ‘true value’ of an objective kind behind the phenomena, the problem may branch out in various directions. We may have a variety of purposes to work out, and these may demand some discrimination as regards the average most appropriate for them. Whenever therefore we have any doubt whether the familiar arithmetical average is suitable for the purpose in hand we must first decide precisely what that purpose is.

1 Mr Mansfield Merriman published in 1877 (Trans. of the Connecticut Acad.) a list of 408 writings on the subject of Least Squares.

2 In other words, we are to take the “centre of gravity” of the shot-marks, regarding them as all of equal weight. This is, in reality, the ‘average’ of all the marks, as the elementary geometrical construction for obtaining the centre of gravity of a system of points will show; but it is not familiarly so regarded. Of course, when we are dealing with such cases as occur in Mensuration, where we have to combine or reconcile three or more inconsistent equations, some such rule as that of Least Squares becomes imperative. No taking of an average will get us out of the difficulty.

3 The only reason for supposing this exceptional shape is to secure simplicity. The ordinary target, allowing errors in two dimensions, would yield slightly more complicated results.

4 When first referred to, the general form of this equation was given (v. p. 29). The special form here assigned, in which ^h/_√π is substituted for A, is commonly employed in Probability, because the integral of y dx, between +∞ and −∞, becomes equal to unity. That is, the sum of all the mutually exclusive possibilities is represented, as usual, by unity. In this form of expression h is a quantity of the order x⁻¹; for hx is to be a numerical quantity, standing as it does as an index. The modulus, being the reciprocal of this, is of the same order of quantities as the errors themselves. In fact, if we multiply it by 0.4769… we have the so-called ‘probable error.’

5 See, for the explanation of this, and of the graphical method of illustrating it, the note on p. 29.

6 Broadly speaking, we may say that the above remarks hold good of any law of frequency of error in which there are actual limits, however wide, to the possible magnitude of an error. If there are no limits to the possible errors, this characteristic of an average to heap its results up towards the centre will depend upon circumstances. When, as in the exponential curve, the approximation to the base, as asymptote, is exceedingly rapid,—that is, when the extreme errors are relatively very few,—it still holds good. But if we were to take as our law of facility such an equation as y = ^π/_1 + x², (as hinted by De Morgan and noted by Mr Edgeworth: Camb. Phil. Trans. vol. X. p. 184, and vol. XIV. p. 160) it does not hold good. The result of averaging is to diminish the tendency to cluster towards the centre.

7 The reader will find the proofs of these and other similar formulæ in Galloway on Probability, and in Airy on Errors.

8 The formula commonly used for the E.M.S. in this case is ^∑e²/_n − 1 and not ^∑e²/_n. The difference is trifling, unless n be small; the justification has been offered for it that since the sum of the squares measured from the true centre is a minimum (that centre being the ultimate arithmetical mean) the sum of the squares measured from the somewhat incorrectly assigned centre will be somewhat larger.

9 It appears to me that in strict logical propriety we should like to know the probable error committed in both the assignments of the preceding two sections. But the profound mathematicians who have discussed this question, and who alone are competent to treat it, have mostly written with the practical wants of Astronomy in view; and for this purpose it is sufficient to take account of the one great desideratum, viz. the true values sought. Accordingly the only rules commonly given refer to the probable error of the mean.

10 i.e. as distinguished from acting upon them indirectly. This latter proceeding, as explained in the chapter on Randomness, may result in giving a non-uniform distribution.

11 There is no difficulty in conceiving circumstances under which a law very closely resembling this would prevail. Suppose, e.g., that one of the two measurements had been made by a careful and skilled mechanic, and the other by a man who to save himself trouble had put in the estimate at random (within certain limits),—the firm having a knowledge of this fact but being of course unable to assign the two to their authors,—we should get very much such a Law of Error as is supposed above.

INDEX.

Accidents 342
Airy, G. B. 447, 484
Anticipations, tacit 287
Arbuthnott 258
Aristotle 205, 307
Average
- arithmetical 437
- geometrical 439
- median 442
- consequences of 482
- necessary results of 457
- uses of 439, 489

Babbage 343
Bags and balls 180, 411
Belief
- correctness of 125, 131, 178
- gradations of 139
- growth of 199
- language of 143
- measurement of 119, 125, 146
- quantity of 133
- test of 140, 149, 294
- undue 129
- vagueness of 127
Bentham 319, 323
Bernoulli 91, 117, 389
Bertillon 435
Births, male and female 90, 258, 263
Boat race, Oxford and Cambridge 339
Boole 183
Buckle 237
Buffon 153, 205, 352, 389
Burgersdyck 311
Butler 209, 281, 333, 366

Carlisle Tables 169
Casual, meaning of 245
Causation
- need of 237
- proof of 244
Centre of gravity 467
Certainty, in Law 324
- reasonable 327
- hypothetical 210
Chance
- and
  - Causation 244
  - Creation 258
  - Design 256
  - Genius 353
- neglect of small 363
- selections 338
Chauvenet 352
Classification, numerical scheme of 48
Coincidences 245
Combinations and Permutations 87
Communism 375, 392
Conceptualism 275
Conflict of chances 418
Consumptives, insurance of 227
Cournot 245, 255, 338
Crackanthorpe 312, 320
Craig, J. 192
Crofton, M. W. 61, 101, 104