1 As stated above, this is really little more than a re-statement, a stage further back, of the existence of the same kind of uniformity as that which we are called upon to explain in the concrete details presented to us in experience.
2 “It would seem in fact that in coarse and rude observations the errors proceed from a very few principal causes, and in consequence our hypothesis [as to the Exponential Law of Error] will probably represent the facts only imperfectly, and the frequency of the errors will only approximate roughly and vaguely to the law which follows from it. But when astronomers, not content with the degree of accuracy they had reached, prosecuted their researches into the remaining sources of error, they found that not three or four, but a great number of minor sources of error of nearly co-ordinate importance began to reveal themselves, having been till then masked and overshadowed by the graver errors which had been now approximately removed…. There were errors of graduation, and many others in the contraction of instruments; other errors of their adjustments; errors (technically so called) of observation; errors from the changes of temperature, of weather, from slight irregular motions and vibrations; in short, the thousand minute disturbing influences with which modern astronomers are familiar.” (Extracted from a paper by Mr Crofton in the Vol. of the Philosophical Transactions for 1870, p. 177.)
3 Typical Laws of Heredity; read before the Royal Institution, Feb. 9, 1877. See also Journal of the Anthrop. Inst. Nov. 1885.
§ 1. At the point which we have now reached, we are supposed to be in possession of series or groups of a certain kind, lying at the bottom, as one may say, and forming the foundation on which the Science of Probability is to be erected. We have described with sufficient particularity the characteristics of such a series, and have indicated the process by which it is, as a rule, actually brought about in nature. The next enquiries which have to be successively made are, how in any particular case we are to establish their existence and determine their special character and properties? and secondly,[1] when we have obtained them, in what mode are they to be employed for logical purposes?
The answer to the former enquiry does not seem difficult. Experience is our sole guide. If we want to discover what is in reality a series of things, not a series of our own conceptions, we must appeal to the things themselves to obtain it, for we cannot find much help elsewhere. We cannot tell how many persons will be born or die in a year, or how many houses will be burnt or ships wrecked, without actually counting them. When we thus speak of ‘experience’ we mean to employ the term in its widest signification; we mean experience supplemented by all the aids which inductive or deductive logic can afford. When, for instance, we have found the series which comprises the numbers of persons of any assigned class who die in successive years, we have no hesitation in extending it some way into the future as well as into the past. The justification of such a procedure must be sought in the ordinary canons of Induction. As a special discussion will be given upon the connection between Probability and Induction, no more need be said upon this subject here; but nothing will be found there at variance with the assertion just made, that the series we employ are ultimately obtained by experience only.
§ 2. In many cases it is undoubtedly true that we do not resort to direct experience at all. If I want to know what is my chance of holding ten trumps in a game of whist, I do not enquire how often such a thing has occurred before. If all the inhabitants of the globe were to divide themselves up into whist parties they would have to keep on at it for a great many years, if they wanted to settle the question satisfactorily in that way. What we do of course is to calculate algebraically the proportion of possible combinations in which ten trumps can occur, and take this as the answer to our problem. So again, if I wanted to know the chance of throwing six with a die whose faces were unequal, it would be a question if my best way would not be to calculate geometrically the solid angle subtended at the centre of gravity by the opposite face, and the ratio of this to the whole surface of a sphere would represent sufficiently closely the chance required.
It is quite true that in such examples as the above, especially the former one, nobody would ever think of appealing to statistics. This would be a tedious process to adopt when, as here, the mechanical and other conditions upon which the production of the events depend are comparatively few, determinate, and admit of isolated consideration, whilst the enormous number of combinations which can be constructed out of them causes an enormous consequent multiplicity of ways in which the events can possibly happen. Hence, in practice, à priori determination is often easy, whilst à posteriori appeal to experience would be not merely tedious but utterly impracticable. This, combined with the frequent simplicity and attractiveness of such examples when deductively treated, has made them very popular, and produced the impression in many quarters that they are the proper typical instances to illustrate the theory of chance. Whereas, had the science been concerned with those kinds of events only which in practice are commonly made subjects of insurance, probably no other view would ever have been taken than that it was based upon direct appeal to experience.
§ 3. When, however, we look a little closer, we find that there is no occasion for such a sharp distinction as that apparently implied between the two classes of examples just indicated. In such cases as those of dice and cards, even, in which we appear to reason directly from the determining conditions, or possible variety of the events, rather than from actual observation of their occurrence, we shall find that this procedure is only valid by the help of a tacit assumption which can never be determined otherwise than by direct experience. It is, no doubt, an exceedingly natural and obvious assumption, and one which is continually deriving fresh weight from every-day observation, but it is one which ought not to be admitted without consideration. As this is a very important matter, not so much in itself as in connection with the light which it throws upon the theory of the subject, we will enter into a somewhat detailed examination of it.
Let us take a very simple example, that of tossing up a penny. Suppose that I am contemplating a succession of two throws; I can see that the only possible events are[2] HH, HT, TH, TT. So much is certain. We are moreover tolerably well convinced from experience that these events occur, in the long run, about equally often. This is of course admitted on all hands. But on the view commonly maintained, it is contended that we might have known the fact beforehand on grounds which are applicable to an indefinite number of other and more complex cases. The form in which this view would generally be advanced is, that we are enabled to state beforehand that the four throws above mentioned are equally likely. If in return we ask what is meant by the expression ‘equally likely’, it appears that there are two and only two possible forms of reply. One of these seeks the explanation in the state of mind of the observer, the other seeks it in some characteristic of the things observed.
(1) It might, for instance, be said on the one hand, that what is meant is that the four events contemplated are equally easy to imagine, or, more accurately, that our expectation or belief in their occurrence is equal. We could hardly be content with this reply, for the further enquiry would immediately be urged, On what ground is this to be believed? What are the characteristics of events of which our expectation is equal? If we consented to give an answer to this further enquiry, we should be led to the second form of reply, to be noticed directly; if we did not consent we should, it seems, be admitting that Probability was only a portion of Psychology, confined therefore to considering states of mind in themselves, rather than in their reference to facts, viz. as being true or false. We should, that is, be ceasing to make it a science of inference about things. This point will have to be gone into more thoroughly in another chapter; but it is impossible to direct attention too prominently to the fact that Logic (and therefore Probability as a branch of Logic) is not concerned with what men do believe, but with what they ought to believe, if they are to believe correctly.
(2) In the other form of reply the explanation of the phrase in question would be sought, not in a state of mind, but in a quality of the things contemplated. We might assign the following as the meaning, viz. that the events really would occur with equal frequency in the long run. The ground of this assertion would probably be found in past experience, and it would doubtless be impossible so to frame the answer as to exclude the notion of our belief altogether. But still there is a broad distinction between seeking an equality in the amount of our belief, as before, and in the frequency of occurrence of the events themselves, as here.
§ 4. When we have got as far as this it can readily be shown that an appeal to experience cannot be long evaded. For can the assertion in question (viz. that the throws of the penny will occur equally often) be safely made à priori? Those who consider that it can seem hardly to have fully faced the difficulties which meet them. For when we begin to enquire seriously whether the penny will really do what is expected of it, we find that restrictions have to be introduced. In the first place it must be an ideal coin, with its sides equal and fair. This restriction is perfectly intelligible; the study of solid geometry enables us to idealize a penny into a circular or cylindrical lamina. But this condition by itself is not sufficient, others are wanted as well. The penny was supposed to be tossed up, as we say ‘at random.’ What is meant by this, and how is this process to be idealized? To ask this is to introduce no idle subtlety; for it would scarcely be maintained that the heads and tails would get their fair chances if, immediately before the throwing, we were so to place the coin in our hands as to start it always with the same side upwards. The difference that would result in consequence, slight as its cause is, would tend in time to show itself in the results. Or, if we persisted in starting with each of the two sides alternately upwards, would the longer repetitions of the same side get their fair chance?
Perhaps it will be replied that if we think nothing whatever about these matters all will come right of its own accord. It may, and doubtless will be so, but this is falling back upon experience. It is here, then, that we find ourselves resting on the experimental assumption above mentioned, and which indeed cannot be avoided. For suppose, lastly, that the circumstances of nature, or my bodily or mental constitution, were such that the same side always is started upwards, or indeed that they are started in any arbitrary order of our own? Well, it will be replied, it would not then be a fair trial. If we press in this way for an answer to such enquiries, we shall find that these tacit restrictions are really nothing else than a mode of securing an experimental result. They are only another way of saying, Let a series of actions be performed in such a way as to secure a sequence of a particular kind, viz., of the kind described in the previous chapters.
§ 5. An intermediate way of evading the direct appeal to experience is sometimes found by defining the probability of an event as being measured by the ratio which the number of cases favourable to the event bears to the total number of cases which are possible. This seems a somewhat loose and ambiguous way of speaking. It is clearly not enough to count the number of cases merely, they must also be valued, since it is not certain that each is equally potent in producing the effect. This, of course, would never be denied, but sufficient importance does not seem to be attached to the fact that we have really no other way of valuing them except by estimating the effects which they actually do, or would produce. Instead of thus appealing to the proportion of cases favourable to the event, it is far better (at least as regards the foundation of the science, for we are not at this moment discussing the practical method of facilitating our calculations) to appeal at once to the proportion of cases in which the event actually occurs.
§ 6. The remarks above made will apply, of course, to most of the other common examples of chance; the throwing of dice, drawing of cards, of balls from bags, &c. In the last case, for instance, one would naturally be inclined to suppose that a ball which had just been put back would thereby have a better chance of coming out again next time, since it will be more in the way for that purpose. How is this to be prevented? If we designedly thrust it to the middle or bottom of the others, we may overdo the precaution; and are in any case introducing human design, that element so essentially hostile to all that we understand by chance. If we were to trust to a good shake setting matters right, we may easily be deceived; for shaking the bag can hardly do more than diminish the disposition of those balls which were already in each other's neighbourhood, to remain so. In the consequent interaction of each upon all, the arrangement in which they start cannot but leave its impress to some extent upon their final positions. In all such cases, therefore, if we scrutinize our language, we shall find that any supposed à priori mode of stating a problem is little else than a compendious way of saying, Let means be taken for obtaining a given result. Since it is upon this result that our inferences ultimately rest, it seems simpler and more philosophical to appeal to it at once as the groundwork of our science.
§ 7. Let us again take the instance of the tossing of a penny, and examine it somewhat more minutely, to see what can be actually proved about the results we shall obtain. We are willing to give the pence fair treatment by assuming that they are perfect, that is, that in the long run they show no preference for either head or tail; the question then remains, Will the repetitions of the same face obtain the proportional shares to which they are entitled by the usual interpretations of the theory? Putting then, as before, for the sake of brevity, H for head, and HH for heads twice running, we are brought to this issue;—Given that the chance of H is 1/2, does it follow necessarily that the chance of HH (with two pence) is 1/4? To say nothing of ‘H ten times’ occurring once in 1024 times (with ten pence), need it occur at all? The mathematicians, for the most part, seem to think that this conclusion follows necessarily from first principles; to me it seems to rest upon no more certain evidence than a reasonable extension by Induction.
Taking then the possible results which can be obtained from a pair of pence, what do we find? Four different results may follow, namely, (1) HT, (2) HH, (3) TH, (4) TT. If it can be proved that these four are equally probable, that is, occur equally often, the commonly accepted conclusions will follow, for a precisely similar argument would apply to all the larger numbers.
§ 8. The proof usually advanced makes use of what is called the Principle of Sufficient Reason. It takes this form;—Here are four kinds of throws which may happen; once admit that the separate elements of them, namely, H and T, happen equally often, and it will follow that the above combinations will also happen equally often, for no reason can be given in favour of one of them that would not equally hold in favour of the others.
To a certain extent we must admit the validity of the principle for the purpose. In the case of the throws given above, it would be valid to prove the equal frequency of (1) and (3) and also of (2) and (4); for there is no difference existing between these pairs except what is introduced by our own notation.[3] TH is the same as HT, except in the order of the occurrence of the symbols H and T, which we do not take into account. But either of the pair (1) and (3) is different from either of the pair (2) and (4). Transpose the notation, and there would still remain here a distinction which the mind can recognize. A succession of the same thing twice running is distinguished from the conjunction of two different things, by a distinction which does not depend upon our arbitrary notation only, and would remain entirely unaltered by a change in this notation. The principle therefore of Sufficient Reason, if admitted, would only prove that doublets of the two kinds, for example (2) and (4), occur equally often, but it would not prove that they must each occur once in four times. It cannot be proved indeed in this way that they need ever occur at all.
§ 9. The formula, then, not being demonstrable à priori, (as might have been concluded,) can it be obtained by experience? To a certain extent it can; the present experience of mankind in pence and dice seems to show that the smaller successions of throws do really occur in about the proportions assigned by the theory. But how nearly they do so no one can say, for the amount of time and trouble to be expended before we could feel that we have verified the fact, even for small numbers, is very great, whilst for large numbers it would be simply intolerable. The experiment of throwing often enough to obtain ‘heads ten times’ has been actually performed by two or three persons, and the results are given by De Morgan, and Jevons.[4] This, however, being only sufficient on the average to give ‘heads ten times’ a single chance, the evidence is very slight; it would take a considerable number of such experiments to set the matter nearly at rest.
Any such rule, then, as that which we have just been discussing, which professes to describe what will take place in a long succession of throws, is only conclusively proved by experience within very narrow limits, that is, for small repetitions of the same face; within limits less narrow, indeed, we feel assured that the rule cannot be flagrantly in error, otherwise the variation would be almost sure to be detected. From this we feel strongly inclined to infer that the same law will hold throughout. In other words, we are inclined to extend the rule by Induction and Analogy. Still there are so many instances in nature of proposed laws which hold within narrow limits but get egregiously astray when we attempt to push them to great lengths, that we must give at best but a qualified assent to the truth of the formula.
§ 10. The object of the above reasoning is simply to show that we cannot be certain that the rule is true. Let us now turn for a minute to consider the causes by which the succession of heads and tails is produced, and we may perhaps see reasons to make us still more doubtful.
It has been already pointed out that in calculating probabilities à priori, as it is called, we are only able to do so by introducing restrictions and suppositions which are in reality equivalent to assuming the expected results. We use words which in strictness mean, Let a given process be performed; but an analysis of our language, and an examination of various tacit suppositions which make themselves felt the moment they are not complied with, soon show that our real meaning is, Let a series of a given kind be obtained; it is to this series only, and not to the conditions of its production, that all our subsequent calculations properly apply. The physical process being performed, we want to know whether anything resembling the contemplated series really will be obtained.
Now if the penny were invariably set the same side uppermost, and thrown with the same velocity of rotation and to the same height, &c.—in a word, subjected to the same conditions,—it would always come down with the same side uppermost. Practically, we know that nothing of this kind occurs, for the individual variations in the results of the throws are endless. Still there will be an average of these conditions, about which the throws will be found, as it were, to cluster much more thickly than elsewhere. We should be inclined therefore to infer that if the same side were always set uppermost there would really be a departure from the sort of series which we ordinarily expect. In a very large number of throws we should probably begin to find, under such circumstances, that either head or tail was having a preference shown to it. If so, would not similar effects be found to be connected with the way in which we started each successive pair of throws? According as we chose to make a practice of putting HH or TT uppermost, might there not be a disturbance in the proportion of successions of two heads or two tails? Following out this train of reasoning, it would seem to point with some likelihood to the conclusion that in order to obtain a series of the kind we expect, we should have to dispose the antecedents in a similar series at the start. The changes and chances produced by the act of throwing might introduce infinite individual variations, and yet there might be found, in the very long run, to be a close similarity between these two series.
§ 11. This is, to a certain extent, only shifting the difficulty, I admit; for the claim formerly advanced about the possibility of proving the proportions of the throws in the former series, will probably now be repeated in favour of those in the latter. Still the question is very much narrowed, for we have reduced it to a series of voluntary acts. A man may put whatever side he pleases uppermost. He may act consciously, as I have said, or he may think nothing whatever about the matter, that is, throw at random; if so, it will probably be asserted by many that he will involuntarily produce a series of the kind in question. It may be so, or it may not; it does not seem that there are any easily accessible data by which to decide. All that I am concerned with here is to show the likelihood that the commonly received result does in reality depend upon the fulfilment of a certain condition at the outset, a condition which it is certainly optional with any one to fulfil or not as he pleases. The short successions doubtless will take care of themselves, owing to the infinite complications produced by the casual variations in throwing; but the long ones may suffer, unless their interest be consciously or unconsciously regarded at the outset.
§ 12. The advice, ‘Only try long enough, and you will sooner or later get any result that is possible,’ is plausible, but it rests only on Induction and Analogy; mathematics do not prove it. As has been repeatedly stated, there are two distinct views of the subject. Either we may, on the one hand, take a series of symbols, call them heads and tails; H, T, &c.; and make the assumption that each of these, and each pair of them, and so on, will occur in the long run with a regulated degree of frequency. We may then calculate their various combinations, and the consequences that may be drawn from the data assumed. This is a purely algebraical process; it is infallible; and there is no limit whatever to the extent to which it may be carried. This way of looking at the matter may be, and undoubtedly should be, nothing more than the counterpart of what I have called the substituted or idealized series which generally has to be introduced as the basis of our calculation. The danger to be guarded against is that of regarding it too purely as an algebraical conception, and thence of sinking into the very natural errors both of too readily evolving it out of our own consciousness, and too freely pushing it to unwarranted lengths.
Or on the other hand, we may consider that we are treating of the behaviour of things;—balls, dice, births, deaths, &c.; and drawing inferences about them. But, then, what were in the former instance allowable assumptions, become here propositions to be tested by experience. Now the whole theory of Probability as a practical science, in fact as anything more than an algebraical truth, depends of course upon there being a close correspondence between these two views of the subject, in other words, upon our substituted series being kept in accordance with the actual series. Experience abundantly proves that, between considerable limits, in the example in question, there does exist such a correspondence. But let no one attempt to enforce our assent to every remote deduction that mathematicians can draw from their formulæ. When this is attempted the distinction just traced becomes prominent and important, and we have to choose our side. Either we go over to the mathematics, and so lose all right of discussion about the things; or else we take part with the things, and so defy the mathematics. We do not question the formal accuracy of the latter within their own province, but either we dismiss them as somewhat irrelevant, as applying to data of whose correctness we cannot be certain, or we take the liberty of remodelling them so as to bring them into accordance with facts.
§ 13. A critic of any doctrine can hardly be considered to have done much more than half his duty when he has explained and justified his grounds for objecting to it. It still remains for him to indicate, if only in a few words, what he considers its legitimate functions and position to be, for it can seldom happen that he regards it as absolutely worthless or unmeaning. I should say, then, that when Probability is thus divorced from direct reference to objects, as it substantially is by not being founded upon experience, it simply resolves itself into the common algebraical or arithmetical doctrine of Permutations and Combinations.[5] The considerations upon which these depend are purely formal and necessary, and can be fully reasoned out without any appeal to experience. We there start from pure considerations of number or magnitude, and we terminate with them, having only arithmetical calculations to connect them together. I wish, for instance, to find the chance of throwing heads three times running with a penny. All I have to do is first to ascertain the possible number of throws. Permutations tell me that with two things thus in question (viz. head and tail) and three times to perform the process, there are eight possible forms of the result. Of these eight one only being favourable, the chance in question is pronounced to be one-eighth.
Now though it is quite true that the actual calculation of every chance problem must be of the above character, viz. an algebraical or arithmetical process, yet there is, it seems to me, a broad and important distinction between a material science which employs mathematics, and a formal one which consists of nothing but mathematics. When we cut ourselves off from the necessity of any appeal to experience, we are retaining only the intermediate or calculating part of the investigation; we may talk of dice, or pence, or cards, but these are really only names we choose to give to our symbols. The H's and T's with which we deal have no bearing on objective occurrences, but are just like the x's and y's with which the rest of algebra deals. Probability in fact, when so treated, seems to be absolutely nothing else than a system of applied Permutations and Combinations.
It will now readily be seen how narrow is the range of cases to which any purely deductive method of treatment can apply. It is almost entirely confined to such employments as games of chance, and, as already pointed out, can only be regarded as really trustworthy even there, by the help of various tacit restrictions. This alone would be conclusive against the theory of the subject being rested upon such a basis. The experimental method, on the other hand, is, in the same theoretical sense, of universal application. It would include the ordinary problems furnished by games of chance, as well as those where the dice are loaded and the pence are not perfect, and also the indefinitely numerous applications of statistics to the various kinds of social phenomena.
§ 14. The particular view of the deductive character of Probability above discussed, could scarcely have intruded itself into any other examples than those of the nature of games of chance, in which the conditions of occurrence are by comparison few and simple, and are amenable to accurate numerical determination. But a doctrine, which is in reality little else than the same theory in a slightly disguised form, is very prevalent, and has been applied to truths of the most purely empirical character. This doctrine will be best introduced by a quotation from Laplace. After speaking of the irregularity and uncertainty of nature as it appears at first sight, he goes on to remark that when we look closer we begin to detect “a striking regularity which seems to suggest a design, and which some have considered a proof of Providence. But, on reflection, it is soon perceived that this regularity is nothing but the development of the respective probabilities of the simple events, which ought to occur more frequently according as they are more probable.”[6]
If this remark had been made about the succession of heads and tails in the throwing up of a penny, it would have been intelligible. It would simply mean this: that the constitution of the body was such that we could anticipate with some confidence what the result would be when it was treated in a certain way, and that experience would justify our anticipation in the long run. But applied as it is in a more general form to the facts of nature, it seems really to have but little meaning in it. Let us test it by an instance. Amidst the irregularity of individual births, we find that the male children are to the female, in the long run, in about the proportion of 106 to 100. Now if we were told that there is nothing in this but “the development of their respective probabilities,” would there be anything in such a statement but a somewhat pretentious re-statement of the fact already asserted? The probability is nothing but that proportion, and is unquestionably in this case derived from no other source but the statistics themselves; in the above remark the attempt seems to be made to invert this process, and to derive the sequence of events from the mere numerical statement of the proportions in which they occur.
§ 15. It will very likely be replied that by the probability above mentioned is meant, not the mere numerical proportion between, the births, but some fact in our constitution upon which this proportion depends; that just as there was a relation of equality between the two sides of the penny, which produced the ultimate equality in the number of heads and tails, so there may be something in our constitution or circumstances in the proportion of 106 to 100, which produces the observed statistical result. When this something, whatever it might be, was discovered, the observed numbers might be supposed capable of being determined beforehand. Even if this were the case, however, it must not be forgotten that there could hardly fail to be, in combination with such causes, other concurrent conditions in order to produce the ultimate result; just as besides the shape of the penny, we had also to take into account the nature of the ‘randomness’ with which it was tossed. What these may be, no one at present can undertake to say, for the best physiologists seem indisposed to hazard even a guess upon the subject.[7] But without going into particulars, one may assert with some confidence that these conditions cannot well be altogether independent of the health, circumstances, manners and customs, &c. (to express oneself in the vaguest way) of the parents; and if once these influencing elements are introduced, even as very minute factors, the results cease to be dependent only on fixed and permanent conditions. We are at once letting in other conditions, which, if they also possess the characteristics that distinguish Probability (an exceedingly questionable assumption), must have that fact specially proved about them. That this should be the case indeed seems not merely questionable, but almost certainly impossible; for these conditions partaking of the nature of what we term generally, Progress and Civilization, cannot be expected to show any permanent disposition to hover about an average.
§ 16. The reader who is familiar with Probability is of course acquainted with the celebrated theorem of James Bernoulli. This theorem, of which the examples just adduced are merely particular cases, is generally expressed somewhat as follows:—in the long run all events will tend to occur with a relative frequency proportional to their objective probabilities. With the mathematical proof of this theorem we need not trouble ourselves, as it lies outside the province of this work; but indeed if there is any value in the foregoing criticism, the basis on which the mathematics rest is faulty, owing to there being really nothing which we can with propriety call an objective probability.
If one might judge by the interpretation and uses to which this theorem is sometimes exposed, we should regard it as one of the last remaining relics of Realism, which after being banished elsewhere still manages to linger in the remote province of Probability. It would be an illustration of the inveterate tendency to objectify our conceptions, even in cases where the conceptions had no right to exist at all. A uniformity is observed; sometimes, as in games of chance, it is found to be so connected with the physical constitution of the bodies employed as to be capable of being inferred beforehand; though even here the connection is by no means so necessary as is commonly supposed, owing to the fact that in addition to these bodies themselves we have also to take into account their relation to the agencies which influence them. This constitution is then converted into an ‘objective probability’, supposed to develop into the sequence which exhibits the uniformity. Finally, this very questionable objective probability is assumed to exist, with the same faculty of development, in all the cases in which uniformity is observed, however little resemblance there may be between these and games of chance.
§ 17. How utterly inappropriate any such conception is in most of the cases in which we find statistical uniformity, will be obvious on a moment's consideration. The observed phenomena are generally the product, in these cases, of very numerous and complicated antecedents. The number of crimes, for instance, annually committed in any society, is a function amongst other things, of the strictness of the law, the morality of the people, their social condition, and the vigilance of the police, each of these elements being in itself almost infinitely complex. Now, as a result of all these agencies, there is some degree of uniformity; but what has been called above the change of type, which it sooner or later tends to display, is unmistakeable. The average annual numbers do not show a steady gradual approach towards what might be considered in some sense a limiting value, but, on the contrary, fluctuate in a way which, however it may depend upon causes, shows none of the permanent uniformity which is characteristic of games of chance. This fact, combined with the obvious arbitrariness of singling out, from amongst the many and various antecedents which produced the observed regularity, a few only, which should constitute the objective probability (if we took all, the events being absolutely determined, there would be no occasion for an appeal to probability in the case), would have been sufficient to prevent any one from assuming the existence of any such thing, unless the mistaken analogy of other cases had predisposed him to seek for it.
There is a familiar practical form of the same error, the tendency to which may not improbably be derived from a similar theoretical source. It is that of continuing to accumulate our statistical data to an excessive extent. If the type were absolutely fixed we could not possibly have too many statistics; the longer we chose to take the trouble of collecting them the more accurate our results would be. But if the type is changing, in other words, if some of the principal causes which aid in their production have, in regard to their present degree of intensity, strict limits of time or space, we shall do harm rather than good if we overstep these limits. The danger of stopping too soon is easily seen, but in avoiding it we must not fall into the opposite error of going on too long, and so getting either gradually or suddenly under the influence of a changed set of circumstances.
§ 18. This chapter was intended to be devoted to a consideration, not of the processes by which nature produces the series with which we are concerned, but of the theoretic basis of the methods by which we can determine the existence of such series. But it is not possible to keep the two enquiries apart, for here, at any rate, the old maxim prevails that to know a thing we must know its causes. Recur for a minute to the considerations of the last chapter. We there saw that there was a large class of events, the conditions of production of which could be said to consist of (1) a comparatively few nearly unchangeable elements, and (2) a vast number of independent and very changeable elements. At least if there were any other elements besides these, we are assumed either to make special allowance for them, or to omit them from our enquiry. Now in certain cases, such as games of chance, the unchangeable elements may without practical error be regarded as really unchangeable throughout any range of time and space. Hence, as a result, the deductive method of treatment becomes in their case at once the most simple, natural, and conclusive; but, as a further consequence, the statistics of the events, if we choose to appeal to them, may be collected ad libitum with better and better approximation to truth. On the other hand, in all social applications of Probability, the unchangeable causes can only be regarded as really unchangeable under many qualifications. We know little or nothing of them directly; they are often in reality numerous, indeterminate, and fluctuating; and it is only under the guarantee of stringent restrictions of time and place, that we can with any safety attribute to them sufficient fixity to justify our theory. Hence, as a result, the deductive method, under whatever name it may go, becomes totally inapplicable both in theory and practice; and, as a further consequence, the appeal to statistics has to be made with the caution in mind that we shall do mischief rather than good if we go on collecting too many of them.
§ 19. The results of the last two chapters may be summed up as follows:—We have extended the conception of a series obtained in the first chapter; for we have found that these series are mostly presented to us in groups. These groups are found upon examination to be formed upon approximately the same type throughout a very wide and varied range of experience; the causes of this agreement we discussed and explained in some detail. When, however, we extend our examination by supposing the series to run to a very great length, we find that they may be divided into two classes separated by important distinctions. In one of these classes (that containing the results of games of chance) the conditions of production, and consequently the laws of statistical occurrence, may be practically regarded as absolutely fixed; and the extent of the divergences from the mean seem to know no finite limit. In the other class, on the contrary (containing the bulk of ordinary statistical enquiries), the conditions of production vary with more or less rapidity, and so in consequence do the results. Moreover it is often impossible that variations from the mean should exceed a certain amount. The former we may term ideal series. It is they alone which show the requisite characteristics with any close approach to accuracy, and to make the theory of the subject tenable, we have really to substitute one of this kind for one of the less perfect ones of the other class, when these latter are under treatment. The former class have, however, been too exclusively considered by writers on the subject; and conceptions appropriate only to them, and not always even to them, have been imported into the other class. It is in this way that a general tendency to an excessive deductive or à priori treatment of the science has been encouraged.
1 This latter enquiry belongs to what may be termed the more purely logical part of this volume, and is entered on in the course of Chapter VI.
2 For the use of those not acquainted with the common notation employed in this subject, it may be remarked that HH is simply an abbreviated way of saying that the two successive throws of the penny give head; HT that the first of them gives head, and the second tail; and so on with the remaining symbols.
3 I am endeavouring to treat this rule of Sufficient Reason in a way that shall be legitimate in the opinion of those who accept it, but there seem very great doubts whether a contradiction is not involved when we attempt to extract results from it. If the sides are absolutely alike, how can there he any difference between the terms of the series? The succession seems then reduced to a dull uniformity, a mere iteration of the same thing many times; the series we contemplated has disappeared. If the sides are not absolutely alike, what becomes of the applicability of the rule?
4 Formal Logic, p. 185. Principles of Science, p. 208.
5 The close connection between these subjects is well indicated in the title of Mr Whitworth's treatise, Choice and Chance.
6 Essai Philosophique. Ed. 1825, p. 74.
7 An opinion prevailed rather at one time (quoted and supported by Quetelet amongst others) that the relative ages of the parents had something to do with the sex of the offspring. If this were so, it would quite bear out the above remarks. As a matter of fact, it should be observed, that the proportion of 106 to 100 does not seem by any means universal in all countries or at all times. For various statistical tables on the subject see Quetelet, Physique Sociale, Vol. I. 166, 173, 238.
§ 1. There is a term of frequent occurrence in treatises on Probability, and which we have already had repeated occasion to employ, viz. the designation random applied to an event, as in the expression ‘a random distribution’. The scientific conception involved in the correct use of this term is, I apprehend, nothing more than that of aggregate order and individual irregularity (or apparent irregularity), which has been already described in the preceding chapters. A brief discussion of the requisites in this scientific conception, and in particular of the nature and some of the reasons for the departure from the popular conception, may serve to clear up some of the principal remaining difficulties which attend this part of our subject.
The original,[1] and still popular, signification of the term is of course widely different from the scientific. What it looks to is the origin, not the results, of the random performance, and it has reference rather to the single action than to a group or series of actions. Thus, when a man draws a bow ‘at a venture’, or ‘at random’, we mean only to point out the aimless character of the performance; we are contrasting it with the definite intention to hit a certain mark. But it is none the less true, as already pointed out, that we can only apply processes of inference to such performances as these when we regard them as being capable of frequent, or rather of indefinitely extended repetition.
Begin with an illustration. Perhaps the best typical example that we can give of the scientific meaning of random distribution is afforded by the arrangement of the drops of rain in a shower. No one can give a guess whereabouts at any instant a drop will fall, but we know that if we put out a sheet of paper it will gradually become uniformly spotted over; and that if we were to mark out any two equal areas on the paper these would gradually tend to be struck equally often.
§ 2. I. Any attempt to draw inferences from the assumption of random arrangement must postulate the occurrence of this particular state of things at some stage or other. But there is often considerable difficulty, leading occasionally to some arbitrariness, in deciding the particular stage at which it ought to be introduced.
(1) Thus, in many of the problems discussed by mathematicians, we look as entirely to the results obtained, and think as little of the actual process by which they are obtained, as when we are regarding the arrangement of the drops of rain. A simple example of this kind would be the following. A pawn, diameter of base one inch, is placed at random on a chess-board, the diameter of the squares of which is one inch and a quarter: find the chance that its base shall lie across one of the intersecting lines. Here we may imagine the pawns to be so to say rained down vertically upon the board, and the question is to find the ultimate proportion of those which meet a boundary line to the total of those which fall. The problem therefore becomes a merely geometrical one, viz. to determine the ratio of a certain area on the board to the whole area. The determination of this ratio is all that the mathematician ever takes into account.
Now take the following. A straight brittle rod is broken at random in two places: find the chance that the pieces can make a triangle.[2] Since the only condition for making a triangle with three straight lines is that each two shall be greater than the third, the problem seems to involve the same general conception as in the former case. We must conceive such rods breaking at one pair of spots after another,—no one can tell precisely where,—but showing the same ultimate tendency to distribute these spots throughout the whole length uniformly. As in the last case, the mathematician thinks of nothing but this final result, and pays no heed to the process by which it may be brought about. Accordingly the problem is again reduced to one of mensuration, though of a somewhat more complicated character.
§ 3. (2) In another class of cases we have to contemplate an intermediate process rather than a final result; but the same conception has to be introduced here, though it is now applied to the former stage, and in consequence will not in general apply to the latter.
For instance: a shot is fired at random from a gun whose maximum range (i.e. at 45° elevation) is 3000 yards: what is the chance that the actual range shall exceed 2000 yards? The ultimately uniform (or random) distribution here is commonly assumed to apply to the various directions in which the gun can be pointed; all possible directions above the horizontal being equally represented in the long run. We have therefore to contemplate a surface of uniform distribution, but it will be the surface, not of the ground, but of a hemisphere whose centre is occupied by the man who fires. The ultimate distribution of the bullets on the spots where they strike the ground will not be uniform. The problem is in fact to discover the law of variation of the density of distribution.
The above is, I presume, the treatment generally adopted in solving such a problem. But there seems no absolute necessity for any such particular choice. It is surely open to any one to maintain[3] that his conception of the randomness of the firing is assigned by saying that it is likely that a man should begin by facing towards any point of the compass indifferently, and then proceed to raise his gun to any angle indifferently. The stage of ultimately uniform distribution here has receded a step further back. It is not assigned directly to the surface of an imaginary hemisphere, but to the lines of altitude and azimuth drawn on that surface. Accordingly, the distribution over the hemisphere itself will not now be uniform,—there will be a comparative crowding up towards the pole,—and the ultimate distribution over the ground will not be the same as before.
§ 4. Difficulties of this kind, arising out of the uncertainty as to what stage should be selected for that of uniform distribution, will occasionally present themselves. For instance: let a book be taken at random out of a bookcase; what is the chance of hitting upon some assigned volume? I hardly know how this question would commonly be treated. If we were to set our man opposite the middle of the shelf and inquire what would generally happen in practice, supposing him blindfolded, there cannot be much doubt that the volumes would not be selected equally often. On the contrary, it is likely that there would be a tendency to increased frequency about a centre indicated by the height of his shoulder, and (unless he be left-handed) a trifle to the right of the point exactly opposite his starting point.
If the question were one which it were really worth while to work out on these lines we should be led a long way back. Just as we imagined our rifleman's position (on the second supposition) to be determined by two independent coordinates of assumed continuous and equal facility, so we might conceive our making the attempt to analyse the man's movements into a certain number of independent constituents. We might suppose all the various directions from his starting point, along the ground, to be equally likely; and that when he reaches the shelves the random motion of his hand is to be regulated after the fashion of a shot discharged at random.
The above would be one way of setting about the statement of the problem. But the reader will understand that all which I am here proposing to maintain is that in these, as in every similar case, we always encounter, under this conception of ‘randomness’, at some stage or other, this postulate of ultimate uniformity of distribution over some assigned magnitude: either time; or space, linear, superficial, or solid. But the selection of the stage at which this is to be applied may give rise to considerable difficulty, and even arbitrariness of choice.
§ 5. Some years ago there was a very interesting discussion upon this subject carried on in the mathematical part of the Educational Times (see, especially, Vol. VII.). As not unfrequently happens in mathematics there was an almost entire accord amongst the various writers as to the assumptions practically to be made in any particular case, and therefore as to the conclusion to be drawn, combined with a very considerable amount of difference as to the axioms and definitions to be employed. Thus Mr M. W. Crofton, with the substantial agreement of Mr Woolhouse, laid it down unhesitatingly that “at random” has “a very clear and definite meaning; one which cannot be better conveyed than by Mr Wilson's definition, ‘according to no law’; and in this sense alone I mean to use it.” According to any scientific interpretation of ‘law’ I should have said that where there was no law there could be no inference. But ultimate tendency towards equality of distribution is as much taken for granted by Mr Crofton as by any one else: in fact he makes this a deduction from his definition:—“As this infinite system of parallels are drawn according to no law, they are as thickly disposed along any part of the [common] perpendicular as along any other” (VII. p. 85). Mr Crofton holds that any kind of unequal distribution would imply law,—“If the points [on a plane] tended to become denser in any part of the plane than in another, there must be some law attracting them there” (ib. p. 84). The same view is enforced in his paper on Local Probability (in the Phil. Trans., Vol. 158). Surely if they tend to become equally dense this is just as much a case of regularity or law.
It may be remarked that wherever any serious practical consequences turn upon duly securing the desired randomness, it is always so contrived that no design or awkwardness or unconscious one-sidedness shall disturb the result. The principal case in point here is of course afforded by games of chance. What we want, when we toss a die, is to secure that all numbers from 1 to 6 shall be equally often represented in the long run, but that no person shall be able to predict the individual occurrence. We might, in our statement of a problem, as easily postulate ‘a number thought of at random’ as ‘a shot fired at random’, but no one would risk his chances of gain and loss on the supposition that this would be done with continued fairness. Accordingly, we construct a die whose sides are accurately alike, and it is found that we may do almost what we like with this, at any previous stage to that of its issue from the dice box on to the table, without interfering with the random nature of the result.
§ 6. II. Another characteristic in which the scientific conception seems to me to depart from the popular or original signification is the following. The area of distribution which we take into account must be a finite or limited one. The necessity for this restriction may not be obvious at first sight, but the consideration of one or two examples will serve to indicate the point at which it makes itself felt. Suppose that one were asked to choose a number at random, not from a finite range, but from the inexhaustible possibilities of enumeration. In the popular sense of the term,—i.e. of uttering a number without pausing to choose,—there is no difficulty. But a moment's consideration will show that no arrangement even tending towards ultimately uniform distribution can be secured in this way. No average could be struck with ever increasing steadiness. So with spatial infinity. We can rationally speak of choosing a point at random in a given straight line, area, or volume. But if we suppose the line to have no end, or the selection to be made in infinite space, the basis of ultimate tendency towards what may be called the equally thick deposit of our random points fails us utterly.
Similarly in any other example in which one of the magnitudes is unlimited. Suppose I fling a stick at random in a horizontal plane against a row of iron railings and inquire for the chance of its passing through without touching them. The problem bears some analogy to that of the chessmen, and so far as the motion of translation of the stick is concerned (if we begin with this) it presents no difficulty. But as regards the rotation it is otherwise. For any assigned linear velocity there is a certain angular velocity below which the stick may pass through without contact, but above which it cannot. And inasmuch as the former range is limited and the latter is unlimited, we encounter the same impossibility as before in endeavouring to conceive a uniform distribution. Of course we might evade this particular difficulty by beginning with an estimate of the angular velocity, when we should have to repeat what has just been said, mutatis mutandis, in reference to the linear velocity.
§ 7. I am of course aware that there are a variety of problems current which seem to conflict with what has just been said, but they will all submit to explanation. For instance; What is the chance that three straight lines, taken or drawn at random, shall be of such lengths as will admit of their forming a triangle? There are two ways in which we may regard the problem. We may, for one thing, start with the assumption of three lines not greater than a certain length n, and then determine towards what limit the chance tends as n increases unceasingly. Or, we may maintain that the question is merely one of relative proportion of the three lines. We may then start with any magnitude we please to represent one of the lines (for simplicity, say, the longest of them), and consider that all possible shapes of a triangle will be represented by varying the lengths of the other two. In either case we get a definite result without need to make an attempt to conceive any random selection from the infinity of possible length.
So in what is called the “three-point problem”:—Three points in space are selected at random; find the chance of their forming an acute-angled triangle. What is done is to start with a closed volume,—say a sphere, from its superior simplicity,—find the chance (on the assumption of uniform distribution within this volume); and then conceive the continual enlargement without limit of this sphere. So regarded the problem is perfectly consistent and intelligible, though I fail to see why it should be termed a random selection in space rather than in a sphere. Of course if we started with a different volume, say a cube, we should get a different result; and it is therefore contended (e.g. by Mr Crofton in the Educational Times, as already referred to) that infinite space is more naturally and appropriately regarded as tended towards by the enlargement of a sphere than by that of a cube or any other figure.
Again: A group of integers is taken at random; show that the number thus taken is more likely to be odd than even. What we do in answering this is to start with any finite number n, and show that of all the possible combinations which can be made within this range there are more odd than even. Since this is true irrespective of the magnitude of n, we are apt to speak as if we could conceive the selection being made at random from the true infinity contemplated in numeration.
§ 8. Where these conditions cannot be secured then it seems to me that the attempt to assign any finite value to the probability fails. For instance, in the following problem, proposed by Mr J. M. Wilson, “Three straight lines are drawn at random on an infinite plane, and a fourth line is drawn at random to intersect them: find the probability of its passing through the triangle formed by the other three” (Ed. Times, Reprint, Vol. V. p. 82), he offers the following solution: “Of the four lines, two must and two must not pass within the triangle formed by the remaining three. Since all are drawn at random, the chance that the last drawn should pass through the triangle formed by the other three is consequently 1/2.”
I quote this solution because it seems to me to illustrate the difficulty to which I want to call attention. As the problem is worded, a triangle is supposed to be assigned by three straight lines. However large it may be, its size bears no finite ratio whatever to the indefinitely larger area outside it; and, so far as I can put any intelligible construction on the supposition, the chance of drawing a fourth random line which should happen to intersect this finite area must be reckoned as zero. The problem Mr Wilson has solved seems to me to be a quite different one, viz. “Given four intersecting straight lines, find the chance that we should, at random, select one that passes through the triangle formed by the other three.”
The same difficulty seems to me to turn up in most other attempts to apply this conception of randomness to real infinity. The following seems an exact analogue of the above problem:—A number is selected at random, find the chance that another number selected at random shall be greater than the former;—the answer surely must be that the chance is unity, viz. certainty, because the range above any assigned number is infinitely greater than that below it. Or, expressed in the only language in which I can understand the term ‘infinity’, what I mean is this. If the first number be m and I am restricted to selecting up to n (n > m) then the chance of exceeding m is n − m : n; if I am restricted to 2n then it is 2n − m : 2n and so on. That is, however large n and m may be the expression is always intelligible; but, m being chosen first, n may be made as much larger than m as we please: i.e. the chance may be made to approach as near to unity as we please.
I cannot but think that there is a similar fallacy in De Morgan's admirably suggestive paper on Infinity (Camb. Phil. Trans. Vol. 11.) when he is discussing the “three-point problem”:—i.e. given three points taken at random find the chance that they shall form an acute-angled triangle. All that he shows is, that if we start with one side as given and consider the subsequent possible positions of the opposite vertex, there are infinitely as many such positions which would form an acute-angled triangle as an obtuse: but, as before, this is solving a different problem.
§ 9. The nearest approach I can make towards true indefinite randomness, or random selection from true indefiniteness, is as follows. Suppose a circle with a tangent line extended indefinitely in each direction. Now from the centre draw radii at random; in other words, let the semicircumference which lies towards the tangent be ultimately uniformly intersected by the radii. Let these radii be then produced so as to intersect the tangent line, and consider the distribution of these points of intersection. We shall obtain in the result one characteristic of our random distribution; i.e. no portion of this tangent, however small or however remote, but will find itself in the position ultimately of any small portion of the pavement in our supposed continual rainfall. That is, any such elementary patch will become more and more closely dotted over with the points of intersection. But the other essential characteristic, viz. that of ultimately uniform distribution, will be missing. There will be a special form of distribution,—what in fact will have to be discussed in a future chapter under the designation of a ‘law of error’,—by virtue of which the concentration will tend to be greatest at a certain point (that of contact with the circle), and will thin out from here in each direction according to an easily calculated formula. The existence of such a state of things as this is quite opposed to the conception of true randomness.
§ 10. III. Apart from definitions and what comes of them, perhaps the most important question connected with the conception of Randomness is this: How in any given case are we to determine whether an observed arrangement is to be considered a random one or not? This question will have to be more fully discussed in a future chapter, but we are already in a position to see our way through some of the difficulties involved in it.
(1) If the events or objects under consideration are supposed to be continued indefinitely, or if we know enough about the mode in which they are brought about to detect their ultimate tendency,—or even, short of this, if they are numerous enough to be beyond practical counting,—there is no great difficulty. We are simply confronted with a question of fact, to be settled like other questions of fact. In the case of the rain-drops, watch two equal squares of pavement or other surfaces, and note whether they come to be more and more densely uniformly and evenly spotted over: if they do, then the arrangement is what we call a random one. If I want to know whether a tobacco-pipe really breaks at random, and would therefore serve as an illustration of the problem proposed some pages back, I have only to drop enough of them and see whether pieces of all possible lengths are equally represented in the long run. Or, I may argue deductively, from what I know about the strength of materials and the molecular constitution of such bodies, as to whether fractures of small and large pieces are all equally likely to occur.
§ 11. The reader's attention must be carefully directed to a source of confusion here, arising out of a certain cross-division. What we are now discussing is a question of fact, viz. the nature of a certain ultimate arrangement; we are not discussing the particular way in which it is brought about. In other words, the antithesis is between what is and what is not random: it is not between what is random and what is designed. As we shall see in a few moments it is quite possible that an arrangement which is the result,—if ever anything were so,—of ‘design’, may nevertheless present the unmistakeable stamp of randomness of arrangement.
Consider a case which has been a good deal discussed, and to which we shall revert again: the arrangement of the stars. The question here is rather complicated by the fact that we know nothing about the actual mutual positions of the stars, all that we can take cognizance of being their apparent or visible places as projected upon the surface of a supposed sphere. Appealing to what alone we can thus observe, it is obvious that the arrangement, as a whole, is not of the random sort. The Milky Way and the other resolvable nebulæ, as they present themselves to us, are as obvious an infraction of such an arrangement as would be the occurrence here and there of patches of ground in a rainfall which received a vast number more drops than the spaces surrounding them. If we leave these exceptional areas out of the question and consider only the stars which are visible by the naked eye or by slight telescopic power, it seems equally certain that the arrangement is, for the most part, a fairly representative random one. By this we mean nothing more than the fact that when we mark off any number of equal areas on the visible sphere these are found to contain approximately the same number of stars.
The actual arrangement of the stars in space may also be of the same character: that is, the apparently denser aggregation may be apparent only, arising from the fact that we are looking through regions which are not more thickly occupied but are merely more extensive. The alternative before us, in fact, is this. If the whole volume, so to say, of the starry heavens is tolerably regular in shape, then the arrangement of the stars is not of the random order; if that volume is very irregular in shape, it is possible that the arrangement within it may be throughout of that order.
§ 12. (2) When the arrangement in question includes but a comparatively small number of events or objects, it becomes much more difficult to determine whether or not it is to be designated a random one. In fact we have to shift our ground, and to decide not by what has been actually observed but by what we have reason to conclude would be observed if we could continue our observation much longer. This introduces what is called ‘Inverse Probability’, viz. the determination of the nature of a cause from the nature of the observed effect; a question which will be fully discussed in a future chapter. But some introductory remarks may be conveniently made here.
Every problem of Probability, as the subject is here understood, introduces the conception of an ultimate limit, and therefore presupposes an indefinite possibility of repetition. When we have only a finite number of occurrences before us, direct evidence of the character of their arrangement fails us, and we have to fall back upon the nature of the agency which produces them. And as the number becomes smaller the confidence with which we can estimate the nature of the agency becomes gradually less.
Begin with an intermediate case. There is a small lawn, sprinkled over with daisies: is this a random arrangement? We feel some confidence that it is so, on mere inspection; meaning by this that (negatively) no trace of any regular pattern can be discerned and (affirmatively) that if we take any moderately small area, say a square yard, we shall find much about the same number of the plants included in it. But we can help ourselves by an appeal to the known agency of distribution here. We know that the daisy spreads by seed, and considering the effect of the wind and the continued sweeping and mowing of the lawn we can detect causes at work which are analogous to those by which the dealing of cards and the tossing of dice are regulated.
In the above case the appeal to the process of production was subsidiary, but when we come to consider the nature of a very small succession or group this appeal becomes much more important. Let us be told of a certain succession of ‘heads’ and ‘tails’ to the number of ten. The range here is far too small for decision, and unless we are told whether the agent who obtained them was tossing or designing we are quite unable to say whether or not the designation of ‘random’ ought to be applied to the result obtained. The truth must never be forgotten that though ‘design’ is sure to break down in the long run if it make the attempt to produce directly the semblance of randomness,[4] yet for a short spell it can simulate it perfectly. Any short succession, say of heads and tails, may have been equally well brought about by tossing or by deliberate choice.