The following would be a fair illustration to test this view. I know that I must die on some day of the week, and there are but seven days. My belief, therefore, that I shall die on a Sunday is one-seventh. Here the contingent event is clearly one that does not admit of repetition; and yet would not the belief of every man have the value assigned it by the formula? It would appear that the same principle will be found to be at work here as in the former examples. It is quite true that I have only the opportunity of dying once myself, but I am a member of a class in which deaths occur with frequency, and I form my opinion upon evidence drawn from that class. If, for example, I had insured my life for £1000, I should feel a certain propriety in demanding £7000 in case the office declared that it would only pay in the event of my dying on a Sunday. I, indeed, for my own private part, might not find the arrangement an equitable one; but mankind at large, in case they acted on such a principle, might fairly commute their aggregate gains in such a way, whilst to the Insurance Office it would not make any difference at all.

§ 26. The results of the last few sections might be summarised as follows:—the different amounts of belief which we entertain upon different events, and which are recognized by various phrases in common use, have undoubtedly some meaning. But the greater part of their meaning, and certainly their only justification, are to be sought in the series of corresponding events to which they belong; in regard to which it may be shown that far more events are capable of being referred to a series than might be supposed at first sight. The test and justification of belief are to be found in conduct; in this test applied to the series as a whole, there is nothing peculiar, it differs in no way from the similar test when we are acting on our belief about any single event. But so applied, from the nature of the case it is applied successively to each of the individuals of the series; here our conduct generally admits of being separately considered in reference to each particular event; and this has been understood to denote a certain amount of belief which should be a fraction of certainty. Probably on the principles of association, a peculiar condition of mind is produced in reference to each single event. And these associations are not unnaturally retained even when we contemplate any one of these single events isolated from any series to which it belongs. When it is found alone we treat it, and feel towards it, as we do when it is in company with the rest of the series.

§ 27. We may now see, more clearly than we could before, why it is that we are free from any necessity of assuming the existence of causation, in the sense of necessary invariable sequence, in the case of the events which compose our series. Against such a view it might very plausibly be urged, that we constantly talk of the probability of a single event; but how can this be done, it may reasonably be said, if we once admit the possibility of that event occurring fortuitously? Take an instance from human life; the average duration of the lives of a batch of men aged thirty will be about thirty-four years. We say therefore to any individual of them, Your expectation of life is thirty-four years. But how can this be said if we admit that the train of events composing his life is liable to be destitute of all regular sequence of cause and effect? To this it may be replied that the denial of causation enables us to say neither more nor less than its assertion, in reference to the length of the individual life, for of this we are ignorant in each case alike. By assigning, as above, an expectation in reference to the individual, we mean nothing more than to make a statement about the average of his class. Whether there be causation or not in these individual cases does not affect our knowledge of the average, for this by supposition rests on independent experience. The legitimate inferences are the same on either hypothesis, and of equal value. The only difference is that on the hypothesis of non-causation we have forced upon our attention the impropriety of talking of the ‘proper’ expectation of the individual, owing to the fact that all knowledge of its amount is formally impossible; on the other hypothesis the impropriety is overlooked from the fact of such knowledge being only practically unattainable. As a matter of fact the amount of our knowledge is the same in each case; it is a knowledge of the average, and of that only.[6]

§ 28. We may conclude, then, that the limits within which we are thus able to justify the amount of our belief are far more extensive than might appear at first sight. Whether every case in which persons feel an amount of belief short of perfect confidence could be forced into the province of Probability is a wider question. Even, however, if the belief could be supposed capable of justification on its principles, its rules could never in such cases be made use of. Suppose, for example, that a father were in doubt whether to give a certain medicine to his sick child. On the one hand, the doctor declared that the child would die unless the medicine were given; on the other, through a mistake, the father cannot feel quite sure that the medicine he has is the right one. It is conceivable that some mathematicians, in their conviction that everything has its definite numerical probability, would declare that the man's belief had some ‘value’ (if they could only find out what it is), say nine-tenths; by which they would mean that in nine cases out of ten in which he entertained a belief of that particular value he proved to be right. So with his belief and doubt on the other side of the question. Putting the two together, there is but one course which, as a prudent man and a good father, he can possibly follow. It may be so, but when (as here) the identification of an event in a series depends on purely subjective conditions, as in this case upon the degree of vividness of his conviction, of which no one else can judge, no test is possible, and therefore no proof can be found.

§ 29. So much then for the attempts, so frequently made, to found the science on a subjective basis; they can lead, as it has here been endeavoured to show, to no satisfactory result. Still our belief is so inseparably connected with our action, that something of a defence can be made for the attempts described above; but when it is attempted, as is often the case, to import other sentiments besides pure belief, and to find a justification for them also in the results of our science, the confusion becomes far worse. The following extract from Archbishop Thomson's Laws of Thought (§ 122, Ed. II.) will show what kind of applications of the science are contemplated here: “In applying the doctrine of chances to that subject in connexion with which it was invented—games of chance,—the principles of what has been happily termed ‘moral arithmetic’ must not be forgotten. Not only would it be difficult for a gamester to find an antagonist on terms, as to fortune and needs, precisely equal, but also it is impossible that with such an equality the advantage of a considerable gain should balance the harm of a serious loss. ‘If two men,’ says Buffon, ‘were to determine to play for their whole property, what would be the effect of this agreement? The one would only double his fortune, and the other reduce his to naught. What proportion is there between the loss and the gain? The same that there is between all and nothing. The gain of the one is but a moderate sum,—the loss of the other is numerically infinite, and morally so great that the labour of his whole life may not perhaps suffice to restore his property.’ ”

As moral advice this is all very true and good. But if it be regarded as a contribution to the science of the subject it is quite inappropriate, and seems calculated to cause confusion. The doctrine of chances pronounces upon certain kinds of events in respect of number and magnitude; it has absolutely nothing to do with any particular person's feelings about these relations. We might as well append a corollary to the rules of arithmetic, to point out that although it is very true that twice two are four it does not follow that four, horses will give twice as much pleasure to the owner as two will. If two men play on equal terms their chances are equal; in other words, if they were often to play in this manner each would lose as frequently as he would gain. That is all that Probability can say; what under the circumstances may be the determination and opinions of the men in question, it is for them and them alone to decide. There are many persons who cannot bear mediocrity of any kind, and to whom the prospect of doubling their fortune would outweigh a greater chance of losing it altogether. They alone are the judges.

If we will introduce such a balance of pleasure and pain the individual must make the calculation for himself. The supposition is that total ruin is very painful, partial loss painful in a less proportion than that assigned by the ratio of the losses themselves; the inference is therefore drawn that on the average more pain is caused by occasional great losses than by frequent small ones, though the money value of the losses in the long run may be the same in each case. But if we suppose a country where the desire of spending largely is very strong, and where owing to abundant production loss is easily replaced, the calculation might incline the other way. Under such circumstances it is quite possible that more happiness might result from playing for high than for low stakes. The fact is that all emotional considerations of this kind are irrelevant; they are, at most, mere applications of the theory, and such as each individual is alone competent to make for himself. Some more remarks will be made upon this subject in the chapter upon Insurance and Gambling.

§ 30. It is by the introduction of such considerations as these that the Petersburg Problem has been so perplexed. Having already given some description of this problem we will refer to it very briefly here. It presents us with a sequence of sets of throws for each of which sets I am to receive something, say a shilling, as the minimum receipt. My receipts increase in proportion to the rarity of each particular kind of set, and each kind is observed or inferred to grow more rare in a certain definite but unlimited order. By the wording of the problem, properly interpreted, I am supposed never to stop. Clearly therefore, however large a fee I pay for each of these sets, I shall be sure to make it up in time. The mathematical expression of this is, that I ought always to pay an infinite sum. To this the objection is opposed, that no sensible man would think of advancing even a large finite sum, say £50. Certainly he would not; but why? Because neither he nor those who are to pay him would be likely to live long enough for him to obtain throws good enough to remunerate him for one-tenth of his outlay; to say nothing of his trouble and loss of time. We must not suppose that the problem, as stated in the ideal form, will coincide with the practical form in which it presents itself in life. A carpenter might as well object to Euclid's second postulate, because his plane came to a stop in six feet on the plank on which he was at work. Many persons have failed to perceive this, and have assumed that, besides enabling us to draw numerical inferences about the members of a series, the theory ought also to be called upon to justify all the opinions which average respectable men might be inclined to form about them, as well as the conduct they might choose to pursue in consequence. It is obvious that to enter upon such considerations as these is to diverge from our proper ground. We are concerned, in these cases, with the actions of men only, as given in statistics; with the emotions they experience in the performance of these actions we have no direct concern whatever. The error is the same as if any one were to confound, in political economy, value in use with value in exchange, and object to measuring the value of a loaf by its cost of production, because bread is worth more to a man when he is hungry than it is just after his dinner.

§ 31. One class of emotions indeed ought to be excepted, which, from the apparent uniformity and consistency with which they show themselves in different persons and at different times, do really present some better claim to consideration. In connection with a science of inference they can never indeed be regarded as more than an accident of what is essential to the subject, but compared with other emotions they seem to be inseparable accidents.

The reader will remember that attention was drawn in the earlier part of this chapter to the compound nature of the state of mind which we term belief. It is partly intellectual, partly also emotional; it professes to rest upon experience, but in reality the experience acts through the distorting media of hopes and fears and other disturbing agencies. So long as we confine our attention to the state of mind of the person who believes, it appears to me that these two parts of belief are quite inseparable. Indeed, to speak of them as two parts may convey a wrong impression; for though they spring from different sources, they so entirely merge in one result as to produce what might be called an indistinguishable compound. Every kind of inference, whether in probability or not, is liable to be disturbed in this way. A timid man may honestly believe that he will be wounded in a coming battle, when others, with the same experience but calmer judgments, see that the chance is too small to deserve consideration. But such a man's belief, if we look only to that, will not differ in its nature from sound belief. His conduct also in consequence of his belief will by itself afford no ground of discrimination; he will make his will as sincerely as a man who is unmistakeably on his death-bed. The only resource is to check and correct his belief by appealing to past and current experience.[7] This was advanced as an objection to the theory on which probability is regarded as concerned primarily with laws of belief. But on the view taken in this Essay in which we are supposed to be concerned with laws of inference about things, error and difficulty from this source vanish. Let us bear clearly in mind that we are concerned with inferences about things, and whatever there may be in belief which does not depend on experience will disappear from notice.

§ 32. These emotions then can claim no notice as an integral portion of any science of inference, and should in strictness be rigidly excluded from it. But if any of them are uniform and regular in their production and magnitude, they may be fairly admitted as accidental and extraneous accompaniments. This is really the case to some extent with our surprise. This emotion does show a considerable degree of uniformity. The rarer any event is the more am I, in common with most other men, surprised at it when it does happen. This surprise may range through all degrees, from the most languid form of interest up to the condition which we term ‘being startled’. And since the surprise seems to be pretty much the same, under similar circumstances, at different times, and in the case of different persons, it is free from that extreme irregularity which is found in most of the other mental conditions which accompany the contemplation of unexpected events. Hence our surprise, though, as stated above, having no proper claim to admission into the science of Probability, is such a constant and regular accompaniment of that which Probability is concerned with, that notice must often be taken of it. References will occasionally be found to this aspect of the question in the following chapters.

It may be remarked in passing, for the sake of further illustration of the subject, that this emotional accompaniment of surprise, to which we are thus able to assign something like a fractional value, differs in two important respects from the commonly accepted fraction of belief. In the first place, it has what may be termed an independent existence; it is intelligible by itself. The belief, as we endeavoured to show, needs explanation and finds it in our consequent conduct. Not so with the emotion; this stands upon its own footing, and may be examined in and by itself. Hence, in the second place, it is as applicable, and as capable of any kind of justification, in relation to the single event, as to a series of events. In this respect, as will be remembered, it offers a complete contrast to our state of belief about any one contingent event. May not these considerations help to account for the general acceptance of the doctrine, that we have a certain definite and measurable amount of belief about these events? I cannot help thinking that what is so obviously true of the emotional portion of the belief, has been unconsciously transferred to the other or intellectual portion of the compound condition, to which it is not applicable, and where it cannot find a justification.

§ 33. A further illustration may now be given of the subjective view of Probability at present under discussion.

An appeal to common language is always of service, as the employment of any distinct word is generally a proof that mankind have observed some distinct properties in the things, which have caused them to be singled out and have that name appropriated to them. There is such a class of words assigned by popular usage to the kind of events of which Probability takes account. If we examine them we shall find, I think, that they direct us unmistakeably to the two-fold aspect of the question,—the objective and the subjective, the quality in the events and the state of our minds in considering them,—that have occupied our attention during the former chapters.

The word ‘extraordinary’, for instance, seems to point to the observed fact, that events are arranged in a sort of ordo or rank. No one of them might be so exactly placed that we could have inferred its position, but when we take a great many into account together, running our eye, as it were, along the line, we begin to see that they really do for the most part stand in order. Those which stand away from the line have this divergence observed, and are called extraordinary, the rest ordinary, or in the line. So too ‘irregular’ and ‘abnormal’ are doubtless used from the appearance of things, when examined in large numbers, being that of an arrangement by rule or measure. This only holds when there are a good many; we could not speak of the single events being so arranged. Again the word ‘law’, in its philosophical sense, has now become quite popularised. How the term became introduced is not certain, but there can be little doubt that it was somewhat in this way:—The effect of a law, in its usual application to human conduct, is to produce regularity where it did not previously exist; when then a regularity began to be perceived in nature, the same word was used, whether the cause was supposed to be the same or not. In each case there was the same generality of agreement, subject to occasional deflection.[8]

On the other hand, observe the words ‘wonderful’, ‘unexpected’, ‘incredible’. Their connotation describes states of mind simply; they are of course not confined to Probability, in the sense of statistical frequency, but imply simply that the events they denote are such as from some cause we did not expect would happen, and at which therefore, when they do happen, we are surprised.

Now when we bear in mind that these two classes of words are in their origin perfectly distinct;—the one denoting simply events of a certain character; the other, though also denoting events, connoting simply states of mind;—and yet that they are universally applied to the same events, so as to be used as perfectly synonymous, we have in this a striking illustration of the two sides under which Probability may be viewed, and of the universal recognition of a close connection between them. The words are popularly used as synonymous, and we must not press their meaning too far; but if it were to be observed, as I am rather inclined to think it could, that the application of the words which denote mental states is wider than that of the others, we should have an illustration of what has been already observed, viz. that the province of Probability is not so extensive as that over which variation of belief might be observed. Probability only considers the case in which this variation is brought about in a certain definite statistical way.

§ 34. It will be found in the end both interesting and important to have devoted some attention to this subjective side of the question. In the first place, as a mere speculative inquiry the quantity of our belief of any proposition deserves notice. To study it at all deeply would be to trespass into the province of Psychology, but it is so intimately connected with our own subject that we cannot avoid all reference to it. We therefore discuss the laws under which our expectation and surprise at isolated events increases or diminishes, so as to account for these states of mind in any individual instance, and, if necessary, to correct them when they vary from their proper amount.

But there is another more important reason than this. It is quite true that when the subjects of our discussion in any particular instance lie entirely within the province of Probability, they may be treated without any reference to our belief. We may or we may not employ this side of the question according to our pleasure. If, for example, I am asked whether it is more likely that A. B. will die this year, than that it will rain to-morrow, I may calculate the chance (which really is at bottom the same thing as my belief) of each, find them respectively, one-sixth and one-seventh, say, and therefore decide that my ‘expectation’ of the former is the greater, viz. that this is the more likely event. In this case the process is precisely the same whether we suppose our belief to be introduced or not; our mental state is, in fact, quite immaterial to the question. But, in other cases, it may be different. Suppose that we are comparing two things, of which one is wholly alien to Probability, in the sense that it is hopeless to attempt to assign any degree of numerical frequency to it, the only ground they have in common may be the amount of belief to which they are respectively entitled. We cannot compare the frequency of their occurrence, for one may occur too seldom to judge by, perhaps it may be unique. It has been already said, that our belief of many events rests upon a very complicated and extensive basis. My belief may be the product of many conflicting arguments, and many analogies more or less remote; these proofs themselves may have mostly faded from my mind, but they will leave their effect behind them in a weak or strong conviction. At the time, therefore, I may still be able to say, with some degree of accuracy, though a very slight degree, what amount of belief I entertain upon the subject. Now we cannot compare things that are heterogeneous: if, therefore, we are to decide between this and an event determined naturally and properly by Probability, it is impossible to appeal to chances or frequency of occurrence. The measure of belief is the only common ground, and we must therefore compare this quantity in each case. The test afforded will be an exceedingly rough one, for the reasons mentioned above, but it will be better than none; in some cases it will be found to furnish all we want.

Suppose, for example, that one letter in a million is lost in the Post Office, and that in any given instance I wish to know which is more likely, that a letter has been so lost, or that my servant has stolen it? If the latter alternative could, like the former, be stated in a numerical form, the comparison would be simple. But it cannot be reduced to this form, at least not consciously and directly. Still, if we could feel that our belief in the man's dishonesty was greater than one-millionth, we should then have homogeneous things before us, and therefore comparison would be possible.

§ 35. We are now in a position to give a tolerably accurate definition of a phrase which we have frequently been obliged to employ, or incidentally to suggest, and of which the reader may have looked for a definition already, viz. the probability of an event, or what is equivalent to this, the chance of any given event happening. I consider that these terms presuppose a series; within the indefinitely numerous class which composes this series a smaller class is distinguished by the presence or absence of some attribute or attributes, as was fully illustrated and explained in a previous chapter. These larger and smaller classes respectively are commonly spoken of as instances of the ‘event,’ and of ‘its happening in a given particular way.’ Adopting this phraseology, which with proper explanations is suitable enough, we may define the probability or chance (the terms are here regarded as synonymous) of the event happening in that particular way as the numerical fraction which represents the proportion between the two different classes in the long run. Thus, for example, let the probability be that of a given infant living to be eighty years of age. The larger series will comprise all infants, the smaller all who live to eighty. Let the proportion of the former to the latter be 9 to 1; in other words, suppose that one infant in ten lives to eighty. Then the chance or probability that any given infant will live to eighty is the numerical fraction ¹/₁₀. This assumes that the series are of indefinite extent, and of the kind which we have described as possessing a fixed type. If this be not the case, but the series be supposed terminable, or regularly or irregularly fluctuating, as might be the case, for instance, in a society where owing to sanitary or other causes the average longevity was steadily undergoing a change, then in so far as this is the case the series ceases to be a subject of science. What we have to do under these circumstances, is to substitute a series of the right kind for the inappropriate one presented by nature, choosing it, of course, with as little deflection as possible from the observed facts. This is nothing more than has to be done, and invariably is done, whenever natural objects are made subjects of strict science.

§ 36. A word or two of explanation may be added about the expression employed above, ‘the proportion in the long run.’ The run must be supposed to be very long indeed, in fact never to stop. As we keep on taking more terms of the series we shall find the proportion still fluctuating a little, but its fluctuations will grow less. The proportion, in fact, will gradually approach towards some fixed numerical value, what mathematicians term its limit. This fractional value is the one spoken of above. In the cases in which deductive reasoning is possible, this fraction may be obtained without direct appeal to statistics, from reasoning about the conditions under which the events occur, as was explained in the fourth chapter.

Here becomes apparent the full importance of the distinction so frequently insisted on, between the actual irregular series before us and the substituted one of calculation, and the meaning of the assertion (Ch. I. § 13), that it was in the case of the latter only that strict scientific inferences could be made. For how can we have a ‘limit’ in the case of those series which ultimately exhibit irregular fluctuations? When we say, for instance, that it is an even chance that a given person recovers from the cholera, the meaning of this assertion is that in the long run one half of the persons attacked by that disease do recover. But if we examined a sufficiently extensive range of statistics, we might find that the manners and customs of society had produced such a change in the type of the disease or its treatment, that we were no nearer approaching towards a fixed limit than we were at first. The conception of an ultimate limit in the ratio between the numbers of the two classes in the series necessarily involves an absolute fixity of the type. When therefore nature does not present us with this absolute fixity, as she seldom or never does except in games of chance (and not demonstrably there), our only resource is to introduce such a series, in other words, as has so often been said, to substitute a series of the right kind.

§ 37. The above, which may be considered tolerably complete as a definition, might equally well have been given in the last chapter. It has been deferred however to the present place, in order to connect with it at once a proposition involving the conceptions introduced in this chapter; viz. the state of our own minds, in reference to the amount of belief we entertain in contemplating any one of the events whose probability has just been described. Reasons were given against the opinion that our belief admitted of any exact apportionment like the numerical one just mentioned. Still, it was shown that a reasonable explanation could be given of such an expression as, ‘my belief is ¹/₁₀th of certainty’, though it was an explanation which pointed unmistakeably to a series of events, and ceased to be intelligible, or at any rate justifiable, when it was not viewed in such a relation to a series. In so far, then, as this explanation is adopted, we may say that our belief is in proportion to the above fraction. This referred to the purely intellectual part of belief which cannot be conceived to be separable, even in thought, from the things upon which it is exercised. With this intellectual part there are commonly associated various emotions. These we can to a certain extent separate, and, when separated, can measure with that degree of accuracy which is possible in the case of other emotions. They are moreover intelligible in reference to the individual events. They will be found to increase and diminish in accordance, to some extent, with the fraction which represents the scarcity of the event. The emotion of surprise does so with some degree of accuracy.

The above investigation describes, though in a very brief form, the amount of truth which appears to me to be contained in the assertion frequently made, that the fraction expressive of the probability represents also the fractional part of full certainty to which our belief of the individual event amounts. Any further analysis of the matter would seem to belong to Psychology rather than to Probability.

1 In the ordinary signification of this term. As De Morgan uses it he makes Formal Logic include Probability, as one of its branches, as indicated in his title “Formal Logic, or the Calculus of Inference, necessary and probable.”

2 Formal Logic. Preface, page v.

3 An illustration of the points here insisted on has recently [1876] been given in a quarter where few would have expected it; I allude, as many readers will readily infer, to J. S. Mill's exceedingly interesting Essays on Theism. It is not within our province here to criticise any of their conclusions, but they have expressed in a very significant way the conviction entertained by him that beliefs which are not justified by evidence, and possibly may not be capable of justification (those for instance of immortality and the existence of the Deity), may nevertheless not only continue to exist in cultivated minds, but may also be profitably encouraged there, at any rate in the shape of hopes, for certain supposed advantages attendant on their retention, irrespective even of their truth.

4 It is necessary to take an example in which the man is forced to act, or we should not be able to shew that he has any belief on the subject at all. He may declare that he neither knows nor cares anything about the matter, and that therefore there is nothing of the nature of belief to be extracted out of his mental condition. He very likely would take this ground if we asked him, as De Morgan does, with a slightly different reference (Formal Logic, p. 183), whether he considers that there are volcanoes on the unseen side of the moon larger than those on the side turned towards us; or, with Jevons (Principles of Science, Ed. II. p. 212) whether he considers that a Platythliptic Coefficient is positive. These do not therefore seem good instances to illustrate the position that we always entertain a certain degree of belief on every question which can be stated, and that utter inability to give a reason in favour of either alternative corresponds to half belief.

5 Except indeed on the principles indicated further on in §§ 24, 25.

6 For a fuller discussion of this, see the Chapter on Causation.

7 The best example I can recall of the distinction between judging from the subjective and the objective side, in such cases as these, occurred once in a railway train. I met a timid old lady who was in much fear of accidents. I endeavoured to soothe her on the usual statistical ground of the extreme rarity of such events. She listened patiently, and then replied, “Yes, Sir, that is all very well; but I don't see how the real danger will be a bit the less because I don't believe in it.”

8 This would still hold of empirical laws which may be capable of being broken: we now have very much shifted the word, to denote an ultimate law which it is supposed cannot be broken.

CHAPTER VII.

THE RULES OF INFERENCE IN PROBABILITY.

§ 1. In the previous chapter, an investigation was made into what may be called, from the analogy of Logic, Immediate Inferences. Given that nine men out of ten, of any assigned age, live to forty, what could be inferred about the prospect of life of any particular man? It was shown that, although this step was very far from being so simple as it is frequently supposed to be, and as the corresponding step really is in Logic, there was nevertheless an intelligible sense in which we might speak of the amount of our belief in any one of these ‘proportional propositions,’ as they may succinctly be termed, and justify that amount. We must now proceed to the consideration of inferences more properly so called, I mean inferences of the kind analogous to those which form the staple of ordinary logical treatises. In other words, having ascertained in what manner particular propositions could be inferred from the general propositions which included them, we must now examine in what cases one general proposition can be inferred from another. By a general proposition here is meant, of course, a general proposition of the statistical kind contemplated in Probability. The rules of such inference being very few and simple, their consideration will not detain us long. From the data now in our possession we are able to deduce the rules of probability given in ordinary treatises upon the science. It would be more correct to say that we are able to deduce some of these rules, for, as will appear on examination, they are of two very different kinds, resting on entirely distinct grounds. They might be divided into those which are formal, and those which are more or less experimental. This may be otherwise expressed by saying that, from the kind of series described in the first chapters, some rules will follow necessarily by the mere application of arithmetic; whilst others either depend upon peculiar hypotheses, or demand for their establishment continually renewed appeals to experience, and extension by the aid of the various resources of Induction. We shall confine our attention at present principally to the former class; the latter can only be fully understood when we have considered the connection of our science with Induction.

§ 2. The fundamental rules of Probability strictly so called, that is the formal rules, may be divided into two classes,—those obtained by addition or subtraction on the one hand, corresponding to what are generally termed the connection of exclusive or incompatible events;[1] and those obtained by multiplication or division, on the other hand, corresponding to what are commonly termed dependent events. We will examine these in order.

(1) We can make inferences by simple addition. If, for instance, there are two distinct properties observable in various members of the series, which properties do not occur in the same individual; it is plain that in any batch the number that are of one kind or the other will be equal to the sum of those of the two kinds separately. Thus 36.4 infants in 100 live to over sixty, 35.4 in 100 die before they are ten;[2] take a large number, say 10,000, then there will be about 3640 who live to over sixty, and about 3540 who do not reach ten; hence the total number who do not die within the assigned limits will be about 2820 altogether. Of course if these proportions were accurately assigned, the resultant sum would be equally accurate: but, as the reader knows, in Probability this proportion is merely the limit towards which the numbers tend in the long run, not the precise result assigned in any particular case. Hence we can only venture to say that this is the limit towards which we tend as the numbers become greater and greater.

This rule, in its general algebraic form, would be expressed in the language of Probability as follows:—If the chances of two exclusive or incompatible events be respectively ¹/_m and ¹/_n the chance of one or other of them happening will be ¹/_m + ¹/_n or ^m + n/_mn. Similarly if there were more than two events of the kind in question. On the principles adopted in this work, the rule, when thus algebraically expressed, means precisely the same thing as when it is expressed in the statistical form. It was shown at the conclusion of the last chapter that to say, for example, that the chance of a given event happening in a certain way is ¹/₆, is only another way of saying that in the long run it does tend to happen in that way once in six times.

It is plain that a sort of corollary to this rule might be obtained, in precisely the same way, by subtraction instead of addition. Stated generally it would be as follows:—If the chance of one or other of two incompatible events be ¹/_m and the chance of one alone be ¹/_n, the chance of the remaining one will be ¹/_m − ¹/_n or ^n − m/_nm.

For example, if the chance of any one dying in a year is ¹/₁₀, and his chance of dying of some particular disease is ¹/₁₀₀, his chance of dying of any other disease is ⁹/₁₀₀.

The reader will remark here that there are two apparently different modes of stating this rule, according as we speak of ‘one or other of two or more events happening,’ or of ‘the same event happening in one or other of two or more ways.’ But no confusion need arise on this ground; either way of speaking is legitimate, the difference being merely verbal, and depending (as was shown in the first chapter, § 8) upon whether the distinctions between the ‘ways’ are or are not too deep and numerous to entitle the event to be conventionally regarded as the same.

We may also here point out the justification for the common doctrine that certainty is represented by unity, just as any given degree of probability is represented by its appropriate fraction. If the statement that an event happens once in m times, is equivalently expressed by saying that its chance is ¹/_m, it follows that to say that it happens m times in m times, or every time without exception, is equivalent to saying that its chance is ^m/_m or 1. Now an event that happens every time is of course one of whose occurrence we are certain; hence the fraction which represents the ‘chance’ of an event which is certain becomes unity.

It will be equally obvious that given that the chance that an event will happen is ¹/_m, the chance that it will not happen is 1 − ¹/_m or ^m − 1/_m.

§ 3. (2) We can also make inferences by multiplication or division. Suppose that two events instead of being incompatible, are connected together in the sense that one is contingent upon the occurrence of the other. Let us be told that a given proportion of the members of the series possess a certain property, and a given proportion again of these possess another property, then the proportion of the whole which possess both properties will be found by multiplying together the two fractions which represent the above two proportions. Of the inhabitants of London, twenty-five in a thousand, say, will die in the course of the year; we suppose it to be known also that one death in five is due to fever; we should then infer that one in 200 of the inhabitants will die of fever in the course of the year. It would of course be equally simple, by division, to make a sort of converse inference. Given the total mortality per cent. of the population from fever, and the proportion of fever cases to the aggregate of other cases of mortality, we might have inferred, by dividing one fraction by the other, what was the total mortality per cent. from all causes.

The rule as given above is variously expressed in the language of Probability. Perhaps the simplest and best statement is that it gives us the rule of dependent events. That is; if the chance of one event is ¹/_m, and the chance that if it happens another will also happen ¹/_n, then the chance of the latter is ¹/_mn. In this case it is assumed that the latter is so entirely dependent upon the former that though it does not always happen with it, it certainly will not happen without it; the necessity of this assumption however may be obviated by saying that what we are speaking of in the latter case is the joint event, viz. both together if they are simultaneous events, or the latter in consequence of the former, if they are successive.

§ 4. The above inferences are necessary, in the sense in which arithmetical inferences are necessary, and they do not demand for their establishment any arbitrary hypothesis. We assume in them no more than is warranted, and in fact necessitated by the data actually given to us, and make our inferences from these data by the help of arithmetic. In the simple examples given above nothing is required beyond arithmetic in its most familiar form, but it need hardly be added that in practice examples may often present themselves which will require much profounder methods than these. It may task all the resources of that higher and more abstract arithmetic known as algebra to extract a solution. But as the necessity of appeal to such methods as these does not touch the principles of this part of the subject we need not enter upon them here.

§ 5. The formula next to be discussed stands upon a somewhat different footing from the above in respect of its cogency and freedom from appeal to experience, or to hypothesis. In the two former instances we considered cases in which the data were supposed to be given under the conditions that the properties which distinguished the different kinds of events whose frequency was discussed, were respectively known to be disconnected and known to be connected. Let us now suppose that no such conditions are given to us. One man in ten, say, has black hair, and one in twelve is short-sighted; what conclusions could we then draw as to the chance of any given man having one only of these two attributes, or neither, or both? It is clearly possible that the properties in question might be inconsistent with one another, so as never to be found combined in the same person; or all the short-sighted might have black hair; or the properties might be allotted[3] in almost any other proportion whatever. If we are perfectly ignorant upon these points, it would seem that no inferences whatever could be drawn about the required chances.

Inferences however are drawn, and practically, in most cases, quite justly drawn. An escape from the apparent indeterminateness of the problem, as above described, is found by assuming that, not merely will one-tenth of the whole number of men have black hair (for this was given as one of the data), but also that one-tenth alike of those who are and who are not short-sighted have black hair. Let us take a batch of 1200, as a sample of the whole. Now, from the data which were originally given to us, it will easily be seen that in every such batch there will be on the average 120 who have black hair, and therefore 1080 who have not. And here in strict right we ought to stop, at least until we have appealed again to experience; but we do not stop here. From data which we assume, we go on to infer that of the 120, 10 (i.e. one-twelfth of 120) will be short-sighted, and 110 (the remainder) will not. Similarly we infer that of the 1080, 90 are short-sighted, and 990 are not. On the whole, then, the 1200 are thus divided:—black-haired short-sighted, 10; short-sighted without black hair, 90; black-haired men who are not short-sighted, 110; men who are neither short-sighted nor have black hair, 990.

This rule, expressed in its most general form, in the language of Probability, would be as follows:—If the chances of a thing being p and q are respectively ¹/_m and ¹/_n, then the chance of its being both p and q is ¹/_mn, p and not q is ^n − 1/_mn, q and not p is ^m − 1/_mn, not p and not q is ^{(m − 1)(n − 1)}/_mn, where p and q are independent. The sum of these chances is obviously unity; as it ought to be, since one or other of the four alternatives must necessarily exist.

§ 6. I have purposely emphasized the distinction between the inference in this case, and that in the two preceding, to an extent which to many readers may seem unwarranted. But it appears to me that where a science makes use, as Probability does, of two such very distinct sources of conviction as the necessary rules of arithmetic and the merely more or less cogent ones of Induction, it is hardly possible to lay too much stress upon the distinction. Few will be prepared to deny that very arbitrary assumptions have been made by many writers on the subject, and none will deny that in the case of what are called ‘inverse probabilities’ assumptions are sometimes made which are at least decidedly open to question. The best course therefore is to make a pause and stringent enquiry at the point at which the possibility of such error and doubtfulness first exhibits itself. These remarks apply to some of the best writers on the subject; in the case of inferior writers, or those who appeal to Probability without having properly mastered its principles, we may go further. It would really not be asserting too much to say that they seem to think themselves justified in assuming that where we know nothing about the distribution of the properties alluded to we must assume them to be distributed as above described, and therefore apportion our belief in the same ratio. This is called ‘assuming the events to be independent,’ the supposition being made that the rule will certainly follow from this independence, and that we have a right, if we know nothing to the contrary, to assume that the events are independent.

The validity of this last claim has already been discussed in the first chapter; it is only another of the attempts to construct à priori the series which experience will present to us, and one for which no such strong defence can be made as for the equality of heads and tails in the throws of a penny. But the meaning to be assigned to the ‘independence’ of the events in question demands a moment's consideration.

The circumstances of the problem are these. There are two different qualities, by the presence and absence respectively of each of which, amongst the individuals of a series, two distinct pairs of classes of these individuals are produced. For the establishment of the rule under discussion it was found that one supposition was both necessary and sufficient, namely, that the division into classes caused by each of the above distinctions should subdivide each of the classes created by the other distinction in the same ratio in which it subdivides the whole. If the independence be granted and so defined as to mean this, the rule of course will stand, but, without especial attention being drawn to the point, it does not seem that the word would naturally be so understood.

§ 7. The above, then, being the fundamental rules of inference in probability, the question at once arises, What is their relation to the great body of formulæ which are made use of in treatises upon the science, and in practical applications of it? The reply would be that these formulæ, in so far as they properly belong to the science, are nothing else in reality than applications of the above fundamental rules. Such applications may assume any degree of complexity, for owing to the difficulty of particular examples, in the form in which they actually present themselves, recourse must sometimes be made to the profoundest theorems of mathematics. Still we ought not to regard these theorems as being anything else than convenient and necessary abbreviations of arithmetical processes, which in practice have become too cumbersome to be otherwise performed.

This explanation will account for some of the rules as they are ordinarily given, but by no means for all of them. It will account for those which are demonstrable by the certain laws of arithmetic, but not for those which in reality rest only upon inductive generalizations. And it can hardly be doubted that many rules of the latter description have become associated with those of the former, so that in popular estimation they have been blended into one system, of which all the separate rules are supposed to possess a similar origin and equal certainty. Hints have already been frequently given of this tendency, but the subject is one of such extreme importance that a separate chapter (that on Induction) must be devoted to its consideration.

§ 8. In establishing the validity of the above rules, we have taken as the basis of our investigations, in accordance with the general scheme of this work, the statistical frequency of the events referred to; but it was also shown that each formula, when established, might with equal propriety be expressed in the more familiar form of a fraction representing the ‘chance’ of the occurrence of the particular event. The question may therefore now be raised, Can those writers who (as described in the last chapter) take as the primary subject of the science not the degree of statistical frequency, but the quantity of belief, with equal consistency make this the basis of their rules, and so also regard the fraction expressive of the chance as a merely synonymous expression? De Morgan maintains that whereas in ordinary logic we suppose the premises to be absolutely true, the province of Probability is to study ‘the effect which partial belief of the premises produces with respect to the conclusion.’ It would appear therefore as if in strictness we ought on this view to be able to determine this consequent diminution at first hand, from introspection of the mind, that is of the conceptions and beliefs which it entertains; instead of making any recourse to statistics to tell us how much we ought to believe the conclusion.

Any readers who have concurred with me in the general results of the last chapter, will naturally agree in the conclusion that nothing deserving the name of logical science can be extracted from any results of appeal to our consciousness as to the quantity of belief we entertain of this or that proposition. Suppose, for example, that one person in 100 dies on the sea passage out to India, and that one in 9 dies during a 5 years residence there. It would commonly be said that the chance that any one, who is now going out, has of living to start homewards 5 years hence, is ⁸⁸/₁₀₀; for his chance of getting there is ⁹⁹/₁₀₀; and of his surviving, if he gets there, ⁸/₉; hence the result or dependent event is got by multiplying these fractions together, which gives ⁸⁸/₁₀₀. Here the real basis of the reasoning is statistical, and the processes or results are merely translated afterwards into fractions. But can we say the same when we look at the belief side of the question? I quite admit the psychological fact that we have degrees of belief, more or less corresponding to the frequency of the events to which they refer. In the above example, for instance, we should undoubtedly admit on enquiry that our belief in the man's return was affected by each of the risks in question, so that we had less expectation of it than if he were subject to either risk separately; that is, we should in some way compound the risks. But what I cannot recognise is that we should be able to perform the process with any approach to accuracy without appeal to the statistics, or that, even supposing we could do so, we should have any guarantee of the correctness of the result without similar appeal. It appears to me in fact that but little meaning, and certainly no security, can be attained by so regarding the process of inference. The probabilities expressed as degrees of belief, just as those which are expressed as fractions, must, when we are put upon our justification, first be translated into their corresponding facts of statistical frequency of occurrence of the events, and then the inferences must be drawn and justified there. This part of the operation, as we have already shown, is mostly carried on by the ordinary rules of arithmetic. When we have obtained our conclusion we may, if we please, translate it back again into the subjective form, just as we can and do for convenience into the fractional, but I do not see how the process of inference can be conceived as taking place in that form, and still less how any proof of it can thus be given. If therefore the process of inference be so expressed it must be regarded as a symbolical process, symbolical of such an inference about things as has been described above, and it therefore seems to me more advisable to state and expound it in this latter form.

On Inverse Probability and the Rules required for it.

§ 9. It has been already stated that the only fundamental rules of inference in Probability are the two described in §§ 2, 3, but there are of course abundance of derivative rules, the nature and use of which are best obtained from the study of any manual upon the subject. One class of these derivative rules, however, is sufficiently distinct in respect of the questions to which it may give rise, to deserve special examination. It involves the distinction commonly recognised as that between Direct and Inverse Probability. It is thus introduced by De Morgan:—

“In the preceding chapter we have calculated the chances of an event, knowing the circumstances under which it is to happen or fail. We are now to place ourselves in an inverted position: we know the event, and ask what is the probability which results from the event in favour of any set of circumstances under which the same might have happened.”[4] The distinction might therefore be summarily described as that between finding an effect when we are given the causes, and finding a cause when we are given effects.

On the principles of the science involved in the definition which was discussed and adopted in the earlier chapters of this work, the reader will easily infer that no such distinction as this can be regarded as fundamental. One common feature was traced in all the objects which were to be referred to Probability, and from this feature the possible rules of inference can be immediately derived. All other distinctions are merely those of arrangement or management.

But although the distinction is not by any means fundamental, it is nevertheless true that the practical treatment of such problems as those principally occurring in Inverse Probability, does correspond to a very serious source of ambiguity and perplexity. The arbitrary assumptions which appear in Direct Probability are not by any means serious, but those which invade us in a large proportion of the problems offered by Inverse Probability are both serious and inevitable.

§ 10. This will be best seen by the examination of special examples; as any, however simple, will serve our purpose, let us take the two following:—

(1) A ball is drawn from a bag containing nine black balls and one white: what is the chance of its being the white ball?

(2) A ball is drawn from a bag containing ten balls, and is found to be white; what is the chance of there having been but that one white ball in the bag?

The class of which the first example is a simple instance has been already abundantly discussed. The interpretation of it is as follows: If balls be continually drawn and replaced, the proportion of white ones to the whole number drawn will tend towards the fraction ¹/₁₀. The contemplated action is a single one, but we view it as one of the above series; at least our opinion is formed upon that assumption. We conclude that we are going to take one of a series of events which may appear individually fortuitous, but in which, in the long run, those of a given kind are one-tenth of the whole; this kind (white) is then singled out by anticipation. By stating that its chance is ¹/₁₀, we merely mean to assert this physical fact, together with such other mental facts, emotions, inferences, &c., as may be properly associated with it.

§ 11. Have we to interpret the second example in a different way? Here also we have a single instance, but the nature of the question would seem to decide that the only series to which it can properly be referred is the following:—Balls are continually drawn from different bags each containing ten, and are always found to be white; what is ultimately the proportion of cases in which they will be found to have been taken from bags with only one white ball in them? Now it may be readily shown[5] that time has nothing to do with the question; omitting therefore the consideration of this element, we have for the two series from which our opinions in these two examples respectively are to be formed:—(1) balls of different colours presented to us in a given ultimate ratio; (2) bags with different contents similarly presented. From these data respectively we have to assign their due weight to our anticipations of (1) a white ball; (2) a bag containing but one white ball. So stated the problems would appear to be formally identical.

When, however, we begin the practical work of solving them we perceive a most important distinction. In the first example there is not much that is arbitrary; balls would under such circumstance really come out more or less accurately in the proportion expected. Moreover, in case it should be objected that it is difficult to prove that they will do so, it does not seem an unfair demand to say that the balls are to be ‘well-mixed’ or ‘fairly distributed,’ or to introduce any of the other conditions by which, under the semblance of judging à priori, we take care to secure our prospect of a series of the desired kind. But we cannot say the same in the case of the second example.

§ 12. The line of proof by which it is generally attempted to solve the second example is of this kind;—It is shown that there being one white ball for certain in the bag, the only possible antecedents are of ten kinds, viz. bags, each of which contains ten balls, but in which the white balls range respectively from one to ten in number. This of course imposes limits upon the kind of terms to be found in our series. But we want more than such limitations, we must know the proportions in which these terms are ultimately found to arrange themselves in the series. Now this requires an experience about bags which may not, and indeed in a large proportion of similar cases, cannot, be given to us. If therefore we are to solve the question at all we must make an assumption; let us make the following;—that each of the bags described above occurs equally often,—and see what follows. The bags being drawn from equally often, it does not follow that they will each yield equal numbers of white balls. On the contrary they will, as in the last example, yield them in direct proportion to the number of such balls which they contain. The bag with one white and nine black will yield a white ball once in ten times; that with two white, twice; and so on. The result of this, it will be easily seen, is that in 100 drawings there will be obtained on the average 55 white balls and 45 black. Now with those drawings that do not yield white balls we have, by the question, nothing to do, for that question postulated the drawing of a white ball as an accomplished fact. The series we want is therefore composed of those which do yield white. Now what is the additional attribute which is found in some members, and in some members only, of this series, and which we mentally anticipate? Clearly it is the attribute of having been drawn from a bag which only contained one of these white balls. Of these there is, out of the 55 drawings, but one. Accordingly the required chance is ¹/₅₅. That is to say, the white ball will have been drawn from the bag containing only that one white, once in 55 times.

§ 13. Now, with the exception of the passage in italics, the process here is precisely the same as in the other example; it is somewhat longer only because we are not able to appeal immediately to experience, but are forced to try to deduce what the result will be, though the validity of this deduction itself rests, of course, ultimately upon experience. But the above passage is a very important one. It is scarcely necessary to point out how arbitrary it is.

For is the supposition, that the different specified kinds of bags are equally likely, the most reasonable supposition under the circumstances in question? One man may think it is, another may take a contrary view. In fact in an excellent manual[6] upon the subject a totally different supposition is made, at any rate in one example; it is taken for granted in that instance, not that every possible number of black and white balls respectively is equally likely, but that every possible way of getting each number is equally likely, whence it follows that bags with an intermediate number of black and white balls are far more likely than those with an extreme number of either. On this supposition five black and five white being obtainable in 252 ways against the ten ways of obtaining one white and nine black, it follows that the chance that we have drawn from a bag of the latter description is much less than on the hypothesis first made. The chance, in fact, becomes now ¹/₅₁₂ instead of ¹/₅₅. In the one case each distinct result is considered equally likely, in the other every distinct way of getting each result.

§ 14. Uncertainties of this kind are peculiarly likely to arise in these inverse probabilities, because when we are merely given an effect and told to look out for the chance of some assigned cause, we are often given no clue as to the relative prevalence of these causes, but are left to determine them on general principles. Give us either their actual prevalence in statistics, or the conditions by which such prevalence is brought about, and we know what to do; but without the help of such data we are reduced to guessing. In the above example, if we had been told how the bag had been originally filled, that is by what process, or under what circumstances, we should have known what to do. If it had been filled at random from a box containing equal numbers of black and white balls, the supposition in Mr Whitworth's example is the most reasonable; but in the absence of any such information as this we are entirely in the dark, and the supposition made in § 12 is neither more nor less trustworthy and reasonable than many others, though it doubtless possesses the merit of superior simplicity.[7] If the reader will recur to Ch. V. §§ 4, 5, he will find this particular difficulty fully explained. Everybody practically admits that a certain characteristic arrangement or distribution has to be introduced at some prior stage; and that, as soon as this stage has been selected, there are no further theoretic difficulties to be encountered. But when we come to decide, in examples of the class in question, at what stage it is most reasonable to make our postulate, we are often left without any very definite or rational guidance.

§ 15. When, however, we take what may be called, by comparison with the above purely artificial examples, instances presented by nature, much of this uncertainty will disappear, and then all real distinction between direct and inverse probability will often vanish. In such cases the causes are mostly determined by tolerably definite rules, instead of being a mere cloud-land of capricious guesses. We may either find their relative frequency of occurrence by reference to tables, or may be able to infer it by examination of the circumstances under which they are brought about. Almost any simple example would then serve to illustrate the fact that under such circumstances the distinction between direct and inverse probability disappears altogether, or merely resolves itself into one of time, which, as will be more fully shown in a future chapter, is entirely foreign to our subject.

It is not of course intended to imply that difficulties similar to those mentioned above do not occasionally invade us here also. As already mentioned, they are, if not inherent in the subject, at any rate almost unavoidable in comparison with the simpler and more direct procedure of determining what is likely to follow from assigned conditions. What is meant is that so long as we confine ourselves within the comparatively regular and uniform field of natural sequences and co-existences, statistics of causes may be just as readily available as those of effects. There will not be much more that is arbitrary in the one than in the other. But of course this security is lost when, as will be almost immediately noticed, what may be called metaphysical rather than natural causes are introduced into the enquiry.

For instance, it is known that in London about 20 people die per thousand each year. Suppose it also known that of every 100 deaths there are about 4 attributable to bronchitis. The odds therefore against any unknown person dying of bronchitis in a given year are 1249 to 1. Exactly the same statistics are available to solve the inverse problem:—A man is dead, what is the chance that he died of bronchitis? Here, since the man's death is taken for granted, we do not require to know the general average mortality. All that we want is the proportional mortality from the disease in question as given above. If Probability dealt only with inferences founded in this way upon actual statistics, and these tolerably extensive, it is scarcely likely that any distinction such as this between direct and inverse problems would ever have been drawn.

§ 16. Considered therefore as a contribution to the theory of the subject, the distinction between Direct and Inverse Probability must be abandoned. When the appropriate statistics are at hand the two classes of problems become identical in method of treatment, and when they are not we have no more right to extract a solution in one case than in the other. The discussion however may serve to direct renewed attention to another and far more important distinction. It will remind us that there is one class of examples to which the calculus of Probability is rightfully applied, because statistical data are all we have to judge by; whereas there are other examples in regard to which, if we will insist upon making use of these rules, we may either be deliberately abandoning the opportunity of getting far more trustworthy information by other means, or we may be obtaining solutions about matters on which the human intellect has no right to any definite quantitative opinion.

§ 17. The nearest approach to any practical justification of such judgments that I remember to have seen is afforded by cases of which the following example is a specimen:— “Of 10 cases treated by Lister's method, 7 did well and 3 suffered from blood-poisoning: of 14 treated with ordinary dressings, 9 did well and 5 had blood-poisoning; what are the odds that the success of Lister's method was due to chance?”.[8] Or, to put it into other words, a short experience has shown an actual superiority in one method over the other: what are the chances that an indefinitely long experience, under similar conditions, will confirm this superiority?