UmfUU: ^^^B' 71 1 1 o (T v./). 1 Digitized by the Internet Archive in 2008 with funding from IVIicrosoft Corporation http://www.archive.org/details/firststudyofstatOOpearrich DEPARTMENT OF APPLIED MATHEMATICS UNIVERSITY COLLEGE, UNIVERSITY OF LONDON ; J DRAPERS' COMPANY RESEARCH MEMOIRS STUDIES IN NATIONAL DETERIORATION II. A FIRST STUDY OF THE STATISTICS OF PULMONARY TUBERCULOSIS BY KARL PEARSON, F.R.S. BIOMETRIC LABORATORY, UNIVERSITY COLLEGE, LONDON WITH ONE DIAGRAM IN TEXT U^ LONDON : PUBLISHED BY DULAU AND CO., 37 SOHO SQUARE, W. 1907 n X '^7*" k-ec 3 // A FIRST STUDY OF THE STATISTICS OF PULMONARY TUBERCULOSIS. By Karl Pearson, F.R.S. " The germ has, perhaps, been too much with us, and the paramount importance of soil has been absurdly underrated." — Sir William Collins. I. Introductory. The most satisfactory method of studying the influence of heredity on the occurrence of special diseases undoubtedly would be to obtain a very large random sample of family histories of the general population. In this way each special disease would be represented in its due proportions and we could test by the usual statistical methods the prevalency of special types of disease in particular stocks. Unfortunately the difficulty of this method is great. I have now been two years collecting family histories and at present have only reached between 200 and 300 fairly complete histories, where 2,000 or 3,000 at least are necessary. It is extremely hard to get co-operators who are willing, or if willing, able to give per- fectly frank and full family details. Still the collection goes slowly forward, and one must hope that some day it may serve the purpose for which it was designed.* Enough, however, of these histories have now been collected to convince me that heredity plays a large part in the effective sources of tuberculous disease. The discovery of the possibility of phthisical infection has led, I think, to underestima- tion of the hereditary factor. Probably few individuals who lead a moderately active life can escape an almost daily risk of infection under urban conditions ; but in the great bulk of cases, a predisposition, a phthisical diathesis, must exist, to render the risk a really great one. In this sense it is probably legitimate to speak of the inheritance of tuberculosis and even of the inheritance of zymotic diseases, meaning thereby the inheritance of a constitutional condition favourable to the development of such diseases should a risk be run, which cannot in the ordinary course of life be wholly avoided.! The recognised importance of modern views as * I should be only too glad of further offers of aid in the preparation of family records. ■f- Again, without prejudice to fuller later investigation, my family histories, I think, show a general scattering over all stocks of zymotic diseases as a source of death, but in addition heavy incidence of these diseases in a relatively few stocks, and this not only in members living under the same environment, but among collaterals and distant ascendants. 1—2 M37571S 4 KARL PEARSON to the nature of phthisis would be dangerous if throwing all weight on environ- ment, they lead to the disregard of the inheritance of the diathesis as being in the bulk of cases an essential preliminary condition. Approaching pathological inheritance from the modern statistical standpoint it is almost heartrending to notice the great amount of effort and energy wasted in the collection of data bearing on the inheritance of disease. We want to know whether the tendency to disease may be assumed to be inherited at the same rate as other physical and mental characters in the organism. If this can be shown to be reasonable, the hereditary factor can be at once assigned its due influence and posi- tion. But the data by which this preliminary inquiry can be satisfactorily answered are wholly wanting. We are told, for example, that a " family history " of tubercu- losis or of cancer exists in such and such a percentage of cases, and this information is extended even to the differences of such percentages for cancer in various parts of the organism. But in 99 cases out of the 100, while a brother or a sister, an uncle or an aunt will be recorded as having cancer, no record whatever is made of the total number of brothers and sisters, or of uncles and aunts of the sufferer. No record of family history is of the least value unless the absolute number of collaterals, and their ages living or at death are taken. The like remark applies to tubercle ; the same statement " no family history " is often recorded in the two cases when the mother died in childbed at twenty-five, and when she lived to a post-tuberculous age, or when no aunt or uncle existed and when none out of thirteen suffered from the disease. Luckily there has recently been an awakening in the medical profes- sion with regard to this matter. Starting with complete family pedigrees in the case of rare diseases, certain physicians of distinction have introduced the custom of taking a complete family history ; careful inquiries are made with regard to every ascendant and collateral that is known to have existed, and in each case it is clearly stated whether the individual was normal, abnormal or no information avail- able. Such pedigrees often mean a very large expenditure of labour, but their ultimate value to medical science, and especially to national eugenics, is incontest- able. From special and rare diseases the formation of such pedigrees is spreading and must spread to the discussion of inheritance in cases of carcinoma, pulmonary tuberculosis and the various forms of insanity. Quite recently the Biometric Laboratory at University College has had placed at its disposal series — in each case amounting to several hundred — of such family histories in the particular cases of pulmonary tuberculosis, insanity and mental . deficiency. The present paper deals with a first series of this kind embracing 384 stocks in which cases of pulmonary tuberculosis have occurred.* * One case had to be omitted in some part of this inquiry owing to absence of information. KC FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 5 II. Material. The material was most kindly sent to my laboratory by Dr. W. C. Rivers, of the Crossley Sanatorium, Frodsham. The records were made by the medical men of that institution and apply in both subject and family history to cases of pulmonary tuberculosis only. The consumptives were almost wholly of the lower middle and working classes, and from Manchester (the great majority), Liverpool and their environs during 1905-6. The total number of brothers and sisters is recorded, the position of the subject in the family, and the age of the subject at probable initial onset of the disease. The number of brothers and sisters affected by the disease is stated, but neither their positions in the family nor their ages at onset are given. The family history is further extended to parents and grandparents and a statement is made of the paternal and maternal aunts and uncles known to have suffered from pulmonary tuberculosis. The total number of such collaterals is not, however, recorded. The great value of the material is obvious, although we may hope that future records may be amplified in one or two directions. Notably it would be desirable to enter (1) the positions in the family of affected brothers and sisters, and the ages of all brothers and sisters at death or if living at time of record, (2) the total number of uncles and aunts and their ages at death or if living at time of record, (3) a special entry distinguishing between "no information" and ''no disease," and (4) a record of cases of tubercle other than pulmonary, and of cancer cases in the stock.* Naturally this is the statistician's view, which will of necessity be controlled by the actual difficulties of taking the record. The present series gives much more than I have been able so far to find elsewhere and appears to be of high value. Two points must be first noted as they are essential to the right estimation of the results to be given later. First one is struck at once by the far larger number of cases of phthisis in the family histories of the women. This is associated with a fact that I have previously remarked in cancer family histories, and that has re- curred in my general family pedigrees. The women know much more of the family history than the men do, and further they are much more ready to inquire about it, and to discuss it. The relative absence of tuberculous relatives in the cases of the males is not really a sexual difference ; it results from no distinction being placed on the record between unaffected individuals and individuals about whom nothing is known. My own experience is that the women, especially in the lower middle and working classes, take a greater interest in and know much more than the men of the family history. They are also, if approached in the right way, more willing to reveal it, and not infrequently rather proud of a markedly pathological stock. The second point to be borne in mind is that pulmonary tuberculosis is a disease * On the correlation between the existence of cancer and tubercle, aee K. Pearson, " Report on Certain Cancer Statistics," Archives of the Middlesex Hospital, vol. ii., pp. 127-137, 1904. KARL PEARSON the frequency of which culminates between twenty and thirty. Hence many subjects, if members of large families, have brothers and sisters who have not yet passed, perhaps even not yet reached the danger zone. Hence the number of brothers and sisters of a patient who also suffer from the disease is a minimum limit and not a real measure of the extent of the disease in the family. This is a most important point for it acts in another way on our inheritance tables and exaggerates the number of non-tuberculous * children of a tuberculous parent. The true number of non-tuberculous children to a tuberculous parent can only be reached when the family history is completed ; for this reason it would be better to take the ascendants ; but if we take cases of tuberculosis in the grandparental generation we find the total number of aunts and uncles unrecorded, and further, we have made a selection of the tuberculous individuals in the grandparental generation, namely, those in whose stocks the tuberculous diathesis was sufficiently strong for at least one grandchild to manifest tubercle. Here again we are forced to the conclusion that a random sample of the pedigrees of the general population is what is really required ; but the thousands of such pedigrees needful to obtain a sufficient subsample of tuberculous stocks would alarm all but the youngest and bravest collector. We must therefore content ourselves with the fact that we have only got a lower limit to the number of tuberculous children of tuberculous parents, and endeavour to test generally the influence of increase of such number. We cannot hope accordingly to do much more at present than measure a lower limit to the intensity of hereditary influence in pulmonary tuberculosis. III. Preliminary Investigations. I first turn to the age distribution. The 167 male and 216 female cases are distributed according to age of onset as follows : — Age. 5-7 8-10 11-13 14-16 17-19 20-22 23-25 26-28 29-31 32-34 35-37 38-40 4143 44-46 47-49 50-52 63-55 56-58 59-61 62-64 Totals. Men Women 2 3 2 3 9 21 13 25 26 33 21 43 19 24 15 17 14 10 16 13 13 7 4 6 5 4 6 2 3 1 1 1 — 1 167 216 Mean age of onset : Men, 29*1 M ,, ,, Women, 25-3 Standard Deviation ; Men, 9 "8 „ ,, Women, 8"6 Thus we see that the average age of onset is lower in the women and more concentrated. The curves, as most age disease curves, are markedly skew, but the data are not numerous enough to justify their complete analytical treatment. The modal values of the onset are probably about twenty-four and twenty-one in the two cases, probably a little bit closer than the average values. t * " Tuberculous " will be used throughout the remainder of this paper for those affected with 'pulmonary tuberculosis. t The mean age of onset in cancer is earlier in women than men and again slightly more concen- trated (" Report on Certain Cancer Statistics," loc. cit., p. 128). FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 7 An indirect method of measuring the effect of a phthisical environment now arises. If infection really plays a large part, in the liability to phthisis, and common life, with parents or brethren suffering from phthisis, explains the apparent running of the disease in stocks, we ought, I think, to find subjects belonging to families in which one or more members (parents or brethren) were affected by phthisis attacked at an earlier age than cases in which there are no such members. Actually I reach the following results : — MEAN AGE OF ONSET. Sex. All Cases. Cases of Immediate Family History. Males 291 + -5 25-3 ± -4 27-7 + -9 25-7 ± -7 Females The results considering the numbers dealt with cannot be considered conclusive. At any rate no significant difference in the average age of onset can be deduced from these figures.* Again another noteworthy point in the present figures is the comparatively few cases of both parents being tubercular. Out of 383 cases of known family history the subjects' parents were only both affected in six instances, one parent being ,* Taking the 1,000 eases of male and 1,000 cases of female " acquired " phthisis given by Dr. E. B. Thompson in his Family Phthisis, I find Average Age of Onset : Males 29-5, Females 25'9. These are in close agreement with the above results for all cases. On the other hand if we take Dr. Thompson's tables for hereditary phthisis we find : — Males. Females. Mother only attacked Father only attacked 26-0 26-6 26-6 24-9 29-0 23-8 24-1 26-3 22-6 26-1 Mother and brothers or sisters attacked Both parents Brothers and sisters, not parents These results certainly show that the age of onset is less when one or both parents are phthisical, but are we to attribute this lowering to the greater chance of infection? It seems impossible to reconcile a large influence of infection in lowering the age, when we find the average age, in which those with a family history of mothers and brothers and sisters, are attacked is greater than the average age in which those are attacked when only mother is concerned. The fact of brothers and sisters being attacked does not lower the average age of onset at all below that of the " acquired cases ". It would seem that parental phthisis lowers the age of onset, but this cannot at present be asserted to be due to infection. Inquiry ought in all cases of family history to distinguish between cases of possible infection and no such possi- bility. In how many of the above cases did the parents die before the offspring were attacked ? A further point is suggestive, the average age of onset of phthisis in males and females coincides very closely with their respective primes in stature and probably with their respective ages of maximum fertility. 8 KAKL PEARSON effected in seventy-eight cases. It is difficult to consider that even those six in- stances are due to infection. Suppose only one-tenth of the population to suffer from tuberculosis and only eight per cent, of those that marry to suffer, then if 100 individuals who will sooner or later develop phthisis married at random, they would be likely in eight cases to have a tuberculous mate. Our statistics show that of eighty-four tuberculous married persons six had a tuberculous mate. There is clearly no need in such cases to appeal to infection from husband or wife to account for the small number of cases in which both parents suffered. IV. Pulmonary Tuberculosis from the Mendelian Staiidpoint. Since for cases in which one parent is tuberculous and the other not tuberculous, there may be one or more offspring tuberculous, and one or more offspring who have throughout life no evidence of tubercle at all, it is needful to consider the phthisical diathesis as a recessive character. We shall express therefore an individual who has shown evidence of tuberculosis as (RR), one who has not personally shown evi- dence of tuberculosis but of whom the offspring have shown it as (DR), while those in whom neither personally nor in the offspring there is any trace of tuberculosis are (DD). Cases in which both parents show the disease must be (RR) x (RR) and on Mendelian theory all the offspring should be tuberculous. Cases in which neither parent show the disease but it appears in the offspring should be (DR) x (DR). And cases in which one parent shows it and the other does not should be (RR) x (DR). Now, of course, the inheritance being one of the diathesis and not of the disease, a parent treated as (DR) may really be (RR), but if some of the (DR) parents be really (RR) the effect of this should be to increase the number of tuber- culous children. The expected theoretical results are as follows : — Matings. Symbol. Offspring— Per Gent. Tuberculous. Non-tuberculous. Both parents tuberculous One parent tuberculous Neither parent tuberculous (EE) X (EE) (EE) X (DE) (DE) X (DE) 100 50 25 50 75 Matings with (DD) do not concern us, as at least one of the offspring in the families recorded was tuberculous. We find : — FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 9 Mating s. Females. Males. Total. Cases. Offspring. Cases. Offspring. 1 Cases. Offspring. T. N.T. T. N.T. 5 98 : 658 761 T. N.T. (EE) X (EE) (EE) X (DE) (DE) X (DE) 2 52 162 216 4 89 205 298 4 203 711 918 4 26 137 167 8 33 167 208 6 78 299 383 12 122 392 506 9 301 1,369 1,679 Totals Treating only the totals for both sexes we have :- Matings. Expected to have Phthisical Diathesis. Known to have it. (EE) X (EE) (EE) X (DE) (DE) X (DE) 100 per cent. 50 per cent. 25 per cent. 57 per cent. 29 per cent. 21 per cent. Now this table shows — what we might expect on any theory of heredity — that the percentage of tuberculous offspring rises steadily according as they have no, one or both parents tuberculous ; but except in the last case the percentages show no approximation to the expected Mendelian values. Nor is this absence of agreement in itself to be at all insisted on, for it must be remembered that it is the inheritance of the diathesis and not that of the actual disease with which we are concerned.* The numbers of persons in the three cases we should expect on the Mendelian hypothesis to have the phthisical diathesis are 21, 212 and 435 respectively. Thus in the case of both parents being tuberculous 43 per cent, of the offspring with the diathesis escaped infection ; in the case of one parent being tuberculous 42 per cent. * Possibly a better comparison with Mendelian theory can be made by using the statistics of Dr. Thompson {Family Phthisis, p. 45). He gives the data for eighty families with completed family history, not very extensive statistics it is true, but the best available. Dr. Thompson gives no cases of non-parental incidence with details as to number of brothers and sisters affected, i.e., we cannot supply (DE) x (DE). But we have from the sixty-eight available families : — Tuberculous. Non-tuberculous. One parent affected (DE) x (EE) 132 141 Both parents affected (EE) x (EE) 43 21 Here the "one parent affected" group has risen from 29 to 48 per cent., a near approach to the Mendelian 50 per cent. But the " both parents affected " class has only risen from 57 to 67 per cent, and is still a long way from the Mendelian 100 per cent. It is remarkable that in these statistics of com- pleted family history, as well as in those cited in the text, the approach to Mendelism in the partial parental affection is considerably greater than in the (EE) x (EE) cases. One important point arises out of the statistics here dealt with. In stocks in which at least one parent is tuberculous, the offspring are tuberculous and non-tuberculous in the ratio of 175 to 162 ; in other words when the family history is completed at least 50 per cent, of such stocks will be tuberculous. 10 KARL PEARSON escaped infection, and in the case of neither parent being tuberculous 14 per cent, escaped infection up to the time of the record. Now if these percentages had been approximately equal the divergence between the expected and the known percent- ages would not be remarkable ; they would merely show that a large percentage of those with the diathesis had escaped infection at the time of the record ; but the remarkable point is that three times as many escape the disease where one or two parents exist as centres of infection as escape when no such parental centres are present. The average age of onset being for males twenty-nine and females twenty-five it is clear that most parents are either dead or through the danger zone at the time of the record, and in the working-class population of a town like Manchester it may reasonably be doubted if many (RU)'s who have lived to forty or fifty years of age would not have met with infection enough to convert the potentiality into actual pulmonary tuberculosis. A few {RR)'s may have been classified as (DR)'s, but it is unlikely that the difference between 14 and 42 per cent, can be introduced in this way. Thus the group which at first sight gives the closest approach to the Mendelian percentage, namely 21 instead of 25 per cent., is the one which on the whole is most unfavourable to it. Of course the present discussion is based on the assumptions (1) that the diathesis of pulmonary tuberculosis is a " recessive " character, and (2) that when tuberculous and non-tuberculous stocks cross the latter is dominant in the hybrid. If, however, diverse constitutions be attributed to the hybrids, so that they may sometimes be classed with (RR) and sometimes with (DD), other than the simple Mendelian percentages cited above may be reached. The great value, however, of the simple Mendelian theory, i.e., that it allows of the latency of the recessive char- acter through several generations, is lost when the idea of dominance is suspended. V. Pure Statistical Theory of Inheritance of Tuberculosis. Parental Heredity. On the whole it does not seem probable that any simple Mendelian theory will throw light on the inheritance factor in pulmonary tuberculosis. That such an inheritance factor really exists is sufficiently marked by the increasing percentage of tuberculous offspring as we pass from neither parent to one parent and then to two parents affected. If we approach the matter now from the purely statistical standpoint we can arrange our records as follows : — Male Pedigrees. Parent. Female Pedigrees. Parent. Offspring. T. N.T. Offspring. T. N.T. T 49 + ^ 361 T. . 107 + y 207 - y 509 X N.T 108 - y X N T FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS ll Here the numbers represent the data actually given by the Crossley Sanatorium records. Each child is entered twice, once with each parent ; x represents the number of non-tuberculous children (male or female) of non-tuberculous parents which are clearly not provided by the records, but which would be associated with this amount of tuberculous material had we reached the record of it from a random sample of the general population. To explain the other unknown, //, we note that 108 and 207 are the number of non -tuberculous offspring of the tuberculous parents at the time of taking the record, y is the further number which possess the tuberculous diathesis, and might, or probably would, exhibit pulmonary tuberculosis if the family history were completed. There would of course be a similar corrective factor for the non-tuberculous parents, who might posterior to the date of the record develop tubercle. But this cor- rection will not, I think, be of much significance, because (1) in the case of the parents their history is in a much larger number of cases completed and (2) the great bulk of parents have at the time of the record passed through the danger zone, 2.^., they are forty-five years of age or older.* The general effect of this correction would be to increase the number in the first quadrant, and so intensify the hereditary influence. I shall not, however, make any attempt to allow for it. To determine y we must appeal to the only data that I know to exist, namely. Dr. Thompson's record of families with completed tuberculous history (see footnote, p. 9). This shows us that slightly more than half the offspring of a tuberculous parentage are tuberculous. Accordingly we have for the two cases : — 49 + y = 108 - y, ov y = 30, say, and 107 + ^ = 207 - y, or y = 50. Thus our two tables may be written in the following form : — MODIFIED TABLES, COEEEGTION FOE COMPLETED FAMILY HISTOEY. Male Pedigrees. Parent. Female Pedigrees. Parent. Offspring. T. N.T. Totals. Offspring. T. N.T. Totals. T N.T 79 78 361 X 440 18 + X T NT 157 157 509 X 666 157 + X Totals 157 361 + x 518 + X Totals 314 509 4- a; 823 +x It remains to determine what value shall be given to x in order that these tables should represent random samples of the general population. To judge by the * As already indicated there is the possibility that the parents recorded as " non- tuberculous " have died of other diseases before getting wholly across the danger zone. For this reason the age of relatives at death should always form part of the family history. 2—2 12 KARL PEARSON death rate, hardly fewer than 10 per cent, of the inhabitants of this country are affected by pulmonary tuberculosis. This fact is occasionally forgotten, when great stress is laid on the occurrence in a family history of one or more cases. Very few families of average size would escape such incidence if the disease were distributed at random ; the evidence for inheritance must therefore lie on a reduced incidence in certain stocks and a marked increase in other stocks. Assuming that our off- spring are a normal sample of the population we must have : — 440 = i (518 + a;), or ^ = 3882 for the first case, and 666 = ,^ (823 + w), or a? = 5837 for the second case. We can then throw our tables into the following final forms, which represent on the basis of our hypotheses random samples of the general population classed into tuberculous and non-tuberculous groups, after completed family history. GENEEAL POPULATION, EANDOM SAMPLES. Male Pedigrees, Parent. Female Pedigrees. Parent. Offspring. T. N.T. Totals. Offspring. T. N.T. Totals. T N.T 79 78 361 3,882 4,243 440 3,960 T N.T 157 157 314 509 5,837 6,346 666 5,994 6,660 Totals 157 4,400 Totals The exact hypotheses on which these tables are based must be realised. In the first place these results are deduced from the Crossley Sanatorium records ; these naturally give "incomplete" family histories, i.e., they contain only the number of cases of pulmonary tuberculosis at the time when a patient was in the sanatorium. I have assumed that the history as to parents is approximately complete, because the average age of the patient is such that the parent will be usually through the danger zone. The actual Crossley re/^ords only show 156 tuberculous offspring out of 481 offspring of affected parents, i.e., about 32 per cent. But this is certainly below the correct value when the history is completed. If we take the data of Dr. Thompson, we find more than 50 per cent, are ultimately affected. The first correction is made on this basis of a final 50 per cent. In the next place an estimate has to be made of the non-tuberculous offspring of non-tuberculous parents. This can only be done by judging by the death rates from pulmonary tuberculosis the number of the population affected ; this is certainly more than 8 per cent, and prob- ably less than 13 per cent. I have taken 10 per cent, as a round number to start work with ; 1 or 2 per cent, either way will not make a substantial difference in the result. Of course a reduction to 5 per cent, would much intensify the strength of heredity, just as raising to 15 per cent, would weaken the intensity. What we shall obtain from our tables will be an approximate value to the intensity of the heritage. FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 13 The tables have been worked out as fourfold tables of normal distribution, the assumption made of course being that the modal frequency of pulmonary tubercu- losis is not among the very slight or very severe cases. This is generally in accord- ance with Dr. Thompson's statistics for degree of acuteness and of haemoptysis,* which, I think, suffice to show no clustering at one end of the range and justify the use of the fourfold table method for approximate results, and will provide at any rate a first approximation, which is all we can venture to hope for at present, to the intensity of tuberculous heredity. For the male pedigrees, the equation for the correlation coefficient is 1 044,264 = r + 1155,432 r' + '241,046 r' - '032,869 ?•' which gives in the nearest second decimal : r = "59. For the female pedigrees, the equation for the correlation coefficient is 1 091,844 = r + 1 072,139 r' + 192,622 7^ + 024,362 r' which gives to the nearest second decimal : r = '62. Thus while the male pedigrees give a slightly lower intensity of inheritance than the female, both practically give "6 for the value of the coefficient of parental heredity. Before drawing any conclusion, I will endeavour to ascertain the effect of modifying the values of cc and ;(/. Clearly a lower limit to the intensity would be found by putting j/ = o. This gives : — Male Pedigrees. Parent. Female Pedigrees. Parent. . T. N.T. Totals. T. N.T. Totals. 616 5,544 T NT 49 108 157 361 3,582 410 3,690 T N.T 107 207 509 5,337 5,846 Totals 3,943 4,100 Totals 314 6,160 * Log. cit., pp. 52, 53, 62, 63. He gives the following data : — Acquired Phthisis. Hereditary Phthisis. Males (1,000). Females (1,000). Males (1,000). Females (1,000). 1 i TAcute '3 ^ - Sub- acute 288 441 271 425 326 249 320 382 298 445 296 259 180 1^4 344 332 Q Q ' Chronic I [Copious .2 Moderate 1 >.' Slight 279 146 306 . 269 • 170 116 368 346 272 171 285 272 W (Nil ..., 14 KAEL PEARSON For the male pedigrees we find — •464,608 = /• - 1 134,724 r' + 228,679 f" - 017,452 z-^, whence the correlation coefficient = "33, to two decimal places. For the female pedigrees we have : — •667,718 = r + 1 047,991 r' + 179,317 ?■" + '038,549 r\ whence the correlation coefficient = "44. The average of these values = •385, This value is itself higher than has been found for the inheritance of cephalic index from mother to offspring, or in some shorter series for stature from parent to child. But it is somewhat smaller than the values obtained for long series of stature or pigmentation in man. The male pedigree results show the reduced correlation to be expected from less complete records. We see accordingly that parental inheritance lies between "4 and 6, approaching the upper limit, if we suppose the family histories when completed to give as great a tuberculous percentage as that observed by Dr. Thompson. Lastly, let us ask what effect it would have on the intensity of parental influ- ence if we supposed 13 per cent, and not 10 per cent, of the community to suffer from tuberculosis. Our tables now become : — Male Pedigrees. Parent. Female Pedigrees. Parent, T. N.T. Totals. T. N.T. Totals. T N.T 79 78 361 2,867 440 2,945 T N.T. 157 157 509 4,300 666 4,457 Totals 157 3,228 3,385 Totals 314 4,809 5,123 From these tables we find — Equation from male pedigrees : — •842,527 = r + 946,738 r^ + -081,783 r" + -023,793 r\ giving r = -55. Equation from female pedigrees : — •885,075 = r + 869,594 y^ + -061,995 r^ + '077,276 r\ giving r = -58 to nearest second place of decimals. It will thus be seen that an increase of 3 per cent, in the amount of tuberculosis in the general population from which the Crossley Sanatorium patients are drawn would lower the correlation coefficients by -04. An increase in the percentage of tuberculous offspring of tuberculous parents from the one in three of the incomplete family records to the one in two of Dr. Thompson's complete family records * raises the correlation from about -4 to "6. We may, I think, accordingly conclude with safety that the intensity of the inheritance factor in pulmonary tuberculosis is greater than -4 and less than •6- * This is the increase which is essential also on the Mendelian hypothesis, see p. 8. FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 15 It is not a dogmatic step from this result to assert that the tubercular diathesis is inherited at the same rate as I have found physical characters in man are inherited, namely, somewhere about "46 to 'oO. What we need to supplement the present investigation are more extensive records for the " completed " family history of tuberculous stocks. We may note here an interesting point : if we take the 50 per cent, of tuberculous offspring demanded by the Mendelian theory in the case of tuberculous parentage,* we can hardly reach a correlation less than '5. Such a value is inconsistent with the one-third which a simple Mendelian theory demands. This fact coupled with the number of non-tuberculous offspring found in cases where both parents are tuberculous leads me to believe that if pulmonary tubercu- losis has a Mendelian inheritance, the principle of dominance does not apply or applies only — to use a Mendelian phrase — " with a complication ". The main result of this investigation is not, however, a question of one or other theory being applicable ; it is the all-important genetic fact that the diathesis of pulmonary tuberculosis is undoubtedly inherited and that the intensity of this inheritance is comparable with that found for normal physical characters in man. A theory of infection does not account for the facts, and there is an anti-social disre- gard for national eugenics in the conduct of medical men who can write to the public press that the marriage or even intermarriage of members of tuberculous stocks is of no social detriment, provided they live with a good supply of fresh air. I am in- clined to think that the risks run, especially under urban conditions, are for tuber- culosis as for a number of other infectious diseases so great, that the constitution or diathesis means almost everything for the individual whose life cannot be spent in self- protection. TABLE OF PARENTAL INHERITANCE. Condition. Source of Statistics. Computer and Locus. Minimum Value. Maximum Value. Probable Value. Pul. Tuberculosis Insane Diathesis Hereditary Deaf- ness Insane Diathesis Crossley Sanatorium Dr. 0. Diem's Data j-Dr. Fay's Data Dr. Urquhart's Data K. Pearson, this Memoir (K. Pearson, British Med.\ \ Jotirn., 1905, p. 1176 / TE. Schuster, BiometrikaA \ vol. iv., p. 466 / ( D. Heron | I Unpublished results J 40 • -30 1 •45 ? •60 ? •62 •65 •50 •54 Character. 1 Pearson, Family Records 1 Galton, Family j Records fLee and Pearson, Bio-') ] metrika, vol. ii., p. }- I 378 j ( Lee and Pearson, Phil. \ \ Trans., 195 A, p. 106 j •49 •45 •41 •44 •51 •46 •42 •55 •51 •46 •42 •50 Stature Span Forearm Eye Colour * Really slightly greater as a few cases of both parents tuberculous actually do occur. ■f This should be really compared with the '33 and ^44 of this memoir as being based on " incomplete " family histories. 16 KARL PEARSON The foregoing table reproduces what has been biometrically deduced at present regarding the inheritance of pathological conditions and compares the results with those for certain physical characters. The pathological states dealt with are very diverse but there appears no ground, after comparing the upper and lower parts of the table, for asserting that the pathological conditions are inherited with a less intensity than normal physical characters, or even in broad outlines for supposing that environment influences the one more markedly than the other. VI. Fraternal Heredity. As a general confirmation of the above result, that the tuberculous diathesis is inherited, and sensibly at the same rate as normal physical characters in man, we may consider the fraternal resemblance. The matter is not so straightforward as the parental correlation, and as the assumptions made are rather greater I have confined myself to the more reliable female pedigrees. Taking every pair of siblings * from these pedigrees we find : — First Sibling. T. N.T. Totals. fl fl N.T 288 1,214 1,502 1,214 4,808 6,022 1,502 6,022 7,524 ^ ^ I Totals Now this table contains only sibships in which one member at least is tubercu- lous, and these sibships deal only with " incomplete " family records. We have first to complete the family records and secondly to allow for non-tuberculous sibships, before we have a random sample of the general population. The above is based upon 216 families containing 1,217 offspring of whom 300 or nearly one-fourth were tuberculous. We have seen that if one parent at least be tuberculous, then 50 per cent, of the offspring in completed histories show tuberculosis. I think we shall probably underestimate the amount of the disease if we suppose that, when family histories are completed, one-third of the offspring of all families having at least one tuberculous member will on the average be tuberculous. I base this estimate on the following considerations : There were 8 offspring with both parents tuber- culous, 292 offspring with one parent tuberculous and 916 with neither parent tuberculous. Dr. Thompson's statistics for completed family history show that of the first two classes 292 + 8 = 300 individuals in all would give 300 x 175/337 = 156 tuberculous members. We may suppose the third class to give the Mendelian quarter at least or 229 tuberculous members. Hence the proportion of tuberculous = 385/1217 = '32 or nearly one-third. * Pair of siblings = a pair of offspring from the same parents. FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 17 Accordingly we shall have, on completed family history, actually 406 instead of 300 tuberculous individuals in our total of 1,217 individuals. I have now distri- buted the 106 additional tuberculous individuals among the 216 families, so as to give as nearly as possible the ratio of one tuberculous to every three members. This process is of course only approximate, but there results the following table, which represents, as far as we can reach it without actual observation, the number of pairs of each class of siblings in 216 tuberculous famihes with completed family history. First Sibling. T. N.T. Totals. ^ . fT 441 1,302 1,743 1,302 4,479 5,781 1,743 5,781 7,524 'B^Mn.t °l] ^^ [ Totals This table represents the sibships of 406 tuberculous individuals, but such a number of individuals in the community would correspond to a total of 4,466 indi- viduals, supposing 1 in 11 of the community tuberculous,* or to 4,060 non-tuberculous individuals. Of these 812 are already accounted for as siblings to the tuberculous ; we have therefore 3,248 individuals without tainted stock. But if, as we shall see, tuberculous stocks are at least as fertile as the non-tuberculous, we have the simple rule of three sum — If 1,217 individuals give rise to 7,524 pairs of siblings, how many will 3,248 provide? The answer is 20,080. We must accordingly add into the above table 20,080 pairs arising from non- tuberculous families to reach a random sample of the general population. We obtain the following table which leads to the equation : — •779,940 = r + 1168,800 r + 298,195 r' + '042,736 v* and this gives fraternal correlation = -48. FINAL TABLE FOE FEATERNAL COEEELATION, MODIFIED FOE COMPLETION OF FAMILY HISTOEY AND BY INCLUSION OF UNTAINTED STOCKS. First Sibling. T. N.T. Totals. ^ ,. t T g ^ N.T 441 1,302 1,743 1,302 24,559 1,743 25,861 "31 ^^ ^ Totals 25,861 27,604 * We have seen that the percentage lies between 8 and 13. 18 KAKL PEARSON This value is very close to vrhat has been already determined for the intensity of fraternal inheritance in man. It must be remembered that here we have mixed siblings of both sexes. The correlation between brothers and sisters for stature, span, cubit and eye colour gives a value slightly under '5 for adults.* Mr. E. Schuster, dealing with deaf mutes, obtains a value about 7 and nearly as high a value has been obtained for pigmentation in horses ; but the bulk of values t are lower than this and cluster not far from "5 to '55. The three assumptions made in the deduction are (1) the final value of the tuberculous contingent in families with neither parent tuberculous ; (2) the distribution among the families of this " later than record " contingent, and (3) the total percentage of those who suffer at any time from pulmonary tuberculosis in the community. We can appreciate the efifect of modi- fying (1) by simply supposing the table on p. 16, which represents the tuberculous contingent at the time of record, to represent the final state of affairs ; we shall then get a minimum limit to fraternal correlation. Repeating the same argument as before we have 300 tuberculous and 917 non-tuberculous. In the general community, if we take 1 in 11 tuberculous to give round numbers, 300 tuberculous correspond to 3,000 non-tuberculous, or to an addition to our table of 3,000 - 917 = 2,083 non- tuberculous individuals from non-tuberculous sibships. But these 2,083 would give 7 524 I'.^-j^ ^ 2,083 = 12,878 sibling pairs. Thus our table becomes — First Sibling. T. N.T. Totals. nd 13D S a N.T 288 1,214 1,502 1,214 17,686 1,502 18,900 O 13 ^ ^^ I Totals 18,900 20,402 This gives — •446,483 = r + 1-050,312 r^ + '201,895 ?^ + 070,797 r' leading to r = -33. It is safe to say that the fraternal correlation is considerably above one-third, because 1 in 4 is undoubtedly too small a tuberculous contingent for completed records. Again as to (3), if we go to the opposite extreme and make 1 in 8 of the com- munity tuberculous, or say 13 per cent., we find that the correlation is only reduced * Huxley Lecture. Biometrika, vol. iii., p. 140. f See table in " Inheritance of Coat Colour in Shorthorns," Biometrika, vol. iv., p. 454. FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 19 to about •43.* Thus, although our assumptions appear large, a very considerable latitude in their numerical application does not widely modify the correlation. I think we may safely assert that there is nothing in the degree of fraternal resem- blance to oppose the result reached from the somewhat more certain parental data, and this result is that : The tuberculous diathesis is inherited in the same way and with the same intensity as the physical characters are inherited in man. VII. Fertility of Tuberculous Stocks. The result reached in the previous section is of such importance from the stand- point of national eugenics, that it is desirable to consider in more detail the nature of the fertility in stocks tainted with pulmonary tuberculosis. The distribution of the 381 f tuberculous families, which may be practically considered as completed, is as follows : — 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 22 Total 15 34 43 42 62 59 40 29 22 14 6 6 3 4 1 1 381 Size of family Number of families Accordingly the mean family contains 5*68 offspring. For the male pedigrees the mean size is 5-80 and for the female pedigrees 5*59. Mr. Schuster finds the mean size of families containing at least one deaf mute to be from American statistics 6*08, and from English statistics for probably completed families 619.+ If we com- pare such results with those for the general population, we must be cautious about three points : (1) when we say that these families are probably complete, we mean that no sensible further additions are likely to be made after the date of record, because the average age at onset is twenty-five for women and twenty-eight for men ; (2) barren marriages by the nature of the case are excluded ; and (3) many of these marriages will not necessarily have lasted throughout the whole fertile period. While artizans' marriages usually begin early, they may terminate by the death of one or other of the pair before the end of fecundity. I have reduced statistics of the fertility of the normal population available of three kinds : {a) those including sterile marriages ; these are incomparable and need not be considered ; (b) marriages which in each case have lasted fifteen years at * The fourfold table is then : — 441 1,302 1,302 18,773 1,743 20,075 1,743 20,075 21,818 with the same meanings for the compartments as before. ■f Three individuals were not placed in their sibships, X Biometrika, vol. iv., pp. 477 and 482, 3—2 20 KARL PEARSON least and where neither husband nor wife was more than thirty-five at the time of marriage. * A first series of such, 417, chiefly middle class cases, gives : Mean size = 6-40± lO A second series of such, 788, chiefly middle class cases, gives : Mean size = 6'68 ± -07 Next : (c) Marriages which were completed by death or by extending beyond the fertile period, barren marriages being excluded : — First series, 4,390, chiefly middle class cases, gives : . . Mean size, 4-52. Second series, 204, middle class cases, gives : . . . Mean size, 4-65. Third series, 378, middle class cases, gives : . . . Mean size, 4-70. I add : Danish,! fertile marriages, professional classes, 15 years at least (1,605 cases) 5*18. Danish,! all marriages, working classes, 25 years at least (2,934 cases) 5-26. New South Wales, J fertile marriages, all classes, 15 years at least (86,140 cases) Mean size, 7*10. Now I have found that the exclusion of the barren marriages does not, in most cases, raise the mean size of families more than about '5 of a child. The Danish industrial classes have a mean size of family (gross fertility) of 5*26, and this number is identical with that given by Powys for the artizan classes in New South Wales. § We may, I think, safely say that the fertile marriages of the artizan class in this country, even if they have lasted fifteen years at least, will not give a greater average gross fertility than six offspring. The educated and professional middle classes give a gross fertility for all completed marriages of under five offspring ; only when we take the fairly stringent selection of marriages begun at or before thirty-five years for both husband and wife and lasting at least fifteen years does the average gross fertility rise to 6*5. Now in the tuberculous and deaf mute stocks no selection of marriages lasting fifteen or more years has been made, no selection of age at entering on the marriage has been made, yet we find in both these cases a fertility of 5*7 to 6-2. We are forced to conclude that these pathological conditions do not tend to reduce the fertility, but that such stocks appear to be quite as fertile and in all probability are more fertile than normal stocks of the same class in the community at large. It would thus appear that fewer off'spring are not born to stocks tainted with pulmonary tuberculosis. The fact, however, that tuberculosis is a disease of youth and early middle life distinctly lowers the marriage rate of such stocks and thus reduces the total number of offspring born to them. From the Crossley Sanatorium records we find : — Both parents tuberculous : . . . . Mean size of family = 3*50 One parent tuberculous : Mean size of family = 5-42 Neither parent tuberculous : . . . . Mean size of family = 5-82 * Unpublished material. f Pearson, Chances of Death, vol. i. Reprodtictive Selection, pp. 62-102, passim, I Powys, Biometriha, vol. iv., p. 250. § Loc. cit., p. 285, FIRST STUDY OF STATISTICS OF PULMONARY TUBERCULOSIS 21 Thus, to use Mendelian terminology, it is the (DR)'s and not the (RR)'s which constitute from the eugenic standpoint the gravest source of danger to the com- munity. Without exogamy the endogamy of the tuberculous would lead to their extinction. The like result is not so marked, I think, in insanity statistics, where the evil so frequently manifests itself after the reproductive period. " VIII. On the Distribution of Pulmonary Tuhermilosis in TiiheroAilous Families. I now turn to an exceedingly important point, the question whether order of birth has any influence on liability to tuberculosis. This, if any limitation of natural fertility is taking place, is not only of importance from the eugenic standpoint, but clearly must have very considerable bearing on any Mendelian theory. Breeders are apt to assert — it is difficult to say on what definite evidence — that late offspring occasionally differ in a marked manner from earlier offspring even in pigmentation characters. Now the position in the family of each tuberculous member is given in the present records. If we consider the community as a whole, it will be built up of families in all stages of development. There will be some in which both eldest and youngest siblings have passed through the tuberculous zone, some in which the eldest have and the youngest have not, and some in which the eldest are in it, and the youngest have not reached it. Each one in his lifetime passes through the danger zone, and we might expect, out of the totals that pass through, the same percentage would be attacked, whether they happen to be elder or younger siblings. In other words, if we take the Crossley Sanatorium population at a given date we might expect that as far as position in family is concerned it would be drawn in- differently from all parts of the family.* Taking the 381 families of which the record of birth position is available I find that there was the following distribution of birth position among the 2,164 members : — TUBERCULOUS STOCKS, NUMBERS OF EACH CLASS OF SIBLING. Siblings' order 1 2 3 4 5 6 7 8 9 10 11 12 13ll4 15 16 17 18 19 20 21 22 Number of cases 381 366 322 289 247 185 126 86 57 35 21 15 9 6 2 1 1 1 1 1 1 1 If we take the actual 381 tuberculous patients we find they were distributed as follows : — * Thus while a certain number of families exist with young siblings, in which the elder alone are likely to suffer, there are others in which the elder are dead or past the danger zone and which only the younger are likely to suffer. I have taken out the actual patients and not all the tuberculous siblings of tjie family. 22 KARL PEARSON TUBEECULOUS PATIENTS. Siblings' order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Above 14 Number of cases, observed 113 79 41 52 39 18 18 9 3 3 3 1 1 1 Number of cases, calculated • 67-1 64-4 58-5 50-9 43-5 32-6 22-2 15-1 100 6-2 3-7 2-6 1-6 1-1 1-6 It will be obvious on mere inspection of this table, or of the accompanying graph, that the excess of elder born and defect of younger born is most marked. Testing by the usual process for goodness of fit,* we find x^ = 59 61 and con- clude that the probability of such a distribution of elder and younger members of a family occurring by random selection lies between one and two in the ten million trials. 1(2 L -'isiri UUIK 3n o r 1 uoer cuious I'lemoers in ramuy. 104 96 g 88 UJ w 80 U. 7S ;*.- - M 5^ 1 Observed Tuberculous Offspring . , Expected Tuhe-rrulnu?s Offspring