~~ hae ‘ } | i i | i 4 ns ein yaaa tn ii it Rp ar ti (2 Le l3TESTING THE KNOWLEDGE , Ue of RIGHT AND WRONG | HUGH HARTSHORNE MARK A. MAY AND OTHERS Price Seventy-five Cents THE RELIGIOUS EDUCATION ASSOCIATION Monograph No. 1 a) July, teaTESTING THE KNOWLEDGE OF RIGHT AND WRONG SIX ARTICLES Hucu HartTsHorRNE Mark A. May AND OTHERS So g 1@ i “ay 7. oS ~~“ a +3 & = Ae A e ot fn » Ta ‘ j= / ; — fe | wee oem ey General Description of the Tests. || — \ tr 2. Administration of the Tests and Pr&é oS Ge liminary Statistical Results. 7 Jp } 3. The Code Value of Moral Knowlege Scores. 4. Some Probable Sources of Moral Knowledge in Children. 5. The Relation of Standards to Behavior in Individuals. - 6. Group Standards and Group Conduct. Reprinted from Religious Education, Issues of February, April, August, October and December, 1926, and May, 1927TESTING THE KNOWLEDGE OF RIGHT AND WRONG HucuH HarTSHORNE AND Marx A. May* FIRST ARTICLE GENERAL DESCRIPTION OF THE TESTS The Character Education Inquiry is devoting itself to the problem of how to measure character. For convenience the field of character study in which tests are called for has been divided as follows: 1. Mental content and skills, the so-called intellectual factors. 2. Desires, attitudes, motives, etc., the dynamic factors. 3. Social behavior, the performance factors. 4. Self-control, the relation of all these factors to one another and to social-self-organization. The first three items are of course abstractions from the unitary process of social experience mentioned in 4. This is the concrete reality we hope to get at, but for practical purposes it has seemed best to approach it in a somewhat piecemeal fashion, much as a doctor examines the composition of the blood, the reflexes, skin color, and so forth, to aid him in making a diagnosis of the condition of the individual as a whole, even while recogniz- ing that blood-count, taken by itself, is relatively an insignificant item. The series of articles of which this is the first will report the efforts so far made to test item one by means of paper and pencil tests requiring word responses. The investigators’ interest in what words may reveal of moral knowl- edge is not based on the assumption that knowledge and behavior are highly correlated. One of our problems is to discover what the relation is between behavior and the knowledge of right and wrong. Furthermore, we do not assume that word behavior and a true knowledge of right and wrong are necessarily correlated. It may be that overt action is a far better indication of what a man really knows about right and wrong than his verbal responses are. If this be the case, there remains the very significant problem of the relation between what he says and what he knows on the one side, and the relation between what he says and what he does or would do, on the other. Words have a social significance that cannot be ignored. The heart of the problem of character lies in the adjustment of persons to one another, and this adjustment is never complete until it has become articulate. Even the extreme behaviorists write books. “It should also be remembered that the fundamental folkways are rather completely reflected in sayings, rules, slogans, definitions, and what not, and are here far more accessible than if studied only as mores. One can find out by word responses whether an individual is aware of certain customs * Dr. May and Dr. Hartshorne are the investigators for the Character Educa- tion Inquiry which is being conducted by the Institute of Educational Research at Teachers College, Columbia University, in cooperation with the Institute of Social and Religious Research. In this series of articles the writers have described in some detail one section of the work of the Inquiry. They have been asked to be as specific as possible in order that persons not familiar with the procedures used in test building and the applica- tion of tests to particular problems may be fully informed concerning the dangers, difficulties, pitfalls and values of statistical methods as applied to the study of one phase of moral behavior. 1even though his possession of a custom in the form of a habit may not be thus revealed. In the study of moral knowledge through word responses we have tried to keep distinct the power of making discriminations and the subject matter, or experiences in which discriminations are made. The one is a factor in pure intelligence. The other is a matter of experience. | We have no reason to suppose that the capacity to make ethical discriminations is not adequately measured by standard intelligence tests. . Nor do we have reason to suppose that the ability to make such discriminations 1s measured by such tests. That is to say, experience with ethical situations and in making ethical judgments is required in addition to native intelligence. = This may be illustrated by reference to the relations between the ability to do arithmetic problems and general intelligence. From the results of an intelligence test which contains no arithmetic problems it 1s possible to pre- dict the probable success a person would have in learning arithmetic. But a highly intelligent person who has had no training in fundamental processes in arithmetic would make a poor showing on an arithmetic test. ee On the contrary, a person possessing the ability to make fine discrimi- nations of any sort, ethical included, must possess the necessary intelligence. A high score on an arithmetic test, that is, is a fair indication of the pres- ence of high intelligence. It is equally true that a high score on an ethical discrimination test is an indication of high intelligence. A low score, on the other hand, is not necessarily an indication of low intelligence, but may be merely the result of a limited experience in the handling of ethical situ- ations. What has just been said of the power of discrimination is equally true of any other typical mental process, such as the power of retention and recall of appropriate experience, of the organization and generalization of experi- ence and the application of generalizations to the understanding of new experiences and the solution of new problems, the foresight of the conse- quences of behavior, the control of an adequate vocabulary, the recognition of what is at stake in any situation. In planning a set of moral knowledge tests, therefore, it was necessary to keep in mind these two preliminary standards: First, the tests must cover as wide a range of moral experience as possible; second, the tests must require the exercise of as many appropriate mental processes as possible. Sources of the Material In order to facilitate the application of these standards, we found it convenient to make a preliminary classification of the kinds of experience that ought theoretically to be included in a complete set of moral knowledge tests. Had there been time, we should have made this classification on the basis of an extended study of the actual behavior of children of all ages and types in all sorts of actual situations. Such a study of children’s moral behavior is very much needed. In lieu of such a study, we did the best we could with the knowledge of life and of children we happened to possess. The following constituted our work sheet: Brief Outline of Certain Mental Contents and Skills Involved in Ethical Behavior A. Certain tools needed for the intelligent consideration of problems of social adjustment. 1. Adequate social-ethical vocabulary. 99 We 3. Adequate control of language—the ability to say the right thing and to understand the more subtle nuances of delicate social adjust- ment. Assimilation of the fundamental ideas or generalizations in terms of which life is coming increasingly to be understood, such as The idea of Sex The idea of God The idea of Right and Wrong The idea of Natural Law The idea of Growth The idea of Evolution ; The idea of Cooperation The idea of Personality The idea of Custom The idea of Design The idea of Legislation The idea of Education The idea of Work The idea of Fun The idea of The Machine The idea of Self-forgetting Service B. Particular knowledges and skills needed for making social adjustments. t: 2. or 6. 10. at. Knowledge of natural law, physical and biological, and the limitations and possibilities of experience. Knowledge of body and mind in general and of oneself in particular: to understand the causes and consequences of certain kinds of behavior in oneself and others, the nature of temptation, reasons for social and legal requirements and desiderata; to control self and growth. Knowledge of race experience in solving problems of social adjust- ment, as recorded in history, foik lore, fiction, biography, poetry. Particularly, knowledge of motives and purposes and their conse- quences. Knowledge of how people behave toward one another in all sorts of situations: home, school, church, public meetings, committee meet- ings, discussion groups, play groups, emergencies, studying, visit- ing, etc., and the significance of this behavior for the life of the groups concerned. Knowledge of moral principles held by different groups, and their implications and applications in concrete situations. Knowledge of constitutional rights and obligations, legislative enact- ments and sanctions affecting oneself and one’s groups. Knowledge of institutions and other cooperative bodies and move- ments affecting oneself or needed as instruments of social adust- ment, such as the church, the school, the home, the state, the town or city or community or block or neighborhood and its government, community agencies of welfare and safety, such as the police department, fire department, health department, national associations such as the Child Labor Committee and Red Cross, the movie, the playground, the library, the museum, local indus- tries, the jail, the hospital, the court, the clinic. What they do, their history, their value, their address, how to cooperate. Knowledge of how the work of the world is carried on in mining, agriculture, industry, commerce, finance, transportation, communi- cation, the trades and professions; mechanical and social aspects. Knowledge of contemporary peoples, races, nations, their contacts, conflicting interests, efforts toward peaceful settlement of dis- putes and world organization, effects of war and armament, his- torical and current utopias. Knowledge of the trend of evolution, theories of the universe and the place of man in the universe. Knowledge of how men have experienced God in connection with nature and in the control and development of self and society. Prayer and reflection, retrospect, valuation, foresight, repentance, forgiveness, aspiration, unification. Knowledge of causes and consequences of social behavior, the habit of foresight and valuation, the recognition of personal and social responsibility, the habit of moral thoughtfulness. 9 vo13. Knowiedge of how to think with the materials of social action, the habit of inhibition, abstraction from prejudice, gathering and weighing of evidence, use of past experience, willingness to ex- periment, discipline of group thinking, openminded consideration of differences, respect for self and others, freedom from social suggestion, social perception and imagination. 14. Knowledge of the sources of information needed, and the habit of making constant reference to them. With such a framework in mind the tests described below were con- structed. No attempt was made to match a test against any particular one of the above classes of material. Each test contains a variety of situations. But the second standard, that requiring the use of as many kinds of mental processes as possible, was applied chiefly in the form of the tests. It was hoped that in the responses requested in the directions for the different tests there would be found a fair sampling of the fundamental types of process. The Tests as Given and Scored Experimentally Of the tests devised, thirteen were given in sufficient numbers to war- rant statistical treatment. Lach of these will be described briefly and what each is supposed to measure or symptomatize will be pointed out. To avoid duplication of material, the problem of criterion and method of scoring will be discussed at the same time. In the case of arithmetic tests the criterion as to what is the right answer is established by universal practice. Spelling tests have a less universal agreement back of them, but at least there are dictionaries. When it comes to handwriting and composition the criterion has to be established experi- mentally by ascertaining the judgment of experts and forming a scale for the evaluation of samples of handwriting or composition produced by the subject.“ In the case of ethical experience we are in a still different field, in which custom and opinion are mixed together to form a great variety of practice and judgment, with no universal agreement as to what consti- tutes the right or wrong answer. Indeed it would be difficult to select a group of “experts” to decide by discussion and vote what the “right” answer of a question in ethics or the “right” solution of moral problem is. Even with such a board of judges, there is strong probability that on many debatable issues there would be only a majority or perhaps 75 per cent agreement. The idea of a “perfect” score on moral knowledge tests, therefore, will probably have to be replaced by the notion of a scale of moral values for each individual. Certain likenesses among these scales, once they were dis- covered, would doubtless appear, so that they could be classified and named without any derogatory implications such as is implied in ‘the notion of a “low” score. A method of scoring that will reveal the individual’s trend of thinking is therefore of more significance than one which will show merely his position on a necessarily arbitrary scale determined by a group of judges. Such a qualitative, descriptive, objective scale waits upon the admin- istration of a large number of tests of the sort to be discussed below. Mean- while they must be scored to be handled in large enough quantities for the discovery of such scales as may prove reliable and valid. It was necessary, therefore, to resort to the notion of a standard answer for each question in comparison with which the particular answer given by the subject could be automatically judged right or wrong, or partly right and partly wrong. These 4standards will be taken up in connection with the description of each test. A. Word Tests 1. Opposites—a multiple choice test of the sort frequently used in intelligence or achievement tests, with the words chosen from the field of social relations. The following is a sample: In the bracket at the right of each line place the number of the word which is most nearly opposite in meaning to the word printed in capitals at the left. 1. GIVE. 1—present, 2—accept, 3—take, 4—wish, 5—absent........ Co yi 2. FRIEND. 1—soldier, 2—true, 3—false, 4—enemy, 5—fight...... Co ) 2 3. HELP. 1-hinder, 2—assisf, 3—someone, 4—need, 5—chantey....(........ 5-3 4, BORROW. 1—steal, 2—return, 3—book, 4—loan, 5—debt............ Cae ) 4 5. KIND. 1—sweet, 2—cruel, 3—sort, 4—sympathy, 5—always..... Cee 25 It was expected that this test would give some notion of a child’s social ethical vocabulary as well as his handling of ethical concepts. A better vocabulary test was later devised. There was no peculiar problem of criterion here, as it was easy to secure agreement on the meaning of the words. 2. Similarities—a cross-out test, of which the following is a sample: In each line below, four of the five words belong in a class or mean about the same thing. One of the five belongs in a different class. Find this odd word in each line and cross it out. 1—debase, 2—ignore, 3—humble, 4—disgrace, 5—lower. 1i—quit, 2—surrender, 3—enemy, 4—relinquish, 5—forsake. 1—abhor, 2—detest, 3—loath, 4—despise, 5—reduce. 1—abjure, 2—insult, 3—revile, 4—-disparage, 5—curse. 1—love, 2—revere, 3—like, 4—adore, 5—fond. Not only is the mental process of recognizing such likenesses and differ- ences not usually found under a mental age of twelve, but the words and relations selected for the test proved difficult even for children over twelve. As this test was well represented in the vocabulary test later devised, it was also dropped. As the criterion involved only word knowledge, it offered no particular difficulty. 3. Word Consequences—also a multiple choice test. The directions required that the subject indicate (1) all likely conse- quences that might follow from the action represented by the word in cap- itals; (2) the most likely consequence; (3) the best consequence; and (4) the worst consequence. The following are sample test words with their multiple choice responses from which tthe subject is to make the selections just described : Se cries ta 1. CHEATING. 1—courage, 2—forgery, 3—outcast, 4—wealth, 5—poverty 2. BETTING. 1—gambling, 2—poverty, 3—-optimism, 4—wealth, 5—war 3. FIGHTING. 1—weakness, 2—love, 3—injury, 4—honor, 5—death 4. COURAGE. 1—disgrace, 2—honor, 3—humility, 4—strength, 5—foolhardi- ness 6 LOYALTY. 1—bigotry, 2—treason, 3—friendship, 4—trust, 5—timidity This is a word test which is intended to do more than test vocabulary. It is an association test in which the required associations are those based on experiences of value. It is an abbreviated evaluation test. The individual must first pick out probable consequences flowing from a form of behavior or an attitude, and then distinguish the best from the worst of these conse- quences. Something of his conception of the “best” is thus revealed. The only criterion used in scoring this test was agreement between the two investigators. The criterion involved not only judgment as to the use 5of words, but also as to the consequential relationship of certain experiences. For this reason it was expected that the combined judgments of a group of mature and thoughtful people would be secured in regard to each response before the revised test was scored. B. Sentence Tests 4. Cause and Effect lowing: Some of the statements made below are true and some are false. Read each statement carefully and underline the word TRUE if it seems to you to be true. Underline the word FALSE if it seems to you to be false. a true-false test with 100 items such as the fol- 7 Good marks are chiefly a matter of JMCK 0... osc se vv vcs sige cas vee True False 2. Ministers’ sons and deacons’ daughters usually go wrong.......... True False 3. If one eats stolen apples he will have a stomach ache............. True False 4, Success always comes. from hard work... i322. 04 i cei es sees True False 5. From the standpoint of the individual workers the wage system ia TORI. Of SIVELY a ee ee True False 6. God punishes bad people by making them sick.................. True False 7. Eavesdroppers never hear anything good about themselves........ True False 8. The youngster who can cheat and not get caught at it shows more wood: sense than one who -does not cheat... 2. 24 6222.5. -05 True False This test is open to the objections that need to be raised about any true- false form of testing. We were aware of these limitations but found the procedure useful, particularly when the test, as here, was only one of a battery of tests and the gross score only was used in measuring the individual. The intention of this test is the reverse of that of the consequences test outlined above and the foresights test described below. The attempt is made here to get at the individual’s ability to trace consequences back to their causes. It is felt that such ability is an important factor in locating one’s own and others’ moral responsibility for what happens, that is, in placing oneself and others in a true causal sequence with events that superficially may appear quite removed. Ability to place oneself in such a determinative sequence of events is one aspect of self-conscious activity that needs to be understood and measured. In working out a criterion for this test as in the case of several others we were fortunate in having available a class of sixty graduate students in education who were taking a course called the Psychology of Character Study. It was the sort of group of which one might expect not only con- scientious work but also mature and liberal ethical judgment. This group took the Cause and Effect test, and furnished us with a criterion of a 75% (or better) agreement on seventy-seven of the hundred items. The remaining twenty-three were reviewed by the investigators and were either dropped or scored with the majority vote of the class, except in a few cases where it seemed to us that either ignorance or conventional opinion prevailed, in which cases the class decision was reversed. For example, 55% of the class thought that success always comes from hard work. Theoretically, the elements of this test deal only with objective fact, but it is in this sort of material that prejudice and highly conventional opinion often reign. The individual’s score, if the criterion is correct, reveals his approximation to knowledge as against ignorance, prejudice or convention. Of course, the fact that more than 75% of these graduate students say that it is not true that unemployment is the fault of the laborer does not make this statement untrue. But it does lend backing to what would otherwise be the unsupported personal judgment of the investigators. This standard is 6imperfect, very, but it is probably as objective as that which determines the bulk of the present day-school curriculum. 5. Duties—a modified true-false test with three point response. A hundred items of the following nature were used; the subject being asked to indicate whether the act stated is his duty, is not his duty, or is sometimes his duty and sometimes not: 1.. To-help a slow: or-dull- child: with iis fessonus...2. = Yes ? No 2.._lo read: the newspapers every day... 22... ‘Ves= & =No 3. To call your teacher’s attention to the fact if you received a higher gerade than: you-deserved. 2 32... Yes =? -No 4. Lo keep a diary... Yes’: ° No 5. “Fo sneeze when -you feelclike 1.3 a Yes ? No 6. Eo jeer ‘at a child who-has just been punished]... .- Yes ? No q. To smile when. things go wrong. 43. Yes? Noe S. -Fo report another pupil 3 you see: him cheating... 4... Yes 2 No This test furnishes a sort of rough index to knowledge of folkways the significance of which to the child is indicated by whether he considers the act his duty or not. It is very difficult to secure a criterion for a test of this sort. The items do not represent a grown person’s activities and it is not particularly prac- ticable for an untrained adult to attempt to answer such questions from the standpoint of a ten-year-old. The graduate class referred to showed far less agreement than on the Cause and Effect test. It may prove wise later to use as a standard the majority or 75% agreement of the pupils of a given age who have on the other tests a score approximating mature ethical judgment. With some exceptions, illustrations of which are given below, the judg- ments of the class were utilized as follows: Two answers were allowed for each item, the one which followed the predominant vote of ‘the class having a value of two, and the other, following the next most frequent reply, having a value of one. On each item, therefore, a child would score two, one or zero, The class judgments were reversed by the investigators in the case of some twenty items, such as the following, in which the class percentages are given on the first line, and the final score value, as set by us, on the second: Yes S No To. prayat least. once: asdays. 64 19 17 0 1 2 To go to Sunday school every Sunday. .....-. 22.6 es 48 40 12 0 1 2 ‘Fo: take2a_ temperance pledge. 2. 2 Se 81 17 2 0 0 2 ‘Fe sell tickets to: your school-entertatnments. ==... 56 37 6 1 2 0 To correct another pupil when you hear him using bad grammar.. = a 8 2 2 To keep every secret that you promise to keep. ..........4...3..... = = ; To keep quiet when older persons are talking a %0 2 0 6. Comprehensions—A multiple choice test suggested by the Binet com- prehensions which employs similar situations. The Terman revision of the Binet distinguishes among such questions three orders of difficulty instead of lumping them together as Binet did and as we were compelled to do in our preliminary testing. The directions in this test called for the “what you would do or say” response first. Then after the test had been taken the 7pupils were asked to go back and indicate what would be the best thing to do or say. As the children almost invariably checked the same items, the second request was later dropped. It might have been better to ask some such question as: “What would you advise a boy or girl of your own age to do?” or “Which act would be most likely to promote your own welfare in the long run?” or “Which act would be most fair, just and friendly for everyone concerned ?” Le The following are samples of the situations and responses: 1. If someone asks to borrow your pencil: (a) Tell him it’s broken. (b) Tell him that you just lost it. : (c) Tell him that you don’t want to loan it. (d) Let him take it. 2. If someone steals your lunch: (a) Steal another lunch to even it up. (b) Report it to the teacher. (c) Cry about it. (d) Say nothing about it. = 3. If you see a classmate cheating on an examination: (a) Say nothing to anyone, (b) Explain to him that it is wrong and warn him. (c) Report it to the teacher. : (d) Say nothing, but try to cheat yourself. This test is similar to the Duties test in its intent, but with a different technique. Counting the multiple responses there were 132 possible ways of responding, each one a distinct item, and not merely ‘the opposite of another, as in the case of Duties. The presumption back of these two tests is not that one may not do the correct thing without knowing he ought to, nor that he will do it when he knows he ought to, but that knowledge of what is expected or of what is wisest is in the field of morals, just as in plumbing or cooking, an essential part of intelligent control of a situation, even when one chooses to do precisely what is not expected or what is not wise. Our moral issues lie largely in this field of conflict, on the one hand, between what we transiently wish and what we know is good, and, on the other, between what is generally regarded as good and what we ideally vision as better. In any case, the tester must know the individual’s equipment of standards before he can understand the moral significance of his behavior. Seventy-five per cent of the class agreed in twenty-three out of the thirty items. In twenty-two of these cases their judgment was followed. In one case it was reversed. In five of the remaining cases, majority opinion was followed. One of the others was dropped and one was reversed. Of the two reversals one was subsequently omitted. 7. Provocations—A few illustrations will introduce the test: Here are some little stories of what some children did. You are to decide whether they did right or wrong. If what they did was not quite right, perhaps it was at least excusable in view of the circumstances. Look at the sample first. SAMPLE: Jane’s family were too poor to buy fruit for her sick brother. So every now and then Jane took an apple or an orange from a fruit stand and brought it home to him. Now if you think she was absolutely wrong in taking the fruit, put a circle ground te Wer tke this... Ro. EX: 9 But if she did exactly right, encircle the R, like this...... Pee cs If you think she was wrong but excusable in view of he desire to bring it to her sick brother, encircle the Ex like ee 8 ee R....€)...Ws Begin here and do the rest in the same way: 1. Helen noticed that nearly everyone in the class was cheating an tee, oo she Clicated too. Be ee CAVx Harry was a Christian boy. One day a Jewish boy called Harry a “dirty Christian.” Harry knocked him down...... Ri Ex We Charles did not want to play marbles for keeps but the boys called him a “sissy” so he went ahead and played for keeps any. Way... 5.0.66 oe Re Ex, Wer 4. On the way to Sunday school Jack matched pennies with the other boys in order to get some money for the Sunday- school collection: 20.605. A a ee ee Ra EX. Wer The test is called ‘“Provocations” because the situations named are pro- vocative of responses that are in conflict with ideal modes of response. In the case of ethically immature persons the situations stimulate wishes, preju- dices, emotions, and so forth, which lead to the sort of action stated in the little story. Sometimes convention supports the stated action and sometimes it does not. In the following case convention and wish seem to agree in contradistinction to more ideal conceptions of the appropriate response: Henry saw a big bully strike a little boy, so Henry walked up and gave the bully a real hard blow and knocked him down. Judgment is passed on the particular responses listed in ‘the test, and thus the examiner gains an insight into the level of moral judgment attained by the subject. As can be imagined, a standard for such a test as this is almost impossible of achievement. It was first decided to take a conventional standard as the criterion. Two suggested themselves, the one a rather mature one as found in the answers given by the graduate class, the other, the less socialized standard found by examining the actual answers given by all the children who took the test. So many of the conventional replies, however, offended our own sense of right and wrong that it was finally decided to attempt an approximation to a standard that would conform to the great historical moral ideals, and to measure all divergences from this viewpoint rather than from some point further down the scale. The conventional standard is thus identified by a score rather than by a qualitative exposition, and so also is the standard of the major group to which the child belongs. The median of his-group may be lower or higher than the conventional stand- ard and his own score may deviate from the median of his group toward the conventional, or toward the ideal, or toward a vague and undetermined zero of moral knowledge. The decisions of the graduate class turned out to be so highly conven- tional that they were practically ignored as a criterion. They were too much like what sixth grade children give as their responses. For example, in the last illustration given, of the boy who knocked down the bully, 45% of the class thought it was unqualifiedly right to knock the bully down, 42% thought it wrong but excusable, and only 13% called it unqualifiedly wrong. In one sixth grade previously given this test 85% marked it right, 6% excusable and 9% wrong. Our own standard gives a value of one to excusable and of two to wrong. Or take the following illustration: The neighbors had been kept awake at night by two cats fighting. So Fred set his bulldog on them. The following percentages of the graduate class and the sixth grade were given to the different answers: R Ex Wr teraduate Class ee 19% 53% 28% Sixth rade. 2 29% 45% 26% Our valiation: 6.2 0 1 2Instances like this made us feel that if the test was to have real differ- entiating value, the only possible standard to be used was one which would grade all from the top down, on which the score would represent approxi- mation toward consistency in forming judgments in the light of ethical ideals rather than in terms of convention or prejudice. 8. Foresights—One important distinction between this and the Word Consequences test is in the fact that here no suggestions are given as to the possible consequences. The subject is left entirely to. himself in thinking of what might happen from the events recorded. He is requested to write down as many things as he can think of, both good things and bad, and a sample is given as an illustration. Here are some of the incidents selected from the forty-eight actually used: 1. Whenever anyone picked on John he would go tell his teacher. (Space is given for a large number of possible consequences.) John accidentally broke a street lamp with a snow ball. Ruth’s folks had a crowded apartment so they kept a lot of boxes and things on the fire escape. 4. Jim was anxious to make good marks at school so he usually studied instead of going out to play with the other fellows. Go 2% The foresight of consequences involves the ability to see for oneself what is likely to happen. Foresight is, of course, a conspicuous factor in intelligence. But foresight in any particular field is a function of experience as well as of intelligence. The foresight of social consequences is one of the chief characteristics of the good man, and even the relatively unintelligent can learn from experience to see ahead to the effect of their own and others’ deeds with sufficient clearness to act kindly if not altogether wisely. The forty-eight items of this test were put into six separate forms. The eight items of each form consisted of two sets of four each, with each set covering about the same range of situation and allowing for about the same range of possible consequences. The method of scoring this test has not yet been worked out. 9. Recognitions—A multiple choice test. The following is a sample: After each statement are five letters: C. L. S. X. J. If the deed is a case of Cheating, draw a circle around the C; if it is Lying, around the L; if it is Stealing, around the S. If it is something wrong, but not either cheating, lying, or stealing, put a circle around the X. If it is not wrong at all, put a circle around the J. If the thing is both cheating and lying or stealing and lying, or all three, encircle all the letters you need to in order to express your opinion. (A sample is given which is here omitted. ) 1 pee oe younger cldies 2 6 Ct. a 2. sing street car transfers that are out of date.................. C43 So oe 3. Riding on the back of a truck without the driver’s knowing it,.C. iL S. A. J 4, Apologizing for a misdeed when you are not really sorry..3... 4 ee 5. Forgetting to brush your teeth for a Gv 2 = os ee 6. Talking loudly in the hallways when classes are in session..... @ tS AL 4 Po Peet Bogers fa public parle, Ct, Seg 8. When you don’t want to go somewhere, making up an excuse so SOE 10 uel enyore’s teehags 4 el 340 44 183 8 105 340 Ave. 1.3 29 1.4 133 8 1.0 8 1.1 1.0 %C . 61 100 75 61 66 57 63 71 Element 24 Element 27 Gs a b Cc d e N a b c d N 0 : 45 29 21 28 114 3 9 t: 2 DS 82 18 30 130 2 2 2 2 22 aL ey 8 60 2 2 s 4 4 5 2 15 4 1 1 aE 7, 124 87 ao 68 341 320 iL 13 0 340 Ave. 9 9 1.0 1.0 8 9 9 aa “ol if 64 67 62 59 64 57 31 The elements reported on in Table III are taken from the original form of the Provocations Test. The subject is to indicate whether the act described is right (R), excusable (Ex), or wrong (Wr). 1. Helen noticed that nearly everyone in the class was cheating on a test, So she cheated t00.c 3.2: 5 5:55 4555 ee REX We 10. There was a contest among the classes for high grades. John cheated on the test in order to help his class win...:...-......... R Ex We 15. The neighbors had been kept awake at night by two cats fighting. So Fred set his bull doe on them...,.-...... ee REX Wer 21. When Dick pointed his father’s revolver at Joe in fun, Joe said, Lee “Don’t you know better than that you ——— foolre. 23. R Ex Wr 23. Helen knew that cucumber salad would make her sick but she ate some So. as not.to-offend-the hostess. ..2..... 2... REx Wer TABLE III CORRESPONDENCE OF IMAGINED PROVOCATION AND CONDUCT Element 1 Element 10 Element 15 C's R Bx “Wr N R Ex Wr N R Bx We oN ‘ 0 5 8 87 100 1 15 83 99 34 30 30 94 a LE 10 118 129 4 12 113 129 56 30 45 13h 2 3 4 83 90 4 13 74 91 39 30 22 91 Z 1 3 29 33 4 4 25: oo 10 9 At 30 4 1 5 6 1 1 4 6 3 2 iL 6 Lr 10 26 BoD 358 14 45 299 358 142 101 109 ob2 ve. £.0 2 42 ey 2.0 12 1-2 12 1:2 1.2 1.2 1.2 To 50 69 Lo (2 93 67 ice Ve 76 70 72 aElement 21 Element 23 0 33 23 34 90 29 39 24 92 1 47 36 51 = 134 33 51 ATS 131 2 43 17 30 90 21 40 30 91 3 12 9 9 30 3 13 14 30 4 2 1 2 5 2 3 5 aE 137 86 126 349 88-143. 118 349 Ave. 1.3 12> S10 1.2 1.0 1.2 1.4 1.2 JC 76 73 73 74 67 73 80 74 The elements reported in Table IV are taken from the original Duties Test and are as follows: 8. To bet on your home team........-..-eeeee reese er eercces True? - False 36. To stick with your gang even when they are wrong.......- True ? False 69. To accept every decision of the umpire without question... .True ? False 74. To pretend you understand a thing when you dO tt = True ? False TABLE IV CORRESPONDENCE OF SENSE OF DUTY AND CONDUCT Element 8 Element 36 Element 69 Element 74 Cs + — ? N Se 2? SN + — ? N +— ?N 0 77 64 42 183 39: 410 3) 180 $23 21 32-15 8 166 honed: aE 93-59 21 173 55 105 14 174 113] -24 2 25 162 7 150 10 167 2 43 29 14 86 26 51 = G83 5O 14 13-4 1 61 6 76 3 20 414 S39 t= ee 0 22 48 5-28 6-39 4 Sie 8 oes 2 8 4 1 3 8 3 AD ES T 238 164 87 489 Ace ft) 9 9 £0 we 65 61 52 63 134 291 59 484 i120 10 71 62 47 63 312 67 76 455 Q= Ti F021 0 61— 69-— 59: -62 34 409 30 473 £692 bo 16-59 11 Ge Looking back over Tables II to IV we find the following conspicu- ous differences. Comprehensions, Element 7. 91% of those who say it is all right to let another pupil copy your work and hand it in as his own actually cheated themselves. Comprehensions, Element 12. 100% of those preferring to smash the slot machine to recover their lost nickel actually cheated on a test. Provocations, Element 10. 93% of those who thought it right for John to cheat in order to help his class win actually cheated themselves. It is noteworthy that these high agreements among the cheaters are in regard to cheating in two cases, to property in the third and not in any instance to other types of behavior. This is somewhat surprising, since one would not expect a cheater to wear his heart on his sleeve. The way he gives himself away in these particular instances may afford suggestions as to how to build a test that will contain a large number of elements having this attraction for the cheater. Meanwhile, it may be found that other elements already used may distinguish between the honest and the dishonest subjects. A complete analysis of six hundred elements for all the cases available was hardly justified in view of the improbability of success. So we selected the twenty-five most deceptive individuals from a group that had over twenty tests of deception and twenty-five cases from another group who did not cheat on any one of ten tests. The first group cheated on the average three out of every four chances. All these children had Scale A, Form 2 of the Moral Knowledge Tests. We ran through the first four tests—Causes, Duties, Comprehensions and Provocations—and tabulated the way the honest and dishonest groups answered each ele- 58ment. The items of Table V showed significant* differences between the two groups. The score reported is the score chosen by the honest group, the other group choosing some other answer. Those marked “2” are weighted double because of the extreme difference between the groups. Items scored as shown in this table give the honest a high score and the dishonest a low score. TABLE V HONEST RESPONSES ON DISTINGUISHING ELEMENTS—SCALE A Causes Duties Provocations Item Score 3 —ors 6 wr or ex 9 — 13 + *2” 7 wr or ex 13 as i Ss —— or 5 8 ex — i/ — 17 S 12 ex “2 18 — Z5 —— ors 13 wr or ex 20 — Comprehensions 14 Wr or ex 21 = 2" 3 b 15 wr or ex 23 Sa 9 a 26 sD? 27 — 29 =O” 34 — 35 — Using Table V as a key we scored the papers of the two groups, using only the items listed. Table VI shows the results: TABLE Vi DISTRIBUTION OF MORAL KNOWLEDGE SCORES (SCALE A) OF HONEST AND DISHONEST GROUPS Score Honest Dishonest — eet ARN ND Oo) a Gri te Oe ho DO On mewn PDO 32 22 25 *Not statistically determined. The largest differences were used. 59This seemed to warrant further study, so we did the same thing for Scale B, Form 2, using another group of most honest cases, but the same group of dishonest cases. The honest scoring of the most dif- ferentiating elements was as follows: TARE Vii HONEST RESPONSES ON DISTINGUISHING TEST ELEMENTS—SCALE B Applications Recognitions Principles Vocabulary item © Score Item Score Item Score Item Score Z bor 3 3 J 3 ~~ i 1 5 4 or 5 5 C + a 10 | 7 5 12 C 6 -|- ti 1 8 4 or 5 13 Cor % 7 -— 12 3 9 3 or 4 14 x 15 1 16 J on % 18 1 18 C 19 3 oe c 20 3 24 2 Zo 1 26 ko 26 3 2s 1 = Z pe 1 oS 2 34 1 36 1 of 4 When scored as in Table VII the two groups of papers yield the following distributions: TABLE: VIII DISTRIBUTION OF MORAL KNOWLEDGE SCORES (SCALE B) OF HONEST AND DISHONEST GROUPS Score Honest Dishonest 0 Z 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 Nh uw DON ern 1 1 3 6 3 4 3 3 1 60Scale B did not succeed as well as Scale A in distinguishing the two groups, but the difference is still marked. But we may still be “stacking the deck,” so to speak, by this method of selection. The same questions might not distinguish between other groups. The apparent differences in the separate items may be chance differences in each case, so that by combining a lot of such chance differences we may have built up a large total difference peculiar to the groups selected. When the items are not selected because of their capacity to distinguish groups, but are chosen at random, the chance differences between groups tend to be neutralized. This can be tested by taking a fresh population of honest and dishonest cases and using the same items as before. We did this by selecting the most honest twenty-five and the most dishonest twenty-five from a population of 500. The difference in this case is much less significant, being only 2.7 times its standard error, whereas it should be three times its standard error to be beyond the range of chance. The difference between the cheating means of these groups on Behavior B alone was twelve times its S. E. This method seems to be unavailable for discovering the relation of moral knowledge to conduct. But having gone so far we thought we might as well see what else the differences among these several populations might reveal. Apparently the moral knowledge scores are due to other factors than those which determine the behavior scores. First it should be noted that the honest group in Table VI is from a private school of unusually fine moral tone. The deception group in the same table is from an institution for children from broken homes. The second group of honest cases used for comparison with these institutional cases consisted of about half the same children as before and half other children from the same school. The groups from the population of 500 used as a check and referred to in the second para- graph preceding, are from a suburban community and both the honest and dishonest groups are from the same schools so that the general back- ground is relatively homogeneous. Let us call these groups HP1, HP2, DI, HS, DS, respectively; HP1 and 2 the most honest private school TABLE Ix DIFFERENCES (LEFT ) BETWEEN HONEST AND DISHONEST GROUP MEANS AND THESE DIFFERENCES DIVIDED BY THEIR STANDARD ERRORS (RIGHT) rt EP2 DI HS D> HP1 Moral Knowledge 10.3 4.4 77 Deception 7 12.3 4.9 151 HP2 Moral Knowledge 8.2 Deception + 18.6 10.4 DI Moral Knowledge + 128 + 11.3 Deception +1459 +127.3 HS Moral Knowledge + 5.9 Deception + 19. DS Moral Knowledge + 9.6 Deception +108. —38.1 +88.0 61children, DI, the twenty-five most deceptive institutional children, HS and DS the twenty-five most honest and twenty-five most dishonest suburban children. Table IX displays some interesting comparisons among these groups. Remembering that any difference three or more times its S.E. (right side of table) is beyond the range of chance, let us examine this table. The biggest differences are between the groups on which the technique was built, HP1, HP2 and DI, the test questions being selected because they differentiated these groups, the private school most honest and the institutional most dishonest. The next largest differences occur in the two instances in which one of these original groups is compared with a fresh group, viz., HP1 and DS (private honest and suburban dishonest) and DI and HS (institutional dishonest and suburban honest). When entirely fresh populations are used for the honest and dishonest groups (HS and DS), the moral knowledge difference is not quite beyond the limits of chance although the deception difference is considerable. Com- parison of the suburban and institutional dishonest groups, DI and DS, shows that there is a slight difference in favor of the suburban group on both moral knowledge and deception tests. Comparison of the pri- vate school honest and suburban honest groups (HP1 and HS), shows a curious and significant difference in both moral knowledge and decep- tion. These moral knowledge scores, it must be remembered, are based on only twenty-six elements. The private school mean (honest groups) is 24.8 as against 18.8 for the suburban honest groups, a difference 4.4 times its standard error. This is a more significant difference than the difference between the moral knowledge elements of the two suburban groups. When we get away from the original two groups by means of which the elements were chosen, their power to distinguish dis- appears. This is particularly conspicuous when it is noted that these two suburban groups differ in deception by twelve times the S.E. of the difference. We must conclude, therefore, that while the responses on the selected elements are much the same for two dishonest groups, they differ so between two honest groups as to eliminate their discriminative capacity. But the comparability of the differences between the HP1 and HS group in moral knowledge (4.4) and deception (4.9) and between the DS and DI group in moral knowledge (2.5) and deception (2.9) as well as the relations between HPI and Ds, and HS and DI (see Table IX), suggest, if they do not demon- strate, a relation of some kind between the moral knowledge responses and conduct. But the great difference in answers between the two honest groups, HP1 and HS, suggests also that the relation is slight and that other factors such as the general cultural differences often found between distinct social groups such as public and private schools, and institutions, are more significant in determining correlations be- tween knowledge and conduct than are any logical relations in the minds of individuals. 62The facts just discussed are graphically portrayed in the accom- panying chart, from which it will be seen that the various groups occupy the same relative position in both moral knowledge and deception. MORAL KNOWLEDGE DECEPTION Means Means Ht 5 1 ae oo ee SS NS i Hs 49 aS 15 118 DS DE = 13} 156 DI If, as has just been suggested, the group as a unit should exhibit higher correlations between such factors as knowledge and conduct than does the individual as a unit, many interesting problems of interpreta- tion would be raised. It has seemed worthwhile, therefore, to make an intensive study of the relation between moral knowledge and conduct of social groups each of which is relatively homogeneous. The con- clusions of this study will be reported in the next article.SIXTH ARTICLE GROUP STANDARDS AND GROUP CONDUCT The previous paper in this series reported two conclusions and two provocative suggestions covering the extent to which standards and conduct are psychologically related in the behavior of individuals. The scores on our moral knowledge tests, purporting to measure general level of compre- hension of ideal conduct, proved to have very little in common with either deceptive or altruistic behavior. The way in which certain test items were answered by honest as contrasted with dishonest children seemed to offer a fruitful lead regarding the way to build a test of moral opinion which might show a better correlation with conduct. We were not able, how- ever, to select from our own tests a group of items which would consistently discriminate between honest and dishonest children. Finally, we drew atten- tion to the fact that close correspondences existed between the most honest sections and the most dishonest sections of certain school populations with respect to their mean differences in both moral knowledge and deception. From Table IX of the last article it appears that Honest Group HP1 differs from Honest Group HS in the same amount in both moral knowl- edge and deception; Dishonest Group DI differs from Honest Group HP1 84 per cent as much in knowledge as in conduct and from Honest Group HP2 78 per cent as much in knowledge as in conduct. Dishonest Group DS differs similarly from Honest Group HP1 in about the same ratio as Dishonest Group DS differs from Honest Group HS. The means of four of these groups were charted on the last page of the previous article so as to indicate the correlation. All this suggests a group similarity in behavior on moral knowledge tests and deception tests which we have thought worth investigating. In reporting the similarity of groups in moral knowledge and conduct we are not engaging in controversy over the psychological nature of a group. We shall show, however, that when one relatively homogeneous group is compared with another, differences in both knowledge and conduct are found which cannot be accounted for by chance or by differences in intelligence and which also correlate more highly than do knowledge and conduct in indi- viduals. These facts bear out the suggestion that there is a community of code and conduct in homogeneous groups which is not a function of indi- vidual integration. In this paper two types of dishonest tests and a record of helpful acts are used for the conduct scores, and eight different moral knowledge tests, wherever these could be matched, case for case. The classroom group is always the unit used. Table I shows the correlations between the available moral knowledge test scores and a type of dishonesty called Behavior C, which consists in making illegitimate use of an answer sheet while taking a test or grading one’s own paper. There were three such tests involving arithmetic problems, completion problems, and information problems. These three are combined in a single classroom or school deception score in Table I. The scores all represent amounts of deception. Classrooms doubtless differ in code in this matter as well as in conduct, but these codes are not qualitatively revealed in the moral knowledge scores, which indicate, rather, a kind of level of com- prehension as to what is expected of children. If a genuine code were avail- able the correlations would presumably run much higher. 64TABLE { INDIVIDUAL AND GROUP CORRELATIONS BETWEEN MORAL KNOWLEDGE AND DECEPTIVE BEHAVIOR C Total School Score 7 Groupr Ind. r’s Group r’s Partials 1 2 3 4 is 6 Intelligence M.K Tests Raw Corr; : PE. N Groups constant Al Causes —.04 —.05 +.28 12 435 13 A2 Duties —.25 —.32 —.35 .09 450 15 A3 Comprehensions —.18 —.24 —.80* .04 457 16 —.73 A4 Provocations —.15 —.20 —.53* .09 307 13 —.20 B2 Recognitions Se —64* 06 766-84 oS B3 Principles —.26 —.36 —.49 .09 302 8 B4 Applications —.40 —.52 —.49 .09 243 9 BS Vocabulary —~.15 —.18 —.51* .08 540 18 —.05 The columns of Tables i, if and fl have the following meanings: At the left are the separate moral knowledge tests, referred to by name, scale and number. Col. 1 gives the r’s between individual moral knowledge and deception scores. Cole 2 gives these r’s corrected for chance errors. Col. 3 gives the r’s between the class- room means in moral knowledge and deception. Col. 4 gives the P.E’s ot Col. 3. Col. 5 is the number of cases in each population. Col. 6 gives the number of class- room groups. Col. 7 shows the partial r’s between moral knowledge and deception group means with intelligence held constant. Table II presents the same facts for Behavior A—a type of dishonesty which consists in adding on more scores in a speed test when one is sup- posed to be correcting his paper. There were six such opportunities in the test. TABLE Ii INDIVIDUAL AND GROUP CORRELATIONS BETWEEN MORAL KNOWLEDGEF AND DECEPTIVE BEHAVIOR A Ind. r’s Group r’s 1 2 3 4. 5 6 Raw Corr. r Ek, N Groups Al —.14 —.22 +.265 12 780 30 A2 —.18 —.30 —.367 12 710 28 A3 —.08 —.13 —.087 13 780 30 A4 —.09 —.13 —.435 .09 780 30 B2 +.03 +.04 —.177 a 458 LT B3 —.06 —.11 +.382 12 458 i B4 —.09 —.14 —.443* 10 419 14 B5 —.06 —.07 —.338 12 528 19 The columns of Table IT have the Same meanings as those of Table I. Table III gives the correlations for general helpful behavior called Behavior H. The helpfulness scores are ratios based on teachers’ estimates of the amount of co-operation each child gave to each of several class and school service projects, and the number of such projects. TABLE HE INDIVIDUAL AND GROUP CORRELATIONS BETWEEN MORAL KNOWLEDGET AND BEHAVIOR H Ind. r’s Group r’s 1 3 4 5 6 7 Partials Raw r BE N Groups (int. constant) Al +24 714* 06 387 13 ++.65 A2 +.26 685% .06 359 12 +.63 A3 —++.12 .362 .10 386 3 A4 116 404 10 400 13 Be +.17 .363 .10 221 9 B3 +.24 .730* .05 222 9 +.75 B4 +.18 .758* .05 152 6 +.73 B5 +.45 650 07 258 10 The columns of Table TT have the same meaning as those of Tables I and i The r’s of Col. 1 could not be corrected for attenuation since the reliability of the helpfulness scores is not known. +The moral knowledge scores in Tables II and IU are from a revised form of those previously used which in each case is less than half the length of the original. 65‘Lhe first thing to be noticed in these tables is the fact that the group r’s of Column 3 are, with one exception, higher than the individual r’s of Column 1, and almost always higher than these r’s even when they are cor- rected for attenuation in Column 2. Column 1 gives the fairer comparison since in groups made up by a random selection of cases Tmimz = T1z as will be pointed out ina moment. In many cases the group r’s exceed the indi- vidual r’s in the ratio of from 4 to 1 to 7 to 1. Those that are significantly greater than the individual r’s are starred(*). Table I shows that in the case of Behavior C at least four of the moral knowledge tests correlate significantly higher in the case of the group means than in the case of the individual scores. Behavior A, however, shows only one single significant difference, although in each case the group r’s are larger than the individual r’s. Four of the moral knowledge tests show significantly different r’s between the individual and group r’s for helpful behavior (Table III), and most of the r’s run higher than for deception. The four that are starred for helpfulness are precisely the four that are not starred for the deception scores of Behavior C in Table I. These figures now set our problems for us: Classroom groups exhibit a genuine association of scores on certain moral knowledge tests and certain conduct tests which is not accounted for by the association of these same facts in the individuals who make up these groups. Individuals who rate high in moral knowledge do not necessarily rate high in conduct. In fact the relation between the two is nearly negligible. But groups that rate high in moral knowledge do also rate high in conduct, under certain conditions. That a relation of this sort between individual r’s and group r’s is not a chance result has been shown by Pearson, who demonstrated that if a series of groups are random samples of the entire population, the r’s between the means of the groups will be the same as the r’s based on individual scores.* In our case, the groups are obviously not selected at random so far as age is concerned since they are ordinary grade groups, the members of which have been together for the most part for some time. It may be that the mere mechanical age and intelligence differentiation of such grade groups would account for the likeness found in knowledge and conduct. This explanation depends upon the existence of correlations between either age or intelligence in both moral knowledge and the conducts studied. Chronological age, we know, does not correlate with either Behavior C or H. It does slightly in the case of Behavior A, but this factor has already been eliminated from the scores reported for this behavior. Differences in age, therefore, cannot account for these correlations. Differences between groups in intelligence, then, must be considered as a possible explanation of our superior group r’s. Fortunately, intelligence scores were secured in the course of our study which enable us to test this hypothesis in two different ways. The first and most obvious procedure is to partial out the variability in intelligence. This we have done for Behaviors C and H in the starred cases where the differences between the *See Kelly, Truman L. Statistical Methods, page 178, Formula 118. More ex- plicitly, if each pupil’s moral knowledge and deception scores were written on a card, and all the cards were shuffled and then sorted by chance into piles, the correlation between the mean moral knowledge scores and mean cheating scores of these piles would be the same as the r between the individual scores if they were thrown into one plot (within the limits of chance variation). 66group and individual r’s are statistically significant, and the results are to be found in Column 7 of Tables I and III. These partials are, of course, highly unreliable, but they are large enough in several cases to indicate that intelligence is not the only factor at work to produce group similarity of knowledge and conduct. Strictly speaking these partials should be compared with corresponding partials for Column 1. We have not computed these as the only effect would be, in most cases, to increase the difference between the individual and group r’s and so still further undermine the suggestion that the group r’s are to be accounted for by differences in the mean intel- ligence of the classrooms. The relatively low group r’s and high P.E.’s in the case of Behavior A make the partial correlation technique here unavailable. Hence we have adopted a different method of testing the intelligence hypothesis in this case. Our criterion here depends on the following statistical relations among random samples: If from a large population several batches of about thirty each are drawn at random, the mean and the standard deviation of each batch will be the same as the mean and the standard deviation of the whole population, within the limits of determinable errors due to chance variations among the samples.* The means of the random samples will form a normal distribution, the mean of which will be the same as the mean of the larger population and the standard deviation of which will equal the average of the SD. VN samples are not random—not mere chance accumulations of individuals— the average of the standard errors of the sample means will be less than the S.D. of the group means. The reason for this is that when a selective force is operating to make the members of a group resemble one another more than they would by chance, the range and therefore the S.D. of the scores in the trait concerned is less than for a random sample or for the total population of which the sample is a selection. Hence the average of a series of such non-random S.D.’s is less than the S.D. of the whole population. If the selective force operates unevenly from group to group, the range and therefore the S.D. of the group means will be greater than in the case of groups chosen at random. Consequently the average of the standard =a. errors of the group means (—) is bound to be less than the S.D. of VN these group means. Applying this criterion to our data, we are to show that even when class groups are random samples with respect to intelligence, or do not differ from one another significantly in this particular, they nevertheless do differ significantly from one another in both moral knowledge and conduct. Under these circumstances, such superiority of group over individual r’s between knowledge and conduct as is secured may with some confidence be attributed to some common factor other than intelligence. In applying this criterion we used seven classroom groups, whose mean intelligence scores were close together, and who had Scale A of the Moral Knowledge tests, nine such groups who had Scale B, and ten of homogeneous intelligence who were tested with Behavior A. The results are summarized in Table IV. standard errors of the S.D.’s of the samples, each of which is If the *See Yule, G. U. Introduction to the Theory of Statistics, Page 344. 67TEDL 1V CRITERION FOR RANDOM SAMPLING IN REGARD TO INTELLIGENCE, MORAL KNOWLEDGE AND BEHAVIOR A 1 2 3 4 5 No. of Ave. S17 of Ave. S.E. of Ratio of Scale A. groups N Means Means 3to4 Intelligence i 23 3.4 3.5 97 Causes iy 23 3.7 1.04 3.56 Duties 1G 23 1.6 a; 1.60 Comprehensions 7 23 72 .40 1.80 Provocations 7 2D 1.0 78 1.28 Scale B. Intelligence 9 24 3.2 3.5 91 Recognitions 9 24 3.0 2.5 1.20 Principles 9 24 1.4 45 3.1 Vocabulary 9 24 4.4 1.8 2.44 Behavior A, Intelligence 10 25 3.2 3.5 91 Deception 10 25 12.96 5.24 2.47 Thus we see from Table IV that in each set of groups the average of the S.E. of the intelligence means is slightly greater than the S.D. of the means, indicating that the groups selected are of the same level of intelligence, or, in other words, random with respect to intelligence. In each of the moral knowledge tests, however, and also in the deception test the ratios of Column 5 show the S.D.’s of the means to be greater, often very much greater, than the S.E.’s of the means, demonstrating that these groups are not ran- dom samples but show the presence of a selective force, operating inde- pendently of intelligence, to produce variation in the means. We have approached the suggestion that there is genuine group unity of standard and conduct by several steps which may be summarized as follows: 1. The correlation of groups, treated as units, with respect to level of moral knowledge and conduct, is not altogether due to the correlation of these two factors in the individuals composing the groups. 2. This correlation is not the product of a large number of uncorre- lated factors (chance). 3. This correlation is not due to differences between the groups in age or in intelligence. 4. The variability of the groups among themselves is not such as could occur by chance, age or intelligence. 5. Since the group r’s are larger than the individual r’s, they cannot be accounted for by a causal relation between moral knowledge and con- duct, since this relation could operate only through the minds of the individuals concerned. 6. Hence the superiority of the group r’s must be due to the reaction of individuals to some influence which tends both toward higher code and more social conduct (and vice versa) without these being integrated in the minds of the individuals. Such a common influence might be exerted either by the group as a whole through a growing tradition or by the teacher or by the school system, or by all three. No matter how much it affects either conduct or code for the better, if the correlations indicate the absence of individual integration, this improvement can hardly be regarded as growth in character. Lest this evidence from group correlations be regarded as insubstantial we will illustrate how it is possible to get a high correlation between group means when the r between individual scores is zero. It all depends on how the groups are constituted or selected. 68SCATTERGRAM No. 1 re-=Q 1 2 2 5 6 vd e @ @ @ @ Ba) @e;* ele @ @ @ @ @ @ e ® @|@ @le e@ e eee @@@\leeaqi* @ e ee e e@¢ |@e@0e Cee|@®@eeiao «a oe #/e0¢e |ee0e|eea\e @ @ ee eo @e@ @ © |@e8@ leeeleaedlioa @ eo @ [Pee /@@elegele @ @ @ @ s @ @ @@20e@ @eaea\le@@e@lga e © O19 @16 5 @ oe @ @ @ e e @ e@ ® e @ e e oe @ 2 @ 6 18 350 56 30 18 6 x SCATTERGRAM No. 2 redQ ThixMy S L