L B 
 
 J/31 
 
 M*4 
 
 k *B 17 Ibfl J 
 
 UNIVERSITY OF PENNSYLVANIA 
 
 THE COMPETENCY OF FIFTY COLLEGE 
 
 STUDENTS 
 
 (A DIAGNOSTIC STUDY) 
 
 BY 
 KARL GREENWOOD MILLER 
 
 A THESIS 
 
 PRESENTED TO THE FACULTY OF THE GRADUATE SCHOOL IN PARTIAL 
 
 FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE 
 
 OF DOCTOR OF PHILOSOPHY IN PSYCHOLOGY 
 
 PHILADELPHIA 
 1922 
 
UNIVERSITY OF PENNSYLVANIA 
 
 THE COMPETENCY OF FIFTY COLLEGE 
 
 STUDENTS 
 
 (A DIAGNOSTIC STUDY) 
 
 BY 
 KARL GREENWOODUVtlLLER 
 
 ■v 
 
 A THESIS 
 
 PRESENTED TO THE FACULTY OF THE GRADUATE SCHOOL IN PARTIAL 
 
 FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE 
 
 OF DOCTOR OF PHILOSOPHY IN PSYCHOLOGY 
 
 PHILADELPHIA 
 1922 
 
^ 
 
 i 2 
 
 
 ix^t 
 

 THE COMPETENCY OF FIFTY COLLEGE STUDENTS. 
 
 (A Diagnostic Study.) 
 NOTE 
 
 This Thesis will be found reprinted as No. VIII of 
 Experimental Studies in Psychology and Pedagogy 
 
 Introduction. 
 
 No task more worthy of attention confronts the psychologist 
 today than the scientific study of the college student by means of 
 mental tests. 
 
 Psychological tests were first employed in the examination and 
 segregation of the mentally feeble. A large number of clinics con- 
 nected with modern school systems, hospitals, or juvenile courts 
 have found these tests of service in detecting mental subnormality. 
 It has only been in the last decade, however, that the possibilities 
 of the psychological examination of "normal" individuals have been 
 recognized, and rapid advances are now being made in this field. 
 The success with which mental tests were used in the classification 
 and stratification of the great mass of men who formed our National 
 Army probably did more to bring about a general acceptance of the 
 method and principles involved than would have resulted from 
 many years of experimentation in peace times. Today, psychological 
 tests are used not only in the field of education but also form an in- 
 tegral part of the selective and administrative machinery of many 
 large industrial organizations. The present vogue of the mental 
 test carries with it one real danger in that the uninitiated are likely 
 to demand more of the psychologist than he can give. 
 
 Without doubt it is now possible to say, as a result of a psycho- 
 logical examination, that one individual possesses too little mentality 
 to admit of his being a self-supporting member of society, that 
 another can be trained to perform a simple task satisfactorily, that 
 a third has ability which will enable him to fill a place in the great 
 middle class, while still another has intellectual endowments which 
 should lead him into the fields of higher education and professional 
 activity. These broad classifications can be made through the 
 employment of many and various tests which have been carefully 
 devised and scientifically standardized. With the concept of differing 
 levels of general intelligence fairly well developed the psychologist 
 now faces the task of classifying individuals. When the attempt is 
 made not only to ascertain the general performance level but also to 
 determine for what occupation the specific abilities of the individual 
 best fits him, the difficulty of the problem is tremendously increased. 
 Shall the man of small competency be a ditch-digger or a stevedore? 
 
 (3) ,;>. 
 

 - - 
 
 4 
 
 Is the citizen of mediocre ability best qualified to follow the vocation 
 of motorman, mechanic or clerk? Should the college student be 
 guided into industry, law or teaching? 
 
 These questions imply that the psychologist must also function 
 as a vocational adviser, and while this obligation may not at present 
 be generally accepted, the implication is nevertheless warranted. 
 Mental tests, if they are to be of value to society, must lead to 
 prognoses as well as to diagnoses and must at least offer to the indi- 
 vidual tested some information which may be useful in the attain- 
 ment of greater personal and social efficiency. In much the same 
 manner as the employment manager of today places the applicant 
 in some particular position in his organization, so the psychologist 
 of the future may find it possible to direct each member of society 
 to the one vocation which will best utilize his peculiar qualifications. 
 
 It is hardly necessary to point out that the problem of differen- 
 tiation becomes increasingly complex as the higher levels of intellec- 
 tual organization are approached. The idiot may be consigned to 
 custodial care with but small probability of error. The stevedore, 
 the scavenger, and the ditch-digger gravitate to their respective 
 occupations without perceptible friction. The "common people" 
 present a more difficult problem in view of their higher level of 
 performance and greater complexity of response, but even here note- 
 worthy advances have been made in recent years through the intro- 
 duction of vocational guidance and the application of psychological 
 principles to industrial management. Although investigation of 
 this character has hardly passed beyond the experimental stage, a 
 beginning has nevertheless been made, and remarkable developments 
 during the next decade may be confidently anticipated. 
 
 The task of differentiating the particular abilities required of 
 the successful plumber, mechanic, clerk, motorman, and telephone 
 operator — to mention only a few of the almost countless range of 
 occupations — is doubtless a difficult one, but it hardly approaches the 
 complexity of the problem presented in the guidance of individuals of 
 greater intelligence and higher intellectual organization to the one 
 vocation for which each is best fitted. While interest, personality, 
 and various external circumstances can not be disregarded as impor- 
 tant factors in the selection of the life work, the concern of the 
 psychologist lies primarily in the determination of the specific 
 abilities requisite to each type of professional activity, and in the 
 scientific evaluation of the particular abilities possessed by each 
 individual. It is with the latter phase of the problem that this inves- 
 tigation will deal, the interest being centered on the college student, 
 who, despite his many shortcomings, must be regarded as representa- 
 tive of the highest intellectual type of young manhood in the country. 
 
Historical. 
 
 The attempt to appraise the undergraduate by means of mental 
 tests must not be considered a new departure in the field of psy- 
 chology. The credit for the first scientific study of the American 
 college student goes to J. McKean Cattell. Stimulated by his 
 researches in the anthropometric laboratory of Francis Galton, he 
 inaugurated in 1887 a series of experiments with undergraduates at 
 Harvard University, which investigation he continued at the Uni- 
 versity of Pennsylvania and Bryn Mawr College in 1888 and 1889, 
 and in the following years at Columbia University. The report 
 entitled, " Physical and Mental Measurements of the Students of 
 Columbia University", which appeared in the Psychological Review 
 for November, 1896, and in which Professor Cattell collaborated 
 with Dr. Livingston Farrand, was probably the first publication of 
 the results of a systematic study of the mental status of the college 
 student. This report is of peculiar interest today not only because 
 of its scope, but also in view of the surprising number of mental and 
 physical tests actually employed or suggested at that time which 
 now constitute the accepted instruments of every clinical psychologist. 
 While the purpose of the investigation was necessarily the establish- 
 ment of norms by the statistical treatment of the test results of one 
 hundred students, and the aim of the present study is rather the 
 observation of individual variation, it will nevertheless be of interest 
 to indicate briefly the character of the information recorded by 
 Cattell. Anthropometric measurements such as height, weight, and 
 cephalic diameters were noted, and in addition such physiognomic 
 characters as the color of hair and eyes, and the size and shape of 
 ears. In addition, psychophysical determinations of visual and 
 auditory acuity, sensitivity to pain, and various types of reaction 
 time were made, as well as tests of a more strictly psychological 
 nature which included memory of drawn lines, memory of numbers 
 heard, cancellation test, color preference, types of imagery, and 
 others. 
 
 The investigation under consideration was carried on during 
 the academic years of 1894-95 and 1895-96, and it is of interest to 
 note that the results were published so as to be of assistance to a 
 committee appointed at the annual meeting of the American Psy- 
 chological Association held at Philadelphia in December, 1895, to 
 consider the feasibility of co-operation among the various psycho- 
 logical laboratories in the collection of mental and physical statistics. 
 This "Committee on Mental and Physical Tests", which consisted 
 of Professors Cattell, Baldwin, Jastrow, Sanford, and Witmer, may 
 well be said to have laid the foundation for all subsequent develop- 
 
6 
 
 ments in the realm of psychological tests in its report to the Psy- 
 chological Association at the meeting held in Boston in 1896. This 
 report may be found in the Psychological Review of March of the 
 following year. 
 
 Having thus briefly indicated the inception of the present field 
 of investigation, it would be a thankless task to trace its history down 
 to the present moment in any adequate manner. Studies of this 
 character have been carried on in every psychological laboratory 
 connected with a college or university, and a complete bibliography 
 of the reports on the subject would cover many pages. It will be 
 well, however, to mention a few of the more important investigations 
 which have a direct bearing on the present problem, in so far as it 
 concerns the correlation of test results with academic standing. 
 Wissler (1) correlated the results published by Cattell and Farrand, 
 to which reference has been made above, with the university grades 
 assigned to the hundred students under consideration. Calfee (2) 
 has reported on "Four General Intelligence Tests" given to approx- 
 imately one hundred students at the University of Texas. Similar 
 investigations have been made by Rowland and Lowden (3) at 
 Reed College, Waugh (4) at Beloit College, and by Kitson (5) at the 
 University of Chicago. The latter study is particularly worthy of 
 note in that a very careful and intensive examination of forty students 
 was made. King and McCrory (6) report the results of tests on 
 five hundred freshmen at the University of Iowa, Caldwell (7) has 
 correlated the Intelligence Quotient of approximately one hundred 
 students at Randolph-Macon Woman's College, as determined by 
 the Adult Tests of the Stanford Revision, with college grades, and 
 Rogers (8) gives interesting results of her investigation at Goucher 
 College. In the reports mentioned above, Kitson and Caldwell also 
 record correlations between test results and estimated intelligence, 
 which will be referred to later in this discussion. Incomplete as is 
 the preceding sketch, it nevertheless gives some indication of the 
 wide-spread interest in the application of mental tests to the college 
 student. In this connection it will likewise be well to refer to the 
 comparatively recent development in the field of psychological en- 
 trance examinations,- which are now demonstrating their practi- 
 cability in a number of the larger universities, and which constitute 
 a further ramification of the same problem. 
 
 Experimental Conditions. 
 
 Stated briefly, the aim of the present study is to examine certain 
 data which have been collected relative to each member of the class 
 in elementary psychology at the University of Pennsylvania during 
 
the academic year 1919-20. This information consists of the score 
 obtained in a "general intelligence examination", the results of a 
 series of psychological tests, a rating on estimated competency, and 
 a rating based on the academic standing of the individual as de- 
 termined by the final grades received in all courses completed at the 
 University. The treatment of results will be concerned with the 
 examination of correlations existing between the various ratings under 
 consideration, and with the scrutiny of the individual record with a 
 view to reaching, if possible, some conclusions which might be of 
 assistance to the student in the direction of his intellectual 
 development. 
 
 The investigation differs from many which have preceded it, in 
 that the psychological tests, with one exception, were given as a 
 part of the ordinary class instruction and therefore not primarily 
 as tests. The elementary work in psychology consists of two courses 
 known as Psychology 1 and 2, each requiring five hours of class 
 attendance and continuing throughout one semester. Since credit 
 in Psychology 1 is prerequisite to admission into Psychology 2, 
 the two courses may be considered as a single introductory course 
 lasting through the full academic year. Of the five hours of class 
 attendance per week, only one hour is occupied by a formal lecture, 
 the remaining four hours being devoted to laboratory work. During 
 the first semester a number of mental tests are given as a part of the 
 laboratory work and with the purpose of graphically demonstrating 
 the various factors which function in the formation and development 
 of the intellect. It is believed that this method enables the student 
 better to understand and appreciate the particular ability or men- 
 tal process under discussion. It is not claimed, therefore, that the 
 series of tests employed would necessarily have been chosen had the 
 purpose been the psychological examination and diagnosis of the 
 individual to the exclusion of other considerations. However, the 
 tests unquestionably provide a very satisfactory framework upon 
 which to build a logical presentation of systematic psychology as 
 well as offering a medium for the demonstration of fundamental 
 psychological processes. In addition, the tests are extremely valuable 
 to the student, in that they enable him to determine his peculiar 
 mental assets and liabilities through a comparison of his individual 
 results with accepted standards or class distributions. 
 
 Since the tests under consideration were given as a part of the 
 usual classroom procedure, the scientifically controlled conditions 
 which are generally regarded as indispensable to a psychological 
 investigation of this character were for the most part lacking. As 
 the class in Psychology 1 numbered more than two hundred students, 
 
8 
 
 the laboratory work was conducted in three sections with an average 
 enrollment of approximately seventy. These three sections all met 
 in the same room, one being held at eight-thirty o'clock in the morn- 
 ing, another at two in the afternoon, and the third at three o'clock 
 on a different afternoon. While the time of meeting was constant 
 for each section, the variation in hour possibly affected the com- 
 parability of section results. With such a large number of students 
 in a laboratory class, some were necessarily seated at a greater 
 distance from the instructor than others and in addition a few were 
 near windows which may have provided distraction of one kind or 
 another. In some cases, the same test was given to the three sections 
 by different experimenters, and although the attempt was made to 
 adhere as closely as conditions would permit to the standard pro- 
 cedure, this variant may have affected the results to some extent. 
 In summary, lack of uniformity in the time of meeting of the different 
 sections, in the seating arrangement of the classroom, and in the 
 identity of the experimenter may be considered factors which expose 
 this investigation to criticism as being unscientifically conceived and 
 prosecuted. 
 
 The comparative absence of controlled experimental conditions, 
 however, cannot be said to invalidate the results. It is an open 
 question whether the environment imposed upon a subject by 
 scientifically controlled conditions elicits a more representative sam- 
 ple of behavior than that produced under less artificial circumstances. 
 Is the psychologist more interested in the reaction of a subject who 
 has been isolated in a sound-proof cabinet with a screen before his 
 eyes to eliminate distracting visual stimuli, or in the behavior of the 
 same individual as displayed in natural association with his fellows? 
 For some, the classroom would provide as unnatural an environment 
 as any that the experimentalist might impose, but for a group of 
 university students no more satisfactory and less distracting atmos- 
 phere could be selected than that of the recitation hall or laboratory. 
 It is contended, therefore, that the experimental results here 
 presented provide an index of the mental status of the college 
 student as reliable as any that might have been obtained under 
 other conditions. 
 
 Having thus disposed in a somewhat arbitrary manner of any 
 criticisms which might be voiced against the general procedure fol- 
 lowed in this investigation, it will be well to consider the treatment 
 of the data collected before undertaking a description of the specific 
 tests employed. As has been indicated, the information available 
 concerning each member of the group here studied consists chiefly 
 of the results of a series of mental tests and the academic record of 
 
9 
 
 the student as displayed in his college grades. The problem of 
 devising some statistical method by which the various scores and 
 grades may be made easily comparable is immediately encountered. 
 For example, a member of the class might have obtained a score of 
 131 in the general intelligence examination, a time rating of forty- 
 three seconds in a mechanical test, and he may have an audito- 
 graphic memory span of eight digits as well as a number of other 
 test results. In addition, his college record may show that he has 
 received the highest grade in 10 per cent of his academic work, a 
 passing grade in 70 per cent, and that he failed in the remainder of 
 his courses. The necessity of reducing these various values to some 
 common denominator so as to render them comparable is evident. 
 
 Perhaps the most natural procedure would have been to obtain 
 arithmetical averages of the results of each test and rate the indi- 
 vidual performance in terms of its variation from the average. After 
 determining a rank order in academic standing it would then have 
 been possible to calculate the correlations and intercorrelations 
 desired. Such a method is valuable in the examination and stand- 
 ardization of tests, but it has little to offer when the interest is 
 chiefly centered in the study of the individual rather than the tests, 
 and it has a tendency to obscure significant personal variations under 
 a mass of figures. Indeed, it is probable that correlation as a sta- 
 tistical method has been carried to extremes in recent psychological 
 investigations. When the results of two mental tests show a high 
 degree of correlation, it does not necessarily follow that they tap 
 two abilities which are mutually dependent, but rather that the 
 tests have called the same ability or group of abilities into play. 
 Conversely, a lack of significant correlation may show either that 
 one of the tests is unreliable or that the results are not dependent on 
 some common factor. If college psychological tests are designed to 
 call into play the same abilities which function in college grades, 
 such tests are useless unless a high degree of correlation with academic 
 standing can be demonstrated. On the other hand, the absence of 
 such correlation does not show the tests to be devoid of significance, 
 but merely that they measure other abilities or factors than are 
 predominant in the attainment of grades. Further, if it be admitted 
 that individual competency is the algebraic sum of the various 
 specific abilities and disabilities, then the ideal series of psychological 
 tests — which would include a different test for each special ability — 
 would show no significant intercorrelations for individuals at the 
 same level of general intelligence. 
 
 The purpose here, therefore, is to present the material in such 
 form as best to facilitate the scrutiny of the individual record, rather 
 
10 
 
 than in the form most convenient for statistical treatment. Hence 
 the various results must be rated on some common scale which has 
 steps of sufficient number to provide the necessary differentiation 
 without introducing a false accuracy. In addition, since many of 
 the tests used have not been scientifically standardized, it is im- 
 portant to adopt a rating system which will permit the comparison 
 of test scores with each other rather than with accepted standards. 
 
 A consideration of the many rating scales which lend them- 
 selves to the present purpose shows that the extremes are to be 
 found in the percentile and the two-division systems. It is hardly 
 necessary to enter into a discussion of the pseudo-accuracy of the 
 percentile grade. It is only in the very unusual case that the 
 material to be rated can be clearly enough differentiated to give any 
 real significance to each of the hundred points on the percentile scale. 
 Investigations have shown the wide variation in grades given by 
 different instructors to the same examination paper even in the 
 field of mathematics where the greatest accuracy might be expected. 
 This variation, however, is no greater than that shown in the grades 
 given by the same scorer to the same paper at different times. The 
 injustice done to the college student who receives a final mark of 
 69 per cent in the course which demands 70 per cent as a passing 
 grade has been commented upon too frequently to require more than 
 passing mention in this discussion. Obviously, the refinement of the 
 percentile scale is too great for the material here at hand. On the 
 contrary, the system which merely distinguishes the "passing" from 
 the "not passing" does not provide sufficient differentiation for 
 analytic examination of the results of a series of mental tests. 
 
 Popular acceptance would seem to have stamped its seal of 
 approval on a five-division rating scale. Cabbages and kings alike 
 are usually judged mediocre, good or very good, poor or very poor. 
 The great majority of our quantitative expressions are given in 
 these terms, and the system seems to provide a sufficient number 
 of significant levels without introducing the fallacy of too great 
 refinement. This psychological justification of the five-point scale, 
 as well as other considerations of convenience and facility of compari- 
 son, led to its adoption as the most satisfactory method of treating 
 the various results and scores herein presented. In accordance with 
 this decision, the results of each test given to the two hundred students 
 who comprised the class in elementary psychology were arranged in 
 rank order and separated into quintiles. While the nature of some 
 of the tests has made even such a coarse rating as this quite difficult, 
 it is believed that the system adopted is the most practicable that 
 could have been devised for the present purpose. Since all grades 
 
11 
 
 assigned in the School of Arts and Science at the University of Penn- 
 sylvania are recorded in terms of a five-point system, an added 
 advantage is gained in the comparison of test scores with academic 
 success. 
 
 The results tabulated in a later section will therefore not be 
 found to contain the number of digits for the memory span, the 
 number of seconds required for the completion of the cylinder test, 
 or the number of problems correctly solved in the general intel- 
 ligence examination, but instead the translation of each of these 
 scores into a quintile rating. If the performance of an individual 
 places him in the best twenty per cent of the class in a particular 
 test, he is given a rating of "5", if in the poorest fifth of the group 
 of two hundred, his quintile grade would be " 1 ". The upper, middle, 
 and lower quintiles are represented by "4", "3", and "2", respec- 
 tively. By thus evaluating a given performance in terms of the class 
 results, it will be found a relatively simple matter to scrutinize the 
 ratings for each individual and gain a fairly trustworthy impression 
 of his standing in an unselected group of university students, and at 
 the same time to note his peculiar mental assets and liabilities. 
 
 Selection of Group. 
 
 Since it is the aim of this investigation to discover individual 
 differences in a comparatively homogeneous group of students, it 
 seemed advisable to make certain eliminations before undertaking an 
 intensive study of test scores and college grades. Of the 220 students 
 who registered for Psychology 1 at the beginning of the session of 
 1919-20, fifteen withdrew before the work of the semester was really 
 under way, reducing the class to an actual enrolment of 205. Of 
 these, 125 were taking the course in the School of Arts and Science, 
 the remainder being students in the School of Education. This split 
 also gives the approximate ratio of men to women in the class. Dur- 
 ing the semester twenty members of the class were dropped because 
 of deficiency or received a failure upon the termination of the course 
 which excluded them from participation in Psychology 2. Since it 
 was deemed advisable to make the completion of both courses one 
 of the requisites for inclusion in this study, these twenty students 
 were automatically eliminated. In order to obtain homogeneity it 
 was also decided not to introduce sex differences but to limit the 
 investigation to male students enrolled in the School of Arts and 
 Science. Of the 125 men who originally started the course only 113 
 were eligible for Psychology 2, and of these only eighty received 
 final grades at the end of the second semester. Since one of the 
 ratings to be taken into consideration is based on academic standing, 
 
12 
 
 it was thought best not to include first-year students in the selected 
 group, thereby eliminating all who were not able to survive at least 
 one year of university work, reducing the variation in age, and at the 
 same time making it possible to base the academic rating on college 
 grades received during two or more years of class attendance. 
 
 When these eliminations had been made, fifty-one students 
 were eligible for inclusion in this study. Of this number, one indi- 
 vidual over thirty years of age was arbitrarily excluded as not con- 
 forming to the normal college age. Of the fifty remaining as subjects 
 of this investigation, thirty-three had sophomore standing, twelve 
 were rated as juniors, and five were seniors. The average age for 
 the group as of October 1, 1919, was 20.8 years, that of the sopho- 
 mores being 20.5 years, and of the juniors and seniors, 21.3 and 21.4 
 years respectively. Although the averages in the latter cases are 
 not of great significance due to the small size of the groups in question, 
 the figures quoted do show that the larger group of fifty is composed 
 of students of approximately normal college age. In conclusion, it 
 will be well to point out that although only about one-fourth of the 
 total class in psychology is to be included in the study, the selection 
 was made on the basis of group qualifications and without regard to 
 individual merit, except for the automatic elimination of those mem- 
 bers of the class who were excluded for deficiency in scholarship. 
 
 The Psychological Tests. 
 
 The psychological tests included a general intelligence examina- 
 tion, the "Psychological Examination for College Freshmen and 
 High School Seniors", devised by Professor L. L. Thurstone, and the 
 following thirteen tests designed to exercise some particular ability 
 or group of abilities: (1) Ausfrage (Observation) Test, (2) Taylor 
 Number Test, (3) Memory Span for Digits, (4) Memory Span for 
 Syllables, (5) Memory Span for Ideas, (6) Description of Formboard, 
 (7) Trabue Language Test, (8) Courtis Arithmetic Test, (9) Differ- 
 ences and Likenesses Test, (10) Opposites Test, (11) Definitions 
 Test, (12) Humpstone Memory Test, (13) Witmer Cylinder Test. 
 
 The tests were given in the order indicated, and, with the excep- 
 tion of the Witmer cylinder test, all were given during the first half 
 of the academic year, or, in other words, as part of the laboratory 
 work in Psychology 1. The cylinder test was given in connection 
 with the competency rating toward the close of the second semester, 
 and it is the only one of the series which was given as an individual 
 and not as a group test, and likewise it alone was given primarily as 
 a test and not for its didactic or illustrative value. Of the series 
 employed, the memory span for digits, the Trabue sentence com- 
 
13 
 
 pletion, the Courtis arithmetic and Witmer cylinder tests are all in 
 general use and have been carefully standardized. The Ausfrage, 
 memory span for syllables and for ideas, description, differences and 
 likenesses, opposites, and definitions tests have merely been adapted 
 to the present instructional aims, while the Taylor number test and 
 the Humpstone memory test are here described for the first time. 
 
 Before undertaking a description of the various tests it will be 
 well to note in connection with the scoring that the quintile ratings 
 were in each case based on the results of the class of approximately 
 two hundred students and not on the relative performance of the 
 fifty here to be considered. 
 
 Thurstone Psychological Examination. 
 
 On the afternoon of October 26, 1919, some fifteen hundred first- 
 year students in the various undergraduate schools of the University 
 of Pennsylvania were given the Thurstone "Psychological Exam- 
 ination for College Freshmen and High School Seniors", the experi- 
 ment being conducted by the Department of Admissions in co- 
 operation with a number of other colleges and universities in the 
 state of Pennsylvania. At the same hour, the examination was 
 given to approximately 120 students who were then meeting in 
 different sections of Psychology 1, with the purpose of comparing 
 the scores obtained by this relatively selected group, which included 
 no freshmen, with the results of the larger first year group. The 
 fifty students who form the basis for this investigation all took the 
 examination as members of laboratory classes in psychology. 
 
 Description: The form which was used is known as "Test IV, 
 Edition of September, 1919 — issued by L. L. Thurstone of the 
 Carnegie Institute of Technology". The examination consists of 
 168 short problems which are to be solved in order. The printed 
 directions on the cover of the pamphlet, and the specific nature of 
 the instructions for each problem greatly simplify the administration 
 of the test. The important timing element, which is a complicating 
 factor in such examinations as the Army Alpha and the Otis intelli- 
 gence test, is practically eliminated in this case. The directions, 
 which are read by the examinee before the beginning of the exami- 
 nation, state that thirty minutes will be given in which to solve as 
 many problems as possible. The problems are to be taken in order, 
 but instructions are also given to skip any which may not be under- 
 stood. The task of the examiner, therefore, is merely to call attention 
 to the directions after the pamphlets have been distributed, and to 
 give the appropriate signals at the beginning and end of the thirty- 
 minute period. Although the subject is directed to solve the prob- 
 
14 
 
 lems in order, the final score is determined solely by the number of 
 correct solutions without reference to errors or omissions. 
 
 The 168 problems which compose the examination are arranged 
 in what is known as the cycle-omnibus form. In other words, while 
 only six different tests are employed, the separate problems which 
 go to make up each test appear in rotation instead of being grouped 
 together as is more usually the case. The examination may readily 
 be analyzed into a number of sets of eight problems each, and in 
 each set all of the six types of tests occur in regular order. The 
 first two problems in each group form part of a general information 
 test, while the next two are a variation of the familiar analogies test, 
 and the fifth is a sentence completion test taken from the language 
 scales devised by Trabue. The sixth problem in each set is of the 
 type known as the syllogism test, and the seventh, referred to by 
 Thurstone as the reading test, is a form of the widely-used proverbs 
 test. The last problem of the group is an example of the number 
 completion test 
 
 Since eight of the 168 problems are preliminary samples for 
 which the correct solution is given, the examination actually con- 
 sists of only 160 problems of which forty comprise a test of general 
 information, an equal number form an analogies test, while each of 
 the other types is represented by twenty problems. The final score 
 is therefore weighted in the direction of information and analogies. 
 
 Discussion: It is not the present intention to enter into a 
 lengthy criticism of the validity of general intelligence tests. Ever 
 since the Binet-Simon scale came into popular use, this question 
 has been discussed with varying degrees of fervor, and the many 
 recent additions to the store of group tests, which have appeared as 
 an aftermath of the army series, have served to keep the controversy 
 before the psychological eye. Even the most conservative intro- 
 spectionist must admit that the army tests performed a valuable 
 service in the stratification of the National Army, and that satis- 
 factory results are being obtained at several of the larger univer- 
 sities by the admission of students on the basis of group psychological 
 examinations in lieu of the traditional entrance requirements. The 
 general intelligence test is of established significance in the differ- 
 entiation of the various well-recognized levels of performance. The 
 question which must be broached here is whether it is of equal 
 significance when applied to individuals at the same general intellec- 
 tual level, and particularly whether it discloses any information of 
 value relative to the college student. 
 
 It may be contended that the Thurstone examination is designed 
 for the elimination of applicants for admission, and that significant 
 
15 
 
 results are not to be expected when the test is applied to students 
 who have not only met the entrance requirements but have success- 
 fully completed at least one year of college work, as is the case with 
 the present group. Nevertheless, it seems profitable to inquire into 
 the particular abilities called into play when the examination is 
 submitted to college students. A mere inspection of the series of 
 problems quoted above will demonstrate that the correct solutions 
 could be given by any person of the intellectual level of the college 
 student were unlimited time at his disposal. An exception to this 
 statement must be made in the case of a few general information 
 questions, which are so designed that no individual would be likely 
 to give correct answers to all. Hence, whatever the abilities in- 
 volved in the solution of the six different tests of which the exam- 
 ination is composed, the score obtained is primarily an index of mental 
 alertness or of the rapidity of the reasoning processes and not of 
 what is usually termed general intelligence. If the colleges wish to 
 admit candidates on the basis of the speed with which a problem can 
 be solved and without regard to the proportion of correct solutions, 
 then the Thurstone examination should be found very satisfactory. 
 Or if experimentation can demonstrate that the rapid thinker is also 
 the accurate thinker, this type of test will be equally acceptable. 
 In this connection it is interesting to note that a correlation between 
 the score on the Thurstone test and the percentage of correct answers 
 to the total number attempted shows the unexpectedly high coeffi- 
 cient of +0.74 (Pearson) in the case of fifty results chosen at random. 
 This would seem to indicate that accuracy and speed are closely 
 related, and must be considered as arguing for the validity of the 
 examination. A study of the same fifty cases shows that on the 
 average only 85 per cent of the solutions given were correct, the 
 syllogism test being the most difficult with 23 per cent incorrect, 
 while the greatest accuracy was shown in the analogies, sentence 
 completion, and number completion tests, each of which had an 
 error of only 10 per cent. 
 
 Although the " cycle-omnibus' ' type of examination has marked 
 advantages, chief among which are simplicity in administration and 
 scoring, one important weakness must be noted. Assuming that 
 the six tests which compose the examination call into play different 
 abilities, it is often desirable to analyze a given score in order to 
 determine individual assets or deficiencies. In other words, a low 
 score might be due either to a poor performance in all six of the 
 tests, or to a particularly deficient result in any one of them, such 
 as the general information test. While the score would be the same 
 in both cases, its significance would be very different. Most of the 
 
16 
 
 general intelligence tests are so arranged that the scores for the dif- 
 ferent parts of the examination are readily available for comparison. 
 In the case of the cycle-omnibus, however, an analysis of the various 
 test results is practically impossible in view of the undue expendi- 
 ture of time and effort required. 
 
 As in the case of the other tests, the class scores for the Thurstone 
 examination were arranged in rank order and quintiled. The rating 
 for each individual in the table of results shows the quintile grade 
 and not the actual score. A discussion of the results and correlations 
 obtained will appear in a later section. 
 
 Ausfrage Test. 
 
 Description: This test is a variation of the familiar Ausfrage 
 test, differing from it only in that specific questions are asked. In 
 the first part of the test a picture was thrown on the screen with the 
 aid of a stereopticon and the class allowed to examine it for two 
 minutes, the following instructions having previously been given: 
 "I am going to throw a picture on the screen. While it is there I 
 want you to do nothing but look at it. When I have finished I will 
 ask you to answer some questions." Upon the removal of the 
 picture, ten questions were asked relative to different objects which 
 may or may not have appeared in the picture. 
 
 The second part of the test consisted of a series of ten questions 
 based on observation of the university buildings and campus and of 
 the city of Philadelphia. In both parts of the test written answers 
 were obtained. 
 
 In scoring the results, each correct answer received one point, 
 giving a maximum score of twenty points. The class results were 
 distributed in rank order and quintile ratings determined. 
 
 Discussion: The ability primarily involved in this test is that 
 of observation, which implies attending to something and making 
 note of it for a purpose. In this case, the stimulus was visual, and 
 therefore visual sensibility and discrimination are essential. It may 
 be assumed, however, in connection with this test as well as those 
 which follow, that every member of the class was equipped with the 
 necessary sensibility and with the psycho-motor apparatus involved 
 in the recording of results, and these factors will therefore be dis- 
 regarded in discussing the various tests. Analytic concentration 
 and distribution of attention play a part in the process of observa- 
 tion, as does the factor of associability, which will be discussed at 
 some length in connection with memory span. Memory enters but 
 little into the first part of the test, since the retention required is of 
 brief duration, but it must be considered an important element in 
 
17 
 
 the second part. While all of the abilities mentioned are involved, 
 the test may be regarded as primarily one of observation. 
 
 Taylor Number Test. 
 
 Description: The test material consists of a sheet of white paper 
 8J/£ x 10 inches in size, upon which are distributed in a haphazard 
 arrangement the numbers from 1 to 50, inclusive, printed in half-inch 
 bold-face black type. One sheet was handed to each student with 
 the numbered side of the paper downward, while the following 
 directions were given: "I am going to give each of you a sheet of 
 paper. I want you to let it lie on your desk until I tell you what 
 to do with it. When I am ready I shall give three commands, the 
 first, * Ready', the second, 'Turn', and the third, 'Go\ When you 
 turn the paper, turn it from the right side over to the left, and in 
 the upper left hand corner you will find the number '1\ On the 
 paper are the numbers from 1 to 50, not arranged in any regular 
 order, but scattered over the sheet. As soon as you have turned 
 the paper, place your pencil on number 1. When I say 'Go', draw 
 a straight line to number 2, then to 3, and go on in order to each 
 number until I say 'Stop'. When I say 'Stop' hold your pencils in 
 the air immediately." 
 
 A time limit of forty seconds was allowed for the test, and the 
 results were then scored on the basis of the highest number reached. 
 The distribution of class results was made and the quintile ratings 
 obtained. 
 
 Discussion: In so far as is known, this test was devised by 
 Mr. Charles K. Taylor and was first used a number of years ago in 
 the Psychological Laboratory of the University of Pennsylvania. 
 When repeated a number of times the Taylor number test serves as 
 an excellent index of trainability, but when only one trial is allowed 
 it must be considered a test of alertness or distribution of attention. 
 In many ways this test is similar to the more familiar "Cancellation 
 Test", but it has the advantage of providing no definite cues to 
 exploitation, since great care was taken not to arrange the numbers 
 on the sheet in an orderly manner. In addition, the goal is con- 
 stantly changing in this test while it remains constant in the can- 
 cellation test, where the aim is to locate some particular letter or 
 digit. 
 
 It would seem unlikely that discrimination of form would be a 
 factor worthy of consideration in the performance of this test by 
 college students, but under the conditions of rapid exploration which 
 usually exist this element cannot be overlooked. The test also has 
 an important motor phase, and coordination and control of move- 
 
18 
 
 ment play a rather important part in the result. However, the 
 higher scores may be attributed to good distribution of attention 
 coupled with methodical exploration. 
 
 Memory Span for Digits. 
 
 Description: The material for this test consists of twenty series 
 of digits, ranging in length from three to twelve digits, and including 
 two series of each length. The series used were employed by H. J. 
 Humpstone in his standardization of the test, and were so prepared 
 that no two digits occur in the natural order or in the reversed order, 
 no two succeeding series begin with the same digit, and no digit is 
 repeated except in the series of ten or more. Zero is not used. 
 
 The instructions given were those used by Humpstone (9). 
 "This is an experiment. In every experiment it is necessary for 
 everyone who takes part to do just what the experimenter asks. 
 Please do just as I ask you to. I am going to say some numbers. 
 While I say them I do not want you to do anything except look at 
 me and hold your pencils up where I can see them. When I put my 
 pencil down you write on your paper the numbers I have said." 
 The digits were then pronounced at the rate of one per second, with- 
 out rhythm or change of intonation except that on the last one of a 
 series the voice was allowed to fall as a signal for reproduction. In 
 each case the number of digits in the series was announced before 
 the series was given. In scoring the results, the number of digits 
 in the longest series correctly reproduced is considered the memory 
 span. The quintile ratings are based on the scores thus obtained. 
 
 Discussion: According to Professor Humpstone, "It has been 
 assumed by almost everybody who has written on the test that it 
 tests memory. A careful analysis causes us to doubt the validity of 
 the assumption. Some imagination is required. The subject must 
 have enough imageability to get perceptions of the stimuli .... 
 In the same sense memory is involved. The images must be retained 
 long enough for reproduction. But this period is so brief that the 
 results do not furnish any criterion by which to judge of retentive- 
 ness .... Attention is involved also .... The ability to dis- 
 tribute the attention well is doubtless an aid in the performance." 
 He continues, "Perhaps the memory span test comes nearer to testing 
 one definite ability than any other test. Whatever other factors or 
 abilities enter into the performance of this test it is clear that the 
 thing specifically tested is the ability to grasp and associate a number 
 of discrete units of perception in a definite order. This is not memory 
 as pointed out above. We are using the term associability and 
 subsuming it under the general heading imagination. Associability 
 
19 
 
 refers to the 'number of discrete perceptions associated in a single 
 act of attention, and the combination of the associated component 
 parts of a single perception'." 
 
 While the memory span test is of great value in the examination 
 of the mentally retarded, and it can be said without fear of contra- 
 diction that a memory span of four and probably of five is pre- 
 requisite to intellectual development, the test loses much of its 
 significance when applied to a group of college students. Certainly 
 in the case of the higher scores the result has been exaggerated by 
 means of grouping, and the factor of planfulness plays an important 
 part. The lower memory spans of five and six digits are probably 
 of greater significance. 
 
 Memory Span for Syllables. 
 
 Description: The subject-matter of the test consists of sixteen 
 sentences ranging in length from ten to fifty syllables. The series 
 provides two sentences at each of the various levels, namely ten, 
 twenty, twenty-five, thirty, thirty-five, forty, forty-five, and fifty 
 syllables. The sentences were prepared by H. J. Humpstone and 
 were all taken from a popular current periodical, so as to obtain 
 material of a non-technical character which would be of suitable 
 difficulty and complexity for the ordinary adult. In each pair of 
 sentences, the first is designed to encourage visual imagery, while 
 the second is of a more abstract nature and does not lend itself 
 readily to any type of sensory imagery. 
 
 In administering the test, the sentences were read aloud with 
 natural expression, the class having been instructed to reproduce 
 each sentence graphically, immediately following its presentation. 
 The number of syllables in the longest sentence reproduced verbatim 
 was considered the memory span for syllables for each individual. 
 The scores thus obtained were distributed in the usual manner and 
 quintiled. 
 
 Discussion: The test here described is an adaptation of the 
 "repeating syllables" test used by Binet and modified by Terman 
 (10) in the Stanford revision of the Binet-Simon scale. It was felt 
 that the sentences used by Terman in the average adult group were 
 not well suited to the college student, and it was also desired to 
 extend the series beyond twenty-eight syllables. Results so far 
 obtained with the Humpstone sentences show a maximum span of 
 forty syllables, a minimum of twenty, with a decided mode at thirty 
 syllables. An analysis of a large number of results has shown no 
 significant difference in the difficulty of the visual and abstract 
 sentences. 
 
20 
 
 The test may be said to measure the integrated memory span. 
 While the factor of associability is probably predominant, the ele- 
 ments in this case are not discrete as in the memory span for digits, 
 and reproduction calls for a higher degree of intellectual organization. 
 Memory is a more important factor than in the span for digits, since 
 the period of retention is somewhat longer, but again it cannot be 
 considered the ability primarily tested. Language ability is cer- 
 tainly involved but the popular character of the sentences employed 
 minimizes its importance when the test is applied to college students. 
 The use of tests of this nature as a measure of proficiency in a foreign 
 language is suggested in this connection. The memory span for 
 syllables must be considered an index of integrability rather than of 
 simple associability. 
 
 Memory Span for Ideas. 
 
 Description: The paragraph beginning "Tests such as we are 
 now making" from the superior adult series of the Stanford revision 
 was used as the material for the test. The standard directions were 
 given with necessary modification for graphic instead of oral repro- 
 duction, as follows: "I am going to read a little selection of about 
 six or eight lines. When I am through I will ask you to write as 
 much of it as you can. It doesn't make any difference whether you 
 remember the exact words or not, but you must listen carefully so 
 that you can write down everything it says." The paragraph was 
 then read at a natural rate, following which adequate time was 
 allowed for reproduction. 
 
 The results were scored on the basis of the number of ideas 
 correctly recorded, the paragraph having been analyzed into sixteen 
 discrete ideas. The scores thus obtained were arranged in rank 
 order and the quintile ratings determined. 
 
 Discussion: While this test is spoken of by Whipple and others 
 as a measure of logical as contrasted with rote memory, Terman calls 
 attention to the fact that it is rather a test "of ability to comprehend 
 the drift of an abstract passage". It seems more satisfactory, how- 
 ever, to regard the memory span for ideas as a natural sequent to 
 the spans for digits and syllables. It will readily be granted that 
 the college student, who receives most of his mental pabulum through 
 the medium of lectures, can comprehend the drift of such a passage 
 as the one here employed. The test must, therefore, be considered 
 a measure of the subject's ability to associate in consciousness a 
 number of logically related ideas. That this requires a higher level 
 of intellectual organization than the verbatim reproduction of a 
 sentence, as in the memory span for syllables, is hardly open to 
 
21 
 
 question. The test, then, involves not only the element of associ- 
 ability but likewise a high degree of understanding and of intellect. 
 It would therefore be reasonable to expect this test to be more 
 significant when applied to college students than either the memory 
 span for digits or for syllables. 
 
 While there may be some disagreement as to what constitutes 
 the unit idea which is to be used as the basis for scoring, the method 
 employed by Terman is too vague for the present purpose, and it is 
 believed that the comparative results obtained by any logical scoring 
 system will be significant. 
 
 Description Test. 
 
 Description: The Witmer formboard, a modification of that of 
 Seguin, was used as the object to be described. The Witmer board 
 provides recesses for eleven forms, namely the square, rectangle, 
 cross, oval, semicircle, star, equilateral triangle, isosceles triangle, 
 hexagon, circle, and diamond. The following instructions were 
 given: "I have here an object. I am not going to give you a name 
 for it. You can call it a 'thing' — call it 'X\ I want you to pass 
 it around so that each one in the class has an opportunity to examine 
 it." A number of formboards were then passed about the class, 
 and after six minutes had been allowed for examination they were 
 collected and placed on tables in different parts of the laboratory 
 where they could easily be seen. Further instructions were then 
 given. "What is it? In answer to that question I want you to 
 write a description in such a way that anyone would understand and 
 recognize this object. You will be allowed twenty minutes in which 
 to write this paper." 
 
 Upon the completion of the twenty-minute period, the written 
 descriptions were collected and redistributed to other members of 
 the class. The number of "points of description " to be used as a 
 basis for scoring of results was then determined in an open discussion. 
 The scores, which were later translated into quintile ratings, were 
 therefore based on an empirical rather than an arbitrary standard. 
 
 Discussion: The term " description' *. as used in this test has 
 reference, not to a literary form, but to the enumeration of the 
 salient characteristics of the object in question. The test is obviously 
 related to the Ausfrage test previously discussed, in that observation 
 is an important factor. In this case, however, memory plays no 
 part, since the object to be described is displayed throughout the 
 twenty-minute period. The test resembles the Aussage test in that 
 no specific questions are asked, the score being based instead on the 
 number of points of description noted. The problem must therefore 
 
22 
 
 be considered one of analysis, and the ability primarily involved may 
 be termed analytic concentration of attention. This ability is con- 
 trasted with the distribution or alertness of attention called for in 
 the Taylor number test. 
 
 The description test was first used by Binet, who stated that 
 individual psychology can be more readily studied through the 
 examination of complex rather than simple mental processes. The 
 test, in the form of description of pictures, is found in the Binet- 
 Simon scale as well as in the Stanford Revision. When applied to 
 children, the qualitative aspect of the description, whether mere 
 enumeration of points or interpretation, is of more significance than 
 the quantitative score used in this case. 
 
 Sentence Completion Test. 
 
 Description: Language Scale "K", devised by M. R. Trabue 
 (11), was employed in this test. Owing to its wide familiarity, it is 
 only necessary to remark in this connection that Scale K consists 
 of seven sentences which are arranged in the order of increasing 
 difficulty. Certain words in each sentence have been omitted and 
 the subject is asked to supply the missing words. The procedure 
 standardized by Trabue was adhered to, the following explanation 
 being given before the distribution of the forms. 
 
 "This sheet contains some incomplete sentences which form a 
 scale. This scale is to measure how carefully and rapidly you can 
 think, and especially how good you are in language work. You are 
 to write one word on each blank, in each case selecting the word 
 which makes the most sensible statement. You may have just five 
 minutes in which to sign your name at the top of the page and write 
 the words that are missing. The papers will be passed to you with 
 the face downward. Do not turn them over until we are ready. 
 After the signal is given to start, remember that you are to write 
 just one word on each blank and that your score depends on the 
 number of perfect sentences you have at the end of five minutes." 
 
 The forms were then distributed and the following additional 
 instructions given. " After you have been working five minutes, I 
 shall say, 'The time is up. All stop writing !' You will all please 
 stop at once and lay aside your pens (or pencils). Now if you are 
 all ready, you may turn your papers, sign your names and fill the 
 blanks." 
 
 In scoring the results the method recommended by Trabue was 
 followed, a sentence perfectly completed being given two points, 
 one point being allowed where the idea was right but the best word 
 not supplied, and a score of zero received where the completion was 
 
23 
 
 unsatisfactory or omitted. The total number of points for the test 
 was determined and the quintile ratings given. 
 
 Discussion: Trabue, in discussing his language scales, does not 
 attempt an analysis of the abilities involved. He calls attention to 
 the fact that the completion test was characterized by Ebbinghaus, 
 who first used the method, as a "real test of intelligence ", and that 
 other psychologists have classified it as a test of imagination, memory, 
 association, and various other "faculties". Trabue himself is satis- 
 fied with the statement that the "ability to complete these sentences 
 successfully is very closely related to what is usually called language 
 ability". 
 
 As has been mentioned by Whipple and others, the ability 
 called into play by the sentence completion test varies greatly with 
 the number and character of the elisions made. If the elisions are 
 few and the nature of the context simple, the problem becomes 
 merely one of controlled association. When the elisions are more 
 numerous the test becomes one of active imagination. An inspection 
 of the seven sentences which form Scale K will show that for the 
 college student the first three sentences and probably the fourth 
 present no imaginative problem, and may be considered comparatively 
 simple tests of controlled association. The remaining sentences, 
 however, are decidedly more difficult, as evidenced by the fact that 
 very few errors were made in the completion of the first four sentences 
 while many were recorded in the fifth, sixth and seventh, and these 
 must be looked upon as tests of imagination. Nevertheless, language 
 ability is of so complex a character, involving as it does various types 
 of sensory imagery, memory, and intellectual organization, that the 
 use of the term imagination in this connection is little more than 
 begging the question. 
 
 Although the abilities involved in the sentence completion test 
 are difficult of analysis, the test is of proven significance as an index 
 of "general intelligence", and a study of the nature of the errors 
 made by a subject is often of diagnostic value. 
 
 Courtis Arithmetic Test. 
 
 Description: The wide acceptance of the Courtis standard 
 tests (12) makes necessary only a brief description here. Series A, 
 Form 3, of the Courtis arithmetic test was used. It consists of a 
 group of eight separate tests in the fundamental processes of arith- 
 metic and their application to problems of varying degrees of diffi- 
 culty. The first five tests of the series measure efficiency in copying 
 figures, and in simple addition, subtraction, multiplication and 
 
24 
 
 division, respectively. The sixth test requires judgments of the 
 operation to be used in simple one-step problems, and is called by 
 Courtis the speed reasoning test. The seventh, or "fundamentals", 
 test provides abstract examples in the four operations, and serves 
 as a "general measure of the ability to add, subtract, multiply and 
 divide with whole numbers". The eighth test requires judgment of 
 the operations to be used, as well as the actual solution of more 
 difficult two-step problems. 
 
 The standard procedure was closely adhered to in administering 
 the tests, one minute being allowed for each of the first six, twelve 
 minutes for the seventh, and six minutes for the last test. After 
 the results had been scored in the usual manner, the scores for each 
 test were treated separately, class distributions being made and 
 quintile ratings assigned. The eight quintile grades for each indi- 
 vidual were then averaged and the averages thus obtained were in 
 turn put in rank order and quintiled. This final quintile rating 
 appears in the tabulation of results in a later section of this 
 report. 
 
 Discussion: The Courtis arithmetic tests provide a valuable 
 illustration of the efficiency test as contrasted with that of intelli- 
 gence. Although this is true to a greater degree of the first five 
 tests than of the sixth, seventh and eighth, even these latter must 
 be considered tests of efficiency when applied to college students. 
 It may be assumed that every member of such a group has the 
 educational background and mathematical ability necessary to solve 
 each of the simple problems presented, and the test therefore mea- 
 sures the facility with which the fundamental processes can be 
 employed. It is not the intention here to attempt to analyze the 
 specific abilities involved in arithmetic. It has even been asserted 
 that mathematical ability is itself specific, akin, for example, to 
 musical ability. Certainly, the factor of intellect cannot be disre- 
 garded, and in such a test as this, alertness of attention and motor 
 coordination are also important. Since the higher curriculum does 
 not frequently call for exercise in the simpler mathematical opera- 
 tions, it is not surprising to find that the college student often fails 
 to meet the standards of the higher elementary grades. This 
 fact illustrates clearly the distinction between efficiency and com- 
 petency. 
 
 It would be well to note in this connection the service which 
 the Courtis tests have performed in introducing scientific measure- 
 ments in the field of education. The tests were designed primarily 
 to determine the efficiency of the teacher or of the school system 
 and not to discover individual competency. 
 
25 
 
 Differences and Likenesses. 
 
 Description: The tests here referred to are all found in the 
 Stanford Revision of the Binet-Simon scale, and include the "differ- 
 ences" test from Year VII, the "similarities — two things" test from 
 Year VIII, the "similarities — three things" test from Year XII, and 
 the "differences between president and king" from Year XIV group. 
 The Terman method was closely adhered to in giving the tests except 
 that a time element was introduced. One minute was allowed for 
 each part of the seven- and eight-year tests, two minutes for the 
 twelve-year test, and five minutes for the fourteen-year test. The 
 response in each case was written instead of oral. In scoring, one 
 point was given for each correct difference or similarity. It was 
 necessary, however, to quintile the papers largely on the basis of a 
 qualitative judgment of the results, since the tests here described do 
 not present a real problem to the college student. 
 
 Discussion: Since the association of ideas with reference to 
 differences and similarities constitutes the essential element of the 
 higher thought processes, these tests are of great significance when 
 applied to children, and were included in this series chiefly for their 
 illustrative value. From a genetic point of view, the recognition 
 of differences is an earlier development than the appreciation of 
 similarities, as evidenced by the Terman standardization which places 
 them at the seventh and eighth years, respectively. However, 
 although similarity in the use of familiar objects should be given at 
 the eight-year mental level, it is not until the twelfth year that the 
 concept thas become usable to the extent of classing the snake, the 
 cow and the sparrow as animals. It is not until the adult level has 
 practically been reached that the ability to appreciate essential 
 differences and likenesses is evident, and this ability may be con- 
 sidered a significant index of intellectual development. 
 
 The test in its present form cannot be considered satisfactory 
 for college students, and as Terman suggests it would be advantageous 
 to develop and standardize a new test designed primarily for use in 
 the upper years and at the adult level, and adapted to call into play 
 the ability to give essential differences and likenesses. As a test for 
 adults the one here used can only be said to exercise the associational 
 processes. 
 
 Opposite^ Test. 
 
 Description: The difficult opposites found in List V, page 79, 
 of Whipple's Manual of Mental and Physical Tests (13) was used. 
 The directions suggested by Whipple were given, as follows: "Write 
 as soon as I say a word as quickly as you can the word that means 
 
26 
 
 just the opposite. Opposites formed by the prefixes 'un' and 'in' 
 or by the suffix 'less* are not to be given unless the root of the stim- 
 ulus word is changed." The stimulus words were called at five- 
 second intervals, and the results scored upon the basis of correct 
 opposites determined in open discussion. 
 
 Discussion: Tests of controlled association, such as the part- 
 whole, genus-species, and opposites tests, are usually scored on the 
 basis of the time required and the accuracy of the response. In the 
 present case, however, since printed forms were not used, the time 
 element had to be ignored except in so far as the five-second period 
 eliminated all associations requiring a greater length of time. In the 
 scoring of the test a difficulty was encountered in the determination 
 of correct or permissible opposites, and in some cases where no 
 original opposite could be agreed upon the use of two or even three 
 terms was allowed. 
 
 The opposites test has been extensively used by Thorndike, 
 Woodworth and Wells, Miss Norsworthy and others. The abilities 
 involved vary considerably with the ease or difficulty of the stimulus 
 words. If the associations called for are too simple the response 
 becomes automatic, while if the stimulus words are very difficult lack 
 of familiarity with the terms is likely to interfere with the validity 
 of the test. It may safely be stated that every word in the list here 
 employed is familiar to college students, and that, with one or two 
 exceptions, the associations required were difficult enough to eliminate 
 automatic responses. It is therefore reasonable to consider the test 
 a measure of the facility and accuracy of controlled association, 
 involving a high degree of language ability. 
 
 Definitions Test. 
 
 Description: As in the case of the differences and likenesses 
 test, a series of tests from different age levels of the Stanford Re- 
 vision of the Binet-Simon scale were used. The definitions tests 
 from Year V and from Year VIII, the definition of abstract terms 
 from Year XII, and the differences between abstract terms from the 
 average adult series comprise the present test. The Terman method 
 was employed except that the definition was written and a time 
 element introduced. One minute was allowed for each of the defi- 
 nitions in the first three tests and two minutes for each in the fourth 
 test. In scoring, the same method was followed as in the case of 
 differences and likenesses, and the same criticism as to the accuracy 
 of the quintile ratings applies here. 
 
 Discussion: The definitions test differs from those previously 
 discussed in that it tests neither intelligence nor efficiency in mental 
 
27 
 
 processes, but is employed as an index of intellectual development 
 as displayed by the number of words at the disposal of the individual. 
 Since it may fairly be said that formal education consists in adding 
 to the number of usable idea-symbols and increasing their distinction, 
 the vocabulary test provides a simple and quite trustworthy measure- 
 ment of intellectual status. With formal education so important a 
 factor in each, it is not surprising to find the high degree of correlation 
 noted by Terman between the results of his vocabulary test and 
 intelligence quotients determined by the Stanford Revision. 
 
 While the principle involved is the same, the test here employed 
 differs from the usual vocabulary test in that only a limited number 
 of definitions were called for. The purpose was rather to demon- 
 strate the various stages of definition than to actually test the college 
 group. Beginning with definition "by use" at the five-year level, 
 the series shows the development of definition " superior to use" in 
 the eighth year. Both of these types have a definite perceptual 
 basis, and it is not until the twelfth year that the processes of com- 
 parison and generalization make possible the definition of abstract 
 terms. In the contrasting of abstract terms, definition is related to 
 the recognition of essential differences, discussed in a previous test. 
 
 For the college student even the most difficult test of this series 
 can hardly be said to present a real problem, although in some 
 cases the contrast is not clearly drawn. While such processes as 
 discrimination and classification enter into definition, the test may 
 be considered one of intellectual development as displayed in language 
 ability. 
 
 Humpstone Memory Test. 
 
 Description: The memory test devised by H. J. Humpstone 
 consists of twenty sentences, each the statement of some rather 
 obscure historic fact connected with the name of some individual or 
 nation. These statements are in the form of the following sentence, 
 "North America was discovered by Columbus in 1492". The series 
 of twenty statements was read aloud to the class three times, care 
 being taken to pronounce the proper names and the dates distinctly. 
 A general discussion, not connected with the experiment, was then 
 entered into and continued for forty minutes. At the expiration of 
 that period, the first part of each sentence was read and the members 
 of the class asked to record in writing the name and date connected 
 with the incident. For example, the experimenter might read 
 
 "North America was discovered by ", the remainder of the 
 
 sentence being supplied by each subject. It should be kept in mind 
 that care was taken in devising the test to select historical incidents 
 
28 
 
 of a trivial and therefore unfamiliar character. Since each of the 
 twenty sentences required the recall of a name and a date, the results 
 were scored on the basis of forty points. The scores were distributed 
 and quintiled in the usual manner. 
 
 Discussion: Various types of memory tests have been devised 
 and employed since Ebbinghaus published his pioneer study in this 
 field. Some of these have been open to the criticism that they test 
 associability rather than memory, others that the material is unsatis- 
 factory, as in the case of nonsense syllables, and still others that the 
 time element involved makes them impractical for use in the class- 
 room. The purpose in devising the test here described was to select 
 simple material which at the same time would be unfamiliar, and 
 would offer sufficient points for scoring to provide the necessary 
 differentiation of results. It was further desired to construct a test 
 which might be completed in the two-hour laboratory period and still 
 give sufficient weight to the factor of retentiveness to make the test 
 really one of memory. The Humpstone Test seems to fulfil these 
 requirements satisfactorily. The three readings of the material 
 bring in the element of repetition and give a fair degree of initial 
 memorization. The interval and distraction provided by the forty- 
 minute discussion involve sufficient retention to make the test sig- 
 nificant, and the fact that no perfect scores have been made demon- 
 strates that the material chosen is of sufficient difficulty for a college 
 group. The method of right associates employed in the recall needs 
 no comment because of its general acceptance. The natural division 
 of the recalled items into names and dates has shown the latter to 
 be more difficult of retention, as might be anticipated. 
 
 It is unnecessary at this point to enter into an analytic study of 
 memory. The subject has been so thoroughly treated in standard 
 text-books and scientific researches as to require no exposition here. 
 It will be sufficient to note that the present test adequately calls 
 into play the three abilities which are chiefly concerned in memory, 
 namely, modifiability, retentivity, and recall. 
 
 Witmer Cylinder Test. 
 
 Description: The material here employed is an adaptation of 
 the Montessori cylinders, and consists of a circular board containing 
 recesses for eighteen cylindrical insets. These insets are arranged in 
 three series of seven blocks each, the last cylinder of one series being 
 also the first of the next series. In the first series the insets are all of 
 equal diameter and vary only in height, in the second the variation 
 is in diameter, the height being constant, while the cylinders of the 
 third series vary in both height and diameter. The board, which is 
 
29 
 
 approximately twelve inches in diameter, contains a central recess 
 in which all of the blocks may be placed, the subject then being 
 required to replace them as quickly as possible. 
 
 Each member of the elementary class in psychology was tested 
 individually by either Professor Witmer, Professor Twitmyer, or 
 Dr. Humpstone, this being the only one of the series which was not 
 given as a group test. The student was required to stand before the 
 table upon which the cylinder board was placed with all of the insets 
 in their proper recesses. His attention was called to the fact that 
 the tops of the different blocks were flush with the top of the board. 
 The insets were then removed by the experimenter and placed in 
 the central receptacle, care being taken to mix the blocks well and 
 at the same time to leave the larger cylinders on top. The subject 
 was then instructed to return the blocks to their original positions 
 as rapidly as possible, and the time required was recorded in seconds. 
 Upon the completion of the first trial the cylinders were again removed 
 and the time for the second replacement determined. 
 
 The results for each of the two trials were treated separately 
 and quintile ratings obtained. In accordance with the method 
 standardized by Paschal (14), a final rating was given by quintiling 
 the results for the shortest trial. The rating for the first, second, 
 and shortest trials all appear in the tabulation of results. 
 
 Discussion: While the diagnostic value of the mechanical test 
 has long been recognized, the cylinder test is the only one of this 
 type which has been included in the present series. The test differs 
 from those which have previously been described in not requiring 
 any appreciable degree of language ability, and hence can not be 
 considered in any sense an index of intellectual level. If intelligence 
 be defined as the ability to solve what for the individual is a new 
 problem, the test is primarily one of intelligence. This, however, is 
 by no means the only ability involved. On the motor side may be 
 observed the rate of discharge of energy, coordination, complexity 
 of response, and in some cases endurance. The performance like- 
 wise displays some degree of analytic and distributed attention, 
 observation, understanding, and trainability when more than one 
 trial is given. While these are not the only abilities involved, they 
 may usually be rated with some accuracy on the basis of the cylinder 
 performance. 
 
 As Paschal has pointed out, the test has both a qualitative and 
 a quantitative aspect. In the present treatment of results, however, 
 only the latter has been considered, since the performance has been 
 rated solely on the basis of the number of seconds required for the 
 successful replacement of the insets. The qualitative aspect of the 
 
30 
 
 performance was an important factor in determining the competency- 
 rating which will be discussed in the following section. In general, 
 the quality of the performance must be considered of more diagnostic 
 significance than the bare time element, although it is evident that a 
 very rapid replacement can not be made unless the performance is 
 qualitatively good, nor is it likely that excessive time will be required 
 if a satisfactory method is used. 
 
 While the quintile ratings for the first and second trials have 
 been included in the tabulation of results, it is probable that the 
 rating for the shortest of the two trials gives the safest index of 
 cylinder proficiency. In his standardization of the test Paschal 
 adopted the shortest of three trials as his criterion, and the results 
 here obtained are therefore not directly comparable with those upon 
 which the standardization was based. Even though the shortest 
 trial gives the most reliable basis for a single rating, the comparison 
 of scores made on the first and second trials is important as an index 
 of trainability, and these have therefore been included in the table 
 of results. 
 
 Composite Test Rating. 
 
 In the treatment of results it will be of interest to compare the 
 records made in the various tests described above with the score of 
 the Thurstone test, the rating on academic standing and that on 
 estimated competency. It seems advisable to obtain a composite 
 rating on the basis of the results for the series of tests in order to 
 facilitate this comparison. Unquestionably, the tests are not all of 
 equal value, and some method of weighting should be employed. 
 Here, however, an almost unsolvable problem is encountered, for 
 any system of weighting the various tests which might be adopted 
 would necessarily be arbitrary and based on an a priori judgment. 
 Moreover, the significance of the tests varies with the individual 
 case, and no one method of weighting would be really satisfactory 
 for the whole group. 
 
 With these difficulties in mind, it has been decided to obtain 
 a composite rating by taking a simple average of the quintile scores 
 on the thirteen tests of the series for each individual. Such an 
 average has the advantage of not being colored by personal opinions 
 of the value of the different tests, and is probably as significant an 
 index as could be devised by any complicated system of weighting. 
 This average includes only the rating for the shortest trial with the 
 Witmer cylinders. 
 
31 
 
 The Competency Rating. 
 
 One purpose of this investigation was to determine what reliance 
 may be placed on the ''snap judgment" of a trained observer. Is 
 it possible to rate the college student with any degree of accuracy on 
 the basis of an interview covering no more than five minutes? Can 
 the experienced psychologist estimate the ability of an individual 
 by noting his appearance and carriage, and by obtaining his reaction 
 to a few simple questions and observing his performance with a 
 mechanical test? It was with a view to answering such questions 
 as these that each member of the first-year class in psychology was 
 personally interviewed by either Professor Witmer, Professor Twit- 
 myer or Dr. Humpstone, and given a competency rating on the basis 
 of five minutes' observation. Each student was required to replace 
 the insets of the Witmer cylinder test twice, as described in the 
 preceding section. The qualitative aspect of this performance had 
 considerable weight in determining the competency rating, and it 
 should be understood that while coordination, attention, understand- 
 ing, trainability and intelligence are all reflected in the time scores 
 of the two cylinder trials, the latter do not necessarily correlate with 
 a rating based on the quality of the performance. As has been 
 previously noted, the cylinder test is the only one considered in this 
 study which was given individually. 
 
 The rating, however, was not based solely on the performance 
 with the cylinders. As the student presented himself to the exami- 
 ner, he was asked to write his name upon a record card, and the 
 character of his writing as well as the degree of composure dis- 
 played were observed. A few leading questions were then asked 
 regarding preparatory school, purpose in coming to the University, 
 intended vocation, outside activities, and the like. No attempt 
 was made to ask the same questions of each individual, but rather 
 to carry on a short conversation which varied naturally with the 
 replies given. The subject was then given the cylinder test, follow- 
 ing the procedure previously outlined, and after answering one or 
 two questions as to his work in psychology was dismissed. As a 
 rule the whole interview consumed no more than five minutes. 
 
 While all three of the examiners had come into some contact 
 with members of the class through lecture work, no one of them knew 
 the students personally or had had occasion to be familiar with the 
 type of work done by any individual. The rating was therefore based 
 entirely upon an observation of the student's behavior as displayed 
 in his general bearing and address, his answers to the questions, and 
 his performance with the cylinders. In this respect, the competency 
 rating here employed differs from the rating on estimated intelli- 
 
32 
 
 gence which has frequently been used in connection with investi- 
 gations of this character. Such a rating has usually been given by 
 an instructor familiar with the student and with his work in the 
 classroom, or by averaging the estimates made by a number of in- 
 structors so qualified. The competency rating is therefore not 
 directly comparable with the ratings on estimated intelligence 
 referred to in a preceding section. 
 
 In giving these ratings, the five-point scale was used in a some- 
 what modified form. Each of the five points of the scale was sub- 
 divided into five lesser grades, thus giving a maximum rating of 5.5, 
 a minimum of 1.1, and a mediocre grade of 3.3. When each student 
 had been rated on this scale, the three examiners in conference 
 arranged the members of the class in rank order on the basis of es- 
 timated competency. Since it is felt that individual differences in 
 the standards of the three examiners somewhat reduces the sig- 
 nificance of the actual rating assigned, the rank order has been 
 employed in determining a quintile rating on estimated competency, 
 which appears in the tabulation of results. This treatment has the 
 added advantage of making the rating directly comparable with the 
 quintile scores of the various mental tests. 
 
 It will be well to note at this point that there is no objective 
 standard by which to measure the accuracy of the competency ratings. 
 In estimating the ability of the student, the attempt was not made 
 to predict the degree of his success in the study of psychology, nor 
 is the rating a prognosis of his relative academic standing as deter- 
 mined by the grades received in all college courses. Neither can the 
 accuracy of the judgment be measured by his performance in any one 
 or in any group of psychological tests. The term "competency 
 rating", implying the algebraic sum of the individual's specific 
 abilities and disabilities as demonstrated by his success as a member 
 of human society, best interprets the character of the rating under 
 discussion. In this connection it may be stated that no ratings 
 lower than 2.3 were given, or, in other words, no students were found 
 so deficient in general competency as to fall below the " doubtful" 
 group. In view of the fact that the members of the class had under- 
 gone a strenuous process of selection in fulfilling the entrance require- 
 ments and surviving at least two years of college work, it is not 
 surprising to find a complete absence of " 1 " ratings. This fact does 
 not appear in the tabulation of results where the ratings have been 
 quintiled on the basis of rank order, and only the quintile grade shown. 
 
 Although, as has been pointed out, the competency rating can 
 not be checked by comparison with mental test scores or academic 
 record, it will nevertheless be profitable to determine in the later 
 
33 
 
 treatment of results whether the rating shows any significant degree 
 of correlation with competency as displayed in the tests and college 
 grades. 
 
 Academic Rating. 
 
 Popular tradition has it that the youth whose scholastic attain- 
 ments make him valedictorian of his college class is destined for 
 future mediocrity, while the typical campus lounger whose academic 
 life is cut short by a heartless faculty is sure to make his mark in the 
 world of success. Nevertheless it will hardly be denied that pro- 
 ficiency in the classroom is to some degree indicative of individual 
 competency, and it will therefore be desirable to know something 
 of the relative academic standing of the fifty students under 
 consideration. 
 
 While it might be contended that preparatory school records 
 would be significant in determining a rating on scholastic merit, the 
 great variation in standards and the incomparability of the various 
 grading systems employed make it advisable to reject this suggestion 
 without further deliberation. Moreover, since grades for at least 
 two years of college work are available for each member of the group, 
 it seems unnecessary to base the academic rating on any work other 
 than that done at the University of Pennsylvania. 
 
 As has previously been stated, a five-point system of grading is 
 employed in the School of Arts and Science. This scheme provides 
 three symbols for work of passing grade, while two are reserved for 
 that of an unsatisfactory character. To be more specific, the letters 
 "D", "G", "P", "N", and "F" are assigned, signifying Distin- 
 guished, Good, Passed, Not Passed (conditioned), and Failure, 
 respectively. A student receiving a grade of "N" in a course may 
 relieve himself of the condition by passing a re-examination, while 
 the grade "F" necessitates the repetition of the course. As applied 
 to the courses in psychology, an "N" in Psychology 1 permits the 
 student to continue his work in Psychology 2, but this permission is 
 not given when "F" is received in the first course. It will be noted, 
 therefore, that no member of the present group received a grade of 
 "F" in Psychology 1, since each of the fifty students completed both 
 courses in the academic year 1919-20. 
 
 While it must be understood that the letter system of grading 
 is intended to obviate the pseudo-accuracy of the percentile grade, 
 and that it is not possible to assign percentile equivalents for the 
 symbols used, the necessity for obtaining some kind of composite 
 rating as an index of academic standing is evident. For example, a 
 given student may have received a grade of "D" in five courses, 
 
34 
 
 "G" in eight, "P" in four, and "N" in two. Moreover, each course 
 may have required from one to nine hours of class attendance per 
 week, with a value of from one to four units of credit, a unit being 
 the equivalent of one hour of lecture work or two hours of laboratory 
 work per week for the academic year. Hence it is clear that the 
 grades must be considered in terms of units of credit rather than by 
 courses if a significant rating is to be obtained, and also that some 
 numerical translation of the letter grades must be devised. 
 
 Since the percentile scale is not recognized in the University 
 marking system, any numerical equivalents which might be adopted 
 would necessarily be arbitrary. Roughly, it may be said that "D" 
 represents a range from 90 to 100 per cent, "G" from 80 to 90, and 
 "P" from 70 to 80. There seems to be no justification, however, 
 for selecting 70 per cent as the marking mark, nor would it be more 
 accurate to place it at 60 per cent. A satisfactory evaluation of the 
 "N" and "F" is even more confusing. While the passing grades 
 might be valued at 95, 85, and 70, respectively, it would be difficult 
 to decide whether the "F", which ranges from zero to 50 per cent 
 should be rated as 25 or 45. By far the simplest solution to the 
 problem, and what seems to be the most logical, is to adopt here 
 the five-point scale generally employed in this study. It is quite 
 as reasonable to represent the five-letter grades by the numbers 5, 
 4, 3, 2, and 1, as by any other numerical values which might be 
 suggested, and this method has the advantage of permitting a direct 
 comparison between the composite rating for college grades and that 
 for mental tests. It has been determined, moreover, that the rank 
 order remains approximately the same whether this system is used 
 or the values 95, 85, 70, 55, and 45 be given to the letter grades. 
 
 The academic rating has therefore been determined by multiply- 
 ing the number of units assigned each letter grade by the appro- 
 priate digit, and dividing the sum of these products by the total 
 number of units graded. The student who had received no grades 
 lower than "D" would have a rating of 5.0, while a record with an 
 equal number of "G" and "D" units would average 4.5. Since it 
 would be almost impossible for a student to remain in college who 
 had not averaged the passing grade, it is not surprising to find that 
 only one of the fifty has an average below 3.0, his rating being 2.9. 
 
 Perhaps even a rating of this kind implies an accuracy of mea- 
 surement which cannot be justified. If every "D" assigned as a 
 final grade stands for the same level of excellence, and if the same 
 amount of work is required for a passing grade in every course, then 
 the validity of the average rating cannot be questioned. If, however, 
 one department of the college is found to be giving the highest grade 
 
35 
 
 to 25 per cent of its students, while a second allows only 5 per cent 
 of "D"s, then the impossibility of comparing grades assigned by 
 different departments is evident. Moreover, it has been demon- 
 strated that different instructors in the same department vary greatly 
 in the grades which they assign to a given piece of work, and that 
 this variation is no greater than that which will be shown by one 
 instructor marking the same work at different times. It is indeed 
 questionable whether any reliance should be placed in a comparison 
 of college grades in an institution where the majority of the courses 
 are elective, and where there is no general supervision of grading. 
 
 The grading problem is by no means a new one, and has a con- 
 siderable literature of its own. Finkelstein (15), for example, has 
 published an interesting study of conditions at Cornell University in 
 which he demonstrates the need for supervision of the grades assigned 
 by different departments by showing that some instructors are typi- 
 cally low markers. He makes a plea for the adoption of a five-division 
 system of grading with the provision that the grades given by any 
 instructor shall not deviate in the long run from a distribution agreed 
 upon. While the intention of the present study is not to preach the 
 necessity of some such general supervision of grading at the University 
 of Pennsylvania, the existing absence of uniformity demands com- 
 ment. Under the present curriculum, a student in the School of Arts 
 and Science is required to complete a specified number of units of work 
 in each " group" of subjects. In most cases he is free to elect which 
 courses he will pursue in a given group. For example, six units of 
 credit is required in the Biological Science Group which is composed 
 of courses in botany, zoology and psychology, but the decision as to 
 whether all six units be taken in one subject or be distributed between 
 two is left entirely to the student, as well as the choice of the subject 
 or subjects to be elected. Until recently the elementary work in one 
 of the three subjects has been so much less difficult than that in the 
 other two, that the situation has been fully recognized by the under- 
 graduate, with a consequent influx to the easier course. While this 
 condition has been remedied in the case cited, it doubtless still exists 
 in other groups, and the present plea is made rather with the purpose 
 of calling attention to the lack of general supervision of grade dis- 
 tributions than as a criticism of any particular instance of non-con- 
 formity. Although the necessity of some general supervision of all 
 grades assigned in the college cannot be overlooked, the more pressing 
 need of uniformity within the various groups must be emphasized. 
 
 From the foregoing discussion it is evident that grades assigned 
 by various instructors in different departments of the University are 
 not really comparable, and it is with this understanding that the 
 
36 
 
 academic record will be included in the present investigation. Even 
 though the data cannot be considered scientifically accurate, however, 
 it must be admitted that the student's college grades do give some 
 indication of his scholastic ability. The grades alone determine 
 whether he is to receive academic honors or be dropped by the 
 Executive Committee for general deficiency, as well playing an 
 important part in election to Phi Beta Kappa and in placement after 
 graduation. 
 
 In the tabulation of results, the final grades for the two courses 
 in psychology will be noted in addition to the average rating for all 
 college grades including those in psychology. The latter are given 
 separately since it is felt that the unusual opportunity for personal 
 contact between instructor and student in the elementary courses in 
 this department makes these grades somewhat more significant than 
 is generally the case. 
 
 In conclusion, it seems almost unnecessary to point to the fact 
 that similar grades may not mean the same thing when assigned to 
 different students even in the same course. Although the attempt 
 is made to control the amount of work done by fixing the maximum 
 as well as the minimum number of units which may be taken by a 
 student in a semester, some carry so full a roster as to seriously 
 interfere with the display of actual ability, while others who are not 
 experiencing great success with a comparatively light schedule may 
 be handicapped by outside work which they are pursuing as a means 
 of livelihood. Since the evaluation of these distributing factors is 
 well nigh impossible, they must be ignored in the present treatment 
 of college grades. 
 
 Tabulation of Results. 
 
 While it was intended to make a statistical study of the various 
 scores and ratings which form a basis for this investigation, the 
 primary purpose was to study the individual record rather than the 
 mass results. It has therefore been deemed advisable to present a 
 complete tabulation of the ratings for each member of the group, 
 and thereby facilitate the scrutiny of the individual case. In the 
 following table will be found (1) the number used to designate each 
 student in the group, (2) his class, whether sophomore, junior, or 
 senior, (3) the quintile rating for the Thurstone test, (4) the quintile 
 rating for each of the thirteen mental tests with the addition of the 
 ratings for the first and second trials with the cylinders, (5) the com- 
 posite test rating obtained by averaging the ratings for the thirteen 
 separate tests — this average does not include the Thurstone test and 
 only the shortest trial with the cylinders is included, (6) the quintile 
 

 
 
 
 
 
 
 Tabulatioin 
 
 r of Results. 
 
 
 
 
 
 
 
 
 1 
 
 ■ 
 
 ii 
 
 ■ 
 
 a 
 
 jj 
 
 1 
 
 6 
 55 
 
 i 
 
 Memory 
 Span 
 
 § 
 
 3 
 
 1 
 
 ! 
 
 ■ 
 1 
 
 1 
 
 9 
 
 6 
 
 4 
 
 3 
 
 B 
 
 s 
 
 1 
 
 1 
 1 
 
 j 
 
 id 
 
 Q 
 
 1 
 3 
 
 .5 
 
 i 
 
 1 
 
 s 
 
 K 
 
 1 
 
 s 
 
 In 
 
 <3« 
 
 a 
 | 
 
 8. a 5 
 
 S'-E 
 O «9 
 
 0« 
 
 1» 
 
 •o.S 
 
 13 
 
 i 
 
 1 
 
 e» 
 
 g 
 
 1 
 
 CO 
 
 i 
 
 1 
 
 z 
 
 1 
 
 ■ 
 
 J 
 
 I 
 
 Ph 
 
 1 
 
 So. 
 
 3 
 
 4 
 
 2 
 
 i 
 
 3 
 
 3 
 
 3 
 
 2 
 
 2 
 
 4 
 
 4 
 
 4 
 
 4 
 
 2 
 
 1 
 
 2 
 
 2.9 
 
 3 
 
 4.5 
 
 G 
 
 D 
 
 2 
 
 So. 
 
 5 
 
 4 
 
 4 
 
 5 
 
 4 
 
 5 
 
 4 
 
 4 
 
 5 
 
 3 
 
 3 
 
 3 
 
 3 
 
 4 
 
 4 
 
 4 
 
 3.9 
 
 5 
 
 4.7 
 
 D 
 
 D 
 
 3 
 
 So. 
 
 2 
 
 1 
 
 3 
 
 3 
 
 3 
 
 2 
 
 2 
 
 5 
 
 2 
 
 3 
 
 3 
 
 2 
 
 1 
 
 4 
 
 5 
 
 2 
 
 2.6 
 
 4 
 
 3.4 
 
 N 
 
 N 
 
 4 
 
 So. 
 
 5 
 
 2 
 
 3 
 
 2 
 
 3 
 
 5 
 
 2 
 
 5 
 
 5 
 
 3 
 
 4 
 
 5 
 
 - 
 
 1 
 
 2 
 
 1 
 
 3.3 
 
 3 
 
 3.6 
 
 G 
 
 P 
 
 5 
 
 So. 
 
 3 
 
 5 
 
 4 
 
 1 
 
 3 
 
 1 
 
 3 
 
 4 
 
 5 
 
 3 
 
 3 
 
 1 
 
 1 
 
 5 
 
 2 
 
 5 
 
 3.0 
 
 2 
 
 3.9 
 
 P 
 
 P 
 
 6 
 
 Jr. 
 
 5 
 
 5 
 
 3 
 
 5 
 
 3 
 
 2 
 
 - 
 
 5 
 
 2 
 
 3 
 
 4 
 
 2 
 
 4 
 
 1 
 
 1 
 
 1 
 
 3.3 
 
 1 
 
 3.8 
 
 P 
 
 P 
 
 7 
 
 So. 
 
 4 
 
 2 
 
 3 
 
 2 
 
 3 
 
 6 
 
 3 
 
 1 
 
 4 
 
 3 
 
 3 
 
 4 
 
 2 
 
 5 
 
 5 
 
 5 
 
 3.1 
 
 5 
 
 3.2 
 
 N 
 
 F 
 
 8 
 
 So. 
 
 4 
 
 4 
 
 1 
 
 3 
 
 4 
 
 3 
 
 2 
 
 5 
 
 2 
 
 3 
 
 4 
 
 1 
 
 3 
 
 2 
 
 2 
 
 2 
 
 2.8 
 
 3 
 
 3.6 
 
 P 
 
 G 
 
 9 
 
 So. 
 
 4 
 
 2 
 
 3 
 
 3 
 
 3 
 
 3 
 
 4 
 
 1 
 
 2 
 
 3 
 
 3 
 
 3 
 
 4 
 
 5 
 
 5 
 
 5 
 
 3.0 
 
 5 
 
 4.1 
 
 G 
 
 G 
 
 10 
 
 So. 
 
 1 
 
 2 
 
 4 
 
 2 
 
 3 
 
 4 
 
 3 
 
 1 
 
 2 
 
 1 
 
 2 
 
 2 
 
 4 
 
 3 
 
 4 
 
 4 
 
 2.5 
 
 4 
 
 3.5 
 
 P 
 
 P 
 
 11 
 
 Jr. 
 
 1 
 
 3 
 
 3 
 
 5 
 
 3 
 
 2 
 
 4 
 
 1 
 
 5 
 
 3 
 
 3 
 
 5 
 
 4 
 
 4 
 
 1 
 
 4 
 
 3.5 
 
 1 
 
 4.2 
 
 P 
 
 G 
 
 12 
 
 So. 
 
 4 
 
 3 
 
 4 
 
 1 
 
 3 
 
 3 
 
 3 
 
 5 
 
 2 
 
 3 
 
 3 
 
 2 
 
 4 
 
 3 
 
 2 
 
 3 
 
 3.0 
 
 3 
 
 4.1 
 
 P 
 
 G 
 
 13 
 
 So. 
 
 4 
 
 3 
 
 5 
 
 5 
 
 3 
 
 4 
 
 3 
 
 5 
 
 4 
 
 2 
 
 - 
 
 2 
 
 1 
 
 5 
 
 5 
 
 6 
 
 3.5 
 
 5 
 
 4.5 
 
 G 
 
 D 
 
 14 
 
 Sr. 
 
 5 
 
 5 
 
 2 
 
 5 
 
 4 
 
 3 
 
 3 
 
 5 
 
 4 
 
 3 
 
 3 
 
 3 
 
 5 
 
 3 
 
 1 
 
 3 
 
 3.7 
 
 4 
 
 4.2 
 
 G 
 
 G 
 
 15 
 
 Sr. 
 
 5 
 
 5 
 
 4 
 
 3 
 
 4 
 
 5 
 
 3 
 
 - 
 
 2 
 
 3 
 
 3 
 
 2 
 
 3 
 
 5 
 
 4 
 
 5 
 
 3.5 
 
 5 
 
 3.5 
 
 P 
 
 P 
 
 16 
 
 So. 
 
 4 
 
 6 
 
 2 
 
 2 
 
 2 
 
 3 
 
 3 
 
 1 
 
 4 
 
 3 
 
 3 
 
 2 
 
 4 
 
 3 
 
 3 
 
 3 
 
 2.8 
 
 2 
 
 3.7 
 
 P 
 
 P 
 
 17 
 
 So. 
 
 1 
 
 1 
 
 2 
 
 4 
 
 8 
 
 2 
 
 5 
 
 I 
 
 1 
 
 3 
 
 1 
 
 4 
 
 2 
 
 1 
 
 3 
 
 1 
 
 2.3 
 
 1 
 
 3.7 
 
 P 
 
 G 
 
 18 
 
 So. 
 
 5 
 
 3 
 
 3 
 
 3 
 
 3 
 
 5 
 
 3 
 
 3 
 
 5 
 
 3 
 
 3 
 
 2 
 
 4 
 
 3 
 
 3 
 
 3 
 
 3.3 
 
 3 
 
 4.5 
 
 P 
 
 D 
 
 19 
 
 So. 
 
 5 
 
 3 
 
 3 
 
 5 
 
 4 
 
 5 
 
 3 
 
 5 
 
 2 
 
 3 
 
 3 
 
 5 
 
 4 
 
 1 
 
 3 
 
 1 
 
 3.5 
 
 4 
 
 3.9 
 
 P 
 
 P 
 
 20 
 
 So. 
 
 5 
 
 3 
 
 4 
 
 3 
 
 4 
 
 3 
 
 3 
 
 5 
 
 1 
 
 3 
 
 3 
 
 5 
 
 1 
 
 1 
 
 2 
 
 1 
 
 3.0 
 
 4 
 
 3.4 
 
 P 
 
 P 
 
 21 
 
 So. 
 
 4 
 
 ■1 
 
 4 
 
 8 
 
 3 
 
 4 
 
 3 
 
 4 
 
 5 
 
 3 
 
 3 
 
 2 
 
 3 
 
 4 
 
 2 
 
 4 
 
 3.5 
 
 4 
 
 4.2 
 
 P 
 
 D 
 
 22 
 
 So. 
 
 2 
 
 - 
 
 - 
 
 3 
 
 3 
 
 3 
 
 1 
 
 3 
 
 8 
 
 3 
 
 4 
 
 4 
 
 5 
 
 I 
 
 1 
 
 1 
 
 3.0 
 
 1 
 
 3.5 
 
 P 
 
 P 
 
 23 
 
 So. 
 
 3 
 
 2 
 
 4 
 
 2 
 
 3 
 
 3 
 
 4 
 
 1 
 
 5 
 
 3 
 
 3 
 
 3 
 
 5 
 
 3 
 
 4 
 
 4 
 
 3.2 
 
 4 
 
 3.7 
 
 P 
 
 N 
 
 24 
 
 So. 
 
 3 
 
 5 
 
 5 
 
 5 
 
 3 
 
 4 
 
 3 
 
 3 
 
 3 
 
 3 
 
 3 
 
 2 
 
 4 
 
 6 
 
 2 
 
 5 
 
 3.7 
 
 4 
 
 3.1 
 
 P 
 
 P 
 
 25 
 
 Jr. 
 
 3 
 
 5 
 
 2 
 
 2 
 
 2 
 
 3 
 
 3 
 
 5 
 
 4 
 
 - 
 
 3 
 
 2 
 
 4 
 
 3 
 
 4 
 
 3 
 
 3.2 
 
 3 
 
 3.3 
 
 N 
 
 P 
 
 26 
 
 Jr. 
 
 1 
 
 4 
 
 3 
 
 1 
 
 4 
 
 4 
 
 3 
 
 3 
 
 3 
 
 3 
 
 2 
 
 1 
 
 2 
 
 5 
 
 3 
 
 5 
 
 2.9 
 
 3 
 
 4.0 
 
 P 
 
 P 
 
 27 
 
 So. 
 
 5 
 
 5 
 
 4 
 
 5 
 
 4 
 
 5 
 
 3 
 
 5 
 
 3 
 
 3 
 
 4 
 
 2 
 
 4 
 
 3 
 
 4 
 
 3 
 
 3.8 
 
 3 
 
 4.4 
 
 D 
 
 D 
 
 28 
 
 So. 
 
 3 
 
 5 
 
 5 
 
 3 
 
 3 
 
 - 
 
 3 
 
 2 
 
 5 
 
 4 
 
 5 
 
 4 
 
 4 
 
 2 
 
 2 
 
 2 
 
 3.8 
 
 3 
 
 4.7 
 
 D 
 
 D 
 
 29 
 
 Jr. 
 
 4 
 
 5 
 
 3 
 
 3 
 
 3 
 
 5 
 
 3 
 
 2 
 
 5 
 
 3 
 
 3 
 
 2 
 
 3 
 
 4 
 
 5 
 
 4 
 
 3.4 
 
 5 
 
 4.0 
 
 D 
 
 G 
 
 30 
 
 Sr. 
 
 4 
 
 5 
 
 4 
 
 2 
 
 3 
 
 4 
 
 - 
 
 3 
 
 3 
 
 3 
 
 4 
 
 1 
 
 s 
 
 5 
 
 3 
 
 5 
 
 3.5 
 
 3 
 
 3.7 
 
 P 
 
 N 
 
 31 
 
 Jr. 
 
 4 
 
 - 
 
 6 
 
 2 
 
 4 
 
 3 
 
 1 
 
 4 
 
 2 
 
 2 
 
 3 
 
 - 
 
 3 
 
 4 
 
 3 
 
 4 
 
 3.0 
 
 4 
 
 4.0 
 
 G 
 
 P 
 
 32 
 
 So. 
 
 5 
 
 3 
 
 2 
 
 2 
 
 4 
 
 3 
 
 3 
 
 1 
 
 3 
 
 1 
 
 3 
 
 2 
 
 5 
 
 1 
 
 3 
 
 2 
 
 2.5 
 
 2 
 
 4.6 
 
 D 
 
 G 
 
 33 
 
 So. 
 
 1 
 
 3 
 
 1 
 
 2 
 
 1 
 
 3 
 
 3 
 
 2 
 
 4 
 
 2 
 
 2 
 
 1 
 
 5 
 
 3 
 
 1 
 
 3 
 
 2.5 
 
 *1 
 
 3.1 
 
 P 
 
 N 
 
 34 
 
 Sr. 
 
 5 
 
 2 
 
 2 
 
 5 
 
 3 
 
 3 
 
 - 
 
 3 
 
 4 
 
 a 
 
 3 
 
 3 
 
 5 
 
 3 
 
 3 
 
 4 
 
 3.3 
 
 4 
 
 4.3 
 
 D 
 
 G 
 
 35 
 
 Jr. 
 
 5 
 
 3 
 
 S 
 
 5 
 
 3 
 
 3 
 
 3 
 
 5 
 
 4 
 
 3 
 
 3 
 
 4 
 
 5 
 
 5 
 
 5 
 
 5 
 
 3.9 
 
 5 
 
 3.7 
 
 G 
 
 G 
 
 36 
 
 Jr. 
 
 4 
 
 4 
 
 3 
 
 5 
 
 4 
 
 - 
 
 - 
 
 3 
 
 4 
 
 - 
 
 - 
 
 2 
 
 - 
 
 2 
 
 3 
 
 2 
 
 3.5 
 
 4 
 
 3.0 
 
 G 
 
 P 
 
 37 
 
 Jr. 
 
 2 
 
 4 
 
 4 
 
 4 
 
 3 
 
 3 
 
 2 
 
 3 
 
 4 
 
 3 
 
 - 
 
 2 
 
 2 
 
 5 
 
 4 
 
 5 
 
 3.3 
 
 3 
 
 3.4 
 
 P 
 
 G 
 
 38 
 
 So. 
 
 2 
 
 - 
 
 2 
 
 4 
 
 3 
 
 2 
 
 2 
 
 1 
 
 3 
 
 3 
 
 3 
 
 4 
 
 3 
 
 4 
 
 4 
 
 4 
 
 2.8 
 
 2 
 
 3.0 
 
 P 
 
 P 
 
 39 
 
 So. 
 
 1 
 
 1 
 
 4 
 
 3 
 
 3 
 
 8 
 
 1 
 
 2 
 
 1 
 
 2 
 
 3 
 
 3 
 
 2 
 
 4 
 
 5 
 
 4 
 
 2.7 
 
 2 
 
 3.1 
 
 P 
 
 P 
 
 40 
 
 So. 
 
 4 
 
 1 
 
 3 
 
 2 
 
 4 
 
 4 
 
 3 
 
 3 
 
 4 
 
 2 
 
 3 
 
 4 
 
 5 
 
 1 
 
 1 
 
 1 
 
 3.0 
 
 1 
 
 3.9 
 
 G 
 
 P 
 
 41 
 
 So. 
 
 3 
 
 5 
 
 1 
 
 2 
 
 2 
 
 3 
 
 3 
 
 2 
 
 5 
 
 3 
 
 4 
 
 4 
 
 2 
 
 5 
 
 5 
 
 1 
 
 3.2 
 
 5 
 
 3.5 
 
 G 
 
 G 
 
 42 
 
 So. 
 
 5 
 
 5 
 
 3 
 
 2 
 
 3 
 
 4 
 
 5 
 
 5 
 
 2 
 
 3 
 
 2 
 
 4 
 
 4 
 
 5 
 
 5 
 
 5 
 
 3.6 
 
 5 
 
 4.5 
 
 G 
 
 D 
 
 43 
 
 Jr. 
 
 3 
 
 5 
 
 2 
 
 2 
 
 3 
 
 3 
 
 3 
 
 5 
 
 2 
 
 3 
 
 3 
 
 5 
 
 3 
 
 1 
 
 1 
 
 1 
 
 3.1 
 
 1 
 
 3.2 
 
 P 
 
 P 
 
 44 
 
 So. 
 
 3 
 
 5 
 
 4 
 
 2 
 
 4 
 
 4 
 
 3 
 
 4 
 
 3 
 
 3 
 
 - 
 
 4 
 
 4 
 
 3 
 
 5 
 
 3 
 
 3.6 
 
 4 
 
 4.2 
 
 G 
 
 G 
 
 45 
 
 So. 
 
 3 
 
 1 
 
 5 
 
 2 
 
 2 
 
 3 
 
 2 
 
 1 
 
 3 
 
 3 
 
 1 
 
 4 
 
 3 
 
 1 
 
 2 
 
 1 
 
 2.4 
 
 2 
 
 4.1 
 
 P 
 
 P 
 
 46 
 
 Sr. 
 
 2 
 
 4 
 
 4 
 
 2 
 
 3 
 
 2 
 
 3 
 
 3 
 
 5 
 
 4 
 
 4 
 
 4 
 
 - 
 
 5 
 
 6 
 
 5 
 
 3.6 
 
 5 
 
 2.9 
 
 P 
 
 N 
 
 47 
 
 So. 
 
 5 
 
 4 
 
 2 
 
 3 
 
 3 
 
 4 
 
 3 
 
 4 
 
 5 
 
 4 
 
 2 
 
 - 
 
 - 
 
 1 
 
 1 
 
 1 
 
 3.0 
 
 2 
 
 3.5 
 
 N 
 
 F 
 
 48 
 
 Jr. 
 
 5 
 
 5 
 
 - 
 
 5 
 
 4 
 
 3 
 
 - 
 
 I 
 
 1 
 
 5 
 
 3 
 
 6 
 
 4 
 
 5 
 
 5 
 
 5 
 
 3.7 
 
 5 
 
 3.4 
 
 G 
 
 P 
 
 49 
 
 So. 
 
 2 
 
 4 
 
 2 
 
 2 
 
 6 
 
 1 1 
 
 2 
 
 2 
 
 2 
 
 3 
 
 3 
 
 4 
 
 2 
 
 4 
 
 1 
 
 2.5 
 
 2 
 
 3.7 
 
 P 
 
 N 
 
 50 
 
 Jr. 
 
 5 
 
 2 1 
 
 3 
 
 4 
 
 311 
 
 1 
 
 1 
 
 3 
 
 2 
 
 2 
 
 5 
 
 3 
 
 1 
 
 3| 
 
 2.4 
 
 3 
 
 3.9 
 
 G 
 
 P 
 
38 
 
 rating based on estimated competency, (7) the final grades in Psy- 
 chology 1 and Psychology 2, (8) the academic rating obtained by 
 averaging college grades as previously described. 
 
 In studying the tabulation of results it must be borne in mind 
 that in every case the quintile rating was obtained from the dis- 
 tribution of the results of the class of approximately two hundred 
 students, and not merely on the basis of the fifty here included. 
 This explains the fact that the ratings are not equally divided among 
 the five quintiles. 
 
 Discussion of Results. 
 
 In considering the data tabulated on the preceding page, it will 
 first be of interest to determine whether any significant correlations 
 exist between the various ratings given for the group as a whole, and 
 then to study the results for the individual student. It will be 
 valuable to ascertain, for example, whether the rating for the Thur- 
 stone test correlates with the average score for the series of more 
 specialized mental tests. Since general intelligence may be looked 
 upon as an average of the specific abilities of the individual, a high 
 correlation might well be expected between these two ratings. Each 
 of these, in turn, must be compared with the rating" on estimated 
 competency, and it will likewise be profitable to observe whether 
 any one of these three ratings may be considered an index of pro- 
 ficiency in college work. 
 
 With this purpose in view a series of intercorrelations has been 
 calculated between the ratings assigned for the four general divisions 
 of the results. In each case the coefficient of correlation was ob- 
 tained by the Pearson method. The data employed consists of the 
 quintile grade on the Thurstone test, the average rating for the 
 thirteen mental tests, the quintile rating on estimated competency, 
 and the average rating for college grades. 
 
 Correlations. 
 
 Competency rating with mental tests r = +0.49 
 
 Thurstone test with mental tests r = +0.40 
 
 Thurstone test with college grades r = +0.39 
 
 Thurstone test with competency rating r = +0.36 
 
 College grades with mental tests r = +0.21 
 
 College grades with competency rating. ... r = +0.10 
 
 A mere inspection of the coefficients listed above will show that 
 while all of the correlations are positive, not one can be considered 
 
39 
 
 significant. In general, it may be stated that coefficients between 
 -f 0.30 and -f-0.75 show that the same factors are operative in the 
 two series to some degree, but the correlation can hardly be regarded 
 as significant unless a coefficient greater than +0.75 is found. An 
 immediate conclusion can therefore be drawn either to the effect 
 that the values employed are not to be relied upon, or that the per- 
 formances rated in the four cases did not involve the same factors 
 or abilities. Nevertheless, it will be of interest to scrutinize the 
 coefficients obtained more closely, and to attempt to interpret 
 them. 
 
 The highest correlation of the series is found to exist between 
 the rating for estimated competency and that for mental tests. This 
 is not surprising since the competency rating was given largely on 
 the basis of the performance displayed in the solution of one of these 
 tests. In view of the fact that the cylinder test calls into play so 
 many of the abilities which enter into other tests of the series, it is 
 rather surprising that the correlation did not prove greater. This 
 can probably be accounted for by the fact that the cylinder test does 
 not involve language ability, which is an important factor in prac- 
 tically all of the other tests. 
 
 Next in order is found the correlation between the Thurstone 
 test and the mental test rating. As has been pointed out, both of 
 these ratings may in a sense be considered indices of general intelli- 
 gence, and since many tests in the series involve intellectual pro- 
 cesses similar to those called for in the Thurstone examination, the 
 low correlation displayed here is again unexpected. However, the 
 weight given to the time element in the latter test is so great, and 
 the range of abilities involved so much more restricted than in the 
 Pennsylvania series, that it is not difficult to account for the seeming 
 inconsistency of the results. 
 
 The very low correlations obtained between the academic rating 
 on the one hand and the mental test and competency ratings on the 
 other, provide food for serious reflection. The question which must 
 naturally arise is whether academic proficiency, as it is evaluated in 
 our colleges today, is really an index of the competency of the student. 
 Perhaps it will be well to notice whether the low correlation shown 
 here is typical of other similar investigations. In the report by 
 Caldwell (7) previously referred to, appears a summary of the results 
 obtained by other experimenters showing correlations obtained be- 
 tween various series of mental tests and college grades. In this con- 
 nection, it is unnecessary to note in detail the character of the tests 
 used by each of the investigators, and merely a statement of the 
 correlations obtained, as cited by Caldwell, is shown below. 
 
40 
 
 Correlation of Test Results with College Grades. 
 
 Wissler 0.09 
 
 Calfee 0.23 
 
 Rowland and Lowden 0.37 
 
 Waugh 0.41 
 
 Kitson 0.44 
 
 King and McCrory 0.39 
 
 Caldwell 0.44 
 
 While the correlations above are in most cases greater than that 
 obtained in the present investigation, namely 0.21, it will be noted 
 that not in a single instance was a significant coefficient shown. 
 Rogers (8) does not even attempt to calculate a coefficient of corre- 
 lation between test results and college grades, but states that "to 
 predict an individual's probable status in academic work from his 
 performance in the tests would obviously be rash ". As has previously 
 been stated, a comparison of the competency rating with ratings on 
 estimated intelligence cited in other investigations is hardly possible, 
 since in this case the estimate was made by an individual unfamiliar 
 with the students rated. It is well to note, however, that even where 
 intelligence was graded by instructors well acquainted with their 
 students, correlations with college grades have not exceeded 0.60. 
 
 From the facts given above it is possible to arrive at three con- 
 clusions. In the first place, college grades may not actually reflect 
 the mentality of the student, or secondly, the tests employed are 
 inadequate or misleading, or finally, the factors which enter into the 
 assignment of college grades are not the same as those which are 
 measured in psychological tests. Probably all three of these con- 
 clusions are in some degree justified. 
 
 Voice has been given recently to much criticism of the present 
 university curricula on the grounds of impracticality and because of 
 the continuation of secondary school pedagogical methods in insti- 
 tutions of higher learning. On the other hand, a large proportion of 
 the instruction in our colleges today is given by means of lectures. 
 The grade assigned at the end of the course is often determined 
 chiefly by the student's ability to give back on an examination paper 
 certain information which has been fed to him in lectures during the 
 term. Frequently, little intelligence is called for and the student is 
 rated either on the excellence of his memory or on the degree of 
 industry with which he compensates for a deficiency in that ability. 
 When to this criticism of university instruction is added the unreli- 
 ability of the grades themselves, as discussed more fully in an earlier 
 section, it is evident that the low correlation between college grades 
 
41 
 
 and test results may be in part due to shortcomings of the educational 
 system both as regards methods of instruction and grading. 
 
 In scrutinizing psychological tests as a whole or the series em- 
 ployed here in particular, certain criticisms must be made. Perhaps 
 the exaggeration of the importance of the time element is the most 
 serious fault with the majority of mental tests. Intellectual dex- 
 terity is generally measured rather than organization and usability 
 of knowledge. The difficulty is increased in this case by the homo- 
 geneity of the group tested. Many of the tests would be significant 
 when applied to individuals less carefully selected and at a lower 
 level of mental development. In most cases the problem presented 
 is too easy to tax the college student, and the speed of reaction is 
 the only ability measured. Another criticism which may be made of 
 tests in general, is that they do not measure with sufficient accuracy 
 the abilities which they are designed to gauge. In other words, a 
 subject does not always give the same score on the same or equivalent 
 tests due to variations in attention, interest, physical condition, etc. 
 Mental testing will not be scientifically accurate until the technique 
 has been so refined as to greatly reduce the probable error of the 
 score, or until a higher reliability coefficient can be obtained. The 
 low correlation between college grades and mental tests may, then, 
 be due to shortcomings of the latter as well as to inaccuracies of the 
 former. 
 
 It seems reasonable, however, to believe that this lack of corre- 
 spondence can be attributed largely to the fact that college work 
 involves other factors than those measured by any series of psycho- 
 logical tests which has yet been devised. In addition to the mental 
 abilities which go to make up the competency of the individual, the 
 factor of motivation plays a most important r61e in academic success. 
 It is possible to conceive of two students of approximately equal 
 competency, one of whom is inspired by the desire to excel in intellec- 
 tual pursuits, while the other is in college for the purpose of enjoying 
 social or athletic advantages. The intense interest and industry of 
 the first is likely to result in a higher academic rating than would 
 be predicted from his performance in series of mental tests, while 
 just the opposite is true in the case of the second student. While 
 it is fair to believe, therefore, that psychological tests can be em- 
 ployed to select those students who have the ability to succeed in 
 college, they will not form an adequate basis upon which to predict 
 academic success until some means has been devised of measuring 
 motive in quantitative terms. The final solution of the problem 
 will be reached when more accurate methods of assigning college 
 grades have been adopted, and those grades depend more on the 
 
42 
 
 higher thought processes and less on memory, and when, on the 
 other hand, psychological tests have been made more difficult, place 
 less stress on the time element, and include some index of motivation. 
 
 Althought it must be admitted that the formulation of a series 
 of mental tests which will accurately predict success in college work 
 is desirable, no great benefit would accrue thereby either to the 
 science of psychology or to the field of education. The psychologist 
 is not so much interested in the abilities which determine college 
 grades, as in evaluating the particular mental assets and liabilities 
 which characterize the individual. While the general intelligence 
 rating, which represents the summation of the scores in a number 
 of tests, is doubtless of some significance, the analysis of such a 
 rating so as to show the peculiar abilities and disabilities of the 
 individual is of much greater importance from the point of view of 
 psychology. An inspection of the results shown in the preceding 
 tabulation reveals the fact that although two students may have 
 the same average test rating, the scores obtained in the different 
 tests are not really reflected in this average. Of two individuals 
 who had an average rating of 3.3 in the thirteen tests and who 
 received the same quintile rating on the Thurstone test, one was 
 placed in the third quintile in nine of the tests, the other in only 
 three. Obviously the first student showed consistent mediocre 
 ability, while the second displayed considerable variation in the 
 different tests, having four ratings of "5", three of "2", and one " 1 ". 
 There is no doubt that the latter student provides the more interesting 
 material for psychological study and for vocational guidance. 
 
 Since it is believed that the present investigation is significant 
 rather in the analysis of individual competency than in the correla- 
 tion of group results, it will be the purpose in the following section 
 to scrutinize the record of each member of the group and to deter- 
 mine whether any conclusions of value in diagnosis or guidance can 
 be reached. In considering the academic rating it is well to note 
 that ratings higher than 4.0 are very good, while those below 3.5 are 
 poor. The median academic rating for the group is 3.7. Composite 
 test ratings above 2.9 and below 3.5 are considered mediocre, with the 
 median rating at 3.2. 
 
 Analysis of Individual Records. 
 
 No. 1. 
 
 This student shows a consistently mediocre record until his 
 
 college grades are observed, when he is found to have one of the 
 
 highest academic ratings of the group. Placed in the middle quintile 
 
 in the Thurstone test as well as in estimated competency, his average 
 
43 
 
 test rating is well below the median. As for the separate tests, he 
 has received the highest rating in none, and the lowest only in the 
 memory span for digits. In general, the higher scores are exhibited 
 in those tests which involve language ability and memory, and the 
 lower where these factors are not prominent, namely, in the Taylor 
 number test, the Courtis test, and the cylinders. In view of the 
 high grades in psychology and the high academic rating it seems 
 probable that this student has some strong motive, such as ambition, 
 and supplements a mediocre intellect with an unusual amount of 
 industry. 
 
 No. 2. 
 
 This record shows the most consistently high rating to be found 
 in the group. The academic record is the highest, and this is borne 
 out by " distinguished " grades in both courses in psychology. The 
 ratings for estimated competency and for the Thurstone test are 
 both in the fifth quintile, while the average test rating is equaled 
 by only one other student in the group. In considering the results 
 of the particular tests, it will be observed that this student has not 
 fallen below the middle quintile, but has reached the highest in only 
 three tests. He shows the poorer scores in those tests which stress 
 language ability and memory, and the higher ratings where intel- 
 ligence, imagination and attention are involved. The general level 
 of performance is so high as to make any specific recommendation 
 or prognosis unsafe. 
 
 No. 3. 
 
 The chief point of interest in this case is the lack of correspond- 
 ence between the competency rating and the remainder of the data 
 at hand. This student shows an academic rating which places him 
 in the poorest fifth of the group, with conditions for both courses in 
 psychology. Although in the second quintile in the Thurstone test, 
 his average test rating is one of the lowest recorded. He rates above 
 the middle quintile only in the sentence completion and cylinder 
 tests. This may indicate good intelligence not directed toward 
 college work, but the conclusion that the competency rating is too 
 high in this case seems justified. 
 
 No. 4. 
 The indication here is of a student of somewhat more than 
 average general intelligence whose record is largely influenced by 
 interest in the task at hand. With an academic rating slightly 
 above the average and a "G" and "P" in psychology, his score in the 
 Thurstone test puts him in the highest quintile. The composite test 
 rating is slightly above the average, and shows a preponderance of 
 
44 
 
 "5's" as well as a number of "2's" and a "1". High ratings in the 
 memory span for ideas, the sentence completion and the definitions 
 tests, as contrasted with a very poor cylinder performance, indicate 
 intellectual ability rather than intelligence. 
 
 No. 5. 
 This record shows a student somewhat below the average in 
 competency, with an academic rating slightly better than would be 
 expected from the test results. Passing grades in both courses in 
 psychology, estimated competency in the second quintile, and the 
 median rating for the Thurstone test all indicate mediocre ability. 
 This is borne out by an average test rating below the mean for the 
 group. Performances in the memory spans for digits and ideas, and 
 in the definitions and memory tests were rated in the lowest quintile. 
 High scores were obtained in the Ausfrage and Courtis tests and in 
 the second trial with the cylinders. Although the test results show 
 great variation, there seems to be no definite tendency displayed. 
 
 No. 6. 
 This individual probably possesses mediocre ability, although 
 receiving a very low competency rating and a very high score on 
 the Thurstone test. A fair academic rating with passing work in 
 psychology, and a test rating slightly below the average seem to 
 indicate that neither the Thurstone test nor the competency rating 
 gives a true picture of the student. High scores in the Ausfrage, 
 digit span, Trabue, opposites and memory tests, with very poor 
 cylinder performances, would suggest fair intellect coupled with 
 rather deficient intelligence. 
 
 No. 7. 
 The record here indicates relatively low competency with a 
 high degree of native intelligence. A very poor academic rating is 
 substantiated by a condition and a failure in the two courses in 
 psychology, and a low average test rating. An exceptionally good 
 performance with the cylinders and a high rating on the Thurstone 
 test and the idea span, with lower scores on the test requiring language 
 ability and memory, lead to the conclusion that this man is misplaced 
 in college, but would probably succeed in a pursuit which does not 
 stress intellectual development. 
 
 No. 8. 
 
 There are no outstanding features in the record of this student. 
 
 The academic rating and the Thurstone score are both slightly 
 
 above the average, while the test rating is somewhat below. The 
 
 tests which emphasize the intellectual side usually show good scores, 
 
45 
 
 while those which do not depend on language ability, such as the 
 Taylor number, the Courtis, and the cylinder tests, are placed in the 
 lower quintiles. On the whole, the record is mediocre. 
 
 No. 9. 
 In this instance, a high academic rating, good work in psychology, 
 a high competency rating and a good score on the Thurstone test 
 fail to correlate with a rather low test rating. Median scores on seven 
 of the tests, with only one result in the highest and one in the lowest 
 quintile, indicate a rather consistent mediocrity. A high rating 
 in the memory test and an excellent cylinder performance suggest 
 that good memory and intelligence are responsible for the high 
 academic standing. 
 
 No. 10. 
 A competency rating of "4" indicates that this man was not 
 doing his best on the mental tests. Mediocre college work and a 
 low rating on the Thurstone test suggest that the competency rating 
 is too high. The test scores are generally low where language ability 
 is involved, and are above the average for the Taylor, idea span, 
 memory and cylinder tests. As in Case 7, it seems likely that this 
 individual is not profiting by his college course and would be more 
 successful in some other line of activity. 
 
 No. 11. 
 
 The record of this student is quite inconsistent. Placed in 
 the lowest quintile in the Thurstone test and competency rating, 
 his test and academic ratings are well above the average. The low 
 score in the first cylinder trial indicates a lack of intelligence, while 
 the marked improvement on the second trial indicates good train- 
 ability. The low rating on the Trabue test and idea span, contrasted 
 with high ratings for the Courtis and Humpstone memory tests, 
 suggest an efficient and retentive mind rather than a quick and 
 imaginative one. That this man is a slow thinker is demonstrated by 
 his score on the Thurstone test. The fact that he retains and digests 
 the information which he acquires is evidenced by his high academic 
 record. 
 
 No. 12. 
 
 This student displays a record consistently near the average for 
 the group. The Thurstone and academic ratings are somewhat better 
 than the mean, the competency rating is in the third quintile, and the 
 test rating slightly below the average. Of the separate tests, seven 
 are rated in the third quintile, a poor score on digit span and a very 
 high rating on the Trabue test being the only significant scores. 
 
46 
 
 On the whole, the competency rating seems to express the ability of 
 the student adequately. 
 
 No. 13. 
 
 The record in this case is consistently high. Very good grades 
 in the two courses in psychology substantiate an academic rating 
 which is exceeded by only three members of the group. A competency 
 rating of "5" and a Thurstone rating of "4" correlate with a high 
 test rating. The only rating in the lowest quintile is that on the 
 memory test and when this is contrasted with an exceptionally good 
 performance with the cylinders, it seems reasonable to conclude 
 that this student depends more on intelligence than on memory in 
 his college work. Almost without exception ratings in the upper 
 quintiles are displayed for the tests which do not stress language 
 ability, while lower ratings are found where this factor is of great 
 importance. 
 
 No. 14. 
 
 This record presents an interesting contrast with that of student 
 No. 13 in that the intellectual rather than the intelligence factors 
 are here stressed. While not quite so good from the academic view- 
 point, this record shows a slightly higher rating for the Thurstone 
 and other mental tests than does the preceding case. Ratings of 
 "5" on the Ausfrage, digit span, Trabue, and memory tests indicate 
 associability, language ability and retentiveness, while a rating in 
 the lowest quintile for the first cylinder trial implies comparatively 
 poor intelligence. A much better record on the second trial with the 
 cylinders shows trainability, which, coupled with a high memory 
 span and good memory, pictures a student of more than average 
 intellect. 
 
 No. 15. 
 
 The indication here is of a man of high general intelligence who 
 does not care to apply himself to college work. On the one hand 
 his academic rating is mediocre and he has obtained merely passing 
 grades in psychology, while contrasted with this are Thurstone and 
 competency ratings in the highest quintile, and a combined test 
 rating well above the average. The low rating on the Courtis test 
 is probably the only score of particular significance, and seems to 
 indicate laziness and lack of interest. In view of the higher scores 
 on the other tests this explanation may also hold for the low rating 
 on definitions. On the whole the picture is that of a student with 
 real ability who does not care to exert himself. 
 
47 
 
 No. 16. 
 
 In spite of a good rating on the Thurstone test, this record 
 indicates an individual of somewhat less than average ability. 
 Although the academic rating is fair, the competency rating and the 
 composite test rating are both low. Ratings below the middle quin- 
 tile are found for the Taylor number test, the digit and syllable 
 spans, the Trabue and definitions tests, while only the ratings for the 
 Ausfrage, Courtis and memory tests are better than the average. 
 It seems likely that this student supplements good retentiveness 
 with more than the usual degree of industry in passing his college work. 
 
 No. 17. 
 
 Thurstone and competency ratings in the lowest quintile com- 
 bined with the lowest composite test rating of the group indicate 
 decidedly inferior ability in this case. Eight of the separate test 
 ratings are below the middle quintile and only three are above. 
 Low ratings on the Thurstone, Taylor, Trabue, Courtis, opposites 
 and cylinder tests, all of which involve a definite speed factor, sug- 
 gest that a slow rate of discharge is primarily responsible for the 
 poor test performances of this individual. High ratings in the 
 digit span, description, and definitions tests, in all of which the time 
 element is relatively unimportant, seem to bear out this conclusion. 
 An observation of the scores of the three memory span tests shows 
 that as the material becomes more complicated the rating is lower. 
 This man evidently needs time to think, and does well when the time 
 is not limited. This fact explains the lack of correlation between the 
 test ratings and the academic record, which is at least average, 
 and it also emphasizes the undue weight given to the time factor in 
 most mental tests. 
 
 No. 18. 
 
 This record displays the interesting combination of a very 
 high academic rating with mediocre performance in the various men- 
 tal tests. The record is quite comparable with that of student 
 No. 1 with the exception that in this case nine of the thirteen test 
 results are found in the middle quintile. High ratings in the Thur- 
 stone and Courtis tests suggest alertness, and this ability, in con- 
 junction with a good rating on memory, may be partly responsible 
 for the success in college work. It seems probable, however, that 
 some motivation factor which cannot be measured by the test 
 results has played an important part in the academic attainments 
 of this student. 
 
No. 19. 
 
 In this case the record, with the exception of the grades in 
 psychology, is consistently above the average. Low ratings on the 
 Courtis and cylinder tests might suggest a slow rate of discharge 
 were it not for a very high rating on the Thurstone test. High 
 scores on the three memory span tests, the Trabue, definitions, and 
 memory tests show associability, retentiveness, and language ability, 
 which may be looked upon as essential factors in intellectual develop- 
 ment. The low rating on the cylinders hardly seems significant in 
 view of the other test results, although it may indicate a deficiency 
 in mechanical as contrasted with mental ability. 
 
 No. 20. 
 This record provides an interesting comparison with that of 
 student No. 19. Although the psychology grades, competency 
 rating, and Thurstone rating are identical, this student has a some- 
 what lower academic rating and a correspondingly lower composite 
 test rating. Even the ratings for the separate tests show similar 
 tendencies, but the scores for the Courtis and cylinder tests are lower 
 here than in the preceding case. The most significant difference 
 between the two records is found in the very low memory rating of 
 this student, which places him definitely in the mediocre group. 
 
 No. 21. 
 This record is one of the most consistent to be found in the group 
 and places the student definitely in the fourth quintile. The academic 
 rating is quite high, the Thurstone and competency ratings are both 
 "4", and the composite test rating is well above the average. The 
 separate test scores indicate little, since all but two of the ratings are 
 in the middle and upper quintile. Although the first cylinder trial 
 was slow, the second trial compensated for this deficiency. There 
 is no comment to make on this case other than a desire that mental 
 tests might always correlate so closely with academic standing. 
 
 No. 22 
 While this record is, on the whole, mediocre, the academic 
 rating is somewhat higher than might be expected in view of the 
 low Thurstone and competency ratings. The latter may possibly be 
 accounted for by the poor intelligence displayed in both cylinder 
 performances, while good ratings on the tests requiring language 
 ability, and particularly on the memory test, provide a satisfactory 
 explanation for the fair academic rating. From the test results 
 it seems probable that this individual has to apply himself to his 
 studies in order to do passing work. 
 
49 
 
 No. 23. 
 
 The failure in Psychology 2 is the only discordant note in an 
 otherwise mediocre record. The composite test rating and that for 
 the Thurstone test are about average for the group, while the com- 
 petency rating is in the fourth quintile. The separate test results 
 do not seem significant except for a high rating in memory. The 
 poor work in psychology must probably be accounted for by lack of 
 interest or failure to study. 
 
 No. 24. 
 The record here is comparable with that of student No. 15 in 
 that a high composite test rating is contrasted with a low academic 
 rating. In this case, however, the discrepancy is even more marked. 
 The test rating is exceeded by only four members of the group, while 
 only two have poorer college records. The separate test results 
 present no solution to the difficulty since the ratings are high with 
 only one exception. The competency rating is "4". It seems 
 probable that this man is not particularly interested in his college 
 work and is expending most of his time and energy in some kind 
 of outside activity. 
 
 No. 25. 
 In this instance the record is consistently mediocre. All four of 
 the general ratings are either in the middle quintile or slightly below 
 the group average. The failure in Psychology 1 is hardly to be 
 accounted for by the separate test results, which display no definite 
 tendency, and was probably due to lack of application, since the 
 student was able to pass the second course. 
 
 No. 26. 
 
 The rather high academic rating in this case seems to contradict 
 the low Thurstone and composite test ratings. The low digit span 
 and the poor rating on the memory test indicate that this student 
 must be a hard worker in order to have received such high grades 
 for his college courses. Good trainability as displayed in the second 
 trial with the cylinders may be a significant factor in his academic 
 work. 
 
 No. 27. 
 
 In this case a very high score on the Thurstone test correlates 
 well with a high composite test rating and a high academic rating. 
 " Distinguished' ' grades in both courses in psychology also indicate 
 general superiority. A poor performance on the second trial with the 
 cy finders which resulted in a competency rating of only "3" is the 
 only flaw in an otherwise excellent record. Eight of the thirteen 
 
50 
 
 tests are rated above the middle quintile and indicate nothing more 
 than an unusually high level of general intelligence. 
 
 No. 28. 
 This record offers an interesting comparison with that of student 
 No. 27. The composite test ratings and the competency ratings are 
 identical in the two cases, while the academic ratings are very nearly 
 so. Both students received the highest grade in both courses in 
 psychology. In this instance, however, the Thurstone score is 
 mediocre, and the ratings for the Trabue and cylinder tests are in the 
 second quintile. The ratings on those tests which stress language 
 ability are generally higher than in the preceding case, while the 
 memory spans are conspicuously lower. These facts indicate a 
 relatively low intelligence coupled with a rather high intellectual 
 development. On the whole, the student is decidedly superior to 
 the majority of the group. 
 
 No. 29. 
 The record in this case must be considered consistently good 
 although it can hardly be compared with either of the two preceding 
 cases in general excellence. The academic rating shows a "G" 
 average and the psychology grades rate the student even higher. 
 While the Thurstone rating is "4", the rating on estimated com- 
 petency is higher than that in either of the preceding records. This 
 rating is not substantiated by the results of the separate tests, only 
 four of which are found to be above the middle quintile. These 
 seem to point to intelligence rather than to intellectual organization, 
 although it would be unsafe to make any specific diagnosis. 
 
 No. 30. 
 This record displays a relatively high test rating and a Thur- 
 stone rating in the fourth quintile contrasted with an average 
 academic rating and unsatisfactory grades in psychology. While 
 the separate test scores indicate somewhat erratic performances, 
 very high ratings on the memory and cylinder tests show that this 
 individual has unusual ability in some directions. It seems probable 
 that lack of interest or want of application is responsible for the 
 deficiency in psychology. 
 
 No. 31. 
 
 The mediocre composite test rating in this case does not corre- 
 late with the generally high level of the other ratings, all of which 
 are in the fourth quintile. Although the separate test results are 
 distributed through the five quintiles, they show no definite ten- 
 dency which might be considered explanatory. Possibly the high 
 
51 
 
 degree of trainability displayed in the second cylinder trial is sig- 
 nificant, but it seems likely that this student either did not take the 
 tests seriously or that some strong motivation factor has entered into 
 his college work. 
 
 No. 32. 
 This record presents as great a contradiction as is to be found 
 in the whole group. While only two students have academic records 
 which exceed the rating in this case, only three have lower composite 
 test ratings. Moreover, the estimated competency rating is "2" 
 and the Thurstone test rating "5". Only three students have better 
 grades in the two psychology courses. In the separate tests, low 
 ratings were received on the Taylor number, digit span, Trabue, 
 differences, definitions, and second cylinder trial. Only the syllable 
 span and memory tests were rated higher than the middle quintile, 
 the latter receiving the only "5" of the series. It seems hardly 
 possible to explain the excellent academic record on the basis of good 
 memory alone, and the only conclusion which can be reached is that 
 the test results do not reflect the evident competency of this student. 
 
 No. 33. 
 
 All things taken into consideration, this is the poorest record in 
 the group. The academic rating is low and one of the courses in 
 psychology was not passed. The competency rating and that on 
 the Thurstone test are both in the lowest quintile, the score on the 
 latter test being the lowest made by any of the fifty students. The 
 composite test rating is one of the lowest in the group, and only two 
 of the separate test results are placed above the middle quintile. 
 A rating of "5" in the memory test suggests that this ability may 
 have enabled the student to stay in college. Low ratings on the 
 Taylor, digit span, syllable span, Trabue, differences, opposites and 
 definitions tests and the first trial with the cylinders indicate a very 
 general deficiency. The test results in this case are quite similar 
 to those in the record of student No. 32, but seem here to be really 
 significant. 
 
 No. 34. 
 
 In this instance, the various ratings of the record correlate 
 well to show better than average competency. The academic rating 
 is good, the psychology grades very good, and the competency rating 
 is in the fourth quintile. The Thurstone score is high, and while the 
 composite test rating is only fair, the separate test results show no 
 marked deficiencies. Low ratings in the Ausfrage and Taylor tests 
 are not particularly significant, while higher ratings in the digit span, 
 Courtis, and memory tests and second cylinder trial indicate asso- 
 
52 
 
 ciability, speed, retentiveness and trainability. On the whole, the 
 record shows no contradictions. 
 
 No. 35. 
 This record is consistent in so far as the composite test rating, 
 the Thurstone rating and the competency rating are concerned. 
 The test rating is equaled only by student No. 2, and both of the other 
 ratings place this student in the highest quintile. In academic work, 
 however, only an average rating is to be found, and the explanation 
 must probably be based on lack of interest in studies or absorption 
 in other activities. High ratings on the Taylor number, digit span, 
 Trabue, Courtis, definitions and memory tests, on both trials with 
 the cylinders, and on the Thurstone test indicate that this student 
 has the ability to do excellent college work if he so desires. 
 
 No. 36. 
 Although a number of the separate test results are missing in 
 this record, the ratings on the Thurstone test and estimated com- 
 petency as well as the composite test rating indicate a rather high 
 level of mentality. The academic rating, however, is one of the 
 lowest in the group and shows that conditions and failures were 
 received in a number of courses, even though the work in psychology 
 was somewhat above the average. The evidence seems fairly con- 
 clusive that this man could do better college work if he wished to 
 apply himself. Interest in outside activities probably explains the 
 discrepancy between the test ratings and the academic record. 
 
 No. 37. 
 
 With the exception of a low rating on the Thurstone test, this 
 record is consistently mediocre. The academic rating, competency 
 rating and composite test rating all appear in the middle quintile. 
 A good performance on the first trial with the cylinders, followed 
 by an excellent second trial, indicate intelligence and trainability, 
 while a low rating on the memory test may explain the mediocre 
 college record. 
 
 No. 38. 
 
 The record in this instance is consistently below the average 
 for the group and may be considered typical of the second quintile. 
 The academic rating is low, the psychology grades merely passing, 
 the competency rating and the Thurstone rating are both "2", and 
 the composite test rating decidedly below the average. Low ratings 
 were received on the Taylor number, idea span, description, and 
 Trabue tests, while the digit span, definitions and cylinder tests 
 were rated above the middle quintile. No ratings in the highest 
 
53 
 
 quintile appear. An analysis of these results seems to indicate 
 good associability and intelligence coupled with rather deficient 
 intellectual organization. This man would probably be more 
 successful in business than in an academic or professional vocation. 
 
 No. 39. 
 This record disputes with that of student No. 33 the distinction 
 of being the poorest in the group. The fact that the student was 
 excluded from the University at the end of the session gives peculiar 
 interest to this case. An observation of the grades received in college 
 courses discloses the significant fact that eight units of work were 
 assigned a grade of "D", while an equal number received a "G". 
 Eight units of credit were merely "Passed", conditions were given 
 for three units, and the remaining eight units received the grade 
 "F". Passing grades were assigned for both courses in psychology. 
 This unusual distribution of grades suggests specific ability along 
 certain lines with marked variations in interest. The student would 
 probably have received " Distinguished" grades in all of his college 
 work if he had been allowed free election of courses. Low ratings on 
 the Thurstone and Courtis tests show that he cannot think quickly, 
 while poor scores in the Trabue and memory tests indicate deficiency 
 in imagination and retentiveness. High ratings on the Taylor 
 number and cylinder tests show that there is no deficiency in the rate 
 of discharge of energy, and that distribution of attention and intelli- 
 gence are both above the average. It seems probable that this man, 
 now being free to follow his own inclinations, will be successful in the 
 vocation which he chooses. The case is particularly interesting as an 
 example of the influence of special abilities and of motivation in the 
 behavior of the individual. 
 
 No. 40. 
 In this case a very low competency rating is contradicted by 
 a composite test rating only slightly below the average and Thur- 
 stone and academic ratings in the fourth quintile. The competency 
 rating was doubtless influenced by very poor performances in both 
 cylinder trials, but this deficiency in intelligence is compensated for 
 by high ratings in the syllable and idea spans, Courtis, definitions 
 and memory tests. In other words this student has the associability, 
 alertness, language ability and retentiveness necessary to do good 
 college work. It is possible, also, that lack of interest in the tests 
 may have affected the significance of the results. 
 
 No. 41. 
 This is a consistently mediocre record with the exception of 
 the psychology grades, which are slightly above the average, and 
 
54 
 
 the competency rating, which is very high. The academic rating is 
 slightly below the median and the composite test rating is median 
 for the group. The Thurstone score is placed in the middle quintile. 
 High ratings are shown for the Ausfrage, Courtis, and first cylinder 
 trial. The latter, however, is offset by a very poor performance in 
 the second trial with the cylinders. Low ratings also appear for the 
 Taylor number, digit and syllable spans, Trabue, and memory tests. 
 These results indicate rather poor general intelligence and suggest 
 that the competency rating is too high. 
 
 No. 42. 
 
 Every one of the principal ratings in this record occurs in the 
 highest quintile, and the student must be ranked definitely with the 
 leaders of the group. High ratings on the Ausfrage, description, 
 Trabue, and memory tests and on both cylinder trials show good 
 observation, imagination, retentiveness and intelligence. A low 
 rating on the digit span is neutralized by a high idea span. Other 
 low ratings on the Courtis and opposites tests do not seem significant. 
 On the whole the record is unusually consistent and justifies the high 
 competency rating. 
 
 No. 43. 
 
 Although the composite test rating in this case is about average 
 for the group, the academic rating is decidedly inferior. The com- 
 petency rating is the lowest given to any member of the class, and 
 is based on very poor performances with the cylinders. Although 
 this student seems to lack intelligence, high ratings were obtained in 
 the Ausfrage, Trabue, and definitions tests. Low ratings for the 
 Taylor number, digit span, and Courtis tests indicate a consistently 
 poor performance in those tests which do not involve language ability. 
 The good ratings in the strictly intellectual tests suggest that outside 
 activities are responsible for the low academic rating. 
 
 No. 44. 
 
 This record seems to be typical of the fourth quintile. The 
 academic record shows a preponderance of "Good" grades, and this 
 mark was received for both courses in psychology. The competency 
 rating is "4" and the Thurstone rating "3". The composite test 
 rating is one of the best in the group, although fifth quintile ratings 
 appear only for the Ausfrage test and the first cylinder trial. Other 
 test ratings show a high level of general intelligence with no sig- 
 nificant disabilities. 
 
55 
 
 No. 45. 
 An academic record in the fourth quintile is accompanied in 
 this case by a Thurstone score in the middle quintile, a competency 
 rating in the second, and a composite test rating in the lowest quintile 
 of the group. This unanimous absence of correlation is also shown 
 in the separate test results where ratings in all five quintiles appear. 
 A high rating on the Taylor number test suggests good distribution 
 of attention, but even this ability must have been lacking in the 
 cylinder performances. The test results show no definite tendency, 
 but display a low level of general intelligence. The high academic 
 rating notwithstanding, this student falls below the middle quintile 
 of the group in competency. 
 
 No. 46. 
 
 The lowest academic rating in the group is displayed by this 
 senior, who, nevertheless, was able to graduate with his class. While 
 the Thurstone score is poor, the competency rating and the com- 
 posite test rating are both high. The separate test results are low 
 for digit and idea spans, but high for most of the other tests with 
 exceptionally good performances on the cylinder test. This student 
 was evidently doing no more college work than was necessary to ob- 
 tain his degree, and was probably interested in outside activities. 
 
 No. 47. 
 The competency rating, composite test rating, and academic 
 rating agree in placing this student in the second quintile. The 
 rating on the Thurstone test is very high, and the grades in psychology 
 the poorest in the group, consisting of an "N" for the first course 
 and a "Failure" for the second. High ratings on the Thurstone 
 and Courtis suggest a rather quick mind when familiar operations 
 are involved, while the very low ratings on the cylinder test indicate 
 inability to meet a new problem successfully. Since the subject- 
 matter of the courses in psychology is quite unlike that of most 
 college courses, the inability of the student to adapt himself to the 
 new situation is probably the cause of his deficient work in this 
 subject. Although the result for the memory test is missing, a high 
 rating in that ability may be predicted. 
 
 No. 48. 
 In this record the composite test rating, the competency rating 
 and the Thurstone rating indicate a very high level of general intelli- 
 gence. The academic rating, however, is far below the average for 
 the group. Of the separate test results, only two fall below the middle 
 
56 
 
 quintile. The low ratings on the Trabue and Courtis tests are diffi- 
 cult to explain in the light of the other test ratings, five of which are 
 in the highest quintile. Excellent associability, language ability, 
 retentiveness, and intelligence are displayed in the various test scores, 
 and the only explanation of the relatively poor college grades seems 
 to lie in lack of interest or absorption in outside activities. 
 
 No. 49. 
 Although the Thurstone score, the competency rating, the com- 
 posite test rating, and the grades in psychology agree in placing this 
 student below the middle quintile, the academic rating is the median 
 for the group. As is frequently the case where this situation is 
 encountered, the rating on the memory test is high. In addition to 
 this test only the Ausfrage and the syllable span were rated higher 
 than the middle quintile, while eight of the thirteen tests fell below 
 that level. It seems certain that more than the usual amount of 
 industry is expended by this individual on his college work. 
 
 No. 50. 
 This record is quite similar to that of student No. 49 with the 
 exception that the composite test rating is slightly lower and the 
 academic rating somewhat higher than in the preceding case. Here, 
 however, the Thurstone rating is high and the competency rating 
 and psychology grades average. Of the separate test results, only 
 the rating on the memory test is in the highest quintile. The ratings 
 for the Taylor number, description, Trabue, Courtis and first cylinder 
 tests are in the lowest quintile. The second trial with the cylinders 
 indicates good trainability, which with the assistance of an unusually 
 good memory may account for the high academic rating. On the 
 other hand, lack of effort in the tests may be responsible for the low 
 composite test rating, and is suggested by the high score on the 
 Thurstone test. 
 
 Summary. 
 
 A scrutiny of the analyses of the fifty individual records shows 
 that these may be separated into two general groups. In twenty-six 
 cases the correlation between the various ratings is close enough to 
 present fairly conclusive evidence of the relative performance level 
 of the student. These cases, in turn, naturally fall into five classes 
 corresponding roughly with the points of a five-division scale, which 
 may be referred to here as very good, good, medium, poor, and very 
 poor. Seven records are so consistently high as to warrant a place 
 in the first group, while five more are distinctly better than the 
 average and may be considered "good". Eight cases occur in the 
 
57 
 
 "medium" class, and of the six which fall below this level two are 
 "poor" and four show such a general inferiority as to justify place- 
 ment in the lowest group. The twenty-four remaining records, 
 which display a decided lack of correlation between the various 
 major ratings, exhibit two opposing tendencies. In fourteen cases 
 the academic rating is higher than would be predicted from the test 
 results, while in the ten remaining cases the Thurstone score, com- 
 petency rating and composite test rating would seem to indicate 
 better scholastic ability than is displayed in the academic rating and 
 psychology grades. The following summary shows the classification 
 of each individual record. 
 
 Classification of Individual Records. 
 
 I. Cases showing general correlation of ratings: 
 
 Very good 2, 13, 14, 27, 28, 42, 44 
 
 Good 11, 18, 19, 21, 29 
 
 Medium 4, 5, 6, 20, 23, 37, 40, 41 
 
 Poor 22,47 
 
 Very poor 3, 33, 38, 39 
 
 II. Cases where correlation is lacking: 
 
 High academic, medium mental 1,9, 12, 31, 34 
 
 High academic, low mental 26, 32, 45 
 
 Medium academic, low mental 8, 10, 16, 17, 49, 50 
 
 High mental, medium academic 15, 30, 35 
 
 High mental, low academic 24, 36, 46, 48 
 
 Medium mental, low academic 7. 25, 43 
 
 Although in some cases the evidence is not so clear cut as the 
 summary above may seem to indicate, the classification nevertheless 
 is justified by the data at hand. It also seems reasonable to attribute 
 the absence of correlation shown in the second group of records to 
 variations in motivation and other external factors which have not 
 as yet lent themselves to quantitative measurement. Of two men 
 who have the same composite test rating and who may be assumed 
 to possess equal competency, one may be intensely interested in his 
 studies and impelled by a consuming ambition to gain the greatest 
 possible benefit from his college course, while the other is content to 
 do only the amount of work necessary to fulfil the minimum scholastic 
 requirements and seeks to excel in athletic or social activities. Again, 
 the first student may be devoting all of his time and effort to college 
 work, while the second is compelled to expend much of his energy 
 
58 
 
 in supporting himself. Certainly no series of mental tests will 
 correlate closely with academic standing until some satisfactory 
 method of evaluating these factors external to competency has been 
 devised. At present it is possible to do no more than call attention 
 to the lack of correlation and attempt to explain the discrepancies 
 in the most logical manner. 
 
 Conclusions. 
 
 (1) The psychologist should engage in the analysis and evalu- 
 ation of the "ability" components of the college student's competency 
 rather than in the correlation of general intelligence tests with aca- 
 demic grades. 
 
 (2) The abilities required for scholastic success, under the 
 present methods of college instruction and grading, are not all of the 
 abilities comprising individual competency. Hence the failure of 
 test results to correlate with college grades. The better the general 
 intelligence test, the smaller will be the correlation with academic 
 standing. 
 
 (3) College grades will provide more satisfactory material 
 for statistical treatment when each institution adopts a standard 
 distribution of grades and provides for supervision by some adminis- 
 trative officer. 
 
 (4) Tests for college students must be devised which place less 
 dependence upon time measurement, which have a higher reliability 
 coefficient, and which are of greater difficulty, than most of the tests 
 now available. 
 
 (5) Motivation and environmental and economic conditions 
 have not as yet yielded to quantitative treatment. Until they do, 
 it will not be possible to predict with accuracy the success of a student 
 in college or in any other field of endeavor. 
 
 (6) Test ratings such as those presented here should be made 
 available to deans, faculty advisers, and committees dealing with 
 scholastic deficiency. In many instances this information would 
 be of value to the student, also, providing him with educational or 
 vocational guidance. 
 
 (7) A "follow up" of the fifty students who have provided the 
 material for this study will be published at some future date. 
 
 (8) Only after many investigations are at hand with diagnoses 
 carefully followed up over a period of years will psychological diag- 
 nosis and orthogenic guidance become as reliable for the normal 
 individual as it is now for the subnormal. 
 
59 
 
 BIBLIOGRAPHY. 
 
 1. Wissler, Clark. The Correlation of Mental and Physical Tests. 
 Psychological Review Monograph Supplement 8, 1901, No. 6, 1-61. 
 
 2. Calfee, M. College Freshmen and Four General Intelligence Tests. 
 Journal of Educational Psychology, 1913, 4, 223-231. 
 
 3. Rowland, E., and Lowden, G. Report of Psychological Tests at Reed 
 College. Journal of Experimental Psychology, 1916, 1, 211-217. 
 
 4. Waugh, Karl T. A New Mental Diagnosis of the College Student. 
 New York Times Magazine Supplement, January 2, 1916. 
 
 5. Kitson, H. D. Scientific Study of the College Student. Psychological 
 Monograph No. 98, 1917. Pp. 81. 
 
 6. King, I., and McCrory, J. Freshmen Tests at the State University of 
 Iowa. Journal of Educational Psychology, 1918, 9, 32-46. 
 
 7. Caldwell, H. H. Adult Tests of the Stanford Revision Applied to 
 College Students. Journal of Educational Psychology, 1919, 10, 477-488. 
 
 8. Rogers, A. L. Mental Tests as a Means of Selecting and Classifying 
 College Students. Journal of Educational Psychology, 1920, 4, 181-192. 
 
 9. Humpstone, H. J. Some Aspects of the Memory Span Test. Experi- 
 mental Studies in Psychology and Pedagogy, 7. Psychological Clinic Press, 
 Philadelphia, 1917. Pp. 31. 
 
 10. Terman, L. M. The Measurement of Intelligence. Houghton-Mifflin 
 Company, Cambridge, 1916. Pp. 362. 
 
 11. Trabue, M. R. Completion-Test Language Scales, Teachers College, 
 Columbia University, 1916. Pp. 118. 
 
 12. Courtis, S. A. The Courtis Standard Tests. Department of Co-oper- 
 ative Research, Detroit, 1914. Pp. 125. 
 
 13. Whipple, G. M. Manual of Mental and Physical Tests, Part II. 
 Warwick and York, Baltimore, 1915. Pp. 336. 
 
 14. Paschal, F. C. The Witmer Cylinder Test. The Hershey Press, 
 Hershey, Pa., 1918. Pp. 54. 
 
 15. Finkelstein, I. E. The Marking System in Theory and Practice. 
 Warwick and York, Baltimore, 1913. Pp. 87. 
 
Op 
 
 LOAN DEPT 
 
 ITn; Gen . eral Library 
 University of Cali£ nfa 
 Berkeley 
 
fC 03817 
 
 501)25! 
 
 UNIVERSITY OF CALIFORNIA LIBRARY