Using College Admissions Exams: Part II
Posted by Howard Wainer on Mon, Nov 10, 2008 @ 02:49 PM
Today's blog post is the second by Dr. Howard Wainer, who is the Distinguished
Research Scientist at the National Board of Medical Examiners, as well
as Professor of Statistics at the Wharton School of the University of
Pennsylvania. Dr. Wainer is also a member of Criteria's Scientific Advisory Board.
In an earlier post
I commented on one aspect of a report, commissioned by the National
Association for College Admission Counseling, that was critical of the
current college admission exams, the SAT and the ACT. The commission
was chaired by William R. Fitzsimmons, the dean of admissions and
financial aid at Harvard.
One of the recommendations of the Commission was for colleges to
consider making their admissions tests (SAT or ACT) optional. Using
data from Bowdoin College, which has had such a policy for almost 40
years, I showed that those students who did not submit their SAT scores
had, in fact, scored about a standard deviation lower than those
students that did submit them. This isn't surprising. More important,
the students who did not submit SAT scores also performed about a
standard deviation lower in their freshmen grade point average at
Bowdoin. This would have been predictable from their SAT scores had the
College insisted on them. My conclusion is that colleges deny
themselves useful information by making SAT's optional. And the
Commission, by making their recommendations in the absence of such
data, was shooting in the dark.
In this post I'd like to discuss another of their other principal recommendations:
Schools should consider eliminating the SAT/ACT altogether
and substituting instead achievement tests. They cite the unfair effect
of coaching as the motivation for this — they weren't naive enough to
suggest that if achievement tests were to become more high stakes
coaching for them would not be offered. Rather, they argued that such
coaching would be related to schooling and hence more beneficial to
education than is coaching that focuses on test-taking skills.
Driving the Commission's recommendations was the notion that the
differential availability of commercial coaching made admissions
testing unfair. They recognized that the 100 point gain (on the 1200
point SAT scale) test prep providers often tout as a typical outcome
was hype and agreed with the estimates from more neutral sources that
about 20 points was more likely. However, they deemed even 20 points
too many. The Commission pointed out that there was no wide-spread
coaching for achievement tests, but agreed that should the admissions
option shift to achievement tests the coaching would likely follow.
This would be no fairer to those applicants who could not afford extra
coaching, but at least the coaching would be of material more germane
to the subject matter and less related to test-taking strategies.
One can argue with the logic of this – that a test that is less
subject oriented and related more to the estimation of a general
aptitude might have greater generality. And that a test that is less
related to specific subject matter might be fairer to those students
whose schools have more limited resources for teaching a broad range of
courses. I find these arguments persuasive, but I have no data at hand
to support them. So instead I will take a different, albeit more
technical, tack. I will argue that the psychometric reality associated
with replacing general aptitude tests with achievement tests means that
making the kinds of comparisons that schools need among different
candidates impossible.
When all students take the same tests we can compare their scores on
the same basis. The SAT and ACT were constructed specifically to be
suitable for a wide range of curricula. SAT–Math is based on
mathematics no more advanced than 8th grade. Contrast this
with what would be the case with achievement tests. There would need to
be a range of tests and students would chose a subset of them that best
displayed both the coursework they had had and the areas they felt they
were best in. Some might take chemistry, others physics; some French,
others music. The current system has students typically taking three
achievement tests (SAT-II). How can such very different tests be scored
so that the outcome on different tests can be compared? Do you know
more French than I know physics? Was Mozart a better composer than
Einstein was a physicist? How can admissions officers make sensible
decisions through incomparable scores?
How are SAT-II exams scored currently? Or more specifically, how
they had been scored for decades when I left the employ of ETS seven
years ago – I don't know if they have changed anything in the interim.
They were all scored on the familiar 200-800 scales, but similar scores
on two different tests are only vaguely comparable. How could they be?
What is currently done is that tests in mathematics and science are
roughly equated using the SAT-Math, the aptitude test that everyone
takes, as an equating link. In the same way tests in the humanities and
social sciences are equated using the SAT-Verbal. This is not a great
solution, but is the best that can be done in a very difficult
situation. Comparing history with physics is not worth doing for even
moderately close comparisons.
One obvious approach would be to norm reference each test, so that
someone who scores average for all those who take a particular test
gets a 500 and someone a standard deviation higher gets a 600, etc..
This would work if the people who take each test were, in some sense,
of equal ability. But that is not only unlikely, it is empirically
false. The average student taking the French achievement test might
starve to death in a French restaurant, whereas the average person
taking the Hebrew achievement test, might do just fine if dropped in
the middle of the night onto the streets of Tel Aviv. Happily the
latter students also do much better on the SAT-VERBAL test and so the
equating helps. This is not true for the Spanish test, where a
substantial portion of those taking it come from Spanish speaking homes.
Substituting achievement tests is not a practical option unless
admissions officers are prepared to have subject matter quotas. I believe that solution would be too inflexible to be feasible.