Posted by Howard Wainer on Thu, Feb 05, 2009 @ 04:26 PM
Today's blog post is by Dr. Howard Wainer, who is the Distinguished Research Scientist at the National Board of Medical Examiners, as well as Professor of Statistics at the Wharton School of the University of Pennsylvania. Dr. Wainer received his Ph.D. from Princeton Univeristy, has won numerous scholarly awards, and spent 21 years as Principal Research Scientist in the Research Statistics Group at the Educational Testing Service. He is also, as far as we know, the only member of Criteria's Scientific Advisory Board to have swam the English Channel.
On September 22, 2008, the New York Times carried the first of three articles about a report, commissioned by the National Association for College Admission Counseling, that was critical of the current college admission exams, the SAT and the ACT. The commission was chaired by William R. Fitzsimmons, the dean of admissions and financial aid at Harvard.
The report was reasonably wide-ranging and drew many conclusions while offering alternatives. Although well-meaning, many of the suggestions only make sense if you say them fast.
Among their conclusions were:
- Schools should consider making their admissions "SAT optional," that is allowing their applicants to submit their SAT/ACT scores if they wish, but they should not be mandatory. The commission cites the success that pioneering schools with this policy have had in the past as proof of concept.
- Schools should consider eliminating the SAT/ACT altogether and substituting instead achievement tests. They cite the unfair effect of coaching as the motivation for this — they weren't naive enough to suggest that because there was no coaching for achievement tests now that, if they became more high stakes coaching for them would not be offered. Rather, they argued that such coaching would be related to schooling and hence more beneficial to education than is coaching that focuses on test-taking skills.
- That the use of the PSAT with a rigid qualification cut-score for such scholarship programs as the Merit Scholarships be immediately halted.
I will not attempt to discuss all three of these here, just the first one — if there is sufficient interest shown in this topic this entry will be followed by others.
Has the admissions process been hampered in schools that have instituted an SAT optional policy?
The first reasonably competitive school to institute such a policy was Bowdoin College, in 1969. Bowdoin is a small, highly competitive liberal arts college in Brunswick, Maine. A shade under 400 students a year elect to matriculate at Bowdoin, and roughly a quarter of them choose not to submit their SAT scores. In Table 1 is a summary of the classes at Bowdoin and five other institutions whose entering freshman class had approximately the same average SAT score. At the other five institutions the students who didn't submit SAT scores used ACT scores instead.
| |
All Students |
Submitted SAT Scores |
Did not Submit |
| Institution |
N |
N |
Mean |
N |
| Northwestern University |
1,654 |
1,505 |
1347 |
149 |
| Bowdoin College |
379 |
273 |
1323 |
106 |
| Carnegie Mellon University |
1,132 |
1,039 |
1319 |
93 |
| Barnard College |
419 |
399 |
1297 |
20 |
| Georgia Institute of Technology |
1,667 |
1,498 |
1294 |
169 |
| Colby College |
463 |
403 |
1286 |
60 |
| Means and Totals |
5,714 |
5,117 |
1316 |
597 |
Table 1: Six Colleges/Universities with similar observed mean SAT scores for the entering class of 1999.
To know how Bowdoin's SAT policy is working we will need to know two things. First, how did the students who didn't submit SAT scores do at Bowdoin in comparison to those students that did submit them? And second, would the non-submitters performance at Bowdoin have been predicted by their SAT scores?
The first question is easily answered by looking at their first year grades at Bowdoin. These are shown in Figure 1 below.
Bowdoin students who did not send their SAT scores performed worse in their first year courses than those who did submit them.

We see that non-SAT submitters did about a standard deviation worse than students who did submit SAT scores. And so, we can conclude that if the admissions office were using other variables to make up for the missing SAT scores, those variables did not contain enough information to prevent them from admitting a class that was academically inferior to the rest.
But would their SAT scores have provided information missing from other submitted information? Ordinarily this would be a question that is impossible to answer, for these students did not submit their SAT scores. However, all of these students actually took the SAT, and through a special data-gathering effort at the Educational Testing Service we find that the students who didn't submit their scores behaved sensibly. Realizing that their lower than average scores would not help their scores at Bowdoin, they chose not to submit them. Below (Figure 2) is the distribution of SAT scores for those who submitted them as well as those who did not.
Those students who don't submit SAT scores to Bowdoin score about 120 points lower than those who do submit their scores.

As it turns out the SAT scores for the students who did not submit them would have accurately predicted their lower performance at Bowdoin. In fact the correlation between grades and SAT scores was 12% higher for those who didn't submit them than for those who did.
So not having this information does not improve the academic performance of Bowdoin's entering class — on the contrary it diminishes it. Why would a school opt for such a policy? Why is less information preferred to more? There are surely many answers to this, but one is seen in an augmented version of Table 1 (below).
| |
All Students |
Submitted SAT Scores |
Did not Submit |
| Institution |
N |
Mean |
N |
Mean |
N |
Mean |
| Northwestern University |
1,654 |
1338 |
1,505 |
1347 |
149 |
1250 |
| Bowdoin College |
379 |
1288 |
273 |
1323 |
106 |
1201 |
| Carnegie Mellon University |
1,132 |
1312 |
1,039 |
1319 |
93 |
1242 |
| Barnard College |
419 |
1293 |
399 |
1297 |
20 |
1213 |
| Georgia Institute of Technology |
1,667 |
1288 |
1,498 |
1294 |
169 |
1241 |
| Colby College |
463 |
1278 |
403 |
1286 |
60 |
1226 |
| Means and Totals |
5,714 |
1307 |
5,117 |
1316 |
597 |
1234 |
We see that if all of the students in Bowdoin's entering class had their SAT scores included the average SAT at Bowdoin would sink from 1323 to 1288, and instead of being second among these six schools they would have been tied for next to last. Since mean SAT scores are a key component in school rankings, a school can game those rankings by allowing their lowest scoring students to not be included in the average. I believe that Bowdoin's adoption of this policy pre-dates US News and World Report's rankings, so that was unlikely to have been their motivation, but I cannot say the same for schools that have chosen such a policy more recently.
Posted by Howard Wainer on Mon, Nov 10, 2008 @ 02:49 PM
Today's blog post is the second by Dr. Howard Wainer, who is the Distinguished
Research Scientist at the National Board of Medical Examiners, as well
as Professor of Statistics at the Wharton School of the University of
Pennsylvania. Dr. Wainer is also a member of Criteria's Scientific Advisory Board.
In an earlier post
I commented on one aspect of a report, commissioned by the National
Association for College Admission Counseling, that was critical of the
current college admission exams, the SAT and the ACT. The commission
was chaired by William R. Fitzsimmons, the dean of admissions and
financial aid at Harvard.
One of the recommendations of the Commission was for colleges to
consider making their admissions tests (SAT or ACT) optional. Using
data from Bowdoin College, which has had such a policy for almost 40
years, I showed that those students who did not submit their SAT scores
had, in fact, scored about a standard deviation lower than those
students that did submit them. This isn't surprising. More important,
the students who did not submit SAT scores also performed about a
standard deviation lower in their freshmen grade point average at
Bowdoin. This would have been predictable from their SAT scores had the
College insisted on them. My conclusion is that colleges deny
themselves useful information by making SAT's optional. And the
Commission, by making their recommendations in the absence of such
data, was shooting in the dark.
In this post I'd like to discuss another of their other principal recommendations:
Schools should consider eliminating the SAT/ACT altogether
and substituting instead achievement tests. They cite the unfair effect
of coaching as the motivation for this — they weren't naive enough to
suggest that if achievement tests were to become more high stakes
coaching for them would not be offered. Rather, they argued that such
coaching would be related to schooling and hence more beneficial to
education than is coaching that focuses on test-taking skills.
Driving the Commission's recommendations was the notion that the
differential availability of commercial coaching made admissions
testing unfair. They recognized that the 100 point gain (on the 1200
point SAT scale) test prep providers often tout as a typical outcome
was hype and agreed with the estimates from more neutral sources that
about 20 points was more likely. However, they deemed even 20 points
too many. The Commission pointed out that there was no wide-spread
coaching for achievement tests, but agreed that should the admissions
option shift to achievement tests the coaching would likely follow.
This would be no fairer to those applicants who could not afford extra
coaching, but at least the coaching would be of material more germane
to the subject matter and less related to test-taking strategies.
One can argue with the logic of this – that a test that is less
subject oriented and related more to the estimation of a general
aptitude might have greater generality. And that a test that is less
related to specific subject matter might be fairer to those students
whose schools have more limited resources for teaching a broad range of
courses. I find these arguments persuasive, but I have no data at hand
to support them. So instead I will take a different, albeit more
technical, tack. I will argue that the psychometric reality associated
with replacing general aptitude tests with achievement tests means that
making the kinds of comparisons that schools need among different
candidates impossible.
When all students take the same tests we can compare their scores on
the same basis. The SAT and ACT were constructed specifically to be
suitable for a wide range of curricula. SAT–Math is based on
mathematics no more advanced than 8th grade. Contrast this
with what would be the case with achievement tests. There would need to
be a range of tests and students would chose a subset of them that best
displayed both the coursework they had had and the areas they felt they
were best in. Some might take chemistry, others physics; some French,
others music. The current system has students typically taking three
achievement tests (SAT-II). How can such very different tests be scored
so that the outcome on different tests can be compared? Do you know
more French than I know physics? Was Mozart a better composer than
Einstein was a physicist? How can admissions officers make sensible
decisions through incomparable scores?
How are SAT-II exams scored currently? Or more specifically, how
they had been scored for decades when I left the employ of ETS seven
years ago – I don't know if they have changed anything in the interim.
They were all scored on the familiar 200-800 scales, but similar scores
on two different tests are only vaguely comparable. How could they be?
What is currently done is that tests in mathematics and science are
roughly equated using the SAT-Math, the aptitude test that everyone
takes, as an equating link. In the same way tests in the humanities and
social sciences are equated using the SAT-Verbal. This is not a great
solution, but is the best that can be done in a very difficult
situation. Comparing history with physics is not worth doing for even
moderately close comparisons.
One obvious approach would be to norm reference each test, so that
someone who scores average for all those who take a particular test
gets a 500 and someone a standard deviation higher gets a 600, etc..
This would work if the people who take each test were, in some sense,
of equal ability. But that is not only unlikely, it is empirically
false. The average student taking the French achievement test might
starve to death in a French restaurant, whereas the average person
taking the Hebrew achievement test, might do just fine if dropped in
the middle of the night onto the streets of Tel Aviv. Happily the
latter students also do much better on the SAT-VERBAL test and so the
equating helps. This is not true for the Spanish test, where a
substantial portion of those taking it come from Spanish speaking homes.
Substituting achievement tests is not a practical option unless
admissions officers are prepared to have subject matter quotas. I believe that solution would be too inflexible to be feasible.