Saturday, June 27, 2020

The SAT will still have an experimental section -- but not everyone will be taking it

The Washington Post reported yesterday  that the new SAT will in fact continue to include an experimental section. According to James Murphy of the Princeton Review, guest-writing in Valerie Strausss column, the change was announced at a meeting for test-center coordinators in Boston on February 4th. To sum up: The  SAT has traditionally included an extra section either Reading, Writing, or Math that is used for research purposes only and is not scored. In the past, every student taking the exam  under regular conditions (that is, without extra time) received an exam that included one of these sections. On the new SAT, however, only students not taking the test with writing (essay) will be given  versions of the test that include experimental multiple-choice questions, and then only some of those students. The College Board has not made it clear what percentage  will take the experimental version, nor has it indicated how  those students will  be selected. Murphy writes: In all the public relations the company has done for the new SAT, however, no mention has been made of an experimental section. This omission led test-prep professionals to conclude that the experimental section was dead. Hes got that right I certainly  assumed the experimental section had been scrapped! And I spend a fair amount of time communicating  with people who stay much more in the loop about the College Boards less publicized wheelings and dealings than I do. Murphy  continues: The College Board has not been transparent about the inclusion of this section. Even in that one place it mentions experimental questions—the counselors’ guide  available for download as a PDF  Ã¢â‚¬â€ you need to be familiar with the language of psychometrics to even know that what you’re actually reading is the announcement of experimental questions.   The SAT will be given in a standard testing room (to students with no testing accommodations) and consist of four components — five if the optional 50-minute Essay is taken — with each component timed separately. The timed portion of the SAT with Essay (excluding breaks) is three hours and 50 minutes. To allow for pretesting, some students taking the SAT with no Essay will take a fifth, 20-minute section. Any section of the SAT may contain both operational and pretest items. The College Board document defines neither â€Å"operational† nor â€Å"pretest.† Nor does this paragraph make it clear whether all the experimental questions will appear only on the fifth section, at the start or end of the test, or will be dispersed throughout the exam. During the session, I asked if all the questions on the extra section won’t count and was told they would not. This paragraph is less clear on that issue, since it suggests that experimental (â€Å"pretest†) questions can show up on any section. When  The Washington Post  asked for clarification on this question, they were sent the counselor’s paragraph, verbatim. Once again, the terminology was not defined and it was not clarified that â€Å"pretest† does not mean before the exam, but experimental. For starters, I was unaware that the term pretest could have a second meaning. Even by the College Boards current standards, thats  pretty brazen (although closer to  the norm than not). Second, Im not sure how it is possible to have a standardized test that has different versions with different lengths, but one set of scores. (Although students who took the old test with accommodations did not receive an experimental section, they presumably formed a group small enough not to be statistically significant.) In order to  ensure that scores are as valid as possible, it would seem reasonable to ensure that, at bare minimum, as many students as possible receive the same version of the test. As Murphy rightly points out, issues of fatigue and pacing can have a significant effect on students scores a student who takes a longer test will, almost certainly, become more tired and thus more likely to incorrectly answers questions that he or should would otherwise have gotten right. Second, Im no expert in statistics, but there would seem to be some problems with this method of data collection. Because the old experimental section was given to nearly all test-takers,  any information gleaned from it could be assumed to hold true for  the general population of test-takers. The problem now is not simply that only one group of testers will be given experimental questions, but that the the group given experimental questions  and the group not given experimental questions may not be  comparable. If you consider that the colleges requiring the Essay are, for the most part, quite selective, and that students tend not to apply to those schools unless theyre somewhere in the ballpark academically, then it stands to reason that the group sitting for the Essay will be, on the whole, a higher-scoring group  than the group not sitting for the Essay. As a result, the results obtained from the non-Essay group might not apply to test-takers across the board. Lets say, hypothetically, that test takers in the Essay group are more likely to correctly answer a certain question than are test-takers in the non-Essay group. Because the only data obtained will be from students in the non-Essay group, the  number of students answering that question correctly is  lower than it would be if the entire group of test-takers were taken into account.   If the same phenomenon repeats itself for many, or even every,  experimental question, and new tests are created based on the data gathered from the two unequal groups, then  the entire level of the test will eventually shift down   perhaps further erasing some of the score gap, but also giving a further advantage  to the stronger (and likely more privileged) group of students on future tests. All of this is speculation, of course. Its possible that the College Board has some way of statistically adjusting for the difference in the two groups (maybe the Hive can help with that!), but even so, you have to wonder†¦ Wouldnt it just have been better  to create a five-part exam and give the same test to everyone?

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.