Since the 1980s, Educational Testing Service (ETS), which dominated educational admission testing from 1940 – 1980, has been hemorrhaging product lines. In its heyday (SAT word) ETS was the Sauron to US education’s Middle Earth, providing admissions tests for the vast majority of professional certification programs and higher ed admissions. Their services ranged from teacher certification exams to the SAT, GRE, GMAT, MCAT, and LSAT. In the last decade or so, ETS business strategy has changed and the organization has begun to aggressively market their most popular remaining assessment product, the Graduate Record Exam (commonly known by its initialism – GRE), as “the One Test to Assess Them All.” This strategic market grab, while an interesting business strategy, raises significant questions about all admission tests. Specifically, the expansion of the GRE into fields beyond its design should force responsible test users to reevaluate long-held assumptions about what information is being gained by requiring the GRE (and all its brethren) and at what cost.
First, let’s explore how we arrived at this most interesting point in time; when ETS is claiming that its test of “general educational attainment” is somehow appropriate to measure the specific skills that are necessary to pursue a degree in engineering, accounting, or veterinary sciences.
In 1949, ETS, which at the time developed and administered the SAT, created a general aptitude test as another option within its series of tests called the Graduate Record Examinations, which consisted of the Profile Tests (similar to today’s GRE Subject Tests) and the Test of General Education. At that time, the GREs were used primarily by universities to measure their own undergraduates prior to admissions into a graduate program. The GRE continued to grow in usage following WWII and evolved as more universities adopted the GREs as well as ETS’s other exams. In the ’60s, and through much of the ’70s, approximately 300,000 people per year took the Aptitude Test. By 1972, ETS offered 19 Advanced Tests (nee Profile Tests) and the Aptitude Test with verbal and quantitative sections as part of its GRE. Until the early ’80s, ETS didn’t even have to compete in the test prep industry as the only company doing any real test prep was Kaplan, which at the time was still relatively small.
According to the February 1947 Bulletin of the Graduate Record Examinations, the Tests of General Education were designed “to measure as directly as possible the attainment of important objectives of general education at the college level.”
All was well in the realm as admissions tests were viewed as predictive of aptitude or intelligence andETS/Sauron was mystically powered and ruled with an iron fist from its impenetrable lair, watching us and dictating our (educational) fates, effectively controlling access to college, graduate school, business school, and law school. But over time ETS lost its contracts for these lucrative product lines; AAMC formed in the 1960s and took over production and management of the MCAT (Medical College Admission Test), LSAC took control of the LSAT (Law School Admission Test) in or around 1988, GMAC awarded the writing and administration of the GMAT (Graduate Management Admission Test) to ACT/Pearson in 2005, and College Board reclaimed much of the SAT authorship in 2014. So what’s an evil overlord to do?
ETS responded by a shift in marketing and began to actively compete for new business. To support this effort they produced studies validating the GRE as a test that predicts “equally well” success (defined, oddly, as achieving a certain first year GPA) in masters level study of veterinary medicine, theology, advanced mathematics, and business. Since 2004, ETS has been rather effective in penetrating MBA admission offices and in the last 4 years, ETS has working hard to convince law schools and the ABA that the GRE is the panacea to save the declining pursuit of law education.
Jack of All Trades, Master of None
As ETS continues to aggressively market its product to a wider array of graduate admissions offices, I continue to worry that they are creating the educational version of the America’s Got Talent “competition;” by using the GRE as the measure of student’s ability to perform in any and every academic setting, are we essentially pitting jugglers against comedians and contortionists against singers to determine who is “best?” In what world is there a rational argument that the skills necessary for success in each of these fields are comparable? How does one fairly assess such disparate activities?
As more business and law schools accept the GRE in addition to the test traditionally required (which admittedly have their own issues) does that signal that admissions offices are acknowledging that these hallowed tools are merely testing superficially relevant skills and abilities? What’s next, the NBA using the presidential fitness test as a pre-draft evaluation tool? Using the same combine for MLB and NFL players? Using the same audition for company members for the American Ballet Company and participants in Dancing with the Stars? All of these would be terrible uses of assessments. All of these are essentially what ETS has been fairly successful convincing American higher ed to take a shot at.
By making the research show that the GRE is capable of validly predicting the performance of candidates for everything from a Masters of Science in Biology to a Masters of Arts in Literature, ETS is essentially arguing that all educational success boils down to remembering basic high school math, tier II and tier III vocabulary words, the ability to narrowly interpret informational texts, and the ability to perform under intense time pressure. While these skills are certainly present in many fields, are they relevant enough to carry the weight often assigned to the GRE? Does this new, broader acceptance of the GRE highlight major flaws in our assumptions about these tests? Could one not argue that if the GRE, LSAT, GMAT, and MCAT all perform similarly in predictive validity regardless of program type that they are all terrible predictors? As more institutions consider the GRE are they concurrently evaluating the commitment to testing overall? Are institutions asking themselves if admissions tests have lived up to their promise of objective evaluation or are they simply adding statistically attractive noise to decisions?
If the GRE has essentially equal predictive validity for first year success in English, Law, advanced math, business, doctoral studies, and engineering does that not indicate that it’s an equally poor predictor for all these aspiring students?
Another way to look at this might be to ask, if Buzzfeed were to show that their quizzes had a .4 correlation with first year grad school GPA would admissions committees put “Which Middle-Earth Character Are You?” quizzes into their application process?
Correlation values from various tests reports. Full research reports are provided in the receipts section at the bottom. Also the methodologies used to reach each of these correlations are potentially different (I posted the r value in all cases not r-squared) so this is provide as a general guidance rather than a tool of direct comparison.
In the conversations about testing, a key indicator (once we’ve establish that the content is relevant and worthwhile to test) is the test’s predictive validity, which is usually expressed as a correlation to first year GPA. Even before delving into the correlation to first year GPA, it’s worth pausing to point out the oddity of using first year graduate school GPA as the metric of success. It would seem more logical that a school would concern itself with the final product produced (the graduates) not some arbitrary initial marker. But alas, validity is rarely discussed in terms of its correlation to graduation or final GPA.
So let’s look at the correlation for the GRE as it applies to various graduate programs. One of the interesting things about all of the correlations is that they are generally in the range of .1 to .5 across all tests after any statistical adjustments, range restrictions, methodology variations, zero-order, regression weights, non-traditional methodology, Cohen’s rules, redefinings of statistically relevant, etc. Also the range of these correlations seems to be surprisingly large; some studies looked at individual institutions found correlations vary greatly by institution. These results reinforce the need for institutions and programs to look closely at what test they are currently requiring and what value it is currently offering. An institution that requires these tests because of some blind adherence to tradition is doing a grave disservice to its potential students and to education writ large.
While the GRE might have some predictive power for first year grades, it fails on most other seemingly more important measures. Source: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0166742
Any responsible admission office should be pondering these questions each time they require the GRE, GMAT, or LSAT. This is especially true when one considers that the genesis of so much impetus to use the GRE to evaluate a growing swath of graduate applicants is ultimately a business case being made by the business that would most benefit. Graduate schools should also consider more closely whether any value added by these tests are worth the problems they create. Further, several recent studies (Joshua D. Hall- UNC, Orion D. Weiner, and Liane Moneta-Koehler – Vandy) have shown that the GRE’s predictive validity beyond first year performance is highly suspect. In fact, ETS in its Guide to the Use of Scores warns against practices such as cutoffs and recommends institutional validity studies.
Is a test that provides only marginal additional information about who will be successful and who will not and is so susceptible to the influence of money (test preparation) the right tool to carry significant weight in the admission process?
Drawn from an ETS research report this chart shows how little impact higher GRE scores have on academic performance.
As important as the question of validity, the question that isn’t debated often enough is whether the content of the test has enough significance and relevance to merit its use. The underlying content of the GRE is reading comprehension, vocabulary, and high school math, which is tested in a high-speed manner. While a strong case can be made for the relevance of reading comprehension skills in all academic programs, vocabulary and math skills are much more suspect. The continued testing of stand alone vocabulary on admissions tests is a vestige of a bygone epoch (you see what I did there?) when only the most educated or elite had wide access to the words needed to understand most academic texts. In the era of Google and digital readers, this is clearly no longer the case and the testing of Tier II and Tier III words out of context is anachronistic to say the least.
Even if one could argue the relevance of testing somewhat obscure vocabulary, the math section is much harder to justify for most graduate programs. The GRE uses math rules, formulas, and principles typically taught in 7th through 10th grade to test “quantitative reasoning”. Yet, since most graduate students (especially those who are not studying anything STEM related) will struggle to recall many of these highly specific rules and formulae, the content basis of the GRE math creates a need to prepare expressly for the GRE in order to display the superficial “intelligence” that the test success represents. As ETS continues pushing for new clients for its products, the relevance of the test seems to be growing more and more questionable.
If we consider what’s tested on the LSAT (the most recent target in ETS’s war for dominance) compared to what’s tested on the GRE, it’s fairly clear that the skills tested on the LSAT are vastly different than those tested on the GRE and while statistically one might find similar predictive validity and correlations, it’s hard to believe this use passes the “smell test.”
Does the content tested on the GRE reasonably connect to the skills necessary for success in the program for which the use of the GRE is being considered?
Here are a few questions from each exam to help you make your own assessment:
GRE Text Completion
GRE Sentence Equivalence
Questions pulled from the June 2007 LSAT
LSAT Analytical Reasoning
If the content of the exam isn’t relevant to or indicative of the program of study, then is this just a box to check for admission offices? And if so, can we not improve on the metrics being used so that they are at least relevant, if not actually useful. The claim that the tests are objective can’t be enough to justify their use, when there are so many red flags, questions, and concerns brought on by that use.
If the GRE is a good assessment of who is prepared for a particular program, one would imagine that the professors in that program would be able to ace the GRE. Any faculty member that argues the value of the GRE should be required to take the test and report their scores publicly.
The Diversity Argument
Finally, the expansion of of the GRE has progressed partially on the strength of the narrative that accepting the GRE in addition to another test will increase the diversity of the candidate pool. On certain levels this argument is true, important and accurate. Law and business schools which accept the GRE will likely see an increase in students who were considering other graduate programs that required the GRE but did not accept either the LSAT or GMAT.
However the heavily implied boost in racial diversity will likely not come from accepting the GRE. Across all graduate school admission tests, under-represented minorities score lower than white and Asian test takers. This is not surprising considering that the stratification of “good” test takers and “bad” has been routing Black and Hispanic students to less enriched academic environments since kindergarten. So unless there is a radical change in the way admissions staff consider GRE scores for law or business school, it is likely that the benefit of adding the GRE as an option will really only help those already benefiting from the system that is currently in place.
Consider the data below. The chart shows the by how much of a standard deviation each ethic group’s mean score varies from the mean. For the LSAT, Hispanic test-takers score .44 standard deviations below the mean score and on the GRE Verbal Hispanic test-takers score .07 below the mean, while Asian test-takers score .23 of a standard deviation above the mean LSAT score. The only group for whom the GRE is a clear boon is Black test-takers (that’s probably worth some research). And while that might be good for Black test-takers, the impact on diversity at an institution might not overall change due to decreases in other groups.
But what is interesting, and perhaps disingenuous, is the school’s claim that accepting GRE scores will promote student-body diversity “in all its forms.” This is because if the GRE is misused in the same manner as the LSAT, the admissions process will remain inequitable, at the expense of racial, ethnic and socioeconomic diversity. read more
– Aaron Taylor, Saint Louis University School of Law
So what’s all this mean? The slow creep of the GRE offers us an opportunity to reevaluate not only whether the GRE is appropriate to use, but whether any of these tests are holding to the promise of providing an objective measure of preparedness for graduate education.
I’d love to hear your thoughts, share them in the comments.
For pleasure or research here are additional resources that might be of interest to you if you care about this topic.
… tweet master class
I rant and rail on twitter and here are a few threads that might offer more information that is in this way too long post. Twitter threads break sometimes because of replies, usually if you click on the last tweet from me before a reply from someone else the thread will continue.
- GRE Masterclass Part 1: History
- GRE Masterclass Part 2: Costs, Opportunity and Access to Information
- GRE Masterclass Part 3: Content and test prep
- Random validity tweets
…. Correlation (validity studies from whence the numbers in the chart hailed) Study
One word of warning with research reports: you actually have to read them carefully. There are lots of caveats and statistical “adjustments” and varying methods that make comparing the data from one report to another really difficult. One should probably consult a psychometrician to ensure a full understanding of the reports if you have no training in these matters.
- GRE – New Perspectives on the Validity of the GRE® General Test for Predicting Graduate School Grades
- GRE for B-School –The Validity of Scores from the GRE®revised General Test for ForecastingPerformance in Business Schools:Phase One (Yes, phase one in 2014, I could not find any other phases or later reports)
- GRE for Law School – Pre-Published Report not yet public
- LSAT – Predictive Validity of the LSAT: A National Summary of the 2013 and 2014 LSAT Correlation Studies (PDF)
- GMAT – Differential Validity and Differential Prediction of the GMAT® Exam
… Other studies and research reports
- 1977 GRE Technical Report (PDF)- a technical report for a test provides the research that unpins the test as well as metrics on performance. All good admissions test writers should publicly release research reports, some like the SHSAT written by Pearson and ISEE written by the ERB don’t.
- LSAT Performance With Regional, Gender, and. Racial/Ethnic Breakdowns: 2007–2008 Through. 2013–2014 (PDF)
- GRE Worldwide Test Taker Report – July 2013-June 2016
- GMAT Profile Reports
- MCAT Research and Data reports
- An analysis on the validity of the lexicon required by GRE® test takers
- GRE Compendium of Studies