Taking me to task

The Family Social Scientist takes me to task in the comments on my post on Decision Fatigue:

while this is an interesting quirk of the data and warrants further exploration, its hardly conclusive . To say that the marshmallow test is "known to predict future success in life" is a little misleading and perhaps a misinterpretation of the results.

Touché. I really cannot complain that the FSS is calling me out in the same way that I do to others all the time. It is indeed perilous to try to glean science out of popular articles. I am a rank amateur in the field, and I know how that seems to an expert because I have not always had the grace to deal lightly with those who have trespassed into my own technical specialty.

Yet, nevertheless, I will persist on this topic, despite the many landmines, because it is absolutely fascinating to me. The FSS brings up some really good points in his comment that need to be considered:

grades are hardly a proper measurement of academic progress and intelligence

Quite true. This was arguably less true in the past, but grades, both high school and undergraduate, have a pretty loose connection with both academic progress and intelligence. This is actually one of the things that first attracted me to psychometrics, because it gave me the mental tools to understand why grades aren't a measure of intelligence.

By way of example, consider this work by Steve Hsu and Jim Schombert on college GPA and SAT scores. Hsu and Schombert explain some of the complexities their work demonstrated in an interview:

“Freshman GPA is not a satisfactory metric of academic success,” Hsu explains. “There is simply too much variation in the difficulty of courses taken by freshmen.” More able freshmen typically take more difficult courses, whereas less able freshmen take introductory courses “not very different from high school classes,” he says. Under these circumstances, academic success—an “A” in an introductory course versus a “B” in an advanced course—becomes too relative to accurately measure. Course variation decreases in later years, as students settle into their respective majors, working hard in required classes.

The new approach bore fruit: SAT and ACT scores, their analysis showed, predict upper-level much better than lower-level college grades, “a significant and entirely new result,” Schombert says.

Hsu and Schombert are now working on including personality inventories in this assessment to see whether they can improve their model. As a guess, conscientiousness will probably be a big hitter. But, there is a difficulty here. How do you measure conscientiousness? The short answer is: we don't know how. The longer answer is we try various techniques to quantify a quality, such as personality inventories or the marshmallow test. Personality inventories are easy, but they are also easy to game. If you know what the questions are getting at, you can manufacture any result you want. The marshmallow test, and the ice bath test, are a little better in this respect because they push up against a hard limit that we hope is correlated with the thing we are interested in. Thus, even if you knew that holding your hand in the water was going to be used to judge your mental toughness, this would be a good thing because your ability to endure unpleasantness for a positive social judgement is exactly what the test is after.

This is also related to why grades aren't the best predictor either: the system is easy to game. In college admissions, this is part of the reason grades have become de-emphasized. Good grades in high school aren't by themselves a good predictor of doing well in college, but if you factor in participation in sports and other extracurriculars, you can get a rough estimate of a student's ability to stick something through and their ability to manage competing priorities. This can be gamed too, as Amy Chua demonstrates, but if you can successfully game this system, it means you are probably smart and likely to be wealthy, which is something colleges want anyway.