Guest Blog: Misusing Test Data

This post originally appeared in the CTQ Collaboratory, and I thought it deserved wider audience. So I invited Scott Diamond to guest blog it here. Scott is a Science Teacher at The Learning Center at Linlee, in the Fayette County Public Schools, and Adjunct Faculty with University of Kentucky College of Medicine.

Many educational testing controversies revolve around interpreting and using testing: educational assessments are being misused. Today’s NYT story, “Unlikely Allies Uniting to Fight School Changes,” is a good starting place to read. I prefer to ascribe this misuse to ignorance than to malice.  I too often see education decisions made with (1) little understanding of statistical power, and (2) misinterpretation of measures for outcomes.

Statistical power dictates the need for large populations to identify small differences. Measures that provide useful data on populations in the 10,000’s often lack “statistical power” for smaller populations. And all of a single teacher’s classes would together be considered a very small population. This makes standardized tests designed to gather data from large populations  useless for evaluating individual students’ educational next steps or individual teachers’ pay and retention. Using large study data to make decisions about small populations or individuals isn’t much better than consulting a Ouija board.

The second kind of error, to mistake outcome measures for outcomes, is like a doctor who mistakes symptoms for underlying disease, focuses on fever instead of infection, treats victims of bacterial infections with only aspirin, and is shocked, just shocked to see their patients die. It is malpractice.

This kind of error still occurs in medicine and health, recently in dietary recommendations. Researchers analyzing data from the Framingham study noticed that diets high in saturated fats were associated with “worse” measures for HDL, LDL, etc. But they didn’t notice that there was no association with actual deaths! They were so focused on outcome measures like HDL v LDL that they forgot that the actual desired outcome was being alive! Their low-fat diet recommendations  may have lead to higher sugar intake, and possibly worse overall health!

We educators commit this same error when we mistake standardized assessment results for education, and make raising test scores and not authentic learning, the explicit focus of our efforts. And our policy makers are shocked, just shocked to see test outcomes become uncoupled from learning. We are fortunate that our errors do not result in mortality, just in ignorance.

How else do we in education misuse testing, mistaking testing outcomes for educational outcomes? And how has that twisted teaching in schools?

Should we require educational administrators to have expertise in statistics? And by that, I mean more than an introductory class?

What would you recommend?

Related categories: ,