Monday, March 25, 2013

Evidence-based . . . or not

During the course of researching my post on the purported boy crisis in education (coming soon!) I came across a fascinating book by John Hattie, an educational researcher from New Zealand. The title is Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement (Routledge 2009). By "Visible Learning" Hattie seems to mean some combination of transparent pedagogy, feedback from teachers to students and students to teachers, and evidence-based innovation. But for me the real fascination of the book is the sheer amount of data underpinning Hattie's review of the 800 meta-analyses. Together, the 52,637 studies included in the book represent over 83 million students. That's a lot of data and a lot of "effects"–146,142, in fact–correlated to 138 variables affecting learning.

"Evidence-based" is a popular buzzword in education, as it is in many other spheres of research. But as Hattie points out, the vast number of studies about "what works" in education can be overwhelming to teachers and administrators. Perhaps more damagingly, the abundance of disparate and sometimes contradictory findings can lead to the justification of certain pet practices on the basis of one or two studies that may suffer from small sample size or design flaws. This is where the power of of meta-analysis comes in. Meta-analysis overcomes or neutralizes potential distortions in the results of individual studies–caused by sample size, design or methodological problems–through the aggregation (and sophisticated statistical manipulation) of data from multiple studies. That's the theory anyway, but as with any theory, meta-analysis has its critics. In the case of this particular synthesis (or meta-meta-analysis), the most salient problem–one acknowledged by the author himself–is that it concerns itself solely with research measuring effects on "achievement," and achievement is invariably measured via some form of testing. But many educational policies, including choice of curricular materials, pedagogical approaches, and integration of technologies, are justified by appealing to these very measures. So the information presented by Hatti is extremely useful to anyone trying to evaluate such programs on their promulgators' own terms.

And the results are quite surprising. Contrary to expectation–mine, anyway–many popular "progressive" pedagogical approaches have low "effect sizes," whereas practices that have been mostly discredited and discarded turn out to have large effect sizes. For instance, the practice of problem-based learning, wherein a teacher acts as a facilitator while students work through "authentic" real-world problems has an average effect size of 0.14; in Hattie's scheme, an effect of 0.40 or greater is considered one that rises above the baseline achievement due to teacher influence and students' year-to-year development. By contrast, the practice of Direct Instruction (which, in this review, means something quite specific) has an effect size of 0.59. In his discussion of the studies of Direct Instruction, Hattie usefully points out the ways in which constructivism–a theory of knowledge–has been confused with certain types of inquiry-based teaching strategies. He rightly stresses that constructivism is not a pedagogy, and that constructivist epistemological views are not incompatible with teacher-directed pedagogies. Certainly, my own experience with my daughters' elementary curriculum leads me to believe that more direct instruction, especially in math, would not be amiss. (See here for my take on the problems inherent in so-called constructivist approaches to math instruction.) And in fact, when it comes to math instruction, Hattie's data show that direct instruction methods have more positive effects on achievement (0.55) than other methods, such as technology-aided approaches (effect size 0.07).

Other results are less surprising to me, but still may be controversial or difficult for administrators to accept. The effect of homework, for instance, is low (. 29), as is that of extracurriculars (.17). This latter figure may be of interest to parents and teachers in Ontario, where extracurriculars have been curtailed by teachers protesting government-imposed contracts. (See my post on the protests here.) But this is also an example of the weakness of this type of study: the effects of extracurricular programs in schools may not be measurable by tests of achievement, but that does not necessarily make them less worthwhile than programs that can be so measured. It might, however, explain, why they are considered "extra" as opposed to part of the curriculum.

One other finding is worth pointing out: by Hattie's calculations the effect size of gender on achievement is a paltry 0.12. He writes:
The . . . question . . . is why we are so constantly immersed in debates about gender differences in achievement–they are just not there. The current synthesis shows that where differences are reported, they are minor indeed.
 This is a good question, and one which provides a perfect segue to my next post on the supposed "boy crisis" in education. Stay tuned.


  1. It seems like the sheer volume of data on standardized test scores makes it harder for people to admit that test scores might be a very inadequate (and even counterproductive) measure of what we mean by "well-educated." It's as if the idea of scores=achievement has its own inertia. You're right that you're just pointing out that these commentators are sometimes unpersuasive even on their own terms, but it's hard to see how we can ever talk about whether there are real gender disparities in education unless we first come up with some way to talk about what we want from education beyond just higher standardized test scores.

  2. I'm not sure standardized test scores were the only measure of achievement used in these studies; the book is actually a little vague on this point, but I suspect a variety of test results (including report cards) were included. But your point is valid: the idea that scores=achievement does have its own inertia, and meta-analyses such as this one certainly contribute to it. In a climate where the test results-achievement equation is rarely questioned, though, it can be helpful to use the data to point out contractions and weaknesses in current policy. But I agree that we need to figure out an alternative way to "measure" education or to talk about what it is and what it should be. Sometimes I think the ethnographic approach used in books like Philip Jackson’s Life in Classrooms (an overlooked classic, in my opinion) would shed more light on what's wrong (or right) about contemporary schooling. But I really don't know the answer.

  3. There was an interesting comment over at kitchen table math about the alleged boy crisis. The commenter said that boys have always done better on standardized tests, while girls get better grades. Today it's fashionable to complain that boys are being graded unfairly; decades ago the complaint was that the tests were unfair to girls!

    I'm ready to throw out grades and standardized tests both. I do think there's a place for tests, if they're used to find out what the kids know, and followed up with re-teaching what the kids missed. Tests are hardly ever used for this, in my experience.

  4. FedUpMom: Thanks for your comment. It is interesting how the pendulum swings, isn't it? I do remember when the concern was that girls were being shortchanged by school, especially by co-ed classes in which teachers called on boys more frequently, and boys dominated discussions. (Actually, I think there's still evidence that boys get called on more.) As for the question of how boys fare on tests (versus grades), it's pretty complicated, as I'm finding out as I continue to research my "boy crisis" post. (For example, the "gender gap" tends to disappear when you approach the question through the lens of socio-economic class.) But I'm getting ahead of myself -- time to finish writing the damn post!

  5. tracy.rose@healthline.comApril 9, 2013 at 1:20 PM


    Healthline is interested in contributing a guest post to We would be open to contributing any blog that would be of interest to your readers. Healthline bloggers have been featured on a variety of sites including:

    Washington Times:
    Natural News:

    Please let me know if you have any questions. Thank you in advance for your consideration.

    Warm Regards,