Monday, March 25, 2013

Evidence-based . . . or not

During the course of researching my post on the purported boy crisis in education (coming soon!) I came across a fascinating book by John Hattie, an educational researcher from New Zealand. The title is Visible Learning: A Synthesis of Over 800 Meta-Analyses Relating to Achievement (Routledge 2009). By "Visible Learning" Hattie seems to mean some combination of transparent pedagogy, feedback from teachers to students and students to teachers, and evidence-based innovation. But for me the real fascination of the book is the sheer amount of data underpinning Hattie's review of the 800 meta-analyses. Together, the 52,637 studies included in the book represent over 83 million students. That's a lot of data and a lot of "effects"–146,142, in fact–correlated to 138 variables affecting learning.

"Evidence-based" is a popular buzzword in education, as it is in many other spheres of research. But as Hattie points out, the vast number of studies about "what works" in education can be overwhelming to teachers and administrators. Perhaps more damagingly, the abundance of disparate and sometimes contradictory findings can lead to the justification of certain pet practices on the basis of one or two studies that may suffer from small sample size or design flaws. This is where the power of of meta-analysis comes in. Meta-analysis overcomes or neutralizes potential distortions in the results of individual studies–caused by sample size, design or methodological problems–through the aggregation (and sophisticated statistical manipulation) of data from multiple studies. That's the theory anyway, but as with any theory, meta-analysis has its critics. In the case of this particular synthesis (or meta-meta-analysis), the most salient problem–one acknowledged by the author himself–is that it concerns itself solely with research measuring effects on "achievement," and achievement is invariably measured via some form of testing. But many educational policies, including choice of curricular materials, pedagogical approaches, and integration of technologies, are justified by appealing to these very measures. So the information presented by Hatti is extremely useful to anyone trying to evaluate such programs on their promulgators' own terms.

And the results are quite surprising. Contrary to expectation–mine, anyway–many popular "progressive" pedagogical approaches have low "effect sizes," whereas practices that have been mostly discredited and discarded turn out to have large effect sizes. For instance, the practice of problem-based learning, wherein a teacher acts as a facilitator while students work through "authentic" real-world problems has an average effect size of 0.14; in Hattie's scheme, an effect of 0.40 or greater is considered one that rises above the baseline achievement due to teacher influence and students' year-to-year development. By contrast, the practice of Direct Instruction (which, in this review, means something quite specific) has an effect size of 0.59. In his discussion of the studies of Direct Instruction, Hattie usefully points out the ways in which constructivism–a theory of knowledge–has been confused with certain types of inquiry-based teaching strategies. He rightly stresses that constructivism is not a pedagogy, and that constructivist epistemological views are not incompatible with teacher-directed pedagogies. Certainly, my own experience with my daughters' elementary curriculum leads me to believe that more direct instruction, especially in math, would not be amiss. (See here for my take on the problems inherent in so-called constructivist approaches to math instruction.) And in fact, when it comes to math instruction, Hattie's data show that direct instruction methods have more positive effects on achievement (0.55) than other methods, such as technology-aided approaches (effect size 0.07).

Other results are less surprising to me, but still may be controversial or difficult for administrators to accept. The effect of homework, for instance, is low (. 29), as is that of extracurriculars (.17). This latter figure may be of interest to parents and teachers in Ontario, where extracurriculars have been curtailed by teachers protesting government-imposed contracts. (See my post on the protests here.) But this is also an example of the weakness of this type of study: the effects of extracurricular programs in schools may not be measurable by tests of achievement, but that does not necessarily make them less worthwhile than programs that can be so measured. It might, however, explain, why they are considered "extra" as opposed to part of the curriculum.

One other finding is worth pointing out: by Hattie's calculations the effect size of gender on achievement is a paltry 0.12. He writes:
The . . . question . . . is why we are so constantly immersed in debates about gender differences in achievement–they are just not there. The current synthesis shows that where differences are reported, they are minor indeed.
 This is a good question, and one which provides a perfect segue to my next post on the supposed "boy crisis" in education. Stay tuned.