Tuesday, December 11

Progress reports (may be*) statistically sound; not enough for Council, parents

After yesterday's excitement, I'm ready to take a more substantive look at the content of the City Council hearing on the progress reports. Jenny Medina at the Times has the best rundown of all of the papers and for an overview of what James Liebman said and how the Council members responded, I would go to her report.

What stood out most to me was that once again the DOE managed to present a compelling initiative in a way that frustrated and angered elected officials and parents. A numbers-oriented friend of mine who shares my interest in education has told me that the progress reports are sound from her vantage point, and from mine, nothing I heard yesterday dissuaded me from thinking that they contain useful information parents ought to be able to find out. Liebman's presentation also helped me understand just how some top schools got low grades by showing how their students' progress, particularly that of their students who began the year in the lowest third, stacked up unfavorably next to other schools with similar students.

So I don't understand why Liebman had to undermine his own hard work by arguing that the grades are not based almost entirely on single assessments in math and English; saying that his office had "consulted" with, among many others, an organization whose leader was in the room and later testified that their only conversation was not about the progress reports; and by giving Time Out From Testing the runaround on his way out the door.

I was also relieved to see that in disliking the progress reports, Insideschools readers are more like typical New Yorkers than the Quinnipiac poll would have us think; Council member after Council member commented that their constituents have told them that poor grades are unfairly stigmatizing some good schools, some of which fear that their recent progress could be undercut. Liebman did say, as he has before, that he is open to tweaking the formula used to calculate the grades or even assigning schools multiple grades based on different criteria. But in my view, it's the presentation and the attitude behind it, not the formula, that need a major revision.

*Title updated to reflect an exchange in the comments about the statistical validity of the reports.


Leonie Haimson said...

There is nothing statistically sound about the grading system devised by Liebman. You can ask any psychometrician about this.

34-80% of the annual fluctuations in test scores at the school level are essentially random or due to one-time factors alone, unrelated to the amount of learning taking place. (see Kane and Staiger, "Promise and Pitfalls of Using Imprecise School Accountability Measures", 2002).

And yet 55% of the school grade is based just on one year's gain or losses in students test scores. It is also the case that the state tests were not designed or validated to make the year to year comparisons for which they are being used.

There are also profound problems with the peer groups as constructed, the other schools to which each school was compared to determine its grade. Some schools that are neighborhood zoned schools and yet are being compared highly selective schools; other schools are vastly overcrowded with huge class sizes while others have far better conditions. None of this is properly controlled for in the system used.

There are many other problems with the grading system, leading to such unreliable results that include the fact that that 100 schools in good standing w/ the state and/or federal government got failing grades, while 54% of SURR and SINI schools got As or Bs.

Philissa said...

I posted this morning before heading off to visit a school and as soon as I got into the subway I thought of all of the flaws in what I wrote. I'm surprised it took you a whole 45 minutes to respond!

I agree that the fact that the progress reports draw on only one year of data is a big problem and it's alarming that the DOE doesn't look like it's planning to incorporate a longer sample into the formula in the future. It's interesting that you note that the state tests are not designed for this kind of comparison. Do we have any idea what NYSED thinks about the progress reports?

The peer groups and divergence from the SINI/SURR lists don't bother me as much. As Liebman has made clear, the progress reports are supposed to give credit for a different kind of achievement, and I think that's a good step.

So I don't mind that this system shows off the weirdnesses of the state and NCLB accountability systems. But having a multitude of accountability measures would be a lot more interesting if each of them didn't come with their own high stakes. Thanks for keeping me on my toes.