Teacher testing shows performance gaps

Advocate staff photo by JOHN McCUSKER -- Katina Williams teaches her kindergarten class at Arthur Ashe Charter School in New Orleans Friday, September 6, 2013. Louisiana has a new system for evaluating teachers that has two main components, one based on classroom observations, the other -- more controversially -- on test scores.
Advocate staff photo by JOHN McCUSKER -- Katina Williams teaches her kindergarten class at Arthur Ashe Charter School in New Orleans Friday, September 6, 2013. Louisiana has a new system for evaluating teachers that has two main components, one based on classroom observations, the other -- more controversially -- on test scores.

Louisiana’s new system for evaluating teachers has two main components, one based on classroom observations, the other — more controversially — on test scores. And with the first release of data this week on how teachers are faring in this new system, one thing became obvious: the test scores tend to judge more harshly.

A far greater proportion of teachers this year earned “proficient” or “highly effective” ratings on the more subjective side of the evaluation than on the test-based component, each of which count for 50 percent of an overall score. Statewide, the figures were 90 percent versus just 52 percent, a pattern that showed up in most school districts across the New Orleans metro area.

For policy makers, it’s a gap that drives home the challenges they face in implementing a system that was designed by the state Department of Education but must be carried out by local districts, not all of which share the department’s zeal for the test-based accountability movement.

The subjective observations offer local principals and other administrators a chance to counterbalance the unforgiving weight of test scores, and it appears many are doing just that, perhaps more than state officials feel is warranted.

State Superintendent John White took to Twitter a day after releasing the initial results to tout high-performing schools that also hold teachers to a high standard.

He pointed out, for instance, that four out of five schools run by the charter network FirstLine in New Orleans fell in the top 10 percent of Louisiana schools in terms of improving test scores, yet ranked fewer than 10 percent of their teachers highly effective.

“Amazing results,” he wrote.

For those who already feel the state’s new evaluation system, known as Compass, is flawed, the gap between observations and test scores also brought cause for concern, but for different reasons. Steve Monaghan, head of the Louisiana Federation of Teachers, argued that the state is sending a message to local administrators: evaluate your teachers more harshly.

“There is going to be either an explicit or implicit message for administrators that their evaluations are off because the machine has indicated differently,” Monaghan said.

He and other union leaders in Louisiana and around the country have expressed deep reservations about test scores playing any role in evaluations that have an impact on teachers’ job security, since flaws in how the state collects or analyzes data could have devastating consequences for individuals.

However the data should be interpreted, the divide between test scores and more subjective measures was stark.

Broadly speaking, the new evaluations combine one ranking based on observations and another on student achievement. For most teachers, the student achievement half is somewhat fuzzy. How do you gauge how far a student has come in music class or gym? In those subjects, schools or districts set their own goals at the beginning of the year and then measure progress toward achieving them.

But for teachers in subjects like reading, math, science and social studies, the state collects what are known as value-added data, measuring how quickly students assigned to a given teacher improve their test scores in the space of a year.

That makes any comparison of observations with value-added data somewhat problematic, since only 30 percent of teachers have value added scores and every teacher is observed. But the gap that showed up between the two components was consistently gaping.

In St. Tammany Parish, for instance, 91 percent of teachers earned a proficient or highly effective rating from classroom observations, rather than “Ineffective” or “Emerging Effective.” But on the student achievement half, looking at only the teachers in the parish who had a value-added score, the percentage ranking that well came to 54 percent.

Among teachers in St. Tammany who fall outside of value-added subjects — those who have “student learning targets” set by the parish to gauge student achievement — about 93 percent earned a proficient or highly effective ranking on the student achievement portion, suggesting again that when local administrators set the bar, they tend to be more generous.

Cheryl Arabie, an assistant superintendent for instruction in St. Tammany, argued that the district’s teachers deserve the high marks, given how well students in the parish have done on standardized tests. This year, about 81 percent of students in St. Tammany scored on grade level or better on their exams, up about a percentage point from the year before.

Still, the data raised concerns even among officials who argue the new evaluation system is a broad improvement over how evaluations worked in the past, when almost every teacher received a “satisfactory” rating.

Rayne Martin, who helped design the new evaluation system, put out a report for an advocacy group called Stand for Children, arguing that district results “generally correspond with the improvement of students in those districts.”

But she also noted that “with 80 percent of educators falling in ‘highly effective’ and ‘effective proficient,’ the top two evaluation categories, there is still a lot of work to do to ensure that Compass is implemented so that it more accurately aligns with the academic progress that students are making in the classroom.”

Even among those who support lifting the bar for teachers, however, there are some who argue state-wide teacher evaluations are the wrong tool. Evaluation systems can be top-notch, the thinking goes, but if school leaders aren’t willing or able to apply them rigorously, they won’t have much of an impact.

“If we had a problem in the past where principals felt for whatever reason that all their teachers were effective, the state mandate isn’t going to change that perception,” said Kathleen Porter-Magee, a fellow at the conservative Thomas B. Fordham Institute. “We get exactly the same results now but with a more complicated, top-down system.”

Some of the districts that proved an exception to the rule are those that were already winning plaudits from state officials for lifting test results.

The state-run Recovery School District in New Orleans, made up almost entirely of non-union charter schools that share the “high expectations” ethos of the national charter movement, was one of them. In the RSD, 73 percent of teachers ranked proficient or highly effective on observations, compared with 72 percent based on value-added data. St. Bernard Parish had roughly similar percentages as well, at 83 percent and 81 percent respectively.