Should student evaluations be used in tenure and promotion? No, says arbitrator
Arbitrator: Student evaluations cannot be used by management to assess the quality or teaching effectiveness of university faculty members
An arbitrator has held that, while student evaluations are easy for a university to administer and “have the air of objectivity”, their ability to measure the quality of an instructor’s teaching is “imperfect at best and biased and unreliable at worst.” They cannot, therefore, be used to assess the quality or teaching effectiveness of university faculty members.
The issue of how student evaluations may be used by management was an issue in collective agreement negotiations between Ryerson University Faculty Association and Ryerson University. When they could not reach an agreement, they referred the issue to interest arbitration.
The arbitrator’s decision
At the outset, the arbitrator noted that “demonstrated evidence of high quality in teaching is an essential requirement for tenure and promotion.” A high standard of justice, fairness and due process was clearly required, given the common use of student evaluations in those decisions and the need to make tenure and promotion decisions on the best possible evidence. “The evaluation of teaching effectiveness for purposes of tenure and promotion is so important – to both the faculty member and the University – that it has to be done right.”
The arbitrator accepted uncontested expert evidence that the results of student evaluations are skewed by a long list of factors, including personal characteristics (such as race, gender, accent, age and attractiveness) and course characteristics (including class size, quantitative content, traditional teaching methodologies versus innovative pedagogy). Indeed, the evidence indicated that student evaluations are most directly influenced by the student’s perception of the attractiveness of the instructor and their expectation of a particular grade in the course.
The expert evidence further demonstrated that the unreliability of student evaluations of teaching is further complicated when results are reduced to averages and then compared with the averages of other faculty members, or across departments, faculties or the entire University. The arbitrator held that the evidence is “clear, cogent and compelling” that averages establish nothing relevant or useful about teaching effectiveness. Rather, he held that the use of averages is fundamentally and irreparably flawed and the only relevant metric is frequency distribution.
The arbitrator concluded that, while the use of student evaluations may be ubiquitous in Canadian universities, that could not serve as a justification for relying on a flawed tool, and therefore student evaluations of teaching can no longer be used to assess the quality or teaching effectiveness of university faculty members for tenure and promotion purposes.
Among other remedies, the arbitrator ordered that:
- The parties attempt to amend the collective agreement themselves to ensure that student evaluation results are not used in measuring teaching effectiveness for promotion or tenure
- The numerical rating system currently in use be replaced with an alphabetical one
- Averages of student evaluations are no longer created or relied on
- The University administration and Association ensure that administrators and committee members charged with evaluating faculty in personnel decisions are educated in the systemic biases inherent in student evaluations.