Using Performance Rating Scales for Assessment

Assessment Questions the Data Might Answer

(Adapted from conference presentation materials, Barbara Walvoord, Ph.D., University of Notre Dame)

Here are some suggestions about the kinds of questions that might be answered by a collection of assignments, performance rating scales, and student scores collected over time. In each case, the question is followed by one or more hypothetical statements that might answer those questions.

Example #1: Documenting Classroom Assessment

Who Needs to Know What, For What?

At this most elemental level, we want to document for ourselves and for our accrediting agency that we have an assessment program in place and that the assessment program is meeting the following criteria:

  1. Assessment is being conducted in classrooms.

  2. Assessment in classrooms is connected to learning goals.

  3. Assessment instruments (tests and assignments) are measuring those learning goals.

  4. Criteria are explicitly stated.

  5. Students' work is being assessed against those criteria.

  6. Assessment results are fed back into student learning and into teaching methods.

Examination of Assignments and Performance Rating Scales Results in This Finding:

We collected from a random sample of 20% of the courses in the program the following info:

  1. Professor's statement of learning goals for the course
  2. Copies of major tests and assignments that assessed student achievement of those goals
  3. The performance rating scale that makes explicit what criteria are used to assess students' performance on the tests/assignments
  4. Statement by the professor, of how assessment results were fed back into student learning and teaching practice. Available evidence to show the feedback process (e.g., student revisions, student evaluations, new syllabus, additional handouts)

Example #2: Finding Common Expectations

Who Needs to Know What, For What?

The General Education Committee needs to know what is being taught and expected of students in General Education courses. What are common expectations? The committee wants to make recommendations to Gen Ed faculty and to the Faculty Senate about enhancing the cohesiveness of the general education experience. Further, the college would like to be able to describe for external audiences the skills taught in Gen Ed.

Examination of Assignments and Performance Rating Scales Results in This Finding:

Examination of assignments and performance rating scales from X courses shows the following common expectations:

  1. Problem-solving: 46% of the assignments

  2. Generalizing from data: 42% of the assignments

  3. Questioning assumptions: 39% of the assignments

  4. Analyzing a text: 79% of the assignments

Comments

Examiners will probably have to interpret some different language in the performance rating scales and assignments. For example, an assignment may call for analysis of text but not use that word. Divergence of language will be reduced if professors develop their performance rating scales in a workshop or collaborative setting, and/or if they have common models or mission statements to work from.

Example #3: Overview of Where Various Skills Are Taught & Assessed

Who Needs to Know What, For What?

A department wants to assure that skills taught and assessed at lower levels build consistently toward skills required for upper-level work. Further, the department wants to describe to its prospective students and to those who employ its students, what skills the students have been taught.

Examination of Assignments and Performance Rating Scales Results in This Finding:

Examination of assignments and performance rating scales from all the department’s senior courses and from a selection of the department's lower-level courses indicates that the following skills are commonly required:

  1. Upper level: X, Y, and Z

  2. Lower level: A, B, and X

Comments

A finding such as this leads to a discussion by the department about whether skills A and B are appropriate preparations for X, Y, and Z, whether X is required at the same level of skill in both upper and lower level, etc.

Example #4: What is Required of Graduates?

Who Needs to Know What, For What?

A department wants to know what is required of its graduates, both for its own use and for employers and prospective students.

Examination of Assignments and Performance Rating Scales Results in This Finding:

An examination of assignments and performance rating scales in the three courses that, among them, enroll all senior students, indicates that all seniors are required to demonstrate the following skills for an assignment grade of "C": X, Y, and Z

Comments

A finding such as this may encourage the department to teach more deliberately toward those skills in the lower levels. Or the department may now want to decide that all teachers in those three courses will, through some assignment or another, assess the three skills. Or they may want to make it a policy that a student who does not perform at least at the "C" level on those three skills will not pass the course, etc.

With the next two examples, we move from cases that would require a committee only to examine the assignments and performance rating scales to cases where the committee might also examine student scores collected over time. The professors would have to keep these scores and submit them, in the aggregate, or with students' privacy protected, to the committee. Remember that the scores are not the same as the assignment grades. It is possible to score a piece of student writing only for the traits that the committee wants to look at.

Example #5: Strengths and Weaknesses in Student Performance

Who Needs to Know What, For What?

Let us suppose that the folks in Example #3, having identified the skills being assessed in their three senior courses, now decide they want to track, over time, how well their students do on these skills. They need not institute an external test for this, as long as the teachers of the three senior courses will submit student scores over time. Looking at these scores, the teachers and the committee together might try to figure out how to raise the scores, either by better preparation before students' senior year, and/or by different teaching strategies in the senior courses.

Examination of Assignments and Performance Rating Scales Results in This Finding:

After examining the assignments, the performance rating scales, and student scores from the three senior courses over three years, the committee finds that:

  1. Students consistently score lower on X than on Y and Z

  2. Student scores on X and Y have remained fairly constant, while student scores on Z have risen over the three years (quantitative results and statistical computations might be given here)

Comments

An examination like this might lead the department to ask why scores have risen on Z, or why students score lower on X than on the others. The department might move to help students more directly with X in the lower courses, or the teachers of the senior courses might agree to work harder or differently with X. Then, analysis of further scores over time could determine whether students were doing better.

Example #6: Comparing and Tracking Student Performance

Who Needs to Know What, For What?

The college has been working hard to improve instruction in critical thinking in Gen Ed courses. They want to know whether their efforts have produced any changes in students' performance on assignments that assess critical thinking.

Examination of Assignments and Performance Rating Scales Results in This Finding:

In a random sample of Gen Ed course assignments, performance rating scales, and student scores in 1994 and in 1996, the committee finds that students' scores in critical thinking, as defined in each discipline through the performance rating scale, have risen significantly in 48% of the courses, remained the same in 27%, and fallen in 25%.

Comments

This kind of comparison is possible because critical thinking, though defined differently in each course, is labeled as such by the professors. That is, the chemistry teacher is scoring students on one or more 5-point scales for critical thinking as she defines it; the history teacher is scoring students on one or more 5-point scales for critical thinking as he defines it. So what you're comparing here is students' standing on those five-point scales. For example, let's say the chemistry teacher is asking students for research proposals that require the traits of hypothesis development and experimental design. The chemistry teacher defines those as "critical thinking." She constructs a 5-point scale for each trait. She – and outside raters if needed – score student work on those 5-point scales, and the chemistry teacher turns in those aggregated scores to the committee along with the assignment and the performance rating scale. The history teacher is asking students for essays that require the traits of evidence and counter-argument. He defines those as "critical thinking." He constructs a five-point scale for each trait. He (and outside raters if needed) score student work on those 5-point scales, and he turns in that material just like the chemistry teacher did. Now the committee sees a configuration something like this:

Figure 1. Mean performance rating score for all students who took the exam, on teacher-defined critical thinking traits (5=high; 1=low)

1994

1996

Chemistry

3.20 (n=208)

3.70 (n=235)

History

2.80 (n=42)

4.10 (n=38)

These are some examples of questions that might be answered by having a committee examine a collection of materials about the grading process. But the examination of this collected data assumes a long prior process of deciding that this is the best mode, helping faculty develop these materials, and collecting them in useful formats.