Prior to starting my first semester as a teacher, I sat down to write the green sheets for my two courses: AP Calculus AB and Algebra 1. As a freshly minted ed school grad, filled with progressive idealism, I chose to use my own grading system and associated rating scale. Designing and implementing it caused more difficulties than I expected, but allowed me to expand my horizons learning more about the history and trends about the purpose, structure, and implementation of grading systems nationally and globally. I still struggle with dimensions of it, even as of this writing. For this post, I focus on establishing cut scores for use in my rating scale and grade reporting scale, the former providing formative feedback to students, with the latter providing summative feedback for students, parents, and others.

**Investigating Cut Scores for Algebra 1**

The act of deciding where to draw the line between a rating scale level consumed more than a fair amount of time. I did not want to follow the traditional, ten-point scale used by my district where an A is 90 or above, a B from 80-89, a C from 70-79, and so on. I found it too punitive, especially if averaged scores included zeros. For my algebra 1 classes, I soon found myself exploring the cut-points used for the 2010 Algebra 1 California Standards Tests (“CSTs”), part of the California Standardized Testing and Reporting (“STAR”) program, as shown in Table 1 below. [Endnote 1]

**Table 1:** 2010 Algebra 1 CST Score Summary

I must admit, the algebra 1 cut scores surprised me, initially. The fact that students were deemed “Advanced” if they achieved a score of 80% struck me as too low. More surprisingly, students were deemed proficient with a score of 58%. Suddenly, the No Child Left Behind (“NCLB”) requirement that all students reach proficiency did not seem so ridiculous. While I still found its mandates ill-conceived and net destructive to public education, it seemed reasonable that a larger proportion of students might attain proficiency with these cut scores. Counterbalancing that thought, the number of students reaching proficient, or above, on the 2010 algebra 1 CST lingered at just over 30%, very far from all students, or even a simple majority. Furthermore, while the 2011 algebra 1 CST data rose 1 percentage point from 2010, as the following graphic shows, it still showed over two-thirds of students scoring below proficient. [Endnote 2]

**Table 2:** 2011 Algebra 1 CST Score Summary

The 2011 data also show the downward shift in the distribution of performance levels attained by students in grades seven through eleven. It might seem counterintuitive to many that as grade level increases, the mean score for each successive grade level declines. However, considering that students in those grades have likely taken the same course two and sometimes three times, they may have become so disenfranchised with the subject, and perhaps school, they put little effort into learning the material in or out of class, irrespective of teacher or support staff interventions. They may also truly be impeded in improving their performance for a variety of cognitive and/or developmental reasons, which could be intertwined with their disposition towards the subject, or school. Unraveling these factors is very difficult for any one student, much less reversing their impact for all.

**Cut Scores from Standards-Aligned Algebra 1 Assessments**

After my initial review of algebra 1 CST cut scores and distributions, I was highly inclined to use them as the basis for my algebra 1 course, especially since I intended to assess students based on the California Algebra 1 Content Standards. This lead me to investigate how the algebra 1 CST was created, and more importantly, how its cut scores associated with specific performance levels were determined.

To this end, the California Department of Education (“CDE”) Assessment and Accountability Division provides a fairly detailed, but obtuse, description of the score setting process in their technical report titled “California Standards Tests Technical Report Spring 2010 Administration.” An additional report on the alignment of CSTs to the CA content standards, contracted out by the CDE, details parameters such as the distribution of questions across performance levels (see the table below), as well as the depth-of-knowledge (“DOK”) and range-of-knowledge (“ROK”) for each standard. The report offers insight into the complexity involved in aligning the CSTs to the content standards.

**Table 3:** Distribution of CST Test Questions by Performance Level

I have some lingering questions from table 3. How are students designated as Below Basic, or Far Below Basic, if only two questions fall into the Below Basic performance level, and none in the Far Below Basic level? For a Below Basic rating using raw scores, students must correctly answer nineteen (19) to twenty-seven (27) questions. From my superficial understanding of the data in table 3, it seems students should be classified as Basic instead. The distribution of questions by performance level, the cumulative number of questions at any performance level, and the raw scores per performance level are contrasted in the following table. Some description of how this distribution relates to raw scores must exist, however, I believe I missed that detail. If anyone is more familiar with the nuances of this, please let me know.

**Table 4:** Contrasting Question Distribution by Performance Level with Raw Scores

Lastly, another report, from the same contractor, asserts that “The performance level descriptors developed here are empirically based descriptions of what students at each performance level do know and what they are able to do.”

After reading each of these reports, with a caveat related to my lingering questions above, it was clear to me that the Algebra 1 CST, as a standards-aligned, criterion-aligned exam, serves as a rigorous instrument to assess student understanding.

**Establishing My Cut Scores**

Setting NCLB expectations aside, but keeping standards-alignment and criterion-alignment in mind, I included aspects of the CST percent-correct cut scores in my rating scale, modifying it selectively resulting in the following, which I use as my standards-based rating scale for all assignments and assessments.

**Table 5:** Algebra 1 Rating Scale

Standards-based grading appealed to my sense of fairness, especially when accompanied by students’ ability to retake assessments or resubmit assignments to raise their scores. I did not want poor performance on any one, or few, assessments or assignments to seal a students fate in the course. I also hoped they might strive to improve their standing if the option existed.

For grade reporting, the district requires that I translate these numeric ratings, with associated performance level descriptors, into end of marking period grades ranging from F to A+. Hence, I convert the weighted average of assessments and assignments per marking period using my 1 to 5 scale to letter grades of F through A, respectively. The cut points for each letter grade follow: 4.25 (A-), 3.5 (B-), 2.75 (C-), and 2 (D-).

**Preliminary First Semester Grade Distribution**

Preliminary aggregate results for my 96 algebra 1 students follow. Interestingly, their grade distribution approaches that of the algebra 1 CST results. To what extent that is coincidence, or follows since one is the basis for the other, is beyond my faculties at the moment; it is fodder for a follow-up post.

**Table 6:** Preliminary End of Semester Algebra 1 Grades

**Conclusion**

While the preliminary grades are curiously similar to the CST scores, I am nonetheless left with many unanswered questions, some of which are in endnote 2 below. Should more students receive A grades, even though I used a highly tolerant standards-based rating scale? Should the distribution of students specifically matter? What about students who scored higher on assessments versus assignments? Should a lower average score on assignments lower a student’s higher average assessment score? Should my assessments be revised to allow more students to score higher? And many, many, more…

**Endnotes:**

[1] Data in the table were taken from California Standards Tests Technical Report Spring 2010 Administration, specifically Table 7.2 Percentage of Examinees in Performance Levels for CSTs and Table 8.D.66 New Conversions for General Mathematics and Algebra I.

[2] A colleague believes the difficulty of the algebra 1 CST is the primary reason for students’ low scores, on average, which I find plausible, especially if the test is a rigorous, standards-based assessment. At the same time, what is needed to improve the attainment of students on this assessment? Is it realistic to believe that all students want to reach a proficient level of performance? Whose needs are we truly meeting in this massive nationwide edu-scramble?

Hiya,

” It might seem counterintuitive to many that as grade level increases, the mean score for each successive grade level declines. ”

I don’t think that’s counterintuitive at all. It’s entirely what you’d expect. The year a student take algebra is directly correlated to his math ability. So seventh graders tagged for math are excellent at math and will do well. 8th graders, now the normal entry point, has fewer excellent students because they all took it in seventh grade. By ninth grade, almost all students tested took algebra the year before–or they are very, very weak in math. By tenth grade and beyond, it’s everyone. So it’s not at all counterintuitive. Math tracks by age of entry.

Incidentally, there was a wholesale improvement in algebra test scores somewhere between 2004 and 2008 for 7th and 8th graders. It could be they made the test easier, but I doubt it. I wonder if that corresponds with when they started mandating that middle school math teachers had to have single subject credentials.

I am concerned that using the CST (which is rigorously evaluated) percentages as your own is a problem, given that you aren’t a psychometrician. However, my concern would be that your grades would be skewed high, when instead nearly half your class has a D or an F–which, of course, tracks to the CST. In both cases, I think it means the tests are too difficult. While this may not impact grades much, it concerns me that the tests are too hard for the students. This is my problem with the CST. Give a student a test with too many difficult quesions, and he’s less likely to work the test and more likely to give up.

I hope you are giving the students the higher of the assessment grade or the overall grade, particularly with the Ds and Fs. If a student passed all your tests, he should not fail the class, and if his or her test average is a C, I would not advise giving him a D.

More generally, though, I think you are overthinking this. I love analyzing performance, but so far as grades go, I’ve decided that it’s pretty easy to place a kid based on his performance without all the rigorous data computation.

LikeLike

Pingback: Grappling with Grading: Part 2 – Grading Scales and Distributions | Reflections of a First Year Math Teacher

Pingback: That’s just (expletive) wrong! | Reflections of a First Year Math Teacher