Phase 4.4 Diagnosing Rubric Issues
Review a sample of graded responses
Look at 5–10 submissions across the score range: a few high scores, a few low scores, and a few in the middle.Navigate quickly with keyboard shortcuts:
← → to switch questions, ↑ ↓ to switch students, and M for manual grading mode.Look for patterns
- Students with correct reasoning are losing marks.
- Similar responses are receiving inconsistent partial credit.
- Students are earning full credit despite flawed logic.
- A single criterion accounts for most of the lost points.
Identify the problematic criterion
Open the score breakdown for affected submissions and check which criteria are misfiring. Match the pattern to one of the common problems below.
Phase 4.5 Common Rubric Problems
Rubric doesn't include the correct answer
Rubric doesn't include the correct answer
Problem: The rubric does not specify the correct answer.Solution: Embed the correct answer directly into each criterion. For specific values use phrases like “must be exactly…”. For approximations, provide the acceptable range.
| Before | After |
|---|---|
| ”Mode is correct" | "Mode is correctly calculated as exactly 15.75" |
| "Median is roughly correct" | "Median is correctly calculated somewhere between 17.25-17.75” |
Rubric doesn't define how to award partial credit
Rubric doesn't define how to award partial credit
Problem: Criteria are binary and don’t specify how to award partial credit.Solution: Add explicit scoring tiers to each criterion. Define what earns full credit, what earns partial credit (and how much), and what earns zero.
| Before | After |
|---|---|
| ”Student identifies the pH of the solution" | "Student identifies the pH of the solution as 4.5. Award 50% if the student identifies the solution as acidic but does not calculate the exact pH. Award 0% if no pH or acidity is mentioned.” |
Rubric is too strict/specific
Rubric is too strict/specific
Problem: The rubric only recognizes one form of a correct answer. Students who express the same answer in an alternative method are marked wrong.Solution: List all acceptable answer forms and solution methods explicitly in the criterion. Review a batch of graded responses to identify common alternatives students are using.
| Before | After |
|---|---|
| ”The y-axis is labeled frequency" | "The y-axis has an appropriate label such as frequency, f, count, number of students, or equivalent” |
Rubric puts too much weight into one criterion
Rubric puts too much weight into one criterion
Problem: One criterion carries so much weight the score is essentially determined by a single element of the response.Solution: Redistribute weight toward process-oriented criteria. No single criterion should dominate the total score unless the question genuinely tests one thing.
| Before | After |
|---|---|
| ”Arrives at the correct answer” (50%), “Sets up the equation correctly” (20%), “Shows algebraic steps” (30%) | “Arrives at the correct answer” (20%), “Sets up the equation correctly” (40%), “Shows algebraic steps” (40%) |
Rubric is misordered
Rubric is misordered
Problem: Criteria don’t follow the sequence the answer is written in. For visual responses, this causes the AI to jump around and mismatch elements.Solution: Reorder criteria to match the sequence the answer is written in or a human grader would naturally follow when grading the response.
| Before | After |
|---|---|
| ”Nucleus is labeled”, “Cell membrane is labeled”, “Mitochondria is labeled”, “Cytoplasm is labeled" | "Cell membrane is labeled”, “Cytoplasm is labeled”, “Nucleus is labeled”, “Mitochondria is labeled” |
Criteria overlap
Criteria overlap
Problem: Two criteria evaluate the same aspect of the response from different angles. A student gets penalized or rewarded twice for a single element.Solution: Merge the overlapping criteria into one and combine their weights.
| Before | After |
|---|---|
| ”Correctly applies Newton’s second law” (30%), “Uses F=ma to solve for acceleration” (30%) | “Correctly applies Newton’s second law (F=ma) to solve for acceleration” (30%) |
Criterion is too broad
Criterion is too broad
Problem: A single criterion bundles multiple independent concepts.Solution: Split the criterion into separate criteria, each testing one concept, and redistribute the weight.
| Before | After |
|---|---|
| ”Student identifies the literary device and explains its effect on tone” (40%) | “Student identifies the literary device” (20%), “Student explains its effect on tone” (20%) |
If accuracy remains below 95% after two rounds of rubric refinement and regrading, contact support@uflo.io for assistance.
What to expect after rubric refinement
What to expect after rubric refinement
After refining a rubric and regrading, estimate accuracy by counting how many scores you disagree with out of the sample you reviewed:
- Below 95% accuracy: The rubric likely still has an issue. Contact us so we diagnose the issue and provide a resolution.
- 95–99% accuracy: The rubric is working well. Address the remaining individual discrepancies during the student regrade request window, where students can flag specific issues for your review.
Phase 4.6 Regrading After Changes
After updating a rubric, regrade the affected question using one of these workflows:- Manual Score Adjustments
- AI-Powered Regrades
Use when only a specific submission needs a score change. To adjust a score:
Adjust the Score
Add a comment explaining the change to create a transparent record with the student.
Next: After Grading
Release grades and handle student regrade requests.