---
id: "concept-model-self-review-bias"
type: "concept"
source_timestamps: ["00:00:00"]
tags: ["evaluations", "model-behavior", "benchmarking"]
related: ["entity-claude-opus-4-7", "entity-chatgpt-5-4"]
definition: "The tendency of different LLMs to exhibit distinct biases (overselling vs. underselling) when evaluating their own outputs or the outputs of competing models."
sources: ["s12-opus-47"]
sourceVaultSlug: "s12-opus-47"
originDay: 12
---
# Model Self-Review Bias

## Definition

The tendency of different LLMs to exhibit distinct biases (overselling vs. underselling) when evaluating their own outputs or the outputs of competing models.

## Detail

Model Self-Review Bias describes the inherent psychological or alignment-driven skew when LLMs are asked to evaluate their own outputs or the outputs of competitors.

### Concrete Findings (Head-to-Head Testing)

| Model | Self-Grade | Behavior |
|-------|------------|----------|
| [[entity-claude-opus-4-7-d12|Claude Opus 4.7]] | 3.5 / 5 | **Oversells** itself — grades flawed outputs highly despite missing critical data |
| [[entity-chatgpt-5-4|ChatGPT 5.4]] | 3.1 / 5 | **Undersells** itself — grades own work harshly, surfaces own errors transparently |

Furthermore, **GPT-5.4 graded Opus 4.7's work much more strictly than Opus graded itself.**

## Why It Matters

This bias indicates that using LLMs as automated evaluators (the **LLM-as-a-judge** pattern) requires **careful calibration**, as the alignment training of the model heavily influences its leniency and transparency during self-reflection and peer review.

## Operator Implications

- Don't trust a single model's self-grading.
- Use cross-model peer review (see [[framework-hex-eval]] step 5).
- Calibrate any LLM-as-a-judge pipeline against human-graded ground truth.

## Cross-References

- Entity: [[entity-claude-opus-4-7-d12]], [[entity-chatgpt-5-4]]
- Quote: [[quote-oversell-undersell]]
- Framework: [[framework-hex-eval]]
