10

Evaluation & Diagnosis

Before/After Metrics — Catching Catastrophic Forgetting

+100 XP5 min10 / 11

Overview: Evaluation & Diagnosis

Overview: Evaluation & Diagnosis

Always report DELTA metrics, not just final scores. Evaluate general benchmarks (MMLU, HellaSwag) alongside task-specific ones to catch catastrophic forgetting — the silent killer of fine-tuned models.

1 of 3