When Your Ruler Is Wrong: How Rescoring Improves Measurement in LEVANTE