Resources
Database Credentialed Access
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
CheXStruct is an automated pipeline that derives structured diagnostic reasoning steps from chest X-rays. CXReasonBench builds on this to evaluate whether models perform clinically grounded, multi-step reasoning beyond final diagnoses.
evaluation chest x-ray benchmark structured chest x-ray qa intermediate reasoning steps structured reasoning grounded reasoning diagnostic reasoning structured diagnostic pipeline
Published: Oct. 23, 2025. Version: 1.0.1