The Great Reality Check Part 2: Acute Stroke
Read the results of our new user studies – up-to-date and transparent!
Purpose:
The aim of the study was to prospectively determine the performance of a common AI assistant in acute stroke, validated with the first read reports of radiologists specialized in emergency radiology as well as imaging and clinical follow-up.
Patients, Material and Methods:
In 2025, 88 patients (age: 18 to 89 years, mean: 52 years, standard deviation: ± 25 years) who had been referred to ERS Emergency Radiology Schueller, a provider of teleradiology services, for cranial CT scans with suspected acute stroke were randomly and prospectively enrolled in the study over three consecutive weeks. CT studies of these patients were evaluated by a common, commercially available AI assistant. Radiologists reported the CT studies without the initial knowledge of the AI results and compared the radiological with the AI findings in a second step. Gold standard were the specialists´ reports as well as clinical follow-up. In case of discrepancies between the radiologists´ and the AI assistants´ findings, CT studies were second read within 30 minutes at the latest. The study was prematurely terminated due to the AI results.
Results:
Of 88 patients, 14 AI results could not be retrieved. Of 74 patients, radiologists and clinical follow-up diagnosed 2 acute ischemia (2.7%). The AI assistant yielded 2 true positive (TP), 58 false positive (FP), 0 false negative (FN), and 14 true negative (TN) results; sensitivity 1.0; specificity 0.194; positive predictive value (PPV) 0.033; negative predictive value (NPV) 1.0. In a second step, the results of the AI assistant were calculated based on the clinically and therapeutically relevant threshold of the ASPECTS score of 7 or lower: The AI assistant yielded 2 TP, 32 FP, 0 FN, and 14 TN results; sensitivity 1.0; specificity 0.304; PPV 0.059; NPV 1.0.
Discussion:
The AI assistant achieved 58 out of 74 FP (78%) and two out of 74 TP (2.7%). This rate, along with the absence of FN, suggests that the software company is accepting FP in favor of sensitivity. The calculated specificity is significantly lower than officially stated in the AI manufacturer’s publications. Evaluation based on the clinically and therapeutically relevant threshold of the ASPECTS score 7 yielded a similar result. Data collection was terminated prematurely, and the low number of cases achieved certainly represents a limitation of our study. Based on the available data, it must be assumed that reporting CT scans for acute stroke, with its complexity, especially with pre-existing, non-acute lesions of brain tissue, particularly in older patients, should for the time being remain entirely in the hands of experienced radiologists.

