The bitter pie charts of radiology AI efficacy
This is a comment on Artificial intelligence in radiology: 100 commercially available products and their scientific evidence. The paper was recently published in European Radiology by a Dutch group. This is great work and I can highly recommend to read it for anyone who is interested in radiology AI and the evolving market 👍🏼.
The authors compiled a list of 100 CE-marked commercially available AI products and collected information on the supporting evidence for their efficacy as well as validation methods.
Overall, it turned out that only 36% of all products were validated in peer reviewed publications. This is the number that was discussed most frequently on Twitter in the last few days. I agree it is concerning that only such a small fraction did even publish anything. However, it would have been very interesting whether or not all the evidence in the 36 publications was actually supporting the usefulness of the respective product. In the supplement there is a table that ranks each publication in a level of efficacy but the authors do not mention if there was any competing evidence. Also, it did not become clear to me how exactly the exactly 100 products were chosen. Is it a coincidence that there are exactly 100? Is this a selection?
One aspect that I find particularly puzzling is that there is a fair amount of approved AI software that was only validated in one center and/or only with one scanner vendor (25%). Even worse, these numbers are related to the 36% of AI-tools with publications. I’m afraid the evidence for the remaining 64% may even be thinner. I am not an expert in regulatory approval processes, but it seems shirt-sleeved to me that it is even possible to obtain CE-approval for a diagnostic product that was only validated in one center…🙄🤦🏻♂️
The paper also shows that radiology AI companies are founded almost four years before CE-approval on average. I wrote it before in this blog but this underlines that radiology AI software development is more similar to drug development in the pharmaceutical industry than traditional software development. In traditional software I can just ship a product when I want. In healthcare, before selling a product, we have to analyze risk, think about intended use, obtain regulatory approval, show efficacy and perform post-market surveillance. This takes years and has quite a few similarities with drug development.
The authors conclude that the market is still in its infancy and I fully agree with this. If radiology AI products are further developed with such poor validation methods, they are doomed to fail when used in clinical settings and we’ll slip into an AI winter. On the contrary, there is a fair amount of AI vendors who validate in >20 imaging centers which makes me hopeful that AI will find its way into clinical routine in the coming years. To me, the article made clear that as we as potential customers need to insist on seeing evidence before purchase.