AI in Healthcare & Life Sciences: Good News, Bad News

Timothy Chou
3 min readJul 23, 2024

--

A few years ago, Jeff Dean at Google shared the right side of the above figure. It helps explain why companies like Waymo continue to acquire more data to build more accurate autonomous driving algorithms and why all consumer-facing companies like OpenAI continue to look for larger data sets.

What is the state of the art of AI in medical imaging?

Recently Parnav Rajpurhar and Matthew Lungren authored a survey paper “The current and future state of AI interpretation of medical images.” This is an excerpt.

In considering the widespread adoption of AI algorithms in radiology, a critical question arises — will they work for all patients? The models underlying specific AI applications are often not tested outside the setting in which they were trained, and even AI systems that receive FDA approval are rarely tested prospectively or in multiple clinical settings¹. Very few randomized controlled trials have shown the safety and effectiveness of existing AI algorithms in radiology and the lack of real-world evaluation of AI systems can pose a substantial risk to patients and clinicians².

Moreover, studies have shown that the performance of many radiologic models worsens when they are applied to patients who differ from those used for model development, a phenomenon known as data asset shift³ ⁴ ⁵ ⁶ ⁷. In interpretation of medical images, data set shift can occur as a result of various factors such as differences in health care systems, patient populations and clinical practices. For instance, the performance of models for brain tumor segmentation and chest radiographic interpretation worsen when the models are validated on external data collected at hospitals other than those use for model training⁸ ⁹ . In another example, a retrospective study showed the performance of a commercial AI model in detecting cervical spine fractures was worse in real-world practice than the performance reported to the FDA¹⁰.

In a nutshell most AI models have been trained and tested on a small sample of images between 1,000 and 10,000 images so which is insufficent for real world practice. After all, a car trained to drive in Palo Alto will likely not work in London.

The good news is that we already have all the data we need.

Good News

Petabytes of data are generated every day in hospitals, clinics and research labs around the world. We’ve estimated just the echo lab, focused only on pediatric cardiology in the 500 children’s hospitals have on average 17 ultrasound machines. The ultrasounds are used on average 4 times a day, which result in approximately 6,000,000 echo studies per year across the 500 children’s hospitals — or roughly 6,000,000 TB of imaging data. That’s several orders of magnitude more data than has been used to train any AI algorithm in children’s or adult medicine. That’s 100,000x the data available in the centralized NIH Imaging Data Commons.

And if there are 6,000,000 ultrasound images in children’s hospitals, if you now include adult hospitals and clinics, and all imaging types: X-ray, PET, CT, MRI you realize there is a vast amount of real world data to train accurate AI applications. It’s just all in the building.

So you might think let’s just move all the data out of the building into a centralized public or private cloud and train as is done with consumer AI applications like ChatGPT. But centralized AI architectures will not work in healthcare and life sciences.

¹https://pubmed.ncbi.nlm.nih.gov/33820998/

²https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2796833

³https://arxiv.org/abs/1909.01940

https://arxiv.org/abs/2002.02497

https://arxiv.org/abs/1910.04597

https://arxiv.org/abs/2012.07421

https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2798829

https://pubmed.ncbi.nlm.nih.gov/29356028/

https://arxiv.org/abs/2102.08660

¹⁰https://pubmed.ncbi.nlm.nih.gov/34117018/

--

--

Timothy Chou
Timothy Chou

Written by Timothy Chou

www.linkedin.com/in/timothychou, Lecturer @Stanford, Board Member @Teradata @Ooomnitza, Chairman @AlchemistAcc

No responses yet