Distributed AI Architecture for Medicine

5 min readMar 1, 2023

A decentralized in-the-building edge cloud service

Just as the original moon shot required a new rocket, the Pediatric Moonshot also requires a new “rocket” — but this time in the form of a distributed architecture for AI in medicine.

Before launching into a discussion of decentralized architectures, let’s first consider classic cloud computing for a moment. As I routinely share on the first day of the cloud computing course I teach at Stanford, the cloud-based web services like those provided by Amazon, Google, and Microsoft have many advantages as compared to on-premises computing. Amazon’s AWS (Amazon Web Services) serves to exemplify three key elements of a cloud service: First, it is Amazon that purchases the compute, storage, and networking equipment. Use of Amazon’s AWS also comes with the company’s commitment to managing the security, availability, performance, and change of the infrastructure. And finally, Amazon delivers these cloud services from about 10–20 data centers around the world. These attributes also apply to Microsoft’s Azure and Google’s GCP and are in stark contrast to the traditional on-premises model where you purchase the capital, rent the data center, and pay the power and cooling bills. And on top of that, hire the staff to manage security, performance, and availability of the compute and storage infrastructure. The result is that AWS and the other cloud services (also referred to as “infrastructure” cloud services) are both lower cost and higher quality.

While there are many advantages to moving to these cloud services, there are some applications that are challenged to run in a centralized architecture. They include:

· Real-time applications. Some applications require near-real-time decision-making at the location where the data is captured (rather than at a remote center).

· Big data applications. In the previous post we estimate the echocardiogram data produced by 500 children’s hospitals at 6,000,000 TB. Pushing this much data up to a central cloud can be slow and expensive, even over 5G.

· Isolated applications. Some applications need to work even when disconnected from the cloud for substantial periods of time, especially where data connections are unreliable or intermittent (on an oil platform in the middle of the ocean, for example).

· Privacy, security, and compliance applications. Sometimes data is not allowed to leave the location where it is generated. In other cases, sufficient processing is required before any data can be shared. These barriers are typical of the healthcare system.

In the previous episode we reviewed five specific challenges to deploying and training AI applications in a centralized architecture. Based on these challenges, we concluded that there is a need for a distributed AI cloud infrastructure. Similar to centralized cloud services, a distributed AI cloud computing service provider still acquires the compute, storage and network equipment; manages the performance, availability, security, and change of the infrastructure and delivers it in a subscription business model.

The major defining difference is that the distributed AI cloud infrastructure needs to be engineered to deploy in any location or building, rather than centralized in just one. More specifically, to be able to support the Pediatric Moonshot mission, the distributed cloud needs to run in the buildings of all 500 of the world’s children’s hospitals.

We’ve established the need to achieve both real-time and privacy-preserving features. And we know that our applications will require our computing infrastructure to be decentralized and in the building, such as the hospital, the clinic, or ultimately the home. These requirements mean that our edge cloud system will need to satisfy a unique constellation of requirements that include:

1. Secure Distributed Compute & Storage

Centralized cloud services today are delivered from a handful of data centers where physical access to the building is strictly enforced. Edge servers placed in hospitals or clinics cannot realistically require the same level of physical access control. Therefore, unable to rely on physically access control, a distributed edge cloud needs to implement features necessary to protect the compute & storage in the absence of physical security.

2. Network Security

One of the main reasons the computing infrastructure needs to be in the building is the healthcare machines creating the data are in the building, and the only way to communicate with those data-generating machines is to be on the same secure, managed network. So an distributed AI network service must support intra-zone (in-the-building) communications, not to mention support secure extra-zone (outside of the building) communications.

3. Access to Data from Healthcare Machines

While there is useful data in the electronic medical record (EMR) or electronic health record (EHR), there is far more data in the imaging machines, blood analyzers, drug infusion pumps, ventilators, and gene sequencers. Any in-the-building edge cloud service must support access to the static data (e.g. machine serial number), environmental data (e.g. location), dynamic data (e.g. laser power level of the gene sequencer) and finally, the “nomic” data (e.g. echo cardiogram, EEG, MRI scan, gene sequence or blood analysis).

4. Fine-Grained Data Sharing

As we discussed in the previous episode, one of the fundamentals of privacy is purpose limitation. The distributed AI infrastructure needs to allow for fine-grained data sharing such that a machine owner should be able to choose specific, distributed AI applications with which to share data, as well as which ones not to). Doing so will clearly define not only which data can be shared and with whom, but also for which specific purpose(s) and for commercial and research applications alike.

5. Distributed AI Application Control

In addition to fine-grained data sharing, the architecture should also offer a rigorous process for allowing distributed AI applications in the building. This process should include security vulnerability testing, application security review, and defined white lists for any external communication.

6. Image Sanitization

Given the intent to share medically related images outside of the hospital or clinic, the infrastructure should support what is referred to as “image sanitization.” In other words, it needs to be able to automatically identify and redact any personally identifying information (PII) present on the images.

7. Real-Time Inference

The architecture should support real-time AI inference. Regardless of where an AI application’s training takes place (using a distributed or a centralized architecture), the servers at the point of care must ultimately be able to execute locally on that learning (i.e., execute the resulting AI application) without having to rely on or make use of servers outside of the building.

8. Distributed Learning Capabilities

Finally, the distributed AI infrastructure should be optimized for privacy-preserving, network-preserving, federated learning. Rather than a centralized architecture that learns on 6,000,000 terabytes of shared, aggregated ultrasound data in a central site, distributed learning would allow the AI training to take place in a decentralized fashion across the 7,000 servers located in all 500 children’s hospitals around the globe.

So what is distributed learning? How is it capable of implementing privacy-preserving, network-preserving training of AI in medicine applications? Read on to learn about how distributed learning works for consumer AI applications.

If you’d like to stay up to date on our progress to the moon, register for the newsletter at www.pediatricmoonshot.com, follow us on LinkedIn, subscribe to the Pediatric Moonshot podcast, listen to the Spotify playlist, and subscribe to the Pediatric Moonshot Youtube channel.

Many thanks for extensive editing by Laura Jana, Pediatrician, Social Entrepreneur & Connector of Dots; Leanne West, Chief Engineer of Pediatric Technology at Georgia Tech. Special thanks to Alberto Tozzi, Head of Predictive and Preventive Medicine Research Unit at Ospedale Pediatrico Bambino Gesù for the translation to Italian.

Distributed AI Architecture for Medicine

Written by Timothy Chou

Responses (1)