Resources

The Risky Integration of Generative AI and Health Analytics

Written by Socially Determined | Apr 10, 2025 5:16:04 PM

Generative artificial intelligence (genAI) is taking business operations and everyday life by storm, and the healthcare industry isn't immune. The next few years will likely see significant leaps in applying genAI to administrative tasks, care delivery and even clinical work.

Today we’re going to discuss the opportunities and limitations of this next wave of tech enablement, and how to move forward with the best interests of partners (and mission here at Socially Determined).

Researchers at the Artificial Intelligence in Medicine Program at Mass General Brigham using trained large language models (LLMs) found they were extremely adept at identifying social determinants of health (SDOH) in clinician’s notes that otherwise may have gotten lost in the growing volume of health records. Tests found that LLMs trained by the researchers were able to identify 93.8% of patients whose SDOH called for more attention, while the official diagnostic codes used in examinations included that information in just 2% of cases. While some of this is down to the limitation of diagnostic coding to keep up with the latest research around social determinants of health, there’s something to be said for parsing through that much data quickly.

The advantages of using AI in this case were dramatic, but these advancements come with the same red flags other industries have encountered with the unrestrained use of AI.

Generative AI’s Healthcare Data Problem

For all of its power, GenAI  is definitively far from perfect.

One well-known issue is the technology’s propensity for producing “hallucinations,” resulting in instances when incomplete or inadequate data is used to train an AI model, combined with an AI interfaces obligation to produce (seemingly) definitive results. These biases and output configurations lead to incorrect, even absurd, conclusions. Unchecked AI processes can create serious risk, particularly in fields where their conclusions could have impactful consequences, such as healthcare diagnoses, financial transactions or legal proceedings.

A lawyer in New York, for example, used OpenAI’s ChatGPT to gather supporting research in a personal injury lawsuit, but six of the cases it cited were, according to the judge, “bogus.” The AI made them up.

In healthcare, the stakes are even higher because mistakes can put lives at risk. In one instance, the American Medical Association found that an AI tool used by an EHR vendor to provide early warnings of sepsis infections often missed diagnosing cases or issued false alarms.

Effective AI Requires Accurate Training Data

Transparency in the design and performance of algorithms is essential, although efforts toward “explainable AI,” in which AI models can describe their reasoning in terms humans can understand, are something AI developers are just beginning to make progress on. Before getting to that point, however, the immediate issue is the quality of data being used to train AI models.

When applying AI to SDOH, the industry is using inherently flawed data. The study at Mass General Brigham produced promising results, but the research was based exclusively on information provided by patients, which the hospital’s chief equity officer called “the gold standard.” Unfortunately, patients, intentionally or not, aren’t always accurate or truthful about what they tell providers or answer on surveys. Relying solely on patient-supplied information produces flawed data, delivering flawed results from an AI model.

An obvious step is to use better data, giving  AI more a more accurate starting point to do its work, but even that can still leave massive blind spots . It lacks information on a patient’s daily life that affects their health.

For SDOH, AI Models Need to Integrate “Outside” Data

Quality is paramount. For healthcare, data that already exists within the industry will probably be enough—as long as it’s accurate and complete. But that won’t work for social determinants of health which, by definition, are drawn from people’s lives outside of healthcare settings, involving housing, employment status, means of transportation, access to healthy food, finances and other factors.

For AI to effectively address SDOH, organizations must look outside the industry for nontraditional datasets. Fortunately, data collection occurs throughout society, and there are myriad potential sources, from datasets that are already publicly available to those from private sources, for example,

A Critical, Measured Approach to Healthcare Data

AI has demonstrated that it can be a powerful tool in understanding a patient’s life in and outside of clinical settings, which can improve care. But AI makes mistakes, especially if trained on inadequate data. Healthcare organizations need to be sure they’re utilizing complete, accurate datasets and should always confirm an AI’s results via other means, such as surveys or outreach to patients.

At Socially Determined, we avoid a lot of these issues by not deploying LLMs in our work. It’s an unpopular thing to say right now, but the accuracy is just not there, yet. And accuracy is paramount in healthcare.

Even still, we avoid bias in our platform by not using biased  data. We have carefully vetted sources that go through frequent, additional internal verifications. We also don’t use external risk scores. In fact, while corollary outcomes are useful for analysis, we don’t even use our own risk scores to describe other risk, because that would introduce bias too.

The potential of new technology is exciting, but through trusting experts to determine the efficacy of that technology at each stage of its evolution, we can ensure the highest possible quality and accuracy in our dataset and the value we deliver to our clients.