Applying Advanced Machine Learning Techniques to Healthcare Research

illustration of doctor with computer imagery
July 1 , 2025  |  By Meghan Perry

Share this via:

When Rafid Zaman Pranto started his master's thesis, he had no idea his research would touch upon many of the biggest questions regarding health insurance in America. Using new machine learning techniques, the recent University of Arkansas graduate discovered that the relationship between insurance and healthcare use is far messier—and more interesting—than anyone expected.

With healthcare costs spiraling and politicians endlessly debating insurance reform, Pranto wanted to answer a fundamental question: How does having health insurance actually change the way people use healthcare? And more importantly, do different types of insurance plans create different behaviors?

The Data Detective

Pranto built his study around three types of healthcare visits that matter most: urgent care visits, emergency room trips, and overnight hospital stays. He used data from the National Health Interview Survey’s Sample Adult Interview (2019–2023), one of the Center for Disease Control and Prevention’s largest and most comprehensive health surveys, which collects information from thousands of people across the country every year.

Instead of just comparing everyone at once, Pranto set up a three-stage investigation. First, he looked at people with insurance versus those without. Then he zoomed in on just the privately insured folks and compared people with deductibles to those without. Finally, he examined whether high deductibles (over $1,500) created different patterns than low ones. To make sure he wasn't missing anything important, he controlled for more than 20 variables that could affect healthcare use: age, sex, income, weight, existing health problems like heart disease and diabetes, job status, education, family size, where people lived, and how healthy they felt overall.

Seeing through the Causal Forests

Pranto got innovative with his research. Instead of relying on traditional statistical methods that treat everyone the same, he used something called Causal Forests—a machine learning approach that helps researchers see how the effects of something (like health insurance) can vary across different groups of people. Unlike older methods that look for a single, uniform average outcome, Causal Forests break the data into subgroups to uncover hidden patterns, such as how certain policies might benefit employed adults more than those who are unemployed. “Machine learning is gaining traction, but it's still pretty new in health economics,” Pranto explains. “That gave me an opportunity to contribute academically.”

He implemented this technique using Python, one of the most widely used programming languages in data science. Python's flexibility and growing library of machine learning tools allowed him to run complex models like Causal Forests efficiently. Think of it this way: traditional analysis gives you one average answer for everyone. But Causal Forests can reveal that insurance may affect working mothers differently than retired men, or that employment status changes how people respond to high deductibles. Using machine learning also helped solve a common challenge in research—making sure the results reflect the actual impact of having insurance, not just pre-existing differences between insured and uninsured people. And it’s worth noting: Pranto didn’t even know Python when he started his master’s program. He taught himself over the summer, making this his first full-scale machine learning project.

The Human Variable

The first discovery wasn't shocking: people with insurance used more healthcare services across the board. In his data, about 60-70% of Americans had private insurance, with the rest covered by Medicare or Medicaid. But when Pranto dug into whether deductibles mattered (the part that should have shown clear differences), things got weird. Economic theory says higher out-of-pocket costs should make people use less healthcare. The data told a different story.

"In the second and third stages—comparing deductible status and high vs. low deductibles—we didn't find statistically significant effects," Pranto admits. "The results made intuitive sense but weren't statistically robust, possibly due to high variance." In other words, people's healthcare decisions are more complicated than simple price calculations.

The Subgroup Surprise

Moving beyond the deductible factor, when Pranto examined different subgroups with machine learning to understand what else drove healthcare choices, it uncovered striking patterns that traditional methods might have missed. What emerged first was that women consistently used all three types of healthcare services more than men, no matter how the groups were compared. Even more unexpected was what the data revealed about age. "I didn't find clear trends by age, even though I expected older people to use services more," Pranto notes. Contrary to the common belief that healthcare use rises sharply with age, the data showed a surprisingly flat pattern across age groups. But perhaps the most eye-opening discovery came when examining employment status, revealing just how complex the relationship between money and healthcare decisions can be.

For urgent care, employed people were nearly twice as likely to seek care compared to unemployed folks. This makes sense. If someone has a steady paycheck, they're more likely to address health issues before they become serious problems. But emergency room visits? Almost equal between employed and unemployed people, with unemployed individuals sometimes using ERs slightly more. "You can't delay emergencies," Pranto points out. When something's truly urgent, financial concerns take a backseat.

Overnight hospital stays revealed the most troubling pattern. Employed people were much more likely to stay in the hospital as long as doctors recommended. Unemployed individuals tended to leave early (even those who were insured). "Employed people were more likely to stay, while unemployed individuals may choose not to due to high costs even with insurance," Pranto explains.

The Deductible Debate

During Pranto’s poster presentation, an undergraduate student asked a question that stopped him in his tracks. Why weren't older people showing up in the data as heavy healthcare users? The student suggested that older individuals who would be assumed to use emergency service at a higher rate might be systematically missing from survey data due to mortality. This "survivor bias" could help explain why the data doesn’t show the expected increase in emergency service use with age. It's a sobering thought that illustrates how complex healthcare research can be. "That bias could affect the age trends, so I might explore that in future research," Pranto says.

Pranto's research has serious implications for ongoing healthcare debates. The finding that deductible amounts didn't significantly change people's behavior challenges the logic behind high-deductible health plans, which are supposed to make people more cost-conscious healthcare consumers. Adding to that, the employment-based differences in hospital stays suggest that having insurance isn't enough—people still need financial resources to get optimal care. This finding could influence discussions about everything from sick leave policies to hospital billing practices. The research also shows that men and women, different age groups, and people with different employment situations all respond to insurance differently. This suggests that one-size-fits-all insurance policies might not work as well as targeted approaches.

The Future of AI and Health Insurance

Beyond the specific findings, Pranto's work demonstrates how advanced machine learning techniques - and the growing role of AI in research - can uncover patterns in human behavior that traditional research methods miss. As healthcare policy becomes increasingly data-driven, tools that can detect these complex patterns become essential. His research suggests that effective healthcare policy needs to account for the very human reality of how different people make healthcare decisions—a complexity that AI tools are uniquely equipped to handle. Instead of assuming everyone reacts to insurance the same way, policymakers should recognize that different groups may be affected by insurance design in their own unique ways.

As Pranto prepares for post-grad life and job hunting, his research stands as proof that sometimes the most important discoveries come from students willing to apply new tools to old questions. In a field where assumptions about healthcare behavior drive billion-dollar policy decisions, his work provides the kind of nuanced, data-driven insights that policymakers desperately need.

Meghan Perry Meghan is an experienced freelance writer and editor. In the daytime, she works as a PR and content writer specializing in B2B, government tech, and higher education. Her heart truly belongs to creative writing, where she finds joy in spinning tales and polishing editorial gems.

With a TBR pile that could rival a small mountain, there’s always a book tucked away in her tote bag. Her LinkedIn DMs are open for project requests, book recommendations, and Harry Potter trivia.