University of Chicago researchers launch projects exploring health disparities, machine learning.
The COVID-19 pandemic has mobilized the world’s scientific community like no other recent crisis, including many researchers using the most modern data science and artificial intelligence approaches. At the University of Chicago, public health experts, computer scientists, economists and policy analysts have launched projects using computational tools to better detect, diagnose, treat and prevent the spread of the deadly virus.
This summer, three of these projects received seed funding from the C3.ai Digital Transformation Institute (DTI), a new partnership of technology companies and universities committed to accelerating the benefits of artificial intelligence for business, government and society. The research attacks the pandemic from several angles: helping policymakers control disease spread by identifying and addressing key social factors, physicians detect the disease at earlier stages, and hospitals decide which patients require admission. A fourth project, a collaboration led by UChicago Medicine’s Maryellen Giger, was funded by the organization in spring.
“We have the potential to alter the course of this global pandemic.” — Thomas M. Siebel, CEO of C3.ai.
The awards were part of $5.4 million in funding distributed by DTI, after their inaugural call for proposals in March. The group also provides AI software tools and a “data lake” of COVID-19 datasets to aid researchers studying the pandemic.
“The enthusiastic response among scientists and researchers coupled with the diverse, high-quality and compelling proposals we’ve received suggests that we have the potential to alter the course of this global pandemic,” said Thomas M. Siebel, CEO of C3.ai. “In the face of this crisis, the Institute is proud to bring together the best and brightest minds and provide direction and leadership to support objective analysis and AI-based, data-driven science to mitigate COVID-19.”
Modeling health disparities
The early toll of the COVID-19 pandemic revealed severe health inequities in who catches the disease and who suffers death and morbidity. Latin and African Americans are more than three times as likely to catch the virus and twice as likely to die as white Americans, according to CDC data. Many experts believe this disparity goes beyond medical comorbidities, to social determinants such as housing, jobs and neighborhood features.
Anna Hotton, a research assistant professor at UChicago Medicine, previously studied the relationship between social factors and viral spread in the context of other infectious diseases. With her DTI grant, she’s working with fellow UChicago researchers Aditya Khanna, Harold Pollack and John Schneider to adapt that work to COVID-19, with help from agent-based modeling experts Jonathan Ozik and Charles Macal at Argonne National Laboratory.
“A lot of my substantive work focuses around understanding social and structural factors as they impact HIV transmission,” Hotton said. “With COVID-19, there are a lot of similarities in terms of the social factors that shape people’s vulnerability to infection, and I’m motivated to shed light on some of these social issues and help guide work around reducing health inequities.”
Agent-based modeling is a powerful form of computer simulation for studying complex systems, from molecular interactions to traffic congestion. Over the last decade, Argonne researchers Ozik and Macal have gradually assembled a computer model for the entire city of Chicago and its population, using it to observe and predict the spread of diseases both real (MRSA, influenza) and imagined (a zombie outbreak). Recently, the team has focused their ChiSIM model on the spread of COVID-19, looking for types of buildings and areas of the city where people gather and disease transmission risk is high.
With Hotton and her collaborators, Ozik and Macal are working on adding new data to their synthetic Chicago population of 2.7 million “agents,” including information on housing, occupations and other social determinants that likely influence virus spread. The team will also use machine learning to identify the data elements that are most important to include in the model from a long list of options, such as time spent on public transit, ability to work from home, number of family members in a household, and many other details.
Once enriched with this data, the researchers will be able to better simulate various scenarios of disease spread and virtually test how different public health or social policy strategies can help mitigate the disease. Their results will be shared with partners in the Chicago and Illinois Departments of Public Health, advising these agencies on how best to deploy testing, reopening of businesses and schools, and, eventually, vaccination.
“Agent-based modeling allows us to explore intervention approaches in a virtual environment before rolling out interventions in real life, in addition to making predictions about trends in incidence and mortality,” Hotton said. “Later, when vaccines are available, we’ll need to figure out how to deploy them most efficiently to the populations with greatest need.”
Admit or release?
One of the toughest decisions physicians face during the pandemic is deciding which COVID-19 patients to keep in the hospital, and which are safe to recover at home. In the face of overwhelmed hospital capacity and a brand-new disease with little data-based evidence for diagnosis and treatment, old rubrics for deciding which patients to admit have proven ineffective. But machine learning could help make the right decision earlier, saving lives and lowering health care costs.
A team led by Prof. Sendhil Mullainathan of Chicago Booth will work with a large northwest U.S. hospital network on creating a new model for predicting acute respiratory distress syndrome (ARDS), the most severe symptom and primary cause of death for COVID-19 patients. Using over 4 million chest X-rays, the team—which also includes Aleksander Madry of Massachusetts Institute of Technology and Ziad Obermeyer from University of California, Berkeley—will build a new machine learning model that predicts the likelihood of this pulmonary collapse.
To work around the issue of limited COVID-19 data early in the pandemic, the team will feed their model with X-rays from other conditions that affect the lungs, such as influenza and pneumonia.
“No one has enough data on COVID yet to apply the modern machine learning toolkit,” said Obermeyer. “But in a pulmonary infection such as COVID, the lungs actually have a very limited physiological playbook. When the lungs are attacked by a virus or bacterium, they basically only react in one way. Our hypothesis is that we can learn about deterioration in COVID by looking at deterioration in other conditions.”
Once validated, their AI model will be made open source and available to other health systems around the world. The project also allows Mullainathan and Obermeyer an opportunity to develop a medical decision-making algorithm that controls for the bias they identified in other health care software in previous research.
“Even if you’re using objective biological data like X-rays, your outcomes are biased because they’re produced by a health system that is biased,” Obermeyer said. “The optimistic view of our prior work on racial bias is that once you’re aware of those biases, you can make algorithms that take them into account.”
Early detection: Treating a pandemic like engine failure
In the early stages of a disease outbreak, detecting cases is critical to prevent population spread, but also very difficult—a proverbial “needle in the haystack” data problem. But computer scientists have already developed artificial intelligence systems for such challenges in other contexts, such as detecting mechanical faults in jet engines or anomalous and potentially fraudulent financial transactions. Models built for these applications must be able to accurately and reliably find rare occurrences in a flood of data—nobody wants to discover airplane engine failure too late.
In previous work at Caltech, UChicago computer scientist Yuxin Chen built these early detection systems for mechanical engineers and other domain experts. With DTI funding, he’ll work with researchers from UC Berkeley and UCSF on transferring these approaches to detecting infection from COVID and other diseases using medical and public health surveillance data. The team will adapt solutions for common challenges such as training models on sparse data, combining data from different sources and collection techniques, and minimizing false negatives that could have dire consequences if infected patients are missed.
Chen’s portion of the project focuses on his primary research interest: interactive machine learning. As opposed to the passive, “black box” of most AI models, these systems actively work with human experts, suggesting new data sources that should be gathered to improve predictions, or asking for help when a particular diagnosis is unclear.
“If the model is not very confident about the predictive results for a certain medical diagnosis that we have data on, it will flag these data and ask experts to verify or correct the predictive results,” said Chen, an assistant professor. “We also care about interpretable recommendations; we’re training our AI system to effectively communicate with the human users to collaboratively make detection and diagnosis decisions. So we need to build an interpretable interface that sits between the system and medical professionals in order to make the collaboration seamless.”