Using Data Science to Help Predict Adverse Drug Reactions
PhD students, undergrads develop visual analytics system for FDA
May 9, 2018
Six WPI students, under the direction of computer science professor Elke Rundensteiner, have integrated natural language processing and deep learning techniques to develop a visual analytics system by processing the more than one million reports to adverse drug reactions gathered annually by the U.S. Food and Drug Administration (FDA). The research is aimed at better prediction of harmful reactions from drug-drug interactions.
The implications of this accomplishment—which includes work done by four undergraduates as their Major Qualifying Project (MQP)—are important because adverse drug reaction events cause more than 100,000 deaths a year in the United States and cost over $170 billion in annual added expenses. Drug to drug interactions are also a major cause of emergency room visits and hospitalizations. Without big data tools such as those developed by the WPI team to support safety evaluators at the FDA to sift through this huge volume of adverse event reports, dangerous incidents will remain unchecked.
“This project is important to me because it has impactful real-world implications,” says Brian Zylich '19 . He says the new technology “may very well be used to improve the capabilities of FDA safety evaluators to identify harmful drug-drug interactions. “This means that the FDA will be able to detect problems and warn patients and consumers in a more timely manner and potentially save lives.”
The students’ efforts have already resulted in the submission of several academic research papers, including one presented last month at the IEEE International Conference on Data Engineering in Paris. Other papers are being reviewed for publication.
Rundensteiner says machine learning and natural language processing capabilities from the system may be worked into the Adverse Event Reporting System (FAERS) software for use by the FDA. She and her PhD students, Tabassum Kakar and Xiao Qin, will meet with their FDA collaborators, safety evaluators, and the deputy director of the Office of Surveillance and Epidemiology this month in Washington, D.C., to discuss the WPI project and see whether it might serve as a model for a wider improvement of the adverse drug reaction review process.
Detecting all potential drug interactions in clinical trials for new drugs is impossible, since it would require testing all potential drugs against the one being used by patients in the trial. Consequently, FDA evaluators rely on the reporting of adverse drug reactions from the field—from healthcare professionals, manufacturers, and consumers—that are submitted to the agency.
The FDA uses FAERS to collect the semi-structured adverse event reports into a database. FAERS then enables the safety reviewers to browse through the list of reports related to a given disease by date or sorted by other criteria. But the system currently does not employ machine learning algorithms for detecting adverse drug-drug interactions and then recommending particular interactions to the safety reviewers for in-depth analysis. The visual analytics capabilities of the current review system also are limited—missing the opportunity to bring the human in the loop of the analysis process.
The WPI students have worked to develop a system that uses text mining and deep learning to effectively sort through and compare the reports and then present the results on a series of interactive visualizations, so analysts can more readily identify possibly serious drug interactions.
Research into the system began several years ago when WPI Marni Hall '97, now a university trustee, suggested that Rundensteiner and her students undertake a project with the FDA to bring WPI’s data science expertise to solve the FDA’s challenging big data problems. At the time Hall was a senior program director at the FDA, overseeing the branch responsible for the surveillance and epidemiology.
Rundensteiner, Kakar, and Qin began a collaboration with the FDA with fellowships funded by the agency through the Oak Ridge Institute for Science and Education. These multi-year graduate fellowships have amounted to nearly $200,000 in funding.
Rundensteiner and the students hold weekly conferences with their FDA collaborators and developed a close rapport learning about the big data challenges the FDA faces.
Qin developed natural language processing and machine learning strategies that extract information from unstructured text and allow it to be compared to data from other reports, scoring the relative interest of the mined interactions for their relevancy. Kakar, with guidance from computer science professor Lane Harrison, has designed novel data visualizations to display critical information on the computer screen and empower safety evaluators to visually interact with the data to facilitate discovery.
Research innovations by Kakar and Qin will be brought to fruition by developing an integrated web-based system that incorporates these innovations within one platform. Four undergraduates took on this task with Kakar and Qin as their mentors. The Multiple Drug Interaction Analytics Platform (MIAP) developed by Zylich, along with Andrew Schade, Brian McCarthy, and Huy Quoc Tran and their mentors is a fully working web-based prototype system, Rundensteiner says.
They realized the research concepts by the graduate students by developing the technologies in a common programming language and then combining them into an integrated web-based client-server system. They tackled difficult data integration challenges developing algorithms for detecting and correcting data ambiguities, such as differences in names for the same drug and imprecise naming of adverse drug events. This empowers the MIAP system to distinguish between newly discovered drug-to-drug interactions versus those previously known by the community. And they overlaid scientific information about known drug-to-drug interactions extracted from external web sources to those discovered by the system to provide better context for safety evaluators.
The undergraduates also enhanced visualizations of the drug interactions and developed a web-based human computer interface for the system so it can be accessed by evaluators more easily.
The new system presents its results on a series of visual displays that allow an evaluator to see myriad interactions among drugs, to drill down on a specific drug and to access additional data on the drugs, down to the specific reports that serve as evidence for the learned drug-to-drug interaction.
The undergraduates have put together a video demonstrating the highlights of the new technology.
“The undergraduates really did an amazing job to show what is feasible by bringing the two distinct innovations together in one integrated system,” says Rundensteiner. “In terms of potential vision for the FDA, this is huge.”
“The undergraduates really did an amazing job to show what is feasible by bringing the two distinct innovations together in one integrated system. In terms of potential vision for the FDA, this is huge.” -Elke Rundensteiner
Kakar thinks the WPI system is an improvement over the query-based approach now in use at the FDA, which requires evaluators to ask explicit questions about the report data to draw out conclusions about interactions.
“The [WPI] system is very interactive,” she says. “They can find out what are potential interactions with ease. They don't have to run manual queries each time to find certain information.
“It can help them find out something alarming very quickly, as compared to their current system, where they have to run queries. We received positive feedback from the FDA—from the safety evaluators and the supervisors I work with.”
Rundensteiner says this innovative WPI system could serve as a model for improvements in the review system in place at the FDA.
“I believe that our proposed techniques have the potential to fundamentally revolutionize how science and safety review is done in the future,” Rundensteiner says.
“Having developed a fully working prototype lets the FDA safety evaluators and their staff play with this as a prototype and see what they like and what they don’t like. This will speed up the development cycle for future technologies. It will guide the FDA in putting together requirements for which features a review system of the future should have.”
Undergraduates Schade and McCarthy see real-world value in what they have produced, making the process of evaluating drug interaction information more efficient, allowing evaluators to get important information and connections easier and prioritize actions more quickly.
“It's important because it allows FDA drug evaluators to understand the risk of combining prescriptions between two different drugs and really making that public knowledge,” says Schade. “Our project creates a way to interpret that data quickly.”
“That’s important because these adverse effects that happen when you're taking a drug or multiple drugs can be life threatening,” adds McCarthy. “And yet they're rather common.”
Rundensteiner also sees tremendous benefit from bringing together both undergraduate and graduate students to work jointly on this larger vision in a mixed team.
“This project showcases the value of the capstone projects at WPI because it's going from critical societal problems to building a real-world working solution,” she continues. The students’ work demonstrated “the wide spectrum of skills that computer science graduates trained at WPI are equipped with from system building to human interface design.”
Zylich, Schade, and McCarthy say the project has taught them invaluable lessons that will last well after their time at WPI.
Schade, who starts work this week as a data engineer for an insurance company in New York, says the FDA effort taught him how to keep the large vision of a project in focus while he broke down the project into individual tasks, prioritized the tasks, and worked on them. Beyond the technical aspects of the project, the effort allowed him to practice both his teamwork and leadership skills. He says he has mentioned this project experience and its lessons in all his job interviews.
Zylich, a junior in a BS/MS program who wants to get his computer science PhD and go into research, says the project “allowed me to apply the theoretical concepts I learned in my classes to a practical project, incorporating WPI’s pillars of theory and practice.” He said it also provided "a fantastic opportunity to learn about graduate school, work with graduate students, and experience conducting research and developing a scholarly manuscript and video-describing the research and its impact.
And McCarthy, who starts a software engineering job in July at Constant Contact in Waltham, says the project exposed the team to “many state-of-the-art techniques and technologies—ranging from data visualization, human computer interaction, client-server systems, web development, database processing, to machine learning” and diverse challenges of software development.
On top of all that, the project also sharpened their team working skills, he adds.
“There were so many different software components that were designed in different ways and used distinct technologies,” he says. “It was critical for the success of the project that we were able to work together, that we were able to divide up that work, but also communicate well so that these different pieces that were being developed independently would work together well.”
- By Thomas Coakley
You are interested in Data Science? Register for your ticket here: https://www.thedatasciencesummit.com/book-your-tickets/