NIH Data Science Plan Aims to Boost Data Analytics, Access

NIH’s Strategic Plan for Data Science will provide a roadmap for improved healthcare data analytics, access, and sharing.

Source: Thinkstock

Source: Thinkstock

June 07, 2018 - The National Institutes of Health (NIH) has released the final draft of its Strategic Plan for Data Science, which seeks to enhance biomedical research by boosting healthcare data analytics capabilities, data access, and data sharing.

In order for researchers to facilitate medical breakthroughs and improve health outcomes, their data resources must be clean and accessible. However, as NIH noted, this is not an easy task.

“The generation of most biomedical data is highly distributed and is accomplished mainly by individual scientists or relatively small groups of researchers,” NIH wrote in the document.

“Moreover, data also exist in a wide variety of formats, which complicates the ability of researchers to find and use biomedical research data generated by others and creates the need for extensive data ‘cleaning.’”

The organization cited a 2016 survey that found data scientists spend about 80 percent of their work time collecting and organizing existing data. This leaves little time for them to mine data for patterns that could lead to new discoveries.

The Strategic Plan for Data Science maps a general path to improve the biomedical data ecosystem over the next five years. NIH released a draft of the plan for public comment in March 2018, and in this finalized version the organization details how it will maximize the value of research-generated data.

NIH stated that it plans to prioritize the development and distribution of health IT tools that will accelerate data management and analytics.

NIH will also help establish a more competitive marketplace for tool developers and providers, aiming to make new technology more available and less costly for the research community. 

In addition, the organization will establish programs that allow engineers to optimize and refine tools developed in academia, making them more efficient, cost-effective, and useful for biomedical research.

NIH will also work to develop and adopt health IT tools that will improve the collection and integration of data from disparate sources. These new tools have the potential to transform big data into actionable clinical information, allowing researchers to identify patient needs and predict poor outcomes in vulnerable populations.

NIH will aim to increase researchers’ access to data as well. The organization intends to improve data accessibility by utilizing large-scale cloud computing platforms, which have the potential to streamline NIH data use by allowing rapid and seamless access.

NIH will leverage partnerships with cloud-service providers to facilitate access to large, high-value NIH datasets, and will ensure that these cloud environments are stable and secure to protect against data compromise.

Additionally, the organization plans to make smaller datasets from individual laboratories more accessible. NIH will create an environment in which individual laboratories can use intuitive interfaces to link datasets to publications in the National Center for Biotechnology Information (NCBI) database.

Data sharing is also a high priority for NIH. According to NIH, more than 3,000 different groups and individuals submit data to NCBI systems each day. These data can include human genome sequences, chemical structures and properties, or clinical trial results.

NIH will work to build a framework to ensure that these datasets can exist together, instead of isolated data silos. NIH will connect new data resources to other systems upon implementation, and when appropriate, develop connections to non-NIH data resources.

NIH expects that expanded data sharing will benefit not only biomedical researchers, but also policymakers and the public.   

Ultimately, the organization anticipates that its Plan for Data Science will foster breakthroughs in research to improve health outcomes.

“Data science holds significant potential for accelerating the pace of biomedical research,” NIH concluded.

“To this end, NIH will continue to leverage its roles as an influential convener and major funding agency to encourage rapid, open sharing of data and greater harmonization of scientific efforts.”

Book your tickets for the Data Science Summit here: