Skip to end of metadata
Go to start of metadata

Erin McNemar

How de-identified data can advance medical research and improve patient care.

De-identified data has become an important tool in medical research and for providers looking to enhance patient care. While data sharing between different organizations could violate the Health Insurance Portability and Accountability Act of 1996 (HIPAA), the de-identification process makes sharing information HIPAA-compliant.

De-identified data sharing can then assist medical researchers in advancing tools and treatments. Additionally, it allows for collaborative efforts from large provides. Overall, de-identifies plays a critical role in improving the patient experience.


The process of de-identification removes all direct identifiers from patient data and allows organizations to share it without the potential of violating HIPAA.

Direct identifiers can include a patient’s name, address, medical record information, etc. While direct identifiers are removed from the data to keep a patient’s identity confidential, indirect identifiers can remain untouched to allow researchers to study data trends. Indirect identifiers include gender, race, age, etc.

According to the  Department of Health & Human Services,  “The process of de-identification, by which identifiers are removed from the health information, mitigates privacy risks to individuals and thereby supports the secondary use of data for comparative effectiveness studies, policy assessment, life sciences research, and other endeavors.”

Data de-identification is crucial to advancing medical research and treatment while also protecting patient privacy.


De-identified data can be used in medical research and treatment. Once identifying information is removed, the data can provide useful information for advancing healthcare.

In a recent study, researchers used de-identified data to develop an artificial intelligence tool to predict 30-day mortality risks in patients with cancer. Cancer is one of the leading causes of death in the United States each year. With the artificial intelligence tool, medical professionals can discover patients who are at high risk and provide early intervention and resolutions for reversible complications.

Additionally, the tool can identify patients who are approaching end of life (EoL) and refer them to early palliative and hospice care. In this case, the use of de-identified data assists with artificial intelligence and can provide an improved quality of life and symptom management for the patient.

“In contrast, aggressive, life-sustaining EoL care can conflict with patient preference and result in lower quality of life, family perceptions of poorer quality of care, and greater regret about treatment decisions. Earlier referral also represents an opportunity to transform cancer care by reducing the potential for unnecessary, toxic and expensive treatments at EoL,” the study authors wrote.

De-identified data can also be used in developing predictive analytics tools. To address healthcare gaps created by the COVID-19 pandemic, UnitedHealthcare developed a predictive analytics tool that used de-identified data to address social determinants to health.

“Around 80 percent of your health is determined by things that are not your genetics. There are things more such as what’s going on in the rest of your life, what we call social determinants of health — social, economic, gender orientation, and other markers that sometimes can lead to inequality,” Rebecca Madsen, chief consumer officer, UnitedHealthcare said.

To eliminate care gaps, UnitedHealthcare created an advocacy system to assist members who might be struggling due to their social environment. Through predictive analytics and a machine learning model, the advocacy system can evaluate de-identified data from members and determine the need for social services.

Data is then loaded into an agent dashboard used by UnitedHealthcare advocates. When a member calls in, advocates can connect the caller to community resources at low or no cost.

De-identified data allows medical professionals to both develop tools to better serve patients and advance research to produce improved outcomes.


Data sharing allows those in the healthcare field to create better tools and treatments to improve patient care and outcomes. However, according to the Centers for Disease Control & Prevention (CDC),  HIPAA law states that patient information must be protected and cannot be shared with other entities without the patient’s knowledge and consent.

By de-identifying data, providers can share information with other organizations to advance medical researcher and treatment. Additionally, de-identifying the data removes some liability regarding HIPAA violations.

Furthermore, the use of de-identified data allows for the collaboration of large data analytic platforms. Earlier this year, fourteen leading healthcare providers partnered to form Truveta, a new company that used big data analytics to enhance care insights.

The providers included AdventHealth, Advocate Aurora Health, Baptist Health of Northeast Florida, Bon Secours Mercy Health, CommonSpirit Health, Hawaii Pacific HealthHenry Ford Health SystemMemorial Hermann Health SystemNorthwell Health, Novant Health, Providence health system, Sentara Healthcare, Tenet Health, and Trinity Health.

By combining the healthcare providers’ tens of millions of patients and from thousands of care facilities across 40 states, Truveta created a large de-identified dataset for their analytic effort.

With de-identified data, providers can share patient data to assist in medical advances while also maintaining patient privacy and complying with HIPAA.