Data Scientists Uncover Molecular Processes Linking Acute to Long-Term Stages of COVID-19

Data Scientists Uncover Molecular Processes Linking Acute to Long-Term Stages of COVID-19

A study by Mount Sinai's Division of Data-Driven and Digital Medicine provides evidence for the first time that molecular signatures associated with “long COVID” are detectable in the acute stage of infection of SARS-CoV-2.


3 minute read

A Mount Sinai study provides evidence for the first time that molecular signatures associated with “long COVID” are detectable in the acute stage of infection of SARS-CoV-2. The finding underscores the enormous potential of computation and data science to statistically model and ultimately resolve some of the biggest clinical challenges of our time.

At the center of that effort is the Department of Medicine’s two-year-old Division of Data-Driven and Digital Medicine (D3M), which led the long-COVID investigation and views it as emblematic of the expanding effort at Mount Sinai to integrate data science and digital tools into translational research and clinical care.

“Data science is enabling us to understand disease on a much deeper molecular scale than ever before, and it’s clear that knowledge will play a transformative role in how we conduct research and practice medicine,” says Girish Nadkarni, MD, Irene and Dr. Arthur M. Fishberg Professor of Medicine at the Icahn School of Medicine at Mount Sinai and Chief of the new Division. “Our team is already making a difference by integrating its work into the clinical side of the Mount Sinai Health System and acting as a powerful data resource for health care professionals who often find access to that information challenging.”

Indeed, the data-driven computational study of post-acute sequelae of SARS-CoV-2 infection (commonly referred to as long COVID), published in Nature Medicine, was a collaborative effort with The Charles Bronfman Institute of Personalized Medicine at Icahn Mount Sinai and the clinical data science team.

“We believe this study exemplifies the type of meaningful partnerships that will evolve between D3M as a data science and precision medicine hub, and clinicians across Mount Sinai who manage the patient populations,” notes senior author Noam Beckmann, PhD, Assistant Professor of Medicine (Data Driven and Digital Medicine). “Together, we can create rich, patient-centric data sets that allow us to investigate the big medical problems of our time.”

Tapping into the vast resources of the Mount Sinai COVID-19 Biobank, researchers examined gene expression data in blood samples from more than 500 patients hospitalized with COVID-19 between April and June 2020. More than 160 provided self-reported assessments of a broad range of symptoms still present six months or more after hospitalization, including fatigue, dyspnea, sleep disruptions, and smell and taste problems. The team tested each gene expressed in the blood for association with each long COVID symptom, and then for associations specific to each of 13 different types of immune cells, including plasma cells. Finally, these associations were categorized by whether they matched up with changes in patients’ levels of antibodies specific to the virus.

Among the team’s breakthrough findings was the presence of two molecularly distinct subsets of long COVID symptoms with opposing gene expression patterns, often observed in the same plasma cells (the immune system’s antibody-producing cells). In patients who progressed to pulmonary problems, for example, antibody-production genes were less abundant, while in patients with non-respiratory issues, such as loss of smell and taste or sleep problems, the same antibody-producing genes were more abundant.

“These opposing patterns point to the existence of multiple independent molecular processes leading to different long COVID phenotypes,” explains Dr. Beckmann, whose omics skills include genomics, proteomics, and transcriptomics. “Just as importantly, our data revealed that these processes are already present during the acute infection stage of COVID-19. This finding has tremendous implications for the design of research studies as well as for the development of potential biomarkers, prevention strategies, and treatment options for individuals who develop long COVID.”

Featured Faculty and Division Leadership

Noam Beckmann, PhD

Noam Beckmann, PhD

Assistant Professor of Medicine (Data-Driven and Digital Medicine)

Girish Nadkarni, MD, MPH

Girish Nadkarni, MD, MPH

Irene and Dr. Arthur M. Fishberg Professor of Medicine; Chief, Division of Data-Driven and Digital Medicine