NHLBI TOPMed - NHGRI CCDG: The BioMe Biobank at Mount Sinai

Description

The description below was taken directly from the NCBI database of Genotypes and Phenotypes (dbGaP):

This study is part of the NHLBI Trans-Omics for Precision Medicine (TOPMed) Whole Genome Sequencing Program. TOPMed is part of a broader Precision Medicine Initiative, which aims to provide disease treatments that are tailored to an individual's unique genes and environment. TOPMed will contribute to this initiative through the integration of whole-genome sequencing (WGS) and other -omics (e.g., metabolic profiles, protein and RNA expression patterns) data with molecular, behavioral, imaging, environmental, and clinical data. In doing so, this program aims to uncover factors that increase or decrease the risk of disease, to identify subtypes of disease, and to develop more targeted and personalized treatments. Two genotype call sets derived from WGS are now available, Freeze 5b (GRCh38) and Freeze 8 (GRCh38), with largely overlapping sample sets. Information about how to identify other TOPMed WGS accessions for cross-study analysis, as well as descriptions of TOPMed methods of data acquisition, data processing and quality control, are provided in the accompanying documents, "TOPMed Whole Genome Sequencing Project - Freeze 5b, Phases 1 and 2" and "TOPMed Whole Genome Sequencing Project - Freeze 8, Phases 1-4". Please check the study list at the top of each of these methods documents to determine whether it applies to this study accession.

The IPM BioMe Biobank, founded in September 2007, is an ongoing, broadly-consented electronic health record (EHR)-linked clinical care biobank that enrolls participants non-selectively from the Mount Sinai Medical Center patient population. BioMe currently comprises >42,000 participants from diverse ancestries, characterized by a broad spectrum of longitudinal biomedical traits. Participants enroll through an opt-in process and consent to be followed throughout their clinical care (past, present, and future) in real-time, allowing us to integrate their genomic information with their EHRs for discovery research and clinical care implementation. BioMe participants consent for recall, based on their genotype and/or phenotype, permitting in-depth follow-up and functional studies for selected participants at any time. Phenotypic and genomic data are stored in a secure database and made available to investigators, contingent on approval by the BioMe Governing Board. BioMe uses a "data-broker" system to protect confidentiality.

Ancestral diversity - BioMe participants represent a broad racial, ethnic and socioeconomic diversity with a distinct and population-specific disease burden. Specifically, BioMe participants are of African (AA), Hispanic/Latino (HL), European (EA) and other/mixed ancestry (Table 1, Figure 1). BioMe participants are predominantly of African (AA, 24%), Hispanic/Latino (HL, 35%), European (EA, 32%), and other ancestry (OA, 10%) (Table 1, Figure 1). Participants who self-identify as Hispanic/Latino further report to be of Puerto Rican (39%), Dominican (23%), Central/South American (17%), Mexican (5%) or other Hispanic (16%) ancestry. More than 40% of European ancestry participants are genetically determined to be of Ashkenazi Jewish ancestry. With this broad ancestral diversity, BioMe is uniquely positioned to examine the impact of demographic and evolutionary forces that have shaped common disease risk.

Phenotypes available in BioMe - BioMe has available a high-quality and validated set of fully implemented clinical phenotype data that has been culled by a multi-disciplinary team of experienced investigators, clinicians, information technologists, data-managers, and programmers who apply advanced medical informatics and data mining tools to extract and harmonize EHRs. BioMe, as a cohort, offers great versatility for designing nested case-control sample-sets, particularly for studying longitudinal traits and co-morbidity in disease burden.

** Biomedical and clinical outcomes: The BioMe Biobank is linked to Mount Sinai's system-wide Epic EHR, which captures a full spectrum of biomedical phenotypes, including clinical outcomes, covariate and exposure data from past, present and future health care encounters. As such, the BioMe Biobank has a longitudinal design as participants consent to make all of their EHR data from past (dating back as far as 2003), present and future inpatient or outpatient encounters available for research, without restriction. The median number of outpatient encounters is 21 per participant, reflecting predominant enrollment of participants with common chronic conditions from primary care facilities.

** Environmental data: The clinical and EHR information is complemented by detailed demographic and lifestyle information, including ancestry, residence history, country of origin, personal and familial medical history, education, socio-economic status, physical activity, smoking, dietary habits, alcohol intake, and body weight history, which is collected in a systematic manner by interview-based questionnaire at time of enrollment.

The IPM BioMe Biobank contributed ~10,600 DNA samples for whole genome sequencing to the TOPMed program. Samples were selected for the Coronary Artery Disease (CAD) and the Chronic Obstructive Pulmonary Disease (COPD) working groups. Using a Case-Definition-Algorithm (CDA), we identified ~4,100 individuals with CAD (~50% women) and ~3,000 individuals as controls (65% women). In addition, we identified ~800 individual with COPD (62% women) and 1800 as controls (72% women). Another 600 BioMe participants with Atrial Fibrillation, all of African ancestry, were included.

General information

phs#Study abbreviationStudy typeParent phs#
phs001644BioMeCohortphs000925