NHLBI studies metadata
Overview
Metadata is data that describes other data. On this page, we've detailed NHLBI studies metadata that are available for viewing and filtering NHLBI studies data in the Data Browser. NHLBI studies metadata on the BioData Catalyst powered by Seven Bridges consists of properties which describe the entities.
-
Entities are particular resources such as files, samples, subjects, and studies.
-
Properties can either describe an entity or relate that entity to another entity. For instance, properties include an entity's gender, data format, or biospecimen repository.
Entities for NHLBI studies
The following are entities for NHLBI studies. They represent clinical data, biospecimen data and data about NHLBI studies files. Learn more about NHLBI studies data.
- Study
- Subject
- Sample
- File
Read more about these entities and related properties below.
Study
The Study entity represents the NHLBI study.
Property | Description |
---|---|
Study name | TOPMed-assigned short study name (1:1 with phs number). |
Study design | Study design is a process wherein the trial methodology and statistical analysis are organized to ensure that the null hypothesis is either accepted or rejected and the conclusions arrived at reflect the truth. |
Biospecimen repository | A biorepository is a biological materials repository that collects, processes, stores, and distributes biospecimens to support future scientific investigation. |
Study disease | Disease that is being investigated. |
dbGap Accession | dbGaP study accession number. |
Subject
The Subject entity represents person from whom the sample was taken and analyzed.
Property | Description |
---|---|
Consent | Consent group as determined by Data Access Committee (DAC). |
DataStage Subject ID | DataSTAGE subjects identifiers across datasets and systems, with the following form: StudyIdentifier(with version)_submitted subject ID (e.g. phs001218.v1_GS86970684) |
DbGap Subject ID | The dbGaP Subject ID is a dbGaP assigned accession to the submitted Subject ID |
Study Accession | Each study or sub-study is assigned an ID with a “phs” prefix, a version suffix and a participant suffix (e.g. phs000946.v4.p1) |
Study Accession With Consent | (e.g. phs000946.v3.p1.c1) |
Study With Consent | Defines specific consent group (e.g. phs000946.c1) |
Sex | Self-reported sex or gender identity. |
Organism | A living thing, such as an animal, a plant, a bacterium, or a fungus. (NCI Thesaurus Code: C14250) |
Subject is affected | Case or control status of the subject. |
Sample
The Sample entity represents an analyte or biological specimen sampled from a subject (e.g. DNA from blood).
Property | Description |
---|---|
BioSample ID | A biosample ID assigned by dbGaP. These are unique across all studies in dbGaP. |
Body Site | Body site where sample was collected. |
Analyte type | Analyte type(e.g. DNA, RNA). |
Histological type | Cell or tissue type or subtype - e.g. melanocytes, buccal cells, embryonic stem cells). |
Is tumor | The tumour status. |
DbGap Sample ID | The dbGaP Sample ID is a dbGaP assigned accession to the submitted Sample ID |
Sra Sample ID | The SRA samples are given independent IDs at the different stage of data processing, handling, and archiving for different purposes (most of the SRA samples distributed through the dbGaP have submitted_sample_id, sra_accession, sra_sample_id, and dbgap_sample_id) |
File
The File entity refers to the files in TOPMed project produced by aliquot analysis. Find the properties of the file entity below.
Property | Description |
---|---|
Access level | A Boolean value indicating Controlled Data or Open Data. Controlled Data is data from public studies that has limitations on use and requires approval by dbGaP. Open Data is data from public studies that doesn't have limitations on its use. |
Assembly name | The reference assembly (such as HG19 or GRCh37) to which the nucleotide sequence of a case can be aligned. |
Platform | The version (for instance, manufacturer or model) of the technology that was used for sequencing or assaying. |
Molecular data type | Molecular data type (e.g. SNP/CNV Genotypes). |
Freeze | TOPMed WGS genotype call sets. |
Sequencing center | Name of the center which conducted sequencing. |
Assay type | DNA sequencing technique that was applied ( e.g. WGS ,WES). |
Instrument | Type of instrument that was being used for sequencing. |
Data format | Data format (e.g. CRAM, VCF). |
Library name | Name of the library. |
Library layout | Library layout of a project (SINGLE or PAIRED end reads.) |
Library selection | The method used to select and/or enrich the material being sequenced. |
Library source | The type of source material that is being sequenced. |
Alignment provider | Information about who did the alignment. |
Release date | Date when the sequencing data was released to SRA. |
Consent | Consent group as determined by Data Access Committee (DAC). |
Coverage | Coverage refers to the number of times the sequencing machine will sequence a genome, the more times the genome is sequenced (ie the higher the coverage), the more accurate the data will be. |
Data type | Data type (e.g. Aligned Reads, Simple Germline Variation, Variant Call...) |
GUID | Unique file identifier. |
Study Accession | Each study or sub-study is assigned an ID with a “phs” prefix, a version suffix and a participant suffix (e.g. phs000946.v4.p1). |
Study Accession With Consent | (e.g. phs000946.v3.p1.c1) |
Study With Consent | Defines specific consent group (e.g. phs000946.c1). |
Updated less than a minute ago