TOPMed Data overview

Overview

The Data Overview page for TOPMed datasets (studies) provides genomic summary information for variants represented in the Freeze 8 VCF files, on the GRCh38 reference genome.

The page lists all the TOPMed studies included in Freeze8, but users can only query variants in studies that have open access genomic summary information.

This feature allows you to learn about the available datasets and if they will be valuable for your research prior to applying for access on dbGaP.

Searching and filtering

The search/filter menu allows specifying variants and genotypes of interest, by defining genomic positions or regions with the added capability of specifying additional variant properties.

The result will show how many samples (and what percentage) contain variants and genotypes that match the specified criteria. The results are listed by the dbGaP accession codes (participants under the same study with the same data use limitations).

The variant properties that are available are:

  • Chromosome and position (or region) in format "chr1:55052210"
  • Chromosome, position, ref and alt in format "chr1:55052210:C>G"
  • Chromosome and region in format "chr14:72942975-72942985"
  • If specifying regions - only include variants that have a filter status PASS or include all variants
  • Specify which genotype to include as a hit in a sample (Homozygous alt, Heterozygous or both)
  • If specifying regions - only search for variants that have a specific Variant effect, as annotated by SnpEff.

You can combine multiple search criteria by using simple expressions connected with AND/OR operators. E.g. search term "chr1:55052210:C>G AND chr14:72942975-72942985" will count samples that contain both the first variant on chr1 and any variant in the second region on chr14.

Note that there are some restrictions in place, in compliance with the NHLBI Genomic Summary Results policies:

  • Summary results for a subset of studies can't be shown. Query results for these studies will always show Restricted in place of the resulting number of samples.
  • Summary results if the resulting number is less than 10 can't be shown. If query result for a specific row is less than 10, the page will show Less than 10 instead of the actual result.

Procedure

To access the TOPMed Data Overview page:

  1. Click Data in the main menu.
  2. Choose option Data overview. The Data Overview opens.
  3. Accept Terms of Use.
  4. Enter a variant or a region into the search box, e.g: "chr1:55052210:C>G" or "chr14:72942975-72942985".
  5. (Optional) Use the toggle to only search for variants with PASS filter status.
  6. (Optional) Choose options for Genotype and Variant effect filters.
  7. Click Search. The results are displayed below.
1271
  1. Select the desired results and click Explore in Data Browser to continue your analysis (see Data Browser features).