Thursday, September 27, 2012

Downloading Annotation Files for methylKit

DNA methylation analysis package methylKit can annotate differentially methylated bases/regions or bases/regions covered by reads.  The package supports input of annotation files in BED format. The annotation file can contain the location of the genes, CpG islands or any other genomic feature of interest. The package will read in the provided annotation file and produce a GRanges object (from GenomicRanges package) to be used in subsequent functions for annotating regions or bases of interest.

You can download annotation files from UCSC table browser for your genome of interest. Go to this webpage. On the top menu click on "tools" then "table browser". Select your "genome" of interest and "assembly" of interest from the drop down menus. Make sure you select the correct genome and assembly. Selecting wrong genome and/or assembly will return unintelligible results in downstream analysis.
From here on you can either download gene annotation or CpG island annotation.
  1. For gene annotation, select "Genes and Gene prediction tracks" from the "group" drop-down menu. Following that, select "Refseq Genes" from the "track" drop-down menu. Select "BED- browser extensible data" for the "output format". Click "get output" and on the following page click "get BED" without changing any options. save the output as a text file.
  2. For CpG island annotation, select "Regulation" from the "group" drop-down menu. Following that, select "CpG islands" from the "track" drop-down menu. Select "BED- browser extensible data" for the "output format". Click "get output" and on the following page click "get BED" without changing any options. save the output as a text file.
In addition, you can check this tutorial to learn how to download any track from UCSC in BED format http://www.openhelix.com/cgi/tutorialInfo.cgi?id=28

No comments:

Post a Comment