Optimized procedure for ChIP-seq peak detection

April 19, 2011

The peakROTS R package implements the ROTS procedure for optimized ChIP-seq peak detection. This is a companion site for the article that describes the procedure.

The procedure allows optimizing the binding site detections in a given ChIP-seq experiment. It can improve the detection accuracies beyond those originating from the default settings of popular peak calling software, or inform the user in case the peak calling results are compromised. The implementation is a stand-alone, open-source R-package. To facilitate in-depth searching through large parameter spaces, we have modularized the implementation so that it can be efficiently distributed across multiple computing cores, allowing large computational resources to be utilized effectively. The infrastructure needed for the distributed computing is included in the R-package.

Download

R package: peakROTS_1.0.1.tar.gz (older versions)

R documentation: peakROTS_1.0.1.pdf (older versions)

The R package is a platform independent source package that should work on all Unix flavors (Linux, Mac OS X, Solaris). It is tested on Linux, with R versions 2.5.1 and later.

The peakROTS R-package is available under the terms of the GNU General Public License version 3 or later. This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY. In case you use the package in your work, we do appreciate a citation to the publication. Source code is available at Google Code SVN.

To run MACS with peakROTS package, you need to download and install it from: http://liulab.dfci.harvard.edu/MACS

To run PeakSeq with peakROTS package, you need to download and compile the improved PeakSeq package. Mappability files for different genomes (.txt) are available at: http://archive.gersteinlab.org/proj/PeakSeq/.

Installation

The peakROTS package is a standard R package and can be installed using the normal R package installation procedures. For more information please consult the R manual on package installation.

In most environments the package can be installed from command line with:

R CMD INSTALL peakROTS_1.0.0.tar.gz

If there are no access rights to install the package to R main library directory, a useful R feature is the user specific package directory. The default directory can be checked from R command line with Sys.getenv("R_LIBS_USER"). Setting the environment variable R_LIBS_USER before installation allows you to install to a custom location. Please note that if the tool is run in a batch processing system, then all compute nodes must see the package directory and be able to load the package when issued the R command library(peakROTS).

Tutorial

Walk through the tutorial to get started with using peakROTS.

Visualizations

Example visualizations demonstrate some of the peak differences between the ROTS and default parameter settings in the STAT1 dataset. They are produced using the Chipster tool that also allows realtime navigation and exploration of the data. Chipster is freely available: ready made data and instructions for browsing the examples yourself are given here also.

 

Example peak visualization produced with Chipster Viewer

A peak in C21orf2 detected by ROTS that would have been completely missed using the default parameters with MACS.

 

Example peak visualization produced with Chipster Viewer

A peak in MIB2 detected by ROTS that would have been completely missed using the default parameters with PeakSeq.

 

Example peak visualization produced with Chipster Viewer

The ROTS parameters produced typically markedly narrower peak widths than the MACS default parameters.

To browse the data yourself, please follow these steps:

  1. Download the data as a session file
  2. Open Java Web Start launch link to start the viewer
  3. In Chipster, click "Open session" and select the downloaded session file
  4. Click "Select all and open genome browser"
  5. Or select the BAM and BAI files, plus the particular BED files you are interested in
  6. Select the first genome (Homo Sapiens)
  7. Accept to download annotations (highly recommended)
  8. Click Go
  9. Browse around: basic navigation happens with mouse (drag to move, wheel to zoom) or keyboard (arrow keys move and zoom)

Citation

Laura L Elo, Aleksi Kallio, Teemu D Laajala, R David Hawkins, Eija Korpelainen & Tero Aittokallio (2011) Optimized detection of transcription factor binding sites in ChIP-seq experiments, Nucleic Acids Research, to appear.

Version history

1.0.0: first public release (January 07, 2011)

1.0.1: improved PeakSeq support (April 19, 2011)

Contact

The main contact is Laura Elo (copy to Tero Aittokallio). In technical questions about the R package, this web site or peak visualizations, please contact Aleksi Kallio (copy to Eija Korpelainen).