RNA sequencing is now widely performed to study differential
expression among experimental conditions. As tests are performed on a
large number of genes, very stringent false discovery rate control is
required at the expense of detection power. Ad hoc filtering techniques
are regularly used to moderate this correction by removing genes with
low signal, with little attention paid to their impact on downstream
analyses.
Researchers at INRA, France propose a data-driven method based on the
Jaccard similarity index to calculate a filtering threshold for
replicated RNA-seq data. In comparisons with alternative data filters
regularly used in practice, they demonstrate the effectiveness of the
proposed method to correctly filter lowly expressed genes, leading to
increased detection power for moderately to highly expressed genes.
Interestingly, this data-driven threshold varies among experiments,
highlighting the interest of the method proposed here.
AVAILABILITY: The proposed filtering method is implemented in the R package HTSFilter available at – http://www.bioconductor.org/packages/release/bioc/html/HTSFilter.html
CONTACT: [email protected]
Rau A, Gallopin M, Celeux G, Jaffrézic F. (2013) Data-based filtering for replicated high-throughput transcriptome sequencing experiments. Bioinformatics [Epub ahead of print]. [abstract]
Post a Comment
Thanks for reading my blog.
Note: only a member of this blog may post a comment.