GPU and FPGA-accelerated Bioinformatics Algorithms

Many researchers are starting to look into hardware acceleration to speed up their bioinformatics analysis. For example, please check this seqanswers thread, where davispeter asked about GPU-accelerated genome assembly programs.
Our next few commentaries will cover various hardware-accelerated approaches. We will start from very basics before exploring more complex topics. Following our typical style, we will stay away from describing how to download, install and run specific programs and instead focus on explaining how the technologies work and discussing programming approach and algorithms. We will also try to explain, where the main cost savings come from. Be aware the topic of cost savings is directly related to ease of programming, but ‘ease of programming’ is often very subjective. My grandma finds it very difficult to program her microwave clock.
Please feel free to suggest any algorithm of your interest in the comment section. We are collecting a list of references and will append to this post, as soon as it is ready.
GPU Algorithms
1. Short Read Alignment
Yongchao Liu, Bertil Schmidt, and Douglas L. Maskell: ” CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform”. Bioinformatics, 2012, 28(14): 1830-1837.
Yongchao Liu and Bertil Schmidt: “Long read alignment based on maximal exact match seeds”. Bioinformatics, 2012, 28(18): i318-324.
2. Effi cient Hash Tables on the GPU
3. Assembly
Yongchao Liu, Bertil Schmidt, and Douglas L. Maskell: “Parallelized short read assembly of large genomes using de Bruijn graphs”. BMC Bioinformatics 2011, 12:354.
4. Error Correction
Yongchao Liu, Bertil Schmidt, and Douglas L. Maskell: “DecGPU: distributed error correction on massively parallel graphics processing units using CUDA and MPI”. BMC Bioinformatics (impact factor 3.43), 2011, 12:85.
5. Phylogeny

8. Other Implemented Programs
FPGA Algorithms
1. Short Read Alignment

MPI, Multicore
2. Burrows Wheeler