BGI has developed a series of bioinformatics analysis tools for various applications. SOAP (Short Oligonucleotide Alignment Program) has been in evolution from a single alignment tool to a package that provides full solution to next generation sequencing data analysis, and has been widely adopted by more than 10,000 users. BGI also applies a variety of open source software, for example ABySS and Velvet, in order to provide comprehensive bioinformatics analysis for our sequencing services.

The following is a list of software developed by BGI:

  • SOAPdenovo – SOAPdenovo, a short read de novo assembly tool, is a package for assembling short oligonucleotide into contigs and scaffolds. SOAP family software can be found here (http://soap.genomics.org.cn/).
  • RePS (repeat-masked Phrap with scaffolding) – RePS is a WGS sequence assembler. It identifies repeated kmer sequences and deletes WGS sequence prior to assembly. The established software Phrap is used to compute meaningful error probabilities for each base. Clone-end-pairing information is used to construct scaffolds that order and orient the contigs. The updated version of RePS incorporates some of the ideas introduced by Phusion on clustering.
  • Exon_Capture_Pipeline – Whole-genome exon trapping analysis software.
  • Maq (Mapping and Assembly with Quality) – Maq builds assemblies by mapping short reads to reference sequences. Maq was previously known as mapass2.
  • ReAS – Software to recover ancestral sequences for transposable elements using unassembled reads from a whole genome shotgun sequencing.
  • SOAPaligner/soap2 – SOAPaligner/soap2 is a program for faster and more efficient alignment for short oligonucleotide onto reference sequences. SOAPaligner/soap2 is compatible with numerous applications, including single-read or pair-end resequencing.
  • SOAPsnp – SOAPsnp is an accurate consensus sequence builder based on Soap1 and SOAPaligner/soap2′s alignment output. It calculates a quality score for each consensus base, which can be used for any latter process to call SNPs.
  • SOAPindel - SOAPindel is developed to find the insertion and deletion specially for re-sequence technology.
  • SOAPsv – SOAPsv is a program for detecting the structural variation.
  • SOAP3/GPU – SOAP3 is a GPU-based software for aligning short reads with a reference sequence. It can find all alignments with k mismatches, where k is chosen from 0 to 3. When compared with its previous version SOAP2, SOAP3 can be up to tens of times faster.
  • MIEREAP – This is used to identify both known and novel microRNAs from small RNA libraries that were deeply sequenced using Illumina-Solexa/454/Solid technology.
  • FGF - (Fishing Gene Family, http://fgf.genomics.org.cn/) – This finds gene families, plots phylogenetic trees, and provides evolutionary information to gene duplication.
  • SVBP – This provides reliability tests and results visualization for sequence assembly.
  • WEGO -  (Web Gene Ontology Annotation Plot, http://wego.genomics.org.cn/cgi-bin/wego/index.pl) – Web Gene Ontology Annotation Plot is a useful tool for plotting GO annotation results especially for comparative genomics.
  • HIBAIS – Ancestor deduction software based on HapMap.
  • SOLEXA-MRNATAG_PIPELINE – Digital gene expression software based on Illumina-Solexa sequencing data
  • CAT (Cross-species Alignment Tool) – Allows mRNA sequence and mammalian genome alignment across species
  • KaKs_Calculator -  This calculates nonsynonymous (Ka) and synonymous (Ks) substitution rates. More information is available here http://evolution.genomics.org.cn/software.htm.

Richard Lumb of Front Line Genomics interviews Yingrui-Li of BGI Americas at ASHG 2014