Fighting against uncertainty: An essential issue in bioinformatics

Many bioinformatics problems, such as sequence alignment, gene prediction, phylogenetic tree estimation and RNA secondary structure prediction, are often affected by the "uncertainty" of a solution; that is, the probability of the solution is extremely small. This situation arises for estimation problems on high-dimensional discrete spaces in which the number of possible discrete solutions is immense. In the analysis of biological data or the development of prediction algorithms, this uncertainty should be handled carefully and appropriately. In this review, I will explain several methods to combat this uncertainty, presenting a number of examples in bioinformatics. The methods include (i) avoiding point estimation, (ii) maximum expected accuracy (MEA) estimations, and (iii) several strategies to design a pipeline involving several prediction methods. I believe that the basic concepts and ideas described in this review will be generally useful for estimation problems in various areas of bioinformatics.
Comments:This manuscript was accepted in Briefings in Bioinformatics for publication
Subjects:Quantitative Methods (q-bio.QM); Biomolecules (q-bio.BM); Genomics (q-bio.GN)
Cite as:arXiv:1305.3655 [q-bio.QM]
 (or arXiv:1305.3655v1 [q-bio.QM] for this version)

Submission history

From: Michiaki Hamada [view email] 
[v1] Wed, 15 May 2013 23:49:59 GMT (972kb)