Introduction
Burrows-Wheeler Aligner (BWA) is an efficient program that aligns relatively short nucleotide sequences against a long reference sequence such as the human genome. It implements two algorithms, bwa-short and BWA-SW. The former works for query sequences shorter than 200bp and the latter for longer sequences up to around 100kbp. Both algorithms do gapped alignment. They are usually more accurate and faster on queries with low error rates. Please see the BWA manual page for more information.
FAQ
- How can I cite BWA?
-
The short read alignment component (bwa-short) has been published:
-
Li H. and Durbin R. (2009) Fast and accurate short read alignment
with Burrows-Wheeler Transform. Bioinformatics, 25:1754-60.
[PMID: 19451168]
- Does BWA align 454 reads?
- Yes and no. The BWA-SW component of BWA works well on 454 reads about 200bp or longer. It achieves similar alignment accuracy to SSAHA2 while much faster. BWA-SW also works for shorter reads, but the sensitivity is lower. In addition, BWA-SW does not support paired-end alignment.
- What is maximum query sequence length in alignment?
- It is recommended to only use bwa-short on reads shorter than 200bp. Although bwa-short works for up to a few kbp query in principle, its performance is degraded. For long reads, BWA-SW is better.
- The BWA-SW component can align a BAC sequence (about 150kbp) against the human genome. The speed in terms of aligned bases per time unit is comparable to the speed of 1kbp read alignment. In principle, BWA-SW should be able to align a few Mbp query sequence at a similar speed, but I have not tried.
- What is the tolerance of sequencing errors?
- Bwa-short is mainly designed for sequencing error rates below 2%. Although users can ask it to tolerate more errors by tuning command-line options, its performance is quickly degraded. Note that for Illumina reads, bwa-short may optionally trim low-quality bases from the 3'-end before alignment and thus is able to align more reads with high error rate in the tail, which is typical to Illumina data.
- BWA-SW tolerates more errors given longer reads. Simulation suggests that BWA-SW may work well given 3% error for a 200bp alignment, 5% for 500bp and 10% for 1000bp or longer alignment.
- Does BWA find chimeric reads?
- Yes, the BWA-SW component is able to find chimera. BWA usually reports one alignment for each read but may output two or more alignments if the read/contig is a chimera.
- Does BWA call SNPs like MAQ?
- No, BWA only does alignment. Nonetheless, it outputs alignments in the SAM format which is supported by several generic SNP callers such as samtools and GATK.
- Does BWA work on reference sequences longer than 4GB in total?
- No, this is not possible and will not be supported in the near future due to the technical complexity involved.