Beware of mis-assembled genomes

Abstract
With hundreds of genomes now in GenBank, researchers might be forgiven for assuming that genome sequence data are correct, at least at a large scale. Certainly there might be errors at some small rate, perhaps 1 in 50 000 or 100 000 bases (Schmutz et al., 2004; Read et al., 2002), but at a large scale these genomes are put together correctly, are not they? Well, not always.