PacBio Revealed

Oliver Elemento has done a pretty remarkable in-depth analysis of the first publicly available PacBio data. It’s all up on his blog, so jump over and read the whole thing, but here are a few of the highlights:

  • The machine only produces about 48k reads per run. By Oliver’s reckoning, this works out to about 6,400 runs to get 10x coverage of a human genome. Ouch.
  • Single-pass sequence accuracy is remarkably low, at just over 80%. I heard rumors that PacBio had accuracy problems, but didn’t expect the error rate to be that ugly.
  • On a more positive note, read length is very high, with several runs *averaging* 2,300 bp, and overall read length averages ~850 bp.
  • Interesting, there is a positive correlation between read length and quality. This is somewhat different from what we see from other platforms, where read length is limited by the huge drops in quality near the end of the read.

The bottom line, in my mind, is that unless PacBio can solve their problems in accuracy and throughput, they’re going to be relegated to niche applications.