The PacBio Benchtop - The Vega

Nov 07, 2024

Let’s great straight to it PacBio have released a new instrument called Vega, their highly anticipated bench top DNA sequencer:

It delivers 60Gb of data in 24 hours
Costs $169K
Runs are $1100

There's also a low capital cost option:

$79K, 152 cells commitment at price of $1750.
After 152 cells, price goes back down to the standard $1100.

The instrument is ~2ft square. So, roughly the same amount of space as a MiSeq.

Comparing to ONT

Throughput

I don’t know how much a PromethION flowcell produces and Oxford Nanopore are seemingly unwilling to commit to a throughput spec. I looked at a few papers, most seemed like relatively small projects… which makes it difficult to estimate real life performance and flow cell variability.

So I just grabbed all human genome PromethION datasets from the SRA and plotted them as a histogram:

From this I’m going to go with a PromethION flow cell generating ~72.5Gb of data1. Not a whole lot more than the Vega. In 4 unit volumes the PromethION flow cell is $900.

On the face of it, perhaps that looks the like PromethION flow cell is the better bet? But there are two further factors:

PromethION run output is variable, and there’s no vendor claim to back it up.
PromethION quality still isn’t good enough.

The lack of a vendor commitment on throughput would for me make it difficult to build a large project (or company) around the PromethION. That could be solved if ONT are willing to step up and say “if you don’t get an average of X using protocol Y on whole human genomes you get a refund”.

Accuracy

The quality issues is more problematic. This needs further research but the last time I look at ONT data, the overall error rate was still pretty high. It seems that ONT are currently claiming something like a Q26 modal read accuracy:

PacBio are showing Q35 median on some Revio runs, and state Q33 on their new chemistry. More than this, I’d still be more concerned about systematic bias in ONT datasets (see below).

Is 20X Enough?

One relatively recent publication, suggested that “below 8× coverage, ONT demonstrated a higher F1 score for SVs compared to PacBio HiFi. However, beyond 8× coverage, there was a significant improvement in the F1-score for PacBio HiFi, exceeding 90%, while the F1-score for ONT remained around approximately 87%”.

This suggests that the PacBio results at 20x should be better than ONT. Unfortunately ONT is a constantly moving target so this isn’t the latest chemistry. PacBio’s results however suggest that a 20x PacBio genome should be better than a significantly higher coverage ONT genome.

Run time/Instrument Cost

PromethION output is generally quoted for 72h run time. That’s 3x slower than the Vega. Realistically this means the Vega should be compared against either the P2 or P24.

The P2 costs $97,000, the P24 $436K. Of course, if you don’t need the throughput ONT have lower range options which may suit better. But if you’re going to be running nearer capacity the Vega looks like a good option.

Summary

I’d expect the Vega to perform similarly to the Revio. As such I suspect it will be a relatively solid workhorse instrument and a good option for smaller labs.

To me it seems like a no-brainer versus ONT2. The price is in the same ballpark, accuracy is better, and I’d expect the vendor claims to hold up and the platform to deliver consistent results.

I think it could even displace short read instruments like the new Miseq i100. Smaller labs could address their low throughput needs; generating data with Illumina compatible quality. But as a bonus they also get to experiment with long reads and direct methylation, generating papers they otherwise wouldn’t have.

What’s less clear is how much revenue this ends up generating for PacBio. Traditionally these low end instruments haven’t created a lot of revenue. PacBio said they wanted to be cash flow positive by 2026.

This instrument seems like it might generate some short term revenue, and longer term help drive adoption for the Revio, this seems possible. But 2026 is a very short timeline.

There will be a post covering some technical aspects of the platform very shortly. So subscribe for that!

PacBio reviewed a version of this article prior to publication.

That being a smallish peak in the distribution roughly in the region I’d expect it to be and ignoring the low throughput runs which I assume were multiplexed. But if someone wants to chime in with another dataset (or even better an official guarantee from ONT) that would be of interest.

For human genomes. And unless you for some reason need very long reads for a research project.

ASeq Newsletter

Discussion about this post