PacBio: All Reads are HiFi Reads? And thoughts on DeepConsensus.
All Reads are HiFi Reads?
Sequel IIe has 8M ZMWs and provides 4M reads on a SMRT Cell 8M. HiFi (>Q20 reads) yield is 30 Gb, with an average read length of ~15Kb. This suggests that ~50% of reads are HiFi, which roughly matches the plots shown in PacBio’s marking material. So, total HiFi read yield on a Sequel IIe should be ~2M reads.
The Revio has 25M ZMWs and yields of 90Gb per SMRT Cell. Revio marketing material doesn’t present any data below Q20:
A 90Gb yield, and 15Kb average read length this would be 6M reads. So out of a total of 25M ZMWs, only 6M are producing usable data. Most likely other wells are producing reads (traditional, low pass CSS reads) but my guess is that instrument is only delivering HiFi reads to the user.
This is pretty important, because it pays into the compute requirements for DeepConsensus, PacBio’s machine learning base caller built into the Revio. These compute requirements could be a significant part of the Revio COGS, as DeepConsensus can take 200 and 800 hours to run even on Sequel IIe data.
Below the paywall break below, I try and estimate the compute requirements in more detail.. if it’s difficult for you to subscribe and you’d like a copy get in touch (new@sgenomics.org).