Ultima Genomics - Further Thoughts
There were a few further points I thought were worth addressing on Ultima Genomics. Some of these come from Keith’s excellent blog post on Ultima, others from a pre-print re-analyzing some of Ultima’s public data. I highly recommend checking out both these sources.
But before we get into that, it’s worth looking at two recent announcements from AGBT and how these effect Ultima’s positioning. The first in from Illumina who with near impeccable timing just announced 300 paired end reads on the NextSeq 1000/2000. This neatly beats Ultima’s 300bp single end reads.
The second is from Singular, who have coincidentally announced a 4B read kit. Not quite in the NovaSeq/Ultima range, but getting closer. Will Ultima be able to compete against not just Illumina, but a number of other players all generating higher quality data at similar throughput?
I guess we’ll find out… but as promised, let’s continue to address the technical approaches Ultima has developed.
Ultima’s Neat Tricks
In addition to using per-run trained ML base calling, Keith notes that:
“Second, for short homopolymers Ultima embeds in the Q-scores a probability matrix of the length”...“This is leveraged by their customized version of GATK, developed with the Broad Institute.”
Which is clever, but I’d be concerned if this information needs to be incorporated in the variant calling process. Other platforms could do this but don’t as it complicates the analysis process and is generally not worth the effort..
Keith also notes a variant calling trick associated with known flow orders: “Cycle shift uses the known order of flows to increase the confidence in variant calls – particularly variable for low coverage data such as cell-free DNA.”. It’s not clear to me exactly what this means and I didn’t see any details in the pre-prints. However, Ion Torrent also use a longer 32 base flow order which I’ve looked at previously and this sounds similar… so their approach may not be unique.
In addition to this Ultima’s platform will not call any homopolymers longer than 12… so if you see a 12mer in your dataset I assume this actually means “12 or longer”. While it might be rare, there are users who need to call long homopolymers… I guess they won’t be using Ultima’s platform.
Imaging
Keith notes that: “Imaging as the wafer spins enables shooting many tiles without having to repeatedly accelerate and decelerate the flowcell as a rectilinear scanning scheme must do.”
This isn’t exactly accurate, Ultima have some advantage here, but Illumina don’t need to accelerate and decelerate either. Since the HiSeq Illumina have used TDI imaging. The stage continuously moves a linear CCD sensor in sync with the stage, scanning across the flow cell.
The stage only needs to stop and reverse direction at the end of the flow cell. Illumina’s analysis pipeline still slices these up and analyzes them as “tiles”. But this doesn’t effect imaging speed… so I don’t think Ultima has much of an advantage here.
Those were the main points I had based on the Keith’s post, which added some context from his interviews with Ultima.
Below the paywall break below, I discuss some additional data quality and sample prep issues.
And also why I’m probably completely wrong about Ultima anyway…