Running alignment and BEST against random HGP002 datasets is probably starting to get old. But there was at least one more I wanted to try. The Ultima Genomics data release!
Ultima’s quality claims have been fairly modest in their previous announcements. Ultima’s main focus seems to be on providing a ultra-high throughput platform with a low cost per base undercutting Illumina.
This makes sense, as high throughput users are likely the most cost sensitive, and none of the other new players in sequencing1 are currently going for this part of the market preferring instead to target the mid-range (NextSeq class instruments).
Ultima however are clearly going after the NovaSeq. So let’s take a look at the data quality!
I’m also using compute donated by GenomeMiner as usual!
The same alignment approach I used previously was applied here2. Obviously this could bias things against Ultima, but it should give a general sense of the platform performance.
Looking at Q scores, they seem pretty well calibrated, at least as good as we saw in the Illumina dataset:
The Illumina dataset was 250bp. With Ultima you get a distribution of read lengths3. the average being ~286bp4. So not a totally fair comparison, but there is still significantly higher fraction of perfect reads (assigned Q75 by BEST) in Illumina data:
Overall, identity/accuracy as determined by BEST is as follows:
Quality is clearly a little worse in Ultima’s data release than for Illumina. But if you’re getting this significantly cheaper I imagine it’s likely good enough for most applications!
Will be interesting to see how Ultima performance plays out in the market and what Illumina’s response is!
Aside from perhaps MGI and the T7 but I’m not sure how many users have ones of these… And of course with the Apton acquisition PacBio announced that they would also be targeting the ultra-high throughput market.
Still throwing everything through Dorado, because that’s what I have setup on GenomeMiner. This is Minimap2 under the hood and is basically used here because I started this project looking at ONT data. Anyway… good enough for a quick look at the data! You however may want to do something more sensible and systematic.
due to their use of a single channel chemistry, like Ion Torrent and others.
Based on quickly running an awk script over the first 2M reads.
Interesting, thanks. Still pretty far behind Illumina. And this is after the improvements recently Ultima mentioned at ASHG?
Presumably a big part of their inaccuracy is from their Ion Torrent chemistry which isn't synced and suffers homopolymer issues.