ASeq Newsletter

ASeq Newsletter

Share this post

ASeq Newsletter
ASeq Newsletter
PacBio Sub-read Simulations

PacBio Sub-read Simulations

May 05, 2024
∙ Paid
5

Share this post

ASeq Newsletter
ASeq Newsletter
PacBio Sub-read Simulations
Share

After my previous PacBio subread experiments I wanted to build out a basic simulation to help confirm my understanding of the process and see how subread accuracy impacts final CCS/HiFi accuracy.

First I wanted to gather statistics for the raw subread dataset. I aligned everything and run the results through BEST1:

I then used these to parameterize a very basic simulator. This wasn’t too difficult, but you need to get the SAM/BAM metadata right2 or CCS will fail. I also null’d out the pulse data as previous experiments suggest that CCS doesn’t use this anyway. My simulated reads also all have a 6 subreads3.

I ran CCS over the simulated results using an error profile derived from the above. I didn’t break out insertions and deletions by homopolymer/Non-HP (perhaps another time). But the results matched reasonably closely:

Real Data (left) and Simulated (right)

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Nava Whiteford
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share