Cees Dekker's protein sequencing paper

Cees Dekker’s group recently published progress toward nanopore protein sequencing. The work addresses one of the fundamental issues in nanopore protein sequencing, controlling the translocation speed.

Without this, peptides translocate too quickly for individual amino acids to be detected (for more context on this see the Dreampore post). As there are no established techniques for controlling peptide translocation the Dekker approach links a peptide to a DNA molecule. This allows motion control techniques established for DNA sequencing to be applied to protein sequencing.

Specifically, they us a DNA helicase to pull DNA (and the linked peptide) through an MspA nanopore:

In the paper they only use negatively charged peptides. This means the construct as a whole has a negative charge, and is pulled through the pore under the bias voltage. The Helicase works in the opposite direction, pulling the conjugate back through the pore.

From example traces, translocation seems to take ~3 seconds. The peptide is 12 amino acids, so this is ~250ms per amino acid. Eyeballing the traces, I suspect we’re seeing roughly 10pA between the highest and lowest current blockages (it’s not obvious because they’re using scaled units):

They have looked at various peptides showing single amino acid differences. From the consensus trace you can see that, as in nanopore DNA sequencing, single amino acid change effects multiple positions:

Conservatively, the single amino acid change contributes to differences in at least 3 steps of the trace. When building out a complete platform we can expect >8000 different current levels, over what we assume will be a 10pA range, ~0.001pA (1fA) between states. This is similar to what we saw with Dreampore.

The advantage here, is that the translocation is far slower. Sampling (or averaging to) 10Hz might be enough to get you near the femtoamp resolution you need differentiate between states. So, by using a HMM (or a ML approach) to use information from adjacent states it feels like you might be in the right ballpark.

Dekker’s team also have a method that allows each peptide to be observed multiple times. Mostly this is likely to help resolve missing/duplicated states, but it may also help with resolving state levels.

In terms of throughput, if we imagine running this on a Minion we’d be looking at a sequencing speed of ~128 AAs a second (512 pores, always active).

Assuming we want a depth of 10 billion proteins (as suggested by Nautilus and others) and that we need 12mers to give a reasonable protein alignment/signal. Running the numbers suggests we’d be looking at ~180 days to sequence to this depth.

Beyond potential throughput issues it’s not clear what will happen when you look at positively charged amino acids, this feels like it could be pretty problematic.

A further complication is that the paper uses “the mutant nanopore M2 MspA” this is likely covered by US8673550B2 which Illumina appears to have exclusive rights to.

Overall there’s clearly a lot to be done in order to build a working protein nanopore protein sequencing platform. But this is a fairly promising development.