storm logo

STORM (Sequence-based Toehold Optimization & Redesign Model) was developed collaboratively between the Predictive BioAnalytics Lab at the Wyss Institute at Harvard University and the Collins Lab at MIT to automate the prediction of toehold switch performance in silico.
A toehold switch is a riboregulator that can sense and respond to nucleic acids. Toehold switches are compatible with freeze-dried cell-free systems, making them good diagnostic tools for viruses such as Ebola and Zika. To learn more about previous work on toehold switches, check out how Wyss researchers have used them in a variety of synthetic biology applications.

Given the trigger or switch sequence, STORM uses a pretrained convolutional neural network to predict ON and OFF values. STORM includes the option to redesign toeholds with improved ON/OFF ratios.

Toehold should be 30 nucleotide or 59 nucleotide DNA or RNA sequence.
Input sequences should be 30 nucleotide trigger region or the 59 nucleotide trigger + hairpin. The 59 nucleotide sequence is built out from complementarity with the trigger, a ribosome binding site, and start codon.

To use STORM, enter your toehold sequence(s) separated by line breaks or in FASTA format:


**Please note that due to the nature of the algorithm, redesign takes about ~5 minutes per sequence, so we only allow ONE sequence to be optimized at a time.
For each sequence, the target ON/OFF values is set to 1 and supplied to an application of SeqProp, an open-source python package developed by Georg Seelig's lab that enables streamlined development of gradient ascent pipelines for genomic and RNA biology applications. At each iteration, the ON/OFF ratio of the initial toehold sequence is predicted and the difference between the predicted values and target values is computed. This discrepancy between predicted and target values is then propagated back through the model to update the input sequence in the direction that decreases the difference between the predicted ON/OFF value and the target. The updated toehold position weight matrix is used as input to the next round of optimization, and at the last round of iteration, the final sequence is composed of nucleotides with the highest probabilities in the position weight matrix. STORM iterates through this process five times and selects the toehold with the highest ON/OFF value. This lengthy algorithm is why redesign is limited to 1 sequence at a time.

Interested in finding the best toehold in a region? Enter a genomic sequence:


Jupyter notebooks and additional code are available at the GitHub. More information about methods and data is available at the BioRxiv preprint for the paper, "Sequence-to-function deep learning frameworks for engineered riboregulators".

wyss logo mit logo mit BE logo

This code is under a GPLv3 license.
Developed by Jacqueline Valeri and Katherine Collins. For questions or feedback please contact jacqueline "dot" valerie "at" wyss "dot" harvard "dot" edu.