$$\rightleftharpoonup{xx}$$
$$\longleftharp{xx}$$,
$$\longrightharp{xx}$$,
Initially, modules thought to impact biosensor function must be selected for variation; this can include regulation of transport proteins, which can affect the intracellular concentration of ligands and thus biosensor output, but also includes relative transcription and translation levels of the aTF itself, as well as the fluorescent reporter or output gene (Figure 1). Figure 2 demonstrates a typical workflow used in the development of a DoE-based experiment for biosensor optimization; beginning with the organization of regulatory elements into distinct modules amenable to manipulation via synthetic biology through changes at the sequence level, specifically in operator sites, hexboxes, or RBS' (Figure 2A). As such, the next step in DoE workflow is the randomization of sequence sites in order to generate libraries of variants (Figure 2B). The degree of randomization must be carefully considered, as the number of colonies screened should scale with the degree of randomization 4N, where N is equal to the number of randomized base positions. Treating each unique promoter or RBS sequence as a unique categorical variable in DoE would increase the number of required constructs to experimentally unfeasible levels, as such conversion to continuous variables through characterization of the libraries is a necessary triaging step to gauge the range of function acquired through randomization and to define the upper, middle, and lower bounds of functionality. This is first achieved through analysis of the libraries' output as a measure of the strength of the RBS or promoter variant through a reporter gene (Figure 2C). A lin-log transformation is performed, as shown, to discretize the continuous variables into levels that can be used by DoE to explore different combinations and to develop a model that describes the effects of these variants. A screening design is then implemented using 3 levels that describe the range of activity for each factor in a combinatorial fashion (Figure 2D). Through assembly and testing of the suggested designs, the experimental space is efficiently explored and the interactions between factors revealed. Statistical analysis of the resulting data is used to determine which combination of factors has the most significant effect on biosensor output, and SLSR is used to predict the behavior of the system under different criteria, facilitating optimization of the biosensor toward specific outcomes such as increased dynamic range or sensitivity (Figure 2D).
Figure 3 demonstrates the assembly and screening of an aTF-regulated promoter library. Isothermal assembly using a degenerate oligonucleotide was performed to create a plasmid-encoded library, whereby each plasmid is uniquely randomized at specific positions. The degree of library diversity will ultimately determine the number of colonies to be screened, with larger theoretical library sizes benefiting greatly from automation. Promoter sequence analysis of homologous operons to TphR provided a map of base conservation that was used to inform randomization locations, specifically bases that showed some degree of variation and therefore may modulate activity without being absolutely essential23. Three bases in each of the -35 and -10 hex-boxes were targeted for complete randomization in addition to six bases in the operator site (Figure 3A), resulting in a theoretical promoter library of ~500,000. The plasmid library was subsequently used to transform the host strain. At this stage, good transformation efficiency is crucial in order to obtain sufficient library coverage with common troubleshooting approaches shown in Figure 3B. Optimization of DNA concentrations, transformation method, and cloning design can significantly improve transformant yields. Figure 3C demonstrates a typical workflow upon obtaining transformants, individual colonies corresponding to unique variants must first be grown in media before any characterization work can begin. In order to cover the theoretical library size, a vast number of variants will need to be picked and arrayed into plates. Leveraging automated systems like liquid handlers and colony pickers can trivialize this labor-intensive step. Step 1 of Figure 3C illustrates the transfer of growth media into MTPs that have been manually loaded into the liquid handler dock, followed by automated inoculation by a colony picker. Some stages, such as sealing the plates and transferring them to offline incubators, are manual but can also be automated if desired. Following the growth of the cultures, liquid handlers can also be used to generate cryo-stocks through the addition of glycerol, as shown in Figure 3C. At this stage, barcoding of the plates will ensure that every picked variant will be linked to a specific plate and well location, enabling easy referencing for further downstream characterization. One of the major advantages of automated approaches, aside from the reduction of labor, is reduced human error, with mistakes at the stage of library preparation less likely to be carried forward. Step 2 of Figure 3C illustrates the automated characterization phase of library preparation. This begins via the filling of DWBs with media using the liquid handler platform, followed by inoculation using the barcoded cryo-stocks. Automation at this stage again ensures that pipetting errors and labor are minimized. The plates are then sealed and manually transferred to offline incubators for growth, at which point arraying of effector compounds into fresh deep well plates can be initiated. For the purposes of an initial screen of part libraries, a simple ON/OFF screen can be desirable as this can be used to prescreen non-functional variants that exhibit equal or worse activity than the base construct and enrich the variant pool for those that exhibit enhanced activity. This has the added benefit of reducing the material costs of tips and plates which can become prohibitive in large library screening protocols. However, where optimization of more complex biosensor performance metrics is required (e.g., EC50), additional effector concentrations will be required. Following the growth of the cultures, the plates are returned to the liquid handler platform, which begins to inoculate the plates containing effector compounds before being manually returned to the incubator once more for the duration of the assay. Figure 3D demonstrates the final automation step before data collection. Following the elapsed period for growth and biosensor activation, the plates are removed from the offline incubator and returned to the liquid handler platform. To remove residual growth media, which can interfere with fluorescence data collection, centrifugation, removal of supernatant, and washing of the cells with 1x PBS are required. The use of liquid handlers can again trivialize this process, with automated resuspension of cultures enabling rapid processing of the plates, including transfer of the washed cells to 96-well format MTPs for screening. Data collection can be performed in a manual or automated fashion, with some readers featuring plate stacks that can interface with liquid handlers to further automate the data collection process. By comparing the ratio of biosensor activation in the presence of effector (ON) to its absence (OFF) 5,000 variants were assessed using degree of biosensor activation (fold change) to determine biosensor function; only the variants with activity above that of the base construct (3.6-fold) were taken forward for further characterization as indicated by the red-pink shaded region of the scatter plot (Figure 3D). Based on the plate and well positions of the enriched variant pool, robust characterization using biological replicates or different effector concentrations can then be carried out by referring back to the original barcoded cryo-stock plates generated in Step 1 of the workflow.
Figure 4 demonstrates the screening of the triaged variants from the initial library screening aiming to develop a promoter library for optimizing sensitivity. Using the data from the 5,000 variants screened in the previous workflow, a triaged pool of 226 variants from the initial ON/OFF screen, determined to be more active than the parental sequence, were then further characterized and ranked according to their sensitivity, in order to act as levels around which a DSD could be designed. As a first step the categorical variables, in this case the Pout top variants, must be converted into continuous variables that span a wide sensitivity range. To screen sensitivity, dose response curves are required to obtain EC50 data from a plotted Hill function; this increases plating work dramatically and is well suited to automation using liquid handlers to simplify the process of assay setup and screening as shown in Figure 4A. Following the workflow established in Step 2 of Figure 3C, plate barcodes and well positions corresponding to the enriched pool of variants were used to inoculate DWBs filled with growth media and antibiotics. To enhance experimental robustness, variants were screened in biological triplicate. Following transfer of the plates to the offline incubator to grow, fresh DWBs were filled with 0, 1, 25, and 1000 µM effector-supplemented growth media using the liquid handlers to reduce labor. To reduce the number of plates required for the assay, a concentration range encompassing the bottom middle and top of the curve was chosen, with the mid-point concentrations revealing the relative sensitivities of each variant as illustrated in Figure 4A. After inoculation of the variant pools at each effector concentration and analysis of fluorescence and OD600, dose response curves were plotted, with non-linear regression analysis used to determine EC50. At this stage, a raw library of each variant with a unique EC50 value was generated, with the top 100 most robust variants taken forward as shown in Figure 4B in order to further reduce library size. Before this library can be used in DoE, however, conversion of the unique variants into a ranked library, representing the range of sensitivity contained within, must be generated. This was achieved by performing a lin-log transformation of the data, which ranks and rescales the data so that each variant is ranked from most sensitive (-1) to least sensitive (+1), as well as defining a mid-point value (0), which represents the geometric mean of the dataset Figure 4C. The transformation of the raw data produced the blue plot shown in Figure 4D, from which discrete Pout sequences corresponding to +1, 0, and -1 were taken forward into the definitive screening design as Pout factor levels.
Figure 5 demonstrates the complete workflow after library generation from DSD generation to modelling and global optimization of a biosensor based on the DoE assisted learnings. Figure 5A features a breakdown of a typical biosensor into 3 modules with either 1 (Transport and Regulator modules) or 2 (Output module) nodes of regulation. Following the example of Figure 4, RBS or promoter libraries will have been developed, and levels ranging from +1, 0, and -1 selected to encompass the greatest variation of each factor. The size of the screened libraries would typically determine the number of experiments required to fully explore the design space, for example, if each library were of size 22, this would equate to 224 (234,256) combinations. DoE aims to simplify experimental workload by reducing the number of combinations through structured screening designs. While many methodologies are possible, DSD is ideal for biosensor development as it allows identification of main factors and two-factor interactions whilst avoiding confounding second-order effects. Additionally, as DSD designs utilize 3 levels, it is possible to estimate curvature (non-linearity). Figure 5A demonstrates a typical DSD output where each of the 4 modules is set to different levels; as each level corresponds to a particular promoter or RBS variant, isothermal assembly is used to generate the genetic constructs corresponding to the recommended designs of the DSD. After assembling and transforming the host strain with the recommended constructs, dose response curves are then obtained using a full range of effector concentrations to provide more confidence in the performance of each of the constructs Figure 5B. As DSD dramatically reduces the number of constructs, this step can often be performed by hand or using automated liquid handlers if preferred. Figure 5C presents the output of the prediction profile obtained after constructing and testing the suggested combinations from the DSD screen and building predictive models based on the Hill coefficient (nH) and EC50 output of each tested combination. The aim of the experiment was to develop a biosensor construct that was globally optimized toward both nH and EC50 through modulation of the expression of the 4 regulatory nodes to maximize both parameters. Each regulatory factor is shown in its own column with the degree of expression indicated along the x-axis corresponding to the lin-log transformed promoter and RBS part libraries (-1 to +1). The effect of changing the expression of nodes on both EC50 and nH is indicated by the curves in the subplots. The profile plots highlight the often-unintuitive nature of biosensor optimization, whereby the tuning of one regulatory node can have opposing effects on output parameters. For example, RBStrans is shown to have no strong correlation with nH,however, it positively correlates with EC50 in a non-linear manner. Higher order (non-linear) interactions are also implied, in the case of RBSout an increase in strength will increase slope (higher nH) with a concomitant increase in sensitivity (lower EC50), resulting in a curve with a more digital slope and sharper response to increasing effector concentration. From these models, unintuitive facets of biosensor tuning can be rendered more clearly which enables the optimization of the regulation nodes towards a global optimum. The models were used to predict the global optima for both EC50 and nH , with the red lines in the plot indicating the optimal levels of each regulatory node (Figure 5C). Figure 5D demonstrates the dose-response profile of the initial parental biosensor construct (Blue) compared against the top-performing DSD design (Green) and the globally optimized construct (Lilac). Using the model to predict the ideal module strengths for maximizing EC50 and nH, avariant corresponding to RBStrans (-1), Preg (-0.7), Pout (-0.3), and RBSout (+1) strengths was assembled and characterized with the optimized construct showing enhancements in EC50 and nH (Figure 5D). Whilst both the DSD and globally optimized biosensors display similar EC50 (0.8 vs 0.7 µM), nH was significantly improved without compromising the EC50 gains that were already achieved. The results clearly demonstrate the advantages of data-driven design over intuition-based approaches and serve to validate DoE as a means of streamlining and simplifying the biosensor tuning process.

Figure 1:Tuning of genetically encoded biosensor parameters. Layout of genetic modules of a genetically encoded biosensor, including aTF, operator sites (OS), hexboxes (-35, -10), and RBS components. Colored boxes correspond to interactions that typically affect biosensor parameters such as: Ligand-aTF affinity (Grey), aTF-operator (Pink), RNAP-Hexbox (Green) and RBS (Orange). The effects of each parameter on dose-response characteristics are indicated within the representative graphs. Please click here to view a larger version of this figure.

Figure 2: Overview of a typical DoE biosensor optimization workflow. (A) Overview of modularization of biosensor components showing a transport module encoding a transport protein to import the target effector, a regulator module pertaining to the aTF, and an output module which encodes for a reporter protein such as sfGFP. Also shown are nodes of regulation, such as RBStrans, Preg, Pout, and RBSout, which correspond to the genetic nodes that will be subjected to randomization in order to explore biosensor parameters. (B) A selection of sequence elements amenable to base randomization, including promoters and RBS'. The parental sequence of the promoter is shown on the top line, with the finalized mutant sequence shown below, stars indicate unchanged bases, whereas K, M, and N refer to Guanine/Thymine, Adenine/Cytosine, or any nucleotide, respectively. Promoters offer greater randomization potential through targeting hexboxes or operator sites and can also include duplication or modifying the spacing of sequences. RBS libraries offer more limited randomization options, however, they are significantly easier to screen owing to their smaller maximum diversity. (C) The expression levels of the variants are characterized and then converted to a ranked lin-log library to convert the categorical variant factors into 3 discrete levels that are more amenable to analysis via DoE. (D) Mapping of the experimental space is performed using multiplexed combinations of the three levels of each module to generate a model that can be used to inform design choices to tune biosensor performance towards desired outcomes, this could be towards dynamic range, or toward sensitivity. Please click here to view a larger version of this figure.

Figure 3: Biosensor modularization, promoter library construction, and automated workflow. (A) Example of randomization of specific sequences in the aTF promoter and insertion into the biosensor construct via isothermal assembly. Bolded letters indicate positions that were randomized in the operator site or hexboxes according to the provided key during degenerate oligonucleotide synthesis. (B) Panel describing the transformation of the resulting biosensor variant library into a cloning host such as E. coli, and next steps depending on transformant yield. Low transformation efficiency can result in poor theoretical library coverage and inadequate exploration of the design space. Troubleshooting at this stage is imperative to ensure a significant portion of variants is available for characterization, with common troubleshooting measures outlined. (C) Workflow of steps 1 and 2 as outlined in the protocol, with the red hand symbol indicating manual steps and the cog indicating automated steps. The step 1 workflow highlights key steps in the protocol from colony selection to cryo-stock generation. The step 2 workflow demonstrates revival and re-arraying of cryo-stocks for assaying by dose response curve. (D) The panel demonstrating the final procedure before screening, including washing of the cells and transfer to assay plates before measurement of fluorescence and OD. A screened variant pool of 5000 is shown in the panel, with the variants demonstrating ON/OFF in excess of the parental promoter sequence (3.6-fold) highlighted in the orange box. Many of the variants can be seen to cluster around 1, indicating poor performance and low variability, likely due to the randomization at the sequence level causing loss of function. The 226 variants shown boxed in the plot were taken forward for robust characterization. Data was adapted from the original publication by Alvarez Gonzalez et al23. Please click here to view a larger version of this figure.

Figure 4: Promoter library top variant screening and discretization via lin-log transformation. (A) Outline of the standard procedure for generating expression data for an RBS or promoter library. Using the triaged variants that represent a good range of expression levels, liquid handlers are used to generate assay plates prefilled with predetermined concentration of effector from which to derive dose response curves of the triaged 226 variants. (B) After EC50 determination and further reduction of the characterized library to 100 variants, the data is plotted as a bar chart showing the mix of different sensitivities generated from the promoter randomization. (C)The EC50 data is transformed using the lin-log rate equation to convert the continuous data set into a categorical one more suited to factorization in a DSD. (D) The transformed EC50 variant data is shown now reduced to a simplified scale and ranked from high to low EC50 activity. From this, 3 levels corresponding to the top (+1) geometric mean (0) and bottom (-1) variants are selected and will be carried forward into the DSD to explore the experimental space. Please click here to view a larger version of this figure.

Figure 5: DSD experimental design, testing, and model-based learning outcomes. (A) Schematic workflow showing the generation of a DSD design table based on the lin-log transformed ranked libraries of the RBStrans, Preg, Pout,and RBSout modules. The DSD design table suggests the smallest number of combinations to efficiently map the experimental space. An example output is given wherein +1, 0, and -1 refer to top, middle, and bottom performing variants for each regulatory node as described by the lin-log transformations. These are constructed via isothermal assembly and confirmed by sequencing before being transformed into the expression host for characterization. (B) Following transformation, cells are grown up and assayed against a wide range of effector concentrations, and the fluorescent output is measured to generate dose-response curves. Various parameters, such as nH and EC50, are extracted from the dose-response curves and fed into the DSD to generate predictive models for each factor. (C) Using the models, predictions on the impact of modulating one biosensor parameter through changing the expression level of any regulatory module can be made. Importantly, global tuning of the regulatory nodes becomes possible, enabling maximization of one or more biosensor parameters simultaneously, indicated by the dashed red lines in each subplot. (D) Optimizing the model toward maximum sensitivity results in the globally optimized construct (lilac), the dose response curve of which is plotted against the top-performing DSD construct (green) and the parental biosensor construct (blue). Extracted nH and EC50 parameters are shown below the plot, demonstrating the improvement of both parameters above the top-performing DSD construct, validating the efficacy of the predictive models generated from the DSD. Data was adapted from the original publication by Alvarez Gonzalez et al23. Please click here to view a larger version of this figure.
Supplementary Figure 1: Automated liquid handling protocol steps used for biosensor library preparation and assay setup. Please click here to download this File.
Supplementary Figure 2 to Supplementary Figure 6: Step-by-step generation of a Definitive Screening Design (DSD). Please click here to download this File.