We have been working for many years on the computational design of proteins, using molecular modeling techniques. Afterwards we extended our computational optimization methods to other biological systems involving metabolic networks, transcriptional networks, or RNA networks. In particular we devoted more focus to the transcriptional and RNA networks, which are the most important regulatory systems in the cell. In 2009, we started our experimental microbiology lab with the aim of validating experimentally our computational designs without relaying on external collaborations. This allowed us to test our synthetic regulatory systems and we started moving towards the aim of developing novel information processing devices in living cells.
Until now, the engineering of the regulation of gene expression has been more of a crafting manufacturing than a systematic endeavor. This is due to the complexity and unreliability of biological systems in living cells. One obvious approach against complexity is to use computational power, but it is unknown weather our current biological knowledge is sufficient to engineer anything using automated design techniques. To try to get some plausible positive answers to this question, we notice that: i) most engineering attempts rely on the use of design principles that could already be incorporated into an optimization algorithm, ii) many behaviors of interest are robust, iii) researchers do actually succeed sometimes to engineer the targeted behavior in living cells in only a few tries. In addition, we could better spot misleading quantifications when we try to incorporate the feedback from experiment into the algorithm. For instance, growth rate has a devastating effect in the quantification of long-lived proteins (such as the fluorescent proteins) and this is most often overlooked.
The engineering of synthetic gene sequences by rational design usually require several years to obtain an functional sequence able to work as expected in a living cell. In fact, the complexity of the combinatorial interactions difficult the design, as it happens in RNA systems where a single nucleotide mutation may lead to a completely unrelated structure. Another example is in the design of genomes, where the many gene interactions make unpractical the design of a whole genome sequence. They exemplify a major roadblock that faces Synthetic Biology in the years to come: our capability to construct exceeds our ability to design. Even the introduction of high-throughput characterization techniques to screen large libraries cannot solve such a problem, and we require improving dramatically our design strategy. Here it is where the computational approaches to automate the invention will enter. Such approaches have already been used in the last decades to develop software able to program computers or even to provide novel designs of patents (in the computer science discipline of “genetic programming”). Our research is focused on the development of higher-order information processing systems by the use of automated approaches. We don’t want to automate just the design step, but also the construction and characterization phases. To automate the design we use software that implements an optimization algorithm. To automate the construction we are developing methods to synthesize large fragments of DNA, for instance using directed evolution. To automate the characterization we use microfluidics devices and automated image treatment.
It is widely believed that curing complex diseases will require new approaches to harness biological complexity in a context of limited knowledge. The Synthetic Biology discipline has recently emerged with the aim of providing such enabling tools, by borrowing strategies from the engineering of complex systems (made of millions of components). Contrary to such standard engineering disciplines, biological systems are made of millions of components, most of them with still unknown physical and chemical properties. Moreover, even if it existed the computation capability to properly simulate a biological system made of even two or three biomolecules in the computer, we would only be able to analyse the phenotype of a handful of sequences. Maybe the exponential increase of computational power and knowledge will allow us in the next years to come to predict the behaviour of synthetic biological systems. Meanwhile, the approach to engineer (but also to understand) biological systems has to rely on simplifying as much as possible biological molecules by creating a "caricature" versions of them, synthetic molecules. Such synthetic molecules will only share with their natural counterparts the main molecular mechanism of function, otherwise they will be very different. This is done by de novo engineering methods. We propose to use de novo engineering as a way to redesign biological systems to greater reproducibility. As biomolecules have amorphous shapes with complex charge distributions across their surface, we hypothesise that a random molecule will not interact strongly with other molecules in the cell, minimising the occurrence of unknown interactions and, therefore, maximising its predictability. There exist computational and experimental approaches to de novo engineering. The former are very limited to very special cases where the lack of molecular knowledge and computational power is not limiting, for instance, the design of RNA circuits. The later are limited by the lack of molecular, systemic and organism knowledge and by the slow way genotype space is sampled (evolutive power).
Our lab is developing new technologies (computational and experimental) for the de novo engineering of biomolecules and their interactions (in particular, with the long goal or creating a synthetic phage) that will facilitate a new generation of therapeutic tools based on multi-component synthetic molecules sensing and actuating various factors in sick cells, tissues and organisms. Our computational methodologies have already produced the proof-of-concept validations necessary to encourage its wide use in the community, which we are fostering by producing protocols, web-servers, open source codes and reviews. We are also working on the development of experimental technologies for the de novo engineering of biological systems. For this, we are coordinating the EVOPROG consortium, where we are programming phage and bacteria to compute our combinatorial optimisation algorithms (for protein, RNA and transcriptional/metabolic network design) by constructing and using a high-throughput droplet device for the directed evolution of biomolecules de novo, integrating for the first time in silico and in vivo evolution. For this, we are developing a general-purpose 3D biochip utilising computational and fluidics automation which could also be applied to perform in vivo molecular biology operations in high-throughput (including time-dependent characterisations of gene expression levels using fluorescent proteins).
Our work to engineer a synthetic phage will contribute to accelerate the design cycle for the next-generation of synthetic biology. Engineering a synthetic phage requires excelling at the RNA design and protein design levels, for this we have contributed to: i) extending the available toolkit of biological parts and circuits, iii) creating hybrid protein-RNA (nucleoprotein) circuits with increased performance, ii) developing new tools for the in vivo characterisation in E. coli of synthetic nucleoprotein circuits in a high-throughput manner using microfluidics time-lapse microscopy, iii) modelling and prediction of a circuits’ dynamics under external forcing at the single-cell level, iv) create a biological test platform where circuits and their parts could be measured and debugged prior being implemented in more complex cellular backgrounds. This is important for Synthetic Biology because we are establishing the foundations for the next-generation of synthetic regulatory circuits. There are already a large number of protein and RNA domains with known function and structure that could work interchangeable as switchable modules. Our current computational tools for biomolecular function prediction can provide a genotype-phenotype linkage for many cases, opening the door to the computational design of circuits composed of such modules. Only recently it was found that computational design was able to produce functional circuits in vivo in several cases involving functional modules such as ribosome-binding sites, transcription terminators, or ribozymes.
We are applying our methodology to engineer novel riboswitches for metabolic engineering within the PROMYS consortium. We have produced the first computational methodology to design riboregulators in living cells, which we have also validated in vivo with cooperative riboregulators, ribozyme-containing riboregulators and anti-terminators. In this way, we are developing a general methodology for the de novo engineering of synthetic RNA parts and circuits working robustly as targeted in a given cellular context. We are constructing in E. coli biological devices able to respond in a complex way after processing a suitable time-dependent signal, where use custom characterisation methods to measure gene expression dynamics at the single-cell level using microfluidics, automated time-lapse microscopy and automated image analysis.