Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About
  • Text only
  • |
  • Sign in
  • Search Systems Biology
  • Search University of Warwick
  • Search for people at Warwick
  • Search Warwick Blogs
  • Search past exam papers
  • Search video
  • More…

    Warwick Systems Biology Centre

    • About Us
    • Research
    • Students
    • People
    • Facilities
    • News and Events
    • Intranet
    • David Wild »
    • BS917 Material
    • Short Biography
    University of Warwick

    David Wild

    Prof. David L. Wild
    Warwick Systems Biology Centre
    Coventry House
    University of Warwick
    Coventry
    CV4 7AL
    U.K.

    Tel.:
    44-24-761-50242 (direct)
    44-24-765-28321 (secretary)

    e-mail:
    d.l.wild(at)warwick.ac.uk

    Research Interests

    My research interests are in the field of statistical bioinformatics; in particular in the application of Bayesian statistical machine learning techniques to problems in systems biology, functional genomics and proteomics.

    Modelling gene regulatory networks

    Over 50 years ago the developmental biologist C.H. Waddington laid the conceptual foundations of modern systems biology in his book "The Strategy of the Genes", in which he envisaged an epigenetic landscape as the potential surface of a multidimensional state-space of cellular metabolism, underpinned by a network of interacting genes. With the advent of high-throughput post-genomic technologies, we are now beginning to investigate the topology of these networks.

    This research aims to combine functional genomics and computational modelling into a novel integrative systems approach, based on a probabilistic modelling technique (Bayesian state-space models), to identify key components of the regulatory networks involved in cell physiology. We aim to learn networks integrating transcriptional data with the production of proteins and metabolites with well-defined biological activity.

    State-space models are a simple class of probabilistic graphical model used for time series analysis. We are investigating the use of these models for the inference of genetic regulatory networks from high-throughput microarray, proteomics and metabolomics data. In the context of genetic regulatory networks, the hidden states of a state-space model can represent unmeasured factors, such as genes that have not been included in the microarray, levels of regulatory proteins, and the effects of mRNA and protein degradation. We have used state space models to reverse engineer transcriptional networks from highly replicated gene expression profiling time series data obtained from a well-established biological model of T cell activation. The resulting networks reflect many of the dynamics of an activated T cell and provide a methodology for the development of rational and experimentally testable hypotheses. In particular, they reveal the integrated activation of cytokines, proliferation, and adhesion following activation and place JunB and JunD at the center of the mechanisms that control apoptosis and proliferation (Rangel et al., 2001; Rangel et al., 2004a,b; Beal et al., 2005, 2007).

    In collaboration with scientists at Warwick HRI we are applying these methods to elucidate key transcriptional networks and underlying regulatory mechanisms controlling plant responses to pathogens, high light and drought. As part of a European consortium, we are investigating metabolic regulation in the bacterium Streptomyces coelicolor, a major producer of antibiotics. With the University of Birmingham we aim to develop a computational framework to reconstruct transcriptional and metabolic networks representative of the response of E. coli to acid stress. These projects are funded by the Biotechnology and Biological Sciences Research Council (SABR and SYSMO initiatives), the European Union, and the U.S. National Science Foundation.


    Fast Bayesian computational methods for post-genomic data analysis

    A key feature that distinguishes the modern approach to Systems Biology is the aim of linking modelling of the interactions of system components with the huge volume and diversity of contemporary cellular and molecular data, such as that coming from high-throughput, genome-wide and imaging technologies. This project focuses on the development of statistical and computational methods for the analysis of such data, using novel approaches from the fields of machine learning and nonparametric Bayesian statistics. The project involves a close collaboration of scientists with expertise in machine learning and statistics, bioinformatics and molecular biology at the universities of Cambridge, Kent and Warwick, The new software tools will be developed in the context of real-world scientific problems, such as: elucidating signalling networks in plant stress responses and metabolic regulation in Streptomyces coelicolor. The scientific goal of the project will be to apply these novel methods to modelling bioinformatics data, but the methods developed will be broadly applicable across a number of fields. This research is funded by the Engineering and Physical Sciences Research Council (Life Sciences Interface).

     

    Analysing Protein Energetics with Stochastic Computation

    50 years ago, the Nobel laureate Christian Anfinsen and colleagues demonstrated that protein molecules can fold into their three-dimensional ‘native state’ reversibly, leading to the view that these structures represented the global minimum of a rugged funnel-like ‘energy landscape’. Since this seminal work, computer simulations have continued to shed light on the phenomena of protein folding and function. However, protein modelling and structure prediction face two major challenges if progress is to be made in the development of more precise models, which quantitatively describe experimental observations. The first is the difficulty of efficient sampling in the enormous conformational space accessible to protein molecules, whilst the second is the development of the energy function describing molecular interactions for the problem at hand. The microscopic size of protein molecules makes it impossible to measure these interactions directly, and so known protein structures themselves have become the best available experimental evidence. Traditionally, empirical so-called ‘knowledge-based statistical potentials’ have been used to describe such interactions from analysis of a collection of known protein structures.

    This project aims to address both of these challenges. The overall goal of this research is to advance knowledge of protein energetics and improve on established modelling techniques that utilize these empirical knowledge-based potentials. We are using an alternative approach, based on a novel statistical machine learning methodology known as ‘Contrastive Divergence’, to infer interaction potentials from a subset of known protein structures. We also utilize a novel Bayesian computational method for sampling the conformational space of molecular systems, known as ‘Nested Sampling’, which allows us to directly investigate the macroscopic states of the protein folding pathway and evaluate the associated free energies. This research is funded by the Leverhulme Trust.

    Advanced Bayesian Computation for Cross-Disciplinary Research

    We live in an era of abundant data. Rapid technological advances, such as the internet, have made it possible to collect, store and share large amounts of information more easily than ever before. The availability of large amounts of data has had a major impact on society, commerce, and the sciences. Data plays a particularly important role in the sciences. Data is what you get from conducting experiments, and data is what you use to test scientific theories. In recent years, the amount of data collected and generated in the sciences has grown tremendously. We need better tools to model this data, so that we can understand and test theories and make scientific predictions.

    This project focuses on advanced statistical tools for modelling data. It is important that the models are based on probability and statistics, because any model of real world phenomena has to represent the uncertainty we have from incomplete information and noisy measurements. Probability theory provides a coherent mathematical language for expressing uncertainty in models. Our research will develop models based on Bayesian statistics, which used to be called "inverse probability'' until the 20th century, and refers to the application of probability theory to learn unknown quantities from observable data. Bayesian statistics can also be used to compare multiple models (i.e. hypotheses) given the data, and thus can play a fundamental role in scientific hypothesis testing. The goal of this reseach is to develop new computational tools for Bayesian modelling, ensuring that the models are flexible enough to capture the complexity of real-world phenomena and scalable enough to deal with very large data sets. We will also develop new methods for deciding which data to collect and which experiments to perform, which can greatly reduce the cost of scientific inquiry. We will make use of the latest advances in computer hardware, in the form of massively parallel graphics processing units (GPUs) to speed up modelling of scientific data.

    This research is truly cross-disciplinary in that we do not focus on a single scientific discipline. We have assembled a team whose expertise spans Bayesian modelling across the physical, biological and social sciences. We will create modelling tools for better astronomical surveying of the skies so that we can understand the composition of our universe; we will create tools for analysing gene and protein data to so that we can better understand biological phenomena and design drug therapies; and we will develop powerful methods for modelling and predicting economic and financial data which will hopefully reduce risk in financial markets. Surprisingly, these diverse areas of the sciences - astronomy, biology and economics - can come together through a unified set of computational and statistical modelling tools. Our advances will benefit not just these areas but many other areas of science based on data-intensive modelling. This research is a collaboration with experts in statistical machine learning (Prof. Zoubin Ghahramani, Cambridge), statistics and econometrics (Dr. Jim Griffin, Kent) and astronomy (Prof. Andrew Liddle, Sussex) and is funded by the Engineering and Physical Sciences Research Council (Cross Disciplinary Interfaces Programme).

    Recent Publications

    • Penfold, C.A., Buchanan-Wollaston, V., Denby, K.J. and Wild, D.L. Nonparametric Bayesian Inference for Perturbed and Orthologous Gene Regulatory Networks. Bioinformatics (2012). In press.
    • Burkoff, N.S., Várnai, C., Wells, S.A. and Wild, D.L. Exploring the Energy Landscapes of Protein Folding Simulations with Bayesian Computation. Biophysical Journal (2012), 102, 878-886 .
    • Podtelezhnikov, A.A. and Wild, D.L. Inferring knowledge based potentials using contrastive divergence in Hamelryck T., Mardia K.V. and Ferkinghoff-Borg J. (Eds.), Bayesian Methods in Structural Bioinformatics, pp 135-155. Springer (2012).
    • Penfold C.A. and Wild D.L. How to infer gene networks from expression profiles, revisited. Journal of the Royal Society Interface Focus (2011), 1, 857-870 (Invited Review).
    • Cooke E.J., Savage. R.S , Kirk P.D.W., Darkins R., Wild, D.L. Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements. BMC Bioinformatics (2011), 12:399 (Highly Accessed Paper).
    • Breeze, E., Harrison, E., McHattie, S., Hughes, L., Hickman, R., Hill, C., Kiddle, S., Kim, Y-S., Penfold, C., Jenkins, D., Zhang, C., Morris, K., Jenner, C., Jackson, S., Thomas, B., Tabrett, A., Legaie, R., Moore, J.D., Wild, D.L., Ott, S., Rand, D., Beynon, J., Denby, K., Mead, A., Buchanan-Wollaston, V. High resolution temporal profiling of transcripts during Arabidopsis leaf senescence reveals a distinct chronology of processes and regulation. Plant Cell (2011), 23, 873–894.
    • Savage, R.S., Ghahramani, Z., Griffin, J.E., de la Cruz, B.J. and Wild, D.L. Discovering Transcriptional Modules by Bayesian Data Integration, Bioinformatics, 26, i158-i167, (2010).
    • Angus, J. Beal, M.J., Li, J., Rangel, C., and Wild, D.L. Inferring Transcriptional Networks Using Prior Biological Knowledge and Constrained State-Space Models. In Lawrence, N.D., Girolami, M., Rattray, M. and Sanguinetti, G. (Eds.), Learning and Inference in Computational Systems Biology, MIT Press, Cambridge, (2010), pp 117-152.
    • Nieselt, K., Battke, F., Herbig, A., Bruheim, P., Wentzel, A., Jakobsen, O.M., Sletta, H., Alam, M.T., Merlo, M.E., Moore, J., Omara,W., Morrissey, E.R., Juarez-Hermosillo, M., Rodríguez-García, A., Nentwich, M., Thomas, L., Legaie, R., Gaze, W.H., Challis, G.L., Jansen, R.C., Dijkhuizen, L., Rand, D.A., Wild, D.L., Bonin, M., Reuther, J., Wohlleben, W., Smith, M.C.M., Burroughs, N.J., Martín, J.F., Hodgson, D.A., Takano, E., Breitling, R., Ellingsen, T.E., Elizabeth M. H. Wellington, E.M.H. The dynamic architecture of the metabolic switch in Streptomyces coelicolor. BMC Genomics (2010), 11:10 (Highly Accessed Paper).
    • Savage, R. S., Heller, K., Xu, Y., Ghahramani, Z., Truman, W.M., Grant, M.R., Denby, K.J. and Wild, D.L. R/BHC: Fast Bayesian Hierarchical Clustering for Microarray Data, BMC Bioinformatics, 10:242, (2009) (Highly Accessed Paper).
    • Cooke, E.J., Savage R.S. and Wild, D.L. Computational approaches to the integration of gene expression, ChIP-chip and sequence data in the inference of gene regulatory networks. Seminars in Cell and Developmental Biology, 20: 863–868, (2009) (Invited Review).
    • Podtelezhnikov, A.A. and Wild, D.L. Reconstruction and stability of secondary structure elements in the context of protein structure prediction. Biophysical Journal, 96:4399-4408, (2009).
    • Stegle, O., Denby, K.J., Wild, D.L., Ghahramani, Z. and Borgwardt, K.M. A Robust Bayesian Two-Sample Test for Detecting Intervals of Differential Gene Expression in Microarray Time Series, Research in Computational Molecular Biology, Lecture Notes in Bioinformatics, Ed. S. Batzoglou, Springer, Berlin, 5541, 201-216, (2009).
    • Rasmussen, C.E., de la Cruz, B.J., Ghahramani, Z., Wild, D.L. Modeling and Visualizing Uncertainty in Gene Expression Clusters using Dirichlet Process Mixtures. IEEE/ACM Transactions on Computational Biology & Bioinformatics, 6: 615-628, (2009).
    • Saqi, M., Dobson, R.J.B., Kraben, P., Hodgson, D.A., and Wild, D.L. An approach to pathway reconstruction using whole genome metabolic models and sensitive sequence searching. Journal of Integrative Bioinformatics, 6(1):107, (2009).
    • Podtelezhnikov A.A. and Wild D.L. CRANKITE– A Fast Polypeptide Backbone Conformation Sampler Source Code for Biology and Medicine,2008, 3:12.
    • Podtelezhnikov A.A. and Wild D.L Comment on: Efficient Monte Carlo trial moves for polypeptide simulations J. Chem. Phys. 2008 129(2):027103.
    david.jpg

    Contact us

    Close this email form
    Page contact: David Wild Last revised: Fri 18 May 2012
    • Sign in
    • |
    • Powered by Sitebuilder
    • |
    • © MMXII
    • |
    • Privacy
    • |
    • Accessibility