Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • Text only
  • |
  • Sign in
  • Search MOAC
  • Search University of Warwick
  • Search for people at Warwick
  • Search Warwick Blogs
  • Search past exam papers
  • Search video
  • More…

    Molecular Organisation and Assembly in Cells

    • About the DTC
    • Research
    • People
    • Degrees
    • Study at MOAC
    • News & Events
    • MOAC Students »
    • Peter Cock »
    • Python Programming »
    • Linear Regressions
    • Rank Correlations
    • Microarray Heatmaps
    • Ramachandran Plots
    • Nuc/AA Sequences
    • RPS-BLAST
    • FASTA nucleotide files
    • Genomes by FTP
    • GenBank Files
    • GenBank to FASTA
    • Protein Superposition
    • Protein Contact Maps
    • Protein visualisation
    • Solving Sudoku
    University of Warwick

    Peter's Python Programming Pages

    Python is a freely available programming language, available on Windows, Linux and MacOS. There is a Beginners' Guide, and I have found searching on Google for specific questions very handy.

    Biopython

    Biopython is an add-on module which provides support for lots of Bioinformatic work:

    • Dealing with numerous forms of biological data (including Fasta and Genbank sequence files, and PDB structure files).
    • Handy functions like translating nucleotide sequences into amino acids.
    • Calling (standalone) NCBI BLAST to search for sequence matches.
    • Calling (standalone) ClustalW to do alignments.

    I started using Biopython in 2004 for my PhD, and then began contributing to the project. By the end of my PhD I was one of the core developers and was lead author on the Biopython application note publication. Why not have a look at the Biopython Tutorial and Cookbook, or some of my examples.

    On a related note, Thomas Mailund's Newick Tree module is very handy for dealing with phylogenetic trees. This file format is used by used by PHYLIP, TREE-PUZZLE, PROTML, and several other programs including Clustal. Biopython 1.30 did not include any code for dealing with tree files - but check out the new Nexus module in BioPython 1.40b onwards from Frank Kauff and Cymon Cox, which is a nice alternative.

    The Molecular Modelling Toolkit (MMTK)

    The Molecular Modelling Toolkit (MMTK) is a python library providing a range of tools for molecular simulations (the numerically intensive parts are actually written C for speed).

    MMTK also has visualisation capabilities, including the ability to use Visual Python or VMD for output. I wrote an example now included in MMTK which loads a PDB file of insulin and displays it using a space filling model.

    I did some work with MMTK for my second MOAC MSc mini-project, and wrote a good chunk of the Windows installation instructions for MMTK.

    RPy (R from Python)

    R is a language and environment for statistical computing and graphics, available for free from The R Project. R's rich libraries for statistics and graph creation can be called from within a Python program using RPy (R from Python), and is used in several of my examples below.

    See also my R programming pages.

    Examples

    Here are some Python examples I have written and chosen to share:

    • Sudoku Solver
    • Using FASTA nucleotide files to calculate GC percentages
    • Downloading (Bacterial) Genomes with an FTP script
    • Parsing (reading) GenBank files
    • Converting GenBank files into FASTA files
    • Running RPS-BLAST and parsing the output
    • Using Python (and R) to calculate Linear Regressions
    • Using Python (and R) to calculate Rank Correlations
    • Using Python (and R) to draw heatmaps from microarray data
    • Protein Superposition (3D alignment) using SVD in BioPython
    • Protein visualisation using MMTK
    • Ramachandran Plots for proteins:
      • Calculating (ϕ,ψ) angles from a PDB file using MMTK
      • Drawing a Ramachandran Plot with Python and R

    Search my python pages:
    Spinner

     

    [Python logo]
    Python

     

    [Biopython logo]
    Biopython

     

    [Logo for Thomas Mailund's Newick module]
    Newick Python parser

     

    [MMTK logo]
    The Molecular Modelling Toolkit (MMTK)

     

    [RPy logo]
    RPy (R from Python)

    MOAC DTC, Coventry House, University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL
    Tel. 024 765 75808 moac2 at warwick dot ac dot uk

    How to find us

    MOAC Intranet

    EPSRC logo

    Close this email form
    Page contact: Peter Cock Last revised: Tue 12 May 2009
    • Sign in
    • |
    • Powered by Sitebuilder
    • |
    • © MMXII
    • |
    • Privacy
    • |
    • Accessibility
    Download this page and its children in a single page