1. HM introduction
What is Comparative protein structure modelling ?
1.1 Why Do We Need Comparative Protein Structure Modelling ?It has been hypothesized that the total number of different protein folds is finite, and roughly of the order of 1000. The fact that the protein structure space is finite, and much smaller than the protein sequence space has given rise to the hope that it is possible to have representative structure models for all protein sequences, without going into the expensive procedures of systematic experimental structure determination. Structural genomics is currently focusing on the construction of an extensive library of folds, and a figure of 10,000 to 100,000 representative proteins has been proposed. With such a library, it is expected that models for all proteins can be constructed.
1.2 What is Comparative Protein Structure Modelling?Comparative protein structure modeling usually proceeds in 4 steps:
(a) Fold recognition.Firstly, a template protein structure is identified as a plausible model for the protein sequence of interest. This step is usually referred to as fold recognition, and relies on sequence matching [5,6], and/or threading techniques in which the sequence is tested against a library of protein folds.
(b) Building the 3D model.In most cases, the template protein only provides an incomplete framework for building a 3-dimensional model for the protein of interest (the "target"). This framework consists of pieces of protein backbones corresponding to the conserved regions in the alignment of the sequence of the template and target protein. The second step of comparative modelling is to fill the gaps in the framework (this is usually referred as "loop building"), and predicting the conformation of the sidechains of the target protein on the completed backbone.
(c) Refinement.The structural model generated after step (b) is as good as expected from the sequence alignment. This model can be further refined using energy minimization, either with molecular mechanics or molecular dynamics programs.
(d) Assessment of the models.After refinement, the quality of the final model is assessed using standard energy functions and/or manual visual inspection using molecular graphics program.
1.3 Useful links
In collaboration with Michael Levitt from Stanford and Marc Delarue from Institut Pasteur, Paris, I have developed a suite of program, ProModel, for homology modelling. Information on ProModel can be found here . ProModel is one among many programs written for solving the comparative protein structure moedlling program. We provide links to other available programs and/or web services. This list is by no means exhaustive.