Sections
Personal tools

1. HM introduction

What is Comparative protein structure modelling ?

Jump to:

1.1 Why Do We Need Comparative Protein Structure Modelling ?

It has been hypothesized that the total number of different protein folds is finite, and roughly of the order of 1000. The fact that the protein structure space is finite, and much smaller than the protein sequence space has given rise to the hope that it is possible to have representative structure models for all protein sequences, without going into the expensive procedures of systematic experimental structure determination. Structural genomics is currently focusing on the construction of an extensive library of folds, and a figure of 10,000 to 100,000 representative proteins has been proposed. With such a library, it is expected that models for all proteins can be constructed.

Struct_genom.jpg

The aim of structural genomics is to generate enough protein structures
that any unknown protein will have a close homolog whose structure is known.
The success of this approach depends on the number of protein structures (i.e.
number of red dots, as well as on the quality of the tools for detecting structural
homologs (defined schematically by the radii of the circles)

The success of this approach is, however, strongly correlated to our ability to identify a proper structural template for a protein of interest, and to build an accurate motel for this protein, based on the template. Techniques to solve the identification step, or "fold recognition" problem rely on the assumption that similarities between the sequences of two proteins imply similarities between the structures of these proteins. The building step is usually referred to "homology modelling", or "comparative protein structure modelling". ProModel was designed as a method for solving this problem, and is described in details below.

1.2 What is Comparative Protein Structure Modelling?

Comparative protein structure modeling usually proceeds in 4 steps:

(a) Fold recognition.

Firstly, a template protein structure is identified as a plausible model for the protein sequence of interest. This step is usually referred to as fold recognition, and relies on sequence matching [5,6], and/or threading techniques in which the sequence is tested against a library of protein folds.

(b) Building the 3D model.

 In most cases, the template protein only provides an incomplete framework for building a 3-dimensional model for the protein of interest (the "target"). This framework consists of pieces of protein backbones corresponding to the conserved regions in the alignment of the sequence of the template and target protein. The second step of comparative modelling is to fill the gaps in the framework (this is usually referred as "loop building"), and predicting the conformation of the sidechains of the target protein on the completed backbone.

(c) Refinement.

The structural model generated after step (b) is as good as expected from the sequence alignment. This model can be further refined using energy minimization, either with molecular mechanics or molecular dynamics programs.

(d) Assessment of the models.

 After refinement, the quality of the final model is assessed using standard energy functions and/or manual visual inspection using molecular graphics program.

promodel_sketch.jpg
Comparative Protein Structure Modelling: Building the structural model:
- The alignment between the template and target sequences defines conserved region between the two proteins, from which a framework for the target protein is built. Gaps in the framework correspond to insertions and deletions. - A structural model for the target protein is built starting from the framework.

1.3 Useful links


In collaboration with Michael Levitt from Stanford and Marc Delarue from Institut Pasteur, Paris, I have developed a suite of program, ProModel, for homology modelling. Information on ProModel can be found here . ProModel is one among many programs written for solving the comparative protein structure moedlling program. We provide links to other available programs and/or web services. This list is by no means exhaustive.



MODELLER          http://salilab.org/modeller/
3D-Jigsaw             http://www.bmm.icnet.uk/servers/3djigsaw/
SDSC1                 
SCWRL                 http://dunbrack.fccc.edu/SCWRL3.php
SWISS-MODEL    http://www.expasy.ch/swissmod/SWISS-MODEL.html
ESyPred3D           http://www.fundp.ac.be/urbm/bioinfo/esypred