Sections
Personal tools

Research Program (December 2004)

My research program focus on understanding protein structures. I am interested in characterizing their shapes, and use this information to improve our understanding of their stability (ProShape). I am also interested in characterizing the subset of sequence space compatible with a protein structure: this is an indirect approach to understanding protein sequence evolution (ProDesign) . In parallel, I am involved in the development of new algorithms for predicting the structure of a protein, based on its sequence(ProModel).

If you are interested in reading more about the techniques used in these projects, please access my Bio eBook


ProShape: Understanding the shapes of protein structures

It is widely believed that the geometry of a protein plays a crucial role in its folding process as well as in its interactions with other bio-molecules or small ligands, yet geometrical methods are still uncommon in computational biology because several unresolved representational and algorithmic issues remain. In the year 2000, I joined an NSF funded ITR project on computational geometry for structural biology and bioinformatics, or short on bio-geometry. ProShape summarizes my efforts within this project,in collaboration with Michael Levitt from Stanford University and Herbert Edelsbrunner from Duke University. ProShape is an ongoing effort, whose aim is to develop new computational techniques for simulating, analyzing and visualizing protein structures. It currently focus on computing the accessible surface area and the volume of a molecule, as well as their derivatives, and their applications in designing new implicit solvent models. Read more...

ProDesign: Characterizing the sequence space compatible with a protein structure

The sequences of naturally occuring proteins are defined by evolutionary selective pressure, which is controlled by a fine balance of function, stability and kinetics. While most random mutations of sequences are unlikely to enhance stability or function, they can be accepted by natural selection as long as they are neutral (or near neutral). As a consequence, the size of the sequence space compatible with a given protein fold is usually very large. Interestingly, the size of this sequence space is found to vary greatly. A large number of these folds have a single representative, whereas other folds, such as the TIM fold or the Ig fold, have hundreds of representatives in the PDB. The question arises whether these differences are a consequence of variations in function, in stability, in evolution, or in all three of the above. My approach to provide an answer to this question is based on computational protein design. It has lead to the development of a full research program, ProDesign. Read more...

ProModel: Automatic comparative protein structure modelling

It has been hypothesized that the total number of different protein folds is finite, and roughly of the order of 1000. The fact that the protein structure space is finite, and much smaller than the protein sequence space has given rise to the hope that it is possible to have representative structure models for all protein sequences, without going into the expensive procedures of systematic experimental structure determination. Structural genomics is currently focusing on the construction of an extensive library of folds, and a figure of 10,000 to 100,000 representative proteins has been proposed. With such a library, it is expected that models for all proteins can be constructed. The success of this approach is, however, strongly correlated to our ability to identify a proper structural template for a protein of interest, and to build an accurate motel for this protein, based on the template. Techniques to solve the identification step, or "fold recognition" problem rely on the assumption that similarities between the sequences of two proteins imply similarities between the structures of these proteins. The building step is usually referred to "homology modelling", or "comparative protein structure modelling". I have been actively working on the homology modelling problem, with Marc Delarue (Institut Pasteur, Paris, France) and Michael Levitt (Stanford University). Former results and current developments are summarized under the project ProModel. Read more...