SCORING FUNCTION FOR PROTEIN STRUCTURE EVALUATION
ProSEE (Protein Structure Energy Evaluation)

Protein structure prediction attempts via comparative modeling or ab initio approaches involve generating various three dimensional atomic level models/conformations for the protein sequence, followed by their rapid assay for locating the most preferred native-like conformation characterized by the minimum most free energy. This is accomplished via all-atom energy based empirical scoring functions relating to the conformational free energy of a protein. Balancing simplicity and speed as well as maintaining accuracy in the scoring functions are two important aspects in this effort. Devising a scoring function that can mimic a free energy function and can distinguish between correct (native or native-like) structures from incorrect ones is a challenging task. This has resulted in intense efforts to devise newer, better and more efficient scoring functions. Scoring functions can be either knowledge/statistics based or physics based or can be hybrids of both. Knowledge based scoring functions are derived from experimentally known protein structures, limited only by the training data set. On the other hand physics based scoring functions are force field derived for modelling the energy landscape of the protein. Over the past few years,there is a continuous progression in the ability of empirical potential functions and protocols in pinning down the native as the lowest energy structure, yet design of an energy function which shows 100% discrimination between the native and the decoys is not achieved.

Our group at IIT Delhi, has devised an all-atom empirical energy function which combines second generation force field parameters with a hydrophobicity function. The scoring function considers the non-bonded energy of a protein, expressed as a sum of three terms - electrostatics, van der Waals and hydrophobicity.
ETotal = ∑( Eel + Evdw + Ehpb )
Eel is the electrostatic contribution to the energy, Evdw is the van der Waals term, Ehpb is the hydrophobic contribution and the summation runs over all the atoms of the protein [1-4].

SCORING FUNCTION
Upload the file in the given format [Sample File]

             Input PDB file :

Instructions for using the Tool

  1. The input file is a minimized all atom protein in PDB format. The file should follow the format described in README; otherwise erroneous values will be generated. Currently the program works only for monomeric globular proteins.
  2. The program displays the electrostatics, van der Waals and hydrophobic components as well as total energy of the protein (kcal/mol).
Validation of Scoring Function

The empirical scoring function is validated on 69 proteins and their 61974 decoys, belonging to twelve different publicly available as well as our own build decoy sets. Only those proteins are selected for which native structure is obtained via X-ray crystallography and are free from metal ions and prosthetic groups. Proteins which are fragments or mutlimers, or having mismatches in the number of atoms between native and decoys are also skipped. We have also tested the energy function on homology built models to assess its general applicability. Thus the scoring function presented here can be utilized in conjunction with protein structure prediction methodologies such as ab initio or comparative modeling for bracketing native-like structures. After addition of hydrogen atoms and subsequent minimization, energy calculations are carried out using the scoring function [4]. The minimized all atom decoys are given below.

S.No. Decoy sets References Number of Sequences studied Number of decoy structures investigated
(1) EMBL Holm & Sander, J Mol Biol, 225, 93-105 (1992). http://prostar.carb.nist.gov 7 10
(2) CASP1 Collection of structures for given targets in CASP1. http://predictioncenter.llnl.gov/download_area/ 2 12
(3) 4state_reduced Park & Levitt, J Mol Biol, 258, 367-392 (1996). http://dd.stanford.edu/ 5 3326
(4) Lattice_fit Samudrala & Moult, J Mol Biol, 275, 893-913 (1998). http://dd.stanford.edu/ 4 8000
(5) Lmds Kesar & Levitt, J Mol Biol, 329, 159-174 (2003). http://dd.stanford.edu/ 6 2634
(6) Fisa Simons et al., J Mol Biol, 268, 209-225 (1997). http://dd.stanford.edu/ 2 1000
(7) Fisa_CASP3 Simons et al., J Mol Biol, 268, 209-225 (1997). http://dd.stanford.edu/ 3 3098
(8) Hg_structal Samudrala et al., unpublished work. http://dd.stanford.edu/ 2 58
(9) Semfold Samudrala & Levitt, BMC Struct Biol, 2, 3-10 (2002). http://dd.stanford.edu/ 2 22667
(10) Rosetta Simons et al., Proteins : Struct Funct Genet Suppl 3, 171-176 (1999).
http://depts.washington.edu/bakerpg
21 20484
(11) CASP5 Collection of structures for given targets in CASP5. http://moult.carb.nist.gov/ 7 614
(12) Homology http://scfbio-iitd.res.in/decoys/Homology/ 8 71
  Total   69 61974

REFERENCES
  1. N. Arora and B. Jayaram, J. Phys. Chem. 102, 6139-6144 (1998).
  2. N. Arora and B. Jayaram, J. Comp. Chem. 18, 1245-1252 (1997).
  3. M. A. Young, B. Jayaram and D. L. Beveridge, J Phys. Chem. 102, 7666-7669 (1998).
  4. Narang, P., Bhushan, K., Bose, S., and Jayaram, B. Protein structure evaluation using an all-atom energy based empirical scoring function. J. Biomol.Str.Dyn, 23, 385-406. (2006).
SUGGESTIONS AND COMMENTS

Please feel free to send your suggestions and comments at scfbio@scfbio-iitd.res.in

Number of Hits since 4th April 2010: 7739