HomeGroupPublicationsResourcesWebmailContact Us
G2H: Chikungunya Database

     G2H is a novel and rapid approach proposed for identifying hits while initiating from the genome sequence. Chikungunya virus (CHIKV) has been used as a case study. The study was carried out by utilizing SCFBio tools (www.scfbio-iitd.res.in).

Why CHIKV?
     Chikungunya (CHIK) is a viral borne disease caused by CHIKV. Its symptoms closely resemble with dengue fever and are characterized by high fever with or without rashes, vomiting, nausea and mild to severe arthalgia/arthritis. The joint pain usually persists from days to months or even years, in some cases. It is one of the most important re-emerging infectious diseases spreading globally with sporadic intervals. India is the worst affected country by the re-emergence of CHIKV, after 32 years. Moreover, it is reported as a biosafety level 3 (BSL3) pathogen and is categorized under ‘C’ grade by National Institute of Allergy and Infectious Diseases (NIAID), in 2008. In contrast to it, yet no approved drug/vaccine is available currently in the public domain.

GENOME TO HITS (G2H)
     In response to this ever-increasing global disease, scientific concerns have been raised for the urgent requirement of the antiviral drug, specifically for CHIKV. The idea of pursuing ‘G2H insilico’ methodology is in concern with the utilization of the various databases available and predicting the hits from genome in an automated mode. Moreover this in silico approach helps the biologist and chemist to accelerate the knowledge to the huge pace by narrowing the cost and time factor. This pathway involves several challenges (i) accurate genome annotation (ii) accurate tertiary structure prediction of proteins (iii) active site identification (iv) hit molecule identification (v) docking and scoring of identified hits (vi) optimization of hits to lead for achieving selectivity and synthesizability, with low toxicity. The steps involved in this hypothesis are mentioned.

(i) Genome annotation: The genome annotation plays a vital role in finding potential therapeutic target molecules for pathogens. Chemgenome 3.0, which is an ab-initio approach based on the physico-chemical model has been used to produce and interpret structural annotations for the viral genome of Chikungunya virus [1]. The results displayed the existence of two genes, attaining 100% accuracy in this case with those of the experiments. These nucleotide sequences are translated to the protein sequences and the results can be archived from here. The proteins in CHIKV are polyproteins and therefore the individual proteins are cleaved manually during post translational processing based on literature. As no computational tools are available till date for polyproteins cleavage. The facility is available for free use at http://www.scfbio-iitd.res.in/chemgenome/chemgenome3.jsp.

(ii) Tertiary structure prediction of proteins: The sequences extracted from Chemgenome 3.0 served as input to Bhageerath-H server, a tertiary structure prediction software provided at http://www.scfbio-iitd.res.in/bhageerath/bhageerath_h.jsp[2]. Only the nonstructural proteins (nsP’s) were considered for structure prediction relying on its accuracy for the small globular proteins. Bhageerath-H is an hybrid ab-initio homology model, which identifies the top five structures are considered as plausible candidates for the native, and carried for further studies. The structures generated are available here.

(iii) Active site detection: In lieu of its structural information, active site detection preserves as a necessitating step in the overall process. To facilitate this, an automated version of active site finder i.e. AADS (Automated active site docking and scoring) is utilized which predicts the potential binding site(s) and further performs the docking of the selected molecule to the top ten cavities in an automated mode [3]. It requires the protein sequence information and detects the ten sites with 100% accuracies. The results of AADS are can be downloaded from here. AADS website is accessible at http://www.scfbio-iitd.res.in/dock/ActiveSite_new.jsp.

(iv) Hit identification: As far the information about the active site is known, the hit molecules can be predicted using RASPD (Rapid Screening of Preliminary Drugs) which is available at http://www.scfbio-iitd.res.in/software/drugdesign/raspd.jsp [4]. The software is designed in the spirit of structure-based drug design which screens its molecular library based on various physico-chemical parameters of the proteins and drugs. The user can choose molecules depending on libraries incorporated: million compound library and natural product library. All the ten cavities identified are processed for rapid screening in search of probable hits. The cutoff binding energy is set to be -8.00 kcal/mol and the top 100 molecules screened can be obtained from here. ZINC ids can be used to extract the molecule from ZINC database [5]. Wiener index is chosen as a search criterion whose values are corresponding to the volume. The numbers of hits are identified by ranging the binding energy from mentioned cutoff to the lowest that can be achieved.

(v) Docking and Scoring of identified molecules: To determine the efficacy, the identified hits are further processed for docking and scoring using Sanjeevini suite [6], which uses ParDOCK as the docking tool. ParDOCK utilizes ‘BAPPL’ scoring function to predict the binding free energies of protein-ligand complexes. ParDOCK is an all-atom energy based Monte Carlo algorithm which generate random configuration of the molecules to induce optimal fit towards the receptor and scores them based on their binging energy values. Sanjeevini is a complete drug design software suite with numerous in-built tools used for protein-drug docking and DNA-drug docking. Sanjeevini is accessible freely on http://www.scfbio-iitd.res.in/sanjeevini/sanjeevini.jsp. Followed by docking and scoring phenomenon, one molecule is identified for each cavity of the various models of the protein. The dataset contains the molecules proposed for synthesis with their corresponding binding energies.

References :

1. (a) A Physico-Chemical model for analyzing DNA sequences. Dutta S., Singhal P., Agrawal P., Tomer R., Kritee, Khurana E. and Jayaram B., J. Chem. Inf. Mod., 2006, 46(1), 78-85.

(b) Prokaryotic gene finding based on physicochemical characteristics of codons calculated from molecular dynamics simulations. Singhal P., Jayaram B., Dixit S. B. and Beveridge D. L., Biophys J., 2008, 94, 11, 4173-4183.

(c) G. Khandelwal, B. Jayaram, "A Phenomenological model for predicting melting temperatures of DNA sequences", PLoS One , 2010, 5(8), e12433.

2. (a) Bhageerath-H: A Homology ab intio Hybrid Webserver for Protein Tertiary Structure Prediction. Mohanty P., Lakhani B. and Jayaram B. (Manuscript in preparation).

(b) Jayaram B, Bhushan K, et al. Bhageerath: An Energy Based Web Enabled Computer Software Suite for Limiting the Search Space of Tertiary Structures of Small Globular Proteins. Nucl. Acids Res., 2006, 34, 6195-6204.

(c) Narang P, Bhushan K, Bose S, Jayaram B. Protein structure evaluation using an all-atom energy based empirical scoring function", J. Biomol. Str. Dyn., 2006, 23, 385-406.

(d) Narang P, Bhushan K, Bose S, Jayaram B. A computational pathway for bracketing native-like structures for small alpha helical globular proteins. Phys. Chem. Chem. Phys., 2005, 7, 2364-2375.

3. Singh T, Biswas D, Jayaram B. AADS--an automated active site identification, docking, and scoring protocol for protein targets based on physicochemical descriptors. J Chem Inf Model. 2011; 51: 2515-27.

4. Mukherjee G, Jayaram B. A Rapid Identification of Hit Molecules for Target Proteins via Physico-Chemical descriptors. Manuscript Submitted 2011.

5. Irwin and Shoichet. ZINC - A Free Database of Commercially Available Compounds for Virtual Screening. J. Chem. Inf. Model. 2005;45(1):177-82.

6. (a) Jain, T. and Jayaram, B. (2005) An all atom energy based computational protocol for predicting binding affinities of protein-ligand complexes. FEBS Letters, 579, 6659-6666.

(b) Jain, T. and Jayaram, B. A computational protocol for predicting the binding affinities of zinc containing metalloprotein-ligand complexes. PROTEINS: Struct. Funct. Bioinfo., 2007, 67, 1167-1178.

(c) Shaikh SA, Jayaram B. A swift all-atom energy-based computational protocol to predict DNA-ligand binding affinity and DeltaTm. J Med Chem. 2007 May 3;50(9):2240-4.

(d) Gupta, A. Gandhimathi, A. Sharma, P. and Jayaram, B. (2007) ParDOCK: An All Atom Energy Based Monte Carlo Docking Protocol for Protein-Ligand Complexes. Protein and Peptide Letters, 2007, 14, 7, 632-646.

(e) Shaikh SA, Jain T, Sandhu G, Latha N, Jayaram B. From drug target to leads--sketching a physicochemical pathway for lead molecule design in silico. Curr Pharm Des. 2007;13(34):3454-70.

(f) Mukherjee G, Patra N, Barua P, Jayaram B. A fast empirical GAFF compatible partial atomic charge assignment scheme for modeling interactions of small molecules with biomolecular targets. J Comput Chem. 2010, DOI:10.1002/jcc.21671.