|
Regulation of gene expression is a matter of chemistry between DNA and proteins at the molecular level. While remarkable advances have been made over the last two decades in gene identification using statistical and mathematical models with a heavy usage of databases and computational protocols, a gene finding model which directly captures the physicochemical properties intrinsic to DNA sequences and the chemistry of protein DNA interactions remains a goal yet to be realized.
Progressing towards development of an ab initio physico-chemical model christened ChemGenome, we constructed three dimensional vectors for each tri-nucleotide (codon) considering hydrogen bond energy, stacking energy and a third parameter which we provisionally identified with groove potentials. As this three-dimensional vector walks along a genome, the net orientation of the resultant vector differs significantly for gene and non-gene regions. The model works well for prokaryotic genomes and shows promise of universal applicability. Efforts to develop ChemGenome into a stand-alone algorithm for gene prediction are in progress.
.
S.No. |
NCBI_ID |
Species Name |
Genes |
TP# |
FP# |
SS# |
SP# |
CC# |
1 |
NC_000117 |
Chlamydia trachomatis |
463 |
458 |
4 |
0.98 |
0.99 |
0.98 |
2 |
NC_000853 |
Thermotoga maritima MSB8 |
641 |
619 |
3 |
0.96 |
0.99 |
0.96 |
3 |
NC_000854 |
Aeropyrum pernix K1 |
561 |
532 |
7 |
0.94 |
0.98 |
0.93 |
4 |
NC_000868 |
Pyrococcus abyssi GE5 |
632 |
630 |
241 |
0.99 |
0.63 |
0.49 |
5 |
NC_000907 |
Haemophilus influenzae |
955 |
953 |
7 |
0.99 |
0.99 |
0.99 |
6 |
NC_000908 |
Mycoplasma genitalium G-37 |
189 |
186 |
2 |
0.98 |
0.98 |
0.97 |
7 |
NC_000909 |
Methanocaldococcus janaschii |
720 |
708 |
9 |
0.98 |
0.98 |
0.97 |
8 |
NC_000912 |
Mycoplasma pneumoniae M129 |
243 |
241 |
2 |
0.99 |
0.99 |
0.98 |
9 |
NC_000913 |
Escherichia coli K12 |
2759 |
175 |
659 |
0.63 |
0.72 |
0.39 |
10 |
NC_000915 |
Helicobacter pylori |
731 |
727 |
4 |
0.99 |
0.99 |
0.98 |
11 |
NC_000916 |
Methanobacterium thermoautotrophicum |
719 |
711 |
4 |
0.98 |
0.99 |
0.98 |
12 |
NC_000917 |
Archaeoglobus fulgidus |
782 |
774 |
8 |
0.98 |
0.98 |
0.97 |
13 |
NC_000917 |
Archaeoglobus fulgidus DSM4304 |
782 |
774 |
8 |
0.98 |
0.98 |
0.98 |
14 |
NC_000918 |
Aquifex aeolicus VF5 |
584 |
575 |
3 |
0.98 |
0.99 |
0.97 |
15 |
NC_000921 |
Helicobacter pylori strain J99 |
658 |
648 |
9 |
0.98 |
0.98 |
0.97 |
16 |
NC_000922 |
Chlamydophila pneumoniae CWL029 |
597 |
590 |
9 |
0.98 |
0.98 |
0.97 |
17 |
NC_000948 |
Borrelia burgdorferi B31 plsmids cp32-1 |
11 |
11 |
0 |
1.0 |
1.0 |
1.0 |
18 |
NC_000949 |
Borrelia burgdorferi B31 plsmids cp32-3 |
11 |
11 |
0 |
1.0 |
1.0 |
1.0 |
19 |
NC_000950 |
Borrelia burgdorferi B31 plsmids cp32-4 |
11 |
11 |
0 |
1.0 |
1.0 |
1.0 |
20 |
NC_000951 |
Borrelia burgdorferi B31 plsmids cp32-6 |
10 |
10 |
0 |
1.0 |
1.0 |
1.0 |
# True positives (TP): Genes evaluated as genes. REFERENCE : 2.)Jayaram, B. Beyond the wobble: the rule of conjugates. J. Mol. Evol. 1997, 45, 704-705. [ABSTRACT] |
---|