cxtrain
Calculation of property predictions (such as logP and pKa ) can be enhanced when experimental data are available for molecules that are similar to the target.
Such user-specific information can be incorporated into so-called training libraries, which can be generated via the ChemAxon's commandline tool cxtrain
.
It is a part of JChem and Marvin Beans pogram packages. The generated training library, stored on the user's own computer, is later used by the calculator plugins for improving the prediction of properties.
Download and launch platform specific installer by following the installation instructions.
cxtrain <prediction> [options] [input file (training set)]
Prediction: pka train pKa prediction logp train logP prediction prediction train custom prediction General options: cxtrain -h, --help this help message -i, --training-id<training> sets the training ID -l, --list list available training ID's -g, --ignore-error continue with next molecule on error pKa options: -V, --validation <filepath> validation results file path logP options: -t, --tag <tag name> name of the SDFile tag that stores the experimental logP values -a, --add-built-in-training-set add built-in logP training set Custom prediction options: -t, --tag <tag name> name of the SDFile tag that stores the experimental property values
The training is run by calling cxtrain as follows:
cxtrain <prediction> [options] [input file (training set)]where 'prediction' must be chosen from among "pka", "logP" or "prediction" (used for a custom property). There are general options that can be used with each training type and property-specific options as well.
--training-id (-i)
, you can set the ID of your training. Afterwards, this ID will refer the given training during the calculation.
--list (-l)
.
--ignore-error (-g)
skips the molecule on error and continues with the next correct one.
--add-built-in-training-set (-a)
merges your data with the data from built-in logP training set.
--tag (-t)
defines the name of the SDFile tag that stores the experimental logP values.
--tag (-t)
defines the name of the SDFile tag that stores the experimental custom defined values.The input of the software is a file which supports molecular properties (such as SDfile, MDL molfile, Compressed molfile, Compressed SDfile).
The generated training library will be stored on your computer , and it can be used via Marvin, Chemical Terms, Instant JChem or cxcalc
.
cxtrain pka -i mypka pKa_trainingset.sdf
cxcalc
:
cxcalc pKa --correctionlibrary
mypka "CSC1=NC2=C(N1)C=NC(O)=N2"
id apKa1 apKa2 bpKa1 bpKa2 atoms
1 11.19 16.01 2.34 -2.59 7,11,9,4
cxtrain logp -t LOGP -i mylogp -a logP_trainingset.sdf
--trainingid
combine with the parameter --method
via cxcalc
.
cxcalc logp --method user --trainingid mylogp "CC(C)CCO"
id logP 1 1,13
cxtrain logp --list
cxtrain prediction -t PAMPA -i mypampa pampa_trainingset.sdf
See also logP, pKa and Predictor training pages.