Training of Calculator Plugins via `cxtrain`

Version 14.9.29.0

Introduction
Installation
Usage
Options
- General
- pK_a
- logP
- Custom prediction
Input
The place of the training library
Examples
License Management

Introduction

Calculation of property predictions (such as logP and pK_a ) can be enhanced when experimental data are available for molecules that are similar to the target. Such user-specific information can be incorporated into so-called training libraries, which can be generated via the ChemAxon's commandline tool cxtrain. It is a part of JChem and Marvin Beans pogram packages. The generated training library, stored on the user's own computer, is later used by the calculator plugins for improving the prediction of properties.

Installation

Download and launch platform specific installer by following the installation instructions.

Usage

cxtrain <prediction> [options] [input file (training set)]

Prediction:
pka                                   train pKa prediction
logp                                  train logP prediction
prediction                            train custom prediction
General options:
cxtrain -h, --help                    this help message
 -i, --training-id<training>          sets the training ID
 -l, --list                           list available training ID's
 -g, --ignore-error                   continue with next molecule on error
pKa options:
 -V, --validation <filepath>          validation results file path
logP options:
 -t, --tag <tag name>                 name of the SDFile tag that stores the experimental logP values
 -a, --add-built-in-training-set      add built-in logP training set
Custom prediction options:
 -t, --tag <tag name>                 name of the SDFile tag that stores the experimental property values

The training is run by calling cxtrain as follows:

cxtrain <prediction> [options] [input file (training set)]

where 'prediction' must be chosen from among "pka", "logP" or "prediction" (used for a custom property). There are general options that can be used with each training type and property-specific options as well.

General options

Applying the option --training-id (-i), you can set the ID of your training. Afterwards, this ID will refer the given training during the calculation.
The available training ID's can be listed using option --list (-l).
--ignore-error (-g) skips the molecule on error and continues with the next correct one.

pK_a specific option

--validation <filepath> (-V) creates validation data; the file path of the pK_a training validation chart can be defined optionally.

logP specific options

--add-built-in-training-set (-a) merges your data with the data from built-in logP training set.
Option --tag (-t) defines the name of the SDFile tag that stores the experimental logP values.

Custom prediction option

Option --tag (-t) defines the name of the SDFile tag that stores the experimental custom defined values.

- Training
  This command trains pK_a calculation, using the datafile pKa_trainingset.sdf and setting training ID to "mypka":
```
cxtrain pka -i mypka pKa_trainingset.sdf
```
- Calculation
  The following example presents, how this generated training set can be utilized in pKa calcutlations via cxcalc :
```
cxcalc pKa --correctionlibrary mypka "CSC1=NC2=C(N1)C=NC(O)=N2"
```
- Result
```
 id      apKa1   apKa2   bpKa1   bpKa2   atoms
```
```
 1       11.19   16.01   2.34    -2.59   7,11,9,4
```
- Training
  Same for logP calculation, using the datafile logP_trainingset.sdf, with the experimental logP values stored in the SDF tag named "LOGP", setting training ID to "mylogp" and including data from the built-in training set:
```
cxtrain logp -t LOGP -i mylogp -a logP_trainingset.sdf
```
- Calculation
  To apply your generated LogP training library in calculations; use the parameter --trainingid combine with the parameter --method via cxcalc.
```
cxcalc logp --method user --trainingid mylogp "CC(C)CCO"
```
- Result
```
id      logP
1       1,13
```
- Training
  The following command lists available training ID's for logP calculation:
```
cxtrain logp --list
```
- Training
  This command trains a custom property calculation, using the datafile pampa_trainingset.sdf, with the experimental values stored in the SDF tag named "PAMPA", setting training ID to "mypampa":
```
cxtrain prediction -t PAMPA -i mypampa pampa_trainingset.sdf
```

See also logP, pK_a and Predictor training pages.

Training of Calculator Plugins via `cxtrain`

Version 14.9.29.0

Contents

Introduction

Installation

Usage

General options

pK_a specific option

logP specific options

Custom prediction option

Input

The place of the training library

Usage examples

Training of Calculator Plugins via cxtrain

Version 14.9.29.0

Contents

Training of Calculator Plugins via `cxtrain`