Chemical Calculations with Calculator Plugins using cxcalc

Version 14.9.29.0

Contents

 

Introduction

ChemAxon's Calculator (cxcalc) is a command line program in Marvin Beans and JChem that performs chemical calculations using calculator plugins.

There are a lot of calculations provided by ChemAxon (e.g. charge, pKa, logP, logD), and custom plugins can be also added.

To obtain a license key for calculations provided by ChemAxon Ltd., contact . Please check this list to ask for the appropriate License Key.

 

Installation

Download and launch platform specific installer by following the installation instructions.

Usage

Calculator performs plugin calculations in a uniform way: it processes general parameters referring to input, output, and SDF file tag names for storing calculation results as well as plugin specific parameters that are different for each plugin. The available calculations are defined in the configuration file and listed below the general help message if you type cxcalc -h or simply cxcalc. Calculator can also be used to train some of the plugin calculations, for more see the training section of the help.
cxcalc [general options] [input files/strings] <plugin> [plugin options] [input files/strings]
cxcalc [general options] [input files/strings] <plugin1> [plugin1 options] [input files/strings] <plugin2> [plugin2 options] [input files/strings] ...
cxcalc [training options] [input file (the training set)]

General Options

  cxcalc -h, --help                 this help message,
                                    list of available calculations
  cxcalc <plugin> -h, --help        plugin specific help message
  -o, --output <filepath>           output file path (default: standard output)
  -t, --tag <tag name>              name of the SDFile tag to store the
                                    calculation results, tag name prefix
                                    to default tag names in case of multiple
                                    plugins (default: see plugin help)
  -i, --id <tag name|format>        the name of the existing SDFile tag that
                                    stores the molecule ID; or create
                                    molecule ID by converting the input
                                    molecule into the specified format;
                                    (default: molecule index is used as ID)
  -N, --do-not-display <type>       [i|h|ih]
                                    do not display molecule ID and/or
                                    table header (in table output form)
          i                         no molecule ID
          h                         no table header
          ih                        neither molecule ID nor table header
  -S, --sdf-output                  SDF output with results in SDF tags
  -M, --mrv-output                  result molecule output in MRV format
                                    (if neither -S nor -M is specified, then
                                    plugin results are written in table form)
  -g, --ignore-error                continue with next molecule on error
  -v, --verbose                     print calculation warnings to the console
      --log <filepath>              write log messages to file
                                    (default: write log to system error)
      --log-level <level>           [error|warning|off]
                                    set log level (default: error)
          error                     log error level information
          warning                   log warning and error level information
          off                       no log information
      --log-options <options>       list of logger options, separated by ','
          time                      log calculation execution time; calculation
                                    will run on ONE CPU in this case
          timelimit=<time in ms>    only execution times above the specified
                                    limit will be logged
          format=<molecule format>  log file format; default is SDF when
                                    logging to file and SMILES when logging to
                                    system error

You can also pass some JVM options to the Java Virtual Machine as cxcalc command line arguments.

Input files can be given both on the general option side and on the plugin specific option side, in either case these input files/strings give the input molecules for the calculations. If more plugins are given then all plugin calculations are performed for all input molecules.

Note: plugin IDs are case-insensitive, you can alter upper- and lower case letters if you like. For example:

Note: The syntax of commands can be different under various command line shells (bash, tcsh, zsh, etc.).

The available plugins are configured in the xjars/calc.properties configuration file. The xjars directory is inside the MarvinBeans.jar (in Marvin Beans package) or jchem.jar (in JChem package). In Marvin Applets package, xjars directory is in the "marvin" directory. User-defined plugins may also be configured in this file. The built-in plugins that can be purchased from . A detailed description of the configuration file is given below.

cxcalc parameters

Plugin Specific Options

The plugin specific help message is printed if the user types:

    cxcalc <plugin> -h
Here plugin is the plugin key from the configuration file.

Example

Typing cxcalc logp -h produces the help string:

Calculator plugin: logp.

logP calculation:
for type logPTrue: logP of uncharged species, or,
in the case of zwitterions, logD at pI;
for type logPMicro: logP of the input species.

Usage:
  cxcalc [general options] [input files/strings] logp
[logp options] [input files/strings]

logp options: 
  
 -h, --help                     this help message
 -p, --precision                <floating point precision as number of
                                fractional digits: 0-8 or inf> (default: 2)
 -m, --method                   [vg|klop|phys|user|weighted]
                                (default: weighted)
     --trainingid               <training id>
 -w, --weights                  <wVG:wKLOP:wPHYS:wUSER> method weights
                                (default: 1:1:1:0)
                                wVG: weight of the VG method
                                wKLOP: weight of the KLOP method
                                wPHYS: weight of the PHYS method
                                wUSER: weight of the user defined method
 -a, --anion                    <Cl- concentration>
                                (default: 0.1, range: [0.0, 0.25])
 -k, --kation                   <Na+ K+ concentration>
                                (default: 0.1, range: [0.0, 0.25])
 -t, --type                     [increments|logPMicro|logPTrue]
                                (default: logPTrue)
 -i, --increments               [true|false] show atomic increments
                                (default: false)
     --considertautomerization  [true|false] consider tautomerization
                                (default: false)
Multiple values for the same parameter
should be separated by commas (',' without space).

Example:
  cxcalc -S -t myLOGP logp -a 0.15 -k 0.05 test.mol

Calculations

See the list of calculations or the output of cxcalc -h command for the list of available calculations.

Input

The software may take molecules from text files or from SMILES string. Most molecular file formats are accepted (for instance MDL molfile, Compressed molfile, SDfile, Compressed SDfile, SMILES).

If no input file name or SMILES string is given in the command line, then input molecules are read from the standard input.

Output

Calculator writes calculation results in a format based on the specified tags. If the result refers to the entire molecule, it is written as a single number. If the calculation gives a separate number for each atom in the molecule, it is written as a list of numbers separated by semicolons. The order of the results corresponds to the order of the atoms determined by their atom indices. Other output formats may be available for certain plugins, see the plugin specific options for the specific plugin. By default, results are written without the input molecule in a table form, but Calculation results can be written in an SDF file as an SDF tag if the --sdf-output parameter is specified.

 

Configuration File

The available plugins can be configured by editing the plugins/calc.properties file (path is taken relative to the Marvin root directory). User-developed calculations can be added and built-in calculations can be modified by editing this configuration file (configuration of the built-in calculations is defined in the xjars/calc.properties file inside MarvinBeans.jar). The plugins provided by ChemAxon Ltd. can be purchased from .

Configuration File Format

The configuration file is a java property file. The format of the configuration file is best shown by an example:

charge=$chemaxon.marvin.calculations.ChargePlugin\
	$ChargePlugin.jar\
	$Charge\
	$p=precision:2;t=type:total;i=implh:false;r=resonance:false;H=pH\
	$CHARGE\
	$Partial charge calculation.\nTypes aromaticsystem / aromaticring calculate the sum of charges\nin the aromatic system / aromatic ring containing the atom.\
	$-p, --precision=<floating point precision as number of
	\nfractional digits: 0-8 or inf>(default: 2);-t, --type=[sigma|pi|total|implh|\naromaticsystem|aromaticsystemsigma|aromaticsystempi|\naromaticring|aromaticringsigma|aromaticringpi]
	\n(default: total);-i, --implh=[true|false] implicit H charge sum shown in brackets
	\n(for sigma and total charge only) (default: false);-r, --resonance=[true|false]
	\ntrue: take resonant structures (default: false);-H, --pH=<pH value>> takes major microspecies at this pH
	\n(default: no pH, takes the input molecule)\
	$cxcalc -S -o result.sdf -t myCHARGE charge -t pi,total -p 3 test.mol

The key charge is the plugin name that the plugin is referenced by in the cxcalc command line tool.

Configuration items are separated by '$' characters. The '\' characters allow property values to be expanded to multiple lines: the '\' character itself as well as leading white spaces in the next line are ignored.

The configuration items:

  1. the plugin class with full package name
  2. the plugin JAR name (with path relative to the plugins directory)
  3. the plugin group name (used for grouping the available plugins in the help message)
  4. the plugin specific parameters:
    <short name>=<long name>:<default value>
    separated by semicolons
  5. the default SDF file tag name storing the results in case of SDF file output
  6. a short description used in the plugin specific help message
  7. the plugin specific help text (parameter description text) with newline characters replaced by semicolons
  8. an example usage text (optional)

The plugin loading mechanism is the following: first the program tries to load the plugin class by the default class loader from the CLASSPATH; if this the plugin class is not found then the JAR is loaded and the system tries to load the plugin class from there.

If the plugin name is omitted then the plugin is loaded directly from the JAR where the Plugin-Class manifest attribute specifies the plugin class.

If the JAR name is omitted then the plugin is loaded from the CLASSPATH.

Missing configuration items should be denoted by '-' characters. For example, here is the above plugin configuration with omitted JAR name:

charge=$chemaxon.marvin.calculations.ChargePlugin\
	$-\
	$Charge\
	$p=precision:2;t=type:total;i=implh:false;H=pH\
	$CHARGE\
	$Partial charge calculation.\nTypes aromaticsystem / aromaticring calculate the sum of charges\nin the aromatic system / aromatic ring containing the atom.\
	$-p, --precision=<floating point precision as number of
	\nfractional digits: 0-8 or inf> (default: 2);-t, --type=[sigma|pi|total|implh|aromaticsystem|aromaticring]
	\n(default: total);-i, --implh=[true|false] implicit H charge sum shown in brackets
	\n(for sigma and total charge only) (default: false);-H, --pH=<pH value> takes physiological microspecies at this pH
	\n(default: no pH, takes the input molecule)\
	$cxcalc -S -o result.sdf -t myCHARGE charge -t pi,total -p 3 test.mol

Important: the long parameter names in the "plugin specific parameters" section should correspond to the parameter property keys used in the plugin class in the setParameters(Properties params) method!

 

Examples

  1. pKa calculation with table form output, showing the two most significant acidic and the two most significant basic pKa values (this is the default table output mode):
    cxcalc mols.sdf pka
    
  2. The same with molecule ID-s taken from the ID tag of the input SDF file, writing three significant values from each pKa type:
    cxcalc mols.sdf -i ID pka -a 3 -b 3
    
  3. The same with setting minimum basic pKa to -5, maximum acidic pKa to 15:
    cxcalc mols.sdf -i ID pka -a 3 -b 3 -i -5 -x 15
    
  4. Charge calculation for molecules in the mols.sdf file, writes results to the standard output in MRV format, charge values displayed in atom labels:
    cxcalc -M charge mols.sdf
    
  5. The same with output to the molcharges.mrv file to be created in the same directory, displaying the results in MarvinView:
    cxcalc -M -o molcharges.mrv charge mols.sdf
    mview molcharges.mrv
    
  6. LogP calculation with both result types (atomic increments and overal molecule) and user defined SDF tag name, piping the result to MarvinView:
    cxcalc -S mols.sdf -t LOGP_BOTH logp -t increments,logP | mview -
    

    Note, that such piping does not work in Windows.

    By setting the Table / Show Fields option in MarvinView the SDF file tags will be shown in the table cells and in this way the charge values can be seen.

  7. Elemental analysis (all result types), output in table form, molecule ID-s taken from the ID tag of the input SDF file, output written to text file elemanal.txt:
    cxcalc -o elemanal.txt -i ID elemanal mols.sdf
    
  8. A similar example with input taken from mols.smiles and output written as SDF to elemanal.sdf with ELEMANAL tag name:
    cxcalc -S -t ELEMANAL -o elemanal.sdf elemanal mols.smiles
    
  9. Writting molecular mass, logP and logD at pH 6.4 in the same table:
    cxcalc mass logP logD -H 6.4 mols.smiles
    
  10. Calculating some topological data:
    cxcalc ringCount ringAtomCount ringBondCount mols.smiles