structurechecker
command-line toolstructurechecker
command-lineStructure Checker is a chemical validation tool detecting and fixing common structural errors or
special features that can be potential sources of problems. structurechecker
is the command-line tool
of Structure Checker.
structurechecker
-h, --help this help page -hc, --help-checker-action help page of valid checker actions -hf, --help-fixer-action help page of valid fixer actions -m, --mode <operationmode> [check|fix] mode of the operation (default: check) check only check is executed, does not modify molecules fix fix molecules containing structure errors whenever possible -x fix mode (deprecated, use --mode fix) Input options: -c, --config <filepath|string> action string configuration actions separated by "..", Output options: -t, --output-type <output type> [single|separated|accepted|discarded] set output type(default: single) single both accepted and discarded structures are written to the <output path> separated accepted structures are written to the <output path>, discarded structures are written to the <discarded path> accepted only accepted structures are written to the <output path> discarded only discarded structures are written to the <discarded path> -o, --output <output path> output file (default: standard output) -d, --discarded <discarded path> write molecules with structure error to a separate file (default:standard output) -f, --format <format> output file format (default: smiles) -rf, --report-file <filepath> write report to a file -rp, --report-property <propname> write report to the property of the output, with the specified property name -rt, --report-pattern <pattern> generate pattern based report file -re, --report-format <format> file format of the molecules in report -l, --log <filepath> write software-error log messages to file -ocr, --discard-scan-errors discard incorrectly scanned moleculesGo to top
structurechecker -hc
Valid checker actions (strings) are: 3d detect atoms with 3D coordinates abbrevgroup detect all abbreviated groups :expanded=[true|false] detect expanded abbreviated groups :contracted=[true|false] detect contracted abbreviated groups :excluded=[...] exclude the following groups during check; set comma-separated list of group abbreviations, e.g., "abbrevgroup:excluded=[Ph,COOH,Val]" absentchiralflag detect absent chiral flag absolutestereoconfiguration detect molecules in which all asymmetric centers have absolute stereo configuration alias detect atoms with alias aromaticity (deprecated) use aromaticityerror aromaticityerror detect aromaticity errors with the given aromatization type (default: general) :basic basic aromaticity errors :loose loose aromaticity errors :general general aromaticity errors atommap detect atoms with map number atomqueryproperty detect all or specified atom query properties :H=[true|false] hydrogen count :X=[true|false] connection count :D=[true|false] explicit connection count :R=[true|false] ring count :h=[true|false] implicit hydrogen count :r=[true|false] smallest ring count :a=[true|false] aromaticity :s=[true|false] substitution count :u=[true|false] unsaturation :rb=[true|false] ring bond count atomvalue detect atoms with atom value atropisomer detect atropisomers attacheddata detect atoms with attached data bondangle detect unpreferred bond angles in 2d bondlength detect bonds that are too long or too short chiralflagerror detect incorrectly set chiral flag circularrgroup (deprecated) use circularrgroupreference circularrgroupreference detect circular R-group references coordsystem detect invalid coordination systems covalentcounterion detect covalent counterions crosseddoublebond detect crossed double bonds empty detect items without atoms explicith detect all or specified explicit hydrogens :lonely=[true|false] lonely explicit hydrogens :mapped=[true|false] mapped explicit hydrogens :charged=[true|false] charged explicit hydrogens :isotopic=[true|false] isotopic explicit hydrogens :radical=[true|false] radical explicit hydrogens :wedged=[true|false] wedged explicit hydrogens :hconnected=[true|false] hydrogen connected to hydrogen atom :polymerendgroup=[true|false] hydrogen connected to a SRU S-group :sgroup=[true|false] hydrogen which is the only atom in an S-group :sgroupend=[true|false] hydrogen connected to a Superatom S-group :valenceerror=[true|false] hydrogen connected to an atom which has valence error :bridgehead=[true|false] hydrogen connected to a bridgehead atom explicitlp detect explicit lone pairs ezdoublebond detect if a double bond can be cis or trans isotope detect isotopes metallocene detect incorrect metallocene representations missingatommap detect atoms without map numbers missingrgroup (deprecated) use missingrgroupreference missingrgroupreference detect missing R-group definitions moleculecharge detect non-neutral molecules multicenter detect multicenters multicomponent detect molecules containing disconnected parts multiplestereocenter detect molecules with multiple stereocenters ocr detect drawings that originates from incorrect optical structure recognition overlappingAtoms detect atoms that are too close to each other overlappingBonds detect bonds that are too close to each other pseudoatom detect pseudo atoms queryatom detect query atoms querybond detect query bonds racemate detect asymmetric tetrahedral atoms without specific stereo configuration radical detect radical atoms rare (deprecated) use rareelement rareelement detect rare elements ratom detect specified type of R-atoms :all=[true|false] all type of R-atoms :disconnected=[true|false] disconnected type of R-atoms :generic=[true|false] generic type of R-atoms :linker=[true|false] linker type of R-atoms :nested=[true|false] nested type of R-atoms reactionmap (deprecated) use reactionmaperror reactionmaperror detect reactions with invalid atom mapping relativestereo detect multiple stereogenic center groups rgroupattachmenterror detect all R-group attachment errors rgroupreferenceerror detect errors in R-group definitions DEPRECATED checker, please use "missingrgroup", "unusedrgroup", "circularrgroup" instead. :missingratom=[true|false] missing R-atom definition :missingrgroup=[true|false] missing R-group definition :selfreference=[true|false] self reference errors in R-group definitions ringstrainerror detect small rings with trans or cumulative double bonds, or triple bond solvent detect common solvents appearing by a main component staratom detect star atoms stereocarebox detect stereo search markers on double bonds straightdoublebond detect undefined double bond stereo layout substructure:[smarts] detect the given SMARTS structure as a substructure in the original molecule unbalancedreaction detect reactions with orphan atoms unusedrgroup (deprecated) use unusedrgroupreference unusedrgroupreference detect unused R-group definitions valence (deprecated) use valenceerror valenceerror detect valence errors valenceproperty detect atoms with all or specified valence properties :defaultvalence=[true|false] default valence properties :nondefaultvalence=[true|false] non-default valence properties wedge (deprecated) use wedgeerror wedgeerror detect incorrect wedge bonds wigglybond detect wiggly bonds on chiral centers wigglydoublebond detect non_stereo double bonds with wiggly representation connected to a double bondGo to top
structurechecker -hf
Valid fixer actions (strings) are: addchiralflag add chiral flag to the molecule aliastoatom remove aliases from atoms aliastocarbon (deprecated) use converttocarbon aliastogroup convert atoms with aliases to abbreviated groups if the alias is recognized clean calculate 2D coordinates clearabsstereo (deprecated) use removeinvalidchiralflag contractgroup contract all abbreviated groups converttoelementalform convert isotopes into elemental atoms converttocarbon remove alias values from atoms and convert the atom to a carbon converttoionicform convert covalent counterions to ionic form converttometalloceneform convert non-standard metallocene representations converttosingle (deprecated) use converttosinglebond converttosinglebond convert faulty bonds to single bonds converttowigglydoublebond convert non-stereo double bond represented by crossed double bond to wiggly bond representation into coordinated multicenter representation crosseddoublebond convert non-stereo double bond represented by wiggly bond to crossed double bond representation crossedtowiggly (deprecated) use converttowigglydoublebond dearomatize convert aromatic rings into Kekule form expandgroup expand all abbreviated groups if it is possible fixmetallocene converts metallocenes to coordinative multicenter layout fixrgroupattachment add missing attachments points to members with single location fixunusedrgroups delete unreferenced R-group definitions fixvalence correct valence problem by removing hydrogens or setting charges mapmolecule add atom maps to each atom of the molecule mapreaction add atom maps to the reaction neutralize remove charges from the molecule partialclean recalculate parts of the atom coordinates for 2D layout pseudotogroup convert pseudo atoms to abbreviated groups if pseudo label is a known abbreviated group rearomatize dearomatize the molecule and aromatize it again removealias remove alias values from atoms removeatom remove the problematic atoms from the molecule removeatommap remove atom map numbers removeatomqueryproperty remove atom query properties removeatomvalue remove atom values removeattacheddata remove data attached to atoms removebond remove problematic bonds from the molecule removeexplicith remove explicit hydrogens removeinvalidchiralflag remove the chiral flag removeradical convert radicals to non_radical atoms removestereocarebox remove stereo search markers from double bonds removevalenceproperty remove valence properties from atoms removezcoordinate set the z-coordinates of atoms to zero ungroup ungroup all abbreviated groups wedgeclean recalculate the orientation of wedge bondsGo to top
structurechecker -c <config file> -m [mode] [<options>] [input list]
The command line parameter -c
or --config
is
mandatory. This parameter specifies the configuration file path or a
simple action string.
structurechecker -c config.xmlor
structurechecker -c "atomqueryproperty"
Parameter -m
or --mode
specifies the
operation mode. The following operation modes are available:
structurechecker -c config.xml -m check
structurechecker -c config.xml -m fix
Note: The syntax of commands can be different under various command line shells (bash, tcsh, zsh, etc.).
structurechecker
accepts most molecular file formats as input
(Marvin
Documents (MRV), MDL molfile, Sdfile, RXNfile, Rdfile, SMILES, etc.).
The input can be specified as:
structurechecker -c config.xml -m check input.mrv
structurechecker -c config.xml -m check "OCC(O)C1OC(=O)C(O)=C1O"Go to top
structurechecker
's output contains the file(s) of the checked/fixed molecules
and optionally a report of the results. The molecules are written to the output
file(s). The format of the output file(s) can be specified by the
-f
or --format
option (default format is: "smiles").
The type of output is defined by the -t
or --output-type
parameter. The possible values of the output type are the following:
--output
parameter.
If --output
parameter is omitted, the result is written in the standard output (console).
(--discarded
parameter is ignored in this case.)
--output
parameter defines the output file of molecules with valid
structures, and the --discarded
parameter defines the output file of
molecules with invalid structures (or in fix mode, those which cannot be fixed automatically).
--discarded
parameter is omitted, molecules with invalid structures are written to
standard output;
--output
parameter is omitted, molecules with valid structures are
written to standard output;
Note: The indication of --output
or --discarded
parameter is mandatory.
If none of these parameters
are defined, the program stops.
--output
parameter. If --output
parameter is omitted, molecules with
valid structures are written to the standard output. (--discarded
parameter is ignored in
this case)
--discarded
parameter. If �-discarded
parameter is omitted, molecules
with valid structures are written to the standard output. (--output
parameter is ignored in
this case.) --report-file
parameter, or to the output file(s) as additional molecule property. The name of the
property can be defined by the --report-property
parameter.
Note: Not all molecules with structure errors are discarded. When fix mode is selected, molecules with automatically unfixable errors will be discarded only.
Go to topBelow you can find the short descriptions of some examples.
If you want to check, fix, or filter structures in evaluate
or JChem Cartridge, find examples here.
structurechecker -c "metallocene"Executes a check with configuration metallocene on the molecule(s) defined in the standard input, and writes the result to the standard output (console);
structurechecker -c "bondLength" in.sdfExecutes a check with configuration bondLength on the molecule(s) defined in the
in.sdf
file, and
writes the result to the standard output (console);structurechecker -c "isotope->converttoelementalform" in.sdf
Executes a check with configuration isotope->converttoelementalform on the molecule(s) defined
in the in.sdf
file, and writes the result to the standard output (console);structurechecker -c "aromaticity..valence" -m fix -f sdf -o out.sdf in.sdfExecutes a fix with configuration aromaticity and valence on the molecule(s) defined in the
in.sdf
file, and writes the molecules with valid structures (including automatically
fixed molecules) in sdf
format to the out.sdf
output file;structurechecker -c config.xml -t separated -o out.sdf -d discarded.sdfExecutes a check with configuration contained by the
config.xml
, and writes the molecules with valid
structures to out.sdf
, and writes the molecules with invalid structures to discarded.sdf
.
Note: The format of both outputs is SMILES(!) as --format (-f)
is not defined;
structurechecker -c config.xml -m fix -t separated -d discarded.sdfExecutes a fix with configuration contained by the
config.xml
, and writes the molecules with invalid
structures to discarded.sdf
, and writes molecules with valid structures to the standard output
(console);structurechecker -c config.xml -m fix -t discarded in.sdfExecutes a fix with configuration contained by the
config.xml
, and writes the molecules with invalid
structures to discarded.sdf
, and omits molecules with valid structures.List of available checkers |
Examples of structure checking in various ChemAxon products |
Structure Checker GUI |
Structure Checker in MarvinSketch |
Structure Checker Developer Guide |