File Formats : SMARTS

ChemAxon products import and export SMARTS strings with the following features:

Full-functional (editable) query features

SMARTS features interpreted during import/export as full-functional (editable) query features:

  • atom lists like [C,N,P] and 'NOT' lists like [!#6!#7!#15]
  • any bond: ~
  • ring bond: C@C
  • hydrogen count: H0, H1, H2, H3, H4
  • valence: v0, v1, ..., v8
  • connectivity: X0, X1, X2, X3, X4
  • in ring: R
    ring count: R0, R1, ..., R6
  • size of smallest ring: r3, r4, r5, r..
  • number of ring bonds: x2, x3, x4
    at least one ring bond: x
  • aromatic and aliphatic atoms: a, A
  • aliphatic, aromatic atom query properties
  • single_or_double, single_or_aromatic, double_or_aromatic bonds (used in Marvin)
  • directional or unspecified bonds: C\C=C/?C
  • chiral or unspecified atoms: C[C@?H](Cl)Br
  • component level grouping: (C).(O) (C.O)

Features with limited editing support

A subset of SMARTS features are imported as SMARTS atoms/bonds. These atoms/bonds have limited editing support in the Marvin GUI, but can be exported and evaluated (e.g. JChem structure searching handles them correctly):

  • implicit hydrogen count: h2, h3, h..
  • degree: D2, D3, D..
  • more difficult logical expressions in atom or bond expressions: &,;!
    (Simpler cases, like atom lists, not lists, "and"-expressions are handled by the above features.)
  • recursive SMARTS: [$(CCC)]

Features exported as SMARTS atoms/bonds

A subset of features are exported as SMARTS atoms/bonds.

  • MDL Substitution Count query atom property s<n> is converted to degree Dn. In case of s* the non-H neighbours are counted and exported as degree D<number>.
  • MDL Unsaturated Atom query atom property u is converted to recursive SMARTS: $([*,#1]=,#,:[*,#1]) is appended after the SMARTS atom.

Implicit and Query Hydrogen Atoms

  • Impicit Hydrogen atoms are not written inside brackets. Eg: [C:1]
  • Query Hydrogen atoms are written inside brackets without using the low precedence "and" operator ';'. Eg: [CH3]

Implicit bond types

The default bond types for import and export strongly depend on the atoms connected by the bond.

  • Aromatic bonds are not written explicitly if neither atoms are aliphatic and they are in a ring.
    e.g.: c1ccccc1 But: c:c, c:[c;a], [#6]:c
  • Single bonds are not written explicitly if at least one atom is not aromatic.
    e.g.: CC, C[c;a], Cc, C[C;A], [#6]C But: [#6]-[c;a], c1ccc(cc1)-c2ccccc2
  • Single_or_aromatic bonds are not written explicitly if both atoms of the bond are aromatic and any of them is not in the same ring.
    e.g.: [#6]cc, [#6][c;a], [#6][#6]

 

™: SMILES, SMARTS, and SMIRKS are trademarks of Daylight Chemical Information Systems.