Wednesday, January 12, 2011

....SMILES...

SMILES stand for Simplified Molecular Input Line Entry Specification

Introduction of SMILES
  • is a line notation (a typographical method using printable characters) for entering and representing molecules and reactions.
  • The primary reason SMILES is more useful than a connection table is that it is a linguistic construct, rather than a computer data structure. 
  • SMILES is a true language, albeit with a simple vocabulary (atom and bond symbols) and only a few grammar rules.
  • SMILES representations of structure can in turn be used as "words" in the vocabulary of other languages designed for storage of chemical information (information about chemicals) and chemical intelligence (information about chemistry). 
 A) Canonicalization
SMILES denotes a molecular structure as a graph with optional chiral indications. This is essentially the two-dimensional picture chemists draw to describe a molecule.
  • "unique SMILES" -  A canonicalization algorithm exists to generate one special generic SMILES among all valid possibilitie.
  • "isomeric SMILES" - SMILES written with isotopic and chiral specifications.
Examples:
Input SMILES Unique SMILES
OCC CCO
[CH3][CH2][OH] CCO
OC(=O)C(Br)(Cl)N NC(Cl)(Br)C(=O)O
ClC(Br)(N)C(=O)O NC(Cl)(Br)C(=O)O

B) Atoms
Atoms are represented by their atomic symbols. Each non-hydrogen atom is specified independently by its atomic symbol enclosed in square brackets, [ ].Elements in the "organic subset" B, C, N, O, P, S, F, Cl, Br, and I may be written without brackets if the number of attached hydrogens conforms to the lowest normal valence consistent with explicit bonds.
  • In aromatic rings - specified by lower case letters, e.g., aliphatic carbon is represented by the capital letter C.
  • In aromatic carbon - specified by lower case c.
Following atomic symbols are valid SMILES notations

Examples:
C methane CH4
N ammonia NH3

Atoms with valences other than "normal" and elements not in the "organic subset" must be described in brackets [ ].

Examples:
[S] ELEMENT SULPHUR
[Au] ELEMENT GOLD

C) Bonds
Single, double, triple, and aromatic bonds are represented by the symbols -, =, #, and :, respectively.

Examples:
CC ethane CH3CH3
C=O formaldehyde CH2O
O=C=O carbon dioxide CO2

D) Branches
Branches are specified by enclosing them in parentheses, and can be nested or stacked. In all cases, the implicit connection to a parenthesized expression (a "branch") is to the left.

Example: 
 

CCN(CC)CC Triethylamine



 
These are the examples of SMILES that I have 
made using ACD/ChemSketch  









SMILES is not only comprehensive but also well documented!!!


No comments:

Post a Comment