SMILES on Wikipeida
The original SMILES specification was developed by Arthur Weininger and David Weininger in the late 1980s. It has since been modified and extended by others, most notably by Daylight Chemical Information Systems Inc.
it also has a wide base of software support with extensive theoretical (e.g., graph theory) backing.
A common application of Canonical SMILES is for indexing and ensuring uniqueness of molecules in a database.
In terms of a graph-based computational procedure, SMILES is a string obtained by printing the symbol nodes encountered in a depth-first tree traversal of a chemical graph. The chemical graph is first trimmed to remove hydrogen atoms and cycles are broken to turn it into a spanning tree.
SMARTS is a modification of SMILES that allows, in addition to the SMILES elements, the specification of wildcard atoms and bonds. This is used in specifying search structures and is widely used in chemical database search applications.
Improved SMILES Substructure Searching , by Daylight
Daylight Theory Manual - Covering general information on representing molecules and an in-depth discussion of SMILESTM, SMARTS®, SMIRKS®, fingerprints, THOR database concepts, and Merlin analysis <html, pdf>
SMARTS - A Language for Describing Molecular Patterns
Fingerprints - Screening and Similarity
OpenBabel, including Implementation of Daylight SMARTS molecular matching syntax
makefp is a command line program to compute hashed path fingerprints from input smiles, or other file formats such as sdf or mol files.
Checkmol is a command-line utility program which reads molecular structure files in different formats (see below) and analyzes the input molecule for the presence of various functional groups and structural elements.
Search by Functional groups
PubChem Similar Searches search allows you to find similar chemical structures to the provided query. Similarity is measured using the Tanimoto equation and a binary fingerprint computed for every structure in the PubChem Compound database. This fingerprint consists of a series of chemical substructure “keys”. Each key denotes the presence or absence of a particular substructure in a molecule. The fingerprint does not consider variation in stereochemical or isotopic information. Collectively, these binary keys provide a “fingerprint” of a particular chemical structure valence-bond form.
PubChem Substructure search allows you to locate chemical structures that contain the particular connectivity and valence bond pattern that you provide in your query. For example, a substructure search of ethanol (SMILES: OCC) would return, among others, acetic acid (SMILE: OC(=O)C), since ethanol is a substructure of acetic acid.
OpenEye software
Roll Your Own Chemical Database With Free Components
Creating a Web-based, Searchable Molecular Structure Database Using Free Software
How to create a web-based molecular structure database with free software, a fine presentation to read.