Académique Documents
Professionnel Documents
Culture Documents
Similarity Searching
Look for compounds that are most similar to the query compound Each compound in the database is ranked In other application areas, the technique is known as pattern matching or signature analysis
2D Similarity Measures
Commonly based on fingerprints, binary vectors with 1 indicating the presence of the fragment and 0 the absence Could relate structural keys, hashed fingerprints, or continuous data (e.g., topological indexes that take into acount size, degree of branching, and overall shape)
Tanimoto Coefficient
Tanimoto Coefficient of similarity for Molecules A and B: SAB = c _ a+bc
a = bits set to 1 in A, b = bits set to 1 in B, c = number of 1 bits common to both Range is 0 to 1. Value of 1 does not mean the molecules are identical.
Similarity Coefficients
Tanimoto coefficient is most widely used for binary fingerprints Others:
Dice coefficient Cosine similarity Euclidean distance Hamming distance Soergel distance
Distance values must be symmetric Distance values must obey the triangle inequality: DAB DAC + DBC Distance between non-identical objects must be greater than zero. Dissimilarity = distance in the ndimensional descriptor space
3D Similarity
Aim is often to identify structurally different molecules 3D methods require consideration of the conformational properties of molecules
Pharmacophore
A structural abstraction of the interactions between various functional group types in a compound Described by a spatial representation of these groups as centers (or vertices) of geometrical polyhedra, together with pairwise distances between centers
http://www.ma.psu.edu/~csb15/pubs/searle.pdf