Académique Documents
Professionnel Documents
Culture Documents
This page describes the problem of rigid molecule docking. We describe two variants of
the problem (ligand-protein and protein-protein docking), and various criteria for "good"
docking. We then describe the DOCK algorithm developed by Kuntz et al, and the basic
principles of the soft docking algorithm by Jiang & Kim and Katchalski-Katzir's docking
algorithm.
Rigid docking
Ligand protein docking
Protein protein docking
The DOCK algorithm
Soft (Flexible) docking
Rigid Docking
In the rigid molecule docking problem we will relate to the molecules as rigid
objects that cannot change their spatial shape during the docking process.
In molecular biology, there are two main problems where the docking problem
arises: The ligand-protein docking and the protein-protein docking problems.
Back to Index.
Ligand-Protein Docking:
This problems involves a large molecule (the protein - also
called the 'receptor') and a small molecule (the ligand) and
is very useful in developing medicines. A common
situation is the `key in lock` situation when the ligand is
docking in a cavity of the protein.
Back to Index.
Protein-Protein Docking:
This problem involves two proteins that are approximately
the same size. Therefore, usually the docking site is a
more "planar" surface than in the ligand-protein docking
and cases where the docking occurs when one molecule is
located inside a cavity in the other molecule, are very rare.
There are several physical and chemical forces that interact between the two
molecules. These forces are used to define various docking scores that measure
how good is each solution. These scores take into account the strength of these
forces and the plausibility of the docking solution.
The most significant forces are:
Electrical forces:
The most significant force that draws parts of the molecules closer
together ot further apart according to their electrical charge.
Van Der Vaals forces:
This force is very significant when the molecules are close and their
contact surface is large.
The formula for this force is A/(r6) - B/(r12) where A and B are constants
and r is the distance between them. Notice that when the distance r is
very small there's a significant rejection force driving them apart.
Hydrogen bonds:
These bonds could significantly strengthen the bond between two
molecules and occurs when one molecule has a Hydrogen atom close to
the docking surface that interacts with an atom from the second molecule
when the docking occurs.
Figure 1 - When the docking
of the molecules is
achieved, a Hydrogen bond
occurs between the Hydrogen
atom of the donor and
another atom (in this case,
an Oxygen atom) in the
acceptor.
1. Compute the molecular surface using Connolly's method. The output of this
stage is a set of points on the 'smoothed' molecular surface with their normals.
2. SPHGEN (Sphere Generator) - This stage creates a new representation
of the molecular surface of the protein and the ligand using 'pseudo-
atoms' (see below) and then uses this representation to find plausible
docking sites on the molecular surface - these docking sites that
SPHGEN is looking for are cavities in the surface of the receptor. The
rest of the algorithm focuses on these sites only.
This stage is crucial to the success of the algorithm and will be explained
in more detail below.
After all this is performed on the receptor the same is done on the ligand,
but this time we take the points and the vectors opposite to their normals,
in order to create the spheres inside the surface instead of outside the
surface.
The result of SPHGEN on the receptor is sometimes called the 'negative
image' and on the ligand it's called the 'positive image'.
In Kuntz's group the distances between the point used instead of the
points locations: For each molecule, the receptor (R) and the ligand (L),
an apropriate distance matrix is defined - dRi,j and dLi,j, respectively.
They then try to find two subsets in R and L such that their distances
will be the same, with some tolerance of error. These two subsets define
two subgraphs with almost similar distances between their vertices.
This is done using a method similar to the interpretation tree by
Grimson and Lozano-Perez:
The root has several mathcing candidates and as we go down in the tree
we should check that the distances on the path comply with the required
tolerance of error. In order to have a more efficient algorithm, Kuntz's
group implementation trims off paths where the matching is not good
enough, but this implementation become sensitive to the order of the
points in the tree, therefore the algorithm could have an exponential run-
time.
Back to Index.
Each cell in the grid is marked as "surface" (if it contains at least one
Connolly point) or "volume" (if it doesn't contain any Connolly point).
Usually, each surgace cell contains 2-3 Connolly points.
This is done first in low resolution and the best results are calculated
again in fine resolution with the addition of an approximated energetic
score. The approximated enrgetic score is calculated according to the
number of "favourable" and "unfavourable" interactions. There are
several categories for the atoms of each molecule and combinations of
these categories are marked as "favourable" if they have a good
contribution to the enrgetic plausibilty of the match, or "unfavourable"
otherwise.
For example, it is unfavourable that an atom with positive charge is
placed near another atom with positive charge, but it is favourable if two
atoms are adjacent if one of them is an H-donor and the other is an H-
acceptor.
Katchalski-Katzir et al. (1992)
The basic idea is to enumerate on the possible translations, while using
FFT to calculate the matching score efficiently. Similar to the previous
algorithm, both molecules are placed on a 3-dimensional grid, but here 3
types of grid cells are defined - "volume", "surface" and "intermediate".
If the molecules are A and B, we define the matrices Al,m,n and Bl,m,n as
follows (l,m,n are the grid coordiantes)
We choose q<0 and r>0 while |q| is large and |r| is small. The scalar
product of these matrices can be efficiently calculated using FFT thus
improving the algorithms performace considerably.