Ligand
docking
CS/CME/BioE/Biophys/BMI	279	
    Oct.	25	and	27,	2016	
          Ron	Dror
                               1
                       Outline
•   Goals of ligand docking
•   Defining binding affinity (strength)
•   Computing binding affinity: Simplifying the problem
•   Ligand docking methodology
•   How well does docking work?
                                                     2
Goals of ligand docking
                          3
        A drug binding to its target
    (The great majority of drug targets are proteins)
Beta-blocker alprenolol binding to an adrenaline receptor
                                            Dror et al., PNAS 2011
                     Problem definition
• A ligand is any molecule that binds to a protein
  – We’ll also use ligand to refer to any molecule that might bind to a protein
    (e.g., any candidate drug)
• Ligand docking addresses two problems:
  – Given a ligand known to bind a particular protein, what is its binding pose
    (that is, the location, orientation, and internal conformation of the bound
    ligand)
  – How tightly does a ligand bind a given protein?
                                                http://www.nih.gov/researchmatters/
                                                october2012/images/structure_l.jpg    5
            Why is docking useful?
• Virtual screening: Identifying drug candidates by
  considering large numbers of possible ligands
• Lead optimization: Modifying a drug candidate to
  improve its properties
  – If the binding pose of the candidate is unknown,
    docking can help identify it (which helps envision how
    modifying the ligand would affect its binding)
  – Docking can predict binding strengths of related
    compounds
                                                             6
Ligand docking: a graphical summary
                                                                                 7
           http://www.slideshare.net/baoilleach/proteinligand-docking-13581869
Defining binding affinity (strength)
                                       8
     How do we measure how tightly a
        ligand binds to a protein?
• Binding affinity quantifies the binding strength of a ligand to a
  protein (or other target)
  – Conceptual definition: if we mix the protein and the ligand (with no
    other ligands around), what fraction of the time will the protein have
    a ligand bound?
     •   This depends on ligand concentration, so we assume that the ligand is
         present at some standard concentration.
  – Binding affinity is usually expressed as either:
     •   The difference ΔG in free energy of the bound state (all atomic
         arrangements where the protein is ligand-bound) and the unbound
         state (all atomic arrangements where the protein is not ligand-bound)
         –   Again, assume standard concentration of ligand
         –   From ΔG, one can compute the fraction of time the ligand will be bound
     •   A dissociation constant (Kd) which is (roughly) the ligand concentration
         at which half the protein molecules will have a ligand bound            9
Computing binding affinity: 
 Simplifying the problem
                                10
       Direct approach to computing
               binding affinity
• Run a really long molecular dynamics (MD)
  simulation in which a ligand binds to and unbinds
  from a protein many times.
• Directly observe the fraction of time the ligand is
  bound.
                                                        11
    The direct approach doesn’t work
• It is so computationally intensive that we usually
  cannot do it for even a single ligand
  – Drug molecules usually take seconds to hours to unbind
    from their targets.
  – Microsecond-timescale molecular dynamics simulations
    usually take days.
                                                             12
           What can we do instead?
Option 1: Use alternative MD-based approaches
• It turns out that one can compute binding affinities by
  MD in more efficient ways
   – These methods, called free energy perturbation (FEP) and
     thermodynamic integration (TI), are very clever
   – They represent the most accurate way to determine binding
     affinities computationally
   – They are very expensive computationally and thus cannot be
     used on large numbers of ligands
   – They assume that one knows the binding pose
• There are also methods based on implicit solvent MD
  simulation (water molecules not represented explicitly)
   – These methods are faster, but still computationally intensive
   – They are somewhat less accurate
   – They again assume that one knows the binding pose               13
You	are	not	responsible	for	any	of	the	methods	on	this	slide
              Option 2: Ligand docking
• Ligand docking is a fast, heuristic approach with two key
  components
  – A scoring function that very roughly approximates the binding affinity
    of a ligand to a protein given a binding pose
  – A search method that searches for the best-scoring binding pose for
    a given ligand
• Most ligand docking methods assume that
  – The protein is rigid
  – The approximate binding site is known
     •   That is, one is looking for ligands that will bind to a particular site on the
         target
• In reality, ligand mobility, protein mobility, and water molecules
  all play a major role in determining binding affinity
  – Docking is approximate but useful
  – The term scoring function is used instead of energy function to                  14
    emphasize the highly approximate nature of the scoring function
                  Docking	software
                                                        Most	popular		
                                                        (based	on	citations	
                                                        2001–2011):	
                                                        !
                                                        AutoDock	
                                                        GOLD	
                                                        DOCK	
                                                        FlexX	
                                                        Glide	
                                                        FTDOCK	
                                                        QXP
                                                              Sousa	et	al.,	Current	
                                                              Medicinical	Chemistry	
                                                              2013	
                                                        http://en.wikipedia.org/wiki/
You	are	not	responsible	for	the	details	on	this	slide   Docking_(molecular)
Ligand docking methodology
                             16
                   Scoring functions
• Scoring functions used for docking tend to be
  empirical
  – Capture chemists’ intuition about what makes a a ligand–
    receptor interaction energetically favorable (e.g., hydrogen
    bonding, or displacement of water from a hydrophobic
    binding pocket)
  – Parameters are often optimized based on known binding
    affinities of many ligands for many receptors
  – Some scoring functions borrow terms from molecular
    mechanics force fields, but a molecular mechanics force
    field is rarely used directly as a scoring function for docking
    •   The scoring function is an (extremely rough) attempt to
        approximate the binding free energy. Molecular mechanics
        force fields give potential energy associated with a particular17
        arrangement of atoms.
            Example: Glide scoring function
• Glide (considered one of the most accurate docking software packages) uses the
  following “GlideScore” function in SP (“standard precision”) mode:
  !
  !                                                                             Friesner	et	al.,	Journal	of	
                                                                                Medicinal	Chemistry	
  !                                                                             47:1739-49	(2004)
  !
  !
      – The first term rewards contacts between hydrophobic atoms of the ligand and protein, and is a
        function of the distance between them
      – The next several terms reward specific kinds of hydrogen bonds, and are a function of both
        angle and distance
• The final ranking of ligands in Glide SP is determined by a combination of the
  GlideScore, an interaction energy computed using a molecular mechanics force field
  (OPLS-AA), and an estimate of the internal strain of the ligand in the bound pose
• Glide’s XP (“extra precision”) mode uses an even more complicated scoring function18
You	are	not	responsible	for	the	details	on	this	slide
                 Search methods
• Docking software needs to search for the best-scoring
  pose for each ligand
• The search space is huge, because one needs to
  consider all possible ligand positions and orientations,
  and the ligand’s internal degrees of freedom
• To search this space efficiently, docking software
  typically employs some combination of:
  – Heuristic assumptions about what poses will/won’t work
  – Monte Carlo methods
  – Heirarchical methods in which one uses approximate
    measures to identify promising groups of poses, then
    evaluates them in more detail
                                                             19
                   Example: Glide search
• Glide SP uses a hierarchical
  search method
• It first identifies a discrete set
  of “reasonable” conformations
  for each ligand, by varying
  internal torsion angles
• For each ligand, it scans
  possible positions and
  orientations, using a rough
  metric of fit
• The most promising
  approximate poses undergo
  further “refinement” and                              Friesner	et	al.,	J	Med	Chem	47:1739,	2004
  evaluation                                                                                        20
You	are	not	responsible	for	the	details	on	this	slide
How well does docking work?
                              21
        How well does docking work?
• The best docking protocols:
  – Predict a reasonably accurate pose (for ligands that do
    in fact bind the target protein) a little more than half the
    time
    •   Usually one of the few top-ranked poses is close to the
        correct one
  – Provide useful, but far from perfect results, when
    ranking ligands
    •   Tend to work best when comparing closely related
        ligands
  – Are not particularly useful when it comes to
    quantitatively estimating binding free energies
                                                                               22
                                    Leach	et	al.,	J	Med	Chem	49:5851	(2006)
                                    Warren	et	al.,	J	Med	Chem	49:5912	(2006)
        How	well	does	docking	work?
      Example:	Performance	of	Glide	on	ligand-ranking	tests		for	multiple	targets.		
Good	
performance	
on	these	
targets
                                                                           Different	
Poor	(near-                                                                target	
random)	                                                                   proteins
performance	
on	these
                                                   Warren	et	al.,	J	Med	Chem	49:5912	(2006)
                        How	well	does	docking	work?
                    Example:	Correlation	between	docking	scores	and	affinity	for	one	target		
                                                                     Magenta	points	correspond	
                                                                     to	ligands	from	one	chemical	
Docking	score	(FlexX)
                                                                     family.		Blue	points	
                                                                     correspond	to	a	second	
                                                                     chemical	family.	
                                                                     Magenta	points:	decent	
                                                                     correlation	between	docking	
                                                                     score	and	affinity.	
                                            pAffinity	=	–log(Kd)     !
                                                                     Blue	points:	no	correlation.	
                                                                   Warren	et	al.,	J	Med	Chem	49:5912	(2006)