Vous êtes sur la page 1sur 34

Introduction to Modeling

Biopolymer Structural Transitions

General Principles
 Most measurements reflect average behavior of a
population of biopolymers.
– e.g., a protein’s native state is an average over many similar
conformations, all of which exhibit activity.
 Statistical 　 Thermodynamics allows a discussion of:
– how a population is distributed over the accessible states;
– the resulting mean values of the physical properties.
 Two types of “average” behavior:
– time-average of a single molecule;
– instantaneous average over an ensemble of molecules.
 Fundamental Assumption:
– time-average and ensemble-average behaviors are
identical.
Modeling Structural Transitions
 Statistical Thermodynamics most useful for modeling
phenomena involving multiple states:
– transitions between biopolymer states;
– assembly of complexes from multiple subunits;
– binding of multiple ligands to macromolecules.
 Our focus: Modeling Structural Transitions
– Formalism originally developed to predict biopolymer melting;
– also useful for predicting protein/nucleic acid structural
transitions.
 Outline:
– Lecture 9: Intro. to Modeling Structural Transitions.
– Lecture 10: Structural Transitions in Polypeptides/Proteins.
– Lecture 11-12: Structural Transitions in Nucleic Acids.
Lecture 9 – Introduction to
Modeling Structural Transitions
Modeling The Two-State Transition
 In the simplest case:
– A biopolymer is modeled as having two only states, A and
B.
– if the two forms are at equilibrium, ‘mass action’ gives:

 Consider an ensemble of such biopolymers…

– a large number of identically prepared systems;
• roughly 1023 copies.
– which is at equilibrium; (constant V, P, N, T)
 What is the probability/member of occupying B?
– From simple probability considerations and the law of mass
action, we know:

P = [B]/([A]+[B]) = K /(1+K ).
The Statistical Weight
 Let state A be defined as the ‘reference state’…
– energies of all conformations then defined relative to that of
the ref. state.
• generally, this is defined as our state of zero free energy.
 Then, let the statistical weight (ωi) of any state i:
– also be defined relative to the reference state,
• in terms of relative equilibrium concentrations.
– For our simple, 2-state system, we have 2 weights:
ωA = [A]/[A] = 1 ωB = [B]/[A] = Keq

– Weight of the reference state generally denoted by ωo = 1.

Generality of the Statistical Weight
 ωi appears to be a simple redefinition of Keq.
– however, ωi is more general:
• Keq is a macroscopic measure of all products to all reactants.
– often includes several ‘pathways’…
• In contrast, ωi relates occupancies of each state, i.
– to that of the reference state, of weight ωo.
– ω is essentially a micro-equilibrium constant;
• very useful for systems with more than two states.
 How do we estimate statistical weights,
– for a physical system of interest?
• by computing the ‘Gibbs Factors’.
The Gibbs Factor
 At equilibrium, the statistical weight, ωj of state j:
– is related to the standard Gibbs free energy (Gjo) of
formation of state j…
• relative to the free energy (Goo) of the reference state:

∆Gjo = Gjo – Goo,

– In particular, ωj = exp[-∆Gjo/RT];
• R = molar gas constant.
– In this form, ωj is called the Gibbs Factor.
 For our 2-state system, PB can be rewritten:
– in terms of the Gibbs factor of state B:
• PB = exp[-∆GBo/RT]/(1 + exp[-∆GBo/RT]).

– the state with the lowest free-energy is most favorable…

The Partition Function, Q
 The partition function, Q of a system,
– is the sum of the statistical weights of all system states:
• Q = Σi ωi = Σi exp[-∆Gio/RT].
– For our 2- state system:
• Q = exp[-∆GAo/RT] + exp[-∆GBo/RT]
= 1 + exp[-∆GBo/RT] .
 Q facilitates computation of system probabilities:
– the equilibrium probability of occupying state j:
• Pj = ωj/Q = exp[-∆Gjo/RT]/Q.
 Q also allows computation of ensemble average
quantities:
– such an the mean free energy or entropy of a biopolymer.
Computing Ensemble Averages
 The ensemble average of observable quantity, X,
– at equilibrium…
– is computed from Q by simply taking a weighted average:
<X> = Σi Xi exp[-∆Gio/RT]) / Q = Σi Xi Pi.
• Here, Xi is the X value characteristic of state i.

– For our 2-state system, we have:

<X> = (XA exp[-∆GAo/RT] + XB exp[-∆GBo/RT]) / Q.
= (XA + XB exp[-∆GBo/RT])/Q
– It turns out that every intrinsic macroscopic quantity
• will correspond to an ensemble average.
 Q thus ‘contains’ a complete specification of the
system:
– if we know Q, we should be able to predict system behavior.
Example – 2 Flipped Coins
 To illustrate probabilistic nature of our treatment:
– consider the simple, analogous example of 2 flipped coins.
 Each coin can land either heads (h) or tails (t).
• 4 outcomes…hh, ht, th, tt (system microstates).
– System (both coins) can assume 3 macrostates (j):
• State ‘HH’: j = 2 heads…unique (g2 = 1…only hh).
• State ‘HT’: j = 1 head…degenerate (g1 = 2…ht and th).
• State ‘TT’: j = 0 heads…unique (g0 = 1…only tt).
…so the ‘macrostate’, j is the total number of heads.
 Assume both coins are fair:
– So, the coin states (h, t) have equal weights of 1:
• ωh and ωt = 1.
 Weight of each system macrostate, j is given by:
ωj = gj ωhjωt2-j
2 Flipped Coins: Average Quantities
 Our ‘Partition Function’:
– given by the sum of the weights, taken over system states:
Q=Σｊ j = (1)(1)2(1)0+(2)(1)1(1)1+(1)(1)0(1)2= 4.
 Probabilities of State Occupancy:
– for the 3 ‘macroscopic’ states, HH, HT, and TT
• PHH = ω2/Q = ¼,
• PHT = ω1/Q = 2/4 = ½,
• PTT = ω0/Q = ¼. (note: PHH + PHT + PTT = 1!)

 Ensemble Average Quantity: Mean Number of Heads…

– i.e., the mean number after many independent trials.
• estimated by a weighted average:
<H> = Σj j ωj/Q.
= [(2)(1) + (1)(2) + (0)(1)] / 4 = 1.
• note: ensemble average quantities denoted by <>’s.
– …the expected result, from probability considerations.
Modeling Structural Transitions in
Biopolymers
 The 2-state system is conceptually simple.
– Adopting a 2-state model for a biopolymer:
• Process described as an all-or-none transition;
– Then, weights of only two states are considered;
• Useful for assessing the total Keq of formation.
– However: yields no information about folding intermediates.
• e.g., partially folded states.
• We need a more complete description
– much more complicated.
 Simplification: Focus on 2o structure
– limits the problem to transitions b/w well-defined states:
• For polypeptides:
– between a random coil and an α-helix.
– alternatively, between an α-helix and a β-strand.
• For polynucleotides:
– between a pair of random coils and a double-strand (helix).
– alternatively, b/w a B-helix and an A- or Z-helix.
Generalized Two-State Modeling
 Each residue in a biopolymer may assume many states.
– e.g., ranging from fully ‘helical’ to fully ‘coil’.
 Simplification: residue-level, 2-conformation model.
– 2o structure formation all-or-none for each residue.
• DNA: each base-pair either stacked or unstacked.
– …or H-bonded or non-H-bonded, in our simple treatment.
• Polypeptide: each residue either H-bonded or free.
 Again, we consider the overall transition:
– from a reference or starting state (A) to a new state, (B).
 But, each residue has either the properties of A or B.
• i.e., is in state a or b.
– Limiting states:
• A = …aaaaaa… (all residues have the properties of A)
• B = …bbbbbb… (all residues have the properties of B)
– Many ‘intermediate’ states: …aaabbaa…
Model for an N-residue Chain
 Assign a weight to each residue, based on
conformation:
– a: assign weight, ωa = 1 (the residue reference state).
– b: assign weight, ωb = s.
 Also, residue transitions may be ‘cooperative’.
– assign a nucleation penalty, σ½ to each ab interface.
– may also be assigned to b’s at chain ends (model-dependent)
 Total statistical weight of each chain state…
– then given by the product of all residue weights.
– States with identical weights classified as degenerate…
• grouped into a single, macroscopically-observable chain state.
 There are 3 cases:
– which depend on the size of σ, relative to s:
• Here, σ models process ‘cooperativity’.
Model 1: All-or-None
 Transition from A to B, length N polymer.
– Reference State = State A (all a’s).
– Each residue has properties of state A or B.
• ωa = [a]/[a] = 1, ωb = [b]/[a] = s.
• here…micro-equilibrium contants.
– residue transitions totally cooperative (σ = 0).
• but no σ½ assigned to b’s at chain ends.
• residues change in unison…
– e.g., for …aabaa…, weight = …(1)2(σs)(1)2…= 0.
 Result:
– only homogenous chains have nonzero weight.
– Let ‘j’ denote the number of b’s.
– Only 2 occupied states, each with degeneracy
of 1:
• State A (j=0)…ωo = 1.
• State B (j=N)…ωN = sN.
Estimating PA and PB.
 Both states A and B are unique:
– degeneracy, g0 = gN = 1.
 Partition Function:
• Q = 1 + sN.
 Probability of observing states A and B:
– estimated by a ratio of statistical weights:
• PA = ωA/Q = 1/(1+sN).
• PB = ωB/Q = sN/(1+sN).
– equilibrium probabilities.
 Physical interpretation of parameter s:
• s = exp[- ∆Goab / RT].
• ∆Goab = Gob - Goa is the standard free energy
– of residue conversion from a to b.
Ensemble Average Fraction of b’s
 Ensemble averages are also obtainable from Q…
– by constructing a weighted average.
 Example: Mean Fraction of b’s
– Denoted by <Pb>.
– j and fj = j/N are the number and fraction of b’s in state j.
– <Pb> is the weighted average of fj:

<Pb> = Σj fj ωj/ Q = [(0)(1) + (1)(sN)]/(1+sN) = sN/(1+sN).

 Parameter s:
– related to mean free energy of residue conversion, ∆Goab,
by the Gibbs factor:
s = exp[- ∆Goab / RT].
• when ∆Goab < 0, <Pb> is greater than ½.
• when ∆Goab = 0, <Pb> = ½, as expected.
Utility of the All-or-None Model
– conceptually simple.
– correctly estimates relative occupancies of A, B.
– provides estimates of the gross details of the transition.
– neglects the occupancy of intermediate conformations…
• occupancies of other states unmodeled.
• Thus, absolute occupancies of A, B generally overestimated.
– structural details of the transition unavailable.
• e.g., misses ‘sub’-transitions.
 Primarily uses:
– developing an approximate picture of the transition.
– estimation of overall, or ‘bulk’ equilibrium constants (Keq).
Model 2: Non-Cooperative
 Transition from A to B, length N polymer.
– Reference state = state A.
– again, residues have properties of A or B.
• ωa = [a]/[a] = 1, ωb = [b]/[a] = s.
– residue transitions non-cooperative (σ = 1):
• each residue converts independently.
 Result: Biopolymer has N+1 states…
– chains with equal j values are equivalent;
• have the same weight, sj.
• grouped into a single chain ‘state’, j.
– states then enumerated by j:
• N + 1 states…j = 0,…, N.
– Weight of state j:
• ωj = gj sj = N!/[j!(N-j)!] * sj.
The Partition Function, Q
 Degeneracy of state j:
– Given by: gj = N! / [j!(N-j)!] = C(N,j).
• …the binomial coefficient;
• number of ways of placing j balls in N boxes.
 Partition Function:
– equal to the sum over all states:
Q = Σj sj N!/[j!(N-j)!].
• but this has the form of a binomial expansion, which
reduces to:

Q = (1 + s)N…
– e.g.; (N = 4): Q = 1 + 4s + 6s2 + 4s3 + s4 = (s+1)4.
The Propagation Parameter, s
 In the non-cooperative model, the propagation
parameter:
– expresses the probability that any residue converts from
a to b;

– Intuitively, transition from A to B occurs a series of micro-

equilibria…
• i.e., a series of 1-residue transitions;
• Thus, s is the equilibrium constant for each micro-transition.
Equilibrium Occupancy of State j
 Probability of observing state j:
– i.e., one of gj physically distinct, equivalent chains…
• with j b’s;
• denoted by Pj.
– again, given by a ratio of statistical weights:
• Pj = ωj/Q
sj/(1 + s)N ; C(N,j) = N! / [j!(N-j)!]
= C(N,j)
 For the special case in which s = 1,
– We note that, ∆Goab = 0.
– Residue occupancies are random…since ωa = ωb = 1;
• Each residue executes a random walk.
– Then, the most likely state = most degenerate state.
• C(N,j) is maximum for state, j = N/2.
– Example: For N = 4…most likely state is j = 2.
• degeneracy, g2 = C(4,2) = 6.
Ensemble Average Fraction of b’s

– Ensemble averages again obtainable from Q…

• By constructing a weighted average.
– Example: Mean Fraction of b’s :
• Denoted by <Pb>.
• fj = j/N = fraction of b’s in state j.
 <Pb> = Σj fjωj/Q = Σj (j/N) C(N,j) sj /(1+s)N
= s/(1+s)
– The mean Fraction of a’s is then:
 <Pa> = 1 - <Pb> = 1/(1+s).
• so that…
 <Pb>/ <Pa> = s, as we expect.
Utility of the Non-Cooperative
Model
 Since each step is independent, this model…
– may be used to describe a random walk.
– As well as non-cooperative binding of multiple ligands:
• e.g., proteins with multiple, independent active sites.
 Most Biopolymer 2o structure transitions:
– exhibit an intermediate level of cooperativity…
– neither the all-or-none, nor non-cooperative models suitable.
• more sophisticated model required.
Modeling Intermediate Cooperativity
 Model…Transition from A to B, length N polymer.
– a and b weighted: ωa = 1, ωb = s.
– however…transitions are cooperative (1 > σ > 0).
• σ½ assigned to each ‘ba’ interface, and each terminal b.
– degree of cooperativity: relative size of s and σ.
• σ = 1…non-cooperative model.
• σ << s…all-or-none model (more accurate: s  0).
• σ often called the ‘cooperativity parameter’.
 Now, we will have many distinct states:
– not all chains with j b’s are equivalent…
– e.g., consider the 2 chains, below (each w/ 2 b’s):
• …aabbaaa… ω = σ s2.
• …aababaa… ω = σ2s2. Not equivalent!
– The general model (with all states) is very complex…
Model 3: Statistical Zipper Model
 For biopolymers, generally 0 < σ << 1…
– results in a zipper-like transition.
 Two types of residue transitions:
– initiation: formation of an isolated b;
• also called ‘nucleation’...
• This is high energy (σ << 1)…difficult.
• σ often called the initiation parameter.
– propagation: formation of ‘neighboring’ b.
• low energy for T < Tm (i.e., s > 1)...easy.
• s called the ‘propagation parameter’.
 Simplification: The Zipper Model.
– chains with more than 1 isolated b neglected.
• e.g., …aababaa… (hard to form!)
– After a single nucleation…
• Transition expands in a Zipper-like fashion.
– Chains with j consecutive b’s equivalent;
• grouped into a single ‘state’, j.
The Partition Function
 Again, we have N+1 states…
– each denoted by the value of j.
• where j denotes the number of b’s in the chain.
 Each state has much smaller degeneracy:
– Given by: gj = N-j+1; for N >= j > 0.
• Number of ways of placing a string of j b’s in N positions.
• Here, go = 1 (treated as an exception).
 Partition Function:
– sum of the weights of all states.
Q = 1 + Σj>1 ωj
= 1 + σ Σ j>1 (N - j + 1) sj,
Ensemble Average Fraction of b’s
 Ensemble averages again obtainable from Q.
– by constructing a weighted average.
 Estimating the Mean Fraction of b’s (< Pb >):
– j and fj are the number, fraction of b’s in state j:
• we have: fj = j/N.
– Weighted average of fj yields <Pb>:

 Note sums include a term sj, but not σ…

– polymer length contributes to propagation, but not initiation.
• s, σ thus experimentally resolved by studying <Pb> vs. N.
Degree of Cooperativity
 Determined by the size of σ:
– sharpness of the transition increases with decreasing σ.
• remember that s varies with T:
s = exp[-∆Goab/RT]
• σ often called the ‘cooperativity parameter’.
 Limiting cases correspond to our other models:
– Case I - if σ << s:
• transition highly cooperative…
• approaches our all-or-none
model.
– Case II - if σ = 1,
• transition is not cooperative…
• reduces to our non-cooperative
model.
Applicability of the Zipper Model
 Applicable to a wide range of biopolymer transitions:
– formation of multi-meric complexes by subunit assembly.
• e.g., viral coat assembly.
– formation of regular structures along a biopolymer chain.
• transition of a coil to an α-helix (polypeptide folding);
• transition of a B-helix to a pair of ssDNA coils (DNA melting);
– The reverse process.
• the B- to Z-helix transition (DNA).
 Application requires assigning physical meanings to σ
and s.
– this allows the definition of each state:
• in terms of an experimentally-meaningful statistical weight.
– thus, allows estimation of the system’s partition function (Q).
The Nucleation Parameter, σ
 Again, σ determines process cooperativity.
– non-cooperative processes:
• should be modeled with σ ~ 1.
– processes which exhibit a higher cooperativity:
• modeled with smaller values of σ.
 σ intrinsic to each type of biopolymer:
– different values for different biopolymers…
• even if the transition appears to be similar.
– Also specific to the type of transition:
• e.g., α-helix/coil, α-helix/β-strand, etc…
 σ independent of Temperature and Length:
• Often also considered roughly independent of sequence.
– Once determined for a given polymer and transition…
• will be equally applicable to any instance.
The Propagation Parameter, s
 Dependent on the specific interactions:
– that stabilize or destabilize a and b.
– Specifically, s = exp(-∆Goab/RT)
• where ∆Goab = Gob- Goa.
– since ∆Goab = ∆Hoab - T ∆Soab, s consists of both:
• T-dependent component… exp(-∆Hoab/RT);
• T-independent component…exp(∆Soab/R).
– both of which can determined experimentally…
• or, if desired, can be estimated by simulation.
Forward
 In this lecture, we have described:
– the basic ideas behind statistical thermodynamic modeling:
• as applied to state transitions of a polymer chain.
– the Zipper model approximation:
• and its dependence on σ…
• as well as its two limiting cases:
– the all-or-none model (σ << s).
– and the non-cooperative model (σ = 1).
 In the next 2 lectures:
– we focus on a limited set of biopolymer applications:
• the coil to α-helix transition of polypeptides.
• the melting transition of the DNA B-helix.
• the B-DNA to Z-DNA transition.