Académique Documents
Professionnel Documents
Culture Documents
DOI 10.1007/s10858-010-9459-z
ARTICLE
Received: 8 September 2010 / Accepted: 16 November 2010 / Published online: 9 December 2010
The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract Sequence specific resonance assignment constitutes an important step towards high-resolution structure
determination of proteins by NMR and is aided by selective
identification and assignment of amino acid types. The traditional approach to selective labeling yields only the
chemical shifts of the particular amino acid being selected
and does not help in establishing a link between adjacent
residues along the polypeptide chain, which is important for
sequential assignments. An alternative approach is the
method of amino acid selective unlabeling or reverse
labeling, which involves selective unlabeling of specific
amino acid types against a uniformly 13C/15N labeled background. Based on this method, we present a novel approach
for sequential assignments in proteins. The method involves a
new NMR experiment named, {12COi15Ni?1}-filtered
HSQC, which aids in linking the 1HN/15N resonances of the
Introduction
The first step towards three-dimensional (3D) structure
determination of proteins by NMR involves sequence
specific resonance assignments of its backbone and sidechain 1H, 13C and 15N nuclei (Wuthrich 1986). For proteins
with molecular mass[10 kDa, this is accomplished using a
suite of triple resonance spectra acquired with a doubly or
triply labeled (13C, 15N or 2H, 13C, 15N) protein sample
(Cavanagh et al. 1996). In this method, sequential connectivities between neighboring amino acid residues along
the polypeptide chain (di-peptide links) are established and
a stretch of assigned residues is mapped on to the primary
sequence for assigning the resonances sequence specifically. Due to high sensitivity and resolution of triple
123
40
123
41
10
20
II
VIII
C /C (ppm)
30
VI
40
VII
IX
50
60
70
80
III
IV
G A S T K R Q E M H WC
red
oxd
D N F Y L
V P
Amino acid
123
42
123
43
y
1
y
1
rec
1H
t2
2
1
2
15N
t1/2
t1/2
3
GARP
13C
G1
G2
G4
G3
PFG
2D Difference spectrum
2D {12COi-15Ni+1}-filtered HSQC
15 N
15 N
i+1
HN
HN
CBCA(CO)NH/
HN(CO)CACB HNCACB
CBCA(CO)NH/
HN(CO)CACB HNCACB
Ci
Ci-1
C i+1
C
i-1
C i+1
H Ni
H Ni
H Ni+1
H Ni+1
flow from GE) and the protein eluted with a salt gradient of
00.6 M NaCl. The protein sample was further purified
using size exclusion chromatography with Superdex 75
(Sigma). For NMR studies, a sample containing *1.0 mM
of protein in 50 mM Phosphate buffer (90% H2O/10%
2
H2O, pH 6.0) was prepared. Four samples containing the
following pairs of selectively unlabeled amino acids were
prepared: (a) Gln, Ile, (b) Arg, Asn (c) Phe, Val and
(d) Lys, Leu. Three other 15N (only) labeled samples
containing the following selectively unlabeled amino acids
were prepared: (a) Asp, (b) His and (c) Tyr.
The gene encoding N-terminal domain of Tim 23 (198)
with the inclusion of a hexa-histidine tag at the C-terminus
was cloned into a pET3a vector and transformed into the
E. coli strain, Rosetta. The cells were grown at 30C in 1liter of a M9 minimal medium containing 4 g of 13C-glucose and 1 g of 15NH4Cl as the sole source of carbon and
nitrogen, respectively. Lysine was selectively unlabelled
by adding at a concentration of 1.0 g/L to the minimal
medium. Cells were induced at A600 of 0.6 with 0.5 mM
IPTG and grown for an additional 4 h. Cells were harvested by centrifugation at 4,000 rpm at 4C and resuspended in 12 ml of lysis buffer (20 mM TrisCl, pH. 7.5,
20 mM imidazole, 0.5 M NaCl, 5% glycerol, 50 lM
123
44
R, N unlabeled
105
1.2
EN
PP
GR
RR
T55
110
1.0
N25
R74
R72
120
V26
R42
L43
L73
IRunlab/IRref
I61
R54
15N
115
N60
[ppm]
G75
125
0.8
0.6
0.4
0.2
0.0
130
10
20
30
40
50
60
70
60
70
60
70
60
70
Residue number
K, L unlabeled
K, L unlabeled
105
T9
1.2
110
I61
K63
K29
I30
I23
V26 K48
L71
L43 L69
L73
L15
L50
V70
L67 K6
V5
I44
I13
F4
H68
I36
T14
Q49
R72
F45
K11
T12
R74
Q31
120
E16
A28
125
Q62
E51
0.8
IRunlab/IRref
L56
K27
L8
115
[ppm]
V17
K33
15N
S57
E34
T7
E64
I3
1.0
0.6
0.4
0.2
V70
0.0
130
10
20
30
40
50
Residue number
Q, I unlabeled
S20
Q, I unlabeled
105
1.4
G35
I23
L8
V5
I30
V26
I44
Q49
Q31 L71
Q2
L69
L73
L15
L50
V70
I31
K6
F4
I36
K63
T14
D32
120
D52 N25
R42
Q62
1.0
IRunlab/IRref
Q41
I61
L56
[ppm]
115
S65
Q40
15N
D39
I3
V17
1.2
110
0.8
0.6
0.4
125
F45
0.2
0.0
130
10
20
30
40
50
Residue number
F, V unlabeled
F, V unlabeled
105
1.4
110
E18
K27
120
V5
V26
L71
F45
L71
L73
L15
125
V70
6
1H
123
[ppm]
0.6
0.4
0.0
0
A46
0.8
0.2
K6
130
1.0
IRunlab/IRref
F4
[ppm]
V17
15N
115
Y54
1.2
10
20
30
40
50
Residue number
45
Results
Figure 5a and b shows the 2D [15N1H] HSQC-difference
spectrum and 2D {12COi15Ni?1}-filtered HSQC of the
four selectively unlabeled samples, respectively. The efficacy of 2D {12COi15Ni?1}-filtered HSQC to suppress
peaks from residues having 13C/15N labeled N-terminal
(sequential) neighbors was tested using the reference
(uniformly 13C/15N) labeled sample of Ubiquitin. No peaks
were observed in the spectrum (Figure S1 of Supporting
Information) which is due to the fact that all residues in the
reference sample have 13C/15N labeled N-terminal neighbors. Analysis of the intensities of peaks corresponding to
the unlabeled amino acids (i.e., IRunlab/IRref) is shown in
Fig. 5c. For calculating IRunlab and IRref, G75 was chosen
as a control residue as it is unaffected by selective unlabeling in any of the samples. In all the four pairs of
samples, complete peak yields were obtained for residues, i (i.e., the amino acid type being unlabeled) and
i ? 1. This implies that the extent of unlabeling of a
desired amino acid type is always complete. Similar analysis
for the 15N-labeled samples of the three other major selectively unlabeled amino acid types is provided in Supporting
Information (Figure S2 of Supporting Information). Note
that if two consecutive amino acids (i.e., both i and i ? 1)
are unlabeled, the experiment will not detect resonances
corresponding to the i ? 1 residue. This happens in the case
of Q40Q41, I2I3, I30Q31 and I61Q62 in Ubiquitin.
Figure 6 illustrates the identification and assignment of
a tri-peptide segment using the Arg, Asn unlabeled sample
of Ubiquitin. As described above (Fig. 4), based on 3D
HNCACB and 3D CBCA(CO)NH spectrum acquired for
the reference (uniformly 13C/15N labeled) sample, the 1HN
and the 15N resonances of the G75 were identified in 2D
{12COi15Ni?1}-filtered HSQC and the segment LRG
could be assigned and placed uniquely in the primary
sequence. A statistical analysis (described below) describes
the extent to which such tri-peptide segments in general are
present uniquely in a given protein sequence. Figure 7
illustrates larger segments of amino acid residues that can
be assigned using different selectively unlabeled samples.
The methodology described above was particularly
found to be useful for resonance assignment of an 11 kDa
glycine rich and intrisincally disordered protein, Tim23
from S. cerevisiae (Cruz et al. 2010). Tim23 is a critical
component of the protein machinary involved in the import
of nuclear-encoded proteins into the mitochondrial membranes. Figure 8a shows an overlay of the 2D [15N1H]
HSQC acquired for uniformly 15N-labeled and selectively
123
46
2D Difference spectrum
R, N unlabeled
R, N unlabeled
105
T55
115
N60
I61
R54
R74
N25
120
V26
R42
R72
L43
L73
[ppm]
110
G75
15N
125
130
(R74)
3D CBCACONH
3D HNCACB
(G75)
(G75)
(R74)
20
13C
i
30
Group VI
13C
20
30
40
40
Group I
i-1
13C
50
13C
13C
i-1
60
8.4
8.3
1H
73
8.5
(ppm)
8.3
50
70
8.5
8.4
8.3
1H
8.5
8.4
8.3
(ppm)
75
123
8.4
i+1
60
70
8.5
C (ppm)
3D HNCACB
3D CBCA(CO)NH
13
47
2D Difference spectrum
2D {12COi-15Ni+1}-filtered HSQC
R, N unlabeled
R, N unlabeled
HSQC helped in removing this degeneracy and the tripeptide segments D7K8T9 and D65K66G67 were
uniquely identified (complete assignments are deposited in
BMRB under accession number 16198).
105
I61
120
Analysis of misincorporation
125
130
2D {12COi-15Ni+1}-filtered HSQC
2D Difference spectrum
K, L unlabeled
105
K, L unlabeled
120
15N
I61
[ppm]
110
115
125
Q62
130
2D Difference spectrum
2D { 12COi-15Ni+1}-filtered HSQC
Q, I unlabeled
Q, I unlabeled
105
15N
120
K63
[ppm]
110
115
125
Q62
130
6
1
14
[ppm]
115
N60
15N
110
H [ppm]
N at undesired sites
Discussion
Sequence specific resonance assignments are often supported by selective identification of amino acid types. This
is primarily done to simplify the search problem. Automated assignment approaches employ this information, if
available, in their algorithms to reduce the ambiguity in
linking amino acid residues (Baran et al. 2004). The
objective of this study was to apply the method of selective
unlabeling to sequential assignments and accomplish: (a)
123
48
(a)
Tim23
110
Tim23- Lysine unlabeled
(b)
G67
112.5
K66
118
122
N [ppm]
15
N [ppm]
114
K32, K8
15
K27
126
115.5
T9
118.5
K25
Q33
E28
121.5
130
10.0
9.5
9.0
8.5
8.0
7.5
7.0
1H [ppm]
6.5
8.7
8.5
8.3
8.1
7.9
1 H [ppm]
obtained from 13Ca and 13Cb chemical shifts (Fig. 1). Note
that the following types of tri-peptide segments cannot be
assigned using the approach described here: (a) the tripeptide segment in which the residue Pro is the immediate
C-terminal neighbor of the central (unlabeled) residue,
(b) the tri-peptide segment in which the two consecutive
amino acids i and i ? 1 are selectively unlabeled and
(c) if the selectively unlabeled residue is located at the
C-terminus end of the protein.
We have carried out a statistical analysis of 186 nonhomologues primary sequences from the TALOS database
(Cornilescu et al. 1999) to estimate the extent to which
such tri-peptide segments can be placed uniquely in a
protein primary sequence. The analysis was carried out by
assigning a unique number code to each of the 9 categories
shown in Fig. 1 except for the residue being selectively
unlabeled, which was assigned a separate code (given that
its type is known exactly). Thus, both the tri-peptide segment and the polypeptide chain were converted into a new
sequence of codes, which was then searched for multiple
occurrences of the tri-peptide segment (this is further
illustrated in Figure S3 of Supporting Information for a 99
amino acid residue protein HIVprotease). This approach
was carried out for 15 amino acids (excluding Ala, Gly,
Ser, Thr and Pro) in the central position of the tri-peptide
segment and repeated for each protein in the database. The
following two comparisons were made to judge the
advantage of selective unlabeling: First, the number of
proteins in the database was taken wherein all tri-peptide
segments (comprising a given amino acid type in the
Selectively unlabeled
amino acid
Cross
metabolism to
Extent of cross
Metabolism
Arginine
Asparagine
Aspartic acid
Phenylalanine
Weak
Tyrosine
Weak
Threonine
Strong
Asparagine
Strong
Lysine
Weak
Glutamine
Proline
Weak
Histidine
Isoleucine
Leucine
Strong
Valine
Strong
Leucine
Isoleucine
Strong
Valine
Strong
Lysine
Phenylalanine
Tyrosine
Weak
Valine
Tyrosine
Leucine
Weak
Isoleucine
Weak
Phenylalanine
Weak
123
(a)
Number of proteins
160
120
100
80
60
40
20
0
Percentage of unique
tri-peptide segments
(b)
K R Q
M H W C red C oxd D N
K R Q
M H W C red C oxd D N
K R Q
M H W C red C oxd D N
100
80
60
40
20
0
(c)
Factor reduction in multiple
occurrences of tri-peptide
segments
49
40
30
20
10
123
50
Conclusion
In summary, we have developed a new method for
sequence specific NMR assignments in proteins based on
the method of amino acid selective unlabeling. Using a
new pulse sequence proposed in combination with 2D
[15N1H] HSQC, the method involves the detection of tripeptide segments along the polypeptide chain. The method
achieves the dual objective of reducing the search space for
sequential assignments and conferring uniqueness to short
segments of linked amino acid residues. This will directly
benefit automated assignment algorithms for unambiguous
and accurate resonance assignments. The method can be
combined with other approaches for isotope labeling
(Kainosho and Guntert 2009; Ohki and Kainosho 2008)
and/or with completely/partially deuterated samples prepared for structure determination of larger molecular weight
proteins. Taken together, this study opens up new avenues
to using this methodology in protein structural studies.
Acknowledgments The facilities provided by NMR Research
Centre at IISc supported by Department of Science and Technology
(DST), India is gratefully acknowledged. HSA acknowledges support
from DST-SERC research award. PDS acknowledges support from
the Wellcome Trust International Senior Research Fellowship in
Biomedical Science, WT081643MA. GJ acknowledges fellowship
from Council of Scientific and Industrial Research (CSIR) and AT
acknowledges support from University Grants Commission (UGC).
We thank Dr. John Cort, Pacific Northwest National Laboratory, for
providing the Ubiquitin plasmid.
Open Access This article is distributed under the terms of the
Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any
medium, provided the original author(s) and source are credited.
References
Apponyi MA, Ozawa K, Dixon NE, Otting G (2008) Cell-free protein
synthesis for analysis by NMR spectroscopy. Methods Mol Biol
426:257268
Atreya HS, Chary KVR (2000) Amino acid selective unlabeling for
residue-specific NMR assignments in proteins. Curr Sci 79:
504507
Atreya HS, Chary KVR (2001) Selective unlabeling of amino acids
in fractionally C-13 labeled proteins: an approach for stereospecific NMR assignments of CH3 groups in Val and Leu residues.
J Biomol NMR 19:267272
123
51
Tugarinov V, Kay LE (2004) Stereospecific NMR assignments of
prochiral methyls, rotameric states and dynamics of valine
residues in malate synthase G. J Am Chem Soc 126:98279836
Tugarinov V, Hwang PM, Kay LE (2004) Nuclear magnetic
resonance spectroscopy of high-molecular-weight proteins.
Ann Rev Biochem 73:107146
Turano P, Lalli D, Felli IC, Theil EC, Bertini I (2010) NMR reveals
pathway for ferric mineral precursors to the central cavity of
ferritin. Proc Natl Acad Sci USA 107:545550
Vinarov DA, Lytle BL, Peterson FC, Tyler EM, Volkman BF,
Markley JL (2004) Cell-free protein production and labeling
protocol for NMR-based structural proteomics. Nat Methods
1:149153
Vuister GW, Kim SJ, Wu C, Bax A (1994) 2D and 3D NMR-study of
phenylalanine residues in proteins by reverse isotopic labeling.
J Am Chem Soc 116:92069210
Wuthrich K (1986) NMR of proteins and nucleic acids. Wiley,
New York
123