Académique Documents
Professionnel Documents
Culture Documents
;t
i n a l i te l :cj l :" " " " c -: - ^..*1^n1,*r nenr inr osubgr oupsaccor ding to resid u e o n s e r v a t i on sequencs rheprotein exposed ,p..,r,.rr!frJ"rr.ttning fti--.--rncrionar of the proteln rf interesr, the 3D structure Ftl!9rmorc' u'itir can
similaritv '11! litjY-:,:;;;; thesequence o, enzymitl.aiiiviiv "' t- i,-',r,r,e to likely bc r"clEon'll9tt-' ,=^., *=ttt nroLe vrliich'are *= rcsiducs 1'gn61isn:tl ' structures ) experimental with )aa to proteins known (o *t'ro. canbe appl-d ,,^.,urrr in,.gri$Jrhf, '-=J
o";;;ffi,.,n..
mode'ls' ls ri'ell as theoretical : dendrogrant aligrtnt etft andseqtrcnce seqtte'tce scquarcaali'qttttartt IIuItipIe n risirr,q tttttrtiplc tt(cs .t.( ttrigttcr! Jic(/,(, u\ ./i'rriir, rt.[rt<trrr,rrtL](.)lr.{ a s r, g a rie ru rc h ic u l (t re il,t e s e q t t e rt c e s c ru s t e re c r urigruttcttt r.t, 2 .r . Llascrr trte ell bet\\'e TItedistc,tce r r o g r Qr ta/i gr r / identity' scqttc,tcc pcrce^r(tge to accortrittQ trte dver(tg,c rttalrtorr c/r.,;Ir,r.irrg i r r r r t o <l c.sa o tttl birttltesaql(e'tcec/tt'slt'r is c t t lu it t lt t d u s f o llo ws : F-q.i i = l.n sequence percentage /D" rvhere is thc pairrvrse identity betrvecn
n -L) = S 1l
rl j in node b' There are frl and i in noclea and sequence sequence jE t.tr arrdD' respectively' in sequences nodec or functionlt of the evorutionary is or denclrogram, a representation cruster, The sequence ' 'u "
9-
r e l a t i on sh ip o ftlr eS equencesintheS eque n c e f a mily ' B u s e d o n cutoff'nAtro g ra m, s cutoils' s l n a t h e d e d differente q u e n ce identity at a given sequence subfamilies iamily can be divided into resolution cliffererrtlevels of functional -,*'u., repr.esents ihe subfanrily :: :]::,:::,5::::: runctionar srrows ancr more
::j::til:':J:[ilil,;,.;
--^*.,,u,
.r
or sroups sequences
,unlt" lo'u "ou"nt" " traceres.id:te.s: ldentifyirtg the across family of protein conserved. resirjues rel alignment' ^rionmenr. sequence From the multiple ' for to be essential matntalnlng and residues areassumed rs conserved. lrc sequer.lces iderrtificcl : ror e i n f u r r c t i on s.B u se cl o n th e d q n dr ogr am' s' equencescanbepjrmay havesimilar titionedintos ub in The proreins differentsubgroups idenrirlcur.rl!-Jprc). sequerrce seit-_cted speciiicityare for responsible the functional Trre residues differentspecificity. within ffiut that Residues areconseryed makethe distinction. to duringevolution to ,orced mutate
arelarBlr idtntitycutofis'thesubfamilies
)t'
I
]
420
g r v\Ju.r'y*J
Honrologr
rff;?
When the urc ecific residues callcd lruce rcsirirres(Figure l)' 1nd class-.rp rcsiciucs conservecl ---" sequence -cutoffsfbr thesame identity sequence to is apptied different
tracemethod evolutiortary
ls If will resolution be idcntrfied. a sequencc not functional at traccrcsidues diff'ercnt fanrily, will Plc, thisseq.ence be by and sequences is a subgroup itselfat selectec ro sinrilar anyorher it to untilthePlc is low enough make a in It in ignored theanalysis. will be included theanalysis at is with 3D-stn^lcture not in any subgroup the sequence If of member a subgroup. the qrrery will residue be reported' cutoff,no trace selected residucs: Clustering of to are es, interflrc tace resiclrrcs rnapped the structure oneoi To dcfinethe functional are Sinceproteinstructures more in family andclustered 3D space. sequence in the proteins .the from one identified clusters trace residue at especially active sites, than sequences, conserved family' in t"b can srructure be applied all themembers thesequence prorein
I P HP AGL KXKKSVTVI- DVGD A t.F r F T R O . V A E K R R I T V I D V G DA Y T L PYPPC IKEC EHLTAID I KD IYT DAYT D T I IDLKDCF T LPSPVIIPQGYLKI L P T P S I I P D K S Y I I V I D L K D CI ' Y .I.PVL3.I,LP RGI.IPLI{VLDL KD': T F L P S P A I I I P K D I . T P L I I I D L K D C T 'F P _P --DLl(DCr'PP- DLSSLPTTTILQTIDLKD AF F P P-D LTS LP TIHLQTID LTD ATT F P P - D L T A I P T H H I I C L D L K D . A 'F DL_DA3F Ptr-Dr. _PT_X ' P YNLL S TL KP D ITYTVL D L,KD.N,T' PYNLLSCLPPS I'YTVLDLKD AF T P YI\ILL S 5 L P P S ltYg\/t, D L RD jif' f LKDNT'I PYITLL S.L-P-INT.VLD D DAY
rd
F r 9 +A
PoL lrvz D2 PO L- B I Vz ? Conr onr uo POL_l{P}fV POL-JSRV POL-RSVP POL 1_HUlt.LN Conr ent us PO L_HTL 1C PO L_HTLVz PO L- BLVAU Cotrtenr\r3 POL-B}f,Vl{ PO L- ULVTP POL-GALV Conr en* ul Tr ac e
red - Consenred
- a _ - .- .- - r a d -
fronta residues, and residues classspecific conserved of FigureI . Identification traceresidues: in is scquence sh;"';l for foursubgroups the of A alignment. segment DNA polymerase sequence sequence the across conserved are residues definedas residues fairlty, Conserved sequenbe that within thesubgrorlp arc consenred as are residues defined resiclues family andclassspecific
42i
'' i
b io Ca mP u s l-6or ?
llonrtll<lgY
Ch a P t e r4
subgrouPs' aiii.r.ntbctrvcctt
linkag:jry i]nclyding algorirhms .sinqle clusredng hierarcriical using are Residues clusrered ^ f^. rricrrtrlization,
:'il:',,;";;;;;;"""'.
atoms uE Uls tl n ce s benvecn I)istrnces l\ Y v er r sidechain
.1*li:q"_::
u.l1!re#11*-"ui'"
t*Y
1,*9,utt, database -t such as SwissProt' a by searching protein sequtnce can be obtainecl sequences or of thequcryprotcin the s.ouldr.1rr.r.,',i function scurcrr ti.gryt.ed*tabrse sclccrcd Sequcnces itsclosere|ltiveffieneral,sequenceswhic@I93!3s's!gI-!h-ry'y le-cu! the sequences . c-alddrlsfIt -$-ql-q9--l-!qpq$!t sequenqq-ar:p-grqd ::: :i:j:::::.-. thc
.,ffiil;
,.,.i" *ffin
lo
resuir alignmenlmay ir@'urrywhich J ^r ..k ,:oo .r ,' - database fi r i l arefirst sequence from The *qu.n.., serected -rhe residue. tr::;race o|-1!letrace resloL ftntlj.suls
pr Thenthestandalonecgr am alignl23. .g rrm as structure input'It kJentiiies and alignment the protein sequence takesthe multipre evTrace
,/Srorchiltg
for homologoussequetrces
2.
H o m o l o g ou sse q u e n ce sfro msequencedatabasesear chingar ealignedusi ngtheal i gnl used' or dcfaultparameters can and mutrices gap penalties be adjusted substitution program. a file witrr job. it runs as a rrackground It takes the Because alignme.tis tinie consuming,
Al l
J
Be as file an generuGJ alignment nirmed <jobStame>.iln. sure ina file fromtheprefixof theinputsequence so thattheinpur .to g ive l job nanrervhichis different
as multiple sequences input
<--_
:n:!':# E {5
,d
l traceresidues 3. Generatiltg ii into loadthe queryproteinwith a structure Insight and is After the alignmenr created, pull down. The queryprotein's underSequences using the comniands extractits sequence
a I
il
! H
iii
l'I lt
step from the previous by again alignment sequence with the multiple rvill sequence be aligned thc either query name, v,,ith usingalignl23.Sincealignl23cannot taketwo sequences the sanle file alignmerrt in theprcvious in shouldnot be included the multiplesequence proteinsequence file T'heoutputalignment is giventhe name. or srep thequeryproteinshouldbe givena different with ".aln"appended. queryprotein name at ore ancl arc scqucnccs clustcrcd traccresiducs identified spccific Blscd on thc uligrrnrcnt, at theyareidentified PIC =30Va,447o, (FICs). default, By cutoffs identity sequence percenrage for are Threesubsets created eachPIC to holdtheconserved and907o. 5Ago,6OVo,707o,807o, 'residues class-spe"cific i.e.: residues, and PIC. for residue selected conserved protein-lD$id%oC: PIC. for [ic :c p roteirt-l D$i dToCS Iass-speci residues selected PIC. for residues selected and both proreiu_lD$idTo: conserved class-specific
l
tfi
il
il
E ^T E E
& &
i$
|d 'il
ls
[i
for example, for protein RJTIA at 3AVoPIC, the three sllbsetsare named RJTLA$30C' respectively. and RJTLA$30CS, RJTIA$30, 4, I-oadingtrace residues program' in as of Traceresidues the queryproieinare defined subset a file by evTrace to and commancl canbe used coloror II cun The subsets be readinto Insight by theLoad-Trace graphical A command' separate renderthe queryproteinor as input for the Cluster-Resiclue (Figure3)' With a single can dendrogram also be displayed window containingthe sequence The on ihegraph' PIC to bar you mouse'click crn movethevertical on thedendrogranr anyplace The at are clusters displayed the bottomof the graph' or and numberof subfamilies sequence is If colors. a sequencenotin any red in ure subtanrilics shorvn ahcrnating andmagenta sequence
.r t 5 a il x
a 3 a g
E
d
tf
rl
T E q
l$
il
li t
I'r
it
il
dr
8l
473
EI
a EI
b io Ca mp u s Honrolog.v
4 Charrtcr
sbt'rltily tt tl," **1..t.<'1
The Colorbutton. a ivhichcan be changed choosing colorandclickingon the Foreground by and farnilyand sub-families, for selecting the is dendrogram usefulfor analyzing sequence usedfor the alignment The residues. Vlultiplesequence trace PIC cutoffsfor generating different oI sequences euch where consensus the brorvset, in can calculation alsobe displayed a Netscapc and residues The PiC chosen. conserved for are and subiamily of all sequences alsoshorvn each respectively. and in residues colored read magentn are specific theclass traceresidues -s. Clustering linkage or linkagemethod the average usingthe sirrg/c Traceresidues be clustered can form a chainandare residues if is nrcthod. single method moresensitive the trace The linkage method shouldrvorkequallyas well as the singlc linkage not tightlyclustered. average The to can Cluster ing be applied sidechlin of n d cl d lin k n g e r e t h o f o r l ti g h tl y u ste reg ro u p r csiclucs. atoms. lieavy atoms ail heavy or
'Io draw a circle, pressdown the lcft
treeFDint O proreinFoint a ctosed Fiint a site Fr-lu',t 0
The rnousebutton, drag, and then release. clustercan be colored residues the selected in in the 3D rvindow by pressingthe Color IlesiclueCluster button on the dendrogram
winciow.
cle "fr6i" tridpotnts EXpO*"ed trACe reSidUeS m free
Exposedand buried trace residues are traceresidues more likely to be directly Exposed classes behaviors. of represent different will the trace residues affect structural whiletheburicci activity involved binding enzymatic in or at residues the bindingsite or activesite,it is in integrity.If one is interestcd the functional 'l'hcyai'ethe commonresidues defined the by trucercsidues. usefulto identitythe exposed of To residues the subset traceresidues. definethe subset exposeci and of subset exposeC of :rnC pulldownare used. underthe Prostat comnrands the residues, Access_Surf SAS_Subset of the accessible surface theprotein, command used calculate solvent is to First, Access_-Surf the a Table opiicn to produce tablereport.Then the k is importcnt tum on the Create/Update to
A'A +L+
it thantheSAS-Cutoff vrill be has if SAS typr:, a residuc an SAS larger For the sclccted .t),pes. and r,,7[snfractional SASis chosen thecutoffis setto 0'1, a For in included thesubset. example, is name The in then0.1will be included thesubser. subset SAS larger wirh residues a fractional set 7 automatically to protein-name$SAS' $6 fF,OCUECK(Modelevaluation)l'1
Introduction:1
that to it rnodelfrom anysource, isimportant cJemolstrate thegtructura For a homology in structures general protein of in of features the modelarereasonable terms whatis know aLrottt
from vrhichbasrc of.prclteins structures three-dimension.al have analyzed That is, rglearglers are,-ryailab programs irnd of pr-inciples proteinstructure fotdinghavebeendevelc,peil-{:'neral I' of si ro as st i n rhi s analysis. "*f9-::l:-t .?lfl:l:[.g'H1"gdt con of ,6Thecritcriltor analysis con'ccttless include,;3"'
,.' .,'-
L*..-.
ho$ is 3Q-Profiler.The airn of PRoCF{ECK to assess includepRbCHnCX.'and publication is' structure in of the how or normal, conversely unusuallth'geometry residries a givenprotein
'
t^ '
derived fronr well-refined,high-resolutio parameters rvith stereochemir:al as compare,Sl bond bond.planarity, angles, of on FpOCIUCK is based an analysis (phi,psi) srrucrures. Pepjiile
'g-
of conformations knownprotein and geometry, side-chain hydrogen-bonci bondangles, . lgngths, are of values these Thus, Perrameters as strucrures a functionof ator"njc.resolution. the expected of baied on the atomicresolution the to knorvnand can be compared a modeled_st,ructure rvrsdevelopedU ,siructures rvhich model the from .\/ v bY ChecksPerfornred I'IIOCHECIi: are structure asfollows: on by performed PROCFIECK a givenproteirr checks Thecurrent
A.' <
b io Ca m p u s HomologY Ch a p t e r4
.Covaieni gcotl'let rY,#/-
. P l a n a r i ty" .Dihedi'rl ungles . C h i r a l i tY .Non-bonded i nteractions .Main-chain bontls hYdrogen .Disulphide bonds .Stereochemical Parameters .Parameter comPadsons .Residue-bY-resi an Ysi riue al s r.rsed paramreters in the PROCHECI{programs: stereochemical iott clerived front higlt-rcsctlut of pttrarrlctcrs Morriset al' (1992)' TABLE A.I: SIat'crscltatttical proteitt stnrclures.
I"{.ean value
St.e r e o ch e mical
P arameter
I S c a n d a rd
d e v ia t io n
i n most far,'oured ph i- p si r e g io n s o f Ramachandran P Iot' angie: ch il, dih e d rai g a u ch e minus tr a n s g a u ce pLus angle c hi2 dih e dral p hi E orsic.'n angle Fr olin e He lix P hi t,orsion arrgle angle HeL ix p si torsion ci :i3 ( S-S bridE e) : r iq h t- hanoed le f c- han,lec bond separatiol"l Disu lp h id e om eg a d ih edral- angle hydrogen Ma in - ch a in bon d e n e r gy (kcal. /ntoi 1 zeLa Ca lp h a chiralieY : ' 'vir tua l' angle ' torsion ( Ca lp h a * N -C-CbeLa
> 90t
C ,A
v= .
l I A
r i anr oc r q vvv !r
- .:Jr
1 q
LJ.
1t Q
v
veYtvvv
Aanr aa<
1Q ?
M .v
r l c ..l Y'oo
q,
1A
IV.
-6 6 . ? L7'l .4 -6 5 . 4 -6 5 . 3 -3 9 . 4
O (
JV.v
c le g re e s c i. e g re e s rie g re e s d e g re e s d e g re e s
r l oar ooq.
1q,
IJ.
A
v
1Q
!v.J
( 1
L
Aonr ooq. v v : J -v v v
uvY!
1 1
LL.
11 LL.
1 '1
!!.
O J
?
J
dorrreoq YvYsvvv
Aoar opq
u vY!u vv
14.8
1n !v q.
J.v .
d e g re e s
dorrropq vvv s v : r!
-8 5 . 8
z.v/., 1an JVI/. fl v
d e g re e s
d o n r o o <:
u v:J
? .
darrrr:oq
gv:J*vYv
- 2.03
JJ.
1?
Q J
? J
q
. J
rl orrrpo u s:r I
vv'
deviatiotts,ct's TABLE A.2 : Main-clwin boncl lengthsand bond angles, arul their standard
426
t;
I
I
]-
n
ll
I
bioCarnpus Homology
9jsgler-l
"b**rd i,^,,*tl,,-fu 199I), f fuSh-AUuber,
a. Botd lertgtlw
fl q
{ {
it ;i
in
lii lr;
Bon d
I x- pt oR labeliinEr ) I sigrma
(e x c e p t (P ro )
n ^tra
P ro )
1 141
i1
I EI gl I
0.016 0.02c
v.v4L
-C cH 2 G*
ca lp h a - Cbeta
I ^?tl F I F F -( A??4 vrl-s 6 ^r r { ul t* !
(e x c e p t (G lY )
\ .frr cU
Ar
d .l
GIy)
| 1tr,
I cHfr-cH:e
n" tl F I F F. F
n n
n1R n?? t
N-CaIpha
(I 1 e , T h r, V a l) (t h e re s E ) (e x c e p t , G ly , P ro ) (G lY ) (P ro )
tr,4n
v.vz
U. U IY
0.020
U . U .j -O
a E{
BI
r. + oo
tt
tt
b.
8o.nd
angrJos
An g Ie Value
I x- plon
I sigrma
e{
11
ir
C - N - C alpha
(e x c e p t (G lY ) (P ro )
/ ovr.orrF \ s^us}/
G ly , P ro )
Ca lp h a - C-N
cH zG*-C -N H 1
/^ IJ1 Ei- r i- NI
(G I Y )
/D r n\
c .d
T
fl
!i{
116.9
Ltu .6 q 1)n
11
Ca lp h a - C-O
la IJI F - r r - n
(e x c e p t . G I y ) (G lY )
\.l - | j - c r l
2. L 1. 5
)')
IY
(I 1 e , T h r, V a l)
/ Fho e -. v \ roql, vvv
^
ttv
11fr
N* C a1 p ha-C
rrtta
Ar"l
ar
NH1-CH2G"-C
I\r-r\r{1 E -(-
N-Calpha-cbeta I wHr-cnrE-cH3E
NH1.-CHlE-CH]-B N-CH1E -CHzE NH1-CHLE -CH2 t s o-c-NHL
(e x c e p t , G ly , P ro ) (G I Y ) (P ro ) (A la ) ( I 1 e , T h r, V a l ) (P ro )
/ F^lvr o v.
\
z.o
2.9 z.>
11 11 11
r oql - 'l vv
-
e,
( except, Pr,c ) (P ro )
L23.Q
LZZ.V 1A L .a
ALI
Aa1
..! -I
{
-l *t
.I
ir
_l
I i
I
'^;^,/^^*-,, i i Pu ) ur u\- di
I I
Honrology' Chapter .1 o t i np rt to P RCCHE CK is a sin g le iiie c o n t a in in g h e c o o rd in a t e s f y o u r p ro tc r n ,/ .,'Th e c fil This rnustbe i;r B rookhaven e f o rma t . T h c o lt t P u t s o mp ris ea n u mb e ro f p lo i . s . s : : -uctur e . . residue-by.rcsidlis t in g T h e p lo t sa reo u t p u tin P o s t S c ripflo n n a t . ue r vitha detaileci roqerher I gcttct'ntc<l PROCI{EQ.K: by f)ifferent 1;lots 1" p lot ; Rantaclnttd,:g,,1:Jpt-,
in for. plot shows-the g-lliiru -!oi.;-r9n"-1!.gl9s -{l residu,el the . .The Ranrachandran by are (excepr tliosert the chain terminitClycine resioues slplt"!9y ]de11t're! srru,crure charn plt qpp,topriale the o!!9, sicje to of to as are tnangles rhese not restricred the regions _t]E
f\pes. :..:-*-*'-" in the on T'hecolouring/sheding the plot represcnts differentregionsdescribed Nlorriset (heleshownin rec{) corespondto the "core" regionsrepresenting al. (i992): the darkest areas of the mostfavourablecombinations phi-psi values. T'hc in Ideally,one rvoulrjhope to haveover 90Voof the residues these"core" regions. g t al p e fcen tagof r e sidues the "core" regio n s o n ec rft h eb e t t e r u id e s o s t e re o c h e micq u llii i . e is in plot The main optionsfcr the Ramrchandran are:.l.abelingof residucsin disallorved can bc off, or altematively regionscan be s,,vitched e xten d ed thecthcrregions. intc .Shading/colouring the diftcrentregions can bc srvitchecl o[f. ol' .Th eplo t canbe in colouroi-black-an d -rv h it e . .A "p u b licationveision" of the plo t (lv it h o u t t h e o u t e r b o rd e r a n d s t a t is t ic s ) a n ' n e c generateci. 2 nd p I ot\-{a -/macIn nd r an p Io t.r b;1 S ;tdue*type I ploi-sare shown for each of the 20 diif'erent The plot shorv's separate Ramachandran lmino acidty'pes. The darker the shacled areaon eachplot, the nrore favourable the region.The dlta on which the shadingis basedhas cornefrom a data sot of 163 non-homologous, high-resolution proteinchrins chosen from structures solvedby X-ray crvstallography a resolution 2.0,1or to of betterand an R-factorno sreaterthan20Vo .
I
;I
428
'il -l
d
rL"*th.tffi
of the The red numbersabovethe datapointsale the reside'numbers poinrson rhat graph. of regions theplot)' lyingin unfavourable residues (ie those in residues question showing for mainoptions theplotare:. ,'iThe only' plots q,,oRant:ichandran for Gly & Proresidues are whichpoints to be labelled' defining cut-offvaluef9r theG-factor: . _,TSL .Theplotcanbein colour black-and-white' or 3rd plot: Clil-Cli2 PIot ,.r' for anglecombinations nll The chil-chi2 plorsshow,rd{nit-.ti2 sideghfn torsion anglcs' bgththcsc to are sidccSains longenou!f', havg rvSosc typcs rcsiduc the how favourableeachregionon the plot is; the darker rie'i[i,lirig"it; iactr ptot indicates has is The dataon whichthe shading based comefrom a the the shade morefavourable region. by solved from structures chosen chains protein high-resolution datasetof 163non-homologous, thanZATa no grEater ' and of to X-raycrystallography a resolution 2.0Aor better anR-factor of sholvthe totalnumber drta name, The nu'rbcrs in brackets,followingeachresidr"le of the The red nunrbersabovethe datapointsare the reside'nurnbers points on rhxr graph. of regions the plot)' lying in unfavourabic (ic in rcsicjues qucstiorr showingthoseresiducs for The mainoPtions thePlotare:poins arcto b'elabelled' whiCh defining .Thecut-offvaluefor theG-factor or. .Theplotcanbein colour black-and-white' lh plot/uait1 chainParanrcterr by' (represented plot parameters showhow the structure The ,i^ gr"pt, on themain-chain The at structures a slm]iarresolution' darkbandin with rvell-refined compares the solidsquare) the structures; centralline is a leastthe eachgraphrepresents resultsfrom the lvell-refinecl while the viidth of the bandon either fit squares to the meantrendas a functionof resolution, the cases' In the abcrut nrean. some deviation of to sideof it corresponds a variation onestandard it is not. and on trendis dependent theresolution, in othercases on not should be placed reiiance too nruckr as Note.This plot is intended a rough guideonly and resolutiot'l"' at thanstructures thesame that results are"be[ter getting
,/
n
T
il I I
n
t
I
fl f
^|
r
il
.{
t
I
,{
429
J .I J J J J J J J
{
IY
d .g
bir-rCampus Ho mo lo g y ChaPter"l
--1-
-,./ of by is .a. Ranrachandran propcrty measured the percentage the piot qualityLXtris
-./
or rhat are in ihe most favoured, core, regionsof the Ramachandran protein's residues , ob p lo r . Fo r a good mocieistructure, t a in e da t h ig h re s o lu t io no n e wo u ld e x p e c tt h is to percentage be over 907o. Horvever,as the resolutiongets poorer, so this figure rv'ith decrease regionreflectsthis expected T'heshaded decreasesas mrghtbe expected, g rvorseninresol oti. uti
. b . Pcp ti dc borrtl plarrarity.."fnispro p e rly is me a s u re d y c a lc u la t in gt h e s t a n d a rd b torsion angles.The smaller the value the deviation of the proiein structure's-tmega p re s s rroundthe idea io i 1 8 0d e g re e (r, v h ic h p re s e n ta p e rf e c t ly la n a r tig h terthe clusterins
n a n ti Jo yvl /r,u ! h n rr,l\ vv" r,/.
t/-
. c. Ba d n o n .bonCedinter;rctions.'T h is ro p e rt yis rn e a s u re d y t h e n u mb e r o f b a d p b contacts per 100 resi.lucs.Bad contactsi.rreselectedfrom the list of non-bonded oi wh l in ter a cti ons found by progrrrl Nll. T h e y a rc d c f in e ca s c o n t a c t s e re t h e d is t a n c e is cl ose st p proach lessthanor equalt o 2 . 6 , \ . a 'fhi . d. Ca lp h a tetrahedral.distortion. s p ro p e n )' me a s rlre b y c a lc u la t in gh e s t a n d a rd t is d prq!-on anglq in thrlt it-,isnot <Ieviation the *eta torsion angle. l'his is a n'otion-al oi Rather, is definedby the following four it definedaboutany actualbond in the struciure. residue: atomsrvithina g,iven Calpha,N, C, and Cbeta.
.e. Main..cldr by lrydlggen bond energy.T'his propertyis measured the standard for deviationof the nyA_rqgg, b,on$*^energies main-chainhydrogen bonds.The (1983). & of the energies calculated are using method Kabsch Sander
a'A"'
,f. Overall G-factclr.The overall G-factor is a measureof the overall*normality of the for cf fronr an average all the differentG-factors structure. The overellvalueis obtained
!%1..+l*
'n'
(represented plot parameters shorv howthestructure bv Thefive graphs_on the"side-chain in at The thesolidsquare) with resolution. darkbantJ compares rvell-refineci structures a sirnilar
430
E----
bioCanpus Homology Ch a p t e r ' i the structures; central line is a leastthe results from the well-refined elch graph represents while thc width of the bandon either fit squares to the mean trend as a function of resclution, the deviationaboutthe mean.ln all cases to side of it corresponds a variationof one standard on trendis dependent the resolution. or Note. This plot is inrended 3j3qh ort shouldnot be placed guilg_ only and too much reilance
fl r|
c
-l
i|
;{
at gettingresultsthat are "betterthan structures the sameresolution". plotge{ aq.e; The 5_pr_op"qel . ' :ai'standard minustorsionangJ"es. deviation thechi-l gauche of .b. Standard deviation the chi-1 transtorsionangles. of dcviation the chi-l grucheplus torsionangles. of -c. Standard .d. Pooledstandard of deviation all chi-l torsionangles' .e. Standard deviationof the chi-2 transtorsionangles. properties dn plot: Rcsicftrc on The vlious graphs and cliagrams this plot show hou' the protein'sgeometncal ofTfricir i-gions uppiot to i,oue This gives a visualization vary alonqils sequnce. . properties (perhaps they arepoorly defined)and which have because geometry poor or unusual consistently more norrnalgeometry. plottedare:The properties .Grapls a-c: Optiorwlproperties by from 14 possibies the u-ser' The first threegraphsat the top of the page,can be selected The threedefaultgraphs,which are plottedwhen you first run PROCHECK, are the fir'st threeof:Plcl) l. Absolutedeviationfrom rneanChi-1 value (exc:I. 2. Absolutedeviationfrom meanof omegatorsion of chirality:abs.deviation zetatorsion 3. C-alpha 4. Absolutedeviationfrom meanof H-boncienergy 5. GammaatornB-value atoms 6. Avcrlge B-vllue of rtlain-chain atoms 7. AveraseB-valueof side-chain
-l .El EI ;i EI
r|
.sl
ii 3l sl d
.d
{r
ir xr Jr
.{
d il
{
..r .rr
.d
+Jl
a ..il
JI
lf
.f,
..ril
bioCamPus HomologY Ch a P t e r{
i i Ostri buti orr & C-tbI* f nEf ri -p--s distribution ior 9. G-frctor chil-ciri2 G-factor 10.Residuc-bY-residue b1' (estimaterj Ooi nunrbers) accessibility I l. Approx' iiccessibility main-chairr residue i2. Percentage B-values of deviation main-chain 13.Stlrrcilrd B-vaiues of deviation side-chain 14.Standar:d a*'av deviations morethan2.0 standard those, (usualiy unusuarvalues For eachgraph, highliglrted. ,,ic]cal,, are value) shown tnean from the
accessibility cs!irttutctl & averctgc /)) "Graplt d: Scc:rtrtda 'slrllctl{''c
& sancier of the representation Kabsch a plot shows schematic srructure Thesecondary ( i 9 8 3 ) s e co n d i rrystru C tu | a ssignments.Thekeyjustbelowthepictur eshow s l v hi c h & with a Kabsch sander arl to aretaken inclucle residues Betastrirnds is srructure rv'ic:h. else whileeverything to bothH andG assignments' corresponds of assignment E, heiices coil' to is taken be randoni T h e s h aci i n g b e h i n <l th e sch gmaticpictur c' ' givesanappr oximationtothe r es i due basedon eachresidue's is a fairrycr.rdeone,being The accessibilities. approxirnation o o i n u mb e r(N i sl ri ka rva & o o i ,1986) .Anooinum ber isacountofthenumbe r ofother own Calpltu' 14A of the givenresidue's rvithina radiusof, in this case' CalphaatorTls are partsof the structure of good impression which crude,trrisdoesgive a Altrrough will of Futureversions PR.CHECK on e,Xposed the surface' buried and rvhichare accessibility' of cllculation residue rn irrclude ilccurate
regiotts & .Graplt e: JSeqtrcrtce R'anrcclnndran ,Gfa ptt e : eqt t c t . Ls s r \ s "' Ev ' - - ' - - -
O n e - l ette f
t
l
I t T
ni nt piot of the thatidentify region theRamachandran a setof markers il,.,-::,;.;;.r;;"a ^f of the fntr one tvpes, for each thefour arefour marker There is rocated. ff;';;;,;"" and I I nr w or l a'owed,senerous disallorved' " favoured), core(iemost ;;;;ion: il"J;;.,
maximunl shows each residue's of asterisks ancl plus'signs The small histogram
t t
I
+);
/a
{"
I
bioCarnprus Homology
il fl
{
here. shown by theamount .Grapltg,W G-factorvalues' of representation eachresidue's give a schematic squares The shaded that resiCues do not havea ctri'2, are (Notethatthechi-l G-factors shownonly for those no andhence chil''chi2G-factor). are conespondto regionswhere the properties Regionswith many dark squares to maycorrespond highly These G-factor' gs by "ultusuul", defined a low (or negative) investigation' further or suchasloops, mayneed regions mobileor poorlydefined for Themainoptions thePlotare:at in .Which3 of 14plotsto be printed the3 maingraphs thetop of thep.age' (default 2.0). is outliers .Number srandard for deviations highlighting of .Theplot canbe in colouror black-and-white' clnin bondlengthdistibwtiorts /h plot: Ma-in of showthe distributions eachof !h on T5e Sistogr.lnls this plot. ain-chain
ry
il
;l d
hl
;d
H EI
J 3l d J fl
^s .fl
g I I
to The solictline in the centreof eachplcltcorresponds the small' in bond lengths the structure. standard mean value,while the dashedlineseithersideshowthe small'molecule nrolecule the deviation, datacomingfromEngh& lluber (1991)' though from the mean, cieviations morethan2.0 standard to barscon-espond values Highlighted file. the by thevalueof 2.0canbechangecl editing procheck.prm barslie off the graph,to the left or to the right, a largearrow If any of the histogram (as outliers in the C'O plot above). of the indicates number these plots. geometry on are outliers shown theDistorted Significant paranieters clwittbondangle {h plot: rttain
differentmirin-chain of on The histogrums rhis plot shorvthe distributions eachof the to piot corresponds thesmall' of The in bondangles the structure. solid line in the centre each stancarcl the snial!-riolecule moleculemean value, while the dashedlineseithersideshow -{ (1991)' fromEngh& Huber coming the deviation, data from themean' deviations morethan2.0 standarsl to barscorrespond values Highlighted
.fl .d
x .fl
.d
x .fl
.d
I
=--t
433
J I I
Homolog"v ChaPter 4
f"'tlttt*ttt'P rnl [i le'
cA-c'o andcB'cA'c plots (as outliers in ttre of these i"oi.r,", ;.'";;, plots' geometq/ on are outliers shown theDistortecl Signifrcant
plslqr$t:
[llSlUXr c r r r r J tho*1!9-*II9iS These I n ese histograms
GIu) The t"T:: (Arg' encl-groups Asn' Asp'Cln' His) andfor plrrrar
^ ''-' ' ' . , . s "##e-
l l rrs r I rru rv ' i n h e s uc ur g' .T h e n a l intthg S ttrr u tc t u r Ieh e o a sdc csh e d i i n e si n dicatediffer entidealvaluesfor. ^ 6 om attcnt a n C , ar a r a n 0 1 1 and 0'034
;il;i;.,;,
rr..Wyro rr rigrrri ghti ngoutl i ers | o r r in g g r o u p S ( d e f a u lt is 0 . 0 3 A ) . is (dcfault 0'02i)' groups ror ourricrs othcr for'igrrrig'ting t.*:ru .RMstrisr*rrcc 'lurrurity itt cololtror black-and-rvhite'
'The plot canbe
olots:
n-chainbond angles,and
propertiesneed to be before these defininghow distorted The parameters file' parameter herearegivenin theprocheck'prm (as the plot shows the ideal value bond lengthand angleplotted, rnain-chain For each benveen . data)'the actualvalue,and the difference &' Hubersmall-molecule by clehned theEngh
being plotted
thetwo.
are projections plotted and the value threeofthogonal groups' planar For eachdistortecl
434
bioCan'iPus I'IonrologY C h a J r t e4 r References: J' M, J. E"; Thornttlr)', M' "Knowledge-b*sed Blunclell,T. L.; Sibtnda, B. I-.; -\ternberg, Nttture'326'34'l(i967)' moleculcs"" of ancl of preciiction proteinsrructures the design novel J',; F . ; F lo wlil, B ' ; Hu b b a rd 'T ' ; o v e r i n g t o n ' S Bl un d e ll,T.L.; Carney,D.; Gard n e r, . ; Ha y e s , ' a n dd es i g n "E t t r ' ed , " K n o rv le d g e -b a sp ro t e inrn o c e llin g B sin g h ,D. A .; S ibanda, . L.; S utc lif f 'N{ . J. Bi oclwn., 172,513(1988)' $p e r vn e,W.J.;North,A .C'T';P hillip s ' D' C' ; B re w' K ' ; V a n a ma n ' T ' C' ; Hili' R' L ' " AP o s s i b i c ()n trrai of Hen's Egg-white Structure of Bovine -Lactalbuminbased Three_ciimensionar J. L yso zyme," Mol. B iol',42,65 (i9 6 9 )' llon,ologiesin protein sequences," T. "Estabrishing Dlyhoff, ivr. o.; Barker,w. c.l Hunr, J-. 91, etlnds in Etttyntology, 524 (1983)' lvl Andrej srli tnd Tom L. Blundcll of spatiai "comparativeprotcin modeling by satisfaction
t
'{ {
il
H I I
I fl
Ado b eS ysterrrslrrc.(1985).Po s t s c rip t L a n g u a g e Re f e re n c e Ma n u a i. A d d is on - W e s l e y , R e a d i n T' y B A , Do u b le rJ aA ' Hig g s H' llu m m e l i n k { A. Ailen F H, B cllard S , B rice M D, Ca rt wrig h t wDS , Ro c ig e rs J R& wa t s o n DG (19 7 9 ) ' l- {u m melink-P etersB G,K enn a rd c , Mo t h e rwe ll New York: Wiley' Algorithms' J. Hartigan, A. (1975).Clustering J' Mol' Biol" 257'342-158'(1996)' O', Lichtarge, Boume,H.R' and Cohen'F'E" F'E', j' Mol' Biol', 214,325.337 (L997), ' K.R' and Coiren, o. Licittarge, Yantamoto,
{
I I I I I I
rli
J J J J J J J J
f
I J J J J
:t