Vous êtes sur la page 1sur 16

/

bioCamPus LIomologY ChaPterI4 wlldlrrcr


illi-.,'tt't ncction bctwcctt
i mportance' a n d thei r } u}f,tronal IL ruv' rr" ' -' rrrv

.-,tlr,"1 c f . r t ut wt " L."'"f;'ffii;'*L'*'-'yt-:T tl,.l;tc'" iV rt l l l l 'l [l , { .. . .-ro c 0 s

;t

i n a l i te l :cj l :" " " " c -: - ^..*1^n1,*r nenr inr osubgr oupsaccor ding to resid u e o n s e r v a t i on sequencs rheprotein exposed ,p..,r,.rr!frJ"rr.ttning fti--.--rncrionar of the proteln rf interesr, the 3D structure Ftl!9rmorc' u'itir can

similaritv '11! litjY-:,:;;;; thesequence o, enzymitl.aiiiviiv "' t- i,-',r,r,e to likely bc r"clEon'll9tt-' ,=^., *=ttt nroLe vrliich'are *= rcsiducs 1'gn61isn:tl ' structures ) experimental with )aa to proteins known (o *t'ro. canbe appl-d ,,^.,urrr in,.gri$Jrhf, '-=J

o";;;ffi,.,n..

r..'or.' '""" "n':1.^" '*n'

mode'ls' ls ri'ell as theoretical : dendrogrant aligrtnt etft andseqtrcnce seqtte'tce scquarcaali'qttttartt IIuItipIe n risirr,q tttttrtiplc tt(cs .t.( ttrigttcr! Jic(/,(, u\ ./i'rriir, rt.[rt<trrr,rrtL](.)lr.{ a s r, g a rie ru rc h ic u l (t re il,t e s e q t t e rt c e s c ru s t e re c r urigruttcttt r.t, 2 .r . Llascrr trte ell bet\\'e TItedistc,tce r r o g r Qr ta/i gr r / identity' scqttc,tcc pcrce^r(tge to accortrittQ trte dver(tg,c rttalrtorr c/r.,;Ir,r.irrg i r r r r t o <l c.sa o tttl birttltesaql(e'tcec/tt'slt'r is c t t lu it t lt t d u s f o llo ws : F-q.i i = l.n sequence percentage /D" rvhere is thc pairrvrse identity betrvecn

n -L) = S 1l

rl j in node b' There are frl and i in noclea and sequence sequence jE t.tr arrdD' respectively' in sequences nodec or functionlt of the evorutionary is or denclrogram, a representation cruster, The sequence ' 'u "

9-

r e l a t i on sh ip o ftlr eS equencesintheS eque n c e f a mily ' B u s e d o n cutoff'nAtro g ra m, s cutoils' s l n a t h e d e d differente q u e n ce identity at a given sequence subfamilies iamily can be divided into resolution cliffererrtlevels of functional -,*'u., repr.esents ihe subfanrily :: :]::,:::,5::::: runctionar srrows ancr more

::j::til:':J:[ilil,;,.;
--^*.,,u,

.r

or sroups sequences

,unlt" lo'u "ou"nt" " traceres.id:te.s: ldentifyirtg the across family of protein conserved. resirjues rel alignment' ^rionmenr. sequence From the multiple ' for to be essential matntalnlng and residues areassumed rs conserved. lrc sequer.lces iderrtificcl : ror e i n f u r r c t i on s.B u se cl o n th e d q n dr ogr am' s' equencescanbepjrmay havesimilar titionedintos ub in The proreins differentsubgroups idenrirlcur.rl!-Jprc). sequerrce seit-_cted speciiicityare for responsible the functional Trre residues differentspecificity. within ffiut that Residues areconseryed makethe distinction. to duringevolution to ,orced mutate

arelarBlr idtntitycutofis'thesubfamilies

)t'

I
]

420

g r v\Ju.r'y*J

Honrologr

rff;?
When the urc ecific residues callcd lruce rcsirirres(Figure l)' 1nd class-.rp rcsiciucs conservecl ---" sequence -cutoffsfbr thesame identity sequence to is apptied different

tracemethod evolutiortary

ls If will resolution be idcntrfied. a sequencc not functional at traccrcsidues diff'ercnt fanrily, will Plc, thisseq.ence be by and sequences is a subgroup itselfat selectec ro sinrilar anyorher it to untilthePlc is low enough make a in It in ignored theanalysis. will be included theanalysis at is with 3D-stn^lcture not in any subgroup the sequence If of member a subgroup. the qrrery will residue be reported' cutoff,no trace selected residucs: Clustering of to are es, interflrc tace resiclrrcs rnapped the structure oneoi To dcfinethe functional are Sinceproteinstructures more in family andclustered 3D space. sequence in the proteins .the from one identified clusters trace residue at especially active sites, than sequences, conserved family' in t"b can srructure be applied all themembers thesequence prorein
I P HP AGL KXKKSVTVI- DVGD A t.F r F T R O . V A E K R R I T V I D V G DA Y T L PYPPC IKEC EHLTAID I KD IYT DAYT D T I IDLKDCF T LPSPVIIPQGYLKI L P T P S I I P D K S Y I I V I D L K D CI ' Y .I.PVL3.I,LP RGI.IPLI{VLDL KD': T F L P S P A I I I P K D I . T P L I I I D L K D C T 'F P _P --DLl(DCr'PP- DLSSLPTTTILQTIDLKD AF F P P-D LTS LP TIHLQTID LTD ATT F P P - D L T A I P T H H I I C L D L K D . A 'F DL_DA3F Ptr-Dr. _PT_X ' P YNLL S TL KP D ITYTVL D L,KD.N,T' PYNLLSCLPPS I'YTVLDLKD AF T P YI\ILL S 5 L P P S ltYg\/t, D L RD jif' f LKDNT'I PYITLL S.L-P-INT.VLD D DAY

rd

F r 9 +A

PoL lrvz D2 PO L- B I Vz ? Conr onr uo POL_l{P}fV POL-JSRV POL-RSVP POL 1_HUlt.LN Conr ent us PO L_HTL 1C PO L_HTLVz PO L- BLVAU Cotrtenr\r3 POL-B}f,Vl{ PO L- ULVTP POL-GALV Conr en* ul Tr ac e

red - Consenred
- a _ - .- .- - r a d -

fronta residues, and residues classspecific conserved of FigureI . Identification traceresidues: in is scquence sh;"';l for foursubgroups the of A alignment. segment DNA polymerase sequence sequence the across conserved are residues definedas residues fairlty, Conserved sequenbe that within thesubgrorlp arc consenred as are residues defined resiclues family andclassspecific

42i

'' i

b io Ca mP u s l-6or ?
llonrtll<lgY

Ch a P t e r4

subgrouPs' aiii.r.ntbctrvcctt

linkag:jry i]nclyding algorirhms .sinqle clusredng hierarcriical using are Residues clusrered ^ f^. rricrrtrlization,

:'il:',,;";;;;;;"""'.
atoms uE Uls tl n ce s benvecn I)istrnces l\ Y v er r sidechain

or heavyatoms'r* usttl ' . Of sel gcted :t' :1":::::"I:":ti,:;, atorns L -,..,

Lr e t r l , e en a p lir o fr esiduesisdefinedasthedis t a n c e b e |we e n t h e c lo s e s t p a u . a . . 9 T

' n atomsi) t h et wo r e s id r : *n.n o r u t io n r y u or a clra i n to ms al l heav y ( i .e.side ,

.1*li:q"_::

u.l1!re#11*-"ui'"
t*Y

fronr Starring the*"

1,*9,utt, database -t such as SwissProt' a by searching protein sequtnce can be obtainecl sequences or of thequcryprotcin the s.ouldr.1rr.r.,',i function scurcrr ti.gryt.ed*tabrse sclccrcd Sequcnces itsclosere|ltiveffieneral,sequenceswhic@I93!3s's!gI-!h-ry'y le-cu! the sequences . c-alddrlsfIt -$-ql-q9--l-!qpq$!t sequenqq-ar:p-grqd ::: :i:j:::::.-. thc
.,ffiil;

,.,.i" *ffin

lo

protern ":. 1.*"togous

resuir alignmenlmay ir@'urrywhich J ^r ..k ,:oo .r ,' - database fi r i l arefirst sequence from The *qu.n.., serected -rhe residue. tr::;race o|-1!letrace resloL ftntlj.suls

in clor.nrlns $qquqncc unrclilrccl incluclirrg sincc thc in turn'rvillaffect

pr Thenthestandalonecgr am alignl23. .g rrm as structure input'It kJentiiies and alignment the protein sequence takesthe multipre evTrace

gglpurya for residues the qu.ery-pigJgg-gnd rrace


c---

fi'le' into gubset a subset

Th o s e t r a c e r e s i du *,,'b e cl ri ste re d a ccor dingtother esidue.r esi


heavyatomsor sidechain heavyatoms'
lv

,/Srorchiltg

however' the protein sequencedatabases' Trace is not able to search Evolutionary

for homologoussequetrces

C r e l t e- Al i gllm e n tconrmand(underE volut io n a ry T ra c e )c a n re a d mo s t o f t h e s e q u e n c e f il e s a s rnput.

2.

aligtntrcnt sequettce MultiPle

H o m o l o g ou sse q u e n ce sfro msequencedatabasesear chingar ealignedusi ngtheal i gnl used' or dcfaultparameters can and mutrices gap penalties be adjusted substitution program. a file witrr job. it runs as a rrackground It takes the Because alignme.tis tinie consuming,

Al l

J
Be as file an generuGJ alignment nirmed <jobStame>.iln. sure ina file fromtheprefixof theinputsequence so thattheinpur .to g ive l job nanrervhichis different
as multiple sequences input

<--_

biocarnpus ;il Honrotogr il

:n:!':# E {5
,d

if *,iir sequences be retained theorru**,., in All rhegaps rheinput ',r,illnot bc ovcrrvritten.


is RemoveGap turnedoff.

l traceresidues 3. Generatiltg ii into loadthe queryproteinwith a structure Insight and is After the alignmenr created, pull down. The queryprotein's underSequences using the comniands extractits sequence

a I
il

! H

iii
l'I lt

step from the previous by again alignment sequence with the multiple rvill sequence be aligned thc either query name, v,,ith usingalignl23.Sincealignl23cannot taketwo sequences the sanle file alignmerrt in theprcvious in shouldnot be included the multiplesequence proteinsequence file T'heoutputalignment is giventhe name. or srep thequeryproteinshouldbe givena different with ".aln"appended. queryprotein name at ore ancl arc scqucnccs clustcrcd traccresiducs identified spccific Blscd on thc uligrrnrcnt, at theyareidentified PIC =30Va,447o, (FICs). default, By cutoffs identity sequence percenrage for are Threesubsets created eachPIC to holdtheconserved and907o. 5Ago,6OVo,707o,807o, 'residues class-spe"cific i.e.: residues, and PIC. for residue selected conserved protein-lD$id%oC: PIC. for [ic :c p roteirt-l D$i dToCS Iass-speci residues selected PIC. for residues selected and both proreiu_lD$idTo: conserved class-specific

l
tfi
il

il

E ^T E E
& &

i$
|d 'il
ls
[i

for example, for protein RJTIA at 3AVoPIC, the three sllbsetsare named RJTLA$30C' respectively. and RJTLA$30CS, RJTIA$30, 4, I-oadingtrace residues program' in as of Traceresidues the queryproieinare defined subset a file by evTrace to and commancl canbe used coloror II cun The subsets be readinto Insight by theLoad-Trace graphical A command' separate renderthe queryproteinor as input for the Cluster-Resiclue (Figure3)' With a single can dendrogram also be displayed window containingthe sequence The on ihegraph' PIC to bar you mouse'click crn movethevertical on thedendrogranr anyplace The at are clusters displayed the bottomof the graph' or and numberof subfamilies sequence is If colors. a sequencenotin any red in ure subtanrilics shorvn ahcrnating andmagenta sequence

.r t 5 a il x
a 3 a g
E
d

tf

rl

T E q

l$

il
li t

I'r

it
il

dr
8l

473

EI

a EI

b io Ca mp u s Honrolog.v

4 Charrtcr
sbt'rltily tt tl," **1..t.<'1

The Colorbutton. a ivhichcan be changed choosing colorandclickingon the Foreground by and farnilyand sub-families, for selecting the is dendrogram usefulfor analyzing sequence usedfor the alignment The residues. Vlultiplesequence trace PIC cutoffsfor generating different oI sequences euch where consensus the brorvset, in can calculation alsobe displayed a Netscapc and residues The PiC chosen. conserved for are and subiamily of all sequences alsoshorvn each respectively. and in residues colored read magentn are specific theclass traceresidues -s. Clustering linkage or linkagemethod the average usingthe sirrg/c Traceresidues be clustered can form a chainandare residues if is nrcthod. single method moresensitive the trace The linkage method shouldrvorkequallyas well as the singlc linkage not tightlyclustered. average The to can Cluster ing be applied sidechlin of n d cl d lin k n g e r e t h o f o r l ti g h tl y u ste reg ro u p r csiclucs. atoms. lieavy atoms ail heavy or
'Io draw a circle, pressdown the lcft
treeFDint O proreinFoint a ctosed Fiint a site Fr-lu',t 0

The rnousebutton, drag, and then release. clustercan be colored residues the selected in in the 3D rvindow by pressingthe Color IlesiclueCluster button on the dendrogram

winciow.
cle "fr6i" tridpotnts EXpO*"ed trACe reSidUeS m free

Exposedand buried trace residues are traceresidues more likely to be directly Exposed classes behaviors. of represent different will the trace residues affect structural whiletheburicci activity involved binding enzymatic in or at residues the bindingsite or activesite,it is in integrity.If one is interestcd the functional 'l'hcyai'ethe commonresidues defined the by trucercsidues. usefulto identitythe exposed of To residues the subset traceresidues. definethe subset exposeci and of subset exposeC of :rnC pulldownare used. underthe Prostat comnrands the residues, Access_Surf SAS_Subset of the accessible surface theprotein, command used calculate solvent is to First, Access_-Surf the a Table opiicn to produce tablereport.Then the k is importcnt tum on the Create/Update to

A'A +L+

surface fine the subs.t oc.ordingto one of the accessibie

bioCermpu Honrology 4 . Chapter

it thantheSAS-Cutoff vrill be has if SAS typr:, a residuc an SAS larger For the sclccted .t),pes. and r,,7[snfractional SASis chosen thecutoffis setto 0'1, a For in included thesubset. example, is name The in then0.1will be included thesubser. subset SAS larger wirh residues a fractional set 7 automatically to protein-name$SAS' $6 fF,OCUECK(Modelevaluation)l'1

Introduction:1

that to it rnodelfrom anysource, isimportant cJemolstrate thegtructura For a homology in structures general protein of in of features the modelarereasonable terms whatis know aLrottt

from vrhichbasrc of.prclteins structures three-dimension.al have analyzed That is, rglearglers are,-ryailab programs irnd of pr-inciples proteinstructure fotdinghavebeendevelc,peil-{:'neral I' of si ro as st i n rhi s analysis. "*f9-::l:-t .?lfl:l:[.g'H1"gdt con of ,6Thecritcriltor analysis con'ccttless include,;3"'

r t i , v . chainco,1!_9--q.na.rlons -.main of regions the Ramac!.11-9:?Lt-$ap in acceptable Lr,'


,. ij, ' .' -i,; ': Uo1f1.," PePtide ,..Prtiinar library in to tif'at rsidechainconformations corresponcl those therotalrrer if .hydrogen-bonding of polaratoms theyareburied
and hydrophilicresidues rproprenvironments hydrophobic for "i-*" contacts bad ,' , ,.er10 atom-atom :]-:\; '/ holesinsidethe structur"// -.''''r..'i'l, ,rlrei

,.' .,'-

L*..-.

ho$ is 3Q-Profiler.The airn of PRoCF{ECK to assess includepRbCHnCX.'and publication is' structure in of the how or normal, conversely unusuallth'geometry residries a givenprotein

'

t^ '

derived fronr well-refined,high-resolutio parameters rvith stereochemir:al as compare,Sl bond bond.planarity, angles, of on FpOCIUCK is based an analysis (phi,psi) srrucrures. Pepjiile

'g-

of conformations knownprotein and geometry, side-chain hydrogen-bonci bondangles, . lgngths, are of values these Thus, Perrameters as strucrures a functionof ator"njc.resolution. the expected of baied on the atomicresolution the to knorvnand can be compared a modeled_st,ructure rvrsdevelopedU ,siructures rvhich model the from .\/ v bY ChecksPerfornred I'IIOCHECIi: are structure asfollows: on by performed PROCFIECK a givenproteirr checks Thecurrent

A.' <

b io Ca m p u s HomologY Ch a p t e r4
.Covaieni gcotl'let rY,#/-

. P l a n a r i ty" .Dihedi'rl ungles . C h i r a l i tY .Non-bonded i nteractions .Main-chain bontls hYdrogen .Disulphide bonds .Stereochemical Parameters .Parameter comPadsons .Residue-bY-resi an Ysi riue al s r.rsed paramreters in the PROCHECI{programs: stereochemical iott clerived front higlt-rcsctlut of pttrarrlctcrs Morriset al' (1992)' TABLE A.I: SIat'crscltatttical proteitt stnrclures.
I"{.ean value

St.e r e o ch e mical

P arameter

I S c a n d a rd

d e v ia t io n

i n most far,'oured ph i- p si r e g io n s o f Ramachandran P Iot' angie: ch il, dih e d rai g a u ch e minus tr a n s g a u ce pLus angle c hi2 dih e dral p hi E orsic.'n angle Fr olin e He lix P hi t,orsion arrgle angle HeL ix p si torsion ci :i3 ( S-S bridE e) : r iq h t- hanoed le f c- han,lec bond separatiol"l Disu lp h id e om eg a d ih edral- angle hydrogen Ma in - ch a in bon d e n e r gy (kcal. /ntoi 1 zeLa Ca lp h a chiralieY : ' 'vir tua l' angle ' torsion ( Ca lp h a * N -C-CbeLa

> 90t
C ,A
v= .

l I A

r i anr oc r q vvv !r
- .:Jr

1 q
LJ.

1t Q
v

veYtvvv

Aanr aa<

1Q ?
M .v

r l c ..l Y'oo

q,

1A
IV.

r l onr opq svYl vvv daar ooq


ue:JAvev

-6 6 . ? L7'l .4 -6 5 . 4 -6 5 . 3 -3 9 . 4
O (
JV.v

c le g re e s c i. e g re e s rie g re e s d e g re e s d e g re e s
r l oar ooq.

1q,
IJ.

A
v

1Q
!v.J

( 1
L

Aonr ooq. v v : J -v v v
uvY!

1 1
LL.

Aanr aoc . evv

11 LL.
1 '1
!!.

O J
?
J

dorrreoq YvYsvvv
Aoar opq
u vY!u vv

14.8
1n !v q.
J.v .

d e g re e s
dorrropq vvv s v : r!

-8 5 . 8
z.v/., 1an JVI/. fl v

d e g re e s
d o n r o o <:
u v:J

? .

darrrr:oq
gv:J*vYv

- 2.03
JJ.

1?

Q J

rl: Jnnroaq s \ ' r' ' v D J

? J

q
. J

rl orrrpo u s:r I

vv'

deviatiotts,ct's TABLE A.2 : Main-clwin boncl lengthsand bond angles, arul their standard

426

t;

I
I

]-

n
ll
I

bioCarnpus Homology

9jsgler-l
"b**rd i,^,,*tl,,-fu 199I), f fuSh-AUuber,
a. Botd lertgtlw

fl q
{ {

it ;i
in
lii lr;

Bon d

I x- pt oR labeliinEr ) I sigrma
(e x c e p t (P ro )
n ^tra

P ro )
1 141

i1

I EI gl I

0.016 0.02c
v.v4L

-C cH 2 G*
ca lp h a - Cbeta
I ^?tl F I F F -( A??4 vrl-s 6 ^r r { ul t* !

(e x c e p t (G lY )
\ .frr cU

Ar
d .l

GIy)
| 1tr,

I cHfr-cH:e
n" tl F I F F. F

n n

n1R n?? t

N-CaIpha

! NH1 - n r J1 F l"^ ' * r $ r1* 1 - nvH? n ' * v r !t l aa& l\ - L^Yr r nt


F

(I 1 e , T h r, V a l) (t h e re s E ) (e x c e p t , G ly , P ro ) (G lY ) (P ro )

tr,4n

v.vz
U. U IY

0.020
U . U .j -O

a E{

BI

r. + oo

tt

tt

b.

8o.nd

angrJos

An g Ie Value

I x- plon
I sigrma

r abelling L2L.7 L20 .6


L1,J s (]l rr P r^ I vl J t L .r\J \ LLO.Z rl _o.1* )1 . O

e{
11

ir

C - N - C alpha

c-N H 1 -C H l -E c-N H l --cH zG*


n - r \I- n lJ1 4! v v ar E a lJl E' - n - I\r H1

(e x c e p t (G lY ) (P ro )
/ ovr.orrF \ s^us}/

G ly , P ro )

Ca lp h a - C-N

cH zG*-C -N H 1
/^ IJ1 Ei- r i- NI

(G I Y )
/D r n\

c .d
T
fl

!i{

116.9
Ltu .6 q 1)n
11

Ca lp h a - C-O

la IJI F - r r - n

c H 2 c *c - o C be t.a - Calpha-C I CH3E -CHl-E -C


A I 'i vta4! A""i F A Y 'A vaa4!, Ar?1 fl F r

(e x c e p t . G I y ) (G lY )
\.l - | j - c r l

2. L 1. 5
)')
IY

(I 1 e , T h r, V a l)
/ Fho e -. v \ roql, vvv
^

ttv

11fr

N* C a1 p ha-C

rrtta

Ar"l

ar

NH1-CH2G"-C
I\r-r\r{1 E -(-

N-Calpha-cbeta I wHr-cnrE-cH3E
NH1.-CHlE-CH]-B N-CH1E -CHzE NH1-CHLE -CH2 t s o-c-NHL

(e x c e p t , G ly , P ro ) (G I Y ) (P ro ) (A la ) ( I 1 e , T h r, V a l ) (P ro )
/ F^lvr o v.
\

t]- t.2 LLz.5 1r - L.8 110.4 111_. 5 .0 1"03


1IU . f,

z.o

2.9 z.>
11 11 11

.fl d d -i| .xl -fl


.{

r oql - 'l vv
-

e,

( except, Pr,c ) (P ro )

L23.Q
LZZ.V 1A L .a

ALI

Aa1

..! -I
{

-l *t

.I
ir

_l

I i
I

'^;^,/^^*-,, i i Pu ) ur u\- di

I I

Honrology' Chapter .1 o t i np rt to P RCCHE CK is a sin g le iiie c o n t a in in g h e c o o rd in a t e s f y o u r p ro tc r n ,/ .,'Th e c fil This rnustbe i;r B rookhaven e f o rma t . T h c o lt t P u t s o mp ris ea n u mb e ro f p lo i . s . s : : -uctur e . . residue-by.rcsidlis t in g T h e p lo t sa reo u t p u tin P o s t S c ripflo n n a t . ue r vitha detaileci roqerher I gcttct'ntc<l PROCI{EQ.K: by f)ifferent 1;lots 1" p lot ; Rantaclnttd,:g,,1:Jpt-,

in for. plot shows-the g-lliiru -!oi.;-r9n"-1!.gl9s -{l residu,el the . .The Ranrachandran by are (excepr tliosert the chain terminitClycine resioues slplt"!9y ]de11t're! srru,crure charn plt qpp,topriale the o!!9, sicje to of to as are tnangles rhese not restricred the regions _t]E
f\pes. :..:-*-*'-" in the on T'hecolouring/sheding the plot represcnts differentregionsdescribed Nlorriset (heleshownin rec{) corespondto the "core" regionsrepresenting al. (i992): the darkest areas of the mostfavourablecombinations phi-psi values. T'hc in Ideally,one rvoulrjhope to haveover 90Voof the residues these"core" regions. g t al p e fcen tagof r e sidues the "core" regio n s o n ec rft h eb e t t e r u id e s o s t e re o c h e micq u llii i . e is in plot The main optionsfcr the Ramrchandran are:.l.abelingof residucsin disallorved can bc off, or altematively regionscan be s,,vitched e xten d ed thecthcrregions. intc .Shading/colouring the diftcrentregions can bc srvitchecl o[f. ol' .Th eplo t canbe in colouroi-black-an d -rv h it e . .A "p u b licationveision" of the plo t (lv it h o u t t h e o u t e r b o rd e r a n d s t a t is t ic s ) a n ' n e c generateci. 2 nd p I ot\-{a -/macIn nd r an p Io t.r b;1 S ;tdue*type I ploi-sare shown for each of the 20 diif'erent The plot shorv's separate Ramachandran lmino acidty'pes. The darker the shacled areaon eachplot, the nrore favourable the region.The dlta on which the shadingis basedhas cornefrom a data sot of 163 non-homologous, high-resolution proteinchrins chosen from structures solvedby X-ray crvstallography a resolution 2.0,1or to of betterand an R-factorno sreaterthan20Vo .
I

;I

428

Dl0Lanrpus Hornologt Chapter 4

'il -l
d

rL"*th.tffi
of the The red numbersabovethe datapointsale the reside'numbers poinrson rhat graph. of regions theplot)' lyingin unfavourable residues (ie those in residues question showing for mainoptions theplotare:. ,'iThe only' plots q,,oRant:ichandran for Gly & Proresidues are whichpoints to be labelled' defining cut-offvaluef9r theG-factor: . _,TSL .Theplotcanbein colour black-and-white' or 3rd plot: Clil-Cli2 PIot ,.r' for anglecombinations nll The chil-chi2 plorsshow,rd{nit-.ti2 sideghfn torsion anglcs' bgththcsc to are sidccSains longenou!f', havg rvSosc typcs rcsiduc the how favourableeachregionon the plot is; the darker rie'i[i,lirig"it; iactr ptot indicates has is The dataon whichthe shading based comefrom a the the shade morefavourable region. by solved from structures chosen chains protein high-resolution datasetof 163non-homologous, thanZATa no grEater ' and of to X-raycrystallography a resolution 2.0Aor better anR-factor of sholvthe totalnumber drta name, The nu'rbcrs in brackets,followingeachresidr"le of the The red nunrbersabovethe datapointsare the reside'nurnbers points on rhxr graph. of regions the plot)' lying in unfavourabic (ic in rcsicjues qucstiorr showingthoseresiducs for The mainoPtions thePlotare:poins arcto b'elabelled' whiCh defining .Thecut-offvaluefor theG-factor or. .Theplotcanbein colour black-and-white' lh plot/uait1 chainParanrcterr by' (represented plot parameters showhow the structure The ,i^ gr"pt, on themain-chain The at structures a slm]iarresolution' darkbandin with rvell-refined compares the solidsquare) the structures; centralline is a leastthe eachgraphrepresents resultsfrom the lvell-refinecl while the viidth of the bandon either fit squares to the meantrendas a functionof resolution, the cases' In the abcrut nrean. some deviation of to sideof it corresponds a variation onestandard it is not. and on trendis dependent theresolution, in othercases on not should be placed reiiance too nruckr as Note.This plot is intended a rough guideonly and resolutiot'l"' at thanstructures thesame that results are"be[ter getting
,/

n
T
il I I

n
t
I
fl f

^|

r
il
.{

t
I
,{

429

J .I J J J J J J J
{
IY

d .g

bir-rCampus Ho mo lo g y ChaPter"l
--1-

TM5 propertiesplotted are:

-,./ of by is .a. Ranrachandran propcrty measured the percentage the piot qualityLXtris

-./

or rhat are in ihe most favoured, core, regionsof the Ramachandran protein's residues , ob p lo r . Fo r a good mocieistructure, t a in e da t h ig h re s o lu t io no n e wo u ld e x p e c tt h is to percentage be over 907o. Horvever,as the resolutiongets poorer, so this figure rv'ith decrease regionreflectsthis expected T'heshaded decreasesas mrghtbe expected, g rvorseninresol oti. uti

. b . Pcp ti dc borrtl plarrarity.."fnispro p e rly is me a s u re d y c a lc u la t in gt h e s t a n d a rd b torsion angles.The smaller the value the deviation of the proiein structure's-tmega p re s s rroundthe idea io i 1 8 0d e g re e (r, v h ic h p re s e n ta p e rf e c t ly la n a r tig h terthe clusterins
n a n ti Jo yvl /r,u ! h n rr,l\ vv" r,/.

t/-

. c. Ba d n o n .bonCedinter;rctions.'T h is ro p e rt yis rn e a s u re d y t h e n u mb e r o f b a d p b contacts per 100 resi.lucs.Bad contactsi.rreselectedfrom the list of non-bonded oi wh l in ter a cti ons found by progrrrl Nll. T h e y a rc d c f in e ca s c o n t a c t s e re t h e d is t a n c e is cl ose st p proach lessthanor equalt o 2 . 6 , \ . a 'fhi . d. Ca lp h a tetrahedral.distortion. s p ro p e n )' me a s rlre b y c a lc u la t in gh e s t a n d a rd t is d prq!-on anglq in thrlt it-,isnot <Ieviation the *eta torsion angle. l'his is a n'otion-al oi Rather, is definedby the following four it definedaboutany actualbond in the struciure. residue: atomsrvithina g,iven Calpha,N, C, and Cbeta.

.e. Main..cldr by lrydlggen bond energy.T'his propertyis measured the standard for deviationof the nyA_rqgg, b,on$*^energies main-chainhydrogen bonds.The (1983). & of the energies calculated are using method Kabsch Sander
a'A"'

,f. Overall G-factclr.The overall G-factor is a measureof the overall*normality of the for cf fronr an average all the differentG-factors structure. The overellvalueis obtained
!%1..+l*

'n'

eachresidue the structure. in lh plot: Side-clnirtpat'attrcters

(represented plot parameters shorv howthestructure bv Thefive graphs_on the"side-chain in at The thesolidsquare) with resolution. darkbantJ compares rvell-refineci structures a sirnilar

430

E----

bioCanpus Homology Ch a p t e r ' i the structures; central line is a leastthe results from the well-refined elch graph represents while thc width of the bandon either fit squares to the mean trend as a function of resclution, the deviationaboutthe mean.ln all cases to side of it corresponds a variationof one standard on trendis dependent the resolution. or Note. This plot is inrended 3j3qh ort shouldnot be placed guilg_ only and too much reilance

fl r|

c
-l
i|
;{

at gettingresultsthat are "betterthan structures the sameresolution". plotge{ aq.e; The 5_pr_op"qel . ' :ai'standard minustorsionangJ"es. deviation thechi-l gauche of .b. Standard deviation the chi-1 transtorsionangles. of dcviation the chi-l grucheplus torsionangles. of -c. Standard .d. Pooledstandard of deviation all chi-l torsionangles' .e. Standard deviationof the chi-2 transtorsionangles. properties dn plot: Rcsicftrc on The vlious graphs and cliagrams this plot show hou' the protein'sgeometncal ofTfricir i-gions uppiot to i,oue This gives a visualization vary alonqils sequnce. . properties (perhaps they arepoorly defined)and which have because geometry poor or unusual consistently more norrnalgeometry. plottedare:The properties .Grapls a-c: Optiorwlproperties by from 14 possibies the u-ser' The first threegraphsat the top of the page,can be selected The threedefaultgraphs,which are plottedwhen you first run PROCHECK, are the fir'st threeof:Plcl) l. Absolutedeviationfrom rneanChi-1 value (exc:I. 2. Absolutedeviationfrom meanof omegatorsion of chirality:abs.deviation zetatorsion 3. C-alpha 4. Absolutedeviationfrom meanof H-boncienergy 5. GammaatornB-value atoms 6. Avcrlge B-vllue of rtlain-chain atoms 7. AveraseB-valueof side-chain

-l .El EI ;i EI
r|

.sl
ii 3l sl d
.d

{r

ir xr Jr
.{

d il
{

..r .rr
.d

+Jl

a ..il
JI

lf

.f,

..ril

bioCamPus HomologY Ch a P t e r{

i i Ostri buti orr & C-tbI* f nEf ri -p--s distribution ior 9. G-frctor chil-ciri2 G-factor 10.Residuc-bY-residue b1' (estimaterj Ooi nunrbers) accessibility I l. Approx' iiccessibility main-chairr residue i2. Percentage B-values of deviation main-chain 13.Stlrrcilrd B-vaiues of deviation side-chain 14.Standar:d a*'av deviations morethan2.0 standard those, (usualiy unusuarvalues For eachgraph, highliglrted. ,,ic]cal,, are value) shown tnean from the
accessibility cs!irttutctl & averctgc /)) "Graplt d: Scc:rtrtda 'slrllctl{''c

& sancier of the representation Kabsch a plot shows schematic srructure Thesecondary ( i 9 8 3 ) s e co n d i rrystru C tu | a ssignments.Thekeyjustbelowthepictur eshow s l v hi c h & with a Kabsch sander arl to aretaken inclucle residues Betastrirnds is srructure rv'ic:h. else whileeverything to bothH andG assignments' corresponds of assignment E, heiices coil' to is taken be randoni T h e s h aci i n g b e h i n <l th e sch gmaticpictur c' ' givesanappr oximationtothe r es i due basedon eachresidue's is a fairrycr.rdeone,being The accessibilities. approxirnation o o i n u mb e r(N i sl ri ka rva & o o i ,1986) .Anooinum ber isacountofthenumbe r ofother own Calpltu' 14A of the givenresidue's rvithina radiusof, in this case' CalphaatorTls are partsof the structure of good impression which crude,trrisdoesgive a Altrrough will of Futureversions PR.CHECK on e,Xposed the surface' buried and rvhichare accessibility' of cllculation residue rn irrclude ilccurate

regiotts & .Graplt e: JSeqtrcrtce R'anrcclnndran ,Gfa ptt e : eqt t c t . Ls s r \ s "' Ev ' - - ' - - -

r/.,aina the ?O standafd

O n e - l ette f

Th e n e xtsectlonshorvstheseq u e n c e o f t h e s t ru c t u rg (u s in g t h e 2 0 s t a n t ron -^L^-.l

t
l

I t T

ni nt piot of the thatidentify region theRamachandran a setof markers il,.,-::,;.;;.r;;"a ^f of the fntr one tvpes, for each thefour arefour marker There is rocated. ff;';;;,;"" and I I nr w or l a'owed,senerous disallorved' " favoured), core(iemost ;;;;ion: il"J;;.,

.GraPhf: Max' deviatiott

maximunl shows each residue's of asterisks ancl plus'signs The small histogram

t t
I

+);

/a

{"
I

bioCarnprus Homology

4 Chapter ttt tht .o't lGCg


that deviates parameter file. Refer ro rhe final column of the .out file to seewhich is the

il fl
{

here. shown by theamount .Grapltg,W G-factorvalues' of representation eachresidue's give a schematic squares The shaded that resiCues do not havea ctri'2, are (Notethatthechi-l G-factors shownonly for those no andhence chil''chi2G-factor). are conespondto regionswhere the properties Regionswith many dark squares to maycorrespond highly These G-factor' gs by "ultusuul", defined a low (or negative) investigation' further or suchasloops, mayneed regions mobileor poorlydefined for Themainoptions thePlotare:at in .Which3 of 14plotsto be printed the3 maingraphs thetop of thep.age' (default 2.0). is outliers .Number srandard for deviations highlighting of .Theplot canbe in colouror black-and-white' clnin bondlengthdistibwtiorts /h plot: Ma-in of showthe distributions eachof !h on T5e Sistogr.lnls this plot. ain-chain

ry
il
;l d

hl
;d

H EI

J 3l d J fl

^s .fl
g I I

to The solictline in the centreof eachplcltcorresponds the small' in bond lengths the structure. standard mean value,while the dashedlineseithersideshowthe small'molecule nrolecule the deviation, datacomingfromEngh& lluber (1991)' though from the mean, cieviations morethan2.0 standard to barscon-espond values Highlighted file. the by thevalueof 2.0canbechangecl editing procheck.prm barslie off the graph,to the left or to the right, a largearrow If any of the histogram (as outliers in the C'O plot above). of the indicates number these plots. geometry on are outliers shown theDistorted Significant paranieters clwittbondangle {h plot: rttain

differentmirin-chain of on The histogrums rhis plot shorvthe distributions eachof the to piot corresponds thesmall' of The in bondangles the structure. solid line in the centre each stancarcl the snial!-riolecule moleculemean value, while the dashedlineseithersideshow -{ (1991)' fromEngh& Huber coming the deviation, data from themean' deviations morethan2.0 standarsl to barscorrespond values Highlighted

.fl .d

x .fl

.d

x .fl
.d
I
=--t

433

J I I

Homolog"v ChaPter 4
f"'tlttt*ttt'P rnl [i le'

cA-c'o andcB'cA'c plots (as outliers in ttre of these i"oi.r,", ;.'";;, plots' geometq/ on are outliers shown theDistortecl Signifrcant

I f a n y o f t he h i sto g ra mb a rsl i e offthegr aph,totheleftor tothgr ight,alar gearT ow oL^r r C \ above)'

plslqr$t:
[llSlUXr c r r r r J tho*1!9-*II9iS These I n ese histograms

glqYp"s Fsr! plqt3t


"^lt:,are

GIu) The t"T:: (Arg' encl-groups Asn' Asp'Cln' His) andfor plrrrar
^ ''-' ' ' . , . s "##e-

l l rrs r I rru rv ' i n h e s uc ur g' .T h e n a l intthg S ttrr u tc t u r Ieh e o a sdc csh e d i i n e si n dicatediffer entidealvaluesfor. ^ 6 om attcnt a n C , ar a r a n 0 1 1 and 0'034

;il;i;.,;,

file' procheck'prm the bv can uu,,ii.r.varues bearreredecliring

H i stog r ir r tr bir rsbcyondttredlsliedline s a re s lro rv n a s lrig lrlig lrt e d '

rr..Wyro rr rigrrri ghti ngoutl i ers | o r r in g g r o u p S ( d e f a u lt is 0 . 0 3 A ) . is (dcfault 0'02i)' groups ror ourricrs othcr for'igrrrig'ting t.*:ru .RMstrisr*rrcc 'lurrurity itt cololtror black-and-rvhite'
'The plot canbe

c'l'gn 1d^ Plot' Dis'Lo't "' groups. planar

olots:

mai ThesePlots shows all distorted

n-chainbond angles,and

propertiesneed to be before these defininghow distorted The parameters file' parameter herearegivenin theprocheck'prm (as the plot shows the ideal value bond lengthand angleplotted, rnain-chain For each benveen . data)'the actualvalue,and the difference &' Hubersmall-molecule by clehned theEngh

being plotted

thetwo.

are projections plotted and the value threeofthogonal groups' planar For eachdistortecl

s l r o s , n i s t l r e R },l scl i sta n ce o ftl reatomsfr omthebest.fitplane.

434

bioCan'iPus I'IonrologY C h a J r t e4 r References: J' M, J. E"; Thornttlr)', M' "Knowledge-b*sed Blunclell,T. L.; Sibtnda, B. I-.; -\ternberg, Nttture'326'34'l(i967)' moleculcs"" of ancl of preciiction proteinsrructures the design novel J',; F . ; F lo wlil, B ' ; Hu b b a rd 'T ' ; o v e r i n g t o n ' S Bl un d e ll,T.L.; Carney,D.; Gard n e r, . ; Ha y e s , ' a n dd es i g n "E t t r ' ed , " K n o rv le d g e -b a sp ro t e inrn o c e llin g B sin g h ,D. A .; S ibanda, . L.; S utc lif f 'N{ . J. Bi oclwn., 172,513(1988)' $p e r vn e,W.J.;North,A .C'T';P hillip s ' D' C' ; B re w' K ' ; V a n a ma n ' T ' C' ; Hili' R' L ' " AP o s s i b i c ()n trrai of Hen's Egg-white Structure of Bovine -Lactalbuminbased Three_ciimensionar J. L yso zyme," Mol. B iol',42,65 (i9 6 9 )' llon,ologiesin protein sequences," T. "Estabrishing Dlyhoff, ivr. o.; Barker,w. c.l Hunr, J-. 91, etlnds in Etttyntology, 524 (1983)' lvl Andrej srli tnd Tom L. Blundcll of spatiai "comparativeprotcin modeling by satisfaction

t
'{ {

il

H I I
I fl

(1993)' J restraints", MoI 8io1,234,779-815

Ado b eS ysterrrslrrc.(1985).Po s t s c rip t L a n g u a g e Re f e re n c e Ma n u a i. A d d is on - W e s l e y , R e a d i n T' y B A , Do u b le rJ aA ' Hig g s H' llu m m e l i n k { A. Ailen F H, B cllard S , B rice M D, Ca rt wrig h t wDS , Ro c ig e rs J R& wa t s o n DG (19 7 9 ) ' l- {u m melink-P etersB G,K enn a rd c , Mo t h e rwe ll New York: Wiley' Algorithms' J. Hartigan, A. (1975).Clustering J' Mol' Biol" 257'342-158'(1996)' O', Lichtarge, Boume,H.R' and Cohen'F'E" F'E', j' Mol' Biol', 214,325.337 (L997), ' K.R' and Coiren, o. Licittarge, Yantamoto,
{

I I I I I I

rli

J J J J J J J J
f

I J J J J

:t

Vous aimerez peut-être aussi