Vous êtes sur la page 1sur 7

DDT: TARGETS Vol. 3, No. 2 (Suppl.

), 2004 mass spectrometry in proteomics supplement | reviews

Analysis of large-scale MS data


sets: the dramas and the delights
Keiryn L. Bennett, Jan C. Brønd, Dan B. Kristensen,
Alexandre V. Podtelejnikov and Jacek R. Wiśniewski
The biotechnology and pharmaceutical industries are faced biologically relevant samples, and data analysis
with the serious challenge of consolidating the enormous quan- and integration.

tities of data that have been generated from high-throughput Priorities of large-scale proteomic
proteomic applications. The bottleneck of data validation and analyses
The predominant methods of large-scale data gen-
placement of the information obtained into sound biological
eration are liquid chromatography mass spec-
context urgently needs to be addressed. Here, we review the trometry (LC–MS) of complex protein mixtures
issues that arise when analysing large quantities of data gener- [2–5], and matrix-assisted laser desorption ion-
ization (MALDI) MS profiling of tissue sections
ated by liquid chromatography mass spectrometry, offer po- [6,7] and sera [8]. The current needs of the in-
tential solutions for data management and predict the future dustry with respect to the information required
from large-scale proteomic analyses are the de-
direction of large-scale data analysis by mass spectrometry.
tection of pharmaceutically relevant proteins.
These include proteins involved in signal trans-
Keiryn L. Bennett* ▼ In the age of high-throughput proteomics, duction and the effects of inhibitors or activators
Jan C. Brønd ‘more is better’ became the catch cry of the biotech- on the proteins; kinases and phosphatases involved
Dan B. Kristensen nology and pharmaceutical industry. Companies in pathway modulation; and receptors and associ-
Alexandre V.
have invested vast amounts of capital in state-of- ated ligands involved in intra- and intercellular
Podtelejnikov and
Jacek R. Wiśniewski† the-art mass spectrometers, robotics to prepare signalling. Identification of proteins such as tran-
MDS Denmark samples for analysis, and automated systems to scription factors, cell-surface proteins, secreted
Staermosegaardsvej 6 generate large data sets. These investments have proteins and biomarkers is a high priority for the
Odense M, DK-5230 neither been cost-effective nor significantly generation of drug targets, diagnosis of disease
Denmark
aided the pharmaceutical industry in discovering and assessment of drug toxicity. Within these
*e-mail: kbennett@
mdsdenmark.com
new targets. Mass spectrometry is a core technol- classes of proteins, the challenge is to identify
†e-mail: jwisniewski@ ogy in proteomic research; however, large-scale low-abundance proteins and to provide quantitive
mdsdenmark.com). proteome characterization has led to serious information.
bottlenecks in data interpretation.
The situation is eloquently captured by a re- Low-abundance proteins
cent quote from Scott Patterson, “our ability to Owing to the inherent nature of cellular com-
generate data now outstrips our ability to analyse plexity, the detection and identification of low-
it” [1], and reflects the current status for indus- abundance proteins by MS is extremely challeng-
try and academia alike.There is now an evolution ing. To increase the chance that a protein of low
towards statistically sound, biologically relevant copy number will be detected, it is necessary to
data sets: ‘quality, not quantity’ is the reality. reduce the complexity of the sample. There are
Efforts are concentrated on producing focused several approaches whereby sample complexity
data sets aimed at resolving the complexity of can be simplified. One method is to fractionate
biological samples. This is being achieved by a tissue and cells into subcellular components,
synergistic relationship among sample prepar- such as plasma membrane [9–11], mitochondria
ation, mass spectrometry (MS) analysis of the [12] nucleoli [13] and lipid rafts [14,15].A second

1741-8372/04/$ – see front matter ©2004 Elsevier Ltd. All rights reserved. PII: S1741-8372(04)02412-0 www.drugdiscoverytoday.com S43
reviews | mass spectrometry in proteomics supplement DDT: TARGETS Vol. 3, No. 2 (Suppl.), 2004

approach is to isolate protein populations by affinity chromatog- Data processing


raphy. This can be achieved by utilizing the inherent properties Before searching information generated by MS, the data must
of the protein via posttranslational modifications (e.g. phos- be in a format compatible with a database search engine.
phoproteome [19] and the glycoproteome [19]). Alternatively, Processing software converts raw mass spectra into a generic
a specific amino acid can be labelled with a tag and, subsequently, list of monoisotopic mass and intensity. The mass-to-charge
the tagged peptides can be isolated by affinity purification (m/z) ratio for each peak is defined by the centroid (i.e. the
[11,20]. centre of the peak mass), which is determined above a speci-
fied percentage of the peak height (e.g. 50%), and selected to
Peptide and protein quantitation avoid the inclusion of noise at the baseline. Many of the soft-
There is an increasing demand for comparative proteomics in ware programs available for transforming raw mass spectra
which populations of specific proteins are compared between attempt to simplify the data by adjusting parameters such as
two different states: for example, healthy versus diseased tissue. signal-to-noise ratio, smoothing, baseline subtraction, peak
There are three main methods for quantitation by MS. The first extraction and filtering according to specific criteria that
approach employs in vivo metabolic labelling. Different cell states exclude nonpeptide peaks, data centroiding, and charge-state
are grown in dissimilar isotopic environments, for example, in determination of the precursor ion. The result of such spectral
media containing light and heavy versions of stable amino acid manipulation, in many instances, is that data can be omitted.
isotopes [21–23]. Equimolar quantities of light and heavy cells
are mixed, then processed, digested and analysed by MS. Data searching
The second approach uses 18O during proteolysis [24]. Search engines such as Sequest [25], Mascot [26], Protein
Digestion of a protein in 18O-enriched water results in the Prospector [27] and Sonar [28] are founded on probability-
incorporation of heavy isotopes into the C-terminus of each based scoring algorithms and, as such, are fraught with prob-
peptide. The third method is to label a specific amino acid, for lems in assessing the trustworthiness of an identification. The
example cysteine, with a chemical reagent containing a light or output from a search engine analysis is a series of peptide and
heavy isotope. Such reagents often incorporate an isotope- protein identifications that are ranked according to specified
coded affinity tag, which allows enrichment of the labelled criteria. Because there are no universal standards for scoring
peptides and determination of the relative abundance of the the output from the programs, determining what constitutes a
peptides [20]. For all three methods, peptides are observed as significant match is not straightforward. If the data are of suffi-
doublets with mass differences corresponding to the light and cient quality, then the probability of a false positive can be dis-
heavy isotopes, and the ratio of peak height and peak area is regarded; however, in some cases the identification returned by
proportional to the relative abundance of each peptide. New the search engine is likely to be incorrect.
approaches for protein and peptide quantitation based on the In an attempt to increase the confidence of a protein
modification of specific amino acids with stable isotopes are identification, it has become common practice to submit the
constantly emerging, as exemplified by a recent article from same data to two or more search engines and combine the out-
Olsen et al. [11]. puts. The claim is that if two search engines return the same
protein, then confidence in a definitive identification will in-
Issues in MS data analysis crease. However, the same data are assessed with similar search
It is now relatively straightforward to establish a high-through- algorithms, which might exacerbate the incidence of false pos-
put proteomic laboratory and produce large quantities of data. itives. Ideally, the data should be assessed by ‘unrelated’ algo-
Data generation no longer represents a major bottleneck. rithms. It is reasonable to speculate that when different algorithms
Nonetheless, the proteomic pendulum has swung away from return the same identification, confidence in the ‘correct’ protein
data generation and towards data validation, that is, relating the identity will increase.
information generated by MS and proteomics to the biological The information returned by the search engine should thus
question. be taken as an initial data filter rather than the ultimate final
The publication of nonvalidated data sets has created confu- and correct answer. After data filtering, the remaining unin-
sion and misunderstanding because there is an inability to terpreted data should be searched against expressed sequence
relate realistically the ‘numbers’ to a biological application. In tag (EST), single nucleotide polymorphism (SNP), alternatively
addition, manual validation is an extremely tedious exercise spliced protein databases and, ultimately, genome databases.
that is highly subjective to human variability. Another major Exhaustive data searches would guarantee that all publicly
issue in dealing with large data sets generated by LC–MS is the available sequence information is accessed and, thereby,
existence of trustworthy software to (i) process, (ii) search, and the incidence of false-positive identifications would be
(iii) analyse, verify and curate the data. reduced.

S44 www.drugdiscoverytoday.com
DDT: TARGETS Vol. 3, No. 2 (Suppl.), 2004 mass spectrometry in proteomics supplement | reviews

Data analysis and curation


Reassembly of peptide information back Protein sample Protein identifications
to the protein is a dilemma for ‘shotgun Protein
level A B C D A B C
proteomics’. Figure 1 is an overview of the
Peptide
process by which data are produced by grouping,
Enzymatic
MS and illustrates the associated difficulty digestion validation
of reassigning identified degenerate pep-
tides back to the parent protein.
Peptide
level
Potential solutions to assist the
analysis of large-scale MS data Peptide mixture Peptide identifications

sets LC/MS/MS Database


search,
From a biological point of view, improve- validation
ment in experimental design is manda-
I/MS/MS spectra
tory as more researchers are realizing the level
impact of design and methodology on
the quality of the results. For example, the MS/MS spectra

analysis of total cell lysate gives a ‘snap Drug Discovery Today: TARGETS
shot’ of the most abundant proteins,
whereas ‘hypothesis-driven proteomics’ Figure 1. Simplified outline of the experimental steps and flow of the data in a typical high-
throughput mass spectrometry (MS)-based analysis of complex protein mixtures. Each
(e.g. specific pull-down experiments or sample protein (yellow circle) is cleaved into smaller peptides (yellow squares), which can
subcellular fractionation) provides insight be either unique to that protein (unbroken arrows) or shared with other sample proteins
into the mechanism or function of a (broken arrows). The peptides are then ionized, and selected ions are fragmented to
produce tandem MS (MS/MS) spectra. Some peptides are selected for fragmentation
particular biological system. several times (broken arrows), whereas some are not selected even once. Each acquired
From an analytical angle, the quality of MS/MS spectrum is searched against a sequence database and assigned a best-matching
the data produced from the mass spec- peptide, which might be correct (yellow squares) or incorrect (black square). The database
search results are then manually or statistically validated. The list of identified peptides is
trometer is of fundamental importance.
used to infer which proteins are present in the original sample (yellow circles), and which are
Low-resolution three-dimensional ion traps false identifications (black circles) corresponding to incorrect peptide assignments. The
are extremely popular and well-suited process of inferring protein identities is complicated by the presence of degenerate peptides
to high-throughput LC–MS; however, the corresponding to more than a single entry in the protein sequence database (broken
arrows). Adapted, with permission, from Ref. [33], © (2003) American Chemical Society.
charge state of a precursor ion cannot be
distinguished when operating in full-scan
mode. The combination of a quadrupole mass selector and to process the information into a format that is compatible
collision cell with orthogonal acceleration has led to high reso- with search engine analysis. Effective automated transfer of MS
lution (~10 000) and mass accuracy (5–20 ppm with internal data to informatic programs is dependent on the reliable per-
calibration). Charge states are more readily assigned, which is formance and accuracy of peak-assignment algorithms. Most
an important advancement in minimizing false protein identi- instrument vendors provide proprietary algorithms as part of
fications. Fourier transform ion cyclotron resonance MS pro- the acquisition software, and there have been considerable ef-
vides the ultimate performance with a mass accuracy of 1–5 ppm; forts by independent groups to develop alternative mathemati-
however, such instruments are beyond the budget of most cal approaches to maximize information extraction from the
laboratories. data.
Ultimately, intelligent software solutions are vital to the contin- A lack of fully automated data analysis systems for protein
uation of proteomics. Areas of importance include (i) adapting identification from complex mixtures requires the development
current algorithms and developing new algorithms to process of more rigorous search and scoring algorithms. Recently,
and search data, (ii) maximizing information extraction, (iii) Colinge et al. [29] introduced a scoring scheme named OLAV.
semiautomatic to automatic data validation and generation of This scheme is similar to established probability-based search
statistically sound data, and (iv) data management and curation. engines, but the scoring has been markedly improved by in-
cluding structural information from consecutive fragment ion
Data processing and search algorithms matches; that is, it is an extension of the sequence tag concept.
There have been numerous attempts to determine the optimal The overall benefit of OLAV compared with other probability-
means of extracting information from MS-generated data and based search engines is the improved selection of correct peptide

www.drugdiscoverytoday.com S45
reviews | mass spectrometry in proteomics supplement DDT: TARGETS Vol. 3, No. 2 (Suppl.), 2004

assignments based purely on the score returned by the search positives, additional parameters related to the identification are
engine. computed. An important parameter is the number of sibling
Other algorithms, including Protein Prospector http://www. peptides [33]. NSP is the total number of unique peptides
ucsf.edu), ProbID [30] and Sequest [31], are under further de- identifying a protein (or protein group) and is a strong indica-
velopment with the aim of providing more reliable signifi- tor of the probability that the peptide identification is correct.
cance thresholds. It still remains to be seen, however, whether Similar to OLAV, a score is calculated for consecutive y- and
putative matches close to the threshold can be unequivocally b-ion fragment matches. Information concerning specific frag-
trusted. ment ions (e.g. proline), missed cleavage sites and quantitative
information (if available) are also scored. MS/MS spectra that
Maximal extraction of information are rejected by autovalidation can be manually validated directly
Multiple analyses of the same sample by MS via an ‘exclusion from the raw data by generating a sequence tag, or reassessed
list’ approach [32] and iterative database analyses of the same by the breakpoint algorithm with alternative parameters.
data are being combined in an approach to extract information PeptideProphet (http://www.systemsbiology.org) [34] is
exhaustively from a single biological experiment (MDS an open source software program that facilitates automatic
Denmark). validation of proteomic data. Based on an empirical statistical
During LC–MS analysis of a mixture of proteins on a model, a sensitivity threshold for correct and incorrect peptide
quadrupole time-of-flight (TOF) instrument, the TOF region identification can be selected. As would be expected, a higher
can resolve significantly more precursor ions than the mass threshold will increase the number of false positive identifica-
spectrometer can select and fragment; therefore, a single LC–MS tions. Thus, the gain obtained from automatic validation is lost
acquisition is insufficient to analyse all precursor ions compre- because of the need for additional manual validation to ensure
hensively. Multiple LC–MS analyses of the same sample and the that the peptides and proteins returned are indeed correct.
generation of an exclusion list between consecutive analyses ProteinProphet [33] is a tool for protein validation based
provide a means whereby only unique precursor ions are frag- on the output of confirmed peptides from PeptideProphet.
mented. After each LC–MS analysis, the m/z ratio and retention Identified peptides corresponding to the same protein are
time of all of the precursor ions selected for tandem MS combined using a statistical model to estimate the probability
(MS/MS) are added to an exclusion list.These peptides are dis- that the protein is present. The number of sibling peptides is
qualified from selection in subsequent analyses, and peptides also estimated and the information is used to improve peptide
that were not previously selected are fragmented. The greater validation.
the number of peptides selected for fragmentation by the mass Another approach for peptide validation is to analyse the data
spectrometer, the greater the number of peptides that will match by an alternative algorithm, such as sequence tag [35], after
a given protein, thereby increasing the sequence coverage and the initial first-round analysis using a breakpoint algorithm.
the probability of a correct identification. Following data gen- Confirmation of peptides by two unrelated algorithms will
eration, the files are iteratively searched through a combination increase the probability of protein identification.
of static and dynamic databases. The aim is to extract more
information (e.g. posttranslational modifications, alternatively Data management and curation
spliced proteins and point mutations) from a single experiment. Combining all of the information generated from a large-scale
The analysis of a complex biological mixture by LC–MS usu- MS investigation in a simple fashion that not only allows pep-
ally involves the production of several fractions, each fraction tide and protein validation, but also grouping of the results,
is analysed, and the protein database searched. This represents reassembly of peptide information into proteins, and data re-
linear data generation and analysis (Figure 2a). Alternatively, porting in a format that is simple to understand is not a trivial
each fraction is analysed several times by LC–MS and the data task. There are several filtering and visualization programs that
are cycled through several databases (Figure 2b). Thus, the attempt to simplify large MS data sets [36–38].
quantity of data that can be generated from a single biological The EPIR database containing validated and confirmed pep-
experiment exponentially explodes. tides has a series of modular interactive software tools that over-
lay the peptide database. These tools include (i) grouping of
Data validation and statistical analysis peptides to related proteins, (ii) addition of quantitative data,
An experimental peptide identification repository (EPIR) is a (iii) extraction of statistical information from within a single
recently developed database system that facilitates the valida- experiment, across multiple MS analyses and different experi-
tion and statistical analysis of data generated by LC–MS. One of ments, and (iv) data collation into a simple format showing,
the software modules enables automated peptide validation. To for example, protein sequence, confirmed peptides and database
enhance the sensitivity of true identification and to avoid false accession identifiers.

S46 www.drugdiscoverytoday.com
DDT: TARGETS Vol. 3, No. 2 (Suppl.), 2004 mass spectrometry in proteomics supplement | reviews

(a) (b)
CELL OR TISSUE EXTRACT CELL OR TISSUE EXTRACT

SUBCELLULAR FRACTIONATION SUBCELLULAR FRACTIONATION


e.g. plasma membrane

PROTEIN/PEPTIDE FRACTIONATION PROTEIN/PEPTIDE FRACTIONATION


e.g. off-line rpHPLC

LC–MS ANALYSIS LC–MS ANALYSIS


e.g. exclusion list

DATABASE SEARCH
DATABASE SEARCH
e.g. iterative
?

DATA MANAGEMENT DATA MANAGEMENT


e.g. EPIR
?
Drug Discovery Today: TARGETS

Figure 2. Proteomic hierarchy: from cell or tissue extract to peptide or protein identification and data management. (a) Generic linear analysis by
liquid chromatography mass spectrometry (LC–MS) and database searching. Cell or tissue extracts are fractionated into subcellular components
and then further fractionated at the protein or peptide level. Each sample is analysed once by LC–MS, and the data generated are searched
against a single database. (b) Advanced exhaustive data production and database analysis. Cell or tissue extracts are fractionated into
subcellular components, such as plasma membrane, and then further fractionated at the protein or peptide level by, for example, off-line reversed-
phase high-performance liquid chromatography (rpHPLC). Each sample is analysed three times by LC–MS via the exclusion list approach, and for
each analysis the data are searched iteratively against a series of databases. The quantity of data generated from a single biological experiment
exponentially explodes through the application of multiple LC–MS analyses and multiple database searches (combined static and dynamic
databases). Peptide-centric databases for storing and mining LC/MS/MS data hold the key to large-scale MS data consolidation.

Similarly, Interact (from the Institute for Systems Biology) interdependent and cross-functional. The demand to formulate
[36] allows rapid and flexible interrogation and analysis of mul- intelligible answers from MS-generated data will continue to
tiple data sets including filtering, unfiltering, sorting, grouping direct advancement in sample preparation, MS and software
and highlighting of the data by user-controlled criteria. development.
Comparisons across experiments are achieved using a tool that
highlights similarities and differences at either the peptide or Perspectives
the protein level. Interact interfaces with quantitative software Large-scale data production by MS and analysis of the results in
to provide the average relative abundance and standard deviation a biological context are still the domain of specialists who are
for each protein. aware of the disadvantages and restrictions of the technology.
The potential issues outlined in this section to assist in the Expert knowledge is heavily exploited when considering the
analysis of large-scale MS data sets are multifaceted, highly limitations of an experiment and the answers obtained. As MS

www.drugdiscoverytoday.com S47
reviews | mass spectrometry in proteomics supplement DDT: TARGETS Vol. 3, No. 2 (Suppl.), 2004

and proteomics become increasingly accessible to a broader range 13 Andersen, J.S. et al. (2002) Directed proteomic analysis of the human
of researchers – medical clinicians, for example – it is impera- nucleolus. Curr. Biol. 12, 1–11
tive that there is sufficient support to assist ‘novice’ scientists in 14 Nebl, T. et al. (2002) Proteomic analysis of a detergent-resistant
formulating realistic conclusions from the data. Without assis- membrane skeleton from neutrophil plasma membranes. J. Biol. Chem.
tance, an inadequate understanding of the technology will 277, 43399–43409
severely impact the reliability and significance of the results. 15 Foster, L.J. et al. (2003) Unbiased quantitative proteomics of lipid rafts
The future demand for software and statistical analysis of MS reveals high specificity for signalling factors. Proc. Natl.Acad. Sci. U. S.A. 100,
data will continue to grow as increasingly more laboratories 5813–5818
realize the need for exhaustive data analysis rather than exces- 16 Neubauer, G. et al. (1998) Mass spectrometry and EST-database searching
sive data production. Quantitation adds an extra dimension of allows characterisation of the multiprotein spliceosome complex.
complexity to the data that cannot be addressed generically by Nat. Genet. 20, 46–50
current software tools. Therefore, further software development 17 Taylor, S.W. et al. (2003) Characterisation of the human heart
in this area is essential. The integration of visionary, well- mitochondrial proteome. Nat. Biotechnol. 21, 281–286
formulated biological theories with MS, plus access to simple 18 Ficarro, S.B. et al. (2002) Phosphoproteome analysis by mass
and comprehensible software solutions, is the key to bringing spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol. 20,
the realm of MS and proteomics successfully within the reach 301–305
of everyday scientists. 19 Bunkenborg, J. et al. (2004) Screening for N-glycosylated proteins by
liquid chromatography mass spectrometry. Proteomics 4, 454–465
References 20 Gygi, S.P. et al. (1999) Quantitative analysis of complex protein mixtures
1 Patterson, S.D. (2003) Data analysis – the Achilles heel of proteomics. using isotope-coded affinity tags. Nat. Biotechnol. 17, 994–999
Nat. Biotechnol. 21, 221–222 21 Ong, S.E. et al. (2002) Stable isotope labelling by amino acids in cell
2 Lasonder, E. et al. (2002) Analysis of the Plasmodium falciparum proteome by culture, SILAC, as a simple and accurate approach to expression
high-accuracy mass spectrometry. Nature 419, 537–542 proteomics. Mol. Cell. Proteomics 1, 376–386
3 Florens, L. et al. (2002) A proteomic view of the Plasmodium falciparum life 22 Conrads, T.P. et al. (2001) Quantitative analysis of bacterial and
cycle. Nature 419, 520–526 mammalian proteomes using a combination of cysteine affinity tags and
4 Washburn, M.P. et al. (2001) Large scale analysis of the yeast proteome 15N-metabolic labelling. Anal. Chem. 73, 2132–2139
via multidimensional protein identification technology. Nat. Biotechnol. 19, 23 Blagoev, B. et al. (2003) A proteomics strategy to elucidate functional
242–247 protein–protein interactions applied to EGF signalling. Nat. Biotechnol. 21,
5 Lipton, M.S. et al. (2002) Global analysis of the Deinococcus radiodurans R1 315–318
proteome by using accurate mass tags. Proc. Natl.Acad. Sci. U. S.A. 99, 24 Yao, X. et al. (2001) Proteolytic 18O labelling for comparative
11049–11054 proteomics: model studies with two serotypes of adenovirus. Anal. Chem.
6 Caprioli, R.M. et al. (1997) Molecular imaging of biological samples: 73, 2836–2842
localisation of peptides and proteins using MALDI-TOF-MS. Anal. Chem. 25 Eng, J.K. et al. (1994) An approach to correlate tandem mass spectral data
69, 4751–4760 of peptides with amino acid sequences in a protein database. J.Am. Soc.
7 Yanagisawa, K. et al. (2003) Proteomic patterns of tumour subsets in Mass Spectrom. 5, 976–989
non-small-cell lung cancer. Lancet 362, 433–439 26 Perkins, D.N. et al. (1999) Probability-based protein identification by
8 Marshall, J. et al. (2003) Processing of serum proteins underlies the mass searching sequence databases using mass spectrometry data. Electrophoresis
spectral fingerprinting of myocardial infarction. J. Proteome Res. 2, 20, 3551–3567
361–372 27 Clauser, K.R. et al. (1999) Role of accurate mass measurement (±10
9 Adam, P.J. et al. (2003) Comprehensive proteomic analysis of breast ppm) in protein identification strategies employing MS or MS/MS and
cancer cell membranes reveals unique proteins with potential roles in database searching. Anal. Chem. 71, 2871–2882
clinical cancer. J. Biol. Chem. 278, 6482–6489 28 Field, H.I. et al. (2002) RADARS, a bioinformatics solution that
10 Blonder, J. et al. A detergent- and cyanogen bromide-free method for automates proteome mass spectral analysis, optimises protein
integral membrane proteomics: application to Halobacterium purple identification, and archives data in a relational database. Proteomics 2,
membranes and human epidermis. Proteomics (in press) 36–47
11 Olsen, J.V. et al. (2004) HysTag – a novel proteomic quantification tool 29 Colinge, J. et al. (2003) OLAV: towards high-throughput tandem mass
applied to differential display analysis of membrane proteins from spectrometry data identification. Proteomics 3, 1454–1463
distinct areas of mouse brain. Mol. Cell. Proteomics 3, 82–92 30 Zhang, N. et al. (2002) ProbID: a probabilistic algorithm to identify
12 Mootha,V.K. et al. (2003) Integrated analysis of protein composition, peptides through sequence database searching using tandem mass
tissue diversity, and gene regulation in mouse mitochondria. Cell 115, spectral data. Proteomics 2, 1406–1412
629–640 31 MacCoss, M.J. et al. (2002) Probability-based validation of protein

S48 www.drugdiscoverytoday.com
DDT: TARGETS Vol. 3, No. 2 (Suppl.), 2004 mass spectrometry in proteomics supplement | reviews

identifications using a modified SEQUEST algorithm. Anal. Chem. 74, 35 Mann, M. and Wilm, M. (1994) Error-tolerant identification of peptides
5593–5599 in sequence databases by peptide sequence tags. Anal. Chem. 66, 4390–4399
32 Kristensen, D.B. et al. (2003) Multiple LC–MS exclusion list analyses: a 36 Han, D.K. et al. (2001) Quantitative profiling of differentiation-induced
tool to enhance protein identification from complex biological samples. microsomal proteins using isotope-coded affinity tags and mass
Abstract at 51st ASMS Conference on Mass Spectrometry and Allied Topics spectrometry. Nat. Biotechnol. 19, 946–951
(http://www.inmerge.com/aspfolder/ASMSAbstracts.html), A031566 37 Tabb, D.L. et al. (2002) DTASelect and contrast: tools for assembling and
33 Nesvizhskii, A.I. et al. (2003) A statistical model for identifying proteins comparing protein identifications from shotgun proteomics. J. Proteome
by tandem mass spectrometry. Anal. Chem. 75, 4646–4658 Res. 1, 21–26
34 Keller, A. et al. (2002) Empirical statistical model to estimate the accuracy 38 Eddes, J.S. (2002) CHOMPER: a bioinformatic tool for rapid validation
of peptide identifications made by MS/MS and database search. Anal. of tandem mass spectrometry search results associated with high-
Chem. 74, 5383–5392 throughput proteomic strategies. Proteomics 2, 1097–1103

In the 1st May 2004 issue of Drug Discovery Today…

Editorial
Rational optimization of proteins as drugs: a new era of 'medicinal biology'.
by David Szymkowski

Update - news and views


• Nieminen et al. provide a Finnish perspective of viable technology regulation in tissue engineering
• The analysis of tandem mass spectrometric datasets and blood-brain barrier permeability are the topics for the
Discussion Forum
• Review of the book Safety Pharmacology in Pharmaceutical Development and Approval
• Up-to-date News from our BioMedNet team

Reviews
Strategies to identify ion channel modulators: current and novel approaches to target neuropathic pain
by Phillip J. Birch, Lodewijk V. Dekker, Iain F. James, Andrew Southan and David Cronk
The use of cell-penetrating peptides as a tool for gene regulation
by Peter Järver and Ülo Langel
Targeting hypoxia-A2A adenosine receptor-mediated mechanisms of tissue protection
by Dmitriy Lukashev, Akio Ohta and Michail Sitkovsky

Monitor
Provides an insight into the latest developments in chemistry, biology and business, as well as awards and
appointments

www.drugdiscoverytoday.com S49

Vous aimerez peut-être aussi