Académique Documents
Professionnel Documents
Culture Documents
Revised D28
Peptide
number Sequence E. coli B. cereus S. aureus
D28 FLGVVFKLASKVFPAVFGKV 64 16 8
D28 FLGVVFKLASKVFPAVFGKV 64 16 4
D28 FLGVVFKLASKVFPAVFGKV 64 16 8
R1 FLKVVFKLASKVFPAVFGKV 32 16 4
R2 FLGKVFKLASKVFPAVFGKV 16 16 32
R3 FLGVVFKKASKVFPAVFGKV 256 256 *
R4 FLGVVFKLASKVFKAVFGKV 128 8 8
R5 FLGVVFKLASKVFPAVFKKV 64 16 4
R6 FLKKVFKLASKVFPAVFGKV 4 32 64
R7 FLGKVFKKASKVFPAVFGKV 32 * *
R8 FLGKVFKLASKVFKAVFGKV 16 8 4
R9 FLGKVFKLASKVFPKVFGKV 16 32 64
R10 FLGKVFKLASKVFPAVFKKV 16 128 128
R11 FLKVVFKKASKVFPAVFGKV 128 * *
R12 FLGVVFKKASKVFKAVFGKV 32 32 256
R13 FLGVVFKKASKVFPKVFGKV 128 * *
R14 FLGVVFKKASKVFPAVFKKV 256 * *
R15 FLGGVFKLASKVFPAVFGKV 64 32 64
R16 FLGSVFKLASKVFPAVFGKV 64 32 64
R17 FLGVVFKGASKVFPAVFGKV * 256 *
R18 FLGVVFKSASKVFPAVFGKV * * *
R19 FLGGVFKGASKVFPAVFGKV * * *
R20 FLGSVFKSASKVFPAVFGKV * * *
R21 FLGVVVKLASKVFPAVFGKV * 32 16
R22 FLGVVFKLVSKVFPAVFGKV 256 16 8
R23 FLGVVFKLAVKVFPAVFGKV * * *
R24 FLGVVFKLASKVVPAVFGKV 256 64 16
R25 FLGVVFKLASKVFPAVVGKV 256 64 32
R26 FLGKVVKGASKVFPAVFGKV * * *
R27 FLGKVFKGVSKVFPAVFGKV 128 * *
R28 FLGKVFKGAVKVFPAVFGKV 128 * *
R29 FLGKVFKGASKVVPAVFGKV * * *
R30 FLGVVFKGASKVFPAVVGKV * * *
R31 FLGKVVKKASKVFPAVFGKV 128 * *
R32 FLGKVFKKVSKVFPAVFGKV 32 * *
R33 FLGKVFKKAVKVFPAVFGKV 32 * 256
R34 FLGKVFKKASKVVPAVFGKV 256 * *
R35 FLGKVFKKASKVFPAVVGKV 128 * *
R36 FLGVVFKLASKVFPAVFGKV-NH3 64 16 4
R37 FLGVVFKLASKVFGAVFGKV * 16 8
R38 FLKVVFKLASKVFGAVFGKV 128 16 16
R39 FLGKVFKLASKVFGAVFGKV 16 8 8
R40 FLGVVFKKASKVFGAVFGKV 64 128 *
R41 FLGVVFKLASKVFGKVFGKV 64 8 8
R42 FLGVVFKLASKVFGAVFKKV 256 8 8
R43 FLGKVFKGASKVFGAVFGKV 32 128 *
R44 FLGGVFKKASKVFGAVFGKV 32 128 *
Molecular Molecular
Peptide Sequence Weight Weight
(theoretical) (MALDI-TOF)
D28 FLGVVFKLASKVFPAVFGKV 2153.9 2154.0
Supplementary Figure 1 (b) Samples of B. cereus incubated with varying concentrations of D28
and S28 are streaked on agar plates, which confirms that the bacteria are not viable at or above
the MIC.
Supplementary Methods
Derivation of grammars
Using the set of 526 AmP sequences from APD, we ran the Teiresias pattern discovery
tool with the following settings: L = 6, W = 6, and K = 2 (a detailed description of the Teiresias
input parameters and associated tools is available elsewhere1). The resulting grammar set was
masked from the input sequences and the process was repeated using L = 7, W = 15, K = 5 with
the following amino acid equivalency groups [[AG], [DE], [FYW], [KR], [ILMV], [QN], [ST]].
(See Rigoutsos and Floratos (1998) for a background on pattern discovery, masking, and
Teiresias outputs its grammars in regular expression format, using wildcards. To make
the grammars more selective, we de–referenced each wildcard in the grammars to a bracketed
expression. That is, we replaced each wildcard with the set of amino acids implied by the
grammar’s offset list. Finally, to allow partial matches as short as 10 amino acids, we divided
each grammar into sub–grammars using a sliding–window of size 10, resulting in 1551 grammars
of length ten.
By design, these 1551 are sensitive for the AmP sequences from the APD. That is, these
sequences from the APD are likely to be matched by the grammars. However, the grammars are
not necessarily selective for the APD AmPs. That is, non-AmP sequences may also be matched
by the grammars.
To select only those AmP grammars that are both sensitive and selective, we searched
each of the grammars against a nearly exhaustive set of all known AmPs. These sequences
which were supplemented with an additional ~200 antimicrobial peptides from Swiss–
Prot/TrEMBL2 that were not included in the AMSdb. In addition, we searched each of the
grammars against sequences from Swiss–Prot/TrEMBL that were not AmPs. Using these two
searches, we eliminated grammars that were not at least 80% selective for AmPs. That is, at least
80% of the matches for a single grammar had to come from the set of all known AmPs.
The resulting, final set of 684 ten amino acid grammars was used to design the unnatural
AmPs as described in the text under Methods, Computational design of unnatural AmPs.
Scoring of sequences
As discussed in the text, if one of the AmP grammars matches a particular 10 residue
stretch of a 20-mer, we would call that 10 residue stretch “grammatical.” Here, the 20-mers were
designed such that each window of 10 amino acids is grammatical. Obviously though, some
grammars are more common in the set of known AmPs than others. So, rather than using a
binary metric --- grammatical or not --- we developed a score S which is the degree to which a
given 20-mer is grammatical. This score is computed by making a sequence dot plot matrix3. In
the dot plot, the columns represent the positions, 1-20, of the 20-mer and the rows represent the
concatenated sequences of the ~1000 naturally occurring AmPs. A dot is placed in the matrix
wherever a grammar matches both a naturally occurring AmP and the 20-mer. Then score S is
Peptide Synthesis
Fmoc chemistry was used to synthesize peptides on the Intavis Multipep Synthesizer
(Intavis LLC, San Marcos, CA) at the MIT Biopolymers Lab. Mass spectrometry was used to
confirm the accuracy of the synthesis and typical purities obtained with the synthesizer were
>85%. Recombinantly produced standards for Cecropin P1, Cecropin Melittin Hybrid, Melittin,
Magainin 2, and Parasin were purchased from the American Peptide Company (Sunnyvale, CA).
In antimicrobial assays, four of the five recombinant peptides had identical MICs to the
chemically synthesized versions from MIT biopolymers, with the last being one dilution different
(Cecropin P1). Also, D28 was synthesized by MIT biopolymers 4 separate times and the resulting
The Minimum Inhibitory Concentration (MIC) was measured with an assay based on the NCCLS
M26A and the Hancock assay for cationic peptides4. Briefly, serial dilutions of peptides in 0.2%
Bovine Serum Albumin and 0.01% Acetic acid were made at 10x the desired testing
concentration. Target bacteria were grown in Mueller Hinton Broth (BD, Franklin Lakes, NJ) to
OD600 between 0.1 to 0.3 and diluted down to 2-7 x10^5 cfu/ml in fresh MHB, as confirmed by
plating serial dilutions (2-7x10^6 for S. aureus). Five ul of the peptide dilutions was incubated
with 45 ul of the target in sterile, capped, polypropylene strip tubes for 16-20 hrs. The MIC was
defined as the minimum concentration that prevented growth based on visual inspection of OD.
For a representative set of peptides, dose response curves were created by taking OD600
measurements on a 96 well plate reader after diluting 30 ul of the sample 1:3 in MHB. When
desired, the samples were streaked on an MHB agar plate to assess the viability of the bacteria.
Viability measurements corresponded with the MIC for all samples except 51s, for which some
samples at high optical density did not produce viable colonies upon plating.
Supplementary Notes
1
Rigoutsos, I. & Floratos, A. Combinatorial pattern discovery in biological sequences:
The TEIRESIAS algorithm. Bioinformatics 14, 55–67 (1998).
2
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its
supplement TrEMBL in 2000. Nucleic Acids Res 28, 45–48 (2000).
3
Maizel, J.V. & Lenk, R.P. Enhanced graphic matrix analysis of nucleic acid and protein
sequences. Proc Natl Acad Sci USA 78, 7665-7669, 1981.
4
Hilpert, K., Volkmer-Engert, R., Walter, J, & Hancock, R.E.W. High-throughput
generation of small antibacterial peptides with improved activity. Nat Biotech, 23, 1008-
1012 (2005).