Vous êtes sur la page 1sur 10

There are many different types of

restriction enzymes.

A "target" DNA sequence is


analyzed.

Generally speaking:
They recognize "palindromic DNA
sequences"
They either cut in the middle of the
sequence ("blunt cuts"), or produce a
5' overhang of a few bases ("sticky
ends").

Oligonucleotide primers that bracket


the sequence are created.

Officially: "Restriction
Endonucleases"
Only cut DNA at specific sequences
(hence "restriction")
Cutting locations: "Restriction sites"
Usually 4-8 bp
Essentially enzymatic DNA scissors.
Hundreds have been isolated.

The thermalcycler is programmed


and it runs.

Probably serve as a bacterial immune


system against phages
Nobel Prize: 1972
DNA from different sources can be
recombined by treatment with the
same, sticky end making, restriction
enzyme.
Ligase can be used to seal the break
between strands.
Different restriction enzymes
recognize different restriction sites.
Vectors
DNA molecules that can be modified
to store and replicate other DNA
sequences.
Examples: Bacterial plasmids,
phages, viruses, artificial
chromosomes.
To be useful, plasmids must
minimally have:
An origin of replication
A region containing many restriction
sites (a "multiple cloning site")
A gene/genes that enable screening
of cells that have successfully taken
up the plasmid (usually ampicillin
resistance)
There is a size limit on plasmids.
To get more genes in to cells,
artificial chromosomes can be used.
In 2010, the J. Craig Venter Institute
created the first cell with a
completely synthetic chromosome.
PCR
A way to produce many copies of a
DNA molecule without the use of
cells.

The DNA, primers, free nucleotides,


coenzymes, and special Taq
polymerase are put in a
thermalcycler.

One sequence can be copied 2 times


overnight.
The Thermalcycler is
a laboratory apparatus most
commonly used to amplify segments
of DNA via thepolymerase chain
reaction (PCR).[1] Thermal cyclers
may also be used in laboratories to
facilitate other temperature-sensitive
reactions, including but not limited
to restriction enzyme digestion or
rapid diagnostics.
Taq Polymerase
For PCR to work, a polymerase that
is not destroyed by high
temperatures is required.
Taq polymerase: isolated from
Thermus aquaticus, a bacterium first
found in hot springs in Yellowstone
National Park
Steps of PCR
Denaturation:
94-98 degrees Celsius
DNA strands separate
Annealing:
50-65 degrees Celsius
primers bond to strands
Extension:
75-80 degrees Celsius
Taq polymerase attaches to primers
and replicates target sequence
Oligonucleotides are short, singlestranded DNA or RNA molecules
that have a wide range of
applications in genetic
testing, research, and forensics.
Commonly made in the laboratory
by solid-phase chemical synthesis,
these small bits of nucleic acids can
be manufactured with any userspecified sequence, and so are vital
for artificial gene
synthesis, polymerase chain
reaction (PCR),DNA
sequencing, library construction and
as molecular probes. In nature,
oligonucleotides are usually found as

small RNA molecules that function


in the regulation of gene expression
(e.g.microRNA), or are degradation
intermediates derived from the
breakdown of larger nucleic acid
molecules.
Primer (molecular biology) The
DNA replication fork.
RNA primerlabeled at top.
A primer is a strand of nucleic acid
that serves as a starting point for
DNA synthesis.
CCR5 or Chemokine Receptor 5 is
a membrane receptor protein found
on human immune cells. Its primary
function is to bind specific chemical
signals, called chemokines, and
recruit other immune cells. The
structure of the molecule is shown in
the figure to the right.
The structure of the Chemokine
Receptor CCR5 shown here is
displayed within the context of the Thelper cell membrane. The PDB
entry 4mbs.pdb is that of an
engineered molecule fused to
rubredoxin (not shown here for
clarity) and in complex with a fusion
inhibitor drug bound to the
extracellular face of the molecule.
The CCR5 protein is an HIV coreceptor. It cooperates with the host
cellular CD4 protein to allow the
initial docking of the HIV virus onto
T-cells, and subsequent infection.
The CD4 bound HIV envelope spike
protein use this molecule as a coreceptor to enter and infect host
cells. In some instances HIV uses
another similar chemokine receptor
CXCR4 as the co-receptor for entry
into host cells.
Curiously, approximately 15-20% of
the northern European population is
heterozygous for a naturally
occurring 32 base pair deletion in
their CCR5 gene making them less
susceptible to HIV infection.
Approximately 1% of this population
is homozygous for this mutation
and resistant to HIV infection.
The eleven amino acids encoded by
the 32 base pair deletion are located
midway through the gene, changing
the translation reading frame.
Therefore, the protein product
translated from the gene containing
this deletion is truncated as a result
of the out-of-frame STOP codon
encountered 31 codons after the
deletion site.

Zinc Finger Nucleases are sequence


specific DNA binding proteins. Each
finger is composed of a short alpha
helix and a 2-stranded beta sheet.
Two histidines from the helix and
two cysteines from the beta sheet
simultaneously bind a zinc atom to
stabilize this protein motif. Each
finger recognizes and binds to
three consecutive base pairs in
double-stranded DNA. By linking 6
zinc fingers together,it is possible to
target a unique 18 basepair sequence
of DNA. But most natural zinc finger
DNA-binding proteins use only 3
consecutive fingers to bind DNA.
Can you guess why?
zinc fingers were first identified in a
frog transcription factor
(transcription factor IIIA).
Interestingly, this protein structure
was found to bind both 5S RNA and
its cognate DNA. Over the years zinc
fingers have been identified in many
other proteins and is one of the most
common protein domains that binds
to specific DNA/RNA sequences.
Each zinc finger domain has ~30
amino acids with two beta strands
and a single alpha helix. In addition
to its hydrophobic core, it is
stabilized by a Zinc ion coordinated
by side chains of four Cysteines, four
Histidines or a combination of these.
The structure of a single zinc finger
protein domain is shown in the
figure to the right. Most zinc finger
containing proteins have a series of
these domains linked to each other.
These domains bind to the major
groove of the DNA. Specific amino
acid side chains reach out from these
domains to "read" the DNA sequence
by interacting with specific DNA
bases
Proteins play countless roles
throughout the biological world,
from catalyzing chemicalreactions to
building the structures of all living
things. Despite this wide range of
functions all proteins are made out of
the same twenty amino acids, but
combined in different ways. The way
these twenty amino acids are
arranged dictates the folding of the
protein into its unique final shape.
Since protein function is based on
the ability to recognize and bind to
specific molecules, having the
correct shape is critical for proteins
to do their jobs correctly.
Primary Structure one amino acid
heme Primary structure is the linear
sequence of amino acids as encoded
by the DNA. This sequence defines
how the protein will fold and

therefore also defines how it will


function. A single change in the
amino acid sequence of hemoglobin
can cause the proteins to clump
together, resulting in the disease
sickle cell anemia.
Secondary Structure-Hydrogen
bonds between amino acids form two
particularly stable structural
elements in proteins: alpha helices
and beta sheets. Alpha helices
(shown in blue) are the basic
structural elements found in
hemoglobin, but many other proteins
also include beta sheets. The inset
highlights the pattern of hydrogen
bonds (shown in green) that
stabilizes alpha helices.
Tertiary Structure Many functional
proteins fold into a compact globular
shape, with many carbon-rich amino
acids sheltered inside away from the
surrounding water. The folded
structure of hemoglobin includes a
pocket to hold heme, which is the
molecule that carries oxygen as it is
transported throughout the body
Quarternary Structure-Two or more
polypeptide chains can come
together to form one functional
molecule with several subunits. The
four subunits of hemoglobin
cooperate so that the complex picks
up and delivers more oxygen than is
possible with single subunits.
Functions
Defense The flexible arms of
antibodies have binding sites that can
protect the body from disease by
recognizing and binding to foreign
molecules.
Structure Collagen forms a strong
and flexible triple helix that is
widely used throughout the body for
structural support
Enzymes Alpha amylase is an
enzyme with a specific catalytic site
that begins the breakdown of
carbohydrates in our saliva.
Communication Insulin is a small,
stable protein that can easily
maintain its shape while traveling
through the blood to regulate blood
sugar levels
Storage Ferritin forms a hollow shell
that stores iron from our food.
Transport The calcium pump moves
ions across cell membranes allowing
the synchronized contraction of
muscle cells.

Stem Cells
Totipotent Stem Cells
These are the most versatile of the
stem cell types. When a sperm cell
and an egg cell unite, they form a
one-celled fertilized egg. This cell is
totipotent, meaning it has the
potential to give rise to any and all
human cells, such as brain, liver,
blood or heart cells. It can even give
rise to an entire functional organism.
The first few cell divisions in
embryonic development produce
more totipotent cells. After four days
of embryonic cell division, the cells
begin to specialize into pluripotent
stem cells [18].
Pluripotent Stem Cells
These cells are like totipotent stem
cells in that they can give rise to all
tissue types. Unlike totipotent stem
cells, however, they cannot give rise
to an entire organism. On the fourth
day of development, the embryo
forms into two layers, an an outer
layer which will become the
placenta, and an inner mass which
will form the tissues of the
developing human body. These inner
cells, though they can form nearly
any human tissue, cannot do so
without the outer layer; so are not
totipotent, but pluripotent. As these
pluripotent stem cells continue to
divide, they begin to specialize
further [18].
Multipotent Stem Cells
These are less plastic and more
differentiated stem cells. They give
rise to a limited range of cells within
a tissue type. The offspring of the
pluripotent cells become the
progenitors of such cell lines as
blood cells, skin cells and nerve
cells. At this stage, they are
multipotent. They can become one of
several types of cells within a given
organ. For example, multipotent
blood stem cells can develop into red
blood cells, white blood cells or
platelets [18].

Adult Stem Cells


An adult stem cell is a multipotent
stem cell in adult humans that is used
to replace cells that have died or lost
function. It is an undifferentiated cell

present in differentiated tissue. It


renews itself and can specialize to
yield all cell types present in the
tissue from which it originated. So
far, adult stem cells have been
identified for many different tissue
types such as hematopoetic (blood),
neural, endothelial, muscle,
mesenchymal, gastrointestinal, and
epidermal cells [18].

Embryonic stem cells (ESCs)


are stem cellsderived from the
undifferentiated inner masscells of a
human embryo. Embryonic stem
cells are pluripotent, meaning they
are able to grow (i.e. differentiate)
into all derivatives of the three
primary germ layers: ectoderm,
endoderm and mesoderm.
Structural Biology of HIV
The Human Immunodeficiency Virus
(HIV) is an RNA virus that can
infect specific immune cells in our
body, called T helper cells. The RNA
genome of HIV is encased in a
capsid, which is in turn covered by
an envelope derived from the host
cell membrane. The structures and
functions of most of HIVs proteins
are now known. Explore the anatomy
of HIV and learn about the different
structural proteins, enzymes and
accessory proteins using this RCSB
PDB animation or poster linked to
below. We are still learning about the
accessory and regulatory proteins of
HIV that exploit the host cells
machinery for its own advantage.
Take a look at the Structural Biology
of HIV poster from the RCSB
Protein Databank web site
atwww.rcsb.org/pdb/education_dis
cussion. This poster will introduce
the overall structure of an HIV Virus,
which is important when
understanding how CCR5 interacts
with HIV
The HIV life cycle can be
summarized in the following steps:

Attachment: The HIV


spike or envelope protein,
gp120, attaches to the
host cell protein CD4 on
specific types of T-cells.

Fusion and entry: Binding of gp120


and CD4 rearranges their structures
allowing the complex to bind another
host cell receptor, the chemokine

receptors, called CCR5. In some


cases an alternate receptor called
CXCR4 may replace CCR5 in this
interaction. This in turn facilitates
the stock of the HIV spike (the
protein gp41) to penetrate the host
cell membrane and fuse the viral
envelope with the host cell
membrane.
Reverse transcription: Upon entry,
HIV sheds its capsid and the 2 single
strands of viral RNA are converted to
a double stranded DNA by a special
viral enzyme called Reverse
transcriptase.
Integration: The double stranded
DNA, or proviral DNA, enters the
host cell nucleus and is integrated in
the cells genome by another special
viral enzyme called Integrase.
Transcription and translation: The
proviral DNA is transcribed and
translated like any other host cell
gene using host cell machinery
(RNA polymerase, Ribosomes etc.)
Assembly and budding: The
various viral proteins and RNA come
together to assemble the virus. At
this stage some of the viral proteins
are still linked to each other as part
of the polyprotein synthesized by the
virus. Various HIV proteins and
RNA are packaged into an immature
viral particle that buds off from the
host cell encased in its membrane.
Maturation of viral particle: With
action of the viral protease the
various HIV proteins are cut and
separated, free to perform their
specific functions. This
rearrangement or maturation helps
the HIV become a mature infectious
particle ready to infect another cell.
All the steps of the viral lifecycle are
presented in the HHMI
Biointeractives animation, narrated
by HHMI investigator, Bruce

1.

Reverse transcriptase inhibitors


(RTI): block initial conversion of
viral RNA to proviral DNA that is
integrated in the host cell genome
By mimicking the enzyme substrate
and directly binding to the active site
(nucleoside RTIs)
By binding to a site near the enzyme
active site and blocking its function
(non-nucleoside RTIs)
Integrase inhibitors:block
integration of proviral DNA into the
host cell genome preventing
permanent infection of the host cells
Protease inhibitors:block cleavage
of viral polyprotein, preventing
maturation of HIV to infectious
particles
2.

Entry inhibitors: block


interaction of the CD4gp120 complex with the
chemokine co-receptor
preventing entry of HIV
in the host cell

3.

Fusion inhibitors: block


the structural changes in
the stock of the HIV
spike (gp41) that are
needed for the viral
envelope and host cell
membranes to fuse

Self-renewal
Two mechanisms exist to ensure that
a stem cell population is maintained:
1.

Obligatory asymmetric
replication: a stem cell
divides into one mother
cell that is identical to
the original stem cell,
and another daughter cell
that is differentiated.

2.

Stochastic differentiation:
when one stem cell
develops into two
differentiated daughter
cells, another stem cell
undergoes mitosis and
produces two stem cells
identical to the original.

Walker, MD.
Research in the last three decades
has yielded a number of different
strategies to block the HIV lifecycle.
Today, more than 25 antiretroviral
drugs are available to manage HIV
infection, significantly reducing
morbidity and mortality. With
current treatments, HIV infection has
become a chronic disease
manageable, but with lifelong
medications.
The approaches currently used to
treat HIV infections include:

Viral Enzyme
inhibitors: block the
actions of some critical
enzymes in the HIV
lifecycle.

Potency specifies the differentiation


potential (the potential to

differentiate into different cell types)


of the stem cell.[4]

single DNA sequence and precisely


cuts it in one place.

Example: EcoRI, cuts the sequence


GAATTC, cutting between the G and
the A. Roving endonucleases can be
dangerous, so bacteria protect their
own DNA by modifying it with
methyl groups. These groups are
added to adenine or cytosine bases
(depending on the particular type of
bacteria) in the major groove. Methyl
groups block the binding of
restriction enzymes but dont block
the normal reading and replication of
the genomic information stored in
the DNA. DNA from an attacking
bacteriophage wont have protective
methyl groups and will be destroyed.

Totipotent (a.k.a.
omnipotent) stem cells can
differentiate into embryonic
and extraembryonic cell types.
Such cells can construct a
complete, viable organism.
[4]
These cells are produced
from the fusion of an egg and
sperm cell. Cells produced by
the first few divisions of the
fertilized egg are also
totipotent.[5]
Pluripotent stem cells are
the descendants of totipotent
cells and can differentiate into
nearly all cells,[4] i.e. cells
derived from any of the
three germ layers.[6]
Multipotent stem cells
can differentiate into a number
of cell types, but only those of
a closely related family of
cells.[4]

Oligopotent stem cells


can differentiate into only a
few cell types, such as
lymphoid or myeloid stem
cells.[4]

Unipotent cells can


produce only one cell type,
their own,[4] but have the
property of self-renewal,
which distinguishes them from
non-stem cells (e.g.progenitor
cells, muscle stem cells).

Restriction Endonucleases
Restriction endonucleases are
proteins that can cut DNA at a
specific point in a specific sequence,
allowing genome editing. They are
termed "restriction enzymes"
because they restrict the infection of
bacteriophages. Bacteria are under
constant attack by bacteriophages
(e.g. bacteriophage phiX174).
To protect themselves, many types of
bacteria have developed a method to
chop up any foreign DNA, such as
that of an attacking phage. bacteria
build an endonuclease (an enzyme
that cuts DNA) which is allowed to
circulate in the bacterial cytoplasm,
waiting for phage DNA. Each type
of restriction enzyme seeks out a

Each particular type of bacteria has a


restriction enzyme (or several
different ones) that cuts a specific
DNA sequence, paired with a
methyl-transferase enzyme that
protects this same sequence in the
bacterial genome.

The zinc finger protein has a tetracoordinated zinc at the core of the
structure to stabilize its structure.
Some scientists experimented with
the idea of replacing the zinc
coordination with other interactions.
This exercise led to the design of a
peptide that could adopt the same
shape and structure as the DNA
binding zinc finger domain but had a
completely different rationale for its
stability.
Zinc Finger Nucleases are sequence
specific DNA binding proteins. each
finger binds three bases Each finger
is composed of a short alpha helix
and a 2-stranded beta sheet. Zinc
fingers were first identified in a frog
transcription factor (transcription
factor IIIA). this protein structure
was found to bind both 5S RNA and
its cognate DNA. Over the years zinc
fingers have been identified in many
other proteins and is one of the most
common protein domains that binds
to specific DNA/RNA sequences.

Endonuclease FokI
The specific nuclease FokI occurs
naturally in bacteria as a defense
mechanism against invading viruses.
It is an enzyme derived from
Flavobacterium okeanokoites (or
Planomicrobium okeanokoites)
This protein, like other restriction
enzymes, has two domains
(functional parts): the cleavage
domain (nuclease) and the DNAbinding domain, composed of zinc
fingers. It is commonly used in
designing genome editing nucleases
The nuclease of the FokI is typically
removed from its natural DNA
binding domains and attached to new
binding domains, to create a new
specialized restriction enzyme.
The nuclease functions solely as a
dimer, meaning it requires two
copies (one attached to each strand
of DNA) in order to successfully
cleave the DNA It can recognize
specific DNA sequences
(5GGATG3 and 5CATCC3) and
cuts or cleaves it on both DNA
strands 14 bases after the first bolded
and underlined G and 13 bases
before the bolded and underlined C.
It has a cofactor: Mg2+
Zinc Finger Proteins

Each zinc finger domain has ~30


amino acids. In addition to its
hydrophobic core, it is stabilized by
a Zinc ion coordinated by side chains
of four Cysteines, four Histidines or
a combination of these. Most zinc
finger containing proteins have a
series of these domains linked to
each other. These domains bind to
the major groove of the DNA.
Specific amino acid side chains
reach out from these domains to
"read" the DNA sequence by
interacting with specific DNA bases.
CCR5 (Chemokine Receptor 5)
CCR5 is a membrane receptor
protein found in human immune
cells that is used by HIV to enter the
host cell. is an HIV co-receptor;
cooperates with the host cellular
CD4 primary receptor to allow the
initial docking of the HIV virus onto
T-cells, and subsequent infection.
The CD4 bound HIV envelope spike
protein use this molecule as a coreceptor to enter and infect host
cells. In some instances HIV uses
another similar chemokine receptor
CXCR4 as the co-receptor for entry
into host cells. A naturally occurring
deletion in this protein enables a cell
to become resistant to the HIV virus
since it is unable to properly bind
and insert its genetic information.

Normally 353 amino acids long, and


folds up into a structure composed of
7 transmembrane alpha helices with
structural homology to the family of
G protein-coupled receptors
(GPCRs). Primarily, the CCR5 gene
is involved in the receiving of
chemical signals called chemokines
and recruiting other immune cells to
help the immune system function.
However, this variation is
homozygous recessive, meaning it
requires both recessive alleles in
order to express its resistant
properties. In some ethnic groups
(Caucasians) a 32 nucleotide deletion
in the gene results in a corresponding
deletion in the mRNA.
Because the genetic code is a triplet
code, and 32 isnt a multiple of 3, the
deletion results in 1) the deletion of
11 amino acids 2) a switch in the
translational reading frame resulting
in a scrambled amino acids sequence
even after the deletion site. 31
additional amino acids are added as a
result of the deletion before a stop
codon is met by the ribosome. This
prematurely terminated CCR5
protein is 215 amino acids long.
CCR5 normally dimerizes and is
phosphorylated in the endoplasmic
reticulum and is then efficiently
trafficked through the Golgi to the
cell membrane. In contrast, 32CCR5
is not phosphorylated, and is not
trafficked to the cell membrane.
32CCR5 retains its ability to
dimerize with wild type CCR5
leading to a transdominant negative
effect on the delivery of the
functional CCR5 to the cell surface.
Approximately 15-20% of the
northern European population is
heterozygous for a naturally
occurring 32 base pair deletion in
their CCR5 gene making them
resistant to HIV infection.
Approximately 1% of European
caucasians are homozygous for this
mutation and resistant to HIV
infection. Based on the functional
cure of the Berlin patient it appears
that introducing the CCR5 delta 32
mutation may make host cells
resistant to HIV. Using an engineered
nuclease, such as a zinc finger
nuclease, and specifically targeting
the CCR5 gene in HIV patients to
isolate and deactivate the CCR5
protein will make the patients
endogenous T-cells resistant to
further infection.

Since HIV infection is persistent,


making the host cells resistant may
provide a functional cure for HIV
infected individuals. Sangamo
Biosciences (a biotech company
specializing in the development of
therapeutic zinc finger nucleases) has
developed a zinc finger nuclease that
is targeted to disrupt the CCR5 gene.
currently being tested in a Phase 2
clinical trial with HIV/AIDS patients
by Sangamo Biosciences in
collaboration with groups from the
University of Pennsylvania School
of Medicine and the Albert Einstein
College of Medicine.
HIV
The Human Immunodeficiency Virus
(HIV) is an RNA virus that can
infect specific immune cells in our
body, called T helper cells. The RNA
genome of HIV is encased in a
capsid, which is in turn covered by
an envelope derived from the host
cell membrane. The structures and
functions of most of HIVs proteins
are now known. We are still learning
about the accessory and regulatory
proteins of HIV that exploits the host
cells machinery for its own
advantage.
Life Cycle
Attachment: The HIV spike or
envelope protein, gp120, attaches to
the host cell protein CD4 on specific
types of T-cells.
Fusion and entry: Binding of gp120
and CD4 rearranges their structures
allowing the complex to bind another
host cell receptor, the chemokine
receptors, called CCR5. In some
cases an alternate receptor called
CXCR4 may replace CCR5 in this
interaction. This in turn facilitates
the stock of the HIV spike (the
protein gp41) to penetrate the host
cell membrane and fuse the viral
envelope with the host cell
membrane.
Reverse transcription: Upon entry,
HIV sheds its capsid and the 2 single
strands of viral RNA are converted to
a double stranded DNA by a special
viral enzyme called Reverse
transcriptase.
Integration: The double stranded
DNA, or proviral DNA, enters the
host cell nucleus and is integrated in

the cells genome by another special


viral enzyme called Integrase.
Transcription and translation: The
proviral DNA is transcribed and
translated like any other host cell
gene using host cell machinery
(RNA polymerase, Ribosomes etc.)
Assembly and budding: The various
viral proteins and RNA come
together to assemble the virus. At
this stage some of the viral proteins
are still linked to each other as part
of the polyprotein synthesized by the
virus. Various HIV proteins and
RNA are packaged into an immature
viral particle that buds off from the
host cell encased in its membrane.
Maturation of viral particle: With
action of the viral protease the
various HIV proteins are cut and
separated, free to perform their
specific functions. This
rearrangement or maturation helps
the HIV become a mature infectious
particle ready to infect another cell.
All the steps of the viral lifecycle are
presented in the HHMI
Biointeractives animation, narrated
by HHMI investigator, Bruce
Walker, MD.
The approaches currently used to
treat HIV infections include: Viral
Enzyme inhibitors: block the actions
of some critical enzymes in the HIV
lifecycle.
Reverse transcriptase inhibitors
(RTI): block initial conversion of
viral RNA to proviral DNA that is
integrated in the host cell genome By
mimicking the enzyme substrate and
directly binding to the active site
(nucleoside RTIs)
By binding to a site near the enzyme
active site and blocking its function
(non-nucleoside RTIs)
Integrase inhibitors:block integration
of proviral DNA into the host cell
genome preventing permanent
infection of the host cells
Protease inhibitors:block cleavage of
viral polyprotein, preventing
maturation of HIV to infectious
particles
Entry inhibitors: block interaction of
the CD4-gp120 complex with the

chemokine co-receptor preventing


entry of HIV in the host cell
Fusion inhibitors: block the
structural changes in the stock of the
HIV spike (gp41) that are needed for
the viral envelope and host cell
membranes to fuse

Upcoming Approaches: Making the


host cells resistant to HIV: Currently
researchers are using Zinc finger
nucleases to target the CCR5 gene in
stem cells that give rise to blood
cells and introduce a deletion or
disruption in the gene. As a result
these cells are unable to make a
functional CCR5 protein and become
resistant to HIV infection. A
treatment protocol using approach is
currently in a Phase II clinical trial
conducted by a group from the
University of Pennsylvania School
of Medicine, the Albert Einstein
College of Medicine and Sangamo
Biosciences (a biotech company
specializing in the development of
therapeutic zinc finger nucleases).
Seek out and destroy all the
integrated proviral DNA: A recent
research report has suggested the
possibility of using a gene
therapeutic approach to specifically
identifying and editing out the
integrated proviral HIV-1 DNA.
While there is a long way before this
can even be tested as a treatment
option it offers the hope that gene
therapy can be used for dealing with
tough diseases like HIV/AIDS.
Protein Structure
"Structure equals function" is the
basic tenet of Protein Modeling: i.e.,
it's important to know what a
protein's structure is like because its
function is determined by its
structure.
There are four different types of
protein structure: primary, secondary,
tertiary, and quaternary.
Primary Structure
Primary structure is the sequence of
amino acid residues in a protein
chain (they're called residues, by the
way, because they're not individual
amino acids anymore, having lost a
hydrogen off their amino groups and
a hydroxide ion off their carboxylic
acid groups in the process of bonding

through dehydration synthesis;


TL;DR the ends of the amino acids
are missing because they're
connected, so we call them
"residues" instead of "amino acids").
There are 20 main varieties of
amino acid, which differ only in
their sidechain(sometimes called an
"R group").
Different residue sidechains have
different properties; for example, the
red sidechains in the diagram are
negatively charged, and the blue
ones are positively charged. These
properties determine how the protein
folds (i.e.,
the secondary and tertiary structure),
because certain types of residues
attract, repel or bond to other types
of residues. Also, the types of
residues present can determine how
the protein interacts with other
molecules such as DNA- for
example, serine can form hydrogen
bonds, and therefore is often found
at binding sites in a protein.
Charged sidechains repel like
charges and attract opposite charges.
Hydrophilic, or polar, sidechains
usually end up on the outside of a
folded structure, because most
proteins fold in a watery
environment and the polar sidechains
interact well with water, which is
also polar. For the same reason,
hydrophobic, or non-polar,
sidechains usually end up on the
inside of the structure, because they
do not interact well with
water. Cysteine, which is shown in
green in the diagram, forms very
strong covalent disulfide bonds with
other cysteines.
Each residue in a chain is given a
number, starting at the amino
terminus (that is, the end that has an
amino group still present) with the
lowest number (which is not always
1, depending on the numbering
conventions for the particular family
of proteins) and going up to the
carboxy terminus (the end that has a
carboxyl group still present).
Secondary Structure
Secondary structure is the first level
of folding in a protein. Patterns
called "motifs", such as alpha helices
and beta sheets (by far the two most
common), are caused by hydrogen
bonding between the backbone
carbons (the central carbons of

amino acids, also known as alpha


carbons) of the residues.
Alpha helices are slightly more
common in proteins overall than beta
sheets. These helices are tightly
coiled single strands, kept in place
by hydrogen bonds between nearby
residues. They can be anywhere from
only a few residues in length to over
100 Angstroms in some proteins.
They tend to be the base of protein
"stalks" (such as that of 2009-10's
influenza hemagglutinin).
Beta sheets, on the other hand, are
made up of many beta strandskinked sequences of residues
separated by loops. These strands
line up parallel to each otheractually, antiparallel, which means
that adjacent strands point in
opposite directions (direction
matters, remember, because of the
numbering of residues from the
amino terminus to the carboxy
terminus)- with multiple hydrogen
bonds between adjacent strands.
They are very strong as protective or
support layers (such as the "betabarrel" exterior of GFP).
Tertiary Structure
Tertiary structure is the position in
three dimensions of the secondary
structures (motifs). It is determined
by the secondary structures present,
as well as the properties of the
sidechains. Hydrophilic sidechains
such as glutamine will move to the
"outside" when the protein is folded
in a watery environment, while
hydrophobic sidechains such as
tryptophan will cluster "inside" the
protein, protected by other sections
of the protein, to prevent their
exposure to water. Oppositely
charged sidechains come together,
forming salt bridges (ionic bonds),
while sidechains with the same
charge repel each other. Cysteine,
which contains sulfur, bonds
covalently with other cysteines to
form strong disulfide bonds. The
interaction of all these attractions
and repulsions cause the protein to
develop a unique shape in 3D, called
a "conformation".
The protein's tertiary structure also
depends on the environment in
which it is folded: in the human
body, which is a watery
environment, the hydrophobic
(nonpolar) sidechains end up on the
inside, as stated above. However, in

a protein folded in a hydrophobic


environment (such as a protein
embedded in a phospholipid cell
membrane), the hydrophilic (polar)
sidechains end up on the inside.
Quaternary Structure
Quaternary structure is the
arrangement of each of the
individual pieces (monomers) of a
multi-unit (multimeric) protein.
These subunits, or "chains" as they
are often called, each have their own
amino and carboxy terminus, and are
not physically attached to each other.
However, they are held together by
bonds- which can be disulfide or
ionic, although more commonly the
latter- and arranged together in a
specific conformation. Multimers are
quite common, and may contain
several distinct chains or simply
several copies of the same one (or
few).

Proteins are the chief actors within


the cell, said to be carrying out the
duties specified by the information
encoded in genes.[5] With the
exception of certain types of RNA,
most other biological molecules are
relatively inert elements upon which
proteins act. Proteins make up half
the dry weight of an Escherichia
coli cell, whereas other
macromolecules such as DNA and
RNA make up only 3% and 20%,
respectively.[23] The set of proteins
expressed in a particular cell or cell
type is known as its proteome.

The enzyme hexokinase is shown as


a conventional ball-and-stick
molecular model. To scale in the top
right-hand corner are two of its
substrates, ATP and glucose.
The chief characteristic of proteins
that also allows their diverse set of
functions is their ability to bind other
molecules specifically and tightly.
The region of the protein responsible
for binding another molecule is
known as the binding siteand is often
a depression or "pocket" on the
molecular surface. This binding
ability is mediated by the tertiary
structure of the protein, which
defines the binding site pocket, and
by the chemical properties of the
surrounding amino acids' side chains.
Protein binding can be
extraordinarily tight and specific; for

example, theribonuclease
inhibitor protein binds to
human angiogeninwith a subfemtomolar dissociation
constant (<1015 M) but does not
bind at all to its amphibian
homolog onconase (>1 M).
Extremely minor chemical changes
such as the addition of a single
methyl group to a binding partner
can sometimes suffice to nearly
eliminate binding; for example,
the aminoacyl tRNA
synthetase specific to the amino
acid valine discriminates against the
very similar side chain of the amino
acid isoleucine.[24]
Proteins can bind to other proteins as
well as to small-molecule substrates.
When proteins bind specifically to
other copies of the same molecule,
they can oligomerize to form fibrils;
this process occurs often in structural
proteins that consist of globular
monomers that self-associate to form
rigid fibers. Proteinprotein
interactions also regulate enzymatic
activity, control progression through
thecell cycle, and allow the assembly
of large protein complexes that carry
out many closely related reactions
with a common biological function.
Proteins can also bind to, or even be
integrated into, cell membranes. The
ability of binding partners to induce
conformational changes in proteins
allows the construction of
enormously
complex signaling networks.[25] Impo
rtantly, as interactions between
proteins are reversible, and depend
heavily on the availability of
different groups of partner proteins
to form aggregates that are capable
to carry out discrete sets of function,
study of the interactions between
specific proteins is a key to
understand important aspects of
cellular function, and ultimately the
properties that distinguish particular
cell types.[26][27]
Enzymes
Main article: Enzyme
The best-known role of proteins in
the cell is as enzymes,
which catalyze chemical reactions.
Enzymes are usually highly specific
and accelerate only one or a few
chemical reactions. Enzymes carry
out most of the reactions involved
in metabolism, as well as
manipulating DNA in processes such
as DNA replication, DNA repair,
and transcription. Some enzymes act

on other proteins to add or remove


chemical groups in a process known
as posttranslational modification.
About 4,000 reactions are known to
be catalyzed by enzymes.[28] The rate
acceleration conferred by enzymatic
catalysis is often enormousas
much as 1017-fold increase in rate
over the uncatalyzed reaction in the
case of orotate decarboxylase (78
million years without the enzyme, 18
milliseconds with the enzyme).[29]
The molecules bound and acted upon
by enzymes are called substrates.
Although enzymes can consist of
hundreds of amino acids, it is usually
only a small fraction of the residues
that come in contact with the
substrate, and an even smaller
fractionthree to four residues on
averagethat are directly involved
in catalysis.[30] The region of the
enzyme that binds the substrate and
contains the catalytic residues is
known as the active site.
Dirigent proteins are members of a
class of proteins which dictate the
stereochemistry of a compound
synthesized by other enzymes.
Cell signaling and ligand binding

Ribbon diagram of a mouse antibody


againstcholera that binds
acarbohydrate antigen
Many proteins are involved in the
process of cell signaling and signal
transduction. Some proteins, such
as insulin, are extracellular proteins
that transmit a signal from the cell in
which they were synthesized to other
cells in distant tissues. Others
are membrane proteins that act
as receptors whose main function is
to bind a signaling molecule and
induce a biochemical response in the
cell. Many receptors have a binding
site exposed on the cell surface and
an effector domain within the cell,
which may have enzymatic activity
or may undergo a conformational
change detected by other proteins
within the cell.[31]
Antibodies are protein components
of an adaptive immune systemwhose
main function is to bind antigens, or
foreign substances in the body, and
target them for destruction.
Antibodies can be secretedinto the
extracellular environment or
anchored in the membranes of
specialized B cells known as plasma

cells. Whereas enzymes are limited


in their binding affinity for their
substrates by the necessity of
conducting their reaction, antibodies
have no such constraints. An
antibody's binding affinity to its
target is extraordinarily high.[32]
Many ligand transport proteins bind
particular small biomolecules and
transport them to other locations in
the body of a multicellular organism.
These proteins must have a high
binding affinity when their ligand is
present in high concentrations, but
must also release the ligand when it
is present at low concentrations in
the target tissues. The canonical
example of a ligand-binding protein
is haemoglobin, which
transports oxygen from the lungs to
other organs and tissues in
all vertebratesand has
close homologs in every
biological kingdom.[33] Lectins are
sugar-binding proteins which are
highly specific for their sugar
moieties. Lectins typically play a
role in
biological recognitionphenomena
involving cells and proteins.
[34]
Receptors and hormones are
highly specific binding proteins.
Transmembrane proteins can also
serve as ligand transport proteins that
alter the permeability of the cell
membrane to small molecules and
ions. The membrane alone has
a hydrophobic core through
which polar or charged molecules
cannot diffuse. Membrane proteins
contain internal channels that allow
such molecules to enter and exit the
cell. Many ion channel proteins are
specialized to select for only a
particular ion; for
example, potassium and sodium chan
nels often discriminate for only one
of the two ions.[35]
Structural proteins
Structural proteins confer stiffness
and rigidity to otherwise-fluid
biological components. Most
structural proteins are fibrous
proteins; for
example, collagen and elastin are
critical components ofconnective
tissue such as cartilage,
and keratin is found in hard or
filamentous structures such
ashair, nails, feathers, hooves, and
some animal shells.[36] Some globular
proteins can also play structural
functions, for
example, actin and tubulin are

globular and soluble as monomers,


butpolymerize to form long, stiff
fibers that make up the cytoskeleton,
which allows the cell to maintain its
shape and size.
Other proteins that serve structural
functions are motor proteins such
as myosin, kinesin, and dynein,
which are capable of generating
mechanical forces. These proteins
are crucial for cellular motility of
single celled organisms and
the sperm of many multicellular
organisms which reproduce sexually.
They also generate the forces exerted
by contracting muscles[37] and play
essential roles in intracellular
transport.
Methods of study
Main article: Protein methods
The activities and structures of
proteins may be examined in
vitro, in vivo, and in silico. In
vitrostudies of purified proteins in
controlled environments are useful
for learning how a protein carries out
its function: for example, enzyme
kinetics studies explore the chemical
mechanism of an enzyme's catalytic
activity and its relative affinity for
various possible substrate molecules.
By contrast, in vivo experiments can
provide information about the
physiological role of a protein in the
context of a cell or even a
whole organism. In silico studies use
computational methods to study
proteins.
Protein purification
Main article: Protein purification
To perform in vitro analysis, a
protein must be purified away from
other cellular components. This
process usually begins with cell
lysis, in which a cell's membrane is
disrupted and its internal contents
released into a solution known as
a crude lysate. The resulting mixture
can be purified
usingultracentrifugation, which
fractionates the various cellular
components into fractions containing
soluble proteins;
membrane lipids and proteins;
cellular organelles, and nucleic
acids. Precipitationby a method
known as salting out can concentrate
the proteins from this lysate. Various
types ofchromatography are then
used to isolate the protein or proteins

of interest based on properties such


as molecular weight, net charge and
binding affinity.[38] The level of
purification can be monitored using
various types of gel
electrophoresis if the desired
protein's molecular weight
and isoelectric point are known,
by spectroscopy if the protein has
distinguishable spectroscopic
features, or byenzyme assays if the
protein has enzymatic activity.
Additionally, proteins can be isolated
according their charge
using electrofocusing.[39]
For natural proteins, a series of
purification steps may be necessary
to obtain protein sufficiently pure for
laboratory applications. To simplify
this process, genetic engineering is
often used to add chemical features
to proteins that make them easier to
purify without affecting their
structure or activity. Here, a "tag"
consisting of a specific amino acid
sequence, often a series
of histidineresidues (a "His-tag"), is
attached to one terminus of the
protein. As a result, when the lysate
is passed over a chromatography
column containing nickel, the
histidine residues ligate the nickel
and attach to the column while the
untagged components of the lysate
pass unimpeded. A number of
different tags have been developed to
help researchers purify specific
proteins from complex mixtures.[40]
Cellular localization

Proteins in different cellular


compartments and structures tagged
with green fluorescent protein(here,
white)
The study of proteins in vivo is often
concerned with the synthesis and
localization of the protein within the
cell. Although many intracellular
proteins are synthesized in
the cytoplasm and membrane-bound
or secreted proteins in
theendoplasmic reticulum, the
specifics of how proteins
are targeted to specific organelles or
cellular structures is often unclear. A
useful technique for assessing
cellular localization uses genetic
engineering to express in a cell
afusion protein or chimera consisting
of the natural protein of interest
linked to a "reporter" such as green
fluorescent protein (GFP).[41]The
fused protein's position within the
cell can be cleanly and efficiently

visualized usingmicroscopy,[42] as
shown in the figure opposite.
Other methods for elucidating the
cellular location of proteins requires
the use of known compartmental
markers for regions such as the ER,
the Golgi, lysosomes or vacuoles,
mitochondria, chloroplasts, plasma
membrane, etc. With the use of
fluorescently tagged versions of
these markers or of antibodies to
known markers, it becomes much
simpler to identify the localization of
a protein of interest. For
example, indirect
immunofluorescence will allow for
fluorescence colocalization and
demonstration of location.
Fluorescent dyes are used to label
cellular compartments for a similar
purpose.[43]
Other possibilities exist, as well. For
example, immunohistochemistry usu
ally utilizes an antibody to one or
more proteins of interest that are
conjugated to enzymes yielding
either luminescent or chromogenic
signals that can be compared
between samples, allowing for
localization information. Another
applicable technique is
cofractionation in sucrose (or other
material) gradients usingisopycnic
centrifugation.[44] While this
technique does not prove
colocalization of a compartment of
known density and the protein of
interest, it does increase the
likelihood, and is more amenable to
large-scale studies.
Finally, the gold-standard method of
cellular localization
is immunoelectron microscopy. This
technique also uses an antibody to
the protein of interest, along with
classical electron microscopy
techniques. The sample is prepared
for normal electron microscopic
examination, and then treated with
an antibody to the protein of interest
that is conjugated to an extremely
electro-dense material, usually gold.
This allows for the localization of
both ultrastructural details as well as
the protein of interest.[45]
Through another genetic engineering
application known as site-directed
mutagenesis, researchers can alter
the protein sequence and hence its
structure, cellular localization, and
susceptibility to regulation. This
technique even allows the
incorporation of unnatural amino
acids into proteins, using modified

tRNAs,[46] and may allow the rational


design of new proteins with novel
properties.[47]
Proteomics
Main article: Proteomics
The total complement of proteins
present at a time in a cell or cell type
is known as its proteome, and the
study of such large-scale data sets
defines the field of proteomics,
named by analogy to the related field
of genomics. Key experimental
techniques in proteomics include 2D
electrophoresis,[48]which allows the
separation of a large number of
proteins, mass spectrometry,
[49]
which allows rapid highthroughput identification of proteins
and sequencing of peptides (most
often after in-gel digestion), protein
microarrays,[50] which allow the
detection of the relative levels of a
large number of proteins present in a
cell, and two-hybrid screening,
which allows the systematic
exploration ofproteinprotein
interactions.[51] The total complement
of biologically possible such
interactions is known as
the interactome.[52] A systematic
attempt to determine the structures of
proteins representing every possible
fold is known as structural genomics.
[53]

Bioinformatics
Main article: Bioinformatics
A vast array of computational
methods have been developed to
analyze the structure, function, and
evolution of proteins.
The development of such tools has
been driven by the large amount of
genomic and proteomic data
available for a variety of organisms,
including the human genome. It is
simply impossible to study all
proteins experimentally, hence only a
few are subjected to laboratory
experiments while computational
tools are used to extrapolate to
similar proteins.
Such homologous proteins can be
efficiently identified in distantly
related organisms by sequence
alignment. Genome and gene
sequences can be searched by a
variety of tools for certain
properties. Sequence profiling
tools can find restriction

enzyme sites, open reading


frames in nucleotide sequences, and
predict secondary
structures. Phylogenetic trees can be
constructed
and evolutionary hypotheses
developed using special software
like ClustalW regarding the ancestry
of modern organisms and the genes
they express. The field
of bioinformatics is now
indispensable for the analysis of
genes and proteins.
Structure prediction and simulation
Constituent amino-acids can be
analyzed to predict secondary,
tertiary and quaternary protein
structure, in this case hemoglobin
containing heme units.
Main articles: Protein structure
prediction and List of protein
structure prediction software
Complementary to the field of
structural genomics, protein structure
prediction seeks to develop efficient
ways to provide plausible models for
proteins whose structures have not
yet been determined experimentally.
[54]
The most successful type of
structure prediction, known
as homology modeling, relies on the
existence of a "template" structure
with sequence similarity to the
protein being modeled; structural
genomics' goal is to provide
sufficient representation in solved
structures to model most of those
that remain.[55] Although producing
accurate models remains a challenge
when only distantly related template
structures are available, it has been
suggested that sequence alignment is
the bottleneck in this process, as
quite accurate models can be
produced if a "perfect" sequence
alignment is known.[56] Many
structure prediction methods have
served to inform the emerging field
of protein engineering, in which
novel protein folds have already
been designed.[57] A more complex
computational problem is the
prediction of intermolecular
interactions, such as in molecular
docking and proteinprotein
interaction prediction.[58]
The processes of protein folding and
binding can be simulated using such
technique as molecular mechanics, in
particular, molecular
dynamics and Monte Carlo, which
increasingly take advantage of
parallel and distributed
computing (Folding@home project;

[59]

molecular modeling on GPU).


The folding of small alpha-helical
protein domains such as
the villin headpiece[60] and
the HIV accessory protein[61] have
been successfully simulated in silico,
and hybrid methods that combine
standard molecular dynamics
with quantum
mechanics calculations have allowed
exploration of the electronic states
of rhodopsins.[62]

Proteins are the chief actors within


the cell, said to be carrying out the
duties specified by the information
encoded in genes.[5] With the
exception of certain types of RNA,
most other biological molecules are
relatively inert elements upon which
proteins act. Proteins make up half
the dry weight of an Escherichia
coli cell, whereas other
macromolecules such as DNA and
RNA make up only 3% and 20%,
respectively.[23] The set of proteins
expressed in a particular cell or cell
type is known as its proteome.

The enzyme hexokinase is shown as


a conventional ball-and-stick
molecular model. To scale in the top

right-hand corner are two of its


substrates, ATP and glucose.
The chief characteristic of proteins
that also allows their diverse set of
functions is their ability to bind other
molecules specifically and tightly.
The region of the protein responsible
for binding another molecule is
known as the binding siteand is often
a depression or "pocket" on the
molecular surface. This binding
ability is mediated by the tertiary
structure of the protein, which
defines the binding site pocket, and
by the chemical properties of the
surrounding amino acids' side chains.
Protein binding can be
extraordinarily tight and specific; for
example, theribonuclease
inhibitor protein binds to
human angiogeninwith a subfemtomolar dissociation
constant (<1015 M) but does not
bind at all to its amphibian
homolog onconase (>1 M).
Extremely minor chemical changes
such as the addition of a single
methyl group to a binding partner
can sometimes suffice to nearly
eliminate binding; for example,
the aminoacyl tRNA
synthetase specific to the amino
acid valine discriminates against the
very similar side chain of the amino
acid isoleucine.[24]

Proteins can bind to other proteins as


well as to small-molecule substrates.
When proteins bind specifically to
other copies of the same molecule,
they can oligomerize to form fibrils;
this process occurs often in structural
proteins that consist of globular
monomers that self-associate to form
rigid fibers. Proteinprotein
interactions also regulate enzymatic
activity, control progression through
thecell cycle, and allow the assembly
of large protein complexes that carry
out many closely related reactions
with a common biological function.
Proteins can also bind to, or even be
integrated into, cell membranes. The
ability of binding partners to induce
conformational changes in proteins
allows the construction of
enormously
complex signaling networks.[25] Impo
rtantly, as interactions between
proteins are reversible, and depend
heavily on the availability of
different groups of partner proteins
to form aggregates that are capable
to carry out discrete sets of function,
study of the interactions between
specific proteins is a key to
understand important aspects of
cellular function, and ultimately the
properties that distinguish particular
cell types.[26][27]

Vous aimerez peut-être aussi