Vous êtes sur la page 1sur 35

VARIATION IN HUMAN GENOME

1
Variations

The snails’ shells vary in, their There are variations in skin colour, hair
colour, the presence or absence of colour, hair curliness, eye colour and gender
bands and the number of bands

Variations may be inherited or acquired


2
Genomic variation
• Variation in alleles of genes, occurs both within and
among populations.

• Individuals of a species have similar characteristics


but they are rarely identical, the difference between
them is called variation.

• Mutations are the ultimate source of genetic variation


because they alter the order of bases in the
nucleotides of DNA. (which is a change in the chemical
structure of a gene)

• The genome sequence for any species is a reference


sequence; the consequences of variation in a
species’ genome are often overlooked .

• How the genome variations affects?


- species within complex ecosystem 3
- individual’s life span
Human Genetic Variation

□ Genetics is the scientific study of inherited variation

□ Human genetics is the scientific study of inherited human


variation

□ We study this variation in order to better understand


ourselves as a species and use this knowledge to improve our
health.
Major Types of Genetic Variations

□ Mutations – changes at the level of DNA; one or more base pairs has
undergone a change; change could be at random or due to a factor in
the environment
□ Major deletions, insertions, and genetic rearrangements can affect
several genes or large areas of a chromosome at once
□ Polymorphisms – differences in individual DNA which are not
mutations
■ Single-nucleotide polymorphisms (SNPs) are the most common,
occurring about once every 1,000 bases or so
■ Copy number variations – some DNA repeats itself (i.e.
AAGAAGAAGAAG) and there can be variation in the number of
repeats
Types of genetic variations
CCTAGTTGACTGATCGCGGGATTCACACACATGG

CCTGGTTGAC..ATCGCGGGATTCACACACACACATGG

InDels SSR – short sequence repeats


(insertions/deletions) (VNTR - variable number tandem repeats)
• two alleles • many alleles
• > 1,000,000 • microsatellites (1-5)
• minisatellites (6-100)
Single (point) base changes
Single Nucleotide Polymorphisms;
• …
> 1,000,000
Human SNPs classification:

•Inversions
• Duplications
• Translocations
• Transposon insertions

• Variations exceeding 1000bp - STRUCTURAL VARIATIONS


• less than 3 million bp - submicroscopic; larger– microscopic
• InDels and duplications are called CNVs (copy number variations)
Variations may be inherited and acquired
1.Inherited causes of variation. Variation in a characteristic that is a result of
genetic inheritance from the parents is called inherited variation. Children
usually look a little like their father, and a little like their mother, but they will
not be identical to either of their parents, They are genetically controlled.
For example: hair colour, skin colour, blood group, and sex cannot be
changed naturally

2. Acquired characteristics result from an individual’s activities or nutrition or


from environmental conditions during a lifetime. Acquired characteristics
cannot be inherited
Examples of acquired conditions in humans are: Obesity , athletic skills , body
building, sun tan 7
Continuous variation
There are two forms of variation: continuous and discontinuous variation;
Continuous variation is variation that has no limit on the value that can
occur within a population. A line graph is used to represent continuous
variation.
Example: shade of hair colour between black and blond or Height and
Weight. The genome AA BB CC DD might give black hair while the
genome aa bb cc dd might give blond hair.

8
Discontinuous variation
Discontinuous variation, where individuals fall into a number of distinct
classes, and is based on features that cannot be measured across a
complete range. They cannot be altered by external conditions. Either have
the characteristic or you don't.
Example: Blood groups are a good example: you are either one blood
group or another - you can't be in between.
Ex: Blood group is either A, B, AB or O

Genetic defects: such as colour blindness, albinism, achondroplastic


dwarfism, sickle cell anaemia are all Genetically controlled and expressed
in a discontinuous way.
9
Polymorphisms (common variation): majority – neutral
The rest:
• slightly “bad” (predispose to disease)
• slightly “good” (protect from disease)
• both slightly bad and good (predispose to and protect from certain conditions)

Polymorphisms

Minor allele frequency > 1%

GENETIC VARIATIONS: alternatives of genomic DNA sequence (alleles) that are present in individual (-s) or
population (-s)

Minor allele frequency < 1%

MUTATIONS Rare variants

“Bad”:
cause genetic disease
10
http://genome-lab.ucdavis.edu/

11
http://projects.tcag.ca/variation/

12
A Sad Story . . .
□ Ekaterina and Sergei, young
married Russian skaters, had
won two Olympic medals in the
pairs competition and were
expected to continue their
success.

□ But in November of 1995, 28-


year-old Sergei suddenly
collapsed and died during his
practice session.

□ He was a non- smoker,


physically fit, and there had
been no warning signs.
What happened?
Could anything have been done?
□ It seems as though PL(A2) mutations interact negatively with
cholesterol in the blood

□ Someone with a PL(A2) mutation, like Sergei Grinkov, may be able


to reduce their risk of a heart attack by maintaining a low
cholesterol diet and exercising regularly.

□ While genetic mutations such as these are very rare, we continue


to learn more about how genetic factors interact with the
environment in causing conditions like heart disease. This
knowledge could help to better define individuals’ health risks.
THE Human genome?
 No two humans have the exact same DNA sequence

 Reference genome describes the composite genome; but


includes all the polymorphisms

Allele Vs Polymorphism Vs Mutation

Allele: one of two or more forms of a gene or a genetic locus.


i.e., alternative form of genetic locus; a single allele for each locus is
inherited separately from each parent

Polymorphism: Difference in DNA sequence among individuals.


Genetic variation >1% population is considered for useful for
polymorphisms for genetic linkage analysis.

Mutation: any heritable change in DNA sequence


How much variation in human genome?

Tandem Repeats and Morphological Variation

• Canine RUNX2 (Runt-related transcription factor 2)-


affects the length and degree of curve in dog’s snout

• Alx-4 gene determines-how many toes should be


there in the dog’s hind leg

Single-Nucleotide Polymorphism (SNP)

 Single bases at a particular locus that are different in


different individuals.

 SNPs are the most abundant type of genetic variation in the


human genome

 accounting for > 90% of all differences between individuals ,


and SNPs occur very frequently, once every 100-1000 bp in
humans, anywhere in the genome
Human SNPs are classified as Major /
Minor alleles
 SNP minor alleles are less common in humans than
other SNPs at the same location

 Particular ethnic group, 20% of the group's allelic


variation is due to SNP minor alleles. (least common
allele occurs in a given population)
 Generalization for the entire species based on genome-
wide definitions of SNP minor alleles should be avoided

Allele: each of two or more alternative forms of a gene that


arise by mutation and are found at the same place on a
chromosome.

international SNP map working group,


Nature ,409, 928-933 (2001)
Types of SNPs
• Single-nucleotide polymorphisms may fall within
coding sequences of genes, non-coding regions of
genes, or in the intergenic regions (regions between
genes).

• SNPs within a coding sequence do not necessarily


change the amino acid sequence of the protein that
is produced, due to degeneracy of the genetic code.

• SNPs in the coding region are of two types,


synonymous and nonsynonymous SNPs.

• Synonymous SNPs do not affect the protein


sequence ,while nonsynonymous SNPs change the
amino acid sequence of protein. The
nonsynonymous SNPs are of two types: missense
and nonsense.

• SNPs that are not in protein-coding regions may still


affect gene splicing, transcription factor binding,
messenger RNA degradation, or the sequence of
non-coding RNA.
• Gene expression affected by this type of SNP is
referred to as an eSNP (expression SNP) and may be
upstream or downstream from the gene. 18
Simple diseases have complex
genomic underpinnings

P – Point mutation, or insertion/deletion entirely inside a gene


D – Deletion
C – Whole chromosome extra, missing, or both
T – Trinucleotide repeat disorders: gene is extended in length
19
Is all SNPs are really SNPs?
• SNPs (Single Nucleotide Polymorphism): These are single bases at a particular locus where individual
people have differences in their sequences.

• SNP: two or more different bases occur in the population, each with a frequency > 1%

• Can see by aligning a set of overlapping sequences and identifying the odd one

GCATGCAaGCATGCAT
GCATGCAcGCATGCAT
GCATGCAaGCATGCAT
• Identify the false positives/ false negatives GCATGCAaGCATGCAT
GCATGCAaGCATGCAT
• A base can be falsely identified due to;
(i) inclusion of paralogs in the sequence alignments and/or (ii) error in sequencing

Warren Gish
20
Step 1: paralog errors
1.3 Mb of finished human genome reference sequence

=(if, chromatograms were available)


1954 hits to the reference sequences to EST database

By aligning EST with reference sequence, they clustered the 1954 EST into 147 contigs that
aligned with 80469 position

Some of the ESTs that aligned with particular segment of reference sequence may be from paralogs.

“Paralog must be less similar to the Frequency of :


reference sequence than the same paralog is 1 base in every 50
gene containing SNPs” SNP is 1 base in every 1000

EST1 GCATGCAaGCATGCAT
EST2 GCAgGCAcGCATGCAT
EST3 GCATGCAaGCATGCAT
EST4 GCATGCAaGCATGCAT
Ref GCATGCAaGCATGCAT

EST with more number of mutation from the reference sequence were classified as paralogs.

Removal of paralogs gives 69756 positions of the original 80469


21
Step 2: Sequence errors

chromatograms of the 1954 ESTs and the sequences are


considered for analysis

Probability that a particular position as truly a SNP, based on

1. The bases that were observed at that position

2. Depth of the coverage of the position-how many ESTs aligned!


(Depth coverage is 4 {a,c,a,a,a}

3. Base quality value at that position in each sequence- calculated from


processing chromatograms with PHRED-base calling program)

EST1 GCATGCAaGCATGCAT
EST2 GCAgGCAcGCATGCAT
EST3 GCATGCAaGCATGCAT
EST4 GCATGCAaGCATGCAT
Ref GCATGCAaGCATGCAT

All position for which the probability of being a true SNP was estimated to be > 0.4 were
designated candidate SNPs.

59 candidates with average probability of 0.78 being a true SNP


22
Step 3: validation

Candidate SNPs were examined at corresponding positions in an independent


collection of DNA sequences (validation set)

From 59, 23 candidates removed from validation

( 59-23 = 36 )

20 were found to be polymorphic in the validation test (56%)

SNPs are being adding databases at rapid phase, but the quality of the data
must be taken into account before concluding a SNP is really a SNP!

23
Uses of SNPs:

1. Comparison could be possible in different subpopulations.


(Why mutations are maintained in some populations and not
all)

2. DNA fingerprinting for tracing out the criminals.

3. For mapping the polygenic traits.

4. Used in genotype –specific medication.

5. Used as markers in certain diseases so that which


combination of alleles are associated with particular disease.

24
Linkage , Linkage disequilibrium & Haplotype:

Linkage : How close the two loci are to each other on a


chromosome. If they near then two loci are linked.

Linkage disequilibrium: If two alleles ( or two SNPs) tend to


be inherited together more often than other alleles, then it
leads to Linkage disequilibrium.

Haplotypes: A set of alleles on one particular chromosome.


25
Changes in non disease QTL due to SNPs
A quantitative trait locus (QTL) is a section of DNA (the locus) that
correlates with variation in a phenotype (the quantitative trait). Usually
linked to, the genes that control that phenotype.

What is a locus?
A locus is simply a region within a genome. Anything from a part of a
single gene to a large hunk of a chromosome.

What is a quantitative trait?


A quantitative trait is one where different individuals vary continuously
(like height or weight) rather than falling into discrete categories (like
whether a person has blue or brown eyes*).
.
Quantitative trait locus mapping requires:

• Two or more strains of organisms that differ genetically with


regard to the trait of interest.

• Genetic markers that distinguish between these parental lines.


SNPs that do not lead disease but produce
altered phenotypes Quantitative trait locus mapping

“What is food to some may be brutal poison to others” 26


“What is food to some people, may be brutal poison to others”

• Feva beans  Lysis of RBC - 10% people in the world are at risk with this problem,
lagging G6PD

• G6PD, metabolic enzyme found in cytoplasm of every cell (90%)  produces NADPH,
which neutralize the toxin H2O2. Fava Beans increases red blood cells sensitivity to
hydrogen peroxide.

• G6PD encoded on X-chromosome

• To block the production of G6PD: Female needs two SNP; where Male need to inherit
one from mother.

Feva beans Glucose-6-phosphate dehydrogenase


(G6PD), a metabolic enzyme involved in
the pentose phosphate pathway,
especially important in red blood
cell metabolism.

27
CHANGES IN NON DISEASE QTL DUE TO SNPs

• SNP 376 A to G- Produces normal activity of G6PD and is


found in 20% of Africans males.

• SNP 202 G to A- Reduces G6PD activity by 10% and in about


20% of Africans alleles.

• SNP 563 C to T- Produces undetectable activity in about


20% of alleles in Caucasians living.

• Thus non disease QTL SNPs are known.

28
Food and Drug interaction

• Grapefruit juice can alter your ability to absorb drugs!

• grapefruit in general, is a potent inhibitor of the cytochrome P450 enzyme,


which can affect the metabolism of a variety of drugs, increasing their
bioavailability.

• Pills dissolve in small intestinal, where medication is absorbed.

• P-glycoprotein involves in absorption , pump drugs into intestainal


cells.

• One glass of grape juice can block P-glycoprotein & Cytochrome


P450 3A for 24hrs. P-glycoprotein

• P-glycoprotein also known as multidrug resistance protein. It is


responsible for decreased drug accumulation in multidrug-resistant
cells and often mediates the development of resistance to
anticancer drugs.

• Pumps drugs into intestinal cells

• Cytochrome P450 3A metabolizes drugs and converts the Cytochrome P450 enzyme 3A4
potentially beneficial drugs into a form that is more excreted 29
SNPs and Skin Pigmentation
• Sickle cell anaemia- caused by a SNP that results in one amino acid difference in
the beta subunit of haemoglobin.

• In 1964, using molecular tools available, it was observed that 30-40 genes
involved with skin pigmentation.

• After 40 years, it is now known that Melanin – Cause of skin pigmentation- the
polymer of two Oxidized derivatives of Tyrosine called Pheomelanin – Red ,
yellow and Eumelanin – Black & Brown

• The SNPs leads to inactive form of gene Mc1R ,results in accumulation of


Pheomelanin and reduction of Eumelanin leading
to development of red hair, pale skin and freckles
on the skin.

• It was also noted that people with red hair


showed resistance to the anaesthetic.
30
SNPs and Malarial Resistance
• In Africa - Ever year millions of people especially children are killed by
malaria.

• In East Africa, hematologists team studied – The frequency of a SNP in the


promoter of Nos2, which codes for nitric oxide synthase that produces cell
signaling NO.

• When the base T changed to C, the children had more NO in their blood
which reduced the chance of developing fatal malaria in children by 80%.

• This discovery leads to new treatments for Malaria by regulating the


appropriate levels of NO production through medication.

31
Mitochondrial SNP’s
Oxidative phosphorylation
Nuclear genes
Mitochondrial genes (13)

• Oxidative phosphorylation is a metabolic pathway that uses energy released by the oxidation of
nutrients to produce adenosine triphosphate (ATP).

• Each ‘mit’ Mitochondrial gene requires the proper function of 22tRNA and 2rRNA genes that are
encoded in the Mitochondrial genome.

• The phenotypes associated with defective Oxidative phosphorylation  affect the tissues that needs
great amount of energy. Defective oxidative phosphorylation tends to affect tissues which require
energy such as skeletal muscles, cardiac muscles etc.

• > 50 different disease causing mitochondrial SNPs have been identified.


2000 patients were screened for 44 known mitochondrial SNPs. Of these 108 were
determined to have a known disease-causing SNP
i. 108 patients could be treated more effectively
ii. Few remaining SNPs that were not in the screen should be tested
iii. some of these 2000 patients don’t have mitochondrial mutations
iv. there might be non-SNP mutations 32
v. The additional SNPs that are not discovered yet - need for highthroughput methods to identify them.
In correct mRNA splicing

• mRNA production requires the splicing to join exons and


exclude introns

• mechanism behind unexpected mutations in breast cancer


gene BRCA1 (BRCA1 accounts for breast and ovarian
cancer)

• If the aa composition of the encoded is unaffected,  that


Adrian R. Krainer- CSHL
SNP as “Silent” and call the type of point mutation as
“silent mutation”

• Silent mutations in exons of BRCA1, leads to the alternate


splicing.

• Splicing of RNA can produce a mature mRNA involved


5’-3’ ends of each exon but there are internal sequences
required as well

33
Variations in Medication responsiveness
• Many medications are not administered in their active and final form.
• The drugs are metabolized in a predictable way and the enzymatic product is the
therapeutic compound. Such types of drugs are called Pre Drugs.
• One enzyme that metabolizes a large number of such predrugs is called
Cytochrome P450.
• The cytochrome P450 superfamily (CYP) is a large and diverse group of enzymes
that catalyze the oxidation large number of pre-drugs (organic substances)
• Based on the activity of these enzymes people can be classified into 3 categories:
• Typical metabolizers – Normal therapeutic effect
• Poor metabolizers – Fail to metabolize to active form and excrete before
conversion into active form
• Ultra-rapid metabolizers – Quickly convert into active form and result in
overdose even if low dose is metabolized quickly.

34
Variations in Medication responsiveness
• CytochromeP450 is encoded by 2 genes – 2C19 located in Chromosome 22 and 2D6 present in
Chromosome 10.

• In 2D6, about 12 SNPs have been identified, but the most common is G to A substitution within
exon 4.

• About 40 different predrugs require 2D6p to become active which includes treatment for
antirhythmics, antidepressions etc.,.

• Mephenytin can’t be used to treat epilepsy when there is SNP in exon 5 in 681G to A in 2C19.

• 23% Asian population are poor metabolizers of 2C19P

• Hence, when drugs are administered, it is vital to determine the population specific dosage.

35

Vous aimerez peut-être aussi