Vous êtes sur la page 1sur 24

What is bioinformatics?

The management, analysis, and visualization


of molecular, cellular, and genomic
information.

Molecular Biology

Computational Biology

Bioinformatics

Computer Science

Genomics

Genomics

Development and application of genetic mapping,


sequencing, and computation (bioinformatics) to analyze the
genomes of organisms.

Sub-fields of genomics:
1. Structural genomics-genetic and physical mapping of
genomes.
2. Functional genomics-analysis of gene function (and nongenes).
3. Comparative genomics-comparison of genomes across
species.

Includes structural and functional genomics.

Evolutionary genomics.

COMPARATIVE
GENOMICS
Brief Review

Definition
A comparison of gene numbers , gene
locations & biological functions of gene, in
the genomes of different organisms is
known as comparative genomics.
Major objective : to identify gene or group
of genes that play a unique biological role
in a particular organism.

Few Terminologies
Homology :- Homology is the relationship of any
two characters ( such as two proteins that have
similar sequences ) that have descended,
usually through divergence, from a common
ancestral character.
Homologues are thus components or characters
(such as genes/proteins with similar sequences)
that can be attributed to a common ancestor of
the two organisms during evolution.

Homologoues can either be


orthologues, paralogues or
xenologues.
Orthologues are homologues that have evolved
from a common ancestral gene by speciation.
They usually have similar functions.
Paralogues are homologues that are related or
produced by duplication within a genome
followed by subsequent divergence. They often
have different functions.
Xenologues are homologous that are related by
an interspecies (horizontal transfer) of the
genetic material for one of the homologues. The
functions of the xenologues are quite often
similar.

Analogues
Analogues are non-homologues
genes/proteins that have descended
convergently from an unrelated ancestor.
They have similar functions although they
are unrelated in either sequence or
structure.

Why Comparative Genomics?


Problems:
1. the vast numbers of species and the much larger size of
some genomes makes the entire sequencing of all
genomes a non-optimal approach for understanding
genome structure.
2. within a given species most individuals are genetically
distinct in a number of ways. What does it actually mean,
for example, to "sequence a human genome"? The
genomes of two individuals who are genetically distinct
differ with respect to DNA sequence by definition.
These two problems, and the potential for other novel
applications, have given rise to new approaches which
constitute the field of comparative genomics.

All modern genomes have arisen from common ancestral genomes


The relationships between genomes can be studies with this fact in
mind.
This commonality means that information gained in one organism
can have application in other even distantly related organisms.
Comparative genomics enables the application of information
gained from model systems to agricultural.
The nature and significance of differences between genomes also
provides a powerful tool for determining the relationship between
genotype and phenotype through comparative genomics and
morphological and physiological studies.

Methods
A DNA walk of a genome represents how the
frequency of each nucleotide of a pairing nucleotide
couple changes locally.
This analysis implies measurement of the local
distribution of Gs in the content of GC and of Ts in the
content of TA.
Lobry was the first to propose this analysis (1996,
1999). Two complementary representations can be
derived from the DNA walk: the cumulative TA- and
the GC-skew analysis.

1) DNA walk
1.1) Drawing a DNA walk by reading a sequence file nucleotide
by nucleotide.
A simple algorithm can be used to draw a DNA walk by simply
assigning a direction to each nucleotide.
Lets T, C, A, and G correspond the E(ast), S(outh), W(est), and
N(orth) directions, respectively. Reading the nucleotide
sequence nucleotide by nucleotide, and following the rule, a
path clearly emerges on the graph.

Figure 1:
1: DNA walk of the sequence

GTCTGGTGTCTGGAGTTCCTGGGTCTTGAGACCACAGGACC

CACCAGGGACCCAGGACCC

Starting from the bottom left (bold blue line), the curve end at the bottom left (pink line)

2) The cumulative TA- and the GC-skew analyses.


2.1) Drawing a cumulative TA- or a GC-skew analysis by reading a
sequence file nucleotide by nucleotide.
Cumulative TA-skew analysis: Assign to each nucleotide the
following direction: to A, T, C, and G correspond the S, N, nd (no
direction), and nd directions, respectively.
On the graph, after the reading of one nucleotide, the pointer has to
go one step eastward. If a A, or T, is read, a further step is added,
southward, or northward, respectively.

Cumulative GC-skew analysis: Assign to each nucleotide the


following direction: to A, T, C, and G correspond the nd, nd, S,
and N directions, respectively. On the graph, after reading one
nucleotide, the pointer has to move one step eastward. If a C,
or G, is read, a further step is added, southward, or northward,
respectively.

Methods (dry)
Bioinformatics.
Its tools (software)

Computational analysis
Shannon entropy is a measure of variation
or change over a time series.Genes that
exhibit significant changes are regarded
as good target candidates.
Clustering is a method for grouping
patterns by similarities in their shapes.

GCG tools
Founded in 1982 as a service of the Department of
Genetics at the University of Wisconsin, GCG became a
private company in 1990 and was acquired by Oxford
Molecular Group in 1997. The company was one of the
pioneers of bioinformatics and its Wisconsin Package
sequence analysis tools are widely used and well regarded
throughout the pharmaceutical and biotechnology
industries and in academia. To support enterprise
bioinformatics efforts, GCG developed SeqStore, its
Oracle-based data management system. Desktop solutions
are delivered to bench scientists through products such as
MacVector and OMIGA

GCG Wisconsin Package


Comparison
Database Searching and Retrieval
DNA/RNA Secondary Structure
Editing and Publication
Evolution
Fragment Assembly
Gene Finding and Pattern Recognit
ion
Importing and Exporting
Mapping
Primer Selection
Protein Analysis
Translation

PAUP* version 4.0 is a major upgrade and new release of the software
package for inference of evolutionary trees, for use in Windows,
UNIX/VMS, or DOS-based formats.

Target Validation
Target validation involves taking steps to prove that a
DNA, RNA, or protein molecule is directly involved in a
disease process and is therefore a suitable target for
development of a new therapeutic compound.
Genes that do not belong to an established family are
critical to many disease processes and also need to be
validated as potential targets.

Target validation & identification


Computer based Drug- design:- Beginning
with the protein engineering and analysis
tools we can identify and evaluate the
target.
The complete suite of software provides
for a flawless environment to work more
efficiently & quickly.

Target validation & identification


Computational component analyzes genomic
sequences resulting in 3D and functional
annotations. Once annotated, sequences can be
identified as potential drug targets for development.
X-ray crystallography has become a central tool in
modern drug and target discovery.
These annotations, made from knowledge of
predicted protein structure, are an important
component in identifying potential targets, thereby
facilitating successful and competitive drug
discovery.

Outcomes/ Benefits
Provides first pass information on the function
of the putative protein based on the existence of
conserved protein sequence motifs.
Advancements in computer software
technologies (Bioinformatics) has made
comparative analysis of genomes an extremely
powerful approach for functional genomics too.
These studies can also reveal insights into the
recruitment of enzymes in a pathway

Outcomes/ Benefits
It will help us to understand the genetic basis of diversity
in organisms, both speciation & variation, events that are
important aspects of evolutionary biology.
Comparative genomics provides a powerful way in which
to analyze sequence data.
Indeed, there is already a long list of 'model' organisms,
which allow comparative analyses in a variety of ways.

Vous aimerez peut-être aussi