Académique Documents
Professionnel Documents
Culture Documents
Facts to know:
Bioinformatics is multidisciplinary fields involving combination of computer science,
statics, mathematics and engineering to analyze and present biological data in the
purpose to better understanding the a given set of sequence of DNA, RNA or protein.
Bioinformatics common uses include gene and nucleotide identification. Scientists have
been using Bioinformatics in medical field for years to better explain the genetically
related diseases through the process of sequence analysis, genome annotation and
many more.
Sequence analysis. With the improvement in Bioinformatics, DNA analysis which used to
be frustrating and manually performed has become more interesting as well as
fascinating. Among the most profound program used in the field is BLAST database. It
contains variation of more than 260 000 organism, and over 19 billion nucleotides. So far,
thousand of DNA sequences of thousand of organisms have been decoded and stored in
several other databases. These sequences provide information which is then used to
determine gene(s) that encode for its polypeptide (protein), RNA genes, regulatory
sequence, and repetitive sequence. It also enables comparative analysis of genes within
species or between different species, or to show similarities between protein function,
and relation between species.
Annotation is another function of Bioinformatics. It allows for gene finding computationally
to search for protein coding genes, RNA genes and other functional sequence within a
genome. As you might reveal from the lecture, not all part in the DNA sequence is
functionally serve the purpose to preserve genetically valuable information, the non-
genetically valuable information part is called junk DNA. Bioinformatics is bridging the
gap of genome to proteome, such as in the use of DNA sequence for protein identification.
Genome Annotation. In the context of genomics, this computational activity to such DNA
sequence allows to gene prediction (it predicts how many genes there are in the given
DNA sequence), and other biological features of the DNA. Annotation is made possible
by the fact that genes have recognizable start and stop regions, although the exact
sequence found in these regions can vary between genes.
What to do?
Having the above brief explanation about Bioinformatics and its functional use in
computational biology, you are expected to have meaningful experiences with the
following instructions.
The instructions are designed to give you ideas of what is Bioinformatics and what to
benefit from it. For the practicality purpose of the task, you are going to perform series of
activities. The procedural information for each activity is provided in terms of the objective,
how you are expected to perform the task (method), and to highlight is the report consists
of your findings. Detail information on how you are going to write the report will be
provided by the senior students who are assisting the practicum.
1. Locating and Retrieving a Sequence
(Nucleotide and Amino Acid Sequence)
Objective:
To locate and retrieve the nucleotide sequence of Human Beta Globin Region on
chromosome 11 from NCBI Genbank
Method:
NCBI Genbank is available at http://www.ncbi.nlm.nih.gov/genbank/
Submit the accession number or the title of the seq into the available slot.
Pay attention the seq. What sequence do you think is it, a DNA, RNA or amino
acid sequence? How can you tell the difference?
Report:
The sequence gi, length, the whole seq in fasta format, NCBI Graphics
Brief explanation on the result (annotate your sequence)
Objective:
To identify gene(s) within a sequence of determined genomic DNA using NCBI Genbank
and Genscan Prediction Tool
Method:
Using the NCBI Genbank facilities to identify the gene within the previously retrieved
Human Beta Globin Region on chromosome 11
Have a look on the previous result in Step 1. What are the seq properties? How
many genes, exons and introns? How do you know?
Contrasting the NCBI Genbank result with Genscan Prediction Tool. Genscan is available
at http://argonaute.mit.edu/GENSCAN.html
Copy the Human Beta Globin Region on chromosome 11 seq in fasta format, and
then submit it in the available slot.
Pay attention to the result: how many genes, exon and introns.
Please also locate a protein sequence at the bottom of the page result. You may
want to copy that polypeptide sequence, because we are going to use it as well
later on (step 4)
Compare the two tools result. Do the two sequences share similarity concerning its
number of genes, exons and introns?
Report:
NCBI Graphics and Genscan result
Compare the two results from Genbank and Genscan prediction tools in terms of the
properties of numbers of genes, exons, introns, and provide brief explanation on the results
Method:
Using NCBI Blastn tool, identify and retrieve a highly similar sequence with the human Beta
Globin Region on chromosome 11 sequence in Pan troglodytes. Use the human globin
sequence as the query seq. NCBI Blastn available at http://blast.ncbi.nlm.nih.gov/Blast.cgi
and this:
Then follow through the link to result which yields highest identity, in this case 97% ident.
with Accession Number NC_006478.3
What do you think about the sequence? How come there are so many Ns? what
are these Ns representing?. Btw, why do we need to blast the human Beta Globin
Region on chromosome 11, what is the purpose?
Report:
NCBI Graphs on Pan troglodytes Human Beta Globin Region on chromosome 11
Brief explanation on the results above
4. Translating a Nucleotide Sequence to Amino Acid Sequence
Objective:
To translate the Human Beta Globin Region on chromosome 11 sequence seq to amino
acid seq
Method:
Translating the Human Beta Globin Region on chromosome 11 nucleotide seq to amino
acid seq using Expasy Translation Tool, available at
http://web.expasy.org/translate/.
Submit the Human Beta Globin Region on chromosome 11 sequence in fasta, and
translate.
As comparison, use the NCBI Genbank translation and Genscan versions to be contrasted
with Expasy result.
So now you have three seqs. You can blastp them to find their similarities
Report:
Prosite result
Explanation on the result, and the function of the protein (search for other supporting
references)
What to expect from me?
Upon completing the instructions, you are expected to
1. Choose seq of your interest and analyze its property, translate it into protein,
discover its similarity to other similarly function seq across species, and finally
predict its function (Repeat Step 1-5). Examples of seqs are albumin, insulin,
insulin receptor, glycogen phosphorilase, etc.
2. Then write a group report on your seq of interest. Each group is obliged to choose
2 seqs with no repetition allowed to other groups seqs. The format of the report
will be explained during or after the practicum. Please be considerate to use proper
wordings and not to copying other students work.
3. There are several words written in italic in the first paragraphs, and several
questions also written in italics in the activities section to trigger your thinking
process. You are expected to write description and brief explanation on those
words, as well as provide answers for the questions. This additional writing should
be embedded within the main report.
For further clarification on this practicum, please refer to the available lecturers.
DAFTAR PUSTAKA