Vous êtes sur la page 1sur 2

Analysis of gene sequences

Analysis of introns, exons, acceptor sites, donar sites, start sites, stop sites & ORF regions in the given gene sequence.

To predict the gene structure using various gene prediction tools such as GENSCAN, geneid and GeneMark.

Procedure: In this exercise, a previously annotated gene will be used to measure the accuracy of different gene finding approaches. GENSCAN, geneid & GeneMark will be used to annotate the sequence. Both search by signal, content and homology (protein and cDNA sequences) methods will be employed in order to improve the ab initio results. Weak conservation of Start codons will lead to wrong prediction of initial exons in most cases. 1. Fetch the nucleotide sequence with id U30787 from embl data base. 2. Save the sequence in fasta format. 3. Use the following gene finding programs for the prediction of genes in the retrieved sequence and compare the results.

GENSCAN http://genes.mit.edu/GENSCAN.html Geneid GeneMark http://www1.imim.es/software/geneid/geneid.html http://exon.gatech.edu/GeneMark/

Spliced alignment is very useful when we have additional information (a putative homologous protein sequence) about the content of the sequence. Thus, gene prediction is guided by fitting the protein sequence into the best splice sites predicted in the genomic sequence.

Questions: 1. Compare the results obtained from the above tools? 2. On an average how many exons and introns are there in the given gene sequence? 3. Specify the begin and end of those identified exons and introns? 4. How many possible ORFs are there in the given gene sequence, retrieve their fasta sequence? 5. List all the possible protein coding exons in the given gene sequence? To identify the gene structure using Genewise web server: http://www.ebi.ac.uk/Wise2/ Procedure: Paste both protein and genomic sequences and run the program Compare predicted gene (end of the file) and annotations: look for splice sites within introns to check exon boundaries are correct) Save your result images answer the following questions: a. What is the important conclusion from the observation? b. Which tool is more accurate? c. How many exons are in the embl entry?

Vous aimerez peut-être aussi