ATIVIDADE PRTICA 1 Turma Diurno 1 quadrimestre 010 Cssia de Soua Carvalho Prof Ana Carolina Simes Santo Andr, 14 de maro de 2010 Parte 1.1 - Sequnca A1 e as modadades da ferramenta BAST > A1 ACCGACATGGCGGCCGTCTTCGCTGTGGTGACTTTAACTCTCGGTTTTCGGTTATAGCCGGCCGGCGCTC ACTTGTCTTCAGGAAGCTCGGAGCCTTTGGTGGAGCCGGGGAGAGGAAGGGTGGGTGCAAGAGTGAAAGG CGAGAGGGGACTGCAAGCATCCGGGTCGGCTCCTGGCCGGAGCAAGATGGCTGAGGGCGAGCGGCAGCCG CCGCCAGATTCTTCAGAGGAGGCCCCTCCAGCCACTCAGAACTTCATCATTCCAAAAAAGGAGATCCACA CAGTTCCAGACATGGGCAAATGGAAGCGTTCTCAGGCATACGCTGACTACATCGGATTCATCCTCACCCT CAACGAAGGTGTGAAGGGGAAGAAGCTGACCTTCGAGTACAGAGTCTCCGAGGCCATTGAGAAACTAGTC GCTCTTCTCAACACGCTGGACAGGTGGATTGATGAGACTCCTCCAGTGGACCAGCCCTCTCGGTTTGGGA ATAAGGCATACAGGACCTGGTATGCCAAACTTGATGAGGAAGCAGAAAACTTGGTGGCCACAGTGGTCCC TACCCATCTGGCAGCCGCTGTGCCTGAGGTGGCTGTTTACCTAAAGGAGTCAGTGGGGAACTCCACGCGC ATTGACTACGGCACAGGGCATGAGGCAGCCTTCGCTGCTTTCCTCTGCTGTCTCTGCAAGATTGGGGTGC TCCGGGTGGATGACCAAATAGCTATTGTCTTCAAGGTGTTCAATCGGTACCTTGAGGTTATGCGGAAACT CCAGAAAACATACAGGATGGAGCCAGCCGGCAGCCAGGGAGTGTGGGGTCTGGATGACTTCCAGTTTCTG CCCTTCATCTGGGGCAGTTCGCAGCTGATAGACCACCCATACCTGGAGCCCAGACACTTTGTGGATGAGA AGGCCGTGAATGAGAACCACAAGGACTACATGTTCCTGGAGTGTATCCTGTTTATTACCGAGATGAAGAC TGGCCCATTTGCAGAGCACTCCAACCAGCTGTGGAACATCAGCGCCGTCCCTTCCTGGTCCAAAGTGAAC CAGGGTCTCATCCGCATGTATAAGGCCGAGTGCCTGGAGAAGTTCCCTGTGATCCAGCACTTCAAGTTCG GGAGCCTGCTGCCCATCCATCCTGTCACGTTGGGCTAGGAGGGGCCAAGCCGAAGAGCCACCCAGGCCAC AGTTCCTGTGCCTGCCTTCCCCACCCCAGCAGTGGCCCCTCCCCATCCCCTCCCTCTGTTCGTCCCGTTT GATGAGAGGCTGTTTACTGGGGTGGGGTGGCGAGATGGGCTTGAGGGGGCTCAGAGCATAAGGCTTCAGG GCCCAAGTTGGGAGAAGTGACCAAAGTGTAGCCAGTTTTCTGAGTTCCCGTGTGCTAGACTGGCCAGAAG AGAGGGTCTGGGGCCTGGTCACTCGGCCACTCTCTCCTGTTTCTGGCCTCTTCTCCCTTCACTCCCGTCC AGTCTGGTTTTGAGAGCAGGGGCTGTTCTGCAGCACCGCAGGGAAGGGAGGAGAGATACCTGCTGCTTCC ATTGCTTTTCCCTTCCTGGAGTCGATGCCTTTCTAAGGGTTGGAGCTGCTCCTTGCAGGGGCGGGTCAGT TTCCCAGGCCATGCCGGG Utiliou-se a sequncia A1 em formato FASTA primeiramente com a ferramenta (Figura 1.1.1), que compara uma sequncia dada de nucleotdeos com sequncias de nucleotdeos dos bancos de dados do NCB. O banco de dados default (Human genomic plus transcript) foi modificado para Nucleotide collection (nr/nt), mais abrangente contm All GenBank + Refeq Nucleotides + MBL + BJ + PB sequences (ecluding HTG0,1,2, T, G, T, PAT, G Tambm foi alterada para 0000 a opo de nmero mximo de sequncias alvo a ser mostrado. Figura 1.1.1 Tela de busca do programa blastn Os resultados obtidos mostram sequncias com alto score de alinhamento, maior ou igual a 00 (Figura 1.1.). H sequncias encontradas com cobertura total (coverage, ou seja, alinhamentos com poucos gaps e mesmo nmero de pares de base), mas h tambm sequncias com gaps e tamanho diferentes, encontradas em animais filogeneticamente mais distantes do homem. A identidade varia de 100% a 75%, sendo que a grande maioria situa-se entre 100% e 99%. O critrio de ordenao default das tabelas o E value (expectation value, estatisticamente, o valor esperado de acertos a serem obtidos numa busca ao banco de dados quanto mais baixo o E value, mais significativo o score). No topo esto as sequncias com e-value 0.0 Figura 1.1. Distribuio grfica dos alinhamentos das sequncias pelo blastn. A primeira sequncia listada possui score mximo, identidade 100% e E value 0.0. Trata-se de um cDNA, obtido pelo mNA de Homo sapiens do gene responsvel pela protena fosfatase famlia A (Figura 1.1.3 e Figura 1.1.4). As trs primeiras sequncias listadas so de genoma humano, e a quarta a de um primata da famlia Hominidae, Pan troglodytes, que apresenta uma identidade de 9%, alm de cobertura de 100%. A maioria das sequncias pertencem a primatas e mamferos. Figura 1.1.3 Sequncias obtidas, com destaque para a primeira, a de Homo sapiens para a produo da protena fosfatase A. Figura 1.1.4 Parte do alinhamento da primeira sequncia, com identidade 100%. O ene D da sequncia foi destacado em verde. Usando a mesma sequncia com a ferramenta , que compara uma sequncia traduida em todos os seis quadros (frames) de leitura com um banco de dados de protenas. anteve-se o banco de dados Non-redundant protein sequences (nr)(Figura 1.1.5) e alterou-se o alvo para 0000 sequncias. A matri de substituio escolhida foi a default BLOUM 62, que apresenta somente resultados com score de alinhamento igual ou superior a % (Figura 1.1.). Figura 1.1.5 tela do programa blast. Figura 1.1. Parmetros do algoritmo, com destaque para a matri de pontuao utiliada, a BLOUM62. isualmente, v-se no grfico do blast (Figura 1.1.7) uma diferena em relao ao blastn realiado anteriormente. mbora os scores de alinhamento continuem altos, agora a cobertura do alinhamento comparativamente menor m mdia, somente cerca de 1000 pares de bases esto alinhados sequncia isca, que possui 1 bp. No aparecem gaps internos s sequncias. sso pode ser explicado pela naturea do algoritmo de buscas. Como a sequncia de nucleotdeos traduida para cada um dos seis possveis quadros de leitura, so preditas protenas que existem apenas como possibilidade, mas no so funcionais, logo podem possuir partes hipotticas que as verdadeiras protenas identificadas no banco de dados no possuem. ogo, o alinhamento de uma forma geral apresenta menor cobertura. Figura 1.1.7 iso grfica do alinhamento das sequncias com o uso do blast Apenas as oito primeiras sequncias esto com E value 0.0 (Figura 1.1.). A diferena no E value coincide com alinhamentos de menor cobertura e menor score. A sequncia com maior score e E value 0.0 foi a do mesmo gene encontrado como primeiro resultado do blastn. Figura 1.1. esultados da tela do blast. Figura 1.1.9 Parte do alinhamento de melhor match, mostrando a protena predita versus a protena dos bancos de dados pesquisados. O programa tblast fa a traduo da sequncia de nucleotdeos dada em todos os seis quadros de leitura possveis e compara com as protenas obtidas pelo mesmo processo, usando as sequncias de nucleotdeos de seus bancos de dados. sso exige maior processamento e as buscas tornam-se mais lentas. Na primeira tentativa de utiliao com A1, tendo como banco de dados Nucleotide collection (nr/nt), no se completou, sendo apresentada a mensagem de erro no servidor. Pode ser um erro temporrio, mas repetiu-se at a data do presente relatrio. O banco de dados foi ento restrito Refeqs, com menor nmero de sequncias, s ento apresentou resultados (Figura 1.1.10). Pela primeira ve apareceram regies do alinhamento com scores abaixo de 00, inclusive trechos em que est abaixo dos 40 pontos. Figura 1.1.10 isualiao dos alinhamentos obtidos com o tblast. Figura 1.1.11 esultados do tblast. A primeira coluna a de scores e a segunda, a de E values. O tblast acaba ento encontrando relacionamentos mais distantes entre as sequncias. No aparecem tantas sequncias de humanos, e animais mais distantes evolutivamente esto com E value 0.0, o que no ocorreu na modalidade blast. (Figura 1.1.1) Novamente a primeira sequncia listada, com os melhores resultados de alinhamento a do gene humano para a fosfatase (Figura 1.1.1). Figura 1.1.1 Parte do alinhamento da primeira sequncia encontrada no tblast. Os programas blast e tblast faem a traduo dos nucleotdeos em todos os seis quadros de leitura. as improvvel obter seis protenas diferentes a partir de cada quadro. H combinaes em que aparecem muitos cdons de parada, no sendo possvel determinar um gene. O uso do programa pRA32 dedu os OFs (Open Reading Frame, uma poro do genoma que pode codificar uma protena) da sequncia (Figura 1.1.13). possvel observar dois possveis OFs em A1, um orientado no sentido 5 3 (nomeado aqui com simples objetivo de diferenciao, de OF1) e outro no sentido 3 5 (OF). Figura 1.1.13 Tela do softare pDA3 com anlise da sequncia A1. As setas pretas indicam o tamanho e a orientao dos OFs preditos. A traduo predita dos OFs em protenas mostrada na Figura 1.1.14. ssas duas protenas foram posteriormente utiliadas no programa blastp. Figura 1.1.14 Protenas obtidas a partir dos OFs. esultado semelhante foi obtido com o uso da ferramenta ORF Finder do NCB. Foi modificado o parmetro do nmero mdio de aminocidos de 100 para 300. O ORF Finder tra recursos interessantes. O default mostra protenas iniciadas pelo cdon AT, mas possvel visualiar uma protena iniciada com CT (boto Alternative Initiation Codons Figura 1.1.15). Aceitando a sequncia default, visualia-se (boto View) o OF selecionado e sua protena traduida em formato GenPept (Figura 1.1.1). Tambm possvel a partir da tela de resultados faer uma busca com o blastp, escolhendo qual banco de dados e se selecionada a opo com parmetros pode-se ajustar a matri de pontuao, nmero de sequncias alvo etc. Figura 1.1.15 esultado obtido com ORF Finder, procurando por OFs com mdia de 300 pares de base e cdon AT como iniciador da sequncia. LOCU A1 1628 bp linear 10-MAR-2010 FINITION No definition line found. ACCION A1 VRION KYOR . OURC Unknown. ORGANIM Unknown. Unclassified. FATUR Location/Qualifiers source 1..1628 C 187..1158,73..1158 /note=predicted coding region /translation=MAGRQPPPAPPATQNFIIPKKIHTVPMGKKRQA YAYIGFILTLNGVKGKKLTFYRVAIKLVALLNTLRITPPVQPRFGN KAYRTYAKLANLVATVVPTHLAAAVPVAVYLKVGNTRIYGTGHAAFA AFLCCLCKIGVLRVQIAIVFKVFNRYLVMRKLQKTYRMPAGQGVGLFQFL PFIGQLIHPYLPRHFVKAVNNHKYMFLCILFITMKTGPFAHNQL NIAVPKVNQGLIRMYKACLKFPVIQHFKFGLLPIHPVTLG BA COUNT 338 a 1 c 87 g 362 t ORIGIN 1 accgacatgg cggccgtctt cgctgtggtg actttaactc tcggttttcg gttatagccg 61 gccggcgctc acttgtcttc aggaagctcg gagcctttgg tggagccggg gagaggaagg 121 gtgggtgcaa gagtgaaagg cgagagggga ctgcaagcat ccgggtcggc tcctggccgg 181 agcaagatgg ctgagggcga gcggcagccg ccgccagatt cttcagagga ggcccctcca 21 gccactcaga acttcatcat tccaaaaaag gagatccaca cagttccaga catgggcaaa 301 tggaagcgtt ctcaggcata cgctgactac atcggattca tcctcaccct caacgaaggt 361 gtgaagggga agaagctgac cttcgagtac agagtctccg aggccattga gaaactagtc 21 gctcttctca acacgctgga caggtggatt gatgagactc ctccagtgga ccagccctct 81 cggtttggga ataaggcata caggacctgg tatgccaaac ttgatgagga agcagaaaac 51 ttggtggcca cagtggtccc tacccatctg gcagccgctg tgcctgaggt ggctgtttac 601 ctaaaggagt cagtggggaa ctccacgcgc attgactacg gcacagggca tgaggcagcc 661 ttcgctgctt tcctctgctg tctctgcaag attggggtgc tccgggtgga tgaccaaata 721 gctattgtct tcaaggtgtt caatcggtac cttgaggtta tgcggaaact ccagaaaaca 781 tacaggatgg agccagccgg cagccaggga gtgtggggtc tggatgactt ccagtttctg 81 cccttcatct ggggcagttc gcagctgata gaccacccat acctggagcc cagacacttt 01 gtggatgaga aggccgtgaa tgagaaccac aaggactaca tgttcctgga gtgtatcctg 61 tttattaccg agatgaagac tggcccattt gcagagcact ccaaccagct gtggaacatc 1021 agcgccgtcc cttcctggtc caaagtgaac cagggtctca tccgcatgta taaggccgag 1081 tgcctggaga agttccctgt gatccagcac ttcaagttcg ggagcctgct gcccatccat 111 cctgtcacgt tgggctagga ggggccaagc cgaagagcca cccaggccac agttcctgtg 1201 cctgccttcc ccaccccagc agtggcccct ccccatcccc tccctctgtt cgtcccgttt 1261 gatgagaggc tgtttactgg ggtggggtgg cgagatgggc ttgagggggc tcagagcata 1321 aggcttcagg gcccaagttg ggagaagtga ccaaagtgta gccagttttc tgagttcccg 1381 tgtgctagac tggccagaag agagggtctg gggcctggtc actcggccac tctctcctgt 11 ttctggcctc ttctcccttc actcccgtcc agtctggttt tgagagcagg ggctgttctg 1501 cagcaccgca gggaagggag gagagatacc tgctgcttcc attgcttttc ccttcctgga 1561 gtcgatgcct ttctaagggt tggagctgct ccttgcaggg gcgggtcagt ttcccaggcc 1621 atgccggg // Figura 1.1.1 Sequncia de bases e sequncia da protena traduida do OF1 As protenas obtidas utiliando-se ambos programas foram as mesmas para seus respectivos OFs. Usando as duas sequncias de protenas, foi realiado o ltimo BAST, o , que compara uma dada sequncia de protenas com o banco de protenas. Foi novamente utiliada a matri de substituio BOSU contra 0000 sequncias alvo. Para identificao, adotou-se Protena1 para aquela predita a partir do OF1 e Protena2 para a correspondente ao OF. A Figura 1.1.17 mostra parte dos resultados da tela do blastp para a Protena1. Primeiramente, visualia-se o domnio (elemento estrutural compacto de uma protena que possui estabilidade prpria)1. Figura 1.1.17 Domnio da super famlia PTPA encontrado pelo blastp. A cobertura para os 33 aminocidos quase total, e novamente so encontradas muitas sequncias com E value 0.0 (Figuras 1.1 e 1.19). Figura 1.1.1 Distribuio dos alinhamentos no blastp. Figura 1.1.19 Tela de resultados do blastp. Figura 1.1.0 Alinhamento completo da sequncia da Protena1 com a protena Fosfatase A. Os resultados obtidos no blastp para a Protena traduida a partir do OF so bem diferentes dos anteriores. -se um alinhamento com baixo score nenhum E value igual a zero e a baixa identidade (Figuras 1.1 e 1.). A primeira sequncia relacionada de um organismo distante evolutivamente do homem (Clavispora lusitaniae um fungo Figura 1.1.3). Biologicamente, no comum de se encontrar em eucariotos dois genes numa mesma sequncia de mNA capaes de codificar protenas. mbora a Protena tenha sido predita pelo pRA e pelo ORF Finder, os resultados no encontram respaldo nas buscas realiadas pelo blastp contra um banco Refeq (he Reference Sequence collection, um conjunto de sequncias confiveis, testadas e abrangentes, incluindo DNA genmico, NA e protenas). Conclui-se ento que esta Protena no deve ser considerada. Figura 1.1.1 Distribuio dos alinhamentos para a Proteina no blastp. Figura 1.1. esultados obtidos com blastp para Proteina. Figura 1.1.3 Alinhamento da protena predita com uma protena hipottica do fungo C Lusitanae Por fim, foi realiada a busca da Protena1 com o tblastn, que recebe uma sequncia de protenas e compara com sequncias traduidas a partir de nucleotdeos encontrados em seus bancos. Como matri de substituio, usou-se a BLOUM62. O resultado apresentado nas Figuras 1.1.4, 1.1.5 e 1.1.. So apresentados dados bastante semelhantes aos encontrados com o blastp, ou seja, alta cobertura e E values 0.0. Figura 1.1.4 Alinhamentos do tblastn. Figura 1.1.4 esultados do tblastn, ordenados pelo E value. Figura 1.1.5 Alinhamento da primeira sequncia encontrada no tblastn. O Gene I em destaque verde o mesmo encontrado nas buscas anteriores. Com todas as buscas realiadas, utiliando bancos de dados que incluem Refeq, conclui-se que o gene da sequncia A1 proveniente de mNA humano e codifica a protena Fosfatase A. sse gene encontrado no Gene atabase do NCB pelo identificador 5524uid. Como em todas as buscas esse mesmo Gene I apareceu sempre em primeiro lugar (exceto para a Protena) com identidade 99% ou 100%, E value 0.0 e mais alto score, chegou-se sua descrio no banco ene clicando-se no seu prprio lin (http.ncbi.nlm.nih.govsitesentre dbgenecmdsearchterm554DT59H7014loggeneexplicitnuclblast_ran 1). Trs artigos relacionados a esse gene que so de domnio pblico podem ser lidos online nos endereos a seguir Peter P. uvolo, ingming Deng, Taahio to, Boyd . Carr and . Stratford ay. Ceramide Induces Bcl2 ephosphorylation via a echanism Involving itochondrial PP2A J. Biol. Chem. 1999 74 09-0300. doi10.1074jbc.74.9.09 Full text http.jbc.orgcontent74909.full Soner Altio, in u and Bruce . Spiegelman PPARy induces cell cycle withdrawal: inhibition of E2F/P NA-binding activity via down-regulation of PP2A enes Dev. August 1, 1997 11 197-199; doi10.1101gad.11.15.197 Full text httpgenesdev.cshlp.orgcontent1115197.full Tournebie, S S Andersen, F erde, Dore, arsenti, and A A Hyman. istinct roles of PP1 and PP2A-like phosphatases in control of microtubule dynamics during mitosis. BO J. 1997 September 15; 1(1) 55375549. doi 10.1093emboj1.1.5537. PCD PC11701 Full text http.ncbi.nlm.nih.govpmcarticlesPC11701pdf005537.pdf Parte 1.2 - Contedo CG e reges ntrncas e exncas. A porcentagem de contedo C para a sequncia A1 calculada pelo softare Gene can, de 5.97% (Figura 1..1). Figura 1..1 Porcentagem de contedo C pelo Gene can. O programa pDA oferece um mapa da sequncia, indicando o percentual de C em cada trecho (Figura 1..). Comparando-se com as regies exnicas, ou seja, com o OF1, aquele que codifica a protena fosfatase A, v-se pela cores da legenda, uma presena de C entre 90% e 50%. egies com alto contedo C tendem a apresentar alta densidade gnica. Figura 1.. Tela do pDA apresentando a legenda para contedo C. A ltima anlise para a sequncia A1 foi realiada utiliando-se a ferramenta online Genome Browser. Os detalhes visualiados na Figura 1..3 pertencem sequncia encontrada com maior similaridade. O grfico mostra a variao de contedo C ao longo da sequncia. Novamente apresentado como match o gene para a fosfatase A humana, e tambm mostrado que esse gene parte do cromossomo 9 humano. Figura 1..3 Anlise do Genome Browser para a sequncia A1. Parte 2 - Sequnca GI 289450895 A sequncia de GI 289450895 foi pesquisada no banco Nucleotide do NCB, e salvou-se a sequncia correspondente no formato FASTA (Anexo 1). Trata-se de um BAC, (Bacterial Artificial Chromosome), um plasmdeo contendo DNA alheio por instncia, um fragmento de genoma humano . Primeiramente foi feita a busca pelo aplicativo oftberr, ferramenta FGNH (Anexo ). A Figura .1 tra um panorama da quantidade de genes encontrados e de xons. Figura .1 Primeira parte da anlise realiada pelo Softberry. egenda mostra nmero de genes e xons preditos. Com outra ferramenta utiliada, o aplicativo Gene can, o resultado diferiu no nmero de genes e xons encontrados. A sada graficamente apresentada na Figura .. Figura . esultados do ene Scan. No total, foram preditos 13 genes (Anexo 3). A busca no aplicativo ORF Finder usando-se essa sequncia apresentou como resultado (Figura .3) possveis 43 genes, algo bastante improvvel. Diante de tais diferenas, como determinar o nmero de genes Uma alternativa realiar buscas no BAST com os genes protenas preditas. uitas das sequncias no retornam nenhum resultado significativo, ou apresentaro baixa correlao com as sequncias dos bancos, o que indicar uma predio no respaldada com os dados de Refeqs, por exemplo. m meio a tantos pares de nucleotdeos, a possibilidade de se encontrar ao acaso, sequncias que iniciem com AT e terminem com algum stop codon grande, mesmo que na realidade essa protena nunca tenha sido sintetiada pelo organismo humano. Como os algoritmos utiliados em cada softare diferem, os resultados obtidos so distintos. REFERNCIAS 1 es, Arthur . Introduo Bioinformtica - . ed. - Porto AlegreArtmed, 00. es, Arthur . Introduction to genomics. Oxford University Press, 007 FGENESH 2. 6 Predict ion of pot ent ial genes in Homo_sapiens genomic DNA Seq name: t est sequence Lengt h of sequence: 172772 Number of predict ed genes 9: in + chain 6, in - chain 3. Number of predict ed exons 53: in + chain 38, in - chain 15. Posit ions of predict ed genes and exons: Variant 1 from 1, Score: 564. 839160 CDSf CDSi CDSl CDSo PolA TSS 1 16602 18000 19000 20000 21000 22000 23029 1 2 1 + TSS 16602 -5.69 1 + 1 CDSf 16676 - 16680 6.80 16676 - 16678 3 1 + 2 CDSl 22253 - 22400 8.80 22254 - 22400 147 1 + PolA 23029 -1.07 2 24909 40000 50000 60000 70000 80000 90000 103558 1 2 345 6 7891011121314 1516 17 1819 2021 2223 2425 26 2 + TSS 24909 -6.79 2 + 1 CDSf 25535 - 25651 2.10 25535 - 25651 117 2 + 2 CDSi 39151 - 39255 9.80 39151 - 39255 105 2 + 3 CDSi 55474 - 55609 4.56 55474 - 55608 135 2 + 4 CDSi 56061 - 56167 8.97 56063 - 56167 105 2 + 5 CDSi 56475 - 56597 11.32 56475 - 56597 123 2 + 6 CDSi 58519 - 58621 12.50 58519 - 58620 102 2 + 7 CDSi 66323 - 66506 6.54 66325 - 66504 180 2 + 8 CDSi 67209 - 67317 18.88 67210 - 67317 108 2 + 9 CDSi 68082 - 68221 12.73 68082 - 68219 138 2 + 10 CDSi 69249 - 69330 3.06 69250 - 69330 81 2 + 11 CDSi 71040 - 71131 5.68 71040 - 71129 90 2 + 12 CDSi 72005 - 72104 11.39 72006 - 72104 99 2 + 13 CDSi 73072 - 73300 22.19 73072 - 73299 228 2 + 14 CDSi 74834 - 74938 6.89 74836 - 74937 102 2 + 15 CDSi 75394 - 75490 3.26 75396 - 75488 93 2 + 16 CDSi 77055 - 77162 0.17 77056 - 77160 105 2 + 17 CDSi 79581 - 79677 3.68 79582 - 79677 96 2 + 18 CDSi 79924 - 80031 4.39 79924 - 80031 108 2 + 19 CDSi 80967 - 81080 13.03 80967 - 81080 114 2 + 20 CDSi 81426 - 81560 12.87 81426 - 81560 135 2 + 21 CDSi 82740 - 82900 17.62 82740 - 82898 159 2 + 22 CDSi 83493 - 83634 13.80 83494 - 83634 141 2 + 23 CDSi 84601 - 84687 11.45 84601 - 84687 87 2 + 24 CDSi 87824 - 87934 10.73 87824 - 87934 111 2 + 25 CDSi 89109 - 89232 18.70 89109 - 89231 123 2 + 26 CDSl 103238 - 103536 0.58 103240 - 103536 297 2 + PolA 103558 1.12 3 107766 110000 112000 114000 116000 118000 119946 1 2 3 4 5 3 - PolA 107766 1.12 3 - 1 CDSl 107955 - 108051 4.97 107955 - 108050 96 3 - 2 CDSi 108287 - 108417 1.85 108289 - 108417 129 3 - 3 CDSi 111862 - 112032 7.43 111862 - 112032 171 3 - 4 CDSi 112851 - 113006 17.41 112851 - 113006 156 3 - 5 CDSf 117319 - 117489 12.30 117319 - 117489 171 3 - TSS 119946 -8.69 4 120441 121000 121500 122000 122500 122767 1 4 + TSS 120441 -8.59 4 + 1 CDSo 121985 - 122689 21.67 121985 - 122689 705 4 + PolA 122767 -5.58 5 122851 124000 125000 126000 127000 128000 128697 1 2 3 4 5 5 + TSS 122851 -12.79 5 + 1 CDSf 122914 - 123180 17.94 122914 - 123180 267 5 + 2 CDSi 123854 - 124427 12.76 123854 - 124426 573 5 + 3 CDSi 124962 - 125621 15.92 124964 - 125620 657 5 + 4 CDSi 126881 - 126978 7.87 126883 - 126978 96 5 + 5 CDSl 127716 - 128072 -3.82 127716 - 128072 357 5 + PolA 128697 1.12 6 136089 140000 145000 150000 154439 1 2 3 4 5 6 7 6 - PolA 136089 1.12 6 - 1 CDSl 136289 - 136388 1.07 136289 - 136387 99 6 - 2 CDSi 143579 - 143694 -3.28 143581 - 143694 114 6 - 3 CDSi 145710 - 145802 6.49 145710 - 145802 93 6 - 4 CDSi 146768 - 146791 2.75 146768 - 146791 24 6 - 5 CDSi 147026 - 147121 3.46 147026 - 147121 96 6 - 6 CDSi 148162 - 151153 207.93 148162 - 151152 2991 6 - 7 CDSf 154219 - 154298 1.39 154221 - 154298 78 6 - TSS 154439 -5.79 7 155933 156500 157000 157500 158000 158500 159282 1 7 + TSS 155933 -6.59 7 + 1 CDSo 158588 - 158863 20.50 158588 - 158863 276 7 + PolA 159282 1.12 8 159821 160500 161000 161500 162000 162500 162848 1 2 3 8 + TSS 159821 -7.29 8 + 1 CDSf 160522 - 161172 19.33 160522 - 161172 651 8 + 2 CDSi 161236 - 161943 4.69 161236 - 161943 708 8 + 3 CDSl 161999 - 162727 14.76 161999 - 162727 729 8 + PolA 162848 -1.07 9 163073 164000 165000 166000 167000 168000 169000 170000 170878 1 2 3 9 - PolA 163073 1.12 9 - 1 CDSl 164379 - 164397 -5.66 164379 - 164396 18 9 - 2 CDSi 164665 - 164795 14.56 164667 - 164795 129 9 - 3 CDSi 170771 - 170878 11.46 170771 - 170878 108 Predicted protein(s): >FGENESH:[mRNA] 1 2 exon (s) 16676 - 22400 153 bp, chain + ATGAAGGAGAAGTTGGAAAAGGATGCAGATCTGGATGGTGTTTTTGCCTGCAGGGAGAAG TCGGAAAAGGATGCAGATCTGGATGGTGTTTTTGCCTGCAGGGAGAAGTTGGAAAAGGAT GCAGATCTGGATGGTGTTTCAACAGGGAAATAA >FGENESH: 1 2 exon (s) 16676 - 22400 50 aa, chain + MKEKLEKDADLDGVFACREKSEKDADLDGVFACREKLEKDADLDGVSTGK >FGENESH:[mRNA] 2 26 exon (s) 25535 - 103536 3315 bp, chain + ATGTTTCTTGGTGGTGGGAGCCCAGGGTTAGTTAGAAGCAGTGAGGGAAGTGTCCCTGAG AGAATAAACGTCCACGGAATGTTAGCAGAACCTTCTTCTCTGGTAGCTTATGGTCAGGTT ATTACACCACAACGAAAAATCACTCTGGCTGCACCCAACCGGAAAGACATGGAAGAATGG ATTAACATCATAAAAACCATCCAACAGGGAGAAATTTATAAGATACCTGCAGCAGAAAAC AACCCTTTTCTTGTTGGAATGCATTGTTGGTACTCCAGTTACAGCCACCGGACCCAGCAC TGCAATGTTTGTCGAGAGAGCATTCCTGCCTTATCTAGAGATGCCATCATCTGTGAAGTG TGCAAAGTGAAATCTCACAGATTGTGTGCTTTGAGAGCAAGCAAAGACTGCAAGTGGAAT ACATTGTCTATCACTGATGACCTCCTTCTGCCTGCAGATGAAGTAAACATGCCCCATCAA TGGGTAGAAGGAAACATGCCTGTCAGCTCTCAGTGTGCAGTGTGTCATGAGAGCTGTGGC AGTTATCAAAGACTTCAAGACTTCCGCTGCCTGTGGTGTAATTCTACGGTGCATGATGAC TGTAGGAGACGGTTTTCCAAGGAATGTTGCTTCAGAAGCCATCGCTCATCAGTCATTCCT CCCACTGCTCTAAGCGACCCCAAAGGCGATGGCCAATTAGTAGTATCTTCAGACTTCTGG AATCTTGATTGGTCATCAGCCTGTTCATGTCCCTTGCTCATCTTCATCAACTCCAAAAGT GGCGATCATCAGGGGATCGTCTTCCTCCGAAAATTCAAGCAATACCTTAACCCATCTCAA GTGTTCGACTTATTGAAGGGTGGACCTGAAGCAGGGCTGTCTATGTTCAAGAACTTTGCT CGCTTTCGCATTCTGGTTTGTGGTGGAGATGGCAGCGTGAGCTGGGTCTTATCTCTGATT GATGCCTTTGGATTACATGAAAAGTGTCAGTTGGCAGTCATCCCACTTGGAACCGGCAAT GATCTGGCTCGTGTTCTGGGCTGGGGTGCATTCTGGAACAAAAGCAAGTCACCTCTGGAC ATCCTCAACAGAGTGGAGCAGGCTAGTGTGAGGATCCTAGACAGATGGAGTGTGATGATT CGTGAGACTCCCAGACAAACCCCGCTGCTAAAAGGACAGGTTGAAATGGATGTACCACGA TTTGAGGCTGCTGCCATCCAACACTTAGAATCTGCAGCCACCGAGTTGAACAAAATCCTG AAGGCCAAGTACCCCACAGAGATGATCATCGCAACCAGATTCTTGTGTTCAGCTGTGGAA GATTTTGTGGTTGATATTGTAAAGGCCTGGGGTCAGATAAAACAGAACAACACTGCAATA GTGTCTGTGATTTTGAAAAGTGACTTAATGTATGATAGGCTCAGTGTCCTGATCGATGTC CTGGCTGAGGAGGCAGCAGCTACTTCTGCTGAAAAAAGTGCTACAGAATATGCAGACAGC AGCAAGGCAGATAGGAAGCCCTTCATTCCTCAAATAGACCACATAGCCAAGTGCAAGTTG GAGCTGGCTACAAAGGCCCAGAGTCTCCAGAAATCCTTGAAACTCATCATATTTCAAGTT GAACAAGCTCTGGATGAGGAAAGCAGACAGACAATATCTGTTAAGAACTTTAGTTCAACT TTCTTCCTGGAAGATGACCCAGAAGATATTAACCAGACAAGCCCACGACGCCGTTCTCGT CGTGGCACTTTGTCTTCTATATCTTCTCTCAAAAGTGAGGACCTGGACAACCTTAACTTG GATCACTTACATTTTACACCTGAATCTATACGCTTCAAAGAAAAATGTGTCATGAACAAC TACTTCGGAATTGGACTGGATGCTAAAATTTCTCTGGACTTCAACACCAGAAGAGATGAA CACCCAGGGCAATACAATAGCCGCCTTAAGAACAAGATGTGGTATGGCCTTCTGGGAACC AAAGAACTTTTGCAGCGCTCTTACAGGAAACTGGAAGAACGAGTGCATTTGGAGTGTGAT GGAGAAACCATCTCCTTGCCAAACCTGCAAGGCATTGTAGTGCTCAACATTACCAGCTAT GCTGGAGGTATCAACTTCTGGGGAAGCAACACAGCAACCACGGAATATGAGGCTCCTGCA ATCGATGATGGGAAACTGGAGGTGGTGGCAATCTTTGGTTCTGTGCAGATGGCAATGTCC CGTATCATCAACCTGCATCATCATCGCATTGCCCAGTGCCATGAGGTGATGATAACCATT GATGGTGAAGAAGGTATCCCAGTGCAGGTGGATGGGGAGGCCTGGATTCAGAGACCAGGC CTTATCAAAATTAGATACAAGAACGCTGCCCAGATGCTGACAAGAGATCGGGACTTTGAG AACTCAATGAAAATGTGGGAATACAAGCATACTGAAATTCAAGCTGCCCCTCAACCCCAG CTGGACTTCCAGGACTCTCAAGAGAGCCTCTCTGACGAGGAGTATGCCCAGATGCAGCAC TTAGCTCGGCTTGCAGAAAACCTCATCAGCAAACTTAATGACCTGAGCAAGATCCACCAG CATGTGTCTGTCCTCATGGGTTCTGTGAATGCCAGCGCTAACATCCTGAATGATATATTT TACGGCCAAGACAGTGGCAATGAGATGGGTGCAGCTTCCTGTATTCCCATTGAAACTCTA AGCAGAAATGATGCCGTAGATGTTACATTTAGTCTTAAAGGTCTCTACGATGACACCACA GCTTTCCTGGATGAAAAGCTGCTGAGAAGTGCTGAGGATGAGACTGCACTACAAAGCGCC CTGGATGCCATGAATAAGGAGTTCAAAAAGCTATCTGAGATTGACTGGATGAATCCAATC TTTGTTCCAGAGGAAAAATCTTCGGACACTGACAGTAGAAGCCTCAGGCTGAAAATTAAG TTCCCCAAATTGGGAAAGAAAAAGGTAGAAGAGGAACGCAAGCCTAAATCAGGCCAGAGT GTCCAGAGTTTTATTGGACTAGAAAACACAATCCTAAAATTCATATGGAACCAAAAAAGA GCCCGCATAGCCAAAGCAAGACTAAGCAAAAAGAACAAATCTGGAGGCATCACATTACTT GACTTCAAACCATACTATAAGGGTATAGTTACCAAAACAGGATGGTACTGGTATAAAAAT AGGCACATAGACCAATGGAACAGAATAGAGAACCCAGAAATAAAGCCAAATATGTACAGC CAACTGATCTTCAACAAAGCAAACAAAAACAAAGTGGAAAAAGAAAACCCTATTCAACAA ATGGTGTTGGGATAA >FGENESH: 2 26 exon (s) 25535 - 103536 1104 aa, chain + MFLGGGSPGLVRSSEGSVPERINVHGMLAEPSSLVAYGQVITPQRKITLAAPNRKDMEEW INIIKTIQQGEIYKIPAAENNPFLVGMHCWYSSYSHRTQHCNVCRESIPALSRDAIICEV CKVKSHRLCALRASKDCKWNTLSITDDLLLPADEVNMPHQWVEGNMPVSSQCAVCHESCG SYQRLQDFRCLWCNSTVHDDCRRRFSKECCFRSHRSSVIPPTALSDPKGDGQLVVSSDFW NLDWSSACSCPLLIFINSKSGDHQGIVFLRKFKQYLNPSQVFDLLKGGPEAGLSMFKNFA RFRILVCGGDGSVSWVLSLIDAFGLHEKCQLAVIPLGTGNDLARVLGWGAFWNKSKSPLD ILNRVEQASVRILDRWSVMIRETPRQTPLLKGQVEMDVPRFEAAAIQHLESAATELNKIL KAKYPTEMIIATRFLCSAVEDFVVDIVKAWGQIKQNNTAIVSVILKSDLMYDRLSVLIDV LAEEAAATSAEKSATEYADSSKADRKPFIPQIDHIAKCKLELATKAQSLQKSLKLIIFQV EQALDEESRQTISVKNFSSTFFLEDDPEDINQTSPRRRSRRGTLSSISSLKSEDLDNLNL DHLHFTPESIRFKEKCVMNNYFGIGLDAKISLDFNTRRDEHPGQYNSRLKNKMWYGLLGT KELLQRSYRKLEERVHLECDGETISLPNLQGIVVLNITSYAGGINFWGSNTATTEYEAPA IDDGKLEVVAIFGSVQMAMSRIINLHHHRIAQCHEVMITIDGEEGIPVQVDGEAWIQRPG LIKIRYKNAAQMLTRDRDFENSMKMWEYKHTEIQAAPQPQLDFQDSQESLSDEEYAQMQH LARLAENLISKLNDLSKIHQHVSVLMGSVNASANILNDIFYGQDSGNEMGAASCIPIETL SRNDAVDVTFSLKGLYDDTTAFLDEKLLRSAEDETALQSALDAMNKEFKKLSEIDWMNPI FVPEEKSSDTDSRSLRLKIKFPKLGKKKVEEERKPKSGQSVQSFIGLENTILKFIWNQKR ARIAKARLSKKNKSGGITLLDFKPYYKGIVTKTGWYWYKNRHIDQWNRIENPEIKPNMYS QLIFNKANKNKVEKENPIQQMVLG >FGENESH:[mRNA] 3 5 exon (s) 107955 - 117489 726 bp, chain - ATGCAGATCTGTGTCGTGTGTGTGTTCCTCCAGGTGTCCTTTGAGATGACCCATGAGACC CTGTACTTGGCAGTGAAGCTGGTGGATCTCTACCTAATGAAGGCAGTATGCAAGAAGGAT AAGTTACAACTCCTTGGTGCCACTGCCTTTATGATTGCAGCAAAATTTGAGGAGCACAAC TCACCTCGTGTGGATGACTTTGTGTACATCTGTGATGATAATTATCAGCGATCTGAGGTA CTCAGCATGGAAATCAACATCCTGAACGTCCTCAAATGTGACATTAACATTCCCATCGCC TACCATTTTCTGCGCAGATATGCTAGGTGTATCCACACCAACATGAAGACACTGACCTTG TCCCGCTACATCTGCGAGATGACCCTGCAGGAATACCACTATGTCCAGGAGAAGGCTTCC AAGCTAGCTGCTGCCTCCTTACTCCTGGCCCTCTACATGAAGAAGCTCGGATACTGGGTA AACACTTGCGAGATAGGGGTTCCCTTCCTGGAGCATTACAGTGGCTACAGTATCTCTGAG CTTCACCCCTTGGTCAGACAGCTGAACAAACTGCTGACTTTCAGTTCTTACGATAGTCTC AAGGCTGTGTATTACAAGTATTCTCACCCGGTCTTCTTTGAAGTCGCCAAAATCCCTGCC TTGGATATGTTGAAGCTGGAGGAGATTTTGAACTGTGATTGTGAGGCTCAGGGCCTGGTA CTCTAG >FGENESH: 3 5 exon (s) 107955 - 117489 241 aa, chain - MQICVVCVFLQVSFEMTHETLYLAVKLVDLYLMKAVCKKDKLQLLGATAFMIAAKFEEHN SPRVDDFVYICDDNYQRSEVLSMEINILNVLKCDINIPIAYHFLRRYARCIHTNMKTLTL SRYICEMTLQEYHYVQEKASKLAAASLLLALYMKKLGYWVNTCEIGVPFLEHYSGYSISE LHPLVRQLNKLLTFSSYDSLKAVYYKYSHPVFFEVAKIPALDMLKLEEILNCDCEAQGLV L >FGENESH:[mRNA] 4 1 exon (s) 121985 - 122689 705 bp, chain + ATGTTGAAAAATTTCAAAAAGGGATTTAAGGGAGACTATGGAGTTACTATGACACCAGGA AAACTTAGAACTTTGTGTGAGATAGACTGGCCAGCATTAGAGGTGGGTTGGCCATCAGAA GGAAGCCTGGACAGGTCCCTTGTCTCGAAGGTATGGCACAGGGGAACCTGTAAGCCAGGG CACCCAGATCAGTTCCCGTATATAGATTCTTGGTTACAGCTAGTTTTGGATCCCCCACAG TGGTTAAGAGGACAGGCAGCAGCAGTACTAGTAGCAAAGGGACAGTTAGTTAAGGAAGGC TGTCACTCCACCCACCGAGGGAAGTCGGCACCAAAAGTCCTGTCCGAGCCAACACCGGAA GAATCATGGCAGGAATTGGTACCAGCAGTGCCTCCTCCTTATCGAGAGGAAGGGCTCCCC ATTCCTGAGCCCACAGCACCTCCACTTCCACCAGATATCCATACTCCTAGACCACCCAGA GTAGACAAAAGAAGAAGTGAAGCCATGGGAGAAACTCCTCCCTTGGCAGCTCACTTACGG CCCAAGACTGAAATCCAAATGCCCGTGAGAGAACAGCAATATACTGGGGTAGATGAGGAC AGACACATGGTGAAAAGGCAGCCTTTGTGTATCAACCTTTCACCTCTGCTGACCTCAATT GGAAAAATAATACTCCAACTTACACCGAAAAGCCTCAAGCTTTAA >FGENESH: 4 1 exon (s) 121985 - 122689 234 aa, chain + MLKNFKKGFKGDYGVTMTPGKLRTLCEIDWPALEVGWPSEGSLDRSLVSKVWHRGTCKPG HPDQFPYIDSWLQLVLDPPQWLRGQAAAVLVAKGQLVKEGCHSTHRGKSAPKVLSEPTPE ESWQELVPAVPPPYREEGLPIPEPTAPPLPPDIHTPRPPRVDKRRSEAMGETPPLAAHLR PKTEIQMPVREQQYTGVDEDRHMVKRQPLCINLSPLLTSIGKIILQLTPKSLKL >FGENESH:[mRNA] 5 5 exon (s) 122914 - 128072 1956 bp, chain + ATGGAGAGGCTAAGACAGTACCATGCGGCATGGATAGAAGGTCTAAAGAAAGAGGCTCAA AAGGATACAAATGTAAATAAGGTCTCTGAGGTCATCCAAGGAAAAGAGGAGAGTCCAGCG CAATTCTATGAACGACTGTGTGAGGCTTACCGTATGTACACTCCTTTTGATCCAGATAGT CCTGAAAATCAGAGAATGATTAATATGGCCTTAGTCAAAGTGTGGAAGATATCAGGAGAA AATTGCAGAAACAGGCTGGGTTTGCAGGGCTCTTTACAGCTAAACTTACCAGAAACAGGA GTTATCATGGCTCTTATGGTCACCAGGGAAAAAGAATGGAGACTTTTTCTGACCGAGCCA GGCCAAGAGATAAAACCAGCTCTAGCTAAGCGATGGCCCCGAATATGGGCGGATGATAAT CCTCCGGGACTGGTGGTCAACCAAGACCCTGTACTCATAGAAGTTAAGCCTGGGGCCCAG CCAATTAGACAAAAGCAGTATCCGGTTCCCAGAGAAGCTCTCGAAGGAATCCAGGTTCAT CTCAGGCACTTGAAAGCCTTTGGAATTATAGTTCCTTGCCAGTCTCCATGGAACCTCCCC TTCCTCCCTGTCCCTAAGCCAGGGACCAAGGACTACCAGCCAGTACAGGACTTGTGCTTG GTCAACCAAGCTACAGTGACTCTGCACCCAACAGTTCCTAACCTTTACACATTGTTAGGG CTGCTGCAGGCTGAGGACAGCTGGTTTACCTGTCTGGACTTAAAAGATGCCTTCTTTAGC ATCAGACTAGCTCCTGAAAGCCAGAAGCTGTTTGCCTTTCAGTGGGAAGATCCGGAGTCA GTCCCAGGACTACCAGATTTGACAAAGCCCTTTACACTCTATGTGTCAGAAAGAGAAAAA ATGGCAGTTGGAGTTTTAACCCAGACTGTGGGGCCCTGGCCAAGGCCAGTGGTCTATCTC TCAAAACAACTAGATGGGGTTTCCAAAGTCTGGCCACCATGTCTAAGGGCCCTGGCAGCA ACAGCCCTGTTAGCACAAGAAGCAGATAAACTAACCCTTGGGCAAAACCTGAATATAAAG GCCCCCCATGCTGTGGTAACTTTGATGAAAACCAAAGGACATCATTGGCTAACAAATGCT AGATTAACAAAGTACCAAAGCTTGCTGTGTGAAAATCCCCACATAACCACTGAAGTCTGT AACACCCTAAATCCCGCCACCCTGCTCCTAGTATCAGAGAGCCTGGTAGAGCATAACTGT GTAGAGGTGTTGGACTCAGTTTATTCTAGCAGACCTGACCTTTGGGACCAGCCATGGGCA TCAGTAGACTGGGAGTTATACATGGACGGGAGCAGCTTCATCAACCCACAAGGAGAAAGA TGTGCAGGATATGTGATGGTAACTTTGGATGCTGTCATTGAAGCCAAACCATTGCCACAG GGCACTTCAGCCCAGAAGGCTGAGCTCATTGCTTTAACTCGGGCTCTAGAACTCAGTGAA GGTGACCACGTGTGGATGAAGGATTGGAATGTAGCCCCTTTGCGGCCACAGTGGAAAGGA CCTCAGACCGTCATCCTGACCACCCCCAAGGCTGTAAAGAAAGGAAAAAGTGGCTCTTCC TGTACTAAGGGACAATGTAACCCCTTAGAGCTGGTAATAACCAATCCCCTTGATTCTCGC TGGAAAAAAGGGGAGCGTGTGGCCTTAGGAATCAGTGGGGCCAGACTGAATCCTCGAGTA AATATCTTAGTTCGAGGAGAAGTTTACAAACGCTCTCCTGAGCCAATGTTTCAAACTTTC TATGATGAACTACATGTGCCAGTACCAGAAATTCCAGGAAAAACAAGAAATTTGTTTTTG CAATTAGCCGAGCATGTAGCCCAGTCTCTCAATGTCACTTCATGTTATGTATGTGGAGGA ATTGTAATGGGAGACCAATGGCCATGGCAAGCCTGA >FGENESH: 5 5 exon (s) 122914 - 128072 651 aa, chain + MERLRQYHAAWIEGLKKEAQKDTNVNKVSEVIQGKEESPAQFYERLCEAYRMYTPFDPDS PENQRMINMALVKVWKISGENCRNRLGLQGSLQLNLPETGVIMALMVTREKEWRLFLTEP GQEIKPALAKRWPRIWADDNPPGLVVNQDPVLIEVKPGAQPIRQKQYPVPREALEGIQVH LRHLKAFGIIVPCQSPWNLPFLPVPKPGTKDYQPVQDLCLVNQATVTLHPTVPNLYTLLG LLQAEDSWFTCLDLKDAFFSIRLAPESQKLFAFQWEDPESVPGLPDLTKPFTLYVSEREK MAVGVLTQTVGPWPRPVVYLSKQLDGVSKVWPPCLRALAATALLAQEADKLTLGQNLNIK APHAVVTLMKTKGHHWLTNARLTKYQSLLCENPHITTEVCNTLNPATLLLVSESLVEHNC VEVLDSVYSSRPDLWDQPWASVDWELYMDGSSFINPQGERCAGYVMVTLDAVIEAKPLPQ GTSAQKAELIALTRALELSEGDHVWMKDWNVAPLRPQWKGPQTVILTTPKAVKKGKSGSS CTKGQCNPLELVITNPLDSRWKKGERVALGISGARLNPRVNILVRGEVYKRSPEPMFQTF YDELHVPVPEIPGKTRNLFLQLAEHVAQSLNVTSCYVCGGIVMGDQWPWQA >FGENESH:[mRNA] 6 7 exon (s) 136289 - 154298 3501 bp, chain - ATGGATGAAGCTGGAAACCACTGTTCTCAGCAAACTATCGCAAGGACAAGAAACCAAACA CCGCATGTTCTCACTCATAGGCATAAGCTGGAAGTCACACCAGTAGTAGCCTCTACTACC GTGGTACCAAACATTATGGAGAAACCACTCATTCTAGACATATCCACCACCTCCAAAACA CCCAACACTGAGGAGGCATCTCTCTTCAGAAAGCCATTAGTTTTAAAGGAGGAACCCACT ATTGAGGATGAAACCCTTATCAATAAGTCATTATCTTTAAAAAAGTGCTCAAATCATGAG GAGGTGTCCTTACTGGAAAAGCTACAGCCCCTGCAGGAGGAGAGTGACAGTGATGATGCG TTTGTTATAGAGCCAATGACTTTTAAGAAGACACATAAAACTGAGGAGGCAGCCATCACC AAGAAGACATTATCCTTAAAGAAGAAGATGTGTGCAAGTCAGCGGAAGCAGTCCTGCCAG GAAGAGTCGTTGGCTGTGCAGGATGTCAATATGGAAGAGGATTCCTTCTTTATGGAGTCA ATGAGTTTTAAGAAGAAGCCTAAAACTGAGGAGTCAATCCCCACCCATAAGTTATCATCT TTAAAGAAGAAATGTACCATTTATGGGAAGATATGCCACTTTAGGAAGCCACCAGTATTG CAGACAACCATCTGTGGAGCAATGTCCTCCATTAAGAAGCCTACCACTGAGAAGGAGACA CTTTTCCAAGAGCTATCTGTATTGCAAGAGAAACACACCACTGAGCATGAGATGTCCATC TTGAAGAAATCATTGGCCTTGCAGAAGACCAACTTTAAAGAGGATTCCCTTGTTAAGGAG TCGTTAGCCTTTAAGAAGAAGCCTAGCACTGAGGAGGCAATCATGATGCCAGTAATATTG AAGGAGCAGTGCATGACTGAGGGGAAGAGGTCCCGTCTGAAGCCATTAGTATTGCAGGAG ATCACCTCTGGAGAGAAGTCGCTCATTATGAAGCCATTGTCCATTAAAGAAAAGCCATCT ACTGAGAAGGAGTCCTTTTCCCAGGAACCATCTGCATTGCAAAAGAAGCACACCACTCAG GAGGAGGTTTCCATCTTAAAGGAGCCCTCGTCCTTGCTAAAGTCTCCAACTGAGGAGTCA CCTTTTGATGAGGCTTTGGCTTTTACAAAGAAGTGTACCATTGAGGAGGCACCCCCCACC AAGAAGCCTTTAATTTTAAAGAGGAAGCATGCCACTCAGGGGACAATGTCCCACTTGAAG AAACCACTAATATTACAGACCACCTCTGGAGAAAAGTCACTTATTAAGGAGCCACTGCCC TTTAAAGAAGAAAAAGTGTCTTTAAAGAAAAAGTGTACCACACAAGAGATGATGTCCATC TGTCCAGAACTGTTGGACTTTCAGGATATGATTGGTGAAGATAAGAATTCTTTCTTTATG GAGCCAATGTCATTTAGGAAGAACCCTACAACTGAGGAGACAGTACTTACCAAGACATCG TTGTCTTTACAGGAAAAGAAAATTACTCAGGGGAAGATGTCCCACTTAAAGAAGCCACTG GTCTTGCAGAAGATCACTTCTGAGGAGGAGTCATTCTATAAGAAGCTGTTGCCCTTTAAG ATGAAATCTACAACGGAAGAAAAGTTCCTCTCCCAGGAACCATCTGCATTGAAAGAGAAG CATACCACCTTGCAGGAAGTGTCCCTCTCAAAAGAGTCATTGGCCATCCAAGAGAAGGCT ACCACTGAGGAGGAATTCTCTCAGGAACTATTTTCATTGCATGTTAAGCATACCAACAAA AGTGGGTCCCTCTTCCAGGAGGCTTTGGTCTTGCAAGAGAAGACTGATGCCGAAGAGGAT TCCTTGAAGAACTTGTTGGCTTTGCAGGAGAAAAGCACCATGGAAGAAGAGTCCCTTATC AATAAGCTATTGGCTCTGAAGGAGGAGCTTTCTGCTGAGGCAGCCACAAACATACAGACA CAATTATCTTTAAAGAAGAAGTCCACTTCTCATGGAAAAGTGTTCTTCCTGAAGAAGCAG TTGGCTTTGAATGAGACCATCAATGAAGAGGAGTTCCTTAATAAGCAGCCACTGGCCTTG GAGGGGTATCCCAGCATTGCGGAGGGGGAGACCCTCTTCAAGAAGCTTTTGGCCATGCAG GAGGAGCCCAGCATTGAGAAGGAAGCTGTCCTCAAGGAGCCCACTATTGACACAGAAGCT CACTTTAAGGAACCTTTGGCCTTGCAGGAGGAGCCCAGCACTGAGAAGGAGGCTGTCCTC AAGGAGCCCAGTGTTGACACAGAAGCTCACTTTAAGGAAACTTTGGCCTTGCAGGAGAAG CCCAGCATTGAGCAGGAGGCCCTCTTTAAGCGACACTCAGCTTTGTGGGAGAAGCCCAGC ACTGAGAAGGAGACCATCTTCAAGGAGTCTTTGGACTTGCAAGAGAAGCCCAGCATTAAG AAAGAGACCCTCCTCAAAAAGCCATTAGCCTTGAAGATGTCTACCATCAATGAGGCAGTC CTCTTCGAAGATATGATAGCTCTGAATGAGAAACCCACCACTGGGAAGGAGTTGTCCTTC AAGGAGCCATTAGCCTTACAAGAGAGTCCCACCTACAAGGAAGACACCTTTCTCAAAACA TTGTTGGTCCCCCAAGTTGGAACCAGCCCAAATGTGTCTAGCACTGCCCCTGAATCCATA ACCAGCAAGTCCAGCATTGCTACCATGACCAGTGTGGGCAAATCTGGTACCATCAATGAG GCATTCCTCTTCGAAGATATGATAACTCTGAATGAGAAACCCACCACTGGGAAGGAGTTG TCCTTCAAGGAGCCATTGGCCTTACAAGAGAGTCCCACCTGCAAGGAAGACACCTTTCTG GAAACATTCTTGATCCCCCAAATTGGAACCAGCCCATATGTGTTTAGCACCACCCCTGAA TCCATAACAGAGAAGTCCAGCATTGCAACCATGACCAGCGTGGGCAAGTCCAGGACCACC ACCGAGTCCAGTGCATGTGAATCTGCTTCTGATAAACCTGTCTCACCACAGGCCAAGGGA ACACCAAAGGAGATAACCCCACGGGAAGATATTGATGAGGACAGCAGTGATCCAAGTTTC AACCCAATGTATGCCAAGGAAATCTTCAGTTACATGAAAGAGAGAGAGATAGAGGAAACT GCCCAGAGTCATGAACAGTTTATACTTACAGATTACATGAACAGGCAGATTGAAATCACC AGTGACATGAGGGCCATTCTTGTGGACTGGTTGGTGGAGGTGCAGTATTCCATGGTACAG ATGTACCACGGTAATATGGTTTGGCTATGTCCACACCCAAATCTTATCTTGAATTATAGC TCCCATAATCCCCACGTGTTGTGGGAGGGACGCAGTGGGAGGGACATGGATGAAGCTGGA AACCATCGTTCTCAGCAAACTATCGCAAGGACAAAAAACCAAACACCGCATGTTCTCACT CAGGTGGGAGTTGAACAATGA >FGENESH: 6 7 exon (s) 136289 - 154298 1166 aa, chain - MDEAGNHCSQQTIARTRNQTPHVLTHRHKLEVTPVVASTTVVPNIMEKPLILDISTTSKT PNTEEASLFRKPLVLKEEPTIEDETLINKSLSLKKCSNHEEVSLLEKLQPLQEESDSDDA FVIEPMTFKKTHKTEEAAITKKTLSLKKKMCASQRKQSCQEESLAVQDVNMEEDSFFMES MSFKKKPKTEESIPTHKLSSLKKKCTIYGKICHFRKPPVLQTTICGAMSSIKKPTTEKET LFQELSVLQEKHTTEHEMSILKKSLALQKTNFKEDSLVKESLAFKKKPSTEEAIMMPVIL KEQCMTEGKRSRLKPLVLQEITSGEKSLIMKPLSIKEKPSTEKESFSQEPSALQKKHTTQ EEVSILKEPSSLLKSPTEESPFDEALAFTKKCTIEEAPPTKKPLILKRKHATQGTMSHLK KPLILQTTSGEKSLIKEPLPFKEEKVSLKKKCTTQEMMSICPELLDFQDMIGEDKNSFFM EPMSFRKNPTTEETVLTKTSLSLQEKKITQGKMSHLKKPLVLQKITSEEESFYKKLLPFK MKSTTEEKFLSQEPSALKEKHTTLQEVSLSKESLAIQEKATTEEEFSQELFSLHVKHTNK SGSLFQEALVLQEKTDAEEDSLKNLLALQEKSTMEEESLINKLLALKEELSAEAATNIQT QLSLKKKSTSHGKVFFLKKQLALNETINEEEFLNKQPLALEGYPSIAEGETLFKKLLAMQ EEPSIEKEAVLKEPTIDTEAHFKEPLALQEEPSTEKEAVLKEPSVDTEAHFKETLALQEK PSIEQEALFKRHSALWEKPSTEKETIFKESLDLQEKPSIKKETLLKKPLALKMSTINEAV LFEDMIALNEKPTTGKELSFKEPLALQESPTYKEDTFLKTLLVPQVGTSPNVSSTAPESI TSKSSIATMTSVGKSGTINEAFLFEDMITLNEKPTTGKELSFKEPLALQESPTCKEDTFL ETFLIPQIGTSPYVFSTTPESITEKSSIATMTSVGKSRTTTESSACESASDKPVSPQAKG TPKEITPREDIDEDSSDPSFNPMYAKEIFSYMKEREIEETAQSHEQFILTDYMNRQIEIT SDMRAILVDWLVEVQYSMVQMYHGNMVWLCPHPNLILNYSSHNPHVLWEGRSGRDMDEAG NHRSQQTIARTKNQTPHVLTQVGVEQ >FGENESH:[mRNA] 7 1 exon (s) 158588 - 158863 276 bp, chain + ATGGCAAAGAAGTTAAAAACTTTGAAAAAAAAAATAGACGAATGGCTAACTAGAATAACC AATGCAGAGAAGTCCTTAAAGGACCTGATGGAGCTGAAAACCATGGCACGAGAACTACGT GACGAATGCACAAGCCTCAGTAAACAAGGCGATCAACTGGAAGAAAGGGTATCAGCAACG GAAGACGAAATGAATGAAATGAAGCATGAAGAGAAGTTTAGAGAAAAAGAATACAAAGAA ATGAACAAAGCCTCCAAGAAATATGGGACTATGTGA >FGENESH: 7 1 exon (s) 158588 - 158863 91 aa, chain + MAKKLKTLKKKIDEWLTRITNAEKSLKDLMELKTMARELRDECTSLSKQGDQLEERVSAT EDEMNEMKHEEKFREKEYKEMNKASKKYGTM >FGENESH:[mRNA] 8 3 exon (s) 160522 - 162727 2088 bp, chain + ATGAAGGCAATAGAGACACAAAAATCCCTTCAAAAAATCAATGAATCCAGGAGCTGGTTT TTTGAAAAGATCAACAAAATTGATAGACCGCTAGCAAGAATAATAAAGAAGAAAACAGAG AAGAATCAAATAGACGCAATAAAAAATGACAAAGGGGATATCACCACCGATCCCACAGAA ATACAAACTACCATCAGAGAATACTATAAACACCTCTACACAAATAAACTAGAAATTCTA GAAGAAATGGATAAATTTTTCGACACATACACTCTCCCAAGACTAAATCAGGAAGAAGTT GAATCTCTGAATAGACAAATAACAGACTCTGAAATTGAGGCAATAATTAATAGCTTACCA ACCAAAAAAAGTCCAGGACCAGATGGATTCACAGCCGAATTCTACCAGAGGTACAAGGAG GAGCTGGTACCATTCCTTCTGAAACTATTCCAATCAATAGAAAAAGAGGGAATCCTCCCT AACTCATTTTATGAGGCCAGCATCATTCTGATACCAAAGCCTGGCAGAGACACAACAAAA AAAGAGAATTTTAGACCAATATCTTTGATGAACATCGATGCAAAAATCCTCAATAAAATA CTGGCAAAAGGAATCCAGCAGCACAGCAAAAAGCTTATCCACCATGATCAACATTTAAAC AGAACCAAAGACAAAAACCACATGATTATCTCAATAGATGCAGAAAAGGCCTTTGACAAA ATTCAACAACCCTTCATGCTAAAAACTCTCAATAAATTAGGTATTGATGGGACGTATCTC AAAATAATAAGAGATATCTGTGACAAACCCACAGCCAATATCATACTGAATGGACAAAAA CTGGAAGCATTCCCTTTGAAAACTGGCACAAGACAGGGATGCCCTCTCTCACCACTCCTA TTCAACATAGTGTTGGAAGTTCTGGCCAGGGCAATCAGGCAGCAGAAGGAAATAAAGGGC ATTCAATTAGGAAAAGAGGAAGTCAAATTGTCCCTGTTTGCAGATGACATGATTGTATAT CTAGAAAACCCCATTGTCTCAGCCCAAAATCTCCTTAAGCTAATAAGCAACTTCAGCAAA TTCTCAGGATACAAAATCCATGTGCAAAAATCACAAGTATTCTTATACACCAATAACAGA CAAACAGAGAGCCAAATCATGAGTGAACTCCCATTCACAATTGCTTCAAAGAGAATAAAA TACCTAGGAATCCAACTTACAAGGGATGTGAAGGACCTTTTCAAGAAGAACTACAAACCA CTGCTCAATGAAATAAAAGAGGATACAAACAAATGGAAGAACATTCCATGCTCATGGGTA GGAAGAATCAATATTGTGAAAATGGCCATACTGTCCAAGAATTGGAAAAAAACTACTTTA AAGTTCATATGGAACCAAAAAAGAGCCCGCATTGACAAGTCAATCCTAAGCCAAAAGAAC AAAGCTGGAGGCATCACGCTACCTGACTTCAAACTATACTACAAGGCTATAGTAACCAAA ACAGCATGGTACTGGTACCAAAACAGAGATATAGACCAATGGAACAGAACAGAGCCCTCA GAAATAATACCACACATCAACAACTATCTGATCTTTGACAAACCTGACAAAAACAAGCAA TGGGGAAAGGATTCCCTATTTAATAAATGGTGCTGGGAAAACTGGCTAGCCATATGTAGA AAGCTGAAACTGGATCCCTTCCTTTCACCTTATACAAAAATTAATTCAAGATGGATTAAA GACTTAAATGTTAGACCTAAAACCATAAAAACCCTAGAAGAAAACCTAGGCAATACCATT CAGGACATAGGCATGGGCAAGGATTTCATGTCTAAAACACCAAAAGCAATGGCAACAGAA GCCAAAATTGACAAATGGGATCTAATTAAACTAAAGAGCTTCTGCACAGCAAAAGAAACC ACCATCAGAGTGAACAGGAAACCTACAGAATGGGAGAAAATTTTTGCAACCTACTCATCT GACAAAGGGCTAATATCCAGAATCTACAATGAACTCAAACAAATTTACAAGAAAAAAGCA AACAACCCCATCAAAAAGTGGGTGAAGGATATGAACAGACATTTCTGA >FGENESH: 8 3 exon (s) 160522 - 162727 695 aa, chain + MKAIETQKSLQKINESRSWFFEKINKIDRPLARIIKKKTEKNQIDAIKNDKGDITTDPTE IQTTIREYYKHLYTNKLEILEEMDKFFDTYTLPRLNQEEVESLNRQITDSEIEAIINSLP TKKSPGPDGFTAEFYQRYKEELVPFLLKLFQSIEKEGILPNSFYEASIILIPKPGRDTTK KENFRPISLMNIDAKILNKILAKGIQQHSKKLIHHDQHLNRTKDKNHMIISIDAEKAFDK IQQPFMLKTLNKLGIDGTYLKIIRDICDKPTANIILNGQKLEAFPLKTGTRQGCPLSPLL FNIVLEVLARAIRQQKEIKGIQLGKEEVKLSLFADDMIVYLENPIVSAQNLLKLISNFSK FSGYKIHVQKSQVFLYTNNRQTESQIMSELPFTIASKRIKYLGIQLTRDVKDLFKKNYKP LLNEIKEDTNKWKNIPCSWVGRINIVKMAILSKNWKKTTLKFIWNQKRARIDKSILSQKN KAGGITLPDFKLYYKAIVTKTAWYWYQNRDIDQWNRTEPSEIIPHINNYLIFDKPDKNKQ WGKDSLFNKWCWENWLAICRKLKLDPFLSPYTKINSRWIKDLNVRPKTIKTLEENLGNTI QDIGMGKDFMSKTPKAMATEAKIDKWDLIKLKSFCTAKETTIRVNRKPTEWEKIFATYSS DKGLISRIYNELKQIYKKKANNPIKKWVKDMNRHF >FGENESH:[mRNA] 9 3 exon (s) 164379 - 170878 258 bp, chain - ACGGGGGAGAATTGCCAAACGAAGATATCTCCATCTTCACTTCAGGAGTCTCCATCTTCA CTTCAGGGAGCACTCAAAAAGAGATCAGCTTTTGAAGATCTCACTAATGCTTCTCAATGT CAACCTGTCCAGCCCAAGAAAGAAGCCAATAAAGAGTTTGTAAAAGTTGTTTCCAAGAAG ATAAACAGGAACACACATGCTCTTGGACTGGCCAAAAAGAATAAGCGGAATCTAAAATGT AATTCAGTACCTGTTTGA >FGENESH: 9 3 exon (s) 164379 - 170878 85 aa, chain - TGENCQTKISPSSLQESPSSLQGALKKRSAFEDLTNASQCQPVQPKKEANKEFVKVVSKK INRNTHALGLAKKNKRNLKCNSVPV GENSCAN Output View gene model output: PS | PDF GENSCAN 1.0 Date run: 14-Mar-110 Time: 19:06:16 Sequence /tmp/03_14_10-19:06:14.fasta : 172778 bp : 40.67% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.02 Intr - 3385 3245 141 1 0 22 90 87 0.333 1.93 1.01 Init - 5876 5796 81 2 0 65 96 65 0.928 6.02 1.00 Prom - 5984 5945 40 -2.25 2.00 Prom + 7347 7386 40 -6.25 2.01 Init + 8676 8887 212 2 2 91 35 134 0.499 6.70 2.02 Term + 13352 13808 457 2 1 10 44 205 0.125 1.91 2.03 PlyA + 15360 15365 6 1.05 3.00 Prom + 16606 16645 40 -3.05 3.01 Init + 16682 16686 5 1 2 103 100 0 0.741 2.22 3.02 Intr + 22259 22386 128 2 2 43 -8 228 0.622 8.10 3.03 Intr + 25325 25516 192 0 0 127 -30 115 0.169 2.14 3.04 Term + 27429 27652 224 1 2 43 53 156 0.278 3.70 3.05 PlyA + 28461 28466 6 -0.45 4.00 Prom + 28809 28848 40 -6.35 4.01 Init + 28936 29079 144 0 0 83 23 95 0.274 2.67 4.02 Intr + 37138 37218 81 0 0 83 75 39 0.534 1.02 4.03 Intr + 39157 39261 105 0 0 43 116 121 0.948 9.89 4.04 Term + 44436 44534 99 2 0 88 42 95 0.943 2.05 4.05 PlyA + 45122 45127 6 1.05 5.05 PlyA - 45149 45144 6 1.05 5.04 Term - 48535 48471 65 2 2 90 36 39 0.568 -4.03 5.03 Intr - 48916 48743 174 2 0 85 52 66 0.464 1.69 5.02 Intr - 51653 51480 174 0 0 68 81 131 0.817 9.39 5.01 Init - 51985 51868 118 1 1 72 61 103 0.912 6.41 5.00 Prom - 52350 52311 40 -6.95 6.00 Prom + 53595 53634 40 -4.95 6.01 Init + 53726 53905 180 1 0 51 58 55 0.814 -1.97 6.02 Intr + 55480 55615 136 0 1 84 77 98 0.885 7.52 GENSCAN Output http://genes.mit.edu/cgi-bin/genscanw_py.cgi 6.03 Intr + 56067 56173 107 1 2 105 86 43 0.944 4.81 6.04 Intr + 56481 56603 123 2 0 112 107 40 0.995 8.16 6.05 Intr + 58525 58627 103 0 1 147 101 56 0.998 11.63 6.06 Intr + 66329 66512 184 0 1 85 98 112 0.976 9.82 6.07 Intr + 67215 67323 109 0 1 146 115 69 0.999 14.77 6.08 Intr + 68088 68227 140 2 2 90 111 140 0.998 14.84 6.09 Intr + 69255 69336 82 0 1 78 78 72 0.998 3.92 6.10 Intr + 71046 71137 92 2 2 73 82 150 0.854 10.77 6.11 Intr + 72011 72110 100 2 1 97 106 64 0.999 8.29 6.12 Intr + 73078 73306 229 0 1 102 116 202 0.996 20.92 6.13 Intr + 74840 74944 105 0 0 94 69 128 0.999 10.77 6.14 Intr + 75400 75496 97 2 1 81 92 79 0.944 5.75 6.15 Intr + 77061 77168 108 0 0 46 80 95 0.740 2.98 6.16 Intr + 79587 79683 97 0 1 42 88 67 0.579 1.19 6.17 Intr + 79930 80037 108 0 0 69 61 104 0.778 5.36 6.18 Intr + 80973 81086 114 2 0 53 96 214 0.874 18.42 6.19 Intr + 81432 81566 135 2 0 95 90 106 0.992 11.34 6.20 Intr + 82746 82906 161 2 2 70 108 203 0.999 18.36 6.21 Intr + 83499 83640 142 0 1 86 100 166 0.999 17.03 6.22 Intr + 84607 84693 87 0 0 113 105 125 0.999 15.95 6.23 Intr + 87830 87940 111 1 0 82 92 103 0.997 9.76 6.24 Intr + 89115 89238 124 2 1 104 111 80 0.995 11.14 6.25 Intr + 89287 89406 120 2 0 10 100 88 0.481 1.75 6.26 Term + 90645 90724 80 1 2 64 41 79 0.433 -2.25 6.27 PlyA + 90767 90772 6 -0.45 7.10 PlyA - 90991 90986 6 1.05 7.09 Term - 93770 93591 180 2 0 50 43 118 0.751 0.13 7.08 Intr - 94489 94280 210 1 0 75 84 148 0.877 11.29 7.07 Intr - 105109 104838 272 2 2 38 49 157 0.424 3.14 7.06 Intr - 108057 107987 71 2 2 99 40 54 0.575 -0.39 7.05 Intr - 108423 108293 131 0 2 61 105 100 0.716 7.57 7.04 Intr - 112038 111889 150 0 0 70 32 201 0.577 12.14 7.03 Intr - 113012 112857 156 2 0 116 90 197 0.999 21.99 7.02 Intr - 117422 117325 98 0 2 69 105 44 0.542 3.01 7.01 Init - 117744 117537 208 0 1 43 70 130 0.549 5.73 7.00 Prom - 119993 119954 40 -9.35 8.00 Prom + 120443 120482 40 -7.15 8.01 Init + 121991 122599 609 1 0 54 22 503 0.722 35.66 8.02 Intr + 122803 123186 384 0 0 -36 80 320 0.870 12.92 8.03 Intr + 123569 124433 865 1 1 -11 38 492 0.644 24.29 8.04 Intr + 124542 124792 251 1 2 9 80 209 0.398 8.43 8.05 Intr + 124873 124933 61 0 1 30 78 38 0.448 -5.61 8.06 Term + 124968 125875 908 1 2 17 48 471 0.396 27.47 8.07 PlyA + 125998 126003 6 1.05 GENSCAN Output http://genes.mit.edu/cgi-bin/genscanw_py.cgi 9.00 Prom + 126009 126048 40 -18.05 9.01 Init + 126077 126308 232 1 1 70 75 182 0.509 13.57 9.02 Intr + 126887 126984 98 0 2 102 70 101 0.820 8.51 9.03 Term + 127752 128078 327 2 0 -1 54 171 0.288 -1.48 9.04 PlyA + 128703 128708 6 1.05 10.06 PlyA - 131375 131370 6 -0.45 10.05 Term - 132647 132588 60 2 0 86 55 53 0.145 -1.37 10.04 Intr - 145808 145716 93 2 0 62 115 87 0.973 8.04 10.03 Intr - 146797 146774 24 1 0 84 103 23 0.618 0.70 10.02 Intr - 147127 147032 96 1 0 52 53 126 0.796 4.79 10.01 Init - 151104 148168 2937 0 0 59 51 2476 0.843 232.51 10.00 Prom - 154481 154442 40 -3.65 11.00 Prom + 156034 156073 40 -5.05 11.01 Sngl + 158594 158869 276 1 0 47 48 247 0.614 11.73 11.02 PlyA + 159288 159293 6 1.05 12.00 Prom + 159823 159862 40 -6.15 12.01 Sngl + 160528 161220 693 0 0 83 48 267 0.767 18.15 12.02 PlyA + 161226 161231 6 1.05 13.05 PlyA - 163084 163079 6 1.05 13.04 Term - 164403 164385 19 2 1 92 49 12 0.565 -5.49 13.03 Intr - 164801 164671 131 2 2 133 101 144 0.745 18.77 13.02 Intr - 170884 170777 108 1 0 93 82 103 0.993 9.76 13.01 Intr - 171464 171405 60 2 0 41 72 86 0.333 0.31 Suboptimal exons with probability > 1.000 Exnum Type S .Begin ...End .Len Fr Ph B/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ NO EXONS FOUND AT GIVEN PROBABILITY CUTOFF Predicted peptide sequence(s): >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_1|74_aa MGNPEGKRKHCRPSDGTHKESSQPEKRIKLGLPKEHQSLAVASQSSQKRESLGIVLLTPS SKFKCQVQEKFDTK >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_2|222_aa MADTISDAWIYIHKTNWREQGYSVYALTLQHLLLLRDLSSDIKFGYPDKVSKILRLLTVD DIFYSQLSISSTGSPGQSNQARERNKGHPNRKRGSQTILFADNMILYLENPIVLAPKLLQ GENSCAN Output http://genes.mit.edu/cgi-bin/genscanw_py.cgi LINNFSKLSRYKINLEKPLAFLYTNNSQAKSQIRNAIPFTIAIKGIKYLRIQLTKEVKNL YKDNYNTLVKEIRDDTNTWKNIPCSWMGRINIIKMTTLPKEI >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_3|182_aa MKEKLEKDADLDGVFACREKSEKDADLDGVFACREKLEKDADLDGLWLRALNKIMHVKQG QIAGICQPETNLFLWRRRVEEKLREEIATPAASNEGHRQSHNRSHSSHRSSGRRFADTVL ADLRGRKIWLFSVRMTTRSQFAQDPPDCSAETPGKTRTVGHPIPHPAAAALRQPGGKIPV VA >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_4|142_aa MDLGIGLSKRIQNETEAEFSKALAFSARAAMVLAKTILCEHCTQDLNYFAHFETIDLSQA TVAESSCRNLCHSFCVITPQRKITLAAPNRKDMEEWINIIKTIQQGEIYKTYTSGNITPT DQQGSANAVALYSGPIHGMILC >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_5|176_aa MQSITELLSKKSPELEDLENSQPIHIAKPKKVCPAENIKENGRMTSRGSSEASKAAERGS EARDAAVAIIGPKNVGPRGRGTASLVPEGGVVSWVPEAGAVPQRISLELWFLSGKRELKA DIQLPHISGPFKGSPLLSCPTGNIGSTSRARSLGVSQGITVRPAITHGGTAGPIQP >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_6|1057_aa MLVELQSSDLGSRKVLGAQSAKAKSMSLPQDSPSSILITLELSSVLMYNKHYNHKAIPEH IPAAENNPFLVGMHCWYSSYSHRTQHCNVCRESIPALSRDAIICEVCKVKSHRLCALRAS KDCKWNTLSITDDLLLPADEVNMPHQWVEGNMPVSSQCAVCHESCGSYQRLQDFRCLWCN STVHDDCRRRFSKECCFRSHRSSVIPPTALSDPKGDGQLVVSSDFWNLDWSSACSCPLLI FINSKSGDHQGIVFLRKFKQYLNPSQVFDLLKGGPEAGLSMFKNFARFRILVCGGDGSVS WVLSLIDAFGLHEKCQLAVIPLGTGNDLARVLGWGAFWNKSKSPLDILNRVEQASVRILD RWSVMIRETPRQTPLLKGQVEMDVPRFEAAAIQHLESAATELNKILKAKYPTEMIIATRF LCSAVEDFVVDIVKAWGQIKQNNTAIVSVILKSDLMYDRLSVLIDVLAEEAAATSAEKSA TEYADSSKADRKPFIPQIDHIAKCKLELATKAQSLQKSLKLIIFQVEQALDEESRQTISV KNFSSTFFLEDDPEDINQTSPRRRSRRGTLSSISSLKSEDLDNLNLDHLHFTPESIRFKE KCVMNNYFGIGLDAKISLDFNTRRDEHPGQYNSRLKNKMWYGLLGTKELLQRSYRKLEER VHLECDGETISLPNLQGIVVLNITSYAGGINFWGSNTATTEYEAPAIDDGKLEVVAIFGS VQMAMSRIINLHHHRIAQCHEVMITIDGEEGIPVQVDGEAWIQRPGLIKIRYKNAAQMLT RDRDFENSMKMWEYKHTEIQAAPQPQLDFQDSQESLSDEEYAQMQHLARLAENLISKLND LSKIHQHVSVLMGSVNASANILNDIFYGQDSGNEMGAASCIPIETLSRNDAVDVTFSLKG LYDDTTAFLDEKLLRSAEDETALQSALDAMNKEFKKLSEIDWMNPIFVPEEKSSDTDSRS LRLKIKFPKLGKKKVEEERKPKSGQSVQSFIAWLPPRDLKPPLEMFVGMVMMKREAIWES AVSDTKPYLGQGNLWHRRHREDEAEGDDPLTPSRSQL >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_7|491_aa MCPLKALIEKAEHFSLICKAGQLPKCVRIDNASVLKGTGDIHRNLAEAVSVFVAESQLVR TVLDVADTPVKLVDLYLMKAVCKKDKLQLLGATAFMIAAKFEEHNSPRVDDFVYICDDNY QRSEVLSMEINILNVLKCDINIPIAYHFLRRYARCIHTNMKTLTLSRYICEMTLQEYHYV QEKASKLAAASLLLALYMKKLGYWVPFLEHYSGYSISELHPLVRQLNKLLTFSSYDSLKA VYYKYSHPVFFEVAKIPALDMLKLEEILNCDCFYTKSFDFGDVSIKVQSENRNHSSYFEQ RKFNAGNWLLKDRRAKKSHTQCEASVILRAGGTVAKRWCHQTPRCQGYQAGFGTSDQLGP MRDHPESQLGFRVLLGNPFSIMIIPDGDSWFALLLSSLTCHRDLHVACLELGWPPKPTCP GENSCAN Output http://genes.mit.edu/cgi-bin/genscanw_py.cgi PFPPPFNSLYAEHRAFLEKGTHNCLPPIVLTPEAVGGSSNVSAFSSPLFFWPGIAYPTNS SESATKDTTKP >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_8|1025_aa MLKNFKKGFKGDYGVTMTPGKLRTLCEIDWPALEVGWPSEGSLDRSLVSKVWHRGTCKPG HPDQFPYIDSWLQLVLDPPQWLRGQAAAVLVAKGQLVKEGCHSTHRGKSAPKVLSEPTPE ESWQELVPAVPPPYREEGLPIPEPTAPPLPPDIHTPRPPRVDKRRSEAMGETPPLAAHLR PKTEIQMPVREQQYTGVDEDRHMAATKWLEEHVLANYQNPQEYRRIQLPGTGHQWDLNKG PDMERLRQYHAAWIEGLKKEAQKDTNVNKVSEVIQGKEESPAQFYERLCEAYRMYTPFDP DSPENQRMINMALVKVWKISGENCRNRLGLQEPTVRITIGGKDIKFVVDTGTEHSVVTTL VTPLSKETIDIIRATGVSTKQAFCLPWTFSVGGHEIVHQFLYMPDCPLPLLGRDLLSKLR ATISFTKQGSLQLNLPETGVIMALMVTREKEWRLFLTEPGQEIKPALAKRWPRIWADDNP PGLVVNQDPVLIEVKPGAQPIRQKQYPVPREALEGIQVHLRHLKAFGIIVPCQSPWNLPF LPVPKPGTKDYQPVQDLCLVNQATVTLHPTVPNLYTLLGLLQAEDSWFTCLDLKDAFFSI RLAPESQKLFAFQWEDPESDLGLLQYTDDLLLGHSTAVGCTKGTDALLWHLEDCGYKVSK KKAQICRQQVHYLGFTIQKGERSLESERKQVICSLPEPKTRRQGYRGGDWEPFEWGPLQQ QAFFPGLPDLTKPFTLYVSEREKMAVGVLTQTVGPWPRPVVYLSKQLDGVSKVWPPCLRA LAATALLAQEADKLTLGQNLNIKAPHAVVTLMKTKGHHWLTNARLTKYQSLLCENPHITT EVCNTLNPATLLLVSESLVEHNCVEVLDSVYSSRPDLWDQPWASVDWELYMDGSSFINPQ GERCAGYVMVTLDAVIEAKPLPQGTSAQKAELIALTRALELSEGKTVNIYIDSRYAFLTL QVHGTLYKEKGLLNSGGKDIKYQQEILQLLEAVWKPQKVAVMHCRGHQRASISVALGSSR ADSEA >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_9|218_aa MHETTHPGQESLEKLLGRYFYISQLPALAKAVTQQQCVTCQQHNARQGPTVPPSIQAYGA APFEDLQVDFTEMPKCGGDHVWMKDWNVAPLRPQWKGPQTVILTTPKAVKGQCNPLELVI TNPLDSRWKKGERVALGISGARLNPRVNILVRGEVYKRSPEPMFQTFYDELHVPVPEIPG KTRNLFLQLAEHVAQSLNVTSCYVCGGIVMGDQWPWQA >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_10|1069_aa MEKPLILDISTTSKTPNTEEASLFRKPLVLKEEPTIEDETLINKSLSLKKCSNHEEVSLL EKLQPLQEESDSDDAFVIEPMTFKKTHKTEEAAITKKTLSLKKKMCASQRKQSCQEESLA VQDVNMEEDSFFMESMSFKKKPKTEESIPTHKLSSLKKKCTIYGKICHFRKPPVLQTTIC GAMSSIKKPTTEKETLFQELSVLQEKHTTEHEMSILKKSLALQKTNFKEDSLVKESLAFK KKPSTEEAIMMPVILKEQCMTEGKRSRLKPLVLQEITSGEKSLIMKPLSIKEKPSTEKES FSQEPSALQKKHTTQEEVSILKEPSSLLKSPTEESPFDEALAFTKKCTIEEAPPTKKPLI LKRKHATQGTMSHLKKPLILQTTSGEKSLIKEPLPFKEEKVSLKKKCTTQEMMSICPELL DFQDMIGEDKNSFFMEPMSFRKNPTTEETVLTKTSLSLQEKKITQGKMSHLKKPLVLQKI TSEEESFYKKLLPFKMKSTTEEKFLSQEPSALKEKHTTLQEVSLSKESLAIQEKATTEEE FSQELFSLHVKHTNKSGSLFQEALVLQEKTDAEEDSLKNLLALQEKSTMEEESLINKLLA LKEELSAEAATNIQTQLSLKKKSTSHGKVFFLKKQLALNETINEEEFLNKQPLALEGYPS IAEGETLFKKLLAMQEEPSIEKEAVLKEPTIDTEAHFKEPLALQEEPSTEKEAVLKEPSV DTEAHFKETLALQEKPSIEQEALFKRHSALWEKPSTEKETIFKESLDLQEKPSIKKETLL KKPLALKMSTINEAVLFEDMIALNEKPTTGKELSFKEPLALQESPTYKEDTFLKTLLVPQ VGTSPNVSSTAPESITSKSSIATMTSVGKSGTINEAFLFEDMITLNEKPTTGKELSFKEP LALQESPTCKEDTFLETFLIPQIGTSPYVFSTTPESITEKSSIATMTSVGKSRTTTESSA CESASDKPVSPQAKGTPKEITPREDIDEDSSDPSFNPMYAKEIFSYMKEREIEETAQSHE GENSCAN Output http://genes.mit.edu/cgi-bin/genscanw_py.cgi QFILTDYMNRQIEITSDMRAILVDWLVEVQTGKAVVSEELEECSNQKHL >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_11|91_aa MAKKLKTLKKKIDEWLTRITNAEKSLKDLMELKTMARELRDECTSLSKQGDQLEERVSAT EDEMNEMKHEEKFREKEYKEMNKASKKYGTM >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_12|230_aa MKAIETQKSLQKINESRSWFFEKINKIDRPLARIIKKKTEKNQIDAIKNDKGDITTDPTE IQTTIREYYKHLYTNKLEILEEMDKFFDTYTLPRLNQEEVESLNRQITDSEIEAIINSLP TKKSPGPDGFTAEFYQRYKEELVPFLLKLFQSIEKEGILPNSFYEASIILIPKPGRDTTK KENFRPISLMNIDAKILNKILAKGIQQHSKKLIHHDQVGFIPGMQGWFNI >/tmp/03_14_10-19:06:14.fasta|GENSCAN_predicted_peptide_13|105_aa PTPVSVENLSSMKPVPGAKKTGENCQTKISPSSLQESPSSLQGALKKRSAFEDLTNASQC QPVQPKKEANKEFVKVVSKKINRNTHALGLAKKNKRNLKCNSVPV Back to GENSCAN GENSCAN Output http://genes.mit.edu/cgi-bin/genscanw_py.cgi ORF Finder (Open Reading Frame Finder) PubMed Entrez BLAST OMIM Taxonomy Structure gi|289450895|gb|AC239396.3| View 1 GenBank Redraw 300 Frame from to Length -3 148138..151098 2961 +2 125021..125869 849 +2 123776..124510 735 +2 121985..122689 705 +1 160522..161214 693 +1 161266..161952 687 +1 159484..160137 654 +3 119625..120209 585 -1 104661..105191 531 +1 158923..159420 498 +2 148154..148621 468 -3 130990..131457 468 -1 148746..149183 438 +2 35321.. 35737 417 -2 90047.. 90454 408 -3 83383.. 83781 399 +1 103141..103536 396 -1 141849..142241 393 -3 91045.. 91437 393 +2 150230..150613 384 +3 127128..127511 384 +1 96181.. 96555 375 +2 131048..131416 369 -2 99737..100105 369 -3 104005..104370 366 +3 13443.. 13802 360 -2 89105.. 89458 354 +2 51299.. 51652 354 -3 73078.. 73428 351 +1 124825..125169 345 +1 120439..120783 345 +2 126071..126406 336 -1 164661..164993 333 -2 154484..154810 327 -3 113341..113661 321 +2 46727.. 47044 318 -2 27401.. 27718 318 +2 85199.. 85513 315 +1 73081.. 73395 315 -3 22144.. 22452 309 -3 153196..153498 303 -3 136234..136536 303 +2 103985..104287 303