242

Am. J. Trop. Med. Hyg., 65(3), 2001, pp.
242–251
Copyright 䉷 2001 by The American Society of Tropical Medicine and Hygiene
PHYLOGENETIC ANALYSIS OF JAPANESE ENCEPHALITIS VIRUS: ENVELOPE GENE

BASED ANALYSIS REVEALS A FIFTH GENOTYPE,
GEOGRAPHIC CLUSTERING, AND MULTIPLE INTRODUCTIONS OF THE
VIRUS INTO THE INDIAN SUBCONTINENT
PRADEEP D. UCHIL AND VIJAYA SATCHIDANANDAM
Department of Microbiology And Cell Biology, Indian Institute of Science, Bangalore, India
Abstract. We report the analysis of the complete nucleotide sequence for the Indian isolate (P20778; Genbank
Accession number AF080251) of Japanese encephalitis virus (JEV). The phylogenetic tree topology obtained using
thirteen complete genome sequences of JEV was reproduced with the envelope, NS1, NS3, and NS5 genes and
revealed extensive divergence between the two Indian strains included. A more exhaustive analysis of JEV evolution
using 107 envelope sequences available for isolates from different geographic locations worldwide revealed five
distinct genotypes of JEV, displaying a minimum nucleotide divergence of 7% with high bootstrap support values.
The tree also revealed overall clustering of strains based on geographic location, as well as multiple introductions of
JEV into the Indian subcontinent. Nonsynonymous nucleotide divergence rates of the envelope gene estimated that
the ancestor common to all JEV genotypes arose within the last three hundred years.
INTRODUCTION nese SA-14 isolate.15 Most of these studies utilized uncor-

rected p-distances (proportion of nucleotide or amino acid
Japanese encephalitis virus (JEV) is the cause of the most sites at which the two sequences compared are different) for
prevalent viral encephalitis of man in terms of morbidity and building phylogenetic trees.
mortality.1 Approximately 50,000 human cases of JE occur Our objective was three-fold. Firstly, to study the evolu-
annually in Asia.2 Japanese encephalitis virus belongs to the tion of JEV, which required identification of good phyloge-
family Flaviviridae, whose members include several patho- netic markers to replace the 240 nt stretch used previously
gens of humans and animals. In India, epidemics of JEV (see below). We have determined and analyzed the complete
have been reported since the mid-1950s, and the virus is now nucleotide sequence for an Indian isolate P20778 of JEV and
endemic in most parts of the country. The genomes of fla- carried out phylogenetic analysis of this and twelve other
viviruses comprise single-stranded RNA of positive polarity independently isolated full-length JEV sequences available
approximately 11,000 nucleotides (nt) in length. The struc- in the database at the nucleotide and amino acid sequence
tural proteins: capsid, premembrane, and envelope, are en- level. The ten individual genes were separately analyzed us-
coded in the 5⬘ third of the genome and are followed by the ing the nucleotide sequences to assess their utility as phy-
genes specifying the seven nonstructural proteins. logenetic markers.
Japanese encephalitis virus is one of the recently diverged Our second objective was to analyze the extensive genetic
members of the family Flaviviridae, and most JEV isolates diversity of Indian JEV strains9,16 and to see if strain varia-
are estimated to have evolved over the last 130 years.3 The tion of JEV from different parts of the globe could be linked
study of flavivirus phylogeny has primarily utilized the to their geographic origin. The literature on this topic is con-
structural glycoprotein envelope sequences.4 These, in ad- flicting.9,17 Therefore, an exhaustive tree building exercise
dition to serological studies,5 have elegantly shown differ- was undertaken with all 107 unique envelope sequences
ences in evolutionary dynamics of the tick- and mosquito- available in the database of JEV strains from geographically
borne subgroups of this genus. Phylogenetic analysis of se- diverse regions of Asia. This revealed the hitherto unrec-
quences from the nonstructural RNA dependent RNA poly- ognized fifth genotype of JEV.
merase (NS5) gene of mosquito- and tick-borne Thirdly, we determined the date of the JEV ancestor com-
flaviviruses6,7 have also yielded tree topologies similar to that mon to all 5 E-gene based genotypes we have identified.
obtained with envelope sequences. More recently, NS3, the This study thus represents an up-to-date, rigorous, and in-
full-length genome sequence, and NS5 were used with equal depth phylogenetic analysis of JEV isolates.
success to classify the nonvector-borne and arthropod-borne
flaviviruses.8 MATERIALS AND METHODS
To study JEV evolution, on the other hand, a stretch of
240 or 198 nt from the premembrane (prM) region have been Cells and viruses. The Aedes albopictus cell line C6/36
predominantly used.9–13 These studies have classified JEV (obtained from NCCS, Pune, India) was maintained in Ea-
strains into four distinct genotypic groups. Another tree gle’s minimum essential medium (MEM) with 10% fetal bo-
building exercise using a limited number of twenty JEV en- vine serum (FBS), and was used for growth of virus. Japa-
velope sequences identified four clusters which however did nese encephalitis virus strain P20778 from Vellore18 was
not match the four genotypes mentioned above and also ‘‘did used for sequence determination reported here. Confluent
not correspond to geographic origin, isolation host, or viru- monolayers of C6/36 were infected with virus at a multi-
lence’’.14 Yet another analysis which compared six full- plicity of infection (m.o.i.) of 0.1. Medium containing virus
length JEV sequences suggested that the Indian strain GP78 was harvested twice, at day four and again at day eight post-
(Accession number AF075723) may be related to the Chi- infection (p.i.). Virus titer was determined on monolayers of
242
PHYLOGENY OF JAPANESE ENCEPHALITIS VIRUS 243
TABLE 1
Details of virus isolates compared in the study
Strain Genbank accession # Year and origin Genome length
P20778 (P20) AF080251 1958, India, Vellore 10,977

GP78 AF075723 1978, India, Gorakhpur 10,976
Beijing-1 (BP1) L48961 1949, China, Beijing 10,976
Beijing P3 (P3) U47032 1949, China, Beijing 10,977
SA-14 (SA14) U14163 1954, China 10,976
SA-14-2-8 (VAC) U15763 China* 10,969
JaGAr 01 (JA01) AF069076 1959, Japan, Gunma 10,976
JaOArS892 (S892) M18370 1982, Japan, Osaka 10,976
JE RP-9 (RP9) AF014161 Taiwan* 10,976
HVI AF098735 1965, Taiwan 10,976
TC AF098736 Taiwan* 10,976
TL AF098737 Taiwan* 10,976
FU AF217620 1995, Australia 10,964
K94P05 (K94) AF045551 1994, Korea, Wando 10,963
YFV 17D (YFV) X03700 Derived from Asibi strain Africa* 10,862
* Year of isolation not available.
the porcine kidney cell line PS (obtained from NCCS, Pune, synonymous (Ps) and nonsynonymous substitution (Pn) were
India) maintained in MEM with 10% FBS by the TCID50 estimated using the Nei and Gojobori method with Jukes-
method.19 Cantor correction available in the MEGA package version
RNA isolation and cDNA synthesis. Virus was pelleted 1.01 (obtained from the website, http://evolgen.biol.
from the medium of infected cells by centrifugation in a metro-u.ac.jp/MEGA).21
Beckman L8–80 ultracentrifuge at 100,000 ⫻ g for 90 min- Phylogenetic analysis. The sequences of the P20778
utes. The crude virus pellet was then centrifuged at 100,000 strain and the twelve other independently isolated full-length
⫻ g for 16 hr through a gradient of 15–60% sucrose. Pure JEV sequences available in the database were used in this
virus banding at a density of 1.17 g/cm3 was solubilized in study are listed in Table 1. Yellow Fever virus (YFV) 17D
4 M guanidinium isothiocyanate and used for isolation of strain22 was used as reference taxon for the analyses. Mul-
viral RNA. cDNA synthesis was carried out using random tiple alignments of the amino acid sequence sets were gen-
hexanucleotide primers and Avian myeloblastosis virus re- erated using the Clustal W version 1.8 software (obtained
verse transcriptase (Promega Corporation, Madison, WI). from the web site, http://www.ebi.ac.uk/clustalw)23 assuming
The cDNAs were tailed with EcoRI linkers and ligated to default alignment parameters. Conserved motifs and putative
EcoRI-digested pUC18. Overlapping recombinant clones cleavage sites were used as control of validity for alignments
were ordered by hybridization to one another as well as by as described earlier.8 This alignment was used to direct the
determining the sequences at the two ends of the inserts. alignment of the nucleotide sequence sets using the
The following regions of the viral genome were obtained by TransAlign version 1.0 Java program package (obtained
RT-PCR of total RNA from virus infected cells: 1 to 154 nt, from the website, http://life.anu.edu.au/molecular/software/
1,200 to 2,000 nt, and 9,200 to 10,976 nt. The primers for
transalign).24 Alignment of the 5⬘ and 3⬘ untranslated regions
the ends of the genome spanning positions 1 to 31 (primer
was performed at the nucleotide level. Phylogenetic analyses
A) and complementary to position 10,958 to 10,976 (primer
of the JEV isolates was carried out in every case using max-
B) were synthesized based on the sequence published for the
imum likelihood (ML), neighbour-joining (NJ) and parsi-
JaOArS982 isolate.20 The sequence of P20778 at the 3⬘ end
mony methods using PAUP* version 4.0b4a (obtained from
of the genome was identical to that of the JaOArS982 strain
Sinauer Associates, Sutherland, MA).25 When ML and NJ
based on northern hybridization under stringent conditions
of labeled primer B to viral RNA. methods were employed, the general time-reversible (GTR)26
DNA sequencing. pUC18 recombinants were sequenced or Hasegawa-Kishino-Yano (HKY85) substitution model27
using M13 forward and reverse primers in the dideoxy chain was used with the shape of the gamma distribution for
termination method with Sequenase version 2.0 from US among-site variation and ratio of transitions to transversions
Biochemicals (Cleveland, OH). Some of the sequencing was estimated from the data. In the case of analysis using par-
also carried out using an ABI (Foster City, CA) Prism DNA simony method Goloboff-fit criterion28 with default concav-
sequencing kit for dye terminator cycle sequencing with ity parameter was assumed. Phylogenetic analyses for amino
AmpliTaq-FS enzyme. The sequence of the P20778 JEV iso- acid sequences were also performed with the help of MEGA
late has been deposited in Genbank (Accession number version 1.01.21 Jukes-Cantor and gamma distance (a ⫽ 2)
AF080251). Every segment was sequenced on both the algorithms were used with NJ method for construction of
strands with several stretches being done more than once on trees. Phylogenetic relationships for the nucleotide and pro-
each strand. For PCR products, sequencing was carried out tein sequences of the complete viral genome as well as nu-
for two independent clones on either strand. cleotide sequences of individual genes were determined. The
Sequence analysis. The full-length and individual gene robustness of phylograms was evaluated by 1,000 bootstrap
and protein sequences of thirteen independent JEV isolates resampling applying random heuristic search with 100 rep-
listed in Table 1 were analyzed. The average proportions of licates each. Phylogenetic trees were drawn using the pro-
244 UCHIL AND SATCHIDANANDAM
gram TreeView version 1.6.1 (website: http:// protein of the neuroinvasive isolate Beijing P3.30 Also, none
taxonomy.zoology.gla.ac.uk/rod/rod.html).29 of the five amino acids unique to the envelope protein of the
Estimation of lineage divergence times. Initial trees attenuated vaccine strains (derived from SA-14)31 was com-
were obtained using the maximum parsimony and distance mon to P20.
methods available in the PAUP* version 4.0b4a. These trees We observed the insertion of a G residue at position
were then used as input for the maximum likelihood method 10,701 in the P20 sequence. The same insertion was also
available in PAUP* and allowed to be perturbed using either seen in the JaGAr01 strain from Japan and the FU strain
NNI (nearest-neighbour interchange) or TBR (tree bisection- from Australia. Secondary structure prediction of the 3⬘ un-
reconnection) in a heuristic search using a GTR model for translated region (UTR) sequence using the program
distance estimation. The shape parameter of the discrete MFOLD version 3.0 (obtained from the website, http://
gamma distribution for among-site variation was determined bioweb.pasteur.fr/docs/softgen.html#MFOLD)32 revealed
from the sequence. The number of rate categories were as- that the insertion of this G residue fell within a predicted
sumed to be ten. The envelope gene was used for this anal- loop and consequently did not affect the secondary structure.
ysis with only the nonsynonymous sites taken into consid- The 3⬘ UTR of K94 and the FU strains were 13 and 12 bases,
eration because saturation plots drawn using synonymous respectively, shorter than the usual 585 bases present in most
distances revealed that it was saturated. Pairwise nonsynon- of the wild type JEV strains analyzed. The percentage of
ymous substitutions per site were obtained by using the GTR variable sites in the 595 nt 3⬘ UTR was 2.79-fold more than
model under the distance settings present in PAUP*. Isola- in the 95 nt 5⬘ UTR. Complete identity was observed in the
tion dates where known are listed in Table 1 and Figure 1. 5⬘ UTRs of nine of the thirteen JEV sequences compared.
The estimation of pairwise nonsynonymous substitution This suggested that the mode of recognition of the 5⬘ and 3⬘
per site for multiple JEV clades from the predicted ancestor UTRs which function as viral promoter regions by the viral
for that clade, was done using the branch lengths obtained polymerase complex may be different. Extensive secondary
from the maximum likelihood tree as mentioned above. structure has been predicted for the 3⬘ UTR of flaviviruses,33
These values were regressed on the time interval that sepa- which appears to be vital for its function in viral replication.
rates the years of isolation. The slope of such a plot is the One might speculate that for the 5⬘ UTR, in addition to sec-
regression coefficient and is the average rate of nonsynon- ondary structure,34 nucleotide sequence may also play an im-
ymous substitution per site, which is referred to as ‘‘k’’. The portant role in recognition by the viral polymerase complex
divergence date was estimated by dividing the branch length and/or by the capsid protein during packaging.
(BL) towards a given node (or the sum of BLs, if the node Identification of candidate phylogenetic markers. The
is next to a tip, i.e., patristic distance) by the value of k to premembrane (prM) region has been extensively used to
obtain the amount of time it took for the amount of change study JEV evolution. The choice of a 240 nt stretch of the
represented by that BL to take place. prM gene was based on a comparison of the genes encoding
the prM, membrane (M), envelope (E), and the nonstructural
RESULTS AND DISCUSSION 1 (NS1) proteins among a limited data set of five JEV iso-
lates, all of which belong to genotype III.9 These workers
Determination and comparative analysis of the se- concluded that the prM region harbored the largest number
quence of P20778. A summary of the relevant information of third position or synonymous mutations ‘‘implying that
relating to the thirteen JEV strains compared in this study this region was free of selective pressures that might obscure
along with YFV as the outgroup is given in Table 1. Sur- long-term evolutionary relationships’’9 and thus is a good
prisingly, the capsid, NS1 and NS2a protein sequences phylogenetic marker. Indeed, our analysis with 13 full-length
showed the highest percentage of amino acid differences JEV sequences belonging to genotypes I, II and III (Tables
among isolates (11.81%, 11.89% and 10.17% variable sites, 1 and 2) also confirmed this high nucleotide divergence of
respectively, Table 2). This was despite the lowest percent- the prM region. Subsequent to the analysis of 46 JEV strains
age of variable nucleotides (16.79%) for the capsid gene, based on this 240 nt stretch,9 several other workers have
and was correlated with the presence in the capsid gene of found this region valuable for classifying newer isolates
the highest proportion of nonsynonymous substitutions among the 4 genotypes.10–13 To date, the database contains
among all ten genes analyzed (Table 2). A comparison of sequences for this stretch from 167 JEV isolates, testifying
the amino acid sequences of all the proteins of the thirteen to its suitability for genotype-based molecular phylogenetic
JEV strains listed in Table 1 showed that the nonstructural classification of JEV.
proteins NS2b and NS4a were the most conserved, with per- The 240 nt prM tree obtained in the above study9 was a
centage amino acid variability of 6.1 and 6.7. In keeping simple p-distance based tree, and was not subjected to sta-
with its non-neuroinvasive phenotype, P20 did not have any tistical analysis. When we carried out a tree-building exer-
of the nine amino acids found to be unique to the envelope cise with this same stretch using the rigorous GTR model
→
FIGURE 1. Maximum likelihood phylogenetic trees for Japanese encephalitis virus: maximum likelihood trees were derived from the
sequences of each of the three genes envelope (A), NS1 (B), NS3 (C), NS5 (D) and full-length sequence (E) using PAUP* version 4.0b4a for
fourteen JEV sequences available in the database listed in Table 1. Branch lengths were derived using the general time reversible (GTR) model
with the shape parameter for among-site variation estimated from the input sequences. Corresponding YFV sequences were used as reference
taxa. The scale shows a genetic distance of 10, which is equivalent to 0.01% nucleotide divergence. The internal node numbers indicate
bootstrap support values expressed as percentage for 1,000 replicates.
TABLE 2
Analysis of sequences of Japanese encephalitis virus genes*
% Variable sites
Gene Total sites AA NT Ps Pn
prM (240 nt) 240 8.75 24.16 0.4867 ⫾ 0.0975 0.0111 ⫾ 0.0037
Capsid 381 11.81 16.79 0.1366 ⫾ 0.0188 0.0108 ⫾ 0.0027
Premembrane 501 7.78 22.75 0.3022 ⫾ 0.0349 0.0090 ⫾ 0.0023
Envelope 1,500 8.20 21.53 0.2638 ⫾ 0.0161 0.0067 ⫾ 0.0010
NS1 1,236 11.89 20.22 0.2220 ⫾ 0.0158 0.0104 ⫾ 0.0015
NS2a 501 10.17 23.35 0.2551 ⫾ 0.0275 0.0106 ⫾ 0.0026
NS2b 393 6.10 20.35 0.2697 ⫾ 0.0351 0.0068 ⫾ 0.0023
NS3 1,857 7.75 22.50 0.2922 ⫾ 0.0164 0.0068 ⫾ 0.0010
NS4a 447 6.71 21.02 0.2409 ⫾ 0.0268 0.0056 ⫾ 0.0020
NS4b 765 7.05 22.61 0.2572 ⫾ 0.0205 0.0071 ⫾ 0.0016
NS5 2,715 8.06 19.07 0.2282 ⫾ 0.0108 0.0067 ⫾ 0.0008
5⬘UTR 95 – 5.26 – –
3⬘UTR 585 – 14.70 – –
Full-length ORF 10,296 8.50 23.49 0.2471 ⫾ 0.0058 0.0076 ⫾ 0.0004
* Individual genes from the thirteen JEV isolates listed in Table 1, which excluded the attenuated derivative SA-14-2-8, were used for sequence analysis. Ps and Pn ⫽ average proportion
of synonymous (Ps) and nonsynonymous (Pn) mutations per synonymous and nonsynonymous site (mean ⫾ standard error); AA ⫽ amino acids; NT ⫽ nucleotides.
for distance estimation followed by bootstrap resampling, produce the tree topology, obtained with the full-length se-
(see Methods), no statistically reliable tree topology was ob- quence. Among the nonstructural and structural genes ana-
tained for members of genotype III. This indicates that the lyzed, polytomies were observed with capsid, prM, NS2a,
nucleotide differences among isolates within this region NS2b, NS4a, and NS4b (data not shown), possibly due to
were not sufficiently meaningful to resolve the phylogenetic lack of sufficient phylogenetically informative sites. NS1,
relationship among closely related isolates. The functional NS3, NS5, and envelope genes alone reproduced the topol-
significance of this high level of nucleotide variation within ogy obtained using the full-length sequence with reliable
this 240 nt prM region has therefore yet to be established.17 bootstrap support values (Figure 2, A–E). When we used the
Indeed, earlier reports have shown that the use of short 240 nt stretch from the prM region to study JEV evolution,
stretches of nucleotides or amino acids in phylogenetic anal- polytomies surfaced with respect to isolates from genotype
ysis35,36 results in incorrect tree topologies, suggesting that III, making these trees unreliable (data not shown).
the 240 nt prM region is not the ideal candidate for studying The trees obtained during the above analysis reflected the
the molecular evolution of JEV. wide divergence of the Indian JEV strains P20 and GP78
We therefore evaluated possible phylogenetic markers that (Figure 2, A–E) despite their geographic proximity. An ear-
would best reflect JEV evolution. We used the simple cri- lier report had observed genetic diversity for JEV strains
terion that a good marker would be a region of the JEV from Japan and India.9 Immunotyping and oligotyping data
genome that could faithfully reproduce tree topologies ob- suggest that strain variation among JE viruses is not related
tained using the full-length genome sequences with high to geographic location.16,37–39 Comparison of nucleotide and
confidence values. As a first step toward this goal we ob- amino acid sequence of envelope genes of thirteen JEV
tained a phylogenetic tree using the thirteen independently strains17 showed no clustering based on geographical loca-
isolated full-length JEV sequences from Genbank (Table 1) tion. However, Chen and coworkers9 used forty-six prM se-
using YFV as the outgroup. Of the remaining seven full- quences to show that JEV isolates from the same geographic
length sequences available in the database for attenuated de- region and time period are similar.
rivatives of original isolates, we included only SA-14-2-8 in To resolve this issue, and to gain insight into the genetic
our analysis. This strain consistently paired with its parent, diversity of JEV from different geographic areas, we carried
thus functioning as an additional indicator of the correctness out an exhaustive phylogenetic analysis using all 107 en-
of the tree topology. The resulting tree (Figure 2E) revealed velope sequences from unique JEV isolates in the database.
that the Korean strain (K94) and the Australian strain (FU) E-gene sequences for JEV are available in the database,
were the most divergent of all the JE strains studied, and making it possible to use this gene for phylogenetic analysis.
separated from the rest as the first branch after the outgroup. These isolates span a sixty-year period and represent strains
This was in keeping with the position of these two strains from almost all the countries where JE virus is prevalent.
within genotypes I and IV in the envelope tree (Figure 1). Japanese encephalitis virus has been reported in Pakistan;40
The trees also placed the Indian strain P20 distant from the however, no sequence for this isolate is available. The da-
other Indian strain GP78, and the Taiwanese strains close to tabase contained repeats for 32 envelope sequences, which
the Japanese strains. A similar tree topology was obtained were also included in our analysis. The 107 E-gene sequenc-
using the full-length amino acid sequences (data not shown). es were first analyzed using distance and parsimony meth-
Although full-length sequences gave reliable trees, rigor- ods, as described in Methods. The tree classified JEV strains
ous phylogenetic analysis of large numbers of long sequence into five distinct genotypic groups differing by a minimum
stretches requires enormous computing capacity. We there- nucleotide divergence of 7% with high bootstrap support
fore explored the utility of individual structural and non- values (Figure 1). Four of these groups broadly matched the
structural genes as phylogenetic markers, which reliably re- four genotypes that were proposed earlier using the prM
FIGURE 2. Phylogenetic tree derived from Japanese encephalitis virus envelope sequences from 107 unique isolates: neighbor joining method
available in PAUP* using Hasegawa-Kishino-Yano (HKY) distances with transition/transversion (ts/tv) ratios estimated empirically was used
to draw the tree. Homologous sequences from Murray Valley encephalitis virus, St. Louis encephalitis virus (sister groups to JEV10) and yellow
fever virus were used to root the tree. Numbers on the internal nodes refer to the bootstrap support values expressed as percentage for 1,000
replicates. The scale shows a genetic distance of 10, or 1% nucleotide divergence. The E-gene sequences from the clade of Taiwanese strains
depicted as node ‘‘A’’ in genotype III were used for regression analysis to estimate the JEV evolutionary rate.
gene region,9 while a fifth genotype surfaced in our analysis, We also observed striking differences between the prM-
containing the lone Singapore Muar isolate.41 The Muar and E-based trees. In the prM-based tree9 JEV strains from
strain was the most divergent of the JEV isolates analyzed, Thailand were found only in genotypes I and II. The large
differing from genotypes I, II, III, and IV by a minimum E-gene data set-based tree placed Thai strains KPP034–
nucleotide divergence of 21%, 20%, 19.3%, and 22% re- 35CT and Chiang Mai in genotype III. Similarly, Indonesian
spectively. Using a 7% cut-off value of nucleotide diver- strains JKT646 and 208335, previously classified in geno-
gence, the Muar strain could clearly be placed in a newly type IV, and JKT1724 previously classified in genotype II,
formed fifth genotype. The reason for the presence of this as well as a Korean strain (K82P01) previously classified in
unique strain in the midst of countries populated by strains genotype I, were all now placed in genotype III. The Korean
of 3 other genotypes is unclear. Without exception, all of the strain showed a minimal difference of 4.4% and 5.3% from
repeat envelope sequences of any isolate grouped together genotypes III and I respectively. Using the 7% nucleotide
as expected. The vastly larger number of envelope sequences divergence cut-off mentioned above, this strain could be
analyzed, compared to the prM region9, and with greater placed with equal confidence in both genotypes I and III.
diversity, may contribute to the 7% cut-off in nucleotide di- One of the hallmarks of the tree was the overall clustering
vergence for this gene. Thus, while the tree based on the of strains according to their geographic origin (Figure 1).
240 nt prM region could not reliably display statistically For example, the Chinese, the Japanese, and the Taiwanese
significant topologies (data not shown), it could distinguish strains were found to group together on this basis. Similarly,
between the different genotypes. This may be due to the 12% JEV strains from the Indian subcontinent, i.e., from India,
nucleotide divergence between genotypes in this nucleotide Sri Lanka, and Nepal also clustered together. The tree also
stretch9. We therefore propose the use of envelope sequences revealed the genetic diversity among the JEV strains within
to classify all future JEV isolates into the various genotypes the same geographic boundary. For instance, the Japanese
and more importantly, to obtain information on precise re- strains grouped into four different sub-geographic clusters
lationships among closely related isolates. suggesting the co-existence of at least four genetically di-
TABLE 3
Genome sequence analysis of the two Indian strains P20 and GP78 of Japanese encephalitis virus
% Variable sites
Gene Total sites AA NT Ps Pn
prM (240 nt) 240 1.25 5.41 0.2524 ⫾ 0.0802 0.0107 ⫾ 0.0076
Capsid 381 3.93 4.72 0.1530 ⫾ 0.0437 0.0176 ⫾ 0.0079
Premembrane 501 2.39 4.99 0.1927 ⫾ 0.0448 0.0132 ⫾ 0.0059
Envelope 1,500 1.60 4.53 0.1862 ⫾ 0.0249 0.0071 ⫾ 0.0025
NS1 1,236 1.45 2.75 0.1051 ⫾ 0.0203 0.0063 ⫾ 0.0026
NS2a 501 1.19 4.19 0.1643 ⫾ 0.0389 0.0054 ⫾ 0.0038
NS2b 393 1.52 4.83 0.2102 ⫾ 0.0532 0.0067 ⫾ 0.0047
NS3 1,857 2.26 5.33 0.2196 ⫾ 0.0250 0.0107 ⫾ 0.0028
NS4a 447 2.01 4.92 0.1847 ⫾ 0.0439 0.0091 ⫾ 0.0053
NS4b 765 1.96 4.70 0.1723 ⫾ 0.0320 0.0089 ⫾ 0.0040
NS5 2,715 1.48 4.15 0.1820 ⫾ 0.0189 0.0062 ⫾ 0.0017
5⬘UTR 95 – 1.05 – –
3⬘UTR 585 – 3.58 – –
Full-length ORF 10,296 1.80 4.37 0.2220 ⫾ 0.0158 0.0082 ⫾ 0.0010
Ps and Pn ⫽ average proportion of synonymous (Ps) and nonsynonymous (Pn) mutations per synonymous and nonsynonymous site (mean ⫾ standard error).
verse groups in this country where JEV was first reported. www.isrec.isb-sib.ch/software/PFSCAN㛮form.html).43 This
The JEV strains in Taiwan and the Indian subcontinent sequence presumably represents the RNA binding region of
grouped into three distinct clusters each, whereas the strains the capsid protein, judging from the presence of an arginine-
from China were homogeneous and found in one large clus- rich stretch as well as RGG motifs.44 Prediction of secondary
ter. This last mentioned feature was reported previously.9,42 structure of the capsid protein using the program Predator
Our analysis thus revealed that while genetic diversity within (obtained from the web site,http://www.embl-heidelberg.de/
a defined geographic region resulted in distinct clades, there argos/predator)45 revealed that in this region of interest, in
was overall clustering of JE isolates within the genotypes contrast to the helical structure assumed by the capsid of all
based on place of isolation. Notable exceptions to this pat- other JE isolates, GP78 displayed a random-coil structure.
tern of clustering were also observed. The P1 strain from Significantly, the RGG box of the RNA binding protein
China for example was found in a cluster which contained hnRNP U44 also displayed a random-coil secondary structure
isolates from Japan (Sagiyama), Vietnam (VN118), and Tai- in this region. Hence in addition to envelope playing an im-
wan (TL). Two strains from Thailand falling within genotype portant role in the slow release of RNA into the cytoplasm
III along with Chinese and Indian strains have been previ- following viral entry for GP78 as hypothesized,46 the capsid
ously mentioned. Strains from Indonesia were distributed may also contribute to this defect by binding strongly to the
among genotypes II, III, and IV, along with viruses from viral RNA.
other geographic regions, making the Indonesian strains the The large data set of 107 E-gene sequences contained
most diverse among the JEV prevalent areas. eight independent Indian isolates from various parts of India.
In a second analysis performed for dating the ancestral In addition 2 strains each from Nepal and Sri Lanka were
JEV lineage, we carried out a refinement on a subset of 40 also present. This analysis grouped these strains into three
E genes containing representatives from the 5 genotypes. different clusters (Figures 1 and 3). The first group desig-
Here we used the more rigorous and time-consuming ML nated as Vellore group contained the two Vellore strains P20
method, which can overcome possible substitutional super- (isolated in 1958 from human brain), and G8924 (isolated
imposition, as described in the Methods section. This tree in 1956 from mosquito) plus a strain 782219 from Tamil
faithfully reproduced the topology obtained from 107 se- Nadu (isolated in 1978). The Sri Lankan strain, 691004 (iso-
quences, proving the reliability of the method used to obtain lated in 1969 from human brain) which was placed with the
the initial tree (data not shown). Nakayama strain from Japan, was 2.9% divergent from the
Genetic diversity of JEV strains in the Indian subcon- Vellore group, reflecting the geographic proximity of Sri
tinent. Our full-length genome sequence analysis indicated Lanka to South India. The second group designated as Bank-
the significant differences between the two Indian strains ura group comprised four Indian strains, which included
P20 and GP78. A comparison of these strains revealed that strains from northern India (GP78 from Gorakhpur, isolated
the percent nucleotide and amino acid divergence was 4.37 in 1978 and 733913 from Bankura isolated in 1973) and
and 1.80, respectively (Table 3). The NS2a protein was the western India (826309 from Goa isolated in 1982). In ad-
most conserved between the two isolates with a difference dition, this group contained a strain H49778 from Sri Lanka
of only 1.19%. The capsid protein, on the other hand was (isolated in 1987). The third group, designated as Nepal
least conserved with 3.93% divergence. Interestingly a com- group, was formed by an Indian strain 7812474 from Assam
parison of the capsid protein of GP78 with other JEV iso- (isolated in 1978 from human brain) and the B2524 strain
lates listed in table 1 revealed that two of the four amino from Nepal (isolated in 1985). Our calculations using the E-
acid residues that were unique to GP78 capsid sequence fell gene based tree revealed that an arbitrary cut-off of 3.4%
in the 85 to 102 amino acid stretch, resulting in abolition of divergence could be set to define these three groups (Figure
the putative ‘‘nuclear localization signal’’ identified by the 1). The Assam strain 7812474 could be placed either in the
program ProfileScan (obtained from the website, http:// Bankura or Nepal groups, based on the 3.4% divergence cut
TABLE 4
Dating the common ancestor of Japanese encephalitis virus (JEV)*
Date for the JEV ancestral node

in years before present
Genotype (mean ⫾ standard error)
I 133.00 ⫾ 56.24
II 166.21 ⫾ 58.81
III 154.17 ⫾ 45.01
IV 259.69 ⫾ 37.33
V 158.59 ⫾ 68.81
* The rate of nonsynonymous substitution per site (k ⫽ 7.5147 ⫻ 10⫺4) obtained from
regression analysis of clade A (Figure 1) was used to date the JEV ancestor with respect
to each of the five genotypes.
Sri Lanka. (Figure 3). The strains, which form the Bankura
group, were isolated between 1973 and 1982. Except for the
Tamil Nadu strain isolated in 1978, members of the Vellore
group were isolated in 1956 and 1958. This suggests that
JEV was introduced into India on two occasions separated
by atleast 17 years. The earlier introduction that formed the
Vellore group appears to have been restricted in its ability
to spread, presumably due to non-availability of suitable
host-vector combinations. However, the virus obviously was
not lost and appears to be the progenitor of the more recent
FIGURE 3. Map of the Indian subcontinent showing the geo- 1978 isolate from Tirunelveli (782219) in Tamil Nadu.
graphic expanse of the different Japanese encephalitis virus groups
classified in Figure 2.
Therefore, the present day endemicity of JEV in India ap-
pears to be predominantly due to the robust spread of strains
from the Bankura group.
off. We chose to keep this strain in the Nepal group since it Dating the ancestor common to all JEV genotypes. Es-
was closer to the B2524 strain from Nepal (1.5% diver- timation of accurate branch lengths is critical to obtain valid
gence) than to any of the strains present in the Bankura numbers for divergence dates. The ML method gives the
group. Availability of more E-gene sequences from this geo- best estimate of branch lengths,48,49 while the distance and
graphic region may further consolidate this group. The Vel- maximum parsimony methods are best suited for obtaining
lore group was phylogenetically closer to the Nakayama reliable tree topologies. We therefore obtained tree topolo-
strain from Japan, whereas the Bankura group was closer to gies for the envelope genes using the latter 2 methods avail-
the Chiang Mai strain from Thailand. The Nepal group was able in PAUP* version 4.0b4a. These trees faithfully repro-
distant from most other JEV clusters and contained the duced the topologies obtained earlier, and then served as user
JKT1724 strain from Java, Indonesia (Figure 1). The E-gene trees for the maximum likelihood method. PAUP* allows
sequence of the Nepalese strain B2524 differed from that of estimation of transition to transversion ratios and shape pa-
the Indonesian strain JKT1724 by only two nucleotides. rameters for among-site variation from the sequence data.
Thus, the phylogenetic relatedness may indicate the country This approach yields the best estimates of branch lengths
of origin for the viruses present in the Indian subcontinent, and consequently a more accurate dating of a particular
and the possible involvement of migratory birds in trans- node. Of the 107 envelope sequences mentioned earlier, forty
mitting JEV into the Indian subcontinent from these geo- JEV isolates representing each of the five genotypes were
graphically distant regions. Interestingly, the ability of sera chosen for this rigorous analysis.
from vaccinees given the Nakayama strain JE vaccine to Regression analyses was subsequently carried out using
cross-neutralize JEV strains in the Vellore group more ef- sequences of multiple clades from this tree belonging to the
fectively than those from the Bankura group47 shows the different genotypes, each derived from a geographically re-
relatedness of the Vellore group strains to the Nakayama stricted region. Pairwise nonsynonymous substitutions per
strain from Japan. This similarity between the serology data site obtained by comparing each member of the clade to its
and our phylogenetic classification further validates the predicted ancestor were plotted versus the time period in
grouping of the JEV strains in the Indian subcontinent. Per- years that separates the isolates. The slope of the linear re-
haps additional genetically diverse groups exist in the Indian gression is the rate of nonsynonymous substitution per site,
subcontinent, the extent of which will become known as referred to as ‘‘k’’. We obtained the highest confidence level
more E-gene sequences become available. Phylogenetic of P ⬍ 0.0001 (R2 ⫽ 0.899) that the slope is significantly
analysis may therefore serve to point to the serological char- different from zero by using the t-test available in EXCEL,
acteristics of the envelope proteins and may help to guide for the clade comprising the Taiwanese strains in genotype
the planning of vaccination strategies, whose success will be 3 (node ‘‘A’’ in Figure 1; the attenuated strains CH2195LA
influenced by the extent of diversity seen in natural isolates. and CH2195SA were not included in this analysis). The cor-
The geographic expanse of JEV strains from the Bankura responding k value of 7.5147 ⫻ 10⫺4 was used to date the
group ranges from the east (West Bengal) and north of India common ancestor of JEV with respect to each genotype (Ta-
(Uttar Pradesh) to the west coast of South India (Goa), and ble 4). The value of 2.6 ⫻ 10⫺4 for k obtained earlier3 relied
on a limited number of 4 taxa and the use of the method of 13. Chung Y, Nam J, Ban S, Cho H, 1996. Antigenic and genetic
Li and coworkers.50 We also obtained similar values for k analysis of Japanese encephalitis viruses isolated from Korea.
Am J Trop Med Hyg 55: 91–97.
upon linear regression analysis of some clades, but with poor 14. Paranjpe S, Banerjee K, 1996. Phylogenetic analysis of enve-
confidence values. Our estimations assumed the operation of lope gene of Japanese encephalitis virus. Virus Res 42: 107–
a molecular clock for evolution of JEV, which may not be 117.
entirely valid. The value for divergence times from the com- 15. Vrati S, Giri RK, Razdan A, Malik P, 1999. Complete nucleotide
mon ancestor ranged from 259 ⫾ 37 (mean ⫾ standard error) sequence of an Indian strain of Japanese encephalitis virus.
Am J Trop Med Hyg 61: 677–680.
years for genotype IV, the oldest of the genotypes, to 133 ⫾ 16. Banerjee K, Ranadive SN, 1989. Oligonucleotide fingerprint
56 years for genotype I, one of the most recently diverged analysis of Japanese encephalitis virus strains of different geo-
of all five genotypes of JEV. Thus, the global ancestor of all graphical origin. Indian J Med Res 89: 201–216.
JEV strains appears to have arisen in the last 300 years. 17. Ni H, Barrett ADT, 1995. Nucleotide and deduced amino acid
sequence of the structural protein genes of Japanese enceph-
alitis viruses from different geographical locations. J Gen Vi-
Acknowledgments: We are indebted to Niranjan V. Joshi from the rol 76: 401–407.
Centre for Ecological Sciences, Indian Institute of Science, for valu- 18. Webb JKG, Pavri K, George S, Chandy J, Jadhav M, 1964.
able guidance especially in the initial stages. We thank the staff of Japanese B encephalitis in South India: isolation of virus from
the Supercomputer Education and Research Centre for help with use human brain. Bose SK, Dey AK, eds. Asian Paediatrics.
of the supercomputer. We thank Jim Wilgenbusch, Paolo Zanotto, Bombay: Asia Publishing House.
and Ernest Gould for clarifications and suggestions during the course 19. Gould EA, Clegg EA, 1985. Growth, titration and purification
of these analyses. of Togaviruses. Mahy BWJ, ed. Virology: A Practical Ap-
Financial Support: PDU is a recipient of the Senior Research Fel- proach. Oxford, England: IRL Press Limited, 43–78.
lowship of the Council of Scientific and Industrial Research. 20. Sumiyoshi H, Mori C, Fuke I, Morita K, Kuhara S, Kondou J,
Kikuchi Y, Nagamatu H,, Igarashi A, 1987. Complete nucle-
Authors’ addresses: Pradeep D. Uchil and Vijaya Satchidanandam,
otide sequence of the Japanese encephalitis virus genome
Department of Microbiology and Cell Biology, Indian Institute of
RNA. Virology 161: 497–510.
Science, Bangalore-560012, Karnataka, India.
21. Kumar S, Tamura K, Nei M, 1993. MEGA: Molecular Evolu-
Reprint requests: Vijaya Satchidanandam, Department of Microbi- tionary Genetics Analysis. Version 1.01. University Park, PA:
ology and Cell Biology, Indian Institute of Science, Bangalore- The Pennsylvania State University.
560012, Karnataka, India. Telephone: 91-80-3092685. FAX: 91-80- 22. Rice CM, Lenches EM, Eddy SR, Shin SJ, Sheets RL, Strauss
3602697. Email: vijaya@mcbl.iisc.ernet.in JH, 1985. Nucleotide sequence of yellow fever virus: impli-
cations for flavivirus gene expression and evolution. Science
229: 726–733.
REFERENCES
23. Thompson JD, Higgins DG, Gibson TJ, 1994. CLUSTAL W:
Improving the sensitivity of progressive multiple sequence
1. Gourie-Devi M, Ravi V, Shankar SK, 1995. Japanese encepha- alignment through sequence weighting, positions-specific gap
litis: an overview. Rose FC, ed. Recent Advances in Tropical penalties and weight matrix choice. Nucleic Acids Res 22:
Neurology. Amsterdam, Netherlands: Elsevier Science Pub- 4673–4680.
lishers B. V., 217–235. 24. Weiller GF, 1999. TransAlign: a Java program package for align-
2. Burke DS, Leake CJ, 1998. Japanese Encephalitis. Boca Raton, ing coding nucleotide sequences according to the amino acid
FL: CRC Press. sequence encoded. Version 1.0. Canberra A.C.T. 26O1, Aus-
3. Zanotto PM, Gould EA, Gao GF, Harvey PH, Holmes EC, 1996. tralia: Bioinformatics Laboratory, Research School of Biolog-
Population dynamics of flaviviruses revealed by molecular ical Sciences, Australian National University.
phylogenies. Proc Natl Acad Sci USA 93: 548–53. 25. Swofford DL, 1998. Paup*. Phylogenetic Analysis Using Par-
4. Zanotto PM, Gao GF, Gritsun T, Marin MS, Jiang WR, Venu- simony (* and Other Methods). Version 4. Sunderland, Mas-
gopal K, Reid HW, Gould EA, 1995. An arbovirus cline sachusetts: Sinauer Associates.
across the northern hemisphere. Virology 210: 152–159. 26. Rodriguez R, Oliver JL, Marin A, Medina JR, 1990. The general
5. Calisher CH, Karabatsos N, Dalrymple JM, Shope RE, Porter-
stochastic model of nucleotide substitution. J Theor Biol 142:
field JS, Westaway EG, Brandt WE, 1989. Antigenic relation-
485–501
ships between flaviviruses as determined by cross-neutrali-
27. Hasegawa M, Kishino H, Yano T, 1985. Dating the human-ape
zation tests with polyclonal antisera. J Gen Virol 70: 37–43.
split by a molecular clock of mitochondrial DNA. J Mol Evol
6. Kuno G, Chang G-JJ, Tsuchiya R, Karabatsos N, Cropp CB,
22: 106–174.
1998. Phylogeny of the genus Flavivirus. J Virol 72: 73–83.
7. Marin MS, Zanotto PM, Gritsun TS, Gould EA, 1995. Phylog- 28. Goloboff PA, 1993. Estimating the character weights during tree
eny of TYU, SRE, and CFA virus: Different evolutionary search. Cladistics 9: 83–91.
rates in the genus Flavivirus. Virology 206: 1133–1139. 29. Page RDM, 1996. TREEVIEW: an application to display phy-
8. Billoir F, Chesse R, Tolou H, Micco P, Gould EA, Lamballerie logenetic trees on personal computers. Comput Appl Biosci
X, 2000. Phylogeny of genus Flavivirus using complete cod- 12: 357–358.
ing sequences of arthropod-borne viruses and viruses with no 30. Ni H, Barrett ADT, 1996. Analysis of molecular basis of high
known vector. J Gen Virol 81: 781–790. neuroinvasiveness for mice of wild-type Japanese encephalitis
9. Chen WR, Tesh RB, Rico-Hesse R, 1990. Genetic variation of virus strain P3. J Gen Virol 77: 1449–1455.
Japanese encephalitis virus in nature. J Gen Virol 71: 2915– 31. Ni H, Chang GJ, Xie H, Trent DW, Barrett ADT, 1995. Molec-
2922. ular basis of attenuation of neurovirulence of wild-type Jap-
10. Chen W-R, Tesh RB, Rico-Hesse R, 1992. A new genotype of anese encephalitis virus strain SA14. J Gen Virol 76: 409–
Japanese encephalitis virus from Indonesia. Am J Trop Med 413.
Hyg 47: 61–69. 32. Zuker M, Mathews DH, Turner, DH, 1999. Algorithms and
11. Huong VTQ, Ha DQ, Deubel V, 1993. Genetic study of Japa- Thermodynamics for RNA Secondary Structure Prediction: A
nese encephalitis viruses from Vietnam. Am J Trop Med Hyg Practical Guide. Dordrecht, Netherlands: Kluwer Academic
49: 538–544. Publishers.
12. Ali A, Igarashi A, Paneru LR, Hasebe F, Morita K, Takagi M, 33. Brinton MA, Fernandez AV, Dispoto JH, 1986. The 3⬘-nucleo-
Suwonkerd YT, Wada Y, 1995. Characterization of two Jap- tides of flavivirus genomic RNA form a conserved secondary
anese encephalitis virus isolated in Thailand. Arch Virol 140: structure. Virology 153: 113–121.
1557–1575. 34. Brinton MA, Dispoto JH, 1988. Sequence and secondary struc-
ture analysis of the 5⬘-terminal region of flavivirus genome 41. Hasegawa H, Yoshida M, Fujita S, Kobayashi Y, 1994. Com-
RNA. Virology 162: 290–299. parison of structural proteins among antigenically different
35. Nei M, Kumar S, Takahashi K, 1998. The optimization principle Japanese encephalitis virus strains. Vaccine 12: 841–844.
in phylogenetic analysis tends to give incorrect topologies 42. Huang CH, 1982. Studies on Japanese encephalitis in China.
when the number of nucleotides or amino acids used is small. Advances Virus Res 27: 71–101.
Proc Natl Acad Sci USA 95: 12390–12397. 43. Hofmann K, Bucher P, Falquet L, Bairoch A, 1999. The PROS-
36. Gascuel O, 2000. On the optimization principle in phylogenetic ITE database, its status in 1999. Nucleic Acids Res 27: 215–
analysis and the minimum-evolution criterion. Mol Biol Evol 219.
17: 401–405. 44. Burd CG, Dreyfuss G, 1994. Conserved structures and diversity
37. Okuno Y, Okuda T, Kondo A, Suzuki M, Kobayashi M, Oya A, of functions of RNA-binding proteins. Science 265: 615–621.
1968. Immunotyping of different strains of Japanese enceph- 45. Frishman D, Argos P, 1996. Incorporation of non-local inter-
alitis virus by antibody-absorption, haemagglutination-inhi- actions in protein secondary structure prediction from the
bition and complement-fixation test. Bull World Health Organ amino acid sequence. Protein Eng 9: 133–42.
46. Vrati S, Agarwal V, Malik P, Wani SA, Saini M, 1999. Molec-
38: 547–563.
ular characterization of an Indian isolate of Japanese enceph-
38. Wills MR, Sil BK, Cao JX, Barrett ADT, 1992. Antigenic char-
alitis virus that shows an extended lag phase during growth.
acterization of live attenuated Japanese encephalitis vaccine J Gen Virol 80: 1665–1671.
virus SA14–4–2: a comparison with isolates of virus covering 47. Banerjee K, 1986. Certain characteristics of Japanese encepha-
a wide geographic area. Vaccine 10: 861–872. litis virus strains by neutralization test. Indian J Med Res 83:
39. Hori H, 1986. Oligonucleotide fingerprint analysis of Japanese 243–250.
encephalitis (JE) virus strains of different geographic origins. 48. Nei M, 1996. Phylogenetic analysis in molecular evolutionary
Trop Med 28: 179–190. genetics. Annu Rev Gen 30: 371–403.
40. Igarashi A, Tanaka M, Morita K, Takasu T, Ahmed A, Akram 49. Yang Z, 1996. Phylogenetic analysis using parsimony and like-
DS, Waqar MA, 1994. Detection of West Nile and Japanese lihood methods. J Mol Evol 42: 294–307.
encephalitis viral genome sequences in cerebrospinal fluid 50. Li W-H, Tanimura M, Sharp PM, 1988. Rates and dates of di-
from acute encephalitis cases in Karachi, Pakistan. Microbiol vergence between AIDS virus nucleotide sequences. Mol Biol
Immunol 38: 827–30. Evol 5: 313–330.

242

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

242

Transféré par

Droits d'auteur :

Formats disponibles

Am. J. Trop. Med. Hyg., 65(3), 2001, pp.

PHYLOGENETIC ANALYSIS OF JAPANESE ENCEPHALITIS VIRUS: ENVELOPE GENE

INTRODUCTION nese SA-14 isolate.15 Most of these studies utilized uncor-

Strain Genbank accession # Year and origin Genome length

P20778 (P20) AF080251 1958, India, Vellore 10,977

Gene Total sites AA NT Ps Pn

Gene Total sites AA NT Ps Pn

Date for the JEV ancestral node

Vous aimerez peut-être aussi