Vous êtes sur la page 1sur 10

Spatial Analysis of Serbian Election Data

Aleksandar Tomasevic, University of Novi Sad

This notebook contains the spatial analysis of turnout rates in last two Serbian Presidental Elections (2012
and 2017). Spatial units of analysis are municipalities.
The analysis includes:
Plotting turnout maps
Calculating global Morans I
Testing the significance of global Morans I using Monte Carlo simulations
Calculating LISA or local Morgan and testing them.
Plotting Morans scatter plot (both regular and normalized)
Plotting LISA Map

Data preparation

Required packages for the entire analysis:


library(ggplot2)
library(maptools)
library(raster)
library(classInt)
library(RColorBrewer)
library(spdep)

First we download the shape file for Serbia on the municipality level.
Serbia<- getData('GADM', country='SRB', level=2)

Then we import the matrix of turnout results by municipality.


Serbia@data<-read.csv("opstine.csv")

Now, we need to extract the neighborhood matrix from the Serbia shape file. Firs we convert the polygonal
shape file to NB format.
pSerbia<-poly2nb(Serbia)

Then we convert the NB to spatial weight list.


lSerbia<-nb2listw(pSerbia,style="W")

Now we have two basic elements of every spatial analysis, the variables which we analyse contained under the
Serbia@Data class and the spatial weights which are crucial factor of most spatial methods.
For simplicity we will extract two variables of interest of out Serbia file, the turnout for 2012 and 2017
elections.
T12<-Serbia@data$X2012
T17<-Serbia@data$X2017

Turnout maps

Now, we can plot two turnout maps. We will need different coloring for each range of turnout. Ranges can
be expressed as quantiles, but for this purpose we can use predefined turnouts as group borders.

1
breaks<-c(35,40,45,55,60,65,70)
color<-brewer.pal(7,"Reds")

Plot of 2012 turnout.


plot(Serbia,col=color[findInterval(T12,breaks)])
title(paste("2012 Presidental Election Turnout by Municipality"))
legend(x="topright",y=24.06,legend = paste(leglabs(breaks),"%"),fill=color,bty="n")

2012 Presidental Election Turnout by Municipality

under 40 %
40 45 %
45 55 %
55 60 %
60 65 %
over 65 %

A plot of 2017 turnout.


plot(Serbia,col=color[findInterval(T17,breaks)])
title(paste("2017 Presidental Election Turnout by Municipality"))
legend(x="topright",y=24.06,legend = paste(leglabs(breaks),"%"),fill=color,bty="n")

2
2017 Presidental Election Turnout by Municipality

under 40 %
40 45 %
45 55 %
55 60 %
60 65 %
over 65 %

### Calculating Morans I


For each year we can check if there is a global auto correlation pattern of turnouts between municipalities.
First, we calculate Morans I for 2012 and 2017.
M2012<-moran(T12,lSerbia,length(pSerbia),S0=Szero(lSerbia))
M2017<-moran(T17,lSerbia,length(pSerbia),S0=Szero(lSerbia))

Morans I for 2012 is: 0.3366 and for 2017 its: 0.2719
Now we should test the significance of the statistic.
moran.test(T12,lSerbia)

##
## Moran I test under randomisation
##

3
## data: T12
## weights: lSerbia
##
## Moran I statistic standard deviate = 7.0354, p-value = 9.936e-13
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.336612147 -0.006250000 0.002375007
moran.test(T17,lSerbia)

##
## Moran I test under randomisation
##
## data: T17
## weights: lSerbia
##
## Moran I statistic standard deviate = 5.8192, p-value = 2.956e-09
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic Expectation Variance
## 0.27194165 -0.00625000 0.00228536
Both statistics proved to be significant.
For 2012, p value is 9.9e-13 and for 2017, the p value is 3e-09
However, since the significance test for Morans I is highly sensitive to the configuration of spatial units, the
ordinary significance test (obtained from z-scores) must be complimented with simulation test.
We will perform Monte Carlo randomization where the variable values are randomly permuted between
municipalities. The observed value of Morans I statistic is compared to the distribution of the statistic of
random permutations resulting with a more robust statistical test.
We can see the random distribution of randomized variable for 2012.
ggplot(as.data.frame(MC2012$res[1:1000]),aes(x=MC2012$res[1:1000])) + geom_density(fill="deepskyblue4",a

4
Density plot of randomized Moran's I for 2012

7.5

5.0
Density

2.5

Observed Morgan's I
0.0

0.2 0.0 0.2


Simulated Morgan's I
And also for 2017 data. We can see that 2017 there is a thin tail on the right side close to the observed value.
However, the observed value is still outside the simulated distribution and both MC tests results with p-value
of 0.0009. This confirms the significance of both statistics.
ggplot(as.data.frame(MC2017$res[1:1000]),aes(x=MC2017$res[1:1000])) + geom_density(fill="deepskyblue4",a

5
Density plot of randomized Moran's I for 2017

7.5
Density

5.0

2.5

Observed Morgan's I
0.0

0.1 0.0 0.1 0.2


Simulated Morgan's I

Spatial correlation is weaker in case of 2017. If we take a look at the maps, the absence of the distinct high
turnout zones has contributes to weaker correlation as there are signs of local spatial randomness. This points
out the necessity of complementing this analysis with the local Moran coefficients.

Moran scatterplots
First we will calculate the local Moran coefficients for each municipality and then divide them into 4
categories of significant results and the rest will be marked as insignificant. Four categories are High-High,
Low-Low,Low-High and High-Low.
First step is to calculate local Moran Is for every municipality.
Serbia$LM12<-localmoran(T12,lSerbia)
Serbia$LM17<-localmoran(T17,lSerbia)

Then to calculate neighborhood values with spatial lag. For simplicity in determining quadrants for the
Moran scatterplot we will use standardized values for the original variable and calculate neighborhood values
from them.
Serbia$s12<-scale(T12)
Serbia$s17<-scale(T17)
Serbia$lag12<-lag.listw(lSerbia,Serbia$s12)
Serbia$lag17<-lag.listw(lSerbia,Serbia$s17)

Now, the we should divide the results into 5 groups mentioned above. Since the groups will be used for
coloring graphs and maps, we will add a new variable to Serbia file.

6
Serbia$color12[(Serbia$s12>=0 & Serbia$lag12>=0)&Serbia$LM12[,5]<=0.05]<-"red"
Serbia$color12[(Serbia$s12<=0 & Serbia$lag12<=0)&Serbia$LM12[,5]<=0.05]<-"blue"
Serbia$color12[(Serbia$s12>=0 & Serbia$lag12<=0)&Serbia$LM12[,5]<=0.05]<-"skyblue2"
Serbia$color12[(Serbia$s12<=0 & Serbia$lag12>=0)&Serbia$LM12[,5]<=0.05]<-"lightpink"
Serbia$color12[Serbia$LM12[,5]>0.05]<-"black"
Serbia$color17[(Serbia$s17>=0 & Serbia$lag17>=0)&Serbia$LM17[,5]<=0.05]<-"red"
Serbia$color17[(Serbia$s17<=0 & Serbia$lag17<=0)&Serbia$LM17[,5]<=0.05]<-"blue"
Serbia$color17[(Serbia$s17>=0 & Serbia$lag17<=0)&Serbia$LM17[,5]<=0.05]<-"skyblue2"
Serbia$color17[(Serbia$s17<=0 & Serbia$lag17>=0)&Serbia$LM17[,5]<=0.05]<-"lightpink"
Serbia$color17[Serbia$LM17[,5]>0.05]<-"black"

Then, we can plot scatterplots using predefined function and use coloring variables to mark the points.
Heres a 2012 plot.
legenda.tekst=c("High-High","Low-Low","Low-High","High-Low","Not significant")
moran.plot(T12,lSerbia,xlab="Observed value",ylab="Neighborhood value value",labels=as.character(Serbia@
legend("bottomright",legend=legenda.tekst,col=c("red","blue","skyblue2","lightpink","black","black"),pch

Moran scatterplot for 2012

Ni

Pirot Doljevac
Bosilegrad
Medveda
Priboj
Kruevac
70

Meroina

BlacePalanka
Bela
Bojnik
Neighborhood value value

60

Trgovite
Negotin
Bujanovac
50

HighHigh
LowLow
LowHigh
HighLow
40

Preevo
Not significant

20 40 60 80

Observed value

And another one for 2017.

7
moran.plot(T17,lSerbia,xlab="Observed value",ylab="Neighborhood value value",labels=as.character(Serbia@
legend("bottomright",legend=legenda.tekst,col=c("red","blue","skyblue2","lightpink","black","black"),pch

Moran scatterplot for 2012

Bosilegrad

Pirot
60

Medveda
Bela Palanka
Neighborhood value value

50

Bujanovac
Senta
Trgovite

Novi Kneevac
40
30

HighHigh
LowLow
LowHigh
20

HighLow
Preevo
Not significant

20 40 60 80

Observed value

Mapping local Moran


Scatterplot enables us to see the significant local Morans, but the key to the analysis is their spatial
distribution. In order to reveals clusters or pockets of locally correlated municipalities, its neccecary to
plot local Moran on a map.
Heres the first map for 2012.
breaks<-seq(1,5,1)
labels<-legenda.tekst
Serbia$color12[Serbia$LM12[,5]>0.05]<-"white"
Serbia$color17[Serbia$LM17[,5]>0.05]<-"white"
plot(Serbia,col=Serbia$color12)
legend("topleft",legend=labels[1:5],fill=c("red","blue","skyblue2","lightpink","white"),bty="n")

8
HighHigh
LowLow
LowHigh
HighLow
Not significant

And a map for 2017.


plot(Serbia,col=Serbia$color17)
legend("topleft",legend=labels[1:5],fill=c("red","blue","skyblue2","lightpink","white"),bty="n")

9
HighHigh
LowLow
LowHigh
HighLow
Not significant

10

Vous aimerez peut-être aussi