Vous êtes sur la page 1sur 27

ECONOMETRIC MODEL WITH

QUALITATIVE VARIABLES
How to quantify qualitative variables to quantitative variables ?

Why do we need to do this ?


Econometric model needs quantitative variables to estimate its
parameters

What are the differences among these variables:


Dummy? Indicator? Binary? Dichotomy? Categorical
ECONOMETRIC MODEL WITH
DUMMY VARIABLES

Specifically:
What if the variables are not quantitative variables, like:
I. Male-female; Urban-rural; Yes-No; foreign-domestic
II. Level of education: SD, SLTP, SLTA, D3, S1, S2, S3
Choice if investment: stock, certificate of BI, gold, etc.

Other Usages:
How to model Unstable Regression?
- Jumping Regression
- Shifting Regression
Technically speaking, do we have problems with our model if:
- Independent variable (s) is (are) a dummy (ies)
- Dependent variables is a dummy

Illustration:
We would like to analyze whether there are differences between
graduate and undergraduate students in weekly entertainment
spending.

Y: weekly spending for entertainment per student


PS: graduate or undergraduate
PS = 1 ; graduate student
PS = 0 ; undergraduate student

Model: Y = α + β PS + u
From the model, an average spending:
• Graduate student: E (Y ⎟ PS = 1) = α + β
• Undergraduate student: E (Y ⎟ PS = 0) = α
For example, by using data from a survey, the estimated model is
the following:

Y = 9,4 + 16 PS
t (53,22) (6,245)
R2 = 96,54%

The model indicates that α ≠ 0 dan β ≠ 0 (statistically signifiant)

Interpretation:
average spending for graduate students: 9,4 + 16 = 25,4,
average spending for under graduate students: 9,4

(There is a difference between spending of the two groups)

The next question is whether graduate students more able or more


consumptive in entertainment spending than undergraduate
students
Professor’s salary = f (experience, sex)
Do we have a discrimination in salary policy against female
professors?

Y = yearly salary of a professors


X = years of teaching
G = 1 ; male professors
0 ; female professors

A model that can relate X and G to Y:


Y = α1 + α2 G + β X + u

From the model, it can be seen that:


• Average salary of female professor = α1 + β X
• Average salary of male professor = α1 + α2 + β X
Secara geometris:
Y
Gaji tahunan
Dosen laki-laki

Dosen perempuan
α2

α1

X
Pengalaman mengajar

Katakanlah berdasarkan data didapat:

Y = 19,21 + 0,373 G + 1,453 X


t (11,33) (1,141) (37,997)
R2 = 89,75%

Adakah diskriminasi?
How about if we define differently
S = 1; female professor
= 0; male professor

Since we define dummy variable differently,


will we have different result substantively?

Model with new definition:


Y = α1 + α 2 S + β X + u
Kalau digambar akan menjadi:

Y
Gaji tahunan
Dosen perempuan

Dosen laki-laki
α2

α1

X Pengalaman mengajar

Perlu diperhatikan sekarang bahwa berdasarkan pendefinisian baru:


• Rata-rata gaji dosen perempuan = α1 + α2 + β X
• Rata-rata gaji dosen laki-laki = α1 + β X
Remark
In defining dummy variable, which category is
representing by one or zero does not matter as
long as the estimated model is interpreted
consistently.
What happened if we define dummy
variable as follows:

D2 = 1; male professor
0; female professor

D3 = 1; female professor
0; male professor

The model with this definition:


Y = α1 + α2 D2 + α3 D3 + β X + u

When we estimate this model with OLS, what will happened ?


Tabel. Nama Dosen Berdasar Jenis Kelamin

Nama Sex D2 D3
Ana Hen P 0 1
Annisa P 0 1
Budi L 1 0
Bambang L 1 0
Badrun L 1 0
Betty P 0 1

Hubungan antar regresor: D 2 = 1 - D3 atau D3 = 1 - D2


Akibat: Perfect Collinear

Aturan main:
Jika jumlah kategori sebanyak m, maka kita hanya memerlukan m-1
variabel dummy.
Qualitative Variables
with more than two categories
Levels of Education: SD, SLTP, SLTA, D3, S1, S2, S3
Choices of Investments: Stock, Saving Deposits, Property, Gold

Can we represent these types of variables with dummy variables?


How?

Supposed we have 3 categories of Education Levels:


(i) Graduate from Secondary School or lower,
(ii). Graduate from High School,
(iii). Graduate from University
Can we represent these types of variables with a Variable that
has different values like: 1, 2, and 3 based on the number of
categories?

Should we define differently?

Try define as follows:


D2 = 1 ; if the highest level of education is high school
0 ; others

D3 = 1 ; if the highest level of education is university


0 ; others

Do we need to define the other category explicitly?


Ilustrasi: 3 kategori dengan 2 variabel dummy
Nama Pendidikan D2 D3
Ana Hen SD 0 0
Annisa PT 0 1
Budi SMU 1 0
Bambang SD 0 0
Badrun SLTP 0 0
Betty SMU 1 0
Life Insurance Consumption = f (income, education)
See the following model:
Y = α1 + α2 D 2 + α3 D 3 + β X + u

Y = life insurance expenses per year


X = income per year
D2 = 1 ; high school degree
0 ; others
D3 = 1 ; college degree (S1)
0 ; others
Average spending based on education:
• less than high school : α1 + βX (base category)
• high school : α1 + α2 + βX
• university/college (S1): α1 + α3 + βX

Notes: Reference group is less than high school. Why?


Bagaimana memilih kelompok dasar?

Pengeluaran Asuransi berdasarkan Tingkat Pendidikan dan


Pengeluaran

Y S1

SMU

Tidak tamat SMU


α3
Diasumsikan : α3 > α2
α2

α1

Pendapatan (X)
Model dg Beberapa Variabel Kualitatif
Gaji = f ( pengalaman, sex, di fakultas apa)

Y = α1 + α2 D2 + α3 D3 + β X + u
Y = gaji/tahun
X = lamanya mengajar/pengalaman (tahun)
D2 = 1 ; dosen laki-laki
0 ; dosen perempuan
D3 = 1 ; Dosen FE
0 ; lainnya

• Rata-rata gaji dosen perempuan diluar FE = α1 + β X


• Rata-rata gaji dosen laki-laki diluar FE = α1 + α2 + β X
• Rata-rata gaji dosen perempuan di FE = α1 + α3 + β X
• Rata-rata gaji dosen laki-laki di FE = α1 + α2 + α3 + β X
Berdasarkan pengolahan data didapat:

Y = 7,43 + 0,207 D2 + 0,164 D3 + 1,226 X


R2 = 91,22%

Apa artinya bila:


(i) uji-t menunjukan variabel D2 dan D3 tidak signifikan.
(ii) uji-t menunjukan bahwa semua koefisien variabel signifikan

Rata-rata Gaji:
• Dosen P diluar FE = 7,43 + 1,226 = Rp.8,656 juta.
• Dosen L diluar FE=7,43+0,207+1,226 = Rp.8,863 juta.
• Dosen P di FE=7,43 +0,164 + 1,226 = Rp.8,820 juta.
• Dosen L di FE=7,43+0,207+0,164+1,226 =Rp.9,027 juta.
Pemodelan upah : Moonlighting
Moonlighter adalah orang yang mempunyai satu pekerjaan
utama dan satu atau lebih pekerjaan sambilan.

Dugaan: pekerja jenis ini mempunyai penghasilan yang kurang


memadai dari pekerjaan utamanya, sehingga terpaksa mencari
sumber pendapatan lain.
Apa pemicunya?
Wm = upah moonlighting/jam
Wu = upah pekerjaan utama
Ras = 0 ; Bukan pribumi
= 1 ; Pribumi
Kota = 0 ; pedesaan
= 1 ; perkotaan
SMU = 0 ; tidak lulus SMU
= 1 ; lulus SMU
Wilayah= 0 ; Kawasan Timur
= 1 ; Kawasan Barat
Umur = umur (dalam tahun)
Model yang ditawarkan:
Wm = α1+ α2 Wu+ α3 Ras+ α4 Kota+ α5 SMU+ α6 Wilayah+
α7 Umur+ u

Misalkan, berdasarkan suatu sampel, model terestimasi:


Wm = 37,07 + 1,403 Wu - 90,06 Ras + 75,51 Kota + 47,33 SMU
+ 113,64 Wilayah + 2,26 Umur

Apa artinya bila uji-F, dan uji-t, ternyata semua variabel signifikan
pada tingkat signifikansi 5%.

Rata-rata upah pekerja bukan pribumi di pedesaan KTI dan tidak


lulus SMU: Wm = 37,07 + 1,403 Wu + 2,26 Umur

Rata-rata upah pekerja pribumi di perkotaan KBIdan lulus SMU:


Wm = (37,07-90,06+75,51+113,64+47,33) +1,403Wu + 2,26 Umur
Wm = 183,49 +1,403Wu + 2,26 Umur
Comparing 2 regressions
Saving (Y) = α1 + α2 Income (X) + u

The above model indicates that saving and income do not behave
differently across sampel and time.

However, in reality, there is a possibility that the model differs between


before and after a certain event. Let say, behavior of saving is different
between prior and post an economic crisis.

How to accommodate this changing in saving behavior?

The following model can be used in accommodating a change.

Periode I, before crisis:Yi = α1 + α2 Xi + ui ; i = 1,2, … , n


Periode II, after crisis:Yi = β1 + β2 Xi + εi ; i = n+1, n+2, … , N
Possibilities in comparing those two models:

Case 1: α1 = β1 and α2 = β2
Case 2: α1 ≠ β1 and α2 = β2
Case 3: α1 = β1 and α2 ≠ β2
Case 4: α1 ≠ β1 and α2 ≠ β2

Case 1 : both models are the same, no shift


Case 4 : both models are different
and there is a shift

Dummy variables can be used in addressing


this type of change.
Membandingkan 2 regresi dengan variabel dummy
Mengantisipasi adanya pergeseran model regresi:

Yi = α1 + α2 Di + β1 Xi + β2 Di Xi + ui

Di = 1 ; pengamatan pada periode 1


0 ; pengamatan pada periode 2

Sehingga, rata-rata tabungan (Y) pada periode :


I : Yi = (α1 + α2) + (β1 + β2) Xi
II : Yi = α1 + β1 Xi
Bagaimana mengetahui adanya pergeseran model?

• 1: Bila α2 = 0 dan β2 = 0 ⇒ Model I = Model II


• 2: Bila α2 ≠ 0 dan β2 = 0 ⇒ Slope sama, intercept beda
• 3: Bila α2 = 0 dan β2 ≠ 0 ⇒ Intercept sama, slope beda
• 4: Bila α2 ≠ 0 dan β2 ≠ 0 ⇒ Intercept dan slope berbeda
Regresi linier sepotong-sepotong
(Piecewise linear regression)
Aplikasi: Pemodelan komisi penjualan
Skenario:
Bonus diberikan jepada penjual yang melampaui terget, X*,
misalnya.

Y = komisi penjualan
X = volume penjualan yang dicapai oleh salesman
X* = target penjualan
D = 1 ; bila X > X*
0 ; bila X ≤ X*
Rata-rata komisi penjualan bila tidak melebihi target:
Komisi = α1 + β1 X ; X < X*
R a ta -ra ta k o m is i p e n ju a la n b ila m e la m p a u i ta rg e t :
K o m is i = α 1 + (β 1 + β 2 ) X - β 2 X * ; X * ≤ X

S e h in g g a m o d e ln ya d a p a t d ig a b u n g m e n ja d i :
Y = α 1 + β 1 X + β 2 (X – X *) D

S e c a ra g e o m e tris :

K o m is i

α1

X* P e n ju a la n
The end of the
lesson
by Nachrowi D. Nachrowi

Vous aimerez peut-être aussi