12 vues

Transféré par Edgar

- AP Review Hypothesis Tests
- Bluman 5th_Chapter 8 HW Soln
- STAT 125-HK. Business Statistics Midterm Exam
- Econometric Notes
- 2. Statistical Inference - Single Population
- P-value
- Harvard Ec 1123 Econometrics Problem Set 4 - Tarun Preet Singh
- 203. Pls' Post-Hr'g Br Re Mot to Enforce Pars 29 & 30 of Settlement Agmt
- t-test TI-NSpire Calculator
- An Appraisal of Vertical Marketing System of Medical Drugs Distribution in Abia State, Nigeria.
- Validation of Accident-Models for Intersections
- Exam
- Differences on the Level of Social Skills between Freshman Computer Gamers and Non-Gamers
- Basik Goobers Final Lab Report
- Active Dimensional
- pharmasug-2011-po08
- Molecular Genetics Exam
- Hypothesis Testing Problem Solving Template (Very Important)
- Testing of Hypothesis
- LLJ110705

Vous êtes sur la page 1sur 784

Vladik Kreinovich

Songsak Sriboonchitta Editors

Structural

Changes and

their Econometric

Modeling

Studies in Computational Intelligence

Volume 808

Series editor

Janusz Kacprzyk, Polish Academy of Sciences, Warsaw, Poland

e-mail: kacprzyk@ibspan.waw.pl

The series “Studies in Computational Intelligence” (SCI) publishes new develop-

ments and advances in the various areas of computational intelligence—quickly and

with a high quality. The intent is to cover the theory, applications, and design

methods of computational intelligence, as embedded in the ﬁelds of engineering,

computer science, physics and life sciences, as well as the methodologies behind

them. The series contains monographs, lecture notes and edited volumes in

computational intelligence spanning the areas of neural networks, connectionist

systems, genetic algorithms, evolutionary computation, artiﬁcial intelligence,

cellular automata, self-organizing systems, soft computing, fuzzy systems, and

hybrid intelligent systems. Of particular value to both the contributors and the

readership are the short publication timeframe and the world-wide distribution,

which enable both wide and rapid dissemination of research output.

Vladik Kreinovich Songsak Sriboonchitta

•

Editors

Econometric Modeling

123

Editors

Vladik Kreinovich Songsak Sriboonchitta

Department of Computer Science Faculty of Economics

University of Texas at El Paso Chiang Mai University

El Paso, TX, USA Chiang Mai, Thailand

Studies in Computational Intelligence

ISBN 978-3-030-04262-2 ISBN 978-3-030-04263-9 (eBook)

https://doi.org/10.1007/978-3-030-04263-9

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part

of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations,

recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission

or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar

methodology now known or hereafter developed.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this

publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from

the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this

book are believed to be true and accurate at the date of publication. Neither the publisher nor the

authors or the editors give a warranty, express or implied, with respect to the material contained herein or

for any errors or omissions that may have been made. The publisher remains neutral with regard to

jurisdictional claims in published maps and institutional afﬁliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG

The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents

General Theory

The Replacement for Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . 3

William M. Briggs, Hung T. Nguyen, and David Traﬁmow

On Quantum Probability Calculus for Modeling

Economic Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Hung T. Nguyen, Songsak Sriboonchitta, and Nguyen Ngoc Thach

My Ban on Null Hypothesis Signiﬁcance Testing

and Conﬁdence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

David Traﬁmow

Kalman Filter and Structural Change Revisited: An Application

to Foreign Trade-Economic Growth Nexus . . . . . . . . . . . . . . . . . . . . . . 49

Omorogbe Joseph Asemota

Statisticians Should Not Tell Scientists What to Think . . . . . . . . . . . . . . 63

Donald Bamber

Bayesian Modelling Structural Changes on Housing Price Dynamics . . . 83

Hong Than-Thi, Manh Cuong Dong, and Cathy W. S. Chen

Cumulative Residual Entropy-Based Goodness of Fit Test

for Location-Scale Time Series Model . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Sangyeol Lee

The Quantum Formalism in Social Science: A Brief Excursion . . . . . . . 116

Emmanuel Haven

How Annualized Wavelet Trading “Beats” the Market . . . . . . . . . . . . . 124

Lanh Tran

Flexible Constructions for Bivariate Copulas Emphasizing

Local Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

Xiaonan Zhu, Qingsong Shan, Suttisak Wisadwongsa, and Tonghui Wang

v

vi Contents

Normal Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Cong Wang, Tonghui Wang, David Traﬁmow, and Hunter A. Myüz

Why the Best Predictive Models Are Often Different from the Best

Explanatory Models: A Theoretical Explanation . . . . . . . . . . . . . . . . . . 163

Songsak Sriboonchitta, Luc Longpré, Vladik Kreinovich,

and Thongchai Dumrongpokaphan

Algorithmic Need for Subcopulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

Thach Ngoc Nguyen, Olga Kosheleva, Vladik Kreinovich,

and Hoang Phuong Nguyen

How to Take Expert Uncertainty into Account: Economic Approach

Illustrated by Pavement Engineering Applications . . . . . . . . . . . . . . . . . 182

Edgar Daniel Rodriguez Velasquez, Carlos M. Chang Albitres,

Thach Ngoc Nguyen, Olga Kosheleva, and Vladik Kreinovich

Quantum Approach Explains the Need for Expert Knowledge:

On the Example of Econometrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Songsak Sriboonchitta, Hung T. Nguyen, Olga Kosheleva,

Vladik Kreinovich, and Thach Ngoc Nguyen

Applications

Monetary Policy Shocks and Macroeconomic Variables: Evidence

from Thailand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Popkarn Arwatchanakarn

Thailand’s Household Income Inequality Revisited: Evidence

from Decomposition Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

Natthaphat Kingnetr, Supanika Leurcharusmee, and Songsak Sriboonchitta

Simultaneous Conﬁdence Intervals for All Differences of Variances

of Log-Normal Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Warisa Thangjai and Suparat Niwitpong

Conﬁdence Intervals for the Inverse Mean and Difference of Inverse

Means of Normal Distributions with Unknown Coefﬁcients

of Variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Warisa Thangjai, Sa-Aat Niwitpong, and Suparat Niwitpong

Conﬁdence Intervals for the Mean of Delta-Lognormal Distribution . . . 264

Patcharee Maneerat, Sa-Aat Niwitpong, and Suparat Niwitpong

The Interaction Between Fiscal Policy, Macroprudential Policy

and Financial Stability in Vietnam-An Application of Structural

Equation Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

Nguyen Ngoc Thach, Tran Thi Kim Oanh, and Huynh Ngoc Chuong

Contents vii

Index for Vietnam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Nguyen Ngoc Thach, Tran Thi Kim Oanh, and Huynh Ngoc Chuong

Mercury Retrograde and Stock Market Returns in Vietnam . . . . . . . . . 303

Nguyen Ngoc Thach and Nguyen Van Diep

Modeling Persistent and Periodic Weekly Rainfall in an Environment

of an Emerging Sri Lankan Economy . . . . . . . . . . . . . . . . . . . . . . . . . . 314

H. P. T. N. Silva, G. S. Dissanayake, and T. S. G. Peiris

Value at Risk of SET Returns Based on Bayesian Markov-Switching

GARCH Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

Petchaluck Boonyakunakorn, Pathairat Pastpipatkul,

and Songsak Sriboonchitta

Benfordness of Chains of Truncated Beta Distributions via a Piecewise

Constant Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342

Tippawan Santiwipanont, Songkiat Sumetkijakan,

and Teerapot Wiriyakraikul

Conﬁdence Intervals for Coefﬁcient of Variation of Three

Parameters Delta-Lognormal Distribution . . . . . . . . . . . . . . . . . . . . . . . 352

Noppadon Yosboonruang, Suparat Niwitpong, and Sa-Aat Niwitpong

Conﬁdence Intervals for Difference Between Means and Ratio

of Means of Weibull Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

Manussaya La-ongkaew, Sa-Aat Niwitpong, and Suparat Niwitpong

Trading Signal Analysis with Pairs Trading Strategy in the Stock

Exchange of Thailand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

Natnarong Namwong, Woraphon Yamaka, and Roengchai Tansuchat

Technical Efﬁciency Analysis of Tourism and Logistics in ASEAN:

Comparing Bootstrapping DEA and Stochastic Frontier Analysis

Based Decision on Copula Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

Chanamart Intapan, Songsak Sriboonchitta, Chukiat Chaiboonsri,

and Pairach Piboonrungroj

Estimating the Difference in the Percentiles of Two Delta-Lognormal

Independent Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402

Maneerat Jaithun, Sa-Aat Niwitpong, and Suparat Niwitpong

Impacts of Global Market Volatility and US Dollar on Agricultural

Commodity Futures Prices: A Panel Cointegration Approach . . . . . . . . 412

Khunanont Lerkeitthamrong, Chatchai Khiewngamdee,

and Rossarin Osathanunkul

viii Contents

in Thailand’s Economic Trends Using Dynamic Stochastic

General Equilibrium (DSGE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423

Chaiwat Klinlampu, Chukiat Chaiboonsri, Anuphak Saosaovaphak,

and Jirakom Sirisrisakulchai

A Regime Switching Skew-Distribution Model of Contagion . . . . . . . . . 439

Woraphon Yamaka, Payap Tarkhamtham, Paravee Maneejuk,

and Songsak Sriboonchitta

Structural Breaks Dependence Analysis of Oil, Natural Gas,

and Heating Oil: A Vine-Copula Approach . . . . . . . . . . . . . . . . . . . . . . 451

Nopasit Chakpitak, Payap Tarkhamtham, Woraphon Yamaka,

and Songsak Sriboonchitta

Markov Switching Constant Conditional Correlation GARCH

Models for Hedging on Gold and Crude Oil . . . . . . . . . . . . . . . . . . . . . 463

Noppasit Chakpitak, Pichayakone Rakpho, and Woraphon Yamaka

Portfolio Optimization of Stock, Oil and Gold Returns: A Mixed

Copula-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474

Sukrit Thongkairat, Woraphon Yamaka, and Nopasit Chakpitak

Markov Switching Quantile Model Unknown tau Energy Stocks Price

Index Thailand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488

Pichayakone Rakpho, Woraphon Yamaka, and Songsak Sriboonchitta

Modeling the Dependence Dynamics and Risk Spillovers for G7

Stock Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 497

Noppasit Chakpitak, Rungrapee Phadkantha, and Woraphon Yamaka

A Regime Switching Vector Error Correction Model of Analysis

of Cointegration in Oil, Gold, Stock Markets . . . . . . . . . . . . . . . . . . . . . 514

Sukrit Thongkairat, Woraphon Yamaka, and Songsak Sriboonchitta

A Regime Switching Time-Varying Copula Approach to Oil and Stock

Markets Dependence: The Case of G7 Economies . . . . . . . . . . . . . . . . . 525

Rungrapee Phadkantha, Woraphon Yamaka, and Songsak Sriboonchitta

Forecasting Exchange Rate with Linear and Non-linear

Vector Autoregressive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

Rungrapee Phadkantha, Woraphon Yamaka, and Songsak Sriboonchitta

The Impacts of Macroeconomic Variables on Economic Growth:

Evidence from China, Japan, and South Korea . . . . . . . . . . . . . . . . . . . 552

Wilawan Srichaikul, Woraphon Yamaka, and Songsak Sriboonchitta

Contents ix

Countries: Panel Threshold Approach and Panel Smooth Transition

Regression Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563

Noppasit Chakpitak, Wilawan Srichaikul, Woraphon Yamaka,

and Songsak Sriboonchitta

Predictive Recursion Maximum Likelihood for Kink

Regression Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572

Noppasit Chakpitak, Woraphon Yamaka, and Paravee Maneejuk

Bayesian Extreme Value Optimization Algorithm: Application

to Forecast the Rubber Futures in Futures Exchange Markets . . . . . . . 582

Arisara Romyen, Satawat Wannapan, and Chukiat Chaiboonsri

Measuring U.S. Business Cycle Using Markov-Switching Model:

A Comparison Between Empirical Likelihood Estimation and

Parametric Estimations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 596

Paravee Maneejuk, Woraphon Yamaka, and Songsak Sriboonchitta

Analysis of Small and Medium-Sized Enterprises’ Insolvency

Probability by Financial Statements Using Probit Kink Model:

Manufacture Sector in Songkhla Province, Thailand . . . . . . . . . . . . . . . 607

Chalerm Jaitang, Paravee Maneejuk, Aree Wiboonpongse,

and Songsak Sriboonchitta

Frequency Domain Causality Analysis of Stock Market and Economic

Activites in Vietnam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620

Nguyen Ngoc Thach, Le Hoang Anh, and Ha Thi Nhu Phuong

Investigating Structural Dependence in Natural Rubber Supplys

Based on Entropy Analyses and Copulas . . . . . . . . . . . . . . . . . . . . . . . . 639

Kewalin Somboon, Chukiat Chaiboonsri, Satawat Wannapan,

and Songsak Sriboonchitta

The Dependence Between International Crude Oil Price and Vietnam

Stock Market: Nonlinear Cointegration Test Approach . . . . . . . . . . . . . 648

Le Hoang Anh, Tran Phuoc, and Ha Thi Nhu Phuong

Stability of Vietnam Money Demand Function: An Empirical

Application of Multiple Testing with a Structural Break . . . . . . . . . . . . 670

Bui Quang Hien and Pham Dinh Long

Analytic on Long-Run Equilibrium Between Thailand’s Economy

and Business Tourism (MICE) Industry Using Bayesian Inference . . . . 684

Chanamart Intapan, Songsak Sriboonchitta, Chukiat Chaiboonsri,

and Pairach Piboonrungroj

x Contents

in Asia: Zero Inefﬁciency Meta-Frontier Approach . . . . . . . . . . . . . . . . 702

Jianxu Liu, Hui Li, Songsak Sriboonchitta, and Sanzidur Rahman

Technical Efﬁciency Analysis of Agricultural Production of BRIC

Countries and the United States of America: A Copula-Based

Meta-Frontier Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724

Jianxu Liu, Yangnan Cheng, Sanzidur Rahman, and Songsak Sriboonchitta

Comparisons of Conﬁdence Interval for a Ratio of Non-normal

Variances Using a Kurtosis Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . 745

Channarong Wongyai and Sirima Suwan

An Analysis of Stock Market Cycle with Markov Switching

and Kink Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 756

Konnika Palason and Roengchai Tansuchat

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775

General Theory

The Replacement for Hypothesis Testing

1

New York, USA

matt@wmbriggs.com

2

New Mexico State University, Las Cruces, USA

{hunguyen,dtrafimo}@nmsu.edu

3

Chiang Mai University, Chiang Mai, Thailand

factors, leads to over-certainty, and produces the false idea that causes

have been identiﬁed via statistical methods. The limitations and abuses

of in particular p-values are so well known and by now so egregious,

that a new method is badly in need. We propose returning to an old

idea, making direct predictions by models of observables, assessing the

value of evidence by the change in predictive ability, and then verify-

ing the predictions against reality. The latter step is badly in need of

implementation.

Model selection · Model validation · Predictive probability

degree, certain hypotheses are true or false, or if a theory is good or bad, or

useful or not. This is not, of course, what that phrase means in frequentist or

Bayesian theory. Classical statistical philosophy has developed measures, such as

p-values and Bayes factors, which are not directly related to the plain meaning.

Yet the plain meaning is what all seek to know.

The relationship between a theory’s truth or goodness and p-values is non-

existent by design. The connection between a theory’s truth and Bayes factors is

more natural, e.g. Mulder and Wagenmakers (2016), but because Bayes factors

focus on unobservable parameters, they exaggerate evidence for or against a

theory (we demonstrate this presently). The predictive approach outlined below

restores, and puts into proper perspective, the natural goals of modeling.

The two main goals of modeling physical observables are prediction and

explanation, i.e. understanding the causes of the phenomenon of interest. With-

out delving too deeply into a highly complex subject, it should be obvious that

if we knew the cause or causes of an observable, we would write these down and

not need a probability model, see Briggs (2016). Probability models are only

needed when causes are unknown, at least in some degree. Though there is some

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 3–17, 2019.

https://doi.org/10.1007/978-3-030-04263-9_1

4 W. M. Briggs et al.

disagreement on the topic, e.g. Hitchcock (2016), Breiman (2001), and though

the reader need not agree here, we suggest that there is no ability for a wholly

statistical model to identify cause. Everybody agrees models can, and do, ﬁnd

correlations. And because correlations are not causes, hypothesis testing cannot

ﬁnd causes, nor does it claim to. At best, hypothesis testing highlights possibly

interesting relationships.

Now every statistician knows these arguments, and agrees with them to vary-

ing extent (most disputes are about the nature of cause, e.g. Pearl (2000)). But

the “civilians” who use the tools statisticians develop have not well assimilated

the arcane philosophy behind those tools. Civilians all too often assume that if

a hypothesis test has been “passed”, a causal eﬀect—or something very like it,

like a “link” (a word nowhere deﬁned)—has been conﬁrmed. This is only natural

given the name: hypothesis test. This explains the overarching desire for p-value

hacking and the like. The result is massive over-certainty and a reproducibility

crisis, e.g. see among many others Begley and Ioannidis (2015); see too Nosek

et al. (2015).

This leaves prediction. Prediction makes sense and is understandable to

everybody, and best of all opens all models to veriﬁcation, to real testing. A

hard check against reality is not the usual treatment statistical models receive.

This is a shame. The many beneﬁts of prediction are detailed below.

There is not much point here adding to the critiques of p-values. Not every

argument against them is well known, but enough are in common circulation

that even their most resolute defenders are given pause, e.g. Nguyen (2016),

Traﬁmow and Marks (2015). The only good use for p-values is the one for which

they are designed. Calculating the probability that certain functions of data will

exceed some value supposing a speciﬁed probability model holds. About whether

that, or any other, model is good, true, or useful, the p-value is utterly silent.

It’s funny, then, that the only uses to which p-values are put are on questions

they can’t answer.

The majority—which includes all users of statistical models, not just careful

academics—treat p-values like ritual, e.g. Gigerenzer (2004). If the p-value is

less than the magic number, a theory has been proved. It does not matter that

frequentist statistical theory insists that this is not so. It is what everybody

believes. And the belief is impossible to eradicate. For that reason alone, it’s

time to retire p-values.

As stated, Bayes factors come closer to the mark, but since they are stated

in terms of unobservable parameters, their use will always lead to over-certainty.

This is because we are always more certain of the value of parameters than we are

of observables. This is obvious since the posterior of any parameters feeds into

the equations for the predictive posterior of observables. Take an easy example.

Suppose we characterize the uncertainty in the observable y using a normal with

known parameters. Obviously, we are more uncertain of the observable than

the parameters, which are known with certainty. If we then suppose there is

uncertainty in the parameters (perhaps supplied by a posterior, or by guess), we

have to integrate out this new uncertainty in the parameters, which increases the

The Replacement for Hypothesis Testing 5

Bayes factors, though we do use what is usually considered an objective Bayes

framework, suitably understood, to produce predictions. Frequentist probability

predictions can also be used, but with diﬃculties in interpretation.

We take probability to be everywhere conditional, and nowhere causal, in the

same manner as Briggs (2016), Franklin (2001), Jaynes (2003), Keynes (2004).

Accepting this is not strictly necessary for understanding the predictive position,

but it is for a complete philosophical explanation. This philosophy’s emphasis on

observables and measurable values which inform observables is also important.

2 Predictive Assessment

All quantiﬁable probability models for observables y can ﬁt this predictive

schema:

Pr(y ∈ s|X, D, M) (1)

where y is the observable of interest (the dimension can be read from the con-

text), s a subset of concern, M is the evidence and premises that suggest the

model form, D is optionally old (or assumed) measurements of (y, x) and X

optionally represents new or assumed values of x. It is well to stress that proba-

bility, like logic, does not restrict itself to statements on observable propositions.

But scientiﬁc models do revolve around that which can be measured. Thus, the

only type of models we discuss here will be for observable, i.e. measurable, y.

It is also worth emphasizing M is usually a complex, compound proposition

that includes everything used to judge the model. Statisticians have developed

a shorthand that works well with mathematical manipulations of models, but

which masks important model information. Since nearly all models in practical

use are assigned ad hoc, the masking emboldens the false belief the model used

in an application is the correct model, or at least one “close enough” to the true

one. This over-emphasizes the importance of hypothesis testing, leading to over-

certainty that causal, or semi-causal “links”, have been properly identiﬁed. And

this in turn has led to a most unfortunate non-practice of model veriﬁcation.

It is rare to never that the vast army of published models ever undergo testing

against the real world. About that subject, more below.

The majority of probability models follow one of two basic forms. Paradig-

matic examples:

MD = “A 6-sided object with sides labeled 1–6, which will be tossed, after

which one side must show”. The observable y is the side, with s = 1 · · · 6. Then

Pr(y = i|M)1/6, ∀i. About why this deduction holds, and about why we believe

we can deduce probability and why we do not believe probability is subjective,

we relegate to Briggs (2016).

Mtemp = “The uncertainty of tomorrow’s high temperature quantiﬁed by a

normal distribution, whose central parameter μ is a function of yesterday’s high

and an indicator of precipitation”; i.e. a standard regression.

MD has no parameters and requires no old observations. Its general form is

MP = P1 P2 · · · Pm , where each P is a premise as in a logical argument, and the

6 W. M. Briggs et al.

complex.

Mtemp is a parameterized model typically requiring old observations, and

in Bayesian analysis evidence on the uncertainty of the parameters, i.e. prior

distributions. The evidence suggesting the priors is assumed to be part of Mtemp .

Of course, there may, and even must, be some number of premises P included in

parameterized models. The one that must be present is the one identifying the

parameterized model. E.g. P = “Uncertainty in the observable will be quantiﬁed

with a normal distribution”. This P is almost always ad hoc. This does not mean

not useful.

Classical hypothesis testing in frequentist or Bayesian terms is usually applied

to parametric models, with the goal of model selection, a potentially confusing

term, as we shall see. The general idea is simple. In its most basic form, two

models are proposed, parameterized or not, both identical except one will have

one less premise or parameter. For example:

MPb : P1 P2 · · · Pm−1 (3)

Mθ2 : μ = θ0 + θ1 x1 (5)

where in the ﬁrst set of comparisons MPb has one fewer premise than does MPa .

In the second set of comparison x1 might be, from the example above, yesterday’s

high temperature, and I(x2 ) the indicator of precipitation. The ordering of more

to less complex models does not, of course, matter.

Predictive selection for premise-based models is simplicity itself. But don’t

let its simplicity fool you. It contains the very basis of how models are actually

built. Calculate

Pr(y ∈ s|X, D, MPb ) = p (7)

(the obvious restrictions on the values of p and δ apply); otherwise it is relevant.

Using the example above with MD remaining the same, and letting MD+1 = MD

& “Candy canes have peppermint ﬂavoring.” Then

Pr(y ∈ s|MD ) = 1/6, ∀s. (9)

die will show. At no value of s was δ non-zero. The premise is therefore rejected.

The example is silly, but it highlights an important truth. All models are built

like this. Scores of irrelevant premises are rejected at the outset, with little or no

The Replacement for Hypothesis Testing 7

thought. This is the right thing to do, too. Yet it is the reason the premises are

rejected that is important. Model builders reject premises because they know

the probability of the observable y at some measurable x will not change. If

you like, we can say that the hypothesis that the premise is relevant has been

rejected—and rejected absolutely.

Hypothesis testing, then, begins well before any p-value is calculated or even

data collected. It does not reach any level of formality until well down the road.

This is interesting because if people were truly serious about the theory behind

p-values, to remain consistent with that theory, p-values (and Bayes factors)

should be used to rule out every hypothesis not making it into the ﬁnal model.

Now every is a lot; indeed, it is inﬁnite. Since any hypothesis not making it into

the ﬁnal model must be rejected in the formal way, true p-value and Bayes factor

believers would thus never ﬁnish testing. No model would ever get built in ﬁnite

time.

What we are proposing is an approach which is everywhere consistent. And

which produces no paradoxes.

In the case of comparing parameterized probability models, there is uncer-

tainty in which model is “better”. But there is no uncertainty in calling any

model true, if that word is meant in the causal sense. None but the strictly

causal (perfectly predictive) model is true. If we knew the actual cause of y, or

what determines the value of y, then we would not need a probability model.

Causal models are not impossible, or even rare. Physics is awash in causal and

deterministic models (to know the cause is greater than to know what determines

a value).

Most, or even all, statistical models are ad hoc. In the temperature example,

it is obvious many other parameterized, and even unparameterized, models could

have been used to express uncertainty in y. Not just in the sense that extra terms

could be added to the right hand side of the regression, but entirely diﬀerent

model structures. Normal distributions do not have to be used, for instance. The

model need not be linear in the parameters. The possibilities for ad hoc models

are limitless.

That is what makes talk of “true” values of the parameters curious. Since

statistical models are ad hoc and not true in any causal sense, and since nearly all

models do not specify the precise and total circumstance of an observable (i.e. all

auxiliary premises, see Traﬁmow (2017)), it is vain to search for “true” values of

parameters. Even at a hypothetical, never-will-be-reached limit. Again, physics

comes closest to an apt understanding of true values of parameters, because

there carefully controlled experiments can be run that delineate all the (known)

possible causal factors. In these limited circumstances, it makes more sense to

speak of true parameter values. Parameters in this sense often have physical

meaning, at least by proxy. But, again, this does not hold for the vast majority

of probability models.

8 W. M. Briggs et al.

Pr(y ∈ s|X, D, M2 ) = p (11)

assume the obvious numerical restrictions of p and δ. If at s, and given X and D,

δ = 0, the parameter(s) in M1 , and therefore the measurements associated with

those parameters, are irrelevant to the uncertainty of y. These X, and these

parameters, are therefore not needed in the model. Removing them does not

change the probability. The models in (10) are predictive, meaning the uncer-

tainty in the parameters given by priors is integrated out. Yet even frequentists

can use this method, as long as probability predictions can be made from the

frequentist model.

If at any s, for the given X and D, δ = 0, then the X and its parameters

are relevant. Whether to keep the extra parameters becomes a standard prob-

lem in decision analysis. A relevant parameter important to one decision maker

can be unimportant to another. There can be no universal value of δ useful

in all situations, like there is with the magic number for p-values. As should

be clear, relevance depends on s and on everything on the right hand side of

the probability equation. That means any change on the right hand side might

change the measure of relevance. That accords with common sense: change your

information, change your basis of judgment.

In practice, on a per-model, per-decision basis, a δ is chosen, which may

depend on s, below which measurements are decided to be unimportant, and

above which are important. Measurements, and their associated parameters, are

kept or discarded accordingly.

An additional advantage of this approach is that no parameter estimates

are needed, or even desired. Parameters are not in any case observable. The

models are already ad hoc anyway, so focusing on parameter estimates, either

as a Bayesian posterior or a frequentist point estimate with conﬁdence interval,

produces over-certainty in any X’s importance. The predictive approach thus

uniﬁes testing and point estimation.

Not only can (10) be used in intra-model selection, but it is ripe for estimating

the probabilistic importance of each X. It will often be found that a model with

multiple parameters will show a wee p-value and large (relative) point estimate

for one parameter, and a non-publishable p-value and small point estimate for

the second parameter. But when (10) is employed, the order of importance is

inverted. Changing the value of X for the classically “weaker” parameter will

produce larger variations in probability of y ∈ s, especially for values of s thought

crucial in the problem at hand.

The Replacement for Hypothesis Testing 9

3 Examples

3.1 Example 1: Product Placement Recall

We begin for the sake of clarity with the simplest of examples. Results of a

survey to relate ability to recall product placement in theater ﬁlms by movie

genre (Action, Comedy, Drama) and sex were asked on 137 people, each giving

a response (a score) with the number of correct recalls in the discrete interval 0–6,

Park and Berger (2010). The data were initially analyzed using null hypothesis

signiﬁcance testing. The conclusion of the authors was “Results suggest that

brand recognition is more common in drama ﬁlms.”

An ordinary regression in R on the score by sex (M = 1, or 0) and movie

genre was run, producing the following ANOVA table (sans hyperventilating

asterisks).

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 3.4994 0.1930 18.134 <2e-16

M1 0.3952 0.2489 1.588 0.1147

GenreComedy 0.4087 0.2712 1.507 0.1342

GenreDrama 0.7077 0.2792 2.535 0.0124

The p-values for sex (diﬀerence) and Comedy were larger than the magic

number. Some authors would at this point remove sex from the model. The

p-value for Drama was publishable, hence the conclusion of the authors.

Predictive probabilities of the full model were calculated, assuming standard

out-of-the-box “ﬂat” priors. Posteriors on the parameters were ﬁrst calculated,

then these were integrated out to produce the predictive posterior of the observ-

able score, see Bernardo and Smith (2000). The results would, of course, change

with a diﬀerent prior; but so would they change with a diﬀerent model. We are

not recommending this model, and certainly not recommending ﬂat priors; we

are only showing how the predictive approach works in a common situation.

There is a bit of diﬃculty in creating predictive probabilities, because the

scores can only take the values 0–6, but the standard normal regression model

produces predictive probability densities along the continuum. Indeed, the model

produces predictions of positive probabilities for values of scores less than 0 and

greater than 6, scores that will never be seen (they are impossible) in any repeat

of the experiment. We elsewhere call the assignment of positive probability to

impossible events probability leakage, Briggs (2013). It usually shows up when

regression models do not make good approximations and when the observable

lives in a limited range, or when the observable’s discreteness is stark.

In this case, for males, the predictive probabilities for scores greater than 6

are 0.06 for Action, 0.1 for Comedy, and 0.15 for Drama (these are probabilities

for known impossible values). In other words, given the person is a male assessing

a Drama, the model predicts a probability of 0.15 for new scores greater than

6. For females, the numbers are 0.03, 0.06, and 0.09 respectively. Not small

numbers. For scores less than 0, the predictive probabilities are for men are all

10 W. M. Briggs et al.

less than 0.001; for women the largest is 0.003. Whether any of these numbers

is important depends on the decisions to which the model is put, and not on

whether any statistician thinks them small or large. About these decisions, we

are here agnostic.

The next decision is how to turn the predictions which are over the real

line to predictions of discrete observable scores. One way of doing this, which

is not unique, is to calculate the predictive probability for being between 0 and

0.5, and assign that to a predictive probability of score = 0; next calculate the

predictive probability for being between 0.5 and 1, and assign that to a predictive

probability of score = 1; and so on. The probability of 5.5 to 6.5 can be assigned

to score = 6, with the remainder being left to leakage, or everything greater than

5.5 can be assigned to score = 6; correspondingly, everything less than 0.5 can

be assigned score = 0. Now all this rigmarole would not have been necessary

if a model which only allowed scores 0–6 were used (perhaps a multinomial

regression). But our purpose here is not to ﬁnd terriﬁc or apt models; we only

want to explain how to use the predictive approach for models people routinely

use.

It is crucial to understand that in creating predictive probabilities, as in

Eq. (6), the model must be fully speciﬁed in each prediction. In other words, we

created a model of sex and genre because we thought these measurements would

change the uncertainty in the score, therefore for each and every prediction we

make, we must specify a value of sex and genre.

Figure 1 shows the predictive probability for men for each genre. Clearly, the

diﬀerence in these probabilities are non-zero, hence δ = 0; so, genre is relevant to

uncertainty in score. The diﬀerences in probabilities clearly depends on the level

of score (the s), ranging from about 0.001 (in absolute values) for s = 1, up to

0.14 for s = 6. Again, whether these diﬀerences are important depends on the

decisions to which the model will be put. Supposing for the sake of argument a

δ = 0.05 (a familiar number!) to indicate importance, then there is no important

diﬀerences in probabilities between Action and Comedy for scores of 0–2 and

4–5 but there are for scores of 3 and 6. The p-value would lead to the decision

of no diﬀerence between Action and Comedy. But with our chosen δ, there is a

clear diﬀerence in importance.

Now the same plot (or calculations: visual inspection is not necessary) should

be done for females by genre, and the diﬀerences assessed there too. We skip that

step, noting that the important diﬀerences exist here, too, and for diﬀerent scores

for the genres. We instead show Fig. 2, the diﬀerences in sex at the Drama genre.

The diﬀerences (in absolute value) are between 0.002 and 0.08. The importance

δ is exceed at scores of 3 and 6.

Again, the p-value for sex was not wee, and sex might have been dropped

from the model. The important diﬀerences noted for Drama were also found for

Comedy, but not for Action, though these were not noted by the p-values.

This level of detail in an analysis won’t always be needed. Instead, tables like

the following can and should be presented. Plots and summaries may of course

be better, depending on the situation. Here there are two diﬀerent regression

The Replacement for Hypothesis Testing 11

Fig. 1. The days on which the interested events occur for DTAC

Table 1. Probabilities (rounded to nearest hundredth) for scores 0–6 for the genre

Drama, with and without considering sex, in two separate regression models.

Either 0.00 0.01 0.06 0.17 0.28 0.27 0.20

Male 0.00 0.01 0.05 0.15 0.27 0.28 0.25

Female 0.00 0.02 0.08 0.20 0.29 0.25 0.16

models, the ﬁrst without sex and the second with. Readers are free to make

decisions based on their own δs, which might diﬀer from the authors’.

This next example shows the ﬂexibility of the predictive method, and its poten-

tial for partial automation. Full automation of analysis is not recommended for

any model, except in special circumstances. Automation can cause one to forget

limitations.

12 W. M. Briggs et al.

Fig. 2. Predictive probability of score for men and women for the Drama genre.

ranks for a college in the USA for two departments A and B “roughly correspond-

ing to theoretical disciplines and applied disciplines, respectively”, quoted from

Fox and Weisberg (2011). Faculty sex, years since PhD and years of service were

also measured. The minimum measured salary was $57,800 and the maximum

was $231,5000, proving at least one of us is in the wrong job.

Obviously, we use this data to make predictions of people not in this data

set, because we already know all we can about the salaries of people we have

already measured.

That is, we desire naturally to make predictions.

Here is the ordinary ANOVA table:

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 78.8628 4.9903 15.803 < 2e-16

rankAsstProf -12.9076 4.1453 -3.114 0.00198

rankProf 32.1584 3.5406 9.083 < 2e-16

disciplineB 14.4176 2.3429 6.154 1.88e-09

The Replacement for Hypothesis Testing 13

yrs.service -0.4895 0.2119 -2.310 0.02143

sexMale 4.7835 3.8587 1.240 0.21584

The standard ANOVA tells us little about predictions. That is easily reme-

died in Table 2, which we label the predictive “ANOVA” table. It uses the same

regression model with (again) “ﬂat” out-of-the-box priors. It shows the central

(most likely) estimate for the condition noted, and which holds all other mea-

surements ﬁxed at their observed median values or base levels (to be deﬁned

below). The categorical variables are stepped through their levels, while the

others step through the ﬁrst, second, and third observed quartiles. Any other

values of special interest may of course be substituted, but we leave these to

demonstrate how an automatic analysis might look.

rank AssocProf 101 0.5

rank AsstProf 88.6 0.343

rank Prof 134 0.844

discipline A 119 0.5

discipline B 134 0.675

yrs.since.phd 5 125 0.5

yrs.since.phd 21 134 0.606

yrs.since.phd 40 144 0.719

yrs.service 3 140 0.5

yrs.service 16 134 0.421

yrs.service 37 123 0.302

sex Female 129 0.5

sex Male 134 0.56

This Table also shows the predicted probability that a person holding these

attributes would have a higher salary than a “base level” person. The base level

is not unique and can be user speciﬁed as a particular level of interest. Here we

take the ﬁrst level of all other categorical measures as ordered (alphabetically)

by R. The ﬁrst level of rank is “AssocProf”, with “AsstProf” coming after,

alphabetically. The non-categorical measures take as base their observed ﬁrst

quartile values.

For example, the predicted most likely salary for an Associate Professor in

discipline B (the median), and male (also median), with 21 years since PhD and

16 years of service is $101 thousand. The probability another person at the base

level, which in this case is a person with the same attributes, is, as expected, 0.5

(in this model, the posterior predictive distributions are all symmetric around

14 W. M. Briggs et al.

the central value). We next hold all these attributes constant, but change the

rank, so that we now have a new male Assistant professor in discipline B with

21 years since PhD and 16 year service. The probability this new man has a

higher salary is 0.34, meaning, of course, a man with the higher rank has a

probability of 0.66 of having a higher salary.

These tables take only a little getting used to, and they are easily modiﬁed, as

a standard ANOVA is not, for questions interesting to decision makers. Relevance

can be picked oﬀ the table: any probability diﬀering from 0.5 shows relevance,

at least for the levels speciﬁed. Direct information about the observable is also

prominent.

This table does not obviate a fuller analysis, as was done above in the ﬁrst

example. Plots and tables of the same sort can and should be made. For example,

as in Fig. 3.

Fig. 3. Predictive probability diﬀerences between men and women in discipline B for

new Assistant Professors in black (0 years of service, 1 year since PhD) and for seasoned

Professors in red (24 years of service, 25 years since PhD). Probabilities are calculated

every $5,000.

This shows the predictive probability diﬀerences between men and women in

discipline B for new Assistant Professors in black (0 years of service, 1 year since

PhD) and for seasoned Professors in red (24 years of service, 25 years since PhD).

Probability diﬀerences are calculated every $5,000. Most of these diﬀerences are

0.01, or less. The largest diﬀerence was for new hires at a salary lower than was

observed. This implication is that while there were observed diﬀerences in salaries

between men and women, the chances are not great for seeing them persist in

new data. At least, not for individual salaries. Calculating the diﬀerences over

larger “block” sizes of salaries, say, every $10 or $15 thousand would show larger

diﬀerences.

The Replacement for Hypothesis Testing 15

The predictive approach does not solve all modeling ills. No approach will. It

reduces some, but only some, of the excesses in classical hypothesis testing.

Although we advise against a universal, one-size-ﬁts-all value of δ, all experience

shows such a value will be picked. Doing so makes model selection and presenta-

tion automatic. People prefer less work to more. The predictive approach clearly

entails more work than standard hypothesis testing in every aspect. As such,

there will be reluctance to use it. It also does not provide answers that are as

sharply deﬁned as hypothesis testing. And people crave certainty—even when

this certainty is exaggerated, as it is with classical hypothesis testing. Every

statistician knows how easy it is to “prove” things with p-values.

Any approach that does not add model veriﬁcation to model selection is

doomed to failure. Models must be tested against reality. It is not at all clear

how to do this with classical hypothesis testing. As said above, the idea a “test”

has been passed gives the false impression the model has been checked against

reality and found good.

True veriﬁcation is natural using the predictive approach. Models under the

predictive approach are reported in probability form. Advanced training in sta-

tistical methods are not needed to understand results. The models reported in

Table 1 require no special expertise to comprehend. These are the (conditional)

probabilities of new scores that might be observed, perhaps depending on the

sex of the participant. “Bets” (i.e. decisions) can be made using this table. Here

the standard apparatus of decision analysis comes into play in choosing which

probabilities are important, and which not. If the model is a good one, the

probabilities will be well calibrated and sharp, when considered with respect to

whatever bets or decisions that are made with it.

Anybody can check a predictive model (given they can recreate the original

scenarios). The original data is not needed, nor the computer code used to gen-

erate it. The model is laid bare for all to see and test. Limitations and strengths,

especially for controversial and “novel” research, will quickly become apparent.

How best to do veriﬁcation we leave to outside authorities. This list is far

from complete, but a good place to start is here: e.g. Gneiting and Raftery

(2007), Briggs and Zaretzki (2008), Hersbach (2000), Wilks (2006), Briggs and

Ruppert (2005), Briggs (2016), Gneiting et al. (2007). The idea is basic. Produce

predictions and compare these using proper scores against observations never

used or seen in any way before. This is the exact way civil engineers test models of

bridges, or electrical engineers test models of cell phone capacity, etc. The “never

used” is strict, and thus excludes cross validation and other approaches which

reuse or “peek” at veriﬁcation datasets when building a model. It’s not that

these methods don’t have good uses, but that the will always inﬂate certainty

in the actual value of a model.

Veriﬁcation, like model building is not exact, and cannot be. We must guard

against the idea that if a theory has passed whatever test we devise, we have the

best or a unique theory. Veriﬁcation is not proof. Quine and Duhem long ago

showed theories or models besides the one under consideration and testing could

16 W. M. Briggs et al.

equally well explain any set of observed (contingent) data, Quine (1953), Duhem

(1954). And when testing, the auxiliary assumptions (all implicit premises) of a

model can be diﬃcult or impossible to disentangle; see Traﬁmow (2009), Traﬁ-

mow (2017) for a discussion. What can be said is that given past good perfor-

mance of a model, and taking care the conditions in all explicit and implicit

premises are also met, it is likely the model will continue to perform well.

References

Begley, C.G., Ioannidis, J.P.: Reproducibility in science: Improving the standard for

basic and preclinical research. Circ. Res. 116, 116–126 (2015)

Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. Wiley, New York (2000)

Breiman, L.: Statistical modeling: the two cultures. Stat. Sci. 16(3), 199–215 (2001)

Briggs, W.M.: On probability leakage. arxiv.org/abs/12013611 (2013)

Briggs, W.M.: Uncertainty: The Soul of Probability, Modeling & Statistics. Springer,

New York (2016)

Briggs, W.M., Ruppert, D.: Assessing the skill of yes/no predictions. Biometrics 61(3),

799–807 (2005)

Briggs, W.M., Zaretzki, R.A.: The skill plot: a graphical technique for evaluating con-

tinuous diagnostic tests. Biometrics 64, 250–263 (2008). (With discussion)

Duhem, P.: The Aim and Structure of Physical Theory. Princeton University Press,

Princeton (1954)

Fox, J., Weisberg, S.: An R Companion to Applied Regression, 2nd edn. SAGE Publi-

cations, Thousand Oaks (2011)

Franklin, J.: Resurrecting logical probability. Erkenntnis 55, 277–305 (2001)

Gigerenzer, G.: Mindless statistics. J. Socio Econ. 33, 587–606 (2004)

Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation.

JASA 102, 359–378 (2007)

Gneiting, T., Raftery, A.E., Balabdaoui, F.: Probabilistic forecasts, calibration and

sharpness. J. R. Stat. Soc. Ser. B Stat. Methodol. 69, 243–268 (2007)

Hersbach, H.: Decompostion of the continuous ranked probability score for ensemble

prediction systems. Weather Forecast. 15, 559–570 (2000)

Hitchcock, C.: Probabilistic causation. In: The Stanford Encyclopedia of Philosophy

(Winter 2016 Edition) (2016). https://plato.stanford.edu/archives/win2016/entries/

causation--probabilistic

Jaynes, E.T.: Probability Theory: The Logic of Science. Cambridge University Press,

Cambridge (2003)

Keynes, J.M.: A Treatise on Probability. Dover Phoenix Editions, Mineola (2004)

Mulder, J., Wagenmakers, E.J.: Editor’s introduction to the special issue: Bayes fac-

tors for testing hypotheses in psychological research: Practical relevance and new

developments. J. Math. Psychol. 72, 1–5 (2016)

Nguyen, H.T.: On evidence measures of support for reasoning with integrated uncer-

tainty: a lesson from the ban of p-values in statistical inference. In: Huynh, V.N.,

Inuiguchi, M., Le, B., Le, B., Denoeux, T. (eds.) Integrated Uncertainty in Knowl-

edge Modelling and Decision Making, pp. 3–15. Springer, Cham (2016)

Nosek, B.A., Alter, G., Banks, G.C., et al.: Estimating the reproducibility of psycho-

logical science. Science 349, 1422–1425 (2015)

Park, D.J., Berger, B.K.: Brand placement in movies: the eﬀect of ﬁlm genre on viewer

recognition. J. Promot. Manag. 22, 428–444 (2010)

The Replacement for Hypothesis Testing 17

Pearl, J.: Causality: Models, Reasoning, and Inference. Cambridge University Press,

Cambridge (2000)

Quine, W.V.: Two Dogmas of Empiricism. Harper and Row, Harper Torchbooks,

Evanston (1953)

Traﬁmow, D.: The theory of reasoned action: a case study of falsiﬁcation in psychology.

Theory Psychol. 19, 501–518 (2009)

Traﬁmow, D.: Implications of an initial empirical victory for the truth of the theory

and additional empirical victories. Philos. Psychol. 30(4), 411–433 (2017)

Traﬁmow, D., Marks, M.: Editorial. Basic Appl. Soc. Psychol. 37(1), 1–2 (2015)

Wilks, D.S.: Statistical Methods in the Atmospheric Sciences, 2nd edn. Academic Press,

New York (2006)

On Quantum Probability Calculus for

Modeling Economic Decisions

1

Department of Mathematical Sciences, New Mexico State University, Las Cruces,

NM 88003, USA

hunguyen@nmsu.edu

2

Faculty of Economics, Chiang Mai University, Chiang Mai 50200, Thailand

songsakecon@gmail.com

3

Banking University of Ho-Chi-Minh City, 36 Ton That Dam Street, District 1,

Ho-Chi-Minh City, Vietnam

thachnn@buh.edu.vn

awarded to Richard H. Thaler in 2017 for his work on behavioral eco-

nomics, we address in this paper the fundamentals of uncertainty mod-

eling of free will. Extensions of von Neumann’s expected utility theory

in social choice, including various nonadditive probability approaches,

and prospect theory seem getting closer to cognitive behavior, but still

ignore an important factor in human decision-making, namely the so-

called “order eﬀect”. Thus, a better candidate for modeling quantita-

tively uncertainty, under which economic agents make their decisions,

could be a probability calculus which is both nonadditive and noncom-

mutative. Such a probability calculus already exists, and it is called

“quantum probability”. The main goal of this paper is to elaborate on

the rationale of using quantum stochastic calculus in decision-making for

econometricians, in a conference such as this, who are not yet aware of

this new trend of on going research in the literature.

Expected utility · Nonadditive probability · Noncommutativity

Quantum probability calculus

1 Introduction

In 1951, Feynman [11] came to a symposium on mathematical statistics and

probability at the university of California, Berkeley, to let probabilists and statis-

ticians know that the quantitative notion of chance in the context of intrinsic

randomness in quantum mechanics is not the same as the one considered much

earlier by Laplace (and hence diﬀerent from the more general probability concept

formulated by Kolmogorov in 1933, which is the standard quantitative modeling

of uncertainty used in almost all sciences). That did not seem to get any atten-

tion of probabilists and statisticians, perhaps because the nonadditivity (and

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 18–34, 2019.

https://doi.org/10.1007/978-3-030-04263-9_2

On Quantum Probability Calculus for Modeling Economic Decisions 19

only for “quantum uncertainty” in physics, and not for “ordinary” uncertainty

encountered in games of chance and ordinary random phenomena in social sci-

ences. The most signiﬁcant eﬀorts on developing further quantum probability,

on purely mathematical grounds, were Meyer [23] and Parthasarathy [26]. How-

ever, there seems to be no reasons for thinking about using quantum probability

outside of physics of particles, let alone in standard statistics.

Unlike Newtonian mechanics, quantum mechanics is intrinsically random,

and as such, deterministic laws should be replaced by stochastic models. In

physics, such an approach is called an “eﬀective theory” [17], i.e., a framework

created to model observed phenomena without describing in detail all of the

underlying processes. The situation seems similar as far as predicting human

behavior is concerned. Speciﬁcally, as Hawking put it [17] (p. 47): “Economics

is also an eﬀective theory, based on the notion of free will plus the assumption

that people evaluate their possible alternative courses of action and choose the

best. That eﬀective theory is only moderately succcessful in predicting behavior

because, as we all know, decisions are often not rational or are based on a

defective analysis of the consequences of the choice.“That is why the world is in

such a mess”.

Since economics (and hence economic data) is “created” by humans (eco-

nomic agents), the buzzword is “free will”. It is not possible to have physical laws

to predict human behavior, we propose models, such as Von Neumann’s expected

utility as a social choice theory, based on rationality and rational “degrees of

belief” (Kolmogorov probability which is an additive and commutative theory).

On the other hand, as Hawking reminded us “Since we cannot solve the equa-

tions that determine our behavior, we use the eﬀective theory that people have

free will. The study of our will, and of the behavior that arises from it, is the

science of psychology”. As such, not only the study of economics should be from

a psychological perspective (called behavioral economics), but also, psychology

should play an essential role in economics, somewhat similar to experiments in

physics to validate models.

It all boils down to uncertainty, the “soul” of modeling, probability and

statistics, as Briggs put it [3]. In an eﬀective theory like economics, where free

will is a consequent of human mind (a human has two ingredients: a body and

a soul; the body is a machine, and hence subject to physical laws, but the soul,

living somewhere in the brain, is something diﬀerent from the physical world,

and hence is not subject to physical laws), the “soul” is in fact the source of

uncertainty we face!

Except death and taxes, everything else is uncertain. But while uncertainty

has a general, common sense meaning, it is speciﬁc in each context. For exam-

ple, we can talk about uncertainty in physics, or in economics. In an uncertain

situation, when we talk about “the chance” of something, we implicitly refer to

a quantitative notion of chance, i.e., some measurement of uncertainty. In the

recent book “Ten Great Ideas about Chance” [8], Diaconis and Skyrms discussed

20 H. T. Nguyen et al.

Of course, all probability calculi are additive and commutative.

Do we already have a unique theory of probability to model quantitatively the

notion of uncertainty? In their ninth great idea “physical chance”, they wrote (p.

180) “Does quantum mechanics need a diﬀerent notion of probability? We think

not”. Well, of course not, as Richard Feynman said it clearly [11] “the concept of

probability is not altered in quantum mechanics”. Note that Richard Feynman’s

paper [11] was not listed in [8]. Instead, the point Feynman wanted to make

is this “What is changed, and changed radically, is the method of calculating

probabilities”, i.e., another probability calculus, and that is so because, in the

context of quantum mechanics (where the uncertainty is due to “free will” of

particles to choose paths to travel) “Nature with her inﬁnite imagination has

found another set of principles for determining probabilities; a set other than that

of Laplace, which nevertheless does not lead to logical inconsistencies”. Thus,

the point is this. A quantitative theory of chance should be developed in each

uncertain environment, no single universal probability calculus is appropriate

for all situations. The authors of [8] declared that they are thorough Bayesian,

i.e., adopting epistemic uncertainty, using Bayes’ rule and additive calculus of

probabilities. In Sect. 2 below, a typical experiment of W. Edwards revealed that,

in social sciences, people may not behave according to Bayes’ updating rule! See

also Gelman’s comments on this issue [14].

It was von Neumann who provided the mathematical language for quantum

mechanics (the counterpart of Newton’s calculus for his own mechanics), but it

seems he did not pay much attention to its probabilistic aspect, i.e., quantum

probability, except its logic (quantum logic), let alone using quantum proba-

bility in some other places. Instead, when moving to social sciences, he used

Kolmogorov probability to formulate his eﬀective theory of free will (economic

behavior) in [32]. See also [21].

Von Neumann’s expected utility [32] is the norm for human behavior in

economics, such as game theory (economic competition), social choice. But it is

only a “hypothesis”, or if you like, a “model” of human behavior. As a model,

it needs to be validated for usage. As in physics, a model is “reasonable” (to be

considered as a “law”) if there are no experiments contradicting its predictions.

Note that to validate a model, we use predictions, and not by “statistical testing”!

Here, von Neumann’s expected utility, as a model of “rational” behavior of free

will, predicts how humans make decisions (choosing alternatives) in the face of

uncertainty/risk. It is right here that psychologists become useful, especially for

economics!

It is well known that Von Neumann’s expected utility theory was contra-

dicted by major “paradoxes” like Allais [1], Ellsberg [9], see also [12,13]. The

main root of these violations of expected utility is the additivity of Kolmogorov

probability axiom. Using the spirit of physics, when a proposed model (here,

for describing human behavior under uncertainty) is violated by “experiments”

(here, psychological experiments), it has to be reexamined and modiﬁed. Now,

the concept of expected utility is deﬁned with respect to standard probabil-

On Quantum Probability Calculus for Modeling Economic Decisions 21

measures which causes the Ellsberg’s parodox. Thus, the ﬁrst thing to look at is

modifying this additivity axiom for a measure of uncertainty. This is somewhat

similar to physics: Depending upon environments, physical laws are diﬀerent:

while Newton’s laws in mechanics are appropriate for motion of macro-objects

(at moderate velocities), we need to modify them for velocities near the speed

of light (Einstein’s relativity), or replace them when dealing with micro-objects

like particles. We emphasize an important point, and that is: replacing does not

mean necessarily “destroying”. Quantum mechanics did not destroy Newtonian

mechanics, it is just applied to another environment. Here, Kolmorogov probabil-

ity, and von Neumann’s expected utility criterion are appropriate for, say, natural

phenomena; whereas, as we will elaborate upon, quantum probability (because of

its properties of nonadditivity and noncommutativity, and not really because we

think that particles and humans have something in common, or similar, namely

“free will”, where by free will we mean the freedom of choices, regardless of the

cause of free will), seems appropriate for social sciences. Let be clear. We will

use the term “quantum probability” because, as Richard Feynman has pointed

out to everybody, it is an uncertainty measure which has precisely two desirable

properties (nonadditivity and noncommutativity) for modeling human’s free will,

and not because of quantum mechanics!

The above mentioned “paradoxes” have triggered eﬀorts to modify stan-

dard probability (as the main ingredient in a decision theory) to various types

of “nonadditive probabilities”, such as Dempster-Shafer belief functions [5,30],

Choquet capacities [18,31], possibility theory [34], imprecise probabilities [33],

among others, see also e.g., [6,10,12,20,24,26–28].

On the other hand, behavioral economists argued that people make decisions

based on the potential value of losses and gains rather than the ﬁnal outcome,

as well as evaluating these losses and gains using certain “heuristics” [19]. Recall

that the 2017 Nobel Memorial Prize in Economic Sciences was a recognition

of an integration of economics with psychology. It is about time! One question

remains: How to model heuristics?

And more recently, and ﬁnally, the light appears! Quantum probability

appears to be the best candidate for modeling human uncertainty. See e.g.,

[2,4,7,16,29].

Since this paper aims simply at calling economists’ attention to a promising

approach to faithfully model the uncertainty under which humans and economic

agents make their decisions, we will be somewhat tutorial to get the message

across. As such, the paper is organized as follows. In Sect. 2, we recall some

basic violations of expected utility. In Sect. 3, we elaborate on some works on

nonadditive probabilities. In Sect. 4, we provide facts from psychological experi-

ments exhibiting more the inadequacy of additivity as well as commutativity of

standard probability measures, as far as modeling cognitive decision-making is

concerned. Section 5 is a tutorial on how quantum probability is built. Finally,

in Sect. 6, we indicate brieﬂy some main aspects of quantum probability calculus

in behavioral ﬁnance and economics.

22 H. T. Nguyen et al.

As stated, our aim is to make economists aware of a promising improve-

ment of behavioral economics, namely incorporating quantum probability into a

prospect-based framework. For that, to start out in this section, we recall some

well-known violations of von Neumann’s expected utility (for decision-making

under uncertainty) because of the additivity axiom in Kolmogorov’s formulation

of probability. This will serve as a motivation for considering various approaches

to nonadditive probability, in Sect. 3.

Below, from the literature, are the experiments (violating expected utility)

conducted by psychologists pointing out that von Neumann’s program is in fact

quite limited, triggering (in a healthy spirit of natural science) new developments.

(i) The Allais paradox (1953)

Consider the following gambles:

B : $2400(1.00)

C : $2500(0.33), $0(0.67)

D : $2400(0.34), $0(0.66)

The experiment consists of asking the participants the following question:

First, choose between gambles A and B, then, next, choose between C and D.

You could be one participant!

It was reported that most participants “behaved” by choosing B in their ﬁrst

choice, and choosing C in their second choice. What are YOUR choices?

Let’s see if their choices are consistent with expected utility model, or can

the expected utility model explain their experimental choices? For that, consider

one participant who have chosen B over A in her ﬁrst choice, and C over D in

her second choice. Let u(.) be her utility function. If she follows the expected

utility “rule”, then she has chosen B over A in her ﬁrst choice, because

⇐⇒ (.33)u(2500) < (.34)u(2400) − (.01)u(0)

In her second choice, with the same utility function u(.), she has chosen C

over D because

But these inequalities go in opposite directions! Although she has her utility

function, her choices were not dictated by taking expected utility, a clear viola-

tion of the expected utility model.

On Quantum Probability Calculus for Modeling Economic Decisions 23

Consider the following situation (not conducting an experiment!) We have

two urns: urn A contains 100 balls, of which 50 balls are red, and 50 balls are

black; urn B contains also 100 balls but in an unknown proportion of black and

red balls. Suppose we ask a person to choose one urn and one color (of balls),

then draw a ball from the chosen urn. If the color of ball drawn from the chosen

urn is her chosen color, she wins, say, $70, otherwise, she wins nothing.

Let’s ﬁnd out whether we can represent her choices by a probability. This

is a choice problem under uncertainty! An alternative θ here is the choice of an

urn and a color, e.g., θ = (A, red). When θ is preferred to θ , we write θ θ .

If also, θ θ, we write θ ∼ θ (we are indiﬀerent about which one); denotes

strict preference.

Suppose P is a probability on the set of alternatives, representing her pref-

erence relation, i.e., θ θ =⇒ P (θ) ≥ P (θ ).

Now examining the situation, it is clear that she is indiﬀerent about color,

but not about urn, so that her “behavior” is

(A, red) (B, red)

(A, black) (B, black)

implying that

P (A, red) = P (A, black)

P (B, red) = P (B, black)

But, since P is additive (so that P (A, red) + P (A, black) = 1 = P (B, red) +

P (B, black)), the above imply that

contracting (A, red) (B, red), and (A, black) (B, black)!

The conclusion is that there is no probability to represent the person’s behav-

ior. The problem is the additivity of the probability measure P .

(iii) W. Edwards (Violating Bayes’ updating rule, 1968)

Consider the following experiment. There are two boxes, one contains 700 red

balls and 300 balls; and the other contains 300 red balls and 700 blue balls.

Subjects know the composition of these boxes. Subjects choose at random one

of the boxes, and draw, say, 12 balls (without replacement) from it. Based upon

the results of their draws, they are asked to identify which box they were drawing

(which box is more “likely” to them).

Suppose the 12 draws resulted in 8 red balls and 4 blue balls. It turns out

that most of the subjects gave an answer between 70% and 80% for the box

24 H. T. Nguyen et al.

with more red balls, which is inconsistent with the likelihood 97% given by the

Bayesian updating formula.

Remark

(1) The computation of the posterior probability 97% is carried out as follows.

Denote by Box I the one with more red balls, and box II the other. The prior

probabilities are P (I) = P (II) = 12 . Let X denote the number of red balls drawn

in a sample of size n = 12. Its distribution is a hypergeometric H(N, D, n), with

N = 1, 000, D = 700 for box I, and D = 300 for box II. Speciﬁcally,

−1

D N −D N

P (X = k) =

k n−k n

P (X = 8|I)P (I)

P (I|X = 8) =

P (X = 8|I)P (I) + P (X = 8|II)P (II)

Note that the Bayesian updating procedure is based upon additivity of prob-

ability measure, also known as the “law of total probability”.

(2) Recall that the Bayesian approach to decision-making under “uncertainty”

is in fact under “risk”, where, according to Knight (1921), uncertainty and risk

are two diﬀerent things. The distinction is this. Risk refers to situations where

probabilities are known, whereas uncertainty refers to situations in which proba-

bilities are neither known, nor can be deduced or estimated in an objective way.

The Bayesian approach minimizes the importance of this distinction by introduc-

ing the concept of subjective probability, and proceeds as follows: when facing

Knight uncertainty, just use your own subjective probabilities, so that the prob-

lem of decision-making under uncertainty becomes a problem under risk. Clearly,

the problem with the Bayesian approach to decision-making is this. Do people

always have “probability beliefs” over any source of uncertainty (say, to update

by Bayes’ rule)?

With respect to the Bayesian approach to uncertainty analysis and its appli-

cations, there are recent works, such as [15] revealing that “its axiomatic foun-

dations are not as compelling as they seem, and that it may be irrational to

follow this approach”. This could be so since, basically, the Bayesian paradigm

commands that “when you face any source of uncertainty (epistemic or objec-

tive), you should quantify it probabilistically; in the absence of objective prob-

abilities, you should “have” your own, subjective probabilities (to guide your

decisions), and if you don’t know what the probabilities are, you should adopt

some probabilities” (you are not allowed to say “I don’t know”!). But then “such

as choice would be arbitrary, and therefore a poor candidate for a rational mode

of behavior”. As such, considerations of beliefs that cannot be quantiﬁed by

Bayesian priors, and of updating of non-Bayesian beliefs have been investigated

in the literature. See the next section.

On Quantum Probability Calculus for Modeling Economic Decisions 25

An excellent place to see the evolution of von Neumann’s expected utility, mainly,

from (subjective) additive probability to nonadditive probability, is to follow the

journey that Fishburn traced for us, from 1970 [12] to 1988, [13]. Of course,

another text is Kreps [20] (p. 198) with his “vision” in 1988 for the future:

“These data provide a continuing challenge to the theorist, a challenge to

develop and adapt the standard models so that they are more descriptive of what

we see. It will be interesting to see what will be in a course on choice theory in

ten or twenty years time”.

Remember, we are now exactly 30 years later! What is the state-of-the-art of

the theory of choice? We will elaborate on this question throught out this paper,

but ﬁrst, here is a ﬂavor of how researchers have reacted to the additivity axiom

of standard probability. A good summary is in [31].

Perhaps, Dempster [5] was credited for emphasizing upper probability con-

cept, as a nonadditive set function for modeling uncertainty, triggering Shafer

[28] to develop a “mathematical theory of evidence”. If P is a probability mea-

sure on a measurable set (Ω, A ), then the set function P∗ : 2Ω → [0, 1], deﬁned

by P∗ (B) = sup{P (A) : A ⊆ B, A ∈ A } satisﬁes P∗ (∅) = 0, P (Ω) = 1, and is

monotone of inﬁnite order, i.e., for any k ≥ 2, and B1 , B2 , ..., Bk in the power

set 2Ω , with |I| denoting the cardinality of the set I,

P∗ (∪kj=1 Bj ) ≥ (−1)|I|+1 P∗ (∩i∈I Bi )

∅=I⊆{1,2,...,k}

Note that this monotonicity of inﬁnite order is nothing else than a weakening

of Poincare’ equalities of probability measures. Clearly, P∗ (.) is a nonadditive

set function. More concretely, if X is a random set (i.e., a random element

whose values are sets) taking values as subsets of a set U , then its distribution

F (.) : 2U → [0, 1], F (A) = P (X ⊆ A) behaves exactly like P∗ , see e.g. [24]

Thus, although P∗ or F are nonadditive, they are somewhat related to additive

probability measures. It should be recalled that the Bayesian paradigm holds

that any source of uncertainty can and should be quantiﬁed by probabilities.

For example, if you do not have objective probabilities, you should have your

own subjective probabilities (to guide your decisions). Nonadditive measures of

uncertainty are designed when Bayesian prior cannot be quantiﬁed.

A nonadditive measure of uncertainty, called a possibility measure [34] in

the context of Zadeh’s theory of fuzzy sets (as opposed to ambiguity, fuzziness

or vagueness refers to situations, where it is diﬃcult to form any interpretation

at the desired level of speciﬁcity) is deﬁned axiomatically as π(.) : 2U → [0, 1],

π(∅) = 0, π(U ) = 1, and for any family {Ai }i∈I of subsets of U , π(∪i∈I Ai ) =

sup{π(Ai ) : i ∈ I}.

With respect directly to the problem of additivity in von Neumann’s expected

utility theory, it was Schmeidler [27,28] who provided a signiﬁcant framework

for nonadditive probability. It is all because of Knightian uncertainty in eco-

nomics (or “ambiguity”, a type of meaning in which several interpretations are

26 H. T. Nguyen et al.

set functions, not necessarily additive, to model ambiguous beliefs in economics,

see also [22].

A general form of nonadditive set functions is Choquet capacities (general-

izations of additive measures), e.g., [18], and its associated Choquet integral to

be used as a nonadditive expected utility concept [31].

A capacity on a measurable space (Ω, A ) is a set function ν(.) : A → [0, 1]

such that ν(∅) = 0, ν(Ω) = 1, and monotone increasing, i.e., A ⊆ B =⇒ ν(A) ≤

ν(B). The Choquet integral of a (real-valued) random variable X is deﬁned as

∞ 0

Cν (X) = ν(X > t)dt + [ν(X > t) − 1]dt

0 −∞

in the following sense. Two random variables X, Y are said to be comonotonic

if for any ω, ω we have [X(ω) − X(ω )][Y (ω) − Y (ω )] ≥ 0. For any capacity

ν(.), if X, Y are comonotonic, then Cν (X + Y ) = Cν (X) + Cν (Y ). This addi-

tivity is referred to as comonotonic additivity of Choquet integral. Schmeidler’s

subjective probability and expected utility without additivity [28] is based on

his integral representation without additivity [27] which we reproduce here. Let

B be the class of real-valued bounded random variables, deﬁned on (Ω, A ). Let

H be a functional on B such that H(1Ω ) = 1, H(.) is increasing and comono-

tonic additive, then H(.) is a Choquet integral operator, i.e., H(.) = Cν (.) with

ν being a capacity deﬁned on A by ν(A) = H(1A ). Note that Choquet inte-

grals are used as models for risk measures in ﬁnancial econometrics, see [31] for

economic applications.

Remark. It is interesting to note that, even at present times, without knowing

psychological evidence spelled out in the next section, there are still research

on topics such as set functions, capacities, “nonclassical” measure theory, non-

additive measure theory for decision theory and preference modeling. These set

functions are nonadditive but still monotone increasing. As such, they violate the

“conjunction fallacy” and do not capture the noncommutativity of information.

Before pointing out some basic evidence which change the way we used to use

the standard probability calculus, let us recall that the behavioral economic

theory is based essentially on the prospect theory [19] which, in the context of

ﬁnancial econometrics, states that people make decisions based upon values of

gains and losses (rather than ﬁnal outcomes) and evaluate these quantities using

“heuristics”.

People make decisions by using heuristics. Is reasoning with heuristics irra-

tional? Well, ﬁrst of all, rational or not, that is the way people do! Secondly,

rationality is understood (deﬁned) in a mathematical way, e.g., via the notion of

On Quantum Probability Calculus for Modeling Economic Decisions 27

expected utility. Think about the analogy with Newtonian and quantum mechan-

ics. If particles move the way they do, are they irrational since they do not obey

Newton’s laws?

Perhaps a word about terminology is useful. In the ﬁeld of Artiﬁcial Intelli-

gence, a fuzzy logic is not “a logic which is fuzzy”, but “it is a logic of fuzzy con-

cepts”. Here, by “quantum decision-making”, we mean “decision-making based

upon quantum probability”, where “quantum probability”, as Richard Feyn-

man pointed out, is a probability calculus similar to the one used in quantum

mechanics (nonadditive and noncommutative). The concept of “free will” of peo-

ple is used by analogy with the intrinsic randomness of particle motion, and not

any physical analogies. To predict human behavior (in making decisions under

uncertainty), academics use probabilities. For each probability calculus, we have

a deﬁned notion of “expected utility” for rationality!

In view of various violations of expected utility (as a rational way to make

decisions), behavioral decision theory was developed as an integration of psy-

chology and economics. Technically speaking, it is about developing appropriate

“probability calculus” in a given environment. In general, it “looks” like people

use “likelihood” to make their decisions under uncertainty. But likelihood is not

probability! Recall that the concept of likelihood was formulated by Fisher in his

theory of statistical estimation of population parameters. Tou counter Bayesian

approach to statistical estimation, Fisher “talked” about turning likelihood con-

cept into his “ﬁducial probability” (probability based on faith) with no success!

But psychologists performed experiments revealing that likelihood is in fact

not only nonadditive, but also not non monotone increasing (as set functions),

so that all above nonadditive probability calculi seem not to be adequate for

modeling the way people make decisions.

The so-called “conjunction fallacy” in the literature [19] is this. A lady named

Linda was known as an active feminist in the past. Consider now the event A =

“She is active in the feminist movement”, and B = “She is a bank teller”.

Subjects are asked to guess the likelihoods of A, B, A ∩ B. It turns out that

subjects judged A ∩ B is more likely than B.

Another more important evidence is the so-called “order eﬀect” [4] exhibiting

the noncommutativity of events, and hence aﬀecting the probability calculi in

standard approaches.

Perhaps, this “order eﬀect”, when putting in the context of non-Boolean

logics, calls for a radical way of thinking about axioms of probability measures,

i.e., looking for a probability calculus consistent with cognitive behavior. Clearly,

on top of all, it boils down to construct a new probability calculus which is

noncommutative (and nonadditive), and yet, it is a generalization of standard

probability calculus. Well, as we always borrow concepts and methods in physical

sciences to apply to social sciences, especially to economics, we have available a

probability calculus suitable for our needs, and that is called quantum probability

calculus. We elaborate on it a bit next.

28 H. T. Nguyen et al.

Calculus?

We wish to extend the standard probability calculus to a noncommutative one.

This type of generalization procedure is familiar in mathematics: if we cannot

extend a concept directly (e.g., a set to a fuzzy set), we do that indirectly,

namely look at some equivalent representation of that concept which can be

more suitable for extension.

Following David Hilbert’s advice “What is clear and easy to grasp attracts

us, complications deter”, let’s ﬁrst consider the simplest case of Kolmogorov

probability, namely the ﬁnite sample space, representing a random experiment

with a ﬁnite number of possible outcomes, e.g., a roll of a pair of dice. A ﬁnite

probability space is a triple (Ω, A , P ) where Ω = {1, 2, ..., n}, say, i.e., a ﬁnite

set with cardinality n, A is the power set of Ω (events), and P : A → [0, 1]

is a probability measure (P (Ω) = 1, and P (A ∪ B) = P (A) + P (B) when

A ∩ B = ∅). Note that since Ω is ﬁnite, the set-function

n P is determined by

the density ρ : Ω → [0, 1], ρ(j) = P ({j}), with j=1 ρ(j) = 1. A real-valued

random variable is X : Ω → R. In this ﬁnite case, of course X −1 (B(R)) ⊆ A .

The domain of P is the σ-ﬁeld A of subsets of Ω (events) which is Boolean

(commutative: A ∩ B = B ∩ A), i.e., events are commutative, with respect to

intersection of sets. We wish to generalize this setting to a non commutative one,

where “extended” events could be, in general, non commutative, with respect to

an “extension” of ∩.

For this, we need some appropriate equivalent representation for all elements

in this ﬁnite probability setting. Now since Ω = (1, 2, ..., n}, each function X :

Ω → R is identiﬁed as a point in the (ﬁnitely dimensional Hilbert) space Rn ,

namely (X(1), X(2), ..., X(n))t , which, in turn, is equivalent to a n × n diagonal

matrix with diagonal terms X(1), X(2), ..., X(n). and zero outside (a special

symmetric matrix), i.e.,

⎡ ⎤

X(1) 0

⎢ X(2) ⎥

⎢ ⎥

X ⇐⇒ [X] = ⎢ ⎢ 0 . 0 ⎥

⎥

⎣ 0. ⎦

0 X(n)

respect to matrix multiplication) subalgebra of the algebra of all n × n matrices

with real entries. As matrices act as (bounded, linear) operators from Rn → Rn ,

we have transformed (equivalently) random variables into operators on a Hilbert

space.

In particular, for each event A ⊆ Ω, its indicator function 1A : Ω → {0, 1} is

identiﬁed as an element of Do with diagonal terms 1A (j) ∈ {0, 1}.As such, each

event A is identiﬁed as a (orthogonal) projection on Rn , i.e., an operator T such

that T = T 2 = T ∗ (its transpose/adjoint). Finally, the density ρ : Ω → [0, 1]

is identiﬁed with the element [ρ] of Do with nonnegative diagonal terms, and

On Quantum Probability Calculus for Modeling Economic Decisions 29

positive operator, i.e., an operator T such that <T x, x> ≥ 0, for any x ∈ Rn

(where < ., . > denotes the scalar product of Rn ). Such an operator is necessarily

symmetric (self adjoint). Thus, a probability density is a positive operator with

unit trace. Thus, we have transformed the standard (Kolmogorov) probability

space (Ω, A , P ), with #(Ω) = n, into the triple (Rn , Po , ρ), where Po denotes

the subset of projections represented by elements of Do (i.e., with 0–1 diagonal

terms) which represent “ordinary” events; and ρ (or [ρ]), an element of Do , is a

positive operator with unit trace.

Now, keeping Rn as a ﬁnitely dimensional Hilbert space, we will proceed

to extend (Rn , Po , ρ) to a non commutative “probability space”. It suﬃces to

extend D0 , a special set of symmetric matrices, to the total set of all n × n

symmetric matrices, denoted as S (Rn ), so that a random variable becomes an

“observable”, i.e., a self-adjoint operator on Rn ; an “quantum event” is simply

an arbitrary projection on Rn , i.e., an element of P (the set of all projections);

and the probability density ρ becomes an arbitrary positive operator with unit

trace. The triple (Rn , P, ρ) is called a (ﬁnitely dimensional) quantum probability

space. We recognize that quantum probability is based upon a new language, not

real analysis, but functional analysis (i.e., not on the geometry of Rn , but on its

non commutative geometry, namely linear operators on it).

Clearly, in view of the non commutativity of matrix multiplication, quantum

events (i.e., projection operators) are non commutative, in general.

Let’s pursue a little further with this ﬁnite setting. When a random variable

X : Ω → R is represented by the matrix [X], its possible values are on the diag-

onal of [X], i.e., the range of X is σ([X]), the spectrum

of the matrix (operator)

[X]. For A ⊆ Ω, Pr(A) is taken to be P ([1A ]) = j∈A ρ(j) = tr([ρ][1A ]). More

generally, EX = tr([ρ][X]), exhibiting the important fact that the concept of

“trace” (of matrix/operator) replaces integration, a fact which is essential when

considering an inﬁnitely dimensional (complex, separable) Hilbert space, such as

L2 (R3 , B(R3 ), dx) of squared integrable, complex-valued functions.

The spectral measure of a random variable X, represented by [X], is the

projection-valued “measure” ζ[X] : B(R) → P(Rn ) : ζ[X] (B) = X(j)∈B πX(j) ,

where πX(j) is the (orthogonal) projection on the space spanned by X(j). From

it, the “quantum” probability of the event (X ∈ B), for B ∈ B(R) is taken to

be P (X ∈ B) = X(j)∈B ρ(j) = tr([ρ]ζ[X] (B)).

The extension of the above to arbitrary (Ω, A , P ) essentially involves the

replacement of Rn by an inﬁnitely dimensional, complex and separable Hilbert

space H. For details, see texts like Driac (1948), Meyer (1995), Parthasarathy

(1992).

We have stated several times that quantum probability is non commutative

and non additive. We will make these properties more explicit now.

Recall that a quantum probability space is a triple (H, P(H), ρ), where

P(H) plays the role of quantum events, and for p ∈ P(H), its probability

is given by tr(ρp). Recall that observables are self adjoint operators on H, i.e.,

elements of S (H).

30 H. T. Nguyen et al.

in general, since, for p, q ∈ P(H), they might not commute, i.e., pq = qp, so

that tr(ρpq) = tr(ρqp). Of course, that extends to non commuting observables

as well.

At the experiment level, the surprising non additivity of probability is

explained by the interpretation of the Schrodinger wave function ψ(x, t) as a

probability amplitude, i.e., the probability of ﬁnding an electron in a neigh-

borhood dx of R3 (at time t) is |ψ(x, t)|2 dx. The well-known two-slit experi-

ment reveals that, for two distinct holes A and B, the probability of ﬁnding

electrons when only A is open is PA = |ψA (x, t)|2 dx, and for B only open,

PB = |ψB (x, t)|2 dx. When both holes are open, waves interference leads to

ψA∪B (x, t) = ψA (x, t) + ψB (x, t), so that PA∪B = |ψA∪B (x, t)|2 = |ψA (x, t) +

ψB (x, t)|2 = |ψA (x, t)|2 + |ψB (x, t)|2 .

It can be also seen from the probability measure μρ (.) = tr(ρ.) on P(H).

First, P(H) is not a Boolean algebra. It is a non distributive lattice, instead.

Indeed, in view of the bijection between projections and closed subspaces of H,

we have, for p, q ∈ P(H), p ∧ q is taken to be the projection corresponding to

the closed subspace R(p) ∩ R(q), where R(p) denotes the range of p; p ∨ q is the

projection corresponding to the smallest closed subspace containing R(p)∪R(q).

You should check p ∧ (q ∨ r) = (p ∧ p) ∨ (p ∧ r), unless they commute.

On (H, P(H), ρ), the probability of the event p ∈ P(H) is μρ (p) = tr(ρp),

and if A ∈ S (H), Pr(A ∈ B) = μρ (ζA (B)) = tr(ρζA (B)), for B ∈ B(R), where

ζA is the spectral measure of A (a projection-valued measure on B(R)). With

its spectral decomposition A = λ∈σ(A) λPλ , the distribution of A on σ(A) is

P r(A = λ) = μρ (ρPλ ), noting that A represents a physical quantity.

Recall that on a Kolmogorov probability space (Ω, A , P ), the probability is

axiomatized as satisfying the additivity: for any A, B ∈ A ;

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

Now, on (H, P(H), ρ), where the “quantum probability” Q (under ρ), deﬁned

as, for the “quantum event” p ∈ P(H), Q(p) = tr(ρp), does not, in general,

satisfy the analogue, for arbitrary p, q ∈ P(H),

Q(p ∨ q) = Q(p) + Q(q) − Q(p ∧ q)

i.e., Q(.) is not additive. This can be seen as follows. For operators f, g ∈ S (H),

their commutator is deﬁned as

[f, g] = f g − gf

so that [f, g] = 0 if f, g do not commute (i.e., f g = gf ), and zero if they commute.

Then, you can check that

[p, q] = (p − q)(p ∨ q − p − q + p ∧ q)

exhibiting the equivalence

[p, q] = 0 ⇐⇒ p ∨ q − p − q + p ∧ q = 0

On Quantum Probability Calculus for Modeling Economic Decisions 31

Now, as Q(p) = tr(ρp), and by additivity of the trace operator, we see that

p ∨ q − p − q + p ∧ q = 0 =⇒ tr(ρ(p ∨ q − p − q + p ∧ q)) =

Q(p ∨ q) − Q(p) − Q(q) + Q(p ∧ q) = 0

which is the analogue of additivity for the quantum probability Q, for example

for p, q which commute.

The non additivity of quantum probability arises since, in general, p, q ∈

P(H) do not commute, i.e., [p, q] = 0. In other words, the non additivity of

quantum probability is a consequence of the non commutativity of observables

(as self adjoint operators on a Hilbert space).

Remark. The quantum law of an observable A ∈ S (H) is given as μA (.) :

B(R) → [0, 1], μA (B) = tr(ρζA (B)), where ζA is the spectral measure of A.

In a paper such as this, we think it is more appropriate to give the audience

a big picture of the promising road ahead, as far as ﬁnancial econometrics is

concerned, rather than some details and algorithms on actually how to apply

“quantum-like models” (which should be the story in another day).

When data (including economic data) are available, we look at them just as

a sample of a dynamic process, i.e., just examining on how they ﬂuctuated, and

not paying any attention on where they came from. In other words, when con-

ducting empirical research, regardless whether data are “natural phenomenon”

data or data having also some “cognitive” components (e.g., decisions from eco-

nomic agents/investors, traders in markets), we treat them the same way. Having

looked at data this way, we proceed (by tradition) simply by proposing stochas-

tic models to model their dynamics (for explanation and then prediction), such

as the well-known Black-Scholes model in ﬁnancial econometrics. Clearly the

geometric Brownian motion model (describing the stochastic dynamics of asset

prices) captures randomness of natural phenomena, but does not incorporate

anything related to the eﬀects of economic agents who are in fact responsible

for the ﬂuctuations of the prices under consideration. As such, does a “tradi-

tional” stochastic model in econometrics really describe the dynamics on which

all conclusions will be derived?

Stephen Hawking nicely reminded us [17] that, following natural sciences

(i.e., physics), we should view economics (a social science) as an “eﬀective the-

ory”, i.e., there is another important factor to take into account when proposing

a model (not a “law” yet!) for dynamics of economic variables, and that is deci-

sions of economic agents (“thinking individuals”, from the existence of their free

will). Whether or not, partially because of this that behavioral economics started

getting attention of researchers. Of course, the problem arises because, so far,

unlike, say, quantum mechanics, predictions in economics were not that success-

ful (!), as Hawking nicely qualiﬁed it as “moderate”. Should we ask “why?”.

32 H. T. Nguyen et al.

market hypothesis” under the inﬂuence of P.A. Samuelson and E.F. Fama, which

is based upon the “assumption” that investors act rationally and without bias

(and new information appears at random, and inﬂuences economic prices at

random). As a consequence, using standard probability calculus, martingales are

models for dynamics of asset prices, resulting in the conclusion that “trading on

stock market is just a game of chance (luck) and not a game of skill”, despite

empirical evidence revealing that “stock dynamics is predictable to some degree”.

It is all about prediction. But prediction is a consequence of our modeling

process. Should we take a closer look at the way we used to model ﬁnancial

dynamics? Obviously, we adapt (follow) concepts and methods in natural sci-

ences to social sciences, but not “completely”. The delicate diﬀerence between

Newtonian mechanics and quantum mechanics was ignored in econometrics mod-

eling. Of course, we do not “equate” the intrinsic randomness of particle motion

with the free will of economic agents’s mind (in making decisions). But, if, unlike

Newtonian mechanics, quantum mechanics is random so that, dynamics, trajec-

tories of particles should be formulated diﬀerently, then the same spirit should

be used in economic modeling.

But as Richard Feynman pointed out to us [11], when dealing with the ran-

domness of particles, we need another probability calculus. Of course that was

his only message to probabilists and statisticians, without knowing that later

standard probability and statistics invade empirical research in economics. The

quantum probability calculus seems strange (i.e., not applicable) to standard

statistical practices, because quantum probability exhibits “nonadditivity” and

“noncommutativity”. Well, Hawking did tell us that we have to pay attention

to psychologists because they are there precisely to help econometricians! Both

nonadditivity and noncommutativity of a measure of ﬂuctuations were discov-

ered by psychologists, invalidating expect utility in the ﬁrst place. The shift to

nonadditive measures (in human decision-making aﬀecting economic data) has

been started long time ago, but it looks like a separate eﬀort only for deci-

sion theory, with no incorporation into econometrics analysis. As pointed out in

this present paper, nonadditive measures, such as Choquet capacities, are not

adequate as a measure of ﬂuctuations (of economic data) since they are still

increasing set functions, and commutative. It is right here that we should follow

physics “completely” by using quantum probability calculus in economic anal-

ysis. Recent literature shows promising research in this direction. Our hope, in

an exposition such as this, is that those econometricians who are not yet aware

of this revolutionary vision, will to start to consider it seriously.

References

1. Allais, M.: Le comportement de l’homme rationnel devant le risque: Critique des

postulats et axiomes de l’ecole americaine. Econometrica 21(4), 503–546 (1953)

2. Baaquie, B.E.: Quantum Finance. Cambridge University Press, Cambridge, New

York (2004)

On Quantum Probability Calculus for Modeling Economic Decisions 33

Springer, New York (2016)

4. Busemeyer, J.R., Bruza, P.D.: Quantum Models of Cognition and Decision.

Cambridge University Press, Cambridge (2012)

5. Dempster, A.: Upper and lower probabilities induced by a multivalued mapping.

Ann. Math. Stat. 38, 325–339 (1967)

6. Denneberg, D.: Non-additive Measure and Integral. Kluwer Academic Press,

Dordrecht (1994)

7. Derman, D.: My Life as a Quant: Reﬂections on Physics and Finance. Wiley,

Hoboken (2004)

8. Diaconis, P., Skyrms, B.: Ten Great Ideas About Chance. Princeton University

Press, Princeton (2018)

9. Ellsberg, D.: Risk, ambiguity, and the savage axioms. Q. J. Econ. 75(4), 643–669

(1961)

10. Fegin, R., Halpern, J.Y.: Uncertainty, belief and probability. Comput. Intell. 7,

160–173 (1991)

11. Feynman, R.: The concept of probability in quantum mechanics. In: Berkeley Sym-

posium on Mathematical Statistics and Probability, pp. 533–541 (1951)

12. Fishburn, P.C.: Non Linear Preference and Utility Theory. Wheatsheaf Books,

Brighton (1988)

13. Fishburn, P.C.: Utility Theory for Decision Making. Wiley, New York (1970)

14. Gelman, A., Betancourt, M.: Does quantum uncertainty have a place in everyday

applied statistics? Behav. Brain Sci. 36(3), 285 (2013)

15. Gilboa, I., Marinacci, M.: Ambiguity and the Bayesian paradigm. In: Acemoglu,

D. (ed.) Advances in Economics and Econometrics, pp. 179–242. Cambridge Uni-

versity Press, Cambridge (2013)

16. Haven, E., Khrennikov, A.: Quantum Social Science. Cambridge University Press,

Cambridge (2013)

17. Hawking, S., Mlodinow, L.: The Grand Design. Bantam Books, London (2010)

18. Huber, P.J.: The use of Choquet capacities in statistics. Bull. Inst. Int. Stat. 4,

181–188 (1973)

19. Kahneman, D., Tversky, A.: Prospect theory: an analysis of decision under risk.

Econometrica 47, 263–292 (1979)

20. Kreps, D.M.: Notes on the Theory of Choice. Westview Press, Boulder (1988)

21. Lambertini, L.: John von Neumann between physics and economics: a methodolog-

ical note. Rev. Econ. Anal. 5, 177–189 (2013)

22. Marinacci, M., Montrucchio, L.: Introduction to the mathematics of ambiguity. In:

Gilboa, I. (ed.) Uncertainty in Economic Theory, pp. 46–107. Routledge, New York

(2004)

23. Meyer, P.A.: Quantum Probability for Probabilists. Lecture Notes in Mathematics.

Springer, Heidelberg (1995)

24. Nguyen, H.T.: On random sets and belief functions. J. Math. Anal. Appl. 65(3),

531–542 (1978)

25. Nguyen, H.T., Walker, A.E.: On decision making using belief functions. In: Yager,

R., Kacprzyk, J., Pedrizzi, M. (eds.) Advances the Dempster-Shafer Theory of

Evidence, pp. 331–330. Wiley, New York (1994)

26. Parthasarathy, K.R.: An Introduction to Quantum Stochastic Calculus. Springer,

Basel (1992)

27. Schmeidler, D.: Integral representation without additivity. Proc. Am. Math. Soc.

97, 255–261 (1986)

34 H. T. Nguyen et al.

28. Schmeidler, D.: Subjective probability and expected utility without additivity.

Econometrica 57(3), 571–587 (1989)

29. Segal, W., Segal, I.E.: The Black-Scholes pricing formula in the quantum context.

Proc. Nat. Acad. Sci. 95, 4072–4075 (1998)

30. Shafer, G.: A Mathematical Theory of Evidence. Princeton University Press,

Princeton (1976)

31. Sriboonchitta, S., Wong, W.K., Dhompongsa, S., Nguyen, H.T.: Stochastic Domi-

nance and Applications to Finance, Risk and Economics. Chapman and Hall/CRC

Press, Boca Raton (2010)

32. Von Neumann, J., Morgenstern, O.: The Theory of Games and Economic Behavior.

Princeton University Press, Princeton (1944)

33. Walley, P.: Statistical Reasoning with Imprecise Probabilities. Chapman and Hall,

London (1991)

34. Zadeh, L.A.: Fuzzy sets as a basis for a theory of possibility. J. Fuzzy Sets Syst. 1,

3–28 (1978)

My Ban on Null Hypothesis Signiﬁcance

Testing and Conﬁdence Intervals

David Traﬁmow(&)

P. O. Box 30001, Las Cruces, NM 88003-8001, USA

dtrafimo@nsu.edu

Abstract. The journal, Basic and Applied Social Psychology, banned null

hypothesis signiﬁcance testing and conﬁdence intervals. Was this justiﬁed, and

if so, why? I address these questions with a focus on the different types of

assumptions that compose the models on which p-values and conﬁdence

intervals are based. For the computation of p-values, in addition to problematic

model assumptions, there also is the problem that p-values confound the

implications of sample effect sizes and sample sizes. For the computation of

conﬁdence intervals, in contrast to the justiﬁcation that they provide valuable

information about the precision of the data, there is a triple confound involving

three types of precision. These are measurement precision, precision of homo-

geneity, and sampling precision. Because it is possible to estimate all three

separately, provided the researcher has tested the reliability of the dependent

variable, there is no reason to confound them via the computation of a conﬁ-

dence interval. Thus, the ban is justiﬁed both with respect to null hypothesis

signiﬁcance testing and conﬁdence intervals.

Models Model assumptions Inferential assumptions Precision

In my new position as Executive Editor of the journal, Basic and Applied Social Psy-

chology (BASP) in 2014, I discouraged researchers from performing the null hypothesis

signiﬁcance testing (NHST) procedure (Traﬁmow 2014). However, the 2014 editorial

had very little discernible effect on BASP submissions, social psychology, or science

more generally. Stronger measures were needed, so the following year I banned NHST

from BASP (Traﬁmow and Marks 2015). At ﬁrst, most of the reaction I received was

strongly negative. Many people emailed me that the ban would destroy BASP, and a few

even expressed that the ban would destroy social psychology as a respectable area of

scientiﬁc inquiry. But did the critics exaggerate the negative effects of the ban?

As time eventually showed, the critics did not exaggerate the amount of attention

that would be paid to the editorial. Within a few months, there were over 100,000 hits

on the editorial on the BASP website, the editorial was cited countless times, and NHST

was an important topic at the American Statistical Association Symposium on Statis-

tical Inference in October of 2017, at which I presented. But in another way, the critics

did exaggerate or were just plain wrong. The ban certainly did not destroy BASP; in

fact, the impact factor more than doubled. Nor did the ban destroy social psychology as

a respectable area of scientiﬁc inquiry as, to my knowledge, no one has written an

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 35–48, 2019.

https://doi.org/10.1007/978-3-030-04263-9_3

36 D. Traﬁmow

article suggesting that the ban is a reason for reducing belief in the respectability of

social psychology as a ﬁeld of science. Much more damaging to the credibility of social

psychology, and the soft sciences more generally, is that there has been a replication

crisis (e.g., Earp and Traﬁmow 2015; Open Science Collaboration 2015; Traﬁmow, in

press).

On the contrary, much that is good has followed the ban. Dramatically

increaseddiscussion has ensued not only in social psychology, but in many areas of

science about the (in)validity of NHST and the possibility of alternative procedures.

Returning to the Symposium on Statistical Inference, most of the speakers strongly

criticized NHST and favored the consideration of alternative procedures. And there are

new efforts in different countries, in different areas of science, to eliminate NHST, such

as a Netherlands effort in the life and health sciences. In addition, statistics textbooks

now exist, either published or in the process of being published, that discourage

NHST.1 In contrast to the expressions of negativity that immediately followed the ban,

much is changing across the sciences, and very much for the better. Some of the

presentations at the present conference TES2019 provide positive examples.

What is NHST?

To understand what is wrong with NHST, it is ﬁrst necessary to understand the p-value

and the place that the p-value has in NHST. The American Statistical Association

provided a nice characterization (Wasserstein and Lazar 2016, p. 131): “Informally, a

p-value is the probability under a speciﬁed statistical model that a statistical summary

of the data (e.g., the sample mean difference between two compared groups) would be

equal to or more extreme than its observed value” (italics added). The temptation, of

course, and what most of the experts warn against but do themselves anyhow, is make

an inverse inference that if the p-value is low, the probability of the model is low.2 The

fallacy is so pervasive that it has a name: the inverse inference fallacy. A quick way to

dramatize the fallacy is to consider the probability that a person is president of the USA

given that the person is an American citizen; versus the probability that someone is an

American citizen given that the person is president of the USA. The former conditional

probability is extremely low whereas the latter conditional probability is extremely high

(according to the American Constitution it is 1.00). Analogously, even though the p-

value might be low number, that does not mean that the statistical model need be

unlikely to be true. From the point of view of strict logic, the probability of the model

could be any number between 0 and 1, at least from a Bayesian perspective that models

can have probabilities. From a strict frequentist point of view, the model is either

correct (probability = 1) or incorrect (probability = 0) but we may not know which.

Thus, whether a Bayesian or frequentist view is taken, it would seem there is no

justiﬁcation for drawing a conclusion about the probability of the model or making an

accept/reject decision, respectively. For a theoretical analysis of why the testing pro-

cedure using p-values is not valid, see Nguyen (2016).

1

An example is the book by Briggs (2016), who is a distinguished participant at TES2019.

2

Richard Morey, in his blog (http://bayesfactor.blogspot.com/2015/11/neyman-does-science-part-1.

html), has documented how even Neyman was unable to avoid misusing p-values in this way, though

he warned against it himself.

Ban on Null Hypothesis Signiﬁcance Testing 37

But there is an important frequentist exception that invokes the notions of Type I

and Type II error. A Type I error is when the model is true and the researcher rejects it.

A Type II error is when the model is false and the researcher fails to reject it. The idea

of NHST, then, is to set an alpha level—usually .05—that serves as a threshold. If the

p-value comes in below the threshold, the researcher rejects the model whereas if the p-

value does not come in below the threshold, the researcher fails to reject the model

(which is not the same thing as accepting the model). By setting the alpha level, say at

.05, the researcher can be conﬁdent of making a Type I error only 5% of the time when

the model is true. This NHST strategy is touted as the way to sidestep the inverse

inference fallacy.

What is the Model?

If NHST provides an elegant way to sidestep the inverse inference fallacy, why am I

nevertheless against it? There is a long litany of reasons reviewed by Traﬁmow and

Earp (2017) that need not be repeated here. But it is worthwhile to hit on the most

interesting problem, the meaning of the model (Traﬁmow, submitted).

As exempliﬁed in the foregoing quotation from the American Statistical Associa-

tion, although statisticians talk virtuously about the importance of recognizing that a

computed p-value is relative to a model, these same statisticians fail to consider ade-

quately the different types of assumptions that go into the model. As I explained

recently (Traﬁmow, submitted manuscript), the different types of assumptions that go

into the model change depending on the type of research being done. For research

designed to test theories, there are at least four categories of assumptions bullet-listed

below:

• theoretical assumptions,

• auxiliary assumptions,

• statistical assumptions, and

• inferential assumptions.

Theoretical assumptions refer to assumptions that the theorist makes by choosing a

theory or set of theories from which to work; the assumptions in the theory or set of

theories are theoretical assumptions. At least some theoretical assumptions refer to

nonobservational terms. For example, in Newton’s theory, although weight is an

observational term, mass is a nonobservational term; and the difference can be seen

easily merely by considering that the same object would have the same mass, but

different weights, on different planets.

Auxiliary assumptions are not embedded in theories themselves but are necessary

to connect nonobservational terms in theories with observational terms in empirical

hypotheses. For example, when Haley (1656–1742) used Newton’s theory to predict

the reappearance of the comet that now bears his name, he made assumptions about the

presence or absence of astronomical bodies, the present position of the comet, and so

on. These auxiliary assumptions were not in Newton’s theory but were necessary to use

the theory to make predictions.

Statistical assumptions concern the summary statistics researchers use. If a mar-

keting researcher were to predict greater purchasing intentions in the advertising

condition than in the control condition, she might propose a statistical hypothesis of a

38 D. Traﬁmow

difference in means. Note that she could use other summary statistics such as medians,

modes, data frequencies above or below arbitrary percentiles, and many others. The

choice of summary statistics necessitates assumptions, even if these are tacit, about

why the summary statistics chosen are better suited for the researcher’s purposes than

other types of summary statistics. Traﬁmow et al. (2018); also see Speelman and

McGann 2016) have shown that different summary statistics can lead to opposing

conclusions. Thus, the issue of choosing summary statistics is much more important

than researchers typically realize.

Finally, there are inferential assumptions. To arrive at a p-value, it is necessary to

assume random and independent sampling from a deﬁned population (Berk and

Freedman 2003). Depending on one’s path to the p-value, it may also be necessary to

make assumptions about the population distributions, linearity, that participants were

randomly assigned to conditions, that the manipulation “took” for all participants, or

that there was no systematic invalidity in the measurements. I emphasize that at least

some inferential assumptions are guaranteed wrong in the soft sciences, such as the

assumption of random and independent sampling from a deﬁned population (Berk and

Freedman 2003). The a priori wrongness of at least some inferential assumptions is

going to play an important part in the argument to be made.

In summary, the model is an immense monstrosity that includes theoretical

assumptions, auxiliary assumptions, statistical assumptions, and inferential assump-

tions. Theoretical assumptions include nonobservational terms, and auxiliary

assumptions connect nonobservational terms in theories to observational terms in

empirical hypotheses. Sometimes there is no need for statistical assumptions, such as

when Haley predicted the reappearance of a comet, but sometimes statistical

assumptions are necessary, as is typical in the soft sciences. Statistical assumptions

connect observational terms in empirical hypotheses to the speciﬁc summary statistics

to be computed. Finally, to bridge the gap from summary statistics to p-values, it is

necessary to make inferential assumptions.

Consequences of the Model Assumptions

The most obvious consequence of the many model assumptions, including inferential

assumptions, is that, at least in the soft sciences, the model is guaranteed to be wrong.

Box and Draper (1987) stated the issue succinctly, “Essentially, all models are wrong,

but some are useful” (p. 424). Well, then, given that the model is known wrong a priori,

it is reasonable to query whether there is any point in testing it via p-values. The usual

apology for p-values is to admit that the model is wrong a priori, but to argue, con-

sistent with the Box and Draper quotation, that it nevertheless might be close enough to

correct to be useful. Once one grants that the model might be close to being correct,

even though the model is incorrect, it seems reasonable to argue for the worth of

p-values to test model closeness to correctness.

But there is an important flaw in the p-value apology, which is that p-values are

computed conditionally upon the model being assumed correct, not upon the model

being assumed close to correct. The unfortunate fact of the matter is that there is no

way to compute a p-value based on the assumption that the model is close to correct.

As an example, it is obvious that the inferential assumption of random and independent

sampling from a deﬁned population is never true in the soft sciences (Berk and

Ban on Null Hypothesis Signiﬁcance Testing 39

Freedman 2003); yet this assumption is a component of the model. Well, then, how

would one compute a p-value based on the notion of various degrees of closeness of

this assumption, not to mention the myriad additional inferential assumptions? Let me

be clear that I am not advocating that researchers always should randomly sample from

deﬁned populations, only that this is an inferential assumption that must be made to

justify the p-value to be computed.3 The larger point, which is worth reiterating, is that

there is no reason for researchers interested in model closeness to compute p-values,

nor to engage in NHST.

From a philosophy of science perspective, matters are worse than I have related

thus far. Consider luminaries such as Duhem (1954) and Lakatos (1978) who

emphasized that even if obtained data seem to falsify a theory, the failure to predict can

be attributed to auxiliary assumptions, as well as to the theory. Thus, it is not clear

whether to blame the theory or auxiliary assumptions for the empirical defeat for the

theory. In addition, I have recently demonstrated that a similar problem ensues in the

event of an empirical victory (Traﬁmow 2017a); the victory could be credited to the

excellence of the theory or the auxiliary assumptions. And this work by Duhem,

Lakatos, and myself only considers theoretical and auxiliary assumptions; but not

statistical or inferential assumptions that add additional layers of complexity to an

already challenging process of evaluating theories based on empirical defeats or vic-

tories. Considering statistical assumptions, the beneﬁts typically outweigh the cost of

another layer of assumptions. Researchers in the soft sciences need means, medians,

percentiles, and so on; though they should put more thought into which descriptive

statistics to use, and to test whether using different descriptive statistics imply similar or

different conclusions (Traﬁmow et al. 2018).

But it is less clear that researchers need p-values. In the ﬁrst place, as I will

demonstrate presently, unlike typical descriptive statistics, p-values provide no added

beneﬁt. Secondly, and directly to the present point, using p-values necessitates a layer

of inferential assumptions, some of which are guaranteed wrong. This guaranteed

wrongness implies that no matter how carefully the research project is performed, the

obtained p-value can be attributed to a problem with the inferential assumptions, as

well as a problem with one of the other types of assumptions. The result is a severe

impediment to attempting to draw conclusions about theories from data. Thus, the cost

of using p-values, though often not recognized, is immense. Further costs, particularly

when p-values are used for NHST, are dichotomous thinking (Greenland 2017;

Traﬁmow et al. 2018); questionable research practices to obtain p < .05 for the sake of

grants and publications (Bakker et al. 2012; John et al. 2012; Simmons et al. 2011;

Woodside 2016); published effect sizes that, on average, are much larger than true

effect sizes (Grice, 2017; Hyman, 2017; Kline, 2017; Locascio, 2017a; Locascio,

2017b; Marks, 2017; Open Science Collaboration 2015); replication crises (Earp and

Traﬁmow 2015; Halsey et al. 2015; Open Science Collaboration 2015; Traﬁmow, in

press), and other costs (see Briggs 2016; Hubbard 2016; Traﬁmow and Earp 2017;

Ziliak and McCloskey 2016; for reviews).

3

In fact, Rothman et al. (2013) provided arguments against random selection.

40 D. Traﬁmow

Do p-values convey any beneﬁts to balance out the immense costs of using them?

Consider an example. A researcher obtains a large sample size, obtains a small p-value,

and rejects the model. Given that the researcher already knew the model was incorrect

before starting, there is no new information here. A counter might be to invoke the

closeness apology; that is, the low p-value shows that the model is not even close to

being true. But this conclusion does not follow validly. First, it commits the inverse

inference fallacy discussed earlier. Second, the p-value was not computed based on the

model being close; but rather based on the model being exactly right. Therefore, the p-

value has nothing to say about whether the model is close or far from the truth. Third,

again from the point of view of the closeness of the model to the truth, the p-value

confounds the implications of sample effect size and sample size. To see this quickly,

suppose that the researcher in our example had proposed a model that speciﬁes the

predicted effect size and that the obtained effect size is close to it. The low p-value is

due to the large sample size. Well, then, the low p-value implies that the model is far

from the truth whereas the closeness of the sample effect size to the predicted one

implies that the model is close to the truth. Clearly, an intelligent researcher would

discount the p-value; not the sample effect size; though she would also keep in mind the

usual difﬁculties in drawing conclusions about theories from data. Although, in this

example, I speciﬁed a large sample size for the sake of simplicity, there is a larger

point. That is, the sample effect size is obviously relevant to assessing the closeness of

the model to the truth (though other factors are relevant too); and the sample size is

obviously relevant for assessing the extent to which the researcher should trust the

stability of the sample effect size (though other factors are relevant too). Given that the

researcher knows the sample effect size and the sample size, the p-value provides no

added beneﬁt. Worse yet, the p-value confounds the implications of the sample effect

size and the sample size. If the issue were the exact correctness of the model, perhaps

the confounding could be justiﬁed, though there would still be the inverse inference

fallacy with which to contend. But given that the issue is the closeness of the model,

there is no justiﬁcation whatsoever in confounding the clear implications of the sample

effect size and the sample size, when not confounded with each other via the com-

putation of a p-value. In summary, the combination of no added beneﬁt and immense

costs renders p-values strongly deleterious to the progress of science.

What About Research Not Designed to Test Theories?

One alternative to research designed to test theories is that researchers may wish to

perform applied research. For those researchers applying a theory, all the categories of

assumptions mentioned earlier—theoretical, auxiliary, statistical, and inferential—re-

main relevant. But it also is possible to perform applied research that is not based on

theory but depends on what might be considered substantive assumptions. However,

this change does not affect the message of the foregoing section. That is, there may be

substantive assumptions and statistical assumptions; whereas using p-values forces the

added complexity of including inferential assumptions too. Thus, even for applied

research, p-values render an already challenging task of evaluating substantive

assumptions more complex, by adding a layer of known wrong inferential assumptions.

It also is possible to imagine researchers using p-values for data ﬁshing. But if one

is data ﬁshing, it makes more sense to decide on relevant sample effect sizes for the

Ban on Null Hypothesis Signiﬁcance Testing 41

issue at hand, and screen based on those rather than based on p-values. Remaining with

data ﬁshing, it is easy to imagine researchers having the computer perform two com-

puter runs. Whereas one run is based on sample effect sizes, the other run could be

based on sample sizes, where the researcher programs the run on the minimum sample

size necessary for her to trust the stability of the sample effect size. Even here, whatever

assumptions the researcher needs to make to go data ﬁshing, there is nothing gained by

adding an extra layer of inferential assumptions where at least some of them are known

wrong a priori.

Finally, let us consider exploratory research, where very little is known, and the

purpose of the research is to obtain empirical facts on which to base decisions about

whether or how to continue the line of research. Perhaps NHST can be useful here.

That is, if the p-value is statistically signiﬁcant, that suggests that the researcher should

continue the line of research whereas if the p-value is not statistically signiﬁcant, that

suggests that resources would be better devoted to pursuing an alternative line of

research. The main problem with this thinking is that the p-value again is no better than

the model on which it is based. We already have seen that the model is problematic in

the context of research performed to test theories, for application, and for data ﬁshing.

The model is even more problematic for exploratory research where very little is

known and so there is even less basis for the model assumptions than usual. In addition

to the ubiquitous and wrong inferential assumption of random and independent sam-

pling from a deﬁned population; when very little is known there is even less justiﬁ-

cation than usual for inferential assumptions about distributions, linearity, and so on.

The sample effect size provides a much better indication of whether to continue the line

of investigation, and the sample size provides an indication of how much to trust the

stability of the sample effect size. The p-value provides no added beneﬁt, confounds the

implications of sample effect size and sample size, and saddles researchers with

important costs.4

The usual recommendation by authorities who eschew NHST speciﬁcally, and p-values

more generally, is that researchers should use conﬁdence intervals (see Cumming and

Calin-Jageman 2017 for a review). Contrary to this recommendation, the BASP p-value

ban included conﬁdence intervals (CIs) too. Why?

To commence, it is worth considering what CIs do not accomplish. Researchers

often believe, for example, that a 95% CI provides an interval that has a 95% proba-

bility of containing the population parameter of interest. But this is not so. As

sophisticated aﬁcionados of CIs admit, the researcher has no idea about the probability

4

The reader may wonder about p-values as used in NHST versus as used to provide continuous

indices of alleged justiﬁed worry about the model. Although both are problematic for the reasons

described, null hypothesis signiﬁcance tests are worse because of the dichotomous thinking they

encourage, and the dramatic overestimates of effect sizes in scientiﬁc literatures that they promote

(see Locascio, 2017a for an explanation). If p-values were calculated but not used to draw any

conclusions, their costs would be reduced though still without providing any added beneﬁts.

42 D. Traﬁmow

that the constructed interval contains the population parameter. In fact, from a strict

frequentist perspective, the population parameter is in the interval (100% probability)

or not (0% probability), though the researcher does not know which. Even if one does

not take a strict frequentist perspective, the best that can be said about a 95% CI is that

if one were to take many samples, approximately 95% of the 95% CIs constructed,

would contain the population parameter.5 To conclude anything about the probability

that population parameter is within a CI constructed based on a single sample is to

commit another version of the inverse inference fallacy discussed earlier.

Sophisticated CI aﬁcionados understand the foregoing but nevertheless tout CIs

because they indicate the precision of the data. Wide CIs indicate less precision and

narrow CIs indicate more precision. Are CIs useful in a precision context?

In fact, they are not, as becomes clear upon consideration that there are at least three

types of precision, and all of them influence CI widths. It is well-known that the

researcher needs to obtain the standard error of the mean to compute a conﬁdence

interval for a mean or a difference between means. The standard error of the mean is

influenced by the standard deviation and the sample size. In turn, the standard deviation

is influenced by random and systematic variance. According to classical test theory

(Gulliksen 1987; Lord and Novick 1968), the smaller the random variance, the greater

the measurement precision. In addition, the smaller the systematic variance (e.g., real

differences between people) within each group, the easier it is to distinguish the effects

of experimental manipulations in between-group analyses; Traﬁmow (2018) termed

this precision of homogeneity. Finally, the larger the sample size, the greater the

sampling precision.6 Thus, CI widths are influenced by three types of precision:

measurement precision, precision of homogeneity, and sampling precision. This triple

confounding will be discussed more later.

Traﬁmow (2018) derived equations that unconfound the joint effects of the three

types of precision, provided that the researcher obtains a good estimate of the reliability

of the dependent variable. At the population level, Eq. 1 gives the variance due to

systematic factors (true differences between participants) r2O as a function of the reli-

ability of the dependent variable qXX 0 and the total variance r2X . Equation 2 gives the

variance due to randomness r2R . As researchers normally do not have access to pop-

ulation data and consequently must depend on sample data, Eqs. 1* and 2* perform

similar functions to Eqs. 1 and 2, respectively; but based on sample reliability rXX 0 and

sample variance s2X .

5

Of course, even this very limited conclusion depends on the model being correct, and as we already

have seen, the model is not correct because of problematic inferential assumptions.

6

Assuming random sampling, an assumption most likely incorrect.

Ban on Null Hypothesis Signiﬁcance Testing 43

homogeneity and greater measurement precision, respectively.

In contrast to precision of homogeneity and measurement precision, the third type

of precision—sampling precision—is in no way a function of the reliability of the

dependent variable. Rather, it is a function of the sample size. As a simple example,

consider the case where a researcher collects a sample mean with the goal of estimating

the population mean. The sampling precision is simply a function of the sample size n.

To see that this is so, imagine the simple case of only one group, and the goal is to

obtain a sample mean to estimate the population mean. Traﬁmow (2017) provided an

accessible derivation of Eq. 3 below, where f is the fraction of a standard deviation

within which the researcher wishes the sample mean to be of the population mean, and

where zc is the z-score that corresponds to the desired probability of being within f .

2

zc

n¼ : ð3Þ

f

sample mean within 3 standard deviations of the population mean. The z-score cor-

responding to 95% is 1.96, and so instantiating the appropriate values into Eq. 3 gives

2

the following, to the nearest upper whole number: n ¼ 1:96 :3 ¼ 43. Thus, the

researcher must collect at least 43 participants to have a 95% probability of obtaining a

sample mean within three-tenths of a standard deviation of the population mean.

I claimed earlier that using CIs causes a confound involving the implications of the

three types of precision. The easiest way to see this is to consider an Equation Traﬁ-

mow (2018) derived that gives the standard error of the mean as a function of the three

types of precision indices: rO , rR , and n.

pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

r2o þ r2R

Standard Error of the Mean ¼ SDX ¼ pﬃﬃﬃ : ð4Þ

n

Equation 4 shows that very many possible combinations of values for rO , rR , and n

can result in similar results for SDX . Thus, whatever a researcher estimates SDX to be,

there is no way for the researcher to know the extent to which the obtained value is

because of impressive or unimpressive levels of each of the three types of precision. To

render the triple confound easy to see, Fig. 1 illustrates how many values for the

different levels of precision can result in the same value for SDX , which was arbitrarily

set at 10 for every curve in Fig. 1.

The fact that it is possible to obtain separate estimates of the three types of precision

ðrO ; rR ; nÞ, rather than confounding their implications by using a CI, sets up a dilemma

for CI aﬁcionados who tout the precision argument as their justiﬁcation for using CIs.

Speciﬁcally, if a researcher cares about precision, there is little excuse for failing to

measure the reliability of the dependent variables, and using the appropriate equations

44 D. Traﬁmow

Fig. 1. The sampling precision n necessary to keep the standard error of the mean constant at 10

is presented along the vertical axis as a function of measurement precision rR presented along the

horizontal axis and six curves representing different levels of precision of homogeneity rO .

to estimate the three types of precision separately. In this case, there is no reason to

compute a triply confounded conﬁdence interval when the researcher can have much

more ﬁne-grained information in the form of separate estimates of the three types of

precision. Alternatively, the researcher might not care about precision, and might not

measure the reliability of the dependent variable; but in the case where the researcher

does not care about precision, the precision argument fails to justify a CI. Thus, in

either case, whether the researcher cares about precision or does not care about it, there

is little reason to compute a CI.

My case against CIs can be summarized as follows. If they are used merely as

another way to perform NHST; that is, to reject the null hypothesis if the value

speciﬁed by the null hypothesis is outside the computed CI; then CIs have the same

problems that plague NHST. If CIs are used for the sake of precision—the “proper” use

according to sophisticated authorities—then there is the issue of the confounding of

three different types of precision in the standard error of the mean necessary to compute

CIs. Either way, CIs are problematic. Finally, as is true of p-values, the computation of

CIs necessitates appending a layer of inferential assumptions—some of which are

guaranteed to be wrong—on top of the other assumptions made, thereby further

complicating an already difﬁcult task of using data to come to justiﬁable conclusions.

Ban on Null Hypothesis Signiﬁcance Testing 45

2 Conclusion

It is possible to agree with the many problems that plague NHST and CIs but never-

theless support them based on a perceived lack of alternatives. Contrary to many

researchers’ perceptions, however, there are many alternatives. An underrated alter-

native, but a good one, is to simply not perform inferential statistics. Lest the reader

experience dismay, or even panic, at this possibility; consider that for centuries before

NHST and CIs had been invented, researchers nevertheless conducted science. Is this

because scientists of yesteryear were smarter than scientists of today, and so they made

progress without the typical inferential statistics currently prevalent and deemed nec-

essary? My argument is to the contrary. Because NHST and CIs force the researcher to

append a layer of inferential assumptions—some of which are guaranteed to be wrong

—on top of their other assumptions, researchers of yesteryear who were not burdened

with this layering had an important advantage over contemporary researchers.7 By

dispensing with NHST and CIs, contemporary researchers could rid themselves of the

unnecessary burden, and accordingly enjoy the beneﬁts of a simpler and less cum-

bersome scientiﬁc process. As a supplement to simply avoiding NHST and CIs, there is

a variety of ways to use visual displays to enhance the ability of researchers to

understand their descriptive statistics and draw useful conclusions (Valentine et al.

2015). As editor of BASP, I have supported improved descriptive statistics and visual

displays of the data.

Another possibility is to consider completely different approaches. One of these is

quantum probability (Trueblood and Busemeyer 2011; 2012) that includes an

assumption that the typically assumed communitive nature of events can be violated.

That is, the typical assumption is that pðA \ BjH Þ ¼ pðB \ AjHÞ. By applying Bayes’

theorem, a further implication is that pðHjA \ BÞ ¼ pðHjB \ AÞ. Based on quantum

probability not assuming communitive properties, Trueblood and Busemeyer (2012)

showed that it is possible to account for much data that otherwise is difﬁcult to explain.

Yet another possibility is for researchers to use one of the a priori procedures that

have been invented only recently. The basic concept commences with the goal that

many researchers have, which is to obtain sample statistics that are good estimates of

corresponding population parameters. With this goal in mind, two obvious issues

concern the desired closeness of sample statistics to population parameters, and the

probabilities of being within the desired distances. The researcher can specify, before

data collection, speciﬁcations for distances and probabilities. By using one of a

growing inventory of a priori equations to suit different study designs, different

inferential assumptions, and different statistics (e.g., Traﬁmow 2017b; Traﬁmow and

MacDonald 2017; Traﬁmow, Wang, and Wang 2017; Wang, Wang, Traﬁmow, and

Myüz, under submission); the researcher can determine the number of participants

needed to be conﬁdent that descriptive statistics of interest are good estimates of their

corresponding population parameters. Because a priori equations are used prior to data

7

This argument should not be interpreted as indicating that contemporary researchers are at an overall

disadvantage. In fact, contemporary researchers have many advantages over the researchers of

yesteryear, including better knowledge, better technology, and others.

46 D. Traﬁmow

collection, there is no necessity to perform NHST or CIs after data collection. Rather,

the researcher who uses the a priori procedure simply performs the relevant calcula-

tions, collects the sample size indicated, and then trusts that the descriptive statistics to

be obtained are good estimates of the corresponding population parameters, based on a

priori speciﬁcations.8

Finally, of course, there are Bayesian procedures. A proper discussion of Bayesian

procedures is far beyond present scope, but the reader can gain an appreciation of the

different varieties of Bayesian thinking, along with philosophical problems associated

with each, by consulting Gillies (2000).

Clearly, then, there are alternatives to NHST and CIs, though each of the alter-

natives may have their own issues. Although the alternatives still need more consid-

eration before their unqualiﬁed acceptance, they are sufﬁcient, at least, to counter the

perception that there are no alternatives. I reiterate that there is always the option of

simply not using inferential procedures of any type which; even if the reader rejects,

quantum probability, a priori procedures, and Bayesian procedures; is superior to

NHST and CIs. My hope and expectation is that TES2019 will help fulﬁll the function

of seeing out NHST and CIs.

References

Bakker, M., van Dijk, A., Wicherts, J.M.: The rules of the game called psychological science.

Perspect. Psychol. Sci. 7(6), 543–554 (2012)

Berk, R.A., Freedman, D.A.: Statistical assumptions as empirical commitments. In: Blomberg, T.

G., Cohen, S. (eds.) Law, Punishment, and Social Control: Essays in Honor of Sheldon

Messinger. 2nd edn., pp. 235–254. Aldine de Gruyter (2003)

Box, G.E.P., Draper, N.R.: Empirical Model-Building and Response Surfaces. Wiley, New York

(1987)

Briggs, W.: Uncertainty: The Soul of Modeling, Probability and Statistics. Springer, New York

(2016)

Cumming, G., Calin-Jageman, R.: Introduction to the New Statistics: Estimation, Open Science,

and Beyond. Taylor and Francis Group, New York (2017)

Duhem, P.: The Aim and Structure of Physical Theory (P.P. Wiener, Trans). Princeton University

Press, Princeton (1954). (Original work published 1906)

Earp, B.D., Traﬁmow, D.: Replication, falsiﬁcation, and the crisis of conﬁdence in social

psychology. Front. Psychol. 6, 1–11, Article 621 (2015)

Gillies, D.: Philosophical Theories of Probability. Routledge, London (2000)

Greenland, S.: Invited commentary: the need for cognitive science in methodology. Am.

J. Epidemiol. 186, 639–645 (2017)

Gulliksen, H.: Theory of Mental Tests. Lawrence Erlbaum Associates Publishers, Hillsdale

(1987)

Halsey, L.G., Curran-Everett, D., Vowler, S.L., Drummond, G.B.: The ﬁckle P value generates

irreproducible results. Nat. Methods 12, 179–185 (2015). https://doi.org/10.1038/nmeth.3288

8

This sparse description may seem to imply that a priori procedures are simply another way to

perform power analyses. However, this is not true, and I have provided demonstrations of the

differences, including contradictory effects (Traﬁmow 2017b; Traﬁmow and MacDonald, 2017).

Ban on Null Hypothesis Signiﬁcance Testing 47

Hubbard, R.: Corrupt Research: The Case for Reconceptualizing Empirical Management and

Social Science. Sage Publications, Los Angeles (2016)

John, L.K., Loewenstein, G., Prelec, D.: Measuring the prevalence of questionable research

practices with incentives for truth telling. Psychol. Sci. 23(5), 524–532 (2012)

Lakatos, I.: The Methodology of Scientiﬁc Research Programmes. Cambridge University Press,

Cambridge (1978)

Lord, F.M., Novick, M.R.: Statistical Theories of Mental Test Scores. Addison-Wesley, Reading

(1968)

Nguyen, H.T.: On evidential measures of support for reasoning with integrated uncertainty: a

lesson from the ban of P-values in statistical inference. In: Huynh, V.N., et al., (eds.)

Integrated Uncertainty in Knowledge Modeling and Decision Making. Lecture Notes in

Artiﬁcial Intelligence, vol. 9978, pp. 3–15. Springer (2016)

Open Science Collaboration. Estimating the reproducibility of psychological science. Science

349(6251), aac4716 (2015). 10.1126/science.aac4716

Rothman, K.J., Galacher, J.E.J., Hatch, E.E.: Why representativeness should be avoided. Int.

J. Epidemiol. 42(4), 1012–1014 (2013)

Simmons, J.P., Nelson, L.D., Simonsohn, U.: False-positive psychology: undisclosed flexibility

in data collection and analysis allows presenting anything as signiﬁcant. Psychol. Sci. 22(11),

1359–1366 (2011)

Speelman, C.P., McGann, M.: Editorial: challenges to mean-based analysis in psychology: the

contrast between individual people and general science. Front. Psychol. 7, 1234 (2016)

Traﬁmow, D.: Editorial. Basic Appl. Soc. Psychol. 36(1), 1–2 (2014)

Traﬁmow, D.: Implications of an initial empirical victory for the truth of the theory and

additional empirical victories. Philos. Psychol. 30(4), 411–433 (2017a)

Traﬁmow, D.: Using the coefﬁcient of conﬁdence to make the philosophical switch from a

posteriori to a priori inferential statistics. Educ. Psychol. Meas. 77(5), 831–854 (2017b)

Traﬁmow, D.: An a priori solution to the replication crisis. Philos. Psychol. 31, 1188–1214

(2018)

Traﬁmow, D.: A taxonomy of model assumptions on which P is based and implications for added

beneﬁt in the soft sciences (under submission)

Traﬁmow, D., Amrhein, V., Areshenkoff, C.N., Barrera-Causil, C.J., Beh, E.J., Bilgiç, Y.K.,

Bono, R., Bradley, M.T., Briggs, W.M., Cepeda-Freyre, H.A., Chaigneau, S.E., Ciocca, D.R.,

Correa, J.C., Cousineau, D., de Boer, M.R., Dhar, S.S., Dolgov, I., Gómez-Benito, J.,

Grendar, M., Grice, J.W., Guerrero-Gimenez, M.E., Gutiérrez, A., Huedo-Medina, T.B., Jaffe,

K., Janyan, A., Karimnezhad, A., Korner-Nievergelt, F., Kosugi, K., Lachmair, M., Ledesma,

R.D., Limongi, R., Liuzza, M.T., Lombardo, R., Marks, M.J., Meinlschmidt, G., Nalborczyk,

L., Nguyen, H.T., Ospina, R., Perezgonzalez, J.D., Pﬁster, R., Rahona, J.J., Rodríguez-

Medina, D.A., Romão, X., Ruiz-Fernández, S., Suarez, I., Tegethoff, M., Tejo, M., van de

Schoot, R., Vankov, I.I., Velasco-Forero, S., Wang, T., Yamada, Y., Zoppino, F.C.M.,

Marmolejo-Ramos, F.: Manipulating the alpha level cannot cure signiﬁcance testing. Front.

Psychology. 9, 699 (2018)

Traﬁmow, D., MacDonald, J.A.: Performing inferential statistics prior to data collection. Educ.

Psychol. Meas. 77(2), 204–219 (2017)

Traﬁmow, D., Marks, M.: Editorial. Basic Appl. Soc. Psychol. 37(1), 1–2 (2015)

Traﬁmow, D., Marks, M.: Editorial. Basic Appl. Soc. Psychol. 38(1), 1–2 (2016)

Traﬁmow, D., Wang, T., Wang, C.: Means and standard deviations, or locations and scales? That

is the question! New Ideas Psychol. 50, 34–37 (2018)

Traﬁmow, D., Wang, T., Wang, C.: From a sampling precision perspective, skewness is a friend

and not an enemy! Educ. Psychol. Meas. (in press)

48 D. Traﬁmow

Trueblood, J.S., Busemeyer, J.R.: A quantum probability account of order effects in inference.

Cogn. Sci. 35, 1518–1552 (2011)

Trueblood, J.S., Busemeyer, J.R.: A quantum probability model of causal reasoning. Front.

Psychol. 3, 138 (2012)

Valentine, J.C., Aloe, A.M., Lau, T.S.: Life after NHST: how to describe your data without “p-

ing” everywhere. Basic Appl. Soc. Psychol. 37(5), 260–273 (2015)

Wasserstein, R.L., Lazar, N.A.: The ASA’s statement on p-values: context, process, and purpose.

Am. Stat. 70, 129–133 (2016)

Woodside, A.: The good practices manifesto: overcoming bad practices pervasive in current

research in business. J. Bus. Res. 69(2), 365–381 (2016)

Ziliak, S.T., McCloskey, D.N.: The Cult of Statistical Signiﬁcance: How the Standard Error

Costs us Jobs, Justice, and Lives. The University of Michigan Press, Ann Arbor (2016)

Kalman Filter and Structural Change

Revisited: An Application to Foreign

Trade-Economic Growth Nexus

Legislative and Democratic Studies, Abuja, Nigeria

asemotaomos@yahoo.com,

asemota@econ.kyushu-u.ac.jp,omorogbe.asemota@nils.gov.ng

Abstract. In the last two decades, Nigeria’s economy has been aﬀected

by several shocks such as the 1985-86 oil price crash; 1997 Asian ﬁnan-

cial crisis; 2008-2009 global ﬁnancial crisis, oil price crash that started

in 2014 as well as political uncertainties in the country. Parameters of

econometric models are dependent on prevailing policy and will react to

policy changes. Yet, previous researches on trade-growth modelling in

Nigeria had assumed parameter constancy over time. Thus, this paper

constructed a time varying parameter model for trade-growth nexus and

demonstrated how the model can be useful in the detection of structural

breaks and outliers. The paper ﬁrst demonstrated via the rolling regres-

sion method that the parameters have been time dependent and pro-

ceeded with the Kalman ﬁlter to estimate the transition of the changing

parameters of the trade-growth nexus. Thereafter, it presented applica-

tions to show how the auxiliary residuals of the model can be used to

detect the time of structural breaks and outliers.

State space model · Structural breaks · Time-vary coeﬃcients

1 Introduction

The notion of structural change has received considerable attention in the econo-

metric literature in recent times. This is premised on various theories of stages

of economic development and growth which assumes that economic relationship

changes over time. Initially, such changes were explained in descriptive form.

However, with the introduction of regression analysis as the principal tool of

economic data processing in the 1950s and 1960s, statisticians and econome-

tricians started to describe structural changes in macroeconomics in regression

framework (Asemota and Saeki 2012). Structural change is deﬁned as a change in

I thank Tom Doan of ESTIMA for his comments when writing the RATS codes used

in the analysis.

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 49–62, 2019.

https://doi.org/10.1007/978-3-030-04263-9_4

50 O. J. Asemota

one or more of the parameters of an econometric model, and usually the “change

point” or “break point” is often unknown. Hansen (2001) posited that an unde-

tected structural change or break can lead to three major problems in time

series analysis: inferences can be wrong; forecasts can be inaccurate; and pol-

icy recommendations may be misleading. The econometrics of structural change

looks for systematic methods of identifying, estimating, testing and monitoring

of structural breaks.

Kim and Siegmund (1989) proposed a test to detect structural change in a

simple linear regression against two diﬀerent alternatives: one speciﬁes that only

the intercept changes while the second permits the intercept and the slope to

change. Bai and Perron (1998) tested for multiple structural changes occurring at

unknown dates in a linear regression model estimated by ordinary least squares.

Their approach was based on testing for partial structural change model where

all parameters are not subjected to shifts as against pure structural change model

where all coeﬃcients are subjected to change. Xu and Perron (2017) noted that

forecasting models are subject to instabilities, leading to imprecise and unreli-

able forecasts. Thus, they proposed a frequentist-based approach to forecast time

series in the presence of in-sample and out-of-sample breaks in the parameters

of the model. In Xu and Perron (2017) model, the parameters evolved through a

random level shift process, with the occurrence of a shift governed by a Bernoulli

process. Hauwe et al. (2011) and Maheu and Gordon (2008) proposed a Bayesian

approach to modelling structural breaks in time series models. The strength of

the Bayesian approach is that it uses the prior distribution to treat the param-

eters as random and allow the breaks to occur with some probability. However,

the Bayesian approach is sensitive to the exact prior distributions used (Xu and

Perron 2017). Harvey (1981) demonstrated the use of the Kalman ﬁlter for

obtaining maximum likelihood estimates of parameters through prediction error

decomposition. From Harvey’s work, it became clear that a wide range of

econometric models, including regression models with time-varying coeﬃcients,

autoregressive moving average (ARMA) models, and unobserved-components

time series models could be cast in state space form. Asemota (2016) noted that

despite the diﬀerences in the state space and ARIMA modelling strategies, the

two models can be considered equivalent, however, the local level state space

model clearly outperformed the ARIMA model at all forecasting horizons con-

sidered in their analysis.

While the state space form allows unobserved components to be incorpo-

rated into a model, the Kalman ﬁlter algorithm is used in the model estimation.

Theodosiadou et al. (2017) applied the recursive Kalman ﬁlter algorithm to

detect change points in NASDAQ daily returns and three of its stocks where the

jumps are assumed to be hidden random variables. Hamilton (1989) modelled

the instability of parameters using the Markov-switching models and Perron

(1989) incorporated structural breaks into the unit root tests. Ito et al. (2017)

noted that the time-varying parameter models are ﬂexible enough to capture

the complex nature of a macroeconomic system, thus yielding better forecasts

and a better ﬁt to data than models with constant parameters. They proposed

Kalman Filter and Structural Change Revisited 51

roach to estimate a class of time-varying AR parameter models. Ito et al. (2017)

demonstrated that their approach yielded smoothed estimate that is identical to

the Kalman-smoothed estimate. Asemota (2012) modeled the inﬂow of tourists

to Japan using time varying parameter model and evaluated the forecasting

performance of the model.

Economic theory indicates that a deﬁnite relationship exists between a coun-

try’s trade ﬂows and its current account (Sunanda 2010). Furthermore, this

relationship can be extended to the theory that increased exports positively

contribute to the economy of a country or region. In Nigeria, the importance

of trade to the national economy cannot be overemphasized. According to the

National Bureau of Statistics (NBS, 2018) trade contributed 16.86% to Nige-

ria’s GDP at the end of 2017. This aﬃrms that trade is an important sec-

tor of the Nigerian economy. Trade creates economic opportunities for people,

income opportunities, job creation and improvement in the general standard of

living. Hence, several researchers such as Oluchukwu and Ogochukwu (2015),

Arodoye and Iyoha (2014), Omoke and Ugwuanyi (2010), Iyoha (1998), Ekpo

and Egwaikhide (1994) have examined the trade-growth nexus in Nigeria.

However, their methodology assumed constancy of parameters over time.

The assumption of the constant parameter model may not be appropriate for

the foreign trade-growth nexus since foreign trade reacts to changes in monetary

and ﬁscal policies. Thus, this paper constructed a time varying parameter model

for Nigeria’s trade- growth nexus. The time varying models allow the parameters

to change gradually over time, which is the main diﬀerence between time-varying

models and Markov switching models. It presented how the residuals in the model

can be used to detect the time of structural breaks and outliers. The rest of the

paper is structured as follows: in Sect. 2, the state space model and the Kalman

ﬁlter equations are introduced. Section 3 presents the model speciﬁcation, and

the results of the empirical application are presented in Sect. 4. The detection

of outliers and structural breaks are presented in Sect. 5, and the ﬁnal section

concludes the paper.

The state space model consists of two equations: the state equation (also called

transition or system equation) and the observation equation (also called measure-

ment equation). The measurement equation relates the observed variables (data)

and the unobserved state variables, while the transition equation describes the

dynamics of the state variables. The state-space representation of the dynamics

of y is given by the following systems of equations:

βt = F βt−1 + νt (2)

52 O. J. Asemota

And the disturbances are assumed to be uncorrelated at all lags:

where βt is k ×1 vector of unobserved state variables, H is n x k matrix that links

the observed vector yt to the unobserved βt , Xt is an r x 1 vector of exogenous

or predetermined observed variables, R and Q are (n x n) and (k x k) matrices

respectively.

The Kalman ﬁlter algorithm is used in estimating the unobserved state vec-

tor βt and provides a minimum mean squared error estimate of βt given the

information set. The Kalman ﬁlter algorithm is further divided into Kalman ﬁl-

tering and Kalman smoothing. While the Kalman ﬁlter gives estimate of βt on

the basis of observed through date t, the smoothing provides estimate of based

on all the available data in the sample through date T.

The Kalman ﬁlter recursive algorithm consists of two stages- prediction and

updating stages. In the ﬁrst stage, an optimal predictor of yt based on all avail-

able information up to t−1 (ŷ t| t−1 ) is obtained. To achieve this, β̂ t| t−1 is cal-

culated. On observing yt , the prediction error can be calculated as η t| t−1 = yt

- ŷ t| t−1 . Thus, making use of the information in the prediction error, a more

accurate estimate of βt (β̂ t| t ) can be obtained. The celebrated Kalman ﬁlter

equations are given by following equations:

ηt| t−1 = yt − ŷt−1| t−1 = AXt + Hβt − AXt − H β̂t| t−1 = H(βt − β̂t| t−1 ) + t (8)

Pt| t = Pt| t−1 − Pt| t−1 H(HPt| t−1 H + R)−1 H Pt| t−1 (11)

−1

where Kt = F Pt| t−1 (HPt| t−1 H + R) is the Kalman gain matrix, which is the

weight assigned to new information about βt contained in the prediction error1 .

1

This section draws largely from Asemota and Saeki (2012).

Kalman Filter and Structural Change Revisited 53

For a detailed derivation of the Kalman ﬁlter equations, see Hamilton (1994),

and Durbin and Koopman (2001). In Eq. (7), the uncertainty underlying β̂t| t−1 is

a function of uncertainty underlying β̂t−1| t−1 and Q, the covariance matrix of the

shocks to βt . Eq. (8) decomposed the prediction error into two components, the

ﬁrst component H(βt − β̂t| t−1 ), might be called “parameter uncertainty”, which

reﬂects the fact that β̂t| t−1 will diﬀer from the true and the second component

is the error due to t , a random shock to yt . Thus in Eq. (9), the mean squared

error matrix of the prediction error is a function of the uncertainty associated

with β̂t| t−1 and of R, the variance of t . The updating equation in (10) indicates

that β̂t| t is formed as a sum of β̂t| t−1 and a weighted average of the prediction

error, the weight assigned to the prediction error is the Kalman gain, Kt .

In some applications, inference about the value of βt is based on all the sample

data. This inference is known as the smoothed estimate of βt . The smoothed

estimates provide more accurate inference on βt . The following two equations

can be iterated backwards for t = T − 1, T − 2, ..., 1, to obtain the smoothed

estimates:

−1

β̂t| T = β̂t| t + Pt| t F Pt+1| t (β̂t+1| T − F β̂t| t ) (12)

−1 −1

Pt| T = Pt| t + Pt| t F Pt+1| t (Pt+1| T − Pt+1| t )Pt+1| t F Pt| t (13)

First, the Kalman ﬁlter estimates are calculated and stored. The smoothed esti-

mates for the ﬁnal date in the sample (β̂T | T and PT | T ) which are the initial

values for the smoothing, are just the last iteration of the (β̂t| t and Pt| t ). Next,

Eqs. (12) and (13) are used moving through the sample backward starting with

t = T − 1, T − 2, ..., 1.

Durbin and Koopman (2001) showed that the Kalman ﬁlter is a useful device

for recursively solving the state-space model and argues that the ﬂexibility of

the state-space model would be particularly important for applications in which

issues such as missing observations, structural breaks, outliers, and forecasting

are paramount.

3 Model Specification

growth and foreign trade using the double log linear regression model estimated

using the ordinary least squares. Their model is presented in Eq. (14).

where RGDP denotes Real Gross Domestic Product, EX is the volume of export,

IMP is the volume of import, EXR is the exchange rate and FDI is foreign direct

investment. However, model (14) assumed constancy of parameter which cannot

54 O. J. Asemota

be guaranteed, and the model does not allow the parameters to vary across time.

Thus, the speciﬁcation might be ineﬃcient because parameters are expected to

react to changes in policy and other shocks to the economy. An alternative

ﬂexible model is the time varying coeﬃcient (TVC) model formulated in the

state space form. The state space formulation of model (14) is:

lnRGDP = ψ0t + ψ1t lnEX + ψ2t lnIM P + ψ3t lnEXR + ψ4t lnF DI + t (15)

Equation (15) is the system or measurement equation, while Eq. (16) is the

transition or state equation which is used to simulate how the parameters evolve

over time. In most empirical studies, the speciﬁcation used for the transition

equation is the random walk process (RWP). The RWP has been proved to have

the capability to capture structural change in econometric models, see Song and

Witt (2000) and Shen et al. (2009).

⎡ ⎤

ψ1t

⎢ψ2t ⎥

InRGDP − ψ1t = [InEX, InIM, InEXR, InF DI] ⎢ ⎥

⎣ψ3t ⎦ + t (18)

ψ4t

⎡ ⎤ ⎛ ⎞⎡ ⎤ ⎡ ⎤

ψ1t 1000 ψ1,t−1 ηt

⎢ψ2t ⎥ ⎜0 1 0 0⎟ ⎢ψ2,t−2 ⎥ ⎢ ζt ⎥

⎢ ⎥=⎜ ⎟⎢ ⎥ ⎢ ⎥

⎣ψ3t ⎦ ⎝0 0 1 0⎠ ⎣ψ3,t−3 ⎦ + ⎣ ξt ⎦ (19)

ψ4t 0001 ψ4,t−4 πt

The ﬁrst equation is the observation equation with time varying parameters

and the other equations are the transition equations. The transition equations

describe the dynamics of the parameter which is assumed to follow a random

walk process. These state space models can be estimated by applying the Kalman

ﬁlter earlier described. In these models, the coeﬃcients can change over time,

which allows the parameters to respond diﬀerently under policy changes. The

unknown variance parameters in models (18) and (19) are estimated by the max-

imum likelihood estimation via the Kalman ﬁlter prediction error decomposition

initialized with the exact initial Kalman ﬁlter.

Annual data from 1981 through 2016 obtained from World Development Indi-

cator of the World Bank are used in the analysis. A descriptive statistics for the

data is given in Table 1 and the graphical display of the series is presented in

Fig. 1.2

2

The RATS (version 8.3) ecoometric software is used for all our estimations.

Kalman Filter and Structural Change Revisited 55

Observation deviation

RGDP 36 24.9647 1.1319 23.4826 27.0663

Export 36 23.6704 1.1052 21.7382 25.6994

Import 36 23.3064 1.1539 21.4911 25.2049

Exchange rate 36 3.2938 1.9477 -0.4943 5.5353

FDI 36 21.1956 1.0996 19.0581 22.9027

56 O. J. Asemota

regressions was ﬁrst performed. The graphical display of the rolling parameters

is presented in Fig. 2. Parameter instability over time is clearly evident in Fig. 2,

hence the need to consider time varying parameter class of models in modelling

the trade-growth nexus. The export and import coeﬃcients estimates (ψ̂1 ) and

(ψ̂1 ) respectively takes on the theoretical expected positive values throughout

the sample estimated period. The regression coeﬃcient for FDI (ψ̂4 ) started

with negative values but later take on positive values between 2009 and 2012, it

however drifted to negative values from 2012. The coeﬃcient of exchange rate

(ψ̂3 ) take on the theoretical expected negative values throughout the sample

period, however, evidence of parameter instability is noticed in the movement

over the over the sample period.

Thus, the plots of the rolling regression reveals inherent parameter instability

that calls for estimation of the model using a more sophisticated technique that

allows the estimated parameter to be time dependent. Thereafter, the observa-

tion and transitions equations in 18 and 19 are estimated using the Kalman ﬁlter

algorithm. Table 2 presents the estimate of the model’s parameters along with the

p-values. The maximum likelihood estimates of the hyper parameters obtained

employing the BFGS optimization technique are presented in Table 2. These

Kalman Filter and Structural Change Revisited 57

hyper parameters are employed in the Kalman ﬁlter algorithm to generate the

Kalman ﬁltered and smoothed transition of the coeﬃcients of the trade-growth

model. A graphical display of the Kalman ﬁlter and smoother estimates of the

coeﬃcients of trade-growth model are presented in Figs. 3 and 4 respectively.

ψ̂0t 13.4497 0.0000

ˆt 1.0796 × 10−7

η̂t 3.8785 × 10−7 0.0513

ζ̂t 1.5002 × 10−8 0.0040

ξˆt 1.0506 × 10−6 0.0024

π̂t 1.4085 × 10−7 0.0513

Log likelihood 10.3001

Note: t is the estimate of the measure-

ment equation variance, ηt , ζt , ξt , πt are the

estimates of the variances of the transition

equation.

From Fig. 3, the Kalman ﬁltered estimate of the export’s parameter (ψ̂1 ) gen-

erally drifted upward, and it takes on the theoretical expected sign throughout

the sample period. It trended upwards and declined around 1993, 1996–1998 and

2008/2009. The Kalman ﬁltered import coeﬃcient was equally positive through-

out the estimation period and it transition depicts a declining impact of import

on GDP growth over time. It attains a peak value of about 0.42 in 1984 and there-

after exhibits a declining pattern until it attained its minimum values between

2012 and 2013. The Kalman smoothed estimates of the export (ψ̂1 ) and import

(ψ̂2 ) parameters behave according to their theoretical expectation, and generally

exhibit upward movement, ﬂuctuating in an increasing and decreasing pattern.

One of the attractions of the state space model is that the auxiliary residuals3 of

the model are useful in the detection of outliers and structural breaks. Harvey

and Koopman (1992) proved that the auxiliary residuals though are autocorre-

lated even for a correctly speciﬁed model but tests constructed based on auxiliary

residuals are reasonably eﬀective in detecting and distinguishing between outliers

and structural change. Consequently, the paper adopted the procedure discussed

in Harvey and Koopman (1992) and Durbin and Koopman (2001). In the state

3

Auxiliary residuals are the residuals associated with the state or transition equation.

These auxiliary residuals are estimators of the disturbances associated with the

unobserved components.

58 O. J. Asemota

associated with the observation equation disturbance (t ) and structural breaks

are detected by identifying large absolute values in the disturbances associated

with transition equations (Durbin and Koopman 2001).

The calculation of the auxiliary residuals is carried out by putting the model

in state space form and applying the Kalman ﬁlter and smoother. For an illus-

tration, let t be the disturbance associated with the observation equation and

ηt be the disturbance associated with one of the unobserved components; the

standardized smoothed residuals are given by;

ˆt

ρ= t = 1, 2, ..., T (20)

var(ˆ

t )

η̂t

ρ= t = 1, 2, ..., T (21)

var(η̂t )

The basic detection procedure is to plot the auxiliary residuals after they had

been standardized. The standardization is necessary because the residuals at the

beginning and end of the sample period will tend to have higher variances. In

Kalman Filter and Structural Change Revisited 59

a Gaussian model, indications of outliers and structural break arise for values

greater than 2 in absolute value, see Durbin and Koopman (2001) and Harvey

and Koopman (1992) for detailed explanation on this procedure. Applying the

above procedure, the plots of the standardized auxiliary residuals of our model

are presented in Fig. 4.

From Fig. 4, an outlier is detected in 1993, and the paper found strong evi-

dence of structural breaks in parameters of the model in 2009. Weak evidence of

structural change is also detected around 1994. The outlier detected in 1993 and

structural break detected in 1994 could be attributed to the drastic change in

policy as a result of the political turmoil in Nigeria with the arrival of the mili-

tary administrator led by late General Sani Abacha. The military administrator

re-regulated the economy by capping exchange and interest rate, and the man-

agement of oil revenue and the macro economy was haphazard, translating into

macroeconomy instability (Asemota and Saeki, 2010). In addition, the structural

break detected in 2009 could be attributed to the global ﬁnancial crises which

occurred in 2008–2009. Due to the interdependence of global economy, the global

ﬁnancial crises that started in 2007 aﬀected global economies including Nigeria.

The eﬀect of the crises on the Nigerian economy was not felt until mid-2008, and

the capital market was greatly aﬀected as foreign investors withdrew and repa-

60 O. J. Asemota

triated their funds to their home countries. Similarly, the economic downturn in

the United States (a major export destination for Nigeria’s crude oil) aﬀected

the demand and price of oil. Furthermore, the oﬃcial exchange rate depreciated

by 25.6% between 2008 and 2009, reﬂecting the demand pressure relative to

supply with implications for foreign reserve. More so, inﬂation rose from 6% in

2007 to 15.1% in 2008 and remained at double digits till January 2013 when it

returned to single digit.4

Our estimated outliers and break dates are quite signiﬁcant because they cor-

respond to some important political events in Nigeria and the global ﬁnancial

crises that occurred between 2007–2009. It is important to note that while we

have attempted to explain the causes of the structural breaks, further research

on Nigeria Trade-Growth Nexus might be worthwhile. For instance, disaggrega-

tion of imports and exports may give a clearer explanation and capture more

structural changes in the trade-growth model.

Diagnostic tests are used to check the adequacy of the ﬁtted model. The diagnos-

tics tests considered are the heteroscedascity test H(h), serial correlation (Q) test

and the normality (N) test. The results of the model diagnostics are presented

in Table 3.

Q(9-4) 46.61 0.0000

Normality 0.97 0.6144

H(12) 1.24 0.7192

The diagnostic tests for the ﬁtted model are quite satisfactory with the excep-

tion of serial correlation test. The results indicate that the residuals are highly

serially correlated with the Box-Ljung statistic based on the ﬁrst 9 sample auto-

correlation, Q(15) = 46.61 with a p-value of 0.0000. Harvey and Koopman (1992)

demonstrated that the auxiliary residuals in a state space model are usually auto-

correlated even for a correctly speciﬁed model. The closeness of the plot to the

45 degree line suggests that the residuals are normally distributed.5

4

The Impact of the Global Financial Crisis on Nigerian Economy. November 28, 213.

By KA Project Research. Available at www.proshareng.com.

5

The plot is a graphical display of ordered residuals against their theoretical quantiles.

The 45 degree line is taken as a reference line (Durbin and Koopman, 2001).

Kalman Filter and Structural Change Revisited 61

6 Conclusions

According to the National Bureau of Statistics (NBS, 2017), trade’s contribu-

tion to Nigeria’s GDP was 16.90% in 2017 and contributed 17.1% to the economy

at the end of ﬁrst quarter of 2018. The country’s economy has been aﬀected by

several shocks such as the 1985-86 crash in oil price, the 1997 Asian Financial

crisis and the global ﬁnancial crisis of 2008–2009 (Abiola and Asemota 2017).

The domestic economy is equally aﬀected by political uncertainties in the coun-

try. Thus, for proper estimation of the trade-growth nexus, it is imperative to

take these changes into consideration. Consequently, this paper was designed to

investigate structural changes in trade-growth nexus from 1981 to 2016 using

a time-varying parameter model. First, the paper applied the rolling regression

estimation strategy to the parameters of the trade-growth model to justify the

application of the TVP model. The graphical display of the rolling estimate of

the parameters conﬁrmed that the coeﬃcients were indeed unstable over time.

Thus, to estimate the parameters of the model over time and detect the timing

of the structural breaks and outliers, a time varying parameter model in state

space form was employed. The Kalman ﬁlter and smoother are very eﬀective

methods for estimating the transition of the parameters over time. The time

varying parameters were estimated as unobserved components in the state vec-

tor of the state space model. The paper demonstrated that the plots of the

auxiliary residuals of the state space model are very useful in detecting struc-

tural changes in time series. The analytical results reveal that the parameters of

Nigeria’s trade-growth nexus were severely aﬀected by the political turmoil in

Nigeria in 1993/1994 and the global ﬁnancial crises of 2008/2009. The proximity

of structural break dates detected in the study to period of important economic

and political events, such as the political turmoil of 1993 and global ﬁnancial

crises attested to the signiﬁcance of the break dates.

62 O. J. Asemota

References

Abiola, G.A., Asemota, O.J.: Recent economic recession in nigeria: causes and solutions.

A Paper Presented at the 2017 Institute of Public Administrator of Nigeria, Nicon

Luxury Hotel, Abuja (2017)

Anowor, O.F., Agbarakwe, H.U.: Foreign trade and Nigerian economy. Dev. Ctry. Stud.

5(6), 77–82 (2015)

Arodoye, N.L., Iyoha, M.A.: Foreign trade-economic growth nexus: evidence from Nige-

ria. CBN J. Appl. Stat. 5(1), 121–140 (2014)

Asemota, O.J.: State space versus SARIMA modeling of the Nigeria’s crude oil export.

Sri Lankan J. Appl. Stat. 17(2), 87–108 (2016). https://doi.org/10.4038/sljastats.

v17i2.7872

Asemota, O.J.: Modeling inﬂow of tourists to Japan: an evaluation of forecasting per-

formance of time varying parameter model. J. Jpn. Assoc. Appl. Econ. 6, 34–60

(2012). Studies in Applied Economics

Asemota, O.J., Chikayoshi, S.: Structural change in macroeconomic time series: a sur-

vey with empirical applications to ECOWAS GDP. J. Kyushu Econ. Sci. Jpn. 48,

25–35 (2010). The annual Report of Economic Science

Durbin, J., Koopman, S.: Time Series Analysis by State Space Methods. Oxford Uni-

versity Press, New York (2001)

Ekpo, A.H., Egwaikhide, F.O.: Export and economic growth in Nigeria: a reconsider-

ation of the evidence. J. Econ. Manag. 1(1), 100–115 (1994)

Hamilton, J.D.: A new approach to the economic analysis of nonstationary time series

and the business cycle. Econometrica 57, 357–384 (1989)

Hansen, B.E.: The new econometrics of structural change: dating breaks in U.S Labor

productivity. J. Econ. Perspect. 15(4), 117–128 (2001)

Harvey, A.C., Koopman, S.J.: Diagnostic checking of unobserved components time

series models. J. Bus. Econ. Stat. 10(4), 377–389 (1992)

Hauwe, S., Paap, R., Van Dijk, D.: An alternative bayesian approach to structural

breaks in time series models. Tinbergen Institute Discussion Paper (2011)

Ito, M., Noda, A., Wada, T.: An alternative estimation method of a time-varying

parameter model. Working Paper, Faculty of Economics, Keio University, Japan

(2017)

Iyoha, M.A.: An econometric analysis of the impact of trade on economic growth in

ECOWAS Countries. Niger. Econ. Financ. Rev. 3(1), 25–42 (1998)

Omoke, P.C., Ugwuanyi, C.U.: Export, domestic demand and economic growth in

Nigeria: granger causality analysis. Eur. J. Soc. Sci. 13(2), 211–218 (2010)

Perron, P.: The great crash, the oil price shock, and the unit root hypothesis. Econo-

metrica 57, 1361–1401 (1989)

Shen, S., Li, G., Song, H.: Is the time-varying paramter model the preﬀered approach

to tourism demand forecasting? Statistical evidence. In: Matias, A., et al. (eds.)

Advances in Tourism Economics, pp. 107–120. Physica-Verlag, Heidelberg (2009)

Song, H., Witt, S.F.: Tourism Demand Modelling and Forecasting: Modern Economet-

ric Approaches. Pergamon, Oxford (2000)

Sunanda, S.: International trade theory and policy: a review of the literature (2010).

http://www.levyinstitute.org. Accessed 23th June 2018

Theodosiadou, O., Skaperas, S., Tsaklidis, G.: Change point detection and estimation

of the two-sided jumps of asset returns using a modiﬁed Kalman Filter. Risks 5, 15

(2017). https://doi.org/10.3390/risks5010015

Xu, J., Perron, P.: Forecasting in the presence of in and out of sample breaks. Working

Paper, Shanghai University of Finance and Economics (2017)

Statisticians Should Not Tell Scientists

What to Think

Donald Bamber(B)

Irvine, CA 92697, USA

dbamber@uci.edu

probability and statistics in science are reviewed. Diﬀerent schools have

diﬀerent goals for statistics; that is not inappropriate. Mimetic modeling,

whose goal is to mimic Nature’s behavior, is described and advocated.

Both Bayesian analyses and some classical analyses may be appropri-

ately applied to mimetic models. Statistics should not usurp scientiﬁc

judgment and, in mimetic modeling, it does not.

Frequentist statistics · Bayesian statistics

This paper is concerned with the application of probability and statistics to

the scientiﬁc modeling of natural phenomena. The main themes that will be

discussed are the following.

• Probability is anything that satisﬁes Kolmogorov’s axioms. Diﬀerent inter-

pretations of probability exist.

• Diﬀerent schools of thought in statistics often employ diﬀerent interpretations

of probability. One view is that probability distributions describe uncertainty;

another view is that probability distributions describe variability in Nature.

• Probabilities don’t know how they’re being interpreted. The inventor of a

model may give the probabilities in the model one interpretation. Someone

else may give those probabilities a diﬀerent interpretation. This, of course,

changes the interpretation of the model.

• Diﬀerent schools of thought in statistics have diﬀerent goals for their statisti-

cal analyses; it is not inappropriate for diﬀerent statisticians to have diﬀerent

philosophies and, therefore, diﬀerent goals.

• To select an appropriate statistical approach, a scientist must decide what

his/her goals are.

• The incorrect belief that all statisticians either have the same goals or ought

to have the same goals is a cause of pointless controversy in statistics.

• Some statistics textbooks are philosophically vague; they describe various

statistical analyses but do not explain how those analyses are justiﬁed by the

concept of probability presented in the book.

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 63–82, 2019.

https://doi.org/10.1007/978-3-030-04263-9_5

64 D. Bamber

(i.e., epistemic modeling and mimetic modeling). Depending on the school,

the exact same Bayesian analysis may be aimed at achieving dramatically

diﬀerent goals.

• Although Bayesian and classical/frequentist analyses are often portrayed as

being incompatible, there is a statistical school of thought (i.e., mimetic mod-

eling) that justiﬁes both Bayesian analyses and some classical/frequentist

analyses.

• Some statistical schools of thought (both those employing Bayesian analyses

and those employing classical/frequentist analyses) aim to tell scientists what

their data forces them to think. In my opinion, this is wrong. Statistics should

not usurp scientiﬁc judgment and, in mimetic modeling, it does not.

Diﬀerent scholars often have diﬀerent concepts of what probability is. For a

discussion of these diﬀerent points of view, see the book by Gillies [15], which

I reviewed in [4], and the book by Diaconis and Skyrms [12]. The discussion in

the former book is broader; in the latter book, more incisive.

In my view, there is no point in arguing over which of these concepts of

probability is better or correct [4]. Anything, that satisﬁes Kolmogorov’s axioms

for probability [16], no matter whether it is degrees of belief, or ratios of counts of

possibilities, etc., should be regarded as legitimate probability. Diﬀerent concepts

of probability may be useful for diﬀerent purposes.

In much of statistics, probability is used to describe uncertainty. For exam-

ple, in his textbook All of Statistics, Wasserman [22, p. 3] states: Probability

is a mathematical language for quantifying uncertainty. In contrast, in much of

science, probability is used as a description of variability in Nature.

Unfortunately, in much scientiﬁc work and in many statistics textbooks,

although probability is employed as a central concept, that concept is only

vaguely explained and is left unclear.

Stamp Collecting vs. Coin Collecting. There simply isn’t any conclusive argu-

ment showing that one concept of probability is correct and the others wrong.

To argue over this issue is as pointless as arguing over whether stamp collecting

is better than coin collecting. The stamp collector may argue that stamp collect-

ing is better than coin collecting because stamps are more colorful than coins.

And the stamp collector may think that the coin collector is wasting time and

resources on a foolish occupation. But the coin collector is doing what he/she

prefers and cannot be induced to change that preference by logical argument.

Similarly, it must be recognized that diﬀerent statisticians employ diﬀerent

kinds of probability because they have diﬀerent kinds of goals. Statistician A

may fail to realize that his/her goals are diﬀerent from Statistician B’s goals. As

a result, Statistician A may say to Statistician B: Your statistical methods are

no good. In reality, Statistician A would have been more accurate if he/she had

said to Statistician B: Although your statistical methods may be appropriate for

achieving your goals, they are not appropriate for achieving my goals.

Statisticians Should Not Tell Scientists What to Think 65

goals for their analyses is provided by a statement of Dennis Lindley’s [17]:

Every statistician would be a Bayesian if he took the trouble to read the literature

thoroughly and was honest enough to admit that he might have been wrong. This

statement might be correct if every statistician had the same goals as Lindley.

But they don’t all have the same goals.

It is often said that probability quantiﬁes uncertainty. But, what is meant by

uncertainty? One answer has been that, when we talk about uncertainty, we

are talking about degree of belief in propositions/events. Furthermore, degree of

belief in an event (such as “it will rain tomorrow”) varies from person to person.

Thus, when we talk about degree of belief, we are talking about some person’s

degree of belief.

A person’s degree of belief in various events is often deﬁned in terms of those

bets about those events that the person is willing to make and those bets that

the person is not willing to make. Philosophers ad probabilists have formulated a

number of arguments having a basic similarity to each other showing that, when

deﬁned in terms of bets, a person’s degrees of belief should conform to the laws

of probability. (For historical reasons that need not concern us, these arguments

are called Dutch book arguments.)

In addition to Dutch book arguments, there is another approach to justifying

personal probability that was developed by Leonard J. Savage [20]. I will not

discuss Savage’s work here as it more complicated than Dutch book arguments.

One form of the Dutch book argument is presented here. Other forms are

discussed in [15, Chap. 4] and in [12, Chap. 2]. In fact, over a period of gener-

ations, philosophers, probabilists, and statisticians have developed a collection

of intellectually very impressive justiﬁcations for personalist probability—again

see [12, Chap. 2].

Dutch book arguments tend to be confusing because one must keep track of

multiple gambles, together with all possible outcomes of those gambles, and the

net gain or loss from the multiple gambles. Furthermore, having to keep track of

all the details in the argument makes it hard to discern the intuitive idea behind

the argument. To avoid these problems, I formulated the kind of Dutch book

argument presented here that is designed to be less confusing.

The form of the Dutch book argument presented here asks and answers the

question of what your strategy should be if you signed a contract with a gam-

bling establishment that required that your gambling be conducted according to

certain rules (described below) that are speciﬁed in the contract.

The House. (Assume that a gambling establishment that we will call the House

has selected as its unit of currency some arbitrary amount, say 100 Thai baht.

66 D. Bamber

Suppose that the House issues conditional promissory notes of two kinds: unit

and fractional. Consider an example of a unit note. If A is some event that hasn’t

happened yet, such as rain next week or no rain next week, then the promissory

note PN(A) promises that the House will pay the holder one unit of currency

if the event A occurs and nothing otherwise. Unit notes may be divided into

fractional parts. Suppose that 0 ≤ α ≤ 1. Then, the α-fractional part of PN(A)

is denoted α PN(A). This fractional note pays oﬀ α currency units if A occurs

and nothing otherwise.

Your Contract with the House: Divisibility of Notes. If you are the holder of a

promissory note PN1 , the House can require you to trade it for the two notes

α PN1 and (1 − α) PN1 , where 0 ≤ α ≤ 1. Conversely, if you hold the two notes

α PN1 and (1 − α) PN1 , the House can require to trade them for the note PN1 .

Your Contract with the House: Start of Gambling Session. At the start of the

gambling session, you are required to pay one currency unit to purchase from

the House the promissory note PN(Ω), where Ω is the universal or sure event.

The note PN(Ω) is just as good as one unit of currency, because it is a promise

by the House to pay the note holder one unit of currency, no matter what.

Your Contract with the House: Trading. If you hold the promissory note PN1 ,

the House may require you to trade it for another promissory note PN2 provided

that you have agreed beforehand that the trade is fair. (How the House ﬁnds

out from you which trades you regard as fair is explained below.)

Your Contract with the House: Fractional Trades. Suppose that you have told

the House that, if demanded by the House, you assent to trade the note PN1

for the note PN2 or vice versa, then the contract requires that you assent to

any fraction of that trade. So suppose that β is between zero and one. Having

agreed to trade the note PN1 for the note PN2 or vice versa, you also agree to

trade the note β PN1 for the note β PN2 or vice versa.

Your Contract with the House: End of Gambling Session. After executing any

trades with you that the House wishes (but only trades that you regard as fair),

the House declares the gambling session to be over and pays oﬀ any promissory

notes that you hold.

Your Contract with the House: Elicitation of Your Judgments of Fairness. It was

mentioned above that the House ﬁnds out from you what trades you consider

“fair”. This is done as follows. Given two events A and B, the House may require

you to specify a fair “exchange rate” between the promissory notes PN(A ∩ B)

and PN(A). In other words: the House requires you to specify some number α

such that you agree to trade PN(A ∩ B) for α PN(A) and, conversely, you agree

to do the trade in the reverse direction. The value of α chosen by you is denoted

exch(B|A). Thus, you agree to trade PN(A ∩ B) for exch(B|A) PN(A) and vice

versa. Abbreviation: The exchange rate exch(B|Ω) will be abbreviated exch(B).

Note that these trades are a kind of bet. If you agree that it is fair to trade

exch(B)PN(Ω) for PN(B) and vice versa, then (in essence) you are saying

willing to pay the House exch(B) currency units for a promise from the House

to pay you one currency unit if the event B occurs. Conversely, you are willing

(in essence) to take a sure gain of exch(B) currency units in return for giving up

Statisticians Should Not Tell Scientists What to Think 67

the risky opportunity to gain one currency unit if B occurs. Whichever direction

the trade is run, you are gambling.

An Aside: Exchange Rates as Operational Definitions of Degrees of Belief. We

may regard your chosen exchange rates as operationally deﬁning your degrees of

belief. Thus, the stronger your belief in the event B, the higher should be your

exchange rate exch(B) relative to exch(Ω) = 1. Analogously, the stronger your

belief in the event A ∩ B relative to your belief in the event A, the higher should

be your exchange rate exch(B|A).

Rationality of Your Judgments of “Fairness” of Trades. Have you been rational

in your judgment of which trades of promissory notes are fair? We will say that

your judgments of fairness have been irrational if there is some sequence of trades,

judged fair by you, that converts your starting stake of PN(Ω) into α PN(Ω),

were 0 ≤ α < 1. If there is such a sequence of trades and if that sequence

were executed, then you would suﬀer a sure loss. You would have started with

a promissory note that was the equivalent of one unit of currency and ended up

with a promissory note that was the equivalent of α < 1 units of currency. (If

the House has executed a sequence of trades with you that caused you a sure

loss, we will say that the House made a Dutch book against you.)

The essence of Dutch book arguments is that, if you do not wisely choose the

exchange rates exch(B|A) across events A and B, then the House can require

you to make trades where you, in eﬀect, “buy high and sell low”, resulting in sure

loss for you and a sure gain for the House. In other words, if you do not wisely

choose your exchange rates, you can come out on the losing end of arbitrage.

In particular, Dutch book arguments show that, if your chosen “fair”

exchange rates do not obey the laws of probability, then the House can exe-

cute a sequence of trades (all of which you regard as fair) that will cause you to

suﬀer a sure loss. In particular, you must choose exchange rates such that

exch(Ω) = 1 (1)

exch(A) ≥ 0. (2)

In addition, for any disjoint events A and C, you must choose exchange rates

such that

exch(A ∪ C) = exch(A) + exch(C). (3)

Dutch book arguments also show that, given any two events A and B, if you

have chosen exch(A) > 0, then you must choose exch(A ∩ B) and exch(B|A)

such that

exch(A ∩ B)

exch(B|A) = , when exch(A) > 0. (4)

exch(A)

68 D. Bamber

If your chosen exchange rates do not conform to Eq. 4, then House can require

you to make a sequence of trades (all considered fair by you) that result in a

sure loss for you.

Proof of Eq. 4. We will show that the exchange rates chosen by you must satisfy

Eq. 4 if you do not wish to be vulnerable to a Dutch book. Suppose that you

chose exch(A) > 0, but your chosen value for exch(B|A) does not satisfy Eq. 4.

There are two cases to consider: Either

or the reverse inequality. Consider the case where the inequality (5) holds; the

case of the reverse inequality being analogous. Recall that you start the gam-

bling session holding the promissory note PN(Ω). Then, the House may, at its

discretion, execute the following sequence of trades with you (all of which you

consider fair).

Trade 1: The House takes from you PN(Ω) and gives you the two notes:

Trade 2: The House takes from you exch(A∩B)PN(Ω) and gives you PN(A∩B).

Trade 3: The House takes from you PN(A ∩ B) and gives you exch(B|A)PN(A).

Trade 4: Next comes a fractional trade. The House takes from you

exch(B|A)PN(A) and gives you

exch(B|A)[exch(A)PN(Ω)].

But Eq. 5 implies that α < 1. Thus, the House has executed a sequence of trades,

regarded as fair by you, that has resulted in a sure loss for you.

Exchange Rates Must Conform to the Laws of Probability. Thus, Dutch book

arguments have shown that, if your exchange rates are rationally chosen (in the

sense of not allowing the House to trade you into a sure loss), then your exchange

rates must obey Eqs. 1, 2, 3, and 4. In other words, they must conform to the

laws of probability.

Rationality of Belief. As mentioned in an aside earlier, your chosen exchange

rates may be taken as an operational deﬁnition of your degrees of belief. Thus,

what Dutch book arguments show is that, for your degrees of belief to be rational,

they should conform to the laws of probability.

Statisticians Should Not Tell Scientists What to Think 69

The Dutch book arguments discussed above show that, at any given moment of

time, your beliefs must conform to the laws of probability if you are rational.

Because those Dutch book arguments apply only to a moment in time, they are

said to be synchronic.

Beliefs change over time as we acquire new information. Is there a theory of

rational belief change? Yes, there is. So-called diachronic Dutch book arguments

assert that, when you acquire new information, the only way to rationally update

your beliefs in the light of the new information, is to condition on the new

information. (Just what is meant by conditioning will be explained soon.)

Diachronic Dutch book arguments are more complicated than synchronic

ones and, therefore, will not be explained in full detail here. The interested

reader may ﬁnd a full explanation in [12, Chap. 2]. The general idea is this:

You are back gambling with the House again. At a moment in time where you

don’t know whether the event A has occurred or not, the House demands that

you state what your new exchange rate between the promissory notes PN(B)

and PN(Ω) would be if you were to learn that A had occurred. Let us denote

that new exchange rate by exchA (B). A diachronic Dutch book argument shows

that, if exch(A) > 0, you must choose exchA (B) to be equal to exch(B|A). If

you choose any other value for exchA (B), then the House can demand that you

make a sequence of trades (i.e., bets) with it that will result in a sure loss for

you.

Change of Notation. Since your degrees of belief are operationally deﬁned by your

chosen exchange rates and since the latter must conform, if you are rational, to

the laws of probability, the notations exch(B), exch(B|A), and exchA (B) for

exchange rates will be replaced by the probability notations Pr(B), Pr(B|A),

and PrA (B), which will be used to represent your degrees of belief.

the two assertions:

• When new information is acquired, the only rational way to update one’s

personal probabilities (i.e., personal beliefs) is to condition those probabilities

on the new information.

Terminology. One’s old beliefs are often called prior beliefs and one’s new,

updated beliefs are often called posterior beliefs.

Putting matters loosely:

laws of probability enforces that consistency.

• When new information forces you to change your beliefs, you should keep your

new beliefs close to your old beliefs. You achieve that closeness by conditioning

70 D. Bamber

your old beliefs on the new information and, in this way, arriving at your new

beliefs.

Remark. Epistemic Bayesianism can tell you whether your beliefs are rational.

But having rational beliefs is no guarantee of having accurate beliefs.

Epistemic Bayesianism as a Vision of Rational Science. A high-level and par-

tial description of the activities of scientists is that (a) they start with opin-

ions/beliefs about Nature, (b) they collect data, and (c) they revise their opin-

ions in light of the new data that they collected. Given this characterization

of science, the epistemic Bayesian prescription for rational science, is that the

scientists’ initial opinions should conform to the laws of probability and that

they should revise their opinions by conditioning their initial opinions on their

collected data.

Is this epistemic Bayesian prescription a good recipe for doing science? I don’t

think it is. Scientists (and people in general) need to change their beliefs in ways

that epistemic Bayesianism deems irrational.

Changing Your Mind Without New Evidence is Irrational. For example, suppose

that you and your friend both have rational beliefs about upcoming baseball

games in Puerto Rico. However, your beliefs are based on ignorance, whereas

your friend’s beliefs are based on the knowledge that he has acquired as a fan

of Puerto Rican baseball. Your friend tells you what his beliefs are about the

upcoming games. But, you say to yourself, “I am as smart as my friend. Why

should I give up my beliefs and adopt his beliefs as my own?” So, you keep your

old beliefs. But, then, you think some more. You know what your friend’s beliefs

are rational. So, if you were to adopt his beliefs, your new beliefs would not

be irrational. Furthermore, your friend’s beliefs are presumably more accurate

than your own. Therefore, in order that your beliefs could be more accurate, you

change your mind and adopt your friend’s beliefs as your own.

Unfortunately, by epistemic Bayesian standards, you changed your beliefs in

an irrational way. What was irrational about your belief change? Notice that

your change of mind was not prompted by any new information. Earlier, when

you learned what your friend’s beliefs were, you decided not to change your own

beliefs. It was only later that you changed your beliefs. And that belief change

was prompted by a desire for greater accuracy in your beliefs, not because you

had learned anything new.

Not receiving new information is the same as receiving information of the

occurrence of the universal/sure event Ω. If you are rational by epistemic

Bayesian standards, when you are informed of Ω, you should change your belief

in any B from Pr(B) to

In other word, there should be no change in your beliefs. (This can be demon-

strated by a diachronic Dutch book argument that I will not present here.) Had

Statisticians Should Not Tell Scientists What to Think 71

your change of mind occurred while you were gambling, the House could have

made you suﬀer a sure loss.

The irony here is that, by changing your mind so that your new belief would

be more accurate/realistic, your new belief became inconsistent with your old

belief. And, thus, your belief change was irrational by epistemic Bayesian stan-

dards. The vice of irrational belief change outweighs the virtue of increased

accuracy of belief.

Is There Room for New Ideas? Science needs to evaluate new ideas. For example,

at one time, the idea of an Ice Age was a new idea. So, let B denote the idea

that there was once a sheet of ice, thousands of meters thick, covering most

of Canada and northern United States [23, pp. 12–14]. Hundreds of years ago,

this idea would not have occurred to most people. Presumably, for such people,

Pr(B) would be zero. But, then there is no evidence A (that is credible in the

sense of having Pr(A) > 0) that can cause Pr(B|A) to be greater than zero. This

is because:

Pr(A ∩ B) Pr(B) 0

Pr(B|A) = ≤ = = 0. (6)

Pr(A) Pr(A) Pr(A)

How then can epistemic Bayesianism explain how unthought-of ideas can ratio-

nally come to be believed?

The problem with epistemic Bayesianism is that it envisions only one way

of forming new beliefs, which is by obtaining new information and conditioning

one’s current beliefs on the new information. This process repeats again and

again as new information is acquired.

The Ball-and-Chain of Old Beliefs. To summarize: Scientists frequently need to

change their beliefs, sometimes radically. But, in the epistemic Bayesian depic-

tion of rational science, scientists’ old beliefs are a like ball-and-chain that keep

them from changing their position much.

Statistics

Pragmatic Bayesianism is very diﬀerent from epistemic Bayesianism. Unlike the

latter school of thought, the former school regards it as unimportant to precisely

specify what probability is or to precisely specify a theory of statistical inference.

To illustrate the attitudes of the pragmatic Bayesian school, I shall use the

textbook Bayesian Data Analysis, Third Edition [14], the title to be abbreviated

here BDA3. This is an excellent book. It won the 2016 DeGroot Prize from the

International Society for Bayesian Analysis. For descriptions of various Bayesian

analyses and explanations of how to carry them out, the book is superb. Per-

sonally, I have found it helpful. It is my go-to book for Bayesian statistical

methodology.

The authors of BDA3 state on p. 4 that, rather than “argue the foundations

of statistics”, they prefer to concentrate on the“pragmatic advantages of the

Bayesian framework”. The problem that this statement poses for many readers

is this: In the philosophy of pragmatism, one judges the value of a method by

72 D. Bamber

the success that it achieves. But, what is “success”? One of the purposes of

the foundations of statistics is to formulate what it is that statistical inference

attempts to achieve. If we are not told what Bayesian statistics attempts to

achieve, how we can judge whether it has been successful?

BDA3’s View of Probability. BDA3 states (pp. 11–13) that probability is used

to measure uncertainty and several kinds of uncertainty are described. The book

does not restrict the application of probability to one kind of uncertainty only.

And that lack of restriction seems to be the point: A pragmatic-Bayesian statis-

tical model may include multiple kinds of uncertainty.

BDA3’s Approach to Statistical Methodology. For all its excellence, BDA3 is a

puzzling book. Although good at explaining how to do a Bayesian analysis, it

leaves unclear why you would want to do a Bayesian analysis. Of the many

diﬀerent kinds of statistical analysis that you might do, why would you want to

condition your model on the data? BDA3 doesn’t answer that question.

In contrast to BDA3, epistemic Bayesians can tell you why, in their view,

you should do Bayesian statistical analyses; the reason being that a Bayesian

analysis in which you condition your beliefs on new data is the only rational way

to update your beliefs. But that explanation is not found in BDA3.

Aside. To do Bayesian statistical analyses, one needs a broader concept of con-

ditioning than has been discussed here. For a broad treatment of conditioning,

see Chap. 5 of Pollard’s textbook on probability theory [19]; particularly rele-

vant to Bayesian statistics is Theorem <12> and the corollary Example <13>.

(To read Pollard’s book, however, you need to learn his unusual de-Finetti style

notation.) And, for ﬁnite-dimensional spaces, see the series of four papers written

by I.R. Goodman with input from colleagues [7].

We might surmise that pragmatic Bayesians agree with epistemic Bayesians

(without saying so) that the reason for doing Bayesian statistical analyses is that

such analyses are the only way to rationally update one’s beliefs. But that sur-

mise is questionable. Pragmatic Bayesians sometimes appear to be unconcerned

with whether belief change is rational.

Concern for Accuracy of Models. If pragmatic Bayesians were concerned only

with rationally updating their beliefs, they would stop further examination of

their models once they had computed the model’s posterior. But they do not

stop there. At the very beginning of BDA3 (Sect. 1.1), it is stated that there is

a further step involved in doing Bayesian statistics. After computing a model’s

posterior, one needs to evaluate how well the model ﬁts the data. This is called

model checking and is the subject of Chap. 6 in BDA3. The purpose of model

checking is to look for deﬁciencies in the model so that the model can be modiﬁed

to correct those deﬁciencies.

By the standards of epistemic Bayesianism, it is irrational to modify a model

because a model check has indicated a deﬁciency in the model. The epistemic

Bayesian viewpoint is that, once the modeler has computed a model’s posterior,

the modeler has rationally updated his/her probabilities. To modify the model

a second time because a model check shows it to be deﬁcient is irrational (by

epistemic Bayesian standards).

Statisticians Should Not Tell Scientists What to Think 73

1. Compute model posteriors by conditioning on data. (That is rational by epis-

temic Bayesian standards.)

2. Check models to see how well they ﬁt the data and, then, modify deﬁcient

models. (That’s irrational by epistemic Bayesian standards.)

How do Pragmatic Bayesians Justify Their Practices? That is quite unclear.

Perhaps the reason for that lack of clarity is that there is not a consensus in the

pragmatic Bayesian community. They may agree on what the proper practice is,

but not on how to justify that practice.

A Justification. In Sect. 6, I present an approach to modeling that justiﬁes the

practices of pragmatic Bayesians. However, in this approach to modeling, prob-

ability is interpreted diﬀerently than in either the pragmatic or the epistemic

Bayesian communities. In this approach, probability describes variability, not

uncertainty.

Aside. It is sometimes reasonable to be fairly unconcerned with how probability

is to be interpreted when doing a Bayesian analysis. Thus, my colleagues and

I have used a Bayesian second-order probability approach to deﬁning a non-

monotonic logic [3,8–10]. The resulting logic makes sense under more than one

interpretation of probability.

In much of statistics, probability is conceived of as being a measure of uncertainty

[22, p. 3]. But, in much of science, there is a diﬀerent school of thought in which

probability distributions are used to describe the variability of Nature’s behavior.

Much of science (but not all science) is concerned with formulating models of

the mechanisms by which Nature operates. Examples: (a) The mechanism of

Newtonian gravity that causes planets to have Keplerian orbits. (b) The mecha-

nism of disease spread. Some models of these mechanisms are deterministic (e.g.,

Newtonian gravity), but others are stochastic (e.g., the spread of disease).

A Simple Model of a Stochastic Mechanism. As a particularly simple example of

a stochastic model of a natural mechanism, I present a model of human paired-

associate learning described in [2, Chap. 3] based on an experiment and theoret-

ical work of Bower [11]. I’ll call this model the “all-or-none learning model”. (It

is also known as the “one-element model”.) In the experiment, the subjects (who

were college students) were shown paired associates consisting of two consonants

paired with either the number 1 or the number 2 (e.g., mk-2 ). The subjects

were to do their best to learn these pairs. Later, when shown the left member of

a pair (e.g., mk ), the subject was supposed to respond with the right member

of the pair (i.e., 2 ). Subjects were presented with the pairs, one at a time, again

and again until the pairs had been learned to a criterion.

74 D. Bamber

Human learning is a natural phenomenon that has been brought into the

laboratory by Bower’s experiment. The all-or-none learning model is a hypothesis

about the natural mechanism underlying paired-associate learning in Bower’s

experiment. The model makes the following assumptions for each subject and

each paired associate.

1. At each trial, the paired associate was in either a learned state or an unlearned

state.

2. At the start of the experiment, each paired associate was in the unlearned

state.

3. Once in the learned state, the paired associate stayed in that state.

4. On each trial, the paired associate, if currently unlearned, had a probability

c of being learned.

5. When an learned paired associate was tested, the subject would always

respond with the correct response.

6. When an unlearned paired associate was tested, the subject would guess the

correct response with probability 1/2.

7. In addition, appropriate conditional independence assumptions were made.

mechanism underlying paired-associate learning.

In the data analysis, the single parameter c was estimated from the data

and the distributions of various statistics were derived from the model. E.g.,

the probability of an error on trial n, the total number of errors, trial number

of last error, and length of runs of errors. Graphs comparing predictions and

observations were made [2, Sect. 3.3]; eyeball inspection of those graphs showed

impressive ﬁts. However, some later experiments showed some deviations from

the all-or-none learning model [2, Sect. 3.5].

the variability of Nature. Thus, on some trials where a paired-associate was in

the unlearned state, it would be learned on that trial; on other trials, it would

not be learned. To describe Nature’s variability, the model states that learning

occurs with probability c, and no learning with probability 1 − c.

More generally, there is a long tradition in experimental psychology of using

probability to describe natural variability [2, Sect. 1.3].

Natural Propensities. What might we mean when we say that probability

describes natural variability? Let’s go back to the example of a trial where a

paired associate is in an unlearned state at the start of the trial. The all-or-none

learning model assumes that, during the trial, the paired associate is learned

with probability c and is not learned with probability 1 − c. Call this the learn-

ing assumption.

One interpretation of the learning assumption is that Nature does not behave

deterministically; instead Nature has a propensity, describable by probability, to

Statisticians Should Not Tell Scientists What to Think 75

sometimes behave one way and sometimes another. (Philosophers of science have

formulated a number of theories of propensity: see [15, Chaps. 6 & 7] and [12,

pp. 76–77].)

Could We Not Reinterpret Probabilities Descriptive of Nature as Degrees of

Belief ? There is a rationale for doing so. If a modeler posits that learning will

occur with probability c, then the modeler is uncertain whether learning will

occur. And it would be reasonable to infer that the modeler has degree of belief

c that leaning will occur. In this way, probabilities could become measures of

uncertainty, consonant with the thinking of many statisticians.

Doing exactly this is appealing to many of my colleagues. They would inter-

pret all the probabilities in the all-or-none learning model as degrees of belief.

In addition, they would adopt a prior distribution for the learning parameter c,

thus producing a full epistemic Bayesian model.

Personally, I ﬁnd this course thoroughly unappealing. Let’s suppose that I

am the modeler. A model that started out as a description of a natural phe-

nomenon has become a description of my beliefs. But, my degrees of belief are

operationalized in terms of what bets I would make and not make. Thus, a model

that started out as a description of a natural phenomenon has become a descrip-

tion of my gambling behavior. Personally, I think that a description of Nature

is interesting, whereas a description of my gambling behavior is boring.

to the stochastic modeling of natural phenomena.

do not regard their models as being absolutely correct descriptions of natural

mechanism and impossible to improve. That is certainly the attitude expressed

toward the all-or-none learning model by the authors of [2] in their Sect. 3.5. So,

let us consider that model. Is it plausible that model could be a true and complete

description of the natural mechanism of paired-associate learning? No, it’s not.

However, paired-associate learning occurs, it involves huge numbers of neurons

in complex networks. It simply isn’t plausible that a simple model could be an

accurate and complete description of a reality that is almost certainly extremely

complex. So, taken as a description of reality, the all-or-none learning model is

almost certainly wrong.

Furthermore, in most domains of science, models are not absolutely correct

descriptions of natural mechanisms. Most models are almost certainly wrong.

76 D. Bamber

Ideally, we would like to know what the natural mechanisms underlying natural

phenomena are. However, most natural phenomena are so complex that we can

never hope to fully and accurately describe them in our models. Consequently,

we need another way to think about what it is that models can realistically hope

to achieve.

In mimetic modeling [5,6], the goal is not to describe the natural mechanism

underlying a natural phenomenon. Rather our goal is to design an artificial

mechanism that mimics the behavior of the natural phenomenon.

Fortunately, when we take up mimetic modeling, we don’t have to throw

away all our old models that had been intended to be descriptions of natural

mechanisms. Instead, we can reinterpret those models as designs for artiﬁcial

mechanisms. In particular, the all-or-none learning model [2,11] can be reinter-

preted in that way.

Let us begin with a very general explanation of mimickers and models. More

detail will be given later. A mimicker is an artificial probabilistic mechanism

that produces behavior; it is something that we build. A mimetic model is a

design for building a mimicker; it speciﬁes the probabilities of the mimicker’s

behaviors. We don’t have to build the mimicker and run it to know how it will

behave. If we have the model for the mimicker (i.e., if we have the design of its

mechanism), we can predict the mimicker’s behavior using probability theory.

Interpretation of Probability in Mimickers. The behavior of mimickers is designed

to be variable. More particularly, a mimicker has been designed to have a propen-

sity, describable by probability, to sometimes behave one way and sometimes

another. Because a mimicker is a built device, such a propensity is an engineered

propensity.

The Goal of Mimicry. A mimicker for a natural phenomenon is an artiﬁcial prob-

abilistic mechanism whose behavior (it is hoped) will mimic the behavior of the

natural phenomenon. To be more speciﬁc, when we speak of the behavior of a

mimicker, we mean the artificial data produced by the mimicker. It is hoped

that the artiﬁcial data produced by the mimicker will resemble the empirical

data produced by the natural phenomenon. A mimicker may mimic well or it

may mimic poorly. We hope that the behavior of the mimicker approximates the

behavior of the natural phenomenon, but it may not. The better the approxima-

tion, the better the mimicker and, thus, the better the design of the mimicker

(i.e., the better the model).

When we propose a design for a mimicker of a natural phenomenon, we are

not making a claim that the artiﬁcial mechanism in the mimicker matches the

natural mechanism underlying the natural phenomenon. In particular, although

the mimicker is a stochastic mechanism, we do not claim that the natural mech-

anism is stochastic. It might be stochastic; or it might be deterministic, or even

chaotic. When we design a mimicker, all we care about is that its behavior should

approximate the behavior of the natural phenomenon.

Statisticians Should Not Tell Scientists What to Think 77

strictly speaking, those hypotheses are either correct or incorrect. And, if we

just collect enough data, virtually every such hypothesis can be discredited.

On the other hand, mimetic models are neither correct nor incorrect. Rather

they approximate the behavior of natural phenomena more or less well. How,

then, should we evaluate them?

Consumer-Magazine Paradigm for Model Evaluation. Various magazines are

published that evaluate products to help consumers decide which products to

buy. Consider automobiles, for example. A consumer magazine might evaluate

various automobiles on criteria such as: distance traveled on a tank of gas, num-

ber of passenger seats, space available for cargo, etc. The magazine does not

specify which car is best, because diﬀerent consumers have diﬀerent needs. For

example, one consumer might want more seats in a car, but another consumer

might want to minimize the frequency of fueling. As a result, diﬀerent consumers

will buy diﬀerent cars.

Analogously, diﬀerent scientists may have diﬀerent goals for a mimetic model

of a natural phenomenon. One scientist might want the model to mimic one

statistic well; the other might want a diﬀerent statistic to be mimicked well.

This might lead the two scientists to diﬀerently evaluate the adequacy of the

mimetic model.

In addition, a scientist might regard, as unimportant, a statistically signif-

icant deviation of a model’s behavior from Nature’s behavior. If the model’s

behavior showed only a small deviation from Nature’s behavior and if no better

model was known, the scientist might regard the model as “good enough”, even

though the small deviation was highly signiﬁcant.

Empirical Distinguishability of Models. For a given experiment, two dissimilar

models might mimic Nature’s behavior about equally well. To empirically dis-

tinguish these models, the challenge for the experimenter is to design a new

experiment in which Nature’s behavior is mimicked well by one model, but not

the other. For example, in the study of the temporal aspects of human cognition,

one of the goals has been to determine whether cognitive processes occur serially

or in parallel. It has been surprisingly diﬃcult to design experiments that can

distinguish serial processing from parallel processing; considerable ingenuity has

been needed to solve this problem [21].

Subjectivity. As described above, there can be considerable subjectivity in a sci-

entist’s evaluation of a mimetic model. However, this is subjectivity of judgment,

rather than the subjectivity of belief found in epistemic Bayesian modeling.

Uncertainty. The purpose of mimetic models is to imitate natural phenomena;

such models are not intended to be descriptions of a scientists’ uncertainties. For

example, a scientist might have doubts about whether a scientiﬁc instrument

measures what it purports to measure. Such an issue would be dealt with by

the scientist exercising scientiﬁc judgment and not by statistically modeling the

scientists’s uncertainties. In mimetic modeling, it is Nature that is modeled, not

the scientist.

78 D. Bamber

Up to now, our discussion of mimetic models has been at such a high level that

there has been no discussion of parameters in mimetic models. We will now

discuss the role of parameters in mimetic modeling. We will show how, without

inconsistency, both classical and Bayesian statistical analyses can be applied to

mimetic models.

A mimetic model with parameters may be regarded either as (a) partially-

speciﬁed model or as (b) a family of fully-speciﬁed models, with one fully-

speciﬁed model assigned to each possible value of the parameter vector. The

latter is the point of view taken here. When empirically evaluating a paramet-

ric mimetic model, one may use classical statistical methods to estimate the

parameter vector and, then, predictions may be derived from the fully-speciﬁed

model assigned to that parameter vector. Those predictions can then be com-

pared with the empirical observations. As an example: Eyeball comparison of

predictions and observations was the way that the all-or-none learning model

was evaluated [2, Sect. 3.3].

Parallel streams. As just described, a mimetic model with a parameter vector

may be regarded as a family of fully-speciﬁed models that are each assigned to

a parameter vector. Now each fully-speciﬁed model is a design for a mimicker,

i.e., an artiﬁcial stochastic mechanism that generates a vector (or, speaking

ﬁguratively, a stream) of artiﬁcial data. So, at this point, we have a family of

mimickers that generate parallel streams of artiﬁcial data. We call such a family

a parametric family of mimickers.

Random Choice of Data Stream. Out of this family of parallel data streams, we

want to extract just one stream. We do that by choosing one stream at random.

Speciﬁcally, we adopt a distribution over the space of parameter vectors and then

randomly select a parameter vector from that distribution. Following Bayesian

terminology, we call this distribution of parameter vectors the prior distribution.

Prior-Equipped Mimicker. From a parametric family of mimickers, we design

an amalgamated mimicker as follows. First, the amalgamated mimicker ran-

domly selects a parameter vector from the prior distribution. This parameter

vector “points at” one of the fully-speciﬁed mimickers. Then a stream of data

is generated from the“pointed-at” (fully-speciﬁed) mimicker. We call such an

amalgamated mimicker a prior-equipped mimicker.

Mimicking a Two-Stage Experiment. Suppose that we have done a two-stage

experiment investigating a natural phenomenon. There are two vectors of obser-

vations: one from the ﬁrst stage and one from the second stage.

Suppose that we construct a prior-equipped mimicker designed to imitate the

results of the experiment. Let θ denote the mimicker’s parameter vector and let

y1 and y2 denote the ﬁrst- and second-stage outputs from the mimicker. Using an

Statisticians Should Not Tell Scientists What to Think 79

informal style of notation found in [14, pp. 6–8], we may express the three-way

joint probability density of θ, y1 , and y2 as:

where p(θ) denotes the prior density of θ, where p(y1 |θ) and p(y2 |θ) denote

the densities of y1 and y2 conditioned on θ, and where the mimicker has been

designed so that y1 and y2 are conditionally independent given θ.

Mimicking the Two Stages Jointly. From (7), the joint density of (y1 , y2 ) is:

p(y1 , y2 ) = p(y2 |θ) p(y1 |θ) p(θ) dθ. (8)

Mimicking Just the Second Stage. From (8), the density of y2 is:

p(y2 ) = p(y2 |θ) p(θ) dθ. (9)

Mimicking the Second Stage Conditioned on the Output of the First. From (8),

the conditional density of y2 given the value of y1 is:

p(y1 |θ) p(θ)

p(y2 |y1 ) = p(y2 |θ) dθ = p(y2 |θ) p(θ|y1 ) dθ. (10)

p(y1 )

Remark. In (9) and in (10), the unconditional density p(y2 ) and the conditional

density p(y2 |y1 ) are expressed as weighted integrals of p(y2 |θ) with weights given

in the former case by the prior density p(θ) and, in the latter case, by the so-

called posterior density p(θ|y1 ). In standard Bayesian terminology, it is said that

the posterior density p(θ|y1 ) is obtained through “updating” the prior density

p(θ) by conditioning on the ﬁrst-stage data y1 .

Continuation of the Mimicker’s Behavior from the First Stage to the Second. In

(10), p(y2 |y1 ) is the distribution of the mimicker’s second-stage output condi-

tional on its output from the ﬁrst stage. It is the mimetic analog of the epistemic

concept of the posterior predictive distribution of y2 given y1 [14, Eq. 1.4].

Accuracy of Mimicking. Suppose that we have done a two-stage experiment with

empirical results y1emp and y2emp from the two stages. If a mimicker does a good

job of mimicking the results of the experiment, then y2emp will look like it could

plausibly come from the distribution with density p(y2 |y1emp ). But that may not

look plausible if the mimicker is not good.

One of the chief concerns of scientists is how well their model ﬁts the data.

Checking model ﬁt can be carried out by classical statistical methods as brieﬂy

described above in Sect. 6.3.1. Or, model checking can be carried out using the

more general pragmatic-Bayesian methods described in BDA3 [14, Chap. 6].

The purpose of model checking is two-fold: The ﬁrst purpose is to evaluate

whether the model needs to be modiﬁed to better ﬁt the data. The second

purpose is to get some clues as to what kinds of modiﬁcations should be made.

80 D. Bamber

a model has been updated by conditioning on data, it is irrational to further

modify the model so that it better ﬁts the data.

In contrast, in mimetic modeling, it is not at all irrational to modify a model

so that it better ﬁts data. In mimetic modeling, a model is an engineering pro-

posal for building a device that, it is hoped, will do a good job of mimicking a

natural phenomenon. If one such engineering proposal doesn’t work well, it isn’t

irrational to formulate another engineering proposal—in fact, it makes good

sense to do so.

Modeling

dictate to scientists what their data forces them to think:

on the basis of a statistical test. Actually, what I am calling a school is really

a set of three related, but warring, schools: the Fisher school, the Neyman-

Pearson school and the null-hypothesis signiﬁcance-testing school [18]. By

these schools’ own analyses, the conclusion to accept a statistical hypothesis

may be in error. And, likewise, the conclusion to reject a statistical hypothesis

may be in error. So, it doesn’t make sense to regard a scientiﬁc question as

being settled on the basis of a statistical test.

• The second school is epistemic Bayesianism. That school says that, once a

model has been updated by conditioning on data, it is irrational to further

modify the model. In particular, it is irrational to modify the model so that

it better ﬁts the data.

when mimetic modeling is employed.

• First, mimetic modeling is not devoted to deciding whether a mimicker is

correct or wrong. When we propose a mimicker for a natural mechanism, we

know that realistically the mimicker cannot hope to capture the full complex-

ity of the natural mechanism. So, there is no point in evaluating whether the

mimicker is correct or wrong—we already know it’s wrong. Instead, mimetic

modeling aims at evaluating whether the mimicker does a good job or a poor

job of mimicking the natural phenomenon.

• Second, unlike epistemic modeling that is constrained by the “ball-and-chain”

of old beliefs, there is no problem in mimetic modeling with further modifying

a model that has already been updated by conditioning on data.

Statisticians Should Not Tell Scientists What to Think 81

8 A Final Word

I end with a quote from the statistician Francis J. Anscombe [1]. He wrote:

The subject of statistics is itself subtle and puzzling, whereas textbooks

try to persuade the reader that all is clear and straightforward.

I heartily agree.

who often helped me by giving me insightful comments on my ideas. I have also bene-

ﬁtted from talking with and hearing the perspectives of Richard Chechile, Michael D.

Lee, Richard Shiﬀrin, and Philip L. Smith.

References

1. Anscombe, F.J.: Fisher’s ideas. Science 210, 180 (1980)

2. Atkinson, R.C., Bower, G.H., Crothers, E.J.: An Introduction to Mathematical

Learning Theory. Wiley, New York (1965)

3. Bamber, D.: Entailment with near surety of scaled assertions of high conditional

probability. J. Philos. Log. 29, 1–74 (2000)

4. Bamber, D.: What is probability? (Review of the book [15].) J. Math. Psychol. 47,

377–382 (2003)

5. Bamber, D.: Two interpretations of Bayesian statistical analyses. Unpublished talk

given at the 54th meeting of the Edwards Bayesian Research Conference, Fullerton,

California, April 2016

6. Bamber, D.: Bayes without beliefs: Mimetic interpretations of Bayesian statistics in

science. Unpublished talk given at the 49th meeting of the Society for Mathematical

Psychology, New Brunswick, New Jersey, August 2016

7. Bamber, D., Goodman, I.R., Gupta, A.K., Nguyen, H.T.: Use of the global implicit

function theorem to induce singular conditional distributions on surfaces in n

dimensions. Random Operators and Stochastic Equations. Part I. 18: 355–389.

Part II. 19: 1–43. Part III. 19: 217–265. Part IV. 19: 327–359 (2010/2011)

8. Bamber, D., Goodman, I.R., Nguyen, H.T.: Deduction from conditional knowledge.

Soft Comput. 8, 247–255 (2004)

9. Bamber, D., Goodman, I.R., Nguyen, H.T.: Robust reasoning with rules that have

exceptions: from second-order probability to argumentation via upper envelopes

of probability and possibility plus directed graphs. Ann. Math. Artif. Intell. 45,

83–171 (2005)

10. Bamber, D., Goodman, I.R., Nguyen, H.T.: High-probability logic and inheritance.

In: Houpt, J.W., Blaha, L.M. (eds.) Mathematical Models of Perception and Cog-

nition: A Festschrift for James T. Townsend, vol. 1, pp. 13–36. Psychology Press,

New York (2016)

11. Bower, G.H.: Application of a model to paired-associate learning. Psychometrika

26, 255–280 (1961)

12. Diaconis, P., Skyrms, B.: Ten Great Ideas About Chance. Princeton University

Press, Princeton (2018)

13. Efron, B.: Why isn’t everyone a Bayesian? Am. Stat. 40, 1–5 (1986)

14. Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.:

Bayesian Data Analysis, 3rd edn. CRC Press, Boca Raton, Florida (2013)

82 D. Bamber

16. Kolmogorov, A.N.: Grundbegriﬀe der Wahrscheinlichkeitsrechnung. Springer,

Berlin (1933). (English translation by N. Morrison, Foundations of the Theory

of Probability. Chelsea, NewYork (1956)

17. Lindley, D.V.: Comment (on [13]). Am. Stat. 40, 6–7 (1986)

18. Perezgonzalez, J.D.: Fisher, Neyman-Pearson or NHST? a tutorial for teaching

data testing. Front. Psychol. 6, Article 223 (2015). https://doi.org/10.3389/fpsyg.

2015.00223

19. Pollard, D.: A User’s Guide to Measure Theoretic Probability. Cambridge Univer-

sity Press, Cambridge (2002)

20. Savage, L.J.: The Foundations of Statistics. Wiley, New York (1954). (Second

revised edition, Dover Publications, New York (1972)

21. Townsend, J.T., Wenger, M.J.: The serial-parallel dilemma: a case study in a link-

age of theory and method. Psychon. Bull. Rev. 11, 391–418 (2004)

22. Wasserman, L.: All of Statistics: A Concise Course in Statistical Inference.

Springer, New York (2004)

23. Woodward, J.: The Ice Age: A Very Short Introduction. Oxford University Press,

Oxford (2014)

Bayesian Modelling Structural Changes

on Housing Price Dynamics

1

Department of Statistics, Feng Chia University, Taichung, Taiwan

{tthong,chenws}@mail.fcu.edu.tw

2

Department of Economics, Feng Chia University, Taichung, Taiwan

cuonghay@gmail.com

Abstract. This paper examines the impact of the inﬂation rate and

interest rates on housing price dynamics in the U.S. and U.K. hous-

ing markets covering the period of 1991 to 2018. We detect structural

changes based on autoregressive models having exogenous inputs (ARX)

with GARCH-type errors via Bayesian methods. This study conducts

a Bayesian model comparison among three scenario structural-change

models by designing an adaptive Markov chain Monte Carlo sampling

scheme. The results from the deviance information criterion show that

ARX-GARCH models with two structural changes are preferable over

those with no/one structural change in both countries. The estimated

locations of breakpoints in the housing returns are dissimilar when we use

diﬀerent exogenous variables, thus asserting the importance and neces-

sity of considering structural changes in housing markets. Bayesian esti-

mation results further reveal the diﬀerent impacts of interest rates and

the inﬂation rate on the housing returns in each market. More speciﬁ-

cally, the inﬂation rate has a negative impact on the U.S. housing market

in an economic downturn (including the global ﬁnancial crisis), but no

strong relationship for the other periods and other exogenous variables.

Conversely, we note that interest rates have a reverse inﬂuence on the

U.K. housing market in a recession only and are insigniﬁcant in other

periods and other exogenous inputs. The results are consistent in one

aspect, whereby the house prices are more sensitive during the reces-

sion era.

Interest rates · Inﬂation rate · MCMC method

Segmented ARX-GARCH model

1 Introduction

In modern days the national housing markets have tremendous eﬀects on

economies. The development of a healthy housing sector can spur an economy

through higher aggregate expenditures, job creation and housing turnover. The

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 83–104, 2019.

https://doi.org/10.1007/978-3-030-04263-9_6

84 H. Than-Thi et al.

household durables.

A house is the most valuable thing many people may ever own, and house

prices strongly correlate with household borrowing and consumption over the

business cycle (Cloyne et al. [18], Miller, Peng, and Sklarz [31]). Most studies in

the literature point out the inﬂuence of house prices on consumption through two

main eﬀects: the wealth eﬀect and the collateral eﬀect. These two eﬀects can be

summarized as when house prices go up, homeowners become better oﬀ and feel

more conﬁdent about their ﬁnance. Some people borrow more against the value

of their home, to spend on goods and services, renovate their house, supplement

their pension, or pay oﬀ other debt. When house prices go down, homeowners

run the risk that their house will be worth less than their outstanding mortgage.

When this occurs, people are more likely to cut down on spending and hold oﬀ

from making personal investments (Aoki, Proudman, and Vlieghe [5], Bajari,

Benkard and Krainer [7], Belsky and Joel [8], Buiter [12]).

Since household consumption is an important part of the economy, house

prices are also a key driver for the whole economy and a signiﬁcant inﬂuence

in a recession (Case, Quigley, and Shiller [13], Mian, Rao, and Suﬁ [28], Mian

and Suﬁ [29]), with the 2007–2008 global ﬁnancial crisis as a typical example

for this. Aside from inﬂuencing consumption, house prices also take on other

key roles; for instance, mortgage markets are important in the transmission of

monetary policy, or adequate house prices can facilitate labor mobility within

an economy and help economies adjust to adverse shocks (Zhu [41]). Another

aspect of their importance is their close relationship to the banking sector in

which when housing prices drop, lenders are more likely to default on their

home loans, causing banks to lose money. Many people (mostly senior people)

have used their house as a lifetime mortgage, which is a type of loan that you do

not have to make any repayments before the end of the plan, but the lender will

collect debt from the sale of the property after the person dies or goes into long-

term care. Under this situation, the precise valuation and prediction of house

prices are very crucial, because any big changes in prices will lead to a great loss

for either the lender or borrower (Longstaﬀ [26]).

Many studies have noted the momentum factor of house prices upon an econ-

omy. Therefore, many researchers attempt to predict house prices, look for the

inﬂuential factors that have the ability to drive them, or try to detect the over-

valuation of a housing prices market (Tsatsaronis and Zhu [38], Englund and

Ioannides [23], Van-Nieuwerburgh and Weill [39]). Our paper enriches the litera-

ture in this ﬁeld by focusing on two of the most important factors aﬀecting house

prices as largely agreed upon by many studies: interest rates and inﬂation rate

(Dougherty and Order [20], Englund and Ioannides [23], McQuinn and O’Reilly

Bayesian Modelling Structural Changes on Housing Price Dynamics 85

[27]). Compared to other research, we not only examine the inﬂuence of those

two factors on the housing market in a general way, but also investigate the phe-

nomenon that housing prices usually increase into a new benchmark and rarely

come back to their old levels; in other words, it is very likely that structural

breaks exist in these prices.

There is a stream of literature that addresses the issue of parameter insta-

bility by estimating regime-switching models and by searching for structural

breaks under a predictive relation between equity returns and explanatory vari-

ables (Pesaran and Timmermann [33], Stock and Watson [35], Brown, Song, and

McGillivray [11], Dong et al. [19]). Several studies present that housing prices

experiences many shocks in the past. It is thus very worthwhile to detect any

structural breaks in a housing market and to study in greater detail each regime

divided by break points (Pain and Westaway [32], Andrew and Meen [3]). More-

over, as Miles [30] highlights, failure to incorporate volatility clustering may

lead to an inaccurate modeling of home prices. Many studies have employed the

autoregressive conditional heteroskedastic (ARCH) family of models, initiated

by Engle [22] and Bollerslev [9], to model dynamic volatility.

To cope with the aforementioned characteristics’ impact on housing prices,

such as autoregression, exogenous factors, structure changes, volatility cluster-

ing, and parameter change, we propose modelling the housing price dynamics by

segmented autoregressive models with exogenous inputs (ARX) and GARCH-

type errors. This segmented ARX-GARCH model is known as piecewise ARX-

GARCH model, in which the boundaries between the segments are breakpoints.

One may interpret a breakpoint as a critical or threshold value beyond or below

which some eﬀects occur. These breakpoints are important to policymakers. In

this study the exogenous variables are either inﬂation rate or interest rates.

Choosing the number of breakpoints for the segmented ARX-GARCH model

is very crucial. More precisely, we detect structural changes based on the ARX

model with GARCH-type errors and ﬁt a segmented ARX-GARCH model simul-

taneously via Bayesian methods. Chen, Gerlach, and Liu [16] illustrate accu-

rate Bayesian estimation and inference based on a time-varying heteroskedastic

regression model, which allows for multiple structural changes. In this study we

examine the eﬀect of exogenous variables on the proposed model. We choose

an optimal number of breakpoints using a Bayesian information criterion, i.e.

deviance information criterion (DIC) proposed by Spiegelhalter et al. [34]. DIC

is a Bayesian version or generalization of the famous Akaike Information Cri-

terion (AIC). Overall, the Bayesian structural change approach used herein is

able to detect the presence of breaks, determine the number of breaks, estimate

both the time of the occurrence and the parameter values around the time of

the breaks, and shows the inﬂuence of exogenous variables on the target series.

By applying the above mentioned method to monthly house prices in the U.S.

and the U.K. from January 1991 to February 2018 and utilizing two exogenous

variables (inﬂation rate and nominal interest rates) in each country, we illustrate

that the existence of structural change is reasonable and detect two breakpoints

in both housing markets, Pain and Westaway [32], Andrew and Meen [3] draw

86 H. Than-Thi et al.

detected breakpoints are dissimilar when we change exogenous variables, and

the inﬂuences of these two exogenous variables toward housing prices in each

regime (the time periods separated by the detected breakpoints) in each coun-

try are totally diﬀerent, thus proving the essence of examining the structural

breaks model. Speciﬁcally, the inﬂation rate has a negative impact on the U.S.

housing market in an economic downturn (including the global ﬁnancial crisis),

but no strong relationship for the other periods and other exogenous variables.

By contrast, we note that interest rates have a reverse inﬂuence on the U.K.

housing market in a recession only and are insigniﬁcant in other periods and

other exogenous inputs. The results are consistent in one aspect, whereby the

house prices are more sensitive during the recession era.

The rest of the paper runs as follows. Section 2 introduces the detection

of structural breaks in a segmented ARX-GARCH model. Section 3 describes

Bayesian inferences for the proposed model. Section 4 presents the data descrip-

tion used herein. Section 5 displays the results and our discussion. Section 6 pro-

vides concluding remarks.

time series. Andrews [4] uses Wald, Lagrange multiplier, and likelihood ratio-like

tests for parameter instability and structural change with an unspeciﬁed num-

ber of breakpoints. Hansen [25] examines the breakpoints using the bootstrap

approach. To decide the number of breakpoints, Yao [40] exploits the Bayesian

information criterion (BIC), while Bai and Perron [6] propose a SupWald type

test. Elliott and Muller [21] suggest that the relationship between particular

variables may change substantially over time based on a J-test. However, those

studies focus on testing for the existence of structural breaks instead of analyzing

properties of the estimated breakdates, and they ignore both the autoregressive

component in returns and the conditional heteroskedasticity of asset returns,

which are often statistically signiﬁcant and those that are missing may distort

the results. To overcome those drawbacks, Chen et al. [16] generalize the stan-

dard return prediction model subject to the presence of structural breaks, which

includes both AR and heteroskedastic components.

Under a similar concept, we incorporate structural changes in an autoregres-

sive model with exogenous inputs (ARX), while allowing the conditional vari-

ances to follow a GARCH model. We thus consider the segmented ARX-GARCH

Bayesian Modelling Structural Changes on Housing Price Dynamics 87

⎧ (1)

⎪

⎪ φ rt−1 + ψ (1) xt−1 + at , t ≤ T1 ,

⎪

⎪

⎨φ(2) rt−1 + ψ (2) xt−1 + at , T1 < t ≤ T 2 ,

rt = . .

⎪

⎪ .. ..

⎪

⎪

⎩ (k+1)

φ rt−1 + ψ (k+1) xt−1 + at , Tk < t ≤ n,

at = ht εt ; εt ∼ N (0, 1), (1)

⎧

m

s

⎪

⎪

(1)

α0 +

(1)

αi a2t−i +

(1)

βj ht−j , t ≤ T1 ,

⎪

⎪

⎪

⎪ i=1 j=1

⎪

⎪

m

s

⎪

⎨ α0(2) + (2)

αi a2t−i +

(2)

βj ht−j , T1 < t ≤ T 2 ,

ht = i=1 j=1

⎪

⎪ .. ..

⎪

⎪ . .

⎪

⎪

⎪

⎪ m s

⎪

⎩ α0

(k+1)

+ αi

(k+1)

at−i +

2 (k+1)

βj ht−j , Tk < t ≤ n.

i=1 j=1

Here, rt is the asset return at time t; rt−1 is a 2-dimensional vector, (1, rt−1 ) ,

allowing an intercept and AR(1) term; xt−1 is a p-dimensional vector of

exogenous variables or leading indicators; φ(i) = (φ0 , φ1 ) and ψ (i) =

(i) (i)

(i) (i)

and volatility ht is recognized to be time-varying. This structure change model

in (1) is piece-wise linear in the space of a time zone.

3 Bayesian Inference

In order to make inferences and compare models, we estimate the unknown

parameters of model (1) in a Bayesian framework. Deﬁne θ = (φ , α , γ k )

as a set of all unknown parameters, where φ = (φ1 , . . . , φk+1 ) , φi =

(φ(i) , ψ (i) ) , γ k = (T1 , . . . , Tk ) , α = (α1 , . . . , αk+1 ) , and αi = (α0 ,

(i)

(i) (i) (i) (i)

n

rt − μt

L (r2,n

|r1 , θ) = P √ × Iit , (2)

t=2

ht

where Iit is an indicator variable such that Iit = (Ti−1 < t < Ti ), i = 1, . . . , k +1,

T0 = 1, Tk+1 = n, and μt = E rt |r1,t−1 , xt−1 = φ(i) rt−1 + ψ (i) xt−1 . We con-

sider the following restrictions on the parameters to ensure covariance stationary

and positive variances:

m

s

(i) (i) (i) (i) (i)

α0 > 0; αj , βj > 0 and αj + βj < 1, i = 1, · · · , k + 1. (3)

j=1 j=1

88 H. Than-Thi et al.

Gaussian prior, N3 (φi0 , Σ i ), i = 1, . . . , k + 1 constrained for mean stationarity,

where φi0 = 0, and Σ i is a matrix with large numbers on the diagonal ele-

ments. The prior set-up of the breakpoint parameters γ k comes from the idea

of Chen et al. [16]; we employ a continuous but constrained uniform prior on

γ k , subsequently discretizing the estimates so that they become an actual time

index.

Without loss of generality, we explain how to set up a prior for breakpoints

with k = 2. The continuous versions are constrained in two ways: the ﬁrst ensures

that T1 < T2 as required; while the second ensures that a suﬃcient sample size

exists in each regime for estimation. We assume priors for T1 and T2 are as

follows:

T1 ∼ Unif(a1 , b1 ) ; T2 |T1 ∼ Unif(a2 , b2 ),

where a1 and b1 are the 100hth and 100(1−2h)th percentiles of the set of integers

1, 2, . . . , n, respectively; e.g. h = 0.1 means that T1 ∈ (0.1n, 0.8n). Moreover, b2

is the 100(1 − h)th percentile of 1, 2, . . . , n and a2 = T1 + hn, so that at least

100h% of observations are in the range (T1 , T2 ). Consequently, the priors for T1

and T2 are uninformative and ﬂat over the region, ensuring T1 < T2 and at least

100h% of observations are in each regime. We assume that αi follows a uniform

prior, p(αi ) ∝ I(Si ), for i = 1, · · · , 3, where Si is the set that satisﬁes (3).

The prior for each grouping of parameters, π(.), multiplied by the likelihood

function in (2) leads to the conditional posterior distributions. The conditional

posterior distributions for γ k , φi , and αi , i = 1, . . . , k + 1 have non-standard

forms. We therefore employ the Metropolis and MH (Metropolis et al., 1953;

Hastings, 1970) methods to draw the MCMC iterates for the γ k , φi and αi ,

i = 1, . . . , k +1 groups. We use an adaptive MH MCMC algorithm that combines

a random-walk Metropolis and an independent kernel MH algorithm to speed

up convergence and grant optimal mixing. Many studies in the literature have

successfully utilized this technique, e.g. Chen and So [17], Chen, Gerlach, and

Lin [14,15], and Gerlach, Chen, and Chan [24], etc. Chen and So [17] illustrate

the detailed procedures of random-walk Metropolis and an independent kernel

MH algorithm.

4 Data Description

In this study we use interest rates and inﬂation rate in turn as an exogenous

variable x in Eq. (1). The dataset consists of the monthly house price index,

interest rates, and consumer price index (CPI) in the U.S. and U.K. from January

1991 to February 2018. All data are from the Federal Reserve Bank of St. Louis

database. We calculate the percentage log returns for the housing market as

rt = (ln Pt − ln Pt−1 ) × 100, where rt is the percentage log returns at time t and

Pt is the house price index at time t. We calculate the inﬂation rate, subject to an

increase in the consumer price index, as it = ((CPIt − CPIt−1 ) /CPIt−1 ) × 100,

in which it is the inﬂation rate at month t and CPIt is the consumer price index

at time t.

Bayesian Modelling Structural Changes on Housing Price Dynamics 89

presents the summary statistics of all variables. In general, the U.K. has higher

average housing returns and average interest rate, while the U.S. has a higher

average inﬂation rate during the covered time period. The U.K. market seems to

be more variable and volatile than the U.S. market, since all of the U.K.’s three

variables have higher variance and standard deviation than those for the U.S.

These phenomena are also arise due to the fact that the U.K.’s variables have

more extreme values in min and max. The Augmented Dickey-Fuller test reveals

that the inﬂation rates in both countries are stationary, while housing returns

and interest rates in both markets are non-stationary. These results hence lend

a support to the necessity of examining the housing returns with a structural

break model. Our model includes the ﬁrst-diﬀerenced monthly interest rates in

both countries, as those series are stationary and eﬃcient to be an exogenous

variable in our model.

Table 1. Summary statistics of monthly housing returns, interest rates, and inﬂation

rates.

U.S. Housing return 0.296 1.156 −1.752 0.208 0.366

Interest rate 2.970 7.170 0.110 5.143 0.438

b

Interest rate change −0.017 0.800 −1.960 0.051 0.010

Inﬂation rate 0.191 1.377 −1.771 0.066 0.010

U.K. Housing return 0.424 3.427 −3.161 0.866 0.128

Interest rate 4.233 13.857 0.298 8.852 0.310

Interest rate changeb −0.041 0.577 −1.760 0.055 0.010

Inﬂation rate 0.180 3.454 −0.964 0.147 0.020

a

P-value for Augmented Dickey-Fuller test.

b

The ﬁrst diﬀerenced interest rate.

Figures 1, 2, and 3 exhibit the time plots of housing returns, the ﬁrst diﬀer-

enced interest rates, and inﬂation rates in the U.S. and U.K. markets, respec-

tively. From Fig. 1, we observe during the 2007–2009 global ﬁnancial crisis period

that there are big drops in housing returns for both countries, but they then

recover afterwards. The ﬁrst diﬀerenced interest rates in Fig. 2 also witnesses

a big decrease in both countries during the crisis. The U.K. also experienced

another shock in interest rates during the early 1990s. Both countries go through

their own inﬂation rate shock period as Fig. 3 presents, with the U.S. in 2007–

2008 crisis period and the U.K. in the recession of 1991–1992. Those big volatil-

ities in all three markets in both countries lend support for the necessity of

structural change detection in our study.

90 H. Than-Thi et al.

Fig. 1. Time plots of the U.S. monthly housing returns (upper panel) and U.K. monthly

housing returns (lower panel).

Motivated by Bollerslev, Chou, and Kroner [10], we consider a segmented ARX

with GARCH(1,1) volatility model in (1) since the GARCH(1,1) appears to

be suﬃcient to explain the volatility development for most return series. We

examine the eﬀects of interest rates and inﬂation rate upon housing returns

separately in each housing market. This study uses DIC to determine the optimal

number of breakpoints in each ARX-GARCH model. The best model has the

smallest DIC value. A description of DIC procedure is referred to Truong, Chen,

and So [37].

In the MCMC framework, we set up the initial values for each parameter

as φi = (0.05, 0.05, 0.05) and αi = (0.1, 0.1, 0.1); the initial value for a single

breakpoint date is the median of the sample dates, and for the two breakpoints

Bayesian Modelling Structural Changes on Housing Price Dynamics 91

Fig. 2. Time plots of the ﬁrst diﬀerenced monthly interest rates of the U.S. (upper

panel) and U.K. (lower panel).

the values are the 25rd and 75th percentile dates. We perform 20,000 MCMC

iterations and discard the ﬁrst 10,000 iterates for each analyzed data series.

Table 2 presents the results of three scenario structural-change models, i.e.

DIC values of each ARX-GARCH model for both markets. We ﬁrst propose

k = 0, 1, and 2 and use DIC to determine how many breakpoints are ideal in each

case. The table also reports the breakpoint locations detected by our model in

each case. We do not consider a model with more than two breakpoints, because

the sample size is quite small (the sample size equals 326 for each market).

The DIC results reveal that models without structural break (k = 0) are least

preferred and two structural breaks (k = 2) are most preferred in all cases.

The result is totally in accordance with our forecast of a structural change in

housing returns, lending support to the correctness of using the structural change

model. Figures 4 and 5 provide the housing returns in both countries along with

92 H. Than-Thi et al.

Fig. 3. Time plots of the inﬂation rate of the U.S. monthly CPI (upper panel) and the

U.K. monthly CPI (lower panel).

the breakpoint locations based on the most preferred model from Table 2. We

indicate the breakdates by vertical dashed lines.

For comparison, we also ﬁt a two-breakpoint model with Student t inno-

vations to each housing series. The results do not show superiority among all

the competing models based on the DIC criterion. Therefore, we do not report

the estimates here to save space. Tables 3 and 4 establish the posterior means,

medians, standard errors, and 95% credible intervals according to Eq. (1) with

two breakpoints for both countries. We use interest rates and inﬂation rate sepa-

rately as the exogenous variable to examine their inﬂuences on the housing prices

in each regime (we call each time period divided by breakpoints a regime and

name them as regimes I, II, and III). When considering the interest rate case,

two detected breakdates are similar in both countries: June 2006 and January

2012. Regime II in this case seems to capture the full global ﬁnancial crisis in

2007–2009, while regimes I and III capture more stable period of house prices.

Bayesian Modelling Structural Changes on Housing Price Dynamics 93

(i)

Our target coeﬃcients ψ1 , i = 1, 2, and 3, show that the impact of interest

rates on house prices is unnoticeable in all three regimes for the U.S. market. In

contrast, the interest rates have a negative inﬂuence on the U.K. house prices in

(1) (3)

regime II (volatile regime), while the estimates of ψ1 and ψ1 are insigniﬁcant

when we consider regimes I and III.

els to select an optimal number of breakpoints.

U.S. Interest rate k=0 −502.113

k=1 −507.313 T1 =214

k=2 −556.403 T1 = 186, T2 = 253

Inﬂation rate k=0 −501.938

k=1 −545.689 T1 = 115

k=2 −568.747 T1 = 114, T2 = 242

U.K. Interest rate k=0 −502.133

k=1 −506.516 T1 = 180

k=2 −559.334 T1 = 186, T2 = 253

Inﬂation rate k=0 173.231

k=1 158.682 T1 = 200

k=2 131.736 T1 = 67, T2 = 163

A bold number indicates the model with the smallest DIC value in each case.

Tj is the location of breakpoints (j = 1, 2).

The results above go against the viewpoint about the virtually uncontested

importance of interest rates toward the housing market. However, this is not

unrealistic since several studies struggle to achieve credible results concerning the

impact of interest rates (McQuinn and O’Reilly [27]). A possible reason for this

unusualness is that interest rates cannot increase or reduce the overall demand

for housing in the short run, but they just move it around. All countries including

the U.S. have to provide a huge amount of additional houses to their increasing

population every year. Faced with higher interest rates, people may delay plans

to buy a new house, and shift their demand from buying to renting. The higher

demand for renting will bring about more investors to invest in this market, and

as a result housing prices do not change quickly. The other explanation for the

ineﬀectiveness of interest rates is that they tend to rise when the economy is

growing, and in this situation people can aﬀord more and are still willing to buy

a house, which does not change the demand for both houses and house prices.

Another reason from Sutton, Mihaljek, and Subelyte [36] is that house prices

adjust to interest rate changes gradually over time, but not immediately. That

could be why the one-lag exogenous variable used in our model does not ﬁgure

out the eﬀect of interest rates on house prices in the U.S.

94 H. Than-Thi et al.

Fig. 4. Time plots of the U.S. housing returns with detected breakpoints in accordance

with a segmented ARX-GARCH model. The exogenous variable is interest rate (upper

panel) and inﬂation rate (lower panel). The two estimated breakdates are indicated by

dashed (red) vertical lines.

returns, two detected breakdates are diﬀerent in the two countries (June 2000

and February 2011 for the U.S. and July 1996 and July 2004 for the U.K.). The

U.S. case still keeps the same characteristics in that regime II covers the full

global ﬁnancial crisis, while regimes I and III cover the more stable time. Regime

(2) (2)

II also displays a high level of persistence (α1 +β1 ) across markets when we ﬁt

a segmented ARX-GARCH model with inﬂation rate as the exogenous variable

for the U.S. market. In contrast, for the U.K. case, regime II captures the more

stable period while regimes I and III capture the volatile time of the market.

The diﬀerent regimes’ characteristics in the U.K. when using the inﬂation rate

Bayesian Modelling Structural Changes on Housing Price Dynamics 95

Fig. 5. Time plots of the U.K. housing returns with detected breakpoints in accordance

with a segmented ARX-GARCH model. The exogenous variable is interest rate (upper

panel) and inﬂation rate (lower panel). The two estimated breakdates are indicated by

dashed (red) vertical lines.

compared to other cases are precise since regime I captures the sudden high

inﬂation rate during the early 1990s in this country and regime III captures the

global ﬁnancial crisis.

(i)

The estimations of ψ1 , i = 1, 2, and 3, in the U.K. case suggest an irrelevant

connection between inﬂation rate and house prices in this country. Conversely,

(2) (1)

the estimation of ψ1 in the U.S. is negatively signiﬁcant, while those of ψ1 and

(3)

ψ1 are not, revealing the fact that the inﬂation rate only has converse eﬀects

on house prices in a more unstable period. This phenomenon leads to the main

implication that the inﬂation rate, which is often associated with higher prices,

96 H. Than-Thi et al.

Table 3. Bayesian estimation results of Segmented ARX-GARCH model for the U.S.

monthly housing returns.

Interest rate

(1)

I φ0 0.1723 0.1726 0.0287 0.1174 0.2303

(1)

φ1 0.5975 0.5967 0.0617 0.4809 0.7229

(1)

ψ1 −0.0563 −0.0558 0.0813 −0.2135 0.1097

(1)

α0 0.0258 0.0255 0.0068 0.0150 0.0402

(1)

α1 0.0690 0.0653 0.0287 0.0207 0.1404

(1)

β1 0.2441 0.2381 0.1249 0.0284 0.4976

(2)

II φ0 −0.1991 −0.1965 0.0744 −0.3503 −0.0632

(2)

φ1 0.4348 0.4339 0.1331 0.1759 0.6997

(2)

ψ1 −0.2471 −0.2432 0.2525 −0.7576 0.2463

(2)

α0 0.1118 0.1115 0.0510 0.0190 0.2110

(2)

α1 0.1916 0.1594 0.1389 0.0178 0.5582

(2)

β1 0.3888 0.3635 0.2063 0.0494 0.8386

(3)

III φ0 0.4911 0.4892 0.0643 0.3690 0.6229

(3)

φ1 0.0165 0.0193 0.1204 −0.2350 0.2484

(3)

ψ1 0.3254 0.3221 0.4400 −0.5697 1.1797

(3)

α0 0.0260 0.0260 0.0090 0.0090 0.0442

(3)

α1 0.1368 0.1023 0.1181 0.0024 0.4427

(3)

β1 0.2564 0.2231 0.1805 0.0133 0.6878

T1 186 186 2.295 181 190

(June 2006)

T2 253 252 1.322 250 257

(January 2012)

Inflation rate

(1)

I φ0 0.1803 0.1791 0.0445 0.0976 0.2696

(1)

φ1 0.3926 0.3952 0.1016 0.1924 0.5934

(1)

ψ1 0.0111 0.0091 0.1370 −0.2539 0.2829

(1)

α0 0.0259 0.0257 0.0042 0.0179 0.0343

(1)

α1 0.2453 0.2432 0.0690 0.1156 0.3857

(1)

β1 0.1999 0.2007 0.0578 0.0822 0.3119

(2)

II φ0 0.1525 0.1514 0.0364 0.0829 0.2261

(2)

φ1 0.8303 0.8317 0.0564 0.7170 0.9368

(2)

ψ1 −0.2293 −0.2299 0.0775 −0.3804 −0.0763

(2)

α0 0.0006 0.0006 0.0004 0.0000 0.0015

(2)

α1 0.0302 0.0302 0.0026 0.0251 0.0353

(2)

β1 0.8958 0.8959 0.0049 0.8859 0.9052

(3)

III φ0 0.4557 0.4556 0.0615 0.3401 0.5768

(3)

φ1 0.0707 0.0685 0.1218 −0.1678 0.3061

(3)

ψ1 0.0092 0.0108 0.1186 −0.2222 0.2479

(3)

α0 0.0077 0.0076 0.0016 0.0045 0.0109

(3)

α1 0.1162 0.1149 0.0393 0.0419 0.1985

(3)

β1 0.7307 0.7315 0.0325 0.6684 0.7942

T1 114 113 2.1095 110 118

(June 2000)

T2 242 242 1.3340 236 242

(February 2011)

Tj is the location of breakpoints (j = 1, 2).

Bayesian Modelling Structural Changes on Housing Price Dynamics 97

Table 4. Bayesian estimation results of Segmented ARX-GARCH model for the U.K.

monthly housing returns.

Interest rate

(1)

I φ0 0.1825 0.1814 0.0287 0.1283 0.2396

(1)

φ1 0.5791 0.5804 0.0606 0.4615 0.6936

(1)

ψ1 0.0540 0.0558 0.0699 −0.0806 0.1849

(1)

α0 0.0283 0.0278 0.0068 0.0163 0.0430

(1)

α1 0.0692 0.0661 0.0316 0.0205 0.1409

(1)

β1 0.1941 0.1809 0.1186 0.0165 0.4319

(2)

II φ0 −0.1882 −0.1848 0.0700 −0.3349 −0.0592

(2)

φ1 0.4432 0.4442 0.1295 0.1803 0.6986

(2)

ψ1 −0.5238 −0.5211 0.2563 −1.0370 −0.0312

(2)

α0 0.1133 0.1104 0.0502 0.0189 0.2111

(2)

α1 0.1821 0.1531 0.1257 0.0114 0.5090

(2)

β1 0.3546 0.3336 0.2040 0.0284 0.8046

(3)

III φ0 0.4964 0.5000 0.0646 0.3765 0.6190

(3)

φ1 0.0132 0.0105 0.1208 −0.2109 0.2585

(3)

ψ1 −0.4124 −0.4153 0.5481 −1.5044 0.6554

(3)

α0 0.0283 0.0275 0.0083 0.0135 0.0484

(3)

α1 0.1226 0.0998 0.0944 0.0043 0.3389

(3)

β1 0.2133 0.1953 0.1363 0.0103 0.5014

T1 186 185 2.205 181 190

(June 2006)

T2 253 252 1.282 249 256.5

(January 2012)

Inflation rate

(1)

I φ0 0.0060 0.0035 0.1398 −0.2668 0.2750

(1)

φ1 −0.1891 −0.1900 0.1540 −0.4855 0.1082

(1)

ψ1 −0.0347 −0.0386 0.2067 −0.4548 0.3729

(1)

α0 0.3507 0.3720 0.0903 0.1492 0.4633

(1)

α1 0.2216 0.1912 0.1524 0.0124 0.5712

(1)

β1 0.4300 0.4317 0.1752 0.0925 0.7567

(2)

II φ0 0.9147 0.9165 0.1665 0.5898 1.2382

(2)

φ1 0.1616 0.1605 0.1383 −0.1048 0.4412

(2)

ψ1 −0.0269 −0.0245 0.2148 −0.4466 0.3831

(2)

α0 0.3305 0.3423 0.0864 0.1380 0.4570

(2)

α1 0.2100 0.1797 0.1411 0.0194 0.5689

(2)

β1 0.2537 0.2188 0.1756 0.0188 0.6546

(3)

III φ0 0.2698 0.2686 0.0599 0.1546 0.3879

(3)

φ1 0.2789 0.2767 0.0937 0.0945 0.4524

(3)

ψ1 −0.1717 −0.1761 0.1576 −0.4732 0.1398

(3)

α0 0.0470 0.0389 0.0344 0.0039 0.1271

(3)

α1 0.2373 0.2213 0.0999 0.0891 0.4752

(3)

β1 0.6488 0.6693 0.1426 0.3427 0.8736

T1 67 66.0000 2.1966 64.0000 72.5000

(July 1996)

T2 163 162.0000 5.0665 157.0000 169.0000

(July 2004)

Tj is the location of breakpoints (j = 1, 2).

98 H. Than-Thi et al.

Fig. 6. The ACF plots of MCMC estimates for parameters θ are from a segmented

ARX-GARCH model for the U.S. housing returns. The exogenous variable: inﬂation

rate.

makes them retract during a recession. This goes against the conclusions in many

previous studies that house prices exhibit a stable inﬂation hedge (Anari and

Kolari [2], Abelson et al. [1]), but actually this is rather explainable. Remember

that the inﬂation rate is calculated using CPI, which does not include house

prices. During a stagnant economy, the inﬂation rate or higher prices mean that

households have to spend more on consumption, and their budget will be lower

Bayesian Modelling Structural Changes on Housing Price Dynamics 99

Fig. 7. The trace plots of MCMC estimates for parameters θ are from a segmented

ARX-GARCH model for the U.S. housing returns. The exogenous variable: inﬂation

rate.

in response. For this reason, the demand for houses will decrease, leading to a

drop in prices. Therefore, the negative causation of inﬂation rate to house prices

in a weak economic period is reasonable. Previous studies also noted this reverse

relationship (see Tsatsaronis and Zhu [38]).

To verify the estimation of our model, we provide the ACF plots and trace

plots of each estimated coeﬃcient of the segmented ARX-GARCH model. Trace

100 H. Than-Thi et al.

Fig. 8. The volatility estimates of a segmented ARX-GARCH model for the U.S. hous-

ing returns. The exogenous variable is the ﬁrst diﬀerenced interest rate (upper panel)

and inﬂation rate (lower panel). The two estimated breakdates are indicated by dashed

(red) vertical lines.

plots provide an important tool for assessing the mixing of a chain. Due to

limited space, we provide the results in the case of the U.S. inﬂation rate as

one example in Figs. 6 and 7. The other results are available upon request. In

summary, all ACF plots die down quickly, indicating convergence of the chains.

Visual inspection of trace plots, all MCMC samples seem to mix well.

Figure 8 presents the volatility estimates for the U.S. housing returns with

detected breakpoints in accordance with the segmented ARX-GARCH model.

Even when the exogenous variables are diﬀerent, both panels capture the struc-

tural breaks in volatility very well since the most volatile periods fall entirely in

regime II, while regimes I and III cover the stable periods.

Bayesian Modelling Structural Changes on Housing Price Dynamics 101

Fig. 9. The volatility estimates of a segmented ARX-GARCH model for the U.K.

housing returns. The exogenous variable is the ﬁrst diﬀerenced interest rate (upper

panel) and inﬂation rate (lower panel). The two estimated breakdates are indicated by

dashed (red) vertical lines.

Figure 9 demonstrates the volatility estimates for the U.K. housing market.

Once again, the breakpoints detected by our model capture the structure changes

of volatility very accurately. In the interest rates case, the highest volatile period

falls completely in regime II, while other regimes catch the constant time of

the market. When we use the inﬂation rate as the exogenous variable, even

the detected breakdates are diﬀerent from the previous case due to the sudden

ﬂuctuation of the U.K. inﬂation rate during the early 1990s; those breakpoints

capture the whole stable period in regime II, while the ﬂuctuating periods are in

regimes I and III. The results once more time prove the precision of our method.

102 H. Than-Thi et al.

6 Conclusion

This research studies the impacts of interest rates and inﬂation rate on hous-

ing prices in the U.S. and U.K. by employing the Bayesian structural changes

approach. The breakpoint detection process using DIC shows a preference for

the two-breakpoint models (k = 2) compared to the no/one breakpoint model.

This ﬁndings prove the correctness of using the structural change model on the

housing price market.

The estimated results reveal two contrasting situations in two countries. The

interest rates cause an insigniﬁcant eﬀect on house prices in the U.S., while only

having negative eﬀects on house prices in an economic downturn in the U.K.

There are several reasons for this: interest rates do not change overall demand

for housing, but rather just move it from buying demand to renting demand;

higher interest rates could be a result of a better economy in which people have

greater ability to aﬀord a new house; or interest rate changes inﬂuence house

prices gradually over time, but not immediately, and thus one lag does not

describe this relationship. When considering the inﬂation rate case, there is no

relation between it and house prices in the U.K., while it only causes a negative

impact on house prices in the U.S. during a volatile time of the economy. One

likely reason is that a higher inﬂation rate leads to a lower budget for households,

and hence demand and housing price also turn lower in response. However, the

opposite outcomes in the two countries are consistent in one aspect: house prices

in both countries are more sensitive in regress time.

This paper suggests the future research can look at the importance of exam-

ining structural changes in the housing market. Ignoring of this inﬂuential char-

acteristic may lead to unclear conclusions in housing price performance.

and Technology, Taiwan (MOST 107-2118-M-035-005-MY2).

References

1. Abelson, P., Joyeux, R., Milunovich, G., Chung, D.: Explaining house prices in

Australia: 1970–2003. Econ. Rec. 81, 96–103 (2005)

2. Anari, A., Kolari, J.: House prices and inﬂation. Real Estate Econ. 30, 67–84

(2002)

3. Andrew, M., Meen, G.: House price appreciation, transactions and structural

change in the British housing market: a macroeconomic perspective. Real Estate

Econ. 31, 99–116 (2003)

4. Andrews, D.W.K.: Tests for parameter instability and structural change with

unknown changepoint. Econometrica 61, 821–856 (1993)

5. Aoki, K., Proudman, J., Vlieghe, G.: House prices, consumption, and monetary

policy: a ﬁnancial accelerator approach. J. Financ. Intermediation 13, 414–435

(2004)

6. Bai, J., Perron, P.: Estimating and testing linear models with multiple structural

changes. Econometrica 66, 47–78 (1998)

Bayesian Modelling Structural Changes on Housing Price Dynamics 103

7. Bajari, P., Benkard, L., Krainer, J.: House prices and consumer welfare. J. Urban

Econ. 58, 474–487 (2005)

8. Belsky, E., Joel, P.: Housing wealth eﬀects: housing’s impact on wealth accumula-

tion, wealth distribution and consumer spending. National Center for Real Estate

Research Report (2004)

9. Bollerslev, T.: Generalized autoregressive conditional heteroskedasticity. J. Econo-

metrics 31, 307–327 (1986)

10. Bollerslev, T., Chou, R.Y., Kroner, K.F.: ARCH modeling in ﬁnance: a review of

the theory and empirical evidence. J. Econometrics 52, 5–59 (1992)

11. Brown, J.P., Song, H., McGillivray, A.: Forecasting UK house prices: a time vary-

ing coeﬃcient approach. Econ. Model. 14, 529–548 (1997)

12. Buiter, W.H.: Housing wealth isn’t wealth, Working Paper, London School of

Economics and Political Science (2008)

13. Case, K.E., Quigley, J.M., Shiller, R.J.: Wealth eﬀects revisted: 1975–2012, Work-

ing Paper (2013)

14. Chen, C.W.S., Gerlach, R., Lin, A.M.H.: Falling and explosive, dormant and rising

markets via multiple-regime ﬁnancial time series models. Appl. Stochast. Models

Bus. Ind. 26, 28–49 (2010)

15. Chen, C.W.S., Gerlach, R., Lin, E.M.H.: Volatility forecast using threshold het-

eroskedastic models of the intra-day range. Comput. Stat. Data Anal. 52, 2990–

3010 (2008)

16. Chen, C.W.S., Gerlach, R., Liu, F.C.: Detection of structural breaks in a time-

varying heteroscedastic regression model. J. Stat. Plann. Infer. 141, 3367–3381

(2011)

17. Chen, C.W.S., So, M.K.P.: On a threshold heteroscedastic model. Int. J. Forecast.

22, 73–89 (2006)

18. Cloyne, J., Huber, K., Ilzetzki, E., Kleven, H.: The Eﬀect of House Prices on

Household Borrowing: A New Approach, Working Paper (2017)

19. Dong, M.C., Chen, C.W.S., Lee, S., Sriboonchitta, S.: How strong is the relation-

ship among Gold and USD exchange rates? analytics based on structural change

models. Comput. Econ. (2017). https://doi.org/10.1007/s10614-017-9743-z

20. Dougherty, A., Order, R.V.: Inﬂation, housing costs, and the consumer price index.

Am. Econ. Rev. 72, 154–164 (1982)

21. Elliott, G., Muller, U.: Optimally testing general breaking processes in linear time

series models, Working Paper, Department of Economics, University of California,

San Diego (2003)

22. Engle, R.F.: Autoregressive conditional heteroscedasticity with estimates of the

variance of United Kingdom inﬂation. Econometrica 50, 987–1008 (1982)

23. Englund, P., Ioannides, Y.M.: House price dynamics: an international empirical

perspective. J. Hous. Econ. 6, 119–136 (1997)

24. Gerlach, R., Chen, C.W.S., Chan, N.C.Y.: Bayesian time-varying quantile fore-

casting for value-at-risk in ﬁnancial markets. J. Bus. Econ. Stat. 29, 481–492

(2011)

25. Hansen, B.: Testing for structural change in conditional models. J. Econometrics

97, 93–115 (2000)

26. Longstaﬀ, F.A.: Borrower credit and the valuation of mortgage-backed securities.

Real Estate Econ. 33, 619–661 (2005)

27. McQuinn, K., O’Reilly, G.: Assessing the role of income and interest rates in

determining house prices. Econ. Model. 25, 377–390 (2008)

28. Mian, A., Rao, K., Suﬁ, A.: Housing balance sheets, consumption, and the eco-

nomic slump. Q. J. Econ. 128 (2013)

104 H. Than-Thi et al.

29. Mian, A., Suﬁ, A.: House Price Gains and U.S. Household Spending from 2002 to

2006, Working Paper (2014)

30. Miles, W.: Volatility clustering in U.S. home prices. J. Real Estate Res. 30, 73–90

(2008)

31. Miller, N., Peng, L., Sklarz, M.: House prices and economic growth. J. Real Estate

Finance Econ. 42, 522–541 (2009)

32. Pain, N., Westaway, P.: Modelling structural change in the UK housing market: a

comparison of alternative house price models. Econ. Model. 14, 587–610 (1997)

33. Pesaran, M.H., Timmermann, A.: How costly is it to ignore breaks when forecast-

ing the direction of a time series? Int. J. Forecast. 20, 411–425 (2004)

34. Spiegelhalter, D.J., Best, N.G., Carlin, B.P., Vander Linde, A.: Bayesian measures

of model complexity and ﬁt. J. Roy. Stat. Soc. B 64, 583–640 (2002)

35. Stock, J.H., Watson, M.W.: Evidence on structural instability in macroeconomic

time series relations. J. Bus. Econ. Stat. 14, 11–30 (1996)

36. Sutton, G.D., Mihaljek, D., Subelyte, A.: Interest rates and house prices in the

United States and around the world, BIS Working Paper No. 665 (2017)

37. Truong, B.C., Chen, C.W.S., So, M.K.P.: Model selection of a switching mech-

anism for ﬁnancial time series. Appl. Stochast. Models Bus. Ind. 32, 836–851

(2016)

38. Tsatsaronis, K., Zhu, H.: What drives housing price dynamics: cross-country evi-

dence. BIS Q. Rev. (2014)

39. Van-Nieuwerburgh, S., Weill, P.O.: Why has house price dispersion gone up? Rev.

Econ. Stud. 77, 1567–1606 (2010)

40. Yao, Y.C.: Estimating the number of changepoints via Schwarz criterion. Stat.

Probab. Lett. 6, 181–189 (1988)

41. Zhu, M.: Opening Remarks at the Bundesbank/German Research Founda-

tion/IMF Conference (2014)

Cumulative Residual Entropy-Based

Goodness of Fit Test for Location-Scale

Time Series Model

Sangyeol Lee(B)

sylee@stats.snu.ac.kr

based goodness of fit (GOF) test for location-scale time series models.

The CRE-based GOF test for iid samples is introduced and the asymp-

totic behavior of the CRE-based GOF test and its bootstrap version is

investigated for location-scale time series models. In particular, the influ-

ence of change points on the GOF test is studied through Monte Carlo

simulations.

1 Introduction

The GOF test has been playing a central role in matching given data sets with

the best ﬁtted probabilistic models and has been applied to diverse applications

in economics, ﬁnance, engineering, and medicine. We refer to D’Agostino and

Stephens [2] for a general review of GOF tests. The testing method based on the

empirical process has been popular because it can generate several famous GOF

tests such as Kolmogorov-Smirnov, Cramér-von Mises, and Anserson-Darling

tests. For the asymptotic properties of the empirical process, see Durbin [3] for

iid samples and Lee and Wei [14] and Lee and Taniguchi [12] for autoregressive

and GARCH models. On the other hand, some authors considered the empirical

characteristic function-based GOF test, see Lee et al. [11] and the papers cited

therein.

Lee et al. [13] considered an entropy-based test, and Lee [7] and Lee et al. [8]

later showed that the entropy test performs well in time series models such as

GARCH models. Lee and Kim [9] and Kim and Lee [6] extended the entropy-

based GOF test to location-scale time series models and developed its bootstrap

test. Lee et al. [10] recently showed that a GOF test based on the CRE (Rao

et al. [19]) or the cumulative Kullback-Leibler divergence (Baratpour and Rad

[1]) compares well with or outperforms existing tests in various situations. They

particularly demonstrated the superiority to the entropy-based GOF test of Lee

et al. [13] in iid samples. Here we consider the CRE-based entropy test for

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 105–115, 2019.

https://doi.org/10.1007/978-3-030-04263-9_7

106 S. Lee

location-scale time series models and its bootstrap version, and seek for their

asymptotic properties.

This paper also investigates the performance of the GOF test when the

parameter experiences a change. It is well known that ﬁnancial time series often

experience structural changes due to critical events and monetary policy changes

and ignoring them can lead to a false conclusion. The change point test has a long

history and there are a vast amount of literatures. For recent references, we refer

to Lee and Lee [15] and Oh and Lee [18], who study the CUSUM test for GARCH-

type models and general nonlinear integer-valued autoregressive models, and

the papers cited therein. As seen in other inferences, the presence of parameter

changes can undermine GOF tests and mislead practitioners to a false conclu-

sion. For example, the normality test can be rejected owing to parameter changes

as seen in our simulations. However, if the distribution family for the GOF test

is broad or ﬂexible enough, the impact of parameter changes on the GOF test

could be weakened to certain extent. Our simulation results show that the rejec-

tion of GOF tests can be attributed to the presence of parameter changes while

the acceptance of GOF test does not necessarily provide a complete evidence for

no change points. In particular, the latter phenomenon can become more promi-

nent as the underlying model gets more complicated, for example, from iid to

time series models. In fact, it is well known that piecewise stationary generalized

autoregressive conditionally heteroscedastic (GARCH) processes induced from

parameter changes can be easily misidentiﬁed as integrated GARCH (IGARCH)

processes, see Maekawa et al. [16].

This paper is organized as follows. Section 2 introduces the CRE-based GOF

test for iid samples and reviews its asymptotic behavior based on Lee et al.

[10]. Section 3 extends the results in Sect. 2 to location-scale time series models.

Section 4 carries out a simulation study to check the inﬂuence of parameter

changes on the GOF test. Section 5 provides concluding remarks.

In this section we review the CRE-based GOF test in Lee et al. [10]. For any

density function f , the entropy-based GOF test is constructed based on the

Boltzmann-Shannon entropy deﬁned by

∞

H(f ) = − f (x) log(f (x))dx (1)

−∞

(Jayens [4]). Lee et al. [13] developed a GOF test using an approximation form

of the integral in (1), and Lee et al. [10] recently considered a GOF test using the

CRE wherein the entropy is deﬁned based on cumulative residual distributions.

That is, for any distribution function F with bounded support [0,1], we consider

the modiﬁcation of (1) as follows:

1

1 − F (x)

IH(F ) = − (1 − F (x)) log dx,

0 1−x

CRE-Based Goodness of Fit Test 107

[1]).

Putting

m

ΨF (si ) − ΨF (si−1 )

ISm (F ) = (ΨF (si ) − ΨF (si−1 )) log , (2)

i=1

Ψ0 (si ) − Ψ0 (si−1 )

1 1

where ΨF (s) = s (1 − F (x))dx and Ψ0 (s) = s (1 − x)dx = 1 − s − (1 − s2 )/2,

m is the number of disjoint intervals for partitioning the interval [0, 1], and

0 < s0 < s1 < · · · < sm = 1 are preassigned partition points, provided that

f = F satisﬁes 0 < inf x f (x) ≤ supx f (x) < ∞ and max1≤i≤m |si − si−1 | → 0

as m → ∞, one can see that as m → ∞,

1

1 − F (x)

ISm (F ) −→ − (1 − F (x)) log dx = IH(F ),

0 1−x

see the proof in the Appendix of Lee et al. [10].

To construct a test statistic, we further consider a generalized version of (2)

by imposing some weights:

m

w ΨF (si ) − ΨF (si−1 )

ISm (F ) = wi (ΨF (si ) − ΨF (si−1 )) log , (3)

i=1

Ψ0 (si ) − Ψ0 (si−1 )

m

where the w is a vector of weights with 0 ≤ wi ≤ 1 and i=1 wi = 1. Observe

w

that if F is the uniform distribution on [0, 1], ISm (F ) = 0.

Suppose that one wishes to test whether Xi , i = 1, . . . , n is a random sample

from an unknown cumulative distribution function F . For this task, we set up

the following hypotheses:

H 0 : F = F0 vs. H1 : F = F0 .

Note that F0 (Xi ) follow U [0, 1] under the null, and whence the testing problem

is reduced to a uniformity test. In view of (3), as a GOF test, we consider

m

w Ψn (si ) − Ψn (si−1 )

ISm (Fn ) = wi (Ψn (si ) − Ψn (si−1 )) log , (4)

i=1

Ψ0 (si ) − Ψ0 (si−1 )

n

where

Ψn = ΨFn with Fn (x) = n1 i=1 I(F0 (Xi ) ≤ x), so that Ψn (s) =

n w

i=1 (F0 (Xi ) − s)I(F0 (Xi ) > s). Then, we reject H0 if supw |ISm (Fn )| is

1

n

large. The test based on (4) can be implemented using the asymptotic result as

follows (Theorem 2.1. of Lee et al. [10]): under H0 , as n → ∞,

√ m

w d ◦ ◦

Tn := n sup |ISm (Fn )| → sup wi (IB (si ) − IB (si−1 )) , (5)

w∈W w∈W i=1

1

where IB ◦ (s) = − s B ◦ (x)dx and B ◦ is a Brownian bridge on [0, 1], W denotes

m with a ﬁnite number of elements of the space of weights 0 ≤ wi ≤ 1

any subset

with i=1 wi = 1, and 0 = s0 < s1 < · · · < sm = 1.

108 S. Lee

1

Since −IH(F ) − ( 0 xdF (x) − 1/2) = 0 if and only if F is a uniform distri-

bution on [0, 1] (cf. Baratpour and Rad [1]),

n Lee et al. [10] also suggested using

1

the test: T̃n := Tn + Δn with Δn = √nm | i=1 (F0 (Xi ) − 1/2)|. Their simulation

study reveals that T̃n outperforms T̂n in some examples.

Below, we consider the problem of testing the null and alternative hypotheses:

transformed random variables Ûi = Fθ̂n (Xi ) follow a uniform distribution on

[0, 1], say, U [0, 1], where θ̂n is an estimate of true parameter θ0 . We impose the

following conditions:

(A1) Fθ has a positive density fθ .

(A2) x → ∂F∂θ θ (x)

is uniformly continuous on (−∞, ∞), and supθ∈N

∂ 2 Fθ (x) θ (x)

supx ∂θ∂θT ≤ L for some L > 0 and supθ∈N ∂F∂θ fθ (x)dx < ∞

for some compact neighborhood N of θ0 .

(A3) Under the null,

n

√ 1

n(θ̂n − θ0 ) = √ l(Xi ; θ0 ) + op (1),

n i=1

where l(x; θ) is measurable with l(x; θ)fθ (x)dx = 0 and satisﬁes

sup ||l(x; θ)||2+δ fθ (x)dx < ∞

θ∈N

w

As a test

statistic, we can use a version similar to (5) based on ISm (F̂n ) with

n

F̂n (s) = n i=1 I(Ûi ≤ s), namely,

1

n

√ w 1

T̂n = n sup |ISm (F̂n )| and T̃n = T̂n + √ (Fθ̂n (Xi ) − 1/2) . (6)

w∈W nm i=1

wij

is a large integer, say, 1,000, and then, use w̃ij = w1j +···+w mj

and si = i/m, i =

1, · · · , m to apply the test:

⎛

⎞

i

i − 1 Ψ̂n mi − Ψ̂n i−1

√ m m

T̂n = n max w̃ij Ψ̂n − Ψ̂n log ⎝

⎠(7)

1≤j≤J

i=1

m m Ψ i − Ψ i−1

0 m 0 m

Further, we use m = [n1/3 ] and adopt the bootstrap method:

(i) From the data X1 , . . . , Xn , obtain the MLE θ̂n .

CRE-Based Goodness of Fit Test 109

(ii) Generate X1∗ , . . . , Xn∗ from Fθ̂n (·) to obtain T̂n , denoted by T̂n∗ , with the

preassigned m in (7) based on these random variables.

(iii) Repeat the above procedure B times and calculate the 100(1 − α)% per-

centile of the obtained B number of T̂n∗ values for given 0 < α < 1.

(iv) Reject H0 if the value of T̂n obtained from the original observations is larger

than the obtained 100(1 − α)% percentile in (iii).

n

√ ∗ 1

n(θ̂n − θ̂n ) = √ l(Xi∗ ; θ̂n ) + o∗p (1), (8)

n i=1

√ n

Proposition 1. Let T̂n∗ := n supw∈W |ISm w

(F̂n∗ )| with F̂n∗ (x) = n1 i=1 I

n

(Fθ̂∗ (Xi∗ ) ≤ x), and T̃n∗ = T̂n∗ + √nm

1

| i=1 (Fθ̂∗ (Xi∗ )−1/2)|. Then, under (A1)–

n n

(A3) and (8) and under H0 , for all −∞ < x < ∞,

|P ∗ (T̃n∗ ≤ x) − P (T̃n ≤ x)| → 0 in probability.

Sect. 3. In our empirical study in Sect. 4, we use the bootstrap version of the test

in (7):

⎛ ⎞

i i − 1 Ψ̂n∗ m

i

− Ψ̂n∗ i−1

√ m m

T̂n∗ = n max ∗

w̃ij Ψ̂n ∗

− Ψ̂n log ⎝ ⎠ (9)

1≤j≤J

i=1

m m Ψ0 m i

− Ψ0 i−1

m

n

In this section we extend the CRE-based entropy test to location-scale models:

tions, Θm = Θ1 × Θ2 with compact subsets Θ1 ⊂ Rd1 and Θ2 ⊂ Rd2 ,

gt (β1,0 ) = g(yt−1 , yt−2 , . . . ; β1,0 ) and ht (β0 ) = h(yt−1 , yt−2 , . . . ; β0 ), where

T T T

β0 = (β1,0 , β2,0 ) denotes the true model parameter belonging to Θm , and {ηt }

is a sequence of iid random variables with mean zero and unit variance. Further,

{yt : t ∈ Z} is assumed to be strictly stationary and ergodic and ηt is indepen-

dent of past observations Ωs for s < t. Model (10) includes various GARCH type

models. Recently, Kim and Lee [6] veriﬁed the weak consistency of the bootstrap

entropy test based on the residuals calculated from Model (10).

110 S. Lee

In this section, we focus on the CRE-based GOF test for Model (10). To this

end, we set up the following hypotheses:

where Fη denotes the innovation distribution of the model and Fϑ can be any

family of distributions. In what follows, we assume that fϑ = Fϑ exists and is

positive and fϑ is continuous, which also ensures the continuity of Fϑ−1 in ϑ, due

to Scheﬀe’s theorem.

To implement a test, we check whether the transformed random variables

y − g (β )

t t 1,0

Ut = Fϑ0

ht (β0 )

follow a uniform distribution on [0, 1], where ϑ0 and β0 are the true parameters.

Since the parameters are unknown, we check the departure from U [0, 1] based on

yt −g̃t (β̂1,n )

Ût := Fϑ̂n (η̂t ) with η̂t = h̃t (β̂n )

, where g̃t (β1 ) = g(yt , yt−1 , . . . , y1 , 0, . . . ; β1 )

and h̃t (β) = h(yt , yt−1 , . . . , y1 , 0, . . . ; β) with β = (β1T , β2T )T ∈ Θm .

Lee and Kim [9] studied the asymptotic behavior of the residual empirical

process:

√

V̂n (r) = n(F̂nR (r) − r), 0 ≤ r ≤ 1

n

with F̂nR (r) = n1 t=1 I(Fϑ̂n (η̂t ) ≤ r), where ϑ̂n is any consistent estimator of ϑ0

under the null, for example, the MLE. More precisely, Lee and Kim [9] showed

that under certain regularity conditions,

√ n

uniformly in r, where Vn (r) = n(Fn (r) − r) with Fn (r) = 1

n t=1 I(Fϑ0 (ηt ) ≤

r) and

√ 1 ∂g1 (β1,0 )

Rn (r) = n(β̂1n − β1,0 )T E fϑ0 (Fϑ−1

0

(r))

h1 (β0 ) ∂β1

√ 1 ∂h (β )

+ n(β̂n − β0 )T E Fϑ−1 (r)fϑ0 (Fϑ−1

1 0

0 0

(r))

h1 (β0 ) ∂β

√ ∂Fϑ0 (Fϑ−1 (r))

+ n(ϑ̂n − ϑ0 )T 0

.

∂θ

Based on this fact, they found the limiting null distribution of the residual

entropy test and its bootstrap version.

The residual CRE test can be designed similarly to (6), that is,

√ n

1

T̂nR = n sup |ISm w

(F̂nR )| and T̃nR = T̂nR + √ (Fθ̂Rn (η̂t )) − 1/2) .(13)

w∈W nm

t=1

CRE-Based Goodness of Fit Test 111

In implementation, using w̃ij in (7). we employ the test, similar to (7), as follows:

T̂nR (14)

⎛

⎞

i

i − 1 Ψ̂nR mi − Ψ̂nR i−1

√ m m

= n max w̃ij Ψ̂nR − Ψ̂nR log ⎝

⎠ ,

1≤j≤J

i=1

m m Ψ0 mi − Ψ0 i−1

m

Further, we use the parametric bootstrap method below to obtain the critical

values:

(i) Based on the data y1 , . . . , yn , obtain a consistent estimator θ̂n .

(ii) Generate η1∗ , . . . , ηn∗ from Fϑ̂n (·) and obtain y1∗ , . . . , yn∗ through Eq. (10)

with β0 replaced by its MLE β̂n . That is, yt∗ = g̃t (β̂1,n ) + h̃t (β̂n )ηt∗ . Then,

calculate T̂n∗ with a preassigned m based on these random variables.

(iii) Repeat the above procedure B times and calculate the 100(1 − α)% per-

centile of the obtained B number of T̂nR∗ values.

(iv) Reject H0 if the value of T̂nR in (14) obtained from the original observations

is larger than the obtained 100(1 − α)% percentile in (iii).

Below we discuss the weak convergence of the above bootstrap test. Let

y ∗ −g̃t (β̂ ∗ )

Ut∗= Fϑ̂∗ (η̂n∗ ) with residuals η̂t∗ = t h̃ (β̂ ∗1,n , and deﬁne the bootstrap residual

n t n)

empirical process:

√

V̂n∗ (r) = n(F̂nR∗ (r) − r), 0 ≤ r ≤ 1,

n

with F̂nR∗ (r) = n1 t=1 I(Fϑ̂∗ (η̂t∗ ) ≤ r).

n

Theorem 3.2 of Kim and Lee [6] shows that under some regularity conditions,

V̂n∗ (r) = Vn∗ (r) + Rn∗ (r) + o∗p (1), (15)

in probability,

√ uniformly in r, which is a bootstrap version of (12), wherein

n

Vn∗ (r) = n(FnR∗ (r) − r) with FnR∗ (r) = n1 t=1 I(Fϑ̂n (ηt∗ ) ≤ r) and

√ 1 ∂g (β )

t 1,0

Rn∗ (r) = n(β̂1,n

∗

− β̂1,n )T E fϑ̂n (Fϑ̂−1 (r))

ht (β0 ) ∂β1T n

+ n(β̂n∗ − β̂n )T E Fϑ̂ (r)fϑ̂n (Fϑ̂−1 (r))

ht (β0 ) ∂β T n n

+ n(ϑ̂∗n n

. T

− ϑ̂n )

∂ϑ

In our study, we use the bootstrap version of the test in (13), similar to (9),

as follows:

n

√

1

T̂nR∗ = n sup w

|ISm (F̂nR∗ )| and T̃nR∗ = T̂nR∗ +√ R∗ ∗

(Fθ̂n (η̂t )) − 1/2) . (16)

w∈W nm

t=1

112 S. Lee

Theorem 1. Let T̂nR and and T̃nR be the ones in (13), and T̂nR∗ and T̃nR∗ be the

ones in (16). Then under (A2), (A3), (B0)–(B4) in Kim and Lee [6], and H0

in (11), we obtain

m

m

T̂nR = sup wi Vn (si ) − Vn (si−1 ) + wi Rn (si ) − Rn (si−1 )

w∈W i=1 i=1

+op (1), (17)

m

m

T̂nR∗ = sup wi Vn∗ (si ) − Vn∗ (si−1 ) + wi Rn∗ (si ) − Rn∗ (si−1 )

w∈W i=1 i=1

∗

+op (1), (18)

1 1 1

where Vn (s) = s Vn (r)dr, Rn (s) = s Rn (r)dr, Vn∗ (s) = s Vn∗ (r)dr, and

1

Rn∗ (s) = s Rn∗ (r)dr. Hence, we find that for all −∞ < x < ∞,

∗

P T̂nR∗ ≤ x − P T̂nR ≤ x → 0 in probability, (19)

∗

P T̃nR∗ ≤ x − P T̃nR ≤ x → 0 in probability. (20)

Proof. Under the null, we have Ψ̂nR (s) → Ψ0 (s) in probability as n → ∞, owing

|x| ≤ 1/2, we can

R | log(1R + x) − x| ≤ x for

2

to (12). Then, by using

the fact that

Ψ̂n (si )−Ψ̂n (si−1 )

write that on An := max1≤i≤m Ψ0 (si )−Ψ0 (si−1 ) − 1 ≤ 1/2 ,

√ m

Ψ̂nR (si ) − Ψ̂nR (si−1 )

T̂nR = sup n R R

wi (Ψ̂n (si ) − Ψ̂n (si−1 )) · log −1+1

w Ψ (s

0 i ) − Ψ (s

0 i−1 )

i=1

m

Ψ̂nR (si ) − Ψ̂nR (si−1 )

= sup wi · Vˆn (si ) − Vˆn (si−1 ) + δn

w Ψ 0 (si ) − Ψ 0 (si−1 )

i=1

m

= sup wi (Vn (si ) + Rn (si ) − Vn (si ) − Vn (si−1 )) + op (1),

w

i=1

1

where we have used the fact that Vˆn (s) := s

V̂n (r)dr = Vn (s) + Rn (s) + op (1)

owing to (12) and

2

√ Ψ̂nR (si ) − Ψ̂nR (si−1 ) R R

|δn | ≤ n max −1 max |Ψ̂n (si ) − Ψ̂n (si−1 )| = op (1).

1≤i≤m Ψ0 (si ) − Ψ0 (si−1 ) 1≤i≤m

theorem. Similarly, using (15), we can show that (18) holds (cf. the proof of

Theorem 2.3 of [10]). Then, in view of (17), (18), and (B4) in Kim and Lee [6],

we can see that (19) is true, and so is (20), which validates the theorem.

In implementation, we use the bootstrap version of the test in (14), similar

to (9), as follows:

CRE-Based Goodness of Fit Test 113

Table 1. Empirical size and powers for the iid normal samples at the level of 0.05.

Size 0.050

Power

μ = 1 0.053

μ = 2 0.087

μ = 3 0.577

μ = 4 0.970

μ = 5 0.957

Power

σ = 2 0.187

σ = 3 0.587

σ = 4 0.827

σ = 5 0.943

Table 2. Empirical sizes for the GARCH(1,1) model with N (0, 1) innovations at the

level of 0.05.

(ω, α, β) Size

(0.1, 0.3, 0.6) 0.140

(0.1, 0.2, 0.7) 0.087

(0.1, 0.1, 0.8) 0.063

T̂nR∗ (21)

⎛ ⎞

i i − 1 Ψ̂nR∗ m

i

− Ψ̂nR∗ i−1

√ m m

= n max R∗

w̃ij Ψ̂n R∗

− Ψ̂n log ⎝ ⎠ ,

1≤j≤J

i=1

m m i

Ψ0 m − Ψ0 m i−1

n

1

T̃nR∗ = T̂nR∗ + √ ∗

(Fθ̂n∗ (η̂t ) − 1/2)

nm i=1

n

4 Simulation Study

Since the CRE-based GOF test is proved to outperform several existing GOF

tests in iid samples and works well for GARCH models as seen in the real data

example of Lee et al. [10], our simulation study focuses on examining the inﬂu-

ence of the change points on GOF tests. For this task, we employ T̂nR∗ in (21)

with J = 1000, m = 5, and apply it to iid normal and GARCH(1,1) samples

with normal innovations. In iid case, the sample is assumed to follow a N (0, 1)

114 S. Lee

Table 3. Empirical powers for the GARCH(1,1) with N (0, 1) innovations at the level

of 0.05.

(0.1, 0.1, 0.2) → (0.1, 0.2, 0.7) 0.397

(0.1, 0.1, 0.2) → (0.1, 0.3, 0.6) 0.387

(0.1, 0.1, 0.2) → (0.1, 0.4, 0.5) 0.377

(0.1, 0.1, 0.7) → (0.1, 0.1, 0.8) 0.105

(0.1, 0.2, 0.5) → (0.1, 0.2, 0.7) 0.183

(0.1, 0.2, 0.5) → (0.1, 0.3, 0.6) 0.240

(0.1, 0.3, 0.3) → (0.1, 0.2, 0.7) 0.277

(0.1, 0.3, 0.3) → (0.1, 0.1, 0.8) 0.243

distribution under the null and to have a change from N (0, 1) to N (μ, σ 2 ) dis-

tribution at n/2 = 50 under the alternative. Here, we consider the two cases (i)

only μ changes and σ is ﬁxed; (ii) only σ changes and μ is ﬁxed. In GARCH

case, the GARCH model under the null is yt = σt t with σt2 = ω + αyt−1 2 2

+ βσt−1

and {t } ∼ N (0, 1). Under the alternative, (ω, α, β) is assumed to change from

(ω0 , α0 , β0 ) to (ω1 , α1 , β1 ) at t = 50. To save computational time, we use n = 100

with the number of bootstraps and repetitions = 300. Sizes and powers are cal-

culated at the level of 0.05 for diﬀerent parameter settings.

Table 1 shows that the normality test for iid case has no size distortion and

produces better powers as the diﬀerence of parameter change gets larger, indicat-

ing that the parameter change can severely damage the GOF test. On the other

hand, Tables 2 and 3 show that the normality test for GARCH(1,1) innovations

has some size distortions (maybe because α + β is close to 1) and produces less

powers than that for the iid samples, which reveals a possibility that the inﬂu-

ence of parameter changes can reduce to certain extent due to model complexity.

Our ﬁndings show that the change point test should be carefully performed in

advance of conducting GOF tests.

5 Concluding Remarks

In this paper, we studied the CRE-based GOF test for location-scale time series

models and its bootstrap version. We also carried out Monte Carlo simulations to

see the inﬂuence of parameter changes on the GOF test. The result reveals that

parameter changes can much aﬀect GOF tests and a change point test should

be carried out a priori before GOF tests. In particular, parameter changes in

GARCH models appeared to aﬀect the GOF test to a less degree than those in iid

samples. This might be a reason that the GARCH model often passes well model

check tests even in the presence of change points. However, more experiments

are needed until a ﬁrm conclusion is reached because here we only considered

the GARCH model with normal innovations and the CRE-based entropy test.

CRE-Based Goodness of Fit Test 115

As such, we plan to extend this work to other sophisticated time series models

such as smooth transition GARCH models with non-normal innovations, see

Khemiri [5] and Meiz and Saikkonen [17], and other GOF tests including the

Anderson-Daring test with more extensive empirical studies.

through the National Research Foundation of Korea (NRF) funded by the Ministry of

Science, ICT and future Planning (No. 2018R1A2A2A05019433).

References

1. Baratpour, S., Rad, H.: Testing goodenss-of-fit for exponential distribution based

on cumulative residual entropy. Commun. Stat. Theory Methods 41, 1387–1396

(2012)

2. D’Agostino, R.B., Stephens, M.A.: Goodness-of-Fit Techniques. Marcel Dekker,

Inc., New York (1986)

3. Durbin, J.: Weak convergence of the sample distribution function when parameters

are estimated. Ann. Stat. 1, 279–290 (1973)

4. Jaynes, E.T.: Information theory and statistical mechanics. Phys. Rev. 106, 620–

630 (1957)

5. Khemiri, R.: The smooth transition GARCH model: application to international

stock indexces. Appl. Financial Econ. 21, 555–562 (2011)

6. Kim, M., Lee, S.: Bootstrap entropy test for general location-scale time series

models with heteroscedasticity. J. Stat. Comput. Simul. 13, 2573–2588 (2018)

7. Lee, S.: A maximum entropy type test of fit: composite hypothesis case. Comp.ut

Stat. Data Anal. 57, 59–67 (2013)

8. Lee, J., Lee, S., Park, S.: Maximum entropy test for GARCH models. Stat.

Methodol. 22, 8–16 (2015)

9. Lee, S., Kim, M.: On entropy test for conditionally heteroscedastic location-scale

time series models. Entropy 19(8), 388 (2017)

10. Lee, S., Park, S., Kim, B.: On entropy-type goodness of fit test based on integrated

distribution. J. Stat. Comput. Simul. 88, 2447–2461 (2018)

11. Lee, S., Maintanis, S., Cho, M.: Inferential procedures based on the integrated

empirical characteristic function. AStA Adv. Stat. Anal., 1–30 (2018)

12. Lee, S., Taniguchi, M.: Asymptotic theory for ARCH models: LAN and residual

empirical process. Statistica Sinica 15, 215–234 (2005)

13. Lee, S., Vonta, I., Karagrigoriou, A.: A maximum entropy type test of fit. Comput.

Stat. Data. Anal. 55, 2635–2643 (2011)

14. Lee, S., Wei, C.Z.: On residual empirical process of stochastic regression models

with applications to time series. Ann. Stat. 27, 237–261 (1999)

15. Lee, Y., Lee, S.: On CUSUM tests for general nonlinear inter-valued GARCH

models. Ann. Inst. Stat. Math., Online published (2018)

16. Maekawa, K., Lee, S., Tokutsu, Y., Park, S.: Cusum test for parameter changes in

GARCH(1,1) models with applications to Tokyo stock data. Far East J. Stat. 18,

15–23 (2006)

17. Meitz, M., Saikkonen, P.: Paramter estimation in nonlinear AR-GARCH models.

Econometric Theory 27, 1236–1278 (2011)

18. Oh, H., Lee, S.: Modified residual CUSUM test for location-scale time series models

with heteroscedasticity. Ann. Inst. Stat. Math., Online published (2018)

19. Rao, M., Chen, Y., Vemuri, B.C., Wang, F.: Cumulative residual entropy: a new

measure of information. IEEE Trans. Inf. Theory 50, 1220–1228 (2004)

The Quantum Formalism in Social

Science: A Brief Excursion

Emmanuel Haven1,2(B)

1

Memorial University, St. John’s, Canada

ehaven@mun.ca

2

IQSCS, Leicester, UK

science problems can begin to be re-interpreted with the aid of elements

of the formalism of quantum mechanics.

1 Introduction

It is surely not new to involve formalisms from other disciplines in social science.

The most obvious example is of course the use of mathematics in areas such as

economics and psychology. Post war economics as led by luminaries like Kenneth

Arrow and Gérard Debreu was responsible for deﬁning a plethora of economics

concepts with the aid of mathematics. Do we have an equivalent ‘School’ which

contributed towards the rigorous deﬁnition of economics concepts with the aid of

physics? The answer shall be ‘no’. Although econophysics1 published an impor-

tant array of papers, its oeuvre did not really enter mainstream economics as

was the case with the Arrow-Debreu school which mathematized economics.

This paper will, on prima facie, go into a slightly diﬀerent direction. We will be

concerned to lay out in a brief fashion, some of the applications of the formalism of

quantum mechanics in social science. For those readers who are completely new to

this area, we can from the outset, already mention that, although there is no main-

stream area of economics2 which will cater towards those very speciﬁc applications,

there is (now) surely a mainstream component in mathematical psychology which

uses those formalisms. The interested reader should peruse the books by Khren-

nikov [1]; Nguyen [2] and Busemeyer [3] to get a much better idea.

This paper is meant to give a very brief overview of some of the applica-

tions without wanting to pretend to be in any way exhaustive. We will discuss in

the sequel some applications involving the area of ﬁnance, more speciﬁcally with

regards to:

– arbitrage/non arbitrage

– value versus price and pricing rules

– decision making

1

Econophysics is a movement which has endeavoured to apply statistical mechanics

concepts mostly to ﬁnance but also to economics.

2

And even less in ﬁnance!.

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 116–123, 2019.

https://doi.org/10.1007/978-3-030-04263-9_8

The Quantum Formalism in Social Science: A Brief Excursion 117

shall again introduce some of the quantum formalism in that discussion too.

Formalism?

As is the case with much of interdisciplinary work, knowing the true limits of such

work is incredibly diﬃcult. What do I mean? If one uses mathematics to further

one’s rigorous understanding of economics or ﬁnance, then such true limits do not

really occur. Why? Mathematics is a universal language which can be applied

to any cognitive discipline. One can not say the same about physics. Physics

studies events which pertain to nature. No work in physics was ever conceived for

applications to the social sciences. This argument, is almost perfectly intuitive

in the case of quantum physics, which studies nature at an incredibly small

scale. To come back to the argument that mathematics is a universal language

applicable to any domain of knowledge, it is precisely this argument we can use

to argue why we can borrow elements of the quantum mechanics formalism and

apply them in macroscopic environments like social science. In other words, the

mathematical apparatus of quantum mechanics is used in those applications.

Hence, no implications can be formulated which would make statements like “so

you are in fact saying that the ﬁnancial world is quantum-mechanical?3 ”

This paper can and will not be a repository where the basics of the quantum for-

malism are explained. We refer back to Nguyen [2] and Haven and Khrennikov [4]

for the basic ideas. See also Haven, Khrennikov and Robinson [5]. Essentially, one

needs to realize that a big diﬀerence between classical mechanics and quantum

mechanics is that in the latter one uses (ﬁnite or inﬁnite) dimensional Hilbert

space. Thus, a vector space is used. An essential diﬀerence between quantum

mechanics and classical mechanics is the distinction one MUST make between

measurements and states. Position is an example of a measurement and typically

we say that an ‘observable’ is measurable (i.e. they are represented by operators

which are Hermitian4 ). In classical mechanics the state and measurement are

the same thing. They are deﬁnitely diﬀerent in quantum mechanics. This is an

essential diﬀerence.

3

Those sort of arguments I hear often. They are expected but they also show that

when individuals make those statements, they problably will not have read much of

the mainstream literature on the interface of quantum mechanics and social science.

4

Think of an operator as an instruction. A Hermitian operator expressed in matrix

form will essentially say this: if you take the transpose of a matrix (and you multiply

each element with its complex conjugate), then if that yields the original matrix, the

matrix is Hermitian.

118 E. Haven

4 Some Applications

4.1 Arbitrage/non Arbitrage

Arbitrage is a key concept in ﬁnance and intuitively one could deﬁne an arbitrage

opportunity as a way to realize a risk free proﬁt. The absence of such proﬁts

is assumed in the derivation of academic ﬁnance models. A good example is

the Black-Scholes option pricing theory [6]. When reformulating option pricing

theory within a Hamiltonian5 framework, Baaquie [7] has shown that the Black-

Scholes Hamiltonian is not Hermitian. This non-Hermiticity of the Hamiltonian

is intimately linked to the arbitrage condition. In quantum mechanics Hamil-

tonians need to be Hermitian. In Haven, Khrennikov and Robinson [5] (p. 140

and following), we discuss how considering non Hermitian Hamiltonians may be

plausible in the context of so called open systems. In an open system there is

an interaction with an environment (i.e. one does not have an isolated system).

Khrennikova, Haven and Khrennikov [8] provide for arguments on how to use

such an open system within the context of political science. The environment is

seen there as a set of information.

We can also wish to actively involve the state function in the set up of the

so called non-arbitrage theorem in ﬁnance (see Haven and Khrennikov [9]). The

basic idea here is that a change in the state function could trigger arbitrage.

The state function is interpreted as an information wave function and is the

key input in the generation of a probability. This is a very simple application

of quantum mechanics in that the wave function (with its complex conjugate)

yields a probability. The wave function can be seen as a probability amplitude (or

probability wave). For those readers who are really new to elementary quantum

mechanics, the probability amplitude is the key device by which we can, in a

formal way, describe the well-known double slit experiment6 . This experiment

makes us to have to use the so called interference term in the basic quantum

probability formulation.

Another basic idea from quantum mechanics is the idea of superposition of states.

In a ﬁnancial context, we could think of so called ‘value’ states as opposed to

price states. If we work with the elements of a vector space in quantum mechanics

then we shall employ so called ‘kets’. They are often denoted with a ‘>’ symbol.

The idea is then as follows: consider the price of an asset to be a superposition

of (say) four value states: |p >= b1 |v1 > +b2 |v2 > +b3 |v3 > +b4 |v4 >, where

5

The Hamiltonian is the sum of potential and kinetic energy. In a quantum mechan-

ical context, when the Hamiltonian becomes an operator, this forms a key part in

the rendering of the so called Schrödinger partial diﬀerential equation (PDE). This

PDE describes the undisturbed evolution of a state (in time dependent or time

independent fashion). It is a central equation in quantum mechanics.

6

Many textbooks exist which introduce quantum physics. A great book to consider

is Bowman [10].

The Quantum Formalism in Social Science: A Brief Excursion 119

2

|bi | =probability of each value to occur. We need to note that such a formulation

is not at all without criticism. As an example, we can query the validity of linear

independence of the value states.

We can go a little further and also consider so called pricing rules. Such rules

are steeped in a little more context. Essentially, the argument revolves around

the idea that Bohmian mechanics as an interpretation of quantum mechanics,

can serve a very targeted purpose within our environment of applications in

social science. Bohmian mechanics is a physics theory and was never developed

with social science applications in mind. The interested reader is referred to

Bohm [11,12] and Bohm and Hiley [13]. In Bohmian mechanics, a key concept

is the quantum potential (which is narrowly connected with a measure of infor-

mation). It can be claimed, that for a large part, the rationale for using the

quantum formalism in social science traces back to the need of wanting to have

an information formalism. The quantum potential depends on the amplitude of

a wave function. You may recall from your high school physics that the force

is the negative gradient of the potential. As an example, say that the price of

an asset is p. Assume there exists an amplitude function R(p). The quantum

potential Q(p)7 can be formulated and its force, −∂Q∂p , calculated. Why could this

be a pricing rule? One can easily come up with examples, where for instance if

p is small and p increases there is a negative force, which resists this price to

go up further. However, when p is large, when the price increases the resisting

decreases. Those two cases give an idea that there is some pricing rule hiding

behind those forces. See also Haven and Khrennikov [9] for explicit examples.

As we mentioned at the beginning of this paper, in mainstream mathematical

psychology, the quantum formalism has made inroads to such an extent that

nowadays it is considered as a mainstream contribution. There is a dynamic

literature on that very topic. Excellent sources to consider for much more infor-

mation on this topic are Khrennikov [1] and Busemeyer and Bruza [3]. In a

nutshell, research in this area started via the observation that in decision mak-

ing formalisms such as the various expected utility frameworks used in economics

and ﬁnance8 , there are deviations from the normative behavior (as prescribed

by the axiomatic frameworks of expected utility). One very well known paradox

is the so called Ellsberg paradox (Ellsberg [15]). In summary form, one considers

a so called two stage gamble. This is a fancy way of saying that one gambles in a

ﬁrst period, call it period t and then subsequently, at period t > t one gambles

again. However, the decision to gamble at time t is conditional upon knowing

what the outcome of the gamble was at time t.

You are either informed that:

– (i) the ﬁrst gamble was a win; or

7

No Planck constant occurs in the macroscopic version of this potential!.

8

If you are not sure what those expected utility frameworks are, a great book to

consider the intricacies is by Kreps [14].

120 E. Haven

– (iii) there is no information on what the outcome was in the ﬁrst gamble

structures of expected utility, which essentially says that there is no reason for

you not to prefer to gamble at time t , if you have no information about the

outcome of the gamble at time t IF you are preferring to gamble at t whether you

are informed you either lost or won at time t. This principle has been violated,

consistently, in many experiments. It was ﬁrst noticed in work published by Shaﬁr

and Tversky [16]. In Busemeyer and Wang [17] two approaches are juxtaposed: a

Markov approach and a quantum-like approach. Following the Markov approach,

frequencies which are obtained through real world experiments indicate that 59%

would gamble (if informed they lost) and 69% would gamble (if informed they

won). The case where no information is given at time t should give a frequency

which is the average of the above frequencies. However, in repeated experiments

this is not the case. The quantum approach remedies this situation by deﬁning

the ‘no information’ state as a superposition of both the ‘informed-won’ state and

‘informed-lost’ state. We work here in Hilbert space. Recall, from the beginning

of the paper that quantum mechanics will require this. In the use of the basic

quantum probability rule, it is the interference term which can accommodate

the observed frequencies.

In ﬁnance, we do work especially in derivative pricing theory, with random

motions which exhibit very little memory. One can explicitly embed a mem-

ory component in a random model, via the use of so called fractional Brownian

motion but this is not the object of our discussion in this paper. Another issue

is time. In the title of this section we wrote the words ‘time asymmetry’. What

is meant with this? We need to recall that classical mechanics prescribes that

a process is perfectly time reversible. We have alluded above about Bohmian

mechanics and the quantum potential when we discussed the pricing rule.

Nelson [18,19] has derived the quantum potential in an alternative way. In so

doing he deﬁnes the drift part of a Brownian motion (when the inﬁnitesimal

change of position of a variable is being formalized) in two diﬀerent ways: by

looking at future positions and by looking at past positions. In classical, Newto-

nian mechanics, the diﬀerence between those two drift rates is zero. However, in

the approach by Nelson, the diﬀerence is non-zero and so called osmotic velocity

then obtains. The non-zero diﬀerence also signiﬁes that there is now asymme-

try between past and future time. It can also be remarked that such non-zero

term can be related to the existence of so called Fisher information. We need to

add that Fisher information can be used in modelling information in economics

(Hawkins and Frieden [20]). We do not expand on it further in this paper.

If we focus on formalizing information in economics and ﬁnance, within a

completely mainstream setting (i.e. without any connections to physics), then

The Quantum Formalism in Social Science: A Brief Excursion 121

on how to model private information (as opposed to public information). Those

types of information play a key role in determining the level of eﬃciency in a

market. Intuitively, we want to think of an eﬃcient market as a market which

becomes more eﬃcient the less individuals can inﬂuence the said market. Public

information, is, as its name denotes, information which can be used by many,

whilst at the other end of the accessibility spectrum ﬁgures private information.

Detemple and Rindisbacher [21] deﬁne the concept of ‘Private information Price

of Risk (PIPR)’, as representing “the incremental price of risk assessed when pri-

vate information becomes available.” ([21], p. 190). They discuss how the infor-

mation gain, attached to the presence of private information, can be measured.

This important formalism is set within an environment which is memory-less

and there is no asymmetry between past and future time.

If we want to continue focussing on modelling information in social science,

then we could take a resolutely diﬀerent route. What if we consider a formalism

which is not memory-less and does allow for asymmetry between past and future

time? I conclude this paper with an argument that such model can exist and I

will now expand a little on this.

The physics model upon which such proposed formalism could be based on,

derives from the so called ‘walking droplet’ model ([22]), which seems to feature

characteristics which can be found back in quantum mechanics. See Hardesty [24]

and Bush [25]. To quote work of Wind-Willassen, Harris and Bush [23]: “Drops

bouncing on a vibrating ﬂuid bath have recently received considerable attention

for two principal reasons. ....in certain parameter regimes, the bouncers walk

horizontally through resonant interaction with their wave ﬁeld. The resulting

walkers represent the ﬁrst known example of a pilot-wave system.” (p. 082002-1).

The pilot-wave system, the authors mention refers to Bohmian mechanics. The

droplet for it to bounce needs a vibrating surface. Both the droplet and surface

are made out of the same liquid. Experiments show that there are guiding

waves which inﬂuence the droplet. The droplet’s motion is inﬂuenced by wave

superposition occurring from positions occupied by the droplet in past time.

Hence, there is an embedded memory property.

Although it is diﬃcult to judge how close this is to Bohmian mechanics, the

interested reader is referred to Bush [25] and Fort et al. [26] for a discussion

on quantization arguments. On could make the argument that such a model

could hold serious promise as an explicit formalism to model information in

an economics/ﬁnance setting. But one needs to actively wonder if both (i) the

presence of memory and (ii) the asymmetry in time - are desirable properties in

a ﬁnance or economics environment. Here are some initial examples of analogies

we could make:

From physics: (i) The droplet bounces at time t at a height h (in the bath);

(ii) the orbital wave is generated upon impact; (iii) two coordinates: time on

X-axis and position (height, h) on Y-axis

122 E. Haven

by analogy to finance (i), (ii) and (iii) become: (i) the price level is generated

at time t; (ii) it has an information impact; (iii) price level is height (on Y-axis)

and it occurs at time t (on X-axis)

What is important is that this macroscopic model is parameter rich. A key

issue then becomes how, on the basis of hard data, we want to establish analogies

between this model and economics and ﬁnance.

References

1. Khrennikov, A.: Ubiquitous Quantum Structure: from Psychology to Finance.

Springer (2010)

2. Nguyen, H.T.: Quantum Probability for Behavioral Economics. Short Course at

BUH. New Mexico State University (2018)

3. Busemeyer, J. R., Bruza, P.: Quantum Models of Cognition and Decision.

Cambridge University Press, Cambridge (2012)

4. Haven, E., Khrennikov, A.Yu.: The Palgrave Handbook of Quantum Models in

Social Science, pp. 1–17. Springer - Palgrave MacMillan (2017)

5. Haven, E., Khrennikov, A., Robinson, T.: Quantum Methods in Social Science: A

First Course. World Scientiﬁc, Chapter 10 (2017)

6. Black, F., Scholes, M.: The pricing of options and corporate liabilities. J. Polit.

Econ. 81, 637–659 (1973)

7. Baaquie, B.: Quantum Finance. Cambridge University Press, Cambridge (2004)

8. Khrennikova, P., Haven, E., Khrennikov, A.: An application of the theory of open

quantum systems to model the dynamics of party governance in the US political

system. Int. J. Theoret. Phy. 53(4), 1346–1360 (2014)

9. Haven, E., Khrennikov, A.: Quantum Social Science. Cambridge University Press,

Cambridge (2013)

10. Bowman, G.: Essential Quantum Mechanics. Oxford University Press, Oxford

(2008)

11. Bohm, D.: A suggested interpretation of the quantum theory in terms of hidden

variables. Phys. Rev. 85, 166–179 (1952a)

12. Bohm, D.: A suggested interpretation of the quantum theory in terms of hidden

variables. Phys. Rev. 85, 180–193 (1952b)

13. Bohm, D., Hiley, B.: The Undivided Universe: An Ontological Interpretation of

Quantum Mechanics. Routledge and Kegan Paul, London (1993)

14. Kreps, D.: Notes on the Theory of Choice. Westview Press (1988)

15. Ellsberg, D.: Risk, ambiguity and Savage axioms. Q. J. Econ. 75, 643–669 (1961)

16. Shaﬁr, E., Tversky, A.: Thinking through uncertainty: nonconsequential reasoning

and choice. Cognit. Psychol. 24, 449–474 (1992)

17. Busemeyer, J.R., Wang, Z.: Quantum information processing: explanation for inter-

actions between inferences and decisions. In: Quantum Interaction - AAAI Spring

Symposium (Stanford University), pp. 91–97 (2007)

18. Nelson, E.: Derivation of the Schrödinger equation from Newtonian mechanics.

Phys. Rev. 150, 1079–1085 (1966)

19. Nelson, E.: Stochastic mechanics of particles and ﬁelds. In: Atmanspacher, H.,

Haven, E., Kitto, K., Raine, D. (eds.) 7th International Conference on Quantum

Interaction QI 2013. Lecture Notes in Computer Science, vol. 8369, pp. 1–5 (2013)

The Quantum Formalism in Social Science: A Brief Excursion 123

theoretic approach. In: Haven, E., Khrennikov, A. (eds.) The Palgrave Handbook of

Quantum Models in Social Science: Applications and Grand Challenges. Palgrave-

Macmillan Publishers (2015)

21. Detemple, J., Rindisbacher, M.: The private information of risk. In: Haven, E.

et al. (eds.) The Palgrave Handbook of Post Crisis Financial Modelling. Palgrave-

MacMillan Publishers (2015)

22. American Institute of Physics: Walking droplets: strange behavior of bouncing

drops demonstrates pilot-wave dynamics in action. Science Daily, 1 October 2013

23. Wind-Willassen, O., Molácek, J., Harris, D.M., Bush, J.: Exotic states of bouncing

and walking droplets. Phys. Fluids 25, 082002 (2013)

24. Hardesty, L.: New math and quantum mechanics: ﬂuid mechanics suggests alter-

natives to quantum orthodoxy. M.I.T. ScienceDaily, 12 September 2014

25. Bush, J.W.M.: Pilot wave hydrodynamics. Ann. Rev. Fluid Mech. 47, 269–292

(2015)

26. Fort, E., Eddi, A., Boudaoud, A., Moukhtar, J., Couder, Y.: Path-memory induced

quantization of classical orbits. Proc. Nat. Acad. Sci. USA 107(41), 17515–17520

(2010)

How Annualized Wavelet Trading

“Beats” the Market

Lanh Tran(B)

LanhTran14@gmail.com

Abstract. The market refers to the S&P 500 stock index SPY, which

is an important benchmark of U.S. stock performances, and “beating”

the market means earning a return greater than the market. The pur-

pose of this paper is to showcase an annualized wavelet trading strategy

(WT) that outperforms the market at a fast rate. The strategy is con-

tained in the website AgateWavelet.com. No prediction of market prices

is involved and using the website does not require any skills on the part of

the trader. By trading the index SPY back and forth about 4 to 5 times

a week for a year, the wavelet trader WT has an expected rate of return

approximately 26% higher than the market. The Sharpe ratios are com-

puted and they show that WT also has a higher expected risk-adjusted

return than the market. The result is a surprise since SPY has long been

considered to be a stock to buy and hold. In addition, proponents of the

Eﬃcient Market and Random Walk hypotheses claim that the market

is “unbeatable” because market prices are unpredictable. Thus WT also

provides a counterexample to this claim.

1 Introduction

The market in this paper represents the S&P 500, which is an important bench-

mark of U.S. stock performances, and “beating” the market means earning a

return greater than the market. Is there any trading strategy that can “beat”

the market? This question is of much interest to stock traders, academicians

and people with interest in business ﬁnance and economics. There is plenty of

empirical evidence and statistics against the existence of such a strategy; Barber

and Odean [3] show that traders who buy and sell frequently usually end up los-

ing more money than those who trade less often. In addition, proponents of the

random walk hypothesis (RWH) and eﬃcient market hypothesis (EMH) assert

that stocks take an unpredictable path and hence it is impossible to outperform

the overall market consistently. There is a large body of literature on the EMH

and RMH. For a bibliography, see Fama [4–6], Fama and French [7], Malkiel [8],

Tran [12] and the references therein. Readers interested in predicting the market

are referred to a recent book by Yardani [13] for more information.

Recently, Tran [12] has shown that there are strategies that “beat” the market

consistently without involving price prediction. However, these strategies are

not quite useful in real-life trading due to the long waiting time to “beat” the

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 124–137, 2019.

https://doi.org/10.1007/978-3-030-04263-9_9

Annualized Wavelet Trading 125

market. The purpose of the current paper is to showcase a more eﬀective wavelet

trading strategy (WT) that “beats” the market at a fast rate. The new strategy

is contained in the website AgateWavelet.com. Trading decisions made by the

website depend on movements of wavelets caused by price ﬂuctuations.

Relevant data to the paper is displayed at

https://iu.box.com/s/spy64zjv0fx9fa11ndvt1zt419m5c2xj

which will be referred to as “the link”. The SPY historical data set displayed at

the link lists the dates and corresponding adjusted closing prices of SPY from

January 29th, 1993 to January 26th, 2018. The data was downloaded on January

27th, 2018 from the Yahoo Finance website online. A date of the year will often

be displayed in the same style of the SPY data set. For example, January 27th,

2018 is written as 1/27/2018 or 1/27/18.

The paper is organized as follows: Sect. 2 discusses stock market trading

in general and also presents the assumptions. Section 3 explains the website in

detail. The focus is on annualized trading which lasts exactly one year. WT is

programmed to buy 1,000 shares of SPY on the ﬁrst trading and to liquidate

all her shares one year later.On each day of trading during a calendar year, the

trader enters the date and corresponding price of SPY and activate the website.

The computer program in the website processes the information, and then tells

WT whether to buy, sell or hold. The exact number of shares to trade is also

speciﬁed. An example using dates and prices of SPY for a year is provided.

Section 4 discusses Sharpe ratios which are used to show that WT has a higher

risk-adjusted return than the market. Section 5 shows that WT outperforms the

market using historical data.

Geometric Brownian motion is employed in the Black-Scholes model and is

a very popular model for stock market prices. In Sect. 6, a Geometric Brownian

motion (GBM) is ﬁtted to the historical SPY data. Section 7 compares the per-

formance of WT versus the market using simulated data generated by the ﬁtted

GBM. Again, WT outperforms the market with an expected return higher than

the market by about the same percentage found in Sect. 5. The computed Sharpe

ratio shows that WT has a Sharpe ratio higher than the market.

Section 8 provides some descriptive statistics using graphs to compare the

performances of WT and the market. Section 9 explains the idea behind WT’s

strategy and Sect. 10 discusses the results of the paper and some other relevant

issues. Some material from Tran [12] is included here for completeness. The

mathematical formulas used in the paper are presented at the link.

The numbers from Sects. 4–8 demonstrate clearly that:

Based on historical data and an unlimited amount of simulated data, after a

year of trading the SPY back and forth, the trader WT has an expected return

higher than the market by about 26%. In addition, WT has a higher risk-adjusted

return than the market.

A game-theoretic argument then easily shows that WT “beats” the market

consistently by repeatedly employing the year-long strategy at the website. This

126 L. Tran

is quite surprising since it has long been considered that SPY is a stock to buy

and hold and that it is impossible to “beat” the market.

2 Stock Trading

buy stocks. The collateral for the borrowed funds is the stocks and cash in the

investor’s account. A regular trader can easily get a 2:1 margin loan at any

brokerage with a margin account. Interactive Brokers (IB) is a large premier

brokerage and currently an owner of a special memorandum account at IB can

get a margin of 2.25:1. Suppose WT is allowed to hold a margin leverage of a:1

where a is 2 or higher, then for every dollar she has in her account, she can buy

up to a dollars in stocks.

Cash. If WT buys on margin and owes the brokerage money, then cash is a

negative number and she has to pay interest to the brokerage on the money that

she owes. Cash is positive if she has surplus money in her account.

Trading by WT is done through a brokerage which must satisfy the following

assumptions:

year and collects interest daily (compounded once a day). The trader gets 2%

interest for surplus cash in her account.

2.25 and checks WT’s margin every day (usually around 4:00 PM) before closing

time.

a transaction.

3.2%. However, IB does not pay interest for surplus cash in the trader’s account.

WT’s cash is almost never positive, so the interest she gets does not aﬀect her

return much. A trader with a lot of surplus cash can transfer her money to a

sweep account to collect some interest. The result of the paper would not change

much if WT gets no interest for surplus cash.

The commission charged by IB is $1.00 if the number of shares per trade is

200 or less; it is $0.005 per share if a trade involves more shares. The commission

charged by the brokerage in Assumption 3 is rather high.

Market Value. Since WT has only SPY stocks. The market value equals the

number of shares multiplied by the current SPY share price.

Account Balance. The account balance is the total value of WT’s account. It

is the net liquidation value of her account which equals the sum of market value

plus cash.

Annualized Wavelet Trading 127

Buying Power. This is the amount of money still available for WT to buy

stocks. Since her account balance is the sum of the market value and cash, her

buying power equals:

margin call if her buying power becomes negative.

Interest. Assume that interest is collected once a day.

3 The Website

Welcome to AgateWavelet.com

Run Annualized Wavelet Trading Strategy

Enter Data. Click on “Run Annualized Wavelet Trading Strategy”, and then

enter the margin ratio that the brokerage allows you. Click on “Enter Data” and

then type in the date and price of SPY. The date needs to have two digits for

the month, two digits for the day and four digits for the year.

Let us start with an example. Consider the set of data SPY-Histo-Data in

Exhibit 1 stored in the Folder Data at the link. This set of data contains a list

of dates and prices of SPY from 1/26/18 to 1/25/19. You can cut and paste this

data set on the screen. If you choose the excel.csv ﬁle, then you need to format

the dates column so the year is displayed with 4 digits. If you choose excel.xlsx,

then the years in the dates column are already written with 4 digits, so no

formatting is needed. You should download the ﬁles before doing any copying

and pasting.

Generate Data. Click on “Generate Data” to activate the website. A table

(Results Table) appears at the bottom of the page with information instructing

the trader as to sell, buy or hold. The outputs on the upper right hand corner

are the Excel ﬁles: result.csv, result-no-zeros.csv and result.xlsx. They can be

downloaded and opened in your computer. Result.csv is a text ﬁle which can

be opened with a text editor but no text formatting can be done with this ﬁle.

Result-no-zeros.csv is the same as result.csv with days of no trading deleted

and result.xlsx contains the same information as result.csv but looks nicer. The

output ﬁles are displayed in the folder Output in Exhibit 1 at the link.

Reset Inputs. You should press “Reset Inputs” to clear the memory of the

computer at the website before you enter new data on the screen.

The variables in result.csv are similar to the variables in the tables displayed

in Tran [12]. A full detailed explanation of these variables can be found there. I

now brieﬂy describe them for completeness.

The ﬁrst three columns list respectively the trading dates, share prices at

which trades are made and the numbers of shares traded.

128 L. Tran

The fourth column lists the commissions charged by the brokerage for each

trade.

The ﬁfth column (CumCom) lists cumulative commissions, which are total

commissions paid up to the current trading day.

The sixth column (Cost) lists the cost of each trade which equals the number

of shares traded multiplied by the share price.

The seventh column (CumCost) lists the cumulative cost paid by the trader.

The eighth column (CumShares) lists the total number of shares held by the

trader after each trade.

The ninth column (MV) lists the market value of the trader’s stocks on each

trading day.

The tenth column lists the amount of cash in the trader’s account.

The eleventh column lists WT’s buying power.

The twelth column lists the interest WT pays or collects for the cash in her

account.

The thirteenth column lists the account balance, which is equal to the market

value of WT’s shares plus cash in her account.

The fourteenth column lists WT’s return, which equals her account balance

subtracted by the sum of her original investment and cumulative commissions.

The ﬁfteenth column lists the return of the market, which is what WT’s return

would be had she bought 1,000 shares on 1/26/18 and held them without trading.

The formulas and computations to make result.csv are diplayed in the ﬁle

Formula.xlsx in Exhibit 1 at the link.

4 Sharpe Ratios

The ex-post Sharpe Ratio (see Sharpe [10,11]) is used for risk-adjusted returns.

The rates of returns of WT are bench-marked against the realized values of SPY.

The computation of SR is carried out for result.csv above and the details are

displayed in the ﬁles SR.csv and SR.xlsx contained in the folder Sharpe Ratio

at the link. The market has SR equal to zero while WT has SR equal to 0.12.

Thus WT has higher risk-adjusted returns than BH.

The historical data set displayed in the folder Historical Data at the link contains

6,295 dates and prices of SPY starting on 1/29/93 and ending on 1/26/18. Let us

compare the performances of WT and the market using past historical data. The

folder HistoDataPerfom (HDP) at the link contains 6,043 ﬁles in the “inputs”

created as follows:

random-0000.csv lists the dates and prices of SPY from 1/29/93 to 1/28/94,

random-0001.csv lists the dates and prices of SPY from 2/1/93 to 1/31/94,

...

random-6042.csv lists the dates and prices of SPY from 1/26/2017 to

1/25/2018,

Annualized Wavelet Trading 129

The folder “outputs” contains trades made by WT using the ﬁles in the

“inputs” folder.

Upload File. This button is created to check quickly that the output ﬁles

correspond to the input ﬁles. Download “inputs” to your computer then use

the “Upload File” button to upload an input ﬁle. Click on “Generate Data” to

obtain output ﬁles.

The buying power in the output ﬁles are always positive, indicating that WT

never gets a margin call. The results from the output ﬁles are summarized in

stats.csv in the folder Stats. Some interesting statistics are:

WT and the market’s average rates of return are, respectively, equal to

13.92% and 10.79%. Thus WT earns on average a rate of return 29.03% higher

than the market. The average ex-post Sharpe ratio is about 0.046. The formulas

to compute the statistics are in the ﬁle stats.xlsx in HDP.

The equation for a geometric Brownian motion (GBM) is given by:

σ2

St = S0 exp (μ − )t + σWt ,

2

where Wt is standard Brownian motion. Here, St is the value of GBM at time

t and S0 is the initial value. The parameters −∞ < μ < ∞ and σ > 0 are

constants. GBM serves as an important example of a stochastic process satisfying

a stochastic diﬀerential equation.

The mean and variance of St are:

E(St ) = S0 exp(μt),

V ar(St ) = (S0 )2 exp(2μt) exp(σ 2 t) − 1 .

Let X(t) = log St − log St−1 . Then

Hence X(t) is normally distributed with mean μ − (σ 2 /2) and standard deviation

σ. A total of 6,294 values of Xt is obtained from the spy-historical.data set at the

link. Using these values, estimates of the drift parameter μ − (σ 2 /2) and standard

deviation σ are, respectively, 0.00037294 and 0.011597345. These estimates are,

respectively, the sample mean and sample standard deviation of 6,294 observations

of Xt ’s. The ﬁtting of the GBM is carried out in the ﬁle GBM-ﬁt.xlsx at the link.

This section is devoted to compare by simulation the performances of WT and

the market for the period of one year. The historical data ends on 1/26/2018 so

130 L. Tran

let us choose this date to start the year. On this day the price of SPY is 286.58

dollars a share. The simulated data is generated using the equation

St = 286.58 exp 0.00037294t + 0.011597345Wt .

On 1/26/2018, the trading starts with S0 = 286.58. The next trading day

is 1/29/2018 since 1/27/2018 and 1/28/2018 are weekend days. The simulated

price of SPY on 1/29/2018 is S1 .

Random Data. Go to the webpage and click on “Run Annualized Wavelet

Trading Strategy” and then enter the margin that you hold. Click on “Random

Data” and then on “Generate Data”. The website generates an Excel.csv ﬁle

containing simulated values of SPY for one year. There are a total of 261 values

since weekend days are excluded. You can continue by clicking on “Generate

Table” to see how WT trades with this set of data.

If you play with “Random Data” long enough, you will see that WT is more

likely to “beat” the market in a “bull year” when prices increase, and the market

is more likely to “beat” WT in a “bear” year when prices decrease. Since there

are more “bull” years than “bear” years, WT “beats” the market in the long

run.

Multi File Simulator. The Multi-File Simulator page is created to replicate

“Random data” up to 1,000 times. Go to the web page and click on Multi-File

simulator then enter the margin ratio allowed by your brokerage and the number

of times you want to replicate. Let us set the number of times to be 1,000. Click

on “Run Simulation” then wait a couple of minutes. The website generates three

ﬁles: stats.csv, inputFiles.csv and outputFiles.csv.

inputFiles.csv contains 1,000 sets of simulated data for the trading year begin-

ning on 1/26/18 and ending on 1/25/19.

outputFiles.csv contains the trades of WT corresponding to the data in the

input ﬁles.

stats.csv contains relevant statistics obtained from an analysis of the the

output ﬁles.

Exhibit 2 at the link contains the results of 1,000 replications of “Random

Data” as an example. Denote the diﬀerence between the returns of WT and the

market by alpha. Denote the average return of WT and the market, respectively,

by WT Ave and Market Ave. The statistics from stats.csv are summarized below:

(1) The diﬀerence alpha is positive in 575 cases out of 1,000.

(2) WT Ave and Market Ave are, respectively, $42,561.23 and $33,784.8 for

the 1,000 replications. WT’s trading produces an average increase in return

of $8,776.43 or 25.98% for the year.

(3) WT’s expected rate of return is 100×(42, 561.23/286, 694.63)%, or 14.83%

and the market’s rate of return is 100 × (33, 784.8/286, 694.63)%, or 11.78%.

By trading occasionally, WT’s yearly rate of return is 25.89% higher than the

market.

(4) WT’s expected Sharpe ratio is taken as the average of the 1,000 Sharpe

ratios. It is approximately equal to 0.025.

Annualized Wavelet Trading 131

due to trading. This number makes sense only when Market Ave is positive. If

you set the number of random ﬁles in the Multi-File Simulator to be 100 or

above, then Market Ave is positive with probability practically equal to 1.

The expected values reported in the introduction are computed from averages

of numerous runs by the Multi-File Simulator. Note that the buying powers of

WT in the output ﬁles in Exhibit 2 are always positive, indicating, that WT

never gets a margin call.

The use of Sharpe ratio rests on the assumption that returns are normally

distributed. Daily returns have a tendency to be heavy-tailed (see Nagahara [9],

for example).

However, yearly returns of WT and the market, being sums of 261 daily

returns, have distributions that are better approximated by normal distributions.

This follows since sums of random variables with ﬁnite means and variances tend

to normality by the central limit theorem. The graphs of the densities of WT’s

returns and market’s returns will be presented in the next section.

The calculation of Sharpe ratio using yearly returns is in ﬁle SR-

YearlyRet.xlsx in Exhibit 2. The Sharpe ratio is found to be 0.298.

The next section is devoted to provide some descriptive statistics regarding

WT’s returns and market’s returns.

8 Graphs

QQ Plot. The QQ plot is a graph of the quantiles of the trader’s returns against

the quantiles of the market’s returns. The QQ plot is displayed below.

The returns of WT and the market are sorted in increasing order and paired

together for a total of 1,000 pairs. The data set obtained is referred to as QQ-

Plot.csv displayed in the folder Quantile-QQ-Plot at the link.

The QQ-Plot.csv ﬁle can be used to compute the percentiles of the trader’s

and market returns. The 1st quartile, 2nd quartile (median) and 3rd quartile

of the returns of WT are, respectively, −$22,631.28, $34,388.36 and $94,222.46.

The 1st quartile, 2nd quartile (median) and 3rd quartile of the returns of the

market are, respectively, −$11,194.63, $28,455.37, and $69,945.37.

(i) Look at the straight line going through the points (0,0) and (200,000,

200,000). A point on the QQ plot below this line indicates that WT does

better than the market and any point above it indicates otherwise.

(ii) The point where the graph intersects the horizontal axis occurs when market

price stays put after one year. This point is slightly to the left of the origin,

indicating that the market performs better than WT if market price does

not increase by the end of the trading year.

(iii) The graph lies almost in a straight line, showing that the trader’s returns

and market’s returns have similar distributions. They are only diﬀerent by

a change of scale and location.

132 L. Tran

(iv) In the upper right hand side and lower left side, the scattered points indicate

that WT’s return has a heavier tail than the market.

(v) The QQ plot indicates that WT tends to underperform the market in a

bear year and vice versa outperforms the market in a bull year. The gain

of WT over the market in a bull year is likely bigger than the gain of the

market over WT in a bear year. At any moment, WT is more likely to be

in bull than bear territory since SPY has an increasing trend. Hence WT

“beats” the market in the long run.

●

●

●

●

●

●

●

●

●●

200,000 ●●

●

●●●

●

●

●●

● ●●

●

●

●

● ●●●

●

●

●●

●

●●●

●

●

●●

●

●

●

●

●

●●●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●

●

●●●

●

●

●

●

●●●

Market Return

●

●

●

●

●

●●●

●

100,000 ●

●●

●

●

●●

●●

●●●

●

●●

●

●

●

●

●●

●

●

●●

●

●●

●

●

●●

●

●●

●

●●

●

●●●

●

●

●

●●

●

●●

●

●●

●●

●

●

●●

●

●

●●

●

●●

●

●

●●

●

●

●●

●●

●

●

●

●

●

●

●●

●

●

●●

●

●

●

●●

●

●

●●

●

●

●

●

●

●

●●

●

●●

●

●

●●

●

●

●

●●

●

●

●●

●

●

●●

●

●

●●

●

●●

●

●●

●

●

●●

●

●

●

●

●●

●●

●

●

●

0 ●

●

●

●

●

●

●

●●

●

●

●●

●●

●●

●

●

●

●●

●

●

●●

●

●

●

●

●●

●

●

●

●

●●

●

●●

●

●●

●

●

●

●●

●

●●

●●

●

●●

●

●●

●●

●

●

●

●●

●

●●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●●

●

●●

●●

●

●●

●

●●

●

●

●

●●

●

●●

●

−100,000 ●

0 200,000 400,000

Trading Return

Kernel Density Estimators of Trader’s Returns and Market Returns.

Below are graphs of kernel estimators of the probability density functions (pdf)

of WT’s and market’s returns. The graphs are continuous and look nicer than

histograms.

Annualized Wavelet Trading 133

6e−06

4e−06

variable

density

Trading

Market

2e−06

0e+00

0 200,000 400,000

value

Note the following:

(i) The pdf of WT’s returns clearly has heavier tails than the pdf of the mar-

ket’s returns and is more skewed toward the right hand side. The kernel

density plots clearly show that WT’s returns are greater than the market

in the upper tail, but lower than the market in the lower tail. There is

a region in the middle where positive returns can be achieved with high

probability using the market strategy.

(ii) The probability that market’s returns exceed $270,000 is practically zero,

whereas, the probabilty of this event is about .01 for WT’s returns. The

probability that market’s losses exceed $100,000.00 is practically zero,

whereas, the probability of this event is about 0.0275 for WT’s return.

(iii) The mean and standard deviation of the market’s returns are, respectively,

$33,784, and $59,193, whereas, the mean and standard deviation of the

trader’s returns are, respectively, $42,561, and $87,411.

134 L. Tran

for the pdf of the alpha. Recall that alpha is the diﬀerence between the returns

of WT and the market.

1e−05

density

5e−06

0e+00

value

(i) Note that alpha is a negative number if WT’s return is less than the market’s

return. The probability that alpha is negative is .425.

(ii) The distribution of alpha is skewed to the right. The mean and standard

deviation of alpha are, respectively, $8,776 and $29,412. The median of

alpha is $5,616, which is much smaller than the mean. This happens due to

the long tail of the distribution of alpha.

(iii) Suppose the distribution of alpha has mean μ and variance σ 2 . Consider

the problem of testing H0 : μ = 0 against HA : μ > 0. Let Ā and s denote,

respectively, the sample mean

√ and sample standard deviation of alpha. The

usual test rejects H0 if Ā 1000/s is large. This statistic is approximately

normally distributed with mean zero and variance 1 by the central limit

theorem.

√ A simple computation shows that the sample statistic equals to

8776 1000/29412 or 9.44 which is quite large. The right decision is to reject

the null hypothesis H0 since the p-value of the test is practically zero.

Annualized Wavelet Trading 135

√

(iv) The 95% conﬁdence interval for alpha is 8776 ± 1.96 × 29412/ 1000 or

[6953, 10599]. The 95% conﬁdence interval for the “% Inc. Return” is [100 ×

6953/33784, 100 × 10599/33784] or [20.58, 31.37].

WT starts trading with a buy of 1,000 shares. The only money she ever invests

is the amount that she pays for the 1,000 shares and the commission to buy

these shares. Using margin leverage, she is allowed to buy additional shares.

The number of additional shares she can buy varies with her buying power. At

any point of time, her shares are more likely to increase in price than decrease

since SPY has an increasing trend. To outperform the market, she trades back

and forth while trying to hold on to a minimum of 1,000 shares. Her strategy

is to increase her cumulative number of shares gradually while avoiding margin

calls. She borrows on margin to pay all expenses: money to buy extra shares,

interests on the amount borrowed and trading commissions. The sum of the

interests and extra commissions she pays during the year is likely less than her

gain due to increases in prices of her additional shares. This happens since SPY

tends to inﬁnity at a suﬃciently fast rate. By the end of the year, she has a

higher expected rate of return than the market. Her Sharpe ratio is also higher

than the market.

Avoiding margin calls is the hardest part of the strategy. Note that WT

always maintains a positive buying power. She buys at price dips and attempts

to sell at higher prices. But occasionally, she has to sell at a loss. This is done

to avoid margin calls when price drops dramatically. However, since SPY tends

to increase, she sells high and buys low more often than buys high and sells low.

The important variables are interest rates, commissions, account balances

and buying power. The numbers of shares traded also vary with the depths

of price dips, heights of price peaks, among other variables. They look rather

bizarre due to the unpredictable movements of market prices.

How does WT “beat” the market consistently? Think of her trading as a

series of games with each lasting for a year. She has a higher expected return

than the market in each game. By repeatedly playing the games, she will “beat”

the market in the long run with probability one. The method of the current

paper is game-theoretic and is much simpler to employ than the complicated

asymptotic approach used in [12].

10 Discussion

1. WT can start with an amount of shares, not necessarily equal to 1,000 at the

beginning of a trading year. Suppose she starts with 2,000 shares. Then she

can buy twice as many shares as the number recommended by the website.

2. How many shares should WT begin with at the beginning of a trading year?

There is no mathematical answer to this question. She should just start with

whatever she can aﬀord.

136 L. Tran

3. SPY is considered to be a low risk stock to trade and a hedge fund manager

may be able to get a margin leverage up to 10:1 (see Ang et al. [1] and

Baker and Filbeck [2]).

4. Under the Capital Asset Pricing Model (CAPM), an investor is assumed

to be able to borrow as much as she wants at the risk-free rate. A margin

leverage of 10:1 may not then be unreasonable under CAPM. A trader

can possibly increase her expected return with a higher leverage. However

borrowing too much can decrease the trader’s risk adjusted return.

5. A trader using WT should check the price of SPY daily. She is required

to buy, sell or hold according to the guidance of WT. However, her yearly

expected return would not change much if she skips a day occasionally.

6. What if WT enters the date and price of SPY several times a day on the

screen? This will probably increase WT’s return slightly by the end of the

year. No deﬁnite answer on this is available at this time.

7. Buying or selling at prices approximately equal to the prices recommended

by WT would not aﬀect the trader’s overall return much.

8. In Sect. 4, WT earns on average a rate of return about 29.03% higher than

the market. The price of SPY is not assumed to behave according to any

model.

9. The geometric Brownian motion is used in the Black-Scholes model. It is

used in the Multi File Simulator to generate an unlimited amount of SPY

data. The ﬁtting of a GBM to the SPY historical data works well for the

purpose of comparing WT and the market. Using simulated data, WT out-

performs the market with a yearly rate of return about 25.89% higher than

the market, which is not too far below the 29.03% found for using real data.

The 26% percent reported in the abstract and introduction appears to be

a conservative estimate.

10. The general belief is that it is impossible to “beat” a GBM since technical

analysis is useless in predicting future prices. Simulation using the Multi

File Simulator clearly shows that predictability of future prices is not a

necessary condition for a trader to consistently “beat” the market. This is

quite surprising.

Exercise. Assume that WT has a 2:25 margin leverage from her brokerage. Use

the Multi-File Simulator at the website to show that WT’s expected annual

return is approximately 30% higher than the market. Find also the ex-post

Sharpe ratio of WT and show that WT has a higher risk-adjusted return than

the market.

References

1. Ang, A., Gorovyy, S., van Inwegen, G.B.: Hedge fund leverage. J. Financ. Econ.

102, 102–126 (2011)

2. Baker, H.K., Filbeck, G.: Hedge Funds: Structure, Strategies, and Performance.

Oxford University Press, Oxford (2017)

Annualized Wavelet Trading 137

3. Barber, B.M., Odean, T.: Trading is hazardous to your wealth: the common stock

investment performance of individual investors. J. Financ. 2, 773–806 (2000)

4. Fama, E.F.: The behavior of stock market prices. J. Bus. 38, 34–105 (1965a)

5. Fama, E.F.: Random walks in stock prices. Financ. Anal. J. 21, 55–59 (1965b)

6. Fama, E.F.: Eﬃcient capital markets: a review of theory and empirical work. J.

Financ. 25, 383–417 (1970)

7. Fama, E.F., French, K.R.: The capital asset pricing model: theory and evidence.

J. Econ. Perspect. 18, 25–46 (2004)

8. Malkiel, B.G.: A Random walk down Wall Street, 1st edn. W. W. Norton & Co.,

New York (1973)

9. Nagahara, Y.: Non-Gaussian distribution for stock returns and related stochastic

diﬀerential equation. Financ. Eng. Jpn. Mark. 3, 121–149 (1966)

10. Sharpe, W.F.: Capital asset prices: a theory of market equilibrium under conditions

of risk. J. Financ. 19, 425–442 (1964)

11. Sharpe, W.F.: Adjusting for risk in portfolio performance measurement. J. Portfolio

Manag. 1(2), 29–34 (1975)

12. Tran, L.T.: How wavelet trading “beats” the market. J. Stock Forex Trading 6,

1–6 (2017)

13. Yardani, E.: Predicting the Markets: A Professional Autobiography. Amazon.com

(2018)

Flexible Constructions for Bivariate

Copulas Emphasizing Local Dependence

and Tonghui Wang1(B)

1

Department of Mathematical Sciences, New Mexico State University,

Las Cruces, USA

{xzhu,twang}@nmsu.edu

2

School of Statistics, Jiangxi University of Finance and Economics, Nanchang, China

qingsongshan@gmail.com

3

Graduate School, Chiang Mai University, Chiang Mai, Thailand

titansteng@gmail.com

copulas is provided, which is a generalization of the so-called “gluing

method” and “rectangular patchwork” constructions. A probabilistic

interpretation of the construction is provided through the concept of

the threshold copula. Properties of the construction and best-possible

bounds of copulas with given threshold copulas are investigated. Exam-

ples are given for the illustration of our results.

Local dependence · Best-possible bound

1 Introduction

For the purpose of describing the dependence among random variables, in recent

years, copulas are extensively studied by researchers and have been applied in

many ﬁelds, e.g., decision science [2], reliability theory [21,26,34], risk models

[24] and hydrology [36].

By Sklar’s theorem, the importance of copulas stems from two aspects. First,

most dependence properties of random variables can be captured by copulas,

which are independent of marginal distributions and which are, in general, easier

to be handled than the original joint distributions, e.g., [7,22,37,39,42,43,45,

46,48], etc. Second, copulas provide an eﬃcient way to construct multivariate

distributions with given marginal distributions. Therefore, it is important to

consider constructions of copulas, e.g., for bivariate copulas, [10,11,16,28,38,

47], and for multivariate copulas, [17,49], etc. Especially, the so-called “gluing

method” and “rectangular patchwork” constructions of copulas were studied by

[38] and [16], respectively, in which copulas are glued on rectangular subsets

of I 2 . In this article, those constructions are generalized into a more ﬂexible

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 138–151, 2019.

https://doi.org/10.1007/978-3-030-04263-9_10

Flexible Constructions for Bivariate Copulas Emphasizing Local Dependence 139

the threshold copula [15].

The paper is organized as follows. Necessary concepts and properties of copu-

las are brieﬂy reviewed in Sect. 2. Our main results are given in Sect. 3, in which

a ﬂexible construction of bivariate copulas and its probabilistic interpretation

are given, and properties and best-possible bounds with given threshold copulas

are investigated. Conclusion is given in Sect. 4.

2 Preliminaries

copula if it satisﬁes the following three conditions,

(ii) C(u, 1) = u and C(1, v) = v for all u, v ∈ I;

(iii) C is 2-increasing on I 2 , i.e., for every 0 ≤ u1 ≤ u2 ≤ 1 and 0 ≤ v1 ≤ v2 ≤ 1,

VC [u1 , u2 ] × [v1 , v2 ] = C(u2 , v2 ) − C(u2 , v1 ) − C(u1 , v2 ) + C(u1 , v1 ) ≥ 0,

Throughout this paper, we will focus on bivariate copulas. Thus, in the sequel, we

may omit the word “bivariate”. Let X and Y be random variables deﬁned on a

probability space (Ω, A , P ) with the joint distribution function H and marginal

distribution functions F and G, respectively, where Ω is a sample space, A is a

σ-algebra of Ω and P is a probability measure on A . By Sklar’s theorem [40],

there is a copula C such that for any x and y in R̄ = R ∪ {−∞, ∞},

If F and G are continuous, then the copula C is unique and C is called the

copula of X and Y . There are three important copulas deﬁned as follows.

for all (u, v) ∈ I 2 . Π is called the product copula or independence copula. M and

W are called, respectively, the Fréchet-Hoeﬀding upper bound and lower bound

of copulas. Π, M and W have following well-known properties.

and only if their copula is Π;

(ii) W (u, v) ≤ C(u, v) ≤ M (u, v) for all copula C and every (u, v) ∈ I 2 ;

(iii) Let X and Y be continuous random variables. Then Y is almost surely an

increasing (or decreasing) function of X if and only if their copula is M (or

W ). For a comprehensive introduction on the theory of copulas, the reader

is referred to the monographs [18,27].

140 X. Zhu et al.

3 Main Results

3.1 A Flexible Construction and Its Properties

First of all, let’s select a copula C0 and use P0 to denote the probability measure

induced by C0 on I 2 . C0 is called a background copula since it will determine

the “ﬂatness” of I 2 in our construction. Let P = {Ai }ni=1 be a collection of

measurable subsets of I 2 such that P0 (Ai ∩ Aj ) = 0 for all i = j, where n

is an integer. Without loss of generality, we may assume P0 (Ai ) > 0 for each

i = 1, · · · , n. Also, let C = {Ci }ni=1 be a collection of copulas. Note that C0

and some Ci ’s may be identical and we require P and C to have the same

cardinality. For convenience, we use a triple (P, C , C0 ) to denote a collection of

subsets of I 2 , a collection of copulas and a background copula satisfying above

conditions throughout this work. Then, we have the following construction.

Definition 1. Given a triple (P, C , C0 ), we deﬁne a function CP ,C ,C0 : I 2 → I

such that

n

CP ,C ,C0 (u, v) = ai Ci P0 [0, u] × I Ai , P0 I × [0, v]Ai + P0 [0, u] × [0, v] ∩ Ac

i=1

(2)

n

for all (u, v) ∈ I , where ai = P0 (Ai ), i = 1, · · · , n, A = I \ ∪ Ai and P0 (·|·)

2 c 2

i=1

denotes the conditional probability. CP ,C ,C0 is called the copula generated by

(P, C , C0 ).

The following theorem shows that Deﬁnition 1 is well-deﬁned.

Theorem 1. The function CP ,C ,C0 deﬁned by Eq. (2) is a copula.

n

c

CP ,C ,C0 (0, v) = ai Ci P0 {0} × I Ai , P0 I × [0, v]Ai + P0 {0} × [0, v] ∩ A = 0.

i=1

(ii) For each v ∈ I,

n

CP ,C ,C0 (1, v) = ai Ci P0 [0, 1] × I Ai , P0 I × [0, v]Ai + P0 [0, 1] × [0, v] ∩ Ac

i=1

n

= P0 I × [0, v] ∩ Ai + P0 I × [0, v] ∩ Ac = P0 (I × [0, v]) = v.

i=1

Flexible Constructions for Bivariate Copulas Emphasizing Local Dependence 141

= CP ,C ,C0 (u2 , v2 ) − CP ,C ,C0 (u2 , v1 )

− CP ,C ,C0 (u1 , v2 ) − CP ,C ,C0 (u1 , v1 )

n

= ai Ci P0 [0, u2 ] × I Ai , P0 I × [0, v2 ]Ai

i=1

n

− ai Ci P0 [0, u1 ] × I Ai , P0 I × [0, v2 ]Ai

i=1

n

− ai Ci P0 [0, u2 ] × I Ai , P0 I × [0, v1 ]Ai

i=1

n

+ ai Ci P0 [0, u1 ] × I Ai , P0 I × [0, v1 ]Ai

i=1

+ P0 [0, u2 ] × [0, v2 ] ∩ Ac − P0 [0, u1 ] × [0, v2 ] ∩ Ac

− P0 [0, u2 ] × [0, v1 ] ∩ Ac + P0 [0, u1 ] × [0, v1 ] ∩ Ac

n

i=1

+ P0 (u1 , u2 ] × (v1 , v2 ] ∩ Ac ≥ 0.

Remark 1. (i) The last term P0 [0, u] × [0, v] ∩ Ac in Eq. (2) can be rewritten as

P0 [0, u] × [0, v] ∩ Ac = aAc CAc P0 [0, u] × I Ac , P0 I × [0, v]Ac

by some copula CAc so that it has the same form with other terms, where

aAc = P0 (Ac ). However, in general, it is not easy to derive the closed form of

CAc . The beneﬁt of using the expression P0 [0, u] × [0, v] ∩ Ac for the last term

of Eq. (2) is that CP ,C ,C0 is identical to C0 on Ac , i.e., CP ,C ,C0 (u, v) = C0 (u, v)

for any (u, v) ∈ Ac .

(ii) In constructions of “gluing method” [38] and “rectangular patchwork” [16],

copulas are linked only over rectangles, i.e., cartesian products of two closed

subintervals of I 2 , but in the construction given by Deﬁnition 1, Ai ’s can be any

subsets of I, so our method includes their constructions. Also, it can be seen

that our construction is very ﬂexible by following examples.

{(u, v) ∈ I 2 : v ≥ 2u}, D2 = {(u, v) ∈ I 2 : 0.5u ≤ v ≤ 2u} and D3 = {(u, v) ∈

I 2 : v ≤ 0.5u}. The graphs of the partition and CP 1 ,C 1 ,Π are given in Fig. 1.

142 X. Zhu et al.

1.0

0.8

M

0.6

V

0.4

0.2

W

0.0

0.0 0.2 0.4 0.6 0.8 1.0

U

1.0

0.8

M

0.6

V

0.4

0.2

W

0.0

0.0 0.2 0.4 0.6 0.8 1.0

U

1 − (1 − u)2 }, E2 = {(u, v) ∈ I 2 : u2 ≤ v ≤ 1 − (1 − u)2 } and E3 = {(u, v) ∈ I 2 :

v ≤ u2 }. The graphs of the partition and CP 2 ,C 2 ,Π are given in Fig. 2.

The background copula C0 is nontrivial in the next example.

Example 2. Consider the Ali-Mikhail-Haq copula [3] given by

uv

C AM H (u, v) = . (3)

1 − (1 − u)(1 − v)

is diﬀerent from CP d ,{M,W },Π generated by (Pd , {M, W }, Π). The graph of

CP d ,{M,W },C AM H and the contour plot of the diﬀerence CP d ,{M,W },C AM H −

CP d ,{M,W },Π are given in Fig. 3.

In general, it is not easy to derive the explicit expression of a construction

deﬁned by Eq. (2). The next result gives us the general formulas for constructions

when C0 = Π and I 2 is partitioned by a straight line from the left edge u = 0

of I 2 to the right edge u = 1.

Flexible Constructions for Bivariate Copulas Emphasizing Local Dependence 143

v

1.0

0.8

0.03

0.02

0.6 0.01

- 0.01

0.4 - 0.02

- 0.03

0.2

u

0.2 0.4 0.6 0.8 1.0

Fig. 3. CP d ,{M,W },C AM H and contour plot of CP d ,{M,W },C AM H − CP d ,{M,W },Π

I 2 : v ≤ (b2 − b1 )u + b1 } and B2 = {(u, v) ∈ I 2 : v ≥ (b2 − b1 )u + b1 } with

0 ≤ b1 , b2 ≤ 1. Let b− = min{b1 , b2 } and b+ = max{b1 , b2 }. Then the copula

generated by (P1 , {C1 , C2 }, Π) is given by

⎧

⎪ b1 +b2 (b2 −b1 )u2 +2b1 u

⎪

⎪ 2

C1 b +b

, b 2v+b

, if 0 ≤ v ≤ b− ,

⎪

⎪ 1 2 1 2

⎪

⎪ 2

(b2 −b1 )u +2b1 u 2b v−v 2

−b2

⎪ b +b

⎨ 1 2 2 C1 ,

+

2 2

−

b1 +b2 b+ −b−

=

⎪

⎪ 2−b1 −b2 (2−2b1 )u−(b2 −b1 )u2 (v−b− )2

⎪

⎪ + 2

C 2 2−b −b

, (b −b )(2−b −b )

, if b− ≤ v ≤ b+ ,

⎪

⎪ 1

2 + − 1 2

⎪

⎪ 2 2

⎩ (b2 −b1 )u +2b1 u + 2−b1 −b2 C2 (2−2b1 )u−(b2 −b1 )u , 2v−b1 −b2 , if b+ ≤ v ≤ 1.

2 2 2−b −b

1 2 2−b −b 1 2

(4)

C0,1,C1 ,C2 (u, v) = 0.5C1 u2 , 2v − v 2 + 0.5C2 2u − u2 , v 2 . (5)

C1,0,C1 ,C2 (u, v) = 0.5C1 2u − u2 , 2v − v 2 + 0.5C2 u2 , v 2 . (6)

Remark 2. Note that if b1 = b2 , then the above construction (4) is reduced to the

gluing construction of two copulas studied by [38]. In addition, the construction

(4) is diﬀerent from constructions studied by [23,33], in which they studied

constructions of copulas with given aﬃne sections and sub-diagonal sections,

respectively. As an example of constructions (5) and (6), graphs of C0,1,M,W

and C1,0,M,W are provided in Fig. 4.

The diagonal section and opposite diagonal section of a copula C are the

function δC : I → I and ωC : I → I deﬁned by δC (t) = C(t, t) and

ωC (t) = C(t, 1 − t), respectively. A diagonal function is a function δ : I → I

144 X. Zhu et al.

such that δ(1) = 1, δ(t) ≤ t for all t ∈ I and δ is increasing and 2-Lipschitz, i.e.,

|δ(t ) − δ(t)| ≤ 2|t − t| for all 0 ≤ t , t ≤ 1. An opposite diagonal function is a

function ω : I → I such that ω(t) ≤ min(t, 1−t) for all t ∈ I and ω is 1-Lipschitz,

i.e., |δ(t ) − δ(t)| ≤ |t − t| for all 0 ≤ t , t ≤ 1. The (opposite) diagonal section of

a copula is a (opposite) diagonal section. Conversely, for any diagonal function δ

(or opposite diagonal function ω), there exist copulas with diagonal section δ (or

opposite diagonal section ω) [12]. For our constructions, we have the following

results.

structions (5) and (6) are given by

(7)

ω0,1,C1 ,C2 (t) = 0.5ωC1 (t2 ) + 0.5ωC2 (2t − t2 ),

and

δ1,0,C1 ,C2 (t) = 0.5δC1 (2t − t2 ) + 0.5δC2 (t2 ),

(8)

ω1,0,C1 ,C2 (t) = 0.5C1 (2t − t2 , 1 − t2 ) + 0.5C2 (t2 , (1 − t)2 ).

Above results can be used to construct copulas with given diagonal or oppo-

site diagonal sections.

Example 3. Consider the diagonal function δ(t) = 2t3 − t4 . Let C1 (u, v) and

C2 (u, v) be F-G-M copulas [3] given by

C1 (u, v) = uv(1 + θ(1 − u)(1 − v)), and C2 (u, v) = uv(1 − θ(1 − u)(1 − v)),

where θ ∈ [−1, 1]. Then it can be shown that δ0,1,C1 ,C2 (t) = 2t3 − t4 , which is

free of θ.

researchers [8,13]. For our construction, we have the following result.

Flexible Constructions for Bivariate Copulas Emphasizing Local Dependence 145

in P, then copulas generated by (P, C , C0 ) for arbitrary collections of copulas

C and arbitrary collections of subsets P converge to C0 under the uniform norm

|| · ||∞ as max{ai = P0 (Ai )} shrinks to 0.

i

Proof. First, note that ||M − W ||∞ = 0.5 and for the ﬁxed C0 and any P =

{Ai }ni=1 , by Sklar’s theorem, there are copulas C1P , · · · , CnP such that

n

C0 (u, v) = ai CiP P0 [0, u]×I Ai , P0 I ×[0, v]Ai +P0 [0, u]×[0, v]∩Ac .

i=1

For any ε > 0 and any P such that max{ai = P0 (Ai )} < 2ε

n,

i

n

2ε

||CP ,C ,C0 − C0 ||∞ ≤ ai ||Ci − CiP ||∞ < n ||M − W ||∞ = ε.

i=1

n

For a probabilistic interpretation of our construction, we need the concept of the

threshold copula, which was deﬁned by [15] as follows.

Definition 2. Let X and Y be random variables deﬁned on a probability space

(Ω, A , P ). Suppose that B ⊆ R̄2 is such that P ({ω ∈ Ω : (X(ω), Y (ω)) ∈ B}) >

0. Then the threshold copula of X and Y over B is a copula CB such that

P (X ≤ x, Y ≤ y|(X, Y ) ∈ B) = CB P (X ≤ x|(X, Y ) ∈ B), P (Y ≤ y|(X, Y ) ∈ B) (9)

for all x, y ∈ B.

Remark 3. (i) From Deﬁnition 2, local dependence of X and Y over B can be

captured by the threshold copula of X and Y over B. In [19,20,32], several

versions of conditional copulas were deﬁned by conditional probabilities, which

are slightly diﬀerent from the threshold copula.

(ii) In general, threshold copulas of a copula C over arbitrary subsets of I 2 are

diﬀerent from C. For example, let B1 = {(u, v) ∈ I 2 : v ≤ u} and B2 = {(u, v) ∈

I 2 : v ≥ u}. Then it can be shown that threshold copulas of Π over B1 and B2 ,

respectively, are

√ √ √
√ √
2

ΠB1 (u, v) = 2 u min u, 1 − 1 − v − min u, 1 − 1 − v ,

and

√ √ √
√ √
2

ΠB2 (u, v) = 2 v min 1 − 1 − u, v − min 1 − 1 − u, v .

However, it can be proved that for any rectangle B = [a1 , a2 ] × [b1 , b2 ] ⊆ I 2 such

that VC (A) > 0, where C = M , W or Π, then the threshold copula of C over B

is C again.

146 X. Zhu et al.

struction (2) can be given as follows. The proof is not diﬃcult.

Theorem 2. Given a triple (P, C , C0 ), where P = {Ai }ni=1 and C = {Ci }ni=1 .

Then the threshold copula of CP ,C ,C0 over Ai is Ci , i = 1, · · · , n.

given marginals, recently, the problem of ﬁnding bounds on classes of copulas

with prescribed properties has been studied by many researchers, e.g., given

values on some points of I 2 [25,35], given values on subsets of I 2 [6,14,41],

given diagonal sections [30,44], given measures of associations [5,29,31]. For the

construction (2), we have the following result.

Proposition 3. Fix a background copula C0 . Let P be a collection of subsets

of I 2 as in the ﬁrst paragraph of this section. For arbitrary collection of copulas

C such that |C | = |P| = n and any (u, v) ∈ I 2 ,

Since copulas with the same threshold copulas must have the same local

dependence, it is necessary to consider bounds on copulas with given thresh-

old copulas. Intuitively, for ﬁxed subsets of I 2 and ﬁxed threshold copulas, the

best-possible upper bound and lower bound should be generated by choosing

background copulas C0 = M and C0 = W , respectively. However, the follow-

ing example shows that it may not hold in general even the global copulas are

well-deﬁned.

Flexible Constructions for Bivariate Copulas Emphasizing Local Dependence 147

v

1.0

0.8

0.08

0.06

0.6

0.04

0.02

0

0.4

- 0.02

0.2

u

0.2 0.4 0.6 0.8 1.0

Example 4. We equip the threshold copula W over A = [0, 0.5]2 . Then it can be

shown that CA,W,M is not always greater than CA,W,C F GM , where C F GM (u, v)

is the F-G-M copula given by C F GM (u, v) = uv(1 + (1 − u)(1 − v)). For example,

CA,W,M (0.2, 0.3) − CA,W,C F GM (0.2, 0.3) = −0.03. The graphs of CA,W,M and

CA,W,C F GM , and the contour plot of CA,W,M − CA,W,C F GM are given Figs. 6 and

5, respectively.

Although the best-possible bounds are not generated by choosing background

copulas to be M or W in general, an ideal result holds if I 2 is partitioned by a

vertical or horizonal line.

Theorem 3. Given two copulas C1 , C2 and a background copula C0 . Let b ∈

(0, 1), B1 = [0, b] × I and B2 = [b, 1] × I. Then

⎧

⎨bC1 u , C0 (b,v)

b b

, if 0 ≤ u ≤ b,

C{B1 ,B2 },{C1 ,C2 },C0 (u, v) =

⎩C0 (b, v) + (1 − b)C2 u−b v−C0 (b,v)

1−b

, 1−b

, if b ≤ u ≤ 1,

and

C{B1 ,B2 },{C1 ,C2 },W (u, v) ≤ C{B1 ,B2 },{C1 ,C2 },C0 (u, v) ≤ C{B1 ,B2 },{C1 ,C2 },M (u, v)

(10)

for all copulas C0 and (u, v) ∈ I 2 . Similar results also hold if I 2 is partitioned

by I × [0, b] and I × [b, 1].

have

u W (b, v) u C0 (b, v) u M (b, v)

bC1 , ≤ bC1 , ≤ bC1 , .

b b b b b b

1−b , 1−b . Then f is nondecreasing. Indeed, for any W (b, v) ≤

x + (1 − b)C2 u−b v−x

148 X. Zhu et al.

x ≤ x ≤ M (b, v),

u − b v − x u−b v−x

f (x ) − f (x) = x − x + (1 − b) C2 , − C2 ,

1−b 1−b 1−b 1−b

v−x

v − x

≥ x − x − (1 − b) − = 0.

1−b 1−b

Thus, C{B1 ,B2 },{C1 ,C2 },W (u, v) ≤ C{B1 ,B2 },{C1 ,C2 },C0 (u, v) ≤ C{B1 ,B2 },{C1 ,C2 },M

(u, v) for all (u, v) ∈ I 2 . A completely analogous steps prove the case when I 2 is

partitioned by I × [0, b] and I × [b, 1].

4 Conclusions

which includes several constructions as special cases. Properties and a proba-

bilistic interpretation of the construction are given and best-possible bounds of

copulas with given threshold copulas are studied. However, there are still open

problems about this work.

the uniform norm || · ||∞ when the number n of subsets in P is ﬁxed. Does

an analogous result hold when n is not ﬁxed and/or under other norms,

e.g., the Sobolev norm [9] for bivariate copulas?

(ii) From Remark 3, threshold copulas of Π, M and W over rectangles, if exist,

are still themselves, Does any other family of copulas have this property?

Or under what conditions will threshold copulas belong to the same family

of their global copula?

(iii) Example 4 shows that analogous results of Theorem 3 do not hold in general.

Can we ﬁnd best-possible bounds of copulas with given threshold copulas

over arbitrary subsets of I 2 ? Are two bounds still copulas?

(iv) The vine copula [4] or pair-copula construction [1] is a ﬂexible method to

construct multivariate copulas from bivariate copulas. Is it possible to com-

bine our method and vine copula method together to construct multivariate

copulas with desired local and pairwise dependence properties?

These problems will be the objects of our future research.

Acknowledgments. The authors would like to thank Dr. S. Tasena at Chiang Mai

University, Thailand, for his valuable discussions and suggestions during the prepara-

tion of this paper.

Flexible Constructions for Bivariate Copulas Emphasizing Local Dependence 149

References

1. Aas, K., Czado, C., Frigessi, A., Bakken, H.: Pair-copula constructions of multiple

dependence. Insur.: Math. Econ. 44(2), 182–198 (2009)

2. Abbas, A.E.: Utility copula functions matching all boundary assessments. Oper.

Res. 61(2), 359–371 (2013)

3. Balakrishnan, N.: Continuous Multivariate Distributions. Wiley Online Library,

Hoboken (2006)

4. Bedford, T., Cooke, R.M.: Vines: a new graphical model for dependent random

variables. Ann. Stat. 30(4), 1031–1068 (2002)

5. Beliakov, G., De Baets, B., De Meyer, H., Nelsen, R., Úbeda-Flores, M.: Best-

possible bounds on the set of copulas with given degree of non-exchangeability. J.

Math. Anal. Appl. 417(1), 451–468 (2014)

6. Bernard, C., Jiang, X., Vanduﬀel, S.: A note on improved Fréchet bounds and

model-free pricing of multi-asset options by Tankov (2011). J. Appl. Probab. 49(3),

866–875 (2012)

7. Boonmee, T., Tasena, S.: Measure of complete dependence of random vectors. J.

Math. Anal. Appl. 443(1), 585–595 (2016)

8. Chaidee, N., Santiwipanont, T., Sumetkijakan, S.: Patched approximations and

their convergence. Commun. Stat.-Theory Methods 45(9), 2654–2664 (2016)

9. Darsow, W.F., Olsen, E.T.: Norms for copulas. Int. J. Math. Math. Sci. 18(3),

417–436 (1995)

10. de Amo, E., Carrillo, M.D., Fernández-Sánchez, J.: Characterization of all copulas

associated with non-continuous random variables. Fuzzy Sets Syst. 191, 103–112

(2012)

11. De Baets, B., De Meyer, H.: Orthogonal grid constructions of copulas. IEEE Trans.

Fuzzy Syst. 15(6), 1053–1062 (2007)

12. De Baets, B., De Meyer, H., Úbeda-Flores, M.: Constructing copulas with given

diagonal and opposite diagonal sections. Commun. Stat.-Theory Methods 40(5),

828–843 (2011)

13. Durante, F., Fernández-Sánchez, J., Quesada-Molina, J.J., Úbeda-Flores, M.: Con-

vergence results for patchwork copulas. Eur. J. Oper. Res. 247(2), 525–531 (2015)

14. Durante, F., Fernández-Sánchez, J., Quesada-Molina, J.J., Úbeda-Flores, M.: Cop-

ulas with given values on the tails. Int. J. Approx. Reason. 85, 59–67 (2017)

15. Durante, F., Jaworski, P.: Spatial contagion between ﬁnancial markets: a copula-

based approach. Appl. Stoch. Models Bus. Ind. 26(5), 551–564 (2010)

16. Durante, F., Saminger-Platz, S., Sarkoci, P.: Rectangular patchwork for bivariate

copulas and tail dependence. Commun. Stat.-Theory Methods 38(15), 2515–2527

(2009)

17. Durante, F., Sánchez, J.F., Sempi, C.: Multivariate patchwork copulas: a uniﬁed

approach with applications to partial comonotonicity. Insur.: Math. Econ. 53(3),

897–905 (2013)

18. Durante, F., Sempi, C.: Principles of Copula Theory. CRC Press, Boca Raton

(2015)

19. Fermanian, J.-D., Wegkamp, M.H.: Time-dependent copulas. J. Multivar. Anal.

110, 19–29 (2012)

20. Gijbels, I., Veraverbeke, N., Omelka, M.: Conditional copulas, association measures

and their applications. Comput. Stat. Data Anal. 55(5), 1919–1932 (2011)

21. Gupta, N., Misra, N., Kumar, S.: Stochastic comparisons of residual lifetimes and

inactivity times of coherent systems with dependent identically distributed com-

ponents. Eur. J. Oper. Res. 240(2), 425–430 (2015)

150 X. Zhu et al.

22. Joe, H.: Multivariate Models and Multivariate Dependence Concepts. CRC Press,

Boca Raton (1997)

23. Klement, E.P., Kolesárová, A.: Intervals of 1-lipschitz aggregation operators, quasi-

copulas, and copulas with given aﬃne section. Monatshefte für Mathematik 152(2),

151–167 (2007)

24. Malevergne, Y., Sornette, D.: Extreme Financial Risks: From Dependence to Risk

Management. Springer Science & Business Media, Heidelberg (2006)

25. Mardani-Fard, H., Sadooghi-Alvandi, S., Shishebor, Z.: Bounds on bivariate distri-

bution functions with given margins and known values at several points. Commun.

Stat.-Theory Methods 39(20), 3596–3621 (2010)

26. Navarro, J., Pellerey, F., Di Crescenzo, A.: Orderings of coherent systems with

randomized dependent components. Eur. J. Oper. Res. 240(1), 127–139 (2015)

27. Nelsen, R.B.: An Introduction to Copulas. Springer Science & Business Media,

Heidelberg (2007)

28. Nelsen, R.B., Quesada-Molina, J.J., Rodrı́guez-Lallena, J.A., Úbeda-Flores, M.:

On the construction of copulas and quasi-copulas with given diagonal sections.

Insur.: Math. Econ. 42(2), 473–483 (2008)

29. Nelsen, R.B., Quesada-Molina, J.J., Rodriı́guez-Lallena, J.A., Úbeda-Flores, M.:

Bounds on bivariate distribution functions with given margins and measures of

association. Commun. Stat.-Theory Methods 30(6), 1055–1062 (2001)

30. Nelsen, R.B., Quesada-Molina, J.J., Rodriı́guez-Lallena, J.A., Úbeda-Flores, M.:

Best-possible bounds on sets of bivariate distribution functions. J. Multivar. Anal.

90(2), 348–358 (2004)

31. Nelsen, R.B., Úbeda-Flores, M.: A comparison of bounds on sets of joint distribu-

tion functions derived from various measures of association. Commun. Stat.-Theory

Methods 33(10), 2299–2305 (2005)

32. Patton, A.J.: Modelling asymmetric exchange rate dependence. Int. Econ. Rev.

47(2), 527–556 (2006)

33. Quesada-Molina, J.J., Saminger-Platz, S., Sempi, C.: Quasi-copulas with a given

sub-diagonal section. Nonlinear Anal.: Theory Methods Appl. 69(12), 4654–4673

(2008)

34. Rychlik, T.: Copulae in reliability theory (order statistics, coherent systems). In:

Copula Theory and Its Applications, pp. 187–208. Springer, Heidelberg (2010)

35. Sadooghi-Alvandi, S., Shishebor, Z., Mardani-Fard, H.: Sharp bounds on a class

of copulas with known values at several points. Commun. Stat.-Theory Methods

42(12), 2215–2228 (2013)

36. Salvadori, G., De Michele, C., Kottegoda, N.T., Rosso, R.: Extremes in Nature: An

Approach Using Copulas, vol. 56. Springer Science & Business Media, Heidelberg

(2007)

37. Schweizer, B., Wolﬀ, E.F.: On nonparametric measures of dependence for random

variables. Ann. Stat. 9(4), 879–885 (1981)

38. Siburg, K.F., Stoimenov, P.A.: Gluing copulas. Commun. Stat.-Theory Methods

37(19), 3124–3134 (2008)

39. Siburg, K.F., Stoimenov, P.A.: A measure of mutual complete dependence. Metrika

71(2), 239–251 (2010)

40. Sklar, M.: Fonctions de répartition á n dimensions et leurs marges. Université Paris

8 (1959)

41. Tankov, P.: Improved fréchet bounds and model-free pricing of multi-asset options.

J. Appl. Probab. 48(2), 389–403 (2011)

42. Tasena, S., Dhompongsa, S.: A measure of multivariate mutual complete depen-

dence. Int. J. Approx. Reason. 54(6), 748–761 (2013)

Flexible Constructions for Bivariate Copulas Emphasizing Local Dependence 151

43. Tasena, S., Dhompongsa, S.: Measures of the functional dependence of random

vectors. Int. J. Approx. Reason. 68, 15–26 (2016)

44. Úbeda-Flores, M.: On the best-possible upper bound on sets of copulas with given

diagonal sections. Soft Comput. Fusion Found. Methodol. Appl. 12(10), 1019–1025

(2008)

45. Wei, Z., Kim, D.: On multivariate asymmetric dependence using multivariate skew-

normal copula-based regression. Int. J. Approx. Reason. 92, 376–391 (2018)

46. Wei, Z., Wang, T., Nguyen, P.A.: Multivariate dependence concepts through cop-

ulas. Int. J. Approx. Reason. 65, 24–33 (2015)

47. Wisadwongsa, S., Tasena, S.: Bivariate quadratic copula constructions. Int. J.

Approx. Reason. 92, 1–19 (2018)

48. Zhu, X., Wang, T., Choy, S.B., Autchariyapanitkul, K.: Measures of mutually

complete dependence for discrete random vectors. In: Predictive Econometrics and

Big Data, pp. 303–317. Springer, Heidelberg (2018)

49. Zhu, X., Wang, T., Pipitpojanakarn, V.: Constructions of multivariate copulas. In:

Robustness in Econometrics, pp. 249–265. Springer, Heidelberg (2017)

Desired Sample Size for Estimating

the Skewness Under Skew Normal

Settings

1

Department of Mathematical Sciences, New Mexico State University,

Las Cruces, USA

{cong960,twang}@nmsu.edu

2

Department of Psychology, New Mexico State University, Las Cruces, USA

{dtrafimo,hamz}@nmsu.edu

Abstract. In this paper, the desired sample size for estimating the

skewness parameter with given closeness and conﬁdence level under skew

normal populations. The conﬁdence intervals for skewness parameter are

constructed based on the desired sample sizes using two pivots, chi-square

distribution and F -distribution. Computer simulations support our main

results. At the end, a real data example is provided for illustration of

constructed conﬁdence intervals.

1 Introduction

It is well known that many data sets from ﬁnancial and biomedical ﬁelds have

skewed distributions. This is a reason why the classical normal distribution is

not so adequate to model the data from these areas even though it is popular and

easy to handle. For data that do not follow a normal distribution, it is natural

to consider the family of skew normal distributions, which extend the family of

normal distributions.

Azzalini [1] provided the probability density function of the skew normal dis-

tribution. Since Azzalini, the family of skew normal distributions has been stud-

ied by many researchers, see, e.g., Gupta [3,4], Vernic [10], and Wang et al. [11].

Theoretically, the skew normal family shares many properties of normal distri-

bution, for example, if Z denotes a random variable SN (0, 1, α), then Z 2 ∼ χ21 ,

irrespective of the skewness parameter α.

Now suppose that we have a population from the skew normal family and

want to construct the conﬁdence interval for the skewness parameter. Other than

using a signiﬁcance test, we start from the question: how many participants are

needed so we can be conﬁdent that the sample skewness estimator is close to the

population skewness parameter? For the normal case, Traﬁmow [6] provided the

answer for the location parameter by ﬁxing the probability of the diﬀerence of

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 152–162, 2019.

https://doi.org/10.1007/978-3-030-04263-9_11

Desired Sample Size for Estimating the Skewness 153

sample mean and population mean within some precision f standard deviation at

conﬁdence level c. Traﬁmow [7] showed how to obtain the necessary sample size

to meet speciﬁcations for single means under the normal distribution; Traﬁmow

and MacDonald [8] extended this to k means; and Traﬁmow et al. [9] provided

a further extension to the family of skew-normal distributions, using locations

instead of means. Also, a way to estimate the desired sample size for simple

random sampling of a skewed population has been discussed by Gregoire and

Aﬄeck [5].

In this paper, we consider the skewness parameter from a skew normal popu-

lation. The paper is organized as follows. Some properties of skew normal distri-

bution are listed in Sect. 2. The methods for deriving the least required sample

size are obtained in Sect. 3, and the simulation work is provided is Sect. 4.

Definition 2.1. A random variable Z is said to have the standard skew normal

distribution, denoted by Z ∼ SN (0, 1, λ), if its probability density function is

where φ(·) and Φ(·) are the density and cumulative distribution function of the

standard normal distribution, respectively. λ is called the skewness parameter.

There is an alternative representation of Z ∼ SN (0, 1, λ) which is the following

lemma.

Lemma 1. Suppose that δ is an arbitrary value from the interval (−1, 1). If

U0 , U1 are independent standard normal variables, then

Z = δ|U0 | + 1 − δ 2 U1 ∼ SN (0, 1, λ),

where λ = √ δ .

1−δ 2

The extension of this lemma is listed below.

Proposition 2.1. Let Z0 , Z1 , ..., Zn be a random sample from the normal dis-

tribution with mean 0 and variance ω 2 and

Xi = δ|Z0 | + 1 − δ 2 Zi

for i = 1, ..., n and δ a value from the interval (−1, 1). Then X1 , ..., Xn are

identically distributed, i.e.,

Xi ∼ SN (0, ω 2 , λ)

n

for all i = 1, ..., n, and if we denote X̄ = n1 i=1 Xi , then

√ 1 + (n − 1)δ 2 2

X̄ ∼ SN (0, ω∗2 , nλ), ω∗2 = ω .

n

154 C. Wang et al.

2 2

ω t

M|Z0 | (t) = 2exp Φ(ωt).

2

Note that Z0 and Zi ’s are independent. We know that all Xi ’s for i = 1, ..., n

have the same moment generating function, i.e.,

2 2

ω t

MXi (t) = 2exp Φ(δωt).

2

1−δ 2 1+λ2

Proposition 2.2. Let Z0 , Z1 , ..., Zn be a random sample from the normal dis-

tribution with mean 0 and variance ω 2 and

Xi = ξ + δ|Z0 | + 1 − δ 2 Zi

for i = 1, ..., n where δ is a value form the interval (−1, 1) and ξ ∈ R. Then

X1 , ..., Xn are identically distributed, i.e.,

Xi ∼ SN (ξ, ω 2 , λ)

Proof. The proof is immediately by the moment generating function in the proof

of the above proposition.

Proposition 2.3. Under the conditions given in Proposition 2.1, let X̄ and S 2

be the sample mean and sample variance, respectively.

(i) Then S 2 and X̄ are independent.

2

(ii) Let Y = X̄

S 2 . Then the mean and standard deviation of Y are

n − 1 1 + (n − 1)δ 2 2(n − 2)

E(Y ) = , σ Y = E(Y )

n − 3 n(1 − δ 2 ) n−5

provided n > 5.

Proof. Note that

n

1

S2 = (Xi − X̄)2 = (1 − δ 2 )SZ2 ,

n − 1 i=1

n

i=1 (Zi − Z̄) . Since Z1 , ..., Zn are independent standard

1

where SZ2 = n−1 2

normal random variables, we know that the sample mean Z̄ and the sample

variance SZ2 are independent. 0 | is independent from Z̄ and SZ , we

2

√ Also, since|Z

obtain that X̄ = δ|Z0 | + 1 − δ Z̄ and S are independent.

2 2

Desired Sample Size for Estimating the Skewness 155

2 √

From Proposition 2.1, X̄ ∼ SN (0, 1+(n−1)δ

n ω 2 , nλ). Then

√

nX̄ √

∼ SN (0, 1, nλ).

[ 1 + (n − 1)δ ]ω

2

Let

nX̄ 2 (n − 1)S 2

V1 = , V2 = .

[1 + (n − 1)δ 2 ]ω 2 (1 − δ 2 )ω 2

Then V1 ∼ χ21 . Note that V2 ∼ χ2n−1 and V1 and V2 are independent. If we

denote T = (n − 1) VV12 , then T has the F-distribution with degrees of freedom

1 and n − 1. Thus, by the mean and standard deviation of F-distribution, the

results are obtained.

Precision

Suppose that Z0 , Z1 , ..., Zn is a random sample from the normal distribution

with mean 0 and variance ω 2 . Let

Xi = ξ + δ|Z0 | + 1 − δ 2 Zi

for i = 1, ..., n and δ a value from the interval (−1, 1). From the Proposition 2.2,

we have that X1 , ..., Xn are dependent random variables from skew normal pop-

ulation with location parameter ξ, scale parameter ω 2 and skewness parameter

λ. It is clear that there is a one-to-one function between λ and δ. In this paper,

we will pay attention to the conﬁdence interval of δ 2 with known ξ so that the

λ’s can be obtained by solving the equation of the relation between λ and δ.

Without loss of generality, we assume ξ = 0 in this paper.

with Known ω 2

In order to determine the minimum sample size n needed to be c × 100% conﬁ-

dent under the given sampling precision, we consider the distribution of sample

above. Let γ = 1 − δ , then it is clear that the standard

variance S 2 deﬁned 2 2

2

deviation of S is n−1 γ ω , which is proportional to γ 2 ω 2 .

2 2 2

Theorem 3.1. Let c be the confidence level and f be the precision which are

specified such that the error associated with estimator S 2 is E = f γ 2 ω 2 . More

specifically, if

P f1 γ 2 ω 2 ≤ S 2 − E(S 2 ) ≤ f2 γ 2 ω 2 = c (2)

where f1 and f2 are restricted by max(|f1 |, f2 ) ≤ f , and E(S 2 ) is the expectation

of S 2 , then the minimum sample size required can be obtained by

U

f (z)dz = c, (3)

L

156 C. Wang et al.

the chi-square distribution with n − 1 degrees of freedom, and

2

where Z = (n−1)S

γ 2 ω2 is of the chi-square distribution with n − 1 degrees of free-

dom. If we denote L = (n − 1)(f1 + 1) and U = (n − 1)(f2 + 1), then the required

n can be solved through the integral equation (3).

Remark 3.1. The value of n obtained is unique together with f1 and f2 . Also

if the conditions in Theorem 3.1 are satisfied, we can construct a c × 100% con-

fidence interval for δ 2 given by

S2 S2

1− , 1− .

(f1 + 1)ω 2 (f2 + 1)ω 2

with Unknown ω 2

From the proof of Proposition 2.3(ii), we know that S 2 and X̄ 2 are both in the

chi-square distribution family. Then we can discuss the conﬁdence interval of δ 2

by using the F -distribution when ω 2 is unknown. Moreover, it has shown that

2 1+(n−1)δ 2

the standard deviation of X̄

S 2 is proportional to

√

n(1−δ 2 )

.

Theorem 3.2. Let c be the confidence level and f be the precision which are

2

specified such that the error associated with estimator SX̄2 is E = f 1+(n−1)δ

√

n(1−δ 2 )

.

More specifically, if

2

1 + (n − 1)δ 2 X̄ 2 X̄ 1 + (n − 1)δ 2

P f1 √ ≤ − E ≤ f2 √ =c (4)

n(1 − δ 2 ) S2 S2 n(1 − δ 2 )

2

where f1 and f2 are restricted by max(|f1 |, f2 ) ≤ f , E X̄S2 is the expectation

X̄ 2

of S2 , then the minimum sample size required (n > 3) can be obtained by

U

f (t)dt = c,

L

F -distribution with 1 and n − 1 degrees of freedom, and

n−1 √ n−1 √

L= + nf1 , U= + nf2 .

n−3 n−3

Desired Sample Size for Estimating the Skewness 157

2

Proof. In Proposition 2.3, we obtained the mean of X̄ S 2 , and also constructed

X̄ 2

the F-distribution using S 2 in its proof. So by simplifying (4), we get

P (L ≤ T ≤ U ) = c,

n−1 √ n−1 √

L= + nf1 , U= + nf2 .

n−3 n−3

From this theorem, we have the following remark.

Remark 3.2. The value of n obtained is unique together with f1 and f2 . Also

if the conditions in Theorem 3.2 are satisfied, we can construct a c × 100% con-

fidence interval for δ 2 given by

nX̄ 2 − U S 2 nX̄ 2 − LS 2

, ,

nX̄ 2 + U S 2 nX̄ 2 + LS 2

We perform computer simulations to support the derivation in Sect. 3. For given

conﬁdence level c = 0.9, 0.95 and precision f = 0.2, 0.4, 0.6, 0.8, 1, the required

values of sample size n, f1 and f2 for the known ω 2 case are listed in Table 1. For

each n and conﬁdence level c, Table 1 also lists the lengths of shortest (SL) and

Table 1. The value of sample size n, left precision f1 , right precision f2 and the lengths

of the intervals for shortest (SL) and equal tail (EL) cases with diﬀerent precision f

for conﬁdence level c = 0.95, 0.9 when ω 2 is known.

f c n f1 f2 SL EL

0.2 0.9 139 −0.1999 0.1971 54.4 54.56

0.95 201 −0.1935 0.1992 78.15 78.33

0.4 0.9 37 −0.3961 0.3857 27.38 27.72

0.95 53 −0.3759 0.3983 39.5 39.84

0.6 0.9 18 −0.5835 0.5632 18.41 18.91

0.95 25 −0.5486 0.5999 26.46 26.96

0.8 0.9 11 −0.7681 0.7383 13.69 14.37

0.95 16 −0.6886 0.7754 20.58 21.23

1 0.9 7 −0.9952 0.9644 10.08 10.96

0.95 11 −0.8345 0.9748 16.45 17.24

158 C. Wang et al.

Table 2. The value of sample size n, left precision f1 , right precision f2 and the length

of the interval L with diﬀerent precision f for the given c = 0.95, 0.9 when ω 2 is

unknown .

f c n f1 f2 L

0.2 0.9 76 −0.117 0.1990 2.77

0.95 207 −0.0702 0.1999 3.89

0.4 0.9 22 −0.2356 0.3956 2.96

0.95 56 −0.1387 0.3981 4.02

0.6 0.9 12 −0.3528 0.5784 3.23

0.95 28 −0.2041 0.5915 4.21

0.8 0.9 8 −0.4950 0.7743 3.59

0.95 18 −0.2671 0.7820 4.45

1 0.9 6 −0.6804 0.9771 4.06

0.95 13 −0.3328 0.9840 4.75

Table 3. The relative frequency for conﬁdence intervals with conﬁdence level c = 0.95,

ω 2 = 1, and location parameter ξ = 0, under diﬀerent precision f and δ 2 .

0.2 201 0.9490 0.9487 0.9467 0.9515

0.4 53 0.9450 0.9514 0.9526 0.9496

0.6 25 0.9513 0.9489 0.9451 0.9506

0.8 16 0.9476 0.9455 0.9486 0.9483

1 11 0.9523 0.949 0.9478 0.9497

equal tail (EL) intervals. Table 2 shows the values of n, f1 and f2 for unknown

ω 2 case, where L is the length of the interval since the length of the shortest

interval is same as that of the one tail case.

Using the Monte Carlo simulations, we account relative frequency for diﬀerent

valve of δ 2 . Table 3 shows the result for the relative frequency of 95% conﬁdence

intervals for f = 0.2, 0.4, 0.6, 0.8, 1, ω 2 = 1 and δ 2 = 0, 0.1, 0.2, 0.5,

and Table 4 gives that for unknown ω 2 case. All results are illustrated with

a number of simulation runs M = 10000. The next graphs (Fig. 1, 2) show

the density functions and the corresponding histograms for f = 0.4, 0.6 with

ω 2 = 1, respectively, where the brackets are the endpoints of the shortest 95%

conﬁdence intervals and the parentheses are the endpoints for equal tail case.

Figure 3 shows that of F -distribution when ξ = 0 and f = 0.6, where the brackets

are the endpoints of the shortest 95% region.

Desired Sample Size for Estimating the Skewness 159

Table 4. The relative frequency for conﬁdence intervals with conﬁdence level c = 0.95,

and location parameter ξ = 0, under diﬀerent precision f and δ 2 for unknown ω 2 .

0.2 207 0.9553 0.9508 0.9499 0.9489

0.4 56 0.9483 0.9488 0.9524 0.9472

0.6 28 0.9540 0.9482 0.9486 0.9507

0.8 18 0.9501 0.9487 0.946 0.9499

1 13 0.9497 0.9466 0.9479 0.9491

and f = 0.4, where the brackets are the endpoints of the shortest 95% region and the

parentheses are the endpoints of the 95% region for equal tail case.

and f = 0.6, where the brackets are the endpoints of the shortest 95% region and the

parentheses are the endpoints of the 95% region for equal tail case.

160 C. Wang et al.

Fig. 3. The density function and histogram of F-distribution for ξ = 0 and f = 0.6,

where the brackets are the endpoints of the shortest 95% region.

We provide an example for illustration of our results in this section. The data

set, which is provided in the appendix, was obtained from a study of leaf area

index (LAI) of robinia pseudoacacia in the Huaiping forest farm of Shannxi

Province from June to October in 2010 (with permission of authors). The esti-

mated distribution based on the data set is SN (1.2585, 1.83322 , 2.4929) by

using the Inferential Models, and SN (1.2585, 1.83322 , 2.7966) via the MME.

For the details, see Zhu et al. [13] and Ye et al. [12]. Now we suppose the pop-

ulation scale parameter ω 2 = 1.83322 and consider the precision f = 0.4 and

conﬁdence level c = 0, 95, then the desired sample size is 53 by Table 1. Ran-

domly choose a sample of size 53, and the sample variance S 2 = 0.6398. Then by

the Remark 3.1, the 95% conﬁdence interval for δ 2 is [0.6984, 0.8631] from which

the 95% conﬁdence interval for the skewness parameter λ can be obtained to be

[1.5217, 2.5109], and for n = 53, the 95% conﬁdence interval for λ under the

equal tail case is [1.5597, 2.5410].

Desired Sample Size for Estimating the Skewness 161

Appendix 1

The data set of the LAI obtained from June to October, 2010

4.87 3.32 2.05 1.50

5.00 3.02 2.12 1.46

4.72 3.28 2.24 1.55

5.16 3.63 2.56 1.27

5.11 3.68 2.67 1.26

5.03 3.79 2.61 1.37

5.36 3.68 2.42 1.87

5.17 4.06 2.58 1.75

5.56 4.13 2.56 1.81

4.48 2.92 1.84 1.98

4.55 3.05 1.94 1.89

4.69 3.02 1.95 1.71

2.54 2.78 2.29 1.29

3.09 2.35 1.94 1.34

2.79 2.40 2.20 1.29

3.80 3.28 1.56 1.10

3.61 3.45 1.40 1.04

3.53 2.85 1.36 1.08

2.51 3.05 1.60 0.86

2.41 2.78 1.50 0.70

2.80 2.72 1.88 0.82

3.23 2.64 1.63 1.19

3.46 2.88 1.66 1.24

3.12 3.00 1.62 1.14

References

1. Azzalini, A.: A class of distributions which includes the normal ones. Scand. J.

Stat. 12(2), 171–178 (1985)

2. Azzalini, A., Capitanio, A.: The Skew-Normal and Related Families, vol. 3. Cam-

bridge University Press, Cambridge (2013)

3. Gupta, A.K., Chang, F.C.: Multivariate skew symmetric distributions. Appl. Math.

Lett. 16, 643–646 (2003)

4. Gupta, A.K., Gouzalez, G., Dominguez-Molina, J.A.: A multivariate skew normal

distribution. J. Multivariate Anal. 82, 181–190 (2004)

162 C. Wang et al.

5. Gregoire, T., Aﬄeck, D.: Estimating desired sample size for simple random sam-

pling of a skewed population. Am. Stat. (2017). https://doi.org/10.1080/00031305.

2017.1290548

6. Traﬁmow, D.: Using the coeﬃcient of conﬁdence to make the philosophical switch

from a posteriori to a priori inferential statistics. Educ. Psychol. Measur. (2016)

7. Traﬁmow, D.: Using the coeﬃcient of conﬁdence to make the philosophical switch

from a posteriori to a priori inferential statistics. Educ. Psychol. Measur. 77(5),

831–854 (2017)

8. Traﬁmow, D., MacDonald, J.A.: Performing inferential statistics prior to data col-

lection. Educ. Psychol. Measur. 77(2), 204–219 (2017)

9. Traﬁmow, D., Wang, T., Wang, C.: From a sampling precision perspective, skew-

ness is a friend and not an enemy! Educ. Psychol. Measur. 1–22 (2018). https://

doi.org/10.1177/0013164418764801

10. Vernic, R.: Multivariate skew-normal distributions with applications in insurance.

Insur. Math. Econ. 38, 413–426 (2006)

11. Wang, T., Li, B., Gupta, A.K.: Distribution of quadratic forms under skew normal

settings. J. Multivariate Anal. 100, 533–545 (2009)

12. Ye, R., Wang, T.: Inferences in linear mixed models with skew-normal random

eﬀects. Acta Mathematica Sinica English Series 31(4), 576–594 (2015)

13. Zhu, X., Ma, Z., Wang, T., Teetranont, T.: Plausibility regions on the skewness

parameter of skew normal distributions based on Inferential Models. In: Robustness

in Econometrics. Studies in Computational Intelligence, vol. 692 (2017). https://

doi.org/10.1007/978-3-319-50742-216

Why the Best Predictive Models

Are Often Diﬀerent from the Best

Explanatory Models: A Theoretical

Explanation

and Thongchai Dumrongpokaphan3

1

Faculty of Economics, Chiang Mai University, Chiang Mai, Thailand

songsakecon@gmail.com

2

University of Texas at El Paso, El Paso, TX 79968, USA

{longpre,vladik}@utep.edu

3

Department of Mathematics, Faculty of Science, Chiang Mai University,

Chiang Mai, Thailand

tcd43@hotmail.com

models which are the best predictors also have the best explanatory

power. Lately, many examples have been provided that show that the

best predictive models are often diﬀerent from the best explanatory mod-

els. In this paper, we provide a theoretical explanation for this diﬀerence.

Traditionally, many researchers who have applied statistical methods implicitly

assumed that predictive and explanatory powers are strongly correlated:

• they assumed that a statistical model that leads to accurate predictions also

provides a good explanation for the corresponding phenomenon, and

• they also assumed that models providing a good explanation for the observed

phenomena also lead to accurate predictions.

In practice, models that lead to good predictions do not always explain the

observed phenomena. Vice versa, models that nicely explain the corresponding

phenomena do not always lead to most accurate predictions.

To illustrate the diﬀerence, let us give a simple example from celestial

mechanics; see, e.g., [1]. Newton’s equations provide a very clear explanation

of why and how celestial bodies move, why the planets and satellites follow

their orbits, etc. In principle, we can predict the trajectories of celestial bodies

– and thus, their future observed positions in the sky – by directly integrating

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 163–171, 2019.

https://doi.org/10.1007/978-3-030-04263-9_12

164 S. Sriboonchitta et al.

computation time on modern computers.

On the other hand, people successfully predicted the observed positions of

planets way before Newton: for that, they use epicycles, i.e., in eﬀect, trigonomet-

ric series. Such series are still used in celestial mechanics to predict the positions

of celestial bodies. They are very good for predictions, but they are absolutely

useless in explanations.

Predictive Models vs. Explanatory Models in Statistics. In statistics, the

need to diﬀerentiate between predictive and explanatory models was emphasized

and illustrated by Galit Shmueli in [3].

Remaining Problem: Why? The empirical fact that the best predictive mod-

els are often diﬀerent from the best explanatory models is currently well known

and well recognized.

But from the theoretical viewpoint, this empirical fact still remains a puz-

zle. In this paper, we provide a theoretical explanation for this empirical

phenomenon.

Problem

Need for Formalization. In order to provide a theoretical explanation for the

diﬀerence between the best predictive and the best explanatory models, we need

to ﬁrst formally describe:

• what it means for a model to be the best predictive model, and

• what it means for a model to be the best explanatory model.

What Does It Mean for a Model to Be Explanatory: Analysis of the

Problem. The “explanatory” part is intuitively understandable: we have some

equations or formulas that explain all the observed data – in the sense that all

the observed data satisfy these equations.

Of course, these equations must be checkable – otherwise, if they are formu-

lated purely in terms of complex abstract mathematics, so that no one knows

how to check whether observed data satisfy these equations or formulas, then

how can we know that the data satisﬁes them?

Thus, when we say that we have an explanatory model, what we are saying,

in eﬀect, that we have an algorithm – a program if you will – that, given the

data, checks whether the data is consistent with the corresponding equations or

formulas. From this pragmatic viewpoint, by an explanatory model, we simply

means a program.

Of course, this program must be non-trivial: it is not enough for the data

to be simply consistent with the data, explanatory means that we must explain

all this data. For example, if we simply state that, in general, the trade volume

grows when the GDP grows, all the data may be consistent with this rule, but

this consistency is not enough: for a model to be truly explanatory, it needs to

Predictive vs. Explanatory Models 165

explain why in some cases, the growth in trade is small and in other cases, it is

huge. In other words, it must explain the exact growth rate. Of course, this is

economics, not fundamental physics, we cannot explain all the numbers based

on ﬁrst principles only, we have to take into account some quantities that aﬀect

our processes. But for the model to be truly explanatory we must be sure that,

once the values of these additional quantities are ﬁxed, there should be only one

sequence of numbers that satisﬁes the corresponding equations or formulas –

namely, the sequence that we observe (ignoring noise, of course).

This is not that diﬀerent from physics. For example, Newton’s laws of gravi-

tation allow many possible orbits of celestial bodies, but once you ﬁx the masses,

initial conditions, and initial velocities of all these bodies, then Newton’s laws

uniquely determine how these bodies will move.

In algorithmic terms, if:

• to the original program for checking whether the data satisﬁes the given

equations and/or formulas,

• we add auxiliary parts checking whether the values of additional quantities

are exactly the ones needed to explain the data,

• then the observed data is the only possible sequence of observations that is

consistent with this program.

Once we know such a program that uniquely determines all the data, we can,

in principle, ﬁnd this data – i.e., solve the corresponding equations – by simply

trying all possible combinations of possible data values until we ﬁnd the one that

satisﬁes all the corresponding conditions.

How can we describe this in precise terms? All the observations can be stored

in the computer, and in the computer, everything is stored as 0s and 1s. From

this viewpoint, the whole set of observed data is simply a binary sequence x,

i.e., a ﬁnite sequence of 0s and 1s.

The length n of this sequence is known. We know how many binary sequences

there are of each length:

• there are 2 × 2 = 22 = 4 sequences of length 2:

– two sequences 00 and 01 that start with 0 and

– two sequences 10 and 11 that start with 1;

• there are 22 × 2 = 23 = 8 binary sequences of length 3:

– two sequences 000 and 001 that start with 00,

– two sequences 010 and 011 that start with 01,

– two sequences 100 and 101 that start with 10, and

– two sequences 110 and 111 that start with 11;

• in general, we have 2n sequences of length n.

There are ﬁnitely many such sequences, so we must potentially check them

all and thus, ﬁnd the desired sequence x – the only one that satisﬁes all the

required conditions.

166 S. Sriboonchitta et al.

so we are talking about potential possibility to compute – not practical compu-

tations: one does not solve Newton’s equations by trying all possible trajectories

and checking whether they satisfy Newton’s equations. But it is OK, since our

goal here is not to provide a practical solution to the problem, but rather to

provide a formal deﬁnition of an explanatory model.

For the purpose of this deﬁnition, we can associate each explanatory model

not only with the original checking program, but also with the related exhaustive-

search program p that generates the data. The exhaustive search part is easy to

program, it practically does not add to length of the original checking program.

So, we arrive at the following deﬁnition.

data. By an explanatory model, we mean a program p that generates the binary

sequence x.

Comments

• The above deﬁnition, if we read it without the previous motivations part,

sounds very counter-intuitive. However, we hope that the motivation part has

convinced the reader that this strange-sounding deﬁnition indeed describes

what we usually mean by an explanatory model.

• For each data, there is at least one explanatory model – since we can always

have a program that simply prints all the bits of the given sequence x one by

one.

What Do We Mean by the Best Explanatory Model: Analysis of the

Problem. There are usually several possible explanatory models, which of them

is the best?

To formalize this intuitive notion, let us again go back to physics. Before

Newton, the motion of celestial bodies was described by epicycles. To accurately

describe the motion of each planet, we needed to know a large number of param-

eters:

• in the ﬁrst approximation, in which the orbit is a circle, we need to know the

radius of this circle, the planet’s initial position on this circle, and its velocity;

• in the second approximation, we need to know similar parameters of the ﬁrst

auxiliary circular motion that describes the deviation from the circle;

• in the third approximation, we need to know similar parameters of the

second auxiliary circular motion describing the deviation from the second-

approximation trajectory, etc.

Then came Kepler’s idea that celestial bodies follow elliptical trajectories.

Why was this idea better than epicycles? Because now, to describe the trajectory

of each celestial body, we need fewer parameters: all we need is a few parameters

that describe the corresponding ellipse.

These original parameters formed the main part of the corresponding

data checking program – and thus, of the resulting data generating program.

Predictive vs. Explanatory Models 167

length of the checking program – and thus, of the generating program corre-

sponding to the model.

Similarly, what Newton did was replaced all the parameters of the ellipses by

a few parameters describing the bodies themselves – and this described not only

the regular motion of celestial bodies, he also described the tides, he described

(explained) why apples from a tree fall down and how exactly, etc. Here, we also

have fewer parameters needed to explain the observed data – and thus, a much

shorter generating program.

From this viewpoint, a model is better if its generating program is shorter

– and thus, the best explanatory model is the one which is the shortest, i.e.,

the one for which the (bit) length len(p) of the corresponding program p is the

smallest possible. So, we arrive at the following deﬁnition.

the best explanatory model if it is the shortest of all explanatory models for x,

i.e., if

len(p0 ) = min{len(p) : p generates x}.

What Do We Mean by the Best Predictive Model. Clearly, not all models

which are explanatory models in the sense of Deﬁnition 1 can be used for prac-

tical predictions. If using a model requires the astronomical time 2n of billions

of years, then the corresponding program is practically useless:

do not need this program: we can as well wait a year and see whether the

Moon is;

• similarly, if a trade model takes 10 years of intensive computations to predict

next year’s trade balance, we do not need this program: we can as well wait

a year and see for ourselves.

For a model to be useful for predictions, it needs not just to generate the

data x but to generate them fast – as fast as possible. The corresponding overall

computation time includes both the time needed to upload this program into a

computer – which is proportional to the length len(p) of this program – and the

time t(p) needed to run this program.

From this viewpoint, the smaller this overall time len(p) + t(p), the better.

Thus, the best predictive model is the one for which this overall time is the

smallest possible. So, we arrive at the following deﬁnition.

Definition 3. Let x be the data. We say that a model p0 is the best predic-

tive model for x if its overall time len(p0 ) + t(p0 ) is the smallest among all the

explanatory models:

168 S. Sriboonchitta et al.

Now that we have formal deﬁnitions, we can formulate our main result. It comes

as two propositions.

explanatory model for this data.

Proposition 2. There exists an algorithm that, given data x, generates the best

predictive model for this data.

Discussion. These two results clearly explain why in many cases, the best pre-

dictive models are diﬀerent from the best explanatory models:

• if they were always the same, then the algorithm from Proposition 2 would

also always generate the best explanatory models, but

• we know, from Proposition 1, that such a general algorithm is not possible.

4 Proofs

Discussion. It is usually easier to prove that an algorithm exists – all we need to

do is to provide such an algorithm. On the other hand, proving that an algorithm

does not exist is rarely easy: we need to provide some general arguments why

no tricks can lead to such an algorithm.

From this viewpoint, it is easier to prove Proposition 2 than Proposition 1.

Let us therefore start with proving Proposition 2.

Proof of Proposition 2. In line with the above idea, let us describe the cor-

responding algorithm. In this algorithm, to ﬁnd the program that generates the

given data x in the shortest possible overall time T , we start with T = 1, then

take T = 2, T = 3, etc. – until we ﬁnd the smallest value T for which such a

program exists.

For each T , we need to look for programs from which len(p) + t(p) = T . For

such programs, we have len(p) ≤ T , so we can simply try all possible binary

sequences p of length not exceeding T . There are ﬁnitely many strings of each

length, so there are ﬁnitely many strings p of length len(p) ≤ T , and we can try

them all.

For each of these strings, we ﬁrst use a compiler to check whether this string

is indeed a syntactically correct program. If it is not, we simply dismiss this

string. If the string p is a syntactically correct program, we run it for time

t(p) = T − len(p), to make sure that the overall time is indeed T . If after this

time, the program p generates the desired sequence x, this means that we have

found the desired best predictive model, so we can stop – the fact that we did

not stop our procedure earlier, when we tested smaller values of the overall time

means that no program can generate x in overall time < T and thus, the overall

time T is indeed the smallest possible.

The proposition is proven.

Predictive vs. Explanatory Models 169

exhaustive-search-type algorithm, that requires exponential time 2n . Yes, this

algorithm is not practical – but practicality is not our goal. Our goal is to explain

the diﬀerence between the best predictive and the best explanatory model, and

from the viewpoint of this goal, this slow algorithm serves its purpose: it shows

that:

• as will now prove, the best explanatory models cannot be computed by any

algorithm – even by a very slow one.

Proof of Proposition 1. The main idea behind this proof comes from the fact

that the quantity

def

K(x) = min{len(p) : p generates x}

is well known in theoretical computer science: it was invented by the famous

statistician A. N. Kolmorogov and it is thus known as Kolmogorov complexity;

see, e.g., [2]. One of the ﬁrst results that Kolmogorov proved about his new

notion is that no algorithm is possible that, given a binary sting x, would always

compute its Kolmogorov complexity K(x) [2].

This immediately implies our Proposition 1: indeed, if it was possible to

produce, for each data x, the best explanatory model p0 , then we would be

able to compute its length len(p0 ) which is exactly K(x) – and K(x) is not

computable.

The proposition is proven.

Discussion. It is worth mentioning that the notion of Kolmogorov complexity

was originally introduced for a somewhat related but still completely diﬀerent

purpose – how to separate random from non-random sequences.

In the traditional statistics, the very idea that some individual sequences are

random and some are not was taboo, one could only talk about probabilities

of diﬀerent sequences. However, intuitively, everyone understands that while a

sequence of bits generated by ﬂipping a coin many times is random, a sequence

like 010101...01 in which 01 is repeated million times is clearly not random. How

can we formally explain this intuitive diﬀerence?

Kolmogorov notices that a sequence 0101...01 is not random because it can

be generated by a very short program: just repeat 01 many times. For example,

in Java, this program looks like this:

{System.out.println("01");}

diﬀerent bits, so the only way to print this sequence is to literally print the whole

sequence bit by bit:

System.out.println("01...");

170 S. Sriboonchitta et al.

So, when x is not random, we can have short programs generating x. Thus, the

shortest possible length K(x) of a program generating x is much smaller than

the length len(x) of this sequence:

K(x) len(x).

On the other hand, for a truly random sequence x, you cannot generate it

by a program shorter than the above line whose length is ≈ len(x). So, in this

case,

K(x) ≈ len(x).

This idea inspired Kolmogorov to deﬁne what we now call Kolmogorov com-

plexity K(x) and to deﬁne a binary sequence random if K(x) ≥ len(x) − c0 , for

some appropriate constant c0 .

Proof that Kolmogorov Complexity is Not Computable: Reminder.

In our proof of Proposition 1 we used Kolmogorov’s proof that Kolmogorov

complexity K(x) is not computable. To make our result more intuitive, it is

worth mentioning that proof is reasonably intuitive.

The main idea behind this proof comes from the following Barry’s paradox.

Some English expressions describe numbers. For example:

• “twelve” means 12,

• “million” means 1000000, and

• “the smallest prime number larger than 100” means 101.

There are ﬁnitely many words in the English language, so there are ﬁnitely

many combinations of less than twenty words, thus ﬁnitely many numbers which

can be described by such combinations. Hence, there are numbers which cannot

be described by such combinations. Let n0 denote the smallest of such numbers.

Therefore, n0 is “the smallest number that cannot be describe in fewer than

twenty words”. But this description of the number n0 consists of 12 words –

less than 20, so n0 can be described by using fewer than twenty words – a clear

paradox.

This paradox is caused by the imprecision of natural language, but if we

replace “described” by “computed”, we get a proof by contradiction that Kol-

mogorov complexity is not computable.

Indeed, let us assume that K(x) is computable, and let L be the length of

the program that computes K(x). Binary sequences can be interpreted as binary

integers, so we can talk about the smallest of them. Then, the following program

computes the smallest sequence x for which K(x) ≥ 3L: we try all possible binary

sequences of length 1, length 2, etc., until we ﬁnd the ﬁrst sequence for which

K(x) ≥ 3L:

int x = 0;

while(K(x) < 3 * L){x++;}

This program adds just two short lines to the length-L program for computing

K(x); thus, its length is ≈ L 3L, so for the number x0 that it computes, its

Predictive vs. Explanatory Models 171

cannot exceed this length. Thus, we have K(x) 3L.

On the other hand, we deﬁned x0 as the smallest number for which K(x) ≥

3L, so we have K(x0 ) ≥ 3L – a contradiction. This contradiction shows that our

assumption is wrong, and the Kolmogorov complexity is not computable.

metrics, Faculty of Economics, Chiang Mai University, Thailand. We also acknowledge

the partial support of Department of Mathematics, Chiang Mai University, and of the

US National Science Foundation via grant HRD-1242122 (Cyber-ShARE Center of

Excellence).

The authors are greatly thankful to Professors Hung T. Nguyen and Galit Shmueli

for valuable discussions.

References

1. Feynman, R., Leighton, R., Sands, M.: The Feynman Lectures on Physics. Addison

Wesley, Boston (2005)

2. Li, M., Vitányi, P.M.B.: An Introduction to Kolmogorov Complexity and Its Appli-

cations. Springer, Berlin (2008)

3. Shmueli, G.: To explain or to predict? Stat. Sci. 25(3), 289–310 (2010)

Algorithmic Need for Subcopulas

and Hoang Phuong Nguyen3

1

Banking University of Ho Chi Minh City,

56 Hoang Dieu 2, Quan Thu Duc, Thu Duc, Ho Chi Minh City, Vietnam

Thachnn@buh.edu.vn

2

University of Texas at El Paso, 500 W. University, El Paso, TX 79968, USA

{olgak,vladik}@utep.edu

3

Division Informatics, Math-Informatics Faculty, Thang Long University,

Nghiem Xuan Yem Road, Hoang Mai District, Hanoi, Vietnam

nhphuong2008@gmail.com

random variables is by describing the corresponding copula. For contin-

uous distributions, the copula is uniquely determined by the correspond-

ing distribution. However, when the distributions are not continuous,

the copula is no longer unique, what is unique is a subcopula, a function

C(u, v) that has values only for some pairs (u, v). From the purely math-

ematical viewpoint, it may seem like subcopulas are not needed, since

every subcopula can be extended to a copula. In this paper, we prove,

however, that from the algorithmic viewpoint, it is, in general, not pos-

sible to always generate a copula. Thus, from the algorithmic viewpoint,

subcopulas are needed.

distribution of a random variable:

• we can use its moments,

• its cumulative distribution function (cdf), etc.

Most of these types of descriptions are not always applicable:

• for a distribution with heavy tails, moments are sometimes inﬁnite, etc.

Out of the known representations, the representation as a cdf is the most uni-

versal, it does not seem to have limitations. In view of this, to take into account

that in econometrics, one can encounter discrete distributions (for which no pdf

is known), heavy-tailed distributions (for which moments are inﬁnite), etc., it is

reasonable to use a cdf

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 172–181, 2019.

https://doi.org/10.1007/978-3-030-04263-9_13

Algorithmic Need for Subcopulas 173

FX (x) = Prob(X ≤ x)

to describe a random variable X.

Similarly, to describe a joint distribution of two random variables (X, Y ), it

is reasonable to use a joint cdf

FX (x) · FY (y). In general, the dependence may be more complicated. It is rea-

sonable to describe this dependence by a function C(u, v) for which

A function with this property is known as a copula; see, e.g., [3,6–8]. Copulas have

been successful used in many application areas, in particular, in econometrics.

Existence and Uniqueness of Copulas. It has been proven that such a

copula always exists, and that the copula function C(u, v) is itself a 2-D cdf on

the square

[0, 1] × [0, 1].

In situations when the distributions of X and Y are continuous – e.g., when

there exists pdf’s – the copula is uniquely determined. Indeed, in this case, the

value FX (x) continuously depends on x and thus, attains all possible values

between 0 and 1. So, to ﬁnd C(u, v), it is suﬃcient to ﬁnd the values x and y for

which FX (x) = u and FY (y) = v, then FXY (x, y) will give is the desired value

of C(u, v).

However, if the distribution of one of the variables – e.g., X – is discrete (for

example, there are some values which have positive probabilities), then the value

FX (x) jumps, and for thus, for some intermediate values u, we do not have values

x for which FX (x) = u. In such situations, the copula is not uniquely determined,

since we can have diﬀerent values C(u, v) for this jumped-over u.

Subcopulas: Reminder. While the copula is not always unique, there is a vari-

ant of this notion which is always unique; this variant is known as a subcopula. In

precise terms, a subcopula is also deﬁned by the formula (1), the only diﬀerence

is that:

• while a copula has to be deﬁned for all possible values u ∈ [0, 1] and v ∈ [0, 1],

• a subcopula C(u, v) is only deﬁned for the values u and v which have the

form u = FX (x) and v = FY (y) for some x and y.

Subcopulas have also been successfully used in econometrics; see, e.g., [9,11,13,

14,16,17].

Main Question: Do We Need Subcopulas? From the purely mathematical

viewpoint, it may seem that do not need subcopulas, since every subcopula can

be, in principle, extended to a copula.

174 T. N. Nguyen et al.

However, the fact that many researchers use subcopulas seem to indicate

that, from the algorithmic viewpoint, subcopulas may not be easy to extend

to copulas. An indirect argument in support of this not-easiness is that known

extension proofs use non-constructive arguments such as Zorn’s Lemma (which

is equivalent to a non-constructive Axiom of Choice); see, e.g., [2].

What We Do in This Paper. In this paper, we prove that indeed, in situations

of non-uniqueness, it is not algorithmically possible to always construct a copula.

In other words, we prove that, from the algorithmic viewpoint, subcopulas are

indeed needed.

What is Computable: Main Definitions. In order to analyze when a copula

is computable and when it is not, let us recall the main deﬁnitions of computabil-

ity; for details, see, e.g., [1,4,15] (for random variables, see also [5]).

A real number x is computable if we can compute it with any given accuracy.

In other words, a number is computable if there exists an algorithm that, given

an integer n (describing the accuracy), returns a rational number rn for which

|x − rn | ≤ 2−n .

x, returns the value f (x). In precise terms, this means that for any desired

accuracy n, we can compute a rational number rn for which |f (x) − rn | ≤ 2−n ;

in this computation, the program can pick some integer m and ask for an 2−m -

approximation to the input.

Similarly, a function f (x, y) of two variables is called computable if, given

x and y, it can compute the value f (x, y) with any given accuracy. Again, in

the process of computations, this program can pick some m and ask for a 2−m -

approximation to x and to y.

Comment. These deﬁnitions describe the usual understanding of computabil-

ity; so, not surprisingly, all usual computable functions – e.g., all elementary

functions, all continuous functions – are computable in this sense as well.

What is Not Computable. What is not computable in this sense are discon-

tinuous functions such as sign(x) which is equal:

• to −1 when x < 0,

• to 0 when x = 0, and

• to 1 when x > 0.

Indeed, if this function was computable, then we would be able to check whether

a computable real number is equal to 0 or not, and such checking is not algo-

rithmically possible; see, e.g., [4] and references therein.

Indeed, the possibility of such checking contradicts to the known result that

it is not possible,

Algorithmic Need for Subcopulas 175

• given a program,

• to check whether this program halts or not.

Indeed, based on each program, we can form a sequence rn each element of which

is:

• equal to 2−n if the program did not yet halt by time n and

• equal to 2−t if it halted at time t ≤ n.

Then:

• If the program does not halt, this sequence describes the computable real

number

x = 0.

• If the program halts at time t, this sequence describes the computable real

number

x = 2−t > 0.

Thus, if we could check whether a real number is equal to 0 or not, we would

be able to check whether a program halts or not – and we know that this is not

algorithmically possible.

What Does It Mean for the cdf to be Computable. In real life, when we

say that we have a random variable, it means that we have a potentially inﬁnite

sequence of observations which follow the corresponding distribution. Based on

these observations, for each computable real number x, we would like to compute

the value F (x).

The value F (x) is the probability that the value of a random variable X is

≤ x. A natural practical way to estimate a probability based on a ﬁnite sample

is to estimate the frequency of the corresponding event. Thus, to estimate F (x),

a natural idea is to take n observations X1 , . . . , Xn , ﬁnd out how many of them

are ≤ x, and then compute the desired frequency by dividing the result of the

counting by n.

Even in the ideal case, when all the values Xi are measured exactly, the

frequency is, in general, diﬀerent from the probability. It is known (see, e.g.,

[12]) that for large n, the diﬀerence between the frequency f and the probability

p is approximately normally distributed, with 0 mean and standard deviation

p · (1 − p) 0.5

σ= ≤√ .

n n

From the practical viewpoint, any deviation larger than 6 sigma has a probability

of less than 10−8 and is, thus, usually considered practically impossible. (If you

do not view 6 sigma as impossible, take 20 sigma; one can always come up with a

probability so small that it is practically impossible.) Thus, if for a given ε > 0,

0.5

we select n so large that 6σ ≤ 6 · √ ≤ ε Then, the resulting frequency f is

n

guarantee to be ε-close to the desired probability F (x): |f − F (x)| ≤ ε, i.e.,

equivalently,

F (x) − ε ≤ f ≤ F (x) + ε.

176 T. N. Nguyen et al.

In practice, we also need to take into account that the values Xi can only

be measured with a certain accuracy δ; see, e.g., [10]. Thus, what we compare

i of their

with the given number x are not the actual values Xi but the results X

measurement which are δ-close to Xi :

i ≤ x, we cannot conclude that Xi ≤ x, we can only conclude that

• If X

Xi ≤ x + δ.

i > x, we cannot conclude that Xi > x, we can only conclude

• Similarly, if X

that

Xi > x − δ.

Thus, the only think that we can guarantee for the observed frequency f is that

F (x − δ) − ε ≤ f ≤ F (x + δ) + ε. (2)

This is how a computable cdf is deﬁned: that, given every a computable number

x and rational numbers ε > 0 and δ > 0, we can eﬃciently ﬁnd a rational number

f that satisﬁes the inequality (2).

A similar inequality

F (x − δ, y − δ) − ε ≤ f ≤ F (x + δ, y + δ) + ε (3)

Comment. Note that a cdf can be discontinuous – e.g., if we have a random

variable that is equal to 0 with probability 1, then:

• F (x) = 1 for x ≥ 0.

not necessarily a computable function.

However, as we will see, when the computable cdf is continuous, it is a com-

putable function.

computable function.

where fδ,ε (x) means a frequency estimated by comparing the measured values

Xi (measured with accuracy δ) with the value x, based on a sample large enough

to guarantee the accuracy ε.

Algorithmic Need for Subcopulas 177

thus

When the cdf F (x) is a continuous function, then, for each x, the diﬀerence

F (x + 2δ) − F (x − 2δ)

and F (x−2δ)−2ε also tends to 0 as δ → 0 and ε → 0. So, if we take δ = ε = 2−k

for k = 1, 2, . . ., we will eventually encounter an integer k for which this diﬀerence

is smaller than a given number 2−n . In this case, due to (5), the diﬀerence

between the inner bounds fδ,ε (x + δ) + ε and fδ,ε (x − δ) − ε is also ≤ 2−n . In

this case, each of these bounds can be used as the desired 2−n -approximation to

F (x).

Thus, to compute F (x) with accuracy 2−n , it is suﬃcient to compute, for

k = 1, 2, . . .,

• values fδ,ε (x + δ) + ε and fδ,ε (x − δ) − ε.

We continue these computations for larger and larger k until the diﬀerence

between fδ,ε (x + δ) + ε and fδ,ε (x − δ) − ε becomes smaller than or equal to 2−n .

Once this condition is satisﬁed, we return fδ,ε (x + δ) + ε as the desired 2−n -

approximation to F (x).

The proposition is proven.

Comment. In the 2-D case, we can use a similar proof.

cdf FXY (x, y), generates the corresponding copula – i.e., generates a computable

cdf C(u, v) that satisfies the formula (1).

FXY (x, y), would generate the corresponding copula – i.e., that would generates

a computable cdf C(u, v) that satisfies the formula (1).

Comment. This result proves that from the algorithmic viewpoint, it is, in gen-

eral, not possible to always generate a copula. Thus, from the algorithmic view-

point, subcopulas are indeed needed.

Proof of Proposition 2. This proof is reasonably straightforward, it follows

the above idea of ﬁnding the copula for a continuous cdf.

178 T. N. Nguyen et al.

Indeed:

• suppose that we are given two computable numbers u, v ∈ [0, 1], and

• we want to ﬁnd the desired approximation to the value C(u, v).

To do that, we ﬁrst ﬁnd x for which FX (x) is δ-close to u.

This value can be found as follows. First, we pick any x0 and compute FX (x0 )

with accuracy δ; we can do it, since, according to Proposition 1, for continuous

distributions, the cdf is a computable function. If we get a value which is δ-close

to u, we are done.

If the approximate value FX (x0 ) is larger than u, we take x0 − 1, x0 − 2, etc.,

until we ﬁnd a new value x− for which FX (x− ) < u.

Similarly, if the approximate value FX (x0 ) is smaller than u, we take x0 + 1,

x0 + 2, etc., until we ﬁnd a new value x+ for which FX (x+ ) > u.

In both cases, we have an interval [x− , x+ ] for which F (x− ) < u < F (x+ ).

Now, we can use bisection to ﬁnd the desired x: namely, we take a midpoint xm

of the interval. Then:

• If |FX (xm ) − u| ≤ δ, we are done.

• If this ideal inequality is not satisﬁed, then we have:

– either FX (xm ) < u

– or FX (xm ) > u.

• In the ﬁrst case, we know that the desired value x is in the half-size interval

[xm , x+ ].

• In the second case, we know that the desired value x is in the half-size interval

[x− , xm ].

• In both cases, we get a new half-size interval.

To the new interval, we apply the same procedure until we get the desired x.

Similarly, we can compute y for which FY (y) ≈ v. Now, we can take the

approximation to FXY (x, y) as the desired approximation to C(u, v).

The proposition is proven.

Proof of Proposition 3. For each real number a for which |a| ≤ 0.5, we can

form the following probability distribution Fa (x, y) on the square [0, 1] × [0, 1]:

it is uniformly distributed on a straight line segment y = 0.5 + sign(a) · (x − 0.5)

corresponding to x ∈ [0.5 − |a|, x + |a|]. Thus:

• when a > 0, we take y = x; and

• when a < 0, we take y = 1 − x.

One can easily check that Fa (x, y) is indeed a computable cdf – although it

is not always a computable function, since, e.g., for a = 0 the whole probability

distribution is concentrated at the point (0.5, 0.5).

For each a, the marginal distribution FX (x) is uniformly distributed on the

interval [0.5−|a|, 0.5+|a|] of length 2|a|. Thus, for the values x from this interval,

we have

x − (0.5 − |a|)

FX (x) = .

2|a|

Algorithmic Need for Subcopulas 179

In particular, for u = v = 0.25, we should take x = y = 0.5 − 0.5|a|, and for

u = v = 0.5, we should take x = y = 0.5.

Since the distribution is symmetric, when u = v, we have the same values

x = y for which FX (x) = u and FY (y) = u.

when u, v ≤ 0.5, we cannot have both X ≤ u and Y ≤ v, and thus, we get

cdf C(u, v), then, by deﬁnition of a computable cdf, we would be able to compute:

• given a,

• the value fδ,ε (x) corresponding to x = 0.375, δ = 0.125 and ε = 0.1,

def

i.e., the value f = f0.125,0.1 (0.375) for which

C(x − δ, x − δ) − ε ≤ f ≤ C(x + δ, x + δ) + ε,

C(0.25, 0.25) − 0.1 ≤ f ≤ C(0.5, 0.5) + 0.1.

180 T. N. Nguyen et al.

Here:

• when a < 0, we have C(0.5, 0.5) = 0, hence f ≤ 0.1 and therefore f < 0.125;

• when a > 0, then C(0.25, 0.25) = 0.25, hence f ≥ 0.15 and therefore f > 0.125.

So, by comparing the resulting value f with 0.125, we will able to check whether

a > 0 or a < 0 – and this is known to be algorithmically impossible; see,

e.g., [1,4,15]. This contradiction shows that it is indeed not possible to have an

algorithm that always computes the copula.

The proposition is proven.

Foundation via grant HRD-1242122 (Cyber-ShARE Center of Excellence).

The authors are thankful to Professor Hung T. Nguyen for valuable discussions.

References

1. Bishop, E., Bridges, D.: Constructive Analysis. Springer Verlag, Heidelberg (1985)

2. Fernández-Sánchez, J., Úbeda-Flores, M.: Proving Sklar’s theorem via Zorn’s

lemma. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 26(1), 81–85 (2018)

3. Jaworski, P., Durante, F., Härdle, W.K., Rychlik, T. (eds.): Copula Theory and

Its Applications. Springer, Heidelberg (2010)

4. Kreinovich, V., Lakeyev, A., Rohn, J., Kahl, P.: Computational Complexity and

Feasibility of Data Processing and Interval Computations. Kluwer, Dordrecht

(1997)

5. Kreinovich, V., Pownuk, A., Kosheleva, O.: Combining interval and probabilistic

uncertainty: what is computable? In: Pardalos, P., Zhigljavsky, A., Zilinskas, J.

(eds.) Advances in Stochastic and Deterministic Global Optimization, pp. 13–32.

Springer, Cham (2016)

6. Mai, J.-F., Scherer, M.: Simulating Copulas: Stochastic Models, Sampling Algo-

rithms, and Applications. World Scientiﬁc, Singapore (2017)

7. McNeil, A.J., Frey, R., Embrechts, P.: Quantitative Risk Management: Concepts,

Techniques, and Tools. Princeton University Press, Princeton (2015)

8. Nelsen, R.B.: An Introduction to Copulas. Springer, Heidelberg (2007)

9. Okhrin, O., Okhrin, Y., Schmidt, W.: On the structure and estimation of hierar-

chical Archimedean copulas. J. Econ. 173, 189–204 (2013)

10. Rabinovich, S.G.: Measurement Errors and Uncertainty: Theory and Practice.

Springer, Berlin (2005)

11. Ruppert, M.: Contributions to Static and Time-varying Copula-based Modeling of

Multivariate Association. EUL Verlag, Koeln (2012)

12. Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures.

Chapman and Hall/CRC, Boca Raton (2011)

13. Wei, Z., Wang, T., Nguyen, P.A.: Multivariate dependence concepts through cop-

ulas. Int. J. Approx. Reason. 65(1), 24–33 (2015)

14. Wei, Z., Wang, T., Panichkitkosolkul, W.: Dependence and association concepts

through copulas. In: Huynh, V.-N., Kreinovich, V., Sriboonchitta, S. (eds.) Mod-

eling Dependence in Econometrics, pp. 113–126. Springer, Cham (2014)

15. Weihrauch, K.: Computable Analysis. Springer, Berlin (2000)

Algorithmic Need for Subcopulas 181

16. Zhang, Y., Beer, M., Quek, S.T.: Long-term performance assessment and design

of oﬀshore structures. Comput. Struct. 154, 101–115 (2015)

17. Zhu, X., Wang, T., Choy, S.T.B., Autchariyapanitkul, K.: Measures of mutually

complete dependence for discrete random vectors. In: Kreinovich, V., Sriboon-

chitta, S., Chakpitak, N. (eds.) Predictive Econometrics and Big Data, pp. 303–

317. Springer, Cham (2018)

How to Take Expert Uncertainty into

Account: Economic Approach Illustrated

by Pavement Engineering Applications

Thach Ngoc Nguyen3 , Olga Kosheleva4 , and Vladik Kreinovich4(B)

1

Department of Civil Engineering, Universidad de Piura in Peru (UDEP),

Av. Ramón Mugica 131, Piura, Peru

edgar.rodriguez@udep.pe,edrodriguezvelasquez@miners.utep.edu

2

Department of Civil Engineering, University of Texas at El Paso,

500 W. University, El Paso, TX 79968, USA

cchangalbitres2@utep.edu

3

Banking University of Ho Chi Minh City, 56 Hoang Dieu 2, Quan Thu Duc,

Thu Duc, Ho Chi Minh City, Vietnam

Thachnn@buh.edu.vn

4

University of Texas at El Paso, 500 W. University, El Paso, TX 79968, USA

{olgak,vladik}@utep.edu

example, in pavement engineering, we often rely on expert graders to

gauge the condition of road segments and to see which repairs are needed.

Expert estimates are imprecise; it is desirable to take the resulting uncer-

tainty into account when making the corresponding decisions. The tra-

ditional approach is to first apply the traditional statistical methods

to get the most accurate estimate and then to take the corresponding

uncertainty into account when estimating the economic consequences

of the resulting decision. On the example of pavement engineering appli-

cations, we show that it is beneficial to apply the economic approach

from the very beginning. The resulting formulas are in good accordance

with the general way how people make decisions in the presence of risk.

help make decisions.

In some cases – e.g., in medicine – we need experts because computer-based

automated system are not yet able to always provide a correct diagnosis: human

medical doctors are still needed.

In other case, the corresponding automatic equipment exists, but it is much

cheaper to use human experts. For example, in pavement engineering, in princi-

ple, we can use automatic systems to gauge the condition of the road surface, to

estimate the size of cracks and other faults, but the corresponding equipment is

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 182–190, 2019.

https://doi.org/10.1007/978-3-030-04263-9_14

Pavement Engineering Applications 183

still reasonably expensive to use, while a human grader can make these evalua-

tions easily. The use of human grades is explicitly mentioned in the corresponding

normative documents; see, e.g., [1] (see also [5]).

Expert Estimates Come with Uncertainty. Expert estimates usually come

with uncertainty. The experts’ estimates have, at best, the accuracy of about

10–15%, up to 20%; see, e.g., [3].

This observed accuracy is in the perfect accordance with the well-known

“seven plus-minus two law” (see, e.g., [4,6]), according to which a person nor-

mally divides everything into seven plus-minus two – i.e., between 5 and 9 –

categories, and thus, has the accuracy between 1/9 ≈ 10% and 1/5 ≈ 20%.

Traditional Approach to Dealing with This Uncertainty. In the tradi-

tional approach to dealing with the expert uncertainty, we:

• ﬁrst use the traditional statistical techniques to transform the expert opinion

into the most accurate estimate of the desired quantity, and then

• if needed, we gauge the economic consequences of the resulting estimate.

ditional approach is that while our ultimate objective is economic – how to best

maintain the pavement within the given budget – we do not take this objective

into account when transforming expert’s opinion into a numerical estimate.

What We Do in This Paper. In this paper, we show how to take economic

factors into account when producing the estimate. The resulting formulas are in

line with the usual way how decision makers take risk into account.

into a Numerical Estimate: A Brief Reminder

Main Idea. An expert may describe his or her opinion in terms of a word from

natural language, or by providing a numerical estimate. For each such opinion –

be it a word or a numerical estimate – we can ﬁnd all the cases when this expert

expressed this particular opinion, and in all these cases, ﬁnd the actual value of

the estimated quantity q.

As a result, for each opinion, we get a probability distribution on the set of all

possible values of the corresponding quantity. This distribution can be described

either in terms of the corresponding probability density function (pdf) ρ(x), or

def

in the terms of the cumulative distribution function (cdf) F (x) = Prob(q ≤ x).

In many real-life situations, the expert uncertainty is a joint eﬀect of many

diﬀerent independent factors, each of which may be small by itself. In such

cases, we can take into account the Central Limit Theorem, according to which

the distribution of the sum of a large number of small independent random

variables is close to Gaussian (normal); see, e.g., [7]. Thus, it often makes sense

184 E. D. Rodriguez Velasquez et al.

normal distribution with mean μ and standard deviation σ, we have

x−μ

F (x) = F0 ,

σ

where F0 (x) is the cdf of the standard normal distribution – with mean 0 and

standard deviation 1.

Based on the probability distribution, we describe the most accurate numer-

ical estimate.

Details: How to Transform the Probability Distribution Reflecting the

Expert Opinion into a Numerical Estimate. We want to have an estimate

which is as close to the actual values of the quantity q as possible.

For the same opinion of an expert, we have, in general, diﬀerent actual val-

ues q1 , . . . , qn . These values form a point (q1 , . . . , qn ) in the corresponding n-

dimensional space. Once we select a numerical value x0 corresponding to this

opinion, we will generate the value x0 in all the cases in which the experts has this

particular opinion. In other words, what we generate is the point (x0 , . . . , x0 ).

A natural idea is to select the estimate x0 for which the point (x0 , . . . , x0 )

is the closest to the point (q1 , . . . , qn ) that describes the actual values of the

corresponding quantity. In other words, we want to select the estimate x0 for

which the distance

def

d = (x0 − q1 )2 + . . . + (x0 − qn )2

Minimizing the distance is equivalent to minimizing its square

d2 = (x0 − q1 )2 + . . . + (x0 − qn )2 .

Diﬀerentiating the expression for d2 with respect to x0 and equating the deriva-

tive to 0, we conclude that

2(x0 − q1 ) + . . . + 2(x0 − qn ) = 0.

If we divide both sides of this equality by 2, move all the terms not related to

x0 to the right-hand side, and then divide both sides by n, we conclude that

x0 = μ,

def q1 + . . . + qn

μ = .

n

In terms of the probability

distribution, this is equivalent to minimizing

the

mean square value (x − x0 )2 · ρ(x) dx, which leads to x0 = μ = x · ρ(x) dx.

Pavement Engineering Applications 185

Selecting an Estimate: On the Example of Pavement

Engineering

Analysis of the Problem: Possible Faults and How Much it Costs to

Repair Them. In pavement engineering, we are interested in estimating the

pavement fault index x. When the pavement is perfect, this index is 0. The

presence of any speciﬁc fault increases the value of this index.

Repairing a fault takes money; the larger the index, the more costly it is to

repair this road segment. Let us denote the cost of repairs for a road segment

with index x by c(x).

We are interested in the case when the road is regularly repaired. In this

case, the index x cannot grow too much – once there are some faults in the road,

these faults are being repaired. Thus, the values of the index x remain small. So,

we can expand the unknown function into Taylor series and keep only the ﬁrst

terms in this expansion – e.g., only linear terms:

c(x) ≈ c0 + c1 · x.

When the road segment is perfect, i.e., when x = 0, no repairs are needed,

so the cost is 0: c(0) = 0. Thus, c0 = 0, and the cost of repairs linearly depends

on the index:

c(x) ≈ c1 · x. (1)

faulty road segment, then, because of the constant traﬃc load, in the next year,

the pavement condition will become worse.

Each fault worsens. Thus, the more faults we have now, the worse will be the

situation next year. Let g(x) denote the next-year index corresponding to the

situation when this year, the index is x.

Since, as we have mentioned, it makes sense to consider small values of x, we

can safely expand the function g(x) in Taylor series and keep only linear terms

in this expansion:

g(x) ≈ g0 + g1 · x.

When the pavement is perfect, i.e., when x = 0, we usually do not expect it to

deteriorate next year, so we should have g(0) = 0. Thus, we have g0 = 0, and

g(x) ≈ g1 · x.

Since we did not repair the road segment this year, we have to repair it next

year. Next year, the index will increase from the original value x to the new

def

value x = g1 · x. Thus, the cost of repairs will be c1 · x = c1 · g1 · x.

This is the cost next year, so to compare it with the cost of this-year repairs,

we need to take into account that next year’s money is somewhat cheaper than

this year’s money: if the interest rate is r, we can invest a smaller amount

c1 · q1 · x

(2)

1+r

186 E. D. Rodriguez Velasquez et al.

now, and get the desired amount c1 · g1 · x next year. This formula (2) describes

the equivalent this-year cost of not repairing the road segment this year.

Combining These Costs: What is the Economic Consequence of

Selecting an Estimate. Once we select an estimate x0 describing the qual-

ity of the road segment, we perform the repairs corresponding to this degree.

According to the formula (1), these repairs costs us the amount c1 · x0 .

If the actual value x is exactly equal to x0 , this is the ideal situation: the

road segment is repaired, and we spend exactly the amount of the money needed

to repair it. Realistically, the actual x is, in general, somewhat diﬀerent from x0 .

As a result, we waste some resources.

When the actual value x of the pavement quality is smaller than x0 , this

means that we spend too much money on repairs: e.g., we bring on heavy and

expensive equipment while a simple device would have been suﬃcient. We could

spend just c1 ·x and instead, we spend a larger amount c1 ·x0 . Thus, in comparison

with the ideal situation, we waste the amount

When the actual value x of the pavement index is larger than the estimate

x0 , this means that after performing the repairs corresponding to the value x0 ,

we still have the remaining fault level x − x0 which needs to be repaired next

year. The cost of these repairs – when translated into this year’s costs – can be

found by applying the formula (2): it is

c1 · q1 · (x − x0 )

. (4)

1+r

The formulas (3) and (4) describe what will be the wasted amount for each x.

By multiplying this amount by ρ(x) and integrating over x, we get the following

expression for the expected value of the waste:

x0 ∞

def c1 · q1 · (x − x0 )

W (x0 ) = c1 · (x0 − x) · ρ(x) dx + · ρ(x) dx. (5)

0 x0 1+r

Main Idea. Instead of selecting the statistically optimal estimate

x0 = μ = x · ρ(x) dx

and gauging the expected waste related to this estimate, let us instead use the

estimate that minimizes the waste (5).

Analysis of the Problem. To ﬁnd the value x0 that minimizes the expres-

sion (5), let us diﬀerentiate this expression with respect to x0 and equate the

derivative to 0.

Pavement Engineering Applications 187

The expression (5) is the sum of two terms, so the derivative of the expression

(5) is equal to sum of the derivatives of these two terms.

To ﬁnd the derivative of the ﬁrst term, it is convenient to introduce an

auxiliary function

t

def

G(t, x0 ) = c1 · (x0 − x) · ρ(x) dx. (6)

0

In terms of this auxiliary function, the ﬁrst term has the form G(x0 , x0 ). Thus,

by the chain rule, the derivative of the ﬁrst term can be described as

d ∂G(t, x0 ) ∂G(t, x0 )

G(x0 , x0 ) = + . (7)

dx0 ∂t |t=x0 ∂x0 |t=x0

(6) is an integral of some expression from 0 to t, its derivative with respect to t

is simply the value of the integrated expression for x = t:

∂G(t, x0 )

= c1 · (x0 − t) · ρ(t).

∂t

For t = x0 , this expression is equal to 0.

The second expression in the right-hand side of the formula (7) is an integral.

The derivative of the integral (i.e., in eﬀect, of the weighted sum) is thus equal

to the integral (i.e., to the weighted sum) of the corresponding derivatives:

t t

∂G(t, x0 ) ∂ ∂

= c1 · (x0 − x) · ρ(x) dx = (c1 · (x0 − x) · ρ(x)) dx. (8)

∂x0 ∂x0 0 0 ∂x 0

∂

(c1 · (x0 − x) · ρ(x)) = c1 · ρ(x),

∂x0

thus the expression (8) takes the form

t t

∂G(t, x0 )

= c1 · ρ(x) dx = c1 · ρ(x) dx.

∂x0 0 0

The integral in the right-hand side of this formula is simply the value of the cdf

F (t). So, for t = x0 , it takes the form F (x0 ). Thus:

d ∂G(t, x0 )

G(x0 , x0 ) = = c1 · F (x0 ). (9)

x0 ∂x0 |t=x0

To ﬁnd the derivative of the second term in the right-hand side of the formula

(5), let us introduce another auxiliary function

∞

def c1 · q1 · (x − x0 )

H(t, x0 ) = · ρ(x) dx. (10)

t 1+r

188 E. D. Rodriguez Velasquez et al.

In terms of this auxiliary function, the second term in the expression (5) for

the waste function W (x0 ) has the form H(x0 , x0 ). Thus, by the chain rule, the

derivative of the ﬁrst term can be described as

d ∂H(t, x0 ) ∂H(t, x0 )

H(x0 , x0 ) = + . (11)

dx0 ∂t |t=x0 ∂x0 |t=x0

respect to t is simply minus the integrated expression:

∂H(t, x0 ) c1 · q1 · (t − x0 )

=− · ρ(t).

∂t 1+r

For t = x0 , this expression is equal to 0.

For the second term in the right-hand side of the formula (11), the derivative

of the integral (i.e., in eﬀect, of the weighted sum) is equal to the integral (i.e.,

to the weighted sum) of the corresponding derivatives:

∞

∂H(t, x0 ) ∂ c1 · q1 · (x − x0 )

= · ρ(x) dx =

∂x0 ∂x0 t 1+r

∞

∂ c1 · q1 · (x − x0 )

· ρ(x) dx. (12)

t ∂x0 1+r

∂ c1 · q1 · (x − x0 ) c1 · q1

· ρ(x) = · ρ(x),

∂x0 1+r 1+r

∞ ∞

∂H(t, x0 ) c1 · q1 c1 · q1

= · ρ(x) dx = · ρ(x) dx.

∂x0 t 1+r 1+r t

The integral in the right-hand side of this formula is simply 1 minus value of the

cdf F (t). So for t = x0 , it takes the form 1 − F (x0 ). Thus, this derivative takes

the following form:

∂H(t, x0 ) c1 · q1

= · (1 − F (x0 )). (13)

∂t |t=x0 1+r

As we have mentioned, the derivative of the objective function (5) – the derivative

which should be equal to 0 when we select the economically optimal estimate

x0 – is equal to the sum of the expressions (9) and (13). Thus, the optimality

dW (x0 )

condition = 0 takes the form

dx0

c1 · q1

c1 · F (x0 ) − · (1 − F (x0 )) = 0.

1+r

Pavement Engineering Applications 189

If we divide both sides of this equality by c1 , move all the terms not containing

the unknown F (x0 ) to the right-hand side, and divide by the coeﬃcient at F (x0 ),

we conclude that

q1

1 +r = q1

F (x0 ) = q1 .

1+ 1 + r + q1

1+r

should select not the mean of the actual values corresponding to this opinion,

q1

but rather a quantile corresponding to the level :

1 + r + q1

q1

F (x0 ) = , (14)

1 + r + q1

where:

• q1 is the growth rate of the pavement fault – what fault of index 1 will grow

into next year, and

• r is the interest rate – how much interest we will get if we invest $1 now.

account that r is very small, we conclude that F (x0 ) ≈ 1/2, i.e., x0 should be

the median of the corresponding probability distribution.

For symmetric distributions like normal, median and mean coincide – they

both coincide with the center of the distribution, i.e., with the value with respect

to which this distribution is symmetric. In this case, we can still use the statis-

tically optimal estimate x0 = μ.

1+r

However, in most real-life situations, when q1 1 + r, we have 1,

q1

1+r

thus, 1 + 2 and

q1

q1 1

F (x0 ) = = 0.5,

1 + r + q1 1+r

1+

q1

so we should select the values larger than the mean.

x−μ

For the case of the normal distribution, with F (x) = F0 , the for-

σ

mula (14) takes the form

x−μ q1

F0 = ,

σ 1 + r + q1

x−μ

= k, (15)

σ

190 E. D. Rodriguez Velasquez et al.

q1

F0 (k) = . (16)

1 + r + q1

Thus, instead of the statistically optimal estimate x0 = μ, we need to use the

estimate

x0 = μ + k · σ. (17)

This is in line with the usual way to taking risk into account when comparing

diﬀerent alternatives: instead of comparing average gains μ, we should compare

the values μ − k · σ, where the coeﬃcient k depends on the person’s tolerance to

risk; see, e.g., [2] and references therein.

Comment. We recommend plus k · σ, since instead of maximizing gains, we

minimize losses – i.e., negative gains. When we switch from the value to negative

of this value, then μ + k · σ becomes μ − k · σ. Indeed, μ[−x] = −μ[x], while

σ[−x] = σ[x], so

μ[−x] + k · σ[−x] = −μ[x] + k · σ[x] = −(μ[x] − k · σ[x]).

Similar Applications). For each expert opinion, we collect all the cases in

which the expert expressed this opinion, and ﬁnd, in all these cases, the actual

values of the corresponding quantity. Based on these actual values, we compute

the mean μ and the standard deviation σ. Then, as a numerical description of

the expert’s opinion, we select the value μ + k · σ, where k is determined by the

formula (16).

This way, we can decrease the losses caused by the expert’s uncertainty.

Foundation grant HRD-1242122 (Cyber-ShARE Center).

References

1. ASTM International: Standard Practice for Roads and Parking Lots Pavement Con-

dition Index Surveys, International Standard D6433-18

2. Elton, E.J., Gruber, M.J., Brown, S.J., Goetzman, W.N.: Modern Portfolio Theory

and Investment Analysis. Wiley, New York (2014)

3. Metropolitan Transportation Commission (MTC): MTC Rater Certification Exam,

Streetsaver Academy, San Francisco, California (2018)

4. Miller, G.A.: The magical number seven, plus or minus two: some limits on our

capacity for processing information. Psychol. Rev. 63(2), 81–97 (1956)

5. Park, K., Thomas, N.E., Lee, K.W.: Applicability of the international roughness

index as a predictor of asphalt pavement condition. J. Transp. Eng. 133(12), 706–

709 (2007)

6. Reed, S.K.: Cognition: Theories and Application. Wadsworth Cengage Learning,

Belmont, California (2010)

7. Sheskin, D.J.: Handbook of Parametric and Nonparametric Statistical Procedures.

Chapman and Hall/CRC, Boca Raton (2011)

Quantum Approach Explains the Need

for Expert Knowledge: On the Example

of Econometrics

Vladik Kreinovich3(B) , and Thach Ngoc Nguyen4

1

Faculty of Economics, Chiang Mai University, Chiang Mai, Thailand

songsakecon@gmail.com, hunguyen@nmsu.edu

2

Department of Mathematical Sciences, New Mexico State University, Las Cruces

88003, New Mexico, Thailand

3

University of Texas at El Paso, 500 W. University, El Paso, TX 79968, USA

{olgak,vladik}@utep.edu

4

Banking University of Ho Chi Minh City, 56 Hoang Dieu 2, Quan Thu Duc, Thu

Duc, Ho Chi Minh City, Vietnam

Thachnn@buh.edu.vn

phenomena, and to ﬁnd out how to regulate these phenomena to get the

best possible results. There have been many successes in both purposes.

Companies and countries actively use econometric models in making eco-

nomic decisions. However, in spite of all the successes of econometrics,

most economically important decisions are not based only on the econo-

metric models – they also take into account expert opinions, and it has

been shown that these opinions often drastically improve the resulting

decisions. Experts – and not econometricians – are still largely in charge

of the world economics. Similarly, in many other areas of human activi-

ties, ranging from sports to city planning to teaching, in spite of all the

successes of mathematical models, experts are still irreplaceable. But

why? In this paper, we explain this phenomenon by taking into account

that many complex systems are well described by quantum equations,

and in quantum physics, the best computational results are obtained

when we allow the system to make kind of imprecise queries – the types

that experts ask.

equations have been discovered, computing a trajectory of a celestial body or of

a spaceship became a purely computational problem.

There was a similar hope when the ﬁrst equations were discovered for describ-

ing economic phenomena:

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 191–199, 2019.

https://doi.org/10.1007/978-3-030-04263-9_15

192 S. Sriboonchitta et al.

behavior as we control spaceships,

• that eventually, all the economic problems will be resolved by appropriate

computations,

• that eventually, econometricians – researchers who know how to solve the

corresponding systems of equations, how to optimize the desired objective

function – will be largely in charge of the world economics.

Since then, econometrics has experienced a lot of success stories, but, in

spite of all these success stories, we are still not in charge: who is in charge are

experts, CEOs, fund managers, bankers, people who may know some mathemat-

ical models, but whose main strength is in their expertise – not in knowing these

models.

Why? Why are econometricians not in charge of companies – after all, com-

panies are interested in maximizing their proﬁts, so why not let a specialist in

maximization be in charge? The fact that this is not happening en masse shows

that, in spite of all the successes of econometrics, there is a still a big advantage

in using expert knowledge.

But why? We do not have an expert with an intuitive understanding of

trajectories in charge of computing spaceship trajectories, why is it diﬀerent in

economics?

human activity, there is also a surprising need for experts.

For example, in sports, a few decades ago, new sports mathematical methods

were developed that drastically improved our understanding of sports phenom-

ena and led to many team successes; see, e.g., [14]. At ﬁrst, the impression was

that the corresponding formulas provide a much better way of selecting team

players than the experience of even most experience coaches. However, it soon

turned out that relying only on the mathematical models is not a very eﬀective

strategy, that much better results can be obtained if we combine the mathemat-

ical model with the expert’s opinions; see, e.g., [9]. But why?

Same thing with smart cities. Cities often grow rather chaotically, with unin-

tended negative consequences of diﬀerent decisions, so:

• why not have a computer-based system combining all city services,

• why not optimize the functioning of the city while taking everyone’s interests

into account?

This seems to be a win-win proposition. This was the original idea behind smart

cities. This idea indeed led to many improvements and successes – but it also

turned out that by themselves, the resulting mathematical models do not always

provide us with very good results. Much better results can be obtained if we take

expert knowledge into account; see, e.g., [19].

Yet another area where experts are still (surprisingly) needed is teaching.

Every time there is a new development in teaching technology, optimistic popu-

lar articles predict that these technologies, optimized by using appropriate math-

ematical models, will eventually replace human teachers. And they don’t.

Quantum Approach Explains the Need for Expert Knowledge 193

• this was predicted with current MOOCs – massive open online courses.

And these predictions turn out to be wrong. Deﬁnitely, teachers adopt new

technologies, these new technologies make teaching more eﬃcient – but attempts

to eliminate teachers completely and let an automatic system teach have not yet

been successful.

Same with medical doctors: since the very ﬁrst medicine-oriented expert sys-

tem MYCIN appeared several decades ago (see, e.g., [3]), enthusiasts have been

predicting that eventually, medical doctors will be replaced by expert systems.

Deﬁnitely, these systems help medical doctors and thus, improve the quality of

the health care, but still medical experts are very much in need.

Similar examples can be found in many other areas of human activity. But

why are experts so much needed? Why cannot we incorporate their knowledge

into automated systems that would thus replace these experts?

Why Cannot We Just Translate Expert Knowledge into Computer-

Understandable Terms: Approaches Like Fuzzy Logic Helped, But

Experts Are Still Needed. Many researchers recognized the desirabil-

ity to translate imprecise natural-language expert knowledge into computer-

understandable terms. Historically the ﬁrst successful idea of such a translation

was formulated by Lotﬁ Zadeh under the name of fuzzy logic [23]. This tech-

niques has indeed led to many successful applications; see, e.g., [2,11,15,16,18];

however, in spite of all these successes, experts are still needed. Why?

What We do in this Paper. In this paper, we show that this unexpected need

for expert knowledge can be explained if we take into account that many complex

systems – especially systems related to econometrics and, more generally, with

human behavior – are well described by quantum equations [6,22], equations that

were originally invented to describe micro-objects of the physical world. And the

experience of designing computers that take quantum eﬀects into account has

shows, somewhat unexpectedly, that the best results are attained if instead of

asking precise questions, we ask kind of imprecise ones – we will explain this in

detail in the following sections.

Reminder

ingly, quantum equations – originally developed for studying small physical

objects – have been shown to be useful in describing economic phenomena and,

more generally, any phenomena that involves human decision making; see, e.g.,

[1,10,13].

In view of the above usefulness, when thinking of the best algorithms for mak-

ing decisions in economics, to look for how decisions are made – and how the

corresponding computations are performed – in the quantum world.

194 S. Sriboonchitta et al.

more and more computations, we need to perform computations faster and faster.

In nature, there is a limitation on the speed of all possible physical processes:

according to modern physics, all the speeds are bounded by the speed of light –

c ≈ 300 000 km/sec. This may sound like a lot, but take into account that for a

typical laptop size of 30 cm, the smallest possible time that any signal need to

go from one side of the laptop to another is 30 cm divided by c, which is about 1

nanosecond, i.e., 10−9 seconds. During this nanosecond, a usual several-gigahertz

processor – and gigaherz means 109 operations per second – performs several

arithmetic operations. Thus, to make it even faster, we need to make processors

even smaller. To ﬁt billions of cells of memory in a small-size computer requires

decreasing these cells to the size at which the size of each cell is of almost the

same order as the size of a molecule – and thus, quantum eﬀects, i.e., physical

eﬀects controlling micro-world, need to be taken into account.

The need to take quantum eﬀects into account when computing was ﬁrst

emphasized by the Nobelist Richard Feynman in his 1982 paper [5]. At ﬁrst,

quantum eﬀects were mainly treated as nuisance. Indeed, one of the features of

quantum physics is its probabilistic nature:

• many phenomena cannot be exactly predicted,

• we can only predict the probabilities of diﬀerent outcomes,

• and the probability that a computer will not do what we want makes the

computations less reliable.

However, later, it turns out that it is possible, as the saying goes, to make tasty

lemonade out of the sour and not-very-edible-by-themselves lemons that life gives

us: namely, it turned out that by cleverly arranging the corresponding quantum

eﬀects, we can actually speed up computations – and speed them up drastically.

ﬁrst result showing potential beneﬁts of quantum computing was an algorithm

developed by Deutsch and Josza ten year after Feynman’s paper; see [4] (see also

[12] for a pedagogical description of this algorithm). This algorithm solved the

following simple-sounding problem:

• given a function f (x) that transforms one bit (0 or 1) into one but,

• check whether this function is constant, i.e., whether f (0) = f (1).

This may sound like a simple problem not worth spending time on, but it

is actually a simple case of a very important practical problem related to high

performance computing. In many applications, we have developed software that

solves the corresponding system of partial diﬀerential equations: it takes as input

the initial conditions, the boundary conditions, and produces the results. Solving

such systems of equations often requires a lot of computation time; for example:

• accurately predicting tomorrow’s weather requires several hours on the fastest

modern high performance computer, and

Quantum Approach Explains the Need for Expert Knowledge 195

the next 15 min takes even longer than several hours – thus making current

predictions practically useless.

One possible way of speeding up computation is based on the fact that:

• while we include all the inputs into our parameters,

• some of current input’s bits do not actually aﬀect our results.

This is, by the way, one of the skills that physicists have – in situations like this,

ﬁguring out which inputs are important and which can be safely ignored. But

even after utilizing all the physicists’ expertise, we probably have many bits of

data that do not aﬀect tomorrow’s weather, i.e., for which, whether we put it

bit value 1 or bit value 0 into the corresponding computations, we will get the

exact same result:

f (. . . , 1, . . .) = f (. . . , 0, . . .).

Now we see that the original Deutsch-Josza problem is indeed the simplest

case of an important practical problem – important when computing the above

simple function f (x) takes a lot of computation time. If we operate within clas-

sical physics, then we have to plug in either 0 or 1 into the given “black box” for

computing f (x). If we only plug in 0 and not 1, we will know f (0) but not f (1)

– and thus, we will not be able to know whether the values f (0) and f (1) are

the same. To check whether the given function f (x) is a constant, we therefore

need to call the function f (x) two times.

An interesting result of Deutsch and Josza is that in quantum computing,

we can ﬁnd the answer by using only one call to the function f (x) – in the next

section, we will explain how this is possible and how this is related to the need

for expert knowledge.

This result opened the ﬂoodgates for many other eﬃcient quantum algo-

rithms. One of the ﬁrst was Grover’s algorithm for a fast search in an unsorted

array [7,8]. The search problem is becoming more and more important every

day, with the increasing amount of data coming in. Ideally, we should sort all

this data – e.g., in alphabetic order – and thus make it easier to search, but in

practice, we often have no time for such sorting, and thus, store the data in a

non-sorted order, in memory cells

c1 , c2 , . . . , cn .

Suppose now that we want to ﬁnd a record r in this database. For example,

suppose that an act of terror has happened, the surveillance system recorded

the faces of penetrators, and to help stop further attacks, we want to ﬁnd if

these faces have appeared in any of the previously recorded surveillance video

recordings.

A natural way to ﬁnd the desired record is to look at all n stored records one

by one until we ﬁnd a one. In this process, if we look at fewer than n records,

we may thus miss the record ci containing the desired information. Thus, in

the worst case, to ﬁnd the desired record, we must spend time c · n = O(n),

196 S. Sriboonchitta et al.

where c is the average time needed to look into a single record – and within

classical physics, no faster algorithm is possible. Interestingly, Grover’s quantum

√

algorithm searches for the record much faster – in time proportional to n.

There are many other known eﬀective quantum algorithms. The most well

known is Shor’s fast factorization algorithm [20,21] that enables us to factorize

large integers fast. This sounds like an academic problem until one realizes that

most computer encryption that we use – utilizing the so-called RSA algorithm

– is based on the diﬃculty of factorizing large integers. So, if Shor’s algorithm

becomes practical, we will be able to read all the encrypted messages that have

been sent so far – this is why governments and companies all over the world try

to implement this algorithm.

Comments.

• Shor’s result would not mean, by the way, that with the implementation

of quantum computing, encryption will be impossible – researchers have

invented unbreakable quantum encryption algorithm which are, by the way,

already used to convey important messages. This algorithm and many other

quantum computing algorithms can be found in [17].

• In the following section, we will brieﬂy mention how exactly quantum com-

puters achieve their speedup – and how this is related to the need for experts.

How This Explains the Need for Imprecise Expert

Knowledge

tant speciﬁc features of the quantum world is that in addition to classical (non-

quantum) states, we can have linear combinations (called superpositions) of these

states. This is a very non-intuitive notion, this is one of the reasons why Ein-

stein was objecting to quantum physics: for example, how can one imagine a

superposition of a live cat and a dead cat? Intuitive or not, quantum physics has

been experimentally conﬁrmed – while many more intuitive alternative theories

ended up being rejected by the experiments.

Let us thus illustrate this idea on the example of quantum states of a bit (a

quantum bit is also called a qubit). In non-quantum physics, a bit has two states:

0 and 1. In quantum physics, these states are usually denoted by |0 and |1.

In quantum physics, in addition to the two classical states |0 and |1, we

also allow superpositions, i.e., states of the type c0 · |0 + c1 · |1, where c0 and

c1 are complex numbers. The meaning of this state is that when we read the

contents of this bit – i.e., if we try to measure whether we will get 0 or 1:

• we will get 1 with probability |c1 |2 .

Quantum Approach Explains the Need for Expert Knowledge 197

Since we will always ﬁnd either 0 or 1, these two probabilities must add up to 1:

|c0 |2 + |c1 |2 = 1. This is the condition under which the above superposition is

physically possible.

How Superpositions Are Used in Deutsch-Josza Algorithm. In the quan-

tum world, superpositions are “ﬁrst-class citizens” in the sense that:

• whatever one can do with classical states, we can do with superpositions as

well.

In particular:

• just like we can use 0 and 1 as inputs to the algorithm f (x),

• we can also use a superposition as the corresponding input.

And this is exactly the main trick behind the Deutsch-Josza algorithm: that

instead of using the classical state (0 or 1) as an input, we use, as the input, a

superposition state

1 1

√ · |0 + √ · |1,

2 2

a state in which we can get 0 or 1 with equal probability

1 2

√ = 1.

2 2

quantum approach, all we can do is select an index i and ask the system to

check whether the i-th record contains the desired information. In contrast, in

quantum mechanics, in addition to submitting an integer i as an input to the

database, we can also submit a superposition of diﬀerent indices:

c1 · |1 + c2 · |2 + . . . + ci · |i + . . . + cn · |n,

as long as this superposition is physically meaningful, i.e., as long as all the

corresponding probabilities add up to 1:

|c1 |2 + |c2 |2 + . . . + |ci |2 + . . . + |cn |2 = 1.

This is exactly how Grover’s algorithm achieves its speedup – by having such

superpositions as queries.

√ way to factorize a large number N is to try all possible prime factors

a usual

p ≤ N . In Shor’s algorithm, crudely speaking, instead of inputting a single

prime number p into the corresponding divisibility-checking algorithm, we input

an appropriate superposition of the states |p corresponding to diﬀerent prime

numbers.

How All this Implies the Need for Experts. How can we interpret a

superposition input in commonsense terms? For example, in the search-in-the-

database problem:

198 S. Sriboonchitta et al.

i-th record contains the desired information.

• In quantum computing, we do not select a speciﬁc index i, the query may

aﬀect several diﬀerent indices with diﬀerent probabilities.

This is exactly the same eﬀect as when an expert asks something like “is one of

the earlier records containing the desired information” – meaning maybe record

No. 1, maybe record No. 2, etc. Of course, the result of this query is also prob-

abilistic (imprecise): we do not get the exact answer to this question, we get an

imprecise answer – which would correspond to something like “possibly”.

In other words, queries like the ones asked by quantum algorithms are very

similar to imprecise queries that real experts make. The main lesson of quantum

computing is thus that:

• normally, when we start with such imprecise queries, we try to make them

more precise (“precisiate” them, to use Zadeh’s term from fuzzy logic), while

• quantum computing shows that in many important cases, it is computation-

ally more beneficial to ask such imprecise queries than to ask precise ones.

with imprecise expert-type reasoning is often beneﬁcial – which explains the

somewhat surprising empirical need for such expert reasoning.

metrics, Faculty of Economics, Chiang Mai University, Thailand. We also acknowledge

the partial support of the US National Science Foundation via grant HRD-1242122

(Cyber-ShARE Center of Excellence).

References

1. Baaquie, B.E.: Quantum Finance: Path Integrals and Hamiltonians for Options

and Interest Rates. Camridge University Press, New York (2004)

2. Belohlavek, R., Dauben, J.W., Klir, G.J.: Fuzzy Logic and Mathematics: A His-

torical Perspective. Oxford University Press, New York (2017)

3. Buchanan, B.G., Shortliﬀe, E.H.: Rule Based Expert Systems: The MYCIN Exper-

iments of the Stanford Heuristic Programming Project. Addison-Wesley, Reading

(1984)

4. Deutsch, D., Jozsa, R.: Rapid solutions of problems by quantum computation.

Proc. R. Soc. Lond. A 439, 553–558 (1992)

5. Feynman, R.P.: Simulating physics with computers. Int. J. Theor. Phys. 21(6/7),

467–488 (1982)

6. Feynman, R., Leighton, R., Sands, M.: The Feynman Lectures on Physics. Addison

Wesley, Boston (2005)

7. Grover, L.K.: A fast quantum mechanical algorithm for database search. In: Pro-

ceedings of the 28th ACM Symposium on Theory of Computing, pp. 212–219

(1996)

8. Grover, L.K.: Quantum mechanics helps in searching for a needle in a haystack.

Phys. Rev. Lett. 79(2), 325–328 (1997)

Quantum Approach Explains the Need for Expert Knowledge 199

9. Grover, T.S., Wenk, S.L.: Relentless: From Good to Great to Unstoppable. Scrib-

ner, New York (2014)

10. Haven, E., Khrennikov, A.: Quantum Social Science. Cambridge University Press,

Cambridge (2013)

11. Klir, G., Yuan, B.: Fuzzy Sets and Fuzzy Logic. Prentice Hall, Upper Saddle River

(1995)

12. Kosheleva, O., Kreinovich, V.: How to introduce technical details of quantum com-

puting in a theory of computation class: using the basic case of the Deutsch-Jozsa

Algorithm. Int. J. Comput. Optim. 3(1), 83–91 (2016)

13. Kreinovich, V., Nguyen, H.T., Sriboonchitta, S.: Quantum ideas in economics

beyond quantum econometrics. In: Anh, L., Dong, L., kreinovich, V., Thach,

N. (eds.) Econometrics for Financial Applications, pp. 146–151. Springer, Cham

(2018)

14. Lewis, M.: Moneyball: The Art of Winning an Unfair Game. W. W. Norton, New

York (2004)

15. Mendel, J.M.: Uncertain Rule-Based Fuzzy Systems: Introduction and New Direc-

tions. Springer, Cham (2017)

16. Nguyen, H.T., Walker, E.A.: A First Course in Fuzzy Logic. Chapman and

Hall/CRC, Boca Raton (2006)

17. Nielsen, M., Chuang, I.: Quantum Computation and Quantum Information. Cam-

bridge University Press, Cambridge (2000)

18. Novák, V., Perﬁlieva, I., Močkoř, J.: Mathematical Principles of Fuzzy Logic.

Kluwer, Boston (1999)

19. Schehtner, K.: Bridging the adoption gap for smart city technologies: an interview

with Rob Kitchin. IEEE Pervas. Comput. 16(2), 72–75 (2017)

20. Shor, P.: Polynomial-time algorithms for prime factorization and discrete loga-

rithms on a quantum computer. In: Proceedings of the 35th Annual Symposium

on Foundations of Computer Science, Santa Fe, New Mexico, 20–22 November 1994

(1994)

21. Shor, P.: Polynomial-time algorithms for prime factorization and discrete loga-

rithms on a quantum computer. SIAM J. Sci. Statist. Comput. 26, 1484–1509

(1997)

22. Thorne, K.S., Blandford, R.D.: Modern Classical Physics: Optics, Fluids, Plasmas,

Elasticity, Relativity, and Statistical Physics. Princeton University Press, Princeton

(2017)

23. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965)

Applications

Monetary Policy Shocks

and Macroeconomic Variables: Evidence

from Thailand

Popkarn Arwatchanakarn(B)

popkarn.arw@mfu.ac.th

Abstract. From May, 2000 up to the present day, Thailand has imple-

mented a monetary policy of inﬂation targeting, with its central bank

(Bank of Thailand) using a short-term interest rate as the main mone-

tary instrument. A question arises as to whether the short-term policy

interest rate remains eﬀective as the monetary policy instrument due to

the current uncertainty of global economy

Using the structural vector error correction (SVEC) model with con-

temporaneous and long-run restrictions, this paper has employed quar-

terly data for Thailand over the inﬂation targeting period of 2000q2–

2017q2 to investigate the relationship among monetary policy shocks

and some key macroeconomic variables in Thailand under the operation

of the inﬂation targeting. This study ﬁnds signiﬁcant feedback relations

among the six variables in the speciﬁed SVEC model, namely real out-

put, prices, interest rates, monetary aggregates, exchange rates and trade

balance. It also suggests that the eﬀects of monetary policy on macroeco-

nomic variables in Thailand are mostly consistent with theoretical expec-

tations. The overall results provide support to an argument that price

stability is required for sustained economic growth. More importantly,

the policy interest rate remains valid and eﬀective as the monetary instru-

ment for price stability under inﬂation targeting

Thailand · Trade balance

1 Introduction

The role of price stability and monetary policy independence in Thailand has

increased since the East Asia crisis of 1997–1998. Given an institutional reform

of monetary policy, a managed-ﬂoat exchange rate system and a rule-based mon-

etary policy has been operated. From May 2000 up to the present day, Thailand

has implemented a ﬂexible form of inﬂation targeting framework, with the price

stability as the ultimate objective of monetary policy.

Under the inﬂation targeting, understanding the directions and magnitude of

the inﬂuences that drive the monetary transmission mechanism of an economy is

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 203–219, 2019.

https://doi.org/10.1007/978-3-030-04263-9_16

204 P. Arwatchanakarn

key to the successful conduct of monetary policy for price stability. The impact

of monetary policy will be substantial if the transmission mechanism of mone-

tary policy is completely passed-through to the target sections of the economy

especially the price level and economic growth. In other words, an instrument of

monetary policy will be more eﬀective when the monetary transmission mech-

anism is the well-understood and well-developed [3]. On the other hand, when

either a monetary policy is not credible or monetary transmission channels do

not work eﬀectively, inﬂation targeting cannot anchor inﬂation expectation and

therefore it fails to achieve price stability. In the case of Thailand, the Bank of

Thailand (BOT) has used a short-term interest rate as the main policy instru-

ment to control inﬂation and adjust aggregate demand. The basic question arises

as to whether the eﬀects of a monetary policy shock on macroeconomic variables

are as theoretical expectation. Another question is whether the policy interest

rate remains valid and eﬀective as the monetary policy instrument in a low-

inﬂation environment and with global ﬁnancial turbulence or not.

The purpose of this study is to investigate the eﬀects of monetary shocks

on some key macroeconomic variables in Thailand under the implementation

of inﬂation targeting. We establish a structural vector error correction (SVEC)

model and impose the long run neutrality of money and contemporaneous restric-

tions to identify the monetary policy shock. The hypothesis that the short-term

policy interest rate and money remain valid transmission mechanisms of Thai

monetary policy is tested. In addition, the eﬀects of monetary policy on key

macroeconomic variables are examined. The analysis of the impulse response

functions and forecast error variance decompositions are then made to draw the

empirical ﬁndings and policy implications.

The structure of this study is organised as follows. Section 2 reviews the

literature on the conduct of monetary policy and structural vector error cor-

rection (SVEC) models. Section 3 outlines an SVEC model for investigating the

dynamic interactions among real output, prices, interest rates, monetary aggre-

gate, exchange rates and trade variables in the presence of two exogenous vari-

ables, namely the world prices of oil and the US Federal Fund rates. Section 4

reports the empirical results, including the impulse response functions and fore-

cast error variance decompositions. Finally, Sect. 5 summarises the key ﬁndings

and draws policy implication.

2 Literature Review

Since the East Asian crisis of 1997–1998, the role of monetary policy for price

stability in Thailand has increased. As part of the implementation of the Inter-

national Monetary Fund’s (IMF) stabilisation programme in 1997, Thailand has

operated a managed ﬂoating exchange rate system with some episodic capital

controls. This has therefore given the bank of Thailand (BOT) some indepen-

dence in the conduct of monetary policy for price stability.

Monetary Policy Shocks and Macroeconomic Variables 205

operated under monetary targeting during July 1997–May 2000. A monetary

aggregate was used as the instrument of monetary policy, and it acted as an

anchor for prices. Although the BOT’s operation under monetary targeting was

successful in stabilising the price level, the concern about reducing economic

growth and increasing the unemployment rate arose.

Since May 2000, the BOT has been conducting ﬂexible inﬂation targeting

(IT) with the price stability as the ultimate objective of monetary policy. In

turn, the IT policy has replaced the independent monetary policy. In addition,

it has been more relevant in achieving price stability and more eﬀective on inﬂa-

tion expectation anchoring than other monetary policies. The reason behind the

abandonment of monetary targeting was the alleged instability of the demand

for money function and the claim that it had loosened the relationship among

money, output and prices [13]. Operating inﬂation targeting, the BOT discarded

monetary aggregate and has employed a short-term policy interest rate as the

main monetary instrument to adjust aggregate demand in order to maintain

both the price and output stability.

Consequently, inﬂation targeting has been instrumental to elevate the institu-

tional reforms and the eﬀectiveness and credibility of central banks. The optimal

inﬂation targeting would elevate the conduct of monetary policy leading to the

price stability, which is a basis of sustained economic growth. Under the Inﬂa-

tion targeting, the Bank of Thailand has successfully maintained inﬂation within

its target range and lowered vulnerability to external shocks. Overall, the Thai

economy has a satisfactory performance in the terms of economic growth and

inﬂation [10].

Although inﬂation targeting has been successfully implemented in Thailand,

the problem associated with the interest-based transmission mechanism is sug-

gested [10,22]. An abandonment of monetary aggregates in the conduct of mon-

etary policy is not without cost to an economy [15]. In particular, the policy

interest rate has eﬀectively made monetary aggregates redundant, despite the

fact that monetary aggregates have been important in the inﬂationary process

of developing countries1 . It is considered unwise to ignore monetary aggregates

in the conduct of monetary policy because there is a dynamic relation among

the real output, prices, interest rates, monetary aggregate and exchange rates

[2,13]. One consequences of the operation of inﬂation targeting in Thailand is

the creation of money-growth volatility, which has kept inﬂation volatile and, in

turn, made both the real interest rates and real exchange rates volatile [13]. In

addition, an increase in inﬂation raises the volatility of inﬂation, which aﬀects

Thailand’s economic growth [14].

1

There is an argument that central banks do not, and cannot, impose control over

the long-term interest rate without controlling the money growth rate. The use of a

short-term interest rate as the monetary instrument makes the money-growth rate

unstable. Unstable money growth makes the inﬂation and hence the interest rate

unstable [2, 13].

206 P. Arwatchanakarn

Two hypotheses dominate over the relation between monetary policies, exchange

rates and trade balance in an open economy. The ﬁrst hypothesis is the over-

shooting hypothesis, which involves monetary policies and exchange rates.

According to this hypothesis, a contractionary (expansionary) monetary pol-

icy results in a large initial appreciation (depreciation) of an exchange rate,

followed by subsequent depreciation (appreciation). Empirical evidence on the

exchange rate overshooting has been debatable. Grilli and Roubini [11] found

that a contractionary monetary policy initially produces a gradual appreciation,

which was then followed by a gradual depreciation. Their ﬁnding is not consis-

tent with in line of exchange rate overshooting. However, Jang and Ogaki [17]

found evidence to support the overshooting hypothesis.

The second hypothesis is the J-curve hypothesis, which relates between mon-

etary policy and the trade balance. According to the J-curve hypothesis, a real

depreciation (appreciation) of an exchange rate lowers (increases) the relative

price of domestic goods to foreign goods, which, in turn, increases (decreases)

exports and decreases (increases) imports. As a result, the trade balance gets an

improvement (deterioration). The process of J-curve takes time, it is not imme-

diate. Empirical studies that support the J-curve hypothesis include Kim [18]

and Koray and McMillin [20].

There is growing literature on monetary policy analysis by using structural vec-

tor autoregression (SVAR) and structural vector error correction (SVEC) mod-

els. The SVAR model is a standard econometric model in dynamic macroeco-

nomic analysis. It has been extensively used to analyse monetary issues espe-

cially monetary transmission mechanism [6]. However, the existing SVAR studies

have weaknesses that come from some issues, such as non-stationary variables

and short sample spans. These problems might generate unreliable, misleading

results and economic puzzles. Due to some limitations of the SVAR model, a

structural vector error correction (hereafter SVEC) model has recently emerged

as a new analytic tool for investigating the relationship between monetary policy

and macroeconomic variables.

Employment of an SVEC model has remained a challenge and has been con-

troversial in monetary policy analysis. The SVEC model, originally developed by

King, Plosser, Stock and Watson [21], relatively diﬀers from the SVAR model.

Whereas the SVAR requires only short-run (contemporaneous) restrictions, the

SVEC model requires both short-run (contemporaneous) and long-run restric-

tions. As the study of Faust and Leeper [7], they criticise the weakness of model

estimation with only short-run or long-run restrictions. They also recommend

using both short-run and long-run restrictions to improve estimations. As stated

in Fung and Kasumovich [9], it is possible to impose identiﬁcation schemes on

the cointegration matrix of a VECM. In addition, Jang and Ogaki [17] point

Monetary Policy Shocks and Macroeconomic Variables 207

that the SVEC model has some advantages in systems with stochastic trends

and cointegration. This can be inferred that estimators from the SVEC Model

are more precise that those from the SVAR model.

The SVEC model would be superior to the SVAR model by addressing some

issues. Firstly, it allows the use of cointegration restrictions, which is possible

to impose long-run restrictions in order to identify shocks. Long-run restrictions

are more attractive because they are more directly related to the macroeconomic

model. The cointegration properties of the variables provide restrictions, which

can be taken into account beneﬁcially in identifying the structural shocks. It also

allows the imposition of restrictions for both short- and long-run relationships.

Secondly, it allows us to incorporate with both the I(0) and I(1) nature of data

and empirically supports cointegrating relationships within the same modelling

framework [8,24] That is, an SVEC model requires less restrictions than an

SVAR. Thirdly, it can imposed restrictions about the underlying structure of

the economy and would provide a better ﬁt to the data. Therefore, it is capable

of investigating how the economy works in response to monetary policy shocks

and other shocks.

In Thailand, a number of monetary literature has extensively used VAR and

SVAR to investigate Thailand’s transmission mechanism of monetary policy.

Examples of VAR literature for Thailand include the studies of Chareonseang

and Manakit [3]; Disyatat and Vongsinsirikul [5]; and Hesse [12]. In addition, few

studies utilize the SVAR model for the analysis of Thailand’s monetary policy,

for instance Arwatchanakarn [1]; Hossain and Arwatchanakarn [15]; Kubo [22]

and Phiromswad [26]. However, the employment of an SVEC model is somewhat

limited and remains challenging. The empirical studies on monetary policy issue

using an SVEC model include Arwatchanakarn [2] and Chucherd [4].

It would be beneﬁcial to employ the SVEC model for analysing monetary

policy issues in Thailand. Therefore, the present study aims to establish an SVEC

mode to analyse the relationship between the monetary policy shocks and some

key macroeconomic variables for Thailand. To accomplish this, both plausible

short-run and long-run restrictions, which are based on economic theory and

previous studies, are imposed on the speciﬁed SVEC model.

3 Methodology

3.1 Interrelations Among the Interest Rate, Monetary Aggregate,

Real Output, Price Level, Exchange Rate and Trade Balance

rate system with some episodic controls over capital ﬂows. The bank of Thai-

land has currently used a short-term policy interest rate, as the instrument of

monetary policy, to stabilise the price levels and the economy without aﬀecting

economic growth. Following the existing literature on monetary policy under

interest-based inﬂation targeting, an SVEC modelling is considered useful to

analyse the monetary transmission mechanism and mechanics of an economy,

208 P. Arwatchanakarn

namely real output (Y), prices (P), monetary aggregates (M), the interest rates

(PR), the exchange rates (ER) and trade balance (TB).

Accordingly, a six-variable SVEC model is established to investigate the inter-

actions among all above deﬁned variables under an exogeneity assumption that

Thailand remains exposed to shocks originating from two external variables,

namely world oil prices and foreign interest rates.2 For modelling purposes, two

external variables are assumed to aﬀect domestic variables, but they are not

aﬀected by domestic variables. Figure 1 illustrates the interrelations among the

external and domestic variables that can be expected for Thailand.

Fig. 1. The 6-variables SVEC model: The interrelation among the interest rates, mone-

tary aggregates, prices, real output, exchange rates and trade balance with two external

shocks, e.g. the world price of oil and the foreign interest rate.

Policy Shocks

This section brieﬂy describes an overview of a structural vector error correc-

tion (SVEC) model. Following Lütkepohl [23], a reduced form of vector error

correction (VEC) model can be speciﬁed. There are beneﬁts in utilising the

cointegrating properties of the variables. Therefore, it is interesting and useful

to deploy an SVEC model in analysis of monetary policy

Considering a reduced from VAR model

A(L)yt = c + et (1)

vector of variables, c is an n × 1 vector ofconstants (intercepts), e t is white noise

with zero mean and covariance matrix e .

2

The variables in an SVEC model represent a multivariate system of endogenous vari-

ables, which maintain dynamically feedback relations. For identiﬁcation purposes,

some restrictions are imposed on the long-run and the short-run (or contemporane-

ous) relations among variables in the SVEC model.

Monetary Policy Shocks and Macroeconomic Variables 209

Assuming that all variables are at most I(1), the VEC model is represented

with cointegration rank of the following form:

Δyt = αβ yt−1 + Π1 Δyt−1 + ... + Πp−1 Δyt−p+1 + et (2)

In the case of cointegration, the matrix αβ is of reduced rank (r < n) and the

αβ yt−1 represents for error correction term. The dimension of α and β matrices

are (n×r) and r is the cointegration rank. More speciﬁcally, the α and β contain

the coeﬃcients and the cointegration vectors respectively. The Πi s express n × n

reduced form of short-run matrices.

The structural form of Eq. (2) is given by

tural form of short-run coeﬃcient matrices, the ut is structural innovations.

In the SVEC model, the reduced form disturbances et are linearly related to

the structural innovations ut . The contemporaneous (or short run: SR) matrix

is given such that

et = H −1 ut (4)

Suppose that the process yt is inﬂuenced by two types of structural distur-

bances: permanent impact and transitory impact. To gain the information of the

process yt , vector moving average (VMA) representation is used in the Eq. (3).

t

yt = η(1) ei + η(L)et + y0 (5)

t=1

Notice that yt contains the initial condition (y0 ), transitory shocks (η(L))

and permanent shocks (η(1)). The transitory shock (η(L)) is the inﬁnite sum-

∞

mation of term η(L) = j=0 ηj Lj that converges to zero. This suggests that the

transitory shock does not have long-run impact. The η(1) expresses the long-run

impact of permanent shocks which is given such that

p−1

−1

η(1) = β⊥ α⊥ In − Πi β⊥ α⊥ (6)

t−1

t

t

η(1) ei = η(1)H −1 ui (7)

t−1 t−1

H −1 matrix. These restrictions imply that some shocks do not have contem-

poraneous eﬀect on some variables in the system.

This study deploys the SVEC model with long-run and short-run (or con-

temporaneous) restrictions. Having identiﬁed the restriction, the six endogenous

210 P. Arwatchanakarn

variables are ordered as follows: the policy interest rate (PR); monetary aggre-

gate (M); prices (P); real output (Y); exchange rates (ER) and trade balance

(TB). This study follows the identiﬁcation scheme of Ivrendi and Guloglu [16]

and Kim and Roubini [19]. Our SVEC model views that a monetary policy shock

is transitory while the other shocks are permanent3 .

For local just-identiﬁed SVEC model [23,25], it requires a total of 15

(n(n − 1)/2) restrictions. The long-run and contemporaneous restrictions are

represented in two following matrices. Apart from the cointegration structure,

there are r transitory shocks. It provides r(n − r) = 5 restrictions on the long-

run matrix as the permanent shocks. The policy interest rate is assumed to be

transitory, and this variable undertake the adjustment required for the cointe-

grating relationship to hold.

The transitory shock is described by a zero column in the long-run matrix4

(LR) as shown in Eq. (8) below:

⎡ ⎤

0∗∗∗∗∗

⎢0 ∗ ∗ ∗ ∗ ∗⎥

⎢ ⎥

⎢0 ∗ ∗ ∗ ∗ ∗⎥

−1

LR = η(1)H = ⎢ ⎢ ⎥ (8)

⎥

⎢0 ∗ ∗ ∗ ∗ ∗⎥

⎣0 ∗ ∗ ∗ ∗ ∗⎦

0∗∗∗∗∗

Assuming that cointegrating rank (r) is equal to 1, this implies that there is

only one transitory shock and there are four permanent shocks. The transitory

shock is identiﬁed without further restrictions, (r(r − 1)/2 = 0). However, the

permanent shocks are identiﬁed with requiring at least 10 ((n − r)(n − r − 1)/2)

additional restrictions, which are imposed on contemporaneous matrix B. Follow-

ing Kim and Roubini [19], we impose twelve restrictions in total. The likelihood

ratio test indicates that the over-identifying restrictions are not rejected. The

relation between the pure shocks and reduced-form shock can be expressed in

the contemporaneous form (SR) in Eq. (9):

⎡ ⎤

∗0∗000

⎢∗ ∗ ∗ ∗ 0 0⎥

⎢ ⎥

⎢∗ ∗ ∗ 0 0 0⎥

−1

SR = εt = H ut = ⎢ ⎢ ⎥ (9)

⎥

⎢∗ ∗ ∗ ∗ 0 0⎥

⎣∗ ∗ ∗ ∗ ∗ 0⎦

∗∗∗∗∗∗

The ﬁrst row expresses the monetary policy reaction function. This study

assumes that the Bank of Thailand (BOT) sets the policy interest rate (PR)

3

Considering only one cointegrating relation. The reason behind this consideration

is that we emphasize a monetary policy shock as only one transitory shock. The

identiﬁcation of the long-run restrictions is also based on the assumption of money

neutrality that a monetary policy does not permanently aﬀect real variables in the

long-run.

4

The zeros are the restricted element and the asterisks are unrestricted elements.

Monetary Policy Shocks and Macroeconomic Variables 211

after observing the price level (P) and two exogenous variables. However, the

BOT is assumed to respond real output (Y) and monetary aggregate (M) with

delay. The second row represents for the money demand. It is assumed to con-

temporaneously react to the policy interest rate (PR), prices (P) and real output

(Y). The third row represents the price equation. The price is contemporaneously

inﬂuenced by the policy interest rate (PR), monetary aggregate (M). The fourth

row stands for the output equation. It is assumed to contemporaneously respond

to the policy interest rate (PR), monetary aggregate (M) and prices (P). The

ﬁfth row represents the exchange rate equation. It is contemporaneously aﬀected

by all above variables except the trade balance (TB). The sixth row represents

the trade balance, which is contemporaneously aﬀected by all variables in the

system.

The six-variable SVEC model is estimated using quarterly data for Thailand5 .

The period of model estimation ranges from 2000q2 to 2017q2, covering the

implementation period of inﬂation targeting framework.

The variables used in the speciﬁed model are as follows. The policy interest

rate (PR) is determined by the Bank of Thailand (BOT). The monetary aggre-

gate (M) is measured by the log of narrow monetary aggregate. The prices (P)

is measured by the log of consumer price index (2010 = 100). The real output

(Y) is measured by the log of real gross domestic product. The exchange rate

(ER) is measured by the log of real eﬀective exchange rate (2010 = 100). The

foreign interest rate is the U.S. federal fund rates. The trade balance (TB) is

measured in terms of logarithm of the ratio of exports to imports6 . The main

sources of data are the International Financial Statistics of the IMF and the

Bank of Thailand. In addition, the world oil prices (WOP), are compiled from

the Federal Reserve Bank of St. Louis.

The ﬁrst step in our analysis is to examine the time-series properties of the

variables. In general, the augmented Dickey-Fuller (ADF) and the Kwiatkowski,

Phillips, Schmidt and Shin (KPSS) tests7 are commonly performed. The results

suggest that most variables under consideration have a unit root in a level form

but are stationary in the ﬁrst-order log-diﬀerence form. These results imply that

all the variables are I (1).

5

In estimation procedure, this study uses the R program by Pfaﬀ [25].

6

Since the trade balance, which is the diﬀerence between exports and imports, might

be negative values; it could not take the logarithm transformation.

7

The ADF test is based on the null-hypothesis that the series under testing has a unit

root. The KPSS test is based on the hypothesis that the series under consideration

is stationary and hence does not have a unit root.

212 P. Arwatchanakarn

The second step is to determine the number of the cointegrating vectors. This

study uses the Johansen cointegration approach to examine the cointegral rela-

tionships among the variables. The Akaike information criterion is used to select

lag lengths. Four lags were suﬃcient to capture the dynamic of all the variables.

The trace and the maximum eigenvalue tests indicate four cointegrating rela-

tions among six variables. However, this study considers only one cointegration

relationship (r = 1).

4 Empirical Results

4.1 Impulse Response Functions

Having estimated the identiﬁed model, the impulse responses for all six variables

are generated and reported in Figs. 2 and 3. The focus is on the impulse responses

of key macroeconomic variables to two monetary policy shocks. As we expected,

the impulse response functions show the interrelations among all variables.

Figure 2 presents the impulse responses to two monetary policy shocks,

namely policy interest rate and monetary aggregate, together with the upper

and lower conﬁdence intervals. Empirical results are shown as follows.

The ﬁrst column in Fig. 2 reveals the responses of a contractionary policy

shock by raising the policy interest rate on the variables speciﬁed in our model.

In response to the shock, the monetary aggregate (M), prices (P) and real output

(Y) fall initially and, after that, they rise to their initial baselines. These results

are consistent with theoretical expectations and the speciﬁed SVEC model does

not generate evidence of the liquidity, output and price puzzles. In the case of

exchange rate (ER), the eﬀect of a contractionary policy is a depreciation of the

domestic currency. This is evidence that the exchange rate puzzle still exists in

our model. In response to the shock, the trade balance (TB) initially improves8 .

After the initial impact rise, it starts to fall and follows by a mean reversion to

its pre-shock level. These results imply that the short-term policy interest rate

remains a valid and eﬀective monetary policy instrument for achieving price

stability and improving the international trade.

The second column in Fig. 2 reveals the responses to an expansionary policy

shock by raising the money on the variables speciﬁed in our model. In response

to this expansionary policy shock, the interest rate (PR) initially falls and, after

that, increases to above its pre-shock level. The eﬀect of this expansionary policy

on the price level (P) is negative and this is not consistent with the theoretical

expectations9 . In response to this expansionary policy shock, the real output

8

The improvement of the trade balance is driven by the relatively strong import

contraction. The reason behind this is that the contractionary monetary policy shock

shrinks the output and, in turn, reduces import demand. This eﬀect is called ’the

income absorption eﬀect.

9

The impulse response functions that contradict theory predictions are known as

empirical puzzles that are often found in monetary literature. These anomalies may

come from modelling issues e.g. identiﬁcation and data limitations with a short time

span.

Monetary Policy Shocks and Macroeconomic Variables 213

(Y) increases initially and over time whereas the real eﬀective exchange rate

(ER) initially depreciates and follows by a mean reversion to its pre-shock level.

In addition, this expansionary policy shock causes the trade balance (TB) to

worsen initially and over time10 . Even though the money is not an appropriate

monetary instrument to maintain the price stability, it could be a supplementary

instrument for stimulating economic growth and improving international trade

under the inﬂation targeting.

In addition, the responses of two monetary instruments (the policy inter-

est rate and monetary aggregate) to target variables (prices, output and trade

balance) are presented in Fig. 3. The ﬁrst column represents the responses of

the policy interest rate to shocks on target variables. First, in response to a

price shock, the policy interest rate promptly increases and remains above its

pre-shock level over time. This reveals supporting evidence that the bank of

Thailand (BOT) has the primary objective to maintain price stability and acts

as an inﬂation ﬁghter under inﬂation targeting framework. Second, in response

to an output shock, the policy interest rate falls initially and over time. This

result does not provide a clear indication that the Bank of Thailand pursues an

objective of sustained output. Third, in response to a trade balance shock, the

policy rate initially increases and falls below its pre-shock level. However, the

adjustment of the policy interest rate is conducted as a key monetary instrument

in managing inﬂation and sustaining output.

The second column in Fig. 3 represents the responses of the monetary aggre-

gate to shocks on target variables. First, in response to a price shock, the mon-

etary aggregate increases initially and over time. This response of money is

consistent with the quantity theory of money (QTM) that a rise in prices drives

money increasing, when and if the velocity of circulation and output are con-

stant. Second, in response to an output shock, initially the money increases and

is follows by a mean reversion to its pre-shock level. This result is also consistent

with the QTM and money market equilibrium. Third, in response to a trade bal-

ance shock, initially the money falls and remains below its pre-shock level over

the time. This is not in line with theoretical expectations. However, one policy

implication emerges that monetary aggregate could be an optional instrument

for controlling inﬂation and stimulating output at least in the short run.

In addition to the impulse responses of, we examine the forecast error variance

decompositions of the macro economic variables. The variance decompositions

for the SVECM are reported in Fig. 4. The interesting results on the variances

decomposition are drawn as follows.

First, as for the policy interest rate, the most dominant source of its ﬂuctu-

ation is the variance of price level.

10

The reason behind this is that an expansionary monetary policy raises domestic

income and, in turn, increases import demand, which leads to a deterioration of the

trade balance.

214 P. Arwatchanakarn

Monetary Policy Shocks and Macroeconomic Variables 215

Fig. 3. Impulse responses of the interest rate and monetary aggregate to the prices,

real output and trade balance shock in an SVEC model

the price level is the dominant source of the ﬂuctuation in money. In addition,

the policy interest rate is the second key determinant of monetary aggregate

ﬂuctuation even in the short-run.

Third, in the case of prices, its ﬂuctuation is mainly originated by the variance

of monetary aggregate in a one-year horizon. The policy interest rate contributes

about 14% to the ﬂuctuation of prices in the ﬁrst quarter. This implies that

changes in monetary aggregate (or the policy interest rate) might cause prices

to be unstable at least in short-run.

Fourth, most output ﬂuctuation is largely explained by output’s own shock

on itself accounting for more than 80%. The price shocks play a signiﬁcant role

in the ﬂuctuation in output; it contributes around 7 to 13% of that ﬂuctuation.

216 P. Arwatchanakarn

Fig. 4. Forecast error variance decomposition for the 6-variables SVEC model

Monetary Policy Shocks and Macroeconomic Variables 217

explained by exchange rate’s own shock on itself although it has a decreasing role

over time. The real output shock is the second largest source of the ﬂuctuation

in exchange rate in the medium-run and long-run.

Lastly, the variance depositions of the trade balance suggest that, after the

trade balance shocks itself, the exchange rate shocks have signiﬁcant impact on

ﬂuctuation in the trade balance in the short-run. However, in the long-run, the

price shocks are the second key factor of the ﬂuctuation in the trade balance.

The overall results show that the variance of price level appear to be an

important factor in variances of all variables in the speciﬁed model. This provides

a support that price stability is also an important requirement for improving

trade balance and sustaining the output in the long-run.

Overall, the empirical results, which are generated by the identiﬁed SVEC

model, show that there are feedback relations among all six variables in the sys-

tem, especially among real output, prices and monetary aggregate. The external

shocks transmit to the domestic economy contemporaneously and dynamically.

The main ﬁnding is that, under inﬂation targeting, adjusting the policy interest

rate is conducted as a key instrument of monetary policy in keeping inﬂation

and output stable. In other words, the policy interest rate remains an eﬀective

monetary instrument for achieving price stability and improving international

trade. Even though the monetary aggregate is not an appropriate monetary

instrument to maintain price stability, it could be a supplementary instrument

for stimulating economic growth and improving international trade in the short

run.

This paper has investigated the relationship among monetary policy shocks and

some key macroeconomic variables in Thailand using quarterly data over the

inﬂation targeting period of 2000q2–2017q2. The dynamic interrelations among

these variables are analysed by estimating a six-variable SVEC model, which

consists of real output, prices, interest rates, monetary aggregates, exchange

rates and trade balance, in the presence of two external shocks, namely the world

price of oil and the US federal fund rate. The overall results suggest signiﬁcant

feedback relations among the six endogenous variables in the speciﬁed model.

The overall responses of the macroeconomic variables obtained from the SVEC

model are consistent with most common theory expectations.

An important ﬁnding from this study is that a contractionary policy has an

important eﬀect on prices in Thailand. It reveals that monetary policy aﬀects

the prices, real output and trade balance at least in the short run. In addi-

tion, the empirical results provide conﬁrmatory evidence that price stability is

essential for sustaining economic growth and improving international trade. More

importantly, under the inﬂation targeting, the policy interest rates remains valid

and eﬀective as the monetary instrument for achieving price stability. Monetary

aggregate also remains signiﬁcant in the conduct of monetary policy in Thailand.

218 P. Arwatchanakarn

Also, exchange rate remains a valid channel in which transmits monetary policy

to international trade via exports, imports and trade balance.

Based on results obtained in the paper, some policy implications on the state

of monetary policy in Thailand can be drawn. First, achieving the price stability

is essential to maintain steady economic growth. One policy implication is that

the price stability requires a credible and transparent monetary policy with

an eﬀective monetary transmission mechanism. Second, as monetary aggregate

is important in the monetary transmission mechanism, the Bank of Thailand

should not ignore monetary aggregates in its implementation of monetary policy

for price stability. The Bank of Thailand might opt for a monetary aggregate

as a supplementary instrument of monetary policy under its inﬂation targeting

framework. Third, when and if necessary, exchange rate measures, such as capital

controls and foreign exchange rate intervention, could be used to stabilise the

exchange rate, to improve international trade and to ensure the soundness of

economic stability.

References

1. Arwatchanakarn, P.: Structural vector autoregressive analysis of monetary policy

in Thailand. Sociol. Study 7(3), 133–145 (2017)

2. Arwatchanakarn, P.: Exchange rate policy, monetary policy and economic growth

in Thailand: a macroeconomic study, 1950–2016. The University of Newcastle,

Newcastle, NSW, Australia (2018)

3. Charoenseang, J., Manakit, P.: Thai monetary policy transmission in an inﬂation

targeting era. J. Asian Econ. 18(1), 144–157 (2007)

4. Chucherd, T.: Monetary and ﬁscal policy interactions in Thailand. Bank of

Thailand, Bangkok (2013)

5. Disyatat, P., Vongsinsirikul, P.: Monetary policy and the transmission mechanism

in Thailand. J. Asian Econ. 14(3), 389–418 (2003)

6. Enders, W.: Applied Econometric Time Series. Wiley, Hoboken (2010)

7. Faust, J., Leeper, E.M.: When do long-run identifying restrictions give reliable

results? J. Bus. Econ. Stat. (1997). https://doi.org/10.2307/1392338.

8. Fisher, L.A., Huh, H.S.: Identiﬁcation methods in vector-error correction models:

equivalence results. J. Econ. Surv. 28(1), 1–16 (2014)

9. Fung, B.S., Kasumovich, M.: Monetary shocks in the G-6 countries: is there a

puzzle? J. Monet. Econ. 42(3), 575–592 (1998)

10. Grenville, S., Ito, T.: An independent evaluation of the Bank of Thailand’s mone-

tary policy under the inﬂation targeting framework, 2000–2010. Bank of Thailand,

Bangkok (2010)

11. Grilli, V., Roubini, N.: Liquidity and exchange rates: puzzling evidence from the

G-7 countries. New York University, New York (1995)

12. Hesse, H.: Monetary policy, structural break and the monetary transmission mech-

anism in Thailand. J. Asian Econ. 18(4), 649–669 (2007)

13. Hossain, A.: The Evolution of Central Banking and Monetary Policy in the Asia-

Paciﬁc. Edward Elgar Publishing, Cheltenham (2015)

14. Hossain, A., Arwatchanakarn, P.: Inﬂation and inﬂation volatility in Thailand.

Appl. Econ. (2016). https://doi.org/10.1080/00036846.2015.1130215

Monetary Policy Shocks and Macroeconomic Variables 219

15. Hossain, A.A., Arwatchanakarn, P.: Does money have a role in monetary policy

for price stability under inﬂation targeting in Thailand? J. Asian Econ. (2017).

https://doi.org/10.1016/j.asieco.2017.10.003

16. Ivrendi, M., Guloglu, B.: Monetary shocks, exchange rates and trade balances:

evidence from inﬂation targeting countries. Econ. Modell. 27(5), 1144–1155 (2010)

17. Jang, K., Ogaki, M.: The eﬀects of monetary policy shocks on exchange rates: a

structural vector error correction model approach. J. Jpn. Int. Econ. 18(1), 99–114

(2004)

18. Kim, S.: Eﬀects of monetary policy shocks on the trade balance in small open

European countries. Econ. Lett. 71(2), 197–203 (2001)

19. Kim, S., Roubini, N.: Exchange rate anomalies in the industrial countries: a solu-

tion with a structural VAR approach. J. Monet. Econ. 45(3), 561–586 (2000)

20. Koray, F., McMillin, W.D.: Monetary shocks, the exchange rate, and the trade

balance. J. Int. Money Financ. 18(6), 925–940 (1999)

21. King, R., Plosser, C., Stock, J., Watson, M.: Stochastic trends and economic ﬂuc-

tuations. Am. Econ. Rev. 81(4), 819–840 (1991)

22. Kubo, A.: Macroeconomic impact of monetary policy shocks: evidence from recent

experience in Thailand. J. Asian Econ. 19(1), 83–91 (2008)

23. Lütkepohl, H.: New Introduction to Multiple Time Series Analysis. Springer,

Heidelberg (2005)

24. Pagan, A.R., Pesaran, M.H.: Econometric analysis of structural systems with per-

manent and transitory shocks. J. Econ. Dyn. Control. 32(10), 3376–3395 (2008)

25. Pfaﬀ, B.: VAR, SVAR and SVEC models: implementation within R package vars.

J. Stat. Softw. 27(4), 1–32 (2008)

26. Phiromswad, P.: Measuring monetary policy with empirically grounded restric-

tions: an application to Thailand. J. Asian Econ. 38, 104–113 (2015)

Thailand’s Household Income Inequality

Revisited: Evidence from Decomposition

Approaches

and Songsak Sriboonchitta2,3

1

Bank of Thailand, Northern Region Oﬃce, Chiang Mai, Thailand

natthaphat.kingnetr@outlook.com

2

Faculty of Economics, Chiang Mai University, Chiang Mai, Thailand

supanika.econ.cmu@gmail.com, songsakecon@gmail.com

3

Puey Ungphakorn Center of Excellence in Econometrics, Chiang Mai University,

Chiang Mai, Thailand

in Thailand in three dimensions: sources of income, industrial subgroups,

and household characteristics. The results show that the source of income

with the highest contribution to the inequality is income from businesses.

In term of industry, we found that real estate, wholesale and retail trade,

manufacturing and agriculture experience highest income inequality. For

the analysis on household characteristics, we examined drivers for over-

all income inequality and also examined separately for each source of

income and industry subgroup. The results raise attention to the impor-

tance of households’ wealth on inequality as the inequality in ﬁnancial

asset and credit accessibility contribute highest to income inequality and

more than that of education. In addition, we found that the key contrib-

utors of income inequality are heterogeneous across industrial subgroups.

In particular, diﬀerent types of ﬁnancial assets and funds contribute dif-

ferently to income inequality in each industrial subgroups. Therefore, in

addition to ensuring an equal opportunity in education, a more equal

access to diﬀerent types of funding is also crucial for income inequality

reduction.

Generalized Entropy · Heterogeneity · Thailand

1 Introduction

Although the global income inequality does not change much and diﬀerences

between countries have decreased, the within-country inequality has increased

signiﬁcantly from 1988 to 2008 [14]. With economic growth, incomes at the

bottom grew much slower than those at the top and did not grow at all in some

countries. This causes a rise in the income share of the top percentiles in nearly

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 220–234, 2019.

https://doi.org/10.1007/978-3-030-04263-9_17

Thailand’s Household Income Inequality Revisited 221

all countries. Several studies in both developed and developing countries have

support the outcomes [1,4,20].

Is income inequality a problem? Literature has debated over the impacts of

inequality on economic growth as theoretically inequality can yield both positive

and negative impacts on growth. Federico Cingano summarized in [4] that the

positive impacts occur because inequality gives incentives for people to work

harder and save more. Therefore, it increases productivities and capital accumu-

lation. For the negative impacts, inequality causes under-investment in education

among poor population due to ﬁnancial market imperfection. In the aggregate

level, inequality reduces demand and may reduce technological adoption that

requires economies of scale. Finally, strong inequality can lead to political insta-

bility and social unrest. In addition to economic impacts, inequality also has

negative impacts on people’s physical and psychological health and well-being

[3,9]. As the inequality has both pros and cons, some level of inequality is desired.

However, as the economic growth has not yet improved income inequality in gen-

eral and the level of inequality within counties is increasing in many countries,

the disadvantages of the inequality can have more serious eﬀects.

Whether the high level of inequality should be a concern also depends on the

belief in the Kuznets’ hypothesis, which states that the level of inequality will

decrease once the economy is developed to a certain point. The study [15] by Iven

Lyubimov shows literature reviews on Kuznets’ hypothesis and discussions on

multiple studies that provide contradictory results from the Kuznets’ hypothesis.

The key opposing study [19] is done by Thomas Piketty, who uses 100 years of

data from 20 countries to examine the relationship between economic growth and

income inequality. His results suggest that an automatic decrease in inequality

cannot be expected as the economy developed, which contradicts the Kuznets’

hypothesis.

For the drivers of inequality, according to [19], they come from two funda-

mental sources—labour earning potential and inherited wealth. Therefore, the

gap between the return to capital and economic growth accelerates inequality. In

addition, Dabla-Norris et al. [7] found that an increase in skill premium is the

key factor that increases the income inequality in developed countries, and ﬁnan-

cial deepening is the key factor that increases the income inequality in develop-

ing countries. Evidence has shown that the key sources of income inequality are

either the unequal opportunity in education and wealth or the policies, such as a

progressive tax on labour income and inherited wealth, that are not well-designed.

Glomm and Ravikumar [11] found that an equal opportunity in education through

public investment in human capital leads to a decline in income inequality. Institu-

tions and policies also play an important role in controlling the level of income and

wealth inequality. In addition, several studies [1,8] found that democracy increases

redistribution when inequality rises and, thus, reduces inequality.

For the case of Thailand, the World Bank reported the Gini coeﬃcient

of 37.80 for Thailand in 2013. Credit Suisse’s Global Wealth Databook 2017

reported that the wealth share of the top 1% in Thailand was 56.2%, which was

greater than the world average of 50.1%. The Gini coeﬃcients calculated using

222 N. Kingnetr et al.

the national household data from the socio-economic survey have dropped and

presented the Kuznets’ pattern [13,26]. However, both the level of inequality

and the trend depends heavily on the choices of measurement and empirical

modelling. While some studies found that the inequality problem in Thailand

is declining, others found contradictory evidences [13,16–18,26]. For drivers of

income inequality, it is agreed upon by several studies that unequal opportu-

nity and choices of education is a key driver of income inequality [13,16,18]. In

addition, Kilenthong [13] found that occupation, ﬁnancial access and urban and

rural locations contribute to the explanation of income inequality. Paweenawat

[18] found that family factors, such as number of children and earners, also play

an important role. Pawasutipaisit [17] uses monthly panel data to examine or

upward mobility in net worth and found that the return on assets is the key

factor. Meneejuk [16] examines macroeconomic factors and found that share of

private sector and inﬂation also have an impact on inequality.

In this study, we use the data from the 2015 Socio-economic Survey (SES)

to calculate the Generalized Entropy (GE) measurement for household’s income

inequality. We use household-level data instead of individual-level data because

of the possibility of task specialization within family. It is not unusual in the

Thai context that some family members may specialize in housework or child

care, while some members specialize in labour market work. Using individual-

level data to calculate income inequality can be misleading. The main reasons

for adopting the GE method for inequality measurement are the following: (1)

it is a widely-used for inequality measurement in the literature, (2) it is more

robust comparing to the Gini index when the data include zero and negative

incomes, and (3) it satisﬁes all necessary properties of a well-designed inequality

index as discussed in [12].

The income inequality then can be decomposed to examine the sources of

the inequality. This study performs three inequality decomposition approaches;

the source decomposition approach [22], the subgroup decomposition approach

[23], and the regression-based decomposition approach [6,10]. In the ﬁrst app-

roach, we ﬁrst examine which sources of household’s income contribute most to

income inequality. As the results show that proﬁts from businesses, wages from

employment, and proﬁts from farming are the key sources, the second approach

examines the role of industrial subgroups to the income inequality within each of

the three income sources. The results show that real estate, wholesale and retail

trade, manufacturing and agriculture experience the highest income inequality.

The third approach identiﬁes drivers of income inequality. As it is possible

that inequality drivers can diﬀer across both income sources and industries. In

this part, we decompose the income inequality for each of the main sources and

industries. Following the framework introduced by Thomas Piketty in [19] and

previous empirical studies [2,6,13,18], we try to emphasize the eﬀects of human

and physical capitals on the inequality. Human capital is measured by years of

education, age and gender. Physical capital is measured by ﬁnancial assets, size

of owned land and access to the internet. In addition, we also include the credit

constraint variables, which are debt to house and land, business and agricultural

Thailand’s Household Income Inequality Revisited 223

associated with income inequality. However, the results show that ﬁnancial asset

contributes most to the inequality. The signiﬁcance of ﬁnancial assets and busi-

ness debts is consistent with [19], which emphasizes the association of capital

to income inequality. It should be noted that unobserved factors, such as ability

of household members, that aﬀect both income and ﬁnancial asset, can cause

an upward bias in the estimate of the impact of ﬁnancial asset. However, our

results show a large gap between the eﬀects of ﬁnancial assets and other inequal-

ity drivers and, thus, ﬁnancial assets should be considered as an important driver

regardless.

Although the key drivers of inequality are the same across income sources

and industries, the size of the contribution of each factor diﬀers. In particular,

diﬀerent types of ﬁnancial assets and funds contribute diﬀerently to income

inequality in each industrial subgroups.

In this section, we start with a discussion on the inequality measurement, fol-

lowing by three well-known income inequality decomposition approaches; the

source decomposition approach [22], the subgroup decomposition [23], and the

regression-based decomposition approach [6,10]. Finally, analysis process and

data in the study are explained.

Haughton pointed out in [12] that a good measure of income inequality should

satisfy the following requirements: mean independence, population size indepen-

dence, symmetry, Pigou-Dalton transfer sensitivity, decomposability, and statis-

tical testability. Cowell introduced the so-called generalized entropy index (GE)

for inequality measurement in [5] based on the information theory introduced by

Shannon in [21] and is an improvement from the work [25] by Theil. It satisﬁes

all the mentioned criteria. The formula for the GE index can be speciﬁed as

follows:

n

1 1 y i α

GE (α) = −1 , (1)

α (1 − α) i=1

n y

where GE (α) represent the value of generalized entropy inequality index for a

given value of parameter α, yi denotes the ith income observation, and α is a

whole number depending on how much sensitive the index to the way incomes

are spread in diﬀerent parts of income distribution. If the value of α is higher, the

index will become more sensitive to the deviation in incomes at the upper tail.

In this study, we employ α = 0, 1, and 2 as they are well-known in the literature

and commonly used in leading organizations such as IMF and the World Bank.

In addition, GE(0) is also called Theil’s L index and GE(1) is called Theil’s T

index [12].

224 N. Kingnetr et al.

One prominent downside of the GE is that it can take any value from zero

to inﬁnity [12]. Zero indicates perfect income equality and a greater value means

larger income inequality. Nevertheless, unlike the conventional Gini index which

is prone to data with negative incomes, the GE is more robust comparing to

the Gini index when researchers want to include zero and negative incomes in

their analyses. According to [12], the equation for Gini index can be speciﬁed as

follow:

N

1

Gini = 1 − (yi + yi−1 ) , (2)

N i=1

where yi is the ith income observation. N is a number of total observation. It

can be noticed that, if the data contain observation of negative income, there

is a chance that value of the Gini index to be greater than one, and thus it

will violate the requirement for the Gini index be valued between 0 to 1. In

addition, the Gini lacks decomposability, meaning that the sum of inequality of

subgroups is not equal to the total inequality [12]. This property is crucial in our

analysis as we are interested in decomposing the inequality by subgroups. Due

to these reasons, it is appropriate to employ the GE in this study as we seek to

investigates the inequality through diﬀerent decomposition approaches.

sources of income to a total income inequality. This method requires that

F

y= zf , (3)

f =1

where y is total income, zf denote the amount of income from source f , and F

is a total number of income sources. Then, one can investigate the share of these

income components to income inequality through the natural decomposition rule,

in a general form, that is

F

I (y) = θf , (4)

f =1

measures; θf is the absolute share of income inequality from an income source

f and have the same unit of measure as I (y). This rule shows that the sum of

inequality from each income source should equal to the income inequality index

calculated using total income. However, we are more interested in term of the

θf

percentage, sf . Thus, from Eq. 4, we can see that sf = I(y) . Alternatively, based

on [22], we can get sf from

cov [zf , y]

sf = . (5)

σ 2 (y)

Thailand’s Household Income Inequality Revisited 225

can be measured from a covariance between the source f and total income, y,

and divided by the variance of y. In addition, it can be noticed that this step

does not involve any income inequality measure. The advantage of the Shorrocks

approach [22] is that the percentage share of sources to total income inequality

will remain the same for any inequality measures [2]. However, the absolute share

may vary depending on the measurements of income inequality [6].

subgroup into two types of inequality as follows:

K

and GEw (α) = Vk1−α Skα GEk (α), (7)

k=1

Vk is a proportion of people in subgroup k to the total population, Sk is total

income share belong to k subgroup, and GEk (α) is the GE index for subgroup

k. Note that GEb (α) assumes all members in subgroup k receive the mean

income of such group, y k . Moreover, the greater level of within-group inequality

suggests the higher income variation within the subgroup, while the greater

level of between-group inequality indicates a strong income diﬀerences between

subgroups.

The decomposition by sources and subgroups are useful for policy makers

when it comes to policy prioritisation [24]. For instance, if the cause of total

income inequality is from the disparities in income among enterprises within an

industry sector, then the government should focus on improving competitiveness

of the disadvantaged enterprises in the sector. Another example is that, if the

variation in income are mainly from salary rather than return to investment, then

the government may consider prioritising policies to improve human capital and

labour market eﬃciency in order to alleviate income inequality.

However, the sources and subgroup decomposition approaches have been crit-

icised for being too restrictive [6]. The source approach requires that the total

income and its sources must follow a natural decomposition rule, which means

that the total income must equal to the sum of its sources, despite its ﬂexi-

bility to be used with any types of income inequality measure. The subgroup

approach requires the discrete variable for its partition criteria. However, other

important socio-economic factors that have potential eﬀect on income distribu-

tion are continuous (e.g. age, number of earners, debt-income-ratio, and wealth).

To overcome this issue, we will turn to the regression-based decomposition

approach [6,10].

226 N. Kingnetr et al.

The approach was ﬁrst introduced by Fields [10], and then being further

improved by Cowell [6]. The whole idea of this approach is based on the data

generating process that takes the linear form, also known as a linear regression

model, that is

K

yi = β0 + βk xk,i + i , (8)

k=1

teristic, i is an error term. As long as the conventional assumptions, such as

exogeneity in explanatory variables and the error term follows zero mean and

constant variance, hold, one can use OLS approach to obtain the parameters

and residuals [6]. Then Eq. 8 can be rearranged into

K

0 +

yi = β zk,i + i , (9)

k=1

k,i

the use of the source decomposition technique by Shorrocks [22] if the composite

variables and residuals are treated as income sources. Thus, the contribution of

household characteristics in this sense to total inequality can be calculated as

cov (zk , y)

sk (y) = , (10)

σ 2 (y)

which is similar to Eq. 5 except that the composite variables are used instead

of income sources. Also note that this approach allows us to investigate the

contribution of residual to total income inequality. It is highly importance that

this term should not be ignored, as an existence of the residual in calculation of

inequality share can potentially aﬀect the results [6].

We perform our analysis using the methods discussed above with the SES data in

2015 from the National Statistical Oﬃce (NSO) of Thailand. We consider ﬁrst

the decomposition by household income sources to see which types of house-

hold income experience the largest income inequality. Then, we examine the role

of industrial subgroups to the income inequality. Lastly, the regression-based

decomposition is employed to investigate the within-group inequality. This will

allow us to assess how household characteristics contribute to the income inequal-

ity in each important occupation subgroup. Four important household charac-

teristics are considered in this study: (1) level of education of household’s highest

earner, (2) level of household ﬁnancial asset, (3) level of credit accessibility, and

(4) level of internet accessibility.

Thailand’s Household Income Inequality Revisited 227

3 Empirical Results

It can be seen from Table 1 that the business-type income contributes to the

household income inequality the most (60.7%), following by the wage and salaries

(19.8%), and the farming (9.8%). This is expected as business tends to generate

higher income and involves greater risks than the other types of works as seen

from the highest value of GE. However, certain types of business or industry

may face higher or lower income disparity than others. We will then examine

industry subgroups to see in more details on how each sector experiences income

inequality.

Table 1. Absolute income inequality and share to total inequality by income source

inequality (%)

Wages and salaries 0.214 19.8

Net proﬁt from business 0.655 60.7

Net proﬁt from farming 0.106 9.8

Pensions, annuities, and other assistances 0.019 1.8

Work compensations or terminated payment 0.001 0.1

Money assistance from other people outside household 0.004 0.3

Elderly and disability assistance from government 0.000 0.0

Income from rent of house, land, and other properties 0.009 0.9

Saving interest, bonds and stocks 0.010 0.9

Interest of individual lending 0.000 0.0

In-kind 0.026 2.4

Other 0.035 3.3

Total 1.079 100.0

ity are real estate, trade, manufacturing, agriculture, and construction. These

industries constitute approximately 65% of sampled households in the survey.

Moreover, the between-group inequality is much higher than the within-group

inequality for all three types of the GE index. This suggests that an income

inequality due to the diﬀerent types of occupation may not as severe as the way

income distributed in a particular work itself.

Figure 1 further illustrates such inequality. It can be seen that the agricul-

ture sector constitutes the largest population and also exhibits the lowest relative

household monthly income, which is approximately a half of the average house-

hold income as a whole. Even worse is the income disparity in this subgroup is

228 N. Kingnetr et al.

1 Real estate activities 0.473 0.699 2.105 100

2 Wholesale and retail trade; repair of 0.317 0.446 1.746 6,158

motor vehicles and motorcycles

3 Manufacturing 0.243 0.339 1.570 5,082

4 Agriculture, forestry and ﬁshing 0.269 0.363 1.258 8,787

5 Construction 0.253 0.333 0.864 2,634

6 Activities of households as employers 0.324 0.417 0.828 264

7 Transportation and storage 0.277 0.354 0.765 945

8 Activities of extraterritorial 0.524 0.522 0.630 4

organizations and bodies

9 Administrative and support service 0.249 0.295 0.507 389

activities

10 Water supply; sewerage, waste 0.247 0.289 0.463 103

management and remediation activities

11 Professional, scientiﬁc and technical 0.279 0.285 0.377 233

activities

12 Other service activities 0.222 0.240 0.368 829

13 Financial and insurance activities 0.261 0.263 0.335 419

14 Information and communication 0.259 0.260 0.334 167

15 Human health and social work activities 0.232 0.237 0.327 900

16 Electricity, gas, steam and air 0.275 0.257 0.307 137

conditioning supply

17 Education 0.217 0.217 0.304 1,750

18 Accommodation and food service 0.205 0.216 0.298 2,542

activities

19 Arts, entertainment and recreation 0.216 0.217 0.282 251

20 Public administration and defence; 0.183 0.196 0.278 2,717

compulsory social security

21Mining and quarrying 0.154 0.159 0.191 85

Within-group inequality 0.256 0.321 1.018

Between-group inequality 0.056 0.058 0.061

Total inequality 0.314 0.379 1.079

Note: The ranking is based on the values of GE(2) in an descending order.

also high, comparing to the others. On the contrary, households working in the

ﬁnancial and insurance activities as well as information and communication are

able to earn nearly twice of average household income, yet experiencing more

equally income distribution. This ﬁnding implies that certain household charac-

teristics may play an important role in income disparity both between and within

Thailand’s Household Income Inequality Revisited 229

what the cause of such diﬀerences.

Fig. 1. The GE(2) index and relative household monthly incomes working in each Thai

industry

We ﬁrst start with an analysis of potential underlying factors of income inequality

as a whole. The results of regression-based decomposition are shown in Table 3.

It can be seen that total inequality are mainly stemmed from ﬁnancial assets

(11%), accessing to credit for business (6%), education (4%), exposure to the

internet (3%), and accessing to credit for house and land (1%). This ﬁndings are

consistent with what we have discussed in the previous section, where agriculture

facing the lowest household income while those in ﬁnancial activities can earn

much more. In general, this implies that the industry, involving skilled-labour

and highly utilising internet technology, ﬁnancial assets, and credits, tends to be

able to generate higher income than those that does not.

However, the income inequality within each industry may be caused by the

factors diﬀerently. Thus, we will turn to our investigation to three important

industrial subgroups: agriculture, manufacturing, and wholesale and retail trade

since they exhibit relatively high income inequality and involve more than half

of the total population in this study.

Table 4 shows the results of household inequality decomposition of three

important subgroups. Starting with the agricultural sector, it can be seen that

230 N. Kingnetr et al.

inequality

(%)

Female 0.000 0.000 0.000 0.0

Age 0.001 0.001 0.003 0.2

Years of schooling 0.013 0.016 0.046 4.3

Proportion of earner 0.000 0.001 0.002 0.2

Proportion of member accessing internet 0.009 0.011 0.031 2.9

Size of owned land 0.001 0.001 0.002 0.2

Size of rented land 0.000 0.000 0.001 0.1

Financial assets 0.035 0.042 0.119 10.9

Debt for house and land 0.004 0.005 0.015 1.4

Debt for business activity 0.019 0.024 0.067 6.2

Debt for agricultural activity 0.001 0.001 0.002 0.2

Residual 0.231 0.279 0.793 73.4

Total 0.314 0.379 1.079 100.00

Note: The top ﬁve factors contributing to the inequality are in bold.

the important elements of income inequality in this group are slightly diﬀerent

from the big picture. The ﬁnancial asset remains the largest source of income

inequality (13%), following by size of land in possession (3%), and ability to

access loan for agricultural activity (4%). The eﬀect of education, however, is

rather low in this group, constituting only 0.3% of the within-group inequality.

This implies that level of eduction has not received much attention for working in

the agricultural sector. The reason would be that many farming families in Thai-

land still employ the traditional ways of farming, which has been passed down

from generation to the next. Without much ﬁnancial asset and land at disposal,

households with rental land have to bear the costs, which signiﬁcantly reduce

their net income. Combining with a diﬃculty in accessing credit for agricul-

tural activity, this prevents them the opportunity to invest in better agricultural

equipments and machineries, creating the income gap within this group even

further.

In the case of manufacturing sector, ﬁnancial assets become more important

source of the income inequality (32%). On the contrary, the inequality as the

result of education or even other factors considered in this study seems to be

insigniﬁcant. This implies that the diﬀerences in household characteristics in this

group are not large except for the ﬁnancial assets. To be able to run manufac-

turing business, certain levels of these characteristics must be met. All is left is

the wealth of household to initiate the business. Obviously, the household with

high wealth would be able to invest in their manufacturing activity than those

do not, and thus generating more income.

Thailand’s Household Income Inequality Revisited 231

Agriculture Manufacturing Wholesale

and retail

trade

Female 0.0 0.3 0.0

Age 0.0 0.1 0.3

Years of schooling 0.3 0.9 0.9

Proportion of earner 0.1 0.1 0.1

Proportion of member accessing internet 0.7 1.1 0.7

Size of owned land 2.7 0.0 0.3

Size of rental land 0.7 0.0 0.0

Financial assets 13.2 32.0 6.0

Debt for house and land 0.1 0.8 0.5

Debt for business activity 0.4 0.1 16.7

Debt for agricultural activity 4.0 0.0 0.0

Residual 77.8 64.5 74.5

Total 100.0 100.0 100.0

For the trade subgroup, the accessibility to credit is more important. Of the

within-group inequality, it constitutes 17% approximately. Financial asset also

remains the important source although, unlike the manufacturing, it make up

about 6% of the within-group inequality. This ﬁndings could suggest that house-

hold doing the trade activity tend to rely on stable cash ﬂow to run their business.

The household that have more accessibility to credit can adjust their commodi-

ties both quantity and type of products as their market situation and consumer

behaviour changes. Similarly to agriculture and manufacturing, the contribution

of education inequality to the income inequality is not large, implying that earners

of household working in this subgroup achieving similar level of education.

This study investigates the drivers of household income inequality in Thailand

using decomposition approaches in three dimensions: sources of income, indus-

trial subgroups, and household characteristics. The data are from the socio-

economic survey conducted by the national statistic oﬃce in 2015. In term of

household income sources, the results showed that households doing business are

experiencing the largest income disparity, following by employment and farming.

The results further indicate that an inequality in education, while remaining an

important aﬀecting the household income inequality as a whole, its contribution

to the income inequality is less than that of ﬁnancial assets and credit accessi-

232 N. Kingnetr et al.

bility. In addition, internet exposure is also another key factor that should have

received great attention in order to mitigate the income inequality.

Furthermore, we discover that the key contributors of income inequality seem

to be heterogeneous across selected industrial subgroups. In all three sectors,

wealth and credit accessibility have the highest contribution. However, while

the inequality in households’ owned ﬁnancial assets contribute highest for the

income inequality in the manufacturing and agricultural sectors, the inequality

in debt for business activity contribute more in the wholesale and retail trade

sector. In addition to ﬁnancial assets and debt, the inequality in land ownership

are more important in the agricultural sector. These ﬁndings conﬁrm the fact

that there is no one-size-ﬁt-all policy. To eﬀectively reduce income inequality,

each policy must be carefully designed as well as prioritised for speciﬁc industry.

Regarding the eﬀects of education on income inequality, our analysis lead to

a striking conclusion. Education contributes much less than households’ wealth

and credit accessibility. This can be a concern as mitigating unequal opportu-

nity for education has been a key attempt to improve overall household income

inequality as many organizations have pointed out. Years of schooling or tra-

ditional education has low contribution to income inequality, especially in the

agricultural sector. It has been widely discussed in the literature that human and

physical capitals are at times complements in the production process. This could

possibly imply an incoherence between the content of the education provided and

accessibility of other relevant resources to yield a productive outcome. For edu-

cated farmers to achieve a higher income, and thus creating income inequality,

the sector requires a certain level of accessibility of capital and technological

transfer.

According to our ﬁndings, the policy implications are as follows. First of all,

distribution of wealth in many forms aﬀects distribution of income. Therefore,

taxation on wealth such as land, real estate, and ﬁnancial assets, to redistribute

wealth will play a central role in mitigating income inequality. However, a great

attention must be put in a design of this measure as it has a tendency to also

hurt the low and middle income population in various aspects. If the eﬀects on

low and middle income population outweigh the rich, then the level of inequality

can also increase. Secondly, internet accessibility contributes to income inequality

not less than education. Consequently, nation-wide internet access is necessary

to reduce income inequality across regions. Having an access to internet is not

only allow a person to experience a vast of knowledge, which may help his/her

current income-generating activities but also business and job opportunities in

the future. Thirdly, Pico-and-nano ﬁnancial providers or peer-to-peer loan ser-

vices should be developed, this will increase an opportunity for those who lack

of fund but full of creativity and innovation to run their businesses.

Finally, it should be emphasized that, in order to successfully mitigate the

income inequality, it is essential for policies to be imposed coherently and comple-

mentarily to each other. Focusing on a single policy at the time would probably

not leading to satisfactory development as a lift in income is not simply a result

of development in one driver but rather an accumulation of improvement in

Thailand’s Household Income Inequality Revisited 233

related drivers of income. In particular, for both the policies to improve human

and physical capitals to be most eﬀective, the policies must be provided coher-

ently. As an example, consider the case for loan accessibility. It is diﬃcult to

ensure loan accessibility with decent interest rates to all when some parts of the

population are not yet ﬁnancially literate and equipped with enough knowledge

to search for investment opportunities. The risks are high and the returns, both

in term of private and social beneﬁts, will not match the risk causing relevant

policies to be unsustainable.

An issue to be noted about the measurement of income inequality is that

a large portion of the inequality is from the top percentiles. Most inequality

measurement relies on household surveys and it is possible that the top per-

centiles are usually under-represented. For the case of Thailand, [26] used both

the SES, which is a household survey, and the tax return data to calculate Gini

coeﬃcient in 2007 and 2009. The results show that the Gini coeﬃcient decreased

over the years when calculated using the household survey, but increased when

calculated using the tax return data. However, the poor and the middle class are

relatively well-represented in household data and, thus, the inequality decom-

position results of this research are applicable for lower income population. In

addition, there might exist factors, such as ability of household members, that

aﬀect both income and ﬁnancial asset. Inadequacy of control for such factors can

cause an upward bias for the estimate of the impact of ﬁnancial asset.

Since this study employ only one time period, thus the analysis on the dynam-

ics of income inequality decomposition in Thailand has yet remained to be seen.

By comparison of diﬀerent decomposition structures overtime, this would fur-

ther clarify our understanding on income inequality development in Thailand.

Another point to be considered is that this study could only explain approxi-

mately 25 to 30% of the income inequality. This suggests that the model should

be enriched by more household characteristics in the future research.

References

1. Alvaredo, F., Chancel, L., Piketty, T., Saez, E., Zucman, G.: Global inequality

dynamics: new ﬁndings from WID.world. Working Paper Series (23119) (2017)

2. Brewer, M., Wren-Lewis, L.: Accounting for changes in income inequality: decom-

position analyses for the UK 1978–2008. Oxford Bull. Econ. Stat. 78(3), 289–322

(2016)

3. Buttrick, N.R., Oishi, S.: The psychological consequences of income inequality. Soc.

Pers. Psychol. Compass 11(3), e12304 (2017)

4. Cingano, F.: Trends in income inequality and its impact on economic growth.

OECD Social, Employment and Migration Working Papers (163) (2014)

5. Cowell, F.A.: Generalized entropy and the measurement of distributional change.

Eur. Econ. Rev. 13(1), 147–159 (1980)

6. Cowell, F.A., Fiorio, C.V.: Inequality decompositions-a reconciliation. J. Econ.

Inequality 9(4), 509–528 (2011)

7. Dabla-Norris, E., Kochhar, K., Suphaphiphat, N., Ricka, F., Tsounta, E.: Causes

and consequences of income inequality: a global perspective. IMF Staﬀ Discussion

Notes 15/13, International Monetary Fund (2015)

234 N. Kingnetr et al.

8. Dodlova, M., Gioblas, A.: Regime type, inequality, and redistributive transfers in

developing countries. WIDER Working Paper 2017/30 (2017)

9. Elgar, F.J., Gariépy, G., Torsheim, T., Currie, C.: Early-life income inequality and

adolescent health and well-being. Soc. Sci. Med. 174, 197–208 (2017)

10. Fields, G.S.: Accounting for income inequality and its change: a new method, with

application to the distribution of earnings in the United States. Res. Labor Econ.

22, 1–38 (2003)

11. Glomm, G., Ravikumar, B.: Public versus private investment in human capital:

endogenous growth and income inequality. J. Polit. Econ. 100(4), 818–834 (1992)

12. Haughton, J.H., Khandker, S.R.: Handbook on Poverty and Inequality. World

Bank, Washington, DC (2009)

13. Kilenthong, W.: Finance and inequality in Thailand. Thammasat Econ. J. 34(3),

60–95 (2016)

14. Lakner, C., Milanovic, B.: Global income distribution: from the fall of the Berlin

wall to the great recession. World Bank Econ. Rev. 30(2), 203–232 (2016)

15. Lyubimov, I.: Income inequality revisited 60 years later: Piketty vs Kuznets. Russ.

J. Econ. 3(1), 42–53 (2017)

16. Meneejuk, P., Yamaka, W.: Analyzing the relationship between income inequal-

ity and economic growth: Does the Kuznets curve exist in Thailand? Bank

of Thailand Setthatat Paper (2016). https://www.bot.or.th/Thai/Segmentation/

Student/setthatat/DocLib Settha Paper 2559/M Doc Prize2 2559.pdf

17. Pawasutipaisit, A., Townsend, R.M.: Wealth accumulation and factors accounting

for success. J. Econ. 161(1), 56–81 (2011)

18. Paweenawat, S.W., McNown, R.: The determinants of income inequality in

Thailand: a synthetic cohort analysis. J. Asian Econ. 31–32, 10–21 (2014)

19. Piketty, T.: Capital in the Twenty-First Century. Harvard University Press,

Cambridge (2014)

20. Ravallion, M.: Income inequality in the developing world. Science 344(6186), 851–

855 (2014)

21. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27(3),

379–423 (1948)

22. Shorrocks, A.F.: Inequality decomposition by factor components. Econometrica

50(1), 193–211 (1982)

23. Shorrocks, A.F.: Inequality decomposition by population subgroups. Econometrica

52(6), 1369–1385 (1984)

24. Stephen, P.J.: Analysis of income distributions. Stata Tech. Bull. 8(48) (1999)

25. Theil, H.: Economics and Information Theory. Studies in Mathematical and Man-

agerial Economics. North-Holland Pub. Co., Amsterdam (1967)

26. Vanitcharearnthum, V.: Top income shares and inequality: evidences from

Thailand. Kasetsart J. Soc. Sci. (2017, in press)

Simultaneous Conﬁdence Intervals for All

Diﬀerences of Variances of Log-Normal

Distributions

University of Technology North Bangkok, Bangkok, Thailand

wthangjai@yahoo.com, suparat.n@sci.kmutnb.ac.th

ferences of variances of log-normal distributions are proposed. Our

approaches are based on generalized conﬁdence interval (GCI) app-

roach and simulation-based approach. Simulation studies show that the

GCI approach has satisfactory performances for all cases. However, the

simulation-based approach is recommended for all cases of same stan-

dard deviations otherwise the GCI approach is recommended. Finally, a

numerical example is given to illustrate the advantages of the proposed

approaches.

Log-normal distribution · GCI approach · Simulation-based approach

1 Introduction

Log-normal distribution has been widely used in medicine, biology, economics,

and several other ﬁelds. This distribution has a right-skewed distribution and

is used to describe the positive data. For example, in bioequivalence studies for

comparing a test drug to a reference drug; see Hannig et al. [2]. In biological sys-

tems for studying the mass of cultures or areas of plant leaves in early stages of

growth, gene expression and metabolite contents; see Schaarschmidt [10]. More-

over, in survival analysis for analyses the survival times of breast and ovarian

cancer patients; see Royston [8].

Variance is the most commonly used dispersion measures in statistics and

many ﬁelds. The variance is measured in the square units of the respective

variable. In several experimental treatments, multiple comparisons among these

treatments are common. Therefore, comparing the variance has an important

advantage. For two independent variances, conﬁdence interval estimation for the

diﬀerence between two variances has been proposed. For details, recent work

includes Herbert et al. [4], Cojbasica and Tomovica [1], Niwitpong [5], and

Niwitpong [6]. For k independent variances, constructing simultaneous conﬁ-

dence intervals for pairwise diﬀerences of variances are of interest. To our knowl-

edge, there is no previous work on constructing simultaneous conﬁdence intervals

for all diﬀerences of variances of log-normal distributions.

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 235–244, 2019.

https://doi.org/10.1007/978-3-030-04263-9_18

236 W. Thangjai and S. Niwitpong

diﬀerences of variances of log-normal distributions is proposed with the GCI

approach and the simulation-based approach. The GCI approach uses general-

ized pivotal quantity (GPQ) for parameter. The concept of the GPQ is explained

by Weerahandi [15]. Thangjai et al. [12] proposed the simultaneous conﬁdence

intervals for all diﬀerences of means of normal distributions with unknown coef-

ﬁcients of variation based on the GCI approach and the method of variance esti-

mates recovery (MOVER) approach. Moreover, Thangjai et al. [13] presented the

simultaneous conﬁdence intervals for all diﬀerences of means of two-parameter

exponential distributions using the GCI approach, the MOVER approach, and

the parametric bootstrap approach. The simulation-based approach is proposed

by Pal et al. [7]. This approach uses the maximum likelihood estimates (MLEs)

for simulation and numerical computations. Therefore, the simulation-based app-

roach is applied to construct simultaneous conﬁdence intervals for all diﬀer-

ences of variances of log-normal distributions and then compare with the GCI

approach.

The rest of the article is organized as follows. In Sect. 2, the procedure for

constructing the simultaneous conﬁdence intervals for all diﬀerences of variances

of log-normal distributions is proposed. In Sect. 3, the performance of the two

approaches using simulation studies is presented. In Sect. 4, the data is used

to illustrate our approaches based on the GCI approach and simulation-based

approach. In Sect. 5, concluding remarks are presented.

For one sample case, let Y = (Y1 , Y2 , . . . , Yn ) be an independent log-normal

random variables and let X = log(Y ) be an independent random variables of

N (μ, σ 2 ). The normal mean and normal variance are μ and σ 2 , respectively.

The log-normal mean and log-normal variance are equal to

σ2

μY = exp μ + (1)

2

and

θ = σY2 = exp 2μ + σ 2 · exp σ 2 − 1 . (2)

n

n

Following Shen [11], suppose that X̄ = Xj /n and SX

2

= (Xj − X̄)2

j=1 j=1

are jointly suﬃcient and complete for μ and σ 2 , respectively. The maximum

likelihood estimators of μY and σY2 are given by

1 2

μ̃Y = exp X̄ + SX (3)

2n

and

1 2 1 2

θ̃ = σ̃Y2 = exp 2X̄ + SX · exp S −1 . (4)

n n X

SCIs for All Diﬀerences of Variances of Log-Normal Distributions 237

The adjusted maximum likelihood estimators for μY and σY2 are deﬁned as

1 2

μ̂Y = exp X̄ · f SX (5)

2n

and

2 2 n−2

σ̂Y2 = exp 2X̄ · f SX − f SX

2

, (6)

n n (n − 1)

2 2

n−1 t (n − 1) t3

where f (t) = 1 + t + · + · + . . ..

n+1 2! (n + 1) (n + 3) 3!

Since σ̂Y is unbiased estimator for σY . The asymptotic variance of σ̂Y2 is

2 2

V ar(θ̂) = V ar(σ̂Y2 )

2σ 2

≈ · exp 4μ + 2σ 2

n 2 2

For k sample cases, let Yi = (Yi1 , Yi2 , . . . , Yini ) be a random variable based

on the i-th log-normal sample. Therefore, the log-normal variance based on the

i-th sample is

θi = exp 2μi + σi2 · exp σi2 − 1 . (8)

Moreover, the maximum likelihood estimator of θi is

1 2 1 2

θ̃i = exp 2X̄i + SX · exp S − 1 . (9)

ni i ni Xi

equal to

θil = θi − θl

= exp 2μi + σi2 · exp σi2 − 1

− exp 2μl + σl2 · exp σl2 − 1 . (10)

1 2 1 2

= exp 2X̄i + SX · exp S − 1

ni i ni Xi

1 2 1 2

− exp 2X̄l + SX · exp S − 1 . (11)

nl l nl Xl

238 W. Thangjai and S. Niwitpong

Definition 2.1. Let X = (X1 , X2 , . . . , Xn ) be a random variable from a dis-

tribution F (x|δ), where x is an observed value of X, δ = (θ, ν) is a vector of

unknown parameters, θ is a parameter of interest, and ν is a vector of nui-

sance parameters. Weerahandi [15] defines random quantity R(X; x, δ) which is

called be generalized pivotal quantity (GPQ) if the following two conditions are

satisfied:

(i) For a fixed x, the distribution of R(X; x, δ) is free of all unknown parameters.

(ii) The value of R(X; x, δ) at X = x is the parameter of interest.

In line with this general definition, the 100(1 − α)% two-sided gen-

eralized confidence interval of parameter of interest is defined by

(R(α/2), R(1 − α/2)), where R(α/2) and R(1 − α/2) denote the (α/2)-th

and the (1 − α/2)-th quantiles of R(X; x, δ), respectively.

ni

Let X̄i and Si2 = (Xij − X̄i )2 /(ni − 1) be the sample mean and sample

j=1

variance, respectively. It is noted that X̄i and Si2 are mutually independent with

σ2 (ni − 1) Si2

X̄i ∼ N μi , i , Vi = ∼ χ2ni −1 , (12)

ni σi2

According to Tian and Wu [14], the generalized pivotal quantities of μi and

σi2 based on the i-th sample are given by

Zi (ni − 1) s2i

Rμi = x̄i − √ (13)

Ui ni

and

(ni − 1) s2i

Rσi2 = , (14)

Vi

where Zi denotes standard normal distribution and Ui and Vi denote chi-square

distribution with ni − 1 degrees of freedom.

The generalized pivotal quantity of θi is

is

SCIs for All Diﬀerences of Variances of Log-Normal Distributions 239

θi − θl , i, l = 1, 2, . . . , k, i = l based on GCI approach are deﬁned by

where Rθil (α/2) and Rθil (1 − α/2) denote the (α/2)-th and (1 − α/2)-th quan-

tiles of Rθil , respectively.

The simultaneous conﬁdence intervals based on GCI approach can be con-

structed using Monte Carlo procedure as follows:

Algorithm 1

Step 1 Calculate the values of x̄i and s2i .

Step 2 Generate the values of Zi from standard normal distribution.

Step 3 Generate the values of Ui from chi-square distribution with ni −1 degrees

of freedom.

Step 4 Compute the values of Rμi by using formula (13).

Step 5 Generate the values of Vi from chi-square distribution with ni −1 degrees

of freedom.

Step 6 Compute the values of Rσi2 by using formula (14).

Step 7 Compute the values of Rθi by using formula (15).

Step 8 Compute the values of Rθil by using formula (16).

Step 9 Repeat Step 2 – Step 8 m = 1000 times.

Step 10 Compute the (α/2)-th and (1 − α/2)-th quantiles of Rθil .

Again, let X = log(Y ) be random variable from normal distribution with mean

n

n

μ and variance σ 2 . Let X̄ = Xj /n and SX2

= (Xj − X̄)2 be jointly

j=1 j=1

suﬃcient and complete for the mean μ and the variance σ 2 , respectively. Let

θ = (exp(2μ+σ 2 ))·(exp(σ 2 )−1) be log-normal variance. Therefore, the restricted

maximum likelihood estimators of μ, σ 2 and θ are obtained by

μ̃RM L = X̄ (18)

σ̃RM

2

L = SX

2

(19)

and

1 2 1 2

θ̃RM L = exp 2X̄ + SX · exp S −1 . (20)

n n X

For i = 1, 2, . . . , k, the restricted maximum likelihood estimators of μi , σi2

and θi are obtained by

μ̃i(RM L) = X̄i (21)

σ̃i(RM

2

L) = SXi

2

(22)

and

1 2 1 2

θ̃i(RM L) = exp 2X̄i + SXi · exp S −1 . (23)

ni ni Xi

240 W. Thangjai and S. Niwitpong

from normal distribution with the mean μ̃i(RM L) in formula (21) and the variance

σ̃i(RM

2

L) in formula (22). Let X̄i(RM L) and Si(RM L) be sample mean and sample

2

variance for normal data for the i-th simulated sample. And let SX 2

i(RM L)

=

i

n

(Xij(RM L) − X̄i(RM L) )2 .

j=1

For i, l = 1, 2, . . . , k and i = l, the diﬀerence of variance estimators based on

the simulated sample is deﬁned by

1 2 1 2

= exp 2X̄i(RM L) + S · exp S −1

ni Xi(RM L) ni Xi(RM L)

1 2 1 2

− exp 2X̄l(RM L) + S · exp S −1 . (24)

nl Xl(RM L) nl Xl(RM L)

θi − θl , i, l = 1, 2, . . . , k, i = l based on simulation-based approach are deﬁned

by

where θ̃il(RM L),(α/2) and θ̃il(RM L),(1−α/2) denote the (α/2)-th and the (1 − α/2)-

th quantiles of θ̃il(RM L) , respectively.

The simultaneous conﬁdence intervals based on simulation-based approach

can be constructed using Monte Carlo procedure as follows:

Algorithm 2

Step 1 Obtain the MLE of the parameters as μ̃i , σ̃i2 , and θ̃il .

Step 2 Calculate the value of μ̃i(RM L) by using formula (21), calculate the value

of σ̃i(RM

2

L) by using formula (22), and calculate the value of θ̃i(RM L) by using

formula (23).

Step 3 Generate simulated sample Xi(RM L) = (Xi1(RM L) , Xi2(RM L) , . . . ,

Xini (RM L) ) from N (μ̃i(RM L) , σ̃i(RM

2

L) ) with m = 1000 times and recalculated

θ̃il(RM L),1 , θ̃il(RM L),2 , . . . , θ̃il(RM L),m .

Step 4 Deﬁne ordered values θ̃il(RM L),(1) ≤ θ̃il(RM L),(2) ≤ . . . ≤ θ̃il(RM L),(m) .

Step 5 Compute the (α/2)-th and (1 − α/2)-th quantiles of θ̃il(RM L) .

3 Simulation Studies

approaches. A simulation study based on 5000 simulation runs has been done to

compare the coverage probabilities, average lengths and standard errors of two

approaches: the GCI approach and the simulation-based approach.

The coverage probabilities of two simultaneous conﬁdence intervals can be

obtained using Monte Carlo procedure as follows:

SCIs for All Diﬀerences of Variances of Log-Normal Distributions 241

Algorithm 3

Step 1 Generate Xi from normal distribution with mean μi and variance σi2 .

Step 2 Calculate x̄i and si (the observed values of X̄i and Si ).

Step 3 Construct the simultaneous conﬁdence intervals based on the GCI app-

roach from Algorithm 1 and record whether or not all the values of θil = θi − θl ,

i, l = 1, 2, . . . , k, i = l are in their corresponding SCIGCI .

Step 4 Construct the simultaneous conﬁdence intervals based on the simulation-

based approach from Algorithm 2 and record whether or not all the values of

θil = θi − θl , i, l = 1, 2, . . . , k, i = l are in their corresponding SCISB .

Step 5 Repeat Step 1 – Step 4 M = 5000 times.

Step 6 Compute the coverage probability from the fraction of times that all

θil = θi − θl , i, l = 1, 2, . . . , k, i = l are in their corresponding simultaneous

conﬁdence intervals.

Table 1. The coverage probabilities (CP), average lengths (AL) and standard errors

(s.e.) of 95% two-sided simultaneous conﬁdence intervals for all diﬀerences of variances

of log-normal distributions: 3 sample cases.

CP AL (s.e.) CP AL (s.e.)

(20,20,20) (0.05,0.05,0.05) 0.9486 0.0447 (0.0038) 0.9482 0.0314 (0.0026)

(0.05,0.10,0.15) 0.9495 0.2539 (0.0656) 0.9009 0.1793 (0.0455)

(0.15,0.15,0.15) 0.9531 0.4518 (0.0428) 0.9523 0.3104 (0.0285)

(30,30,30) (0.05,0.05,0.05) 0.9526 0.0328 (0.0023) 0.9526 0.0260 (0.0018)

(0.05,0.10,0.15) 0.9521 0.1856 (0.0468) 0.9133 0.1481 (0.0369)

(0.15,0.15,0.15) 0.9468 0.3284 (0.0251) 0.9466 0.2567 (0.0193)

(50,50,50) (0.05,0.05,0.05) 0.9499 0.0234 (0.0013) 0.9507 0.0204 (0.0011)

(0.05,0.10,0.15) 0.9501 0.1340 (0.0332) 0.9299 0.1174 (0.0289)

(0.15,0.15,0.15) 0.9524 0.2315 (0.0138) 0.9531 0.1999 (0.0116)

(100,100,100) (0.05,0.05,0.05) 0.9506 0.0156 (0.0006) 0.9519 0.0145 (0.0006)

(0.05,0.10,0.15) 0.9535 0.0892 (0.0217) 0.9411 0.0836 (0.0203)

(0.15,0.15,0.15) 0.9503 0.1524 (0.0065) 0.9502 0.1418 (0.0060)

(200,200,200) (0.05,0.05,0.05) 0.9483 0.0107 (0.0003) 0.9479 0.0103 (0.0003)

(0.05,0.10,0.15) 0.9459 0.0616 (0.0151) 0.9403 0.0597 (0.0146)

(0.15,0.15,0.15) 0.9529 0.1042 (0.0034) 0.9519 0.1005 (0.0032)

The simulation study was performed with the following values of the factors:

(1) sample cases: k = 3 and k = 5; (2) population means: μ1 = μ2 = . . . =

μk = 1; (3) population standard deviations: σ1 , σ2 , . . . , σk ; (4) sample sizes:

n1 , n2 , . . . , nk ; (5) signiﬁcance level: α = 0.05. The speciﬁc combinations are

given in the following two tables.

Tables 1 and 2 report the coverage probabilities, average lengths and standard

errors of the simultaneous conﬁdence intervals for k = 3 and 5 sample cases,

242 W. Thangjai and S. Niwitpong

respectively. From Tables 1 and 2, the coverage probabilities of the GCI approach

are close to the nominal conﬁdence level 0.95 for all cases. For ni ≤ 100, the

coverage probabilities of the simulation-based approach are close to the nominal

conﬁdence level 0.95 when the standard deviations are same values, whereas the

coverage probabilities underestimate the nominal conﬁdence level 0.95 when the

standard deviations are diﬀerent values. For ni > 100, the coverage probabilities

of the simulation-based approach are close to the nominal conﬁdence level 0.95.

Comparing the average lengths and standard errors, it is seen that the average

lengths and standard errors of the simulation-based approach are smaller than

those of the GCI approach for all cases.

4 An Empirical Application

Consider the data are taken from Hand et al. [3], Schaarscmidt [10] and Sadooghi-

Alvandi and Malekzadeh [9]. The data set consists of 57 observations of nitrogen

bound bovine serum albumin in k = 3 groups of mice. The data were categorized

into three groups depending on the type of mice: normal mice (Group 1), alloxan-

induced diabetic mice (Group 2), and alloxan-induced diabetic mice treated with

insulin (Group 3). The summary statistics are as follows: n1 = 20, n2 = 18,

n3 = 19, x̄1 = 4.859, x̄2 = 4.867, x̄3 = 4.397, s21 = 0.927, s22 = 0.850, s23 = 0.696,

s2X1 = 17.613, s2X2 = 14.45, s2X3 = 12.528, θ1 = 56612.670, θ2 = 46406.680, and

θ3 = 11904.000.

Table 2. The coverage probabilities (CP), average lengths (AL) and standard errors

(s.e.) of 95% two-sided simultaneous conﬁdence intervals for all diﬀerences of variances

of log-normal distributions: 5 sample cases.

CP AL (s.e.) CP AL (s.e.)

(20,20,20,20,20) (0.05,0.05,0.05,0.05,0.05) 0.9494 0.0447 (0.0026) 0.9495 0.0314 (0.0018)

(0.05,0.05,0.10,0.15,0.15) 0.9503 0.2617 (0.0434) 0.9052 0.1842 (0.0300)

(0.15,0.15,0.15,0.15,0.15) 0.9508 0.4528 (0.0290) 0.9490 0.3108 (0.0193)

(30,30,30,30,30) (0.05,0.05,0.05,0.05,0.05) 0.9494 0.0328 (0.0015) 0.9488 0.0260 (0.0012)

(0.05,0.05,0.10,0.15,0.15) 0.9510 0.1922 (0.0305) 0.9230 0.1530 (0.0240)

(0.15,0.15,0.15,0.15,0.15) 0.9518 0.3273 (0.0168) 0.9513 0.2559 (0.0129)

(50,50,50,50,50) (0.05,0.05,0.05,0.05,0.05) 0.9518 0.0234 (0.0009) 0.9512 0.0204 (0.0007)

(0.05,0.05,0.10,0.15,0.15) 0.9516 0.1370 (0.0208) 0.9334 0.1196 (0.0181)

(0.15,0.15,0.15,0.15,0.15) 0.9489 0.2316 (0.0093) 0.9489 0.2001 (0.0079)

(100,100,100,100,100) (0.05,0.05,0.05,0.05,0.05) 0.9497 0.0156 (0.0004) 0.9500 0.0145 (0.0004)

(0.05,0.05,0.10,0.15,0.15) 0.9480 0.0918 (0.0136) 0.9394 0.0859 (0.0127)

(0.15,0.15,0.15,0.15,0.15) 0.9507 0.1526 (0.0044) 0.9508 0.1419 (0.0040)

(200,200,200,200,200) (0.05,0.05,0.05,0.05,0.05) 0.9480 0.0107 (0.0002) 0.9489 0.0103 (0.0002)

(0.05,0.05,0.10,0.15,0.15) 0.9478 0.0632 (0.0092) 0.9435 0.0611 (0.0089)

(0.15,0.15,0.15,0.15,0.15) 0.9513 0.1044 (0.0022) 0.9512 0.1007 (0.0021)

SCIs for All Diﬀerences of Variances of Log-Normal Distributions 243

variances are given in Table 3. It is clear from the table that the simulation-

based approach is shorter interval than the GCI approach.

Table 3. The 95% two-sided simultaneous conﬁdence intervals for all pairwise diﬀer-

ences of variances of log-normal distributions.

Lower Upper Lower Upper

Group 1/Group 2 –764974.00 663809.50 –241820.00 192428.80

Group 1/Group 3 –907064.70 55546.79 –272166.90 16233.17

Group 2/Group 3 –755857.60 74883.13 –231398.70 15659.84

In this paper, the GCI approach and the simulation-based approach are intro-

duced to construct the simultaneous conﬁdence intervals for all diﬀerences of

variances of log-normal distributions. The coverage probabilities, average lengths

and standard errors are considered. Simulation studies showed that the coverage

probabilities of the GCI approach are close to the nominal conﬁdence level 0.95

for all cases. Coverage probabilities of the simulation-based approach are very

close to the nominal conﬁdence level 0.95 for all cases when they have same

values of standard deviations. For diﬀerent standard deviations, the simulation-

based approach has coverage probabilities below the nominal conﬁdence level

0.95. Average lengths of the simulation-based approach are slightly less than the

GCI approach for all cases of same values of standard deviations. Hence, the

simulation-based approach is recommended for same values of standard devia-

tions otherwise the GCI approach is recommended.

Technology North Bangkok. Contract no. KMUTNB-61-DRIVE-006.

References

1. Cojbasica, V., Tomovica, A.: Nonparametric conﬁdence intervals for population

variance of one sample and the diﬀerence of variances of two samples. Comput.

Stat. Data Anal. 51, 5562–5578 (2007)

2. Hannig, J., Lidong, E., Abdel-Karim, A., Iyer, H.: Simultaneous ﬁducial generalized

conﬁdence intervals for ratios of means of lognormal distributions. Austrian J. Stat.

35, 261–269 (2006)

3. Hand, D.J., Daly, F., McConway, K., Lunn, D., Ostrowski, E.: A Handbook of

Small Data Sets. Chapman and Hall/CRC, London (1994)

244 W. Thangjai and S. Niwitpong

4. Herbert, R.D., Hayen, A., Macaskill, P., Walter, S.D.: Interval estimation for the

diﬀerence of two independent variances. Commun. Stat. Simul. Comput. 40, 744–

758 (2011)

5. Niwitpong, S.: Conﬁdence intervals for the diﬀerence of two normal population

variances. World Acad. Sci. Eng. Technol. 80, 602–605 (2011)

6. Niwitpong, S.: A note on coverage probability of conﬁdence interval for the diﬀer-

ence between two normal variances. Appl. Math. Sci. 6, 3313–3320 (2012)

7. Pal, N., Lim, W.K., Ling, C.H.: A computational approach to statistical inferences.

J. Appl. Probab. Stat. 2, 13–35 (2007)

8. Royston, P.: The lognormal distribution as a model for survival time in cancer,

with an emphasis on prognostic factors. Statistica Neerlandica 55, 89–104 (2001)

9. Sadooghi-Alvandi, S.M., Malekzadeh, A.: Simultaneous conﬁdence intervals for

ratios of means of several lognormal distributions: a parametric bootstrap app-

roach. Comput. Stat. Data Anal. 69, 133–140 (2014)

10. Schaarschmidt, F.: Simultaneous conﬁdence intervals for multiple comparisons

among expected values of log-normal variables. Comput. Stat. Data Anal. 58,

265–275 (2013)

11. Shen, W.H.: Estimation of parameters of a lognormal distribution. Taiwan. J.

Math. 2, 243–250 (1998)

12. Thangjai, W., Niwitpong, S., Niwitpong, S.: Simultaneous conﬁdence intervals for

all diﬀerences of means of normal distributions with unknown coeﬃcients of vari-

ation. In: Studies in Computational Intelligence, vol. 753, pp. 670–682 (2018)

13. Thangjai, W., Niwitpong, S., Niwitpong, S.: Simultaneous conﬁdence intervals for

all diﬀerences of means of two-parameter exponential distributions. In: Studies in

Computational Intelligence, vol. 760, pp. 298–308 (2018)

14. Tian, L., Wu, J.: Inferences on the common mean of several log-normal populations:

the generalized variable approach. Biom. J. 49, 944–951 (2007)

15. Weerahandi, S.: Generalized conﬁdence intervals. J. Am. Stat. Assoc. 88, 899–905

(1993)

Conﬁdence Intervals for the Inverse Mean

and Diﬀerence of Inverse Means

of Normal Distributions with Unknown

Coeﬃcients of Variation

King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand

wthangjai@yahoo.com, {sa-aat.n,suparat.n}@sci.kmutnb.ac.th

intervals for a single inverse mean and the diﬀerence of two inverse means

of normal distributions with unknown coeﬃcients of variation (CVs).

The conﬁdence intervals for the inverse mean with unknown coeﬃcient

of variation (CV) were constructed based on the generalized conﬁdence

interval (GCI) approach and the large sample approach. The generalized

conﬁdence interval and large sample conﬁdence interval for the inverse

mean with unknown CV were compared with the generalized conﬁdence

interval for the inverse mean of Niwitpong and Wongkhao [5]. Moreover,

the conﬁdence intervals for the diﬀerence of inverse means with unknown

CVs were constructed using the GCI approach, large sample approach

and method of variance estimates recovery (MOVER) approach and then

compared with existing conﬁdence interval for the diﬀerence of inverse

means based on the GCI approach of Niwitpong and Wongkhao [6]. The

coverage probability and average length of the conﬁdence intervals were

evaluated by a Monte Carlo simulation. Carrying out the simulation stud-

ies, the results showed that the generalized conﬁdence interval provides

the best conﬁdence interval for the inverse mean with unknown CV. The

generalized conﬁdence interval and the MOVER conﬁdence interval for

the diﬀerence of inverse means with unknown CVs perform well in terms

of the coverage probability and average length. Finally, two real datasets

are analyzed to illustrate the proposed conﬁdence intervals.

Monte Carlo simulation

1 Introduction

In statistics, a normal distribution is the most important probability distribution.

The mean and variance of the normal population are denoted μ and σ 2 , respec-

tively. The sample mean x̄ is the uniformly minimum variance unbiased (UMVU)

estimator of the normal population mean μ. Searls [9] introduced the minimum

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 245–263, 2019.

https://doi.org/10.1007/978-3-030-04263-9_19

246 W. Thangjai et al.

mean squared error (MMSE) estimator for the normal population mean with

known CV, where the CV is deﬁned as σ/μ. However, the CV needs to be esti-

mated in practice. Therefore, Srivastava [10] proposed the UMVU estimator to

estimate the normal population mean with unknown CV. For more details about

the mean of normal distribution with unknown CV, see the research papers of

Sahai [7], Sahai and Acharya [8], Sodanin et al. [11], and Thangjai et al. [13].

The inverse mean is the reciprocal of mean, 1/μ. It has been used in experi-

mental nuclear physics, econometrics, and biological sciences. Several researchers

have been studied conﬁdence interval estimation for the inverse mean of normal

distribution. For example, Niwitpong and Wongkhao [5] constructed the new

conﬁdence intervals for the inverse mean. Niwitpong and Wongkhao [6] pro-

posed the new conﬁdence intervals for the diﬀerence of inverse means. Thangjai

et al. [12] investigated the performance of the conﬁdence intervals for the com-

mon inverse mean based on the GCI approach and the large sample approach.

Thangjai et al. [14] extended the research work of Thangjai et al. [12]. Thangjai

et al. [14] proposed the adjusted MOVER approach to construct the conﬁdence

interval for the common inverse mean.

Suppose X = (X1 , X2 , . . . , Xn ) is a random sample of size n from all possible

distributions. Let L(X) and U (X) be the lower limit and the upper limit of the

conﬁdence interval for the mean corresponding to a given nominal conﬁdence

level 1 − α. By deﬁnition, this means that if X = (X1 , X2 , . . . , Xn ) is an inde-

pendent and identically distributed (i.i.d.) sample from the actual distribution,

then the actual mean M will be between L(X) and U (X). It can be written as

P (L(X) ≤ M ≤ U (X)) = 1 − α. By taking inverse values of all three values, it

concludes that 1/U (X) ≤ 1/M ≤ 1/L(X). This means that if a random sample

is taken, then 1/M will be between 1/U (X) and 1/L(X) with nominal conﬁdence

level 1 − α. That is P (1/U (X) ≤ 1/M ≤ 1/L(X)) = 1 − α. In other words, if

(L(X), U (X)) is the conﬁdence interval for the mean, then (1/U (X), 1/L(X)) is

automatically a conﬁdence interval for the inverse mean.

To our knowledge, no paper exists for the inverse mean and diﬀerence of

inverse means of normal distributions with unknown CVs. Therefore, this paper

will ﬁll this gap by developing novel approaches and extends the paper works

of Thangjai et al. [12], Thangjai et al. [13] and Thangjai et al. [14] to construct

conﬁdence intervals for the single inverse mean and diﬀerence of inverse means

of normal distributions with unknown CVs. The conﬁdence intervals for the

single inverse mean with unknown CV were constructed based on the general-

ized conﬁdence interval and the large sample conﬁdence interval and compared

with the generalized conﬁdence interval for the inverse mean of Niwitpong and

Wongkhao [5]. Moreover, the conﬁdence intervals for the diﬀerence of inverse

means with unknown CVs were proposed using the generalized conﬁdence inter-

val, large sample conﬁdence interval, and method of variance estimates recovery

(MOVER) conﬁdence interval. Three conﬁdence intervals were compared with

the generalized conﬁdence interval for the diﬀerence of inverse means of Niwit-

pong and Wongkhao [6].

CIs for Inverse Mean and Diﬀerence of Inverse Means 247

This paper is organized as follows. In Sect. 2, the conﬁdence intervals for the

single inverse mean with unknown CV are described. In Sect. 3, the conﬁdence

intervals for the diﬀerence of inverse means with unknown CVs are provided. In

Sect. 4, simulation results are presented to evaluate the coverage probabilities and

average lengths of the proposed approaches. Section 5 illustrates the proposed

approaches using two examples. Finally, Sect. 6 summarizes this paper.

Distribution with Unknown Coeﬃcient of Variation

mean μ and variance σ 2 . The CV is deﬁned as the standard deviation divided

by the mean, τ = σ/μ. Let X̄ and S 2 be sample mean and sample variance for

X, respectively. Hence, the estimator of the CV is τ̂ = σ̂μ̂ = X̄

S

. Also, let x̄ and

2 2

s be the observed sample of X̄ and S , respectively.

Searls [9] proposed the following minimum mean squared error (MMSE) esti-

mator for the mean of normal population with variance

μ nμ

η= = . (1)

1 + (σ 2 /nμ2 ) n + (σ 2 /μ2 )

introduced by Srivastava [10]. The estimator of Srivastava [10] has the following

form

X̄ nX̄

η̂ = = . (2)

2

1 + S /nX̄ 2 n + S 2 /X̄ 2

From Eqs. (1) and (2), the inverse mean of normal population with unknown

CV θ and the estimator of θ are

1 1 + σ 2 /nμ2 n + σ 2 /μ2

θ= = = (3)

η μ nμ

and

1 1 + S 2 /nX̄ 2 n + S 2 /X̄ 2

θ̂ = = = . (4)

η̂ X̄ nX̄

Theorem 2.1. Let X = (X1 , X2 , . . . , Xn ) be a random sample from N (μ, σ 2 ).

Let X̄ and S 2 be a sample mean and a sample variance of X, respectively. Let

θ be the inverse mean of normal population with unknown CV and let θ̂ be an

estimator of θ. The mean and variance of θ̂ are

σ2 2σ 6 + 4nμ2 σ 4

E(θ̂) = 1 + + ·θ (5)

nμ2 + σ 2 (nμ2 + σ 2 )3

248 W. Thangjai et al.

and

2

1 1 σ2 2σ 4 + 4nμ2 σ 2

V ar(θ̂) = + · · 1 +

μ μ nμ2 + σ 2 (nμ2 + σ 2 )2

⎛ 2 ⎞

nσ 2 2 2σ 4 + 4nμ2 σ 2

⎜ ·

σ2 ⎟

+

⎜ nμ2 + σ 2 n (nμ2 + σ 2 )2 ⎟

· ⎜ 2 + 2⎟

. (6)

⎝ nσ 2 4

2σ + 4nμ σ 2 2 nμ ⎠

n+ · 1+

nμ2 + σ 2 (nμ2 + σ 2 )2

X̄ ∼ N (μ, σ 2 /n). Then the mean and variance of nX̄ are

According to Thangjai et al. [13], the mean and variance of X̄ 2 have the

following form

nμ2 + σ 2 2σ 4 + 4nμ2 σ 2

E X̄ 2 = and V ar X̄ 2 = .

n n2

The mean and the variance of S 2 /X̄ 2 are

2

S nσ 2 2σ 4 + 4nμ2 σ 2

E = · 1+ 2

X̄ 2 nμ2 + σ 2 (nμ2 + σ 2 )

and 2

S2 nσ 2 2 2σ 4 + 4nμ2 σ 2

V ar = · + 2 .

X̄ 2 nμ2 + σ 2 n (nμ2 + σ 2 )

Therefore, the mean and variance of n + (S 2 /X̄ 2 ) are deﬁned by

S2 nσ 2 2σ 4 + 4nμ2 σ 2

E n+ 2 =n+ · 1+ 2

X̄ nμ2 + σ 2 (nμ2 + σ 2 )

and

2

S2 S2 nσ 2 2 2σ 4 + 4nμ2 σ 2

V ar n + 2 = V ar = · + 2 .

X̄ X̄ 2 nμ2 + σ 2 n (nμ2 + σ 2 )

CIs for Inverse Mean and Diﬀerence of Inverse Means 249

n + (S 2 /X̄ 2 )

E θ̂ = E

nX̄

E n + S 2 /X̄ 2 V ar nX̄

= · 1+ 2

E nX̄ E nX̄

σ2 2σ 4 + 4nμ2 σ 2 n2 μ2 + nσ 2

= 1+ · 1 + 2 ·

nμ2 + σ 2 (nμ2 + σ 2 ) n2 μ3

σ2 2σ 6 + 4nμ2 σ 4 nμ2 n + σ 2 /μ2

= 1+ + 3 ·

nμ2 + σ 2 (nμ2 + σ 2 ) n2 μ3

2 2

σ2 2σ 6 + 4nμ2 σ 4 n + σ /μ

= 1+ 2 2

+ 3 ·

nμ + σ (nμ2 + σ 2 ) nμ

σ2 2σ 6 + 4nμ2 σ 4

= 1+ + 3 ·θ

nμ2 + σ 2 (nμ2 + σ 2 )

and

n + (S 2 /X̄ 2 )

V ar θ̂ = V ar

nX̄

2 2 2

E n + S /X̄ V ar n + S 2 /X̄ 2 V ar nX̄

= · 2 + 2

E nX̄ E n + S 2 /X̄ 2 E nX̄

⎛ ⎞2

nσ 2 2σ 4 + 4nμ2 σ 2

⎜ n + nμ2 + σ 2 · 1 + ⎟

⎜ (nμ2 + σ 2 )

2

⎟

=⎜⎜ ⎟

nμ ⎟

⎝ ⎠

⎛ 2
⎞

nσ 2 2 2σ 4 + 4nμ2 σ 2

⎜ · + ⎟

⎜ nμ2 + σ 2 n (nμ2 + σ 2 )

2

nσ 2 ⎟

⎜ ⎟

· ⎜
2 + ⎟

⎜ n2 μ2 ⎟

⎝ nσ 2 2σ 4 + 4nμ2 σ 2 ⎠

n+ · 1+ 2

nμ2 + σ 2 (nμ2 + σ 2 )

2

1 1 σ2 2σ 4 + 4nμ2 σ 2

= + · · 1+ 2

μ μ nμ2 + σ 2 (nμ2 + σ 2 )

⎛ 2
⎞

nσ 2 2 2σ 4 + 4nμ2 σ 2

⎜ · + ⎟

⎜ nμ2 + σ 2 n (nμ2 + σ 2 )

2

σ2 ⎟

⎜ ⎟

· ⎜
2 + ⎟.

⎜ nμ2 ⎟

⎝ nσ 2 2σ 4 + 4nμ2 σ 2 ⎠

n+ · 1+ 2

nμ2 + σ 2 (nμ2 + σ 2 )

Hence, Theorem 2.1 is proved.

250 W. Thangjai et al.

of Normal Distribution with Unknown Coeﬃcient of Variation

Deﬁnition 2.1. Let X = (X1 , X2 , . . . , Xn ) be a random sample from a distri-

bution F (x|γ) which depends on a vector of parameters γ = (θ, ν) where θ is

parameter of interest and ν is possibly a vector of nuisance parameters. Weer-

ahandi [15] deﬁnes a generalized pivotal quantity R(X, x, θ, ν) for conﬁdence

interval estimation, where x is an observed value of X, as a random variable

having the following two properties:

(i) R(X, x, θ, ν) has a probability distribution that is free of all unknown param-

eters.

(ii) The observed value of R(X, x, θ, ν), X = x, is the parameter of interest.

Let R(α) be the 100(α) -th percentile of R(X, x, θ, ν). Along these lines,

(R(α/2), R(1 − α/2)) becomes a 100(1 − α)% two-sided generalized confi-

dence interval for the parameter of interest.

Recall that

(n − 1) S 2

= V ∼ χ2n−1 , (7)

σ2

where V is chi-squared distribution with n − 1 degrees of freedom. Therefore,

the generalized pivotal quantity for σ 2 is

(n − 1) s2

Rσ 2 = . (8)

V

The mean has the following form

Z (n − 1) s2

μ ≈ x̄ − √ , (9)

U n

where Z and U denote standard normal distribution and chi-square distribution

with n − 1 degrees of freedom, respectively. Therefore, the generalized pivotal

quantity for μ is

Z (n − 1) s2

Rμ = x̄ − √ . (10)

U n

The generalized pivotal quantity for θ is

n + Rσ2 /Rμ2

Rθ = , (11)

nRμ

Therefore, the 100(1−α)% two-sided conﬁdence interval for the inverse mean

of normal distribution with unknown CV based on the GCI approach is

CIs for Inverse Mean and Diﬀerence of Inverse Means 251

Normal Distribution with Unknown Coeﬃcient of Variation

θ̂ − E(θ̂) θ̂ − θ

Z= = . (13)

V ar(θ̂) V ar(θ̂)

Therefore, the 100(1−α)% two-sided conﬁdence interval for the inverse mean

of normal distribution with unknown CV based on the large sample approach is

CILS.θ = θ̂ − z1−α/2 V ar(θ̂), θ̂ + z1−α/2 V ar(θ̂) , (14)

where θ̂ is deﬁned in Eq. (4), V ar(θ̂) is deﬁned in Eq. (6) with μ and σ 2 replaced

by x̄ and s2 , respectively, and z1−α/2 denotes the (1 − α/2)-th quantile of the

standard normal distribution.

of Normal Distributions with Unknown Coeﬃcients

of Variation

2 2

mean μX and variance σX . Let X̄ and SX be sample mean and sample variance

2 2

for X, respectively. Also, let x̄ and sX be the observed sample of X̄ and SX ,

respectively. Furthermore, let Y = (Y1 , Y2 , . . . , Ym ) be a random sample from a

normal distribution with mean μY and variance σY2 . Let Ȳ and SY2 be sample

mean and sample variance for Y , respectively. Also, let ȳ and s2Y be the observed

sample of Ȳ and SY2 , respectively. Also, X and Y are independent.

Let δ = θX −θY be the diﬀerence of inverse means with unknown CVs, where

θX and θY are the inverse means with unknown CVs of X and Y , respectively.

The estimator of δ is

2

n + SX /X̄ 2 m + SY2 /Ȳ 2

δ̂ = θ̂X − θ̂Y = − , (15)

nX̄ mȲ

2

dom samples from N (μX , σX ) and N (μY , σY2 ), respectively. Let X and Y be

2

independent. Suppose that X̄ and SX are a sample mean and a sample vari-

ance for X, respectively. Also, suppose that Ȳ and SY2 are a sample mean and

a sample variance for Y , respectively. Let θX and θY be the inverse means with

252 W. Thangjai et al.

θY and let δ̂ be an estimator of δ. The mean and variance of δ̂ are

2 6

σX 2σX + 4nμ2X σX

4

E(δ̂) = 1+ + · θ X

nμ2X + σX2 (nμ2X + σX

2 )3

σY2 2σY6 + 4mμ2Y σY4

− 1+ + · θ Y (16)

mμ2Y + σY2 (mμ2Y + σY2 )3

and

2

4

2

1 1 σX 2σX + 4nμ2X σX 2

V ar(δ̂) = + · · 1 +

μX μX nμ2X + σX 2 (nμ2X + σX 2 )2

⎛ 2 ⎞

2 4

nσX 2 2σX + 4nμ2X σX2

⎜ ·

σ2 ⎟

+

⎜ nμ2X + σX 2 n (nμ2X + σX 2 )2

⎟

· ⎜ + X2 ⎟

⎝ nσX2 4

2σ + 4nμX σX 2 2 2 nμ X⎠

n+ 2 2 · 1+ X 2 2 2

nμX + σX (nμX + σX )

2

2

1 1 σY 2σY4 + 4mμ2Y σY2

+ + · · 1 +

μY μY mμ2Y + σY2 (mμ2Y + σY2 )2

⎛ 2 ⎞

mσY2 2 2σY4 + 4mμ2Y σY2

⎜ ·

σY2 ⎟

+

⎜ mμ2Y + σY2 m (mμ2Y + σY2 )2 ⎟

· ⎜ 2 + ⎟(17)

.

⎝ mσY2 2σY4 + 4mμ2Y σY2 mμ2Y ⎠

m+ · 1+

mμ2Y + σY2 (mμ2Y + σY2 )2

Let δ̂ be an estimator of δ which is deﬁned by

2

n + (SX /X̄ 2 ) m + (SY2 /Ȳ 2 )

δ̂ = − .

nX̄ mȲ

2

n + (SX /X̄ 2 ) m + (SY2 /Ȳ 2 )

E δ̂ = E −

nX̄ mȲ

2

n + (SX /X̄ 2 ) m + (SY2 /Ȳ 2 )

=E −E

nX̄ mȲ

2 6

σX 2σX + 4nμ2X σX

4

= 1+ 2 + (nμ2 + σ 2 )3 · θX

nμ2X + σX X X

σY2 2σY6 + 4mμ2Y σY4

− 1+ + · θ Y

mμ2Y + σY2 (mμ2Y + σY2 )3

CIs for Inverse Mean and Diﬀerence of Inverse Means 253

and

2

n + (SX /X̄ 2 ) m + (SY2 /Ȳ 2 )

V ar δ̂ = V ar −

nX̄ mȲ

2 2

n + (SX /X̄ ) m + (SY2 /Ȳ 2 )

= V ar + V ar

nX̄ mȲ

2

4

2

1 1 σX 2σX + 4nμ2X σX 2

= + · · 1+

μX μX nμ2X + σX 2 (nμ2X + σX 2 )2

⎛ 2 ⎞

2 4

nσX 2 2σX + 4nμ2X σX2

⎜ · + 2 ⎟

⎜ nμ2X + σX 2 n (nμ2X + σX 2 )2

σX ⎟

· ⎜ + ⎟

⎝ nσX2

2σX 4

+ 4nμ2X σX2 2 nμ2X ⎠

n+ · 1+

nμ2X + σX 2 (nμ2X + σX 2 )2

2

1 1 σY2 2σY4 + 4mμ2Y σY2

+ + · · 1+

μY μY mμ2Y + σY2 (mμ2Y + σY2 )2

⎛ 2 ⎞

mσY2 2 2σY4 + 4mμ2Y σY2

⎜ ·

σY2 ⎟

+

⎜ mμ2Y + σY2 m (mμ2Y + σY2 )2 ⎟

· ⎜ + 2 ⎟ .

⎝ mσY 2 4

2σY + 4mμY σY 2 2 2 mμ Y ⎠

m+ · 1+

mμ2Y + σY2 (mμ2Y + σY2 )2

Hence, Theorem 3.1 is proved.

Means of Normal Distributions with Unknown Coeﬃcients

of Variation

Deﬁne

(n − 1) SX

2

(m − 1) SY2

2 = VX ∼ χ2n−1 and = VY ∼ χ2m−1 , (18)

σX σY2

where VX and VY are chi-squared distributions with n − 1 and m − 1 degrees of

2

freedom. Therefore, the generalized pivotal quantities for σX and σY2 are

(n − 1) s2X (m − 1) s2Y

R σX

2 = and RσY2 = . (19)

VX VY

The means are given by

ZX (n − 1) s2X ZY (m − 1) s2Y

μX ≈ x̄ − √ and μY ≈ ȳ − √ , (20)

UX n UY m

where ZX and ZY denote standard normal distributions and UX and UY denote

chi-square distributions with n − 1 and m − 1 degrees of freedom, respectively.

Therefore, the generalized pivotal quantities for μX and μY are

ZX (n − 1) s2X ZY (m − 1) s2Y

RμX = x̄ − √ and RμY = ȳ − √ . (21)

UX n UY m

254 W. Thangjai et al.

2

n + R σX 2 /R

μX m + RσY2 /Rμ2 Y

R δ = R θX − R θY = − . (22)

nRμX mRμY

Therefore, the 100(1 − α)% two-sided conﬁdence interval for the diﬀerence

of inverse means of normal distributions with unknown CVs based on the GCI

approach is

CIGCI.δ = (Rδ (α/2) , Rδ (1 − α/2)) , (23)

where Rδ (α) denote the 100(α)-th percentile of Rδ .

Means of Normal Distributions with Unknown Coeﬃcients of

Variation

δ̂ − E(δ̂) δ̂ − δ

Z= = . (24)

V ar(δ̂) V ar(δ̂)

Therefore, the 100(1 − α)% two-sided conﬁdence interval for the diﬀerence

of inverse means of normal distributions with unknown CVs based on the large

sample approach is

CILS.δ = δ̂ − z1−α/2 V ar(δ̂), δ̂ + z1−α/2 V ar(δ̂) , (25)

where z1−α/2 denotes the (1−α/2)-th quantile of the standard normal distribution.

for the Diﬀerence of Inverse Means of Normal Distributions

with Unknown Coeﬃcients of Variation

According to Niwitpong and Wongkhao [5], the conﬁdence intervals for the

inverse means of X and Y are

√ √

n n

(lX , uX ) = √ , √ (26)

dX SX + nX̄ −dX SX + nX̄

and √ √

m m

(lY , uY ) = √ , √ , (27)

dY SY + mȲ −dY SY + mȲ

where dX and dY are an upper (1 − α/2)-th quantiles of the t-distributions with

n − 1 and m − 1 degrees of freedom, respectively.

CIs for Inverse Mean and Diﬀerence of Inverse Means 255

Donner and Zou [2] introduced the conﬁdence interval estimation for the

diﬀerence of parameters of interest using by the MOVER approach. Let Lδ and

Uδ be the lower limit and upper limit of the conﬁdence interval of the diﬀerence

of two parameters, respectively. The lower limit and upper limit are given by

Lδ = θ̂X − θ̂Y − (θ̂X − lX )2 + (uY − θ̂Y )2 (28)

and

Uδ = θ̂X − θ̂Y + (uX − θ̂X )2 + (θ̂Y − lY )2 , (29)

where θ̂X and θ̂Y are deﬁned in Eq. (15), lX and uX are deﬁned in Eq. (26), and

lY and uY are deﬁned in Eq. (27).

Therefore, the 100(1 − α)% two-sided conﬁdence interval for the diﬀerence of

inverse means of normal distributions with unknown CVs based on the MOVER

approach is

CIM OV ER.δ = (Lδ , Uδ ) , (30)

where Lδ and Uδ are deﬁned in Eqs. (28) and (29), respectively.

4 Simulation Studies

The proposed conﬁdence intervals in Sects. 2 and 3 were compared the perfor-

mance of the conﬁdence intervals in term of coverage probability and average

length. In this section, there are two simulation studies. First, the proposed conﬁ-

dence intervals in Sect. 2 were compared with the generalized conﬁdence interval

for the inverse mean of normal distribution which introduced by Niwitpong and

Wongkhao [5]. Second, the proposed conﬁdence intervals in Sect. 3 were com-

pared with the generalized conﬁdence interval for the diﬀerence of inverse means

of normal distributions which presented by Niwitpong and Wongkhao [6]. The

nominal conﬁdence level was set at 1 − α = 0.95. The conﬁdence interval was

chosen when the values of the coverage probability greater than or close to the

nominal conﬁdence level and also having the shortest average length.

Firstly, the performances of conﬁdence intervals for the inverse mean of nor-

mal distribution with unknown CV were compared. The generalized conﬁdence

interval for the inverse mean with unknown CV was deﬁned as CIGCI.θ , the

large sample conﬁdence interval for the inverse mean with unknown CV was

deﬁned as CILS.θ , and the generalized conﬁdence interval for the inverse mean

of Niwitpong and Wongkhao [5] was deﬁned as CIN W . The data were generated

from a normal distribution with the population mean μ = 1, the population

standard deviation σ = 0.01, 0.03, 0.05, 0.07, 0.09, 0.10, 0.30, 0.50 and 0.70, and

the sample size n = 20, 30, 50, 100 and 200. Table 1 shown the coverage proba-

bilities and average lengths of 95% two-sided conﬁdence intervals for θ and 1/μ.

The results indicated that the large sample conﬁdence intervals CILS.θ have the

coverage probabilities under nominal conﬁdence level of 0.95 when the sample

size is small and have the coverage probabilities close to 0.95 when the sample

size is large. The coverage probabilities of the generalized conﬁdence intervals

256 W. Thangjai et al.

intervals of Niwitpong and Wongkhao [5] CIN W . The average lengths of CIGCI.θ

are a bit shorter than the average lengths of CIN W for σ ≤ 0.30, whereas the

average lengths of CIN W are shorter than the average lengths of CIGCI.θ for

σ > 0.30.

Secondly, the coverage probabilities and average lengths of the proposed con-

ﬁdence intervals for the diﬀerence of inverse means with unknown CVs were

obtained and hence compared with the generalized conﬁdence interval for the

diﬀerence of inverse means of Niwitpong and Wongkhao [6]. The generalized

conﬁdence interval for the diﬀerence of inverse means with unknown CVs was

deﬁned as CIGCI.δ , the large sample conﬁdence interval for the diﬀerence of

inverse means with unknown CVs was deﬁned as CILS.δ , the MOVER con-

ﬁdence interval for the diﬀerence of inverse means with unknown CVs was

deﬁned as CIM OV ER.δ , and the generalized conﬁdence interval for the diﬀer-

ence of inverse means of Niwitpong and Wongkhao [6] was deﬁned as CID.N W .

Two data were generated from X ∼ N (μX , σX 2

) and Y ∼ N (μY , σY2 ), where the

population means μX = μY = 1, the population standard deviations σX = 0.10,

σY = 0.01, 0.03, 0.05, 0.07, 0.09, 0.10, 0.30, 0.50 and 0.70, and the sample sizes

(n, m) = (20, 20), (30, 30), (20, 30), (50, 50), (100, 100), (50, 100) and (200,

200). Tables 2 and 3 showed the coverage probabilities and the average lengths

of 95% two-sided conﬁdence intervals for the diﬀerence of inverse means with

unknown CVs, respectively. The results indicated that the coverage probabili-

ties of the CIGCI.δ , CIM OV ER.δ and CID.N W close to nominal conﬁdence level

of 0.95, whereas CILS.δ provides the coverage probabilities under 0.95. In almost

all cases, the CIGCI.δ and CIM OV ER.δ yield the average lengths shorter than

the CID.N W . Therefore, the CIGCI.δ and CIM OV ER.δ perform well in terms of

the coverage probability and average length for the conﬁdence intervals for the

diﬀerence of inverse means with unknown CVs.

5 An Empirical Application

In this section, some real data are used to illustrate the proposed conﬁdence

intervals.

Example 1. Consider the data taken from Niwitpong [4] and Thangjai et al. [13].

This data shows the number of defects in 100,000 lines of code in a particular

type of software program. The observations are as follows 48, 54, 50, 38, 39,

48, 48, 38, 42, 52, 42, 36, 52, 55, 40, 40, 40, 43, 43, 40, 48, 46, 48, 48, 52,

48, 50, 48, 52, 52, 46, and 45. Niwitpong [4] presented that the data is ﬁtted

by normal distribution. The summary statistics are as follows n = 32, x̄ =

45.9688, s2 = 27.7732, and 1/x̄ = 0.0218. The proposed conﬁdence intervals

given in Sect. 2 were used to compute the 95% two-sided conﬁdence intervals

for the inverse mean with unknown CV. The generalized conﬁdence interval is

CIGCI.θ = (0.0209, 0.0227) with the interval length of 0.0018. The large sample

conﬁdence interval is CILS.θ = (–0.8687, 0.9122) with the interval length of

1.7809. In comparison, the generalized conﬁdence interval for the inverse mean

CIs for Inverse Mean and Diﬀerence of Inverse Means 257

Table 1. The coverage probabilities (CP) and average lengths (AL) of 95% two-sided

conﬁdence intervals for the inverse mean of normal distribution with unknown CV.

CP AL CP AL CP AL

20 0.01 0.9530 0.0092 0.9404 0.0086 0.9580 0.0095

0.03 0.9534 0.0278 0.9408 0.0260 0.9590 0.0285

0.05 0.9494 0.0467 0.9368 0.0437 0.9546 0.0479

0.07 0.9468 0.0648 0.9322 0.0605 0.9494 0.0664

0.09 0.9528 0.0837 0.9384 0.0781 0.9568 0.0857

0.10 0.9484 0.0930 0.9350 0.0868 0.9558 0.0953

0.30 0.9468 0.2931 0.9326 0.2653 0.9520 0.2957

0.50 0.9504 0.5417 0.9334 0.4584 0.9548 0.5266

0.70 0.9474 0.9668 0.9264 0.7041 0.9532 0.8444

30 0.01 0.9508 0.0074 0.9378 0.0071 0.9534 0.0075

0.03 0.9464 0.0222 0.9360 0.0213 0.9494 0.0226

0.05 0.9490 0.0370 0.9388 0.0354 0.9520 0.0376

0.07 0.9498 0.0519 0.9406 0.0497 0.9532 0.0528

0.09 0.9512 0.0671 0.9394 0.0641 0.9534 0.0681

0.10 0.9468 0.0742 0.9356 0.0710 0.9494 0.0755

0.30 0.9490 0.2288 0.9404 0.2149 0.9520 0.2304

0.50 0.9498 0.4074 0.9402 0.3683 0.9532 0.4019

0.70 0.9536 0.6422 0.9438 0.5462 0.9576 0.6059

50 0.01 0.9484 0.0056 0.9398 0.0055 0.9504 0.0057

0.03 0.9528 0.0170 0.9488 0.0165 0.9548 0.0171

0.05 0.9504 0.0282 0.9446 0.0275 0.9530 0.0285

0.07 0.9452 0.0397 0.9404 0.0386 0.9490 0.0400

0.09 0.9520 0.0511 0.9464 0.0498 0.9546 0.0516

0.10 0.9462 0.0567 0.9400 0.0552 0.9488 0.0572

0.30 0.9514 0.1729 0.9438 0.1665 0.9554 0.1735

0.50 0.9546 0.2994 0.9494 0.2827 0.9580 0.2972

0.70 0.9454 0.4449 0.9428 0.4086 0.9474 0.4328

100 0.01 0.9508 0.0040 0.9482 0.0039 0.9512 0.0040

0.03 0.9530 0.0119 0.9508 0.0117 0.9532 0.0119

0.05 0.9494 0.0198 0.9472 0.0195 0.9520 0.0199

0.07 0.9574 0.0277 0.9530 0.0274 0.9574 0.0278

0.09 0.9520 0.0357 0.9502 0.0352 0.9534 0.0358

0.10 0.9522 0.0396 0.9510 0.0391 0.9534 0.0397

0.30 0.9504 0.1200 0.9482 0.1178 0.9512 0.1202

0.50 0.9566 0.2033 0.9552 0.1979 0.9560 0.2028

0.70 0.9490 0.2920 0.9462 0.2810 0.9506 0.2888

200 0.01 0.9498 0.0028 0.9488 0.0028 0.9502 0.0028

0.03 0.9502 0.0084 0.9492 0.0083 0.9516 0.0084

0.05 0.9530 0.0139 0.9510 0.0138 0.9520 0.0140

0.07 0.9534 0.0195 0.9528 0.0194 0.9544 0.0196

0.09 0.9510 0.0251 0.9488 0.0249 0.9496 0.0251

0.10 0.9512 0.0279 0.9504 0.0277 0.9514 0.0280

0.30 0.9460 0.0839 0.9460 0.0832 0.9472 0.0840

0.50 0.9452 0.1410 0.9450 0.1391 0.9458 0.1408

0.70 0.9504 0.1998 0.9506 0.1961 0.9514 0.1988

258 W. Thangjai et al.

Table 2. The coverage probabilities of 95% two-sided conﬁdence intervals for the

diﬀerence of inverse means of normal distributions with unknown CVs.

20 20 0.10 0.01 0.9548 0.9416 0.9534 0.9600

0.03 0.9512 0.9394 0.9524 0.9562

0.05 0.9552 0.9402 0.9558 0.9622

0.07 0.9564 0.9448 0.9586 0.9624

0.09 0.9590 0.9444 0.9592 0.9640

0.10 0.9516 0.9402 0.9520 0.9562

0.30 0.9490 0.9328 0.9476 0.9548

0.50 0.9486 0.9306 0.9488 0.9548

0.70 0.9446 0.9310 0.9436 0.9484

30 30 0.10 0.01 0.9478 0.9382 0.9492 0.9532

0.03 0.9542 0.9476 0.9558 0.9584

0.05 0.9574 0.9472 0.9568 0.9594

0.07 0.9508 0.9424 0.9518 0.9566

0.09 0.9550 0.9470 0.9536 0.9574

0.10 0.9536 0.9454 0.9536 0.9566

0.30 0.9520 0.9414 0.9510 0.9542

0.50 0.9438 0.9384 0.9454 0.9494

0.70 0.9504 0.9380 0.9508 0.9552

20 30 0.10 0.01 0.9510 0.9374 0.9502 0.9556

0.03 0.9510 0.9384 0.9510 0.9544

0.05 0.9498 0.9372 0.9490 0.9534

0.07 0.9548 0.9424 0.9546 0.9578

0.09 0.9498 0.9404 0.9512 0.9560

0.10 0.9558 0.9464 0.9564 0.9602

0.30 0.9588 0.9484 0.9576 0.9616

0.50 0.9530 0.9414 0.9524 0.9558

0.70 0.9484 0.9362 0.9480 0.9540

50 50 0.10 0.01 0.9500 0.9440 0.9516 0.9540

0.03 0.9496 0.9424 0.9482 0.9516

0.05 0.9518 0.9478 0.9522 0.9536

0.07 0.9516 0.9470 0.9526 0.9540

0.09 0.9506 0.9472 0.9498 0.9522

0.10 0.9556 0.9500 0.9554 0.9568

0.30 0.9504 0.9440 0.9506 0.9516

0.50 0.9488 0.9384 0.9474 0.9522

0.70 0.9500 0.9434 0.9464 0.9512

(continued)

CIs for Inverse Mean and Diﬀerence of Inverse Means 259

Table 2. (continued)

100 100 0.10 0.01 0.9528 0.9510 0.9538 0.9550

0.03 0.9518 0.9490 0.9516 0.9526

0.05 0.9492 0.9458 0.9494 0.9506

0.07 0.9574 0.9544 0.9562 0.9564

0.09 0.9580 0.9558 0.9588 0.9570

0.10 0.9482 0.9490 0.9502 0.9514

0.30 0.9544 0.9520 0.9556 0.9564

0.50 0.9472 0.9464 0.9492 0.9494

0.70 0.9470 0.9450 0.9484 0.9472

50 100 0.10 0.01 0.9450 0.9402 0.9454 0.9462

0.03 0.9496 0.9446 0.9490 0.9512

0.05 0.9502 0.9452 0.9506 0.9522

0.07 0.9522 0.9480 0.9512 0.9534

0.09 0.9488 0.9426 0.9486 0.9508

0.10 0.9470 0.9428 0.9470 0.9506

0.30 0.9468 0.9412 0.9458 0.9468

0.50 0.9482 0.9438 0.9482 0.9510

0.70 0.9518 0.9502 0.9534 0.9534

200 200 0.10 0.01 0.9478 0.9470 0.9490 0.9480

0.03 0.9440 0.9442 0.9446 0.9454

0.05 0.9480 0.9476 0.9484 0.9510

0.07 0.9532 0.9506 0.9520 0.9536

0.09 0.9472 0.9472 0.9484 0.9486

0.10 0.9490 0.9480 0.9494 0.9502

0.30 0.9522 0.9536 0.9540 0.9542

0.50 0.9452 0.9460 0.9464 0.9458

0.70 0.9464 0.9472 0.9480 0.9472

of Niwitpong and Wongkhao [5] is CIN W = (0.0208, 0.0227) with the interval

length of 0.0019. It is seen that CIGCI.θ performs better than CILS.θ and CIN W

in the sense that the length of CIGCI.θ is shorter than CILS.θ and CIN W .

Example 2. The data is previously considered by Lee and Lin [3] and Thangjai

et al. [13]. The data shows the carboxyhemoglobin levels for nonsmokers and

cigarette smokers. The data are ﬁtted by normal distributions. For nonsmok-

ers, the summary statistics are as follows n = 121, x̄ = 1.3000, s2X = 1.7040,

and 1/x̄ = 0.7692. For cigarette smokers, the summary statistics are as follows

m = 75, ȳ = 4.1000, s2Y = 4.0540, and 1/ȳ = 0.2439. The diﬀerence between

1/x̄ = and 1/ȳ = is 0.5253. The 95% two-sided conﬁdence intervals for the

260 W. Thangjai et al.

Table 3. The average lengths of 95% two-sided conﬁdence intervals for the diﬀerence

of inverse means of normal distributions with unknown CVs.

20 20 0.10 0.01 0.0937 0.0874 0.0935 0.0960

0.03 0.0971 0.0907 0.0970 0.0995

0.05 0.1039 0.0972 0.1040 0.1064

0.07 0.1140 0.1067 0.1142 0.1168

0.09 0.1253 0.1173 0.1255 0.1283

0.10 0.1317 0.1232 0.1319 0.1349

0.30 0.3074 0.2790 0.3024 0.3105

0.50 0.5500 0.4667 0.5195 0.5354

0.70 1.0989 0.7194 0.8592 0.8704

30 30 0.10 0.01 0.0747 0.0714 0.0745 0.0759

0.03 0.0777 0.0743 0.0776 0.0789

0.05 0.0833 0.0798 0.0833 0.0846

0.07 0.0910 0.0872 0.0911 0.0924

0.09 0.1002 0.0959 0.1002 0.1017

0.10 0.1053 0.1009 0.1054 0.1070

0.30 0.2416 0.2272 0.2392 0.2434

0.50 0.4169 0.3772 0.4036 0.4115

0.70 0.6480 0.5513 0.5994 0.6126

20 30 0.10 0.01 0.0933 0.0871 0.0931 0.0955

0.03 0.0957 0.0895 0.0956 0.0981

0.05 0.1001 0.0939 0.1001 0.1023

0.07 0.1068 0.1005 0.1069 0.1092

0.09 0.1147 0.1082 0.1149 0.1172

0.10 0.1197 0.1130 0.1198 0.1221

0.30 0.2485 0.2332 0.2464 0.2508

0.50 0.4187 0.3789 0.4058 0.4137

0.70 0.6494 0.5524 0.6010 0.6141

50 50 0.10 0.01 0.0569 0.0555 0.0569 0.0575

0.03 0.0592 0.0577 0.0592 0.0597

0.05 0.0634 0.0618 0.0634 0.0640

0.07 0.0692 0.0675 0.0692 0.0699

0.09 0.0765 0.0745 0.0765 0.0771

0.10 0.0802 0.0782 0.0802 0.0809

0.30 0.1822 0.1759 0.1813 0.1831

0.50 0.3042 0.2877 0.2991 0.3025

0.70 0.4450 0.4092 0.4284 0.4336

(continued)

CIs for Inverse Mean and Diﬀerence of Inverse Means 261

Table 3. (continued)

100 100 0.10 0.01 0.0398 0.0393 0.0398 0.0400

0.03 0.0414 0.0408 0.0414 0.0416

0.05 0.0443 0.0438 0.0443 0.0445

0.07 0.0484 0.0478 0.0484 0.0486

0.09 0.0533 0.0527 0.0534 0.0535

0.10 0.0560 0.0554 0.0561 0.0563

0.30 0.1264 0.1242 0.1260 0.1267

0.50 0.2072 0.2017 0.2056 0.2067

0.70 0.2949 0.2837 0.2900 0.2917

50 100 0.10 0.01 0.0567 0.0552 0.0567 0.0572

0.03 0.0579 0.0564 0.0579 0.0585

0.05 0.0601 0.0587 0.0601 0.0607

0.07 0.0632 0.0617 0.0632 0.0637

0.09 0.0669 0.0655 0.0669 0.0675

0.10 0.0692 0.0678 0.0693 0.0698

0.30 0.1328 0.1303 0.1325 0.1333

0.50 0.2113 0.2054 0.2096 0.2108

0.70 0.2986 0.2873 0.2938 0.2956

200 200 0.10 0.01 0.0280 0.0278 0.0280 0.0281

0.03 0.0291 0.0289 0.0291 0.0292

0.05 0.0311 0.0310 0.0311 0.0312

0.07 0.0340 0.0338 0.0340 0.0341

0.09 0.0375 0.0373 0.0375 0.0376

0.10 0.0394 0.0392 0.0394 0.0395

0.30 0.0887 0.0879 0.0885 0.0888

0.50 0.1438 0.1418 0.1431 0.1435

0.70 0.2016 0.1978 0.1999 0.2006

diﬀerence of inverse means with unknown CVs were computed using the pro-

posed conﬁdence intervals given in Sect. 3. The generalized conﬁdence interval

is CIGCI.δ = (0.4107, 0.7017) with the interval length of 0.2910. The large sam-

ple conﬁdence interval is CILS.δ = (0.3158, 0.7461) with the interval length of

0.4303. The MOVER conﬁdence interval is CIM OV ER.δ = (0.4032, 0.6962) with

the interval length of 0.2930. In comparison, the generalized conﬁdence interval

for the diﬀerence of inverse means of Niwitpong and Wongkhao [6] is CID.N W =

(0.3988, 0.7070) with the interval length of 0.3082. It is found that CIGCI.δ and

CIM OV ER.δ provide the lengths shorter than CILS.δ and CID.N W . Therefore,

CIGCI.δ and CIM OV ER.δ perform better than CILS.δ and CID.N W .

262 W. Thangjai et al.

This paper is extension of previous works of Thangjai et al. [12], Thangjai et al.

[13] and Thangjai et al. [14]. The performance of the new estimators and well-

established estimators were compared. The coverage probability and average

length of the conﬁdence intervals were used for a comparative study. The new

estimator of the inverse mean with unknown CV is θ and the new estimator

of the diﬀerence of inverse means with unknown CVs is δ. The well-established

estimators for the inverse mean and the diﬀerence of inverse means are 1/μ and

(1/μX ) − (1/μY ), respectively.

For the single inverse mean, the generalized conﬁdence interval (CIGCI.θ )

and the large sample conﬁdence interval (CILS.θ ) for the new estimator were

compared with the generalized conﬁdence interval (CIN W ) for 1/μ of Niwitpong

and Wongkhao [5]. The results indicated that CIGCI.θ and CIN W perform well

in term of the coverage probability. For σ ≤ 0.30, CIGCI.θ provides the average

lengths shorter than CIN W . For σ > 0.30, the average lengths of CIN W are

shorter than the average lengths of CIGCI.θ . It can be concluded that CIGCI.θ

is suggested for the single inverse mean with unknown CV when the value of

population standard deviation is small (σ ≤ 0.30).

For the diﬀerence of inverse means, the generalized conﬁdence interval

(CIGCI.δ ), the large sample conﬁdence interval (CILS.δ ) and the MOVER con-

ﬁdence interval (CIM OV ER.δ ) for the new estimator are compared with the

generalized conﬁdence interval (CID.N W ) for (1/μX ) − (1/μY ) of Niwitpong

and Wongkhao [6]. The coverage probabilities of the CIGCI.δ , CIM OV ER.δ

and CID.N W close to nominal conﬁdence level of 0.95. Moreover, CIGCI.δ and

CIM OV ER.δ yield the average lengths shorter than the CID.N W in almost all

cases. Therefore, CIGCI.δ and CIM OV ER.δ perform well in terms of the cover-

age probability and average length. However, the MOVER conﬁdence interval is

easy to use because it has the simple formula. Hence, the MOVER conﬁdence

interval is recommended as an interval estimator for the diﬀerence of inverse

means with unknown CVs.

Technology North Bangkok. Contract no. KMUTNB-61-DRIVE-006.

References

1. Blumenfeld, D.: Operations Research Calculations Handbook. Boca Raton,

New York (2001)

2. Donner, A., Zou, G.Y.: Closed-form conﬁdence intervals for function of the normal

mean and standard deviation. Stat. Methods Med. Res. 21, 347–359 (2010)

3. Lee, J.C., Lin, S.H.: Generalized conﬁdence intervals for the ratio of means of two

normal populations. J. Stat. Plann. Infer. 123, 49–60 (2004)

4. Niwitpong, S.: Conﬁdence intervals for the normal mean with a known coeﬃcient

of variation. Far East J. Math. Sci. 97, 711–727 (2015)

5. Niwitpong, S., Wongkhao, A.: Conﬁdence interval for the inverse of normal mean.

Far East J. Math. Sci. 98, 689–698 (2015)

CIs for Inverse Mean and Diﬀerence of Inverse Means 263

6. Niwitpong, S., Wongkhao, A.: Conﬁdence intervals for the diﬀerence between

inverse of normal means. Adv. Appl. Stat. 48, 337–347 (2016)

7. Sahai, A.: On an estimator of normal population mean and UMVU estimation of

its relative eﬃciency. Appl. Math. Comput. 152, 701–708 (2004)

8. Sahai, A., Acharya, R.M.: Iterative estimation of normal population mean using

computational-statistical intelligence. Comput. Sci. Tech. 4, 500–508 (2016)

9. Searls, D.T.: The utilization of a known coeﬃcient of variation in the estimation

procedure. J. Am. Stat. Assoc. 59, 1225–1226 (1964)

10. Srivastava, V.K.: A note on the estimation of mean in normal population. Metrika

27, 99–102 (1980)

11. Sodanin, S., Niwitpong, S., Niwitpong, S.: Generalized conﬁdence intervals for the

normal mean with unknown coeﬃcient of variation. In: AIP Conference Proceed-

ings, vol. 1775, pp. 030043-1–030043-8 (2016)

12. Thangjai, W., Niwitpong, S., Niwitpong, S.: Inferences on the common inverse

mean of normal distribution. In: AIP Conference Proceedings, vol. 1775, pp.

030027-1–030027-8 (2016)

13. Thangjai, W., Niwitpong, S., Niwitpong, S.: Conﬁdence intervals for mean and

diﬀerence of means of normal distributions with unknown coeﬃcients of variation.

Mathematics 5, 1–23 (2017)

14. Thangjai, W., Niwitpong, S., Niwitpong, S.: On large sample conﬁdence intervals

for the common inverse mean of several normal populations. Adv. Appl. Stat. 51,

59–84 (2017)

15. Weerahandi, S.: Generalized conﬁdence intervals. J. Am. Stat. Assoc. 88, 899–905

(1993)

Confidence Intervals for the Mean

of Delta-Lognormal Distribution

King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand

m.patcharee@uru.ac.th, sa-aat.n@sci.kmutnb.ac.th, suparatn@kmutnb.ac.th

vals for the mean are proposed in this research. This will be achieved

using a method of variance estimates recovery (MOVER) and gener-

alized conﬁdence interval (GCI) based on weighted beta distribution

by Hannig, and MOVER based on variance stabilized transformation

(VST). These are then compared with GCI based on VST. The coverage

probabilities and average lengths are performances from the presented

methods, computed via Monte Carlo simulation. Our results showed that

MOVER based on VST is the recommended method under situations of

slight probability of being zero and large coeﬃcient of variation in small

to moderate sample sizes. Ultimately, rainfall data in Chiang Mai was

used to illustrate all of the presented methods.

Method of variance estimates recovery

Generalized conﬁdence interval · Rainfall

1 Introduction

Data consist of two characteristics: contained zeros and positive observations

which have a lognormal distribution. This is called delta-lognormal distribution,

ﬁrst discovered by Aitchison [1]. The following data sets ﬁt the delta-lognormal

in many research areas such as medicine, environment and ﬁshery. For exam-

ple, the diagnostic test charges of patient group were utilized in the Callahan’s

study [3,4,10,15,19,22]. The airborne chlorine concentration was recorded at an

industrial site, United States [10,16,17,19]. The red cod density was inspected

by the National Institute of Water and Atmospheric Research, New Zealand

[8,21]. Also, the monthly rainfall totals in Bloemfontein and Kimberley cities

were surveyed and collected by the South African Weather Service [9].

One of the parameters of interest is mean or expected value of a random

variable in statistical inference. For applications, the mean has been utilized to

apply in several ﬁelds. In medicine, it was used to compare the outpatient cost

between before and after a Medicaid policy change, Indiana state, United States

[2], as well as to investigate medical cost of patient groups: patients with type

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 264–274, 2019.

https://doi.org/10.1007/978-3-030-04263-9_20

Conﬁdence Intervals for the Mean of Delta-Lognormal Distribution 265

I diabetes and patients being treated for diabetic ketoacidosis [5]. In environ-

ment, it was used to estimate airborne concentrations in the area of industrial

sites [10,16,19], and to analyze the monthly rainfall totals in Bloemfontein and

Kimberley cities, South African [9]. In pharmacokinetics, it was used to examine

the maximum concentration (Cmax) in men from an alcohol interaction study

[13,18], and to assess the relative carboxyhemoglobin levels for two groups: non-

smokers and cigarette smokers [14].

Furthermore, several researchers have continually studied and developed the

methods to construct conﬁdence intervals for mean in delta-lognormal. For exam-

ple, Zhou and Tu [22] showed that the bootstrap method had the best accuracy

under small sample size, while small skewness was excluded. Tian and Wu [17]

showed that the adjusted signed log-likelihood ratio statistic outperformed in

terms of coverage probabilities and symmetry of upper and lower tail error prob-

abilities. Fletcher [8] recommended the proﬁle-likelihood in situations of error

rates within 1% (lower limit) or 3% (upper limit) of the nominal level, except for

small sample sizes and moderate to high skewness levels. Li et al. [15] suggested

the ﬁducial approach has highly accurate coverage and fairly low bias. Wu and

Hsieh [21] conﬁrmed that the asymptotic generalized pivotal quantity satisﬁed

in terms of coverage, expected interval lengths and reasonable relative biases.

Finally, Hasan and Krishnamoorthy [10] proposed the modifying of Tian’s [19]

generalized conﬁdence interval, which was precision and satisfactory in terms of

coverage, and maintained the balanced tail error rates better than the existing

methods.

As mentioned before, delta-lognormal distribution has been applied in real

life. Also, various methods have been proved and developed to ﬁnd the best

method to construct conﬁdence intervals for delta-lognormal mean. However,

these methods still have the restrictions in a few situations. Therefore, the aim

of this study is importantly to search for a better conﬁdence interval for the mean

using the MOVER and GCI based on weighted beta distribution by Hannig [11],

abbreviated by MOVER-1 and GCI-1, respectively, and MOVER based on VST

(MOVER-2). The three mentioned methods are compared with GCI based on

VST of Wu and Hsieh [21], abbreviated by GCI-2. The outline of this article

is systematized as follows: the methods are elaborated to establish conﬁdence

intervals for the delta-lognormal mean in Sect. 2. Section 3, the numerical results

are detailed to show the performances in terms of coverage probability and aver-

age lengths of all methods. The rainfall amount in Chiang Mai is illustrated with

the proposed methods in Sect. 4. This article closes with a brief discussion and

conclusion.

2 Methods

bution, denoted as Δ(μ, σ 2 , δ) where P (Yi = 0) and n0 ∼ B(n, δ); n0 stands

for the number of zero values. The distribution function of Y was presented by

Aitchison [1], Tian and Wu [17], deﬁned as

266 P. Maneerat et al.

2 δ ; yi = 0

H(yi ; μ, σ , δ) = (1)

δ + (1 − δ)G(yi ; μ, σ 2 ) ; yi > 0

denoted as LN (μ, σ 2 ). There are three maximum likelihood estimates as follows:

1 1

n1 n1

2 n0

μ̂ = ln yi , σ̂ 2 = (ln yi − μ̂) and δ̂ =

n1 j=1 n1 j=1 n

which are the estimates for μ, σ 2 and δ, respectively. The n1 stands for the

number of nonzero values; n = n0 + n1 . The mean, variance and coeﬃcient of

variation (CV) of Yi are

σ2

θ = (1 − δ) exp μ + (2)

2

λ = (1 − δ) exp 2μ + σ 2 exp σ 2 + δ − 1 (3)

exp (σ ) + δ − 1

2

τ= (4)

1−δ

Then, we obtained that the logarithm of the mean can be written as

σ2

ξ = ln(1 − δ) + μ + (5)

2

There are four presented methods to establish conﬁdence intervals for ξ are as

below.

The ideas of generalized conﬁdence interval was ﬁrst presented by Weerahandi

[20]. This is regarded as a basic method to construct conﬁdence interval for

the interesting parameter based on the concept of generalized pivotal quantity

(GPQ), elaborated as

Let Y = (Y1 , Y2 , ..., Yn ) be a random sample from the probability density

function, denoted as f (yi ; δ, η) where δ and η stand for the interesting and nui-

sance parameters, respectively. Let y = (y1 , y2 , ..., yn ) be a observed value of Y .

The R(Y ; y, δ, η) is considered as the generalized pivotal quantity if it satisﬁes

the conditions as follows:

(i) Given Y , the probability distribution of R(Y ; y, δ, η) is free of all unknown

parameters.

(ii) The observed value of R(Y ; y, δ, η), denoted as r(y; y, δ, η), depends on the

parameter of interest.

Therefore, we obtain that CIgci = [Rα/2 , R1−α/2 ] becomes the 100(1 − α)%

two-sided conﬁdence interval for the interesting parameter δ based on GCI where

Rα stands for the αth percentile of R(Y ; y, δ, η). Now, the GPQs of μ, σ 2 and δ

Conﬁdence Intervals for the Mean of Delta-Lognormal Distribution 267

are considered to construct conﬁdence intervals for ξ based on GCI. First, the

GPQs of μ and σ 2 were proposed by Krishnamoorthy and Mathew [12] as

W (n1 − 1)σ̂ 2

Rμ = μ̂ − √ (6)

n1 U

(n1 − 1)σ̂ 2

Rσ2 /2 = (7)

2U

2

2

where W = (μ̂−μ)/ (n1n−1)σ̂

1U

and U = (n1 −1)σ̂

σ2 stand for the random variables

of standard normal and chi-square distribution with n1 − 1 degrees of freedom,

respectively. Also, both variables are independent.

The generalized ﬁducial quantity (GFQ) of δ was presented by Hannig [11]. The

mentioned study found that the combination of two beta distribution, weighted

by 12 , is the best GFQ for δ. Next, the mentioned GFQ was also used with

the research of Li et al. [15]. Additionally, it also follows the concept of GPQ.

Consequently, the GPQ of δ is applied in this study, which is

1

Rδ.H ∼ [Beta(n0 , n1 + 1) + Beta(n0 + 1, n1 )] (8)

2

By Eqs. (6), (7) and (8), the GPQs of μ, σ 2 and δ satisﬁed the two conditions of

Weerahandi [20], the GPQ of ξ is given by

As a result, the 100(1 − α)% two-sided conﬁdence interval for ξ based on GCI-1

is

CIgci.H = [Lgci.H , Ugci.H ] = [Rξ.H (α/2), Rξ.H (1 − α/2)] (10)

where Rξ.H (α) stands for the αth percentile of Rξ.H .

Dasgupta [6] showed that the coverage probabilities of VST was essentially better

than the Wald interval. Next, Wu and Hsieh [21] applied the GPQ of δ based on

VST to construct intervals for delta-lognormal mean, which is

T

Rδ.vst = sin arcsin δ̂ − √

2

(11)

2 n

√ √

where T = 2 n arcsin δ̂ − arcsin δ ∼ N (0, 1); n → ∞. By three pivots of

(6), (7) and (11), the GPQ for ξ is deﬁned as

268 P. Maneerat et al.

where the random variables T , W , U are clearly independent. Recall that the

GPQs of μ, σ 2 and δ satisﬁes the conditions for a general pivotal quantity so that

the GPQ of ξ is also satisﬁed. Therefore, the 100(1 − α)% two-sided conﬁdence

interval for ξ based on GCI-2 is

This simple method was proposed by Donner and Zou [7]. Under normal distri-

bution, their study solved the problem using the estimates of variance recovered

form conﬁdence intervals, these individually computed mean and standard devi-

ation. These mentioned concepts are adopted in this study. Here μ, σ 2 and δ are

the parameters of delta-lognormal. Surely, the mean ξ is a function of mentioned

n1 2

parameters. For σ 2 , the unbiased estimate is σ̂ 2 = i=1 (ln Xi − μ̂) /(n1 − 1),

then

(n1 − 1) σ̂ 2

U= ∼ χ2n1 −1 (14)

σ2

where

χ2n1 −1 stands for chi-square distribution with n1 − 1 degrees of freedom.

σ̂ 2 stands for the sample variance of logarithm of positive observations.

At α signiﬁcant level, the coverage probability of χ2n1 −1 is deﬁned to estimate

σ 2 as

P χ2α2 ,n1 −1 ≤ χ2n1 −1 ≤ χ21− α2 ,n1 −1 = 1 − α (15)

(n1 − 1) σ̂ 2 (n1 − 1) σ̂ 2

CIσ2 = [lσ2 , uσ2 ] = , (16)

χ21− α ,n1 −1 χ2α ,n1 −1

2 2

(n1 − 1)σ̂ 2

W = (μ̂ − μ)/ ∼ N (0, 1) (17)

n1 U

Which is the random variable based on central limit theorem (CLT). To estimate

sample mean, the coverage probability of W is written at α signiﬁcant level as

P W α2 < W < W1− α2 = 1 − α (18)

⎡ ⎤

(n − 1)σ̂ 2 (n − 1)σ̂ 2

CIμ = [lμ , uμ ] = ⎣μ̂ − W1− α2 ⎦

1 1

, μ̂ + W1− α2 (19)

n1 U n1 U

Conﬁdence Intervals for the Mean of Delta-Lognormal Distribution 269

For δ, there are two presented methods to construct conﬁdence intervals for δ,

comprising of weighted beta distribution by Hannig [11] and VST as below.

The weighted beta distribution by Hannig [11]

This study applied the weighted beta distribution by Hannig [11] to establish

conﬁdence interval for δ where B1 ∼ Beta(n0 , n1 +1) and B2 ∼ Beta(n0 +1, n1 ).

One obtains that

Betaw = (B1 + B2 )/2 (20)

Then, the 100(1 − α)% conﬁdence interval for δ based on the weighted beta

distribution is

The variance stabilizing transformation

Dasgupta [6] presented the VST to construct conﬁdence interval for δ. Recall

that n0 ∼ B(n, δ) and δ̂ ∼ N (δ, δ(1 − δ)/n). By applying the delta theorem, one

obtains that √ d

n(δ̂ − δ) → N (0, δ(1 − δ)) (22)

√

A variance-stabilizing transformation is g (δ) = √ 1/2 dδ = arcsin δ so that

δ(1−δ)

g(n0 ) = arcsin nn0 is also the VST, and then

√ √ d

n arcsin δ̂ − arcsin δ → N (0, 1/4) (23)

√
√ d

Similarly, W = 2 n arcsin δ̂ − arcsin δ → N (0, 1). Therefore, the 100(1 −

α)% conﬁdence interval for δ based on VST is given by

1 1

= sin2 arcsin δ̂ − W1− α2 √ , sin2 arcsin δ̂ + W1− α2 √ (24)

2 n 2 n

Now, the method of variance estimates recovery is considered. Let

σ2

ξ = ξ1 + ξ2 + ξ3 = ln(1 − δ) + μ + (25)

2

Using μ̂, σ̂ 2 and δ̂ from the sample, the estimate of ξ is ξˆ = ξˆ1 + ξˆ2 + ξˆ3 =

2

ln(1 − δ̂) + μ̂ + σ̂2 . The 100(1 − α)% two-sided conﬁdence interval for ξ2 + ξ3 is

ﬁrst focused based on MOVER, that is

where

lσ 2 2

lξ2 +ξ3 = (ξˆ2 + ξˆ3 ) − (ξˆ2 − lμ )2 + (ξˆ3 − 2 )

270 P. Maneerat et al.

u

uξ2 +ξ3 = (ξˆ2 + ξˆ3 ) + (uμ − ξˆ2 )2 + ( 2σ2 − ξˆ3 )2

Next, the previous step is combined. Then, the 100(1−α)% two-sided conﬁdence

interval for ξ based on the MOVER approach is given by

where

Lξ = ξˆ − [ξˆ1 − ln(1 − uδ )]2 + [(ξˆ2 + ξˆ3 ) − lξ2 +ξ3 ]2

Uξ = ξˆ + [ln(1 − lδ ) − ξˆ1 ]2 + [uξ2 +ξ3 − (ξˆ2 + ξˆ3 )]2

The lδ and uδ depend on the conﬁdence interval for δ as below.

The 100(1 − α)% two-sided conﬁdence interval for ξ based on MOVER-1 is given

by

CIm.H = [Lm.H , Um.H ] (28)

where

Lm.H = ξˆ − [ξˆ1 − ln(1 − uδH )]2 + [(ξˆ2 + ξˆ3 ) − lξ2 +ξ3 ]2

Um.H = ξˆ + [ln(1 − lδH ) − ξˆ1 ]2 + [uξ2 +ξ3 − (ξˆ2 + ξˆ3 )]2

The 100(1 − α)% two-sided conﬁdence interval for ξ based on MOVER-2 is given

by

CIm.vst = [Lm.vst , Um.vst ] (29)

where

Lm.vst = ξˆ − [ξˆ1 − ln(1 − uδv )]2 + [(ξˆ2 + ξˆ3 ) − lξ2 +ξ3 ]2

Um.vst = ξˆ + [ln(1 − lδv ) − ξˆ1 ]2 + [uξ2 +ξ3 − (ξˆ2 + ξˆ3 )]2

3 Simulation Studies

Monte Carlo simulation is used to assess performances of the presented meth-

ods, including coverage probability (CP) and average length (AL). The conﬁ-

dence intervals are established by four methods: GCI-1, GCI-2, MOVER-1 and

MOVER-2. To choose a recommended method, there are two important criteria:

the coverage probabilities are at least or close to the nominal conﬁdence level

(1 − α) and also the narrowest average length.

The simulation settings consist of the mean μ = 0; the sample sizes n =

10, 20, 30, 50, 100; the probabilities of additional zero δ = 0.2, 0.5, 0.8 and the

coeﬃcient of variation τ = 0.2, 0.5, 1, 2. The following cases of n = 10, 20, δ = 0.8,

and τ = 0.2, 0.5, 1, 2 are similarly excluded with Fletcher [8], Wu and Hsieh [21].

Although, the expected number of non-zeros were beneath 10, the situations

Conﬁdence Intervals for the Mean of Delta-Lognormal Distribution 271

Table 1. The coverage probabilities (CP) and average lengths (AL) of 95% two-sided

conﬁdence intervals for ξ of delta-lognormal

272 P. Maneerat et al.

of n = 30, δ = 0.8 and τ = 0.2, 0.5, 1, 2 were not discarded in this simulation

study because a few methods performed well in the mentioned combinations.

The nominal conﬁdence level was set to 0.95. A total of 10,000 random samples

were generated for each sample size and parameter setting. Also, the number of

pivotal quantities was 5000 for GCI-1 and GCI-2 methods.

The numerical results are shown in Table 1. This indicates that the coverage

probabilities of MOVER-1 were mostly lower than the nominal conﬁdence level

except in the following cases: [n = 50, 100, δ = 0.2, τ = 1, 2] and [n = 100,

δ = 0.5, τ = 1, 2]. When the large τ were excluded, the MOVER-2 performed

well in terms of coverage probabilities. If its average lengths were also considered,

the performances of MOVER-2 satisﬁed the criterion when δ = 0.2, τ = 1, 2 in

sample sizes n = 20, 30, 50. For GCI-1, its coverage probabilities underestimated

the nominal conﬁdence level in all cases. Conversely, GCI-2 provided coverage

probabilities and average lengths to maintain the target, especially in cases of

δ = 0.5, 0.8 and τ = 0.2, 0.5, 1 in sample sizes n = 30, 50, 100.

4 An Empirical Application

To conﬁrm the simulation results, the rainfall data (mm/day) were recorded

though May 2017 by the Chiang Mai weather station, Northern meteorological

center, Thailand. In general, the rainy season is during mid-May to October in

Thailand. In this period, Thai agriculturist cultivate early when dealing with rice

farming and other plants, so rainfall quantities are one of the most important

factors for plant growth. For the survey, the sample sizes were 31 days, including 8

and 23 days for zero and positive-valued rainfall, respectively, detailed in Table 2.

Table 2. The amount of rainfall recorded by the Chiang Mai weather station on May

2017.

Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Rainfall 0.0 2.7 0.0 9.9 5.5 0.0 0.8 4.0 6.9 0.0 0.1 0.0 13.0 0.0 0.5 36.2

Day 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Rainfall 54.8 124.8 0.0 0.8 0.1 0.0 8.6 21.2 53.7 9.7 7.4 14.6 2.7 32.7 0.7

For the test of normality, the p-value of Kolmogorov-Smirnov tests was 0.8697

for the logarithm of positive rainfall on May 2017. Moreover, the mentioned data

contained zero observations. This amount of rainfall clearly ﬁts the character-

istics of delta-lognormal distribution. In fact, the estimated rainfall mean was

θ̂ = 24.79 while the approximate logarithm of the mean of rainfall was ξˆ = 3.21

where μ̂ = 1.64, σ̂ 2 = 3.73, δ̂ = 0.26 and τ̂ = 7.43. Next, the 95% two-sided

conﬁdence intervals for θ were computed based on GCI-1, GCI2, MOVER-1 and

MOVER-2, shown in Table 3. This dataset showed that the conﬁdence intervals

for the mean of rainfall are consistent with the numerical result in the previous

section.

Conﬁdence Intervals for the Mean of Delta-Lognormal Distribution 273

Table 3. Summary of result for the amount of rainfall: the 95% two-sided conﬁdence

intervals for θ using four methods.

Lower limit 8.88 8.62 10.26 10.11

Upper limit 204.95 223.28 7123.99 7138.77

This paper presented the following methods: GCI-1, MOVER-1 and MOVER-2,

and compared them with GCI-2 to establish a conﬁdence interval for the mean

of delta-lognormal distribution. Likewise, Monte Carlo simulation numerically

evaluated the performances of all methods which are considered as coverage

probability and average length. The ﬁndings can be brieﬂy summarized as fol-

lows: MOVER-2 is regarded as a recommended method when there is a small

probability of having zero and large CV in small and moderate sample sizes.

Furthermore, GCI-2 satisﬁed the criteria under the situations of large probabil-

ity of obtaining zero and small CV in moderate and large sample sizes. However,

MOVER-2 is not as complicated as GCI-2 if the two methods are compared

by computation. On the contrary, GCI-1 and MOVER-1 are not recommended

methods because they both have low accuracy in terms of coverage probabili-

ties and average lengths. From the above simulation results, the performance of

GCI-2 matched up with the study of Wu and Hsieh [21].

References

1. Aitchison, J.: On the distribution of a positive random variable having a discrete

probability mass at the origin. J. Am. Stat. Assoc. 50, 901–908 (1955)

2. Bebu, I., Mathew, T.: Comparing the means and variances of a bivariate log-normal

distribution. Stat. Med. 27, 2684–2696 (2008)

3. Callahan, C.M., Kesterson, J.G., Tierney, W.M.: Association of symptoms of

depression with diagnostic test charges among older adults. Ann. Internal Med.

126, 426–432 (1997)

4. Chen, Y.-H., Zhou, X.-H.: Generalized conﬁdence intervals for the ratio or diﬀer-

ence of two means for lognormal populations with zeros, UW Biostatistics Working

Paper Series (2006)

5. Chen, Y.-H., Zhou, X.-H.: Interval estimates for the ratio and diﬀerence of two

lognormal means. Stat. Med. 25, 4099–4113 (2006)

6. Dasgupta, A.: Asymptotic Theory of Statistics and Probability Springer Texts in

Statistics. Springer, New York (2008)

7. Donner, A., Zou, G.Y.: Closed-form conﬁdence intervals for functions of the normal

mean and standard deviation. Stat. Methods Med. Res. 21, 347–359 (2010)

8. Fletcher, D.: Conﬁdence intervals for the mean of the delta-lognormal distribution.

Environ. Ecol. Stat. 15, 175–189 (2008)

9. Harvey, J., van der Merwe, A.J.: Bayesian confdence intervals for means and vari-

ances of lognormal and bivariate lognormal distributions. J. Stat. Plann. Infer.

142, 1294–1309 (2012)

274 P. Maneerat et al.

10. Hasan, M.S., Krishnamoorthy, K.: Conﬁdence intervals for the mean and a per-

centile based on zero-inﬂated lognormal data. J. Stat. Comput. Simul. 88, 1499–

1514 (2018)

11. Hannig, J.: On generalized fducial inference. Stat. Sinica 19, 491–544 (2009)

12. Krishnamoorthy, K., Mathew, T.: Inferences on the means of lognormal distri-

butions using generalized p-values and generalized conﬁdence intervals. J. Stat.

Plann. Infer. 115, 103–121 (2003)

13. Krishnamoorthy, K., Oral, E.: Standardized likelihood ratio test for comparing

several log-normal means and conﬁdence interval for the common mean. Stat.

Methods Med. Res. 0, 1–23 (2015)

14. Lee, J.C., Lin, S.-H.: Generalized conﬁdence intervals for the ratio of means of two

normal populations. J. Stat. Plann. Infer. 123, 49–60 (2004)

15. Li, X., Zhou, X., Tian, L.: Interval estimation for the mean of lognormal data with

excess zeros. Stat. Probab. Lett. 83, 2447–2453 (2013)

16. Owen, W.J., DeRouen, T.A.: Estimation of the mean for lognormal data containing

zeroes and left-censored values, with applications to the measurement of worker

exposure to air contaminants. Biometrics 36, 707–719 (1980)

17. Tian, L., Wu, J.: Conﬁdence intervals for the mean of lognormal data with excess

zeros. Biometrical J. Biometrische Zeitschrift 48, 149–156 (2006)

18. Tian, L., Wu, J.: Inferences on the common mean of several log-normal populations:

the generalized variable approach. Biometrical J. 49, 944–951 (2007)

19. Tian, L.: Inferences on the mean of zero-inﬂated lognormal data: the generalized

variable approach. Stat. Med. 24, 3223–3232 (2005)

20. Weerahandi, S.: Generalized conﬁdence intervals. J. Am. Stat. Assoc., 899–905

(1993)

21. Wu, W.-H., Hsieh, H.-N.: Generalized conﬁdence interval estimation for the mean

of delta-lognormal distribution: an application to New Zealand trawl survey data.

J. Appl. Stat. 41, 1471–1485 (2014)

22. Zhou, X.H., Tu, W.: Conﬁdence intervals for the mean of diagnostic test charge

data containing zeros. Biometrics 2000, 1118–1125 (2000)

The Interaction Between Fiscal Policy,

Macroprudential Policy and Financial

Stability in Vietnam-An Application

of Structural Equation Modeling

Nguyen Ngoc Thach1(B) , Tran Thi Kim Oanh2 , and Huynh Ngoc Chuong3

1

International Economics Faculty, Banking University of Ho Chi Minh City,

36 Ton That Dam Street, District 1, HCMC, Vietnam

thachnn@buh.edu.vn

2

Banking University of Ho Chi Minh City,

36 Ton That Dam Street, District 1, HCMC, Vietnam

kimoanhtdnh@gmail.com

3

University of Economics and Law, Vietnam National University - HCMC,

Quarter 3, Linh Xuan Ward, Thu Duc District, Ho Chi Minh City, Vietnam

chuonghn@tdmu.edu.vn

tial policy as well as their interaction on ﬁnancial stability in Vietnam

during the global economic crisis 2008–2009. Using Structural Equation

Modeling (SEM), the study shows that both ﬁscal policy and macropru-

dential policy have a great impact on the ﬁnancial stability. In particular,

ﬁscal policy has a negative impact while macroprudential policy shows

a positive eﬀect on the ﬁnancial stability in Vietnam. Besides, evidences

that indicate a negative relation between the ﬁscal policy and macro-

prudential policy in Vietnam are also implied in the study’s outcomes.

With those results, the authors come to conclusion that Vietnam should

execute macroeconomic policies, especially the ﬁscal policy and macro-

prudential policy, with caution and consideration of their interaction in

order to take the best advantage of their coordination towards ﬁnancial

stability.

Fiscal policy · Macroprudential policy

1 Introduction

Vietnam’s economy has been increasingly being integrated into the global mar-

ket. However, Vietnamese ﬁnancial system’s development still remains at low

level with critical dependence on the banking sector. Similar to other developing

countries, Vietnam’s rapid growth over the last three decades is mainly fueled

by domestic and foreign investment funds through the ﬁnancial system. It can

c Springer Nature Switzerland AG 2019

V. Kreinovich and S. Sriboonchitta (Eds.): TES 2019, SCI 808, pp. 275–288, 2019.

https://doi.org/10.1007/978-3-030-04263-9_21

276 N. N. Thach et al.

macroeconomic situations, which enhances Vietnam’s economic growth.

The US ﬁnancial crisis in 2007 caused several negative impacts not only on its

economic growth but also on others worldwide, which triggered a global economic

crisis. To restrict this spread, countries developed mechanisms based on the

combination of macroeconomic policies, speciﬁcally the tripartile coordination

of monetary, ﬁscal and macroprudential policies to stabilize the macroeconomic

situation as well as the ﬁnancial system. In this context, macroprudential policy

which aims to prevent or limit contagion eﬀects of economic shocks has received

increasing attention from central banks and ﬁnancial institutions. Since being

ﬁrst introduced by Basel Committee on Banking Supervision and the Bank of

England in the 1970s, macroprudential policy has been studied and applied more

and more widely in several countries due to the recent global crisis.

In Vietnam, the term “macroprudential policy” has recently been men-

tioned in many colloquia. Especially in 2013, Vietnam National Assembly’s Eco-

nomic Commission compiled criteria for the assessment of macroprudential pol-

icy model according to international experience. However, Vietnam has not yet

developed a full set of macroprudential policy tools itself up to now. Only few

tools are applied with the view to minimizing the risks of macroeconomic insta-

bility to the ﬁnancial system. Additionally, there is a limitation in researches

on this policy in Vietnam. The existing papers mainly focus on three issues: (i)

the coordination between ﬁscal and monetary policies in Vietnam; (ii) the eﬀec-

tiveness of individual implementation of ﬁscal and macroprudential policies to

pursue ﬁnancial stability in Vietnam; (iii) the coordination between monetary

and macroprudential policies in the context of economic instability. Currently,

there is no researches on the coordination between ﬁscal and macroprudential

policies with the aim of stabilizing Vietnam’s ﬁnancial system. For those rea-

sons, this paper aims to address two issues: (i) evaluating the individual impact

of ﬁscal and macroprudential policies on the ﬁnancial stability; (ii) estimating

the interaction between ﬁscal and macroprudential policies in ﬁnancial stability

in Vietnam.

2 Theoretical Background

formed their deﬁnitions for the term “ﬁnancial stability”. Although these def-

initions are not the same, most of them have some common aspects as listed

belows:

transferring savings to investments; Financial stability is deﬁned in terms of

its ability to be resistant to economic shocks;

• Financial stability has a positive impact on real sector.

An Application of Structural Equation Modeling 277

bility” whose analytical framework has already been developed and accepted

worldwide, there has not been an analytical framework and standard measure-

ment for “ﬁnancial stability” according to Gadanecz and Jayaram [3]. Basically,

ﬁnancial stability can be characterized by the complex interactions of diﬀerent

sectors in a ﬁnancial system and it is not easily measured by any single indica-

tor. It is obvious that within a national ﬁnancial system, markets have diﬀerent

volatility indices. The fact that every change in each index aﬀects other indices

is indeed a major challenge for policy makers. Therefore, there is a need of an

aggregate indicator grouping all indices in diﬀerent ﬁnancial markets to draw

key descriptions of the volatility of the ﬁnancial system.

In a research performed in Macau, Cheang and Choy [2] uses 2 groups of

indicators: the ﬁnancial vulnerability indicator (FVI) and the regional economic

capacity index (RECI). In their study, the FVI-based ﬁnancial stability index

contributes 60% to the ﬁnancial fragility while the RECI takes up 40%. Also,

Morales and Estrada [10] construct a stability index for the ﬁnancial system in

Columbia through three weighted approaches to selected variables: proﬁtability

- return on assets (ROA), return on equity (ROE), ratio of net loan losses to

total loan portfolio; probability of default; liquidity - ratio of liquid liabilities

to liquid assets, ratio of interbank funds to liquid assets. It is indicated that

all these three weights approaches point out similar ﬁnancial stability index.

Another researcher, Morris [11] builds measures of ﬁnancial stability in Jamaica

by using four indicators with equal weightings: ﬁnancial development index,

ﬁnancial vulnerability index, ﬁnancial soundness index and world economic cli-

mate index. On the other hand, Petrovska and Mihajlovska [14] developed an

aggregate ﬁnancial stability index for Macedonia with ﬁve key variables, in which

the insolvency accounted for 0.25, while the proportion of credit risk, proﬁtabil-

ity, liquidity risk and currency risk were respectively 0.25, 0.2, 0.25 and 0.05.

Finally, Albulescu [1] and Morris [11] used a total of 18 indicators classiﬁed into

three subindices: ﬁnancial development index (4 indicators), ﬁnancial vulnera-

bility index (8 indicators) and insolvency index (3 indicators). It is recognized

that measures of ﬁnancial stability constructed by Albulescu [1] and Morris [12]

assess the stability of the ﬁnancial system quite comprehensively. In fact, the

IMF Global Macroprudential Policy Instruments (GMPI) also uses this bench-

mark in its research (Table 1).

2.2.1 Fiscal Policy

According to Krugman and Wells [8], ﬁscal policy can be understood as a gov-

ernment’s interventions in the tax system and expenditure in order to achieve

macroeconomic objectives such as economic growth, full employment or price

stability. Governments usually implement discretionary or automatic ﬁscal pol-

icy in which discretionary ﬁscal policy implies the government’s actions to adjust

spending and/or income in accordance with their decisions. Based on the impact

on output, ﬁscal policy can be categorized into expansionary and contractionary

278 N. N. Thach et al.

Financial Market capitalization/GDP Albulescu [1], Morris [11],

development index Svirydzenka [19]

Total credit/GDP Albulescu [1], Morris [11],

Svirydzenka [19]

Interest spread Albulescu [1], Morris [11],

Svirydzenka [19]

Herﬁndahl - Hirschmann Index Albulescu [1], Morris [11],

(HHI) Svirydzenka [19]

Financial Inﬂation rate Albulescu [1], Morris [11].

vulnerability index

General budget deﬁcit/surplus Albulescu [1], Morris [11],

(%GDP) Svirydzenka [19]

Current account deﬁcit/surplus Albulescu [1], Morris [11],

(%GDP) Svirydzenka [19]

Change in real eﬀective Albulescu [1], Morris [11].

exchange rate (REER)

Non governmental credit/total Albulescu [1], Morris [11].

credit

Loans/deposits Albulescu [1], Morris [11]

Deposits/M2 Albulescu [1], Morris [11]

Solvency Non-performing loans/total Albulescu [1], Morris [11].

loans

Bank equity/total assets Albulescu [1], Morris [11].

Probability of default Albulescu [1], Morris [11].

one. Expansionary ﬁscal policy is usually used to boost output growth during

downturn or recession. When no inﬂation pressure is found, this is a favor-

able condition to perform expansionary policy. On the contrary, the government

implements contractionary policy to slow growth and inﬂationary pressure when

the economy is overheated with exhausted sources and high inﬂation. However,

changes in budget deﬁcit or surplus are not only resulted from the impact of

discretionary ﬁscal policy but also automatic stabilizers. Particularly, these sta-

bilizers are deﬁned as tools which automatically adjust according to economy

cycle.

While the purpose of ﬁscal policy (or monetary policy) is to stabilize output

after economic shocks, macroprudential policy helps to anticipate and to pre-

vent shocks before they occur. This policy uses prudential instruments to limit

systemic risks, thus mitigating default risks and willingly acting against the

An Application of Structural Equation Modeling 279

build-up of ﬁnancial systemic risks which may cause severe consequences for

real sector as in Nier et al. [12].

Macroprudential policy consists of institutional frameworks, tools and work-

force. In addition, to achieve high eﬃciency, it is necessary to have a strong

mechanism of coordination among stakeholders, not to mention the harmo-

nized interaction among macroeconomic policies, particularly, the coordination

between ﬁscal and macroprudential policies contributing to achieve ﬁnancial sta-

bility. In fact, it is macroprudential policy that monitors macroeconomic policies

and ensures their consistent interactions thus leading to ﬁnancial stability.

According to IMF [6]’s research results, there is no standard macropruden-

tial policy for all countries. The choice of policy instruments depends on the

exchange rate regime, development level as well as vulnerability of a ﬁnancial

system to shocks. Countries usually apply diﬀerent tools rather than a single

one, then coordinate them to act against the cyclical nature of economy. There

are many ways to categorize macroprudential instruments, two of which are sub-

ject dimension and risk dimension. (i) According to subject dimension, macro-

prudential tools can be classiﬁed into tools aﬀecting borrower’s behaviors and

tools of capital control to mitigate the risk of unstable investment ﬂows. (ii) In

terms of risk dimension, there are credit-related tools limiting loans based on

loan-tovalue ratio (LTV), debt-to-income (DTI), foreign currency, credit ceiling

or credit growth; liquidity-related tools limiting the net open position, maturity

mismatch and regulating reserves; and capital-related tools such as countercycli-

cal/timevarying capital requirements, time-varying/dynamic provisioning and

restrictions on proﬁt distribution.

As being analyzed above, ﬁnancial stability is the outcome of the interaction

among macroeconomic policies. In this study, the authors focus on the interaction

between ﬁscal policy and macroprudential policy which are the two key powers

of macroeconomic policies.

With a view to achieving macroeconomic objectives, especially ﬁnancial sta-

bility, it is suggested to have a good coordination between government depart-

ments who are responsible for macroprudential policy and ﬁscal policy. While

the purpose of ﬁscal policy is to enhance economic growth by aﬀecting aggregate

demand, macroprudential policy is aimed at stabilizing a ﬁnancial system. On

one hand, that government executes ﬁscal policy to fasten economic growth can

adversely aﬀect the eﬃciency of macroprudential policy which means increasing

costs in order to achieve ﬁnancial stability. On the other hand, if macroprudential

policy is ineﬀectively implemented which means achieving ﬁnancial stability with

high costs or failing to achieve ﬁnancial stability, it shall have negative impacts

on real sector. In other words, this implies adverse eﬀects on the ﬁnal objectives

of ﬁscal policy. In summary, the ineﬃcient coordination between macroprudential

and ﬁscal policies will lead to the failure in pursuing policy objectives (ﬁnancial

stability and economic growth) as planned. Speciﬁcally, ineﬀective performance

280 N. N. Thach et al.

of ﬁscal policy aiming at high GDP growth can have an adverse impact on the

goal of macroprudential policy - ﬁnancial stability.

In more details, the major impacts of ﬁscal policy on macroprudential policy

are classiﬁed as followings:

First, positive impacts including:

• Fiscal policy is an instrument to control capital inﬂow and outﬂow.

• For example, the taxation and charges on holders of foreign ﬁnancial assets

may reduce those assets’ attractiveness to residents.

• An eﬀective performance of ﬁscal policy allows to achieve rapid economic

growth with public sources economically used. Fast economic development

contributes to solving plenty of problems, one of which is ﬁnancial stability.

• An appropriate approach to ﬁscal policy such as countercyclical policy that

allows for considerable increase of ﬁnancial stability.

Second, negative impacts including:

Expansionary discretionary ﬁscal policy is likely to cause long-term big bud-

get deﬁcits and high public debts. It is noted that budget deﬁcits are often

oﬀset by two main sources: (i) public debt and (ii) central bank borrowings. In

detail, increasing public debts often creates “crowding out eﬀect” which leads

to a decrease of private investment and economic growth in long-term periods.

If the government borrows money from central bank to oﬀset budget deﬁcit,

inﬂation will rise, accompanied by decline in aggregate demand and in eco-

nomic growth as well. In addition, the budget deﬁcit, oﬀset by the attraction of

short-term international capital ﬂuctuations that strongly ﬂuctuates and likely

leads to sudden reverse resulting adverse impacts on exchange rates and foreign

exchange reserves shall cause a deﬁcit of current account balance. It is clear that

the instability of international capital ﬂows has a negative impact on ﬁnancial

stability.

Negative impacts of public debt are expressed in following aspects. (i) Sup-

posing that public debts are too high, according to “crowding out eﬀect”, the

government will limit potential economic growth as well. As a result, the ability

to prevent ﬁnancial crisis will supposed to be weakened. (ii) excessive borrowings

of the government leads to high spending on debt service and the need to cut

budgets for socio-economic development, including ﬁnancial sustainability; (iii)

If the government borrows a lot of foreign debts, the risk of relying of the national

ﬁnancial system on the international ﬁnancial system is certainly enormous; (iv)

Public debts are considered an expected tax, sooner or later that debt together

with interest. must be paid. For long-term debts, the payment is transferred to

future generations.

• For countries facing serious public debts, ﬁscal repression is a common tool

in ﬁscal policy.

It includes measures such as provision of government debts from controlled

ﬁnancial institutions (pension funds, central banks); current or hidden limits

on bank lending rates; limitation of the international capital ﬂows; increase

of the reserve requirement ratio; increase of the demand for bank capital and

An Application of Structural Equation Modeling 281

retirement funds [15,16]. All of the above enhance the government control

of ﬁnancial resources to reduce public debt. In essence, ﬁscal repression is a

latent tax imposing on the operation of ﬁnancial intermediaries.

The impacts of macroprudential policy on ﬁscal policy can be described as:

First, positive impacts including:

• Financial stability is indeed a critically important objective of the public

policy, and macroprudential policy is to use safe tools in order to prevent or

to minimize the system risks and ensure the objective of ﬁnancial stability

that allows for constructing sustainable economic growth.

Thus, eﬀective implementation of macroprudential policy will ensure stable

economic growth, steady increase of economic entities’ income, contributing

to stabilizing tax revenues. It is a good source of tax revenues that not only

maintain the public administration system but also solve social problems and

create the essential conditions for long-term economic growth and ensure the

government’s reserves in case of emergency such as ﬁnancial crisis. In other

words, thanks to great budget revenues, ﬁscal policy has more chances to

carry out government’s regular or emergency tasks. Thanks to this, ﬁscal

policy becomes more ﬂexible to solve emergency problems.

Second, negative impacts including:

• On the contrary, supposing that macroprudential policy is ineﬀectively imple-

mented, it will cause an increase in ﬁnancial instability and system risks,

eventually leading to a negative impact on economic growth. In that case,

a decrease in budget revenue of economic subjects will be aﬀected. Conse-

quently, ﬁscal policy is less eﬀective.

• Risks deriving from ineﬃcient performance of macroprudential policy result

in severe trade deﬁcits, considerable inﬂows and outﬂows of international

capital, asset bubbles along with the risks of interest rates, inﬂation, liquidity,

national debts and sudden changes in market sentiment.

Based on theoretical and empirical studies on ﬁnancial stability, ﬁscal policy,

macroprudential policy and data in the period of 2000–2015 collected from IMF,

ADB and Orbis Bank Focus, the author’s team constructs a research model

about the relation between ﬁscal policy and macroprudential policy to pursue

ﬁnancial stability in Vietnam in the context of the global economic crisis 2008–

2009 (Table 2).

In order to analyze the relation between ﬁscal policy and macroprudential

policy in stabilizing Vietnam’s ﬁnancial system during economic crisises, the

authors use SEM to evaluate the statistically signiﬁcant relation between latent

variables and observe variables. With SEM, observe variables can be used to

measure latent variables. Moreover, as SEM provides plenty of evidences to form

simple and easy-to-follow indicators, it is especially advantageous for models

282 N. N. Thach et al.

Fiscal policy Tax revenue/GDP - Galati and Moessner [4]

Government Albulescu [1], Galati and

debt/GDP Moessner [4]

Expense/GDP Galati and Moessner [4]

Macroprudential Non-performing + Lee et al. [9]

policy loans/total loans

Credit growth Lee et al. [9]

Total loans/total Lee et al. [9]

deposits

World economic Average world +/- Albulescu [1] and Morris [11]

indicators economic growth

Average world Albulescu [1] and Morris [11]

inﬂation growth

Economic crisis D = 0 for the period - Thanh and Tuan [13]

before 2008

D = 1 for the period

after 2008

Economic growth Real GDP growth + Morgan and Pontines [10]

which have many variables representing complex indicators. Last but not least,

SEM is also helpful in estimating and separating direct and indirect impacts of

variables in the model.

In SEM approach, Path diagrams that like ﬂowcharts play a fundamental

role in structural modeling. They show variables interconnected with lines that

are used to indicate variable correlation as well as causal ﬂows. With a classic

equation:

Y = aX + e (1)

we can use arrows to represent the relationships between variables (both:

observed and latent variables). Latent variables are placed in an ovals or cir-

cles (including error term), observed variables are place in boxes (Fig. 1).

An Application of Structural Equation Modeling 283

background, authors proposed the following SEM (Fig. 2) for our concept model:

macroeconomic variables have complicated relation noticing that estimating and

separating direct and indirect eﬀects among factors are conducted in the context

of limited database. One solution for this is to perform concurrent estimation

for regression equations through estimating and simulating data with probability

and loop to ensure robust result. This is also the strength of analyzing by SEM

method. In thi