Formulation and Testing of Scientific Hypotheses in The Presence of Uncertainty

Vol.
5, 2020-01
Formulation and Testing of Scientific Hypotheses in the presence of

Uncertainty
Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org
doi: 10.13140/RG.2.2.36317.97767
Abstract
The present essay is intended as a brief explanation and a provocative discussion of the
mechanism of hypothesis formulation and testing, used for the construction of human
knowledge and the advance of Science. Simply stated, hypotheses are possible answers to a
specific research question. The validity of those hypotheses can be tested, in principle, from
the observation of experimental results. However, the presence of randomness and
uncertainty in the experimental validation process does not allow us to reject or accept
hypotheses with absolute certainty. Thus, the experimental validation of hypotheses yields
highly likely, although not absolute, conclusions. Experimental validation of hypotheses
requires formulating a complete set of mutually exclusive hypotheses for a particular research
question. All hypotheses in the set must be scientific, that is, it should be possible to subject
them to experimental validation. A conclusion about the validity of a particular hypothesis can
be reached after contrasting the experimental data in the presence of uncertainty to the result
predicted by the hypothesis. This procedure involves several important assumptions, and for
that reason, statistical hypothesis testing results must always be carefully interpreted in order
to avoid reaching wrong conclusions. One of such assumptions, for example, is the confidence
level. The selection of the confidence level used in the analysis is arbitrary, and it should
depend on the particular research question considered and the degree of uncertainty involved.
By increasing the confidence level, rejecting a hypothesis becomes more difficult. However, it is
usually easier to reject a hypothesis than to prove it. Furthermore, it is virtually impossible to
validate a model or theory. Thus, not-yet-rejected “working” models or paradigms, integrating
our current scientific knowledge, are useful but not necessarily valid. Even the so-called laws of
Nature are not universally valid; they are just highly likely to hold true every time they are
tested. Since models and theories cannot be validated with absolute confidence, Science will
never reach a state of perfection. However, this also implies that Science will also be
continuously progressing and evolving, based only on the experimental merit of the novel and
better hypotheses formulated by scientists.
15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (1 / 16)

www.forschem.org
Formulation and Testing of Scientific
Hypotheses in the presence of Uncertainty
Hugo Hernandez
ForsChem Research
Keywords
Experimentation, Hypotheses, Models, Observation, Philosophy, Research, Schrödinger,

Science, Scientific Method, Statistical Tests, Theories, Uncertainty
1. Hypotheses
Hypotheses are statements about a certain property of a particular population of elements.

Hypotheses also represent possible answers to a specific question or problem (research
question). The validity of hypotheses can be tested from the observation of experimental
results. Hypothesis testing is one of the most important mechanisms for the scientific
construction of human knowledge. The fundamental basis of building knowledge by
hypothesis testing relies on simultaneously testing pairs of mutually exclusive hypotheses. This
was clearly stated by René Descartes [1] using his “famous” Eudoxus§ character: “It is
impossible that one and the same thing should exist and at the same time not exist.”
Descartes’ statement might seem in contradiction with the Copenhagen interpretation of

quantum mechanics, clearly explained by Schrödinger’s thought experiment of a cat (initially
alive) trapped in a box with a deadly radioactive atom.[2] Before opening the box, the cat is in
two different superposed states: Dead and alive. Only after opening the box, its true state
becomes determined. However, Schrödinger’s example can also be used to explain hypothesis
testing. The box represents our lack of knowledge about a specific research topic. The cat
represents the population of interest, related to the research topic. The state (vital status) of
the cat (dead or alive) is the property of interest about the population. Each possible state of
the cat is a hypothesis. The two hypotheses considered (dead or alive) are exhaustive (all
possible outcomes are considered) and mutually exclusive (they cannot be simultaneously
valid). Opening the box is equivalent to obtaining empirical information about the state of the
cat. Before opening the box, none of the possible hypotheses can be rejected because there is
not enough information to reach a conclusion. Thus, both mutually exclusive hypotheses
(states) co-exist simultaneously as non-rejected (or not yet rejected), similarly to the
superposed states described by quantum mechanics. Please notice that being non-rejected does
not imply being valid. By opening the box, the co-existence of both hypotheses as non-rejected
is no longer possible, because there is information available to reach a conclusion.
When both hypotheses are exhaustive and mutually exclusive (as in Schrödinger’s cat in a box
example), proving the falseness of one hypothesis automatically implies the validity of the
other. However, it is not always possible to be absolutely certain of the falseness of a particular
§
According to Dugald Murdoch and Robert Stoothoff, Eudoxus means literally “famous”, although the
Greek root also suggests “one of sound judgment”.

www.forschem.org
Hugo Hernandez
ForsChem Research
hypothesis. For example, if you are observing the experiment by remote video (avoiding
radioactive contamination), and after opening the box the cat is not moving, it might be dead
but it might also be unconscious, paralyzed, or just sleeping (alive in any of these cases). Thus
an absolute conclusion is not possible. Depending on the position of the cat, one hypothesis
might be more likely to be valid than the other, but additional information is required to reach a
decisive conclusion. This can happen because there is uncertainty in our mechanism used for
obtaining information (system of measurement), or because it is not possible to observe the
whole population of elements, but only a partial sample. Furthermore, if the partial sample is
not truly representative of the population, uncertainty increases. In other words, randomness
does not allow us to reject or accept hypotheses with absolute certainty.
The fundamental steps in any hypothesis testing procedure are the following:
1. Formulating the research question and the corresponding set of possible hypotheses.
2. Planning and executing the validation experiments.
3. Evaluating the validity of the different hypotheses in the set, from the experimental
information obtained.
2. Formulation of Hypotheses
2.1. Research Questions
Research is “a detailed study of a subject, especially in order to discover (new) information or

reach a (new) understanding.”[3] The information and understanding involved in reached only
needs to be new at least to the person carrying out the study. Thus, basically any question with
unknown answer (at least to the person interested in it), is a research question.
In order to obtain correctly formulated hypotheses, the research questions must clearly define:
1. The population of elements subject of study (subject population).

2. The conditions under which the subject population is observed (observation
conditions).
3. The specific property (or properties) observed in the subject population under the
observation conditions (observed property).
Resuming Schrödinger’s example we may formulate the following research question:

www.forschem.org
Hugo Hernandez
ForsChem Research
Q1: What is the vital status (observed property) of the cat (subject population) after
being trapped in a box with a deadly radioactive atom during a half-life period
(observation conditions)?
The subject population, the observation conditions and the observed property usually requires
employing previously defined concepts. For example, how is the vital status of the cat defined?
What determines if a cat is dead or alive? If the concept is not clear for every possible audience
interested in these results, then it must be clarified in any way. As an example, we might need
to add, for example, that the heartbeat of the cat over a certain period of time defines whether
it is dead or alive. Also, when the observed property is not directly measured, but indirectly
determined using particular measuring devices, it should also be clear how the value of such
property will be obtained. Using ambiguous concepts or incomplete information, may lead to
erroneous generalizations of the results obtained. In this example, the population is a single
cat. Thus, the result obtained from this experiment cannot be generalized to all populations of
cats or living beings eventually becoming trapped in the box. If that were the case, the
research question should clearly state it:
Q2: What is the vital status (observed property) of any living being (subject population)
after being trapped in a box with a deadly radioactive atom during a half-life period
(observation conditions)?
The second research question, being more general, provides more valuable information and
knowledge. However, it is more difficult to reach an absolute conclusion since the validation
experiments should be done with all living beings.
2.2. Structure of a Hypothesis
Hypotheses must be clearly associated to a particular research question. If the research

question is not mentioned, the hypothesis must then include: 1) the subject population, 2) the
observation conditions, and 3) the observed property. In addition, a hypothesis must contain
the following elements:
4. A reference value (either numerical or categorical) for the observed property

(reference value).
5. A relation between the observed property for the subject population under the
observation conditions, and the reference value (proposed relation).
The reference value must be consistent with the observed property; otherwise the hypothesis
becomes inherently invalid. If we say that the vital status of the cat after leaving the trap is
, that statement is clearly wrong because the answer is completely unrelated to the
question.

www.forschem.org
Hugo Hernandez
ForsChem Research
In Schrödinger’s example the vital status of the cat (observed property) can be related to two
different categorical reference values: dead and alive. The proposed relations may be equality
or inequality. Thus, the following hypotheses can be proposed to research question Q1:
H1: The vital status of the cat IS DEAD after being trapped in a box with a deadly
radioactive atom during a half-life period. (Reference value: DEAD, proposed relation:
Equality).
H2: The vital status of the cat IS ALIVE after being trapped in a box with a deadly
radioactive atom during a half-life period. (Reference value: ALIVE, proposed relation:
Equality).
H3: The vital status of the cat IS NOT DEAD after being trapped in a box with a deadly
radioactive atom during a half-life period. (Reference value: DEAD, proposed relation:
Inequality).
H4: The vital status of the cat IS NOT ALIVE after being trapped in a box with a deadly
radioactive atom during a half-life period. (Reference value: ALIVE, proposed relation:
Inequality).
All these hypotheses included all 5 basic elements, assuming that the research question has not
been previously presented.
Two hypotheses are mutually exclusive when both cannot be simultaneously valid. On the other
hand, if two hypotheses are exhaustive (all possible outcomes are considered), they cannot be
simultaneously rejected since one of them should necessarily be valid. If two hypotheses are
exhaustive and mutually exclusive, they cannot be simultaneously valid or simultaneously
rejected. They can, however, be simultaneously non-rejected until sufficient experimental
information is available. H1 and H2 are exhaustive and mutually exclusive. H3 and H4 are also
exhaustive and mutually exclusive. If a certain hypothesis is valid when another hypothesis is
valid, and the former is rejected when the latter is also rejected, then the two hypotheses are
equivalent. Otherwise they are different hypotheses. In our example, the hypotheses H1 and H4
are equivalent. Also, hypotheses H2 and H3 are equivalent. For that reason, H1 and H3, as well
as H2 and H4, are also exhaustive and mutually exclusive.
Certain hypotheses might correspond to inherently valid (or inherently true) or inherently
invalid (or inherently false), depending on the particular definitions of the elements involved. In
these cases, no testing is required. For example, if being alive was a necessary condition for
being considered a cat, then H1 would be inherently invalid, and H2 inherently valid.

www.forschem.org
Hugo Hernandez
ForsChem Research
2.3. Sets of Hypotheses
A set of hypotheses is a group of different hypotheses considered as possible solutions of the

research question. Thus, H1 and H4 cannot be considered in the same set of hypotheses, since
they are not different. Furthermore, a set of hypotheses is complete if it is exhaustive. That is,
when all hypotheses included in the set consider all possible results. Otherwise it will be
incomplete. The hypotheses in the set does not necessarily must be mutually exclusive.
However, when hypotheses are not mutually exclusive, the experimental results might not be
conclusive. It is therefore preferred formulating complete sets of mutually exclusive
hypotheses.
Depending on the number of hypotheses considered in the set, we may have:
 Singular sets: Conformed by just one single hypothesis. If a singular set is complete, it
becomes inherently valid (by definition), and no test is required. For example:
H5: The state of the cat IS DEAD OR ALIVE after being trapped in a box with a
deadly radioactive atom during a half-life period. (Reference value: DEAD OR
ALIVE, proposed relation: Equality).
This hypothesis is inherently valid since it covers any possible outcome of the research
question. Thus, no experiment is required for testing the validity of this hypothesis.
 Binary or binomial sets: They are conformed by two hypotheses of the same research
question. In this case, testing is always required for evaluating the validity of each
hypothesis.
 Multiple or multinomial sets: In this case, there are three or more different hypotheses
for answering the same research question. Testing is also required for evaluating the
validity of each hypothesis.
Considering Q1, if the vital status only refers to dead or alive a complete binary set of mutually
exclusive hypotheses is obtained with H1 and H2, for example. On the other hand, if the vital
status of the cat may also refer to: Seriously injured, slightly injured, healthy, asleep, awake,
etc. then a multiple set of hypotheses can be obtained. For this example, they are not mutually
exclusive. The cat might be found alive, healthy and awake after opening the box. Thus,
different hypotheses can be simultaneously valid in this case. Being either healthy or awake
inherently implies that the cat is alive. However, being alive does not imply being healthy or
awake. Thus, they are different hypotheses. Being alive is an entailed hypothesis of either being
healthy or awake (entailing hypotheses). If an entailing hypothesis is proved valid, the validity of
its entailed hypotheses is automatically proven. If an entailing hypothesis is proved invalid,
nothing can be said about its entailed hypotheses.

www.forschem.org
Hugo Hernandez
ForsChem Research
The formulation of the research question may have an effect on the type of hypotheses set
obtained. Let us consider the following re-formulation of research question Q1:
Q3: Is the cat (subject population) healthy and awake (observed properties) after being
trapped in a box with a deadly radioactive atom during a half-life period (observation
conditions)?
In this case we might directly think of two possible hypotheses for answering Q3:
H6: The cat IS HEALTHY AND AWAKE after being trapped in a box with a deadly
radioactive atom during a half-life period.
H7: The cat IS NOT HEALTHY AND AWAKE after being trapped in a box with a deadly
radioactive atom during a half-life period.
H6 and H7 represent a complete binary set of mutually exclusive hypotheses.
2.4. Scientific Hypotheses
A final remark is needed in this Section regarding the meaning of a scientific hypothesis. Any
hypothesis can be considered a scientific hypothesis if its validity can be experimentally tested.
This relatively simple definition has different implications:
 The subject population (or at least a representative sample of the population) must be
available for observation.
 The observation conditions can be attained and maintained during observation of the
population. If this is not the case, such hypothesis is considered non-scientific until
technological developments make those conditions feasible.
 The properties of interest about the population can be observed (directly measured or
determined from other properties**). While accurate and precise observations are
desirable, they are not a condition for considering a hypothesis as scientific. However,
accuracy and precision will have an important effect on the uncertainty during the
validation of the hypothesis.
Let us consider as an example, the following hypothesis where none of the previous conditions
are met:
**
Indirect determination of observed properties usually relies on previously established paradigms, valid
under certain specific conditions. If for any reason those paradigms are not valid, the conclusion
obtained might be wrong.

www.forschem.org
Hugo Hernandez
ForsChem Research
H8: The density (observed property) of the soul of the cat (subject population) IS LESS
THAN 1 ng/m3 (reference value) after being trapped in a box in the center of the Earth
with a deadly radioactive atom during a half-life period (observation conditions).
Hypothesis H8 cannot be proved valid, but also cannot be disproved because a validation
experiment is impossible. Thus, H8 is a non-scientific hypothesis††.
Notice that the scientific character of the hypothesis is already determined from the
formulation of the research question.
In the following Sections, the discussion is referred only to scientific hypotheses, since non-
scientific hypotheses cannot be experimentally validated.
3. Validation Experiments
In general, a hypothesis can only be truly validated after ALL elements in the population have
been observed. Otherwise, the possibility of being a false hypothesis will remain, since one
single element of the population may disprove it (assuming a reliable, exact measurement
system for the observed property). Thus, it is usually easier to reject a hypothesis than to prove
it.
Depending on the observed property, on the range of possible observed values, and on the
proposed relation, partial results can be conclusive. Let us consider the popular election of a
certain representative between two candidates (binary set of hypotheses). Depending on the
partial results obtained, a definitive winner (or loser) can be announced before counting the
totality of votes. In close competitions however, the winner (or loser) can only be defined after
counting all the votes.
Only for complete binary sets of mutually exclusive hypotheses, rejection of one hypothesis
implies acceptance of the alternative hypothesis. In complete multiple sets of mutually
exclusive hypotheses, all but one hypothesis must be rejected in order to prove validity. For
incomplete sets of hypotheses, it is not possible to prove validity of a certain hypothesis by
rejecting all other. Furthermore, if the set contains inclusive hypotheses, it might be impossible
to reject all other hypotheses. That is why only complete sets of mutually exclusive hypotheses
should always be used for validation.
††
Non-scientific hypotheses cannot (and must not) be denied ipso facto by a scientist. They are only
outside the scope of Science and scientific activity. Thus, scientists may have their own sets of personal
non-scientific beliefs, and this is perfectly compatible with their profession.

www.forschem.org
Hugo Hernandez
ForsChem Research
A particular hypothesis of a research question involving a very large population of elements

might never be validated, unless all other hypotheses (in a complete set of mutually exclusive
hypotheses) have been rejected. By decreasing the size of the population, hypotheses
validation becomes easier but the conclusion obtained will also become less relevant and less
important. Hypotheses involving larger populations and wider ranges of observation
conditions are more appealing to science, but they are more difficult to validate. Absolute
certainty about the validity of the most relevant scientific hypotheses (generalized to limitless
populations) is virtually impossible. Hypotheses validated with absolute (100%) certainty are
usually less relevant from a scientific perspective, because they commonly refer to small,
limited populations.
One of the keys for successful research is asking the right research questions, leading to the
formulation of a suitable set of hypotheses (complete set of mutually exclusive hypotheses),
which can be easily validated with a minimum number of experiments (less time and lower
costs). For that reason, complete binary sets of mutually exclusive hypotheses are preferred,
since fewer experiments are required to reject the false hypothesis than for proving the validity
of the complementary hypothesis. Also, since the goal is rejecting false hypotheses,
experiments can be designed with that purpose in mind. That is, experiments should be
designed trying to reject the less likely hypotheses first.
4. Statistical Hypothesis Testing
Unless the subject population is limited to one element, or all elements of the population
present an exactly identical (deterministic) behavior, observations of individual elements are
not relevant properties. Instead, overall (statistical) properties of the population obtained from
different observations are required. Such properties may include:
 For categorical variables: Mode and Probability (or proportion) of occurrence of

specific outcomes.
 For numerical variables: Central tendency properties (arithmetic mean, geometric
mean, harmonic mean, median, mode, midrange, etc.), Dispersion properties (standard
deviation, variance, range, interquartile range, mean absolute deviation, etc.), Relative
position properties (quantiles, ranking, etc.), Probability of occurrence of a range of
values, Probability density of a specific outcome, Probability density functions and
cumulative probability functions (reconstructed from experimental data [4]).
All these properties are usually determined or estimated from a representative sample of the
population. However, the behavior of the sample does not necessarily correspond exactly to
the behavior of the whole population. This is an important source of uncertainty in the

www.forschem.org
Hugo Hernandez
ForsChem Research
validation of hypotheses, and a reason why conclusions can only be reached with a limited
confidence (<100%). The lack of accuracy and precision of the systems of measurement used for
the experimental observation of the properties of the population also contributes to the
uncertainty in the validation of hypotheses. For that reason, even after measuring all elements
in the population, a 100% confident conclusion might never be reached (unless the hypothesis is
inherently valid or inherently invalid).
Let us for example assume that we want to know the proportion (p) of cats surviving
Schrödinger’s experiment described by research question Q4:
Q4: What is the proportion of cats surviving after being trapped in a box with a deadly
radioactive atom during a half-life period?
This, in principle, is a limitless population since it considers any living cat trapped in the box.
Thus, only a sample of elements in the population can be tested. Different complete binary sets
of mutually exclusive hypotheses are possible for this question, mathematically and
grammatically expressed as follows:
Set A: HA1:
The proportion of cats surviving after being trapped in a box with a deadly radioactive
atom during a half-life period IS (exactly) .
HA2:
atom during a half-life period IS NOT (exactly) .
is a reference value for the proportion of cats. Thus, in order to avoid an inherently invalid
hypothesis it can only take values between 0 and 1 ( ).
Set B: HB1:
atom during a half-life period IS GREATER THAN OR EQUAL TO .
HB2:
atom during a half-life period IS LESS THAN .
Set C: HC1:
atom during a half-life period IS LESS THAN OR EQUAL TO .
HC2:
atom during a half-life period IS GREATER THAN .

www.forschem.org
Hugo Hernandez
ForsChem Research
Only for the particular cases of or , a single (although reliable) experiment is

enough for rejecting one of the hypotheses and reaching a confident conclusion. Otherwise,
more experiments are needed. Also, by increasing the number of experiments performed, the
reliability of the conclusion may increase.
Now, given that uncertainty is always present (during sampling, during observation and/or
during the adjustment of the experimental conditions), the observed property will be found
with a certain confidence (probability) in the interval [ ̂ ̂ ], where ̂ is an
estimation of in the presence of uncertainty, and and represent lower and upper
absolute deviations with respect to ̂ in the limits of the interval of confidence C. Thus, the
probability of finding the observed property in the confidence interval is:
( ̂ ̂ ) (4.1)
The selection of the confidence level is arbitrary, and it will depend on the particular research
question considered, and the degree of uncertainty involved. If the conclusion of the research
question may have an impact on the health of a population, very high confidence levels would
be desirable (e.g. >99%). By increasing the confidence level, more effort is required to reject a
false hypothesis and thus, the time and cost of experimental validation may significantly
increase, particularly when high levels of uncertainty are involved.
Considering uncertainty, the original hypotheses can then be transformed (with a confidence
) into‡‡:
Set A*: HA*1: ̂

HA*2: ̂ or ̂
Set B*: HB*1: ̂

HB*2: ̂
Set C*: HC*1: ̂

HC*2: ̂
‡‡
HA*1: Given ̂ and ̂ , and assuming , then ̂ and ̂ ,
which can be transformed into ̂ and ̂ , or equivalently ̂ .
HA*2: Given ̂ and ̂ , and assuming , or equivalently or , then
̂ or ̂ can be expressed as ̂ or ̂ , or
equivalently as ̂ or ̂ .
HB*1: ̂ , thus ̂
HB*2: ̂ , thus ̂
HC*1: ̂ , thus ̂
HC*2: ̂ , thus ̂

www.forschem.org
Hugo Hernandez
ForsChem Research
These transformed sets of hypotheses remain complete but they no longer contain mutually
exclusive hypotheses. This can be graphically evidenced in Figure 1, where all the sets contain
overlaying regions. Particularly, hypothesis HA*2 is inherently valid because it contains the
complete space of possible outcomes. This means that hypothesis HA2 becomes always valid in
the presence of uncertainty, and therefore, lacks of any scientific value.
Figure 1. Graphical representation of the validity of hypotheses considered in each of the

transformed sets of hypotheses A*, B* and C* for the proportion of surviving cats. Blue region:
Only Hypothesis 1 is valid. Red region: Only Hypothesis 2 is valid. Purple (overlaid) region:
Hypotheses 1 and 2 are both simultaneously valid.
All other transformed hypotheses can be tested after defining new sets of hypotheses which
involves only mutually exclusive hypotheses. Thus, the following binary sets are obtained:
Set A*1: HA*1: ̂

HA*1a: ̂ or ̂
Set B*1: HB*1: ̂

HB*1a: ̂

www.forschem.org
Hugo Hernandez
ForsChem Research
Set B*2: HB*2: ̂

HB*2a: ̂
Set C*1: HC*1: ̂

HC*1a: ̂
Set C*2: HC*2: ̂

HC*2a: ̂
The subscript a included in the notation of the new hypotheses represent an alternative or
complement to the original hypothesis.
Now, let us consider the hypotheses set A*1. If HA*1a is rejected (proved invalid), HA*1 is
automatically accepted (proved valid). However, rejecting HA*1a does not imply that HA*2 (which
is inherently valid), and much less HA2, are invalid. In addition, if HA*1 is rejected, HA*1a is
automatically accepted, but again, it does not necessarily imply rejecting HA1. Rejecting HA*1 with
a high confidence level indicate that HA1 is highly likely invalid, but it does not prove it. For that
reason, statistical hypothesis testing results must always be carefully interpreted in order to
avoid reaching wrong conclusions.
A similar analysis can be done for all other sets of hypotheses (B*1, B*2, C*1 and C*2).
Particularly sets A*1, B*1 and C*1 are more commonly used for statistical hypothesis testing.
The reformulation of the sets requires determining the confidence interval limits, which
depends on the particular probability distribution of the observed property. The probability
distribution can be obtained from experimental data,[4] resulting either in arbitrary functions
or in typical functions (such as the Normal distribution, Student’s T distribution, distribution,
Fisher’s F distribution, etc.). Use of typical functions is the basis for most common statistical
tests of hypotheses, differing only in the determination of the particular confidence intervals.
These tests have been widely described in statistics textbooks.[5-8]
Even though the proportion was used as an example for discussing the mathematical
foundation of statistical hypothesis testing, this analysis is also valid for any other property
observed in the population. Furthermore, the reference value ( in the example), does not
necessarily need to be a constant value. It can also be generally expressed by an arbitrary non-
linear function as follows:
( )
(4.2)
where is the reference value (non-constant) for the observed property , represents a
vector of external conditions, is a vector of internal conditions of the subject population, is

www.forschem.org
Hugo Hernandez
ForsChem Research
a vector of spatial coordinates or any other type of variable used to classify the elements in the
population, and is time. The function is basically a mathematical model (or theory) of the
behavior of . Please also notice that a model is usually represented by the corresponding
hypothesis:
H9: ( )
Since there are practically limitless different conditions where a model can be tested, it is also
virtually impossible to validate a model. If the model is proved invalid in the presence of
uncertainties for a particular set of conditions, the model becomes likely to be invalid. As the
number of different conditions where the model is rejected under uncertainty increases, the
model is each time more likely to be invalid. Even though models cannot be validated with
absolute confidence, the likeliness of being valid can be compared for different models (or
theories). Models which are least likely to be rejected under certain conditions are considered
“working” models or paradigms (under the particular conditions considered). Paradigms are,
however, not necessarily valid.
Different working models of the same population under the same conditions may co-exist as
long as there is not enough experimental information for more likely rejecting any of them.
Now, since the number of possible models describing a certain population is countless, likely
rejecting a model does not imply that any remaining model is valid (even if no other models or
theories have been proposed).
5. Concluding Remarks
Scientific knowledge can be considered as the current set of paradigms (working models or
theories) available to mankind regarding the behavior of everything in the Universe. Since
models and theories cannot be confidently validated, Science will never reach a state of
perfection. On the contrary, Science will be permanently improving by increasing the amount
of evidence for highly likely rejecting invalid models and theories, while retaining only the best
working models and theories.
For a scientist, absolutely valid paradigms should not exist (excluding, of course, those
inherently valid in their formulation as it is the case of theorems). Assuming the absolute
validity of a non-inherently valid paradigm makes it a scientific dogma, and scientific dogmas
endanger the progress of Science. Furthermore, the laws of Nature should not be conceived as
scientific dogmas. A law of Nature is defined as “an empirical truth of great generality, conceived
of as a physical (but not a logical) necessity, and consequently licensing counterfactual
conditionals.”[9] Thus, a law of Nature is not necessarily universally valid; it is just that, by

www.forschem.org
Hugo Hernandez
ForsChem Research
inductive reasoning, it is very probable that it will again hold true the next time that it is
tested.[10] This is simply a manifestation of the faith of scientists on probabilities, and not on
the law itself.
On the other hand, theories highly likely to be invalid should not be absolutely rejected as long
as there is any slight possibility of being valid. This is because uncertainty in our current systems
of measurement, in our current technologies for controlling the experimental conditions, and
our sampling procedures might have affected the past experimental results, leading to
erroneous conclusions. Thus Science is continuously evolving by improving accuracy and
precision of our systems of measurement, by improving our technologies for controlling and
expanding the range of experimental conditions, by improving our sampling procedures
(and/or increasing sample sizes), by reaching new subject populations, by proposing
novel/better models and theories, and thus, by improving our current set of working models
and theories explaining our Universe (scientific knowledge).
Any particular theory accepted by the vast majority of scientists (or human population in
general), does not make it valid. In that sense, Science should not be a democracy, otherwise
scientific revolutions will never be possible.[11] In addition, scientific authorities cannot dictate
whether a theory is valid or not (without any verifiable experimental evidence). Thus, Science
should not be an aristocracy and much less an autocracy. Science should be a meritocracy, where
models and theories are accepted or rejected based only on the objective evaluation of
experimental results, considering the effect of uncertainty, of course. Scientific policies should
encourage the search for different, alternative models and theories to be tested, which will
inevitably promote scientific progress. Scientists should be focused on searching for evidence
for rejecting (or reducing the likeliness of being valid) of current working models and theories.
This will strengthen those paradigms more likely to be valid, while weakening those less likely
to be valid. The mission of a scientist should not be proving theories or hypotheses, but
disproving them. By rejecting hypotheses, the number of coexistent non-rejected hypotheses
will be reduced, and science will progress that way.
Acknowledgments
The author gratefully acknowledges Prof. Dr. Silvia Ochoa (Universidad de Antioquia) for
helpful discussions on the topic.
This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.

www.forschem.org
Hugo Hernandez
ForsChem Research
References
[1] Descartes, R. (1985). The Search for Truth by means of the Natural Light. In: The
Philosophical Writings of Descartes. Translated by J. Cottingham, R. Stoothoff and D. Murdoch.
Volume 2. Cambridge: Cambridge University Press. p. 416.
[2] Schrödinger, E. (1935). Die gegenwärtige Situation in der Quantenmechanik.

Naturwissenschaften, 23, 807-812.
[3] Research (noun). Cambridge Dictionary Online. Cambridge University Press. Retrieved
January 11, 2020 from https://dictionary.cambridge.org/dictionary/english/research.
[4] Hernandez, H. (2018). Comparison of Methods for the Reconstruction of Probability Density
Functions from Data Samples. ForsChem Research Reports, 3, 2018-12. doi:
10.13140/RG.2.2.30177.35686.
[5] Walpole, R. E., Myers, R. H., Myers, S. L., & Ye, K. (1993). Probability and statistics for
engineers and scientists. New York: Macmillan.
[6] Montgomery, D. C., & Runger, G. C. (2010). Applied statistics and probability for engineers.
John Wiley & Sons.
[7] Devore, J. L. (2011). Probability and Statistics for Engineering and the Sciences. Cengage
learning.
[8] Hartshorn, S. (2015). Hypothesis Testing: A Visual Introduction to Statistical Significance.
[9] Law of Nature (noun). Collins Dictionary Online. Collins. Retrieved January 15, 2020 from
https://www.collinsdictionary.com/dictionary/english/law-of-nature.
[10] Davies, P. (1992). The Mind of God: The Scientific Basis for a Rational World. Simon &
Schuster.
[11] Kuhn, T. S. (2012). The structure of Scientific Revolutions. 4th Ed. The University of Chicago
Press.

www.forschem.org

Formulation and Testing of Scientific Hypotheses in The Presence of Uncertainty

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Formulation and Testing of Scientific Hypotheses in The Presence of Uncertainty

Transféré par

Droits d'auteur :

Formats disponibles

Vol.

Formulation and Testing of Scientific Hypotheses in the presence of

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (1 / 16)

Experimentation, Hypotheses, Models, Observation, Philosophy, Research, Schrödinger,

Hypotheses are statements about a certain property of a particular population of elements.

Descartes’ statement might seem in contradiction with the Copenhagen interpretation of

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (2 / 16)

2.1. Research Questions

Research is “a detailed study of a subject, especially in order to discover (new) information or

1. The population of elements subject of study (subject population).

Resuming Schrödinger’s example we may formulate the following research question:

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (3 / 16)

2.2. Structure of a Hypothesis

Hypotheses must be clearly associated to a particular research question. If the research

4. A reference value (either numerical or categorical) for the observed property

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (4 / 16)

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (5 / 16)

2.3. Sets of Hypotheses

A set of hypotheses is a group of different hypotheses considered as possible solutions of the

Depending on the number of hypotheses considered in the set, we may have:

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (6 / 16)

H6 and H7 represent a complete binary set of mutually exclusive hypotheses.

2.4. Scientific Hypotheses

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (7 / 16)

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (8 / 16)

A particular hypothesis of a research question involving a very large population of elements

4. Statistical Hypothesis Testing

 For categorical variables: Mode and Probability (or proportion) of occurrence of

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (9 / 16)

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (10 / 16)

Only for the particular cases of or , a single (although reliable) experiment is

Set A*: HA*1: ̂

Set B*: HB*1: ̂

Set C*: HC*1: ̂

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (11 / 16)

Figure 1. Graphical representation of the validity of hypotheses considered in each of the

Set A*1: HA*1: ̂

Set B*1: HB*1: ̂

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (12 / 16)

Set B*2: HB*2: ̂

Set C*1: HC*1: ̂

Set C*2: HC*2: ̂

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (13 / 16)

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (14 / 16)

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (15 / 16)

[2] Schrödinger, E. (1935). Die gegenwärtige Situation in der Quantenmechanik.

[8] Hartshorn, S. (2015). Hypothesis Testing: A Visual Introduction to Statistical Significance.

15/01/2020 ForsChem Research Reports Vol. 5, 2020-01 (16 / 16)

Vous aimerez peut-être aussi

Set A: HA1: ̂

Set B: HB1: ̂

Set C: HC1: ̂

Set A1: HA1: ̂

Set B1: HB1: ̂

Set B2: HB2: ̂

Set C1: HC1: ̂

Set C2: HC2: ̂