Vous êtes sur la page 1sur 58

RESEARCH IN ELT I

------------------------------------------------QUANTITATIVE APPROACH

BY
DR.SUWANDI M.PD

IKIP PGRI SEMARANG

PART I. PRELIMINARY understanding


A. The Development of Modern Problem Solving
Many centuries ago human beings were trying to develop what seemed to be sensible
explanations. However, the explanations were often rejected if they seemed to conflict
with the dogma of religious authority. Curious men who raised questions were often
punished and even to death when they persisted in expressing doubts suggested by such
Unorthodox explanations of natural phenomena.
The reliance on empirical evidence or personal experience challenged the sanction of
vested authority and represented an important step in the direction of scientific inquiry.
Such pragmatic observation, however, was largely unsystematic and further limited by
the lack of an objective method. The observer was likely to overgeneralize on the basis of
incomplete experience or evidence, to ignore complex factors operating simultaneously,
or to let his feelings and prejudices influence both his observations and his conclusions. It
was only when man began to think systematically about thinking itself that the era of
logic began. The first systematic approach to reasoning, attributed to Aristotle and the
Greeks was the deductive method. The categorical syllogism was one model of thinking
that prevailed among other philosophers. Syllogistic reasoning established a logical
relationship between a major premise, a minor premise, and a conclusion. A major
premise is a self-evident assumption, previously established by metaphysical truth or
dogma, concerning a relationship; a minor premise is a particular case related to the
major premise. Given the logical relationship of these premises, it leads to an inescapable
conclusion.
A typical Aristotelian categorical syllogism follows:
Major premise All birds have wings
Minor premise Dove is bird
Conclusion.. Dove has wings
This deductive, moving from the general assumption to the specific application, made an
important contribution to the development of modern problem solving but it was not
fruitful in arriving at new truths. The acceptance of incomplete or false major premise
based on old dogmas or unreliable authority could only lead to error.

Centuries later, Francis Bacon advocated the application of direct observation of


phenomena, arriving at conclusions or generalizations through evidence of many
individual observations. This inductive process of moving from specific observations to
the generalization freed logic from some of the hazards and limitations of deductive
thinking.
The method of inductive reasoning proposed by Bacon, a method new to the field of
logic, but widely used by the scientists of his time, was not hampered by false premises,
by the inadequacies and ambiguities of verbal symbolism, or by the absence of
supporting evidence. But the inductive method alone did not provide a completely
satisfactory system for the solution of problems. Random collections of individual
observations without a unifying concept or focus of obscured investigations and therefore
rarely led to a generalization or theory.
The major premise of the older deductive method was gradually replaced by an
assumption or hypothesis which was subsequently tested by the collection and logical
analysis of data. This deductive-inductive method is now recognized as an example of a
scientific approach.
John Dewy suggested a pattern that is helpful in identifying the elements of a
deductive-inductive process:
A method of a science
1. Identification and definition of the problem
2. Formulation of a hypothesis hunch, an assumption, or an intelligent guess
3. Collection, organization, and analysis of data
4. Formulation of conclusions
5. Verification, rejection, or modification of the hypothesis by the test of its consequences
in a specific situation.
B. What is Research?
Research is commonly defined as a systematic approach to provide answers to
questions (Tuckman: 1978: 1). This definition implies the meaning that there are three
basic components in a research, namely a systematic approach, answers and questions. In

other words, when someone wants to carry out a research he has a question that needs an
answer. The way how to find the answers is by using a systematic approach.
In line with Tuckmans definition, Nunan (1992) states that research is a process
which involves: (a) defining a problem, (b) stating an objective, and (c) formulating an
hypothesis. It involves gathering information, classification, analysis, and interpretation
to see to what extent the initial objective has been achieved. He further states research is
carried out in order to:
- get a result with scientific methods objectively, not subjectively.
- solve problems, verify the application of theories, and lead on to new insights.
- enlighten both researcher and any interested readers.
- prove/disprove new or existing ideas, to characterize phenomena (i.e., the language
characteristics of a particular population), and to achieve personal and community aims.
That is, to satisfy the individuals quest but also to improve community welfare.
- prove or disprove. Demystify, carry out what is planned, to support the point of view, to
unc0ver what is not known, satisfy inquiry. To discover the cause of a problem, to find
the solution to a problem, etc.
Best (1977: 8) states that research is the systematic and objective analysis and
recording of controlled observations that may lead to the development of generalizations,
principles, or theories, resulting in prediction and ultimate control of many events that
may be consequences or causes of specific activities. From the two research definitions
above, we can conclude that basically we do a research because there is a problem that
needs a solution which may attempt to find the relationship between two or more
variables. Research demands accurate observation and description. The researcher uses
quantitative, numerical measuring devices, the most precise means of description. The
researcher selects valid and reliable instrument order to collect the data. Research
involves the gathering new data from primary or first hand sources or using existing data
for a new purpose. Last but not least, research emphasizes the development of
generalizations, principles, or theories that will be helpful in predicting future
occurrences.
Basic research is concerned with the relationship between two or more variables. It is
carried out by identifying a problem, examining selected relevant variables through a

literature review, constructing a hypothesis where possible, creating a research design to


investigate the problem, collecting and analyzing appropriate data, an then drawing
conclusions about the relationships of the variables. The purpose of basic research is to
develop a model, or a theory that identifies all the relevant variables in a particular
environment and hypothesizes about their relationship. Then using the findings of basic
research, it is possible to develop a product product here being used to include for
example, a given curriculum, a particular teacher-training program, a textbook, or an
audio-visual aid. .

C. STEPS IN THE RESEARCH PROCESS


There are several steps in conducting a research. The following is a list of steps each
of which will be discussed briefly and concisely namely, identifying a problem,
constructing a hypothesis, reviewing the literature, identifying and labeling variables,
constructing operational definitions, manipulating and controlling variables,
constructing a research design, identifying and constructing devices for observation
and measurement, constructing questionnaires and interview schedules, carrying out
statistical analyses and writing a research report.
1. INDENTIFYING A PROBLEM
This is the most difficult phase for a student to choose a suitable problem. Select a
research problem which is interesting and practical. There are some characteristics of a
problem that should be taken into consideration in order to get the most suitable one.
a. It should ask about a relationship between two or more variables.
In this kind of problem the researcher manipulates a minimum of one variable to
determine its effect on other variables, as apposes to a purely descriptive study in
which the researcher observes, or in some way measures, the frequency of
appearance of a particular variable in a particular setting. For example, in a
descriptive study the problem might be: How many students in school X have I.Q.s
in excess of 120?; or: Are boys more likely than girls to have I.Q.s in excess of 120?
b. It should be stated clearly an unambiguously, usually in question form.

For instance:
What is the relationship between I.Q. and achievement?
Do students learn more from a directive teacher or a non directive teacher?
Is there a relationship between racial background and drop out rate?
c. It should be testable by empirical methods.
A problem should be testable by empirical methods that is, through the collection of
data. Moreover, for a students purposes, it should lend itself to study by a single
researcher, on a limited budget, within a year. The nature of the variables included in
the problem is a good clue to its testability.
d. It should not represent a moral or ethical position.
Questions about ideals or values are often more difficult to study than questions about
attitudes or performance. Examples of problem that would be difficult to test are:
Should people disguise their feelings? Should children be seen and not heard?
2. Constructing a hypothesis
Once a problem has been identified, the researcher often employs the logical processes
of deduction and induction to formulate an expectation of the outcome of the study. That
is he or she hypothesizes about the relationship between the concepts identified in the
problem. A hypothesis, a suggested answer to the problem has the following
characteristics:
a. It should hypothesize upon a relationship between two or more variables.
b. It should be stated clearly and unambiguously in the form of a declarative
sentence.
c. It should be testable; that is, it should be possible to restate it in an operational
form that can then be evaluated based on data.
Question: Is there a drinking problem among students?
Thesis: There is a drinking problem among students at this college.

The thesis is a statement of belief. The opposite belief gives the opposite thesis: There
is no drinking problem on campus. The next step is to sharpen the thesis, which means
thinking of ways to identify what we mean by drinking problem. The refining of the
thesis might go like this:
Thesis 1. There are students on this campus who have drinking problem
Thesis : Students who drink excessively have problem.
Thesis 3. Students who drink excessively have problem with their school work.
Supppose we decide to focus this topic on the individual students who get into trouble
through drinking. Lets follow the question-and-answer process through a refining of the
thesis:
Question: Are there students on campus who drink excessively?
Thesis:

There are students on campus who drink excessively.

Question: Do Students who drink excessively get into trouble from drinking?
Thesis:

Excessive drinking gets students into trouble.

Question: What kind of trouble does excessive drinking get students into?
Thesis:

Students who drink excessively have problems with their academic work.

Thesis:

Students who drink excessively have problems getting along with their
peers.

Thesis:

Students who drink excessively have problems with their families.

The thesis becomes a tool for working out your idea. Once you have sharpened your
thesis, the next step is to develop testable hypotheses. These are the cutting edges
necessary to carve out some information for the report. The sharper the hypothesis, the
neater the information.

A Hypothesis is a tentative proposition about a relationship between two or more


phenomena, a proposition that can be empirically tested to be either true or false. These
phenomena are called variables which must be measurable. The hypothesis must specify
how the variables are related. A thesis that states variables A and B are related may
have hypotheses such as the greater A, the greater B, or the more A, the less B.
Hypothesis: Excessive drinkers will have lower grade point averages than will
moderate drinkers or teetotalers among the students.
Hypothesis: Excessive drinkers are more frequently dismissed from college than are
moderate drinkers or teetotalers among the students.
Hypothesis: Excessive drinkers will have grade point averages that decline faster than
those of moderate drinkers or nondrinkers among the students.
Thus, hypotheses might have been derived from the problem statements. Other
examples are as follows:
Hypothesis: I.Q. and achievement are positively related
Hypothesis: Directive teachers are more effective than nondirective teachers.
Hypothesis: The dropout rate is higher for black students than for white students.
Where do hypotheses come from?
Given a problem statement, for example, Are A and B related?, there are three possible
hypotheses. These are
a. Yes, as A increases so does B.
b. Yes, as A increases B decreases.
c. Yes, A and B are related.
After deciding that the relationship between variables A and B is the problem to be
studied, the researcher has two logical processes to draw upon in developing a
hypothesis. These processes are called deduction and induction. Deduction is a process
that goes from the general to the specific. For example, one may deduce that people
spend less time doing what they do well because they are able to perform efficiently. It

can also be generally stated that people spend more time doing the things they do well
because they like them and less time doing the things they do poorly because they prefer
to avoid them. Thus, when general expectations about events based on presumed
relationships between variables are used to arrive at, more specific expectations that
process is called deduction.
In the inductive process the researcher starts with specific observations and combines
them to produce a more general statement of relationship, namely a hypothesis. Induction
begins with data and observations (empirical events) and proceeds towards hypotheses
and theories, while deduction begins with theories and general hypotheses and proceeds
towards specific hypotheses (or anticipated observations).
Constructing Alternative Hypotheses
From any problem statement, it is generally possible to derive more than one
hypothesis. As an example, a researcher interested in the possible relationship between
birth order and achievement motivation asks: Are firstborn children more likely to pursue
higher education than later-born children? There are three possible hypotheses:
a. Firstborn are more likely to pursue higher education than later-born children.
b. Firstborns are less likely to pursue higher education than later-borns.
c. Firstborn and later-borns are equally likely to pursue higher education.
Researchers formulate hypotheses using induction and deduction, thus giving due
consideration to both potentially relevant theory and prior research findings. Since one of
the goals of research is to produce the pieces for generalizable bodies of theory that will
provide answers to practical problems, the researcher, where possible, should try to work
out of or toward a general theoretical base. Hypothesis construction and testing enable
researchers to generalize their dings beyond the specific conditions in which they were
obtained.
3. Reviewing the Literature
Every serious piece of research includes a review of relevant research more extensive
in a dissertation, for example than in journal article where space is at a premium.
Research begins with ideas and concepts that are related to one another through

hypotheses, that is , expected or anticipated relationships. These expectations are then


tested by transforming or operationalizing the ideas and concepts into procedures for the
collection of data. Results or findings based on these data are then interpreted and
extended by converting them into new concepts. Ideas and concepts to some extent come
from the collective body of prior work referred to as literature. For example, reference to
relevant studies helps to uncover the following:
-

ideas about variables that have proven important and unimportant in a given field
of study,

information about work that has already been done and which can be
meaningfully extended or applied

meanings of and relationships between variables that you have chosen to study
and wish to hypothesize about.

4. Identifying and Labeling Variables


There are several kinds of variables which must be considered in a research, among
others are independent variable, dependent variable, moderator variable, control variable,
and intervening variable. The following is a brief and concise description of the variables.
a. The independent variable
It is that factor that which is measured, manipulated, or selected by the
experimenter to determine its relationship to observed phenomenon. If an
experimenter studying the relationship between two variables, X and Y, asks
himself, What will happen to Y if I make X greater or smaller? he is thinking of
variable X as his independent variable. He considers it independent because he is
interested in how it affects another variable, not in what affects it.
b. The dependent variable
The dependent variable is a response variable or output. The dependent variable is
that factor which is observed and measured to determine the effect of the
independent variable, that is, that factor that appears, disappears, or varies as the
experimenter introduces, removes, or varies the independent variable.
c. The moderator variable

The moderator variable is defined as that factor which is measured, manipulated,


or selected by the experimenter to discover whether it modifies the relationship of
the independent variable to an observed phenomenon. Consider this illustration.
First, suppose you want to compare the effectiveness of a visual approach (using
pictures) to an auditory (using audiotapes) for teaching grammar. Suppose you
further suspect that while one method may be more effective for students who
learn best in a visual mode, the other may be more effective for those who learn
best in an auditory mode. When all students are tested for achievement at the end
of semester, the results of the two approaches may appear to be the same; but
when visual learners are separated from auditory ones, the two approaches may
have different results in each subgroup. If so, learning mode would be seen to
moderate the relationship between instructional approach (the independent
variable) and effectiveness (the dependent variable). The data analysis used is
usually analysis of variance or regression analysis. The following are two
examples of moderator variables.
1). Hypothesis 1. Male experimenters get more effective performances from
both male and female subjects than do female experimenters, but they are
singularly most effective with male subjects.
Independent variable: the sex of the experiment
Moderator: the sex of the subject
Dependent variable: effectiveness of performance of subjects
2). Hypothesis 2. Grade-point average and intelligence are more highly
correlated for boys than for girls.
Independent variable: either grade-point average or intelligence may be
considered the independent variable; the other, the dependent variable.
Moderator variable: sex (boys versus girls)
d. Control Variable
Control variables are defined as those factors which are controlled by the
experimenter to cancel out or neutralize any effect they might otherwise have on the

observed phenomenon. While the effect of control variables are neutralized, the effects of
moderator are studies. The followings are examples of control variables.
Hypothesis 1. Among boys there is a correlation between physical size and social
maturity, but for girls in the same age group there is no
correlation between these two variable.
Control variable: age
Hypothesis 2. Under intangible reinforcement conditions, middle-class children will learn
significantly better than lower-class children.
Control variable: reinforcement conditions
e. Intervening Variables
An intervening variable is that factor which theoretically affects the observed
phenomenon but can not be seen, measured, or manipulated; its effect must be inferred
from the effect of the independent and moderator variables on the observed phenomenon.
Consider the intervening variable in the following hypothesis: Teachers given more
positive feedback experiences will have more positive attitudes toward children than
teachers given fewer positive feedback experiences.
Independent variable: number of positive feedback experiences for teacher
Intervening variable: teachers self-esteem.
Dependent variable: positiveness of teachers attitude toward students.

PART II. QUANTITATIVE APPROACH TO


RESEARCH
A.THE RESEARCH SPECTRUM
In order to place those and procedures already described in the previous discussion
and to outline the sequence of activities in the research that form the basis for this book,
the schematic of the research spectrum is presented in the following figure.

Prblm
emem

Rel
theor
y

hypoth

Rel
finding
s
findings

predic
tion

Msrmnt
Dev

Exp
design

Figure 1: The research spectrum


Note that the research spectrum is intended to show the flows of ideas in conducting a
research. Research begins with a problem and the utilization of both theories and
findings, located through a thorough literature search, in arriving at hypotheses. These
hypotheses contain variables which must be labeled and then operationally defined to
construct predictions. These steps might be considered the logical stages of research.
These stages are followed by the methodological stages which culminate in the
development of research design, the development of measures, and finally in the findings
themselves.

B. QUANTITATIVE APPROACH TO RESEARCH


Creswell (2003: 18) states that there are three major approaches to research; among
others are quantitative approach, qualitative approach and mixed method approach.
Quantitative approach is one in which the investigator primarily uses postpositivist
claims for developing knowledge (i.e., cause and effect thinking, reduction to specific
variables and hypotheses and questions, use of measurement and observation, and the test
of the theories), employs strategies of inquiry such as experiments and surveys, and
collects data on predetermined instruments that yield statistical data.
A qualitative approach is one in which the inquirer often makes knowledge claims
based primarily on constructivist perfectives (i.e., the multiple meanings of individual
experiences, meanings socially and historically constructed, with an intent of developing
a theory or pattern) or advocacy/participatory perspectives (i.e., political, issue-oriented,
collaborative, or change oriented) or both. It also uses strategies of inquiry such as
narratives, phenomenologies, ethnographies, grounded theory studies, or case studies.
The researcher collects open-ended, emerging data with the primary intent of developing
themes from the data.
A mixed methods approach is one in which the researcher tends to base knowledge
claims on pragmatic grounds (e.g., consequence-oriented, problem-centered, and
pluralistic). It employs strategies of inquiry that involve collecting data either
simultaneously or sequentially to best understand research problems. The data collection
also involves gathering both numeric information (e.g., on instruments) as well as text
information (e.g., on interviews) so that the final database represents both quantitative
and qualitative information.
In this part, quantitative method will be the focus of the discussion.
According to Creswell (2003: 153) survey and experimental research methods belong to
quantitative method. It is therefore, the essential components of a survey and an
experimental research method will be the focus of the discussion.
C. SURVEY
A survey design provides a quantitative or numeric description of trends, attitudes, or
opinions of a population by studying a sample of that population. From sample results,

the researcher generalizes or makes claims about the population. In an experiment,


investigators may also identify a sample and generalize to a population; however, the
basic intent of an experiment is to test the impact of a treatment( for an intervention) on
an outcome, controlling for all other factors that might influence that outcome.
Cohen, Manion and Morrison (2007: 206) state that a survey has several
characteristics. Among others are:
-

gathers data on a one-shot basis and hence is economical and efficient

represents a wide target population (hence there is a need for careful sampling

generates numerical data

provides descriptive, inferential and explanatory information

manipulates key factors and variables to derive frequencies (e.g. the numbers
registering a particular opinion or test score)

gathers standardized information (i.e. using the same instruments and questions
for all participants.

Ascertains correlation (e.g. to find out if there is any relationship between gender
and scores)

Presents material which is uncluttered by specific contextual factors

Captures data from multiple choice, closed questions, test scores or observation
schedules

Supports or refutes hypotheses about the target population

Generates accurate instruments through their piloting and revision

Makes generalization about and observes patterns of response in, the target of
focus

Gathers data which can be processed statistically

Usually relies on large-scale data gathering from a wide population in order to


enable generalizations to be made about given factors or variables.

The followings are examples of surveys:


-

opinion polls, which refute the notion that only opinion polls can catch opinions

test scores (e.g. the results of testing students nationally or locally)

students preference for particular courses (e.g. humanities, sciences)

D. THE SURVEY DESIGN


Three prerequisites to the design of any survey are: the specification of the exact
purpose of the inquiry; the population on which it is to focus; and the resources that are
available.
1). The Purpose of the Inquiry/survey research
-

Identify the purpose of the survey research. This purpose is to generalize from a
sample to a population so that inferences can be made about some characteristic,
attitude, or behavior of this population.

Indicate why a survey is the preferred type of data collection procedure for the
study. In this rationale, consider the advantages of survey design, such as the
economy of the design and the rapid turnaround in data collection. Discuss the
advantage of identifying attributes of a large population from a small group of
individuals.

Indicate whether the survey will be cross-sectional, with the data collected at one
point in time, or whether it will be longitudinal with data collected over time.

Specify the form of data collection whether it is self-administered questionnaires,


interviews, structured record reviews to collect financial, medical, or school
information, or structured observation.

2) The Population and Sample


-

Identify the population in the study. Also state the size of this population, if size
can be determined, and the means of identifying individuals in the population.

Identify whether the sampling design for this population is single or multistage
(called clustering). Cluster sampling is ideal when it is impossible or impractical
to compile a list of the elements composing the population. A single-stage
sampling procedure is one in which the researcher has access to names in the
population and sample the people (or other elements) directly. In a multistage,
clustering procedure, the researcher first samples groups or organizations (or

clusters), obtains names of individuals within groups or clusters, and ten samples
within the clusters.
-

Identify the selection process for individuals. It is recommended that selecting a


random sample in which each individual in the population has an equal
probability of being selected.

Identify whether the study will involve stratification of the population before
selecting the sample. Stratification means that specific characteristics of
individuals (e.g., both females and males) are represented in the sample and
sample reflects the true proportion of individuals with certain characteristics of
the population.

Discuss the procedure for selecting the sample from available lists. The most
rigorous method for selecting the sample is to choose individuals using a random
numbers table, a table available in many introductory statistics texts

Indicate the number of people in the sample and the procedure s used to compute
this number. In survey research, it is recommended that one use a sample size
formula available in many survey texts.

3) Instrumentation
As part of rigorous data collection, the proposal developer also provides detailed
information about the actual survey instrument to be used in the proposed study. Consider
the following:
-

Name the survey instrument used to collect data in the research study. Discuss
whether it is an instrument designed for this research, a modified instrument, or
an intact instrument developed by someone else.

To use an existing instrument, describe the established validity and reliability of


scores obtained from past use of the instrument. This means reporting efforts by
authors to establish validity whether one can draw meaningful and useful
inferences from scores on the instruments. The three traditional forms of validity
to look for are content validity (i.e., Do the items measure the content they were
intended to measure?), predictive or concurrent validity (i.e., Do scores predict a
criterion measure? Do results correlate with other results?), and construct validity

(i.e., Do items measure hypothetical constructs or concepts?). In more recent


studies, construct validity has also included whether the scores serve a useful
purpose and have positive consequences when used .
-

Include sample items from the instrument so that readers can see the actual items
used. In an appendix to the proposal, attach sample items from the instrument or
the entire instrument.

Indicate the major content sections in the instrument, such as the cover letter, the
items (e.g., demographics, attitudinal items, behavioral items, factual items) and
the closing instructions. Also mention the type of scales used to measure the items
on the instrument, such as continuous scales (e.g., strongly agree to strongly
disagree) and categorical scales (e.g., yes/no, rank from highest to lowest
importance).

Discuss plans for pilot testing or field testing the survey and provide a rationale
for these plans. The testing is important to establish the content validity of an
instrument and to improve questions, format, and the scales. Indicate the number
of people who will test the instrument and the plans to incorporate their comment
into final instrument revisions.

4) Variables in the Study


It is also important to relate the variables, the research questions, and items on the
survey instrument so that a reader can easily determine how the researcher will use
the questionnaire items. Table 1.2 Illustrates such relations using hypothetical data.

Table 1.2
Variable name

Research Question

Item on Survey

Independent variable

Descriptive research

Item on survey

Prior publication

How many publications did

See questions 11,13, 14, 15:

the faculty member produce

publication counts before

prior to receipt of the

doctorate for journal

doctorate?

articles, books, conference

Dependent variables

Descriptive questions

papers, book chapters.


Se questions 16,17, and 18:

Grants funded

How many grants has the

grants from foundations

faculty member received in

federal grants, state grants.

Control variable

the last 3 years?


Descriptive research

See question 19. tenured

Tenure status?

question: is the faculty

(yes/no)

member tenured?

5) Data Analysis
In the proposal, present information about the steps involved in analyzing the data. It
is recommended to present them as a series of steps.
Step 1. Report information about the number of members of the sample who did and did
not return the survey. A table with numbers and percentages describing respondents and
non-respondents is a useful tool to present this information.
Step 2. Discuss the method by which response bias will be determined. Response bias is
the effect of non-responses on survey estimates. Bias means that if non-respondents had
responded, their responses would have substantially changed the overall results of the
survey. Mention the procedures used to check for response bias, such as wave analysis or
a respondent/non-respondent analysis. In wave analysis, the researcher examines returns
on select items week by week to determine if average responses change. Based on the
assumption that those who return surveys in the final weeks of the responses begin to
change, a potential exists for response bias is to contact by phone a few non-respondents

and determine if their responses differ substantially from respondents. This constitutes a
respondent-non-respondents check for response bias.
Step 3. Discuss a plan to provide a descriptive analysis of data for all independent and
dependent variables in the study. This analysis should indicate the means, standard
deviations, and range of scores for these variables.
Step 4. If proposal contains an instrument with scales or a plan to develop scales
(combining items into scales), identify the statistical procedure for accomplishing this.
Also mention reliability checks for the internal consistency of the scales.
Step 5. Identify the statistics and the statistical computer program for testing the major
questions or hypothesis in the proposed study.

PART III. QUANTITATIVE METHOD DESIGN

This part presents essential steps in designing a quantitative method for a research
proposal or study with specific focus on experimental and survey modes of inquiry.
A survey design provides a quantitative or numeric description of trends, attitudes, or
opinions of a population by studying a sample of that population. From sample results,
the researcher generalizes or makes claims about the population. In an experiment,
investigators may also identify a sample and generalize to a population: however, the
basic intent of an experiment is to test the impact of a treatment (or an intervention) on an
outcome, controlling for all other factors that might influence that outcome. As one form
of control, researcher s randomly assign individuals to groups. When one group receives
a treatment and the other group does not, the experimenter can isolate whether it is the
treatment and not the characteristics of individuals in a group (or other factors) that
influence the outcome.
A. COMPONENTS OF AN EXPERIMENTAL METHOD PLAN
An experimental method discussion follows a standard form: participants, materials,
procedures, and measures. The intent in this section is to highlight key topics to be
addressed in an experimental method proposal. The followings are questions as an
overall guide to these topics.
Table 1.3 A checklist of question for designing an experimental procedure.
------ Who are the participants in the study? To what populations do these participants
belong?
------ How were the participants selected? Was a random selection method used?
------ How will the participants be randomly assigned? Will they be matched ? How?
------ How many participants will be in the experimental and control groups?
------ What is the dependent variable(s) in the study? How will it be measured? How
many times will it be measured?
------ What is the treatment condition(s)? How was it operationalized?
------ Will variables be covaried in the experiment? How will they be measured?
------ What experimental research design will be used?
What would a visual model of this design look like?
------ What instrument(s) will be used to measure the outcome in the study? Why was it

chosen? Does it have established validity and reliability


------ Will a pilot test of the experiment be conducted?
------ What statistics will be used to analyze the data?
PARTICIPANTS
Readers need to know about the selection, assignment, and number of participants who
will participate in the experiment. Consider the following suggestions when writing an
experimental research proposal:
-

Describe the selection process for participants as either random or nonrandom.


The participants might be selected by random selection or random sampling.

If randomly assignment is made, discuss how the project will involve randomly
assigning individuals to the treatment groups. For example, individual #1 goes to
group 1, individual # goes to group 2 and so forth so that there is no systematic
bias in assigning the individuals.

VARIABLES
Generally speaking, experiments are carried out in order to explore the strength of
relationships between variables. The label given to the variable that the experimenter
expects to influence the other is called the independent variable. In our case this would be
the teaching method. The variable upon which the independent variable is acting is
called the dependent variable- in our case, the test scores. Thus, clearly identify the
independent variable and the dependent variable in the experiment. The dependent
variable is the response or the criterion variable that is presumed to be caused by or
influenced by the independent treatment conditions (and any other independent
variables).
INSTRUMENTATION AND MATERIALS
During an experiment, one makes observations or obtains measures using instruments at
a pre- or posttest (or both) stage of the procedures. Therefore, the researcher should
describe the instrument or instruments participants complete in the experiment, typically
completed before the experiment begins and at its end. Indicate the established validity

and reliability of the scores on instruments, the individuals who developed them, and any
permissions needed to use them.
Thoroughly discuss the materials used for the experimental treatment. One group, for
example, may participate in a special computer-assisted learning plan used by a teacher in
a classroom. This plan might involve handouts, lessons, and special written instruction to
help students in this experimental group learn how to study a subject using computers. A
pilot test of these materials may also be discussed, as well as any training required of
individuals to administer the materials in a standard way.
EXPERIMENTAL PROCEDURES
The specific experimental design procedures also need to be identified.
-

Identify the type of experimental design in the proposed study. The types
available in experiments are pre-experimental design, true experiments, quasiexperiments, and single-subject designs. With pre-experimental design, the
researcher studies a single group and provides an intervention during the
experiment. This design does not have a control group to compare with the
experimental group. In quasi-experiments, the investigator uses control and
experimental groups but does not randomly assign participants to groups ( e.g.,
they may be intact groups available to researcher) In a true experiment, the
investigator randomly assigns the participants to treatment groups. A singlesubject design or N of 1 design involves observing the behavior of a single
individual (or a small number of individuals) over time.

In discussing experimental design a few symbols are used.


R: random selection of subjects or assignment of treatments to experimental groups
X: experimental variable manipulated
C: control variable
O: observation or test
A line between levels indicates equated groups.

In the examples below, this notion is used to illustrate pre-experimental, quasiexperimental, true experimental and single subject designs.
PRE-EXPERIMENTAL DESIGNS
The least adequate of design is characterized by
a. lack of a control group or
b. a failure to provide for the equivalence of a control group.

1. The One-Short Case Study


This design involves an exposure of a group to a treatment followed by a measure.
Group A

2. The One-Group Pre-Test Post-Test design


This design includes a pre-test measure followed by a treatment and a post-test for a
single group.
Group A

O1

O2

3. a. Static Group Comparison or Post-test Only With Non-equivalent Groups


Experimenters use this design after implementing a treatment. After the treatment, the
researcher selects a comparison group and provides a post-test to both the
experimental groups and the comparison group(s).
Group A
Group B

O
O

b. Alternative Treatment Posttest-Only With Nonequivalent Groups Design


This design uses the same procedure as the static group comparison with the
exception that the nonequivalent comparison group received a different treatment.
Group A

X1

Group B

X2

QUASI-EXPERIMENTAL DESIGNS
1. Nonequivalent (pretest and posttest)
Control-Group Design
In this design, a popular approach to quasi-experiments, the experimental group A
and the control group B are selected without random assignment. Both groups
take a pretest and posttest. Only the experimental group receives the treatment.
Group A

Group B

O
O

2. Single-Group Interrupted Time-Series Design


In this design, the researcher records measures for a single group both before and
after a treatment
Group A

3. Control-Group Interrupted Time-Series Design


A modification of the Single-Group Interrupted Time-Series Design in which two
groups of participants not randomly assigned are observed over time. A treatment
is administered to only one of the groups (i.e., Group A)
Group A O

Group B O

TRUE EXPERIMENTAL DESIGNS


1. Pretest-Posttest Control Design
A traditional, classical design, this procedure involves random assignment of
participants to two groups. Both groups are administered both a pretest and a
posttest to both groups, but the treatment is provided only to experimental group
A
Group A

Group B

2. Posttest-Only Control-Group Design


This design controls for any confounding effects of a pretest and is a popular
experimental design. The participants are randomly assigned to groups, a
treatment is given only to the experimental group, and both groups are measured
on the posttest.
Group A

Group B

O
O

3. Solomon Four-Group Design


A special case of a 2 x 2 factorial design, this procedure involves the random
assignment of participants to four groups. Pretests and treatments are varied for
the four groups. All groups receive a posttest.
Group A

Group B

Group C

Group D

O
O

O
O

The Procedure
One needs to describe in detail the procedure for conducting the experiment. A reader
should be bale to see the design being used, the observations, the treatment, and the
timeline of activities.
Discuss a step-by-step approach for the procedure in the experiment. For example, Borg
and Gall (a989.p.679) outlined six steps typically used in the procedure for a pretestposttest control-group design with matching:
1. Administer measure of the dependent variable or a variable closely
correlated with the dependent variable to the research participants.
2. Assign participants to matched pairs on the basis of their scores on the
measures described in step 1
3. Randomly assign one member of each pair to the experimental group and
the other member to the control group
4. Expose the experimental group to the experimental treatment and
administer no treatment or an alternative treatment to the control group.
5. Administer measures of the dependent variables to the experimental and
control groups.
6. Compare the performance of the experimental nd control groups on the
posttest(s) using tests of statistical significance.
STATISTICAL ANALYSIS
Tell the reader about the types of statistical analysis that will be used during the
experiment.
Report the descriptive statistics calculated for observations and measures at the pretest or
posttest stage of experimental designs. These statistics are means, standard deviations,
and ranges.
Indicate the inferential statistical tests used to examine the hypotheses in the study.
B. Components of a Survey Method Plan
The design of a survey method section follows a standard format: the population and
sample, instrumentation, variables in the study and data analysis.

The Survey Design


Begin the discussion by reviewing the purpose of a survey and the rationale for its
selection as a design in the proposed study. This discussion can
-

Identify the purpose of survey research. This purpose is to generalize from a


sample to a population so that inferences can be made about some characteristic,
attitude, or behavior of this population.

Indicate why a survey is preferred type of data collection procedure for the study.
In this rationale, consider the advantages of survey designs, such as the economy
of the design and the rapid turnaround in data collection. Discuss the advantage of
identifying attributes of large population from a small group of individuals.

Indicate whether the survey will be cross-sectional, with the data collected at one
point in time, or whether it will be longitudinal with data collected over time.

Specify the form of data collection: self-administered questionnaires, interviews


etc.

The Population and Sample


Specify the characteristics of the population and the sampling procedure. This discussion
will focus on essential aspects of the population and sample to describe in a research
plan.
-

Identify the population in the study. Also state the size of this population.

Identify whether the sampling design for this population is single or multistage
(called clustering). Cluster sampling is ideal when it is impossible or impractical
to compile a list of the elements composing the population. A single-stage
sampling procedure is one in which the researcher has access to names in the
population and can sample the people (or other elements) directly. In a
multistage, clustering procedure, the researcher first samples groups or
organizations (or clusters), obtains names of individuals within groups or clusters,
and then samples within the clusters.

Identify the selection process for individuals. It is recommended that a random


sampling is ideal in which each individual in the population has an equal
probability of being selected.

Identify whether the study will involve stratification of the population before
selecting the sample. Stratification means that specific characteristics of
individuals (e.g., both females and males) are represented in the sample and the
sample reflects the true proportion of individuals with certain characteristics of
the population.

Indicate the number of people in the sample and the procedures used to compute
this number. In survey research it is recommended to use a sample size formula.

Instrumentation
As part of rigorous data collection, the proposal developer also provides detailed
information about the actual survey instrument to be used in the proposal study. Consider
the following:
-

Name the survey instrument used to collect data in the research study. Discuss
whether it is an instrument designed for this research, a modified instrument, or
an intact instrument developed by someone else.

To use an existing instrument, describe the established validity and reliability of


scores obtained from past use of the instrument. Also discuss whether scores
resulting from past use of the instrument demonstrate reliability.

Include sample items from the instrument so that readers can see the actual items
used. In an appendix to the proposal, attach sample items from the instrument or
the entire instrument.

Discuss plans for pilot testing or field testing the survey and provide a rationale
for these plans. This testing is important to establish the content validity of an
instrument and to improve questions, format, and the scales. Indicate the number
of people who will test the instrument and the plans to corporate their comments
into final instrument revisions.

For a mailed survey, identify steps for administering the survey and for following
up to ensure a high response rate.

Variables in the Study


Although readers of a proposal learn about the variables in earlier sections of the
proposal, it is useful in the method section to relate the variables to the specific questions
on the instrument. At this stage in a research plan, one technique is to relate the variables,
the research questions, and items on the survey instrument so that a reader can easily
determine how the researcher will use the questionnaire items.
Data Analysis
In the proposal, present information about the steps involved in analyzing the data. It is
recommended to represent them as a series of steps.
Step 1. Report information about the number of members of the sample who did and did
not return the survey. A table with numbers and percentages describing
respondents and non-respondents is a useful tool to present this information.
Step . Discuss the method by which response bias will be determined. Response bias is
the effect of non-responses on survey estimates. Bias means that if non-respondents had
responded, their responses would have substantially changed the overall results of the
survey.
Step 3. Discuss a plan to provide a descriptive analysis of data for all independent and
dependent variables in the study. This analysis should indicate the means, standard
deviations, and range of scores for these variables.
Step 4. If the proposal contains an instrument with scales or a plan to develop scales
(combining items into scales) identify the statistical procedure (i.e., factor analysis) for
accomplishing this. Also mention reliability checks for the internal consistency of the
scales (i.e., the cronbach alpha statistic).
Step 5. Identify the statistics and the statistical computer program for testing the major
questions of hypotheses in the proposed study. Provide a rationale for the choice of
statistical test and mention the assumption associated with the statistic.

An Example of a Survey Method


Below is an example of a survey method section that illustrates many of the steps
mentioned above. This excerpt (used with permission) comes from a journal article
reporting a study of factors affecting student attrition in one small liberal arts college.
Methodology
The site of this study was a small (enrollment 1000) religious, coeducational, liberal
arts college in a Midwestern city with a population of 175.000 people.
The dropout rate the previous year was 5%. Dropout rates tend to be highest among
freshmen and sophomores, so an attempt was made to reach as many freshmen and
sophomores as possible by distribution of the questionnaire through classes. Research on
attrition indicates that males and females dropouts of college for different reasons.
Therefore, only women were analyzed in this study.
During April 1979, 169 women returned questionnaires. A homogeneous sample of
135 women who were 25 years old or younger, unmarried, fulltime U.S. citizens and
Caucasian was selected for this analysis to exclude some possible confounding variables.
Of these women, 71 were freshmen , 55 were sophomores, and 9 were juniors. Of the
students, 95% were between the ages of 18 and 21. This sample is biased toward higherability student as indicated by scores on the ACT test.
Concurrent and convergent validity (D.T. Campbell & Fiske, 1959) of these measures
was established through factor analysis and was found to be at an adequate level.
Reliability of the factors was established through the coefficient alpha. The constructs
were represented by 25 measures multiple items combined on the basis of factor
analysis to make indices and 7 measures were single item indicators.
Multiple regression and path analysis were used to analyze the data.
In the causal model, intent to leave was regressed on all variables which preceded it in
the sequence. Intervening variables significantly related to intent to leave were then
regressed on organizational variables, personal variables, environmental variables, and
background variables.

PART IV: THE TOOLS OF RESEARCH


This part of this book moves to a closer grained account of instruments for collecting
data, how they can be used, and how they can be constructed. Several main kinds of data
collection instruments are identified to enable researchers to decide on the most
appropriate instruments for data collection, and to design such instruments.
1. QUESTIONNAIRES
At This preliminary stage of design, it can sometimes be helpful to use a guide for
questionnaire Construction.
A Guide for Questionnaire Construction.
A. Decision about question Content
1. Is the question necessary? Just how will it be useful?
2. Are several questions needed on the subject matter of this question?
3. Do respondents have the information necessary to answer the question?
4. Does the question need to be more concrete, specific and closely related to the
respondents personal opinions?
5. Is the question content sufficiently general and free from spurious concreteness and
specificity?
6. Do the replies express general attitudes and only seem to be as they sound?
7. Is the question content biased or loaded in one direction, without accompanying
question to balance the emphasis?
8. Will the respondents give information that is asked?
B. Decision about question wording
1. Can the question be misunderstood? Does it contain difficult or unclear phraseology?
2. Does the question adequately express the alternative with respect to the point?
3. Is the question misleading because of unstated assumption or unseen implications?
4. Is the wording biased? Is it emotionally loaded or slanted towards a particular kond of
answer?

5. Is the question wording likely to be objectionable to the respondent in any way?


6. Would a more personalized wording of the question produce better results?
7. Can the question be better asked in a more direct or a more indirect form?
C. Decisions about form of response to the question
1. Can the question best be asked in a form calling for check answer (or short answer of a
word or two, or a number ) free answer or check answer with follow-up answer?
2. If a check answer is used, which is the best type for this question-dichotomous,
multiple choice, or scale?
3. If a checklist is used, does it cover adequately all the significant alternatives without
overlapping and in a defensible order? Is it of reasonable length? Is the wording of
items impartially and balanced?
4. Is the form of response easy, definite, uniform and adequate for the purpose?
D. Decisions about the place of the question in the sequence
1. Is the answer to the question likely to be influenced by the content of preceding
question?
2. Is the question led up to in a natural way? Is it in correct psychological order?
3. Does the question come too early or too late from the point of view of arousing interest
and receiving sufficient attention, avoiding resistance, and so on?
The intention of this guide for questionnaire construction is to make sure that the
questionnaire :
-

is clear on its purpose

is clear on what needs to be included or covered in the questionnaire in order to


meet the purposes

is exhaustive in its coverage of the elements of inclusion

asks the most appropriate kinds of question

elicit the most appropriate kinds of data to answer the research purposes and subquestions

asks for empirical data.

TYPES OF QUESTIONNAIRE ITEMS


1. CLOSED AND OPEN QUESTIONS COMPARED
Questionnaires that call for short, check responses are known as the restricted, or closedform, type. They provide for marking a yes or no, a short response, or checking an item
from a list of suggested responses. The following example illustrates the closed-form
item.
Why did you choose your graduate work at this university? Kindly indicate three reasons
in order of importance, using number 1 for the most important, 2 for the 2nd most
important and 3 for the 3rd most important.
Rank
(a) Convenience of transportation ..
(b) Advice of a friend ..
(c) Reputation of institution
(d) Expense factor
(e) Scholarship aid
(f) Other ...
Highly Structured, closed questions are useful in that they can generate frequencies of
response amenable to statistical treatment and analysis. They are directly to the point and
deliberately more focused than open-ended questions
2. OPEN FORM QUESTIONS
The open-form, or unrestricted type of questionnaire calls for a free response in the
respondents own words. The following open-form item seeks the same type of
information as the previous closed-form item:
Why did you choose to take your graduate work at this university?
Note that no clues are given. The respondents reveal their frame of reference and possibly
their reasons of their responses. They are useful if the possible answers are unknown or
the questionnaire is exploratory, or if there are so many possible categories of response

that a closed question would contain an extremely long list of options. They also enable
respondents to answer as much as they wish, and are particularly suitable for
investigating complex issues, to which simple answers cannot be provided. Closed
questions do not enable respondents to add any remarks, qualifications and explanations
to the categories. On the other hand, open questions enable respondents to write a free
account in their own terms, to explain and qualify their responses and avoid the
limitations of pre-set categories of responses but they can lead to irrelevant and redundant
information. The following is another example of a questionnaire.
1. Male . Female..
2. Age .
3. Marital status: single . Married . Divorced/separated .
4. Number of respondent children .; their ages ..
5. Number of other respondents .
6. Highest degree held ..
7. Years of teaching experience ..
8. Years of teaching at present school .
9. Teaching level: primary . Intermediate.. upper grades ..Jr H.S. ..
H.S. ....; If secondary, your major teaching area .....
10. Enrollment of your school ..
11. Your average class size ..
12. Population of your community or school district .
13. Your principal is: male female
In the following questions kindly check the appropriate column:
a. excellent b. good c. fair d. poor
14. How does your salary schedule compare with those of similar a b

c d

school district?
15. How would you evaluate the adequacy of teaching materials
and supplies? Etc.

a b

c d

CHARACTERISTICS OF A GOOD QUESTIONNAIRE


1. It deals with a significant topic, one respondent will recognize as important
enough to warrant spending his or her time on. The significance should be clearly
and carefully stated on the questionnaire, or the letter that accompanies it.
2. It seeks only that information which cannot be obtained from other sources such
as school reports or census data.
3. It is as short as possible, only long enough to get the essential data. Long
questionnaires frequently find their way into the wastebasket.
4. It is attractive in appearance, nearly arranged, and clearly duplicated or printed.
5. Directions are clear and complete, important terms are defined, each question
deals with a single idea, all questions are worded as simply and clearly as
possible, and the categories provide an opportunity for easy, accurate, and
unambiguous responses.
6. The questions are objective, with no leading suggestions as to the responses
desired. Leading questions are just as inappropriate on a questionnaire are they
are in a court of law.
7. Questions are presented in good psychological order, proceeding from general to
more specific responses. This order helps respondents to organize their own
thinking, so that their answers are logical and objective. It may be well to present
questions that create a favorable attitude before proceeding to those that may be a
bit dedicate or intimate. If possible annoying or embarrassing questions should
be avoided.
8. It is easy to tabulate and interpret.

VALIDITY AND RELIABILITY OF QUESTIONNAIRES


Basic to the validity of a questionnaire are the right questions phrased in the least
ambiguous way. The question of content validity is: Do the items sample a significant
aspect of the purpose of the investigation?
The meaning of all terms must be clearly defined so that they have the same meaning to
all respondents. The researcher needs all the help that he or she can get; suggestions from

colleagues and experts in the field of inquiry may reveal some ambiguities that can be
removed and some items that do not contribute to its purpose. The panel of experts may
rate the instrument in terms of how effectively it samples significant aspects of its
purpose, providing estimates of content validity.
It is possible to estimate the predictive validity of a questionnaire by follow-up
observation of respondent behavior at the present time or at some time in the future.
Reliability of questionnaires may be inferred by a second administration of the
instrument, comparing the responses with those of the first. Reliability may also be
estimated by comparing responses of an alternate form with the original form.

2. THE OPINIONNAIRE, OR ATTITUDE SCALE


The information form that attempts to measure the attitude or belief of an individual is
known as an opinionnaire, or attitude scale. Since the terms opinion and attitude are not
synonymous, a clarification is necessary.
How people feel, or what they believe, is their attitude. But it is difficult, if not
impossible, to describe and measure attitude. Researchers must depend upon what people
say are their beliefs and feelings. This is the area of opinion. Through the use of
questions, or by getting peoples expressed reaction to statement, a sample of their
opinions is obtained. From this statement of opinion, one may infer or estimate their
attitudes what they really believe.
Two procedures which are commonly used in opinion research are Thurstone Technique
and Likert Method.
Thurstone Technique
The first method of attitude assessment is known as the Thurstone Technique of Scaled
Values. A number of statements, usually twenty or more, that express various points of
view toward a group, institution, idea, or practice are gathered. They are then submitted
to a panel of a number of judges, who each arranges them in eleven groups, ranging from
one extreme to another in position. This sorting by each judge yields a composite
position for each of the items. When there has been marked disagreement between the
judges in assigning a position to an item, that item is discarded. For items that are

retained, each is given its median scale value, between one and eleven, as established by
the panel.
The list of statements is then given to the subjects, who are asked to check the statements
with which they agree. The median value of the statements that they check establishes
their score, or quantifies their opinion.
Likert Method
The second method, the Likert Method of Summated Ratings, which can be carried out
without the panel of judges, has yielded scores very similar to those obtained by the
Thurstone method. The coefficient correlation between the two scales was reported as s
+.92 in one study. Since the Likert Scale takes less time to construct, it offers an
interesting possibility for the student of opinion research.
The likert scaling technique assigns a scale value to each of the five responses. Thus, the
instrument yields a total score for each respondent and a discussion of each individual
item, while possible, is not necessary. Starting with a particular point of view, all
statements favoring this position would be scored:
Scale value
a. strongly agree

b. agree

c. undecided

d. disagree

e. strongly disagree

For statements opposing this point of view, the items are scores in the opposite order:
Scale value
a. strongly agree

b. agree

c. undecided

d. disagree

e. strongly disagree

The following statements represent opinions, and your agreement or disagreement will be
determined on the basis of your particular beliefs. Kindly check your position on the
scale as the statement first impresses you. Indicate what you believe, rather than what you
think you should believe.
a. I strongly agree
b. I agree
c. I am undecided
d. I disagree
e. I strongly disagree
1. Heaven does not exist as an actual place or location

2. God sometimes sets aside natural law, performing miracles.

3. Hell does not exist

4. The devil exists as an actual person

5. God is a cosmic force, rather than an actual person.

6. There is a final day of judgment for all who have lived

on earth. etc
If the opinionnaire consisted of 30 statements or items, the following score values would
be revealing:
30 x 5 = 150 Most favorable response possible
30 x 3 = 90 A neutral attitude
30 x 1 = 30 Most unfavorable attitude

3. TEST
ACHIEVEMENT TESTS
Achievement tests attempt to measure what an individual has learned his or her present
level of performance. Most tests used in schools are achievement tests. They are

particularly helpful in determining individual or group status in academic learning.


Achievement test scores are used in placing, advancing, or retaining students at particular
grade levels. They are used in diagnosing strengths and weaknesses, and as a basis for
awarding prizes, scholarships, or degrees. Frequently, achievement tests scores are used
in evaluating the influence of courses of study, teachers, teaching methods, and other
factors considered to be significant in educational practice.
In planning a test, the researcher can proceed the following:
1. Identify the purpose of the test.
2. Identify the test specification
3. Select the contents of the test
4. Consider the form of the test
5. Write the test item
6. Consider the layout of the test
7. Consider the timing of the test
8. Plan the scoring of the test.
Ad1. IDENTIFY THE PURPOSES OF THE TEST
The purposes of a test are several, for example to diagnose a students strengths,
weaknesses and difficulties, to measure achievement, to measure aptitude and potential,
to identify readiness for a program.
Formative testing is undertaken during a program and is designed to monitor students
progress during that program, to measure achievement of sections of the program, and to
diagnose strengths and weaknesses. It is typically criterion referenced.
Diagnostic testing is an in-depth test to discover particular strengths, weaknesses and
difficulties that a student is experiencing, and is designed to expose causes and specific
areas of weakness or strength. This type of test criterion-referenced.
Summative testing is the test given at the end of the program and is designed to measure
achievement, outcomes, or mastery. This might be criterion-referenced or normreferenced, depending to some extent on the use to which the results will be put.

Ad 2. IDENTIFY THE TEST SPECIFICATION


The test specification includes:
-

which program objectives and student learning outcomes will be addressed

which content areas will be addressed

the relative weightings, balance and coverage of items

the total number of items in the test

the number of questions required to address a particular element of a program or


learning outcome

the exact item in the test.

Ad 3. SELECT THE CONTENTS OF THE TEST


Here the test is subject to item analysis. Gronlund and Linn (1990) suggest that an item
analysis will need to consider:
-

the suitability of the format of each item for the (learning) objective
(appropriateness)

the ability of each item to enable students to demonstrate their performance of the
(learning) objective (relevance)

clarity of the task for each item

the straightforwardness of the task

the independence of each item (i.e. where the influence of other items of the test
is minimal and where successful completion of one another)

the adequacy of coverage of each (learning) objective by the items of the test.

Ad 4. CONSIDER THE FORMAT OF THE TEST


The researcher will need to consider whether the test will be undertaken individually, or
in a group, and what form it will take. Oral test, for example, can be conducted if the
researcher feels that reading and writing will obstruct the true purpose of the test.

Ad 5. WRITE THE TEST ITEM


Hanna (1993:139-41) and Cunningham (1998) provide several guidelines for
constructing short-answer items:
-

make the blanks close to the end of the sentence

keep the blanks the same length

ensure that there can be only a single correct answer

avoid putting several blanks close to each other (in a sentence or paragraph) such
that the overall meaning is obscured

only make blanks of key words or concepts, rather than trivial words

ensure that students know exactly the kind and specificity of the answer required

use short-answer for testing knowledge recall.

CONSTRUCTING MULTIPLE CHOICE ITEMS.


1) Design each item to measure a specific objective.
Consider this item introduced and then revised, in the simple test above
Multiple choice item, revised
Voice:

Where did George go after the party last night?

S reads:

a. Yes, he did.
b. Because he was tired
c. To Elaines place for another party.
d. Around eleven oclock

The specific objective being tested here is comprehension of wh-question. Distractor (a)
is designed to ascertain that the student knows the difference between an answer to a whquestion and a yes/no question. Distractors (b) and (d), as well as the key item (c), test
comprehension of the meaning of where as opposed to why and when. The objective has
been directly addressed.
On the other hand, here is an item that was designed to test recognition of the correct
word order of indirect questions.

Multiple-choice item, flawed


Excuse me, do you know ..?
a. where is the post office
b. where the post office is
c. where post office is
Distractor (a) is designed to lure students who dont know how to frame indirect
questions and therefore serves as an efficient distractor. But what does distractor (c)
actually measure? In fact, the missing definite article (the) is an unintentional clue-a
flaw that could cause the test taker to eliminate (c) automatically. IN the process, no
assessment has been made of indirect questions in this distractor. Can you think of a
better distractor for (c) that would focus more clearly on the objective?
2) State both stem and options as simply and directly as possible
We are sometimes tempted to make multiple-choice items too wordy. A good rule
of thumb is to get directly to the point. Here is an example.
Multiple-choice cloze test item, flawed
My eyesight has really been deteriorating lately. I wonder if I need glasses. I think
Id better go to the .. to have my eyes checked.
a. pediatrician
b. dermatologist
c. optometrist
You might argue tat the first two sentences of this item give it some authenticity and
accomplish a bit of schema setting. But if you simply want a student to identify the type
of medical professional who deals with eyesight issues, those sentences are superfluous.
Moreover, by lengthening the stem, you have introduced a potentially confounding
lexical item, deteriorate, that could distract the student unnecessarily.
Another rule of succinctness is to remove needless redundancy from your options. In the
following item, which were is repeated in all three options. It should be placed in the
stem to keep the item as succinct as possible.

Multiple-choice, flawed
We went to visit the temples, fascinating.
a. which were beautiful
b. which were especially
c. which were holy
3). Make certain that the intended answer is clearly the only correct one.
In the proposal unit test described earlier, the following item appeared in the
original draft:
Multiple-choice item, flawed
Voice: Where did George go after the party last night?
S reads: a. Yes, he did.
b. Because he was tired
c. To Elaines place for another party
d. He went home around eleven oclock
A quick consideration of the distractor (d) reveals that it is a plausible answer, along
with the intended key, (c). Eliminating unintended possible answers is often the most
difficult problem of designing multiple-choice items. With only a minimum of
context in each stem, a wide variety of responses may be perceived as correct.
4). Use item indices to accept, discard, or revise items.
The appropriate selection and arrangement of suitable multiple-choice items on a
test can best be accomplished by measuring items against three indices: item
facility (or item difficulty), item discrimination (sometimes called item
differentiation), and distractor analysis. Although measuring these factors on
classroom tests would be useful, you probably will have neither the time nor the
expertise to do this for every classroom test you create, especially on-time tests.
But they are a must for standardized norm-referenced tests that are designed to be
administered a number of times and/or administered in multiple forms.

(1). Item facility (or IF) is the extent to which an item is easy or difficult for the
proposed group of test-takers. You may wonder why that is important if in your
estimation the item achieves validity. The answer is that an item that is too easy
(say 99 percent of respondents get it right) or too difficult (99 percent get it
wrong) really does nothing to separate high-ability and low-ability test-takers. It is
not really forming much work for you on a test.
IF simply reflects the percentage of students answering the item correctly. The
formula looks like this:
# of Ss answering the item correctly
IF=

Total # of Ss responding to that item

For example, if you have an item on which 13 out of 20 students respond


correctly, your IF index is 13 divided by 20 or .65 (65 percent). There is no
absolute IF value that must be met to determine if an item should be included in
the test as is modified, or thrown out, but appropriate test items will generally
have IFs that range between .15 and .85. Two good reasons for occasionally
including a very easy item (.85 or higher) are to build in some affective feelings
of success among lower-ability students and to serve as warm-up items. And
very difficult items can provide a challenge to the highest-ability students.
(2) Item discrimination (ID) is the extent to which an item differentiates
between high-and low-ability test takers. An item on which high-ability students
(who did well in the test) and low-ability students (who didnt) score equally well
would have poor ID because it did not discriminate between the two groups.
Conversely, an item that garners correct responses from most of the high-ability
group and incorrect responses from most of the low-ability group has good
discrimination power.
Suppose your class of 30 students has taken a test. Once you have alculated final
scores for all 30 students, divide them roughly into thirds that is, create three

rank-ordered ability groups including the top 10 scores, the middle 10, and the
lowest 10. To find out which of your 50 or so test items were mot powerful in
discriminating between high and low ability, eliminate the middle group leaving
two groups with results that might look something like this on particular item:
Item #23

# Correct

# Incorrect

High-ability Ss (to 10)

Low-ability Ss (bottom 10)

Using the ID formula (7-2 = 5 10= .50), you would find that this item has an ID
of .50, or a moderate level.
The formula for calculating ID is
High group # correct low group # correct
ID =

7 -2
=

x total of your two comparison groups

5
=

x 20

= .50
10

The result of this example item tells you that the item has a moderate level of ID. High
discriminating power would approach a perfect 1.0, and no discriminating power at all
would be zero. In most cases, you would want to discard an item that near zero.
(3) Distrator efficiency is one more important measure of multiple choice items
value in a test, and one that is related to item discrimination. The efficiency of distractors
is the extent to which (a) the distractors lure a sufficient number of test-takers,
especially lower-ability ones, and (b) those responses are somewhat evenly distributed
across all distractors. Those of you who have a fear of mathematics formulas will be
happy to read that there is no formula for calculating distractor efficiency and that an
inspection of a distribution of responses will usually yield the information you need.
Consider the following. The same item (#23) used above is a multiple-choice item with
five choices, and responses across upper- and lower-ability students are distributed as
follows:

Choices

High-ability Ss (10)

Low-ability Ss (10)

Note: C is the correct response


No mathematical formula is needed to tell you that this item successfully attracts seven of
the ten high-ability students toward the correct response, while only two of the lowability students get this one right. As shown above, its ID is .50, which is acceptable, but
the item might be improved in two ways: (a) Distractor D doesnt fool anyone. No one
picked it, and therefore it probably has no utility. A revision might provide a distractor
that actually attracts a response or two. (b) Distractor E attracts more responses (2) from
the high-ability group than the low-ability group (0). Why are good students choosing
this one? Perhaps it includes a subtle reference that entices the high group but is over the
head of the low group, and therefore, the latter students dont even consider it.
The other two distractors (A and B) seem to be fulfilling their function of attracting some
attention from lower-ability students.
Ad 6. CONSIDER THE LAY OUT
Deciding on the layout will include the following factors:
-

the nature, length and clarity of the instruction, for example, what to do, how long
to take, how much to do, how many items to attempt, what kind of response is
required (e.g., a single word, a sentence, a paragraph, a formula, a number, a
statement etc), how and where to enter the response, where to show the working
out of a problem, where to start new answers (e.g. In a separate booklet).

Is one answer only required to a multiple choice item, or is more than one answer
required.

What marks are to awarded for which parts of the test

The progression from the easy to the more difficult items of the test (i.e., the
location and sequence of items)

The visual layout of the page, for example avoiding overloading students with
visual material or words

The setting out of the answer sheets or locations so that they can be entered onto
computers and read by optical mark readers and scanners (if appropriate).

The layout of the text should be such that it supports the completion of the test
and that this is done as efficiently and as effectively as possible for the student.

Ad 7. CONSIDER THE TIMING OF THE TEST


The timing refers to two areas: when the test will take place (the day of the week, month,
time of day) and the time allowances to be given to the test and its component items.
With regard to the former, in part this is a matter of reliability, for the time of the day or
week etc, might influence how alert, motivated or capable a student might be. With
regard to the latter, the researcher will need to decide what time restrictions are being
imposed and why; for example, is the pressure of a time constraint desirable-to show
what a student can do under time pressure or an unnecessary impediment, putting a
time boundary around something that need not be bounded.
Ad 8. PLAN THE SCORING OF THE TEST
The awarding of scores for different items of the test is a clear indication of the relative
significance of each item the weightings of each item are addressed in their scoring. It
is important to ensure that easier parts of the test attract few marks than more difficult
parts of it, otherwise a students result might be artificially inflated by answering many
easy questions and fewer more difficult questions.

PART V. VALIDITY AND RELIABILITY


DEFINING VALIDITY
Validity is an important key to effective research. If a piece of research is invalid the it is
worthless. Validity is thus a requirement for both quantitative and qualitative/naturalistic
research.
While earlier versions of validity were based on the view that it was essentially a
demonstration that a particular instrument in fact measures what it purports to measure,
more recently validity has taken many forms. The followings are several kinds of
validity:
-

Content validity

Criterion-related validity

Construct validity

Internal validity

External validity

Concurrent validity

Face validity

Ad 1. Content Validity
To demonstrate this form of validity the instrument must show that it fairly and
comprehensively covers the domain or items it purports to cover. In other words, content
validity refers to the degree to which the test actually measures, or is specifically related,
the traits for which it was designed. It shows how adequately the test samples the
universe of knowledge, attitudes, and skills that a student is expected to master. Content
validity is based upon careful examination of course textbooks, syllabi, objectives, and
the judgments of subject matter specialists. The criterion of content validity is often assed
by a panel of experts in the field who judge its adequacy but there is no numerical way to
express it.
Ad 2. Criterion-related validity
Criterion-related validity is a broad term that refers to two different criteria of time frame
in judging the usefulness of a test.

1. Predictive validity refers to the usefulness of a test in predicting some future


performance such as the degree of usefulness of the Scholastic Aptitude Test
taken in high school in predicting college grade.
2. Concurrent validity refers to the usefulness of a test in closely relating to such
other measures as present academic grades, teacher ratings, or scores on another
test of known validity. If a test is designed to pick out good candidates for
appointment as shop foremen, and test scores show a high positive correlation
with actual success on the job, the test has a degree of predictive validity,
whatever factors it actually measures. It predicts well. It serves a useful purpose.
Ad 3. Construct validity
Construct validity is the degree to which scores on a test can be accounted for by the
explanatory constructs of a sound theory. If one were to study such a construct as
dominance, one would hypothesize those who do not. Theories can be built describing
how dominant people behave in a distinctive way. If this is done, dominant people can be
identified by observation of their behavior, rating or classifying them in terms of the
theory. A test could then be designed to have construct validity to the degree that the test
scores are systematically related to the judgments made by observation of behavior
identified by the theory as dominant.
Ad 4. Internal validity
Internal validity seeks to demonstrate that the explanation of a particular event, issue or
set of data which a piece of research provides can actually be sustained by the data. In
some degree this concerns accuracy, which can be applied to quantitative and qualitative
research. The findings must describe accurately the phenomena being researched.
In ethnographic research internal validity can be addressed in several ways :
-

using low-inference descriptors

using multiple researchers

using participant researchers

using peer examination of data

using mechanical means to record, store and retrieve data

In the experimental research internal validity refers to the extent that the factors that
have been manipulated (independent variables) actually Have a genuine effect on the
observed consequences (dependent variables) in the experimental setting.
Ad 5. External validity
External validity refers to the degree to which the results can be generalized to the wider
population, cases or situations. The issue of generalization is problematical. Another
opinion, external validity is the extent to which the variable relationships can be
generalized to non-experimental situations other settings, other treatment variables,
other measurement variables, and other populations.
Ad 6. Concurrent validity
To demonstrate concurrent validity, the data gathered from using one instrument must
correlate highly with data gathered from using another instrument. For example, suppose
it was decided to search a students problem solving ability. The researcher might observe
the student working on a problem or might talk to the student about how she is tackling
the problem, or might as the student to write down how she tackled the problem. Here the
researcher has three different data collecting instruments-observation, interview and
documentation respectively. If the results all agreed-concurred-that, according to given
criteria for problem-solving ability, the student demonstrated a good ability to solve a
problem, then the researcher would be able to say with greater confidence (validity) that
the student was good at problem-solving than if the researcher had arrived at that
judgment simply from using one instrument.
Ad 7. Face validity
Face validity refers to the degree to which a test looks right, and appears to measure the
knowledge or abilities it claims to measure, based on the subjective judgment of the
examinees who take it, the administrative personnel who decide on its use, and other
psychometrically unsophisticated observers.
Face validity will likely be high if learners encounter:
-

a well-constructed, expected format with familiar tasks,

a test that is clearly doable within the allotted time limit,

items that are clear and uncomplicated,

directions that are crystal clear,

tasks that relate to their course work (content validity) and

a difficulty level that is a reasonable challenge.

DEFINING RELIABILITY
A test is reliable to the extent that it measures consistently, from one time to another. In
tests that have a high coefficient of reliability, errors of measurement have been reduced
to a minimum. Reliable tests, whatever they measure, yield comparable scores upon
repeated administration. An unreliable test would be comparable to a stretchable rubber
yardstick that yielded different measurements each time it was applied.
A test may be reliable, even though it is not valid. A valid test is always reliable.
The reliability of a test is expressed as coefficient of correlation. There are a number of
methods used to compare various coefficients of reliability.
1. Test of stability test retest. The scores on a test may be correlated with scores on
a second administration of the test at a later date.
2. Equivalent or parallel forms. The scores on a test may be correlated with scores
on an alternative form of the test scores on form A with scores on form B.
3. Internal consistency.
a. Split halves odd vs. even items. The scores on the odd-numbered items
are correlated with the scores on the even-numbered items. Since this is
actually comparing scores on two shorter tests, the coefficient is modified
by the application of Spearman-Brown formula.
Reliability=

2r
1+r

Where r= the actual correlation between the halves of the instrument. Let
us say that using the Spearman Brown formula, the correlation coefficient is
0.85 in this case the formula is set out thus:
Reliability = 2 x 0.85 = 1.70 = 0.919
1 + 0.85

1.85

b. Split halves-first half with second half. On some, but not on all types of
tests, it may be feasible to correlate the scores on the first half with those
of the second half.
c. A combination of a. and b., which is an estimate of the average reliability
of such possible combinations of split halves-first half vs., second half,
odd vs., even items, etc. The Kuder-Richardson formula is used for this
purpose.

PART VI: QUOTATION BASED ON THE


PUBLICATION MANUAL OF THE AMERICAN
PSYCHOLOGICAL ASSOCIATION (APA)
Material quoted from another authors work or from ones own previously published
work, material duplicated from a test item, and verbatim instructions to subjects should
be reproduced word for word. In corporate a short quotation (fewer than 40 words) in text
and enclose the quotation with double quotation marks. The following examples illustrate
the application of APA style to direct quotation of a source. When quoting, always
provide the author, year, and specific page citation in the text and include a complete
reference in the reference list.
Quotation 1.
He stated, The placebo effect, . . .
disappeared when behaviors were studied in this manner (Smith, 1982. p. 276). But
he did not clarify which behaviors were studied.
Quotation 2.
Smith (1982) found that the placebo effect, which had been verified in previous
studies, disappeared when [his own and others] behaviors were studied in this
manner (p.276).
Quotation 3.
Smith (1982) found the following:
The placebo effect, which had been verified in previous studies, disappeared when
behaviors were studied in this manner. Furthermore, the behaviors, were never
exhibited again [italics added], even when reel [sic] drugs were administered. Earlier
studies were clearly premature in attributing the results to a placebo effect. (p.276)

a. ACCURACY
Direct quotations must be accurate. Except as noted in sections 3.4 to 3.5, the
quotation must follow the wording, spelling, and interior punctuation of the
original source, even if the source is incorrect.
If any incorrect spelling, punctuation , or grammar in the source might confuse
readers, insert the word sic, underlined and bracketed (i.e., (sic), immediately
after the error in the quotation (see section 3.4 for the use of brackets). Always
check the typed copy against the source to ensure that no discrepancies occur.
b. Double or Single Quotation Marks
In text. Use double quotation marks for quotations in text. Use single quotation
marks within double quotation marks to set off material that in the original source
was enclosed in double quotation marks. (see quotation 2).
In block quotations. Do not use any quotation marks to enclose block quotations.
Use double quotation marks to enclose any quoted materials within a block
quotation (see quotation 3).
c. Changes From the Source Requiring No Explanation
The first letter of the first word in a quotation may be changed to a capital or
lowercase letter. The punctuation mark at the end of a sentence may be changed to
fit the syntax. Single quotation marks may be changed to double quotation marks
and vice versa. Any other changes (e.g., italicizing words for emphasis or omitting
words) must be explicitly indicated.
d. Changes From the Source Requiring Explanations
Omitting material. Use three ellipsis points (. . .) within a sentence to indicate that
you have omitted material from the original source (see quotation 1) Use four
points to indicate any omission between two sentences (literally a period followed
by three spaced dots ). Do not use ellipsis points at the beginning or end of any
quotation unless, in order to prevent misinterpretation, you need to emphasize that
he quotation begins or ends in mid sentence.

Inserting Material. Use brackets, not parentheses, to enclose material (additions or


explanations) inserted in a quotation by some person other than the original autor
(see quotation ).
Adding emphasis. If you want to emphasize a word or words in a quotation,
underline the word or words (underlined manuscript copy will be set in italic
type). Immediately after the underlined words, insert within brackets the words
italic added, that is, [italic added] (see quotation 3)

Reference Citation in Text


Citation of an authors work in text documents your work, briefly identifies the source for
readers to locate the source of information in the alphabetical reference list at the end of
an article.
1. One Work by a Single Author
APA journals use the author-date method of citation: that is, the surname of the author
and the year of publication are inserted in the text at the appropriate point:
Smith (1983) compared reaction times
In a recent study of reaction times (Smith, 1983)
If the name of the author appears as part of the narrative, as in the first example, cite only
the year of publication in parentheses. Otherwise, place both the name and the date,
separated by a comma (as in the second example), in parentheses. In the rare case in
which both the year and the author are given as part of the textual discussion, do not add
parenthetical information:
In 1983, Smith compared
Within a paragraph, you need not include the year in subsequent references to a study as
long as the study cannot be confused with another studies cited in the article:
In a recent study of reaction times, Smith (1983) described the method.. Smith
also found

2. One Work by Two or More Authors


When a work has two authors, always cite both names every time the reference occurs in
text.
When a work has more than two authors and fewer than six authors, cite all authors the
first time the reference occurs; in subsequent citations include only the surname of the
first author followed by et al. (not underlined and with no period after et and the year.
Example: Williams, Jones, Smith, Brader, and Torrington (1983) found (first citation)
Williams et al. (1983) found (subsequent citation)
3.Corporate Authors
Example of the name of a corporate author (e.g., association, government agency) that
may be abbreviated:
Entry in reference list:
National Institute of mental Health. (1981).
First citation: (National Institute of Mental Health [NIMH]. (1981).
Subsequent text citation: (NIMH, 1981)
4.Works With No Author or With an Anonymous Author
When a work has no author, cite in text the first two or three words of the reference list
entry (usually the title) and the year. Use double quotation marks around the title of an
article or chapter and underline the title of a periodical or book:
on free care (Study Finds. 198)
the book College Bound Seniors (1979)
5.Personal Communications
Personal communications may be letters, memos, telephone conversations, and the
like. Because they do not provide recoverable data, personal communications are not
included in the reference list. Cite personal communications in text only. Give the
initials as well as the surname of the communicator and provide as exact a date as
possible:
J.O. Reiss (personal communication, April 18, 19830)
(J.O. Reiss, personal communication, April 18, 1983)
6. Newspaper Articles

Ordinarily newspaper items are not listed in the bibliography but they are cited in
footnotes.
Footnote form:
1. Editorial in The Indianapolis News, January 6, 1968.
2. Associated Press Dispatch. The Milkwaukee Journal, December 24,1968,1
3. The Chicago Daily News, February 3, 1968, 6.
Bibliography
Best, John W. 1981. Research In Education: fourth edition. London: Prentice-Hall
International, Inc.,
Becker Jr, Leonard. 1976. Encounter With Sociology: The Term Paper. San
Francisco: Boyd & Fraser Publishing Company
Brown, H. Douglas. 2004. Language Assessment: Principles and Classroom
Practices. New York: Pearson Education, Inc.
Cohen, Louis; Manion, Lawrence, and Morrison, Keith. 2007.Research Methods in
Education. New York: Routledge
Cresswell, John W. 2003. Research Design: Qualitative & Quantitative Approaches.
California: Sage Publications Ltd.
Locke, Lawrence F and Spirduso, Wyrick, Waneen; Silverman, Stephen J . 1976.
Proposal that Work: a guide for planning research. Columbia: Teachers College
Publication Manual of the American Psychological Association.
Washington D.C.; The Association, 1974