Vous êtes sur la page 1sur 12

Department of Public Health

QUANTITATIVE RESEARCH METHODS

GLOSSARY OF TERMS

Acknowledgement

This glossary of terms has been developed by the Department of Public Health, University of Liverpool,
and draws on a number of sources including ‘Measuring Health’ by Ann Bowling, ‘A-Z of Medical
Statistics’ by Filomena Pereira-Maxwell and ‘Epidemiological studies: a practical guide’ by Alan
Silman.

(Version 1.0 September 2002)

Glossary of Terms – Quantitative Research Methods


Page 1
Glossary of terms

Accuracy quality of a measurement which is both correct and precise. The validity of a
measure depends on (among other factors) its accuracy.
Aquiescence response set (‘yes-saying’) respondents will more frequently endorse a
statement than disagree with its opposite.
Actuarial records public records about the demographic characteristics of the population
served.
Age-specific rate rate or frequency of occurrence of an event in a defined age group.
Analysis of variance (ANOVA) significance test for comparing the means of a quantitative
variable between three or more groups (an extension of the independent samples t-test).
Archives ongoing records maintained by institutions within society.
Assumptions specific conditions required by significance tests and other statistical methods
in order to produce valid results.
Attributable risk difference in the risk of a particular event between two groups (typically
the difference between the risk in an exposed population and that in an unexposed
population). Also known as absolute risk difference.
Attrition loss of sample members over time in longitudinal and experimental research.
Bias deviation in one direction of the observed value from the true value of the variable being
measured (as opposed to random error).
Bivariate statistics descriptive statistics for the analysis of the association between two
variables (e.g. contingency tables, correlation).
Blind concealing the assignment of people to experimental or control group in experiments.
Concealment can be from the subjects only (‘single blind’), or from both the subjects and
the research personnel carrying out the intervention and assessment ('double blind’).
Box-and-whisker plot graphical method for displaying ordinal variables. Also useful to
describe quantitative variables which have a skewed distribution.
Case a single unit in a study (e.g. a person or setting, such as a clinic, hospital).
Case study a research method which focuses on the circumstances, dynamics and complexity
of a single case, or a small number of cases.
Case-control study analytical observational study which aims to investigate the relationship
between an exposure or risk factor and one or more outcomes. This is done by selecting a
group of subjects known to have the outcome or disease of interest (cases) and comparing
their exposure with that of a group of subjects known not to have the disease in question
(controls).
Categorical variable variable whose values represent different categories or classes of the
same feature.
Causal hypothesis a statement that it is predicted that one phenomenon will be the result of
one or more other phenomena that precede it in time.
Causal relationships observed changes ('the effect’) in one variable are the result of prior
changes in another.
Censoring in the context of follow-up studies, the event (e.g. death) of a subject is said to be
censored if the same event is not observed within the follow-up period for that subject.
Note: Loss to follow-up frequently leads to censoring since the event remains unknown.

Glossary of Terms – Quantitative Research Methods


Page 2
Central limit theorem the sampling distribution approaches normality as the number of
samples taken increases.
Central tendency (a) Mean: the arithmetic mean, or average, is a measure of central
tendency in a population or sample. The mean is defined as the sum of the values divided
by the total number of cases involved. (b) Median: this is the middle value of the
observations when listed in ascending order; it bisects the observations (i.e. the point below
which 50 per cent of the observations fall). (c) Mode: a measure of central tendency based
on the most common value in the distribution (i.e. the value of X with the highest
frequency).
Clinical trial an experimental study (trial) where the participants are patients.
Closed question the question is followed by predetermined response choices into which the
respondent’s reply must be placed.
Cluster a sample unit which consists of a group of elements, for example a school.
Cluster sampling probability sampling involving the selection of groupings (clusters) and
selecting the sample units from the clusters.
Coding the assignation of (usually numerical) codes to each category of each variable.
Cohort the population has a common experience or characteristic which defines the sampling
(i.e. all born in the same year).
Cohort study analytical observational study which aims to investigate the relationship
between an exposure or risk factor and one or more outcomes. This is done by following
up two or more cohorts over a period of time (also known as longitudinal study). The level
of exposure of each cohort is established at the beginning of the study.
Concept an abstraction representing an object or phenomenon.
Conditional logistic regression regression method for paired or matched dichotomous data.
Common application for the analysis of case-control studies where cases and controls have
been individually matched.
Confidence interval a confidence interval calculated from a sample is interpreted as a range
of values which contains the true population value with the probability specified.
Confounding factors an extraneous factor (a factor other than the variables under study), not
controlled for, distorts the results. An extraneous factor only confounds when it is related
to dependent variables and to the independent variables under investigation. It makes them
appear connected when their association is, in fact, spurious.
Contingency table used to summarise the association between two categorical variables. The
rows represent the different levels of one of the variables and the columns represent the
different levels of the other variable.
Continuous variable quantitative variable which can theoretically take any value within a
given range (e.g. height and weight).
Control group In experimental studies, the group that is not exposed to the independent
variable (intervention); in case-control studies, the group that does not have the disease or
condition of interest (not cases).
Control variable a variable used to test the possibility that an empirically observed
relationship between an independent and dependent variable is spurious.
Correlation linear association between two quantitative or ordinal variables, measured by a
correlation coefficient.
Correlation coefficient measure of the linear association between quantitative or ordinal
variables.

Glossary of Terms – Quantitative Research Methods


Page 3
Cox regression regression method for modeling survival times. Also called proportional
hazards model since it assumes the ratio of risks (or hazard ratio) of the event (e.g. death),
at any particular time, between any two groups being compared, to be constant.
Cross-sectional at one point in time.
Cross-over design study design for a clinical trial in which all patients are given the two or
more treatments under investigation, such that each patient acts as his or her own control.
Cross-sectional study type of observational study with subjects being observed on just one
occasion.
Crude estimates estimates which are obtained without controlling for confounding factors as
opposed to adjusted estimates.
Data cleaning after the data have been entered on to the computer they are checked to detect
and correct errors and inconsistent codes.
Deduction a theoretical or mental process of reasoning by which the investigator starts off
with an idea, and develops a theory and hypothesis from it; then phenomena are assessed in
order to determine whether the theory is consistent with the observations.
Degrees of freedom measure used in significance tests and other statistical procedures,
which reflects the sample size(s) of the study group(s) used in an investigation.
Dependent variable(s) the variable the investigator wishes to explain - the dependent
variable is the expected outcome of the independent variable.
Determinism assumes that everything is caused by some factor in a predictable way;
explanations that are based on a few narrowly defined factors to the exclusion of all others.
Dispersion a summary of a spread of cases in a figure (measures include quartiles,
percentiles, deciles, standard deviation and the range).
Dummy variable in the context of regression, dummy or indicator variables are created
whenever it is necessary to incorporate a categorical variable (with more than two values)
into a regression model.
Ecological studies research where the unit of observation is a group of people rather than an
individual (e.g. schools, cities, nations).
Effect size a numerical index of the magnitude of an observed association.
Empirical based on observation.
Empiricism a philosophical approach that the only valid form of knowledge is that which is
gathered by use of the senses; explanations should be based on actual observations, rather
than theoretical statements.
Experiment a scientific method used to establish cause and effect relationships between the
independent and dependent variables. At its most basic, the experiment is a situation in
which the independent (experimental) variable is fixed by manipulation by the investigator
or by natural occurrence. The true experimental method involves the random allocation of
participants to experimental and control groups. Ideally, participants are assessed before
and after the manipulation of the independent variable in order to measure its effects on the
dependent variable.
Experimental group the group that is exposed to the independent variable (intervention) in
experimental research.
Explanatory variable see independent variable.
Exposure factor which is thought to be associated with the development or prevention of a
given condition or outcome.

Glossary of Terms – Quantitative Research Methods


Page 4
Factor analysis multivariate method that analyses correlations between sets of observed
measurements, with the view to estimating the number of different factors which explain
these correlations.
Field research research which takes place in a natural setting.
Fit (goodness of) measure of how well a theoretical distribution of a specified model fits a
set of data. Based on a the comparison of observed and expected frequencies.
Follow-up study longitudinal or prospective study in which information is collected by
following subjects over a period of time, thus allowing temporal relationships to be
investigated.
Frequency distribution the number of observations of each of the values of a given variable.
Geometric mean anti-log of a mean calculated from observations which have been
transformed to a log scale.
Gold standard in the context of diagnostic testing, it refers to a valid diagnostic tool which
consistently gives the correct diagnosis (i.e. reliable and accurate).
Hazard ratio measure of relative risk used in survival studies.
Heterogeneity (or lack of homogeneity). Term usually employed in the context of meta-
analyses, when the results or estimates from individual studies appear to have different
magnitude. Tests of heterogeneity are available to assess the extent of this variation.
Hierarchical data data on different levels or layers (e.g. household, individual member of
household).
Hypothesis a tentative solution to a research question, expressed in the form of a prediction
about the relationship between the dependent and independent variables.
Hypothetico-deductive method beginning with a theory and, in a deductive way, deriving
testable hypotheses from it, the hypotheses are then tested by gathering and analysing data
and the theory is supported or refuted (see deduction).
Incidence The number of new cases (e.g. of disease) occurring in a population in a defined
period of time, expressed as a rate.
Independent variable(s) the explanatory variable - the variable hypothesised to explain the
dependent variable(s).
Induction begins with the observation and measurement of phenomena and then develops
ideas and general theories about the universe of interest.
Inferential statistics these enable the researcher to make inferences about the characteristics
of the population of interest on the basis of observations made on a sample of that
population.
Information bias misclassification of, for example, people’s responses due to error or bias.
Intention-to-treat analysis in the context of randomized controlled trials. Analysis is
carried out with subjects in the treatment groups to which they were allocated, even if they
change that treatment during the course of the study for health reasons, or through lack of
compliance, etc.
Interaction the direction and/or magnitude of the association between two variables depends
on the value of one or more other variables; also known as effect modification.
Intercept in the context of regression models, the intercept is a constant value, specific to
any given model, which represents the estimated value of the outcome (dependent) variable
when the explanatory (independent) variable is equal to zero.

Glossary of Terms – Quantitative Research Methods


Page 5
Interquartile range measure of the variability of a set of measurements. It is the interval
delimited by the 25th and 75th percentiles and comprises 50% of the observations of a
variable.
Interval data the data points (classes) are ordered and the size of the difference between the
points is specified, but the zero point and unit of measurement are arbitrary (e.g.
temperature - the zero point differs on the two scales commonly used).
Interview a research method which involves a trained interviewer asking questions and
recording respondents’ replies. Interview questions can be structured (printed on a
questionnaire with set question wording and pre-coded response categories), semi-
structured (mostly open-ended questions, i.e. with no pre-coded response categories) or
unstructured and in-depth (listed topics about which interviewers probe respondents for
their views and experiences).
Kaplan-Meier method method of determining survival probability over a period of time, in
which the probabilities are calculated at exact points in time where an event of interest has
occurred in the study group(s). This information can be used to construct a survival curve.
Leading question question phrased in a way that leads the respondent to believe that a
certain reply is expected.
Lead-time bias in the context of screening or early diagnosis, type of bias which is caused by
detection of disease at a presymptomatic stage, without, however, the possibility of
offering a better treatment than that which can be offered to symptomatic patients.
Life table table where survival (or failure) experience of a group of people, or cohort, over a
follow-up period, is recorded. The cumulative chance of surviving various time intervals
can be used from the life table to construct a survival curve and also used to calculate life
expectancy. Also called the Actuarial Table.
Logistic regression regression method for modeling with categorical outcome variables (for
example: died/did not die; smoker/not smoker, etc.).
Longitudinal at more than one point in time.
Matching selection of controls in case-control studies, with a view to ensure a similar
distribution of important risk and/or prognostic factors (frequently age and sex) in the two
study groups (cases and controls). Where matching has been carried out, matched analysis
is the preferred and more efficient approach.
Median measure of the centre of a distribution. As opposed to the mean, it is said to be a
robust measure, given that it is not greatly affected by the presence of outliers.
Meta-analysis quantitative synthesis of primary data to produce an overall summary statistic.
Missing data information that is not available for a particular case (e.g. person) for which
other information is available (e.g. owing to item non-response).
Model in the context of regression, a model is an equation which summarises the relationship
between an outcome (dependent) variable and one or more explanatory (independent)
variables.
Multiple regression process of fitting a regression model with more than one explanatory
(independent) variable, as opposed to simple regression analysis. In particular, multiple
regression is used in cases where it is necessary to adjust for confounders or check for the
presence of interactions.
Multiple significance testing multiple significance tests which are carried out on the same
body of data. The likelihood of a type 1 error increases with the number of tests carried
out. For an alpha level of 0.05 (5%), if 20 tests are carried out the balance of probability
implies that one of these test results would be significant even in the absence of a real
effect (Type I error). In general, this error can be avoided by applying a more stringent

Glossary of Terms – Quantitative Research Methods


Page 6
level of probability, for example 0.01. The Bonferroni method provides a formal means
of setting the significance level for a given number of hypothesis tests.
Multivariate statistics analysis of three or more variables simultaneously; for example, they
can explain the association of two variables after adjusting for one or more others (e.g.
multiple and logistic regression analysis, factor analysis).
Nominal data the classes are mutually exclusive, but have no intrinsic order or value (e.g.
classification of capitals: Berlin, London, Milan, Paris, Stockholm).
Normal distribution a mathematically defined curve which is an ideal or a theoretical
distribution that occurs frequently in real life, especially in sampling. The Normal
distribution is a symmetrical, bell-shaped curve, rising smoothly from a small number of
cases at both extremes to a large number of cases in the middle; and the average (mean)
corresponds to the peak of the distribution; it is enveloped by a curve and equation.
Non-parametric methods statistical methods for the analysis of data which do not conform
with the requirements for parametric methods. Methods are based on the ranks given to the
values of the observations, rather than the actual observations.
Null hypothesis a statement that there is no relationship between the dependent and
independent variables.
Odds ratio of the number of times an event occurs to the number of times it does not occur,
out of a given number of chances.
Odds ratio (OR) ratio of two odds, often used in epidemiological studies as a measure of
relative risk, to compare exposed vs non-exposed or intervention vs control groups.
One-sided test significance test which only explores one alternative hypothesis (effect in one
direction only) to the null hypothesis, when comparisons are being made.
Ordinal data classes which can be placed in rank order (e.g. bigger than, preferred to) but in
which the amount by which one class is bigger than/preferred is not specified (e.g.
behaviour and attitudes: much more, more, about the same, less, much less; strongly agree,
agree, neither agree nor disagree, disagree, strongly disagree; social class I professional, II
semiprofessional, III non-manual, III manual, IV semi-skilled, V unskilled).
Outcome variable also dependent or response variable. It represents the characteristic or
measurement which is used to test the main hypothesis in an investigation.
Outliers values in a set of observations, which are much higher or lower than the majority of
values. An important consequence of the presence of outliers is that the data will not be
normally distributed.
Over-matching in the context of matching, occurs when cases and controls are matched for
variables which are closely associated with the exposure(s) under study.
P value P is the symbol of the probability associated with the outcome of a test of a null
hypothesis (i.e. the probability that an observed inferential statistic occurred by chance, as
in P < O.O5); Statistical tests exist which, in appropriate study designs and samples, can
test for the probability of observing the values obtained. [Note that p (small p) is used for
proportions].
Paradigm a set of ideas (hypotheses) about the phenomena under inquiry.
Paradigm shift this occurs if, over time, evidence accumulates which refutes, or is
incompatible with, the paradigm, and thus the old paradigm is replaced by the new one.
Parametric methods statistical methods of data analyses which rely on one or more
distributional assumptions for the data being analysed, commonly Normality (data follow
an approximate Normal distribution).
Perspective a way of interpreting empirical phenomena.

Glossary of Terms – Quantitative Research Methods


Page 7
Placebo inactive or dummy treatment which, in a clinical trial, is given to the control group
in order to prevent information biases, since it enables both patients and researchers to
remain blind to the treatments given.
Poisson regression regression method for the analysis of counts (e.g. number of cases of a
rare disease in different geographical areas) and rates (e.g. mortality rates).
Population attributable risk measure of the impact an exposure or risk factor has on a given
population, in terms of excess risk of disease attributable to that factor in a given
population.
Positivism positivism aims to discover laws using quantitative methods and emphasises
positive facts. It assumes that human behaviour is a reaction to (i.e. determined by)
external stimuli and that it is possible to observe and measure social phenomena, using the
principles of the natural scientist, and to establish a reliable and valid body of knowledge
about its operation based on empiricism and the hypotheticodeductive method.
Power calculation a measure of how likely the study is to produce a statistically significant
result for a difference between groups of a given magnitude (i.e. the ability to detect a true
difference and avoid a Type II error).
Precision the ability of a measure to detect small changes in a variable.
Prediction forecast of the value for a variable, based on the knowledge of the value of at
least one other variable.
Predictive values in the context of diagnostic testing, predictive values measure how useful a
test is in practice. The positive predictive value (PPV) of a test is the probability of
actually having a condition given that the test result is positive. The negative predictive
value (NPV) is the probability of not having the condition given that the test is negative.
Predictor variable explanatory or independent variable. In the context of regression, it
refers to a variable used to determine or predict the values of another variable called the
outcome (dependent variable).
Prevalence (point) the number of cases (e.g. of disease) in a population at one point in time,
expressed as a ratio of the population’s size. Note: prevalence is a ratio not a rate.
Prevalence (period) the number of cases (e.g. of disease) in a population over a specified
period of time, expressed as a ratio of the population’s size.
Prospective study collection of data over the forward passage of time (future).
Publication bias type of bias which arises due to selective publication in medical journals of
articles which report statistically significant results.
Qualitative research social research which is carried out in the field (natural settings) and
analysed largely in non-statistical ways.
Quantitative research the measurement and analysis of observations in a numerical way.
Random error the errors in the study (usually from the sampling) randomly vary and sum to
zero over enough cases; random error results in an estimate being equally likely to be
above or below the true value. Random error – of itself - results in lower precision, but not
bias.
Random sampling this gives each of the units in the target population a calculable and (for
simple random sampling) an equal probability of being selected.
Randomisation assignment at random of people to experimental and control groups in
experiments.
Randomised controlled trial (RCT) clinical trial where at least two treatment groups are
compared, one of them serving as the control group, and treatment allocation is carried out
using a random, unbiased method.

Glossary of Terms – Quantitative Research Methods


Page 8
Range a measure of dispersion that is based on the lowest and highest values observed.
Ratio data scores are assigned on a scale with equal intervals and a true zero point.
Rate summary measure which conveys the idea of risk over time. The denominator is
expressed as person-time at risk and the numerator is the number of occurrences of a
particular event.
Reactive (Hawthorne) effect a guinea pig effect (awareness of being studied). If people feel
they are being tested they may feel the need to create a good impression, or if the study
stimulates new interest in the topic under investigation then the results will be distorted.
Regression statistical method used to describe an association between variables, and for the
purpose of prediction. In simple linear regression, the relationship between the outcome
(dependent) variable (y) and the explanatory (independent) variable (x), is summarized by
means of a model with the formula for a straight line (y = a + bx), where ‘a’ is the intercept
and ‘b’ the slope or ‘regression coefficient’. See also multiple regression, logistic
regression and Cox proportional hazards regression.
Regression coefficient or slope of the line of best fit. It represents the increments predicted
in the outcome variable for each unit increase in the explanatory (independent) variable.
Reductionism the view that the phenomenon of interest can be explained within the lowest
level of investigation (e.g. in biology, the cellular or chemical level). In sociology, this is
known as atomism, which argues that the social system is no more than a collection of
individuals, and in order to understand the social system we simply need to understand
individuals.
Relative risk the incidence rate for the condition in the population exposed to a phenomenon
divided by the incidence rate in the non-exposed population. Relative risk is a ratio.
Relativism no single system of knowledge or beliefs (or ‘social facts’) exists; it is dependent
on context (i.e. culture).
Reliability the extent to which the measure is consistent when repeat measurements are
made, and minimises random error (its repeatability).
Regression to the mean an extreme measurement of a variable of interest which contains a
degree of random error; on subsequent measurements, this value will tend to return to
normal. The implication is that if a group of patients with a severe disease rating at a
particular point in time have been selected for study, they may improve in the short term
independently of any intervention simply because of the random variation inherent in the
disease.
Repeatability refers to the variability of repeated measurements taken under similar
conditions. Synonymous with reliability.
Research design this refers to the strategy of the research - how the sampling is conducted,
whether a descriptive or experimental design is selected, whether control groups are
needed, what variables need to be operationalised and measured, what analyses will be
conducted.
Research methods, or techniques these are the methods of data collection: interview,
telephone, postal surveys, diaries and analyses of documents, observational methods and so
on. They are also describe the instruments that are to be used.
Residuals in the context of regression, residuals are the numerical differences between
observed and predicted values. The analysis of the pattern of residuals is useful in
determining the appropriateness of a particular model to the data it proposes to describe.
Response rate the number (most usefully expressed as a percentage) of people who respond
positively to the invitation to take part in the study.

Glossary of Terms – Quantitative Research Methods


Page 9
Responsiveness a measure of the association between the change in the observed score and
the change in the true value of the construct (see also sensitivity).
Retrospective study collection of data over past time (looking backwards).
Risk factors factors, in particular the subject’s characteristics, history of disease, family
history, occupational exposure, and socioeconomic and demographic factors, which
increase an individual’s probability of disease when compared to individuals in whom the
factors in question are absent (or have a lower level).
Risk ratio ratio of the risk of an event in one group (exposure or intervention) to that in
another group (control). The term relative risk is sometimes used as a synonym of risk
ratio.
Sample a subset of a population.
Sample size Number of subjects required in a study so that differences thought to be
clinically important can also be detected as statistically significant, if indeed they do exist
(avoidance of Type II error).
Sampling techniques used to obtain a subset of a population without the expense of
conducting a census (census means gathering of information from all members of a
population).
Sampling distribution the distribution of means of all possible different samples of n
observations that can be obtained from this population. It has a mean equal to the
population mean. It is a Normal distribution (assuming the sample size is reasonably large).
Sampling error any sample is just one of an almost infinite number that might have been
selected, all of which can produce slightly different estimates. Sampling error is the
probability that any one sample is not completely representative of the population from
which it was drawn.
Sampling frame a list of the sampling units from which the sample can be drawn.
Scatterplot graphical method for displaying the association between two quantitative
variables. Note: it is a good idea to present a scatterplot when assessing the correlation
between two variables, or when regression models are developed, to assess the type of
relationship (linear or other) present.
Screening clinical, laboratory, radiological or other tests, carried out for the purpose of
identifying risk factors for disease (usually done on healthy, asymptomatic individuals).
Selection bias bias in the sample obtained. Systematic differences between a sample and its
source population, usually caused by inappropriate sampling.
Sensitivity ability of the actual gradations in a measurement scale’s scores to reflect these
changes adequately; probability of correctly identifying affected person (‘case').
Sensitivity analysis a method for making plausible assumptions about the margins of errors
in the results, and assessing whether they affect the implications of the results. The margins
of error can be calculated using the confidence intervals of the results or they can be
guessed.
Simple random sample a probability sampling method that gives each sampling unit an
equal chance of being selected in the sample.
Simple regression regression in which a single explanatory (independent) variable is used in
a model predicting an outcome (dependent variable).
Skewed distribution a distribution in which more observations fall on one side of the mean
than the other.
Specificity a measure of the probability of correctly identifying a non-affected person (i.e.
non-case) with the measure.

Glossary of Terms – Quantitative Research Methods


Page 10
Spurious association an observed association between the independent and dependent
variables which is false (spurious) because the association is caused by a third extraneous
variable which intervenes. If the latter is controlled the observed association disappears.
Standard deviation this is the most common measure of dispersion of continuous variables.
It is based on the difference of values from the mean value (the spread of individual results
round a mean value); it is the square root of the arithmetic mean of the squared deviations
from the mean.
Standard error this is a measure of the uncertainty in a sample statistic; the standard
deviation of the sampling distribution is called the standard error. It is related to the
population variation. The standard error of a mean is the standard deviation of the
population divided by the square root of the sample size. The formula is given in standard
statistical texts.
Standardisation statistical method used to compare rates in populations with different age
structures (or differences in other attributes such as social class).
Standardised mortality rate deaths e.g per 1000 of the population standardised for age.
Standardised mortality ratio compares the observed number of deaths in the index
population (a particular region or group of interest), with the expected number of deaths.
This is an indirect method of standardisation. Thus, for age standardisation, the expected
deaths are determined by applying age-specific rates of a standard population (e.g. the
country) to the age-specific populations in the index population, and taking the sum. The
observed/expected ratio is usually expressed as a percentage, with 100% being the value
for the standard population.
Statistical significance significance at the 0.05 per cent level means that five times in 100
the results could have occurred by chance, i.e. if the test was performed 100 times, on five
occasions significant results will occur by chance. See also multiple significance testing.
Stepwise regression method of selection of variables to be included as explanatory variables
in multiple regression models. Can be carried out using forwards or backwards selection.
Stratification computation of estimates or significance tests for each ‘stratum’ (level) of a
classifying variable. Used when considering confounding.
Stratified sampling method of sampling which aims to produce a sample that is
representative of all strata in a given population. A stratum, for example, might be an
administrative area.
Survey a method of collecting information from a sample of the population of interest
(known as a sample survey).
Survival analysis analysis of survival studies (where the outcome may be death or any other
event of interest), usually concerned with predicting length of survival given a number of
characteristics or prognostic factors, or with comparing the survival experiences of two or
more groups of individuals. See also Life Tables, Kaplan-Meier curves and Cox
regression.
Systematic error the errors in the study result in an estimate being more likely to be either
above or below the true value, depending upon the nature of the systematic error in any
particular case.
Systematic research the process of research should be based on an agreed set of rules and
processes which are rigorously adhered to, and against which the research can be
evaluated.
Systematic review of the literature review prepared with a systematic approach to
minimising biases and random errors, and including components on materials and methods.

Glossary of Terms – Quantitative Research Methods


Page 11
Systematic (random) sampling a sample in which every kth case (for example, every 10th
case) is selected from the population (n) (with a random starting point).
Theory a set of logically interrelated propositions and their implications.
Transformation data manipulations which attempt to find the right measurement scale,
usually for variables which are not normally distributed. Logarithmic transformation is
most commonly used. See also Geometric mean.
Trend (test for) Special form of standard significance tests to assess whether there is
evidence of a trend, used when the grouping variable is ordinal.
Two-sided test significance test which explores both alternative hypotheses to the null
hypothesis.
Type I error (or alpha error) the error of rejecting a true null hypothesis.
Type II error (or beta error) the failure to reject (i.e. acceptance of) a null hypothesis when
it is actually false.
Univariate statistics descriptive statistics for the analysis (description) of one variable (e.g.
frequency distributions, statistics of central tendency and dispersion).
Validity, external the extent to which the research findings can be generalised to the wider
population of interest and applied to different settings.
Validity, internal the extent to which the instrument is really measuring what it purports to
measure.
Variable an indicator assumed to represent the underlying construct or concept, produced by
the operationalisation of the latter.
Variance measure of spread of variability of quantitative measurements. Variance is the
square of the standard deviation.

Glossary of Terms – Quantitative Research Methods


Page 12

Vous aimerez peut-être aussi