Vous êtes sur la page 1sur 8

HANDOUT 1 FOR THE COVERAGE OF THE FINAL EXAM: PART I: ANALYSIS OF QUANTITATIVE DATA (POLIT CHAPTERS 21 & 22)

Types of Statistics: 1. Descriptive statistics -Used to describe and synthesize data 2. Inferential statistics- Used to make inferences about the population based on sample data

Review: LEVELS OF MEASUREMENT 1. Nominal measurement :Involves assigning numbers to classify characteristics into categories 2. Ordinal measurement :Involves sorting objects based on their relative standing on an attribute 3.Interval measurement : Occurs when objects are rank-ordered on a scale that has equal distances between points on the scale 4.Ratio measurement: Occurs when there are equal distances between score units and there is a rational, meaningful zero

Descriptive Statistics: Frequency Distribution (to condense data) :A systematic arrangement of numeric values on a variable from lowest to highest, and a count of the number of times each value was obtained Frequency distributions can be described in terms of: Shape Central tendency (measures of central tendency; also descriptive statistics) Variability (also descriptive statistics along with range, percentile )

Construction of Frequency Distribution: -Can be presented in tabular form (counts and percentages) - Can be presented graphically Histograms Frequency polygons

Shapes of Frequency Distribution 1. Symmetry Symmetric Skewed (asymmetric) Positive skew (long tail points to the right) Negative skew (long tail points to the left)

Examples of Skewed Distributions

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

2.Peakedness (how sharp the peak is) 3.Modality (number of peaks) Unimodal (1 peak) Bimodal (2 peaks)

-Multimodal (2+ peaks Note: Modality could be described as:Symmetric, Unimodal , Not too peaked, not too flat )

MEASURES OF CENTRAL TENDENCY: 1. MODE Index of typicalness of set of scores that comes from center of the distribution Modethe most frequently occurring score in a distribution ; useful mainly as gross descriptor, especially of nominal measures 2 3 3 3 4 5 6 7 8 9 Mode = 3

2.Medianthe point in a distribution above which and below which 50% of cases fall; useful mainly as descriptor of typical value when distribution is skewed 2 3 3 3 4 5 6 7 8 9 Median = 4.5

3.Meanequals the sum of all scores divided by the total number of scores ; or the average; most stable and widely used indicator of central tendency 2 3 3 3 4 5 6 7 8 9 Mean = 5.0

Weighted Mean-refers to over-all average of responses/ perceptions of the study respondents

Scale 5 4 3 2 1

Interval 4.51-5.00 3.51-4.50 2.51 3.50 1.51 2.50 1.0 1.50

Adjectival Description*

Interpretation

* Adjectival Description as used in the questionnaire; could be in terms of agreeability, frequency, etc. *Interpretation as answer or interpretation, based on the SOP

VARIABILITY OF DISTRIBUTIONS :The degree to which scores in a distribution are spread out or dispersed Homogeneitylittle variability Heterogeneitygreat variability

Two Distributions of Different Variability

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

INDEXES OF VARIABILITY

Range: highest value minus lowest value Standard deviation (SD): average deviation of scores in a distribution Variance: a standard deviation, sq

Standard Deviations in a Normal Distribution

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

CONTINGENCY TABLE (CROSS-TABS) A two-dimensional frequency distribution; frequencies of two variables are crosstabulated Cells at intersection of rows and columns display counts and percentages Variables must be nominal or ordinal

Contingency Table for Gender and Smoking Status Relationship

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

CORRELATION PROCEDURES Indicate direction and magnitude of relationship between two variables Used with ordinal, interval, or ratio measures Can be shown graphically (scatter plot) Correlation coefficient (usually Pearsons r) can be computed With multiple variables, a correlation matrix can be displayed

Various Relationships Graphed on Scatter Plots

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

DESCRIBING RISK Absolute risk is the proportion of people who experienced an undesirable outcome in each group Absolute risk reduction (ARR) expresses the estimated proportion of people who would be spared from an adverse outcome through exposure to an intervention Relative Risk (RR) is the estimated proportion of the original risk of an adverse outcome that persists among people exposed to the intervention Relative risk reduction (RRR) is the estimated proportion of untreated risk that is reduced through exposure to the intervention ODDS RATIO :Ratio of the odds for the treated versus the untreated group, with the odds reflecting the proportion of people with the adverse outcome relative to those without it USING INFERENTIAL STATISTICS TO TEST HYPOTHESES : A means of drawing conclusions about a population (i.e., estimating population parameters), given data from a sample; Based on laws of probability

Sampling Distribution of the Mean:A theoretical distribution of means for an infinite number of samples drawn from the same population; Characteristics : (1)Is always normally distributed;(2) Has a mean that equals the population mean; (3)Has a standard deviation (SD) called the standard error of the mean (SEM); (4)SEM is estimated from a sample SD and the sample size

Sampling Distribution

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

FORMS OF STATISTICAL INFERENCE

1.Estimation of parameters Used to estimate a single parameter (e.g., a population mean) Two forms of estimation: (1)Point estimation- Calculating a single statistic to estimate the population parameter (e.g., the mean birth weight of infants born in the U.S.); (2)Interval estimation Calculating a range of values within which the parameter has a specified probability of lying-A confidence interval (CI) is constructed around the point estimate; The upper and lower limits are confidence limits 2.Hypothesis testing (more common) Based on rules of negative inference: research hypotheses are supported if null hypotheses can be rejected; Involves statistical decision making to either: accept the null hypothesis, or reject the null hypothesis Researchers compute a test statistic with their data, then determine whether the statistic falls beyond the critical region in the relevant theoretical distribution If the value of the test statistic indicates that the null hypothesis is improbable, the result is statistically significant A nonsignificant result means that any observed difference or relationship could have resulted from chance fluctuations Two types of incorrect decisions: (1)Type I error OR FALSE POSITIVE: a null hypothesis is rejected when it should not be rejected ;Risk of a Type I error is controlled by the level of significance (alpha), e.g., = .05 or .01. (2)Type II error OR FALSE NEGATIVE : failure to reject a null hypothesis when it should be rejected Two-tailed tests :Hypothesis testing in which both ends of the sampling distribution are used to define the region of improbable values

Critical Regions in the Sampling Distribution for a Two-Tailed Test: IVF Attitudes Example

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

One-tailed tests:Critical region of improbable values is entirely in one tail of the distributionthe tail corresponding to the direction of the hypothesis

Critical Region in the Sampling Distribution for a One-Tailed Test: IVF Attitudes Example

Copyright 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins

HYPOTHESIS TESTING PROCEDURE: 1.Select an appropriate test statistic 2.Establish the level of significance (e.g., = .05) 3.Select a one-tailed or a two-tailed test 4.Compute test statistic with actual data 5.Calculate degrees of freedom (df) for the test statistic 6.Obtain a tabled value for the statistical test 7.Compare the test statistic to the tabled value 8.Make decision to accept or reject null hypothesis COMPARISON BETWEEN PARAMETRIC AND NON-PARAMETRIC STATISTICS PARAMETRIC NON-PARAMETRIC (Distribution-free Statistics)

Involve the estimation of a parameter

Do not estimate parameters

Require measurements on at least an interval scale

Involve variables measured on a nominal or ordinal scale

Involve several assumptions (e.g., that variables are normally distributed in the population)

Have less restrictive assumptions about the shape of the variables distribution than parametric tests

COMMONLY USED BIVARIATE STATISTICAL TESTS (INFERENTIAL) 1.t-Test :Tests the difference between two means t-Test for independent groups (between subjects) t-Test for dependent groups (within subjects) 2. Analysis of variance (ANOVA) Tests the difference between 3+ means: One-way ANOVAMultifactor (e.g., two-way) ANOVA; Repeated measures ANOVA (within subjects) 3. Pearsons r Pearsons r, a parametric test :Tests that the relationship (Test of Correlation) between two variables is not zero; Used when measures are on an interval or ratio scale 4. Chi-square test Tests the difference in proportions in categories within a contingency table; A nonparametric test Power Analysis: A method of reducing the risk of Type II errors and estimating their occurrence;With power = .80, the risk of a Type II error () is 20%; Method is frequently used to estimate how large a sample is needed to reliably test hypotheses Four components in a power analysis: (1)Significance criterion () ; (2)Sample size (N); (3)Population effect size the magnitude of the relationship between research variables (); (4)Powerthe probability of obtaining a significant result (1-) Note: CHAPTER 23 is not discussed as it is not to be commonly used for student researchers; you may also read it.

Vous aimerez peut-être aussi