Vous êtes sur la page 1sur 4

Measures of Association 1. Explain how levels of measurement affect choices about measures of central tendency, dispersion and association.

(You dont have to explain levels of measurement as such - just how they affect our choices.) For each measure, please use an example to show how we interpret numeric results. (10 points)

Levels of measurements, as proposed by Stevens (1946), represent the four most widely used classifications of levels which affect the ability of our choices. It affects the data, including transformations, what statistics you calculate, and what statistical procedures are to be used. Nominal data, which distinguishes between numbers assigned to the category as arbitrary, generates just a name for the characteristic. As for example, I am trying to set up a code scheme between the different genders in a voting group, I might determine 1 as male and 2 as female. As the number I assign is arbitrary, it represents as just a name for a subject, unshared by others in the same category. As no ranking is assigned and only a numeric order is placed, the only type of tendency that can be found is the mode (which here would only be the bigger of the two gender categories). The dispersion which can only be distinguished by three interpretations: the index of diversity (probability of the two people chosen at random to come from different categories), index of qualitative variation (the ranges), and Entropic measure (information theory), help to identify

With nominal data, you can use frequencies to describe the results, determine the mode(s) (e.g., the most frequent religious affiliation), but you cant calculate a mean or a standard deviation. The average religious affiliation doesnt even make sense. Whats the average ZIP code? You could have assigned any religion to any number, and so you would end up with a different average of whatever data you collected. You can use tests like and tests based on the binomial distribution to answer your research questions, but most other statistical procedures are inappropriate.

2. For PRE measures mentioned above, explain the sense in which errors are reduced.

PRE measures which stand for Proportional Reduction Error, is the measures designed to tell us how much we can reduce our errors in predicting an outcome, if we know how two variables are linked. It must be able to identify the type of error to be used and to specify how to assess it when we dont know how the DV of interest is linked to an IV and how to assess it when we do. Essentially the measures of error we use are heavily affected by the levels of measurement and fall attention to these nominal, ordinal, and interval/ratio PRE measures. Nominal Guttmans Lambda is the most widely used and tells us the proportion by which we can reduce our errors in guessing a cases score on the dependent variable if we know how

the dependent is linked to the independent. When Lambda provides us with a proportioned total, we can then take that percentage number and conclude that our errors have been effectively reduced by that number from their original level. In accordance with the measures of association, which ever value as closely associated to the value of zero, where no association exists, but still represents a rise beyond zero but by no more than one, allows us to realize that the two variables are almost independent. However as useful as Lambda is, it still ignores the impact of an IV on any category of the DV but the mode. As by only focusing on modal categories, it does not detect the major differences in other categories and for data like these we want a measure sensitive to effects on all categories of the DV. Such measure, which is based on information theory, is the use of the uncertainty coefficient, where it maintains that the proportion by which we can reduce our uncertainty about where cases lie in a table if we know how the two variables are linked. In order to do this, we require the use of the tiny bits of information needed allowing us to eliminate the uncertainty by proportioning the values between the rows and columns which will equal the uncertainty coefficient. Ordinal Gamma, which allows us the proportion by which we can reduce our errors in predicting the direction of pairs, if we know how the two variables are linked, are represented by either a positive or a negative relationship. It maintains that as scores on one variable rise so too do scores on the other are referred to as concordant pairs (positive relationship) and as scores on one variable rise, scores on the other decline, are referred to as discordant pairs (negative relationship). If no changes occur, than neither a relationship of concordant or discordant will exist. Understanding the linkage will then allow us to see which is the greatest and therefore determining the fewer amount of errors made with guesses in the larger category. Somers dxy (as an extension to the Gamma) put focus on the effect of IV where all cases might be influenced (something Gamma is unable to do). Somers maintains that because gamma does not penalize for ties and takes on a value of 1.00 when many of the cases are off the main diagonal is not a best representation of the perfect association. Therefore what Somers proposes is the measure which gives us the equal difference in proportion between columns and the excess of concordant pairs over discordant (or vice versa) as a proportion of all pairs distinguished on the independent variable. Thus allowing us to take into consideration all pairs of cases tied on the independent variable and reducing our errors by predicting the direction more accurately and requiring the assumption where cases tied on the DV would require that half of them, once distinguished would be concordant and half discordant. Somers then provides the advantage over gamma but quite essentially should act as symmetric to the findings , giving the same value when viewing the variables. Two other PRE measures also for Gamma are Yules Q as a special case and the Odds ratio with Q as a function for its usefulness. What Yules Q provides is the proportion by we can reduce the errors by guessing what kind of pairs are present if we know how the

variables are linked. As commonly used for nominal dichotomies, it interests is where the excess of pairs favoring one type of association over those favouring the other, as a proportion of all pairs favouring one or the other, is still comparable with an interpretation for ordinal variables. The odds ratio, which looks to be used as a measure of the affinity between groups or categories looks to act as a cross-product ratio, where it can be obtained from products of diagonally opposite cell values. By using it with the Q function, it shares its independence of the marginal totals (which Somers dxy cannot do) and looks to reflect their exclusive focus on the inner structure of a table, in the sense of ratios among the cell totals within it. Interval/Ratio

Pearsons R, which is recognized as a covariance, represents as the association where we are able to determine a suggestion that the negative contributions exceedingly surpass the positive or vice versa between where the two variables exist. What R gives us is the interpretations which act as a PRE measure. By squaring its yield into R2, we can discover for the proportion of variance of the standard deviations of change (where the change in one variable is on average the same change within the other) to determine the deviations of change within the subject of interest (i.e. if r=.35 and r2 = .1225, we will be able to calculate 12.25% of variance). Also, like for cases of nominal and ordinal, there is a special case of r where Spearmans Rho looks to act like a ranking position where it takes on the value of x and y. By using rho to refer to the ability of one set of ranks to predict another, rather than one set of raw scores, comparably using this special case in conjunction with Pearsons R allows us to reduce the errors even further.

3. Explain what conditional tables are, and how they may clarify the relationships between variables. (5 points)

Conditional tables are the cross-tabulation which includes only the cases which meet a certain condition. Sociologists raise recognition to associations in a two-way table which may arise when: one variable causes the other, when they affect each other, or when each is influenced by a third variable. To determine what causes what, understanding the various conditions to determine whether the strength, visibility, and direction, of the association will conclude to its effect. As proposed by the Columbian Approach, a Columbia school which created a series of tests, identified specific conditions for where variables could be placed into subgroups or partial tables for which would comprise of each conditional table of a value where a third variable, called a test factor is fixed. Since such factor is fixed, it cannot affect the link between the other two being compared and the third variable can help to assist in sorting out the links between them. How they can clarify the relationship between the variables can be based on four factors in particular use of a double decker graph and a fifth with log-linear modeling. 1. Specification/moderation, where association between two variables is different for subsamples with different values of a third variable. Essentially with use of a


3. 4. 5.

double decker graph, it shows a three-way table to specify the differences precisely. Casual Chain, where the focus of a path analysis displays the variables which lie in the middle of these chains (also called the mediators) which helps sort out the inner workings between variables. Spurious Association, where the observed correlation between two variables exist because each is affected by a common cause. Distortion, where the relationship exists between two variables is reversed when we control for the third. Conditional Independence, occurring when two variables are unrelated, once we have taken another (or others) out of the picture or where we can have more than one two-way interactions, with three variables, as long as they are mutually independent, but conditionally independent of each other.