Vous êtes sur la page 1sur 7

Stat 1211 Fall 2013

Rohit Patra Department of Statistics Columbia University rohit@stat.columbia.edu

3 September, 2013

Rohit Patra

Stat 1211 Fall 2013

What is Statistics? Statistics is concerned with methods for collecting, summarizing and analyzing data for drawing conclusions based on the information contained in the data Applications: Economics: Data on unemployment, interest rates, stocks Clinical trials: Is drug A better than drug B? Weather data: Is global warming real? Agriculture: Does variety C have higher yields than variety D? In short, Statistics is the study of any phenomenon exhibiting uncertainty and variation.

Rohit Patra

Stat 1211 Fall 2013

Rohit Patra

Stat 1211 Fall 2013

Rohit Patra

Stat 1211 Fall 2013

General Procedure in a Statistical Research Project


Planning and design (experimental design) Data collection Data summarization [graphical (histogram, pie charts ...) and descriptive statistics (mean, variance ...)] Data analysis and interpretation (statistical inference) Conclusions about the population

Rohit Patra

Stat 1211 Fall 2013

Population and sample

Population: All objects/individuals of a particular type. Sample: A (representative) subset of the population Variable: Any characteristic of an unit/individual, e.g., height, weight, gender ... Two types of variables: Categorical (qualitative) and Quantitative Categorical: gender, eye color, blood type ... Quantitative: temperature, age, height, income ...

Rohit Patra

Stat 1211 Fall 2013

Numerical summary of data

Measures of location (center) Data: X1 , X2 , . . . Xn (n = sample size) = Sample mean: X


1 n

Easy to calculate; affected by outliers" = middle observation (if n is odd, then Sample median: X the (n + 1)/2-th obs.; if n is even, then average of n/2 and n/2 + 1 obs.)

Pn

i =1 Xi

Rohit Patra

Stat 1211 Fall 2013

Measures of variability

Range: R = max1i n Xi Sample variance: P S2 = n 1 1 n i =1 (Xi )2 = X p

min1i n Xi
1 1(

Standard deviation: S =

S2

Pn

2 i =1 Xi

2) nX

2 = s2 , Dene Yi = Xi + b. Then sY X

2 = a 2 s 2 , s = |a |s Dene Yi = aXi . Then sY Y X X

Rohit Patra

Stat 1211 Fall 2013

Percentiles
The p-th percentile is the data value which has p% of the observations falling at or below it Median is the 50th percentile Q1 = 1st quartile = 25th percentile (median of the lower half of the ranked data) Q3 = 3rd quartile = 75th percentile (median of the upper half of the ranked data) Inter-quartile range: IQR = Q3 extreme obs.) Five-number summary! A Box plot is a visual representation of the ve number summary. Outliers: Bigger than Q3 + 1.5IQR and smaller than Q1 1.5IQR .
Rohit Patra Stat 1211 Fall 2013 9

Q1 (not affected by

Take home points

Overview of course Descriptive statistics Graphical statistics Sample mean, median, variance, standard deviation

Rohit Patra

Stat 1211 Fall 2013

10

Probability Probability provides methods for quantifying the chances, or likelihood, associated with various outcomes. Example: Will it rain tomorrow? Will the Republicans win? Will I get 3 Heads (H) in 3 coin tosses? Experiment Experiment: Action or process that generates data Outcome Examples: (a) Flip a coin and record H or T; (b) Toss a coin three times; (c) roll a pair of dice; (d) ip coin until rst H.

Rohit Patra

Stat 1211 Fall 2013

11

Sample space S : set of all possible outcomes of experiment Examples: (a) S = {H , T }; (b) S = {HHH , HHT , HTH , HTT , THH , THT , TTH , THH }; (c) S = {(1, 1), (1, 2), . . . , (1, 6), . . . , (6, 6)}; (d) S = {H , TH , TTH , TTTH , . . .}.

Event: Any subset of S

(a) Let A be the event that the coin lands H; A = {H}

(b) Let B be the event that we get exactly two tails; B = {HTT , THT , TTH } (d) Let D be the event that # coin ips needed is 2; D = {TH } (c) Let C be the event that the rst dice shows 1; C = {(1, 1), (1, 2), . . . , (1, 6)}

Rohit Patra

Stat 1211 Fall 2013

12

Set relations Ac : The complement of A is the event that A does not occur (Union) A [ B : Set of all outcomes in A, or in B , or in both (Intersection) A \ B : set of all outcomes that are in A and in B Null set: the event containing no outcomes, denoted by Mutually exclusive/disjoint events: A and B are disjoint if they have no outcomes in common, i.e., A \ B =

Rohit Patra

Stat 1211 Fall 2013

13

Probability axioms (1) P (A) 0 for any A

(2) P (S ) = 1 (3) If A1 , A2 , . . . are disjoint, then P (A1 [ A2 [ . . .) = P (A1 ) + P (A2 ) + . . . Properties P( ) = 0 P (A) + P (Ac ) = 1, for any event A P (A [ B ) = P (A) + P (B ) P (A \ B )

Rohit Patra

Stat 1211 Fall 2013

14

Vous aimerez peut-être aussi