0 Votes +0 Votes -

3 vues161 pagesFeb 25, 2019

© © All Rights Reserved

PDF, TXT ou lisez en ligne sur Scribd

© All Rights Reserved

3 vues

© All Rights Reserved

- PSSC Maths Statistics Project Handbook eff08.pdf
- Statistics for Management and Economics 9th Edition by Gerald Keller Test Bank
- Lovely institute of management
- 1979 Psychophysicai Aspects of Sensory Analysis
- E2+Basic+Mathematical+and+Measurement+Concepts
- Untitled
- 4040_w13_er
- Basic Concepts of Statistical Studies
- Ssc Cgle Syllabus
- CHAPTER 1
- How to Research
- CHAPTER 1 Collecting Data
- Research for Marketing Decisions
- Marketing Research
- statistics(2).docx
- Am Pling
- Ahmad Rustam & Sufri Mashuri
- Methodology (Chapter 3)
- Big bazaar
- Chap06

Vous êtes sur la page 1sur 161

INTRODUCTION TO STATISTICS

Expected Outcomes

Able to define basic terminologies of statistics.

Able to apply the basic steps in the statistical problem-solving

methodology for various applications.

Able to summarise and analyse data using measures of central

tendency, measures of variation and measures of position.

Able to relate the concept of accuracy and precision of data using game

of darts.

Able to conduct exploratory data analysis that includes numerical data

analysis and various graphical displays.

Able to plot and interpret normal probability plot.

SZS2017

CONTENT

1.1 Statistical Terminologies

1.2 Statistical Problem Solving Methodology

1.3 Review on Descriptive Statistics

1.3.1 Measures of Central Tendency

1.3.2 Measures of Variation

1.3.2.1 Accuracy and Precision

1.3.3 Measures of Position

1.3.4 Descriptive Statistics Using Microsoft Excel

1.4 Exploratory Data Analysis

1.4.1 Outliers

1.4.2 Box Plot

1.5 Normal Probability Plot

SZS2017

1.1 STATISTICAL

TERMINOLOGIES

Define the meaning of statistics, population,

sample, parameter, statistic, descriptive statistics

and inferential statistics.

Discuss the importance of statistics in daily lives.

SZS2017

1.1.1 What is Statistics?

Most people become familiar with probability and statistics through

radio, television, newspapers, and magazines. For example, the

following statements were found in newspapers:

Ten thousands parents in Malaysia have chosen StemLife as their trusted

stem cell bank.

The death rate from lung cancer was 10 times higher for smokers compared

to nonsmokers.

The average cost of a wedding is nearly RM10,000 in Malaysia.

In Malaysia, the median salary for men with a bachelor’s degree is

RM 30,000 per year, while the median salary for women with a bachelor’s

degree is RM 29,000 per year.

Globally, an estimated of 500,000 children under the age of 15 live with Type

1 diabetes.

Women who eat fish once a week are 29% less likely to develop heart disease.

SZS2017

What is Statistics?

The sciences of conducting studies to collect, organise, summarise,

analyse, present, interpret and draw conclusions from data.

Collection and analysis of data are the most important part in research

methodology.

Researchers must have a basic knowledge of statistics before starting any

research or study involving data analysis.

Statistics is also used to analyse the results of surveys and as a tool in

scientific research to make decisions based on controlled experiments,

estimation, prediction, and quality control.

SZS2017

1.1.2 Why we Need Statistics?

Basic knowledge of statistics is needed in any disciplines or any field of

research or study (in almost all fields of human endeavour) that involve data

analysis.

The methods of statistics allow the researchers to design a valid experiment

and finally draw a reliable conclusion or interpretation from the data they

produced and analysed.

Examples:

team scored during a football season.

In public health, a doctor might be concerned with the number of child who

are infected with a H1N1 virus during a certain year.

In education, an educator might want to know if the performance of

students in current semester are better than the previous semester.

SZS2017

1.1.2 Why we Need Statistics?

Knowledge of statistics may help you in:

a. A university admission director needs to find an effective way of

selecting students. He designed a statistical study to see if there is a

significance relationship between SPM result and the GPA achieved by

first year students at his university. If there is a strong relationship,

high SPM result will become an important criterion for admission.

b. A management consultant wants to compare a client’s investment

return for this year with related figures from last year. He summarises

the revenue and cost data from both periods and find the relationship

between these two variables. Based on his findings, he presents his

recommendations to his client.

Variables is a characteristic or attribute that can assume different values. These

values are data. It is called random variables if the values are determined by chance.

SZS2017

1.1.2 Why we Need Statistics?

Knowledge of statistics may help you in:

a. Suppose that a manager of Unisex Hair Stylist claimed that 90% of the

customers are satisfied with the services. If a consumer activist feels

that this is an exaggerated statement that might require legal action,

the activist can use statistical inference techniques to decide whether

or not to sue the manager. Therefore, the knowledge gained from

studying statistics can enhance the awareness towards becoming

better consumers.

b. People can make intelligent decisions about what products to purchase

based on consumer studies about government spending based on

utilisation studies, and so on.

SZS2017

1.1.3 Population and Sample

Population (N)

Tangible

A complete collection of finite and the total number of

measurements, outcomes, objects or subjects is fixed and could be listed

individuals under study. → all computers in a room, all female

students in a university, or all electrical

components manufactured in a day, etc.

Conceptual (Intangible)

all values that might possibly have

been observed and has an unlimited

number of subjects.

→ simulated data from computer or

Sample (n) instrument, number of germs on human

A subset of the population that body, all experimental data such as all

measurements of length of metal rod, etc.

is observed

SZS2017

Parameter and Statistic

Parameter Statistic

A numerical value that represents a A numerical value that represents a

certain population characteristic certain sample characteristic

The average of weight of students from a The average of weight for a sample of

population of students in a university female students selected from all students in

The percentage of defective components in a university

a population of electrical components The percentage of defective components in

manufactured in a day a sample of 100 electrical components

Mean (Average) x

Variance 2 s2

Standard deviation s

Proportion p

SZS2017

EXAMPLE 1.1

A travel agent claims that the average number of rooms in large hotels in

Pahang is 500 and the standard deviation is 165. A sample of seven hotels in

Genting Highlands is selected and the average number of rooms is found to be

435 with standard deviation of 15.

Based on the above example:

The sample selected is seven large hotels in Genting Highlands.

The population under study is tangible since there are finite numbers of

large hotels in Pahang.

The characteristic (variable) is number of rooms.

The parameters are 500 and 𝜎 = 165 since they describe the

population characteristics.

The statistics are 𝑥ҧ = 435 and s = 15 since they describe the sample

characteristics.

SZS2017

EXERCISE 1.1.3

The number of first year students at a residential college is 317 students. An IQ

pre-test is given to all of them in their first week. The dean of admission

collected data on 27 of them and found their mean score on the IQ pre-test was

51. The mean for the entire first year students was estimated to be

approximately 51. A subsequent computer analysis of all first year students

showed that the true mean (population mean) is 52.

Based on the above statement, answer the following questions.

b) Is the population tangible or conceptual?

c) What is the sample?

d) What is the variable of the study

e) Which number describes a parameter?

f) Which number describes a statistic?

a) 317 first year students b) tangible c) 27 first year students d) IQ pre-test score e) 52 f) 51

SZS2017

1.1.4 Descriptive and

Inferential Statistics

Descriptive statistics Inferential statistics

Includes the process of data collection, Involves a process of generalisation,

data organisation, data classification, estimations, hypothesis testing, predictions

data summarisation, and data and determination of relationships between

presentation obtained from the sample.

variables.

Used to describe the characteristics of

the sample. Used to describe, infer, estimate,

Used to determine whether the sample approximate the characteristics of the target

represents the target population by population.

comparing sample statistic and Used when we want to draw a conclusion

population parameter. for the data obtain from the sample.

EXAMPLE: EXAMPLE:

Ten thousands parents in Malaysia have The death rate of lung cancer was 10 times

chosen Takaful Insurance as their higher for smokers compared to

trusted life insurance agency. nonsmokers .

SZS2017

Overview of descriptive

and inferential statistics

SZS2017

EXERCISE 1.1.4

descriptive statistics or inferential statistics.

is RM 30,000 per year, while the median salary for women with a

bachelor’s degree is RM 29,000 per year.

live with Type 1 diabetes.

heart attacks in men over 70 years of age.

SZS2017

1.1.5 Role of the Computer in Statistics

1. Spreadsheets

Microsoft Excel & Lotus 1-2-3

2. Statistical Packages

AMOS, eViews, MINITAB, R, SAS, SmartPLS,

SPSS and SPlus

SZS2017

Data Analysis Application Tools in EXCEL

2. Formulas

File → Options → Add-Ins

→ Analysis ToolPak → ok

→ Data → Data Analysis

SZS2017

Chose

Analysis

ToolPak

and click

Go

SZS2017

Tick Analysis

ToolPak

and click ok

SZS2017

→ Now we can use the Data Analysis

Application in Microsoft Excel to analyse data.

SZS2017

1.2 STATISTICAL

PROBLEM- SOLVING

METHODOLOGY

Outline the six basic steps in the statistical

problem-solving methodology.

Identify various sampling methods.

Classify type of data and level of measurement.

SZS2017

Statistical Problem-Solving

Methodology

SZS2017

Statistical Problem-Solving

Methodology

SZS2017

1.2.1 Identify the Problem or Opportunity

The researchers must clearly understand and define the objective of the study

before conducting any research. Possible questions that could be asked before

starting any study are given as follows.

What are the possible variables that are related to the study?

Can the study goal be achieved through simple counts or measurements of

the group?

What are possible treatments should be imposed on the group and what are

their responses?

Should the experiment be performed on the group?

Do the data come from population or sample?

If samples are needed, how large the sample size is appropriate? How

should they be taken?

SZS2017

Characteristics of Sample

A sample is a subset of population.

The population is a complete group of people, companies, hospitals,

stores, university, students, and etc., that share some set of

characteristics.

A census involves the whole population which possesses a greater

likelihood of non-sampling errors.

Sampling error is calculated when the statistical characteristics of a

population are estimated from a subset, or sample, of that population.

The difference between the sample and population values is considered as

a sampling error.

Non-sampling errors is an error that are not due to sampling. As example,

in a survey, mistakes may occur in the selection of people.

SZS2017

Characteristics of Sample Size

The larger the sample size, the smaller the magnitude of sampling errors

would be.

Studies using survey method need a larger sample size since the survey is

a voluntarily based.

Studies using mail response need a much larger sample size. Normally,

the response is as low as 20%-30% responses.

The ideal sample size in a study should be large enough to serve as an

adequate representative of the population in order to generalise the

overall population.

The optimal sample size depends on statistical distribution used and for

the purpose of generalisation to the whole population.

Researcher may refer to Krejcie and Morgan (1970) as a guideline to

obtain an adequate sample size.

SZS2017

1.2.2 Deciding on the

Method of Data Collection

Data must be collected as complete as possible, accurate & relevant to the

problem in order to solve the problem.

secondary data)

It is similar to historical or observed data.

The availability of the data depends on the primary and secondary

resources of document, evidence that includes interviews, observation

method, minutes of meeting, formal policy statement etc.

Example: Rainfall data collected from Malaysian Meteorological

Department is a secondary data.

SZS2017

1.2.2 Deciding on the

Method of Data Collection

Data could be obtained in 3 ways:

In an experimental study, the researcher manipulates one of the

variables and study on how the manipulation influences other variables

provided that the treatment and the subjects are assigned to groups

randomly.

Example: Blood glucose level data obtained from diabetic patients

before and after a treatment is an example of experimental data.

questionnaire):

Observations VS interviews

SZS2017

Observation method

In qualitative research: used to study the behaviours or events and the

context that surrounds the behaviours or events and between the behaviour

and the event.

In quantitative research: used to collect data regarding the number of

occurrences in a specific period of the time, or duration of a very specific

behaviour or event.

The detail descriptions or data collected in qualitative research can be

converted later to numerical data and can be analysed quantitatively.

Observations method can be used in setting the physical environment, social

interactions, physical activities, non-verbal communications, planned and

unplanned activities.

Example: A study on customer’s behaviour towards type of brands in a

certain shopping complex is an example of observational study.

SZS2017

Interviews method

The purpose of interview in collecting data is to find out what is in or on

someone else’s mind.

Interview data can easily become biased and misleading if the interviewed

person is aware of the perspective of the interviewer.

It is very important to make sure the person being interviewed does not

hold any preconceived notions regarding the outcome of the study.

Interviews range from quite informal and completely open-ended to very

formal with the questions predetermined and asked in a standard manner.

Usually, interviews are used to gather information regarding an individual’s

experience and knowledge; his/her opinions, beliefs, and feelings, and

demographic data.

Example: An interviewer is interested to gather information on the way

nurses organise their care in hospital wards and conduct an interview

session.

SZS2017

Other Methods of Data Collection

• Questionnaires and surveys (Quantitative + Qualitative).

• Opinions (Qualitative + Quantitative).

• Projective technique and psychological tests (both).

• Proxemics – Study of people’s use of space and their relationship to

culture.

• Kinetics – Study of body movement or people communicate

nonverbally.

• Street Ethnography – Concentrate on a person becoming a part of

the place under study.

• Narratives – Study people’s individual life stories.

• Triangulation – The used of multiple data collection techniques

(Triangulation of data permits the verification and validation of

qualitative data.

SZS2017

EXERCISE 1.2.2

Identify each of the following studies as being either observational or

experimental.

a) Subjects were randomly assigned to two groups, and one group was

given a herb and the other group a placebo. After 6 months, the

numbers of respiratory tract infections each group were compared.

b) A researcher stood at a busy intersection to see if the colour of an

automobile a person drives is related to running red lights or not.

c) A researcher finds that people who are more hostile have higher

total cholesterol levels than those who are less hostile.

d) Subjects are randomly assigned to four groups. Each group is

placed on one of four special diets—a low-fat diet, a high-fish diet, a

combination of low-fat diet and high-fish diet, and a regular diet.

After 6 months, the blood pressures of the groups are compared to

see if diet has any effect on blood pressure or not.

SZS2017

1.2.3 Collecting the Data

(Sampling Techniques)

Sampling is a process of selecting few samples from a population to

become the basis for estimating or predicting the prevalence of an

unknown piece of information, situation or outcome regarding the

bigger group.

i. Non-probability sampling (judgment, voluntary, convenience):

• Sample collected based on the judgment of the experimenter.

• Resulting samples might be biased.

ii. Probability sampling (random, systematic, stratified, cluster):

• The chances is known before the sample is picked.

• Resulting samples are unbiased.

a non-probability data or probability data.

SZS2017

Judgment

Voluntary

Nonprobability

sampling Convenience

Snowball

Others

Sampling Quota

Techniques Random

Systematic

Probability

Cluster

sampling

Stratified Multi-stage

Others K-Sampling

Nested

SZS2017

A. Nonprobability Sampling Methods

Non-probability Sampling Methods Example

Judgment sampling A political campaign manager intuitively

Data is selected based on opinion of one or picks certain voting districts as reliable

more experts. places to measure the public opinion of his

candidates.

Voluntary sampling

Questions are posed to the public by A call-in radio show asks their listeners to

publishing them over radio or television via participate in surveys on controversial

phone, short message, email etc. The topics such as abortion, affirmative action,

resulting sample tends to over represent gun control, politic, etc.

individuals who have strong opinions.

Convenience sampling

The data selected is an “easy sample”, A surveyor will stand in one location and

haphazard or accidental sampling. ask passerby the questions.

The researcher obtains units or people who

are most conveniently available.

SZS2017

B) Probability Sampling Methods

1. Random sampling

• Each data is numbered, and then the

data is selected using chance or

random method such as random

number.

• When a sample is chosen at random,

it is said to be an unbiased sample.

• Random sample can be selected with

or without replacement.

Example:

Suppose a lecturer wants to study the physical fitness levels of students at his/her

university. There are 5000 students enrolled at the university, and he/she wants to draw a

sample of size 100 to take a physical fitness test.

She could obtains a list of all 5000 students, numbered it from 1 to 5000 and then

randomly invites 100 students corresponding to those numbers to participate in the study.

SZS2017

Generating Random Number

• Generating random number is an important step in obtaining

random sample.

• In random number, each number has equal chance to be selected.

• Random number can be generated from calculator, softwares, or

random number table.

we want to choose five samples only. Hence, using R-language we

can use the R command “sample (1: 100, 5)”. The resulted output is

the five number listed randomly.

SZS2017

B) Probability Data Samples

2. Systematic sampling

• A set of data is numbered from 1 to N .

x1, x2 ,

, xN

• The first data is selected randomly within

number 1 and k where k=N/n and n

sample size.

• The next number are selected every k

interval to produce n samples.

Example:

Suppose a lecturer wants to study the physical fitness levels of students at his/her university

and he/she wants to draw a sample of size 100 to take a physical fitness test. She obtains a list

of all 5000 students, numbered it from 1 to 5000 and randomly picks one of the first 50 voters

(k=5000/100) on the list. If the first picked number is 30, then the 30th student in the list

should be invited first. Then she should invite every 50th name on the list after this first

random number starts (the 80th student, the 130th student and so on) to produce 100 samples

of students to participate in the study.

SZS2017

B) Probability Data Samples

3. Stratified sampling

• The population is divided into groups

according to some characteristic that is

important to the study, and then the sample

is selected from each group using random or

systematic sampling.

• The characteristics are homogeneous

(similar) within each group but

heterogeneous (dissimilar) among the groups

Example:

Assume that, because of different lifestyles, the level of physical fitness is different

between male and female students. To account for this variation in lifestyle, the population

of student can easily be stratified into male and female students.

The random method or systematic method can be used to select the participants. As an

example, she use random sample to choose 50 male students and use systematic method

to choose another 50 female students or otherwise.

SZS2017

B) Probability Data Samples

4. Cluster sampling

• The population is divided into groups or

clusters, then some of those clusters are

randomly selected and all members from

those selected clusters are chosen.

• Cluster sampling can reduce cost and time.

• Each cluster has heterogeneous

characteristic but has homogeneous

characteristic among the clusters.

• We can choose more than one cluster.

Example:

Assume that, because of different lifestyles, the level of physical fitness is different

between 1st year, 2nd year, 3rd year and senior students. To account for this variation in

lifestyle, the population of student can easily be clustered into four categories.

Then, she can choose any clusters and chose all students in that clusters as the

participants. For example, all 2nd year students are chosen as the participants.

SZS2017

Advantages and Disadvantages for each

Sampling Techniques

Sampling When to Use? Advantages Disadvantages

Techniques

Judgement When the population - Fast and conclusive. - Biased since it based on

Sampling is too large. opinion of one or more

expert only.

Voluntary When the members - Fast response. - Samplings are too

Sampling of the population are - Easy to obtain lager random.

convenient to be sample sizes. - Sometimes not reliable.

sampled. - Degree of generalisability

is questionable.

Convenience When the members - Fast and easy. - Samplings are too

Sampling of the population are - Convenience and random.

convenient to be inexpensive. - Sometimes not reliable,

sampled. - Degree of generalisability

is questionable.

SZS2017

Advantages and Disadvantages for each

Sampling Techniques

Sampling When to Use? Advantages Disadvantages

Techniques

Random When the members of - Use table of random - High cost.

Sampling the population are number. - Time consuming for large

similar to one another - Each data has an equal sample size.

on important chance to be selected. - Tedious.

variables. - Ensures a high degree of

representativeness.

Systematic When the members of - Relatively easy to - There is a risk of data

Sampling the population are construct, execute, manipulation.

similar to one another compare and understand. - Not the best method if the

on important variables - The process can be researcher does not know

controlled. the background of the

- Good for tight budget population.

research. - Less random than simple

- Ensures a high degree of random sampling.

representativeness.

- No need to use a table of

random number.

SZS2017

Advantages and Disadvantages for each

Sampling Techniques

Sampling When to Use? Advantages Disadvantages

Techniques

Stratified When the population - Variety of samples. - Time consuming.

Sampling is heterogeneous and - Ensures a high degree of - Tedious.

contains several representativeness of all

different groups, some the strata or layers in the

of which are related to population.

the topic of the study.

Cluster When the population - Less energy and money. - Possibly, members of units

Sampling consists of units rather - Easy and convenient. are different from one

than individuals. - Save time. another, decreasing the

techniques effectiveness.

SZS2017

Random Data Generation

From Normal Distribution

𝑋~𝑁 𝜇, 𝜎 2 𝑜𝑟 𝑍~𝑁 0, 1

𝜇 is mean

2

𝜎 is variance

SZS2017

Random Data Generation

From Poisson Distribution

X~Po λ , λ is average

value

SZS2017

EXERCISE 1.2.3

In each of these statements, identify the type of sampling method used.

community and he wants to pick a probability sampling of 50 samples.

He uses a random number table to pick one of the first 20 voters

(1000/50 = 20) on the list. The table gave him the number of 16, so he

selects the 16th voter on the list as the first selected number. Then he

picks every 20th name after the first random number start (the 36th

voter, the 56th voter, etc.) until 50 samples obtained.

city into small blocks. Each block containing a cluster is surveyed. A

number of clusters are selected for the sample, and all the households

in a cluster are surveyed. Less energy and money are expended if an

interviewer stays within a specific area rather than traveling across

stretches of the cities.

SZS2017

EXERCISE 1.2.3

In each of these statements, identify the type of sampling method used.

growing pattern or when surface differences are observed for a soil. For

example, differences may occur in soil color which may be the result of many

factors. A researcher is called to judge a particular shade of colour to be

typical for a sample at certain sites. Then from these sites, samples are

drawn.

d) The population of university professors is divided into groups according to

their rank (instructor, assistant professor, etc.) and several are selected from

each group to make up a sample.

e) A surveyor stands outside a shop in the East Cost Mall and randomly selects

people to participate in a quiz.

f) A quality engineer wants to inspect rolls of wallpaper in order to obtain

information on the rate at which flaws in the printing are occurring. She

decides to draw a sample of 50 rolls of wallpaper from a day’s production. At

the end of each hour, for 5 consecutive hours, she takes the 10 most

recently produced rolls and counts the number of flaws on each.

SZS2017

MIND EXPANDING EXERCISES

1. Statistics can be applied across many disciplines or any fields of

research and almost in all fields in human endeavour. Based on this

statement, suggest reasons why statistics is important.

determine the age distribution of their listeners. Describe in detail

how you would select at least 3000 sample of listeners. Chose the

best sampling techniques and state the reason. The sampling

techniques can be mix or combine.

SZS2017

1.2.4 Classifying and Summarising

the Data

In this step, the collected data are organised properly for further study and

investigation.

Data that has been collected during the sampling process is called raw data.

The simplest way to organise raw data systematically is by using data array.

Data array is an arrangement of data items in either ascending or

descending order (sorting).

1.2.4.1 Classifying

identify items with the same characteristics & arranging them into

groups or classes.

Data could be classified by its type or by its level of measurement.

1.2.4.2 Summarisation

Graphical & Descriptive statistics ( tables, charts, measures of central

tendency, measures of variation, measures of position)

SZS2017

Example of Raw Data

by column or row

SZS2017

1.2.4.1 Data Classification

Variables is a characteristic or attribute that can assume different values.

Variables whose values are determined by chance are called random

variables.

Data can be

classified

As Quantitative or or measured

Qualitative type - Level of measurements of data

SZS2017

Nominal Data

Qualitative The values cannot be ranked

(categorical/Attributes) Gender, race, citizenship,

Data that refers to colour, etc.

classification name according Use code

to some characteristic or Ordinal Data numbers

The values can be ranked and (1, 2,…)

attribute

Data is classified using code likert scale is used

numbers Feeling (dislike-like),

Type colour (dark-bright), etc.

of

Data Discrete Data

The values can be counted and finite

Number of student, number of cat,

Quantitative (Numerical) number of defect, etc.

Data can be counted or

Continuous Data

measured The values can be placed within two

Data can be ordered or ranked specified values, obtained by measuring,

have boundaries, and shall be rounded to

require decimal places

Weight, age, salary, temperature, etc.

SZS2017

Levels of Measurement of Data

Levels Descriptions Examples

Nominal-level Classifies data into mutually Zip code (4, 5, 6,…),

exclusive (non-overlapping), Post code (25000, 25600, …),

exhausting categories in which Gender (female, male),

no order or ranking can be Eye colour (blue, brown, green, hazel),

imposed on the data. Political affiliation, Religion,

Nationality

Ordinal-level Classifies data into categories Grade (A, B, C, D, etc.),

that can be ranked; however, any Judging (first place, second place, etc.),

specific differences between the Rating scale (poor, good, excellent).

ranks do not exist. Color (light blue, …, dark blue)

Interval-level Ranks the data, and precise IQ test

differences between units of Temperature

measure do exist; however, there Shoe size

is no meaningful zero.

Ratio-level Possesses all the characteristics Height, Weight, Time, Salary

of interval measurement, and

there exists a true zero. SZS2017

EXERCISE 1.2.4.1

1. The SuperMotor Marketing Corporation has asked you for information

about the car you drive. For each question, identify each of the types of data

requested as either attribute data or numeric data. When atribute data is

requested, identify the variable either as nominal or ordinal. When

numeric data is requested, identify the variable either as discrete or

continuous. Then, identify the level of measurement for each variable.

b) In what city was your car made?

c) How many people can be seated in your car?

d) What is the distance traveled from your home to your school?

e) What is the color of your car?

f) How many cars are in your household?

g) What is the length of your car?

h) What is the normal operating temperature (in C) of your car’s engine?

i) How much does the petrol mileage (km/l) do you get in city driving?

j) Who made your car?

k) How many cylinders are there in your car’s engine?

l) How many kilometres have you put on your car’s current set of tyres?

SZS2017

EXERCISE 1.2.4.1

2. The chart shows the number of job-related injuries for each of the

transportation industries for 1998.

Industries injuries

Railroad 4520

Intercity bus 5100

Subway 6850

Trucking 7144

Airline 9950

a) What are the variables under study?

b) Categorise each variable either as qualitative or quantitative.

c) Categorise each quantitative variable either as discrete or

continuous.

d) Categorise each qualititative variable either as nominal or ordinal.

e) Identify the level of measurement for each variable.

SZS2017

1.2.4.2 Data Summarisation

1) Descriptive statistics (refer Section 1.3)

Typically used to confirm conjectures about the data.

Quantitative data: measures of central tendency, measures of

variation (dispersion) and measures of position.

Qualitative data (non-numeric quality (attribute) or category):

measure the relative frequency for a particular characteristic

and calculate its percentage.

b) Graphical Summary

Organise the data in some meaningful way by constructing a

frequency distribution (refer Appendix A.1) for quantitative or

qualitative data.

A frequency distribution is the organisation of raw data in

table form, using classes and frequency

SZS2017

Graphical Statistics

The purpose of graphs in statistics is to convey the data to the viewer in pictorial

form and getting the audience’s attention in a publication or a presentation.

SZS2017

Histogram, Frequency

Polygon, Ogive

For quantitative data. For quantitative data. For quantitative data.

Describe grouped Describe grouped frequency Represents the cumulative

frequency data data distribution. frequencies for the classes in a

distribution. Displays the data by using grouped frequency data

Displays the data by using lines that connect points distribution.

contiguous vertical bars of plotted for the frequencies at Visually represent how many

various heights to represent the midpoints of the classes. values are below a certain upper

the frequency of the classes. The frequencies are represented class boundary.

by the heights of the points.

Distribution Shapes for Histogram

SZS2017

Bar Chart, Pareto Chart,

Pie Chart

For quantitative data, the bar Used to represent a frequency A circle that is divided into

represents the mean values. distribution for a categorical sections or wedges according

For qualitative data, the bar variable. to percentage of frequencies in

represents the heights or length The frequencies are displayed each category of the

whose represents the by the heights of vertical bars distributions.

frequencies of the data. which are arranged in Pie charts show the relationship

The bars can be vertical or decreasing order. between classes in a set of data

horizontal. with the whole data.

Stem and Leaf Plot, Time

series graph

Represents data that occur over The leading digit is plotted as the stem and the trailing digit as the leaf to

a specific period of time. form groups or classes.

For analysis, we look at the A key indicator is used to define the stem and leaf values.

trend or pattern (increasing or If the plot is rotated in horizontal position, we can see the shape of the

decreasing) that occurs over the data distribution

time period. For a mixture stem and leaf plot, the shape of distribution for the left side

Further analysis will look at the may be seen by reflecting the plot to the right side.

slope or the steepness of the line We may analyse the variability of the data by looking at the spread of the

(rapid increase or decrease). stem and leaf plot.

A stem and leaf plot is also good in showing the range, minimum,

maximum, mode, gaps, clusters, and outliers.

Selection of appropriate statistical

techniques for data summarisation

Type of Data Descriptive Statistics Graphical Summary

Quantitative Mean, Median, Mode, Histogram, Bar Chart (bar

(ratio scale) Range, Standard Deviation, representing means), stem

Interquartile range (IQR and leaf plot, Boxplot

=Q3-Q1)

Symmetrical Mean, Median, Mode, Histogram, Bar Chart (bar

Distribution Range, Standard Deviation representing means)

range (IQR =Q3-Q1) plot, Boxplot

Categorical (Nominal) Mode, Counts, Percentage Pie Chart, Bar Chart

(Ordinal, Likert Scale) Percentage

SZS2017

1.2.5 Presenting and

Analysing the Data

Analysed information given by the

Descriptive statistics (refer topic 1.3)

Graphical summary (graph and chart)

study.

confidence interval, hypothesis testing, ANOVA, goodness of fit

test, contingency table, regression, correlation, etc.

SZS2017

BASIC INFERENTIAL STATISTICS

Statistical Analysis Characteristics

Confidence Intervals An estimated range of values which is likely to include an unknown population

(CHAPTER 2) parameter, 𝜃 with a specified probability (confidence level) within that interval.

The interval is usually written as 𝒂, 𝒃 or 𝒂 < 𝜽 < 𝒃.

Hypothesis Testing A statement (claim or conjecture or assertion) concerning a parameter or

(CHAPTER 3) parameters of one or more populations.

• Statistical Analysis for one population (mean, variance, proportion)

• Statistical Analysis for two populations (mean, variance, proportion)

Analysis of Variance Statistical Analysis for three or more populations mean

(ANOVA) • One-way ANOVA

(CHAPTER 4) • Two-way ANOVA and Post Hoc Test

Linear Regression A statistical measure that attempts to determine the strength of relationship

Analysis between dependent (y) and independent variables (x).

(CHAPTER 5) • Simple linear regression analysis and correlation. (y vs x)

• Multiple linear regression analysis and correlation. (y vs xi)

• Model selection technique to chose a parsimony model that best fit the data.

Statistical Analysis for 1. Tests concerning frequency distributions for categorical data

Categorical Data (Goodness of Fit)

(CHAPTER 6) 2. Tests concerning specific probability distributions (Goodness of Fit)

3. Test the Independence of two variables (Contingency Table)

4. Test the homogeneity of proportions (Contingency Table)

ADVANCED INFERENTIAL STATISTICS

Statistical Analysis Characteristics

Experimental Planning, conducting, analysing and interpreting controlled tests to evaluate the factors

Design (DOE) that control the value of a parameter or group of parameters.

Example: ANOVA, Single factor experiment, Randomized Blocks, Latin Squares and

Related Design, Factorial Design, Response Surface Methodology, Nested and Split-Plot

Design

Time Series Modelling, making inference and producing forecast time series data for future

Analysis observations. Time series models are built to represent the serially correlated series,

trends, or seasonal effects.

Example: Linear Time Series, Linear Stationary Models (AR, MA, ARMA), Linear

Nonstationary Models (ARIMA, SARMA), Box-Jenkins Models, Volatility Models (ARCH,

GARCH), Hybrid models

Multivariate A central tool whenever many variables need to be considered at the same time.

Analysis Example: Mean Vector and Covariance Matrix Estimation, MANOVA, Principal

Component Analysis, Factor Analysis, Canonical Correlation Analysis, Discriminant

Analysis, Cluster Analysis

Statistical Quality Quality improvement through the use of modern statistical methods for quality control

Control (SQC) Example: Variables control charts, Attribute Control Charts, Time-Weighted Control

Charts, Multivariate Control Charts

ADVANCED INFERENTIAL STATISTICS

Statistical Analysis Characteristics

Statistical A mathematical equations that relate one or more random variables and possibly

Modelling other non-random variables, concerning the generation of some sample data and

similar data from a larger population.

• Example of Statistical Models: Generalised Linear Model, Dependence model,

Regression, Bayesian, markov chain, Random effect and mixed model

• The Process involve: parameter estimation, data generation, missing values,

outlier detection, simulation study, bootstrap, goodness of fit test

Data Mining A computing process of discovering patterns in large data sets involving methods at

the intersection of machine learning, statistics, and database system.

Example: Decision Tables, Decision Trees, Classification Rules, Association Rules,

Decision Tress, Clustering, Advanced linear model, Bayesian, Instance-based Learning

Circular Statistics A branch of statistics that involve circular data which deal with direction or cyclic

time. Circular data are measured in degrees (0,2π] or radian (0o, 360o].

Example: orientation of an animal, direction of wind and wave, days of the week,

compass direction, waves of sound, the human perception under various conditions,

the orientation of ridges of fingerprints, the orientation of sand grains from a beach,

the death due to a disease at various times in a year, and astronomical observations.

ADVANCED INFERENTIAL STATISTICS

Statistical Analysis Characteristics

Advanced Regression • Polynomial Regression: y is modelled as an nth degree polynomial in x

Analysis

• Multivariate Regression: Y is a matrix with series of multivariate dependent

measurements and X is a matrix of observations on independent variables.

• Generalized Linear Model: A flexible generalization of ordinary linear

regression that allows for response variables that have error distribution

models other than a normal distribution.

• Logistic Regression: A regression model where the dependent variable is

categorical.

• Nonlinear Regression: The observational data are modeled by a function

which is a nonlinear combination of the model parameters and depends on

one or more independent variables

• Error in Variables: a regression model that account for measurement errors

in the independent variables.

1.2.6 Make the decision

and conclusion

The researchers can make decisions in order to achieve the

objective and goal of the research and choose the best options

which represents the ‘best’ solution to the problem.

the researchers and quality of the information.

SZS2017

1.3 REVIEWS ON

DESCRIPTIVE

STATISTICS

Summarise the data using measures of central

tendency, such as the mean, median, mode, and

midrange.

Describe the data using measures of variation, such

as the range, variance, standard deviation and

coefficient of variation.

Identify the position of a data value in a data set

using measures of position such as quartiles, deciles,

and percentiles.

SZS2017

Reviews on

Descriptive Statistics

about the data.

We can summarise data using measures of central tendency,

measures of variation, and measures of position.

Some classified these type of measures as traditional

statistics.

If the measurement describes about a population

characteristic, it is called a parameter.

If the measurement describes about a sample characteristic,

it is called a statistic.

SZS2017

RULE OF THUMB FOR DECIMAL

PLACES

be rounded to four (4) decimal places.

2. If the unit is given (in cm, minute, day, etc.), the value should

be rounded to that unit’s decimal places.

SZS2017

TIPS: Descriptive Statistics using

Scientific Calculator

Casio fx-570MS

STEP 2: Data summary

Shift 1 →

Shift 2 →

STEP 3: Clear data → Shift CLR 1

Casio fx-570ES

STEP 2: Data summary:

Shift 1 → 3: Sum →

Shift 1 → 4: Var →

STEP 3: Clear data → Shift 9

Note:

The notations used in the calculator are n as sample size, x as mean sample, x n or x as population

standard deviations, and x n 1 or sx as sample standard deviations.

SZS2017

1.3.1 Measures of Central Tendency

Measures of central tendency are also called measures of

average

1. mean Can roughly describes

2. median the shape of

distribution of a

3. mode, and certain data set

4. midrange.

The measures of central tendency are use to describe an

entire set of observations with a single value representing the

central or middle value of the data set.

SZS2017

Midrange (MR)

Is a rough estimate of the middle

MR

2

EXAMPLE 1.3:

1 8

If the data set is 1, 3, 5, 7, 7, 8, then the calculated midrange is, MR 4.5 .

2

Properties of Midrange

A rough estimate of the average

Can be affected by one extremely high or low value (outlier).

SZS2017

Mean

Is the sum of the values divided by the total number of values

N n

x i x i

i 1

, N population size x i 1

, n sample size

N n

If the data set is 1, 3, 5, 7, 7, 8, then

‒ the calculated mean is 5.1667 if the data is taken from the population.

The value is a true mean or a parameter.

‒ the calculated mean is x 5.1667 if the data is taken from the sample.

The value is a sample mean or a statistic.

SZS2017

RECALL: Descriptive Statistics using

Scientific Calculator

Casio fx-570MS

STEP 2: Data summary

Shift 1 →

Shift 2 →

STEP 3: Clear data → Shift CLR 1

Casio fx-570ES

STEP 2: Data summary:

Shift 1 → 3: Sum →

Shift 1 → 4: Var →

STEP 3: Clear data → Shift 9

Note:

The notations used in the calculator are n as sample size, x as mean sample, x n or x as population

standard deviations, and x n 1 or sx as sample standard deviations.

SZS2017

Median

Is the middle number of n ordered data (smallest to largest)

If n is odd If n is even

Median(MD) x n 1 xn xn

1

2 Median(MD) 2 2

2

3

x x

If the data set is 1, 3, 5, 7, 7, 8, then the calculated median is, Median 3

6.4

SZS2017

Mode

Is the most commonly occurring value in a data series

EXAMPLE 1.4:

a) If the data set are 1, 6, 3, 7, 8, 5 then the mode is not exist.

b) If the data set are 1, 6, 3, 7, 8, 3, 5 then the mode is 3.

c) If the data set are 1, 6, 3, 7, 3, 8, 7, 5, 3, 7 then the mode is 3 and 7.

Properties of Mode

The mode is used when the most typical case is desired.

The mode is can be used when the data are nominal.

The mode is not always unique.

A data set can have more than one mode, or the mode may not

exist for a data set.

SZS2017

Identify the Shapes of Data

Distribution

Symmetric Positively skewed / Negatively skewed/

right-skewed left-skewed

Mean Median Mode Mean Median Mode Mean Median Mode

→ The shape of the distribution may be identified by observing the

position of the mode value.

SZS2017

EXAMPLE 1.3

If the data set is 1, 3, 5, 7, 7, 8, then

population. The value is a true mean or a parameter.

‒ the calculated mean is x 5.1667 if the data is taken from the sample.

The value is a sample mean or a statistic.

x3 x4

‒ the calculated median is, Median 6.

2

‒ the mode is 7.

‒ the shape of distribution is negatively skewed since

Mean Median Mode .

SZS2017

RECALL: Descriptive Statistics using

Scientific Calculator

Casio fx-570MS

STEP 2: Data summary

Shift 1 →

Shift 2 →

STEP 3: Clear data → Shift CLR 1

Casio fx-570ES

STEP 2: Data summary:

Shift 1 → 3: Sum →

Shift 1 → 4: Var →

STEP 3: Clear data → Shift 9

Note:

The notations used in the calculator are n as sample size, x as mean sample, x n or x as population

standard deviations, and x n 1 or sx as sample standard deviations.

SZS2017

Properties of Mean and Median

The mean is unique, and not necessarily one of the data values.

The mean is affected by extremely high or low values and if it occurs, the

mean may not be the appropriate average to use.

The mean is used in computing other statistics, such as variance.

The mean cannot be computed for an open ended frequency distribution.

The mean varies less than the median or mode when samples are taken from

the same population and all three measures are computed for these samples.

The mean is not an appropriate average to use if the shape of distribution is

skewed.

The median is used when one must find the center or middle value of a data

set.

The median will make sure that the data values fall into the upper half or

lower half of the distribution.

The median is affected less than the mean by extremely high or extremely low

values.

SZS2017

EXAMPLE 1.5

An extreme value, let say 21 is added to the data set in Example 1.3. The new

data set are 1, 3, 5, 7, 7, 8, 21. Assume that the data is taken from a sample, then

outliers and may not be the appropriate average to use. This new average

value is no longer representing the central of the data set.

‒ the calculated median is 7 or Median x 7 . This new average value is

4

‒ the mode is 7.

1 21

‒ the calculated midrange is, MR 11 . The midrange is easily

2

affected by outliers.

‒ the shape of distribution is positively skewed since mode is the smallest

value as compared with the mean and median values.

that occur in a data set is called outlier.

EXERCISE 1.3.1

1. Determine the shape of distribution of the following

data.

b) Mean = 25, Mode = 13, Median = 17

c) Mean = 5, Mode = 73, Median = 17

d) 11.4, 11.6,12.6,12.7, 12.8, 13.3, 13.3, 13.6, 13.7,

13.8

a) symmetric b) right-skewed c) left-skewed d) Mean = 12.88, Median = 13.05, mode = 13.3, left-skewed

SZS2017

EXERCISE 1.3.1

2. The following set of data represents the number of hospitals

for selected countries.

123 108 195 138 115 179 119 148 147 180

146 178 189 108 193 114 179 147 108 128

164 174 128 159 193 175

b) Is the average values calculated in (a), a parameter or a

statistic? Why?

c) What is the distribution type that describes the data?

d) What is the best measure of average of this set of data?

Why?

SZS2017

1.3.2 Measures of Variation/Dispersion

Measures of variation or measures of dispersion are measures

that determine the spread of data values.

1. Range: the simplest measure of variation

2. Variance, and

more meaningful and popular

3. Standard deviation. measures that describes the

4. Coefficient of Variation variability of data

more accurately.

Variance and standard deviation are used quite often in

inferential statistics.

SZS2017

Range (R)

Is the different between the highest value and the lowest value in a

data set

EXAMPLE 1.6:

Suppose the data set is 1, 6, 3, 7, 8, 5, then the calculated range is, R 8 1 7 .

Properties of Range

The simplest measure of variation.

Easily affected by one extremely high or low value (outliers).

SZS2017

Variance

Is the average of the squares of the distance each value is from the mean.

N n

xi x x

2 2

i

2 i 1

, N population size

s

2 i 1

, n sample size

N

n 1

Standard Deviation

Is the square root of the variance

N n

xi xi x

2 2

i 1

, N population size s i 1

, n sample size

N n 1

SZS2017

Properties of Variance & Standard Deviation

The variance is the average of the squares of the distance each value

is from the mean.

If the data values are near the mean, the variance will be smaller.

If the data values are far from the mean, the variance will be larger.

The square distance is used since the sum of the distances will

always be zero.

Variance is always a positive value.

There is no unit for the resultant variance.

Standard deviation is the square root of the variance.

Standard deviation is measure of deviations of values from the

mean.

Standard deviation is always positive value.

The units of standard deviation are similar as the unit of the data.

SZS2017

Coefficient of Variation

Is the standard deviation divided by the mean.

s

CVar 100%, for population CVar 100%, for sample

x

Properties of CVar

The result is expressed as percentage.

A parameter/statistic that allows user to compare the standard deviations

when the units are different (the variables are different).

RECALL: Descriptive Statistics using

Scientific Calculator

Casio fx-570MS

STEP 2: Data summary

Shift 1 →

Shift 2 →

STEP 3: Clear data → Shift CLR 1

Casio fx-570ES

STEP 2: Data summary:

Shift 1 → 3: Sum →

Shift 1 → 4: Var →

STEP 3: Clear data → Shift 9

Note:

The notations used in the calculator are n as sample size, x as mean sample, x n or x as population

standard deviations, and x n 1 or sx as sample standard deviations.

SZS2017

EXAMPLE 1.6

Suppose the data set is 1, 6, 3, 7, 8, 5, then

‒ the calculated variance is 2 5.6667 and the standard deviation is 2.3805

if the data is taken from the population. These values are called as parameters.

‒ the calculated variance is s 2 6.8 and the standard deviation is s 2.6077 if the

data is taken from the sample. These values are called as statistics.

‒ the calculated sample mean is, x 5 . Hence the sample coefficient of variation

2.6077

is CVar 100% 52.15% .

5

SZS2017

Why we Need Measures of Variation

• Measures of variation can be a judgment about how well the

measures of average illustrate or depict the data.

• It is also called measure of variation because it can measure the

variability that exists in a data set.

• It can be used when the measures of central tendency do not give

any significant meaning or not needed/practical.

EXAMPLE:

Suppose we wish to compare the performance of two groups of student

in a test. Given that the mean values are the same for both data sets.

In short, you might conclude that these two groups of students are

equally well performed in the test. However, if the data sets are

examined graphically as shown in Figure 1.10, a different conclusion

might be drawn.

SZS2017

Examining Data Sets Graphically

Students are given the same set of test and the mean of score is

calculated as 66.67 marks for each group of students.

The mean values are the same but the spread or variation of the

test score is quite different.

The test score for students from Group B is more consistent and

less variable.

When the mean values are equal, the larger the data range is, the

more the variable the data.

SZS2017

Comparing Two Data Sets

Smaller standard deviation 1 2 indicate that:

POPULATION 1 is POPULATION 2 is

Less dispersed More dispersed

Less spread More spread

Less variable (small variation) More variable (large variation)

More consistent Less consistent

More precise Less precise

More accurate Less accurate

Better data Worse data

SZS2017

EXAMPLE 1.7

The following data represents the age (in years) of lecturers in two faculties at UMP.

FIST: 24, 25, 26, 27, 30, 31, 31, 32, 36, 40, 43, 44, 45

FKEE: 22, 25, 25, 25, 28, 33, 34, 36, 37, 40, 41, 43, 48, 51, 53

For these sample data sets, find the standard deviations. Then, identify which data set

is more consistent and less dispersed. What can you say about the variation of age for

lecturers in both faculties?

Solution:

sFKEE 9.9460 years

sFIST sFKEE , so FIST data is more consistent and less dispersed.

The variation of ages for lecturers in FIST is small and less dispersed as

compared to FKEE lecturers.

SZS2017

EXERCISE 1.3.2 (Q1&Q2)

1. Which of the following set of sample data is less variable?

Method A: 79 73 78 76 80 75 82 70 77

Method B: 80 85 78 79 75 73 70 60 65

s A 3.6742 sB 7.8493

lifetime (in hours) from two different brands. Which brand of

battery is performed better?

A: 4.2, 6.7, 7.3, 7.5, 8.0, 8.5, 8.7, 8.8, 9.2, 9.3

B: 9.6, 9.7, 9.8, 9.9, 10.1, 10.2, 11.0, 11.0, 11.0, 11.1

s A 1.5 hours sB 0.6 hours

SZS2017

Comparing Two Data Sets with

different units/variable

If the two samples do not have the same units of measurement or the

variables are different, the variance and standard deviation for each

sample cannot be compared directly.

between the number of sales of car for a year and the commission (in

RM) made by the salesperson. It is very clear that these two

variables have two different units.

Hence, the best way to compare the variability within these two

variables is by using the coefficient of variation.

variable than the variable two.

SZS2017

EXERCISE 1.3.2 (Q3)

3. The average age of the accountants at a huge company is 31

years with a standard deviation of 4 years. The average

salary of the accountants is RM 44255 per year with a

standard deviation of RM 780. Compare the variations of

age and income.

CVar age

12.90% CVar income 17.63%

SZS2017

Other Properties of Standard Deviation

Use to determine the number of data values that fall within a

specified interval in a distribution.

section or range of data.

It can be seen that about 95% of data values are fall within 𝜇 − 2𝜎

and 𝜇 + 2𝜎.

SZS2017

1.3.2.1 Accuracy and Precision

Concept (Validity and Reliability)

→ The concept is important to ensure that data collected from an

experiment or observation is good, valid, and reliable.

Accuracy is how close a measured Precision is how close the measured

value to the ‘true’ measurements. value to each other or how consistent

No measurement/device is your results are for the same

perfect (can easily be inaccurate phenomena over several

and lead to false measurements). measurements.

There is still a tolerance for error. Precision as a measure of variation

Accuracy must be accounted for in must be accounted in your

your results. calculations and results.

The precision of a measurement is the

The bigger the difference between size of unit used to make a

the measured and the true values, measurement. The smaller the unit,

the less accurate (less valid) the the more precise (more reliable) the

measurement. measurement.

SZS2017

Game of Darts

(close to the mark) without imprecision precise.

measurements, but accuracy • Not valid and not • Valid and reliable

not very precise, • Very reliable • Very good

since the darts are consistent, but measurement

spread out not near the

everywhere mark

• Valid but not • Not valid but

reliable reliable

SZS2017

EXERCISE 1.3.2 (Q4)

4. Identify each situation as either accurate or precise or both.

a) If you are playing football and you always hit the left goal post

instead of scoring.

b) A candy manufacturer claims that each packet contains 20 candies.

A sample of packet have 18, 21, 19, 21, 19, 20, 22 candies,

respectively. The average is 20 candies with an error of 1 candy.

c) A manufacturer claims that each chocolate packet contains 20

chocolates. A sample of packets have 17, 18, 18, 17, 18, 17, 17

chocolates, respectively.

d) In an experiment, with five trials, the end results of the five trials for

whatever is being tested are: 35 kg, 36 kg, 36 kg, 35 kg, 36 kg. The

actual value (as found in a scientific data book) is meant to be 42 kg.

e) In an experiment, with five trials, the average value is 35 kg. The

actual value (as found in a scientific data book) is meant to be 35 kg.

SZS2017

MIND EXPANDING EXERCISES

4. In what sense are the mean, median, mode and midrange measures

the “centre”? of a data set?

in a statistics class or the IQ scores of 30 teenagers watching a

movie? Why?

measures as compared to mean and variance for non-normal data.

7. A JDT football fan records the number on the jersey of each player

in a game. Does it makes sense to calculate the mean of those

numbers? Why or why not?

SZS2017

MIND EXPANDING EXERCISES

8. In an analysis of the accuracy of weather forecasts, the actual high

temperature are compared to the high temperatures predicted one day earlier

and the temperatures predicted five days earlier. Listed below are the errors

between the predicted temperatures and the actual high temperatures for 14

consecutive days in Kuala Lumpur.

Actual high ‒ 2 2 0 0 ‒ 3 ‒2 1

High predicted one day earlier ‒2 8 1 0 ‒ 1 0 1

Actual high ‒ 0 ‒3 2 5 ‒ 6 ‒9 4

High predicted five days earlier ‒1 6 ‒2 ‒2 ‒ 1 6 ‒4

a) Do the means and medians of the errors indicate that the temperatures

predicted one day in advance are more accurate than those predicted

five days in advance, as we might expect?

b) Do the standard deviations of the errors indicate that the temperatures

predicted one day in advance are more accurate than those predicted

five days in advance, as we might expect?

SZS2017

ME.8 (solution)

Mean median sd

1.5000 1.0000 2.4152

3.8333 4.5000 2.4014

SZS2017

MIND EXPANDING EXERCISES

9. A data set consists of 20 values that are fairly close together. Another

value is included, but this new value is an outlier (very far away from

the other values). How is the standard deviation affected by the

outlier? No effect? A small effect? Or a large effect?

deviation of 10. Meanwhile, scores on the economics test have a mean

of 55 and a standard deviation of 5. Which is relatively better: a score

of 85 on a psychological test or a score of 45 on an economics test?

11. When designing the production procedure for batteries used in heart

pacemakers, an engineer specifies that “the batteries must have a

mean life greater than 10 years, and the standard deviation of the

battery life can be ignored.” If the mean battery life is greater than 10

years, can the standard deviation be ignored? Why or why not?

SZS2017

1.3.3 Measures of Position

Describe where a specific data value falls within the data set or its

relative position based on percentiles, deciles and quartiles in

comparison with other data values

Describing the position of

the data value

(increasing order)

Split data into Split data into Split data into

100 equal parts 10 equal parts 4 equal parts

Pi x in xc Di x in xc Qi xin xc

100 10 4

SZS2017

Pi x in xc Di x in xc Qi xin xc

100 10 4

xc xc 1 xc xc 1 xc xc 1

If c is a whole number, then use Qi , Di , Pi

2 2 2

SZS2017

EXAMPLE 1.9

A manufacturer measured the volume of a sample of 11 bottles of chemical

solvents. The results are recorded (in millilitres) as follows.

40 45 38 25 42 31 30 44 26 27 36

Show that Q1 equivalent to P25 , Q2 equivalent to P50 , Q3 equivalent to P75 , and Di

equivalent to Pi (10) , where i 1, 2, , 9.

Quartiles Percentiles

Q1 x1 11 x2.75 x3 27 P25 x 25 11 x2.75 x3 27

4 100

Q2 x 2 11 x5.50 x6 36 P50 x 50 11 x5.50 x6 36

4 100

Q3 x 311 x8.25 x9 42 P75 x 75 11 x8.25 x9 42

4 100

SZS2017

EXAMPLE 1.9

Deciles Percentiles

D3 x 3 11 x3.3 x4 30 P30 x 30 11 x3.3 x4 30

10 100

D5 x5 11 x5.5 x6 36 P50 x50 11 x5.5 x6 36

10 100

D7 x 711 x7.7 x8 40 P70 x 70 11 x7.7 x8 40

10 100

SZS2017

EXERCISE 1.3.3

1. Given a set of data as 9 2 1 4 3 7 5 4 6 .

b) Find the value corresponds to 3rd quartiles.

are shown below.

9 22 11 14 13 3 7 15 18 16

b) Find the score corresponds to 7th deciles.

1) 4, 6 2) 8, 15.5

SZS2017

Why We need Measures of Position?

Percentiles are one of measures of position that often used in

educational and health related fields to indicate the position

of an individual in a group.

Percentile is not a percentage value. The ith percentile, is a

value that i % of the data are less than or equal to Pi and

(100-i) % are greater than or equal to Pi.

EXAMPLE:

If a student obtained 82 marks over 100 in a test , he/she will

obtain 82% of score. However, there is no indication of his/her

position with respect to the rest of the class. On the other hand,

if his/her score corresponds to the 75th percentile, then he/she

did better than 75% of the students in his/her class.

SZS2017

Why We need Measures of Position?

Quartiles can be used as a rough measurement of variability.

defined as the difference between Q1 and Q3 and is the range

of the middle 50% of the data.

used to identify outliers, and to measure variability in

exploratory data analysis (Section 1.4).

the smaller the value of IQR; the smaller the variation in the

data.

useful to show the variability of the data set, either its more

variation, more dispersed, more spread or more consistent.

SZS2017

MIND EXPANDING EXERCISES

4. In what sense are the mean, median, mode and midrange measures

the “centre”? of a data set?

in a statistics class or the IQ scores of 30 teenagers watching a

movie? Why?

measures as compared to mean and variance for non-normal data.

7. A JDT football fan records the number on the jersey of each player

in a game. Does it makes sense to calculate the mean of those

numbers? Why or why not?

SZS2017

1.3.4 Descriptive Statistics

Using Microsoft Excel

SZS2017

Interpreting Descriptive Statistics

Using Microsoft Excel (Example 1.9)

A firm is conducting a study to compare two different physical

arrangements of its assembly line. The arrangement with the smaller

variance in the number of finished units produced per day will be adopted

as the new arrangement of its assembly line.

→ x1 x2 , in average Assembly Line 2 produced more

number of finished units per day.

of Assembly Line 1 is more consistent, less dispersed,

less spread, less variable (small variation), and more

precise. Therefore the arrangements of Assembly

Line 1 will be adopted as the new arrangement.

negatively skewed or left-skewed since

Mean Median Mode . The skewness value is

negative too.

negatively skewed or left-skewed since the mode is

the highest value compared to mean and median. The

skewness value is negative too.

SZS2017

Interpreting Descriptive Statistics

Using Microsoft Excel (Example 1.9)

A firm is conducting a study to compare two different physical

arrangements of its assembly line. The arrangement with the smaller

variance in the number of finished units produced per day will be adopted

as the new arrangement of its assembly line.

that the Assembly Line 1. Hence the distribution of

data from Assembly Line 2 is more skewed to the

left, indicating that Assembly Line 2 produced more

number of finished units per day.

x1 Confidence Level 491.1 17.1 474,508.2 .

Hence, we are 95% confident that the population

mean number of finished units per day for Assembly

Line 1 is lies between 474 and 509 units.

x2 Confidence Level 499.4 25.2 474.2,524.6

Hence, we are 95% confident that the population

mean number of finished units per day for Assembly

Line 2 is lies between 475 and 525 units.

SZS2017

MIND EXPANDING EXERCISES

12. A lecturer is interested to investigate the students’ performance in

statistics course based on their carry mark and the final score in

the final examination. The descriptive statistics and graph are

given below. From the analyses, comment on the students’

performance based on carry marks and final examination scores.

SZS2017

MIND EXPANDING EXERCISES

ME.12

SZS2017

MIND EXPANDING EXERCISES

13. A study is conducted to compare the performance of male and female

students in the statistics course for final examination scores. The

data, descriptive statistics and graph of the final examination scores

are presented as follow. Based on the analysis, answer the following

questions:

72 62 83 65 60 74 66 68 57 63 61

Female

76 60 78 34 70 59 63 86 43 90 87

58 81 86 68 70 77 54 54 72 41 33 52

Male

70 37 67 39 74 32 8 33 27 23 54

SZS2017

MIND EXPANDING EXERCISES

ME.13

a) State the mean and standard deviation for both groups and give your

comment.

b) Based on the graph shown, give your comment.

SZS2017

MIND EXPANDING EXERCISES

14. People with diabetes must monitor and control their blood glucose level. The

goal is to maintain fasting plasma glucose between 90 and 130 mg/dl. The

data presented below give the fasting plasma glucose for two groups, before

treatment and after treatment. Answer the following questions:

b) Give the first five data in the ‘before’ group and last five data in the ‘after’

group.

c) Identify the median and mode in each group.

d) Describe the shape of the distribution of data in each group.

e) Is there any outlier in the groups?

f) What are the advantages of using stem and leaf plot?

g) Which data is more dispersed (consistent)?

h) Based on the descriptive analysis done in Excel, why do you think that

the dispersion for both groups using variance is different from variance

given by IQR?

SZS2017

MIND EXPANDING EXERCISES

Before After

ME.14 8 7

8

6 5 9

3 10

2 11

12 8 8

4 13

7 5 8 1 14

3 8 15 8 9

16 3 4 0

2 2 17

18 8

19 5 8

0 20

21

22 7 6 3 1 0

23

24

5 25

26

1 27

28 3

29

30

31

32

33

34

9 35

Key: 14|1=141

SZS2017

1.4 EXPLORATORY

DATA ANALYSIS

Identify outliers.

Draw and interpret a boxplot.

SZS2017

Exploratory Data Analysis

Traditional Method Exploratory Data Analysis

Frequency distribution Stem and leaf plot

Histogram Boxplot

Mean Median

Interquartile range

Standard deviation

(IQR=Q3-Q1)

The purpose of exploratory data analysis is to discover any gaps or

pattern in the data.

For symmetric data, the appropriate measure of central tendency

is mean and for variability is standard deviation or variance.

For skewed data, the appropriate measure of central tendency is

median and for measure of variability is interquartile range (IQR).

SZS2017

RECALL: Selection of appropriate

statistical techniques for data

summarisation

Type of Data Descriptive Statistics Graphical Summary

Quantitative Mean, Median, Mode, Histogram, Bar Chart (bar

(ratio scale) Range, Standard Deviation, representing means), stem

Interquartile range (IQR and leaf plot, Boxplot

=Q3-Q1)

Symmetrical Mean, Median, Mode, Histogram, Bar Chart (bar

Distribution Range, Standard Deviation representing means)

range (IQR =Q3-Q1) plot, Boxplot

Categorical (Nominal) Mode, Counts, Percentage Pie Chart, Bar Chart

(Ordinal, Likert Scale) Percentage

SZS2017

Histogram, Stem and Leaf OR Boxplot?

Type of Graph Advantages Disadvantages

Histogram ‒ Can graph huge data sets easily. ‒ Not good for small data set.

‒ The shape of distribution can be easily ‒ It is difficult to simplify all

described. the data into one scale.

‒ You could change the intervals of the

histogram to see which gives a better

description of the data.

‒ Great for comparing data.

‒ Can show trends in the data clearly.

Stem and Leaf ‒ Very easy to construct. ‒ Not good for small data set

‒ Show the real value of data or very large data set.

‒ Can shows range, minimum & ‒ Not visually appealing.

maximum, gaps & clusters, and ‒ Does not easily indicate

outliers easily. measures of centrality for

‒ May observe the mode. large data sets.

‒ Can identify the shape of distribution.

Boxplot ‒ Good for small or large data sets. ‒ Original data is not clearly

‒ It displays the range and distribution shown in the box plot.

of data along a number line. ‒ Mean and mode cannot be

‒ Can shows outliers. identified in a box plot.

SZS2017

1.4.1 Outliers

Outlier is an extremely high or an extremely low data value when

compared with the rest of the data values.

Outliers can happen from:

the result of measurement or observational error,

the written or typing error,

the data value obtained from a subject that is not in the defined

population, or

the legitimate data value occurred by chance.

When a distribution is symmetric or normal, data values that are

beyond three standard deviations of the mean can be considered

as suspected outliers (refer Figure 1.11).

An outlier can strongly affect the mean and standard deviation of a

variable.

SZS2017

Recall: Other Properties of Standard Deviation

Use to determine the number of data values that fall within a

specified interval in a distribution.

section or range of data.

It can be seen that about 95% of data values are fall within 𝜇 − 2𝜎

and 𝜇 + 2𝜎.

SZS2017

Position of Outliers

A data value x is an outlier if it less than the lower boundary value or

exceed the upper boundary value for the data set.

SZS2017

EXAMPLE 1.11

The number of credits in business courses for eight job applicants is

shown here:

9, 12, 15, 27, 33, 45, 63, 72.

Find the first and third quartiles for the above data. Is there any

outlier on the above data?

x2 x3

Q1 x18 x2 13.5

4

2

x6 x7

Q3 x 38 x6 54

4

2

SZS2017

EXERCISE 1.4.1

1. Given 19 2 1 4 3 7 5 4 6 . Find outliers if any.

Q1 3, Q3 6; 19 is outliers

outliers if any.

Q1 5, Q3 11; 21 is outliers

SZS2017

MIND EXPANDING EXERCISES

14. People with diabetes must monitor and control their blood glucose level. The

goal is to maintain fasting plasma glucose between 90 and 130 mg/dl. The

data presented below give the fasting plasma glucose for two groups, before

treatment and after treatment. Answer the following questions:

b) Give the first five data in the ‘before’ group and last five data in the ‘after’

group.

c) Identify the median and mode in each group.

d) Describe the shape of the distribution of data in each group.

e) Is there any outlier in the groups?

f) What are the advantages of using stem and leaf plot?

g) Which data is more dispersed (consistent)?

h) Based on the descriptive analysis done in Excel, why do you think that

the dispersion for both groups using variance is different from variance

given by IQR?

SZS2017

MIND EXPANDING EXERCISES

Before After

ME.14 8 7

8

6 5 9

3 10

2 11

12 8 8

4 13

7 5 8 1 14

3 8 15 8 9

16 3 4 0

2 2 17

18 8

19 5 8

0 20

21

22 7 6 3 1 0

23

24

5 25

26

1 27

28 3

29

30

31

32

33

34

9 35

Key: 14|1=141

SZS2017

1.4.2 Boxplots

Boxplot (Box and Whiskers plot) is graphical representations of a five-

number summary of a data set and outliers.

The lowest value of data set (minimum)

The lower quartile Q1 (1st Quartile or 25th percentile)

The median (2nd Quartile or 50th percentile)

five-number

summaries

The upper quartile Q3 (3rd Quartile or 75th percentile)

The highest value of data set (maximum) + Outliers

Outliers

SZS2017

Types of Boxplots

A Horizontal boxplot

A Vertical boxplot

SZS2017

SZS2017

EXAMPLE 1.12

The following mixture stem and leaf plot represent sample of age of teachers in two

schools.

School A Stem School B

9 7 7 5 5 4 2 2

8 7 6 2 1 1 0 3 3 4 6 7

4 0 1 3 4 5 7

7 5 1 3 4 [key: 3|4 → 34]

Given that for School B, Q1 36, Q2 42, Q3 47 and there is no outlier. Draw Boxplots

for both schools on the same x-axis. Then compare shapes, averages, and variability of

both age distributions

School A School B

Minimum 24 22

1st quartile Q1 x114 x3.5 x4 27 Q1 36

4

2nd quartile/ x7 x8 Q2 42

Median Q2 30.5

2

3rd quartile Q3 x 314 x10.5 x11 36 Q3 47

4

Maximum 38 54

Outliers Q1 1.5 Q3 Q1 27 1.5(36 27) 13.5 no outlier

Q3 1.5 Q3 Q1 36 1.5(36 27) 49.5

Since 57 > 49.5, Thus 57 is an outlier.

SZS2017

Information Obtain from a Boxplot

1. If the median is near the centre of the box, the distribution is approximately

symmetric.

2. If the median falls to the left of the centre of the box, the distribution is positively

skewed.

3. If the median falls to the right of the centre of the box, the distribution is

negatively skewed.

Suppose the median is near the centre of the box (approximately symmetric):

4. If the lines are about the same length, the distribution is approximately

symmetric.

5. If the right line is larger than the left line, the distribution is positively skewed.

6. If the left line is larger than the right line, the distribution is negatively skewed.

If the boxplots for two or more data sets are graphed on the same axis, the

distributions can be compared using their central tendency (average) and

variability values.

To compare the average, use the location of the medians.

To compare the variability, useSZS2017

the length of the IQR.

EXAMPLE 1.12

The following mixture stem and leaf plot represent sample of age of teachers in two

schools.

School A Stem School B

9 7 7 5 5 4 2 2

8 7 6 2 1 1 0 3 3 4 6 7

4 0 1 3 4 5 7

7 5 1 3 4 [key: 3|4 → 34]

Given that for School B, Q1 36, Q2 42, Q3 47 and there is no outlier. Draw Boxplots

for both schools on the same x-axis. Then compare shapes, averages, and variability of

both age distributions

School A School B

Minimum 24 22

1st quartile Q1 x114 x3.5 x4 27 Q1 36

4

2nd quartile/ x7 x8 Q2 42

Median Q2 30.5

2

3rd quartile Q3 x 314 x10.5 x11 36 Q3 47

4

Maximum 38 54

Outliers Q1 1.5 Q3 Q1 27 1.5(36 27) 13.5 no outlier

Q3 1.5 Q3 Q1 36 1.5(36 27) 49.5

Since 57 > 49.5, Thus 57 is an outlier.

SZS2017

EXAMPLE 1.12 solution

Shape:

Based on the location of median, School A has right-skewed distribution where most of

teachers’ age is concentrated at the lower age (< 30 years old). However, School B has

left-skewed distribution where most of teachers’ age is greater than 42 years old.

Average:

Based on the median value, 50% of teacher at School A age less than 30.5 years old

whereas 50% of teacher at School B age less than 42 years. On average, teachers at

School B is older than the teachers at School A.

SZS2017

EXAMPLE 1.12 solution

Variability:

Based on the IQR value, for School A, IQRA = 9 years where most 50% of the teachers

age between 27-36 years old. Meanwhile, for School B, IQRB = 11 years where most

50% of the teachers age between 36-47 years. Hence, the variation of teachers’ age at

School B is higher than age of teacher at School A (IQRA < IQRB).

Range:

Without outlier, teachers’ age at school A varies less from minimum age of 24 years to

maximum age of 38 years as compared to School B with minimum age of 22 years to

maximum of 54 years.

SZS2017

Boxplot for Special Case

In some cases, we cannot use the general guideline as given above to interpret the

boxplot.

Boxplot is not the best graphical representation to describe a data set if the sample

size of the data set is too small.

The existence of outliers also may affect the boxplot.

Therefore, in such cases, we have to use the descriptive statistics to identify the

distribution of the data set.

SZS2017

EXERCISE 1.4.2 (Q1)

1. Plot a boxplot for the following data. Then describe the data.

a) 3.2, 5.9, 4.3, 6.9, 4.5, 8.0, 4.7, 8.9, 5.7, 11.9

SZS2017

1.4.2 (Q1) solution

SZS2017

EXERCISE 1.4.2(Q2)

2. Two samples of ten springs made out of the steel rods supplied by

two different companies were compared. The measurement of

flexibility (in N/m) for each spring was recorded as follows. Compare

the distributions using box-plots.

8.8 9.2 9.3

Company B: 9.6 9.7 9.8 9.9 10.1 10.2 11.0

11.0 11.0 11.1

companies.

Company A : Min 6.7, Q1 7.3, Q2 8.25, Q3 8.8, 4.2 is outlier, Max 9.3, left-skewed

Company B : Min 9.6, Q1 9.8, Q2 10.15, Q3 11.0, no outlier, Max 16.4, right-skewed

SZS2017

1.4.2 (Q2) solution

EXERCISE 1.4.2 (Q3)

3. The following Table presents viscosity (in Pascal) of chemical substance from

three (3) batches of chemical process.

Batches Viscosity

Batch A 13.3 14.1 14.3 14.5 14.5 14.6 14.8 15.2 15.3 15.3

Batch B 13.3 13.7 14.1 14.5 14.9 15.2 15.3 15.4 15.6 15.8

Batch C 13.4 13.7 14.1 14.3 14.3 14.8 15.1 15.8 16.4 16.9

1st quartile 14.30 14.10

Median 14.55 14.55

3rd quartile 15.40 15.80

Outlier No No

b) Draw three boxplots on the same x-axis by using the information in (a).

c) Compare the boxplots in terms of shape and variability.

Batch A : Q3 15.2, right-skewed; Batch B : Q2 15.05, no outlier, left-skewed; Batch C : Q1 14.1, right-skewed

SZS2017

1.4.2 (Q3) solution

17

16.5

16

15.5

15

14.5

14

13.5

13

12.5

12

Batch A Batch B Batch C

MIND EXPANDING EXERCISES

ME.15

SZS2017

MIND EXPANDING EXERCISES

15. An experiment was conducted to assess the potency of various constituents of

orchard sprays in repelling honeybees. Individual cells of dry comb were filled

with measured amounts of lime Sulphur emulsion in sucrose solution. Seven

different concentrations of lime Sulphur ranging from a concentration of 1/100

to 1/1,562,500 in successive factors of 1/5 were used as well as a solution

containing no lime Sulphur (A, B, C, D, E, F, G, H). The responses for the

different solutions were obtained by releasing 100 bees into the chamber for

two hours, and then measuring the decrease in volume of the solutions in the

various cells. Based on the figure below, answer the following questions:

a) Which concentration has outlier(s)?

b) Group the concentration according to their shape of distribution.

c) Which concentration has the most consistent data? Why?

d) Which concentration has the most variable data? Why?

e) H is the concentration of ‘no lime sulphur’. What is the use of

concentration H?

f) What conclusion can you draw from this experiment?

SZS2017

1.5 NORMAL

PROBABILITY PLOT

SZS2017

Normal Probability Plots

The easiest way to check whether the sample distribution is normal or not.

The most plausible normal distribution is the one whose mean and standard deviation

are the same as the sample mean and standard deviation.

STEP 1 : Sort the data in ascending order and denote each sorted data as

xi , i 1, , n.

STEP 2 : Numbered the sorted data from i to n.

i 0.5

STEP 3 : Calculate the probability value for each xi using pi .

n

STEP 4 : Plot pi versus xi.

the data is approximately normally distributed.

SZS2017

Testing Normality using

Software

Other than plot manually, we can obtain it from software such as SPSS,

Minitab, Excel, and etc. The normality of the data also can be tested by

using Kolmogorov Smirnov, Anderson Darling or Shapiro-Wilk Tests.

SZS2017

EXAMPLE 1.13

figure above is known as the

normal probability plot. Since the

data lies approximately on a

straight line, the data is normally

distributed.

SZS2017

EXERCISE 1.5

1. A sample of size six is drawn. The sample, arranged in

increasing order, is

3.01 3.35 4.79 5.96 7.89 9.15

Do these data appear to come from an approximately normal

distribution?

14-year period.

2084 1497 1014 910 899 870 859

848 837 826 815 750 737 637

Do these data appear to come from an approximately normal

distribution?

1) yes 2) no

SZS2017

1.5 (Q1) solution

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0

0 1 2 3 4 5 6 7 8 9 10

SZS2017

1.5 (Q2) solution

1.2000

1.0000

0.8000

Pi 0.6000

0.4000

0.2000

0.0000

0 500 1000 1500 2000 2500

xi

SZS2017

CONCLUSION

• The applications of statistics are

many and varied. People

encounter them in everyday life,

such as in reading newspapers or

magazines, listening to the radio,

or watching television.

descriptive statistics techniques

discussed in this chapter

together, the student is now able

to collect, organize, summarize

and present data.

Thank You

NEXT: Chapter 2 Sampling Distribution and Confidence Interval

SZS2017

REFERENCES

1. Walpole R.E., Myers R.H., Myers S.L. & Ye K. 2011. Probability and Statistics for Engineers

and Scientists. 9th Edition. New Jersey: Prentice Hall.

2. Navidi W. 2011. Statistics for Engineers and Scientists. 3rd Edition. New York: McGraw-Hill.

3. Triola, M.F. 2006. Elementary Statistics.10th Edition. UK: Pearson Education.

4. Bluman A.G. 2009. Elementary Statistics: A Step by Step Approach. 7th Edition. New York:

McGraw–Hill.

5. Weiss, N.A. 2002. Introductory Statistics. 6th Edition. United States: Addison-Wesley.

6. Sanders D.H. & Smidth R.K. 2000. Statistics: A First Course. 6th Edition. New York: McGraw-

Hill.

7. Crawshaw, J. & Chambers,J. 2001. A Concise Course in Advance Level Statistics with Work

Examples, 4th Edition, Nelson Thornes.

8. Satari S. Z. et al. Applied Statistics Module New Version. 2015. Penerbit UMP. Internal used.

Thank You

NEXT: Chapter 2 Sampling Distribution and Confidence Interval

SZS2017

- PSSC Maths Statistics Project Handbook eff08.pdfTransféré parkanikatekriwal126
- Statistics for Management and Economics 9th Edition by Gerald Keller Test BankTransféré parrodilnger
- Lovely institute of managementTransféré parRavi Kant
- 1979 Psychophysicai Aspects of Sensory AnalysisTransféré parSebas-GhisRamirez
- E2+Basic+Mathematical+and+Measurement+ConceptsTransféré parLyza Fidelino
- UntitledTransféré parapi-26018528
- 4040_w13_erTransféré parmstudy123456
- Basic Concepts of Statistical StudiesTransféré parrituahuja1985
- Ssc Cgle SyllabusTransféré parBhushan Mohan
- CHAPTER 1Transféré parAlyanna Crisologo
- How to ResearchTransféré parbarathyshanmugam
- CHAPTER 1 Collecting DataTransféré parFarah Dayana
- Research for Marketing DecisionsTransféré parVi Jay
- Marketing ResearchTransféré parVikas Pathak
- statistics(2).docxTransféré parAngelicaBade
- Am PlingTransféré parAndre Chundawan
- Ahmad Rustam & Sufri MashuriTransféré parAhmad Rustam
- Methodology (Chapter 3)Transféré paryasminesaffa
- Big bazaarTransféré parVimal Birda
- Chap06Transféré parIrda
- Data Analysis LtTransféré parBamgbade Adewale Jibril
- Kuliah 3-Taburan PersempelanM4 TABURAN PERSAMPELAN.pptTransféré parAsmadera Mat Esa
- Research Project Unit 1 Step - 1 - Reconginiton AssignmentTransféré parGuillermina Delgado
- Synopsis on Analysis of Teaching Practince in Commerce at HSSC Schools in District KotliTransféré parAdnan Rashid
- Class 1 - Statistics_SMT1-2018Transféré parvika
- chapter333332Copy.docxTransféré parGabrielle Angelo Orendez
- TGTransféré parDnnlyn Cstll
- Entrep-Research.pptxTransféré parangeeelic zamora
- 3Transféré parTalha Imtiaz
- BASIC-STATISTICS-reviewer-1st-quarter.docxTransféré parAngelitaBejerano

- Rupa Goswami Bhakti Rasamrta SindhuTransféré parmahaphala
- Lecture 13Transféré parThanes Raw
- Assignment1.docxTransféré parThanes Raw
- S2- Standard Proctor Test.docxTransféré parThanes Raw
- 311231020-Ethnic-Report.docxTransféré parThanes Raw
- mackintosh RESULTS.docxTransféré parThanes Raw
- Applied Statistics Assignment 1 Group Alpha Thanesh Raw Ramasamy Te17052 Shivani Mahendran Cb17123 Wan Nurul Binti Wan Iskandar Te17042Transféré parThanes Raw
- Thanesh RawTransféré parThanes Raw
- Lecture 1 Health and Safety FoundationTransféré parThanes Raw
- S2- Standard Proctor Test.pdfTransféré parThanes Raw
- Softskill Assignment 1 Group 2 (1)Transféré parThanes Raw
- JC 8 THANESH RAW AL RAMASAMY(TE17O52).xlsTransféré parThanes Raw
- basic soil.docxTransféré parThanes Raw
- Book1Transféré parThanes Raw
- HIRARC ReportTransféré parThanes Raw
- Lab WeldingTransféré parFong Wei Jun
- 22032016 - Briefing Form Perm (Telemarketing) - Bukit JelutongTransféré parThanes Raw
- Problem SolvingTransféré parThanes Raw
- Level ReductionTransféré parThanes Raw
- Feedbooks Book 3796Transféré parBalrajGoulikar
- The SilmarillionTransféré parΚωνσταντίνος Κανάκης
- Introduction to AstronomyTransféré parThanes Raw
- Slide 1Transféré parThanes Raw
- What is InfrastructureTransféré parThanes Raw
- SatyaTransféré parThanes Raw
- Inca Lux InstallTransféré parThanes Raw
- Third Pcxvcxvcxvarty Legal NoticesTransféré parDanielYee

- Control Estadístico de Calidad Documento de ClaseTransféré pareuserodriguez
- MITT Open-Source MATLAB Algorithms for the Analysis of High-frecuency Flow Velocity Time Series Datasets - MacVicar 2014Transféré parJean HC
- Using Apply, Sapply, Lapply in RTransféré parchinu-pawan
- Volume 209Transféré parHatem Hadia
- Chapter 1 - Data Collection.pdfTransféré parAzam Maulana
- Current GMAT Prep - DS EasyTransféré parhardik kumar
- optimization in crystal ball.pdfTransféré parnmukherjee20
- Data CleaningTransféré parmrg1212005
- Business MathematicsTransféré parVishal Bhadra
- Lrfd a Comparison With Allowable Stress Design and Plastic DesigTransféré parTamtam Adayo
- Assignment I - - - STAT 106 (17-09-2019)Transféré parchhayank kaushik
- 3D Shape From Silhouette Points in Registered 2D Images Using Conjugate Gradient Method (Szymczak)Transféré parjoh loh
- SPE-167553-MSTransféré parJames
- Exploratory Data AnalysisTransféré parBrandon Mcguire
- FE Examples Prob.&Stat. July22 09[1]Transféré parHashem Mohamed Hashem
- mca-120422084948-phpapp02.pdfTransféré parDhananjay Sharma
- Sta301lec1to45mcqsTransféré parSarfraz Ali
- 04_Intro_to_Quantitative_MethodsTransféré parFon Acham
- Manual for CalculatorTransféré parEgeruoh Chigoziri Cyrinus
- Chapter 04Transféré parBich Phan
- Descriptives.pdfTransféré parAhmed Kadem Arab
- Jags User ManualTransféré parpopov357
- StataTutorialTransféré parNicole Robertson
- Abx Micros 60 Operators ManualTransféré parWilly Charly Chirilla Vacaflor
- Three Decades of Consolidation in US Agriculture Eib-189Transféré parMichelle O'Neill
- GAEA General information.docxTransféré parMarcelo Germán Vega
- Business Statistics L3 Past Paper Series 2 2011Transféré parHaznetta Howell
- SPSSTransféré parEddie M. Bastes Jr.
- Problem Set 1Transféré parAfanti Nasruddin
- Assessment HandoutTransféré parangelli45

## Bien plus que des documents.

Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.

Annulez à tout moment.