Data Science Assingment (Probability Manual)

Umair Sajid
Minhas
F141ABCSE010
4-Mar-17
Data Sciences
SUBMITED TO: SIR
BABER YAQOOB
TABLE OF CONTENTS (CLICK TO GO)
Introduction........................................................................................... 2
Mathematical representation................................................................2
Mean.................................................................................................. 2
Example.......................................................................................... 2
Mode.................................................................................................. 2
Example.......................................................................................... 2
Median............................................................................................... 2
Example.......................................................................................... 2
Standard Deviation............................................................................2
Variance............................................................................................. 3
Correlation......................................................................................... 3
Probability Distribution.......................................................................3
Normal Distribution............................................................................ 3
Binomial Distribution.......................................................................... 3
Formula........................................................................................... 4
Example.......................................................................................... 4
Poisson............................................................................................... 4
Formula........................................................................................... 4
Discrete.............................................................................................4
Regression......................................................................................... 5
Example:......................................................................................... 5
Linear Regression............................................................................... 5
Non-Linear Regression.......................................................................5
Logistic Regression............................................................................5
Statistical Inference...........................................................................5
Data Sciences
INTRODUCTION
Data science is the application of computational and statistical
techniques to address or gain insight into some problem in the real
world.
We use mathematical operations to analysis data. Some Statistical
formulas are defined below.
MATHEMATICAL REPRESENTATION
MEAN
For a population or a sample, the mean is the arithmetic average of all
values. The mean is a measure of central tendency or location.
EXAMPLE
2+10+23+42+21+32+43+53+54
Mean of 2,10,23,42,21,32,43,53,54 is = =31.1
9
MODE
The mode is a value that occurs with the greatest frequency in a
population or a sample. It could be considered as the single value most
typical of all the values.
EXAMPLE
MEDIAN
The median is also the number that is halfway into the set.
EXAMPLE
During the first marking period, Nicole's math quiz scores were 90, 92,
93, 88, 95, 88, 97, 87, and 98. What was the median quiz score?
Ordering the data from least to greatest, we get: 87, 88, 88, 90, 92,
93, 95, 96, 98The median quiz score was 92.
STANDARD DEVIATION
Where S is the Standard deviation of a sample.
= sum of
X= each value in a data set.
= Mean of All value in the data set.

N = number of value in a data set.
VARIANCE
Variance is a square of standard deviation. The variance of a random
variable X is the expected value of the squared deviation from the
mean
CORRELATION
Correlation Co- (meaning "together") and is a statistical technique that
can show whether and how strongly pairs of variables are related.
PROBABILITY DISTRIBUTION
A listing of all the values the random variable can assume with their
corresponding probabilities make a probability distribution.
NORMAL DISTRIBUTION
In general, when we gather data, we expect to see a particular pattern
to the data, called a normal distribution. A normal distribution is one
where the data is evenly distributed around the mean, which when
plotted as a histogram will result in a bell curve also known as a
Gaussian distribution.
BINOMIAL DISTRIBUTION
FORMULA
EXAMPLE
S (success) and F (failure) denote possible categories for all outcomes;
whereas, p and q=1-p denote the probabilities P(S) and P(F),
respectively.
P(S) = p.
P(F) = q = 1-p.
n indicates the fixed number of trials.
x indicates the number of successes (any whole number [0,n]).
p indicates the probability of success for any one trial.
q indicates the probability of failure (not success) for any one trial.
P(x) indicate the probability of getting exactly x successes in n trials.
POISSON
a discrete probability distribution for the count of events that occur
randomly in a given time
FORMULA
X = the number of events
= mean of the event per interval.
Where e is the constant, Euler's number (e = 2.71828...)
DISCRETE
The statistical or probabilistic properties of observable (either finite or
countably infinite) pre-defined values. Unlike a continuous distribution,
which has an infinite number of outcomes, a discrete distribution is
characterized by a limited number of possible observations
REGRESSION
A regression equation is used in stats to find out what relationship, if
any, exists between sets of data.
EXAMPLE:
if you measure a childs height every year you might find that they
grow about 3 inches a year. That trend (growing three inches a year)
can be modeled with a regression equation.
LINEAR REGRESSION
Linear regression and commonly used predictive analysis. Linear
regression requires a linear model. Linear regression attempts to model
the relationship between two variables by fitting a linear equation to
observed data.
NON-LINEAR REGRESSION
Non-linear regression is a form of regression analysis in which
observational data are modeled by a function which is a nonlinear
combination of the model parameters and depends on one or more
independent variables.
LOGISTIC REGRESSION
Logistic regression is used to describe data and to explain the
relationship data in which there is a binary (success-failure) outcome
(response) variable.
STATISTICAL INFERENCE
Statistical inference means drawing conclusions based on data. There
are a many contexts in which inference is desirable, and there are
many approaches to performing inference.

Data Science Assingment (Probability Manual)

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Data Science Assingment (Probability Manual)

Transféré par

Droits d'auteur :

Formats disponibles

Umair Sajid

Where S is the Standard deviation of a sample.

= Mean of All value in the data set.

Vous aimerez peut-être aussi