Vous êtes sur la page 1sur 16

Chapter 6: Statistical Hydrology

6.1 Introduction
Random variable
Variable whose outcome varies from trial to trial as the experiment is repeated.
Two types - discrete and continuous.
Probability distribution (discrete random variable, represented by histogram, table, formula) or
probability density function (continuous random variable, area under the curve).
Discrete Random Variable
A random variable which may take on only a countable number of distinct values
Continuous Random Variable
A random variable which takes an infinite number of possible values.
Independent Random Variables
Two random variables X and Y say, are said to be independent if and only if the value of X has
no influence on the value of Y and vice versa.
Sample
A set of observations of random variable
Population
Hypothetical infinite set having constant statistical properties
Variable
The characteristics of sample, for example the depth of rainfall.
Variate (x)
An individual observation or the value of any variable.
Why is the concept of probability important in hydrology
Due to random nature of hydrological phenomenon
Objectives of statistic in hydrology
1.
2.
3.
4.

Interpretation of observation
Search for hydrologic probabilistic regularities.
Extraction of maximum information from hydrologic data.
Presentation of hydrologic information.

Various Measures in statistics


Measures of central tendency
Mean: average
Median: Middle value in rank
Mode: Most frequent value
Measures of dispersion (precision)
Range: Difference between largest and smallest
Variance: spread
Standard deviation: square root of variance
Measures of association
Correlation

Frequency
For discrete random variable, the number of occurrences of a variate is generally called frequency. When
the number of occurrences of a variate, or the frequency, is plotted against the variate as the abscissa, a
pattern of distribution is obtained. The pattern is called the frequency distribution.
Interpretation of probability
Classical interpretation
o Outcome: each possible distinct result, event: collection of outcomes
o P(Event E) = Number of favorable outcomes/Total number of outcomes
o Assumption: All outcomes equally likely
Relative frequency interpretation
o If an experiment is conducted n different times and event E occurs on n e of these
trials, the probability of event E is approximately

PEvent E

ne
n

Sample
Relative frequency function
Cumulative frequency function

Population
Probability density function
Probability distribution function

Relative frequency function


If the number of observations ni in interval i is divided by the total number of observations, the result is
called relative frequency functionfs(x)
fs(x) =ni/n
This is also called the probability of a function.
Cumulative frequency function
The sum of values of the relative frequencies up to a given point is the cumulative frequency function
Fs(x).
( )

Cumulative Distribution Function(CDF)


Function giving the probability that the random variable X is less than or equal to x, for every
value x.
F x P X x for x

For a discrete random variable, the cumulative distribution function is found by summing up the
probabilities.
For a continuous random variable, the cumulative distribution function is the integral of its
probability density function.

Properties of CDF

0 CDF 1
Continuous, has derivative
Non-decreasing function

Probability Density Function(PDF)


Representation of randomness for continuous random variable
Equal to the derivative of the cumulative distribution function
Can be integrated to obtain the probability thatthe random variable takes a value in a given
interval.

f x

dF x
dx

Flood frequency
Flood frequency refers to the probability of occurrence of a flood. If a flood of certain magnitude occurs
m times in n years, then frequency of flood is m/n.
Recurrence interval or return period (T)
Return period is defined as the average interval of time T within which an event of given magnitude will
be equalled or exceeded at least once. In hydrology, it is the average interval between the occurrence of
flood equal to or greater than a given magnitude. The return period is widely used in hydrologic
frequency analysis.
Risk (R)
The probability of occurrence of event (x xT) at least over a period of n successive years is called the risk
(R). R represents probability of failure of a structure.
(

Formulae
a. Probability of an event (P) with return period T is given by (Probability of occurrence in any year)

b. Probability of not occurrence of event = 1- P =


c. Probability of not occurrence in n years (Pn) = (1 P)n = (

d. Probability of occurrence of event at least once in n year = 1 Pn =

Test of significance
Larger samples tend to follow normal curve and properties of the normal curve are used to test the
significance of difference two samples.
A level significance: - A difference in sample means greater than 2 is considered significant because
the probability of this occurring by chance is less than 5%.
The t test
When number of observation in a sample is small i. e. when n 30, the distribution of the ratio (x )/s is
generally not normal and the tests (
test, AIF tests) fails. In such cases students t test is used, which is
written under the pseudonym of student.

(x ) n
s

Where, n = size of the sample


= Mean of the population

x = Mean of the sample


s = standard deviation

x
s2
n

for calculation

If calculated value of t is smaller than that of obtained from the table, then the difference is not
significant.
Design flood
A flood used for the design of a structure on considerations of its safety, economy, life expectancy and
probable damage considerations is called design flood.
Depending upon the magnitude, the flood can be classified into the following three classes.
a. Ordinary flood:- The floods that are sure to be equaled in magnitude once or more times in the
estimated life of the project.
b. Frequency based flood (FBF): design flood estimated using flood frequency analysis
c. Standard Project Flood (SPF): - The flood that is likely to be exceeded in magnitude only at rare
occasion. Generally, it is equal to 40 to 60 % of probable maximum flood (PMF). SPF is computed from
standard project storm that have occurred over the project area under consideration or on the adjoining
areas with similar hydrometeorological and basin characteristics
d. Probable Maximum Flood (PMF):- PMF is a flood that might occur under the worst meteorological and
hydrological conditions. In other word, PMF is extreme flood that is physically possible in a region as a
result of severe most combinations.
Binomial distribution
Each trial has 2 possible states: occurrence or non-occurrence of event
Probability of occurrence is constant for all trials
Trials are statistically independent
Probability of occurrence of event r times in n successive years is found by Binomial distribution.

Where q = 1-p

6.2Continuous probability distribution commonly used in Hydrology


a. Normal distribution
The normal distribution arises from the central limit theorem, which states that if a sequence of random
variables Xi are independently and identically distributed with mean and variance 2 , then the

distribution of the sum of n such random variables y

i n

Xi
i 1

tends towards thenormal distribution with


,

mean n and variance n2 as n becomes large.


The PDF of normal distribution is given by

f x

x 2
exp
x
2
2

where = population mean and


value of the parameters are

= population standard deviation are parameters of the distribution. The

x, S x
where = sample mean and
If z

= sample standard deviation


and Z N 0,1 , it is called standard normal distribution.

Properties of normal distribution


Bell shaped
Symmetric about mean
Unbounded
The curve in which mean, median and the mode value coincide is the normal curve. The normal or
Gaussian frequency distribution is the most important in statistical theory. Most hydrological data are not
normally distributed, but they can sometimes be normalized by various methods like using logarithms or
cube root of the sample. Hydrological variables, such as annual precipitation, calculated as the sum of
effects of many independent events tends to follow normal distribution.
b. Lognormal distribution
If the random variable Y=logX is normally distributed, then X is said to be normally distributed. The PDF
of Lognormal distribution is given by

y y 2
f x
exp

2
2 y
x 2

Where y = logx

x0

The parameters of the distribution are

y y, y S y
where = sample mean and

= sample standard deviation

Properties of Lognormal distribution


X ranges from 0 to (lower bounded)
X is positively skewed.
Distribution tends to be symmetric as decreases
c. Exponential distribution
Exponential distribution is useful for instantaneously and independently occurring events, e.g. occurrence
of precipitation, occurrence of flood. The PDF of exponential distribution is given by
f x e x

x0

where = parameter, which is given by

1
x

d. Gamma distribution
The probability distribution function for gamma distribution is given by
for
where and

are parameters.
,

The symbol

is called gamma function.

The gamma distribution is useful to find the time taken for a particular event to occur in a Poisson process
(instantaneously and independently occurring event). The gamma distribution has a smoothly varying
form and is useful for describing skewed hydrological variables without the need for log transformation,
for example, distribution of depth of precipitation in storms. The distribution has lower bound at zero,
which is a disadvantage for applications to hydrological variables that have lower bound larger than zero.
e. Pearson type III (Three parameter gamma) distribution
The PDG of Pearson type III distribution is given by

x e x
f x

x
1

Where

and are parameters.

2
,

Cs

Sx

, x S x

This distribution is also called three parameter gamma distribution as it includes one more parameter in
the gamma distribution. By the method of moments, three sample moments (mean, standard deviation and
coefficient of skewness) can be transferred to where
and . This distribution can be used to describe
distribution of the annual maximum flood.
f. Log-Pearson type III (LPIII) distribution
If logX follows a Pearson type III distribution, then X is said to follow Log-Pearson type III distribution.
The PDF of LPIII distribution is given by

y e y
f x
x
1

where y = logx

logx

2
, y S y
,

Cs y

Sy

The log transformation reduces the skewness of the transformed data. This distribution is widely used for
the frequency analysis of the annual maximum floods.
The detailed explanation of this distribution is given in section 8.12.
g. Gumbel (Extreme value type I) distribution
The PDF of Gumbel distribution is given by

xu
x u
exp
exp


x
6S x

, u x 0.5772
f x

6.3Statistical techniques for Frequency analysis


Hydrologic extremes: floods, drought, severe storms
Objective of frequency analysis: to relate the magnitude of extreme events to their frequency of
occurrence through the use of probability distribution
Application of result of frequency analysis
For the design of dams, bridge, culverts, and flood control structures
To determine the economic value of flood control works
To delineate flood plains

Commonly used Statistical techniques based on probability distribution


I. Gumbels distribution
Gumbels distribution (extreme value typeI) is the most widely used distribution for analysis of flood,
maximum rainfall etc. It is general practice to use extreme value type I distribution also known as
Gumbels distribution to fit the flood discharges of various rivers. Gumbel (1941) proposed this concept.
He defined the largest of 365 days flow as the flood.
Probability distribution function for Gumbels distribution is

F x exp exp y

where y

x u

, x

Parameters

6S x

, u x 0.5772

and Sx are mean and standard deviation of flood series.

Reduced variate, yT
[
Value of variate for recurrence interval T

Flood frequency study using Gumbels method

Value of variatexT with return period T is

xT = value of a variate with return period T


= mean of variate
= standard deviation of variate
K = frequency factor

1 n
xi
n i 1

1 n

xi x
n 1 i 1

Computation of K
Method 1: Using mean and standard deviation of reduced variate

yT = reduced variate which is given by


[

= reduced mean
Sn = reducedstandard deviation
and Sn: both function of sample size N, obtained from tabulated values
Method 2: Computation of K from K-T relationshipfor large samples
For large samples (>100), K can be computed by chows formula. (Note: If table or values of and Snis
not given, also use this formula assuming large samples.)

{
[
]}

Or,

)]

Alternatively, As y , Yn 0.577, Sn1.2825


So, for n tends to infinity, K can also be computed from

YT 0.577
1.2825

Procedure to estimate the flood magnitude for given return period using Gumbels method
1. Compute mean, and standard deviation, of the given data.
2. Compute frequency factor K using method 1 for small sample size or method 2 for large sample size.
3. Compute xT.

To verify whether the given data follow the assumed Gumbels distribution

Plot value of xTfor different values of return period in semi-log or log-log or Gumbel
probability paper and see whether the plot is straight line.

For large N, T =2.33 for mean annual flood


Confidence limit
Limit within which the true value is expected to lie with a given probability based on
sampling errors

Formula to compute confidence limit


Confidence limit for variatexT
CL = confidence limit
xT = value of variate for given return period T
f(c) = function of confidence probability c (obtained from table)
Se = probable error, which is given by
where K is frequency factor

= standard deviation of sample


N = Sample size

Confidence interval, c %
50
68
80
90
95
99

f(c)
0.674
1
1.282
1.645
1.96
2.58

II. Log-Pearson typeIII (LPIII) distribution


Log Pearson III distribution is extensively used in USA for frequency analysis of annual maximum
floods. In this method the variate is first transferred into logarithmic form base 10.
Steps for computation of flood using Log-Pearson type III
First transform peak discharge (X) to logarithm of base 10. (y = log X)

Compute mean ( ), standard deviation ( ) and coefficient of skewness (Cs) of y.

Obtain the value of frequency factor (KT) for CS and the required return period (T) from the table for
Log Pearson type III distribution or by using formulae.

Compute YT

Flood of return period T (XT) = antilog(YT)

Formulae to compute KT
p = 1/T
[

)]

k = CS/6

6.4 Graphical method for frequency analysis


Probability plotting

Special probability paper or preparation of plot in graph for linearization


Fitting the data with straight line for interpolation and extrapolation.

Probability paper
Ordinate: Value of x, e.g. flow
Abscissa: return period or exceedence probability or reduced variate
Ordinate and abscissaare so designed that the fitted data appear close to straight line
Purpose of plot: linearization of data for interpolation, extrapolation and comparison
Plotting position
Probability value is assigned to each piece of data to be plotted.
Method
Arrange the data in descending order and give rank (m) starting from 1.
Compute plotting position , P (Xxm) and T (T=1/P)
Plot given data versus P or T
Fit a straight line
California formula for plotting position: P (Xxm) = m/n : simplest formula
Weibul formulafor plotting position: P (Xxm) = m/n+1 (widely used)
where m = rank, n = total number of values
Preparation of Gumbelprobability paper on ordinary graph paper and finding flood of required frequency

The relationship between given variable such as discharge (x) and the reduced variate (y T) is linear. For constructing
Gumbel probability paper, return period T can be computed for different values of reduced variate YTas shown in
table below.
YT

-2

-1

1.1

1.6

3.2

7.9

20.6

55.1

149

403.9

1097

First mark value of discharge on Y-axis and value of YT(say from -2 to 7) on X-axis. For corresponding value of YT,
return period, T (as shown in table above) is marked on the X-axis below YT.
Prepare a table for plotting position and plot the data on the graph. Fit straight line and extrapolate it for finding
flood of different frequencies.

1.0

1.1
2.0

1.6
5

3.2

7.9

20.6

10
50
Return period (T) years

55.1
100

198.8

403.9 1097.1
500

1000

(Alternatively, return period on log scale and discharge on linear scale will also plot as straight line. )

1.1.1 Rational Method

Rational method is commonly used method for computing peak discharge for small basins.
The idea behind this method is that if a rainfall of intensity i begins instantaneously and
continues indefinitely, the rate of runoff will increase until the time of concentration (t c),
when all of the basin is contributing to flow at the outlet. After t c, runoff becomes constant
for the period of rainfall excess (t-tc). After the cessation of rain, the runoff recedes
gradually to become zero at time tc from the end of the peak.
The product of rainfall intensity i and basin area (A) is the inflow rate for the system. The
peak discharge is given by
Qp =CiA (in FPS unit)
Where Qp = peak discharge
C = runoff coefficient
A = basin area in
i = mean intensity of rainfall for a duration equal to time of concentration (tc) and an
exceedence probability P
C: runoff coefficient. It is ratio of runoff to rainfall, represents total cumulative effect of
watershed loss. It depends on initial losses, depression storage, nature of soil, surface slope,
degree of saturation, rainfall intensity, geology of the basin, geohydrological characteristics
of the basin.
C varies from 0 to 1.

Runoff and rainfall


rates

Rainfall
Peak value

End of rainfall

Recession

Runof
f

Runoff hydrograph due to uniform rainfall

Time (t)

With SI unit (Computing Q in m3/s, given i in mm/hr and A in km2)


Qp =Ci A =

=0.278 Ci A

If A is in hectors (ha), i is in mm/hr, then Q in m3/s is


Qp =CI A =
Assumptions
The computed peak rate of runoff at the outlet point is a function of the average
rainfall rate during tc.
tc employed is the time for runoff to become established and flow from the most
remote part of the basin to the outlet.
Rainfall intensity is constant throughout the storm duration.
To get Qp, we need tc, i and C.
Rational method is useful for small catchments up to 50 sq. km
C is the runoff coefficient
Table7. 1: Runoff coefficients for rational formula
Type of basin
Rocky and permeable
Slightly impermeable, bare
Cultivated or covered with vegetation
Cultivated absorbent soil
Sandy soil
Heavy forest

C
0.8 1.0
0.6 0.8
0.4 0.6
0.3 0.4
0.2 0.3
0.1 0.3

For non-homogeneous basin, divide into sub-basins, get C for each sub-basin and compute
weighted average C. (

In the absence of data on rainfall intensity, i shall be estimated by

T = return period = 1/P, where P = probability of exceedence


tc = time of concentration
K, a, b, n: constants

K, a, b and n are to be defined for the particular site. For Nepal, their values may be
assumed as those for Northern India, i.e. K = 5.92, a = 0.162, b = 0.5 and n = 1.013. For use
in above equation, the time of concentration tc, in hours, shall be estimated by Kirpich
formula
tc = 0.019478L0.77S-0.385

where L is the maximum length of travel of water in m and S is the slope equal to H/L, H
being the difference in elevation between the remotest point of the basin and the outlet in
m.
Applications of rational method: for design of storm sewers, channels, and other drainage
structures
Limitations of rational method
Applicable to small basins (up to 50 km2)
Duration of rainfall intensity>tc
Gives only peak, does not give complete hydrograph
C assumed to be same for all storms
Rainfall intensity must be constant over the entire basin during t c.
1.1.2 Empirical Methods

All regional formulae are based on statistical correlation of the observed peak and
important catchment properties.
QP =f (A) where Qp = peak discharge and A = area
Empirical formulae shall be used only when a more accurate method for flood prediction
cannot be applied because of lack of data. For flood prediction in ungauged basins of Nepal,
the empirical formulae discussed in the following sections may be used with great caution
and proper justification.
Modified Dickens Method

Using Dickens method, the T year flood discharge QT, in m3/sec, shall be determined as
QT C T A 0.75

where A is the total basin area in sq. km and CT is the modified Dickens constant proposed
by the Irrigation Research Institute, Roorkee, India, based on frequency studies on
Himalayan rivers. This constant shall be computed as
1185
4
C T 2.342 log( 0.6T ) log
p

a 6
Aa
where a is perpetual snow area in sq. km. and T is the return period in years.
p 100

Fullers Method

Although developed for basins in the United States of America, Fullers formula may be
used to estimate flood discharges in the ungauged basins of Nepal for comparison
purposes. Using this method, the maximum instantaneous flood discharge Qmax in m3/s shall
be estimated as
Qmax

0.3

A
QT 1 2

2.59

where QT is the maximum 24 hour flood with frequency once in T years in m3/s and A is the
basin area in sq. km. QT shall be given by
QT Qav 1 0.8 log T
in which Qav is the yearly average 24 hour flood over a number of years, in m3/s, given by
Qav C f A0.8

where Cf is Fullers coefficient varying between 0.18 to 1.88. For Nepal, Cf may be taken as
the average of these values, i.e. equal to 1.03.
Hortons Formula

Hortons formula may be used to compute the flood qtr, in m3/s/sq. km, equaled or
exceeded in a T year return period using the relation
T 0.25
A 0.5
where A is the drainage area in sq. km.
qtr 71.2

WECS Formula

In Nepalese context, Water and Energy Commission Secretariat (WECS) developed empirical
relationships for analyzing flood of different frequencies.
The formula for 2 year return period is
The formula for 100 year return period is
where A3000 = Basin area (Km2) below 3000m elevation
For other return period,
where QT = Flood of T year return period (m3/s), S = standard normal variate, = parameter

Value of T and S
T (years)
2
5
10
25
50
100
500
1000
10000

S
0
0.842
1.282
1.645
2.054
2.326
2.878
3.09
3.719

Vous aimerez peut-être aussi