Vous êtes sur la page 1sur 23

Vol.

5, 2020-03

Optimal Samples from Selected Probability Distributions

Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org

doi: 10.13140/RG.2.2.27628.72327

Abstract

Optimal sampling is a deterministic approach for discretizing continuous probability


distributions. The purpose of optimal sampling is finding a set of values such that the first
natural normalized moments of the original distribution are preserved. Normalization consists,
in this case, on taking the -th root to the corresponding -th moment. This report presents
different examples of optimal sample sets obtained for a selection of families of probability
distributions (Uniform distribution, Normal distribution, Exponential distribution, Maxwell-
Boltzmann distribution, distribution, Log-normal distribution, Integrated-normal
distribution, and Epanechnikov parabolic distribution) each one represented by a standard
distribution, considering different sample sizes (5, 10, 15, 20, 25 and 50), and using a fixed
number of normalized moments (first 5 natural moments). The performance of each optimal
sample, in terms of matching the normalized moments of the original distribution, is also
reported. Those optimal samples can be used for representing the behavior of the full
distribution, for example, when analyzing the effects of uncertainty in non-linear systems, or
for reducing the number of elements required by Monte Carlo simulations.

Keywords

Deterministic sampling, Moments, Monte Carlo simulation, Optimal sampling, Optimization,


Probability Distributions, Random variables, Standard Transformation

1. Introduction

A previous report [1] presented and discussed different types of discretization methods of
probability distributions (sampling) including random, deterministic, and randomistic sampling
methods. One particular type of deterministic sampling method, optimal sampling, was
described. The purpose of optimal sampling is finding a set of values such that the first
moments of the original distribution are preserved, by solving the following optimization
problem:

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (1 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

∑ (( ∑ ) ( ))

(1.1)

where are the elements of the deterministic sample of size , is the number of integer
moments considered in the objective function, and ( ) is the -th moment of the distribution
of the sampled variable given by:

( ) ( ) ∫ ( )

(1.2)

where ( ) denotes the expected value operator, and ( ) represents the probability
density function of . The moments used in the objective function are normalized by using the
power of the -th moment.

The number of moments considered in the objective function (1.1) should not be larger than
the selected sample size . In practice, taking into account the most relevant moments
considered in the description of random variables, should be 4 as minimum. As the number
of moments considered increases and the sample size decreases, it is more difficult to reach a
good agreement between the moments of the original distribution and the moments of the
sample.

During optimization, the decision variables should remain within certain probable limits§
determined by the sample size. This can be done by directly defining the limits of the variables:

( ) ( )
(1.3)

where is the cumulative probability distribution function of , and is the inverse of the
cumulative probability function. The inverse function used can be the exact analytical function,
or any empirical function providing a relatively good fit.

The initial values of the decision variables used in the optimization can be defined as the
interval median heuristic sample given by:

§
While not being strictly necessary, this constraint forces a representative distribution of the elements of
the sample within the possible range of values.

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (2 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

( )
(1.4)

Any other set of initial values can be used, as long as constraints represented by Eq. (1.3)
are satisfied. For example, different randomistic samples [1] can be used as starting points
when local optima are observed.

The purpose of this report is presenting examples of optimal sample sets obtained for a
selection of families of probability distributions, considering different sample sizes ( ) and
using a fixed number of moments ( , for practical purposes). The samples considered only
standard probability distributions [3] given that the sample of any other distribution in the
family can be easily obtained using the following transformation:

(1.5)

where is the sample obtained for any distribution in the family, is the sample obtained for
the standard distribution of the family, and and are constant parameters characteristic of
the family, and defined according to Table 1.

Table 1. Parameters used in the definition of standard random variables


Type of Standard
Transformation
Type I (Unbounded) ( ) √ ( )
Type II (Lower bounded) ( ) ( ) ( )
Type II (Upper bounded) ( ) ( ) ( )
Type III (Bounded) ( ) ( ) ( )

The families of probability distributions considered in this report are: Uniform distribution,
Normal distribution, Exponential distribution, Maxwell-Boltzmann distribution, distribution,
Log-normal distribution, Integrated-normal distribution, and Epanechnikov’s parabolic
distribution. Standard distributions for each family, some of which have been previously
reported,[4] and their corresponding tables of optimal samples for different sizes are
presented in Section 2.

2. Optimal Samples of Selected Standard Probability Distributions

This Section contains a basic description of the selected standard distributions (Uniform,
Normal, Exponential, Maxwell-Boltzmann, , log-normal, integrated-normal, and
Epanechnikov) and the corresponding tables of optimal samples obtained for different sample

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (3 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

sizes ( = 5, 10, 15, 20, 25, 30, and 50). In all cases, the objective function considers only the first
5 natural moments of each distribution. However, the objective function may include more
moments in order to obtain a sample more representative of the original distribution.

The optimization procedure used to obtain the sets of optimal samples was the following: First,
a non-linear generalized reduced gradient (GRG) optimization is performed, using as starting
point the corresponding interval median heuristic sample. The tolerance of the objective
function (maximum final value allowed) was set to ,** corresponding to an average
tolerance of for the deviation of each normalized moment considered in the objective
function (the same resolution considered for each value in the sample). If the GRG method
does not find an objective function below , the optimization is repeated from 10
different randomistic samples as starting points, in order to overcome local minima and to
reduce the gap with the tolerance limit. If none of the randomistic starting points achieve the
requested tolerance in the objective function, the best sample obtained becomes the starting
point for an iterative search optimization method. This iterative search method works by
changing each element (one by one) in the sample using an adaptive variable step between
and in both the positive and negative directions, updating the sample when the
objective function improves, and repeating the loop until no further improvement is obtained
(less than 0.1% in an iteration cycle). If the objective function tolerance limit is not reached after
this iterative search procedure, the sample with the best objective function obtained is
reported. This is usually the case when the sample size considered is small (particularly if it is
less or equal than the number of moments considered in the objective function), or when the
samples of a distribution usually contain elements with large values (e.g. exponential, log-
normal, integrated-normal, or distributions). This “brute force” approach seems to work
well compared to other methods because the objective function leads to large numbers of
local minima (affecting the effectiveness of gradient-based methods), and the overall search
region considered is too wide (affecting the efficiency of stochastic-based search methods).

The use of tolerances also implies that a “close to optimal” solution might have been found,
but not necessarily the global optimum. Furthermore, different set of samples might result in
an objective function below the tolerance limit, and therefore, multiple “optimal” sets with
similar performance might be obtained, depending on the starting point of the optimization.
However, only one set is reported for each distribution and each sample size.

For each optimal sample obtained, the first 5 normalized natural moments of the distribution
(denoted by a sample size ) are compared to the normalized sample moments in order to
show the optimal sample performance.

**
The value for the objective function tolerance is the same for all distributions because only standard
distributions are employed.

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (4 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.1. Type III Standard Uniform Random Variable

Probability density function:

( )
(2.1)

Figure 1. Probability Density Function of the Type III Standard Uniform Distribution

Natural moments of the distribution:

( ) ∫

(2.2)

Figure 2. First Natural Normalized Moments of the Type III Standard Uniform Distribution

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (5 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 2. Selected Optimal Samples for the Type III Standard Uniform Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.08063 0.04446 0.03018 0.02496 0.00001 0.01656 0.51649 0.01089 0.51007
0.33025 0.14921 0.11936 0.07469 0.06501 0.05058 0.55002 0.02970 0.53011
0.47489 0.25125 0.13334 0.12434 0.12000 0.08379 0.58344 0.04970 0.55011
0.70076 0.35229 0.24989 0.17406 0.13544 0.11696 0.61685 0.06976 0.57013
0.91348 0.45105 0.29502 0.22396 0.18694 0.15006 0.65024 0.08971 0.59010
0.54921 0.37726 0.27407 0.21558 0.18313 0.68364 0.10977 0.61011
0.64750 0.43504 0.32439 0.28000 0.21630 0.71696 0.12969 0.63012
0.74759 0.48271 0.37487 0.29464 0.24956 0.75035 0.14976 0.65013
0.85262 0.59285 0.42541 0.34233 0.28279 0.78364 0.16974 0.67014
0.95482 0.61001 0.47591 0.37077 0.31606 0.81693 0.18974 0.69011
0.70721 0.52624 0.40578 0.34943 0.85028 0.20982 0.71012
0.75871 0.57630 0.44583 0.38281 0.88360 0.22976 0.73008
0.83752 0.62604 0.51564 0.41619 0.91691 0.24987 0.75012
0.90368 0.67546 0.52478 0.44966 0.95022 0.26978 0.77008
0.96723 0.72463 0.59581 0.48310 0.98352 0.28983 0.79009
0.77378 0.60645 0.30988 0.81008
0.82323 0.66193 0.32992 0.83009
0.87348 0.71975 0.34989 0.85010
0.92516 0.74460 0.36994 0.87006
0.97904 0.76682 0.38996 0.89002
0.81971 0.41001 0.91009
0.87080 0.43000 0.93005
0.89552 0.45003 0.95004
0.93001 0.47004 0.97005
0.98586 0.49007 0.99003

Table 3. Performance of the Optimal Samples for the Type III Standard Uniform Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
3.00E-09 4.10E-12 1.66E-12 4.38E-11 5.00E-10 4.95E-10 4.31E-10 0
0.50000 0.50000 0.50000 0.50000 0.50000 0.50000 0.49999 0.50000
0.57734 0.57735 0.57735 0.57735 0.57735 0.57734 0.57736 0.57735
0.62996 0.62996 0.62996 0.62997 0.62995 0.62997 0.62997 0.62996
0.66878 0.66874 0.66874 0.66874 0.66876 0.66875 0.66874 0.66874
0.69879 0.69883 0.69883 0.69883 0.69882 0.69882 0.69882 0.69883

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (6 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.2. Type I Standard Normal Distribution

Probability density function:

( )

(2.3)

Figure 3. Probability Density Function of the Type I Standard Normal Distribution

Natural moments of the distribution:

( ) ∫ {
√ ∏| |

(2.4)

Figure 4. First Natural Normalized Moments of the Type I Standard Normal Distribution

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (7 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 4. Selected Optimal Samples for the Type I Standard Normal Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
-1.61880 -1.94385 -2.09715 -2.20106 -2.27187 -2.33788 0.00209 -2.50169 0.00007
-0.25339 -0.87203 -1.21085 -1.42678 -1.57031 -1.65182 0.10186 -1.90656 0.06367
0.00000 -0.60669 -0.92362 -1.08218 -1.25355 -1.35835 0.18893 -1.65628 0.11536
0.25339 -0.30482 -0.68401 -0.87647 -1.05572 -1.18885 0.27604 -1.46399 0.16627
1.61880 -0.00476 -0.48450 -0.71678 -0.84168 -0.99398 0.36422 -1.33767 0.21803
0.00476 -0.25343 -0.57586 -0.75239 -0.89101 0.45604 -1.21413 0.26972
0.30482 -0.12609 -0.44332 -0.62259 -0.75720 0.55097 -1.11499 0.32205
0.60669 0.00000 -0.31476 -0.50260 -0.65239 0.65239 -1.02385 0.37542
0.87203 0.12609 -0.18827 -0.39156 -0.55097 0.75720 -0.92016 0.42961
1.94385 0.25343 -0.06267 -0.28458 -0.45604 0.89101 -0.85737 0.48795
0.48450 0.06267 -0.17999 -0.36422 0.99398 -0.79692 0.53186
0.68401 0.18827 -0.07323 -0.27604 1.18885 -0.72775 0.60221
0.92362 0.31476 0.00000 -0.18893 1.35835 -0.66369 0.66369
1.21085 0.44332 0.07323 -0.10186 1.65182 -0.60221 0.72775
2.09715 0.57586 0.17999 -0.00209 2.33788 -0.53186 0.79692
0.71678 0.28458 -0.48795 0.85737
0.87647 0.39156 -0.42961 0.92016
1.08218 0.50260 -0.37542 1.02385
1.42678 0.62259 -0.32205 1.11499
2.20106 0.75239 -0.26972 1.21413
0.84168 -0.21803 1.33767
1.05572 -0.16627 1.46399
1.25355 -0.11536 1.65628
1.57031 -0.06367 1.90656
2.27187 -0.00007 2.50169

Table 5. Performance of the Optimal Samples for the Type I Standard Normal Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
2.13E-03 1.30E-11 4.97E-10 8.02E-12 1.86E-11 4.87E-10 2.95E-10 0
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
1.03629 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
1.28758 1.31607 1.31605 1.31608 1.31607 1.31605 1.31606 1.31607
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (8 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.3. Type II Standard Exponential Distribution

Probability density function:

( )
(2.5)

Figure 5. Probability Density Function of the Type II Standard Exponential Distribution

Natural moments of the distribution:

( ) ∫

(2.6)

Figure 6. First Natural Normalized Moments of the Type II Standard Exponential Distribution

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (9 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 6. Selected Optimal Samples for the Type II Standard Exponential Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.02296 0.10531 0.06899 0.05129 0.04082 0.03390 0.76214 0.02020 0.73397
0.22318 0.22308 0.14310 0.10536 0.08338 0.06899 0.83625 0.04082 0.77653
0.51085 0.30181 0.22314 0.16252 0.12783 0.10536 0.84285 0.06187 0.82098
0.91636 0.35673 0.31015 0.22314 0.17435 0.14310 0.91629 0.08338 0.8675
3.31589 0.51086 0.40547 0.28768 0.22314 0.18232 1.0033 0.10536 0.91629
0.69318 0.47998 0.35667 0.27444 0.22314 1.09861 0.12783 0.96758
0.91633 0.51083 0.43078 0.32850 0.26570 1.20397 0.15082 1.02165
1.20407 0.62861 0.51083 0.38566 0.31015 1.32176 0.17435 1.07881
1.60945 0.76214 0.59784 0.44629 0.35667 1.45529 0.19845 1.13943
3.94987 0.91629 0.61970 0.51083 0.40546 1.60944 0.22314 1.17992
1.09861 0.69315 0.57982 0.45676 1.79176 0.24846 1.20397
1.32176 0.79851 0.65393 0.51083 2.0149 0.27444 1.27297
1.60944 0.91629 0.73397 0.56798 2.30259 0.30110 1.34707
2.01490 1.04982 0.73821 0.62861 2.70805 0.32850 1.42712
4.32971 1.20397 0.82098 0.69315 4.99171 0.35667 1.51413
1.38629 0.91629 0.38566 1.60944
1.60944 1.02165 0.41551 1.7148
1.89712 1.13943 0.44629 1.83258
2.30259 1.27297 0.47803 1.96611
4.60267 1.42712 0.51082 2.12026
1.60944 0.54473 2.30259
1.83258 0.57982 2.52573
2.12026 0.61619 2.81341
2.52573 0.65393 3.21888
4.81619 0.69315 5.48799

Table 7. Performance of the Optimal Samples for the Type II Standard Exponential Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
8.04E-02 2.22E-02 9.88E-03 5.39E-03 3.31E-03 2.20E-03 6.54E-04 0
0.99785 0.98707 0.98821 0.99028 0.99215 0.99370 0.99758 1.00000
1.55860 1.46635 1.43743 1.42469 1.41812 1.41443 1.40967 1.41421
1.95521 1.90248 1.87556 1.85972 1.84944 1.84232 1.82793 1.81712
2.22102 2.24330 2.24420 2.24199 2.23935 2.23686 2.22954 2.21336
2.40411 2.49951 2.53565 2.55462 2.56623 2.57402 2.58939 2.60517

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (10 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.4. Type II Standard Maxwell-Boltzmann Distribution

Probability density function:[5]

( )
(2.7)

Figure 7. Probability Density Function of the Type II Standard Maxwell-Boltzmann Distribution

Natural moments of the distribution:[6]


( )
( )
( ) ∏ ( )
( ) ∫

( ) ∏( )
{
(2.8)

Figure 8. First Natural Normalized Moments of the Type II Standard Maxwell-Boltzmann Distribution

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (11 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 8. Selected Optimal Samples for the Type II Standard Maxwell-Boltzmann Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.47601 0.26335 0.19663 0.24713 0.14037 0.25852 0.9862 0.17190 0.9727
0.85437 0.62615 0.54659 0.40515 0.44616 0.30411 1.02061 0.23507 0.99301
0.87292 0.75372 0.64027 0.51758 0.52451 0.41383 1.05618 0.33111 1.01364
1.05620 0.83617 0.71805 0.60054 0.58684 0.48808 1.09331 0.40560 1.03468
1.74306 0.91898 0.78805 0.67150 0.64027 0.54660 1.13244 0.48000 1.05618
1.04923 0.85437 0.73596 0.68303 0.63844 1.17413 0.52061 1.07825
1.10160 0.91969 0.79562 0.71491 0.68050 1.21914 0.55526 1.10096
1.20366 0.98621 0.85223 0.75596 0.71806 1.2685 0.58616 1.12443
1.32745 0.99583 0.90332 0.79576 0.75372 1.29543 0.61440 1.14876
1.91945 1.05619 0.95266 0.83446 0.78805 1.3458 0.64027 1.17412
1.13244 1.00328 0.87303 0.82148 1.3889 0.66478 1.19538
1.21915 1.05618 0.91135 0.85437 1.4899 0.68819 1.21415
1.32370 1.11259 0.96541 0.88701 1.61964 0.71071 1.25004
1.65661 1.17240 0.99128 0.91969 1.75539 0.73251 1.28047
1.96629 1.22171 1.03160 0.95266 2.1291 0.75371 1.30528
1.28096 1.07493 0.77445 1.33963
1.35031 1.12087 0.79480 1.38842
1.44541 1.19541 0.81485 1.43804
1.61417 1.22681 0.83468 1.46826
2.06119 1.31866 0.85437 1.53825
1.35867 0.87397 1.59944
1.44319 0.89353 1.67642
1.55411 0.91313 1.76741
1.73329 0.93282 1.91776
2.07908 0.95266 2.19434

Table 9. Performance of Optimal Samples for the Type II Standard Maxwell-Boltzmann Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
1.02E-05 9.24E-08 1.72E-07 1.37E-08 6.40E-08 8.63E-09 1.89E-08 0
1.00051 0.99998 1.00000 0.99999 1.00000 0.99999 0.99999 1.00000
1.08373 1.08549 1.08539 1.08544 1.08543 1.08544 1.08545 1.08540
1.16328 1.16224 1.16228 1.16237 1.16230 1.16238 1.16235 1.16245
1.23523 1.23344 1.23359 1.23332 1.23344 1.23331 1.23333 1.23325
1.29756 1.29911 1.29900 1.29915 1.29909 1.29916 1.29915 1.29917

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (12 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.5. Type II Standard Distribution

Probability density function:

( )
( )
( )
(2.9)

Figure 9. Probability Density Function of the Type II Standard Distribution for different values of

Natural moments of the distribution:

( ) ( )
( ) ∫ ( )
( ) ( )
(2.10)

Figure 10. First Natural Normalized Moments of the Type II Standard Distribution for different values
of

Particularly, only the optimal samples of the standard distribution are presented.

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (13 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 10. Selected Optimal Samples for the Type II Standard Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.44114 0.32206 0.26309 0.13922 0.04931 0.00000 0.91992 0.00000 0.89982
0.46851 0.46850 0.37423 0.32206 0.28780 0.26309 0.97179 0.15038 0.9301
0.73110 0.59998 0.46850 0.39876 0.35394 0.32206 1.02637 0.20627 0.96121
1.02638 0.65212 0.55668 0.46850 0.41311 0.37423 1.08423 0.24999 0.99327
2.28276 0.73110 0.64327 0.53492 0.46850 0.42255 1.14609 0.29625 1.02637
0.87030 0.73109 0.59998 0.52179 0.46850 1.15077 0.35394 1.06065
1.02638 0.78773 0.66502 0.57402 0.51301 1.21289 0.38414 1.09626
1.21289 0.82251 0.73109 0.62593 0.55668 1.28584 0.41311 1.13336
1.45786 0.91993 0.79918 0.67812 0.59998 1.36667 0.44115 1.17216
2.63206 1.02638 0.87029 0.73109 0.64327 1.45786 0.46850 1.21288
1.14610 0.94555 0.78536 0.68688 1.56322 0.49533 1.25582
1.28584 0.95202 0.84142 0.73109 1.6891 0.52179 1.3013
1.45786 1.02638 0.89982 0.77620 1.84728 0.54799 1.34976
1.68910 1.11462 0.96121 0.82250 2.06395 0.57402 1.39967
2.83348 1.21289 1.02637 0.87029 3.16954 0.59998 1.40173
1.32514 1.06983 0.62593 1.45786
1.45786 1.09627 0.65196 1.51905
1.62304 1.17217 0.67812 1.58649
1.84728 1.25583 0.70448 1.66184
2.97335 1.34977 0.73109 1.74752
1.45786 0.75803 1.84728
1.58649 0.78536 1.96732
1.74752 0.81313 2.11925
1.96732 0.84142 2.32887
3.08122 0.87029 3.4145

Table 11. Performance of Optimal Samples for the Type II Standard Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
3.58E-03 5.19E-04 1.61E-04 7.20E-05 3.63E-05 1.98E-05 3.52E-06 0
0.98998 0.99733 1.00039 1.00036 1.00008 1.00020 1.00014 1.00000
1.20109 1.18229 1.17920 1.18019 1.18119 1.18167 1.18270 1.18322
1.39440 1.37291 1.36567 1.36330 1.36216 1.36135 1.36052 1.36082
1.54698 1.54523 1.54228 1.54018 1.53880 1.53784 1.53601 1.53446
1.66184 1.68937 1.69739 1.70026 1.70174 1.70276 1.70428 1.70514

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (14 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.6. Type II Standard Log-Normal Distribution

Probability density function:

( )
( )

(2.11)

Figure 11. Probability Density Function of the Type II Standard Log-normal Distribution for different
values of

Natural moments of the distribution:

( )
( )
( ) ∫

(2.12)

Figure 12. First Natural Normalized Moments of the Type II Standard Log-normal Distribution for different
values of

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (15 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 12. Selected Optimal Samples for the Type II Standard Log-normal Distribution ( )
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.39447 0.46497 0.41663 0.38773 0.36775 0.35276 0.8825 0.31604 0.8825
0.57938 0.51100 0.50642 0.46497 0.43712 0.41663 0.9202 0.36775 0.90491
0.77750 0.57938 0.57937 0.52559 0.49041 0.46497 0.95978 0.40560 0.92795
1.00168 0.67896 0.58358 0.57937 0.53674 0.50642 1.00168 0.43712 0.9517
2.16803 0.77750 0.64632 0.62986 0.57937 0.54405 1.0464 0.46497 0.97624
0.88250 0.71152 0.63824 0.61992 0.57937 1.09458 0.49041 1.00168
1.00168 0.77750 0.67896 0.65940 0.61326 1.14706 0.51419 1.02813
1.14706 0.84635 0.72786 0.68287 0.64631 1.20498 0.53674 1.05573
1.34422 0.92020 0.77750 0.69848 0.67895 1.26993 0.55839 1.08463
2.51251 1.00168 0.82876 0.73770 0.71151 1.34422 0.57937 1.11500
1.09458 0.88250 0.77750 0.72097 1.43149 0.59983 1.14706
1.20498 0.93973 0.81834 0.74428 1.53786 0.61992 1.18107
1.34422 1.00168 0.86065 0.77750 1.67494 0.63975 1.21735
1.53786 1.07001 0.90491 0.81144 1.86927 0.65940 1.25628
2.71992 1.14706 0.95170 0.84635 3.08634 0.67895 1.29836
1.23646 1.00168 0.69847 1.34422
1.34422 1.05573 0.71803 1.39471
1.48174 1.11500 0.73769 1.45097
1.67494 1.18107 0.75749 1.51462
2.87010 1.25628 0.77749 1.58803
1.34422 0.79776 1.67494
1.45097 0.81833 1.78165
1.58803 0.83706 1.92012
1.78165 0.83928 2.11773
2.98842 0.86065 3.36695

Table 13. Performance of Optimal Samples for the Type II Standard Log-normal Distribution ( )
Sample
5 10 15 20 25 30 50 ∞
size ( )
9.09E-03 2.93E-03 1.49E-03 9.12E-04 6.22E-04 4.53E-04 1.84E-04 0
0.98421 0.98998 0.99274 0.99436 0.99544 0.99620 0.99786 1.00000
1.16615 1.14438 1.13787 1.13501 1.13351 1.13264 1.13141 1.13315
1.33594 1.31429 1.30498 1.29977 1.29644 1.29413 1.28935 1.28403
1.47400 1.47447 1.47234 1.47042 1.46885 1.46756 1.46419 1.45499
1.58021 1.61124 1.62281 1.62895 1.63277 1.63538 1.64078 1.64872

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (16 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.7. Type II Standard Integrated-Normal Distribution

Probability density function:[7]

( )

(2.13)

Figure 13. Probability Density Function of the Type II Standard Integrated-Normal Distribution

Natural moments of the distribution:

( ) ∫ ( )
√ √ √
(2.14)

Figure 14. First Natural Normalized Moments of the Type II Standard Integrated-Normal Distribution

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (17 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 14. Selected Optimal Samples for the Type II Standard Integrated-Normal Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.00000 0.00222 0.01037 0.00629 0.00427 0.00312 0.44924 0.00131 0.44924
0.07292 0.06081 0.03085 0.02111 0.01427 0.01037 0.52223 0.00427 0.49194
0.27659 0.09257 0.03514 0.04335 0.02913 0.02111 0.60505 0.00863 0.53796
0.69945 0.21762 0.07292 0.07291 0.04867 0.03513 0.69945 0.01427 0.58762
4.73271 0.32408 0.12439 0.09464 0.07291 0.05238 0.80766 0.02111 0.64131
0.54963 0.19131 0.11017 0.10206 0.07291 0.93252 0.02913 0.69945
0.91972 0.27659 0.15577 0.13087 0.09685 1.07781 0.03832 0.76255
1.47961 0.38468 0.21076 0.13647 0.12438 1.24853 0.04867 0.83119
2.31436 0.52223 0.27659 0.17657 0.15388 1.45149 0.06019 0.90605
5.50443 0.69945 0.35520 0.22301 0.15577 1.69626 0.07291 0.98794
0.93252 0.44924 0.27659 0.19131 1.99660 0.08686 1.07781
1.24853 0.56231 0.33834 0.23142 2.37317 0.10206 1.17680
1.69626 0.69945 0.40956 0.27659 2.85832 0.11858 1.28628
2.37317 0.86780 0.49194 0.32743 3.50584 0.13646 1.40791
6.36680 1.07781 0.58762 0.38468 7.46079 0.15576 1.54373
1.34546 0.69945 0.17656 1.69626
1.69626 0.83119 0.19894 1.86867
2.17369 0.98794 0.20899 2.06499
2.85832 1.17680 0.22301 2.29043
6.81407 1.40791 0.24885 2.55182
1.69626 0.27659 2.85832
2.06499 0.30637 3.22252
2.55182 0.33834 3.66215
3.22252 0.37267 4.20309
7.16726 0.40956 8.31647

Table 15. Performance of Optimal Samples for the Type II Standard Integrated-Normal Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
5.44E-01 3.23E-01 1.08E-01 7.01E-02 5.01E-02 3.78E-02 1.56E-02 0
1.15635 1.14651 0.99768 0.99456 0.99394 0.99408 0.99562 1.00000
2.14334 1.97881 1.87118 1.83352 1.81122 1.79633 1.76470 1.73205
2.77087 2.63715 2.65163 2.62056 2.59866 2.58212 2.54054 2.46621
3.16534 3.12389 3.25639 3.25997 3.25990 3.25848 3.24989 3.20109
3.43022 3.48320 3.71085 3.75599 3.78535 3.80637 3.85486 3.93628

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (18 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

2.8. Type III Standard Epanechnikov Distribution

Probability density function:[8]

( )
(2.15)

Figure 15. Probability Density Function of the Type III Standard Epanechnikov Distribution

Natural moments of the distribution:

( ) ∫ ( )
( )( )
(2.16)

Figure 16. First Natural Normalized Moments of the Type III Standard Epanechnikov Distribution

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (19 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Table 16. Selected Optimal Samples for the Type III Standard Epanechnikov Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.16062 0.11708 0.09718 0.08920 0.07467 0.06649 0.50492 0.04331 0.50191
0.40078 0.24378 0.19495 0.15258 0.15489 0.14507 0.52749 0.12560 0.50937
0.50000 0.32895 0.25593 0.23517 0.19246 0.17585 0.54988 0.14098 0.52485
0.59922 0.40816 0.31343 0.25790 0.23061 0.20855 0.57217 0.16191 0.53903
0.83938 0.49164 0.36723 0.31707 0.26661 0.23965 0.59619 0.18186 0.55340
0.50836 0.41811 0.33233 0.30141 0.26940 0.62135 0.20130 0.56779
0.59184 0.46797 0.38778 0.33418 0.29826 0.64733 0.22027 0.58251
0.67105 0.50000 0.42563 0.36596 0.32580 0.67420 0.23882 0.59726
0.75622 0.53203 0.45577 0.39652 0.35267 0.70174 0.25664 0.61223
0.88292 0.58189 0.48609 0.42631 0.37865 0.73060 0.27465 0.62756
0.63277 0.51391 0.45459 0.40381 0.76035 0.29159 0.64306
0.68657 0.54423 0.48151 0.42783 0.79145 0.30844 0.65893
0.74407 0.57437 0.50000 0.45012 0.82415 0.32492 0.67508
0.80505 0.61222 0.51849 0.47251 0.85493 0.34107 0.69156
0.90282 0.66767 0.54541 0.49508 0.93351 0.35694 0.70841
0.68293 0.57369 0.37244 0.72535
0.74210 0.60348 0.38777 0.74336
0.76483 0.63404 0.40274 0.76118
0.84742 0.66582 0.41749 0.77973
0.91080 0.69859 0.43221 0.79870
0.73339 0.44660 0.81814
0.76939 0.46097 0.83809
0.80754 0.47515 0.85902
0.84511 0.49063 0.87440
0.92533 0.49809 0.95669

Table 17. Performance of Optimal Samples for the Type III Standard Epanechnikov Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
7.32E-10 4.97E-10 4.84E-10 2.28E-10 2.70E-10 2.61E-10 3.72E-10 0
0.50000 0.50000 0.50000 0.50000 0.50000 0.50000 0.50000 0.50000
0.54773 0.54773 0.54773 0.54773 0.54773 0.54773 0.54773 0.54772
0.58482 0.58481 0.58481 0.58481 0.58481 0.58481 0.58481 0.58480
0.61479 0.61479 0.61479 0.61479 0.61479 0.61479 0.61479 0.61479
0.63970 0.63971 0.63971 0.63971 0.63971 0.63971 0.63971 0.63972

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (20 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

3. Final Remarks

Optimal sampling is a deterministic approach for representing continuous probability


distributions in a discrete way. The main idea behind optimal sampling is closely representing
the behavior of the distribution by matching the first normalized sample moments to the
exact normalized distribution moments. Such normalization consists simply on taking the -th
root to the corresponding -th moment. As the name indicates, optimal sampling requires the
optimization of an objective function (square differences between normalized moments),
using the elements of the sample as decision variables.

This particular optimization problem is challenging because multiple local optima are present in
the wide search region considered. Different optimization methods can be used for finding the
optimal samples, including gradient-based methods, stochastic-based methods, and heuristic
search methods. Using a combination of different methods is particularly useful for achieving
the solution in an efficient and effective way. The starting point for the optimization is usually
the interval-median sample, another deterministic sampling method based on finding the
median of each probability interval represented by each element in the sample.

As it can be inferred, the size of the sample plays a key role in the determination of the
elements in an optimal sample. Small-sized samples may become unable to represent the first
normalized moment, particular for large values of . All optima samples reported here
considered only the first 5 normalized moments. On the other hand, larger samples have more
degrees of freedom for adjusting the sample moments to the distribution moments. As the size
of the sample increases, the optimal sample approaches the interval median sample. For very
large sample sizes (e.g. >1000), the interval median sample can be considered an optimal
sample, avoiding the use of optimization.

The tables of optimal samples included in this report show the values of the elements of each
sample (considering different sample sizes: 5, 10, 15, 20, 25, 30 and 50), but also their
performance in terms of the objective function used in the optimization and the values of the
normalized moments, for different representative standard distributions (Uniform, Normal,
Exponential, Maxwell-Boltzmann, , Log-normal, Integrated-normal, and Epanechnikov).
Standard distributions allow generalizing the results to a whole family of probability
distributions. Type I (mean zero, and variance one) and Type II (mean one, positive values only)
standard distributions are unbounded in at least one side, and thus, elements with very large
absolute values are possible in a sample. Those large values have a significant influence on the
normalized moments, particularly on the higher moments, causing a mismatch between the
sample moments and the distribution moments, particularly for smaller sample sizes. Type III
(bounded between 0 and 1) distributions can be fitted with relative ease, achieving low values
of the objective function.

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (21 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

Another important consideration is the symmetry of the distribution. For perfectly symmetric
distributions, optimizing only half of the elements (either the lower or the upper half) in the
sample is more efficient than using all elements as decision variables. The other half of the
elements is easily found by reflecting the values of the first half about the mean. This approach
also allows improving the objective function in unbounded distributions, as is the case of the
standard Normal distribution.

From the families of continuous distributions considered, the and the Log-normal
distributions involve additional parameters in the probability density functions (degrees of
freedom for , and for the Log-normal). For those families, optimal samples for only one
particular parameter value were considered. In the case of , it was , and for the Log-
normal distribution, it was . The selection of the parameters was rather arbitrary as they
were used only for illustrative purposes. In both cases, it is also possible to obtain close-to-
optimal samples by transforming optimal samples of the standard Normal distribution. The
latter is possible because optimal samples can be used for representing the non-linear behavior
of the full distribution. Thus, they can also be used for analyzing the effects of uncertainty in
non-linear systems, or for simplifying the number of elements in Monte Carlo simulations.

Acknowledgments

The author gratefully acknowledges Prof. Jaime Aguirre (Universidad Nacional de Colombia)
for proof-reading the manuscript.

This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.

References

[1] Hernandez, H. (2019). Discretization of Probability Distributions: Random, Deterministic and


Randomistic Sampling. ForsChem Research Reports, 4, 2019-11. doi:
10.13140/RG.2.2.11389.92643.

[2] Hernandez, H. (2019). Goodness-of-fit of Randomistic Models. ForsChem Research Reports,


4, 2019-10. doi: 10.13140/RG.2.2.35386.34248.

[3] Hernandez, H. (2018). Multidimensional Randomness, Standard Random Variables and


Variance Algebra. ForsChem Research Reports, 3, 2018-02. doi: 10.13140/RG.2.2.11902.48966.

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (22 / 23)


www.forschem.org
Optimal Samples from Selected
Probability Distributions
Hugo Hernandez
ForsChem Research
hugo.hernandez@forschem.org

[4] Hernandez, H. (2018). Expected Value, Variance and Covariance of Natural Powers of
Representative Standard Random Variables. ForsChem Research Reports, 3, 2018-08. doi:
10.13140/RG.2.2.15187.07205.

[5] Hernandez, H. (2017). Standard Maxwell-Boltzmann distribution: Definition and properties.


ForsChem Research Reports, 2, 2017-2. doi: 10.13140/RG.2.2.29888.74244.

[6] Hernandez, H. (2017). Standard Maxwell-Boltzmann Distribution: Additional Nonlinear and


Multivariate Properties. ForsChem Research Reports, 2, 2017-14. doi:
10.13140/RG.2.2.35761.07520.

[7] Hernandez, H. (2018). Integrating Functions of Random Variables. ForsChem Research


Reports 2018-07. doi: 10.13140/RG.2.2.23660.87680.

[8] Epanechnikov, V. A. (1969). Non-parametric Estimation of a Multivariate Probability Density.


Theory of Probability & Its Applications, 14(1), 153-158.

20/02/2020 ForsChem Research Reports Vol. 5, 2020-03 (23 / 23)


www.forschem.org

Vous aimerez peut-être aussi