Académique Documents
Professionnel Documents
Culture Documents
5, 2020-03
Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org
doi: 10.13140/RG.2.2.27628.72327
Abstract
Keywords
1. Introduction
A previous report [1] presented and discussed different types of discretization methods of
probability distributions (sampling) including random, deterministic, and randomistic sampling
methods. One particular type of deterministic sampling method, optimal sampling, was
described. The purpose of optimal sampling is finding a set of values such that the first
moments of the original distribution are preserved, by solving the following optimization
problem:
∑ (( ∑ ) ( ))
(1.1)
where are the elements of the deterministic sample of size , is the number of integer
moments considered in the objective function, and ( ) is the -th moment of the distribution
of the sampled variable given by:
( ) ( ) ∫ ( )
(1.2)
where ( ) denotes the expected value operator, and ( ) represents the probability
density function of . The moments used in the objective function are normalized by using the
power of the -th moment.
The number of moments considered in the objective function (1.1) should not be larger than
the selected sample size . In practice, taking into account the most relevant moments
considered in the description of random variables, should be 4 as minimum. As the number
of moments considered increases and the sample size decreases, it is more difficult to reach a
good agreement between the moments of the original distribution and the moments of the
sample.
During optimization, the decision variables should remain within certain probable limits§
determined by the sample size. This can be done by directly defining the limits of the variables:
( ) ( )
(1.3)
where is the cumulative probability distribution function of , and is the inverse of the
cumulative probability function. The inverse function used can be the exact analytical function,
or any empirical function providing a relatively good fit.
The initial values of the decision variables used in the optimization can be defined as the
interval median heuristic sample given by:
§
While not being strictly necessary, this constraint forces a representative distribution of the elements of
the sample within the possible range of values.
( )
(1.4)
Any other set of initial values can be used, as long as constraints represented by Eq. (1.3)
are satisfied. For example, different randomistic samples [1] can be used as starting points
when local optima are observed.
The purpose of this report is presenting examples of optimal sample sets obtained for a
selection of families of probability distributions, considering different sample sizes ( ) and
using a fixed number of moments ( , for practical purposes). The samples considered only
standard probability distributions [3] given that the sample of any other distribution in the
family can be easily obtained using the following transformation:
(1.5)
where is the sample obtained for any distribution in the family, is the sample obtained for
the standard distribution of the family, and and are constant parameters characteristic of
the family, and defined according to Table 1.
The families of probability distributions considered in this report are: Uniform distribution,
Normal distribution, Exponential distribution, Maxwell-Boltzmann distribution, distribution,
Log-normal distribution, Integrated-normal distribution, and Epanechnikov’s parabolic
distribution. Standard distributions for each family, some of which have been previously
reported,[4] and their corresponding tables of optimal samples for different sizes are
presented in Section 2.
This Section contains a basic description of the selected standard distributions (Uniform,
Normal, Exponential, Maxwell-Boltzmann, , log-normal, integrated-normal, and
Epanechnikov) and the corresponding tables of optimal samples obtained for different sample
sizes ( = 5, 10, 15, 20, 25, 30, and 50). In all cases, the objective function considers only the first
5 natural moments of each distribution. However, the objective function may include more
moments in order to obtain a sample more representative of the original distribution.
The optimization procedure used to obtain the sets of optimal samples was the following: First,
a non-linear generalized reduced gradient (GRG) optimization is performed, using as starting
point the corresponding interval median heuristic sample. The tolerance of the objective
function (maximum final value allowed) was set to ,** corresponding to an average
tolerance of for the deviation of each normalized moment considered in the objective
function (the same resolution considered for each value in the sample). If the GRG method
does not find an objective function below , the optimization is repeated from 10
different randomistic samples as starting points, in order to overcome local minima and to
reduce the gap with the tolerance limit. If none of the randomistic starting points achieve the
requested tolerance in the objective function, the best sample obtained becomes the starting
point for an iterative search optimization method. This iterative search method works by
changing each element (one by one) in the sample using an adaptive variable step between
and in both the positive and negative directions, updating the sample when the
objective function improves, and repeating the loop until no further improvement is obtained
(less than 0.1% in an iteration cycle). If the objective function tolerance limit is not reached after
this iterative search procedure, the sample with the best objective function obtained is
reported. This is usually the case when the sample size considered is small (particularly if it is
less or equal than the number of moments considered in the objective function), or when the
samples of a distribution usually contain elements with large values (e.g. exponential, log-
normal, integrated-normal, or distributions). This “brute force” approach seems to work
well compared to other methods because the objective function leads to large numbers of
local minima (affecting the effectiveness of gradient-based methods), and the overall search
region considered is too wide (affecting the efficiency of stochastic-based search methods).
The use of tolerances also implies that a “close to optimal” solution might have been found,
but not necessarily the global optimum. Furthermore, different set of samples might result in
an objective function below the tolerance limit, and therefore, multiple “optimal” sets with
similar performance might be obtained, depending on the starting point of the optimization.
However, only one set is reported for each distribution and each sample size.
For each optimal sample obtained, the first 5 normalized natural moments of the distribution
(denoted by a sample size ) are compared to the normalized sample moments in order to
show the optimal sample performance.
**
The value for the objective function tolerance is the same for all distributions because only standard
distributions are employed.
( )
(2.1)
Figure 1. Probability Density Function of the Type III Standard Uniform Distribution
( ) ∫
(2.2)
Figure 2. First Natural Normalized Moments of the Type III Standard Uniform Distribution
Table 2. Selected Optimal Samples for the Type III Standard Uniform Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.08063 0.04446 0.03018 0.02496 0.00001 0.01656 0.51649 0.01089 0.51007
0.33025 0.14921 0.11936 0.07469 0.06501 0.05058 0.55002 0.02970 0.53011
0.47489 0.25125 0.13334 0.12434 0.12000 0.08379 0.58344 0.04970 0.55011
0.70076 0.35229 0.24989 0.17406 0.13544 0.11696 0.61685 0.06976 0.57013
0.91348 0.45105 0.29502 0.22396 0.18694 0.15006 0.65024 0.08971 0.59010
0.54921 0.37726 0.27407 0.21558 0.18313 0.68364 0.10977 0.61011
0.64750 0.43504 0.32439 0.28000 0.21630 0.71696 0.12969 0.63012
0.74759 0.48271 0.37487 0.29464 0.24956 0.75035 0.14976 0.65013
0.85262 0.59285 0.42541 0.34233 0.28279 0.78364 0.16974 0.67014
0.95482 0.61001 0.47591 0.37077 0.31606 0.81693 0.18974 0.69011
0.70721 0.52624 0.40578 0.34943 0.85028 0.20982 0.71012
0.75871 0.57630 0.44583 0.38281 0.88360 0.22976 0.73008
0.83752 0.62604 0.51564 0.41619 0.91691 0.24987 0.75012
0.90368 0.67546 0.52478 0.44966 0.95022 0.26978 0.77008
0.96723 0.72463 0.59581 0.48310 0.98352 0.28983 0.79009
0.77378 0.60645 0.30988 0.81008
0.82323 0.66193 0.32992 0.83009
0.87348 0.71975 0.34989 0.85010
0.92516 0.74460 0.36994 0.87006
0.97904 0.76682 0.38996 0.89002
0.81971 0.41001 0.91009
0.87080 0.43000 0.93005
0.89552 0.45003 0.95004
0.93001 0.47004 0.97005
0.98586 0.49007 0.99003
Table 3. Performance of the Optimal Samples for the Type III Standard Uniform Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
3.00E-09 4.10E-12 1.66E-12 4.38E-11 5.00E-10 4.95E-10 4.31E-10 0
0.50000 0.50000 0.50000 0.50000 0.50000 0.50000 0.49999 0.50000
0.57734 0.57735 0.57735 0.57735 0.57735 0.57734 0.57736 0.57735
0.62996 0.62996 0.62996 0.62997 0.62995 0.62997 0.62997 0.62996
0.66878 0.66874 0.66874 0.66874 0.66876 0.66875 0.66874 0.66874
0.69879 0.69883 0.69883 0.69883 0.69882 0.69882 0.69882 0.69883
( )
√
(2.3)
( ) ∫ {
√ ∏| |
(2.4)
Figure 4. First Natural Normalized Moments of the Type I Standard Normal Distribution
Table 4. Selected Optimal Samples for the Type I Standard Normal Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
-1.61880 -1.94385 -2.09715 -2.20106 -2.27187 -2.33788 0.00209 -2.50169 0.00007
-0.25339 -0.87203 -1.21085 -1.42678 -1.57031 -1.65182 0.10186 -1.90656 0.06367
0.00000 -0.60669 -0.92362 -1.08218 -1.25355 -1.35835 0.18893 -1.65628 0.11536
0.25339 -0.30482 -0.68401 -0.87647 -1.05572 -1.18885 0.27604 -1.46399 0.16627
1.61880 -0.00476 -0.48450 -0.71678 -0.84168 -0.99398 0.36422 -1.33767 0.21803
0.00476 -0.25343 -0.57586 -0.75239 -0.89101 0.45604 -1.21413 0.26972
0.30482 -0.12609 -0.44332 -0.62259 -0.75720 0.55097 -1.11499 0.32205
0.60669 0.00000 -0.31476 -0.50260 -0.65239 0.65239 -1.02385 0.37542
0.87203 0.12609 -0.18827 -0.39156 -0.55097 0.75720 -0.92016 0.42961
1.94385 0.25343 -0.06267 -0.28458 -0.45604 0.89101 -0.85737 0.48795
0.48450 0.06267 -0.17999 -0.36422 0.99398 -0.79692 0.53186
0.68401 0.18827 -0.07323 -0.27604 1.18885 -0.72775 0.60221
0.92362 0.31476 0.00000 -0.18893 1.35835 -0.66369 0.66369
1.21085 0.44332 0.07323 -0.10186 1.65182 -0.60221 0.72775
2.09715 0.57586 0.17999 -0.00209 2.33788 -0.53186 0.79692
0.71678 0.28458 -0.48795 0.85737
0.87647 0.39156 -0.42961 0.92016
1.08218 0.50260 -0.37542 1.02385
1.42678 0.62259 -0.32205 1.11499
2.20106 0.75239 -0.26972 1.21413
0.84168 -0.21803 1.33767
1.05572 -0.16627 1.46399
1.25355 -0.11536 1.65628
1.57031 -0.06367 1.90656
2.27187 -0.00007 2.50169
Table 5. Performance of the Optimal Samples for the Type I Standard Normal Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
2.13E-03 1.30E-11 4.97E-10 8.02E-12 1.86E-11 4.87E-10 2.95E-10 0
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
1.03629 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000 1.00000
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
1.28758 1.31607 1.31605 1.31608 1.31607 1.31605 1.31606 1.31607
0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000
( )
(2.5)
( ) ∫
(2.6)
Figure 6. First Natural Normalized Moments of the Type II Standard Exponential Distribution
Table 6. Selected Optimal Samples for the Type II Standard Exponential Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.02296 0.10531 0.06899 0.05129 0.04082 0.03390 0.76214 0.02020 0.73397
0.22318 0.22308 0.14310 0.10536 0.08338 0.06899 0.83625 0.04082 0.77653
0.51085 0.30181 0.22314 0.16252 0.12783 0.10536 0.84285 0.06187 0.82098
0.91636 0.35673 0.31015 0.22314 0.17435 0.14310 0.91629 0.08338 0.8675
3.31589 0.51086 0.40547 0.28768 0.22314 0.18232 1.0033 0.10536 0.91629
0.69318 0.47998 0.35667 0.27444 0.22314 1.09861 0.12783 0.96758
0.91633 0.51083 0.43078 0.32850 0.26570 1.20397 0.15082 1.02165
1.20407 0.62861 0.51083 0.38566 0.31015 1.32176 0.17435 1.07881
1.60945 0.76214 0.59784 0.44629 0.35667 1.45529 0.19845 1.13943
3.94987 0.91629 0.61970 0.51083 0.40546 1.60944 0.22314 1.17992
1.09861 0.69315 0.57982 0.45676 1.79176 0.24846 1.20397
1.32176 0.79851 0.65393 0.51083 2.0149 0.27444 1.27297
1.60944 0.91629 0.73397 0.56798 2.30259 0.30110 1.34707
2.01490 1.04982 0.73821 0.62861 2.70805 0.32850 1.42712
4.32971 1.20397 0.82098 0.69315 4.99171 0.35667 1.51413
1.38629 0.91629 0.38566 1.60944
1.60944 1.02165 0.41551 1.7148
1.89712 1.13943 0.44629 1.83258
2.30259 1.27297 0.47803 1.96611
4.60267 1.42712 0.51082 2.12026
1.60944 0.54473 2.30259
1.83258 0.57982 2.52573
2.12026 0.61619 2.81341
2.52573 0.65393 3.21888
4.81619 0.69315 5.48799
Table 7. Performance of the Optimal Samples for the Type II Standard Exponential Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
8.04E-02 2.22E-02 9.88E-03 5.39E-03 3.31E-03 2.20E-03 6.54E-04 0
0.99785 0.98707 0.98821 0.99028 0.99215 0.99370 0.99758 1.00000
1.55860 1.46635 1.43743 1.42469 1.41812 1.41443 1.40967 1.41421
1.95521 1.90248 1.87556 1.85972 1.84944 1.84232 1.82793 1.81712
2.22102 2.24330 2.24420 2.24199 2.23935 2.23686 2.22954 2.21336
2.40411 2.49951 2.53565 2.55462 2.56623 2.57402 2.58939 2.60517
( )
(2.7)
( ) ∏( )
{
(2.8)
Figure 8. First Natural Normalized Moments of the Type II Standard Maxwell-Boltzmann Distribution
Table 8. Selected Optimal Samples for the Type II Standard Maxwell-Boltzmann Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.47601 0.26335 0.19663 0.24713 0.14037 0.25852 0.9862 0.17190 0.9727
0.85437 0.62615 0.54659 0.40515 0.44616 0.30411 1.02061 0.23507 0.99301
0.87292 0.75372 0.64027 0.51758 0.52451 0.41383 1.05618 0.33111 1.01364
1.05620 0.83617 0.71805 0.60054 0.58684 0.48808 1.09331 0.40560 1.03468
1.74306 0.91898 0.78805 0.67150 0.64027 0.54660 1.13244 0.48000 1.05618
1.04923 0.85437 0.73596 0.68303 0.63844 1.17413 0.52061 1.07825
1.10160 0.91969 0.79562 0.71491 0.68050 1.21914 0.55526 1.10096
1.20366 0.98621 0.85223 0.75596 0.71806 1.2685 0.58616 1.12443
1.32745 0.99583 0.90332 0.79576 0.75372 1.29543 0.61440 1.14876
1.91945 1.05619 0.95266 0.83446 0.78805 1.3458 0.64027 1.17412
1.13244 1.00328 0.87303 0.82148 1.3889 0.66478 1.19538
1.21915 1.05618 0.91135 0.85437 1.4899 0.68819 1.21415
1.32370 1.11259 0.96541 0.88701 1.61964 0.71071 1.25004
1.65661 1.17240 0.99128 0.91969 1.75539 0.73251 1.28047
1.96629 1.22171 1.03160 0.95266 2.1291 0.75371 1.30528
1.28096 1.07493 0.77445 1.33963
1.35031 1.12087 0.79480 1.38842
1.44541 1.19541 0.81485 1.43804
1.61417 1.22681 0.83468 1.46826
2.06119 1.31866 0.85437 1.53825
1.35867 0.87397 1.59944
1.44319 0.89353 1.67642
1.55411 0.91313 1.76741
1.73329 0.93282 1.91776
2.07908 0.95266 2.19434
Table 9. Performance of Optimal Samples for the Type II Standard Maxwell-Boltzmann Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
1.02E-05 9.24E-08 1.72E-07 1.37E-08 6.40E-08 8.63E-09 1.89E-08 0
1.00051 0.99998 1.00000 0.99999 1.00000 0.99999 0.99999 1.00000
1.08373 1.08549 1.08539 1.08544 1.08543 1.08544 1.08545 1.08540
1.16328 1.16224 1.16228 1.16237 1.16230 1.16238 1.16235 1.16245
1.23523 1.23344 1.23359 1.23332 1.23344 1.23331 1.23333 1.23325
1.29756 1.29911 1.29900 1.29915 1.29909 1.29916 1.29915 1.29917
( )
( )
( )
(2.9)
Figure 9. Probability Density Function of the Type II Standard Distribution for different values of
( ) ( )
( ) ∫ ( )
( ) ( )
(2.10)
Figure 10. First Natural Normalized Moments of the Type II Standard Distribution for different values
of
Particularly, only the optimal samples of the standard distribution are presented.
Table 10. Selected Optimal Samples for the Type II Standard Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.44114 0.32206 0.26309 0.13922 0.04931 0.00000 0.91992 0.00000 0.89982
0.46851 0.46850 0.37423 0.32206 0.28780 0.26309 0.97179 0.15038 0.9301
0.73110 0.59998 0.46850 0.39876 0.35394 0.32206 1.02637 0.20627 0.96121
1.02638 0.65212 0.55668 0.46850 0.41311 0.37423 1.08423 0.24999 0.99327
2.28276 0.73110 0.64327 0.53492 0.46850 0.42255 1.14609 0.29625 1.02637
0.87030 0.73109 0.59998 0.52179 0.46850 1.15077 0.35394 1.06065
1.02638 0.78773 0.66502 0.57402 0.51301 1.21289 0.38414 1.09626
1.21289 0.82251 0.73109 0.62593 0.55668 1.28584 0.41311 1.13336
1.45786 0.91993 0.79918 0.67812 0.59998 1.36667 0.44115 1.17216
2.63206 1.02638 0.87029 0.73109 0.64327 1.45786 0.46850 1.21288
1.14610 0.94555 0.78536 0.68688 1.56322 0.49533 1.25582
1.28584 0.95202 0.84142 0.73109 1.6891 0.52179 1.3013
1.45786 1.02638 0.89982 0.77620 1.84728 0.54799 1.34976
1.68910 1.11462 0.96121 0.82250 2.06395 0.57402 1.39967
2.83348 1.21289 1.02637 0.87029 3.16954 0.59998 1.40173
1.32514 1.06983 0.62593 1.45786
1.45786 1.09627 0.65196 1.51905
1.62304 1.17217 0.67812 1.58649
1.84728 1.25583 0.70448 1.66184
2.97335 1.34977 0.73109 1.74752
1.45786 0.75803 1.84728
1.58649 0.78536 1.96732
1.74752 0.81313 2.11925
1.96732 0.84142 2.32887
3.08122 0.87029 3.4145
Table 11. Performance of Optimal Samples for the Type II Standard Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
3.58E-03 5.19E-04 1.61E-04 7.20E-05 3.63E-05 1.98E-05 3.52E-06 0
0.98998 0.99733 1.00039 1.00036 1.00008 1.00020 1.00014 1.00000
1.20109 1.18229 1.17920 1.18019 1.18119 1.18167 1.18270 1.18322
1.39440 1.37291 1.36567 1.36330 1.36216 1.36135 1.36052 1.36082
1.54698 1.54523 1.54228 1.54018 1.53880 1.53784 1.53601 1.53446
1.66184 1.68937 1.69739 1.70026 1.70174 1.70276 1.70428 1.70514
( )
( )
√
(2.11)
Figure 11. Probability Density Function of the Type II Standard Log-normal Distribution for different
values of
( )
( )
( ) ∫
√
(2.12)
Figure 12. First Natural Normalized Moments of the Type II Standard Log-normal Distribution for different
values of
Table 12. Selected Optimal Samples for the Type II Standard Log-normal Distribution ( )
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.39447 0.46497 0.41663 0.38773 0.36775 0.35276 0.8825 0.31604 0.8825
0.57938 0.51100 0.50642 0.46497 0.43712 0.41663 0.9202 0.36775 0.90491
0.77750 0.57938 0.57937 0.52559 0.49041 0.46497 0.95978 0.40560 0.92795
1.00168 0.67896 0.58358 0.57937 0.53674 0.50642 1.00168 0.43712 0.9517
2.16803 0.77750 0.64632 0.62986 0.57937 0.54405 1.0464 0.46497 0.97624
0.88250 0.71152 0.63824 0.61992 0.57937 1.09458 0.49041 1.00168
1.00168 0.77750 0.67896 0.65940 0.61326 1.14706 0.51419 1.02813
1.14706 0.84635 0.72786 0.68287 0.64631 1.20498 0.53674 1.05573
1.34422 0.92020 0.77750 0.69848 0.67895 1.26993 0.55839 1.08463
2.51251 1.00168 0.82876 0.73770 0.71151 1.34422 0.57937 1.11500
1.09458 0.88250 0.77750 0.72097 1.43149 0.59983 1.14706
1.20498 0.93973 0.81834 0.74428 1.53786 0.61992 1.18107
1.34422 1.00168 0.86065 0.77750 1.67494 0.63975 1.21735
1.53786 1.07001 0.90491 0.81144 1.86927 0.65940 1.25628
2.71992 1.14706 0.95170 0.84635 3.08634 0.67895 1.29836
1.23646 1.00168 0.69847 1.34422
1.34422 1.05573 0.71803 1.39471
1.48174 1.11500 0.73769 1.45097
1.67494 1.18107 0.75749 1.51462
2.87010 1.25628 0.77749 1.58803
1.34422 0.79776 1.67494
1.45097 0.81833 1.78165
1.58803 0.83706 1.92012
1.78165 0.83928 2.11773
2.98842 0.86065 3.36695
Table 13. Performance of Optimal Samples for the Type II Standard Log-normal Distribution ( )
Sample
5 10 15 20 25 30 50 ∞
size ( )
9.09E-03 2.93E-03 1.49E-03 9.12E-04 6.22E-04 4.53E-04 1.84E-04 0
0.98421 0.98998 0.99274 0.99436 0.99544 0.99620 0.99786 1.00000
1.16615 1.14438 1.13787 1.13501 1.13351 1.13264 1.13141 1.13315
1.33594 1.31429 1.30498 1.29977 1.29644 1.29413 1.28935 1.28403
1.47400 1.47447 1.47234 1.47042 1.46885 1.46756 1.46419 1.45499
1.58021 1.61124 1.62281 1.62895 1.63277 1.63538 1.64078 1.64872
( )
√
(2.13)
Figure 13. Probability Density Function of the Type II Standard Integrated-Normal Distribution
( ) ∫ ( )
√ √ √
(2.14)
Figure 14. First Natural Normalized Moments of the Type II Standard Integrated-Normal Distribution
Table 14. Selected Optimal Samples for the Type II Standard Integrated-Normal Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.00000 0.00222 0.01037 0.00629 0.00427 0.00312 0.44924 0.00131 0.44924
0.07292 0.06081 0.03085 0.02111 0.01427 0.01037 0.52223 0.00427 0.49194
0.27659 0.09257 0.03514 0.04335 0.02913 0.02111 0.60505 0.00863 0.53796
0.69945 0.21762 0.07292 0.07291 0.04867 0.03513 0.69945 0.01427 0.58762
4.73271 0.32408 0.12439 0.09464 0.07291 0.05238 0.80766 0.02111 0.64131
0.54963 0.19131 0.11017 0.10206 0.07291 0.93252 0.02913 0.69945
0.91972 0.27659 0.15577 0.13087 0.09685 1.07781 0.03832 0.76255
1.47961 0.38468 0.21076 0.13647 0.12438 1.24853 0.04867 0.83119
2.31436 0.52223 0.27659 0.17657 0.15388 1.45149 0.06019 0.90605
5.50443 0.69945 0.35520 0.22301 0.15577 1.69626 0.07291 0.98794
0.93252 0.44924 0.27659 0.19131 1.99660 0.08686 1.07781
1.24853 0.56231 0.33834 0.23142 2.37317 0.10206 1.17680
1.69626 0.69945 0.40956 0.27659 2.85832 0.11858 1.28628
2.37317 0.86780 0.49194 0.32743 3.50584 0.13646 1.40791
6.36680 1.07781 0.58762 0.38468 7.46079 0.15576 1.54373
1.34546 0.69945 0.17656 1.69626
1.69626 0.83119 0.19894 1.86867
2.17369 0.98794 0.20899 2.06499
2.85832 1.17680 0.22301 2.29043
6.81407 1.40791 0.24885 2.55182
1.69626 0.27659 2.85832
2.06499 0.30637 3.22252
2.55182 0.33834 3.66215
3.22252 0.37267 4.20309
7.16726 0.40956 8.31647
Table 15. Performance of Optimal Samples for the Type II Standard Integrated-Normal Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
5.44E-01 3.23E-01 1.08E-01 7.01E-02 5.01E-02 3.78E-02 1.56E-02 0
1.15635 1.14651 0.99768 0.99456 0.99394 0.99408 0.99562 1.00000
2.14334 1.97881 1.87118 1.83352 1.81122 1.79633 1.76470 1.73205
2.77087 2.63715 2.65163 2.62056 2.59866 2.58212 2.54054 2.46621
3.16534 3.12389 3.25639 3.25997 3.25990 3.25848 3.24989 3.20109
3.43022 3.48320 3.71085 3.75599 3.78535 3.80637 3.85486 3.93628
( )
(2.15)
Figure 15. Probability Density Function of the Type III Standard Epanechnikov Distribution
( ) ∫ ( )
( )( )
(2.16)
Figure 16. First Natural Normalized Moments of the Type III Standard Epanechnikov Distribution
Table 16. Selected Optimal Samples for the Type III Standard Epanechnikov Distribution
n=5 n = 10 n = 15 n = 20 n = 25 n = 30 n = 50
0.16062 0.11708 0.09718 0.08920 0.07467 0.06649 0.50492 0.04331 0.50191
0.40078 0.24378 0.19495 0.15258 0.15489 0.14507 0.52749 0.12560 0.50937
0.50000 0.32895 0.25593 0.23517 0.19246 0.17585 0.54988 0.14098 0.52485
0.59922 0.40816 0.31343 0.25790 0.23061 0.20855 0.57217 0.16191 0.53903
0.83938 0.49164 0.36723 0.31707 0.26661 0.23965 0.59619 0.18186 0.55340
0.50836 0.41811 0.33233 0.30141 0.26940 0.62135 0.20130 0.56779
0.59184 0.46797 0.38778 0.33418 0.29826 0.64733 0.22027 0.58251
0.67105 0.50000 0.42563 0.36596 0.32580 0.67420 0.23882 0.59726
0.75622 0.53203 0.45577 0.39652 0.35267 0.70174 0.25664 0.61223
0.88292 0.58189 0.48609 0.42631 0.37865 0.73060 0.27465 0.62756
0.63277 0.51391 0.45459 0.40381 0.76035 0.29159 0.64306
0.68657 0.54423 0.48151 0.42783 0.79145 0.30844 0.65893
0.74407 0.57437 0.50000 0.45012 0.82415 0.32492 0.67508
0.80505 0.61222 0.51849 0.47251 0.85493 0.34107 0.69156
0.90282 0.66767 0.54541 0.49508 0.93351 0.35694 0.70841
0.68293 0.57369 0.37244 0.72535
0.74210 0.60348 0.38777 0.74336
0.76483 0.63404 0.40274 0.76118
0.84742 0.66582 0.41749 0.77973
0.91080 0.69859 0.43221 0.79870
0.73339 0.44660 0.81814
0.76939 0.46097 0.83809
0.80754 0.47515 0.85902
0.84511 0.49063 0.87440
0.92533 0.49809 0.95669
Table 17. Performance of Optimal Samples for the Type III Standard Epanechnikov Distribution
Sample
5 10 15 20 25 30 50 ∞
size ( )
7.32E-10 4.97E-10 4.84E-10 2.28E-10 2.70E-10 2.61E-10 3.72E-10 0
0.50000 0.50000 0.50000 0.50000 0.50000 0.50000 0.50000 0.50000
0.54773 0.54773 0.54773 0.54773 0.54773 0.54773 0.54773 0.54772
0.58482 0.58481 0.58481 0.58481 0.58481 0.58481 0.58481 0.58480
0.61479 0.61479 0.61479 0.61479 0.61479 0.61479 0.61479 0.61479
0.63970 0.63971 0.63971 0.63971 0.63971 0.63971 0.63971 0.63972
3. Final Remarks
This particular optimization problem is challenging because multiple local optima are present in
the wide search region considered. Different optimization methods can be used for finding the
optimal samples, including gradient-based methods, stochastic-based methods, and heuristic
search methods. Using a combination of different methods is particularly useful for achieving
the solution in an efficient and effective way. The starting point for the optimization is usually
the interval-median sample, another deterministic sampling method based on finding the
median of each probability interval represented by each element in the sample.
As it can be inferred, the size of the sample plays a key role in the determination of the
elements in an optimal sample. Small-sized samples may become unable to represent the first
normalized moment, particular for large values of . All optima samples reported here
considered only the first 5 normalized moments. On the other hand, larger samples have more
degrees of freedom for adjusting the sample moments to the distribution moments. As the size
of the sample increases, the optimal sample approaches the interval median sample. For very
large sample sizes (e.g. >1000), the interval median sample can be considered an optimal
sample, avoiding the use of optimization.
The tables of optimal samples included in this report show the values of the elements of each
sample (considering different sample sizes: 5, 10, 15, 20, 25, 30 and 50), but also their
performance in terms of the objective function used in the optimization and the values of the
normalized moments, for different representative standard distributions (Uniform, Normal,
Exponential, Maxwell-Boltzmann, , Log-normal, Integrated-normal, and Epanechnikov).
Standard distributions allow generalizing the results to a whole family of probability
distributions. Type I (mean zero, and variance one) and Type II (mean one, positive values only)
standard distributions are unbounded in at least one side, and thus, elements with very large
absolute values are possible in a sample. Those large values have a significant influence on the
normalized moments, particularly on the higher moments, causing a mismatch between the
sample moments and the distribution moments, particularly for smaller sample sizes. Type III
(bounded between 0 and 1) distributions can be fitted with relative ease, achieving low values
of the objective function.
Another important consideration is the symmetry of the distribution. For perfectly symmetric
distributions, optimizing only half of the elements (either the lower or the upper half) in the
sample is more efficient than using all elements as decision variables. The other half of the
elements is easily found by reflecting the values of the first half about the mean. This approach
also allows improving the objective function in unbounded distributions, as is the case of the
standard Normal distribution.
From the families of continuous distributions considered, the and the Log-normal
distributions involve additional parameters in the probability density functions (degrees of
freedom for , and for the Log-normal). For those families, optimal samples for only one
particular parameter value were considered. In the case of , it was , and for the Log-
normal distribution, it was . The selection of the parameters was rather arbitrary as they
were used only for illustrative purposes. In both cases, it is also possible to obtain close-to-
optimal samples by transforming optimal samples of the standard Normal distribution. The
latter is possible because optimal samples can be used for representing the non-linear behavior
of the full distribution. Thus, they can also be used for analyzing the effects of uncertainty in
non-linear systems, or for simplifying the number of elements in Monte Carlo simulations.
Acknowledgments
The author gratefully acknowledges Prof. Jaime Aguirre (Universidad Nacional de Colombia)
for proof-reading the manuscript.
This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.
References
[4] Hernandez, H. (2018). Expected Value, Variance and Covariance of Natural Powers of
Representative Standard Random Variables. ForsChem Research Reports, 3, 2018-08. doi:
10.13140/RG.2.2.15187.07205.