Vous êtes sur la page 1sur 21

E.

Statistical evaluation of PRNGs


E.1. ENT battery of statistical tests1
The ENT battery of statistical tests (i.e., entropy, chi-square test, arithmetic mean, MC value of pi and serial
correlation coefficient) is useful for evaluating pseudorandom number generators for encryption and
other applications where the information density of a file is of interest.

E.1.1. Monte Carlo value approximation


In this test, each successive sequence of six bytes is used as 24-bit X and Y co-ordinates within a square. If
the distance of the randomly-generated point is less than the radius of a circle inscribed within the square,
the six-byte sequence is considered a hit. The percentage of hits can be used to calculate the value of .
How?!
Let us consider a circle of radius R, as shown in figure below. As we know, the area of this circle can be
computed with the aid of eq. 1. From this formula, we can extract the value of , as shown in eq. 2.
Furthermore, if we consider only the area of the circle included in the first quadrant (i.e., area marked with
green, which hereafter will be denoted as Q), the value of can be computed with the aid of eq. 3. Now,
the denominator within eq. 3 can be geometrically interpreted as the aria of the square of side R (i.e., area
marked with red, which hereafter will be denoted as S). Therefore, it all reduces to the approximation of
the ratio between Q and S (hereafter denoted as ), as shown in eq. 4.
The Monte-Carlo approximation of involves randomly selected points ( , )} in the unit square
and determining the ratio = / , where is the number of points that satisfy + 1.

= (1)

= (2)

4
= (3)

=4 =4 (4)

1
http://www.fourmilab.ch/random/
No. of points value approximation abs. err. [%] Execution time/point [s]
100 3.191919191919192 1.6019434687655515
500 3.174348697394790 1.0426572575399710
1000 3.159159159159159 0.5591592388431780
0.00357 0.00401
5000 3.154223084616923 0.4022861641530620
10000 3.149114911491149 0.2394409056425770
50000 3.141182823656473 0.0130452919429850

Remarks:
1. Analysing the graphs above2, one can observe the fact that in the beginning of the simulation the results
are changing/fluctuating wildly, whilst near the end the results converge to the point of interest (i.e., the
value of ). This is an expected behavior since as less points are taken into consideration as more important
they are (i.e., after considering the first two points, the 3rd one can exhibit greatly influence on the average
value, whilst after considering the first 106 points, the 106+1st one cant exhibit that much influence on the
average value).
2. If someone runs the same simulation model twice (i.e., MC value approximation, e.g., with 1000 points
taken into consideration), for sure, will get two good results (i.e., in the sense of value approximation)
but totally different ones (e.g., = 3. (159) with = 0.559%, resp., = 3. (167) with = 0.814%).
This fact suggests, strongly, that neither answer should be taken for granted.
3. Related with the previous two remarks, and generally speaking, even if computer simulation models can
offer an expected result quite quickly, as more precision is required as bigger (sometimes exponentially)
the number of trials (i.e., in our case, the number of points taken into consideration). Thus, e.g., if 10 7
points are taken into account, at a slower pace and with slight differences at each 10 6 points considered
(resp., with close to the same answer every time the same simulation model is run) a much better estimate
is computed. Now, this doesnt mean that at the end of the simulation the right answer was acquired but,
with statistical comfort, it can be decided that enough points were considered.

In a more practical approach, being given the file MC_Pi_Aprox_binfile. txt (i.e., a pseudo-random bit
string, of size 256KB), using the pseudocode presented in Appendix B, with an associated relative error
= 0.105046659574485%, the estimate is = 3.1382925154497603.

2
Generated using the pseudocode in Appendix D, resp., Appendix E
3
Similar results with ones presented in Appendix F, i.e., as computed by ENT statistics battery
E.1.2. Entropy
In this test, the information density of the contents of the file is evaluated and expressed as a number of
bits per character. For example, the entropy of an 8-bit grayscale image should be, at least theoretically,
equal to 8 bit. Actually, in practice, the resulted entropy is smaller than the ideal one. The smaller the
resulted entropy, the greater the degree of predictability, a fact which threatens encryption systems
security.
The entropy of a file (e.g., an image, in our case) can be computed with the aid of eq. 5, where represents
the probability of a given grayscale and is computed as shown in eq. 6.

= ( ) (5)

= (6)

In the above eqs. represents the number of bins (i.e., the number of shades), represents the number
of counts (i.e., number of pixels) for a given shade and represents the total number of pixels.

For our case (i.e., entropy of 8-bit grayscale images), there are some weaknesses exhibited by so called
global Shannon entropy and thus a more powerful qualitative assessment implies the use of local Shannon
entropy4:
1. Inaccuracy: the global Shannon entropy sometimes fails to measure the true randomness of an image.
Unlike global Shannon entropy, ( , ) local Shannon entropy can capture local image block randomness,
a measure that might not be correctly reflected in the global Shannon entropy score 5.
2. Inconsistency: the term global is commonly inconsistent for images with various sizes, making the
global Shannon entropy unsuitable as universal measure. However, ( , ) local Shannon entropy is able
to measure the image randomness using the same set of parameter regardless of the various sizes of test
images and thus provides a relatively fair comparison for image randomness among multiple image 6.
3. Low efficiency: the global Shannon entropy measure require the pixel information of an entire image,
which is costly when the test image is large. However, the local entropy measure requires only a portion
of the total pixel information7.

4
Y. Wu, Y. Zhou, G. Severiades, S. Agaian, J.P. Noonan, P. Natarajan, Local Shannon entropy measure with statistical
tests for image randomness, Inf. Sci. 222 (2013) 323-342
5
Idem, Fig. 3 and Fig. 4, p. 331
6
Idem, Table 2, p. 332
7
Idem, Fig. 10, p. 340
For the computation of the local Shannon entropy, = 31 non-overlapping blocks of pixels (each of them
having = 1936 pixels, taken from the ciphered image subjected to local entropy assessment) were
considered, the results being scrutinized with respect to the acceptance intervals at 5%, 1% and 0.1%
significance levels as shown in table below.

Test image Global Local Local entropy critical values


entropy entropy
k = 30, T = 1936
.
h = 7.901901305 h .
= 7.901722822 h .
= 7.901515698
.
h = 7.903037329 h .
= 7.903215812 h .
= 7.903422936
Lenna 7.9968 7.9026 Passed Passed Passed

In a more practical approach, being given the file MC_Pi_Aprox_binfile. txt (i.e., a pseudo-random bit
string, of size 256KB), using the MATLAB command entropy( _ ), the computed entropy is
= 0.999968091102078 . 8

E.1.3. Chi-square test


The graph that depicts pixels value distribution within a file (e.g., our image), by representing their number
relative to each intensity level (viz., counts vs. bins), is called Histogram.

For true random files a uniform distribution is expected (i.e., as to conceal the clues needed in a statistical
attacks). In order to assess if the distribution within a histogram does approach the features of a uniform
distribution (i.e., equiprobable frequency counts) its goodness-of-fit is subjected to test (that is, with the
aid of the chi-square test).
The Chi-square test is the most well-known statistics used to test the agreement between observed and
theoretical (expected) distribution, independency and homogeneity 9.
The Chi-square goodness-of-fit tests if a sample of data (i.e., the frequency counts, in our case) came from
a population with a specific distribution (i.e., a uniform distribution), by comparing the distribution of
sampled data against another distribution whose expected frequencies are known.

8
Similar results with ones presented in Appendix C, i.e., as computed by ENT statistics battery
9
S.D. Bolboac, L. Jantschi, A.F. Sestra, R.E. Sestra and D.C. Pamfil, Pearson-Fisher Chi-Square statistic revisited,
Information 2 (2011) 528-545
The Chi-square goodness-of-fit test works under the null hypothesis that there is no statistical difference
between the observed values and the theoretical (expected) values, that is, sampled data followed the
assumed distribution, with respect to a significance level .
Thus, the value of the -test is given by the eq. (7)10, where is the observed frequency associated to
the frequency class and is the expected frequency calculated from the theoretical distribution law.

( )
= (7)

The value, computed by applying eq. (7) over the sampled data, is compared against the critical value
, . = 293.2511 and if it is lower than the critical value then the null hypothesis is accepted.
Based on the returned values, i.e., = 285.39, which indicates that the null hypothesis is accepted at
the default 5% significance level, resp., hypothesiss value = 0.0926 > = 0.05, one can conclude
that the data in histograms values vector came from a normal distribution.

E.1.4. Serial correlation coefficient


This quantity measure the extent to which each bit/byte in the file depend upon the previous one. For
random sequences, this value (which can be positive or negative) will be close to zero.

Being given the file MC_Pi_Aprox_binfile. txt (i.e., a pseudo-random bit string, of size 256KB), using the
MATLAB command autocorr( _ ), the computed serial autocorrelation coefficients (i.e., for
different lags) are shown in the graph above, the mean value of serial autocorrelation coefficient being
0.000101.

10
R.E. Boriga, A.-C. Dsclescu, A.-V. Diaconu, A new fast image encryption scheme based on 2D chaotic maps,
IAENG Int. J. Comp. Sci. 41 (4) (2014) 249-258
11
http://www.fourmilab.ch/rpkp/experiments/analysis/chiCalc.html
E.2. NIST Statistical Test Suite (i.e., NIST 800-22)12
Generators suitable for use in cryptographic applications may need to meet stronger requirements than
for other applications. In particular, their outputs must be unpredictable in the absence of knowledge of
the inputs. Some criteria or characterizing (i.e., in relation to cryptanalysis) and selecting appropriate
generators are discussed in NIST Special Publication 800-22 Rev.1a (April 2010).
In what follows, we briefly introduce the logistic map, which is the basic building block of the pseudo-
random bit generator (PRBG) that will be subjected to randomness assessment using the NIST Statistical
Test Suite.
The logistic map is defined by eq. (8), where [ , ] is a state variable and [ , ] is a system
parameter:
= (1 ) (8)
In figure below, i.e., subplot (a), the map function ( ) is shown, for systems parameter = 4 (a.k.a., the
attractor of the map function ( )). It can be observed that the map function is symmetric about the
midpoint of interval [0, 1], resp., this iterative map shows a strange complex behavior, where map function
never repeats its history. This peculiar behavior is termed as chaos and more precisely, it can be described
by the phrase sensitivity on initial conditions. In subplot (b), two trajectories of the logistic map are
shown, two trajectories which start nearby (i.e., with a difference of just 10 between the two initial
seeding points, that is, = 0.709364831851 and = 0.709364831852, resp., = 4 in both cases)
and soon diverge exponentially and exhibit no correlation between them. In subplot (c), the complete
dynamical behavior of the logistic map is shown by using the bifurcation plot (i.e., a plot illustrating the
qualitative changes in the dynamical behavior of the logistic map as a function of systems parameter .
One can notice from the bifurcation diagram that the map function is surjective in the complete interval
[0, 1] only at = 4, i.e., every value of ( ) in the interval [0, 1] is an image of at least one value of in
the same interval [0, 1]. Subplot (d)13 showcases the Lyapunov exponent, i.e., eq. (9), which is a
quantitative measure of chaos (that is, a positive Lyapunov exponent indicates chaotic behavior).

= lim ln | ( )| (9)

A random bit generator (RBG) is a device/algorithm, which outputs a sequence of statistically independent
and unbiased binary digits. Such generator requires a naturally occurring source of randomness (i.e., non-
deterministic). In most practical environments designing a hardware device or software program to exploit
the natural source of randomness and produce a bit sequence free from biases and correlation is a difficult
task. In such situations, the problem could be ameliorated by replacing a RBG with a PRBG. A pseudo
random bit generator (PRBG) is a deterministic algorithm which uses a truly random binary sequence of
length as input (i.e., seed) and produces a binary sequence of length called pseudo random
sequence (i.e., it appears to be random)14.
For our study purposes, the bit sequence is generated by comparing the outputs of the logistic map with
the threshold value = 0.5:
1 >
( )= (10)
0 <

12
http://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-22r1a.pdf
13
Generated using the pseudocode shown in Appendix B (i.e., LMs Bifurcation diagram and Lyapunov exponent),
resp., Appendix C (i.e., LMs attractor and sensibility to initial conditions)
14
Most of the above text (i.e., red colored, and with except of subplots referred within the text) is reproduced from
V. Patidar, K.K. Sud, A pseudo random bit generator based on chaotic logistic map and its statistical testing,
Informatica 33 (2009) 441-452
E.2.1. Frequency (Monobit) Test
The focus of the test is the proportion of zeros and ones for the entire input sequence. The purpose of this
test is to determine whether the number of ones and zeros in a sequence are approximately the same as
would be expected for a truly random sequence. The test assesses the closeness of the fraction of ones to
that is, the number of ones and zeros in a sequence should be about the same. All subsequent tests
depend on the passing of this test.

Test Description: (i) zeros and ones of the input sequence , of length , are converted to values of 1,
resp., +1 and are added together to produce the sum ; (ii) compute the statistic , with the aid of
eq. (11); and then, (iii) compute sequences , using eq. (12); (iv) if the computed is
less than the significance level = 0.01, then conclude that the sequence is non-random.
| |
= (11)

= (12)
2
Input size recommendation: it is recommended that each sequence to be tested consist of a minimum of
100 bits.

For our study, the file NIST_Test_LM. txt will be used, i.e., a pseudo-random bit string generated using the
Logistic Map and the discretization rule (10). The input sequence, having 2.097.152 bits in length, is divided
into 128 non-overlapping blocks, i.e., each 16.384 bits sub-sequence being subjected to the statistical
assessment.

There are several ways to interpret a set of computed by an empirical test of randomness, i.e.,
examination of the proportion of sequences that pass a certain statistical test (that is, relative number of
that are greater than the significance level = 0.01), resp., uniformity testing of
(that is, computed for random sequences should be uniformly distributed on the interval
[0, 1)).
The acceptable proportion of passing sequences should fall within the interval defined by eq. (12), i.e.,
0.99 0.0264 for the significance level = 0.01 and the number of tested sequences = 128.

(1)
(1) 3 (12)

The computed by a single test should be uniformly distributed on the interval [0, 1). Hence,
the uniformity of forms a hypothesis and it can be tested by a statistical test, i.e., GoF test.
For this, the interval [0, 1) is divided into 10 sub-intervals [ , ) and test checks whether the number
of for each sub-interval is close to 15.

15
Most of the above text (i.e., red colored) is reproduced from M. Ss, Z. ha, V. Maty, K. Mrton, A. Suciu, On the
interpretation of results from the NIST Statistical Test Suite, Romanian Journal of Information Science and Technology
18 (1) (2015) 18-32
E.2.2. Frequency Test within a Block
The focus of the test is the proportion of ones within M-bit blocks. The purpose of this test is to determine
whether the frequency of ones in an M-bit block is approximatively M/2, as would be expected under an
assumption of randomness.

Test Description: (i) partition the input sequence into = / non-overlapping blocks (with any unused
bits being discarded); (ii) determine the proportion of ones in each M-bit block using eq. (13); and then,
(iii) compute the statistic (14); (iv) compute sequences , using eq. (15); (iv) if the computed
is less than the significance level = 0.01, then conclude that the sequence is non-random.
( )
= , = 1, (13)

=4 ( 1/2) (14)

= , (15)
2 2
Input size recommendation: it is recommended that each sequence to be tested consist of a minimum of
100 bits (i.e., 100), resp., the block size should be selected such that 20, > 0.01 and
< 100.

For our study, each of the 128 input sub-sequences (i.e., = 16.384 bits) will be divided into = 16 non-
overlapping blocks (i.e., = 1024 bits).

E.2.3. Runs Test


The focus of the test is the total number of runs in the sequence, where a run is an uninterrupted sequence
of identical bits. A run of length consists of exactly identical bits and is bounded before and after with
a bit of the opposite value. The purpose of the runs test is to determine whether the number of runs of
ones and zeros of various lengths is as expected for a random sequence. In particular, this test determines
whether the oscillation between such zeros and ones is too fast or too slow.

Test Description: (i) compute the pre-test proportion of ones in the input sub-sequence, using eq. (16);
(ii) determine if the prerequisite frequency test is passed, i.e., if < , where is defined by eq. (17); and
then, (iii) compute the test statistic , with the aid of eq. (18); (iv) compute sequences , using
eq. (19); (iv) if the computed is less than the significance level = 0.01, then conclude that the
sequence is non-random.
( )
= (16)

2
= (17)

= ( ) + 1, where ( ) = 0 if = , and ( ) = 1 otherwise (18)


| 2 (1 )|
= (19)
2 2 (1 )
Input size recommendation: it is recommended that each sequence to be tested consist of a minimum of
100 bits.

E.2.4. Test for the Longest Run of Ones in a Block


The focus of the test is the longest run of ones within -bit blocks. The purpose of this test is to determine
whether the length of the longest run of ones within the tested sequence is consistent with the length of
the longest run of ones that would be expected in a random sequence. Note that an irregularity in the
expected length of the longest run of ones implies that there is also an irregularity in the expected length
of the longest run of zeros. Therefore, only a test for ones is necessary.

Test Description: (i) divide the sequence into -bit blocks; (ii) tabulate the frequencies of the longest
runs of ones in each block into categories, where each cell contains the number of runs of ones of a given
length; (iii) compute the statistic (20); and (iv) compute sequences , using eq. (21); (v) if the
computed is less than the significance level = 0.01, then conclude that the sequence is non-
random.
For our study, each of the 128 input sub-sequences (i.e., = 16.384 bits) will be divided into = 2048
non-overlapping blocks (i.e., = 8 bits). Doing this, only four ranges of frequencies will be tabulated, i.e.,
will held frequencies of longest runs that are less than or equal to 1, will held frequencies of longest
runs that are equal to 2, will held frequencies of longest runs that are equal to 3, resp., will held
frequencies of longest runs that are equal or greater than 4.

( )
= (20)

where, for = 8-bit blocks, = 3, resp., = 0.2148, = 0.3672, = 0.2305 and = 0.1875.16

= , (21)
2 2
Input size recommendation: it is recommended that each sequence to be tested consist of a minimum of
128 bits for = 8-bit blocks, 6272 bits for = 128-bit blocks and 750.000 bits for = 10 -bit blocks.

E.2.5. Binary Matrix Rank Test


The focus of the test is the rank of disjoint sub-matrices of the entire sequence. The purpose of this test is
to check for linear dependence among fixed length substrings of the original sequence. Note that this test
also appears in the DIEHARD battery of tests.17

Test Description: (i) divide the input sequence in -bit disjoint blocks, i.e., = such blocks, and

18
collect the bit segments into matrices (i.e., each row of the matrix is filled with successive
bit blocks); (ii) determine the binary rank of each matrix; (iii) with representing the number of

16
values for the other cases, i.e., = 5 and = 128, = 5 and = 512, = 5 and = 1000, resp., = 6
and = 10000 are listed in Section 3.4 of the NIST 800-22 standard; whilst subsequent ranges of frequencies are
listed in Section 2.4.4 of the same standard
17
G. Marsaglia, DIEHARD Statistical Tests, [http://www.stat.fsu.edu/pub/diehard/]
18
Usually, = , i.e., square matrices are taken into consideration
matrices with the rank (i.e., for = 0, 2), resp., with representing the number of remaining
matrices, compute statistic (22); (iv) compute sequences , using eq. (23); (v) if the computed
is less than the significance level = 0.01, then conclude that the sequence is non-random.

( ) 1
= + (22)19
1

(23)
=
Input size recommendation: the minimum number of bits to be tested must be such that 38 , i.e.,
at least 38 matrices are created (in our case, for = = 8, each sequence to be tested should consist of
a minimum of 2.432 bits).
Table 1 summarizes NIST STS partial results (i.e., for all tests described above). Each row of this table
corresponds to one test. Values in the columns 1, 2, , 10 represent number of that fall
within intervals [0.0, 0.1), [0.1, 0.2), , [0.9, 1.0). Value in the column Proportion represents
proportion of sequences that pass a given test (i.e., with respect to eq. (12)).
The uniformity testing of is shown in Appendix A. For this test, a computed small
(e.g., the case of uniformity testing of for Binary Matrix Rank Test, i.e., Appendix A, last row)
indicates a problem of the generator, but it is hard to identify a concrete weakness.
Table 1. Partial NIST STS results
Proportion
TEST C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 Result

2.1 12 23 11 12 10 13 15 09 13 09 0.9922 PASSED
2.2 12 07 07 15 12 10 12 18 17 18 0.9844 PASSED
2.3 07 11 10 15 14 18 13 12 18 10 0.9844 PASSED
2.4 09 04 12 11 15 16 15 11 15 20 0.9922 PASSED
2.5 12 10 15 17 22 18 12 09 10 03 0.9766 PASSED

19
represents the expected proportion of binary matrices with the rank ; for = 8 the expected
proportions are computed here: [link]
E.3. Supplementary testing of random images
E.3.1. The uniformity of the bit distribution within each bit-plane
The bit arrangement in each pixel of an 8-bit greyscale image makes it possible for the most significant bits
to contain a large percentage of the total information of a pixel. Subsequently, the most significant bit-
planes will provide most of the inherent visual information carried by the image. Therefore, it is desirable
that not only the positions of each pixel but also its value to be changed during the confusion stage 20,21. To
assess whether and to what extent pixels values are changed during the confusion stage an analysis of
the percentage of bit value information within each bit-plane was performed.
Percentage of bit value information for each bit-plane within Lenna plain-image, resp., its scrambled and
ciphered versions are summarized in Table 2. We can easily observe the fact that, even after the confusion
stage, the bit distribution within each bit-plane is more uniform. This fact can be visually assessed by
comparing Fig. 1.a) with Fig. 1.b) and Fig. 1.c), respectively Fig. 2.a) with Fig. 2.b) and Fig. 2.c). To be noticed
the fact that the subsequent diffusion process improves the uniformity of bit distribution within each bit-
plane.

a) b) c)
Fig. 1 Higher bit-planes of Lenna testing image, i.e., 8th bit-plane
a) plain-image, b) scrambled image and c) ciphered image

a) b) c)
Fig. 2 Higher bit-planes of Lenna testing image, i.e., 7th bit-plane
a) plain-image, b) scrambled image and c) ciphered image

20
W. Zhang et al., A symmetric color image encryption algorithm using the intrinsic features of bit distribution,
Commun Nonlinear Sci Numer Simulat 18 (2013) 584-600
21
Y. Zhou et al., Image encryption using binary bit-plane, Signal Processing 100 (2014) 197-207
Table 2
Percentage of bit value information for each bit-plane
Lenna testing image vs. its scrambled and ciphered versions (percentage of 1s)
Test image Stage 8th bit 7th bit 6th bit 5th bit 4th bit 3rd bit 2nd bit 1st bit
plain (%) 69.785 52.055 57.626 48.289 50.154 50.142 49.719 50.203
Lenna scrambled (%) 50.142 50.166 49.899 50.511 50.265 50.050 50.641 50.285
ciphered (%) 50.099 50.148 49.872 50.199 50.003 50.218 49.936 50.038

E.3.2. Correlation between neighboring higher bit-planes


Besides the bit value distribution within images bit-planes, some other intrinsic features among different
bit-planes should be assessed, i.e., correlation coefficients between neighboring higher bit-planes. There
are strong correlations among the higher bit-planes of a plain-image and a good cryptographic algorithm
should considerably reduce them.
For the evaluation of correlation coefficients between the neighboring higher bit-planes, the three most
significant bit-planes of the Lenna testing plain-image, resp., its scrambled and ciphered versions were
divided into sixteen non-overlapping blocks, 64 64 bits each. Correlation coefficients, computed for each
pair of these blocks, as shown in Fig. 3, are summarized in Table 3 (i.e., in mean value and standard
deviation from the mean).
Analyzing Fig. 3 and data summarized in Table 2 we can conclude that the proposed confusion strategy
reduces considerably the correlation coefficients among higher bit-planes of the processed plain-images.
These correlation coefficients are more reduced after the subsequent diffusion stage of the proposed
encryption algorithm.

a b c
Fig. 3 Correlation coefficients within Lenna plain-image vs. scrambled images
a) 8th and 7th bit-planes blocks, b) 7th and 6th bit-planes blocks, and c) 8th and 6th bit-planes blocks

Table 3
Correlation coefficients between higher bit-planes
Lenna testing plain-images vs. its scrambled and ciphered versions
Lenna
Bit planes Stage
Mean Std_dev.
plain 0.3878 0.2270
8th/7th scrambled 0.0283 0.0116
ciphered 0.0042 0.0163
plain 0.2599 0.2164
8th/6th scrambled 0.0087 0.0198
ciphered 0.0025 0.0199
plain 0.0080 0.2955
7th/6th scrambled 0.0079 0.0091
ciphered 0.0046 0.0111
E.3.3. Gray difference and degree of scrambling
The gray degree of scrambling22 represents, in essence, a qualitative measure of the scrambling effect
achieved over an image and is defined as follows:
(, ) ( ( , ))
= , (24)
(, ) + ( ( , ))
where ( ( , )) and ( , ) represent the mean gray differences of the original image and the
scrambled image respectively, being defined as follows:
(, )
(, ) = , (25)
( 2)( 2)
where , resp., represent images dimensions and ( , ) represents the gray difference between the
pixel of coordinates ( , ) and its neighboring pixels, computed as follows:

1
(, )= [ ( , ) ( , )] , (26)
4
,

where ( , ) = ( 1, ), ( + 1, ), ( , 1), ( , + 1)} and ( , ) denote the gray value of the pixel at
coordinates ( , ).
In our case, the degree of scrambling within the image obtained at the output of the proposed confusion
strategy is = 0.9550.

E.3.4. Adjacent pixel correlation coefficients


In plain-images, due to pixels arrangement, there exists a high correlation coefficient between adjacent
pixels in horizontal, vertical and diagonal directions. One goal of a good image encryption algorithm,
among the others already stipulated, is to minimize as much possible the typical correlation between
adjacent pixels.
Fig. 5 plots the distribution of horizontal adjacent pixels in plain, scrambled and ciphered images. As can
be seen, whilst within the plain-image the adjacent pixels have equal values or close to each other, in the
ciphered image adjacent pixels values suffer great changes (i.e., inherent correlation degree has been
weakened).

a b c
Fig. 5 Horizontal adjacent pixels correlation before and after image scrambling, resp., ciphering of
Lenna plain-image: a) before scrambling; b) after scrambling; c) after ciphering

22
R. Ye et al., A novel image scrambling and watermarking scheme based on cellular automata, in: Proc. of the 2008
IEEE Int. Symp. on Electronic Commerce and Security, 2008, 938-941
Numerically, APCCs for the testing image of choice are: = 0.00019, = 0.0018, resp.,
= 0.0023.
For previously computed APCCs, under the null hypothesis of Students t-distribution 23,24, corresponding p-values
have been calculated and summarized in Table 4. It is clear that the computed p-values are much greater than 5%,
which implies that the null hypothesis is accepted. Therefore, these results indicate that the proposed encryption
algorithm successfully reduces adjacent pixel correlation coefficients.

Table 4
p-values of adjacent pixel correlation coefficients
Statistics Direction Test images
Size
Degrees of freedom
Lenna
256x256
65534
APCC - 0.00019
Statistic t horizontal - 0.0486
p-value 96.38%
APCC 0.0018
Statistic t vertical 0.4607
p-value 64.67%

23
Y. Wu et al., 2D Sudoku associated bijections for image scrambling, Information Sciences 327 (2016) 91-109
24
A. Donner, B. Rosner, On inferences concerning a common correlation coefficient, Appl. Stat. (1980) 69-76
APPENDIX A
APPENDIX B
clear; clc;

Npre = 1000;
Nplot= 200;

subplot(2, 2, 1);
hold off;
plot(NaN);
title('Bifurcation diagram of the Logistic Map', 'FontSize', 10);
xlabel('r', 'FontSize', 10);
ylabel('x_n', 'FontSize', 10);
set(gca, 'ylim', [0 1]);
hold on;
set(gca, 'xlim', [2.5 4.0]);
hold on;
subplot(2, 2, 3);
hold off;
plot(NaN);
title('Lyapunov exponent of the Logistic Map', 'FontSize', 10);
xlabel('r', 'FontSize', 10);
ylabel('\lambda', 'FontSize', 10);
set(gca, 'ylim', [-3 2]);
hold on;
set(gca, 'xlim', [2.5 4.0]);
hold on;
refline([0 0]);
hold on;

x=zeros(Nplot, 1);
for r = 2.5:0.01:4.0
x(1) = 0.70936483085807256;
for n = 1:Npre
x(1) = r*x(1)*(1-x(1));
end
L_exp=0;
for n = 1:Nplot-1
x(n+1)=r*x(n)*(1-x(n));
L_exp = L_exp + log(r*(1-2*x(n+1)));
end
subplot(2, 2, 1);
plot(r*ones(Nplot,1), x, '.b', 'markersize', 1);
hold on;
subplot(2, 2, 3);
plot(r, L_exp/100, 'or', 'markersize', 2);
pause(0.001);
hold on;
end
APPENDIX C

clear; clc;

subplot(2, 2, 2);
hold off;
plot(NaN);
title('Attractor of the Logistic Map', 'FontSize', 10);
xlabel('x_n', 'FontSize', 10);
ylabel('x_n_+_1', 'FontSize', 10);
set(gca, 'xlim', [0 1]);
hold on;
set(gca, 'ylim', [0 1]);
hold on;
subplot(2, 2, 4)
hold off;
plot(NaN);
title('Sensibility of the Logistic Map', 'FontSize', 10);
xlabel('r=4, x_0^1=0.709364831851, x_0^2=0.709364831852', 'FontSize', 9);
ylabel('x_n_+_1^1, x_n_+_1^2', 'FontSize', 10);
hold on;

x_n = 0.70936483085807256;
r = 4.0;
for i=1:250
x_n_n = r*x_n*(1-x_n);
subplot(2, 2, 2);
plot(x_n, x_n_n, 'ob', 'markersize', 2);
pause(0.001);
hold on;
x_n = x_n_n;
end

x_1 = 0.709364831851;
x_2 = 0.709364831852;
r = 4.0;
for i= 1:10000
x_1_n(i) = r*x_1*(1-x_1);
x_2_n(i) = r*x_2*(1-x_2);
x_1 = x_1_n(i);
x_2 = x_2_n(i);
end
subplot(2, 2, 4);
plot(0:1:25, x_1_n(1, 20:45), '-or');
hold on;
plot(0:1:25, x_2_n(1, 20:45), '-ob');
APPENDIX D
clc; clear;

n = 49999;
x = 0:1:50000;

figure(1)
subplot(1,2,1);
Plot_pi_Chart = Pi_Aprox_Chart(0, pi/2, 0, 0, 1);
hold on;
subplot(1,2,2);
semilogx(x, 4);
set(gca,'xgrid','on');
hline=refline([0 pi]);
hline.Color='r';
hold on;

counter = 0;

for i = 1 : 1 : n
tstart = tic;
x = rand (1, 1);
y = rand (1, 1);
if (x^2+y^2) <= 1
subplot(1,2,1);
plot(x, y, 'r*');
hold on;

counter = counter + 1;

Pi(i) = 4 * (counter / i);


subplot(1,2,2);
semilogx(i, Pi(i), '-bo', 'MarkerSize',2);
hold on;
else
subplot(1,2,1);
plot(x, y, 'b*');
hold on;
Pi(i) = 4 * (counter / i);
subplot(1,2,2);
semilogx(i, Pi(i), '-bo', 'MarkerSize',2);
hold on;
end
%pause(0.0001);
telapsed = toc(tstart);
end

format long
Pi(n)
Err = ((Pi(n) - pi)/pi)*100
telapsed
APPENDIX E
clear
clc

fileID = fopen('MC_Pi_Aprox_binfile.txt');
Q = fread(fileID);
Q = Q - 48;
Q = Q';

n = 43690;
x = 0:1:10000;
figure(1)
subplot(1,2,1);
Plot_pi_Chart = Pi_Aprox_Chart(0, pi/2, 0, 0, 1);
hold on;
subplot(1,2,2);
semilogx(x, 4);
set(gca,'xgrid','on');
hline=refline([0 pi]);
hline.Color='r';
hold on;

counter = 0;

for i = 1 : 1 : n
x = bi2de(Q((((i-1)*48)+1):(((i-1)*48)+24)),'left-msb');
x = x/16777215;
y = bi2de(Q((((i-1)*48)+25):(((i-1)*48)+48)),'left-msb');
y = y/16777215;
if (x^2+y^2) <= 1
subplot(1,2,1);
plot(x, y, 'r*');
hold on;

counter = counter + 1;

Pi(i) = 4 * (counter / i);


subplot(1,2,2);
semilogx(i, Pi(i), '-bo', 'MarkerSize',2);
hold on;
else
subplot(1,2,1);
plot(x, y, 'b*');
hold on;
Pi(i) = 4 * (counter / i);
subplot(1,2,2);
semilogx(i, Pi(i), '-bo', 'MarkerSize',2);
hold on;
end
pause(0.001);
end

format long
Pi(n)
Err = ((Pi(n) - pi)/pi)*100
APPENDIX F

F:\...\ENT>ent -b -c Test_1.txt

Value Char Occurrences Fraction


0 1041598 0.496675
1 1055546 0.503325
Total: 2097144 1.000000

Entropy = 0.999968 bits per bit.


Optimum compression would reduce the size of this 2097144-bit file by 0 percent.
Chi square distribution for this 2097144-bit file is 92.77, and randomly would exceed
this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.5033 (0.5 = random).
Monte Carlo value for Pi is 3.138292515 (error 0.11 percent).
Serial correlation coefficient is -0.004859 (totally uncorrelated = 0.0).

F:\...\ENT>ent -b -c Test_2.txt

Value Char Occurrences Fraction


0 3125817 0.496836
1 3165631 0.503164
Total: 6291448 1.000000

Entropy = 0.999971 bits per bit.


Optimum compression would reduce the size of this 6291448-bit file by 0 percent.
Chi square distribution for this 6291448-bit file is 251.95, and randomly would exceed
this value less than 0.01 percent of the times.
Arithmetic mean value of data bits is 0.5032 (0.5 = random).
Monte Carlo value for Pi is 3.130425495 (error 0.36 percent).
Serial correlation coefficient is -0.006937 (totally uncorrelated = 0.0).

Vous aimerez peut-être aussi