Honkanen Heli

Obtaining Parton Distribution Functions
from Self-Organizing Maps
Heli Honkanen, ISU & UVa

In collaboration with:
Simonetta Liuti (UVa, physics)
Joseph Carnahan, Yannick Loitiere, Paul Reynolds (UVa, cs)
Heli Honkanen SPIN 2008 1

Omnipresent bias
Theoretical bias: Bias introduced by researches in the form of the

precise structure of the model they use, invariably constrains the
form of the solutions
Systematical bias: Bias introduced by algorithms, such as
optimization algorithms, which may favor some results in ways
which are not justified by their objective functions, but rather
depend on the internal operation of the algorithm
• PDFs always present in hadronic processes involving high
virtualities
σ(x, Q2 ), F2 (x, Q2 ) ∼ fi/h (x, Q2 ) ⊗ σ̂ (i) (x, Q2 )
P
i=q,q̄,g
• Knowledge of PDFs and their errors crucial in calculations of

new physics and measurements at the LHC

PDF fast facts
• In principle moments of F2 calculable on lattice, in practise
PDFs need to be extracted from measurements
• Needed also for x, Q2 combinations not available in DIS,
DY,. . . data ⇒ parametrization
• Specific for the incoming hadron, independent of the hard
scattering process ⇒ Universal
• Subject to scale evolution, once known at one scale Q20 can be
predicted for other Q2
• Current methods: Global Analysis & Neural Networks

Extracting PDFs I: Global analysis
• Initial scale (Q0 ∼ 1GeV ≤ Qmin dat ) ansatz
fi/h (x, Q0 ) = a0 xa1 (1 − x)a2 P (x; a3 , ...)
• Evolve to higher scale
• Compute all the available observables
• Compare with all the available data e.g.
χ2 = expt. N −1
P P e
i,j=1 (Data i − Teori ) Vij (Dataj − Theorj )
• Adjust parameters and repeat until global mininum found

• Errors estimated with Hessian method
2 2
P ∂X −1
∂X
(∆X) = ∆χ i,j ∂yi H ij ∂yj
• “ Estimates for the current major global analyses are that something like
∆χ2 = 50 − 100 corresponds to a ∼ 90% confidence interval.”
• Differences between current sets ∼size of the estimated errors

Uncertainties on Uncertainties
• Choice of statistical estimator ⇒ global χ2 is not adequate as
shown by inconsistencies from different data sets
• Error analysis ⇒ ambiguities in the usage of data from
different experiments
• Parametrization dependence ⇒ bias from the functional forms
chosen at the initial scale, Q20
• Theoretical assumptions ⇒ s, s̄, c quark content, details of
evolution (NNLO, large/small x resummation,...)

Extracting PDFs II: Neural Network Approach
(The NNPDF Collaboration)
• State of NN represented by the weight vector

“ ”
(1) (2) (1) (2)
ω= ω11 , ω11 , . . . , θ1 , θ1 , . . .
• ωij (weights) and θi (thresholds) free parameters to be

determined by the fitting procedure

Neural Network Schematically
Output of i:th neutron in the l:th layer:
“ ”
(l) (l)
ξi =g hi , i = 1, . . . , nl , l = 2, . . . , L ,
where nonlinear activation function
1
g(x) = 1+exp(−x) ( g(x) = x for the last layer)
(l−1)
evaluated as a linear combination of the output ξj of all networks in the
previous layers,
(l) Pnl−1 (l−1) (l−1) (l)
hi = j=1 ωij ξj − θi
Example: For (1-2-1) case:
(2) (2)
(3) (3) ω11 ω12
ξ1 = θ1 − (2) (1) (1)
− (2) (1) (1)
1+ eθ1 −ξ1 ω11 1 + eθ2 −ξ1 ω21
PL−1
General architecture: l=1 (nl nl+1 + nl+1 ) parameters

NNPDF algorithm
• Monte Carlo sampling of the data:
“ ”“ PNsys (k) ”
(art)(k) (k) (exp) (k)
Fi = 1+ rN σ N Fi + p=1 rp σi,p + ri σi,s ,
k = 1, . . . , Nrep
• Use neural networks as universal unbiased interpo-

lating functions for each replica (=individual fit for each replica)
2 (k) 1
PNdat “ (art)(k) (net)(k)
”` ´ “
(art)(k) (net)(k)
”
χ [ω] = Ndat i,j=1 Fi − Fi (cov) −1
ij
Fj − Fj
– Genetic Algorithm for ω

• Global minimum given by the average over the sample of
trained NN
„ D E « „ D E «
PNdat (exp) (net) (exp) (net)
χ2 = 1
` ´
Ndat i,j=1 Fi − Fi (cov) −1
ij
Fj − Fj
rep rep
• The uncertainty on the final result is found from the variance

of the Monte Carlo samples

NN results for nonsinglet PDF and gluon
• Architecture of the NN (2-5-3-1)
4
Nrep=100 4
CTEQ6.5
3 MRST2001E
3 Alekhin02
NNPDF1.0
2
xg(x,Q02)
xg (x, Q2)
1
0
1
0
0
-1
-1
-2
1e-05 0.0001 0.001 0.01 0.1 1 -2 -5 -3
10 10-4 10 10-2 10-1 1
x x
0809.3716 [hep-ph]

Things to consider
• MC sampling eliminates the problem of choosing a suitable
value of ∆χ2
– Not tied to use of NN ⇒ How would a functional form fit
behave in MC sampling?
NN training fully automated
• What happens when the data is sparse (nPDFs, GPDs)?
– no control over the parameters
• How to implement information not given directly by the data?
– nonperturbative models, lattice calculations
• Are bigger error bars really what is needed?

Give up this...

...for this!
• Introduce “Researcher Insight” instead of “Theoretical bias”

Extracting PDFs III: Self-Organizing maps
• The SOM is an algorithm used to visualize and interprete large
high-dimensional data sets (subtype of neural networks)
• The map attempts to represent all the available observations
with optimal accuracy using a restricted set of models
• Widely used in several fields of reserch
• SOM is a set of vectors that are isomorphic to the data samples
used for training (PDFs, observables, RGB color triplets...),
arranged e.g. as a 2-D rectangular grid
• Each vector Vi , a cell, is assigned spatial coordinates
• Distance metric Mmap (us: L1 ) determines the topology of the
map
• Implementation proceeds in 3 steps: initialization, training and
clustering

Initializing SOM

Training the SOM
Vi (t + 1) = Vi (t) (1 − w(t) Nj,i (t)) + Sj (t) w(t) Nj,i (t)

Training the SOM II
In the end on a properly trained SOM, cells that are

topologically close to each other will contain map vectors
which are similar to each other.
• Data that is introduced (clustered) on a trained SOM get

distributed according to the similarity ⇒ map vector represents
a class of similar data

“Colors” Example

1. step - Automated minimization: ENVPDF
1. iteration:
• Use existing PDF sets as a guideline:
– For each flavour separately, select randomly either the range [0.5, 1],
[1.0, 1.5] or [0.75, 1.25] times any of the
{PDF} = {CTEQ6(or 4), CTEQ5, MRST02, Alekhin, GRV98} sets at
Q0 = 1.3 GeV
– Set a value for each xdata randomly within the selected range (uniform
distribution), apply smoothing
– Scale the combined set PDFcomb
i to obey the sumrules, linear interpolation
between {xdata }
• Initialize N × N SOM such that Vi = {PDFcomb
i , F2i }
• Batch train (in Nstep steps), training “data” 4N 2 PDFcomb sets (= “database”)
• Similarity criterion: similarity of observables F2 (xdata , Q2data )
– Always rescale {PDFcomb
i } to obey sumrules after updating the Vi
– Evolution as in CTEQ6
• After training compute χ2 against experimental data for every PDF set on the
map, pick Ninit best to start a new iteration with a whole new SOM
• DIS data (H1, Zeus, BCDMS) only for now

ENVPDF algorithm II
Later iterations:
• For each selected init PDF, use the best nearest neighbour PDFs to establish a
1 − σ envelope
• For each flavour at each xdata , jitter around the init PDF within the selected
range (Gaussian distribution), smooth
• Scale the combined set to obey the sumrules, linear interpolation between
{xdata }
• Preserve PDF variety by using Norig 1. iteration generators in turn with Ninit
Gaussian generators
• Initialize N × N SOM, and Nstep Batch train with 4N 2 database sets + Ninit
“mother” sets

Input quality
PDF LO χ2 /N NLO χ2 /N
Alekhin 3.34 29.1
CTEQ6 1.67 2.02
CTEQ5 3.25 6.48
CTEQ4 2.23 2.41
MRST02 2.24 1.89
GRV98 8.47 9.58
*These are the χ2 /N for the quoted initial scale PDF sets which are evolved with
CTEQ6 DGLAP settings, no kinematical cuts or normalization factors for the
experimental data were imposed. We don’t claim these values to describe the quality
of the quoted PDF sets.

ENVPDF results I
SOM Nstep Norig Case LO χ2 /N NLO χ2 /N

5x5 5 2 1 1.04 1.08
5x5 5 0 1 1.41 -
5x5 5 2 2 1.14 1.25
15x15 5 6 1 1.00 1.07
15x15 5 6 2 1.13 1.18
3.0 3.0
LO Case 1: NLO Case 1:

2.5 2.5
5 5, Nstep=5 best of 10 5 5, Nstep=5 best of 10
worst of 10 worst of 10
Case 2: Case 2:
2.0 2.0
/N
2
best of 10 best of 10
/N
2
worst of 10 worst of 10
1.5 1.5
1.0 1.0
0.5 0.5
0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50
Iteration Iteration

ENVPDF results II: LO
3.0 4.0
LO LO
3.5
2.5
Q=1.3 GeV Q=3.0 GeV
3.0
CTEQ6 0.1*xg CTEQ6
MRST02 MRST02
2.0
0.25*xg 5 5, Nstep= 5 5 5, Nstep= 5 2.5
Case 1 Case 1
2 2
/N 1.2 /N 1.2
1.5 2.0
1.5
1.0
xuV 1.0
xu
xu xuV
0.5
0.5
0.0 0.0
-5 -4 -3 -2 -1 -5 -4 -3 -2 -1
10 2 5 10 2 5 10 2 5 10 2 5 10 2 5 1 10 2 5 10 2 5 10 2 5 10 2 5 10 2 5 1
x x
(χ2 /N ) = 1.065, σ = 0.014 ⇒ ∆χ2 = 10

ENVPDF results III: NLO
3.5 4.0
Q=1.3 GeV
NLO CTEQ6 NLO
3.0 MRST02 3.5
5 5, Nstep= 5 Q=3.0 GeV

2.5 Case 1 CTEQ6 3.0
2
/N 1.2 0.25*xg MRST02
0.85*xg 5 5, Nstep= 5
2.0 2.5
Case 1
2
/N 1.2
1.5 2.0
1.0 1.5
xuV
0.5 1.0
xu xu
xuV
0.0 0.5
-0.5 0.0
-5 -4 -3 -2 -1 -5 -4 -3 -2 -1
10 2 5 10 2 5 10 2 5 10 2 5 10 2 5 1 10 2 5 10 2 5 10 2 5 10 2 5 10 2 5 1
x x
(χ2 /N ) = 1.122, σ = 0.029 ⇒ ∆χ2 = 20

2. Step - Interactive GUI
Method extremely open for user interaction
• Build an interactive GUI, let the user set the shape of the
envelope
• Replace jittering with NN (or functional form), generators to
sample the NN weight vector (or parameters)
• Clustering criteria could be anything that can be
mathematically formulated ⇒ project desired quality out of the
map
• Study of flexible points (“opportunities for adapting and fine
tuning”), e.g. DGLAP variables, data selection, SOM params,
theoretical assumptions,...
• Extend to nPDFs and GPDs...

Honkanen Heli

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Honkanen Heli

Transféré par

Droits d'auteur :

Formats disponibles

Obtaining Parton Distribution Functions

from Self-Organizing Maps

Heli Honkanen, ISU & UVa

Heli Honkanen SPIN 2008 1

Theoretical bias: Bias introduced by researches in the form of the

• Knowledge of PDFs and their errors crucial in calculations of

Heli Honkanen SPIN 2008 2

Heli Honkanen SPIN 2008 3

• Adjust parameters and repeat until global mininum found

• Differences between current sets ∼size of the estimated errors

Heli Honkanen SPIN 2008 4

Heli Honkanen SPIN 2008 5

• State of NN represented by the weight vector

• ωij (weights) and θi (thresholds) free parameters to be

Heli Honkanen SPIN 2008 6

where nonlinear activation function

Example: For (1-2-1) case:

Heli Honkanen SPIN 2008 7

• Use neural networks as universal unbiased interpo-

– Genetic Algorithm for ω

• The uncertainty on the final result is found from the variance

Heli Honkanen SPIN 2008 8

Heli Honkanen SPIN 2008 9

Heli Honkanen SPIN 2008 10

Heli Honkanen SPIN 2008 11

• Introduce “Researcher Insight” instead of “Theoretical bias”

Heli Honkanen SPIN 2008 12

Heli Honkanen SPIN 2008 13

Heli Honkanen SPIN 2008 14

Vi (t + 1) = Vi (t) (1 − w(t) Nj,i (t)) + Sj (t) w(t) Nj,i (t)

Heli Honkanen SPIN 2008 15

In the end on a properly trained SOM, cells that are

• Data that is introduced (clustered) on a trained SOM get

Heli Honkanen SPIN 2008 16

Heli Honkanen SPIN 2008 17

Heli Honkanen SPIN 2008 18

Heli Honkanen SPIN 2008 19

CTEQ6 DGLAP settings, no kinematical cuts or normalization factors for the

of the quoted PDF sets.

Heli Honkanen SPIN 2008 20

SOM Nstep Norig Case LO χ2 /N NLO χ2 /N

LO Case 1: NLO Case 1:

Heli Honkanen SPIN 2008 21

(χ2 /N ) = 1.065, σ = 0.014 ⇒ ∆χ2 = 10

Heli Honkanen SPIN 2008 22

5 5, Nstep= 5 Q=3.0 GeV

(χ2 /N ) = 1.122, σ = 0.029 ⇒ ∆χ2 = 20

Heli Honkanen SPIN 2008 23

Heli Honkanen SPIN 2008 24

Vous aimerez peut-être aussi