Newer
Older
Master_thesis / thesis / dalitz.tex
\section{Beyond standard fitting}
\label{sec:beyond standard fitting}

In the previous Sections, we have seen the capabilities of \zfit{} for fitting 
a model to data. The inherent flexibility and the powerful functionality of the 
core parts allow for the
library to be extended beyond the usual feature set of general model fitting 
libraries in HEP.
To perform more specialized fits, the components of \zfit{} can be 
used as building blocks of a higher level fitter. In order to keep the size and 
features of \zfit{} 
at a reasonable level, the whole project 
is split into multiple packages that are tightly coupled to each other. In this 
Section, we will have a closer look at the different packages that extend 
\zfit{} beyond the simple model fitting we have seen so far.

The main package is the core library \zfit{}, as described in the thesis up to 
now, that provides all the fundamental 
building blocks and is in itself a self-contained fitting library. 
Content-wise, 
\zfit{} offers a more field agnostic selection and does not contain models and 
tools too specific for HEP. The focus is on a stable and solid 
implementation together with the API and format definitions.

More field specific content can be added in other libraries such as \zphys{}. 
This repository is meant to be the place for community 
contributions focusing on HEP-specific content. Guidelines, examples and
automated tests are planned to be in place in order to lower 
the threshold 
for contributions. Furthermore, it will also contain 
functions dealing with physical quantities such as kinematics, which can serve 
as 
building blocks for models. Higher level interfaces that use \zfit{} to build 
specific models are also intended to be placed in this repositories, for 
example a whole amplitude analysis framework currently under development which
will be explained in the following Section.

Additionally, the project also contains libraries that \zfit{} depends on but 
which themselves are self-contained and which are factored out of the 
core- and extension libraries to standalone packages. This allows other 
projects than \zfit{} to make use of them as well. An example is the 
implementation of 
\texttt{phasespace}, which generates four momenta of particles from a 
decay taking 
into account the correct kinematics. The package will be explained in more 
detail in Sec. \ref{sec:phasespace}.


\subsection{Amplitude fits}
\label{sec:dalitz}

\zamp{} is a higher level fitting library, which is currently under 
active development and in its early stage, built with elements from \zfit{} in 
order to perform amplitude analyses, including Dalitz. 
In the following, the extension within the scope 
of the \zfit{} project is discussed by showing an implementation example of the 
decay \decay{\Dz}{\Kp\pim\piz}. As part of this, an additional library will be 
introduced, \zphasespace, which covers an essential part of being able to 
generate amplitude fits.

In order to perform amplitude fits, \zamp{} contains
\begin{itemize}
	\item shapes such as the Breit-Wigner distribution that can be used to 
	model resonances;
	\item helper classes to build PDFs such as some that efficiently cache 
	intermediate results specific to amplitude fits, and
	\item a higher level interface to build amplitude fits in a transparent 
	manner, similar to other specialised amplitude fitters in HEP. This removes 
	the need of low level handling \textit{but} keeps the flexibility to 
	replace any part of it by a custom object.
\end{itemize}

In order to perform toy studies as described in \ref{sec:performance}, an 
efficient way of sampling from a model is needed. While accept-reject is an 
universal and good working way, the efficiency can be very low in higher 
dimensional spaces and/or with peaky model shapes. To avoid inefficiencies, 
samples can be produced by using importance sampling as described in Appendix 
\ref{appendix:sampling techniques}, which requires a distribution 
approximately 
resembling the shape of the model. In \zamp{} the approach is to use the 
kinematic constituents of the decay particles. They are then transformed to the 
desired variables that are used in the model. This is a very general approach 
that allows all sorts of variables to be constructed from the kinematics. On 
the downside, in order to produce the particles with realistic kinematics, a 
phasespace generator is needed. While there is an implementation in \root 
called the TGenPhasespace class, no equivalent is available in pure Python. 
Therefore, a library porting the above with an extension to real world 
experiment kinematics and implemented using TF has been created.

More details on the \zamp{} library, especially on the higher level interface,
will be provided in a future paper and goes beyond the scope of this thesis.

\subsection{phasespace}
\label{sec:phasespace}

The kinematics of a particle are a four-dimensional tuple representing its 
four-momentum. In a decay of a parent particle to lighter children particles, 
their kinematics are constrained by the fundamental physical laws of momentum 
and energy conservation.
With
\begin{align*}
	p^{0,1,2} = p^{x, y, z} ~~~~ p^3 = \sqrt{p^2 + m^2c^2}
\end{align*}
this reads as
\begin{equation}
\label{eq:momentum_conservation}
	p_{parent}^{\mu} = \sum_i p_{i}^{\mu}
\end{equation}
with $p_i$ being the four-momenta of all children particles. For a two-body 
decay $A\rightarrow BC$, there are six free variables from the momenta of the 
two children. Using Eq. \ref{eq:momentum_conservation} there are two degrees of 
freedom left in the decay kinematics that lead to a distribution. Furthermore, 
the mass is only fixed for stable particles but is a distribution for 
resonances. To sample from the phasespace within the \zfit{} project 
a 
package named \zphasespace{} was created. The purpose of it is to generate 
arbitrary decays obeying the physical constraints discussed before and 
expressed in Eq. \ref{eq:momentum_conservation}. Since it uses TF as the 
backend, it integrates seamlessly into \zfit{}.

The algorithm used in \zphasespace{} is the Raubold and Lynch method for n-body 
events as described in \cite{James:1968gu}. In principle, the algorithm builds 
a tree where every node has two leaves representing a two-body decay. It 
calculates the available kinematic energy that will be assigned to the 
particles from the difference of the parent rest mass and the sum of all the 
children rest masses. The latter are either fixed or drawn from the mass 
distribution in case of short-lived particles

\begin{equation*}
	E_{kin} = m_{parent} - \sum_{i} f_{i}^{mass} (m_{i}^{min}, m_{i}^{max})
\end{equation*}
with $f_{i}^{mass}$ being the mass distribution of the particle $i$, 
$m_i^{min}$ 
the minimum mass recursively determined from the children masses of 
particle i and $m_i^{max}$ the available energy from the top particle minus the 
other parent particles of particle $i$. The remaining $E_{kin}$ is randomly 
split 
into fractions along all the decaying particles in the tree where each fraction 
is the kinetic energy for the boost of this particle.

Given the mass of each particle in the tree and its kinetic energy, particles 
are recursively generated starting from the top of the tree, \emph{i.e.}, with 
the 
lightest particles. Each parent 
particle randomly generates the two childrens in the available phasespace. Then 
the whole decay tree is boosted to the parent momentum. This continues until 
the top particle is reached, which is not boosted by default. However, an 
additional argument to the generation method allows to also boost it. This can 
be used to reproduce the physics happening in actual colliders.

\subsubsection*{Usage}
The library has a main class \pyth{GenParticle} that describes the particles 
mass, 
either fixed or as a distribution, and has a method to set the children 
particle it decays to. This allows to build an arbitrary decay chain in an 
object-oriented way. As an example, the decay of \decay{\Dz}{\Kstar(892) 
(\rightarrow{\Kp\pim})\piz}, which will be implemented as an amplitude fit in 
the next Section, would be implemented as following

\begin{center}
\begin{minipage}{\textwidth}
\label{py:reso_particle}
\begin{python}
import phasespace as phsp

kplus = phsp.GenParticle("K+", mass=KPLUS_MASS)
piplus = phsp.GenParticle("Pi+", mass=PIPLUS_MASS)
piminus = phsp.GenParticle("Pi-", mass=PIMINUS_MASS)
kstar = phsp.GenParticle("K*", mass=kstar_mass_func)
dzero = phsp.GenParticle("D0", mass=DZERO_MASS)

kstar.set_children(kplus, piminus)
dzero.set_children(kstar, piplus)
\end{python}
\end{minipage}
\end{center}

Here, \pyth{kstar_mass_func} is a function sampling from the mass distribution of a \Kstar. This is a custom function that can be implemented by the user.

Having specified the decay chain, two methods can be used to generate the 
actual particles: Either \pyth{generate_tensor} which returns the four momenta 
as a Tensor and can be used directly inside \zfit{}, or alternatively, 
\pyth{generate} returns the same information but as a numpy ndarray. 
Internally, a TF session is called and the computation is run.

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
weights, particles = dzero.generate(1000)  # generate 1000 particles
\end{python}
\end{minipage}
\end{center}

The weights correspond to every single event and quantify the probability of 
the generated event. The returned \pyth{particles} is a 
dictionary containing the 
momentum of each particle. For example

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
kstar_kinematics = particles['K*']
kstar_x = kstar_kinematics[:, 0]
kstar_mass = kstar_kinematics[:, 3]
\end{python}
\end{minipage}
\end{center}
and the format of the kinematic is (number of events, components).

As a shortcut for simple decays, a high level function \pyth{generate_decay} is 
available which allows to describe a decay with stacked lists of masses.

\subsection{Dalitz implementation}

As an example of an amplitude fit, a Dalitz analysis of the decay 
\decay{\Dz}{\Kp\pim\piz} is implemented using some parts of \zamp{} together 
with \zfit{} and \zphasespace. This 
implementation of the resonances and their shapes is done following the 
analysis in Ref. \cite{PhysRevLett.103.211801}.
The amplitude is built using the isobar approach, which calculates the coherent 
sum of the 
individual contributions, the intermediate resonances. The implemented 
resonances are $\rhomeson (770)$, $\rhomeson (1700)$, $\Kstarz (892) $, 
$\Kstarp (892)$, $\Kstarz (1430) $, $\Kstarp (1430)$ and
$K_2^*(1430)$

For amplitude analyses, dedicated fitting libraries exist and using a general 
purpose fitter like \zfit{} for this case is rather special. In the following, 
we will go through the elements that are needed from a top-down approach and 
see how the problem can be separated into smaller pieces using the 
functionality of \zfit{}. Although unimportant technicalities are left away and 
the example is a sketch only of the actual implementation, 
it is going to be rather advanced and involves functionality only 
explained in Appendices.

The observables in this case are the invariant masses of pairs of the different 
children 
particles, which are $m_{\Kp\pim}$, $m_{\Kp\piz}$ and $m_{\pim\piz}$. The whole 
PDF $f_{tot}$ is of the form
\begin{equation}
\label{eq:total amp}
	f_{tot}(m_{\Kp\pim}, m_{\Kp\piz}, m_{\pim\piz}) = \abs{\sum_{i}^{n_{amp}} 
	c_i A_i(m_{\Kp\pim}, m_{\Kp\piz}, m_{\pim\piz})}^2
\end{equation}
with $c_i$ being the complex coefficient of each amplitude $A_i$. Expanding 
this term contains cross ($i\neq j$) and square ($i=j$) terms $b_{ij}$ of the 
form
\begin{equation*}
	b_{ij} = c_i c_j^* A_i A_j^*
\end{equation*}

To implement this in \zfit{}, a custom class is created that takes the 
coefficients, 
amplitudes and a \pyth{Func} called \pyth{AmplitudeProduct} which calculates 
the above products $b_{ij}$. Before we will look at these three components, 
we can build the global PDF as defined in Eq. \ref{eq:total amp}

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
class SumAmplitudeSquaredPDF(zfit.pdf.BaseFunctor):
	def __init__(self, obs, coefs, amps,...)
		self._cross_terms = [AmplitudeProduct(c1, c2, a1, a2)
							for (c1, a1), (c2, a2) 
							in combinations(coefs, amps)]
		
		self._square_terms = [AmplitudeProduct(coef, coef, amp, amp)
				     		for coef, amp in zip(coefs, amps)]
		...
				     		
	def _unnormalized_pdf(self, x):
		value = tf.reduce_sum([2. * amp.func(x=x) for amp in self._cross_terms] 
					    	  +[amp.func(x=x) for amp in self._squared_terms],
					    	  axis=0)  # over which axis to sum
		return tf.real(value)  # convert it to a real number
\end{python}
\end{minipage}
\end{center}

As the next piece, we need to have the amplitude product. Since this is complex 
in general, it is again split into smaller parts that represent the real and 
imaginary parts by a class \pyth{AmplitudeProductProjection}.

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
class AmplitudeProduct(zfit.func.BaseFunctorFunc):
	def __init__(self, coef1, coef2, amp1, amp2,...):
		prod_real = AmplitudeProductProjection(amp1, amp2, proj=tf.math.real)
		prod_img = AmplitudeProductProjection(amp1, amp2, proj=tf.math.imag)
		super().__init__(funcs=[prod_real, prod_imag], 
						 params={'coef1': coef1, 'coef2', coef2}, ...)
		...
		
	def _func(self, x)
		coef1 = self.params['coef1']
		coef2 = self.params['coef2']
  	    prod_real, prod_imag = self.funcs	
  	    coeffs = coef1 * tf.conj(coef2)
		return  coeffs * tf.complex(prod_real.func(x), prod_imag.func(x))
\end{python}
\end{minipage}
\end{center}

Splitting the real and imaginary parts has the advantage of keeping both of 
them real and therefore also their integrals. This allows to make full usage of 
the \zfit{} integration. Since these parts are 
\textit{independent} of any coefficient, this allows to cache the 
integral value.

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
class AmplitudeProductProjection(zfit.func.BaseFunctorFunc):
	def __init__(self, amp1: ZfitFunc, amp2: ZfitFunc, proj, ...):
	self.projector = proj
	super().__init__(funcs=[amp1, amp2], ...)
	self._cache_integral = None
	
	def _func(self, x):
		amp1, amp2 = self.funcs
		return self.projector(amp1.func(x) * tf.conj(amp2.func(x)))
		
	def _single_hook_integrate(self, limits, norm_range, name):
		integral = self._cache_integral
		if integral is None:
			integral = super()._single_hook_integrate(limits=limits, 
													  norm_range=norm_range,
													  name=name)
			integral = zfit.run(integral)  # simplified
			self._cache_integral = integral
		return integral
\end{python}
\end{minipage}
\end{center}
where we used the hook for the integration as described in Appendix 
\ref{appendix:basemodel}.

All that is left now are the coefficients, which are just parameters, and the 
resonances. Using the
Breit-Wigner function from 
\zphys{},\footnote{The Breit-Wigner is currently an open merge request.} we can 
build them

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
resonances = [
	'rho(770)': RelativisticBreitWigner(rho_770_plus_mass,
										obs=zfit.Space('m2pipi', limits...))
	...
	]
coeffs = [
	zfit.ComplexParameter.from_polar('c_rho770', 1.0, 0.0),
	...
	]
\end{python}
\end{minipage}
\end{center}
and combining all of the above we can finally create the whole model

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
obs = zfit.Space(['m2kpim', 'm2kpi0', 'm2pipi', limits=...])
pdf = SumAmplitudeSquaredPDF(obs=obs, coeffs=coeffs, amps=resonances, ...)
\end{python}
\end{minipage}
\end{center}

This PDF is quite an accomplishment: while the implementation compared to all 
previous examples is more extensive, \text{none} of the above 
parts is \textit{really} complicated to implement and uses basic 
\zfit{} functionalities. Nonetheless, with this few lines of code, \zfit{} 
extends its functionality into the world of amplitude fitters, where usually 
only dedicated tools exist.

A missing part is the sampling: since the kinematics of the particles is not 
encoded into the PDF, it is needed inside the sampling. For that, we can create 
a decay as shown in examples above in Sec. \ref{py:reso_particle}  with the 
\zphasespace{} package. For simplicity, the Breit-Wigner distribution is 
used to model the mass shape. 
Building the decay and sample from it can be registered with the PDF as 
importance sampling.\footnote{Since this feature is not yet fully public in 
\zfit{}, no explicit example is shown. The mechanism will be similar to the 
registration of an integral.}

\begin{figure}[tbp]
\label{fig:dalitz_comparison}
\begin{subfigure}{.43\textwidth}
\includegraphics[width=\textwidth]{figs/dalitz_paper_cut.png}
% \caption{}
\end{subfigure}%
\begin{subfigure}{.57\textwidth}
\includegraphics[width=\textwidth]{figs/dalitz_kpim_kpiz.png}
% \caption{}
\end{subfigure}%
\caption{Dalitz plot of the invariant mass $\Kp\pim$ and $\Kp\piz$. The left 
histogram is taken from the paper while the one on the right was sampled from a 
distribution built with \zfit{} and using \zphasespace{} as sampler.}
\end{figure}



In order to compare the \zfit{} implementation with the measured data, a sample 
is drawn from the PDF. It is to note that the right plot in Fig. 
\ref{fig:dalitz_comparison}, drawn with 
\zfit{}, is currently not representative, since an instability in the sampling 
can vary the samples strongly depending on a single proposed 
event.\footnote{As described in \ref{sec:sampling techniques}, weights going 
	to zero as occurring in the phasespace can cause large problems.}. While 
	the 
two distributions 
do not agree, it is notable that resonances are 
taken into account and that the phasespace is sampled with correct borders. 
This unveils another current limitation in \zfit{}, that currently only 
rectangular limits are possible. As a future extension, arbitrary limits for 
\zspace{} will be implemented.

What was shown in this example are the low level components to build an 
amplitude analysis using \zfit{}. Most of the work above is straight forward 
but cumbersome 
and can be hidden behind a higher level interface that takes care of the right 
assignment of observables, resonances and compositions. Furthermore, and adding 
an additional 
layer around the resonances, the \zphasespace{} decays can be kept with the 
actual 
resonances which allows the phasespace to be an integral part of the amplitude. 
A higher level interface, which leaves the user with the 
specification of the resonances and coefficients but still offers the 
possibility to 
replace any part that was shown above, is currently under work in \zamp.