Newer
Older
Master_thesis / thesis / quickstart.tex
\section{\zfit{} introduction}
\label{sec:quickstart}

\zfit{} was created in order to fill the gap of a model fitting library in pure 
Python for HEP. We will now have a look at it, how it is structured and 
supposed to fill this gap. Model fitting as implemented in 
\zfit{} is split into five essential parts. To introduce them and \zfit{} 
itself, an 
example with a sum of a Gaussian and an exponential PDF will be implemented. 
This example can be thought of as a fit to an invariant mass distribution of a 
signal and a background component.

Let's assume we are interested in an observable $x$ within a range from $5$
to $10$. In \zfit{} this is expressed with a \zspace{} defining our domain

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
limits = zfit.Space(obs="x", limits=(5, 10))
\end{python}
\end{minipage}
\end{center}

\zfit{} can handle data from a variety of different sources. In this case, we 
load the data \pyth{data_np} from a numpy array

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
data = zfit.Data.from_numpy(array=data_np, obs=limits)
\end{python}
\end{minipage}
\end{center}

Since the data was specified with the \pyth{limits} as its observables, it will 
be cut automatically to be only within the \pyth{limits} range. In 
this context, we can think of the observable \textit{x} as of the column of the 
data frame.

Next the model needs to be built. We create the Gaussian PDF
with two free parameters, \texttt{mu} and \texttt{sigma}. Using $7$ 
and $1.5$ as 
initial values, respectively, this is done as

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
mu = zfit.Parameter("mu", 7)
sigma = zfit.Parameter("sigma", 1.5)
\end{python}
\end{minipage}
\end{center}

Creating the Gaussian in the observable \textit{x} and using the
parameters from before as

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
gauss = zfit.pdf.Gauss(obs=limits, mu=mu, sigma=sigma)
\end{python}
\end{minipage}
\end{center}

Equivalently the exponential PDF is created. A fixed value of $-0.1$ is used 
for the exponential parameter $\lambda$ as in $e^{\lambda x}$ and can directly 
be given to the PDF\footnote{Alternatively, a \pyth{Parameter} 
with the argument \pyth{floating} set to \pyth{False} can be created.}

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
exponential = zfit.pdf.Exponential(obs=limits, lambda_=-0.1)
\end{python}
\end{minipage}
\end{center}

To build the sum, an additional free parameter is used to describe the 
fraction of the first PDF. It is initialised with 0.5 and limited between 0 and 
1

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
frac = zfit.Parameter("fraction", 0.5, 0, 1)
model = zfit.pdf.SumPDF(pdfs=[gauss, exponential], fracs=frac)
\end{python}
\end{minipage}
\end{center}

Now that the model is built, we can define the loss by combining it with the 
data. 
Here an unbinned NLL will be used

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
nll = zfit.loss.UnbinnedNLL(model=model, data=data)
\end{python}
\end{minipage}
\end{center}
which needs to be minimised in order to find the optimal parameters. 
To achieve this goal, a minimiser such as Minuit is needed. Once 
it is created, we use 
its \pyth{minimize} method in order to minimise the previously built loss

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
minimizer = zfit.minimize.MinuitMinimizer()
result = minimizer.minimize(nll)
\end{python}
\end{minipage}
\end{center}

The outcome of this minimisation is stored in a \pyth{FitResult} object. 
Whether the 
convergence was successful can be checked with

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
has_converged = result.converged
\end{python}
\end{minipage}
\end{center}

The free parameters of the model are updated in-place with the values obtained 
from the 
minimisation. This implies that the shape of the model has changed now, since 
it depends on the parameters. While a 
parameter can change again, the \pyth{result} stores the values from the 
minimisation as immutable numbers. They can be accessed like\footnote{Notice 
that 
\pyth{mu}, the parameter itself, and not \pyth{"mu"}, the name of it, is used 
as the key.}

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
mu_value = result.params[mu]["value"]
\end{python}
\end{minipage}
\end{center}

The value of the parameter is incomplete without an estimate 
of its uncertainty. For an accurate estimation, we can use 
\pyth{error}, an advanced method that takes all correlations among the 
parameters into account

\begin{center}
\begin{minipage}{\textwidth}
\begin{python}
errors = result.error()
\end{python}
\end{minipage}
\end{center}


This simple yet complete example demonstrates how model fitting in \zfit{} 
works. 
All parts contain a lot more functionality than just seen, but the structure of 
the workflow as shown in Fig. \ref{fig:fit_workflow} into five independent 
parts 
remains 
the same, no matter how complicated a fit may be.

\begin{figure}[tbp]
	\includegraphics[width=\textwidth]{images/fit_workflow.jpg}
	\caption{Fitting workflow in \zfit{}. Model building is the largest part. 
	Models combined with data can be used to create a loss. A minimiser finds 
	the optimal values and returns them as a result. Estimations on the 
	parameters uncertainties can then be made.}
	\label{fig:fit_workflow}
\end{figure}



\begin{description}
	\item[Model building]
	The construction of models is the core of \zfit{} and involves Functions 
	and PDFs. The difference between them is that the latter is normalised to 
	one over 
	a certain domain. Building the model includes a set of convenient base 
	classes that allow to easily create a custom model as explained in Sec. 
	\ref{sec:model}. Furthermore, composed 
	models involving sum, products and more are available.
	\item[Data] 
	Any kind of data needs to be loaded and transferred into a well defined 
	\zfit{} format. The \pyth{Data} class takes care of this and offers several 
	formats to load from, which then can be used by models. The aim 
	here is to provide a simple way of loading data from different formats into 
	\zfit{} and applying some cuts.
	\item[Loss] 
	This is the core definition of the problem. It uses the model and data 
	objects to 
	calculate a single number that quantifies the discrepancy from the model 
	and 
	the data. Typically, a binned or unbinned NLL or a \chisq is used, but 
	\zfit{} offers the freedom to implement any desired loss that is not 
	available in a straightforward way. From 
	this step onward, it is irrelevant what data or models are 
	\textit{actually} used. Only the number and the gradients with respect to 
	the models parameters matter.
	\item[Minimisation] 
	Given a loss, the minimiser minimises its value with respect to the free 
	parameters of the models. In \zfit{}, several
	algorithms are implemented by wrapping existing minimisation libraries.
	\item[Result and Errors]
	After each minimisation, a \texttt{FitResult} object is created. It 
	stores all the information about the minimisation process and allows 
	,amongst other things, to 
	check if the convergence was successful. The result also includes the 
	parameters and their values at the minimum. Furthermore, the loss and the 
	minimiser itself are also stored in the result. Using both of them, an 
	estimation on the 
	uncertainty of the parameters can be made. For this purpose, some simple
	algorithms are provided by 
	\zfit{}, but any more sophisticated uncertainty estimation can be made by 
	using the objects made available by the \pyth{FitResult}.
\end{description}

This formalisation is a powerful approach: the separation of 
the model fitting into these fine building blocks allows to improve and 
maintain 
the individual parts almost independently.
Most importantly, it reveals a surprising similarity to the field of 
deep 
learning: apart from the last step, the workflow is \textit{exactly} the same. 
Using a 
deep learning framework as the backend for a model fitting library therefore 
seems like an 
obvious choice to consider.