\subsection{Requirements} \label{sec:requirements} Some features are crucial in order to implement a model fitting library. An important part of model fitting is the model building itself, but a library should also offer a convenient, transparent creation of the loss and the minimisation. Especially in HEP, the following features are essential: \begin{itemize} \item PDFs are by definition normalised over a certain range. In most other libraries and fields, the domain is assumed to be $(-\infty, \infty)$. In HEP, this is basically never the case and a finite normalisation range is used. \item Fits in HEP are often more than one-dimensional. The framework should therefore naturally extend to higher dimensions. \item Building and combining models from basic shapes like Gaussian or exponential functions only suffices for simpler cases, but this is often not enough to build more complicated or specific models.Therefore, a convenient way to implement custom models has to be provided. \item Reasonable scaling with the data size and the model complexity is a key criteria. This is often especially hard to achieve in combination with the ability of specifying custom models, since the latter usually requires to have the parallelization implemented by the user. \item While the minimisation of the loss yields an optimal value for each parameter, it is crucial in HEP to also know the uncertainty of the value. This requires the library to have a transparent way of handling the parameters and their uncertainties as well as to provide the flexibility to perform an advanced statistics treatment. \end{itemize} \subsection{Existing libraries} \label{sec:landscape} Model fitting itself is nothing new. In fact there are already a lot of model fitting libraries available. Some of these libraries are also written in Python and cover a similar scope as \zfit{}. Building a new fitting library from scratch sounds therefore like reinventing the wheel and should be avoided if not necessary. But as already discussed in Sec. \ref{sec:Introduction}, fundamental changes in the computing architecture are leading to vectorized paradigms. Additionally, the needs in HEP for larger and more complicated while still flexible fits require to keep up with the state-of-the-art in computing. And this sometimes requires a re-invention. However, it is an imperative to make sure that no existing library already fulfils the needs or can be extended to. And even if concluding that a new library is the way to go, as much as possible should be learned and taken from any existing library in order to reinvent as few as necessary. In the following an overview of already existing libraries is given. \subsubsection{General fitting} Fitting models to data is a task that is performed in a variety of fields independent of HEP. Different general fitting libraries exist in Python, but they often contain functionality not actually needed in HEP, such as mean, variance, survival function, and lack central features like a custom normalisation range or the extension to more than one dimension. \begin{itemize} \item Scipy\cite{software:scipy} is the go-to library for scientific calculations in Python and provides an extensive toolbox for statistical and numerical methods. There is a module with distributions that have proven to be stable and work well. Downsides of the package include a non-optimized implementation in terms of parallelisation and lack of support for composite models. \item lmfit\cite{software:lmfit} shares a lot of its design in terms of naming and concept to \zfit{}. It is built for model fitting, has parameters, minimisers, fit results and more. It lacks more advanced features like the possibility of normalisation ranges for PDFs or good scalability, since it is built on top of numpy, a fast numerical library in Python, and scipy, which strongly limits the ability for massive parallelisation. \item TensorFlow Probability\cite{DBLP:journals/corr/abs-1711-10604} provides a library for statistical reasoning. Its focus is on analytical functions and only marginally extends to numerical and Monte Carlo methods, which limits its application to analytically integrable functions. Interestingly, it contains a lot of of features that can be used inside or together with \zfit{}, such as Bayesian inference with MCMC sampler and analytic functions with integrals already implemented in TF. \end{itemize} \subsubsection{HEP specific} A wide range of specialised fitters exist in HEP. The overview here is limited to general purpose fitters which can be used from Python. \begin{description} \item \roofit\cite{Verkerke:2003ir} is the de-facto standard tool for fitting in HEP. Models are built using classes and provide automatic normalisation and integration. There is support for binned as well as unbinned fits. \roofit itself extends beyond that and offers also an extensive plotting and statistics module. While the library has proven itself in numerous analyses over the years, and the model building part of \zfit{} is actually inspired by the core of \roofit, there are several shortcomings which are meant to be addressed with \zfit{}: \begin{itemize} \item \roofit is not a native Python library but can only be accessed through the Python bindings to \root. Since \roofit manages its own memory in C++ and Python uses a garbage collection as well, this can lead to memory leaks and completely undefined behaviour. \item Since the Python interface is barely a wrapper around the C++ classes, it does not integrate well to the scientific Python stack. \item In terms of flexibility, \roofit offers the possibility to be extended up to a certain degree with custom classes in pure C++. But especially when used from Python, it does not provide a convenient way to define custom PDFs. \item While there are improvements in the pipeline, it is not natively optimized to run vectorized on multiple cores or even accelerators like GPUs. \item Since the usage requires \root, the installation and setup is typically not lightweight. \end{itemize} \item probfit\cite{software:probfit} is a fitting library written in pure Python that mainly uses Cython to perform the heavy computations. This is a limitation in terms of performance and custom PDF implementations that makes a possible extension hard. Since it does provide limited features only, a large extension would be needed together with a major conceptual overhaul to be able to include new features. \item pyhf\cite{software:pyhf} is a re-implementation of HistFactory from ROOT in Python. It makes use of TensorFlow and other libraries including PyTorch and Numpy as a backend. It is purely designed to do binned template fits and does not extend its functionality beyond that point. \item The CMS Combine Tool\cite{higgsanalysis_combinedlimit} contains a subpart that implements template fits in TF. It does not extend its functionality further and is currently not available as a stand-alone package. Several useful parts like likelihood profiling or a minimiser in pure TF have been implemented there. \item TensorFlow Analysis\cite{tensorflow_analysis} is a library with a simple, functional approach to built the loss with TF and use Minuit\cite{James:1975dr} directly inside\footnote{This also requires to have the \root package installed.} to find the minimum. It offers a lot of physics content to create a model. While the lightweight approach comes with a lot of flexibility, the library also leaves quite some work to the user. For example it does not offer anything close to model composition with automatic normalisation. Notably, in its current state, the library lacks Python 3 support. However, its importance has to be stressed since it demonstrated the feasibility of using TF for unbinned likelihood fits with complex models and was a major inspiration for the development of \zfit{}. \item[TensorProb] is a model fitting library in Python that uses TF as the backend. In general it was built with a similar goal in mind as \zfit{}, providing a model fitting library in Python using TF, but using more an experimental approach. It offers models that can also integrate and sample. The content is based on older TF versions and the library is strongly limited in functionality. Most importantly though, the project never grew out of its experimental status and has been discontinued. It recommends now to use \zfit{} instead. \end{description} While the discussed model fitting libraries have different strengths and weaknesses, no single one fully fulfil the needs of HEP. However it is worth pointing out that their demonstration of concepts, designs and even certain functionality that can be used directly with \zfit{} are essential pieces in the development of \zfit{}.