\subsection{Model} \label{sec:model} Building models is the core competence of \zfit{}. As seen in the example in Sec. \ref{sec:quickstart}, this can be done in a simple manner by using already implemented models and possibly combining them, but the models can also be completely custom built. Within \zfit{}, there are two basic types of models to cover most cases: Functions and Probability Density Functions (PDF). The basic features of a model include \begin{itemize} \item Each model is defined inside a \zspace{}. Its dimension are ``observables", simple string identifiers as previously discussed. \item A model implements a function that returns its value depending on some data. This is either \pyth{pdf} for a PDF or \pyth{func} for a Func. \item Full as well as partial integration over a model is possible. This includes numerical as well as analytical integration, if available. \item Generating a sample following the models shape using numerical or analytical methods, the latter only if available. \end{itemize} The value function as well as the integration and sampling are implemented to return pure Tensors. Depending on the task, higher-level methods providing either a more intuitive, imperative behaviour or a significantly more performant execution for certain cases such as repetitive pseudo-experiments are also implemented. The main differences between a PDF and a Function concern the normalisation and the output dimensionality. This leads to a few subtle differences. \begin{description} \item[PDF] A PDF $f(x)$ is only well-defined with a given normalisation range. This defines the normalisation constant so that, with the limits from $lower$ to $upper$, the integral over the PDF equals to one as \begin{equation} \label{eq:norm_range} \sum_{i}\int_{l_i}^{u_i} f(x) dx = 1 \end{equation} with $i$ indexing all the limits that make up the domain, $l_i$ and $u_i$ are the lower respectively upper limit. A PDF object has three special attributes, which are \begin{itemize} \item the probability density function \pyth{pdf}. It returns always a rank one Tensor, i.e. a simple vector, with the length number of data points. \item the probability density function \textit{without} the normalisation constant, \pyth{unnormalised_pdf}. In cases where only the shape is needed, using this function is less expensive, especially if the normalisation has to be computed numerically. \item the limits that define the normalisation constant of Eq. \ref{eq:norm_range}. They can be set using the \pyth{set_norm_range} method. \end{itemize} \item[Function] A Function is in a way more simple and general than a PDF. It takes the same values as a PDF but returns something with dimensionality $\mathbb{R}^m$. It can be used to transform values or to use it as a building block for more complex expressions. It has the method \pyth{func} to evaluate its value and no other special attributes. \end{description} \subsubsection{Parametrization} \label{sec:parametrization} Models can be parametrized by \pyth{Parameter} objects which can be used in the implementation of the shape function like any other Tensor-like object when building the model. In the following we will first have a look at the \pyth{Parameter} itself. Afterwards we will see how they are exactly used with a model. There is a distinction between dependent and independent parameters as only the latter can be changed directly and have limits while the former are any arbitrary combination of them. An independent \pyth{Parameter} \begin{itemize} \item has a name with purely descriptive purpose; \item has an initial value; \item maybe has lower and upper limits; \item is either floating or not, independent of whether limits were specified; \item has a step size indicating the order of magnitude of the parameter which can be given. Otherwise, it is automatically inferred. A well chosen step size improves the minimization process and can be critical, mostly in the absence of limits and with a weak dependence on the model, so that large steps will be required to change the model. \end{itemize} Currently, the shape of parameters is implicitly restricted to a scalar.\footnote{This comes from the fact that the PDF will have different sized data as input which is not controlled by the user, such as when doing numerical integration. Any parameter therefore has to be able to broadcast seamlessly.} As a consequence, a parameter cannot simply be treated like data as it is possible in \roofit. A function \pyth{Parameter} which depends on the data itself will most likely be available in the future. The name of a parameter, as any other name for a single object in \zfit{}, has purely descriptive character. There is purposely\footnote{Contrary to the \root framework. If the need ever arises, adding this as an additional feature is relatively simple.} no direct way provided to access parameters by name: instead of using names the actual parameter object is passed around. This has the advantage of avoiding double bookkeeping the parameters, since then a reference on an object as well as on a string would be needed. The name is mandatory for parameters, as opposed to other objects in \zfit{}, since matching the value of a parameter to the right name is a critical task during and after a minimization, when reading off the right value. Composed parameters are dependent parameters. In general they can be any Tensor, that is a result of any kind of operation. They can depend on zero, one or more independent parameters, as composed parameters are arbitrary functions; therefore, operations such as shifting and scaling are included as a trivial subset. For more details on the actual implementation and the dependency management see Appendix \ref{appendix:dependency management}. Composed parameters can neither be floating nor have limits currently, since the independent parameters a composed parameter may depends on can have arbitrary relations and have to be restricted themselves. In order to restrict a parameter, arbitrary constraints can be given to the loss instead. Every model can depend on multiple parameters, both dependent and independent. Each parameter that parametrises the model has a name specific to the model and is given on instantiation. For example a \texttt{Gauss} has parameters named \texttt{mu} and \texttt{sigma} as in the example in Sec. \ref{sec:quickstart}. They are stored in a mapping attribute named \pyth{params}. \begin{center} \begin{minipage}{\textwidth} \begin{python} mu_param = gauss.params["mu"] sigma_param = gauss.params["sigma"] \end{python} \end{minipage} \end{center} Notice that this is not contradictory to the statement above that single objects \textit{cannot} be accessed by \textit{their} name. Each object has a unique identifier, the name, but objects can have names for their constituents that are \textit{not} unique, like \texttt{mu}. This simply describes a part of the PDF and any \texttt{Gauss} will have a parameter \texttt{mu}. \subsubsection{Implementing a custom PDF} An essential feature of \zfit{} is the ability to simply create custom models. There is a large freedom in building models from the \pyth{BaseModel} class, since it takes care of most boilerplate and has well defined entry points than can be customized. The full implementation details and possibilities for customizations are described in Appendix \ref{appendix:basemodel}. For the most common use cases though, there exists a simple way of creating a custom model. The \pyth{ZPDF}, basically a more user friendly wrapper around \pyth{BasePDF}, can be be used as a base class in these simple cases. The following function will be used as the PDF shape \begin{equation} \label{eq:simple_linear_model} f(x, y) = a \cdot x^2 + b \cdot y^4. \end{equation} To implement this function, \pyth{_unnormalized_pdf} has to be overwritten. For the vast majority of custom models, this is the only method to be overwritten. Changing other methods, especially \pyth{_pdf}, is an advanced feature and only needed in special cases. For more details on the customization and the possibility of hooking into the calls see Appendix \ref{appendix:basemodel}. This is a two dimensional PDF with two parameters. To implement it in \zfit{}, a new class is created \begin{center} \begin{minipage}{\textwidth} \label{py:custom_pdf_linear_model} \begin{python} class EvenPolyPDF(zfit.pdf.ZPDF): """Implementation of f(x, y) = a*x^2 + b*y^4""" _PARAMS = ['a', 'b'] _N_OBS = 2 # since two dimensional def _unnormalized_pdf(x): xdata, ydata = x.unstack_x() a = self.params['a'] b = self.params['b'] return a * xdata ** 2 + b * ydata ** 4 \end{python} \end{minipage} \end{center} Note that we only need the \textit{shape} of the function but do not need to take care of the normalisation, as numerical methods are already implemented in the base class. We can see the advantage of the preprocessing that is done by the base class, especially the reordering of the data \pyth{x}. It is a \zdata{} object and calling the \pyth{unstack_x} method returns a list\footnote{Or a single Tensor fo the one dimensional case if not specified differently in the arguments.} containing a Tensor for each column sorted according to the models observables. The length of the list has to coincide with the specified \pyth{_N_OBS}, the number of observables. It is not mandatory to specify this field in a Model and is sometimes not possible to know previously, in which case it can simply be left away. The naming of the parametrization of the function is defined with \pyth{_PARAMS}. These exact names have to be used when creating an instance of the model. The parameters are then stored in the \pyth{params} dictionary and extracting them is usually the first step inside \pyth{_unnormalized_pdf}. Documentation plays an important role here: it defines the name of the parameters that have to be used and what they represent in the function, but it is also crucial to communicate the ordering of the data. In this case, a user can see that the first observable and the corresponding column in the data will be used as \textit{x} in the function. This PDF is already complete and works out of the box. We can create an instance and use its methods. As an example, the observables of the instance will be called "xobs" and "yobs" with the normalisation range going from $0$ to $10$ and from $0$ to $5$, respectively. \begin{center} \begin{minipage}{\textwidth} \label{py:instance_custom_pdf_linear_model} \begin{python} param_a = zfit.Parameter("a", 1.) param_b = zfit.Parameter("b", 2.) x_obs = zfit.Space("xobs", limits=(0, 10)) y_obs = zfit.Space("yobs", limits=(0, 5)) obs = x_obs * y_obs poly_model = EvenPolyPDF(a=param_a, b=param_b, obs=obs) prob = poly_model.pdf(...) # some data needed x_limits = zfit.Space("xobs", limits=(3, 5)) y_limits = zfit.Space("yobs", limits=(1, 3)) integral_limits = x_limits * y_limits integral = poly_model.integrate(integral_limits) sample = poly_model.sample(n=1000) \end{python} \end{minipage} \end{center} Using the \pyth{integrate} method, we obtain the integral $i$ of our normalized PDF $f_{norm}$ \begin{align} \label{eq:model integral} i &= \int_3^5 \mathrm{d}x \int_{1}^{3} \mathrm{d}y ~f_{normed} (x, y) \\ &\stackrel{(\ref{eq:pdf from func})}{=} \frac{\int_3^5 \mathrm{d}x \int_{1}^{3} \mathrm{d}y ~f(x, y)}{\int_0^{10} \mathrm{d}x \int_{0}^{5} \mathrm{d}y ~f(x, y)} \end{align} with $x$ and $y$ corresponding to the observables \pyth{x} and \pyth{y}, respectively, and $f(x, y)$ is the unnormalised PDF as defined in Eq. \ref{eq:simple_linear_model}. Using Eq. \ref{eq:pdf from func} we end up with a precise formulation of what is \textit{actually} executed in \zfit{}. In other words, the integral was calculated over the limits $3$ to $5$ and $1$ to $3$ respectively and normalised over the normalisation range \texttt{obs}. The latter is defined on initialisation and taken as the default normalisation range. As mentioned before, the limits could also be multiplied in a different order resulting in \texttt{limits} having the order (``yobs", ``xobs"). The model takes care internally (see also Appendix \ref{sec:public_methods}) that the right limits are at the right place. The \pyth{sample} method is used to generate 1000 points from the model. With the instance created, also probability densities \pyth{prob} for points can be calculated by calling the method with some \zdata{} object called \pyth{data} as follows \begin{center} \begin{minipage}{\textwidth} \begin{python} probs = poly_model.pdf(data) \end{python} \end{minipage} \end{center} It is worth noting that none of the operations are executed yet and what is returned by the methods are \text{Tensors}, as described in Sec. \ref{sec:implementation}. To run the actual computations, it is necessary to call \pyth{zfit.run(...)} on any Tensor. Running \begin{center} \begin{minipage}{\textwidth} \begin{python} probs_np = zfit.run(probs_np) integral_np = zfit.run(integral) \end{python} \end{minipage} \end{center} returns an actual numpy array for \pyth{probs_np} and a Python float for \pyth{integral_np}. For further, advanced customization of the PDF, methods can be overridden as described in Appendix \ref{sec:public_methods}. However, in most use cases there exist better ways: for example, to add an analytic integral to the model, overwriting \pyth{_analytic_integrate} should be the last resort. Instead, integrals defined over a specific range only or the whole range and over all dimensions or just partially can be registered with a model. A priority attribute allows to specify preferences on one over other methods. A typical use case for this feature is a special integral that is known exactly, for example the value of the integral over the full space of a Gaussian shaped model\footnote{As already hinted in Eq. \ref{eq:model integral}, it is important to note that the integral \textit{to be implemented} is over the unnormalised PDF as implemented in the \pyth{_unnormalized_pdf} method.} An integral over both dimensions is registered but partial integrals could be added as well \begin{equation} \label{integral_xy} \int_{x_0}^{x_1} \int_{y_0}^{y_1} f(x, y) \mathrm{d}x \mathrm{d}y = (1 / 3 \cdot a \cdot x^3 + 1 / 5 \cdot b \cdot y^5) \Big|_{x_0}^{x_1} \Big|_{y_0}^{y_1}. \end{equation} In case a full integral is requested but no analytic integral over all dimensions is available, the fallback of \pyth{integrate} looks for partial integrals. If available, it uses them to numerically integrate over the remaining dimensions instead of using the unnormalised PDF. The implementation of the integral with \zfit{} is done by registering it to the PDF \begin{center} \begin{minipage}{\textwidth} \begin{python} def integral_full(x, limits, norm_range, params, model): lower, upper = limits.limit1d a = params['a'] b = params['b'] lower = ztf.convert_to_tensor(lower) upper = ztf.convert_to_tensor(upper) def indef_integral(limit): return 1 / 3 * a ** 3 + 1 / 5 * b * y ** 5 return indef_integral(upper) - indef_integral(lower) lower = ((zfit.Space.ANY_LOWER, zfit.Space.ANY_LOWER),) upper = ((zfit.Space.ANY_UPPER, zfit.Space.ANY_UPPER),) limits = zfit.Space.from_axes(axes=(0, 1), limits=(lower, upper)) EvenPolyPDF.register_analytic_integral(func=integral_full, limits=limits) \end{python} \end{minipage} \end{center} Since the observables that will be assigned to each axis are unknown, the \zspace{} has to be defined with axes, not with observables. This follows the principle of using axes inside the model as explained in \ref{sec:spaces and dimensions}. Instead of \pyth{ANY_LOWER} and \pyth{ANY_UPPER}, it could also be defined over a specific domain by using plain Python floats. The PDF will now automatically use this integral if possible. Otherwise, as for example when creating a partial integral, it will silently fall back to numerical integration. This example demonstrates how to implement a shape that does only depend on the data and parameters. Models, that also depend on other models, such as the \pyth{SumPDF} from Sec. \ref{sec:quickstart}, require to be built from a \pyth{Functor}. This implies as a minor change only an additional argument to the base class, the depending models, in order to track the dependencies correctly. This is described in more details in Appendix \ref{appendix:functor}. \subsubsection{Sampling} Sampling from a model can be done in two different ways to cover two distinct use cases. On one hand, the method \pyth{sample} returns a pure Tensor, which behaves as any other sampling algorithm in TF, for example \pyth{tf.random.uniform}. This can be useful for specific and mostly advanced cases, but he overall behaviour can be unintuitive for inexperienced users, as discussed in Appendix \ref{appendix:tensorflow}. On the other hand, sampling from a model is typically used to perform toy studies, which consists of the generation of events according to the model and afterwards a fit of the model with randomly initialized parameters to the sampled data. With \pyth{create_sampler}, a \zdata -like object which handles the correct storage and the sampling from the model is created. As any other \zdata{}, it can be used to build a loss. The actual data is sampled by invoking \pyth{resample} and stays unchanged until it is invoked again. This sampler keeps a dependency on the original model and uses it to sample, keeping the original parameter values it was created with. If desired, any parameter that has changed in between will be at that new value when \pyth{resample} is called, effectively changing the sampling shape. \footnote{Just to be clear: the default behaviour is that the sampler samples from the exact distribution that it was created with \textit{in terms of parameter values}.} To illustrate the use of the toy sampler, let's look at a toy study example. Note that the actual (large) \textit{objects} needed such as the model, minimiser, loss etc. are created \textit{outside} of the loop and only the necessary \textit{methods} are called \textit{inside}. In general, this removes boilerplate code from the additional object creation but is especially important when working with a graph based backend. We assume to have previously built a model already and a minimizer as for example shown in Sec. \ref{sec:quickstart}, and we generate 1000 events according to our model. \begin{center} \begin{minipage}{\textwidth} \begin{python} sampler = model.create_sampler(n=1000) nll = zfit.loss.UnbinnedNLL(model, sampler) for _ in range(n_toys): sampler.resample() # now the sampling happens for param in model.get_dependents(only_floating=True): param.set_value(np.random.normal()) # any initial value result = minimizer.minimize(nll) if result.converged: # check if the fit was successful ... # safe results in a list or similar \end{python} \end{minipage} \end{center} The sampling is implemented with the accept-reject method that works with arbitrary shapes. If an analytic inverse integral is registered, this will be used, providing a more efficient way of sampling. More details about the different techniques and the use of importance sampling are described in Appendix \ref{appendix:sampling techniques}. \subsubsection{Extended PDFs} In addition to the shape, a PDF can carry more information, namely a yield, in which case we refer to it as an Extended PDF. The yield is a scale that describes the magnitude of the PDF, typically reflecting the number of events in the data sample. It can be used to multiply the output of \pyth{pdf}, the normal probability density function, which results in a number probability. This implies that when multiplying the integral with the yield, the number of events is retrieved instead of the probability over a certain range.\footnote{\textit{Currently}, the integral of an extended PDF is multiplied by the yield \textit{by default} but this behaviour is meant to change in a future version}. To count for example how many signal particles are in the sample, a composite PDF existing of a background and a signal component, both extended, can be fitted to the data. The \pyth{SumPDF} used to build this composition does not need a fraction if the PDFs are already extended, but uses the yields, normalised to one, as fractions. In this case, integrating the PDF that represents the signal shape over the whole range returns the number of signal events. To create an extended PDF, \pyth{create_extended} can be used. This method returns a copy of the PDF and adds a yield to it.