The pybats-detection package
André F. B. Menezes
Maynooth University
Joint work with Eduardo G. Pinheiro and Hélio S. Migon
October 05, 2022
Maynooth, TSE seminar
pybats-detection packagepybats-detection is a Python package developed for Bayesian monitoring and intervention in univariate time series data.Harrison and Stevens (1976) introduced the Bayesian Dynamic Linear Model (DLM).
Theoretical developments were consolidated in an excellent book by West and Harrison (1997).
Two major works are:
West and Harisson (1986): Automatic monitoring, and adaptation in the class of DLMs.
West and Harisson (1989): Bayesian subjective intervention.
When the time series abruptly changes at a point in time.
This could happen because of changes in the location, scale or other parameters of the process.
Analyses of structural changes in ecological time series (ASCETS) by Östman et al. (2020)
Study of ecological indicators over time such as:
Detecting structural change in ecological time series and adapting is fundamental to improve future forecasts as well as for inference purposes.
Dynamic Linear Models (DLMs) are a natural extension of linear (regression) models.
Go further to the well-known Kalman filter.
Based on Bayes’ theorem.
Takes into account the sequential nature of the data.
\[\begin{eqnarray} Y_t &=& \mathbf{F}_t^\top\,\boldsymbol{\theta}_t + \nu_t, \qquad \nu_t \sim N[0, V_t] \label{eq:dlm_observation} \\ \label{eq:dlm_state_evolution} \boldsymbol{\theta}_t &=& \mathbf{G}_t\,\boldsymbol{\theta}_{t-1} + \boldsymbol{\omega}_t, \quad \boldsymbol{\omega}_t \sim N[\mathbf{0}, \mathbf{W}_t] \end{eqnarray}\]
\(\mathbf{G}_t\) is a known \(p \times p\) matrix, referred as state evolution matrix at time \(t\).
\(V_t\) is the observation variance.
\(\mathbf{W}_t\) is the state evolution variance-covariance matrix.
The quadruple \(\{ \mathbf{F}, \mathbf{G}, V, \mathbf{W}\}_t\) describes the class of DLMs.
A very useful model in the class of DLM is the linear growth or second-order polynomial model.
Let be \(\mathbf{F}_t = (1, 0)^ \top\) and \(\mathbf{G}_t = \left( \begin{matrix} 1 &1 \\ 0 & 1 \end{matrix}\right)\) we obtain the linear growth model: \[\begin{eqnarray} y_t &=& \mu_t + \nu_t \\ \mu_t &=& \mu_{t-1} + \beta_t + \delta\mu_t \\ \beta_t &=& \beta_{t-1} + \delta\beta_t \end{eqnarray}\]
Suitable for medium-term forecasts in time series with trend patterns.
Define the local Bayes factor, \({B_t(k)}\);
Create an alternative DLM (\(M_1\)) describing a level or scale shift; and
Define the subjective and automatic interventions.
The local Bayes factor is defined as
\[B_t(M_0, M_1) = \dfrac{p(y_t \mid M_0)}{p(y_t \mid M_1)}\] where \(p(y_t \mid M_i)\) is the predictive density distribution and \(M_0\) and \(M_1\) denote the current and alternative models, respectively.
At time \(t\), the monitor will be based on the most recent \(k\) observations, which are defined as: \(B_t(k)=\frac{p(y_t,\ldots, y_{t-k+1} \mid M_0)}{p(y_t,\ldots, y_{t-k+1} \mid M_1)}\). Two other important quantities are cumulative Bayes factor and the run-length: \[\begin{eqnarray*} L_t&=& B_t \, \min_{1 \le k \le t}\{1, L_{t-1}\}\\ l_t&=& 1+ l_{t-1}\,\times \, I_{(-\infty,1)}(L_{t-1}) \end{eqnarray*}\] where \(B_t=B_t(0)\).
Evaluating the model consistency of \(y_t\) is equivalent to evaluating for the one-step-ahead standardized forecasting error (\(e_t\)) distribution which is \[ e_t \sim N[0, 1] \] and so \[ p(e_t \mid M_0) \propto \exp\left\{-0.5 e^2_t \right\} \]
A natural alternative model, \(M_1\), assumed that \(e_t\) has a non-zero mean \(h\), which has p.d.f. \[ p(e_t \mid M_1) \propto \exp\left\{-0.5 (e_t - h)^2\right\} \]
Thus, the local Bayes factor at time \(t\) is given by: \[ B_t = \exp\left\{-0.5 (h^2 - 2 h e_t)\right\} \]
The sensitivity of the monitor depends on:
pybats-detectionpybats-detection is built under Object Oriented Programming using Python classes
The three main classes of pybats-detection are: Smoothing, Intervention, and Monitoring.
The classes depends on the pybats.dglm.dlm object from the PyBATS package (Lavine and Cron, 2021). A DLM can be initialized in PyBATS as follows:
Monitoring classAn instance of the Monitoring class can be initialized as follows:
The fit method of Monitoring performs the automatic monitoring.
Further details can be found in package vignettes available at:
quick_start: quick reference guide with step-by-step examples.
pybats_detection: detailed explanation about usability.
The data is simulated from a Normal generating process \(N[\mu, \sigma^2]\) with \(\sigma^2 = 0.5^2\)
And \(\mu = (100, 104, 98)\) for the first \(40\) observations, the following \(30\), and the last \(30\), respectively.
A level DLM is used to described the data. The model is given by: \[\begin{eqnarray} Y_t &=& \mu_t + \nu_t, \qquad \nu_t \sim N[0, V_t] \label{eq:level_dlm} \\ \nonumber \mu_t &=& \mu_{t-1} + \omega_t, \quad \omega_t \sim N[0, W_t] \end{eqnarray}\] where the evolution \(\omega_t\) represents the stochastic change in level between times \(t - 1\) and \(t\).
Monitoring classUpper potential outlier detected at time 41 with H=2.581e-16, L=2.581e-16 and l=1
Upper potential outlier detected at time 42 with H=2.477e-13, L=2.477e-13 and l=1
Upper potential outlier detected at time 43 with H=6.881e-05, L=6.881e-05 and l=1
Lower potential outlier detected at time 71 with H=1.438e-26, L=1.438e-26 and l=1
Lower potential outlier detected at time 72 with H=2.959e-21, L=2.959e-21 and l=1
Lower potential outlier detected at time 73 with H=7.951e-08, L=7.951e-08 and l=1
One-step-ahead forecasts with 95% credible interval.
pybats-detection provides methods for identifying structural changes and outliers in Bayesian Dynamic Linear Models.
The methodology is scalable and efficient for analysing real time series data.
To extend the intervention and monitoring for other class of models, such as transfer functions, DGLM and DGEGM.
Harrison, P.J., Stevens, C.F., 1976. Bayesian forecasting. Journal of the Royal Statistical Society. Series B (Methodological) 38, 205–247. doi:https:// doi.org/10.1111/j.2517-6161.1976.tb01586.x.
Lavine, I., Cron, A., 2021. PyBATS: Bayesian time series modeling and fore- casting. URL: https://lavinei.github.io/pybats/. python package version 0.0.5.
West, M., Harrison, J., 1989. Subjective intervention in formal models. Journal of Forecasting 8, 33–53. doi:10.1002/for.3980080104.
West, M., Harrison, P.J., 1986. Monitoring and adaptation in Bayesian fore- casting models. Journal of the American Statistical Association 81, 741–750. doi:10.1080/01621459.1986.10478331.
West, M., Harrison, P.J., 1997. Bayesian Forecasting and Dynamic Models. 2nd ed., Springer, New York.