(1) Overview
Introduction
Lithium-ion batteries have become a cornerstone of recent research around the electrification of several technological sectors. An important aspect of such research is to devise computational tools to obtain and simulate suitable battery models for various optimisation and control algorithms. To this end, several programming packages/toolboxes have been developed to obtain and simulate battery models with varying levels of physics incorporated into the model structure. Quite generally, such packages utilise the following model structures, namely (i) the physics-based Doyle-Fuller-Newman (DFN) model structure involving a set of coupled partial differential equations, (ii) the single-particle model (SPM) structure, which can be considered as a simplification of the DFN model structure, and (iii) the semi-empirical linear time-invariant (LTI) equivalent-circuit model (ECM) structure A few examples of the developed packages can be given as follows, namely (i) LIONSIMBA [1], a MATLAB toolbox for battery simulation using the DFN model, (ii) TOOFAB [2], a MATLAB toolbox for efficient simulation of the DFN model by selecting desired model simplifications, (iii) PyBaMM [3], a Python package for battery simulation using various mathematical models, including the DFN, SPM, and LTI-ECM, (iv) PyBOP [4], a Python package that works in conjunction with PyBaMM for the estimation of model parameters using suitable optimization methods, (v) BattMo [5], a MATLAB/Python/Julia package for battery simulation using the DFN model for 1-, 2- or 3-dimensional geometries, and (vi) SLIDE [6], a collection of C++ code files for simulation of the battery degradation by incorporating various degradation models into the SPM. Notably, the aforementioned packages utilise complex model structures – such as the DFN and SPM – to achieve high-fidelity battery simulations that can be computationally expensive for control-oriented applications, such as the battery management systems (BMS). On the other hand, the LTI-ECMs provide the desired simplicity for such applications, but fail to generalise across a wide range of battery operating conditions, thereby forcing the user to adopt workarounds such as constructing lookup tables comprising several models [7]. Quite naturally, a package for identifying battery models that offer both computational efficiency and generalizability over wide operating conditions may be more desirable for control-oriented BMS applications.
A relatively less investigated model class in the battery literature can be given by the linear parameter-varying (LPV) models, which preserve the linear input–output structure of the LTI models while approximating nonlinear battery behaviour using the so-called scheduling variable. Notably, the LPV models can be regarded as intermediate between the LTI-ECMs and nonlinear electrochemical models in terms of model complexity, though the LPV models remain closer to the LTI-ECMs than the electrochemical models. Recently, a methodology for identifying battery models using the LPV framework has been proposed in our recent papers [8, 9, 10]. Accordingly, a Python package named PyBatteryID has been developed to facilitate the identification of battery models using the proposed methodology, where the identified models can depend on various user-specified signals, such as state-of-charge (SOC), current magnitude, current direction and temperature. It may be noted that several toolboxes/packages exist to identify LPV models for general systems, namely (i) LPVcore [11], and (ii) deepSI (LPV-SUBNET) [12]. However, to the best of the authors’ knowledge, PyBatteryID is the first open-source package to enable LPV model identification specifically tailored for batteries.
The PyBatteryID package describes the model dependence on various suitable signals, such as SOC, by generating a dictionary of basis functions according to user-specified functional forms and complexity. For instance, the user may specify combinations of the basis functions 1∕s and log(s) up to the second order, where s represents the SOC. Furthermore, the package allows the user to specify multiple optimization methods in a sequence for model estimation, for instance, the least absolute shrinkage and selection operator (LASSO) can be used to perform the so-called variable selection, followed by the ridge regression method to refine (re-estimate) the parameters corresponding to the selected variables or model terms. Other noteworthy features provided in the package include (i) modelling of the hysteresis behaviour exhibited by certain battery chemistries, such as the lithium iron-phosphate (LFP) batteries, and (ii) generation of suitable current profiles for obtaining informative identification datasets. In summary, the battery models identified using the PyBatteryID package possess the following advantages, namely, the models (i) comprise a handful of terms that need to be simply added to obtain the voltage output, and (ii) offer good generalizability across a wide range of battery operating conditions, thereby rendering the models favourable for control-oriented (BMS) applications.
Battery model structure
The PyBatteryID package considers the battery voltage output as a sum of the battery electromotive force (EMF) (also known as open-circuit voltage) and the battery overpotentials, where the former needs to be determined a priori using suitable experiments such as galvanostatic intermittent titration technique (GITT), or low-current cycling [13]. For the battery overpotentials, the package employs a model structure proposed in [9] using the linear parameter-varying (LPV) framework, which can be considered as an intermediate paradigm between linear time-invariant (LTI) systems and nonlinear systems. Essentially, the LTI systems can be upgraded to LPV systems by considering the model parameters as functionally dependent on appropriate variables, which can be collectively referred to as the scheduling variable p. For instance, the package allows the scheduling dependence of the model parameters to be defined using the state-of-charge (SOC), current magnitude, current direction and temperature. Furthermore, the LPV systems can admit several representations relating the model output(s) with the model input(s), for instance, state-space representation, and input–output representation, where only the input–output representation will be discussed in this paper. Note that the overall battery model structure remains nonlinear due to the presence of battery EMF, regardless of whether the overpotential dynamics are modelled using an LTI or LPV model structure. Nevertheless, for brevity, the term ‘LPV (or LTI) model’ will henceforth be abused to refer to the full battery model, which represents the overpotential dynamics using the LPV (or LTI) model structure.
Input–output LPV representation
A brief overview of the battery model structure can be given as follows. Consider an nth-order LTI input–output representation for the battery overpotentials as given by
where yk represents the model output at time instant k, uk the model input at time instant k, and ai,bi the model parameters. The above LTI representation can be converted to an LPV representation by rendering the parameters ai,bi as functions of the scheduling variable p. Concretely,
where the notation (□⋄p)k corresponds to having an arbitrary dependence on the variable p at different time instants, e.g., . In the following, the so-called shifted-form input–output representation will be considered, which assumes a special p-dependence structure, that is, . Accordingly, the above input–output representation can be given by
where the parameter corresponding to the output or input term lagged by j time instants depends on the value of the scheduling variable also lagged by j time instants, that is, pk–j. Such a choice of p-dependence has been motivated owing to the analytical convenience associated with the shifted-form input–output representation as explained in [9].
Definition of p-components
As mentioned earlier, the package allows SOC, current magnitude, current direction and temperature as the components of the variable p, that is, p≜[sδ|u|T]T, where s represents the SOC, δ the current direction, |u| the current magnitude, and T the temperature. Note that the value of SOC s can be determined using a first-order difference equation as given by
where τ represents the model sampling time, Q the battery capacity, and u the battery current. Additionally, the current direction δ has also been defined using a first-order difference equation [9], as can be given by
where ε is a current-dependent parameter as given by
with the hyperparameters. In effect, various current-direction trajectories can be generated by choosing certain values of ε0 and ε1, henceforth denoted by .
Selection of suitable basis functions
Once the components of the variable p have been defined, the next step can be given by the specification of suitable basis functions for each p-component. For instance, the following basis functions for the SOC, current magnitude, current direction and temperature have been considered in our recent papers,
where 𝒢s, 𝒢|u|, 𝒢δ and 𝒢T represent the sets of basis functions for SOC, current magnitude, current direction and temperature, respectively; see our recent papers [9, 10] for the motivation behind choosing such functional forms for each p-component. Subsequently, a larger set of basis functions can be generated that includes individual basis functions specified above as well as their higher-order combinations. Formally, a set 𝒢(l) can be defined representing all the considered basis functions as given by
where l represents the nonlinearity order specifying the order of basis-function combinations, and . Finally, the model parameters , can be expanded using the set 𝒢(l) as given by
where {aij} and {bij} represent the to-be-estimated free model parameters, g(j) the jth element in 𝒢(l), and r the number of elements in the set 𝒢(l).
Implementation and architecture
This section details various functions provided by the PyBatteryID package that enable the battery model identification using the LPV framework. In this regard, the following aspects of PyBatteryID will be discussed, namely (1) defining the model structure, (2) identifying a battery model, (3) simulating a battery model, and (4) a summary of various utilities provided in the package. Figure 1 shows a structural overview of the PyBatteryID package.

Figure 1
A structural overview of the PyBatteryID package.
Installing PyBatteryID
The PyBatteryID package can be installed using the pip package manager via the following command
pip install pybatteryid
Alternatively, the user can download the source code files directly from the GitHub repository.
Defining the Battery Model Structure
The model structure can be defined using the ModelStructure class provided in the package that takes model sampling time and battery capacity as the inputs, as given by

Subsequently, the package requires the battery EMF function to be known a priori, which can then be used to calculate the battery overpotentials from the measured battery terminal voltage as given by yk = Vk–VEMF(sk), where y represents the battery overpotential, V the battery terminal voltage, VEMF the battery EMF, and s the battery SOC. Essentially, the EMF function can be defined using the add_emf_function provided by the ModelStructure class, which requires a Python dictionary consisting of a set of SOC values and the corresponding EMF values as given by

Additionally, the package allows the definition of a temperature-dependent EMF function VEMF(s,T) using the first-order approximation, that is,
where the functions VEMF(s,Tref) and can be obtained experimentally using the potentiometric method; see [10] for the detailed procedure. Now, the add_emf_function requires a dictionary of four elements to define a temperature-dependent EMF function, namely, (i) a set of SOC values, (ii) the corresponding EMF values, (iii) the value of the reference temperature Tref at which the EMF values are obtained, and (iv) the values for the function corresponding to the SOC values given as the first element. Concretely,

The next step for the model identification procedure can be given by specifying the basis functions as given in (7), which are needed to explain the dependence of the model parameters on the p-variables. In this regard, the package allows the user to conveniently specify various functional forms of the p-variables as strings, which are automatically parsed by the package. Note that the characters s, i, d and T are reserved for the SOC s, current u, sign of the current sgn[u] and temperature T, respectively. The following functional forms are currently supported by the package,
No functional transformation. In this case, the variables are specified as is, e.g., s, and T.
Inverse. The string format given by 1/□ needs to be specified to invert a variable, e.g., 1/s, and 1/T.
Logarithm. The logarithm of a variable can be specified using the string log[□], e.g., log[s].
Low-pass filtering. A variable can also be low-pass filtered according to (5) using the string □[ε0, ε1]. Note that such an operation only makes sense for the current direction d, e.g., d[0.01,0.99].
Exponential. For certain basis functions, e.g., 𝒢|u| and 𝒢T, it may be needed to perform multiplication, addition, taking the absolute value, or even raising the variable to a certain power before taking the exponential of the variable. In such cases, the string of the form exp[c*[|a*□+b|]^d] can be used, where a, b, c, d represent suitable numbers. For instance, the string exp[[0.00366*T+1]^-1] can be used to specify the basis function , where T represents the temperature in Celsius. Furthermore, if the exponent d equals 0.5, the string can also be specified as exp[c*sqrt[|a*□+b|]], e.g., exp[0.05*sqrt[|i|]].
Accordingly, the package provides the add_basis_functions function to specify the basis functions using a Python list of the strings described above, as can be given by

Modelling batteries with hysteresis. Multiple approaches can be adopted using the package to incorporate the hysteresis phenomenon in the model structure, for instance, by introducing a second input that represents maximum hysteresis overpotential as investigated in [8]. Alternatively, the hysteresis phenomenon can be considered as an extremely sluggish relaxation process compared to other battery chemistries, as done in [14], which can then be modelled by including suitable current-trajectories δ mimicking such sluggish relaxation as part of the basis functions without introducing the second input. Nevertheless, the latter approach has not been thoroughly investigated by the authors and will not be discussed further. Accordingly, the second input can be introduced in the model structure using the add_hysteresis_function function provided by the ModelStructure class, as given by

Furthermore, the package allows additional basis functions that are specific to the second input – the hysteresis overpotential, using another list of identifier strings as a second input to the add_basis_functions function. For instance,

Identifying the Battery Model
Following the definition of the battery model structure, the battery model of a certain model order n and nonlinearity order l can be identified using a predefined list of regression methods; see (1) and (7) for the definition of n and l, respectively. More precisely, a linear regression setting can be constructed to estimate the free model parameters in (8) as given by
where represents the output vector containing the output measurements , the regression matrix containing the regression vectors corresponding to each output measurement, and θ∈ℝM the unknown parameter vector. Note that the regression vector contains all the candidate model terms which can be generated using the set 𝒢(l) as given by
where
Note that ⊗ denotes the Kronecker product, represents the vector containing the output and the input terms, the vector containing the basis functions, and M = r(2n+1). Furthermore, the parameter vector θ can be expressed in terms of the free model parameters as given by
where {aij}, {bij} correspond to the free model parameters.
To solve the regression problem in (10), the package allows the usage of either or both of the following methods, namely (i) the least absolute shrinkage and selection operator (LASSO), and (ii) the ridge regression method. Quite briefly, the LASSO and ridge regression methods can be summarised as follows
where represents the LASSO estimate, and λ1 the hyperparameter controlling the amount of regularization via the l1-norm. Note that the LASSO attempts to induce sparsity in the parameter estimate by setting some of its elements to exactly zero depending on the value of λ1, thereby removing the corresponding candidate terms from the model equation [15]. Subsequently,
where represents the final model estimate, and λ2 the hyperparameter controlling the amount of regularization via the l2-norm. In this regard, the package currently provides wrappers for the following algorithms,
lasso.cvxopt: A LASSO algorithm provided along with the CVXOPT package, which uses λ1 = 1 [16].
lassocv.sklearn: A LASSO algorithm provided by the Scikit-learn package that uses cross-validation to find the optimal value of λ1 [17].
ridgecv.sklearn: A ridge regression algorithm provided by the Scikit-learn package that uses cross-validation to find the optimal value of λ2 [17].
Now, the battery models can be identified using the function identify_model, which requires the following arguments, namely the (i) identification dataset, (ii) model order n, (iii) nonlinearity order l, and (iv) list of regression methods. Note that the desired algorithms for the regression methods need to be specified in the form of a list with correct sequence using the identifier strings (e.g., lasso.cvxopt). For instance, a battery model with n = 3 and l = 4 can be identified as follows

where the argument identification_dataset needs to be given using a Python dictionary as follows

Additionally, in this example, the argument optimizers specifies cross-validated LASSO and ridge regression methods to be performed sequentially, which ultimately results in (i) variable selection using LASSO, that is, the terms corresponding to zero values in are automatically removed, and then (ii) re-estimation of the parameters corresponding to the remaining candidate terms; see [9] for the motivation behind augmenting ridge regression method with LASSO.
In the case of temperature-dependent battery model identification, the argument identification_dataset accepts an additional temperature_values key corresponding to the temperature measurements of the battery. Furthermore, as explained in [10], the temperature-dependent model identification may require multiple identification experiments, each resulting in a separate regression problem Yi = Φiθ. In this regard, the package provides two strategies to combine multiple regression problems, namely (i) concatenate, and (ii) interleave; see [10] for more details. An example of temperature-dependent model identification can be given by

where identification_datasets represents a list of Python dictionaries corresponding to each identification experiment, that is, [{…}, {…}, …, {…}].
Simulating the Battery Model
The identified battery model can be simulated for a given current profile to obtain the output voltage using the simulate_model function as given by
1 voltage_output = simulate_model(model, 2 current_profile)
where model represents the identified battery model (an instance of the Model class), and current_profile a Python dictionary containing the current values, initial SOC and initial voltage values depending on the model order, as can be given by

Again, in case of a temperature-dependent battery model, an additional temperature_values key is required in the current_profile dictionary corresponding to the temperature measurements of the battery.
Various Utility Functions
The package provides various utility functions that can enhance user experience during the battery model identification procedure. In the following, such utility functions will be briefly described along with suitable example usages.
Generating identification input signal. The package allows the generation of a current profile that can be used as an input signal during the identification experiments; see [9] for details regarding the current profile design and the associated parameters. More precisely, the function generate_current_profile can be used as follows

Saving/Loading models to/from a file. The identified battery model can be saved to (respectively, loaded from) a file using the function load_model_from_file (respectively, save_model_to_file). For instance, an identified model model can be saved and then loaded from a file as given by

Analysing an experimental dataset. Occasionally, it may be important to check various details of an experimental dataset for a better understanding of the dataset. In this regard, the package provides a utility function analyze_dataset, which provides various details for a dataset, such as experimental time, extracted charge during the experiment, voltage range, SOC range and temperature range. An example usage can be given by

Other utility functions. Other worth-mentioning utility functions can be given by invert_voltage_function and print_model_details. The former function can be used to calculate the initial SOC for a given experimental dataset by inverting the EMF function using the initial voltage and temperature (if applicable) values as inputs. The latter function can print the model terms along with the model order and nonlinearity order of an identified battery model.
An Illustrative Example
This section presents an example of using the PyBatteryID package to perform model identification for a 2.85-Ah NMC cylindrical battery cell. Figure 2(a) shows the identification current profile keeping in mind the intended model application as the real drive-cycle-like conditions. The following code snippet can be used to obtain an LPV model that depends on SOC, current magnitude, and current direction as discussed in the previous section.

Figure 2
Model identification results for a 2.85-Ah NMC battery cell; (a) and (b) show the identification and validation current profiles, respectively, and (c) compares the measured voltage with the simulated voltage obtained using an LTI and an LPV model.

Additionally, the performance of the identified LPV model can be compared with that of an LTI model to understand the utility of the model structure employed by the PyBatteryID package. Essentially, the same code snippet can be used to obtain an LTI model by providing an empty list of the basis functions, that is,

Once the two models are obtained, the corresponding model performance can be validated using a suitable current profile as shown in Figure 2(b), which represents the driving conditions of a travel route in the Netherlands. To do so, the following code snippet can be used to obtain the simulated voltage.

Figure 2(c) shows the measured and the simulated voltage curves corresponding to the validation current profile in Figure 2(b), for the two models. Note that the corresponding root-mean-squared error (RMSE) and mean absolute error (MAE) values can be given for the LTI and LPV models as (RMSE, MAE) = (83.949 mV, 60.023 mV) and (RMSE, MAE) = (9.414 mV, 6.342 mV), respectively. For more examples, the reader is referred to the examples folder of the package repository at https://github.com/tue-battery/PyBatteryID/tree/develop/examples.
Quality control
Various tests have been written to ensure that the package functions correctly, namely (i) the extraction of basis functions from the user-provided basis function strings, (ii) the combination of the user-provided basis functions according to the nonlinearity order, (iii) the formation of resulting signal trajectories corresponding to each candidate model term (i.e., each column of the regression matrix), and (iv) the construction of the regression matrix and the corresponding output vector have been tested. The aforementioned tests can be run using the command pytest in the root folder. Note that the tests are run automatically using GitHub actions whenever new code is pushed to the repository, ensuring that the package remains functional without errors. In addition, the code is linted with pylint to enforce strict Python coding conventions.
An important aspect of the functionality of the package can be given by the resulting model quality, which is generally assessed in terms of the accuracy of model simulation. Notably, the resulting model accuracy does not depend on any specific battery chemistry or form factor; rather, the user needs to provide (i) sufficiently representative basis functions, and (ii) sufficiently informative identification dataset(s), to ensure adequate model accuracy. In this regard, the recommendations for selecting the basis functions and generating current and temperature profiles to obtain the identification datasets are given in the examples folder of the GitHub repository. It may be noted that less representative basis functions, or less informative datasets, may not necessarily break the functionality of the package, though the resulting model quality may be unacceptable.
Several other recommendations are given for the user to ensure proper functionality of the package as follows. Namely, the EMF function should cover the same or larger SOC range than the identification and validation datasets. Also, the model order n and nonlinearity order l should not be too high; the recommended range is n ≤ 4, l ≤ 4, or else the optimisation methods currently provided (e.g., lassocv.sklearn, ridgecv.sklearn) may fail to yield adequate models. Another important concern relates to the battery temperature measurements, that is, the temperature may vary spatially throughout the cell geometry during battery operation. Accordingly, the user may fix the temperature sensor at a specific location of the cell, preferably where the temperature variations are significant, to promote consistency in the temperature measurements.
(2) Availability
Operating system
The package can be used on any operating system where python can be run (GNU/Linux, macOS, Windows).
Programming language
Python 3.12+
Dependencies
PyBatteryID requires the following packages, namely (i) numpy 2.1.0 or higher, (ii) cvxopt 1.3.2 or higher, (iii) scikit-learn 1.5.1 or higher, (iv) rich 13.8.0 or higher, (v) matplotlib 3.9.2 or higher, and (vi) pytest 8.3.2 or higher (for running tests).
List of contributors
Muiz Sheikh
Software location
Name: Zenodo
Persistent identifier: https://doi.org/10.5281/zenodo.15481664
Licence: BSD-3-Clause license
Publisher: Muiz Sheikh
Version published: 3.0.0
Date published: 21/05/25
Code repository
Name: GitHub
Persistent identifier: https://github.com/tue-battery/PyBatteryID
Licence: BSD-3-Clause license
Date published: 18/10/23
Language
English
(3) Reuse potential
The PyBatteryID package can be used to identify sparse and computationally lightweight battery models, and therefore may find its direct application in a practical battery management system. For instance, the battery models may be employed for estimating battery SOC as well as keeping track of the battery health over the battery lifetime. Additionally, the package allows the generation of suitable input current and temperature profiles for obtaining informative identification datasets, which may then be used by researchers to identify battery models with alternative model structures, such as DFN or neural-network-based models; see the examples folder in the GitHub repository for a quick guide on how to generate such profiles. Moreover, the package may be valuable for researchers seeking simpler battery models for the modelling, estimation and control of larger battery systems (battery packs) involving multiple battery cells.
Various features are planned for the future releases of the package, namely (i) allowing the user to evaluate the identification datasets using a suitable informativity measure, (ii) providing recommendations for the basis functions depending on the battery behavior and the given model application, (iii) allowing the user to estimate an (temperature-dependent) EMF function using various experiments, such as GITT, and low-current cycling, and (iv) developing a graphical user interface (GUI) for a no-code user experience.
Users are encouraged to open GitHub issues to report problems or request new features, and to use GitHub discussions for general support queries. Contributions to the package are also welcome and may be submitted via GitHub pull requests.
Competing Interests
The authors have no competing interests to declare.
