Statistical Characterisation of GNSS Data for a Stationary Receiver using Non-Gaussian Distributions

Abu Bantu; Józef Wiora

doi:10.2478/msr-2025-0037

Introduction

Accurately characterising datasets is essential for effective statistical modelling, particularly when dealing with asymmetric and heavy-tailed data samples. Global Navigation Satellite System (GNSS) latitude data are examples of such samples [1]. GNSS is a satellite constellation that provides signals from space, transmitting positioning and timing data to receivers on Earth [2]. This system includes GPS (USA), GLONASS (Russia), Galileo (Europe), and BeiDou (China) [3]. It is essential for applications such as navigation, surveying, mapping, and scientific research, where precise location data are required. However, GNSS positioning errors arise from multiple sources, including ionospheric and tropospheric delays, satellite clock drift, multipath interference, and receiver noise [4]. To account for these errors, classical statistical models often assume a Gaussian distribution of positioning errors. However, empirical studies have revealed deviations from normality, suggesting that errors exhibit heavy-tailed and skewed characteristics [5]. These deviations challenge conventional error modelling approaches and highlight the need for alternative probability distributions that better capture the statistical properties of positioning errors.

Residual pseudorange errors in single-frequency differential GNSS systems arise from sources such as signal multipath, increased receiver noise due to interference, residual differential troposphere error, and ionospheric gradients [6]. To address these errors, the study proposed modelling the total pseudo-range error distribution by combining the probability density functions of individual error sources, moving beyond classical Gaussian assumptions. As GNSS errors originate from various sources, modelling their total impact requires alternative statistical approaches. The application of non-classical error theory of measurements (NETM) was proposed to better handle these deviations, particularly for large datasets (n > 500) [7]. In urban environments, GNSS positioning performance is affected by outlier measurements, including multipath effects and non-line-of-sight (NLOS) receptions.

These outliers present a significant challenge in GNSS modelling. To address these effects, Wen et al. [8] proposed a graduated non-convexity factor graph optimisation technique, which effectively reduced the impact of outliers and improved positioning accuracy. In addition to outlier mitigation, enhancing state estimation under non-Gaussian noise conditions has also been a key research focus. Raitoharju et al. [9] introduced an algorithm that integrates empirical noise models into Kalman filtering, enabling improved estimation accuracy in scenarios where Gaussian assumptions fail. This approach improved estimation accuracy by incorporating non-Gaussian noise characteristics, making it particularly useful when measurement errors deviate from Gaussian assumptions.

Although methods such as Kalman filtering can improve state estimation, selecting an appropriate statistical distribution remains a critical challenge. Several studies have examined the suitability of alternative distributions, including Cauchy, Student's t, Laplace, lognormal, and skew-normal, for data fitting and modelling applications. For example, Sun-Yong Choi et al. [10] applied twelve different distributions, including Cauchy, Laplace, normal, and Student's t, to model stock index returns, using information criteria and goodness-of-fit tests to identify the best-fit distribution. Alzaatreh et al. [11] proposes a family of generalised Cauchy distributions, denoted as T-Cauchy(Y), using the T-(RY) framework. This family is generated using the quantile functions of uniform, exponential, log-logistic, logistic, extreme value, and Fréchet distributions. It provides a flexible approach for modelling heavy-tailed data, demonstrating greater adaptability than classical Gaussian models. Aryal et al. [12] investigated the Laplace probability distribution and its derivatives, focusing on their applicability in modelling real-world problems. The study highlights the growing interest in constructing flexible parametric classes of probability distributions and examines how the Laplace distribution and its variants have been widely applied in statistical modelling.

Building on these findings, this study systematically evaluates the suitability of non-Gaussian distributions, including the Laplace, skew-normal, skew-t, and generalised hyperbolic (GH) distributions, for modelling GNSS latitude data from a stationary receiver. These models are applied using weighted maximum likelihood estimation (WMLE) to achieve robust parameter estimation and confidence interval (CI) construction.

The novelty of this work lies in developing a unified statistical framework that combines empirical GNSS data with non-Gaussian modelling and WMLE-based estimation to quantify measurement uncertainty more accurately under real-world, heavy-tailed conditions. This approach moves beyond the classical Gaussian assumption, providing a comparative evaluation of flexible asymmetric distributions using multiple goodness-of-fit criteria such as log-likelihood, Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and root mean squared error (RMSE).

The added value of this study lies in its direct application of metrological principles from the Guide to the Expression of Uncertainty in Measurement (GUM) and the International Vocabulary of Metrology (VIM) to non-Gaussian error modelling. This enhances the statistical reliability and interpretability of GNSS uncertainty assessment, bridging the gap between theoretical distribution fitting and practical measurement evaluation.

To achieve this, we analyse empirical GNSS latitude datasets and employ rigorous statistical methodologies for model evaluation. The rest of this paper is structured as follows: Section 2 describes the methods and materials used in our research. Section 3 details the selected synthetic data distribution; Section 4 presents goodness-of-fit tests to compare the selected models; Section 5 presents the results and discussion; and Section 6 provides the conclusion.

Synthetic data distributions

Laplace distribution

The Laplace distribution is a continuous probability distribution characterised by a sharp peak at the mean and heavier tails than the normal distribution [13]. It can be derived as the difference between two independent exponential distributions: $\begin{matrix} X = E_{1} - E_{2}, & E_{1}, E_{2} \sim exp (λ) \end{matrix} .$ \matrix{{X = {E_1} - {E_2},} & {{E_1},{E_2} \sim \;\exp \;(\lambda)}\cr}.

The probability density function (PDF) of the Laplace distribution is [14] (1) $f (x_{i}; μ_{0}, σ) = \frac{1}{2 σ} e^{- \frac{x - μ|}{σ}},$ f({x_i};{\mu_0},\sigma) = {1 \over {2\sigma}}{e^{- {{\left| {x - \mu} \right|} \over \sigma}}}, where μ₀ is the location parameter and σ is the scale parameter, which must be strictly positive (σ > 0), and xi is the data point. The Laplace distribution has a variance of 2σ² and exhibits exponentially decaying tails, which are heavier than those of the normal distribution but lighter than those of the Cauchy distribution. The Cauchy distribution is a continuous probability distribution characterised by its location parameter (mean) μ₀ and scale parameter σ [15].

Skew-normal distribution

The skew-normal distribution extends the normal distribution by introducing a shape parameter (α) to control skewness. It is defined by three parameters: location (μ₀), which controls the centre of distributions; scale (σ), which determines the spread; and shape (α) [16]. This distribution is constructed by multiplying the normal PDF by a transformation of its cumulative distribution function (CDF). A random variable X follows a skew-normal distribution if X = μ₀ + σZ, where Z follows a standard skew-normal distribution. Thus, the skew-normal PDF is given by (2) $f (x_{j}) = 2 ϕ (x_{j}) Φ (α x_{i}),$ f({x_j}) = 2\phi ({x_j})\Phi (\alpha {x_i}), where the standard normal PDF is (3) $ϕ (x_{i}) = \frac{1}{\sqrt{2 π}} e^{- x_{i}^{2} / 2}$ \phi ({x_i}) = {1 \over {\sqrt {2\pi}}}{e^{- x_i^2/2}} and the standard normal CDF is (4) $Φ (x_{i}) = \int_{- \infty}^{x} ϕ (t) dt = \frac{1}{2} 1 + erf (\frac{x_{i}}{\sqrt{2}})],$ \Phi ({x_i}) = \mathop \smallint \nolimits_{- \infty}^x \phi (t)dt = {1 \over 2}\left[ {1 + {\rm{erf}}\left({{{{x_i}} \over {\sqrt 2}}} \right)} \right], where erf (error function) is defined in [6]. For a general skew-normal distribution with μ₀ and σ, we standardise the variable x_i as (5) $z = \frac{x_{i} - μ_{0}}{σ} .$ z = {{{x_i} - {\mu_0}} \over \sigma}.

Substituting (5) into (2), the general skew-normal PDF is (6) $f (x_{i} | μ_{0}, σ, α) = \frac{2}{σ} ϕ (\frac{x_{i} - μ_{0}}{σ}) Φ (α \frac{x_{i} - μ_{0}}{σ}) .$ f({x_i}|{\mu_0},\sigma,\alpha) = {2 \over \sigma}\phi \left({{{{x_i} - {\mu_0}} \over \sigma}} \right)\Phi \left({\alpha {{{x_i} - {\mu_0}} \over \sigma}} \right).

Generalised hyperbolic distribution

It is a flexible class of continuous probability distributions capable of modelling heavy tails and skewness. It includes many other distributions as special cases, such as the normal, Laplace, hyperbolic, variance gamma, and normal-inverse Gaussian (NIG) distributions [17]. The key distinction between the GH distribution and the standard hyperbolic distribution is the inclusion of the index parameters α, resulting in a five-parameter family (α, λ, β, σ, and μ). The PDF of the GH distribution arises from the expression of X as a normal mean-variance mixture, where the mixing distribution is a generalised inverse Gaussian (GIG) (7) $X | W = w \sim N (μ + β w, w σ^{2}),$ X|W = w \sim {\cal N}\left({\mu+ \beta w,\;w{\sigma^2}} \right), where W is a positive mixing variable distributed as the GIG, and where γ > 0 is another scale-like parameter (8) $W \sim GIG (α, σ^{2}, γ^{2})$ W \sim {\rm{GIG}}(\alpha,{\sigma^2},{\gamma^2}) with the PDF (9) $\begin{matrix} \begin{matrix} f_{W} (w) = \frac{{(γ / σ)}^{α}}{2 K_{α} (σ γ)} w^{α - 1} exp (- \frac{1}{2} (\frac{σ^{2}}{w} + γ^{2} w)), & w > 0 \end{matrix} \\ X \sim GH (α, λ, β, σ, μ), \end{matrix}$ \matrix{{\matrix{{{f_W}\left(w \right) = {{{{(\gamma /\sigma)}^\alpha}} \over {2{K_\alpha}\left({\sigma \gamma} \right)}}{w^{\alpha- 1}}\;\exp \;\left({- {1 \over 2}\left({{{{\sigma^2}} \over w} + {\gamma^2}w} \right)} \right),} & {w > 0}\cr}}\cr{X \sim {\rm{GH}}\left({\alpha,\lambda,\beta,\;\sigma,\mu} \right),}\cr} where the parameters are: α ∈ ℝ (shape), λ > 0 (tail heaviness), the |β| < λ controls skewness, and it satisfies the constraint, α > 0 (scale), and μ ∈ ℝ (location). In the one-dimensional case, the GH distribution has the following PDF (10) $\begin{matrix} f_{X} (x_{i}; α, λ, β, σ, μ) = a (α, λ, β, σ) . {exp}^{β (x_{i} - μ)} \cdot \frac{K_{α - \frac{1}{2}} (λ \sqrt{σ^{2} + {(x_{i} - μ)}^{2}})}{{(\sqrt{σ^{2} + {(x_{i} - μ)}^{2}})}^{(1 / 2 - α)}}, & (x_{i} \in ℝ) \end{matrix}$ \matrix{{{f_X}\left({{x_i};\alpha,\lambda,\beta,\sigma,\mu} \right) = a\left({\alpha,\lambda,\beta,\sigma} \right).{{\exp}^{\beta \left({{x_i} - \mu} \right)}} \cdot {{{K_{\alpha- {1 \over 2}}}\left({\lambda \sqrt {{\sigma^2} + {{({x_i} - \mu)}^2}}} \right)} \over {{{(\sqrt {{\sigma^2} + {{({x_i} - \mu)}^2}})}^{\left({1/2 - \alpha} \right)}}}},} & {({x_i} \in {\mathbb R})}\cr} where a(α,λ,β,σ) is a norming constant, K_α(⋯) is the modified Bessel function of the third kind (see [18]), and a(α,λ,β,σ) is a norming constant given by (11) $a (α, λ, β, σ) = \frac{{(λ^{2} - β^{2})}^{α / 2}}{\sqrt{2 π} λ^{α - 1 / 2} K_{α} (σ \sqrt{λ^{2} - β^{2}})} .$ a\left({\alpha,\lambda,\beta,\sigma} \right) = {{{{({\lambda^2} - {\beta^2})}^{\alpha /2}}} \over {\sqrt {2\pi} {\lambda^{\alpha- 1/2}}{K_\alpha}\left({\sigma \sqrt {{\lambda^2} - {\beta^2}}} \right)}}.

Overall, the GH has the following PDF after substituting (11) into (10) (12) $f_{X} (x_{i}; α, λ, β, σ, μ) = \frac{{(λ^{2} - β^{2})}^{α / 2}}{\sqrt{2 π} λ^{α - 1 / 2} K_{α} (σ \sqrt{λ^{2} - β^{2}})} \cdot \frac{K_{α - \frac{1}{2}} (λ \sqrt{σ^{2} + {(x_{i} - μ)}^{2}})}{{(\sqrt{σ^{2} + {(x_{i} - μ)}^{2}})}^{(1 / 2 - α)}} \cdot e^{β (x_{l} - μ)} .$ {f_X}\left({{x_i};\alpha,\lambda,\beta,\sigma,\mu} \right) = {{{{({\lambda^2} - {\beta^2})}^{\alpha /2}}} \over {\sqrt {2\pi} {\lambda^{\alpha- 1/2}}{K_\alpha}\left({\sigma \sqrt {{\lambda^2} - {\beta^2}}} \right)}} \cdot {{{K_{\alpha- {1 \over 2}}}\left({\lambda \sqrt {{\sigma^2} + {{({x_i} - \mu)}^2}}} \right)} \over {{{(\sqrt {{\sigma^2} + {{({x_i} - \mu)}^2}})}^{\left({1/2 - \alpha} \right)}}}} \cdot {e^{\beta \left({{x_l} - \mu} \right)}}.

Skew-t distribution

It is an extension of the Student's t-distribution that allows for skewness as well as heavy tails. The Student's t-distribution models the sample mean of a normally distributed population when the sample size is small and the standard deviation is unknown [19]. The skew-t distribution is useful for modelling asymmetric data with outliers or heavy-tailed behaviour [20]. Its distribution arises from a normal mean-variance mixture. The random variables Y and X are then defined as follows (13) $Y = β Z| + \frac{1 - β}{2} \cdot Z$ Y = \beta \left| Z \right| + {{1 - \beta} \over 2} \cdot Z and (14) $X = μ + σ \cdot \frac{Y}{\sqrt{V / v}},$ X = \mu+ \sigma\cdot {Y \over {\sqrt {V/v}}}, where: Z ∼ 𝒩 (0, 1) is a standard normal variable, $V \sim χ_{v}^{2}$ V \sim \chi_v^2 is a chi-squared variable with ν degrees of freedom, β ∈ (−1, 1) is the skewness parameter, and μ and σ > 0 are the location and scale parameters. The distribution of X is skewed and heavy-tailed and is called the skew-t distribution. Its overall PDF is (15) $f (x_{j}; μ, σ, α, v) = \frac{2}{σ} \cdot t_{v} (\frac{x_{i} - μ}{σ}) \cdot T_{v + 1} (α \cdot \frac{x_{i} - μ}{σ} \cdot \sqrt{\frac{v + 1}{v + {(\frac{x_{i} - μ}{σ})}^{2}}}) .$ f\left({{x_j};\mu,\sigma,\alpha,v} \right) = {2 \over \sigma} \cdot {t_v}\left({{{{x_i} - \mu} \over \sigma}} \right) \cdot {T_{v + 1}}\left({\alpha\cdot {{{x_i} - \mu} \over \sigma} \cdot \sqrt {{{v + 1} \over {v + {{\left({{{{x_i} - \mu} \over \sigma}} \right)}^2}}}}} \right).

Justification for the choice distributions

The Laplace, skew-normal, skew-t, and GH distributions were selected for their complementary ability to represent key non-Gaussian characteristics commonly observed in GNSS data, namely asymmetry, heavy tails, and excess kurtosis.

The Laplace model serves as a simple reference with moderate tail behaviour, while the skew-normal introduces a shape parameter to accommodate data asymmetry. The skew-t extends this flexibility by modelling both skewness and pronounced tail heaviness. The GH distribution provides the most general framework, encompassing several of these models as special cases and offering superior adaptability for complex GNSS error structures.

Together, these distributions form a progressive hierarchy from simple to highly flexible models, enabling a systematic evaluation of non-Gaussian behaviour in GNSS measurements. Broader generalisations, such as the generalised alpha-skew-t, variance gamma, or NIG distributions, could further enhance modelling accuracy by offering additional control over tail behaviour and asymmetry, and are recommended for future investigation.

Goodness-of-fit tests

A goodness-of-fit test is a statistical method used to determine how well a set of observed data matches a specific theoretical distribution or model [21].

To evaluate the suitability of various statistical models for GNSS data, it is essential to consider distributions capable of capturing asymmetry and heavy tails. Among the candidate models are the Laplace, skew-normal, GH, and skew-t distributions, each offering different levels of flexibility in representing real-world data characteristics. Parameter estimation for these distributions is typically performed using the WMLE method. This is a modified form of maximum likelihood estimation (MLE) that applies weights to reduce bias, particularly for individuals with extreme response patterns [22]. The estimated parameter vector θ^* is defined as (16) $θ^{*} = arg max_{θ} \sum_{i = 1}^{n} w_{j} log L (θ | x_{j}),$ {\theta^*} = \;\arg \mathop {\max}\limits_\theta\sum\limits_{i = 1}^n {{w_j}\log {\cal L}(\theta |{x_j})}, where:

θ is the vector of parameters to be estimated,
x_i is the i-th data point,
w_i is the weight assigned to x_i, satisfying $Σ_{i = 1}^{n} w_{i} = 1$ \Sigma_{i = 1}^n{w_i} = 1 ,
ℒ(θ | x_i) is the likelihood of observing x_i given parameters of θ.

This approach provides robust parameter estimates and enables the construction of 95 % CIs for the fitted models. CIs are determined using the percentile method, which relies on the quantiles of the fitted distribution. The general formula for the CI is given by [23] (17) $CI = F^{- 1} (α, params), F^{- 1} (1 - α, params)],$ {\rm{CI}} = \left[ {{F^{- 1}}(\alpha,\,{\rm{params}}),\,{F^{- 1}}(1 - \alpha,\,{\rm{params}})} \right], where F⁻¹ (α, params) and F⁻¹ (α, params) correspond to the lower and upper quantile functions (percent-point function, PPF) of the fitted distribution. The confidence level is set at 95 %, leading to the following values for α, which represents the significance level used to determine the probability of the CI not capturing the true parameter. In a two-tailed CI, the total significance level α is symmetrically divided between the two tails of the distribution. This means each tail contains a probability of $\frac{α}{2}$ {\alpha\over 2} . $\begin{array}{l} \frac{α}{2} = \frac{1 - confidence level}{2} = 0.025 (for a 95 % CI), \\ 1 - \frac{α}{2} = 0.975 . \end{array}$ \matrix{{{\alpha\over 2} = {{1 - {\rm{confidence}}\,{\rm{level}}} \over 2} = 0.025\,\,\,\,\,({\rm{for}}\,a\,95\,\% \,{\rm{CI}}),} \hfill\cr{1 - {\alpha\over 2} = 0.975.} \hfill\cr}

In this study, the constructed CIs correspond to the expected range of the measured quantity (GNSS latitude) as represented by each fitted distribution. Thus, the intervals indicate the range within which the true value of the measurand is expected to lie with a 95 % confidence level.

To evaluate the adequacy of statistical models in representing GNSS error distributions, various goodness-of-fit metrics are employed. These include the log-likelihood, AIC, BIC, and RMSE. Together, these measures provide a rigorous framework for identifying the most appropriate distributional model.

Log-likelihood function (ℒ)

The log-likelihood function is a tool for evaluating how well a probability distribution fits the observed data. It quantifies the data compatibility with the assumed model by summing the logarithms of the PDF values at each observation. Higher values indicate a better fit.

For n independent observations X = {x₁, x₂, . . . ,x_n} and a PDF f (x | θ), the log-likelihood is defined as [24] (18) $L (θ | X) = \sum_{i = 1}^{n} ln f (x_{i} | θ),$ {\cal L}(\theta |X) = \sum\limits_{i = 1}^n {\ln f({x_i}|\theta)}, where θ represents the estimated parameters of the distribution, n is the number of data points, and f (x_i | θ ) is the PDF evaluated at x_i. Equations (19–23) are obtained by substituting the respective model distribution expressions into (18) to derive the log-likelihood function for each distribution. Starting with the Laplace distribution (1) substituted into (18), the log-likelihood function is (19) $\begin{array}{l} L (μ_{0}, σ) & = \sum_{i = 1}^{n} ln \frac{1}{2 σ} e^{- \frac{x_{i} - μ_{0}|}{σ}}] \\ = - n ln (2 σ) - \sum_{i = 1}^{n} \frac{x_{i} - μ_{0}|}{σ} . \end{array}$ \matrix{{{\cal L}\left({{\mu_0},\sigma} \right)} \hfill & {= \sum\limits_{i = 1}^n {\ln \;\left[ {{1 \over {2\sigma}}{e^{- {{\left| {{x_i} - {\mu_0}} \right|} \over \sigma}}}} \right]}} \hfill\cr{} \hfill & {=- n\ln \;(2\sigma) - \sum\limits_{i = 1}^n {{{\left| {{x_i} - {\mu_0}} \right|} \over \sigma}}.} \hfill\cr}

The log-likelihood function of the skew-normal distribution is (20) $L (μ_{0}, σ, α) = \prod_{i = 1}^{n} f (x_{j} | μ_{0}, σ, α) .$ {\cal L}\left({{\mu_0},\sigma,\alpha} \right) = \prod\limits_{i = 1}^n {f({x_j}|{\mu_0},\sigma,\alpha)}.

After substituting (6) into (18), the log-likelihood function is (21) $L (μ_{0}, σ, α) = \prod_{i = 1}^{n} \frac{2}{σ} ϕ (\frac{x_{i} - μ_{0}}{σ}) Φ (α \frac{x_{i} - μ_{0}}{σ})$ {\cal L}\left({{\mu_0},\sigma,\alpha} \right) = \prod\limits_{i = 1}^n {{2 \over \sigma}\phi \left({{{{x_i} - {\mu_0}} \over \sigma}} \right)\Phi \left({\alpha {{{x_i} - {\mu_0}} \over \sigma}} \right)} taking the logarithm of (21) yields the log-likelihood function (22) $\begin{array}{l} log L (μ_{0}, σ, α) = \sum_{i = 1}^{n} log \frac{2}{σ} ϕ (\frac{x_{i} - μ_{0}}{σ}) Φ (α \frac{x_{i} - μ_{0}}{σ})] \\ = n log 2 - n log σ + \sum_{i = 1}^{n} log ϕ (\frac{x_{i} - μ_{0}}{σ}) + \sum_{i = 1}^{n} log Φ (α \frac{x_{i} - μ_{0}}{σ}) . \end{array}$ \matrix{{\log {\cal L}\left({{\mu_0},\sigma,\alpha} \right) = \sum\limits_{i = 1}^n {\log \left[ {{2 \over \sigma}\phi \left({{{{x_i} - {\mu_0}} \over \sigma}} \right)\Phi \left({\alpha {{{x_i} - {\mu_0}} \over \sigma}} \right)} \right]}} \hfill\cr{\,\,\,\,\,\,\,\,\,\,\, = n\log 2 - n\log \sigma+ \sum\limits_{i = 1}^n {\log \phi \left({{{{x_i} - {\mu_0}} \over \sigma}} \right)} \; + \sum\limits_{i = 1}^n {\log \Phi \left({\alpha {{{x_i} - {\mu_0}} \over \sigma}} \right).}} \hfill\cr}

By substituting (12) into (18), the log-likelihood function of the GH distribution is as follows (23) $\begin{array}{l} log L (α, λ, β, σ, μ) = & \sum_{i = 1}^{n} \frac{α}{2} log (λ^{2} - β^{2}) - (α - \frac{1}{2}) log λ \\ - log (\sqrt{2 π}) - log K_{α} (σ \sqrt{λ^{2} - β^{2}}) \\ + log K_{α - \frac{1}{2}} (λ \sqrt{σ^{2} + {(x_{i} - μ)}^{2}}) \\ - (\frac{1}{2} - α) log (\sqrt{σ^{2} + {(x_{i} - μ)}^{2}}) \\ + β (x_{i} - μ)] . \end{array}$ \matrix{{\log {\cal L}\left({\alpha,\lambda,\beta,\sigma,\mu} \right) =} \hfill & {\sum\limits_{i = 1}^n {\left[ {{\alpha\over 2}\log ({\lambda^2} - {\beta^2}) - \left({\alpha- {1 \over 2}} \right)\log \lambda} \right.}} \hfill\cr{} \hfill & {\,\,\,\,\, - \log (\sqrt {2\pi}) - \log {K_\alpha}(\sigma \sqrt {{\lambda^2} - {\beta^2}})} \hfill\cr{} \hfill & {\,\,\,\,\, + \log {K_{\alpha- {1 \over 2}}}\left({\lambda \sqrt {{\sigma^2} + {{({x_i} - \mu)}^2}}} \right)} \hfill\cr{} \hfill & {\,\,\,\,\, - \left({{1 \over 2} - \alpha} \right)\log \left({\sqrt {{\sigma^2} + {{({x_i} - \mu)}^2}}} \right)} \hfill\cr{} \hfill & {\,\,\left. {\,\, + \beta \left({{x_i} - \mu} \right)} \right].} \hfill\cr}

Substituting the skew-t PDF (15) into (18), the for log-likelihood is (24) $\begin{array}{l} ln L (μ, σ, α, v) = & \sum_{i = 1}^{n} ln (\frac{2}{σ}) + ln t_{v} (\frac{x_{i} - μ}{σ}) + ln T_{v + 1} \\ (α \cdot \frac{x_{i} - μ}{σ} \cdot \sqrt{\frac{v + 1}{v + {(\frac{x_{i} - μ}{σ})}^{2}}})] . \end{array}$ \matrix{{\ln {\cal L}\left({\mu,\sigma,\alpha,v} \right) =} \hfill & {\sum\limits_{i = 1}^n {\left[ {\ln \left({{2 \over \sigma}} \right) + \ln {t_v}\left({{{{x_i} - \mu} \over \sigma}} \right) + \ln {T_{v + 1}}} \right.}} \hfill\cr{} \hfill & {\left. {\left({\alpha\cdot {{{x_i} - \mu} \over \sigma} \cdot \sqrt {{{v + 1} \over {v + {{({{{x_i} - \mu} \over \sigma})}^2}}}}} \right)} \right].} \hfill\cr}

Akaike Information Criterion (AIC)

The AIC is a widely used metric for model selection, balancing goodness-of-fit with model complexity and fit quality. It is defined as [25]: (25) $AIC = - 2 L (θ) + 2 k,$ {\rm{AIC}} =- 2{\cal L}(\theta) + 2k, where k denotes the number of parameters in the model. For the Laplace distribution, which has (k = 2), the AIC is given by AIC = −2ℒ(θ) + 2 · 2, making it directly comparable based on log-likelihood values. Similarly, the skew-normal and skew-t distributions, each with three parameters (k = 3), have AIC values expressed as AIC = −2ℒ(θ) + 2 · 3. The AIC for the GH distribution is calculated as AIC = −2ℒ(θ) + 2 · 5, where 5 represents the number of estimated parameters. A lower AIC indicates a better-fitting model with minimal complexity.

Bayesian Information Complexity (BIC)

The BIC is a model selection tool that imposes a stronger penalty on model complexity than the AIC. The BIC is defined as [26] (26) $BIC = - 2 L (θ) + k ln (n) .$ {\rm{BIC}} =- 2{\cal L}(\theta) + k\ln (n).

The penalty term k ln(n) ensures that the BIC strongly discourages overfitting, especially for large datasets.

The Laplace distribution is characterised by two parameters (k = 2). Based on (25), the corresponding BIC values are calculated using the formula: BIC = −2ℒ(θ) + 2 · ln(n).

Similarly, the skew-normal and skew-t distributions each involve three parameters (k = 3). According to (25), their BIC values are computed as: BIC = −2ℒ(θ) + 3 · ln(n).

The GH distribution involves five parameters (k = 5) and BIC is expressed as: BIC = −2ℒ(θ) + 5 · ln(n). A lower BIC indicates a better model fit while imposing a stronger penalty on model complexity than the AIC, thereby helping to prevent overfitting.

Root Mean Squared Error (RMSE)

RMSE is a widely used metric for quantifying the accuracy of a model or estimator by measuring the average deviation between observed and predicted values. By squaring the differences, RMSE penalises larger errors more heavily, providing a sensitive measure of overall fit [27]. It is particularly useful for comparing multiple models or candidate distributions in terms of their ability to represent the observed data. In our GPS latitude data analysis, we use RMSE to evaluate how closely each fitted probability distribution aligns with the observed measurements, thereby establishing a quantitative basis for comparing model performance. Formally, for a set of n observed values y_i and corresponding fitted values ŷ_i derived from a candidate distribution, RMSE is defined as (27) $RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - \hat{y})}^{2}},$ {\rm{RMSE}} = \sqrt {{1 \over n}\sum\limits_{i = 1}^n {{{({y_i} - \hat y)}^2}}}, where y_i denotes the observed GPS latitude values, (ŷ_i) represents the predicted or sampled values from the fitted distribution, and (y_i − ŷ_i) indicates the residual (error) for each observation. A lower RMSE indicates better model accuracy (small average error), while a higher value suggests larger deviations between the predictions and the actual values.

The RMSE for the Laplace distribution with mean μ₀ and variance 2σ² can be directly computed as (28) $RMSE = \sqrt{2 σ^{2} + {(\hat{y} + μ_{0})}^{2}} .$ {\rm{RMSE}} = \sqrt {2{\sigma^2} + {{(\hat y + {\mu_0})}^2}}.

For the skew-normal distribution with mean $\frac{α σ}{α - 1}$ {{\alpha \sigma} \over {\alpha- 1}} and variance $\frac{α σ^{2}}{{(α - 1)}^{2} (α - 2)}$ {{\alpha {\sigma^2}} \over {{{(\alpha- 1)}^2}\left({\alpha- 2} \right)}} , it is (29) $RMSE = \sqrt{\frac{α σ^{2}}{{(α - 1)}^{2} (α - 2)} + {(\hat{y} - \frac{α σ}{α - 1})}^{2}}, α > 2$ {\rm{RMSE}} = \sqrt {{{\alpha {\sigma^2}} \over {{{(\alpha- 1)}^2}\left({\alpha- 2} \right)}} + {{(\hat y - {{\alpha \sigma} \over {\alpha- 1}})}^2}},\,\,\alpha> 2 and undefined for α ≤ 2.

For the GH distribution with parameters α, λ, β, σ, and μ, the mean and variance are (30) $𝔼 [Y] = μ + σ \cdot \frac{K_{α + 1} (σ \sqrt{λ^{2} - β^{2}})}{K_{α} (σ \sqrt{λ^{2} - β^{2}})} \cdot \frac{β}{\sqrt{λ^{2} - β^{2}}},$ {\mathbb E}[Y] = \mu+ \sigma\cdot {{{K_{\alpha+ 1}}(\sigma \sqrt {{\lambda^2} - {\beta^2}})} \over {{K_\alpha}(\sigma \sqrt {{\lambda^2} - {\beta^2}})}} \cdot {\beta\over {\sqrt {{\lambda^2} - {\beta^2}}}}, where K_α(·) is the modified Bessel function of the third kind. The variance of the GH distribution is given by (31) $Var (Y) = σ^{2} \frac{K_{α + 1} (σ \sqrt{λ^{2} - β^{2}})}{K_{α} (σ \sqrt{λ^{2} - β^{2}})} + \frac{K_{α + 2} (σ \sqrt{λ^{2} - β^{2}})}{K_{α} (σ \sqrt{λ^{2} - β^{2}})} - {(\frac{K_{α + 1} (σ \sqrt{λ^{2} - β^{2}})}{K_{α} (σ \sqrt{λ^{2} - β^{2}})})}^{2}] .$ {\rm{Var}}(Y) = {\sigma^2}\left[ {{{{K_{\alpha+ 1}}(\sigma \sqrt {{\lambda^2} - {\beta^2}})} \over {{K_\alpha}(\sigma \sqrt {{\lambda^2} - {\beta^2}})}} + {{{K_{\alpha+ 2}}(\sigma \sqrt {{\lambda^2} - {\beta^2}})} \over {{K_\alpha}(\sigma \sqrt {{\lambda^2} - {\beta^2}})}} - {{\left({{{{K_{\alpha+ 1}}(\sigma \sqrt {{\lambda^2} - {\beta^2}})} \over {{K_\alpha}(\sigma \sqrt {{\lambda^2} - {\beta^2}})}}} \right)}^2}} \right].

Then, the RMSE is (32) $RMSE = \sqrt{Var (Y) + {(\hat{y} - 𝔼 [Y])}^{2}} .$ {\rm{RMSE}} = \sqrt {{\rm{Var}}(Y) + {{(\hat y - {\mathbb E}[Y])}^2}}.

For a skew-t distribution with location parameter μ, scale σ , skewness λ , and degrees of freedom ν, the mean and variance (for ν > 2 and ν > 4, respectively) are given by (33) $𝔼 [Y] = μ + σ δ \sqrt{\frac{v}{π}} \cdot \frac{Γ (\frac{v - 1}{2})}{Γ (\frac{v}{2})}$ {\mathbb E}[Y] = \mu+ \sigma \delta \sqrt {{v \over \pi}}\cdot {{\Gamma \left({{{v - 1} \over 2}} \right)} \over {\Gamma \left({{v \over 2}} \right)}} (34) $Var (Y) = σ^{2} (\frac{v}{v - 2} - δ^{2} \cdot \frac{v}{π} \cdot {(\frac{Γ (\frac{v - 1}{2})}{Γ (\frac{v}{2})})}^{2}),$ {\rm{Var}}(Y) = {\sigma^2}\left({{v \over {v - 2}} - {\delta^2} \cdot {v \over \pi} \cdot {{\left({{{\Gamma \left({{{v - 1} \over 2}} \right)} \over {\Gamma \left({{v \over 2}} \right)}}} \right)}^2}} \right), where $δ = \frac{λ}{\sqrt{1 + λ^{2}}}$ \delta= {\lambda\over {\sqrt {1 + {\lambda^2}}}} . The RMSE is defined as (35) $RMSE = \sqrt{Var (Y) + {(\hat{y} - 𝔼 [Y])}^{2}} .$ {\rm{RMSE}} = \sqrt {Var(Y) + {{(\hat y - {\mathbb E}[Y])}^2}}.

The presented formulas [(28), (29), (32), and (35)] explicitly define how the RMSE values were computed for each fitted distribution. Each expression corresponds to the RMSE calculation applied to a specific model, quantifying the average squared deviation between the observed GPS latitude data and the values predicted by the fitted distribution. These computed RMSE values are reported in Table 1, allowing direct comparison of the fitting accuracy among the candidate distributions.

Table 1.

Estimated parameters with 95 % CIs and performance metrics for the fitted models applied to GNSS latitude data.

Model	Parameters						CI^*	ℒ	AIC	BIC	RMSE
	μ^*	σ	ν	α	λ	β
	×10⁻⁵	×10⁻⁵	×10³					×10⁶	×10⁶	×10⁶
Laplace	864	20	–	–	–	–	[805; 924]	1.16	−2.33	−2.33	3.5 · 10⁻⁴
Skew-normal	857	21	–	0.50	–	–	[825; 904]	1.13	−2.27	−2.27	2.5 · 10⁻⁴
GH	845	0.06	–	0.05	1.39	0.43	[841; 917]	1.20	−2.40	−2.40	2.8 · 10⁻⁴
Skew-t	861	14	5.0	–	–	–	[825; 897]	1.16	−2.32	−2.32	2.6 · 10⁻⁴

Note: The values marked with an asterisk (μ and CI) are expressed as least significant digits relative to a base latitude of 50.28000° N and scaled by 10⁻⁵.

Materials and methods

To collect GNSS sample data, we deployed an L76k GPS module at a fixed outdoor location. This module, known for its compact design and high positioning accuracy, was connected to a Raspberry Pi, which served as the control system. The Raspberry Pi handled module configuration, data logging, and real-time monitoring, ensuring continuous and reliable data collection throughout the experiment.

During 72 hours of continuous operation, the GPS module recorded 1 303 376 rows of National Marine Electronics Association (NMEA) sentences. These included various sentence types such as:

$GPRMC: Recommended Minimum Specific GNSS data,
$GPGGA: Global Positioning System fix data,
$GPGLL: Geographic Position Latitude / Longitude,
$GPVTG: Track made good and ground speed,
$GPGSA: GNSS DOP and Active Satellites,
$GPGSV: GNSS Satellites in View.

From this dataset, 162 922 $GPRMC sentences were extracted for further analysis. These sentences were selected because they include essential positional and signal quality parameters such as fix quality, satellite count, horizontal dilution of precision (HDOP), and latitude / longitude coordinates. Python scripts were used to efficiently filter, parse, and structure the extracted data into a log file.

The GNSS latitude data distribution was first characterised before fitting multiple non-Gaussian distributions (Laplace, skew-normal, skew-t, and GH). Parameter estimation and CIs for each distribution were determined using WMLE to evaluate measurement reliability and quantify uncertainty. In addition, model performance was assessed using log-likelihood, AIC, BIC, and RMSE.

Results and discussion

Fig. 1 presents the PDF and CDF for the latitude data, compared with four fitted distributions using WMLE: Laplace, skew-normal, GH, and skew-t. For numerical consistency, the latitude values were normalised by subtracting the reference latitude of 50.28000° N, so the values reported in Table 1 correspond to the least significant digits of the measured coordinates. For example, the Laplace model mean value 864 × 10⁻⁵ represents an actual latitude of approximately 50.28864° N, with a 95 % CI of [50.28805°, 50.28924°]. The same interpretation applies to the other models listed in Table 1. In Fig. 1(a), the empirical PDF (blue histogram) exhibits a right-skewed and heavy-tailed structure. The Laplace distribution fits the peak well but fails to capture the tail behaviour, while the skew-normal improves on the asymmetry but still under-represents the tail. In contrast, the skew-t distribution closely matches the empirical density across the entire range, accurately modelling the sharp peak and heavy right tail. The GH distribution exhibits the best alignment with the empirical data, particularly capturing the skewness and heavy tail behaviour, which aligns with the goodness-of-fit metrics in Table 1. Fig. 1(b) shows the empirical CDF along with the fitted CDFs. The Laplace distribution deviates significantly around the central region, while the skew-normal offers better central alignment but lacks precision in the tails. GH again provides the closest match to the empirical CDF, with skew-t exhibiting slightly superior tail conformity. Overall, both PDF and CDF analyses confirm that the GH distribution offers the best fit for the observed GNSS latitude data, effectively capturing its skewed and heavy-tailed nature.

Table 1 presents the estimated parameters, CIs, and performance metrics for four fitted models applied to GPS latitude data. Among these, the GH distribution demonstrates the best overall performance, with the highest log-likelihood (ℒ = 1.20 × 10⁶) and the lowest AIC and BIC values (−2.40 × 10⁶) for both, indicating an optimal balance between goodness-of-fit and model complexity. It also achieves a low RMSE (2.8 × 10⁻⁴), confirming its accuracy in approximating the observed data. The skew-t model also performs competitively, with a slightly lower RMSE (2.6 × 10⁻⁴) and a similar log-likelihood, although its AIC and BIC values are marginally less favourable than those of the GH. The GH distribution incorporates additional parameters, α, λ , and β, to model heavy tails, kurtosis, and skewness, respectively. Notably, the small value of α = 0.05 suggests mild asymmetry, highlighting the distribution's suitability for capturing the non-Gaussian characteristics of the data. The skew-normal model performs moderately well, capturing skewness through its shape parameter (α = 0.50), but its RMSE (2.5 × 10⁻⁴) and lower likelihood suggest it is less flexible in modelling heavy tails. In contrast, the Laplace model performs worse, with the highest RMSE (3.5 × 10⁻⁴) and the lowest likelihood, failing to capture the data's asymmetry and heavy tails. Overall, the GH distribution provides the most reliable representations of the latitude data based on both fit quality and parameter efficiency. Fig. 1 and Table 1 together evaluate the statistical models' fit to GPS latitude data, supporting both qualitative and quantitative analysis.

This study uses probability–probability (P–P) plots to visually assess the agreement between empirical latitude data and theoretical distributions. These plots compare the CDF of the observed data with that of a specified theoretical model by plotting empirical probabilities against theoretical ones, making them a valuable tool for evaluating goodness-of-fit [28].

Fig. 2 shows the P–P plots for the residual analysis of four fitted distributions: Laplace, skew-normal, GH, and skew-t-against the empirical CDF of the latitude data. The closer the blue curve is to the red dashed line (representing a perfect fit), the better the distribution matches the empirical data. In Fig. 2(c), the GH distribution aligns most closely with the diagonal, indicating the best overall fit and demonstrating its effectiveness in capturing both skewness and kurtosis in the data. The skew-normal and skew-t distributions in Fig. 2(b) and (d) show a generally good fit, though slight deviations at the lower tails and centre suggest minor limitations in capturing the data's asymmetry or kurtosis. In contrast, the Laplace distribution in Fig. 2(a) deviates more noticeably from the diagonal, especially in the lower and upper tails, indicating a poor fit to the empirical data. This is consistent with Laplace's limited ability to capture asymmetry and heavy tails. Overall, the GH distribution demonstrates superior performance in residual behaviour, confirming its suitability for modelling the GPS latitude data compared to the other distributions analysed.

In summary, the P–P plots confirm that the GH distribution provides the best empirical fit, followed by the skew-t and skew-normal models, with Laplace performing the worst. These visual results are consistent with the quantitative metrics from the likelihood and RMSE analyses.

Conclusions

This study critically assessed four probability distributions: Laplace, skew-normal, GH, and skew-t-for modelling skewed and heavy-tailed data using GNSS latitude data as an example. Using the WMLE, we estimated parameters and CIs to quantify uncertainty. We also evaluated model performance using log-likelihood, AIC, BIC, and RMSE.

The analysis showed that, although the Laplace model offers a simple symmetric representation, it performed the weakest overall. Its limited ability to capture heavy tails and asymmetry resulted in the highest RMSE, lowest likelihood, and the widest CIs. The skew-normal model improved fit by capturing asymmetry and producing tighter intervals, yet it struggled with heavy tails.

The skew-t distribution provided a more balanced fit, effectively modelling both skewness and moderate tail behaviour. It performed well across all metrics, with only minor deviations in the lower tails and centre. The GH distribution proved most robust, closely aligning with empirical data and excelling in both statistical and visual evaluations. Its additional shape, skewness, and kurtosis parameters enabled it to capture complex distributional features, outperforming all other models. Both GH and skew-t showed tighter CIs, indicating more reliable parameter estimates.

The novelty of this work lies in establishing a unified statistical framework that integrates WMLE-based parameter estimation with multiple non-Gaussian models to evaluate GNSS measurement uncertainty under real-world, non-Gaussian noise conditions. Unlike previous research, which focused mainly on Gaussian assumptions, this study demonstrates that heavy-tailed and asymmetric models, particularly GH and skew-t, significantly improve distributional fit and the accuracy of uncertainty quantification.

These results highlight the importance of flexible, asymmetric models for accurate GNSS error characterisation. While RMSE quantifies overall deviation, it overlooks distributional nuances evident in models such as skew-normal, which achieve low RMSE but misrepresent tail structure. In contrast, GH and skew-t models deliver superior performance across all criteria. AIC and BIC further support their suitability by balancing fit and complexity, ensuring robustness and generalisability to new datasets.

This work has practical implications for GNSS-based positioning, calibration, and metrological applications, particularly in environments where impulsive or non-Gaussian noise dominates. Future work will extend this framework to dynamic GNSS scenarios and explore its integration with advanced filtering and real-time uncertainty estimation techniques.

Statistical Characterisation of GNSS Data for a Stationary Receiver using Non-Gaussian Distributions

Full Article

Paradigm

My account