Evaluation of height-diameter equations for predicting dominant height using data from Estonian forest research plots

Kaev, Taavi; Padari, Allar; Tarmu, Toomas; Kaimre, Paavo; Kiviste, Andres

Introduction

Modelling of the height-diameter relationship is important for predicting the mean heights of trees of given diameters at breast height growing in stands of specified age, site index, and density (Burkhart & Tomé, 2012). Height-diameter models, also known as height-diameter curves or just height curves (van Laar & Akça, 2007) are employed in many applications like in stand-table projection and individual tree growth and yield simulators (Burkhart et al., 1972; Burkhart & Tomé, 2012). The relationship between diameter and height within an even-aged stand is curvilinear and many non-linear regression equations have been proposed to fit a stand height-diameter curve (Çatal & Carus, 2018; Cui et al., 2022; Curtis, 1967; Huang et al., 2000, 1992; Lebedev, 2020; Mehtätalo et al., 2015; Misik et al., 2016; Padari, 1994; Seki & Sakici, 2022; Sharma, 2010; Soares & Tomé, 2002; Sonmez, 2009; Temesgen et al., 2007; van Laar & Akça, 2007; Vargas-Larreta et al., 2009).

Height curve equations should also meet the following mathematical properties: (A) monotonic increment, (B) presence of an upper height asymptote and (C) presence of an inflection point. The shape of a height diameter curve should be sigmoidal rather than concave and zero diameter height should be equal to breast height (1.3 meters) (Lei & Parresol, 2001). Equations that meet the mathematical properties of the height-diameter relationship mentioned in the previous sentence are usually non-linear with respect to their parameters. Non-linear regression requires optimal starting values for the parameters but may still fail to converge (Mehtätalo & Lappi, 2020). Conversely, linear regression does not require any starting values and never fails to converge.Because the parameters of height-diameter curves are often highly correlated (Sims, 2022), non-linear fitting of empirical tree height-diameter data with equations having more than two parameters often fails to converge (Kangur et al., 2021; Paulo et al., 2011). To estimate parameters for non-linear equations using linear regression, the non-linear equation is typically transformed into a linear form with respect to its parameters. It is done to obtain the best possible starting values for non-linear regression. However, the parameter estimates obtained through linearization are biased (Nilson, 2002a), although approximate parameter estimates are always obtained and non-linear regression with these obtained parameters as starting values is more likely to converge.

In generalized height-diameter models, additional stand-level variables, such as quadratic mean diameter, dominant diameter, average height, dominant height, basal area, and age, are incorporated alongside the measured diameter at breast height. The inclusion of these variables enhances the accuracy of height predictions (Lebedev, 2020; Mehtätalo & Lappi, 2020; Nilson, 2002b; Temesgen & v. Gadow, 2004).

In Estonia, stand height is traditionally defined using mean height, whereas in many other countries, dominant height – calculated as the average height of the 100 largest trees by diameter per hectare – is used as the primary stand height indicator (Hanus et al., 1999; Tarmu et al., 2020; Woollons, 2003). Although several height-diameter equations have been evaluated in Estonia, no significant differences in mean height predictions have been observed (Padari, 1994). However, height-diameter equations specifically designed for predicting dominant height in Estonian forests remain unexplored.

This study aims to evaluate various height-diameter equations for their applicability in predicting dominant height, utilizing individual tree height and diameter measurement data from the Estonian Network of Forest Research Plots (ENFRP).

Material and Methods

Research data

The evaluation of height-diameter models was conducted using empirical data obtained from the Estonian Network of Forest Research Plots (ENFRP). The measurement of the plots in the network started in 1995 (Kiviste et al., 2015; Kiviste & Hordo, 2002). The network primarily represents stands with different age, density and species composition growing on the mineral soils of Estonia (Kiviste & Hordo, 2002). The permanent sample plots of the ENFRP network are located all over Estonia (Figure 1). Each year, repeated measurements are conducted on more than 100 permanent sample plots. Each permanent sample plot is remeasured at 5-year intervals. The plots have recorded tree locations in terms of distance and azimuth from the centre of the plot, tree diameter at breast height and identified tree species, as well as any damage codes and severity if damage was present. In addition to diameter at breast height, the height and crown base height were also measured for every fifth tree and for trees with the largest diameters (Kiviste et al., 2015). ENFRP data were verified to eliminate measurement errors (outliers).

This study used ENFRP data from 1995 to 2021, covering 1,084 permanent plots measured 4,134 times. The dataset includes 157,523 height and diameter measurements, with some trees measured once and others multiple times, resulting in multiple height-diameter pairs per tree. ENFRP tree measurement data was grouped into 21,335 cohorts, defined by plot number, measurement year, canopy layer, and tree species. Metrics calculated for each cohort included diameter and height counts, quadratic mean diameter, dominant tree measurements, and empirical dominant height. For height-diameter curve analysis, 5,278 cohorts with at least nine height-diameter measurement pairs (127,506 pairs in total) were selected (Table 1).

Table 1.

Distribution of cohorts by canopy layer and tree species (the number of cohorts that include at least one dominant tree height-diameter measurement pair is presented in brackets).

	Upper canopy layer	Secondary canopy layer	Undergrowth	Advanced regeneration
Pinus sylvestris L.	2077 (1968)	10	0	0
Picea abies (L.) H. Karst.	1039 (940)	697	0	47
Betula spp.	993 (706)	58	0	1
Populus tremula L.	157(144)	2	0	0
Alnus glutinosa (L.) Gaertn.	81 (49)	0	0	0
Alnus incana (L.) Moench	60(15)	1	0	0
Salix spp.	22 (8)	0	0	0
Fraxinus excelsior L.	5(5)	5	0	0
Tilia cordata Mill.	2	1	0	0
Acer platanoides L.	2	8	0	1
Quercus robur L.	1(1)	0	0	0
Sorbus aucuparia L.	0	0	1	0
Corylus avellana L.	0	0	7	0

On average, the cohorts contained 20 height-diameter measurement pairs. Most upper canopy layer cohorts included height-diameter measurement pairs for dominant trees.

Selection of height-diameter equations for mathematical analysis

We acquired the number of potential equations for height-diameter regression from forestry literature: (A) linear equations with respect to the parameters, (B) non-linear equations transformable into a linear form with respect to the parameters, (C) non-linear equations that could not be linearized.

Height-diameter equations acquired from forestry science literature for this study were standardized and presented in Table 2. The general forms of equations with three parameters were handled in this study by fixing the third parameter c as a constant (presented in Table 3 column 2). In such equations, parameters a and b were estimated on the height-diameter dataset while the value for the third parameter c was taken from the literature or adjusted through trial-and-error method in such a way that the average prediction error of the model would be minimized for all trees and for dominant trees.

Table 2.

The standardized height-diameter equations evaluated in the study (h – tree height (m); d – tree diameter at breast height (cm); a, b – estimated parameters; c – parameter fixed as constant).

No	Equation	Linear form	References	Decreasing region	Height at zero diameter	Height asymptote	Diameter at the inflection point
Fl	$h = 1.3 + \frac{d^{c}}{(a + b d^{c})}$	$\frac{d^{c}}{(h - 1.3)} = a + b d^{c}$	(Hoßfeld, 1823; Kiviste et al., 2002)	a<=0 \| b<=0	1.3	$1.3 + \frac{1}{b}$	${(\frac{a (c - 1)}{b (c + 1)})}^{\frac{1}{c}}$
Fla	$h = 1.3 + \frac{d^{c}}{(a + b d^{c})}$	$\frac{1}{(h - 1.3)} = b + \frac{a}{d^{c}}$	(Hoßfeld, 1823; Kiviste et al., 2002)	a<=0 \| b<=0	1.3	$1.3 + \frac{1}{b}$	${(\frac{a (c - 1)}{b (c + 1)})}^{\frac{1}{c}}$
F2	$h = 1.3 + {(\frac{d}{(a + b d)})}^{c}$	$\frac{d}{{(h - 1.3)}^{\frac{1}{c}}} = a + b d$	(Näslund, 1936)	a<=0 \| b<=0	1.3	$1.3 + \frac{1}{b^{c}}$	$\frac{a (c - 1)}{2 b}$
F3	h =1.3 + ad^b	ln(h – 1.3) = ln a + b ln d	(Curtis, 1967)	a<=0 \| b<=0 \|b>l	1.3	–	–
F4	h = 1.3 + ae^b/d	$\ln (h - 1.3) = \ln a + \frac{b}{d}$	(Curtis, 1967)	a<=0 \| b>=0	1.3	1.3 + a	$\frac{- b}{2}$
F5	$h = 1.3 + a {(\frac{d}{d + c})}^{b}$	$\ln (h - 1.3) = \ln a + b \ln (\frac{d}{d + c})$	(Curtis, 1967)	a<=0 \| b<=0	1.3	1.3 + a	$\frac{c (b - 1)}{2}$
F5a	$h = 1.3 + \frac{a d}{{(1 + d)}^{b}}$	$\ln (\frac{d}{(h - 1.3)}) = a + b \ln (1 + d)$	(Mehtätalo et al., 2015)	a<=0 \| b<=0	1.3	–	–
F6	$h = 1.3 + e^{a + \frac{b}{(d + 1)}}$	$\ln (h - 1.3) = a + \frac{b}{(d + 1)}$	(Wykoff et al., 1982)	b>=0	1.3 + e ^a+b	1.3+ e^a	$\frac{- b}{2} - 1$
F7	h= 1.3 + ade^−bd	$\ln (\frac{(h - 1.3)}{d}) = \ln a - b d$	(Lebedev, 2020)	a<=0 \| b<=0 \| d>l/b	1.3	1.3	$\frac{2}{b}$
F8	h= 1.3 + a(ln(l + d))^d	ln(h – 1.3) = ln a + b ln(ln(l + d))	(Lebedev, 2020)	a<=0 \| b<=0	1.3	–	–
F9	h =1.3 + a(l – e^−bd)^c	–	(Chapman, 1961; Meyer, 1940; Nigul et al., 2021; Richards, 1959)	a<=0 \| b<=0	1.3	1.3 + a	$\frac{\ln c}{b}$
F10	h = a + b log d	-	(Curtis, 1967)	b<=0	– ∞	–	–
F11	$h = a + b \frac{1}{d^{2}}$	–	(Curtis, 1967)	a<=0 \| b>=0	–∞	a	–
F12	h = a + bd	–	Line	b<=0	α	–	–
F13	$h = a + \frac{b}{d}$	–	Hyperbole	a<=0 \| b>=0	–∞	a	–
F14	$h = a + b \sqrt{d}$	–	Square root	a<=0\|b<=0	0	–	–

Table 3.

Goodness of fit statistics of the two-parameter height-diameter curves used in the study. The meanings of the column headings are presented in Table 4.

No	Noc	c	Nfail	Ndecr	Nlow	Nhd	RMSE	MAD	ME	ME5%	MEdom	MEDdom	a	b
Fl	1	1.3	0	68	0	127034	1.406	1.202	-0.0043	0.0034	-0.0805	-0.0796	0.7829	0.0365
Fl	2	1.5	1	33	0	127034	1.401	1.198	-0.0013	0.0070	-0.0312	-0.0261	1.1954	0.0393
Fl	3	1.5117	1	33	0	127023	1.401	1.198	-0.0011	0.0072	-0.0282	-0.0231	1.2262	0.0395
Fl	4	1.58	1	30	0	127034	1.401	1.198	-0.0001	0.0085	-0.0108	-0.0033	1.4226	0.0404
Fl	5	1.5896	0	29	0	127034	1.401	1.197	0.0000	0.0086	-0.0084	-0.0013	1.4519	0.0405
Fl	6	1.59	0	29	0	127034	1.400	1.197	0.0000	0.0086	-0.0083	-0.0013	1.4532	0.0405
Fl	7	1.6	0	29	0	127034	1.400	1.197	0.0002	0.0088	-0.0057	0.0008	1.4855	0.0407
Fl	8	1.6118	0	29	0	127034	1.400	1.197	0.0004	0.0090	-0.0027	0.0035	1.5250	0.0408
Fl	9	1.6655	0	28	0	127034	1.400	1.197	0.0012	0.0100	0.0112	0.0187	1.7148	0.0414
Fl	10	1.7	0	28	0	127034	1.400	1.197	0.0017	0.0107	0.0202	0.0277	1.8505	0.0417
Fl	11	1.7128	0	28	0	127034	1.400	1.197	0.0019	0.0109	0.0235	0.0318	1.9038	0.0419
Fl	12	2	0	22	0	127034	1.405	1.200	0.0064	0.0161	0.0998	0.1067	3.5648	0.0444
F1a	13	2	0	22	0	127034	1.405	1.200	0.0064	0.0161	0.0998	0.1067	3.5648	0.0444
Fl	14	3	1	27	6	127024	1.460	1.260	0.0220	0.0313	0.3722	0.3642	35.725	0.0496
F2	15	1	1	304	0	127025	1.419	1.221	-0.0084	-0.0013	-0.1499	-0.1605	0.4339	0.0308
F2	16	2	0	22	0	127034	1.405	1.202	-0.0019	0.0062	-0.1022	-0.1036	0.8874	0.1844
F2	17	3	0	21	0	127034	1.404	1.201	0.0000	0.0086	-0.0838	-0.0808	0.9421	0.3274
F2	18	4	0	21	0	127034	1.403	1.199	0.0009	0.0097	-0.0742	-0.0685	0.8905	0.4343
F2	19	5	1	21	0	127025	1.403	1.200	0.0015	0.0104	-0.0682	-0.0610	0.8180	0.5140
F2	20	10	4	21	0	126925	1.404	1.203	0.0025	0.0115	-0.0560	-0.0458	0.5417	0.7183
F2	21	20	7	19	0	126909	1.405	1.205	0.0029	0.0121	-0.0498	-0.0386	0.3109	0.8479
F2	22	200	91	14	0	125759	1.404	1.204	0.0034	0.0124	-0.0450	-0.0332	0.0354	0.9837
F2	23	1000	1030	2	0	108218	1.389	1.192	0.0040	0.0132	-0.0458	-0.0333	0.0073	0.9967
F2	24	2000	2055	0	0	85463	1.372	1.178	0.0049	0.0140	-0.0517	-0.0208	0.0036	0.9984
F3	25	–	4	310	0	126977	1.448	1.251	-0.0062	0.0022	-0.2843	-0.2888	4.8087	0.4344
F4	26	–	8	21	0	126839	1.406	1.206	0.0034	0.0126	-0.0434	-0.0319	26.941	-7.1578
F5	27	1	3	21	0	126849	1.405	1.204	0.0022	0.0113	-0.0595	-0.0519	27.433	7.6564
F5	28	1	1	21	57	127025	1.407	1.206	0.0029	0.0119	-0.0477	-0.0362	28.308	7.0243
F3	29	–	2	125	8	127007	1.443	1.243	-0.0044	0.0040	-0.2785	-0.2835	5.6670	0.4007
F4	30	–	6	21	100	126864	1.409	1.212	0.0038	0.0130	-0.0311	-0.0194	27.867	-6.5447
F6	31	1	0	21	0	127034	1.404	1.200	0.0012	0.0100	-0.0752	-0.0727	3.3291	-8.1500
F5	32	1.1	3	21	0	126849	1.405	1.204	0.0021	0.0112	-0.0609	-0.0540	27.468	7.0015
F5	33	1.3	2	21	0	126986	1.404	1.201	0.0020	0.0110	-0.0638	-0.0574	27.536	5.9887
F5	34	8	0	20	0	127034	1.407	1.204	-0.0012	0.0069	-0.1285	-0.1388	30.173	1.3628
F5	35	0.5	6	21	0	126758	1.406	1.206	0.0027	0.0119	-0.0518	-0.0411	27.192	14.808
Fl	36	1.59	1	21	0	127034	1.400	1.197	0.0000	0.0086	-0.0083	-0.0013	17.143	0.2572
Fl	37	1.6	0	22	0	127034	1.400	1.197	0.0002	0.0088	-0.0057	0.0008	17.145	0.2553
F5a	38	1	12	285	0	126822	1.444	1.247	-0.0071	0.0011	-0.2627	-0.2675	5.5728	0.6048
F7	39	–	2	282	0	127013	1.414	1.213	-0.0045	0.0024	0.0617	0.0654	1.7051	0.0293
F8	40	–	3	20	0	126997	1.426	1.230	-0.0040	0.0040	-0.2284	-0.2337	4.0636	1.3151
F9	41	1	327	187	0	121663	1.417	1.220	-0.0075	0.0004	-0.0475	-0.0516	23.967	0.0839
F9	42	0.80312	856	23	0	109582	1.410	1.216	-0.0062	0.0033	-0.0946	-0.1033	24.840	0.0669
F9	42	1.18444	242	66	0	123709	1.410	1.210	-0.0049	0.0028	-0.0023	-0.0064	23.109	0.0988
F9	44	1.19374	230	65	0	123885	1.409	1.210	-0.0048	0.0029	-0.0001	-0.0039	23.090	0.0993
F9	45	1.19	233	67	0	123846	1.409	1.210	-0.0048	0.0029	-0.0010	-0.0050	23.102	0.0991
F10	46	–	0	3262	55	127034	1.420	1.221	0.0000	0.0080	-0.1941	-0.1922	-1.5402	7.1452
F1l	47	–	0	21	226	127034	1.590	1.340	0.0000	-0.0015	0.4462	0.4068	21.162	-752.43
F12	48	–	0	136	ü	127034	1.493	1.289	0.0000	0.0118	-0.4066	-0.4012	10.427	0.4127
F13	49	–	0	21	172	127034	1.466	1.249	0.0000	0.0046	0.1151	0.1165	25.051	-108.27
F14	50	–	0	1585	11	127034	1.443	1.245	0.0000	0.0090	-0.3161	-0.3112	3.2249	3.4302

Table 4.

The meanings of the column headings of Table 3.

	Meaning
No	Height-diameter equation number (as presented in Table 2)
Noc	Height-diameter curve number
Nfail	Number of cohorts for which the non-linear regression did not converge.
Ndecr	Number of cohorts for which the estimates for parameters a and b resulted in the height-diameter curve decreasing.
Nlow	Number of height-diameter measurement pairs for which the model predicted a height lower than 1.3 m.

The following characteristics were calculated from the trimmed dataset, from which cohorts with extreme distributions were excluded:
Nhd	The number of height-diameter measurement pairs at the individual tree level.
RMSE	The root mean square error of height prediction.
MAD	The median absolute error of height prediction.
ME	The mean error based of height prediction.
ME5%	The trimmed mean error of height prediction error (excluding the largest and smallest 5% of the data).
MEdom	The mean height prediction error based on dominant tree measurement data.
MEDdom	The median value of height prediction error based on dominant tree measurement data.
a	The median values of parameter a, estimated for cohorts with non-linear regression
b	The median values of parameter b, estimated for cohorts with non-linear regression

Table 2 presents the linear forms of the nonlinear equations F1-F8 with respect to their parameters a and b, enabling their use in software systems that cannot perform nonlinear regression analysis. Additionally, the estimates of parameters a and b obtained through linear regression were used as starting values for non-linear regression analysis to ensure better convergence and achieve more accurate parameter estimates. Some equations in Table 2 were adapted from forest growth and yield equations. Equation F1 was applied also with Nilson (2002b) parametrization (height-diameter curve numbers 36 and 37 in Table 3): $h = 1.3 + \frac{a}{1 - b (1 - {(\frac{D}{d})}^{c})}$ where h – tree height (m); d – tree diameter at breast height (cm); D – cohort’s quadratic mean diameter at breast height (cm); a, b and c – model parameters. The mean height of the cohort corresponds to the formula a + 1.3.

The intrinsically non-linear equation F9 (Table 2) is adapted from the well-known Chapman-Richards forest growth function and has been applied for modelling height-diameter curves of the Järvselja old-growth forest in Estonia (Kangur et al., 2021) and also for modelling height-diameter models for Scots pine and birch stands in Finland (Mehtätalo, 2005), and has also been applied in many other studies on height-diameter curves (Duan et al., 2018; Huang et al., 1992; Sharma & Parton, 2007; Sharma, 2010).

This study also examined some simple smoothing equations with linear parameters (Table 2, F10–F14), which have been widely used, and whose authorship is difficult to trace. The references presented in Table 2 highlight various works where these equations have been previously used and may not refer to the original authors who first introduced the equations.

For each equation used in the study, the following mathematical properties were outlined to assess their suitability for application as a height-diameter curve (Table 2):

Tree height at zero diameter.
Parameter regions where the height curve is decreasing (decreasing region).
The height asymptote as the diameter increases to infinity.
The diameter at the inflection point of the height-diameter curve.

Parameter estimation for the height-diameter curves

In this study, we evaluated 50 different two-parameter height-diameter curves based on the equations presented in Table 2 with various fixed values of parameter c (Table 3). Parameters a and b were estimated for all 50 height-diameter curves on the dataset of height-diameter measurement pairs by 5,278 cohorts (Table 1), resulting in a total of 263,900 parameter estimates. For non-linear equations that could be transformed into a linear form with respect to parameters a and b (Table 2, F1–F8), the parameters were initially estimated using linear regression with the linearized equation. These estimates were then refined using non-linear regression analysis, with the starting values obtained from the linear regression. For the non-linear equation F9 and the linear equations F10–F14, parameters a and b were estimated using non-linear regression or linear regression, respectively.

For a further evaluation process, the parameter estimates obtained using both linear regression and non-linear regression for each cohort were saved in a combined table. The table also included additional variables for each cohort, such as quadratic mean diameter, the number of measured tree heights, the quadratic mean diameter of the stand, the quadratic mean diameter of the upper canopy layer and the arithmetic mean diameter of dominant trees.

Goodness of fit of height-diameter curves

The primary criterion for evaluating height-diameter curves was the successful convergence of non-linear regression analysis for the data of all cohorts. The number of cohorts where parameter estimation failed to converge is presented in the column Nfail of Table 3. Next, the remaining curves had to provide height predictions above 1.3 meters (h > 1.3) for any diameter (d > 0). The number of cohorts with height predictions below 1.3 meters is shown in the column Nlow of Table 3. Curves that did not meet these two criteria were excluded from the subsequent process of selecting the best equation.

The number of cohorts with decreasing height-diameter curves is shown in the column Ndecr of Table 3. The goal was to minimize the number of cohorts producing such curves.

The following goodness of fit statistics (columns Nhd, RMSE, MAD, ME, ME5%, MEdom, MEDdom in Table 3) were calculated based on the dataset from which cohorts that resulted in more than 10 decreasing curve shapes (Ndecr > 10) for the observed height-diameter curves were excluded. This approach partially mitigates the impact of erroneous measurement data on the evaluation of the height-diameter curve’s performance.

The final comparison of height-diameter curves was based on the height prediction statistics of the ENFRP data (Table 3). The possibility of transitioning to the use of dominant height as adopted in many other countries, was also considered for Estonia. Therefore, the selected height-diameter curve had to predict dominant tree heights accurately in addition to the heights of all trees. We selected 15 height-diameter curves that met the first three criteria and ranked them on the basis of median errors of dominant height predictions. Also, we studied median errors of dominant height predictions by most important tree species (Scots pine (Pinus sylvestris L.), Norway spruce (Picea abies (L.) H. Karst.) and birch species (Betula spp.)) and by cohort mean diameter groups (D < 15, 15 < D < 21, 21 < D < 27, and D > 27).

The data analysis was performed in the R open-source software environment (R Core Team, 2025). Linear and non-linear regression analysis (Aho, 2014) were performed using the R functions lm and nls, respectively.

Results

Linear equations

The mean height prediction error (ME in Table 3) for the linear equations observed in this study (F10–F14) is close to zero because linear regression analysis ensures that the sum of residuals equals zero. However, the residual standard errors and mean height prediction errors of dominant trees (RMSE and MEdom in Table 3) of the linear equations were considerably higher than those of the nonlinear equations. None of the linear equations met the criteria (see Ndecr and Nlow in Table 3) established in this study for selecting the best height-diameter curve. Therefore, the zero mean prediction error is irrelevant for assessing the suitability of the linear equations as a universal height-diameter curve.

Non-linear equations

All the different height-diameter curve variations for equations F1–F2 and F4–F6 produced a biologically plausible curve shape. For the height-diameter equation (F1 in Table 2) adapted from the Hossfeld IV growth function, it was observed that higher values of the exponent parameter c caused the model to, on average, slightly underestimate tree height compared to the measured values (ME > 0 in Table 3). Conversely, lower values of parameter c resulted in the model, on average, slightly overestimating tree height (ME < 0 in Table 3). The model’s height predictions were closest to the measured values when parameter c was approximately 1.6 (Table 3).

For height-diameter curves based on Näslund’s equation (F2), the impact of the exponent parameter c was also assessed. While applying higher values of parameter c reduced the mean prediction error for dominant height (MEdom in Table 3) and decreased the number of cohorts with descending height-diameter curves (Ndecr in Table 3), it significantly increased the number of cohorts for which parameter estimation during non-linear regression analysis failed to converge (Nfail in Table 3). Näslund’s equation with an exponent c = 1 (a hyperbolic equation) frequently results in descending height-diameter curves for many cohorts (Ndecr = 304).

The equation F3 is adapted from an allometric relationship which does not have a height asymptote. The residual standard error (RMSE in Table 3) and mean height prediction error of dominant trees (MEdom in Table 3) are considerably larger than those for most height-diameter curves. Considering the above, equation F3 is unsuitable for use as a universal height-diameter curve.

The height-diameter curve with equation F7 decreases when the diameter is larger than 1/b (approximately 33 cm for most cohorts). The height-diameter curve with equation F8 does not have a height asymptote, leading to an overestimation of height predictions for dominant trees (Table 3). Therefore, equations F7 and F8 are unsuitable for further application as universal height-diameter curves.

The intrinsically non-linear equation F9, adapted from Chapman-Richards’s growth function, failed to converge for a significant number of cohorts (Nfail in Table 3). With smaller values of parameter c, the number of non-converged cohorts (Nfail) was greater than with larger values of parameter c. The number of cohorts with a decreasing height-diameter curve (Ndecr) was smaller. Table 3 shows that the Chapman-Richards’s function, which is widely used in forest growth and yield studies (F9 in Table 2), did not provide a good convergence of parameter estimation on the height-diameter data of the cohorts.

The best height-diameter equation

Suitability to the criteria (convergence, height at zero diameter, and monotonic increase) and goodness of fit statistics based on data from 5,278 cohorts for 50 height-diameter curves are presented in Table 3. A total of 33 cohorts produced unsuitable height-diameter curve shapes for more than 10 of the equations analysed in this study. These cohorts were excluded from the error estimation process to prevent inaccuracies caused by measurement errors. Illogical height values were predicted for 635 height-diameter measurements across 8 height-diameter curves. For the remaining 17 curves, the number of cohorts with a declining height-diameter curve was considered. Based on this criterion, equation F1 with a fixed parameter value of 1.3 and equation F12 were excluded, as the number of cohorts with declining height-diameter curves for these equations was more than twice as high as for the remaining equations. Among the remaining 15 height-diameter curves, the median error of the dominant tree height predictions was considered for ranking.

Finally, equation F1 with a fixed parameter c value of 1.6 was closest to the zero median error of the dominant tree height in all ENFRP cohorts’ dataset (Table 3). Equation F1 with Nilson (2002b) parametrization (height-diameter curve 37 in Table 3) gave the same goodness of fit results as the height-diameter curve 7. Surprisingly, the widely used Näslund’s equation did not perform the best in fitting the height-diameter relationship based on ENFRP data and was outperformed in all cases by the Hossfeld IV equation (Figure 2 and Table 3).

Figure 2 shows similar trends of the 15 selected height-diameter curves on dominant height prediction error by tree species and cohort mean diameter classes. The greater the cohort mean diameter, the smaller the systematic error of dominant height prediction.

Also, with the cohort mean diameter increasing, the systematic errors of dominant height predictions by different curves are decreasing and approaching to each other. Evidently the Hossfeld IV equation (F1 in Table 2) is the best for fitting the height-diameter relationship on ENFRP cohorts’ data, however, the exponent c depends on tree species and stand development status (cohort mean diameter). Figure 3 shows that the Hossfeld IV equation with an exponent of 1.6 is the best for describing the pooled dataset of all tree species. The Hossfeld IV equation with an exponent of 1.5896 showed the best fit for pine data while the Hossfeld IV equation with an exponent of 1.6118 for spruce and birch data. Perhaps, for further elaboration of a generalized height-diameter mixed effect model based on the three-parameter Hossfeld IV equation should be applied.

Discussion

Representations of different equations

Upon closer examination of the equations (Table 2), it became evident that many of them share a similar general formula, with variations in equation structure resulting from different mathematical transformations. For instance, Näslund’s equation (F2), Curtis’ equation (height-diameter curve numbers 27–28 in Table 3), and Nilson’s equation (height-diameter curve numbers 36–37 in Table 3) are transformations of the Hossfeld IV equation (height-diameter curve numbers 1–14 in Table 3). Consequently, the classification of height-diameter equations by the names of different authors is somewhat arbitrary. Similarly, Näslund’s and Hossfeld’s equations with an exponent of 1 (height-diameter curve number 15) produce hyperbolic curves with a similar graphical shape and have analogous formula structures.

Mathematical properties of height-diameter equations

When evaluating equations, it is important to first identify which equations produce an illogical shape for the height-diameter curve. Just as the mathematical properties of tree growth functions (Kiviste et al., 2002) which should align with natural laws governing tree height and age relationships, the properties of tree height-diameter curve equations should also reflect the natural patterns of the tree height and diameter relationship. The requirements for height-diameter curve equations are as follows: (A) When the argument (diameter) is zero, the equation’s output (height) should give a value of 1.3 meters; (B) The equation must not be decreasing (in the positive domain of the argument); (C) The equation should have an asymptote (as the diameter increases); (D) The equation must be concave i.e. the second derivative of the height-diameter curve is negative in domain d > 0 (this does not necessarily apply for small diameter values); and (E) The equation may also have an inflection point (in the range of small diameters at breast height).

It is evident that tree diameter at breast height is zero, i.e. tree height at zero diameter is 1.3 metres. However, it is questionable whether a height-diameter curve equation applied to thick trees should satisfy the previously mentioned condition. In our opinion, this condition is still important for developing a universal generalized height-diameter curve that would also work for trees with small diameters. The non-linear height-diameter equations F1-F9 are standardized so that the height at zero diameter is 1.3 meters. The same approach can be applied to the linear equations F12 and F14 by setting the parameter a to 1.3. The linear equations F10, F11, and F13 have a value of negative infinity at zero diameter and therefore do not meet the criterion which is reflected in a negative height prediction error (Table 3).

Concavity is required because, generally, as a tree’s diameter increases, its height growth slows. Convexity, however, would cause the height-curve equation to act contrary to this rule which may not apply to undergrowth or advanced regeneration. According to Lei & Parresol (2001) a height-diameter curve should be sigmoidal rather than concave. A sigmoidal curve shape assumes that the diameter of young trees grows more than their height. This is illogical considering the competition between trees in the early growth stage. Therefore, we believe that only the deceleration of height growth relative to diameter growth is significant, and this is ensured by a concave curve shape.

The existence of an inflection point is relevant to growth curves when height increment in the early years is still small. Conversely, the necessity of an inflection point in the height-diameter curve is debatable, as tree height generally increases more than tree diameter at small diameter ranges. Paulo et al. (2011) found that the necessity of an inflection point in height curve equations is questionable. Lam & Ducey (2024) found that different inflection point diameter values are impacted by the shape of the curve and vary across different equations, but did not confirm the inflection point as a necessary component of the height-diameter curve. We consider that the presence of an inflection point is not necessary for height-diameter curves, regarding the biological relationships between tree height and diameter at breast height. However, if the inflection point lies outside the domain of a function (i.e. the inflection point diameter is smaller than the smallest tree diameter in the stand), the presence of an inflection point does not significantly impact the equation’s performance either.

The effect of linearization

As mentioned by Artur Nilson (Nilson, 2002a), parameter values for non-linear equations that have undergone linearization are biased. However, when comparing linearizable equations (F1-F8) with non-linearizable equations (F9), we observed that finding the starting values of parameters for an equation through linearization improved the convergence of data with non-linear regression, whereas using previously acquired parameter values as starting values still resulted in convergence failure for the data of many cohorts. The reason for that is possibly the fact that linearization of equations allows for the determination of the most suitable starting values through linear regression. However, the issue is still present for some cohorts. Various solutions have been proposed for dealing with convergence issues. For example, it is recommended to test different starting values for the parameters and to transition gradually from a simpler model structure to a more complex one (Mehtätalo & Lappi, 2020).

Perspectives for practical application

The result of our study serves as the basis for developing a new generalized height-diameter model for Estonian forests which would be based on the extensive ENFRP dataset. In this study, we did not evaluate the height-diameter equations separately for cohort subsets by tree species. We propose that using the same equation for all tree species is practical. In this approach, differentiation between tree species would be achieved solely through variations in parameter values. Non-linear mixed modelling could be a rational method to accomplish this.

Our study evaluated height-diameter curves from the point of view of accuracy of the dominant height prediction. Utilizing dominant height as the stand height indicator offers a means to circumvent challenges and potential misinterpretations associated with employing mean height as the representative stand height. The study by Tarmu et al. (2020) employing Estonian data demonstrated that dominant height is less affected by thinning compared to mean height. In Estonia, determining stand maturity for harvesting involves not only the mean age of a stand but also a maturity diameter based on the mean diameter at breast height of a stand (Rules of Forest Management, 2007). During thinning, the removal of trees with smaller diameters at breast height results in an increase in the mean diameter of a stand, potentially allowing for artificially induced harvesting maturity. Since dominant height and dominant diameter are based on the dimensions of the largest trees, their use theoretically can eliminate this issue.

In Estonian forestry, there is no agreed-upon definition of how to determine dominant height in mixed stands yet. In the ENFRP dataset, we defined the dominant height of a plot as the average height of dominant trees on the plot (100 thickest trees per hectare, regardless of tree species). The 100 largest trees per hectare are calculated proportionally according to the plot area. For dominant trees without measured height, we predicted its height with the height-diameter model calibrated to the corresponding plot cohort (tree species).

Conclusions

Many of the equations previously used as height-diameter curves are biologically implausible due to their mathematical properties. The Estonian Network of Forest Research Plots (ENFRP) dataset is sufficiently large and diverse to evaluate different height-diameter curves.

The estimation of parameters for non-linear height-diameter curves did not converge on the data from some ENFRP cohorts. Linear regression can provide optimal starting values for linearized non-linear height-diameter curves.

When comparing the mean prediction error of different height-diameter curves across all tree data, many curves showed similar estimates. However, in the comparison of prediction errors for dominant height, the differences in curve performance were more distinct.

The Hossfeld IV equation (F1 in Table 3) with an exponent c of 1.6 was the most accurate height-diameter curve for predicting dominant height based on ENFRP data if all tree species were analysed together, however, the exponent c depends on tree species and stand development status (cohort’s mean diameter). For different tree species the most suitable exponent parameter value turned out to be slightly different (1.5896 for pine and 1.6118 for spruce and birch data).

The results of this study can be used in modelling generalized height-diameter models. When modelling a generalized height-diameter model, it is essential to examine the dependence of model parameters on various stand variables and apply non-linear mixed-effects modelling for more precise results.

Supplementary Materials

To illustrate the height-diameter curve evaluation process, a series of scatterplots was generated for each assessed curve to showcase its performance. Each set of scatterplots was saved as a separate graphical file. Some scatterplots depicted the parameter value ranges derived from both linear and non-linear regression analysis, while others highlighted the differences in parameter estimates obtained through these two regression methods. Additionally, the graphs presented various error metrics for predicting both mean height and dominant height. These error metrics were analysed using datasets consisting of all tree measurements as well as datasets stratified by individual tree species. The summarized graphical files for all 50 height-diameter curve variations have been saved separately with unique identifiers in the digital archive DSpace of the Estonian University of Life Sciences (http://hdl.handle.net/10492/9022), where they are freely accessible.

Evaluation of height-diameter equations for predicting dominant height using data from Estonian forest research plots

Full Article

Paradigm

My account