Have a personal or library account? Click to login

Integrative model-based assessment of heavy metal contamination and source apportionment in groundwater of West Bengal, India

Open Access
|Oct 2025

Full Article

I.
Introduction

Globally, over 1.5 billion people rely entirely on groundwater as their primary source of drinking water [1]. Groundwater fulfills approximately 22% of industrial use, 69% of agricultural needs, and around 8% of domestic requirements [2]. However, with growing population pressure, insufficient soil conservation measures, rapid urban growth, intensified agriculture, and expanding industrial and mining activities, the demand for water has escalated significantly. Simultaneously, pollutants from human activities, including chemical fertilizers, pesticides, and industrial and household waste, often percolate through soil layers and bedrock, ultimately contaminating the underlying groundwater [3]. Heavy metals (HMs) and metalloids are recognized as major pollutants due to their high toxicity to living organisms, particularly when present in their dissolved ionic forms in water [4]. Even a highly toxic HM ion slightly exceeding its acceptable limit can pose a higher risk to water safety than a less toxic HMs present at much higher levels. Some of the key pollutants often found in groundwater are arsenic (As), manganese (Mn), lead (Pb), chromium (Cr), iron (Fe), nickel (Ni), and copper (Cu) as highlighted by various recent research [5,6].

The Bengal Delta Plain, which includes the area of Bangladesh and West Bengal, India, has garnered global scientific attention due to the presence of As in the groundwater at concentrations far exceeding the WHO’s acceptable limit (10 μg ⋅ L−1) [7]. In West Bengal, India, groundwater often contains a mix of various dissolved HMs at different levels, which indicates that people can be unknowingly exposed to several harmful substances (Pb, Ni, and Cr, along with As) at a time [8]. Upon exposure to multiple metals simultaneously to the people, their combined effects can differ from the impact of each metal individually; they may either enhance (synergistic) or counteract (antagonistic) each other. For instance, selenium (Se) has been shown to reduce As toxicity [9], while Ni can work synergistically with As to increase the risk of lung cancer [10]. Hence, the presence of multiple HMs may lead to several health risks for the local population, making it essential to assess their distribution and identify their sources for effective monitoring and management. To determine the probable sources of HMs and understand their distribution patterns, various machine learning approaches, such as positive matrix factorization (PMF), geospatial information system (GIS) mapping, and self-organizing maps (SOMs) have been employed. PMF is widely acknowledged and recommended by the USEPA [11] as an effective technique for identifying pollution sources. However, its application in water source analysis has been relatively limited, with only a few researchers employing it [12]. The effectiveness of the PMF model largely depends on the researchers’ interpretation of background information from the study area, which can restrict its precision and applicability [13]. To enhance the reliability of source identification, PMF was integrated with Pearson correlation analysis in this study. Similarly, GIS techniques facilitate the analysis of HM mobility patterns [14]. According to Burgos et al. [15], GIS also helps characterize the spatial distribution of HMs by applying the principles of regionalized variability. Additionally, SOMs, a type of unsupervised learning algorithm, are effective in addressing complex non-linear problems [16]. Due to this capability, SOM has been extensively utilized as a reliable method for identifying and interpreting HM distribution patterns.

Although several research studies have been conducted on the presence of various HMs in drinking water across West Bengal, most studies have focused on HM contamination in the groundwater of Greater Kolkata. However, there is a scarcity of information regarding the distribution of different HMs in other regions of West Bengal. The primary goals of this study are: (a) to assess the concentrations of various toxic HMs and quality of groundwater in comparison to World Health Organization (WHO) standards, (b) to determine natural/anthropogenic sources and source allocation of HMs using the PMF model, and (c) to analyze the distribution pattern of HMs in the study area through GIS and SOM models. This study offers valuable insights through its integrated multi-model approach aids in pinpointing major pollution sources, enabling strategic interventions and effective management of water pollution.

II.
Materials and Methods
a.
Study area and sample collection

South 24 Parganas, a district in West Bengal, India, is known for having groundwater HM levels, mainly As, exceeding the WHO-recommended limit of 10 μg ⋅ L−1 [17]. The selected study sites, Baruipur (site 1: N22°19′ to E88°29′) and Bhangar (site 2: N22°31′ to E88°34′), are adjacent blocks of the district (Figure 1). To ensure a representative dataset, a total 32 groundwater samples were randomly collected from tube wells located in residential areas. These tube wells are commonly used by the local population for daily household activities, such as cooking, washing vegetables, bathing, and cleaning. Prior to sampling, each hand-operated tubewell was pumped multiple times to flush out stagnant water, ensuring that fresh groundwater was collected. Samples were gathered in pre-cleaned, acid-washed polyethylene (PE) bottles. To prevent the precipitation of dissolved iron and maintain HMs in solution, the water samples were immediately acidified on-site using concentrated nitric acid (HNO3) to bring the pH below 1. All samples were stored at 4°C until further analysis.

Figure 1:

Sample location map of two sites (Baruipur and Bhagar) of West Bengal, India.

b.
Evaluation of physicochemical parameters and HM concentration

The physicochemical parameters, such as pH and electrical conductivity (EC), were measured using pH meter-335 (Systronics, Ahmedabad, India) and EC meter-307 (Systronics), respectively [18]. The HMs (Ni, Pb, Fe, Cu, Cr, and Mn) in the studied water samples were measured using an atomic absorption spectrophotometer (AAS 816, Systronics). As levels were determined using an atomic absorption spectrophotometer combined with a hydride generator system. To ensure quality control, a certified reference material (SRM 2710, National Institute of Standards and Technology (NIST)) was utilized. The relative standard deviation (RSD) was measured to determine the precision of the method. The RSD values for As, Mn, Cr, Pb, Ni, Cu, and Fe are 4.3%, 2.1%, 3.3%, 2.7%, 3.8%, 5.2%, and 4.6% respectively.

c.
Source and pattern distribution of HMs employing different models

PMF was employed as a comprehensive model for source apportionment [19]. The model utilized two key input files: the concentrations of the evaluated parameters (HM concentration) and the corresponding uncertainty values calculated through specific equations [20]. The analysis was conducted using EPA-PMF version 5.0, (United States Environmental Protection Agency (USEPA)), with the optimal number of factors determined by minimizing the Q value, which represents the model’s best fit to the data [21].

(i) Yab=r=1qgarfrb+eab {Y_{ab}} = \sum\nolimits_{r = 1}^q {{g_{ar}}{f_{rb}} + {e_{ab}} \cdots } (ii) S=a=1Ib=1mXabr=1qgarfabuab2. S = \sum\limits_{a = 1}^I {\sum\limits_{b = 1}^m {{{\left( {{{{X_{ab}} - \sum\nolimits_{r = 1}^q {{g_{ar}}{f_{ab}}} } \over {{u_{ab}}}}} \right)}^2}} .} (iii) ForcMDL,uab=56MDL {\rm{For}}\;{\rm{c}} \le {\rm{MDL}},{u_{ab}} = {5 \over 6}{\rm{MDL}} \ldots (iv) Else,uab=Errorfraction×c2+MDL22 {\rm{Else}},{u_{ab}} = \sqrt {{{\left( {{\rm{Errorfraction}} \times c} \right)}^2} + {{\left( {{{{\rm{MDL}}} \over 2}} \right)}^2}} \ldots

Here, number of samples, metals, and different sources represent a, b, and r, respectively; gar = source contribution r in sample a (μg ⋅ L−1); Yab = metal concentration b in sample a (μg ⋅ L−1); eab = residual portion; frb = metal concentration b in source r; uab = metal uncertainty b in sample a; error fraction = uncertainty percentage measurement; and Method Detection Limit (MDL) = the detection limit (species-specific method) [12].

The GIS was employed to map the distribution of HM contamination in groundwater using the inverse distance weighting (IDW) interpolation technique through ArcGIS version 10.4 (Environmental Systems Research Institute, (ESRI) American multinational GIS software company) [14].

The SOM was utilized to interpret the complex dataset and detect patterns of HMs in water samples [22]. In the component plane, areas with similar color intensities indicate a positive correlation, while differing colors reflect negative correlations. During the self-learning process, neurons are mapped onto a two-dimensional grid to enhance visual interpretation. The number of neurons was determined using a heuristic equation as outlined by Bhuiyan et al. [23]. SOM analysis was carried out using R software version 4.3.0. (R Project for Statistical Computing).

(v) p=5×n p = 5 \times \sqrt n \ldots

Here, “p” represents the quantity of node, and “n” depicts the input (data quantity).

d.
Statistical analysis

Statistical analysis was conducted to assess the variability in HM concentrations and water quality parameters. The Shapiro–Wilk test was used to check the normality of the data, while Bartlett’s test was applied to evaluate the homogeneity of variances. After confirming that the data followed a normal distribution and exhibited homogeneity of variances, a t-test was carried out to compare treatment means. Mean separation was done using a 5% significance level (p < 0.05) with a 95% confidence interval. Pearson correlation coefficients were calculated to explore relationships between different HMs and water quality parameters. All statistical procedures were performed using R software (version 4.3.0). (R Project for Statistical Computing).

III.
Results and Discussion
a.
Physicochemical properties of groundwater

Physicochemical properties including pH and EC were measured and illustrated in Figure 2. In site 1, the pH levels of the water samples were within the neutral range (7.06–7.27), with an average of 7.15 ± 0.059. In contrast, water from site 2 showed slightly alkaline characteristics, with pH values ranging from 7.51 to 7.83. The alkalinity observed in site 2 is likely due to household practices (such as the use of soaps and detergents) as well as agricultural inputs, which eventually seep into the groundwater and alter its composition. Similarly, the EC levels were measured as 0.89 ± 0.054 in site 1 and 1.20 ± 0.048 in site 2, as shown in Figure 2. The higher EC observed in site 2 could be attributed to the greater concentration of HMs present in the area. These findings are consistent with the observations reported by Shrivastava et al. [24].

Figure 2:

Physicochemical attributors (pH and EC) of groundwater samples between two sites. EC, electrical conductivity.

b.
Background of HM concentration in the studied location

All groundwater samples collected from the two study sites were tested for HMs (As, Mn, Cr, Ni, Pb, Cu, and Fe), and the results are presented in Figure 3. These values were compared against the WHO drinking water guidelines [25, 26] to assess whether the metal concentrations fall within safe limits. The findings highlight arsenic (As) as a major concern, particularly in site 2 (Bhangar: 34.31 ± 9.07 μg ⋅ L−1), where all tube well samples exceeded the WHO-recommended limit of 10 μg ⋅ L−1. In contrast, arsenic levels in site 1 (Baruipur: 4.64 ± 0.89 μg ⋅ L−1) remained within the permissible range. Long-term exposure to arsenic-contaminated water continues to be one of the most pressing public health issues in these communities [27]. Villagers commonly use water from these tube wells for everyday purposes, such as drinking, cooking, and washing, as it is generally believed to be safer [28]. Therefore, it is important to test the water before considering it a safe source for consumption, especially in terms of As levels. Despite the known risks, skin lesions are still commonly observed in the study area. From our field observations, it was evident that people continue to use tube well water for daily tasks, such as cooking and washing. In fact, many still occasionally drink this water, particularly farmers as they work in the agricultural fields. This highlights the need to explore the presence and impact of other HMs in the groundwater, even when arsenic levels appear within safe limits, to better understand and reduce potential health risks for the local population.

Figure 3:

HM concentration of groundwater samples between two sites. HM, heavy metal.

Like As, Fe is an essential nutrient for the human body, with a recommended daily intake of about 5 mg [29]. In our groundwater analysis, Fe levels ranged from 1,560 μg ⋅ L−1 to 2,750 μg ⋅ L−1 in site 2, with an average concentration of 2,011.87 ± 433.50 μg ⋅ L−1 was observed for all the samples. Similarly, in site 1, the average Fe level was 1,800.62 ± 340.34 μg ⋅ L−1 (Figure 3). According to BIS [26], the acceptable limit for iron in drinking water is 300 μg ⋅ L−1. Based on this standard, most of the samples collected during both the sites are considered unsuitable for drinking due to excessive iron content. This is an important issue that warrants close attention. High levels of iron in groundwater are quite common, as documented by CGWB [30], Gautam et al. [31]. While elevated iron levels can affect the taste of water, stain clothes, and cause rusting, it is generally considered far less harmful than many other HMs. Given these factors, there is a strong argument that Fe should not fall under the non-relaxable category, and a maximum permissible limit should be established. Another HMs, Mn levels were also found to be high in the groundwater samples, especially at site 2 (Figure 3). Although the WHO has recently discontinued its health-based guideline value for Mn, which was previously set at 400 μg ⋅ L−1 [26], the Mn concentrations in this study generally fell within that former limit at both sites. However, a significantly high level (mean: 136 μg ⋅ L−1) was detected at site 2 as compared with site 1 (Mn levels ranged from 57.6 μg ⋅ L−1 to 92.5 μg ⋅ L−1). Mn contamination in water sources is often linked to a combination of natural geological processes, domestic waste, and industrial effluents. Elevated levels of manganese in drinking water can be particularly harmful to human health, with infants being especially at risk due to its potential neurotoxic effects. This metal is commonly present in both surface and groundwater, arising from natural environmental factors as well as human activities, such as mining operations and industrial discharge [29].

The other toxic HMs, such as Cr, Cu, Ni, and Pb in groundwater were evaluated as per the WHO guidelines and BIS standards and depicted in Figure 3. The mean values for all the elements analyzed in the groundwater of our study area were below the guideline values [25,26]. Although site 2 showed relatively higher HM content in groundwater sample (Cr: 2.20 ± 0.28, Cu: 44.55 ± 7.97, Pb: 2.06 ± 0.36, and Ni: 5.1 ± 0.62, μg ⋅ L−1, respectively) as compared to site 1 (Cr: 1.23 ± 0.26, Cu: 25.2 ± 7.89, Pb: 1.17 ± 0.24, and Ni: 3.14 ± 0.59, μg ⋅ L−1, respectively). Cu and Ni are important HMs, but when their concentrations surpass safe limits, they can become harmful to the ecosystem and human health. Elevated Cu and Ni levels are often linked to discharges from chemical industries, waste from nearby factories, and household waste. Similarly, high levels of Cu and Ni in water can pose serious health risks, including the potential to cause cancer [29]. Cr is typically present in the environment at very low. However, when its concentration exceeds the safe limit, it becomes a cause for concern. One possible reason for elevated Cr levels in contaminated water is the use of soaps and detergents containing chromium compounds during activities, such as bathing and washing [32]. Pb is highly toxic and tends to accumulate in the kidneys and bones of animals over time. In the context of the study area, its presence could be linked to the use of leaded fuel in vehicles (battery discharge), generators, as well as household wastewater, agricultural runoff, and waste from both humans and animals [33]. Therefore, based on the findings, it is evident that As and Fe levels were relatively higher than those of other metals. To effectively manage this pollution, the most critical step is accurately identifying the sources of contamination and understanding their distribution patterns.

c.
HM pattern recognition using different model analyses between two sites
c.i.
Source allocation of HMs by PMF model

The PMF model was utilized to investigate and identify the sources of 7 HMs in the study site. In our research, the model was executed 20 times to identify the lowest and most stable Q value and select four factors as optimal [34]. The signal-to-noise (S/N) ratios for all seven selected HMs were above 2, suggesting strong data integrity and validating the model’s reliability. The regression analysis between observed and predicted concentrations, with R2 values greater than 0.9, indicates a strong correlation and confirms the model’s robustness. Figures 4A–4C presents the contribution of each HMs to the PMF output, along with detailed factor profiles and a correlation matrix. In this study, the PMF analysis revealed that Factor 1 had the highest contribution to As, Cr, Mn, and Ni, accounting for 68.2%, 37%, 29.1%, and 23.4%, respectively. Factor 2 showed strong loadings for Cu (48.1%), As (16.3%), and Fe (23.5%). Factor 3 was predominantly associated with Fe (42.4%) and Ni (18.9%), while Factor 4 also showed a notable contribution from Pb, Cu, and Ni with a loading of 48.3%, 26.9%, and 32.9%, respectively (Figures 4A and 4B). Factor 1 accounted for 31.08% of the total variance and was primarily influenced by the presence of As, Cr, Mn, and Ni in the ground water (Figure 4A). This factor is largely associated with the lithogenic origin, linked to the natural composition of the earth’s core material. The factor was mainly influenced by arsenic, showing a strong positive correlation with other HMs (mainly with Cr, Mn, and Ni) (Figure 4C). In the Bengal Delta Basin, As emerges as the most dominant contaminant, followed by Cr, Ni, and Mn [35]. Hence, Factor 1 is likely associated with the natural or geogenic source. Factor 2, primarily comprising Cu, As, Pb, and Fe, accounted for a total variance of 23.40% (Figure 4B). The elevated levels of Cu, Pb, As, and Fe in agricultural soils can be attributed to the widespread use of insecticides, pesticides, herbicides, and fertilizers [36]. Additionally, the rapid application of these agrochemicals to boost crop yields has led to significant HM contamination. Over time, these HMs leach from the soil into the underlying groundwater, contaminating the water sources. Therefore, Factor 2 is likely linked to agricultural activities. Factor 3 contributed 14.67% of the total variance and was mainly influenced by elevated levels of Fe and Ni. These HMs are commonly associated with industrial activities, particularly from sources such as waste from the steel industry, mining residues, metal smelting, wastewater discharge, and alloy manufacturing [37,38]. Hence, Factor 1 was identified as being associated with the industrial activities. Lastly, Factor 4 accounted for 30.85% of the total variance, with a primary contribution from Pb, followed by Cu and Ni. The primary contributors to Pb pollution include lead-acid batteries, vehicular fuel combustion, and catalytic converters [39]. Additionally, Ni and Cu are largely associated with vehicle emissions, road traffic, tire, and industrial smelting processes [39]. Based on this analysis, Factor 4 is likely linked with traffic-related emissions. These findings suggest that the PMF model effectively identified and quantified the sources of HM contamination in groundwater. In both site 1 and site 2, natural processes and traffic-related emissions emerged as the primary contributors to HM pollution.

Figure 4:

Source allocation of HMs in the studied area: (A) factor contribution percentage of each HMs, (B) HM factor profile by PMF model, and (C) Pearson correlation coupled with PMF factors to evaluate the relationship among HMs. HMs, heavy metals; PMF, positive matrix factorization.

c.ii.
Spatial distribution of HMs by IDW-GIS model

The geostatistical method using IDW interpolation revealed the spatial distribution patterns of seven HMs in groundwater between two sites and is depicted in Figures 5A and 5B. Based on the concentration data from field sampling locations, the IDW analysis indicated that elevated arsenic levels (ranging from 17.22 μg ⋅ L−1 to 47.09 μg ⋅ L−1) in groundwater were primarily concentrated in the northwestern region of site 2 (Figure 5A). In contrast, site 1 showed arsenic concentrations mainly in its western region. Likewise, for other HMs, such as Pb, Cu, Ni, Mn, and Cr, their concentrations were primarily elevated in the northern and northwestern parts of site 2 (Figure 5B). However, iron (Fe) showed a different spatial pattern, with higher concentrations observed in the southern region (Figure 5A). Similarly, in site 1, the northern and western areas exhibited elevated levels of all the other HMs. Our results align with those reported by Ghosh et al. [14], who used GIS mapping to illustrate the spatial distribution of HMs. This analysis, therefore, serves as a useful tool to understand not only the concentration levels and spread of these HMs but also to infer their possible sources.

Figure 5:

(A) Spatial distribution pattern of HMs (As, Cr, and Mn) of study area through IDW interpolation. Here, site 1: brown to blue shade and site 2: green to light shade.

Figure 5:

(B) Spatial distribution pattern of HMs (Cu, Ni, Pb, and Mn) of study area through IDW interpolation. Here, site 1: brown to blue shade and site 2: green to light shade. HMs, heavy metals; IDW, inverse distance weighting.

c.iii.
Pattern distribution of HMs by SOM

The SOM technique helps uncover important patterns that are often difficult to detect using conventional methods, as illustrated in Figure 6. In this study, SOM was effectively used as a quantitative tool to map the distribution of toxic HMs in groundwater while also aiding in the classification and identification of different pollution sources. To highlight the significance of each variable represented by SOM units, color-ranked plots were created to visually reflect the similarity between samples based on their proximity within hexagonal spaces. The unified distance matrix (U-matrix) is formed by comparing the weight vectors of each neuron with its neighboring units. In the component planes, similar colors across variables suggest a positive correlation; whereas, contrasting colors imply a negative relationship [22]. Figure 6 displays the component planes resulting from the SOM analysis, with each variable represented individually. The visualization offers a zone-specific perspective for better understanding the spatial patterns of each parameter. Color gradients were used to create the SOM planes, where deep red indicated areas with the lowest concentrations and lighter shades (toward white) represented areas with higher concentration levels. The concentrations of As, Mn, Ni, and Cr were predominantly higher, located from the upper right to the lower right corner neurons. Iron (Fe) showed a distinct distribution pattern, with elevated levels observed in neurons spanning from the upper left to the right side. Higher levels of Cu and Pb were concentrated on the lower right. When comparing the two sites, neurons from the upper to lower left quadrant were mainly associated with site 1; whereas, those from the lower right to upper right were linked to site 2. Ultimately, the SOM algorithm generated a U-matrix based on sample locations (Figure 6), which revealed that the majority of neurons (10 in total) corresponded to site 2. Based on the SOM results, it can be inferred that As and Fe were the most dominant contaminants among the HMs in the study area, with the majority of HM pollution primarily concentrated in site 2.

Figure 6:

Distribution pattern of each HMs and water quality parameters of study area through SOM, site wise HM concentration distribution maps, and clustering of U-matrix depicts the two sampling sites. EC, electrical conductivity; HMs, heavy metals; SOM, self-organizing map.

IV.
Conclusion

HMs in groundwater pose a significant threat to public health. This study aimed to assess the concentration, origin, and spatial distribution patterns of HMs in groundwater across the study area, utilizing various modeling approaches. The results highlight that the groundwater in site 1 exhibited a neutral pH range, while site 2 showed slightly alkaline characteristics. As and Fe levels in site 2 exceeded the permissible limits; whereas, the concentrations of other HMs remained within acceptable levels. The combined use of multiple modeling approaches offers a more precise understanding of the potential sources and spatial spread of HMs in the region. The PMF analysis indicated that natural processes and traffic emissions are the primary contributors to HM contamination. GIS-based spatial mapping further highlighted that the northern part of the study area exhibited the highest levels of HM pollution. Additionally, SOM analysis effectively analyzed and confirmed the distribution patterns of these HMs across the region. This research offers insights into HM concentrations, pinpoints potential sources of contamination, and highlights the need for effective remediation strategies to address groundwater pollution on a global scale.

Language: English
Submitted on: Apr 9, 2025
Published on: Oct 4, 2025
Published by: Professor Subhas Chandra Mukhopadhyay
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Soumi Banerjee, Saibal Ghosh, Arijit Dasgupta, Sonali Banerjee, published by Professor Subhas Chandra Mukhopadhyay
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.