Introduction
Water is an essential resource for life; however, its availability and quality are increasingly challenged by global issues such as pollution, climate change, and inadequate water governance (UNESCO 2021). Brazil, which contains approximately 12% of the planet’s surface freshwater (Valenti et al. 2021), grapples with significant inequalities in access to this vital resource, and faces serious water management challenges.
It is estimated that around 35 million Brazilians lack access to safe drinking water, and only 46% of the sewage generated in the country undergoes any form of treatment (Instituto Trata Brasil 2023). This precarious situation directly impacts public health, accounting for approximately 60% of hospital admissions in the Unified Health System (SUS) due to waterborne diseases (FUNASA 2010). This water paradox is exacerbated by the unequal distribution of resources: While the North and Center-West regions contain over 80% of the country’s water reserves, they are home to only a small fraction of the population, leaving the remaining regions under constant hydric stress (ANA 2021).
These challenges are particularly acute in biomes experiencing intense anthropogenic pressure, such as the Atlantic Forest. This region, which hosts high biodiversity and a significant portion of Brazil’s population, suffers from uncontrolled urbanization, deforestation, and pollution. Spanning 17 states, it is one of the most heavily impacted environments, particularly along the Atlantic coast, since the country’s colonization (Colombo and Joly 2010; Guedes Pinto and Voivodic 2021; Romanelli et al. 2022). The original forest cover has been drastically reduced, with some studies estimating that only 7–16% of the original forest remains (Martins Neto et al. 2022; Ribeiro et al. 2009), while more recent estimates based on the Atlantic Forest Law suggest that up to 24% remains (MapBiomas 2022). Nevertheless, it continues to be listed among the global biodiversity hotspots that are critically threatened (Myers et al. 2000), and it is extremely vulnerable to the effects of climate change (Aleixo et al. 2010; Javeline et al. 2019).
The 2014–2015 water crisis in southeastern Brazil dramatically illustrated this vulnerability. Strategic reservoirs, such as the Cantareira System, were reduced to just 5% of their capacity, necessitating emergency measures—including water rationing and inter-basin transfers—to secure the water supply for the São Paulo and Campinas metropolitan regions (Escobar 2015; Jacobi et al. 2015). This scenario underscored critical gaps in infrastructure and water resource governance.
In this context, citizen science emerges as an innovative and complementary alternative to state-led environmental monitoring, fostering decentralized data generation and enhancing community engagement in the conservation of natural resources (Heigl et al. 2019; Sauermann et al. 2020). The term citizen science generally refers to projects with scientific objectives that involve the participation of non-specialist volunteers, typically coordinated by scientists or research institutions (Heigl et al. 2019; Vohland et al. 2021). This model has expanded significantly in recent decades, driven by the proliferation of communication tools (such as the internet, social media, and mobile phones), low operational costs, incentives from funding agencies, and its applicability in large-scale studies, including those addressing climate change, biodiversity conservation, invasive species, and water quality (Burgess et al. 2017; Silvertown et al. 2013; Queiroz-Souza et al. 2023).
Citizen science demonstrates significant methodological adaptability, enabling remote participation in data collection, analysis, and interpretation—often at spatial scales that would be unfeasible for individual scientists (Kobori et al. 2016). However, challenges persist regarding the scientific reliability of the data, the social inclusion of participants, and the effective utilization of this information by public managers (Fritz et al. 2022; Wang et al. 2015). To address these barriers, guiding principles for citizen science have been proposed, including the ten principles outlined by ECSA (European Citizen Science Association) (2015), which advocate for scientific excellence, environmental protection, and active public engagement in decision-making processes.
In water quality monitoring, citizen science has demonstrated significant potential to address data gaps and enhance community engagement (Jovanovic et al. 2019). According to San Llorente Capdevila et al. (2020), the success of such initiatives relies on three key factors: citizen characteristics (knowledge, motivation, experience), institutional capacities (organization, funding, motivation), and the quality of their interactions (support structures, communication, and feedback). Recognizing this potential, the United Nations has begun to promote the use of citizen science as a strategic tool to support public policies and contribute to the achievement of the United Nations Sustainable Development Goals (UN SDGs), particularly SDG 6 (clean water and sanitation) and indicator 6.3.2, which measures the proportion of water bodies with good ambient water quality (Quinlivan et al. 2020).
In Brazil, the work of the SOS Mata Atlântica Foundation is particularly noteworthy. This non-governmental organization, established in the 1980s, aims to mobilize society in defense of the Atlantic Forest biome. The foundation is actively involved in protecting areas, restoring forests, and improving water quality. To date, it has planted over 42 million native trees across more than 23,000 hectares in nine states and 550 municipalities (SOS Mata Atlântica, 2022). In 1991, the organization launched the “Observando o Tietê” (Watching the Tietê) program, which originated from a social mobilization effort to restore this vital river, resulting in the collection of 1.2 million signatures in support of its cleanup. Despite facing structural limitations, the initiative generated significant public interest in the issue of water quality.
Starting in 2015, the program was expanded and renamed “Observando os Rios” (Watching the Rivers; OoR), establishing itself as a model of participatory citizen science. By 2023, it engaged approximately 2,700 volunteers organized into 250 groups, monitoring 230 rivers across 106 municipalities in 17 Brazilian states (SOS Mata Atlântica 2022). After completing training, participants conduct monthly samplings and evaluate nine physicochemical and organoleptic parameters, generating a water quality index (WQI) based on the National Sanitation Foundation Water Quality Index (NSFWQI) model, originally proposed by Brown et al. (1970) and inspired by Horton’s initial framework developed for the Ohio River Valley Water Sanitation Commission in 1965 (Prado et al. 2010). The index encompasses the following parameters: dissolved oxygen (DO), fecal coliforms, pH, biochemical oxygen demand (BOD), nitrate (NO3), phosphate (), temperature, turbidity, and total solids—each weighted differently in the final WQI score.
By involving ordinary citizens in the collection of environmental data, the OoR program not only broadens the spatial coverage of water monitoring but also enhances public awareness of socio-environmental issues, strengthens participatory governance, and redefines community engagement in scientific endeavors (Bonney et al. 2016). This initiative integrates local knowledge, methodological rigor, and social mobilization to promote the sustainability of Brazil’s water resources. The methodological simplicity of the WQI and its replicable nature empower communities to become knowledge producers, while simultaneously fostering environmental awareness and encouraging accountability in public policy.
However, the reliability of these data in comparison to official monitoring methods necessitates systematic evaluation. In this context, reliability refers to the internal consistency of the measurements collected by volunteers across time and space, while validity refers to the extent to which these measurements correspond to the official standards established by the state monitoring agency. Therefore, this study compares WQI data collected by OoR groups in three mesobasins with records from the São Paulo State Environmental Company (CETESB). It assesses the consistency of the results, the adequacy of the parameters used, and the socio-environmental impacts generated by public engagement. This analysis contributes to broader discussions on the role of citizen science in water governance and its potential institutionalization as a complementary strategy for environmental monitoring and action.
Methodology
Study area
This study analyzes 187 water bodies, including major rivers and their tributaries, across 50 municipalities in three mesobasins located in the state of São Paulo, all within the Atlantic Forest biome: Alto Tietê, Sorocaba/Médio Tietê, and Piracicaba-Capivari-Jundiaí (PCJ). Collectively, these basins encompass an area of 31,782 km² and provide water to approximately 28 million people. They face significant anthropogenic pressures, including urbanization, industrialization, and intensive agriculture (CBH-AT et al. 2020) (Figure 1).

Figure 1
Map of the study meso-basins — Alto Tietê, Sorocaba/Médio Tietê, and Piracicaba-Capivari-Jundiaí (PCJ) — highlighting the main monitored rivers.
Ecologically, these mesobasins contain critical remnants of the Atlantic Forest, which are threatened by deforestation and pollution. From a socioeconomic perspective, they are characterized by industrial, urban, and agricultural activities, high population density, and social inequality, all of which exacerbate the impacts on water resources (Ribeiro et al. 2009).
The basins were selected for their significance in water management and their representativeness within the OoR program. Alto Tietê encompasses an area of 5,775 km² and includes 39 municipalities, among which is the São Paulo Metropolitan Region, home to 21 million inhabitants. The region boasts 36% native vegetation and features 14 reservoirs; however, it is also impacted by domestic pollution and water crises (Javeline et al. 2019). Sorocaba / Médio Tietê covers an area of 11,829 km² and comprises 37 municipalities with a population of 2 million inhabitants. The region is affected by industrial and agricultural pollution, as well as deficiencies in sanitation (ANA 2021). The PCJ region encompasses an area of 14,178 km², comprises 71 municipalities, and features a forest cover of 13.5%. It is also home to the Cantareira System and is notable for its participatory governance facilitated by river basin committees (Colombo and Joly 2010).
Data collection
Table 1 presents the list of parameters and their respective scoring criteria used in the construction of the OoR WQI.
Table 1
Parameters, units, and scoring criteria used in Observando os Rios (WQI).
| PARAMETERS | UNITS | SCALES | |||
|---|---|---|---|---|---|
| Turbidity (Tb) | UTJ | Tb > 100 | 40 < Tb < 100 | Tb < 40 | Not verified/measured |
| Total Coliforms (TC) | TC | Positive | – | Negative | |
| Dissolved oxygen (DO) | mg L–1 | DO < 4 | 4 < DO < 6 | OD > 6 | |
| Biochemical oxygen demand (BOD) | mg L–1 | BOD > 8 | 4 < BOD < 8 | BOD < 4 | |
| Hydrogen potential (pH) | – | (pH < 5) or (pH > 9) | (7 < pH < 9) or (5 < pH < 6) | (6 < pH < 7) | |
| Nitrate (NO3) | ppm | NO3 > 20 | 5 < NO3 < 20 | NO3 < 5 | |
| Phosphate (PO4) | ppm | PO4 > 2 | 1 < PO4 < 2 | PO4 < 1 | |
| Settleable solids (Ss) | mm | S > 3 mm | S < 3 mm | None | |
| Floating trash (Ft) | – | Frequent | Few | None | |
| Smell (Sm) | – | Fetid | Weak | None | |
| Fish (Fh) | – | None | Few | Muitos | |
| Red Larvae and worms (R/lw) | – | Frequent | Few | Nenhum | |
| Transparent/dark worms (TD/lw) | – | None | Few | Frequent | |
| Foams (Fs) | – | Frequent | Few | None | |
| score | 1 | 2 | 3 | 0 | |
OoR WQI data were collected between 2002 and 2023 by trained volunteers at fixed, georeferenced sites, including rivers, streams, lakes, and reservoirs, with varying sampling frequencies. The WQI comprises 14 parameters: 8 analytical (turbidity, total coliforms, DO, BOD, pH, , , and settleable solids) and 6 organoleptic/biological indicators (floating trash, odor, fish, red larvae, transparent/dark larvae, and foams). Analytical parameters measured with LaMotte kits adhered to the manufacturer’s standardized protocol, as outlined in Supplemental File 1: Table S1.
The additional procedures employed by the program were based on an internally developed methodology, also described in Table S1, and validated in accordance with the criteria established by Rocha et al. (2004). For each sampling, volunteers assigned a score from 1 to 3 to each parameter based on established criteria. A score of 0 was recorded when a parameter was not assessed.
The OoR WQI classification and scale conversion range from 14 to 42 points, while CETESB’s WQI spans 0 to 100. Both indices utilize five qualitative categories, ranging from Excellent to Terrible, but they differ in their value ranges and color codes. For standardization purposes, we adopted the Viridis color palette, which is designed to be accessible for individuals with visual impairments (Figure 2).

Figure 2
Water Quality Index (WQI) qualification scales – colorimetric and numerical – for Observando os Rios program (OoR) and the São Paulo State Environmental Agency (CETESB), and percentage representation of each qualitative category across the different methods.
Four conversion approaches were implemented to facilitate comparison.
Direct Linear Conversion - This method assumes that the OoR WQI can be linearly rescaled from its original range (14 to 42) to the CETESB scale (0 to 100), without considering the methodological differences between the indices.
Where:
• IQA_OoR is the original index value, which ranges from 14 to 42.
• The range of the OoR scale is 28 (42 – 14).
• The result is a rescaled value on a scale of 0 to 100.
Partial Conversion Using Compatible Parameters – This method utilizes only the eight analytical parameters that are common to both indices (OoR and CETESB). The possible score for these eight parameters in the OoR index ranges from a minimum of 8 to a maximum of 24, assuming each parameter is scored on a scale from 1 to 3.
Where:
• IQA_OoR_8par represents the total score derived from the eight compatible parameters.
• The possible score range is 16 (24 – 8).
• The result converts the partial score to a scale of 0 to 100.
Weighted Conversion by Parameter – This approach adjusts the scores of the eight compatible parameters, weighting them according to the weights defined in CETESB’s WQI. The weights are proportionally redistributed, as the parameter is not utilized in OoR. Each parameter, denoted as i, is normalized on a scale from 0 (poor) to 1 (excellent).
Where:
• Pi represents the score for parameter i in the OoR, with values ranging from 1 to 3.
• wi is the adjusted relative weight for parameter i, based on CETESB’s WQI weights.
• The weighted sum is converted to a scale of 0 to 100.
Adjustment for Incomplete Sampling – When not all parameters are recorded (e.g., in cases of partial or faulty sampling), this formula adjusts the conversion by considering only the parameters that were effectively measured.
Where:
• Sn represents the sum of the scores for n available parameters, each of which can take values ranging from 1 to 3.
• n represents the number of parameters considered in the sampling process.
• The total possible range is represented by the expression 2n, which can be calculated as (3n – n).
This conversion does not apply to the weighted approach (OoR_CET_8*).
These conversions enabled both qualitative and quantitative comparisons. Because OoR and CETESB classifications rely on discrete categories, strict one-to-one matching can overstate disagreements near class boundaries. For this reason, we assessed both exact agreement and agreement within ±1 category. This tolerance, also adopted by Quinlivan et al. (2020), accounts for the inherent uncertainty of interval-based classifications and prevents minor boundary effects from being misinterpreted as substantive discrepancies.
Statistical analyses
We compared OoR converted indices (OoR_CET, OoR_CET_8, OoR_CET_8*) across mesobasins using the Kruskal–Wallis test, followed by pairwise post-hoc comparisons with Benjamini–Hochberg false discovery rate (BH-FDR) correction (α = 0.05). Quantitative agreement between annual means of the converted OoR WQIs and CETESB’s WQI was assessed with Welch’s two-sample t-tests; years with n < 2 per group were excluded. Qualitative agreement among five WQI classes (Excellent–Terrible) was evaluated with Cohen’s Kappa and interpreted using standard benchmarks.
To assess temporal patterns in monitoring effort and data consistency, we applied nonparametric trend tests (Mann–Kendall) and estimated monotonic slopes using the Theil–Sen estimator (α = 0.05). Trends in completeness were modeled with a bias-corrected binomial generalized linear model (GLM), where the response was the probability that a sample contained all 14 parameters; model significance was evaluated by Wald tests. For parameter-specific “success rates” (annual proportion of events with a successful measurement), we likewise tested monotonic trends via Mann–Kendall/Theil–Sen. Unless noted otherwise, two-sided p-values are reported and multiple comparisons were adjusted with BH-FDR. All analyses were conducted in R (v4.3.x), using base stats and the packages rcompanion (post-hoc and BH-FDR utilities), irr Cohen’s Kappa (Sim and Wright, 2005; McHugh, 2012), and Kendall (Mann–Kendall and Theil–Sen).
Data consistency
The quality of OoR data was assessed based on the total number of samples, the success rate of parameters, and completeness, which is defined as the proportion of parameters recorded per sample. In this context, the success rate refers to the annual proportion of monitoring events in which each water quality parameter was successfully measured within a given mesobasin. This indicator reflects the operational completeness of the dataset rather than the analytical accuracy of results.
Socio-environmental analysis
The socio-environmental analysis examined the following aspects:
Capillarity: The quantity of monitored water bodies within each mesobasin.
Engagement: The total number of samples, groups, and municipalities involved in the study.
Resilience: refers to a group’s ability to maintain longevity over time, including during the COVID-19 pandemic.
Results
Comparison of Observando os Rios’s Conversions
Differences Between the OoR_CET, OoR_CET_8, and OoR_CET_8* In the Alto Tietê mesobasin, the Kruskal-Wallis test revealed statistically significant differences among the medians of the OoR_CET, OoR_CET_8, and OoR_CET_8* (H = 245.5; p = 3.454 × 10–54). Post-hoc comparisons confirmed significant differences between all pairs: OoR_CET vs. OoR_CET_8 (p = 3.347 × 10–31), OoR_CET versus OoR_CET_8* (p = 9.428 × 10–7), and OoR_CET_8 versus OoR_CET_8* (p = 9.968 × 10–47).
Quantitative water quality index (Observando os Rios versus São Paulo State Environmental Agency)
Annual means and standard deviations were calculated for the OoR conversions (OoR_CET, OoR_CET_8, OoR_CET_8*) and CETESB’s WQI on a scale of 0 to 100 (Supplemental File 2: Table S2).
The WQI analysis encompassed the Alto Tietê, Sorocaba/Médio Tietê, and Piracicaba-Capivari-Jundiaí (PCJ) mesobasins from 2003 to 2022, with a total of 2,692, 2,304, and 402 samples collected, respectively. Welch’s t-test was employed to assess statistical differences between the converted indices and CETESB’s WQI (p < 0.05).
The weighted conversion (OoR_CET_8*) demonstrated the highest similarity to CETESB’s WQI, with non-significant differences (p < 0.05) observed in several years—particularly in the PCJ basin (2010, 2012, 2013, 2016–2020), Sorocaba/Médio Tietê (2004–2006, 2008, 2010–2014, 2018–2022), and Alto Tietê (2004–2008, 2013, 2022). This convergence reflects the weight adjustments made in OoR_CET_8*, which align it more closely with CETESB’s methodology. Years with low sampling, such as 2015 and 2022 in the PCJ basin (n = 1), limit the robustness of the findings, whereas larger sample sizes (e.g., 2007: 449 in Alto Tietê and Sorocaba/Médio Tietê) enhance reliability.
Qualitative water quality index (Observando os Rios versus São Paulo State Environmental Agency)
The WQI classifications from OoR (original and converted) and CETESB, categorized from Excellent to Terrible across the mesobasins, are presented (Figure 3).

Figure 3
Annual mean values of the Observando os Rios Water Quality Index (IQA_OoR; A), the CETESB Water Quality Index (E), and OoR-based conversions (OoR_CET; B, OoR_CET_8; C, and OoR_CET_8*; D) by mesobasin. Colors indicate classification categories based on the CETESB scale (0–100), except for column A, which uses the original OoR scale.
The comparative analysis between the Observando os Rios (OoR) classifications and the official CETESB WQI revealed substantial regional variation in agreement rates across the analyzed meso-basins (Figure 4).

Figure 4
Agreement between Observando os Rios (OoR) and CETESB water-quality classifications across the three hydrographic mesobasins (Alto Tietê, Sorocaba/Médio Tietê, and Piracicaba–Capivari–Jundiaí). Bars indicate exact agreement, and the dashed line represents agreement within ±1 classification category.
In the Alto Tietê meso-basin, the original OoR classification achieved the highest level of concordance, with 90% exact agreement relative to CETESB, outperforming all conversion-based approaches (OoR_CET: 60%; OoR_CET_8: 40%; OoR_CET_8*: 56%). When a tolerance of ±1 category was considered, all methods reached 100% agreement, indicating strong qualitative consistency between citizen and official assessments.
In Sorocaba/Médio Tietê, the original OoR classification showed only 10% exact agreement with CETESB, representing the lowest convergence among the three basins. The conversion-based indices improved performance substantially, with OoR_CET and OoR_CET_8* reaching 70% exact agreement and OoR_CET_8 reaching 55%. With the ±1-category tolerance, all methods—including the original index—achieved at least 95% agreement, indicating that discrepancies were generally limited to a single qualitative class.
In the Piracicaba–Capivari–Jundiaí (PCJ) meso-basin, the original OoR classification recorded 65% exact agreement, whereas the conversions produced lower concordance (OoR_CET: 30%; OoR_CET_8: 40%; OoR_CET_8*: 50%). Even in this more heterogeneous system, the ±1 category agreement exceeded 95%, reflecting that deviations were generally within a single classification step.
Overall, these results indicate that while exact numerical equivalence between indices can vary across regions, the qualitative patterns and management implications derived from citizen-collected data remain strongly aligned with those obtained through official monitoring programs.
A detailed comparison between the OoR program and CETESB was conducted at co-monitored stretches of the Tietê River, located within the Alto Tietê mesobasin. Both institutions monitor corresponding points in this area. The evaluated CETESB sites include Biritiba-Mirim (ID2050), Mogi das Cruzes (ID2090), Suzano (ID3120), Itaquaquecetuba (ID3130), Guarulhos (ID4150), and São Paulo (ID4170/4180/4200), with the latter represented by the annual mean across the three São Paulo stations. The objective of this analysis was to assess the robustness of concordance in segments historically associated with poor water quality (Figure 5) and to explore potential systematic trends introduced by the OoR methodology. This focus is justified by the predominance of the “Regular” category in CETESB’s long-term averages for the mesobasins—a range that, due to its broad amplitude, encompasses 32.1% of the OoR scale.

Figure 5
Agreement between Observando os Rios (OoR) and CETESB water-quality classifications at co-monitored sites along the Tietê River (Alto Tietê mesobasin). Bars indicate exact agreement, and the dashed line represents agreement within ±1 classification category.
Across these co-monitored sites, exact agreement between OoR and CETESB classifications ranged from 0% to 100%, with a mean concordance of approximately 33%. When a tolerance of ±1 category was considered, agreement increased substantially, varying from 20% to 100% and yielding an average of 71.4%, indicating that most discrepancies were confined to adjacent qualitative classes. Cohen’s Kappa values for these stretches ranged from 0.12 to 0.68, with a mean of 0.43, reflecting overall moderate agreement but with pronounced spatial variability, especially in highly urbanized segments.
Number of samples, success rates, and completeness
The analysis of water quality monitoring data from the Alto Tietê, Sorocaba/Médio Tietê, and PCJ river basins (2002–2023) revealed distinct patterns in monitoring effort and analytical consistency. Figure 6 synthesizes these patterns by presenting, for each mesobasin and year, (i) the proportion of samples that included 11, 12, 13, or 14 parameters and (ii) the total number of sampling events, represented by labeled circular markers.

Figure 6
Annual sample completeness (number of parameters measured per sampling event, ranging from 11 to 14) and total number of sampling events in the Alto Tietê, Sorocaba/Médio Tietê, and Piracicaba–Capivari–Jundiaí mesobasins (2002–2023).
Although analytical completeness improved in certain periods, the trajectories differed among basins. In Alto Tietê, completeness rose mainly from 2014 onward after a mid-2000s decline. In Sorocaba/Médio Tietê, it peaked between 2006 and 2013 before decreasing in recent years. In the PCJ basin, completeness was initially high, then declined, and later increased again, reflecting a fluctuating rather than monotonic pattern.
Sampling effort, however, exhibited strong year-to-year variability. In Alto Tietê, the number of collected samples ranged from a minimum of 1 in 2002 (with zero samples in 2011–2012) to a peak of 684 in 2015. Sorocaba/Médio Tietê reached its maximum monitoring intensity in 2007 (449 samples) but declined to 8 in 2022, followed by a partial recovery in 2023. In the PCJ basin sampling effort increased gradually until 2017 (92 samples), declined markedly between 2018 and 2022 (minimum of 2 samples), and rose again in 2023 (76 samples). A Mann–Kendall trend analysis confirmed a significant upward trend only in PCJ (τ = 0.31, p = 0.045; Theil–Sen = +2 samples yr–1), suggesting gradual expansion of monitoring capacity in this basin.
A bias-corrected binomial regression indicated an annual increase of approximately 5% in the likelihood that a sample contained all 14 parameters (p < 0.001), confirming long-term improvements in completeness across the network. Despite the sharp decline in 2020 caused by the COVID-19 pandemic, the dataset shows consistent recovery in all basins by 2023, without long-term loss of monitoring capacity.
Success rates for individual parameters (Supplemental File 3: Table S3) were generally high, particularly for turbidity, DO, , foams, debris, odor, fish, and settleable solids, which frequently approached 100%. showed near-complete success after 2009. In contrast, pH and biological indicators (red and transparent/dark worms) exhibited greater variability. Mann–Kendall tests detected a significant upward trend in pH success rates in Alto Tietê (τ = 0.49, p = 0.003) and Sorocaba/Médio Tietê (τ = 0.43, p = 0.008), while presented a slight but significant decline in both basins (τ ≈ –0.43, p < 0.03). Biological indicators showed no long-term trends and presented expected inconsistencies linked to visual identification and site conditions.
Importantly, the increased completeness over time did not produce a measurable effect on the agreement between OoR and CETESB classifications. This result is consistent with the mathematical structure of the OoR index: Missing analytical parameters are compensated internally within the WQI calculation, preventing structural bias. Therefore, the absence of a statistical correlation between completeness and concordance is interpreted as evidence of the effectiveness of this compensatory mechanism.
Monitoring network coverage community participation and engagement
The volume of sampling conducted by the OoR program in these basins, particularly in the past decade and in the years since the lifting of COVID-19 restrictions, frequently surpassed the number of samples collected by the state environmental agency (CETESB) in the Alto Tietê and Sorocaba/Médio Tietê mesobasins (Figure 7). In the PCJ region, however, CETESB maintained higher sampling volumes across the three evaluated periods.

Figure 7
Comparison of the number of water-quality sampling events conducted in the study mesobasins by the Observando os Rios program (OoR) and by CETESB.
This result reflects the broad reach of the program: While CETESB focuses on large rivers, the OoR network often includes small or neglected water bodies, located in areas that are difficult for official monitoring teams to access or of particular importance to the local community.
A total of 187 different water bodies were monitored across the three mesobasins (Supplemental File 4: Table S4), including reservoirs, urban streams, and small, hard-to-access tributaries. Several of these sites even lacked official names, being identified instead by local residents according to neighborhood or historical references—a reflection of the program’s strong territorial embeddedness. Figure 8 (panels 8a–c) illustrates this spatial diversity by showing the annual distribution of sampling events across municipalities: Highlighted colors correspond to the most representative locations in each mesobasin, while all others appear as “other municipalities.”

Figure 8
Annual sampling effort per municipality in each mesobasin (a–c). Highlighted colors indicate the most representative municipalities, while all others appear in light gray as “other municipalities.”
In the Alto Tietê basin (Figure 8a), São Paulo city alone accounted for nearly 70% of all monitoring events, including the program’s highest single-municipality peak in 2004. This predominance is expected, as São Paulo is both Brazil’s largest metropolis and the birthplace of the OoR program (2002), currently hosting 41 active volunteer groups. Municipalities such as Mogi das Cruzes and Suzano also contributed substantially to regional sampling efforts.
In the Sorocaba/Médio Tietê basin (Figure 8b), overall activity was lower, but cities such as Itu—home to SOS Mata Atlântica’s headquarters—stood out as key operational hubs, especially during the 2010s. Salto, known for its frequent foam pollution events on the Tietê River, and Ibiúna, where monitoring first began in this basin, also exhibited consistent engagement.
In the PCJ basin (Figure 8c), the monitoring structure shifted markedly after 2016 with the emergence of more organized groups in Campinas and Amparo, often supported by corporate partnerships. Earlier, in the 2000s, Mairiporã had been the only participating municipality, while from 2007 onward, Indaiatuba became important before later discontinuing activities. This trajectory reflects the dynamic participation patterns characteristic of the basin.
In certain municipalities, continuous monitoring of major rivers such as the Camanducaia and Anhumas in the PCJ basin (Amparo and Campinas) is carried out by volunteer teams formed through partnerships with local companies. A similar situation is observed in the Médio Tietê basin, where neighborhood associations, public schools, and universities (e.g., Salto, Itu, Tietê, and Cabreúva) manage regional monitoring activities—comparable in frequency to the work of public sanitation agencies.
Since its inception in the mid 1990s, expansion in the 2000s, and consolidation in the 2010s, OoR experienced fluctuations in volunteer engagement but has maintained continuous operation. Group sizes varied by mesobasin: Alto Tietê - average group size between 4 and 28 participants. Sorocaba/Médio Tietê - average of 6 to 58 participants per event. And PCJ - average of 2 to 19 participants.
Discussion
The analysis of data from the OoR program, along with its comparison to the official CETESB indicators in the Alto Tietê, Sorocaba/Médio Tietê, and PCJ mesobasins over two decades (2002–2023), revealed both the strengths and limitations of citizen science in water quality monitoring. The implementation of methodological conversions and adjustments, such as parameter weighting (OoR_CET_8*), was crucial for enhancing the reliability of the data produced by OoR data—understood here as their temporal consistency and agreement with official reference values, as defined in the Introduction—thereby facilitating greater alignment with CETESB’s WQI, particularly in years with a higher number of samples.
The Kruskal–Wallis test revealed statistically significant differences among the three index versions (OoR_CET, OoR_CET_8, and OoR_CET_8*) in the Alto Tietê and Sorocaba/Médio Tietê basins (p < 0.05). This finding indicates that the method of parameter conversion significantly influences the results. In contrast, the lack of statistical significance in the PCJ basin (p = 0.05789) may be attributed to the lower sample density, which diminishes analytical power. The OoR_CET_8* conversion was found to be the most consistent with CETESB’s methodology, particularly in years with more extensive sampling (e.g., 2007). This suggests that the technical calibration of the weights assigned to OoR parameters is an effective strategy for aligning citizen science with official monitoring methods.
When considering the temporal dimension, the comparison encompassed 18–20 years of overlapping data per mesobasin, providing a robust basis for assessing consistency. Exact agreement between OoR and CETESB classifications varied considerably—90% in Alto Tietê, 10% in Sorocaba/Médio Tietê, and 65% in PCJ—while the ±1 category agreement exceeded 95% in all cases, indicating that most discrepancies occurred within adjacent qualitative levels. This pattern aligns with previous validation studies showing that categorical water quality assessments often require tolerance for adjacent classes, as small deviations near category boundaries may not represent substantive disagreement. Quinlivan et al. (2020) adopted the same rationale when comparing citizen science classifications with laboratory-derived benchmarks, explicitly accepting ±1 class deviation to accommodate boundary uncertainty and colorimetric variability.
Kappa coefficients confirmed substantial agreement in Alto Tietê (0.74), moderate in Sorocaba/Médio Tietê (0.51), and moderate to substantial in PCJ (0.58), supporting earlier findings that citizen-science datasets can achieve meaningful levels of agreement with official references when supported by structured protocols and sustained engagement (Buytaert et al. 2014).
A finer-scale analysis along the Tietê River, covering 4–15 years of comparable co-monitoring depending on the site, revealed pronounced spatial variability. Exact agreement ranged from 0% to 100%, with an overall mean of approximately 33%, while agreement within ±1 category ranged from 20% to 100% (mean = 82%). Sites such as Itaquaquecetuba and Suzano exhibited the highest coherence (up to 100%), whereas stretches like São Paulo and Guarulhos consistently underestimated degradation. This pattern is likely linked to the broad amplitude of the “Regular” category in the OoR index (~32.1% of the scale), which can smooth short-term or localized deterioration more aggressively than CETESB’s formulation. These findings indicate that most divergences between OoR and CETESB classifications were restricted to a single qualitative step.
However, more detailed analyses along the Tietê River revealed an average Kappa value of only 0.43, indicating moderate agreement, with notable variation among river segments. For example, segments such as Mogi das Cruzes showed high agreement, whereas others, such as Guarulhos, exhibited substantial discrepancies. At these critical sites, CETESB consistently classified water quality into more degraded categories, suggesting that the broader range of the “Regular” category in the OoR—representing 32.1% of the scale—may obscure episodes of severe environmental degradation by smoothing over changes that, under the official system, would result in more pronounced class downgrades.
These limitations do not invalidate the role of citizen science; rather, they qualify it. They underscore the importance of methodological refinements—such as a potential redefinition of qualitative categories—and the complementary use of continuous metrics alongside discrete classifications. The experience accumulated by the OoR also demonstrates that, even with basic instruments, volunteers can produce coherent and socially relevant time series, provided there is ongoing technical support and training (Haklay 2012; Dickinson et al. 2012).
The analysis of data completeness reinforces the methodological robustness of this study. Physicochemical parameters, such as DO, , and , exhibited high consistency over the years. In contrast, parameters like pH demonstrated significant improvements, achieving 100% reliability between 2020 and 2022. Conversely, biological indicators, including larvae and worms, displayed more inconsistent results. These variations can be attributed to challenges in observation in turbid rivers, the subjectivity of visual criteria, and the potential lack of technical training among monitoring groups (San Llorente Capdevila et al. 2020). Additionally, parameters such as foam, fish presence, and BOD introduced notable biases. Sporadic foam formation, low fish sightings in polluted rivers, and the overestimation of BOD underscore the limitations of the colorimetric method and direct observation, particularly when conducted by non-specialist citizens (Dickinson et al. 2012; Haklay 2012).
Despite these challenges, the OoR program stands out for its extensive outreach and social engagement. It encompasses 187 water bodies, including many not monitored by CETESB—such as urban streams and unnamed rivers often designated by local communities—highlighting its territorial breadth and grassroots relevance. This network’s resilience was evident during the COVID-19 pandemic: after a sharp decline in 2020, the program recovered over 80% of its capacity by 2023, demonstrating strong volunteer commitment and adaptability (Pocock et al. 2017).
Regional recovery patterns varied. Sorocaba/Médio Tietê showed the fastest rebound, driven by corporate-supported groups that maintained activities through 2021, while Alto Tietê and PCJ recovered from 2022 onward. By 2023, PCJ even surpassed pre-pandemic levels, reflecting strengthened partnerships with schools, universities, and companies. A post–pandemic restriction shift was also observed, with main rivers monitored more frequently than tributaries, likely reflecting the operational capacity of structured, resource-supported groups.
Although SOS Mata Atlântica (2022) reported nationwide declines—54% fewer groups, 48% fewer monitored rivers, and 30% fewer municipalities—the smaller reduction in active volunteers (–23%, from 3,500 to 2,700) underscores the persistence and social resilience of the network. These patterns confirm that citizen science programs like OoR can sustain continuity and growth even under adverse conditions.
Conclusion
This study reinforces the potential of citizen science, as exemplified by the OoR program, to complement official environmental monitoring networks through extensive territorial coverage, continuous data generation, and strong community engagement. Beyond producing technically consistent results aligned with official indices, the program has demonstrated remarkable social resilience, particularly in its recovery from the disruptions caused by the COVID-19 pandemic.
The capacity of OoR to reach water bodies neglected by institutional monitoring, combined with its role in fostering environmental education, public participation, and local stewardship, highlights its strategic value for integrated water resource governance. Future improvements should focus on standardizing methodologies, expanding digital tools, and strengthening partnerships with the private sector, academia, and local communities to enhance both data quality and societal impact. As such, citizen science emerges not as an auxiliary effort but as a fundamental component of inclusive, adaptive, and sustainable water management strategies.
Data Accessibility Statement
All data used in this study are available upon request from the SOS Mata Atlântica Foundation and from CETESB’s public database.
Supplementary Files
The Supplementary files for this article can be found as follows:
Supplemental File 1: Table S1
Parameters and measurement protocols used in the OoR Water Quality Index (WQI). DOI: https://doi.org/10.5334/cstp.836.s1
Supplemental File 2: Table S2
Summary of annual mean and standard deviation of WQI values for OoR conversions and CETESB measurements across mesobasins. DOI: https://doi.org/10.5334/cstp.836.s2
Supplemental File 3: Table S3
Percentage of successful sampling per parameter and year. DOI: https://doi.org/10.5334/cstp.836.s3
Supplemental File 4: Table S4
List of monitored water bodies by municipality in the Alto Tietê, Sorocaba/Médio Tietê, and PCJ mesobasins. DOI: https://doi.org/10.5334/cstp.836.s4
Supplemental File 5
Full Portuguese-language version of the manuscript. DOI: https://doi.org/10.5334/cstp.836.s5
Ethics and Consent
This study did not involve human participants or animals. All analyses were based exclusively on secondary data obtained from existing environmental monitoring programs (OoR and CETESB). Therefore, ethical approval was not required.
Acknowledgements
The authors would like to thank GEPURA – Grupo de Estudo e Práticas do Uso Racional da Água, a student-led group composed of undergraduate students from the University of São Paulo, for their collaboration in the studies that contributed to the development of this article.
Competing Interests
The authors Aline da Silva Cruz and Gustavo Veronesi are employees of “SOS Mata Atlântica”, an NGO responsible for the program “Observando os Rios”, which is a point of conflict of interest. However, they provided the data and answered questions that were very relevant to the development of this study, and there is no particular gain for them from the publication.
