Have a personal or library account? Click to login
RColSim: An R-Based Open-Source Regional Water Management Model for the Columbia River Basin Cover

RColSim: An R-Based Open-Source Regional Water Management Model for the Columbia River Basin

Open Access
|Apr 2026

Full Article

Introduction

The Columbia River Basin (CRB) is the largest supplier of hydropower in the United States and generates more than 40% of the U.S. hydroelectric energy supply. Additionally, the agricultural sector of the region produces more than 7% of U.S. agricultural commodities in terms of economic value. The CRB’s drainage basin extends to seven U.S. states and the Canadian province of British Columbia. Its dams affect many other sectors and stakeholders, such as urban water utilities, fisheries, the recreation industry, and river-based navigation. They also play a crucial role in protecting the region from flooding dangers. The CRB water system carries significant cultural and spiritual importance for various societies, such as Indigenous residents of the region in both the United States and Canada [1, 2].

This paper presents RColSim (Figure 1), an open-source, script-based model developed to simulate the complex water infrastructure systems of the Columbia River Basin. Written in the R programming language, the model is designed to run in parallel on high-performance computing clusters, enabling large-scale ensemble simulations. The script-based structure also facilitates integration with other hydrologic land-surface models and socioeconomic system-dynamics tools. Together, these features make RColSim well-suited for evaluating the impacts of climate scenarios and operational management decisions on diverse water stakeholders in the Columbia River Basin, and can also support applications such as exploratory policy analysis [3] and uncertainty quantification.

Figure 1

Conceptual schematic of RColSim and the six major sub-basins that RColSim simulates.

The underlying algorithms used in RColSim were initially developed in a monthly time-step, Stella-based system-dynamics model called ColSim, which was used in various studies to explore climate change impacts on the integrated system [4, 5], the economic value of long-lead climate forecasts [6], flood control optimization [7], Pacific Northwest energy impacts [8], and agricultural production [9]. Despite the wide application of the original program, Stella is not a freely available programming platform. Due to the limitations of its coding environment, the original model could not be scripted to run efficiently in ensemble mode and was not compatible with studies requiring high-performance parallel computing.

The Columbia River system is expected to undergo various types of stressors in the future, including land-use change, climate change, increasing hydrometeorological extremes, and revision of the transboundary agreement between the United States and Canada [10, 11, 12, 13, 14]. To explore the ramifications of these transformations, a myriad of scientific papers and projects have focused on the CRB during the last three decades. However, a comprehensive investigation of the impacts of these stressors often necessitates carrying out intensive computational experiments to conduct optimization, sensitivity analysis, and bottom-up assessment. These experiments allow us to investigate system behaviour and best management practices under unknown future climatic and socioeconomic uncertainties and stressors [15, 16, 17, 18].

Currently, the authors are unaware of any open-source, script-based water-management models of the region that can be deployed on computer clusters for tens of thousands of simulations. The simulation tools used to represent the infrastructural and institutional details of the Columbia River have been either overly simplistic or proprietary. RColSim responds to these diverse needs and provides a script-based, open-source model that can be used in future studies that aim to improve the planning and management of water resources in the CRB under deep uncertainty. For example, Hall et al. (2024) [19] leveraged the script-based architecture of RColSim to evaluate the impacts of climate change on water supply and demand in the Columbia River Basin. In their framework, RColSim was coupled with several biophysical models, including VIC-CropSyst [20], which simulates land-surface hydrology and agricultural systems. Within this integrated modeling platform, VIC-CropSyst generated projections of naturalized streamflow and irrigation demand over a 30-year horizon, which were then provided as inputs to RColSim. RColSim subsequently simulated the effects of human water management decisions, infrastructure operations (e.g., dam rule curves), water rights, and institutional constraints. Together, the coupled models enabled a comprehensive assessment of the combined influences of climatic and anthropogenic stressors on regional water supply and demand.

Implementation and architecture

RColSim exploits the original, widely used conceptual algorithms of ColSim while using the freeware and data-processing features of the R programming language. Both ColSim and RColSim follow the reservoir rule curves that guide system operation in the real world. These rule curves (Figure 3) include the upper flood-protection rule curve (the dam stage cannot go higher than this level), the lower-limit operating rule curve (the dam stage cannot go below this level), the assured- and variable-refile rule curves (the assured or fixed rule curve is based on an analysis of historical records and is unaffected by a specific year’s projected hydrological conditions, while the variable rule curve is responsive to each year’s projected water supply conditions), and the critical rule curves (guides reservoir operation for hydropower generation during low flow years). Each week, system operators select a rule curve based on the season, projection of water availability during that season, and flood-control projections; RColSim closely follows this operational logic. Additionally, we have modified several aspects of the code in RColSim. For example, unlike the original ColSim model, irrigation demands in RColSim are treated as a dynamic variable that changes on a weekly basis. This approach allows for a more accurate and simultaneous representation of supply and demand dynamics, particularly when coupled with external agro-hydrological models. For example, several past and ongoing projects have used the VIC-CropSyst [20] model to simulate irrigation requirements [21]. We have also improved RColSim simulations compared to the original model by updating all the rule curves and implementing additional environmental flow targets.

Additionally, RColSim has been used to assess the impacts of climate change on the water supply and demand of the Columbia River Basin. To achieve this RColSim was integrated with VIC-CropSyst model that simulate land surface hydrology and agricultural systems model. VIC-CropSyst simulates naturalized streamflow input to upstream of dams over the next thirty years, while also simulating irrigation water demands across CRB. The results of VIC-CropSyst were then used as RColSim inputs to represent the effect of climatic changes in the region. RColSim then simulates the effect of human water management decisions, water infrastructures, water rights, and institutions. The combination of these two model provides a comprehensive understanding of how human and climatic stressors can affect the water supply and demand in this region.

Furthermore, the original ColSim model represented the upper and middle Snake River dams as two hypothetical integrated dams. While these dams do not significantly contribute to the overall water supply of the Columbia River, their absence limits our ability to explore research questions in those specific regions (e.g., the headwaters of the Snake River). RColSim addresses that issue by unpacking the original hypothetical cluster dams into individual reservoir units. In addition, RColSim also includes dams along smaller tributaries, such as the Chelan dam, that were not considered in the original ColSim model.

RColSim simulates the operation of 46 storage and run-of-river dams in the Columbia River system using a weekly time step. The purpose of the model is to capture the operations and streamflow effects on each dam, which jointly reduce flood risk while meeting agricultural, energy production, and environmental demands in the system. The model is able to simulate the system-wide constraints imposed by the Columbia River Treaty between the United States and Canada. In other words, each reservoir in the system simultaneously handles its dam-specific operation goals (e.g., minimum flows, and rule curve operations) while contributing to system-wide flood protection, energy production, and support for environmental flows. RColSim simulates six main tributaries and dam systems of the CRB: the upper, middle, and lower Columbia basins; Kootenay River Basin; Snake River Basin; and Pend Oreille (Figure 1).

RColSim’s Inputs and Outputs

RColSim requires naturalized streamflow inputs for the major dams in the Columbia River Basin, as well as incremental flows between upstream and downstream dams (i.e., water added to the system between reservoirs). The model also includes default out-of-stream water demands (e.g., agriculture and municipal use), but users can substitute these with their own demand datasets. Other inputs, such as reservoir rule curves, can likewise be updated by users. In terms of outputs, RColSim produces streamflow, water allocation, and hydropower generation results. Because of its script-based design, users can also extend the model to generate additional outputs tailored to their research needs and applications.

Model Code Architecture

The structure of the RColSim code was intentionally designed to follow a procedural, or function-centric, paradigm. RColSim comprises ten essential code components. The following section, as well as Figure 2, offer additional details on the structural organization of the model’s code. The choice of a functional paradigm for RColSim is rooted in the fact that each dam in the Columbia River Basin possesses distinctive operational details, minimizing significant overlaps with other dams in the system. The decision to avoid an object-oriented implementation stems from the presumption that such an approach might introduce unnecessary structural complexities and demand a substantial additional investment of code development time, yielding little to no discernible benefits.

Figure 2

Code architecture of RColSim.

  1. RColSim_main.R: The entire simulation procedure is controlled and orchestrated by the model’s main function (i.e., RColSim_main.R). There are a few different categories of functions that are imported into the main code at the beginning of the simulation. The main function of RColSim, responsible for overseeing simulations, importing input files, initializing the model, and generating output files, is structured in the following manner:

  2. Global control file: Provides RColSim with its primary simulation input information, encompassing the start and end date of the simulation and the location of the key streamflow and water diversion inputs to the model.

  3. load_functions.R: This module is the most central part of RColSim that provides the model with all the dam-specific operational details that facilitate the simulation of the Columbia River System. The dam-specific operations include upstream water supply estimation, simulation of water release to downstream dams, flood protection procedures, calculation of off-stream water diversion from each reservoir, and the contribution of each dam to the overall Columbia River system performance indicators related to hydropower production and environmental conditions. The load functions module is imported into the R environment by the main program at the beginning of the simulation, and it in turn imports dam-specific functions from separate scripts named for each dam. All functions remain accessible to all model parts throughout the model execution.

  4. initialize_model.R: The initialization module defines the system’s initial conditions for RColSim. For example, the starting water level for each dam is defined in this module, and users can change it if needed.

  5. dataframes.R: Defines all the data frames that are used during the simulation of RColSim to track the key model components. These data frames are also used to define model outputs, therefore, additional model outputs need to be defined in this module.

  6. PMFs.R: RColSim calculates systems performance indicators at the end of each time step. These indicators include various out-of-stream water demand, hydropower generation, and environmental conditions performance metrics.

  7. supply_demand.R: Spatial and temporal aggregation of the streamflow and water demand input data.

  8. flow_to_ColSim.R: Creates the global input file and writes the time series input data used by the RColsim functions.

  9. read_rule_curves.R: As discussed earlier RColSim is a rule-based model, meaning that dam operations are partly controlled by predefined water retention and release rules. This module first reads all the rule curve inputs for each dam and ensures that they are available during the simulation (Figure 3).

  10. switches.R: As a water resource management model, RColSim relies on diverse inputs. One of the key inputs to the model is the criteria that define various modes of behaviour, triggering distinct operational choices throughout the simulation. For instance, this module specifies under what streamflow conditions fish protection takes precedence over reservoir refill.

Figure 3

Seasonality of system operation and rule curve selection process for operation of the Columbia River Basin. Each week one of these rule curves is selected for the operation of each dam in the CRB water system. Overall, water stored behind dams cannot be more than the flood control URCs. In terms of the minimum operation level (CRC and ORCLL), the rule curves cannot drop below the dam water level during the worst historical water supply year.

RColSim Target User Community

RColSim is designed and developed to specifically simulate the water management and infrastructural systems of the Columbia River Basin. Therefore, the model’s target user community is academic, private sector, and agency-level researchers, in addition to decision-makers who focus on opportunities and vulnerabilities in the Columbia River system. While the model is not transferable to other river systems, Columbia River studies [4, 5, 6, 9, 22, 23] have informed a wide array of researchers and decision-makers investigating other snow-dominated basins that are sensitive to climate change [24, 25, 26] with competing water demands. Like the CRB, these water systems (e.g. the Mekong River basin) are often constrained by complex institutional and infrastructural frameworks, including transboundary treaties. Thus, RColSim can also broadly serve as a template and computational framework for the development of detailed water management models of other regional-scale water resources systems.

Region-Specific vs. Generic Water Transfer Models

There are two main categories of water management modelling platforms: 1) generic transferable models; and 2) region-specific models [27]. The generic water management models usually provide objects that represent generic dams, river flow, hydropower plants, out-of-stream diversions, and other water control devices. The user can use these objects as the building blocks of their specific river systems and construct a region-specific model. Examples of these models include RiverWare [28], WaterGap [29], WASMOD-M [30], H08 [31] and VIC-ResOpt [32]. The main advantage of these models is their transferability and reusability. The main drawback is their limited ability to accurately represent the real-world details of river system infrastructures, operation characteristics, water laws, and their specific system-wide objectives [22, 27, 33, 34]. The models specific to a particular region, such as RColSim, have the capability to depict the infrastructural and operational details of their respective river basins with a higher level of granularity and fidelity. However, they lack the flexibility that makes them immediately transferable to other regions. Examples of these models include StateMod [35], CALFEWS [36], and CalSim [37]. Overall, the region-specific models are expected to outperform generic models in regional water systems planning and operations. Therefore, since RColSim is a region-specific water management model, belonging to the second group, it is not directly transferable to other regions.

Quality control We compared observed versus RColSim-simulated reservoir signature (defined as the difference between mean monthly streamflow with and without the reservoir effect) at various system reservoirs to assess the performance of our river-system model and show that the model can capture the overall storage dynamics of the system reasonably well (Figure 4 and Table 1). There are, however, discrepancies that can stem from many simplifying assumptions applied during the representation of the complex CRB water system. There are also model limitations associated with uncertainties in streamflow input data, irrigation and energy demands, and rule curves used in the model. The model operates at a native weekly time step. For performance evaluation at the monthly scale, the weekly simulation results were aggregated into monthly values as a post-processing step.

Figure 4

Simulated vs. observed long-term mean change in reservoir storage by calendar month, post-processed from on the original weekly time step simulation from 1979 to 2007.

Table 1

Performance metrics for RColSim simulated vs. observed monthly dam outflow. These metrics include the Pearson Correlation Coefficient (r), Mean Error (ME), Kling-Gupta Efficiency (KGE), Normalized Root Mean Square of Error (NRMSE), and Volumetric Efficiency (VE). The simulation period is from August 1979 to September 2007. The original simulation time step was weekly, which was aggregated to monthly to construct this table. The table also provides information about dam location in the basin and acronyms used in RColSim to refer to each dam.

DAM NAMEBASINRColSim_namesMENRMSE%rKGEVE
BonnevilleLower ColumbiaBON2497.9454.30.880.810.85
The DallesLower ColumbiaDA2511.9252.40.880.830.85
John DayLower ColumbiaJD1941.0952.80.880.830.85
McNaryLower ColumbiaMCN2456.0352.90.880.840.84
LibbyKootenayLB–384.66114.40.390.390.51
Bonners FerryKootenayBONF–574.7105.30.490.480.62
Corra LinnKootenayCL–1256.8363.80.810.80.78
DuncanKootenayDU–166.5788.40.570.550.48
BrownleeSnake RiverBR737.6590.20.780.520.6
DworshakSnake RiverDW376.35121.60.340.330.35
Hells CanyonSnake RiverHC758.8486.10.80.540.62
Ice HarborSnake RiverIH–362.3540.20.930.870.8
Little GooseSnake RiverLIG2463.9145.30.930.780.78
Lower MonumentalSnake RiverLM2603.7645.20.930.780.79
Lower GraniteSnake RiverLG2068.1945.10.930.790.79
Grand CouleeMid-ColumbiaGC–3569.2781.80.670.670.81
Priest RapidsMid-ColumbiaPR1786.24710.770.760.82
Rock IslandMid-ColumbiaRI1645.6970.20.770.760.83
Rocky ReachMid-ColumbiaRR1645.9173.70.750.740.82
WanapumMid-ColumbiaWA1599.9170.30.770.760.83
WellsMid-ColumbiaWE1639.4475.20.740.730.82
Chief JosephMid-ColumbiaCJ1547.7881.20.690.690.81
Albeni FallsPend OreilleAF–1068.7149.10.870.80.78
BoundaryPend OreilleBD–941.5646.90.890.840.79
CabinetPend OreilleCB–840.2743.30.90.830.77
Hungry HorsePend OreilleHH–89.69141.50.110.10.27
KerrPend OreilleKE–252.8379.40.640.60.66
Noxon RapidsPend OreilleNOX–774.4845.20.890.810.76
MicaUpper ColumbiaMI–904.0383.80.610.590.67
RevelstokeUpper ColumbiaREV–1173.3495.30.530.520.75
Keenleyside (Arrow)Upper ColumbiaAR–2359.7893.40.560.560.7

System-Level (Regression) Tests

RColSim includes a built-in system-level (regression) test. This test is performed at six major reservoirs in the system (Mica, Keenleyside, Grand Coulee, Ice Harbor, Duncan, and Dalles). At each site, the performance of RColSim is compared against a pre-executed benchmark, and any discrepancies are flagged. This feature helps ensure that future modifications do not unintentionally alter model performance and thereby enhances the model’s robustness and reuse potential.

Finally, the Supplemental Materials of this paper provide an additional comparison between observed and simulated dam outflow at all CRB dams represented in RColSim. The information about the original datasets used to conduct the RColSim simulations for this comparison can be found in the Supplemental Materials. Also, observed dam outflow datasets have been discussed in the Supplemental Materials.

Model Limitations and Future Work

There are tributaries within the Columbia system that are currently excluded from RColSim because their water contribution relative to the overall annual flow at the Columbia River scale is negligible. Examples of these tributaries include the Yakima and Walla Walla. Therefore, future studies can more explicitly incorporate operational details of dams in those regions and enable RColSim to answer broader ranges of questions at the subbasin scale.

In addition, the model is not currently accessible through any specific application programming interfaces. Future works that enable this capability can significantly enhance RColSim’s accessibility and utility. RColSim is intentionally designed as a script-based model and was not transformed into a formal R package to provide researchers with a higher degree of flexibility in utilizing the model and fine-tuning it to their specific requirements. However, future work exploring the benefits of creating a formal CRAN R package could offer valuable insights, guiding the potential development of RColSim in subsequent phases. We also recommend future work inspired by this framework that uses an object-oriented programming (OOP) paradigm, either in R (S3 or S4) or other programming languages more focused on OOP (Python, C++, etc.). The current implementation also relies on an architecture with shared global state and implicit execution order, which limits modularity, testability, and formal API definition in this version of the model; these issues are expected to be addressed in future model releases. It is our hope that future studies will undertake these kinds of modifications in an open-source environment.

(2) Availability

Operating system

Linux/UNIX, Windows, Mac.

Programming language

R Programming Language

Additional system requirements

There are no specific system requirements.

Dependencies

RColSim is compatible with the base version of R and does not require any additional R libraries except the xts (eXtensible Time Series).

List of contributors

  1. Malek, Keyvan; (Lead/corresponding author first)

  2. Yourek, Matthew

  3. Adam, Jennifer

  4. Hamlet, Alan F.

  5. Rajagopalan, Kirti

  6. Reed, Patrick

Software location

Archive

Code repository

Language

English

(3) Reuse potential

The RColSim model has been developed in R programming language. The following steps are necessary to conduct a simulation. Interested readers can contact Keyvan Malek (keyvanmalek@gmail.com), Matthew Yourek (matthew.yourek@wsu.edu), and Jennifer Adam (jcadam@wsu.edu) for additional information.

The manuscript includes a stand-alone user manual document that provides more comprehensive information on the model input data preparation and simulation process. The users can also refer to RColSim’s GitHub page (https://github.com/keyvan-malek/RColSim) for more information on various aspects of model simulations and data pre-processing.

In summary, RColSim, as an R-native model, requires the presence of the R programming language. However, the model only uses functions and libraries that are available in the base-R platform, eliminating the need for any supplementary libraries during program execution except for the xts package. The current version of RColSim operates on a weekly time step, therefore, it requires streamflow and irrigation demand inputs to align with this weekly cadence for conducting simulations. Moreover, the calculation of these inputs is required for each specific model streamflow input node, as shown in Figure 5.

Figure 5

Incremental drainage areas for each dam simulated in RColSim. The red triangles mark the location of dams with reservoirs, and the black stars indication the location of run-of-river dams. The light gray boundaries delineate the incremental drainage area between downstream and immediately upstream dam(s).

RColSim relies on a global control file that specifies the main simulation parameters. While, currently, the global parameter file is located in the “inputs/global_input_files/” directory (e.g., “GIF_Historical_baseline_supply_and_demand”), users have the flexibility to choose any file name and location for the global input file. However, the newly defined file address needs to be adjusted in line 46 of the “RColSim_main.R” file. Additionally, there are various other inputs that can be customized for specific purposes, such as rule curves that can be found in “RColSim/inputs/default_rule_curves”.

RColSim includes a system test functionality that facilitates the comparison of its simulations at six key system reservoirs (Mica, Keenleyside, Grand Coulee, Ice Harbor, Duncan, and Dalles) with benchmark simulations that have been previously validated. These reservoirs are chosen to represent the primary tributaries of the Columbia River system. Users can initiate the test by executing the “run_tests.sh” shell script. The test functions, which are customizable, can be accessed within the “tests” folder and the “test_RColSim.R” R script.

Additional File

The additional file for this article can be found as follows:

Supplemental Materials

Abbreviations

DescriptionAbbreviation
1Columbia River BasinCRB
2No-Regulation, No-IrrigationNRNI
3Bonneville Power AdministrationBPA
4Critical Rule CurveCRC
5Operating Rule Curves Lower LimitORCLL
6Upper Rule CurveURC
7Assured Rule CurveARC
8Variable Rule CurveVRC
9Pearson Correlation Coefficientr
10Mean ErrorME
11Kling-Gupta EfficiencyKGE
12Normalized Root Mean Square of ErrorNRMSE
13Volumetric EfficiencyVE

Competing Interests

The authors have no competing interests to declare.

DOI: https://doi.org/10.5334/jors.406 | Journal eISSN: 2049-9647
Language: English
Submitted on: Nov 23, 2021
|
Accepted on: Mar 10, 2026
|
Published on: Apr 1, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Keyvan Malek, Matthew Yourek, Jennifer Adam, Alan Hamlet, Kirti Rajagopalan, Patrick Reed, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.