More than two thirds of the earth's surface is covered by water, which is a vital resource for all people and living things (Alsdorf and Lettenmaier, 2003, Jesus et al., 2022). Water quality should be monitored to control diseases associated with poor water quality (Liu et al., 2022, Manoiu et al., 2022), especially when it comes to drinking water for humans and animals. WQ can be determined analytically in the laboratory; however, this method of WQ analysis is inefficient, time-consuming, and expensive (Shanahan et al., 1998). This situation highlights the need to monitor WQ using intelligent systems, especially when real-time data is available (Alavi et al., 2021, Tao et al., 2018). It is important to note that WQ monitoring is essential to track human activities and freshwater availability for humans and natural ecosystems (Akhtar et al., 2021). The current challenges of exponential population growth, intensive agriculture, urbanization, industrialization, and the interaction of complex systems have raised concerns about adequate water availability in extreme climate scenarios (Uddin et al., 2021, Yan et al. , 2022). A recent United Nations study found that 80% of health problems in low-income countries, which kill 1.5 million people a year, are associated with a poor quality of life (United Nations, 2017). . An assessment and prediction of WQ indicators using advanced modeling and analytical approaches is needed to determine whether or not water is suitable for a specific use (Tiyasha et al., 2020). Estimating WQ accurately and quickly offers the opportunity to develop appropriate treatment or safety measures for contaminated water (Tiyasha et al., 2021). Considering the above fact, a proper understanding of surface water quality is highly desirable to ensure the safety of the end user and the sustainability of water resources. As WQ is a serious problem when it comes to surface water for human, animal and food production, analytical evaluation of WQ indicators is extensive and time consuming. There is an increasing need to develop advanced models to predict and forecast WQ parameters using different input variables to guide decision makers to manage the water crisis for different sectors and end users.
Improved management of surface water ensures the development of reliable computer-assisted models for monitoring water quality. Therefore, the main advantage of modeling variables related to the WQ area is to obtain cost-effective and reliable management of water resources (Zhang et al., 2022). The basic reason for this is that these models have excellent reliability and ability to accurately predict the variables associated with WQ (Al-Sulttani et al., 2021). Establishing an accurate WQ monitoring system is essential to ensure better WQ management in a given basin/catchment. One of the critical components of effective water resources management is a better understanding of HQ-related properties of the surface (Khaleefa and Kamel, 2021, Zhu et al., 2022). This approach would lead to the development of a new and more effective process for more cost-effective water treatment, improving the sustainability of the WQ.
Among many other WQ variables, electrical conductivity (EC) is one of the key salinity indicators for optimizing irrigation and other water uses (Thompson et al., 2010). The EC value for water samples is usually expressed in microsiemens per centimeter (S/cm), can be determined with the EC meter (Golnabi, 2011). EC can significantly affect WQ health, as EC is affected by Total Dissolved Solids (TDS), it is significant. Since EC dominance is based on TDS and is directly related to dissolved ionic components in water, including magnesium (Mg2+), sulfate (SO42-), sodium (Na+), Chloride (Cl-) and calcium (approx.2+), then EC is a crucial indicator of the presence of contaminants in water (Ahmadianfar et al., 2022). The ionic composition has a significant negative impact on the drinking WQ, in addition to affecting the development of the plant. Additionally, EC is critically important in determining salinity risk to agriculture and drinking water.
Another important parameter of water is the bicarbonate ion (HCO3-), indicating alkaline water. Buffers and neutralizing acids are presented (Fernandez et al., 2021). The bicarbonate concentration in water samples, generally expressed in milliequivalents per liter (mEq/L), was determined using the methylsulfuric acid orange indicator titration method (Anderson and Yang, 1992). Bicarbonate in the textile industry, in beverages and in cooling towers is of great interest (Van Wyk and Scarpa, 1999). Another indication of high HCO concentration3-it is a value greater than 7.5 pH (Arshad and Shakoor, 2017). When a negatively charged oxygen atom of an ion combines with a positively charged ion to form an ionic molecule, a bicarbonate salt is formed. Many bicarbonates are soluble in water at standard pressure and temperature; Sodium bicarbonate, in particular, contributes to total dissolved solids, a typical measure used to determine WQ (Geor et al., 2013). Hence the previous determination of HCO3-it can contribute significantly to various concerns related to freshwater resources, human health problems, and others, such as irrigation water.
Variable WQ models provide accurate assessment and prediction of surface WQ for efficient water resource management. This is one of the most important cutting-edge research components of aquatic systems research (Zhang et al., 2022). Accurate simulation of WQ parameters is crucial for optimal resource management. Although traditional process-based modeling techniques provide excellent predictions of WQ parameters, these models have several limitations. These models are specific to a catchment area, certain types of stochastic data, and data redundancy. For example, some models work with a data set that requires a lot of computation time and contains unknown input data. Also, WQ is influenced by various elements, so traditional data processing approaches to solve this problem are not efficient enough. These parameters also show a complex non-linear relationship with the WQ prediction parameters.
A literature review was conducted for different WQ modeling approaches used to increase forecast accuracy. Several mathematical models based on statistical views have been developed, but there are still some limitations that need to be addressed (Beck, 2013, Zieminska-Stolarska & Skrzypski, 2012). One of the main limitations of statistical models is the assumption and use of linear and normally distributed relationships between prediction and response (Waghmare and Kiwne, 2017, Wu and Chau, 2006). The use of artificial intelligence (AI) models has allowed researchers to view current advances in flexible computing with remarkable interest (Kang et al., 2017). AI models are increasingly used to solve environmental problems due to their exceptional ability to solve complex non-linear problems and their independence from a prior understanding of physical processes (Bayatvarkeshi et al., 2020, Jamei et al., 2021). . As AI models can handle advanced non-linear systems, this encourages the development of parallel computational and computing capabilities, which motivates researchers to implement machine learning (ML) models. Various versions of AI models have been adopted for the simulation of the WQ river, such as Yaseen et al., 2018), complementary model (Song and Yao, 2022), joint model (Alnahit et al., 2022), and others (Anmala and Turuganti, 2021). Although the massive implementation of AI models to model flow WQ parameters can be verified in the literature, there is still room to explore a new version of AI models, where they can more comprehensively resolve the natural phenomena of WQ parameters. . Fig. 1 shows the different ML approaches implemented for WQ flow simulation.
Environmental engineers have seen an exponential increase in the number of studies over the last decade in their search for the most influential computational models for surface WQ simulations. It coincided with important advances in modeling. It is difficult to fully understand superficial WQ as a natural problem. In order to be able to assess this scenario more accurately, a new hybrid form of AI models is being developed. This is due to some limitations of autonomous AI models, such as the need for internal tuning of model parameters, data pooling, cleanup, data preprocessing, and other factors. One of the most challenging problems in modeling WQ indices in streams is choosing the correct and optimal combination of candidate input from the vast amount of resources available to feed ML models. This research gap shows that there is an urgent need to focus on an efficient strategy based on powerful preprocessing. The main objective of this study is to provide two new ML frameworks based on Boruta data filtering, consisting of Boruta-GXBoost (BXGB) and Boruta-Extra Tree (BET) and an integrated BSR scheme with LSTM approaches, ELNET, KRR and ERNN for simulations. you need two WQ indicators (EC and HCO3-) at the Soosan Plain and Shaloo Bridge stations on the Karun River in Iran. Three other top candidate entry combinations for EC and HCO3-were studied for eight hybrid models, namely BXGB-ERNN, BET-ERNN, BXGB-KRR, BET-KRR, BXGB-ELNET, BET-ELNET, BXGB-LSTM and BET-LSTM. In addition, various infographics and statistical validation tools were used to evaluate the performance of the models.
Study area and data description
The Karun River with an upstream catchment area of 67,257 km was selected as the study area.2, with an extension of about 152 km. Located in southwestern Iran, this river originates in the Zagros Mountains and runs from northwest to southeast. Eventually, this river empties into the Persian Gulf. The average annual rainfall in this basin is 620 mm. In the present study, two hydrometric stations including Shaloo Bridge in Khuzestan Province and Soosan Plain in Kohgiluyeh and
Development and customization of models.
One of the challenging problems in developing ML-based models based on high input features is selecting the correct and optimal candidate input combination from the large number of input features to feed ML models. An extensive strategy of so-called linear correlation scoring, based on Pearson's correlation scores, has been used in the literature (Jamei et al., 2022b). These regression-based input selection methods were considered a simple strategy for linear in previous studies.
Results and Analysis
Based on the simulations, extensive investigations of the BXGB-ERNN, BET-ERNN, BXGB-KRR, BET-KRR, BXGB-ELNET, BET-ELNET, BXGB-LSTM and BET-LSTM models were carried out using the metrics of evaluation that was simulated by HKO3-and EC parameters in the Karun river.
Table 5 provides the results in the training and testing periods based on the set of input combinations C1, C2 and C3, where C1 appears to be the most optimal input combination for simulating HCO.3-Parameter. The BET-ERNN model achieved this
In this study, a hybrid modeling framework is developed to simulate EC and HCO.3-Parameters using good fit metrics based on BSR algorithms together with BET and BXGB approaches integrated with ERNN, ELNET, KRR and LSTM models. BET-ERNN and BXGB-ERNN hybrids are better for simulating EC and HCO3-with C1, C2 and C3 input combinations compared to BXGB-KRR, BET-KRR, BXGB-ELNET, BET-ELNET, BXGB-LSTM and BET-LSTM models.
Fig. 12 shows boxplots of hybrid models based on BET and BXGB
This research examined the feasibility of multilevel preprocessing hybrid models to simulate EC and HCO.3-Parameters for the Karun River. The results of this study suggested that the BET and BXGB models were suitable to obtain the best input characteristics to improve the accuracy of the models. The results of advanced modeling together with input feature schematics showed that the BET-ERNN hybrid model had the best performance, followed by BXGB-ERNN for EC and HCO.3-based simulation
The authors declare that they have no conflicting interests.
Conflict of Interest Statement
The authors declare that they are not aware of any competing financial interests or personal relationships that may have influenced the work described in this paper.
© 2023 Institute of Chemical Engineers. Issued by Elsevier Ltd. All rights reserved.