In an article published in the journal Scientific Reports, researchers from Italy and Colombia developed an innovative approach to predict the transport and decay of substances like chlorine in water distribution networks (WDNs). They used evolutionary polynomial regression (EPR) to generate symbolic formulas for estimating the concentration of a substance at any network node based on the concentration at the source node and the reaction rate parameter.
Background
WDNs are systems of pipes that transport and deliver clean water to people. In this system, the water quality depends on the transport and decay of substances that can either be contaminants or disinfectants, such as chlorine. Chlorine is widely used to disinfect water and prevent the growth of pathogens. It also reacts with organic and inorganic compounds, forming disinfection by-products (DBPs) that can have adverse health effects. Therefore, monitoring the chlorine transport and decay in WDNs is crucial to ensure a safe and adequate water supply.
The conventional method to model the transport and decay in WDNs is based on solving differential equations that describe the advective diffusion and the kinetic reaction of chlorine in the pipes network domain. The kinetic reaction can be formulated using different orders of reaction, depending on the characteristics of the substance and the water quality. The most common models are the first and second-order kinetics, which assume a constant or a variable reaction rate parameter, respectively.
About the Research
In the present paper, the authors investigated the mechanism of substance transport and decay in WDNs using a data-driven approach based on the symbolic machine learning method. Symbolic machine learning is a technique that can generate explicit and interpretable models from data in the form of symbolic formulas. The study used EPR, which combines evolutionary optimization and machine learning to search for optimal polynomial models. The EPR models have the form of polynomial expressions that relate the output variable with the input variables.
The researchers used EPR to analyze data generated by water quality simulations performed on three WDNs, including a simple branched network, a small-looped network, and a real network with both branches and loops, using first or second-order kinetics of substance decay. The assumed substance was chlorine. However, the results can be generalized for any substance having similar kinetics. The data consisted of the substance concentration at each node of the network, as well as the hydraulic and water quality variables that influence the transport and decay process, such as the flow rate, the velocity, the concentration at the source node, the reaction rate parameter, the water age, and the travel time along the shortest paths.
The study applied EPR to the data, using different sets of inputs, to obtain predictive symbolic models of the substance concentration at any node. The EPR models were evaluated in terms of accuracy, complexity, and physical consistency. Moreover, this model was compared with the analytical solutions of the first and second-order kinetics to assess the ability of EPR to identify the order of reaction from data.
Research Findings
The outcomes showed that the EPR models provided simple and explicit formulas to accurately predict the substance concentration at any node of the network, given the concentration at the source node, the reaction rate parameter, and the water age or the travel time along the shortest paths. This model demonstrated high physical consistency, resembling the analytical solutions of first and second-order kinetics, depending on the data. It means that EPR can discover the intrinsic mechanism of the substance transport and decay from data and distinguish between different orders of reaction.
The study demonstrated that the EPR models were robust and generalizable, as they performed well on different WDNs, including a real one, and different scenarios of substance concentration and reaction rate parameters. It was validated on unseen data generated by water quality simulations with variable reaction rate parameters for each pipe and highlighted satisfactory results.
The EPR models emphasized that the travel time along the shortest paths can be used as a surrogate for the water age, as they had similar or slightly lower accuracy than the models using water age. It implies that substance transport and decay in WDNs were mainly determined by the shortest paths between the source and destination nodes, with secondary paths having a minor influence.
Applications
The new approach can provide an accurate estimation of the substance concentration at any node of the network, given the concentration at the source node and the reaction rate parameter, without solving the differential equations. Furthermore, it enables real-time monitoring and control of water quality. Additionally, it aids in the planning and design of water distribution systems. The EPR models can calibrate the kinetic model parameters from real data, providing both a physical interpretation and a range of values for the reaction rate parameter.
Conclusion
In summary, the novel approach is efficient and effective for understanding and predicting the decay of substances in water distribution networks. This approach holds the potential for water quality analysis, calibration, and optimization in drinking water infrastructures.
The researchers acknowledged limitations and challenges and suggested that future work could incorporate transient flow dynamics and unsteady conditions, address uncertainty in model parameters such as reaction rate, and expand the approach to model interactions between multiple substances, including chlorine and DBPs. Moreover, they recommended integrating the EPR models into optimization frameworks for water quality management and investigating the transferability of the approach to other domains such as wastewater networks, environmental systems, etc.