key: cord-0829993-9na5gqbg authors: Doorn, Neelke title: Artificial Intelligence in the Water Domain: Opportunities for Responsible Use date: 2020-09-29 journal: Sci Total Environ DOI: 10.1016/j.scitotenv.2020.142561 sha: aad6b22cbae6c7907d27332f6b636c1d18b3d2f2 doc_id: 829993 cord_uid: 9na5gqbg Recent years have seen a rise of techniques based on artificial intelligence (AI). With that have also come initiatives for guidance on how to develop “responsible AI” aligned with human and ethical values. Compared to sectors like energy, healthcare, or transportation, the use of AI-based techniques in the water domain is relatively modest. This paper presents a review of current AI applications in the water domain and develops some tentative insights as to what “responsible AI” could mean there. Building on the reviewed literature, four categories of application are identified: modeling, prediction and forecasting, decision support and operational management, and optimization. We also identify three insights pertaining to the water sector in particular: the use of AI techniques in general, and many-objective optimization in particular, that allow for a pluralism of values and changing values; the use of theory-guided data science, which can avoid some of the pitfalls of strictly data-driven models; and the ability to build on experiences with participatory decision-making in the water sector. These insights suggest that the development and application of responsible AI techniques for the water sector should not be left to data scientists alone, but requires concerted effort by water professionals and data scientists working together, complemented with expertise from the social sciences and humanities. genetic computation are widely used in the water domain. The Hidden Markov Model is a statistical technique that can be used to recover a data sequence (time series) that is not immediately observable, but where other data that depend on the sequence is observable. It is therefore also considered of relevance for modelling activities in the water domain. After that, the same search string was applied again but now to the "topic", which includes not only article title but also article abstract and keywords. This way, also more technical papers that did not use any of the more general AI-related terms or the application field in the title were identified. Since this second search led to a too large set of papers to consider (>12.000), only the 100 most-cited papers of this second search were included. While the search strategy may have a slight bias towards less technical and more applied papers, we consider this justified as the aim is to identify the most relevant type of applications of AI. Also, with the addition of the more specific AItechniques in the search string, these terms are expected to cover a sufficiently broad range of AIapplications in the water domain. All subject areas in the database were included. The search was performed on all article types published before 2020. The following search strategy was used. (1) TITLE/TOPIC: ("Artificial Intelligen*" OR AI OR "Machine Learning" OR ((Evolutionary OR Genetic) AND (Algorith* OR Computation*)) OR "Hidden Markov") AND TITLE/TOPIC: Water AND TIMESPAN: 1900 (2) TITLE: ("Artificial Intelligen*" OR AI OR "Machine Learning" OR ((Evolutionary OR Genetic) AND (Algorith* OR Computation*)) OR "Hidden Markov") AND TITLE: (ethic* OR responsible) AND TIMESPAN: 1900 After both searches, the academic literature found was assessed in two steps. This resulted in 855 unique publications in Search_Water and 216 in Search_Ethics. Their abstracts were then sifted for relevance, to J o u r n a l P r e -p r o o f exclude papers not about AI and water (Search_Water) or AI and ethics (Search_Ethics). This produced a set of 601 relevant papers on AI and water (Search_Water), and 187 on AI and ethics (Search_Ethics), which were then scanned on a full-paper basis. Papers in Search_Water that did not describe a specific application of AI were excluded from further analysis. Also papers that did not focus on the water domain explicitly were excluded. Papers in Search_Ethics that did not discuss specific ethical issues, values, principles, or guidelines of AI were excluded from the search set. See Figure 1 for a flow diagram of the selection process. Looking at the results of Search_Water, the applications vary widely in terms of scope and real-world impact, but it is still possible to define some basic categories of application nevertheless. In Section 5, the scope and impact of the different applications is discussed in more detail. The four main categories of application currently being discussed in the literature are modeling, prediction and forecasting, decision support and operational management, and optimization. A very few papers discuss cybersecurity, but this too little to warrant discussion as a separate category. In this section, only the four main categories are discussed. But these should not be considered as exclusive or exhaustive. Within the set of articles in Search_Water, 130 (out of 601; 22%) referred to modeling. In these articles, J o u r n a l P r e -p r o o f the use of AI was typically described in terms of improving accuracy or reducing uncertainty, or as an efficient way to gather information that would otherwise be difficult to establish. Within this category, a few papers that used AI in computational fluid mechanics were also found. Here, AI was mostly used as an alternative to solving the algebraic equations numerically (e.g. Shang 2005) . One example where AI has been used to improve the accuracy of a model is a modeling study of the Andong Dam watershed in Korea. Seo et al. (2015) describe two approaches to model water reservoir systems. Conventional modeling (that is, not using AI techniques) uses statistical models based on time series analysis. However, most of these are linear models, which makes them less suitable for modeling complex hydrological systems that show highly non-linear and non-stationary behavior. The authors compare two AI techniques to model the non-linear behavior of the Andong Dam watershed: artificial neural networks and adaptive neuro-fuzzy inference systems. Artificial neural networks (also known as connectionist models) contain a set of algorithms loosely inspired by the way the human brain works. These models are designed to recognize patterns by considering examples, from which they generate identifying characteristics, without being programmed with task-specific rules (Dawson and Wilby 2001) . Adaptive neuro-fuzzy inference systems combine neural networks with fuzzy logic models, which are mathematical means of representing vagueness and imprecise information and which are therefore especially suitable when handling data that is vague and lacks certainty (Jang 1993) . Seo et al. observe improved behavior of their models as compared to conventional methods. The added value of AI here lies in its ability to model non-linear systems, which improves the accuracy of these models compared to conventional ones based on linear input-output relations. Many applications used genetic or evolutionary computation to reduce uncertainty. For example, Yu et al. (2019) used a so-called noisy genetic algorithm (NGA) in the context of sustainable water reservoir operation under stochastic inflow conditions. Sustainable reservoir operation requires that operation rules are not only based on satisfying utility demands but also take into account the necessary environmental flow conditions downstream. Meeting both the utility demand and the environmental flow demand is J o u r n a l P r e -p r o o f challenging given the stochastic nature of the inflow conditions. Existing studies in sustainable reservoir operation are often based on deterministic inflow conditions (such as historical inflows) as an input for optimization, but these prove to be inaccurate. Alternatively, Monte Carlo simulation or comparable simulation tools can be used to handle the stochastic variables. However, these often require several runs of the tool, which is computationally demanding. By contrast, NGAs can run well without sampling plenty of realizations in different optimizations. The NGA differs from a standard GA in that the fitness function is replaced by a sampling fitness function based on expected fitness rather than deterministic fitness. In an empirical study of the Tanghe reservoir in China, Yu et al. found a reduction in the computation time of 90% compared to a Monte Carlo simulation of the same reservoir and an increased performance in terms of the utility demand and the environmental flows demand. Studies that use ML to interpret images or other products of remote sensing to obtain spatial information about some relevant property are an example where AI generates information that would otherwise be difficult to establish. Huang et al. (2015) used ML techniques to infer different types of bodies of water from urban high-resolution remote-sensing images. In particular, they found, it is in classifying different types of bodies of water and associated water extraction that AI has clear added value compared to conventional methods based on in-situ measurements. Similar applications are reported by Acharya et al. (2019) , who compared different ML techniques to provide information on surface water extraction in Nepal, and by Bair et al. (2018) , who used ML techniques to estimate the so-called snow water equivalent (SWE) in mountain watersheds. What these papers have in common is that the spatial distribution of the property being looked for is heterogeneous, so that sparse networks of sensors fail to characterize that heterogeneity. Within the set of articles in Search_Water, 78 (out of 601; 13%) referred to prediction and/or forecasting. In these articles, the use of AI was typically described in terms of its ability to develop predictive models that do not rely on site-specific characteristics or that are applicable to a wide range of environmental J o u r n a l P r e -p r o o f conditions. Although there is a clear overlap with the modeling category discussed above, and quite some papers are identified as referring both to modeling and to prediction/forecasting, the papers discussed in the category prediction and forecasting all have a time-dimension, which is not the case for all those categorized under modeling. For example, Singh and Gupta (2012) developed different predictive models to forecast the formation of trihalomethanes (THMs) in chlorinated waters, which poses a high risk to humans. For adequate control of THMs, their levels in the water need to be known, but determining those levels in the laboratory is too costly and time-consuming. With AI-based predictive models, THM formation can be predicted based on parameters that are easier to determine, such as water pH, temperature, and the concentration of easier to establish chemical compounds, such as bromide. The results indicate that the AI-based models are capable of capturing the complex non-linear relationship between water conditions and corresponding THM formation. Another example of prediction is a study that explores the performance of ML models for the forecasting of water quality parameters in coastal waters fed by contributing streams that carry potentially polluted water (Alizadeh et al. 2018) . In this case, hourly recorded water quality parameters of salinity, temperature and turbidity in Hilo Bay (Hawaii, USA) were combined with flow data from the contributing Wailuku River. Several ML techniques were used to investigate the impact of the river's flow on the water quality parameters from the current time up to 2 hours ahead. The researchers found that water quality parameters can be properly forecasted up to several hours in advance, which may provide valuable information for environmental management and monitoring in coastal areas. Within the set of articles in Search_Water, 183 (out of 601; 30%) referred explicitly or implicitly to decision support and operational management. This category again partly overlaps with the others, but because of its own specific focus it deserves separate consideration. Many articles in this category focus on event detection and early warning. J o u r n a l P r e -p r o o f Journal Pre-proof One example where AI was used for the detection of accidental water contamination is found in the paper by Arnon et al. (2019) . It focuses on water contamination from organic components, which is traditionally detected by inferring "routine" water patterns from measurements of indicator or surrogate physical and chemical parameters of presumably non-contaminated water. Deviations from these routine patterns are then taken as a sign of contamination. Typical physical and chemical parameters used to describe these routine patterns are the water's turbidity, oxidation-reduction potential, free chlorine, and conductivity. However, non-contaminating processes, such as changes in maintenance and operational procedures, may also lead to changes in the physical and chemical parameters, rendering identification of "real" contamination difficult. Hence, this procedure is conditional on stable background conditions. In this paper, a methodology is developed to defer water contamination from the spectroscopic properties of water (UV absorbance spectra). The challenge is to select the relevant features for the given classification problem ("contaminated" or "potable"). Based on an affinity measure combining Pearson correlation and Euclidean distance between the tested absorbance spectrum and the characteristic absorbance of the water types flowing in the water distribution system, an algorithm was developed to detect water anomalies. The innovative part was in the use of a feature selection algorithm that strengthened the relevant differences between the potable and contaminated water. The authors used Gram-based amplification, which allows for computationally efficient feature selection (Ramona, Richard, and David 2012) . A database of uncontaminated water sources was used to train the detection algorithm. The algorithm proved successful in detecting contaminants at relatively low concentrations with a very low rate of falsepositives, under stochastic and highly varying water characteristics. Bagriacik et al. (2018) compared different AI-based models to detect damage to pipes in the aftermath of an earthquake. The models were evaluated in terms of their ability to accurately predict the total number and approximate spatial distribution of damaged pipes, to correctly classify each individual pipe as damaged or not, and to describe the relative importance of pipe and earthquake attributes in predicting damage. J o u r n a l P r e -p r o o f Yuan et al. (2019) used ML techniques to distinguish eruption and precursory signals of a geyser under noisy environmental conditions. Building on seismic data from the Chimayó geyser (New Mexico, USA), the authors developed a method to filter the relevant signals from background noises, such as daily temperature variations, animal movement near the geyser, and human activity, and to classify the filtered data into three classes of geyser state: remnant noise, precursor, and eruption states. The proposed ML approach demonstrates an ability to extract eruption and precursory signals from background noise, which makes it very suitable for providing real-time actionable information. Although developed for the detection of geyser eruptions, this model has strong usage potential for other noisy environments where the detection of anomalies from time series is challenging (Yuan et al. 2019) . A few articles combined the quantitative and physical engineering models of some socio-physical system with models that simulate the behavior of the human actors within the system. In the context of the detection and prevention of water contamination, Zechman (2013) developed a model that combined agent-based modelling with evolutionary algorithms to develop and evaluate different threat management strategies. Mewes and Schumann (2019) developed an agent-based irrigation planning model with a machine learning-based training component that is able to identify the current hydrological situation and adapt irrigation and cropping schemes accordingly within the model at runtime. Within the set of articles in Search_Water, 211 (out of 601; 35%) referred to optimization. In these articles, the use of AI was typically described in terms of its ability to provide necessary information to optimize processes or to identify optimal solutions to complex, often ill-structured problems. Although the distinction between modeling and prediction and forecasting on the one hand and optimization on the other is not watertight, optimization tools generally have the capability of systematically determining optimal water planning and management decisions. However, this may come at the expense of sacrificing accurate representation of the underlying water system behavior. Although advances in AI may challenge the very distinction between optimization and the applications discussed under modeling (Labadie 2014), J o u r n a l P r e -p r o o f for the sake of providing an overview of different types of applications, it may still be useful to describe optimization as a separate category. Guo et al. (2017) describe the use of AI for precision irrigation. Accurate information on the water status of a plant root system is essential for precision irrigation, but also difficult to establish. In this study, ML techniques were combined with phenotyping to develop a discrimination method for plant root zone water status in a greenhouse, thereby allowing for more precise and more economical allocation of water. A different type of optimization was used in the study by Strobl and Robillard (2006) , who explored the usefulness of AI techniques in the design of a water quality monitoring network: a classical yet complex optimization problem that requires an optimal configuration of sensors to ensure maximum information extraction from the water quality data collected. In this study, the authors explored potentially applicable AI techniques, including expert systems, artificial neural networks, genetic algorithms, and fuzzy logic systems. The review included relatively many similar applications that used AI for the design of some system or water network. Another example of optimization is found in the paper by Kumar et al. (2018) , where AI was used to allocate resources for pro-active maintenance. Around the world, many vital infrastructures are showing their age, which prompts the question as to what constitutes an adequate maintenance scheme for them. Although maintenance in a healthy condition is usually cheaper than repairing failing infrastructures, resources are often limited. The authors focus on water supply infrastructures, which are particularly vulnerable to water main breaks. These not only cause major disruptions to everyday life for residents and businesses, but are costly as well. They also typically occur without any prior warning, making it difficult to schedule maintenance and to decide which mains should be replaced proactively before a break occurs. Moreover, the magnitude and history of several factors contributing to deterioration and failure of the mains are often unknown. These include factors such as the relevant chemical parameters of the water inside the pipe and in the soil, pipe production parameters, external loading. Using historical data on which mains have failed previously, descriptors of pipes, and other data sources, the authors applied ML techniques to assess the risk of a water main breaking within the next three years. The resulting model J o u r n a l P r e -p r o o f provides risk scores for each of the city's blocks, allowing the authorities to schedule maintenance proactively rather than reactively repairing or replacing mains that have broken. In addition to optimized allocation of resources, the authors also see a potential for the model to provide insights into factors that are important in predicting water main failures. Lastly, quite a number of papers referred to the use of multi-objective and many-objective optimization techniques for trade-off analysis. Multi-objective (in the case of two or three objectives) and manyobjective optimization (in the case of more than three objectives) use genetic computation techniques to generate alternatives for complex planning problems, enabling the discovery of the key trade-offs that need to be made between relevant decision parameters (Kasprzyk et al. 2013 ). This approach is especially relevant for design processes that are multi-dimensional and characterized by uncertainty, of which there are many in the water domain. Examples discussed include water supply (Kasprzyk et al. 2013) , urban storm-water runoff (di Pierro, Khu, and Savic 2006) , and water quality management (Chatterjee et al. 2017 ). Strikingly, these were also the articles that paid most attention to how end users (planners, water managers) could use these techniques in daily practice. This rough overview shows the potential of AI-based techniques in the water domain. Because of the complexity of many processes in this sector, the strength of data-driven over theory-driven models may be that they are better capable of capturing non-linearity in the relevant physical processes. This is facilitated by the increasing availability of relevant sensor data. When it comes to design and planning, we see many applications focusing on the optimal design of water distribution networks. Many of the approaches described in the academic papers have also found their way to real-life applications (Van Thienen et al. 2018) . When it comes to the application of ML to decision-making, Hadjimichael et al. (2016) report that only a very small proportion of the academic literature on ML and water deal with decision-making, which suggests that water is lagging behind broadly comparable sectors like energy and logistics (Van Thienen 2019). J o u r n a l P r e -p r o o f One of the discussions within the field of AI generally is how to develop "responsible" AI; that is, how to make sure that it is consistent with important ethical values. To assess the extent to which this discussion is also taking place within the subset of AI literature focusing on water, we evaluated how many of the papers in the list of publications from Search_Ethics are cited by any of the papers in Search_Water. Web of Science identified 782 articles that cited one or more of those in the set Search_Ethics, but there was no overlap between this set of citing articles and the publications in the set Search_Water. In other words, none of the identified articles on AI in the water domain cited any of the articles on AI and ethics in Web of Science. This suggests that there is little or no discussion of the ethical aspects of AI in the water domain. 1 Looking at how ethics in relation to AI is discussed in the literature, we see two main strands. One focuses on the ethical challenges that AI and ML pose, often linked to specific application domains, such as warfare (Russell 2015) , medicine (Char, Shah, and Magnus 2018), or labor (Torresen 2018) . The other focuses on the values or principles that should guide AI and ML developments (Cath 2018; Dignum 2018). In the past five years, we have also seen a growth of non-academic institutions developing principles and guidelines for AI. Combining these different strands suggests a general convergence on the relevant ethical principles, but at the same time also substantive divergence on how these principles are to be interpreted, let alone implemented in different domains (Jobin, Ienca, and Vayena 2019) . Arguably, the five most prominent principles mentioned in the literature are transparency, justice and fairness, responsibility and accountability, privacy, and non-maleficence, each of which can be linked to concerns raised by AI applications. 1 Of course, it cannot be ruled out that some relevant gray literature on ethics and AI/ML or some that uses slightly different terminology is cited by the articles in Search_Water, but the lack of any citations to the articles in Search_Ethics at least suggests that the topic of ethics does not feature prominently in the literature on water and AI/ML. Transparency is often mentioned in response to the perceived black-box nature of AI in general and ML in particular. Decision-making by AI systems, especially those based on deep learning techniques, is frequently said to be opaque, in the sense that, once the algorithms are trained with data, it is practically impossible to examine the internal structure of those algorithms. This makes it difficult to understand why and how they produce a particular outcome (Winfield and Jirotka 2018) . relevance. Some scholars stress that responsibility is a matter concerning the people constructing and using the technology and not something that applies to the technology itself. Despite their ability to learn, AI systems are ultimately artifacts constructed by people (Dignum 2018) . Others develop guidelines on how to have humans or society "in the loop" (Rahwan 2018) or conceptualize what it means to have "meaningful human control" (Santoni de Sio and van den Hoven 2018), so that ultimately there are people who could intervene and who could also be held accountable in case of undesirable decisions. Privacy relates to the use of data and the need to protect people's right to privacy. Approaches like "privacy by design" stress the need to minimize the collection of data and to prevent privacy issues from materializing rather than solving them once they occur (Cavoukian 2010). Others view privacy primarily as a value or right to be protected through regulation (Gürses and del Alamo 2016). Non-maleficence is a general principle stemming from medical ethics (Beauchamp and Childress 2001[1994] ), indicating an intention to avoid needless harm or injury. In the context of AI, it is often interpreted as the need to protect safety and security (Beil et al. 2019) . While the appeal to safety and security could be read as a principle independent from the ones discussed above, many appeals to nonmaleficence concern the prevention of infringements of personal privacy and the responsibility to avoid misuse (Floridi and Cowls 2019) , which suggests that the different principles have some overlap. Application of the five ethical principles to the use of AI techniques in the water domain shows that those principles require further operationalization into action-guiding design recommendations for specific AI systems in the water domain. In order to do so, let us first look at what characterizes the water domain and discuss a few typical AI applications in context. Typical for the water domain is that it is a largely public domain (Rogers, Silva, and Bhatia 2002) and that water is seen as a resource that should be available to all people, at least in some sufficient quantity and quality (Gleick 1998) . Secondly, many management issues in the water domain are characterized by scarcity, both of the water itself and in terms of the space and funds available for building and upgrading the relevant infrastructure, both of the water itself and in terms of the space and funds available for implementing and upgrading relevant infrastructure, and conflicting uses (Hajkowicz and Collins 2007) . Additionally, many water problems have a spatial character (Gerlak et al. 2018) , often also in combination with a dedicated infrastructure (Clark, Hakim, and Ostfeld 2011). Lastly, and this holds especially in relation to climate change, many problems in the water sector are characterized by uncertainty (WWAP 2012). All these aspects pose ethical challenges in themselves, also without the use of AI (e.g. Groenfeldt 2013; Brown and Smith 2010; Doorn 2013) . While these features are not unique to water and not all papers in the review explicitly address all these aspects, many of the papers touch upon at least some of them. Based on these characterizations of the water domain and the AI applications discussed in Section 4, let us look at some concrete applications that are already in use or which use in the near future is conceivable and that may have a substantial impact on society and the natural environment. While the scientific literature is obviously concerned with the most cutting-edge AI techniques, more conventional data science techniques are already deployed by the sector at a larger scale. A first typical example concerns wastewater-based (or sewage-based) epidemiology (WBE), which refers to the analysis of pollutants and biomarkers in raw wastewater, using techniques from biochemistry and bioinformatics (Lorenzo and Picó 2019) . It can be used to provide real-time information on the production and consumption of legal and illegal drugs, exposure to certain agents (e.g. pesticides or persistent organic pollutants), incidence of specific diseases (e.g. diabetes or cancer), and determination of some lifestyle consequences (e.g. exposure to personal care products or consumption of doping substances). Recently, WBE has been used in the early detection of SARS-Cov-2 (Orive, Lertxundi, and Barcelo 2020) . Although current WBE is based on rather conventional data science, the data may be used as input for algorithmic decision-making and profiling, for example for the detection of consumption or production of illegal drugs. The technique then becomes ethically-sensitive, prompting questions related to privacy and non-discrimination. Increased police surveillance in areas with allegedly high illegal drug use may go against principles of non-discrimination. While the use of WBE for early infection detection may warrant low aggregation levels of relevant data to allow for specific profiling, this may be problematic when done for the purpose of surveillance in criminal investigations. Currently, samples for WBE are mostly taken at the waste water treatment plant, and thus at high aggregation levels. However, the technique in principle allows for analysis at lower aggregation levels, such as streets or households, which prompts the need to implement safeguards so that personal data is not used for illegitimate purposes in those cases. A second example concerns algorithmic water allocation. Many applications discussed in Section 4 involved the optimization of water allocation and a better match of water supply and demand. This may eventually result in algorithmic decisions on how to allocate water. While the prudent allocation of scarce J o u r n a l P r e -p r o o f resources is to be welcomed, optimization ultimately depends on the formulation of the -to be optimizedobjective function. The water ethics literature already discusses how existing approaches aimed at improving "water efficiency" and "water productivity" often deprive smallholders of water use rights and harm local livelihood and production strategies (Boelens and Vos 2012) and this effect may be exacerbated with increased focus on optimization. Hence, naïve optimization may have negative effects on distributive justice and even conflict with the legally recognized human right to water. A more radical critique along these lines could be derived from the so-called commonwealth-of-life idea. While the use of AI or data science is not an explicit topic within the water ethics literature, defendants of the commonwealth-of-life idea argue that the very notion of water allocation already reflects a too anthropocentric attitude towards water as a "resource" to be managed by and to the benefit of humans (Brown 2010) . This denies the intrinsic value of water itself. The use of algorithms may add to this concern. Another example of optimization concerns water user profiling. Water demand data at the household level combined with, for example, GIS-data and census data, allow for evaluation of demand information in a way that it "can help water utilities to plan the water supply system in an optimal manner to meet demand" (Ghavidelfar, Shamseldin, and Melville 2017: p. 2184 ). However, the use of this same information may also prompt ethical questions when it affects water accessibility of some specific users. Also related to water use is the increased use of nudging tools. Largely drawing from psychological research in decision-making, nudges aim at influencing people to make better decisions by presenting choices in specific ways, still leaving intact their freedom of choice (Thaler and Sunstein 2008) . Nudges are often presented as an effective alternative to traditional monetary or regulatory policy interventions to achieve some public goal, even though critics worry that nudges are manipulative (White 2013) or coercive (Hausman and Welch 2010) and thereby a violation of human dignity and autonomy. Generic nudging in the water-sector is already common-place, for example households or farmers who are provided with information on how their own water use relates to that of comparable users to "nudge" J o u r n a l P r e -p r o o f them into more conservative behavior (e.g. Chabé-Ferret et al. 2019; Miranda, Datta, and Zoratto 2019) . While generic nudges seem justifiable conditional on due respect for personal autonomy and human dignity (Schmidt and Engelen 2020), AI-based techniques open the possibility for targeted micronudging, where nudges are tailored to specific individuals. This opens a whole new box of ethical questions beyond human dignity and autonomy, especially when used by public or semi-public actors to achieve a public goal, such as water saving. These personalized nudges often involve data or tools from private companies, which are not under democratic control in the way that public or semi-public actors are (Schmidt 2017) . Moreover, when some users are treated differently than others, or when some user groups are nudged into behavior that is beneficial to other people whereas others are not, targeted nudging may conflict with basic demands of justice and equal treatment. A last example concerns the use of AI techniques for algorithmic investment prioritization generally and maintenance scheduling specifically. The discussion of pipe leak detection in Section 4 is an example where AI is used to detect which parts of a water infrastructure are in need of improvement (Kumar et al. 2018) . While it seems in principle a rational approach to schedule maintenance based on failure probabilities, when combined with data about the expected damage, the algorithm may put the infrastructure parts in the more affluent areas higher on the list. This prompts questions about justice: who gets better infrastructure and for what reasons? These examples show that responsible AI in the water sector is not only a matter of constraining some type of uses or applications, but also one of harnessing its potential (Taddeo and Floridi 2018) . The use of AI techniques may help us to address some of the challenges that water domain is currently faced with, but it may at the same time pave the way to a world characterized by profiling and one-sided optimization. Responsible AI is therefore as much a question of responsible AI techniques as one of responsible governance. And while the use of AI techniques may introduce new ethical concerns related to privacy, transparency, and non-maleficence, it may also alter ongoing discussions in the water ethics literature. Strikingly, the legally recognized human right to water is already formulated in terms of access to relevant information (CESCR 2002) and this may become only more relevant with the increased use of J o u r n a l P r e -p r o o f AI and data science techniques. Discussions on responsibility in relation to water now primarily focus on the forward-looking responsibility to act, for example in relation to flood prevention (Doorn 2016) or water service delivery (Koehler 2018) , but with the increased use of AI and algorithmic decision-making, the attention may shift towards backward-looking notions of responsibility, such as accountability and liability, which focus not so much on "who should act" but more on "who should repair" or "who should explain how a decision came about". Based on these concrete examples and the discussion of general AI principles, we can now formulate some general insights on what it means to develop responsible AI specifically for the water domain, including some promising opportunities for the use of AI. First, the opportunities for responsible use. At the abstract level, the different ethical principles call for AI techniques that are aligned to human values (Blumenstock 2018). The challenge, therefore, is to build systems purposed to maximize the realization of these values (Russell 2016) . In the literature, it has been suggested that systems be equipped with suitable utility functions that govern how options are evaluated by the system (Bostrom 2014). The use of evolutionary algorithms for many-objective optimization may provide a concrete starting point to broaden the implementation of values in AI systems. Although many-objective optimization is in itself primarily an optimization algorithm, and by that also prone to some of the drawbacks of optimization discussed above, on a more abstract level the method reflects the ideal of value pluralism, that is, the recognition that there is a plurality of legitimate values or goals worth striving for (Doorn 2019) . A study on water allocation in the Lower Rio Grande Valley, Texas, found that the use of many-objective optimization resulted in water management strategies that were more adaptive to conditions of water shortage and that would not have been identified as promising with traditional optimization techniques (Kasprzyk, Reed, and Hadka 2016) . Recently, distributive justice has been included as a new decision criterion in a flood risk management many-objective optimization problem in the Netherlands (Ciullo et al. 2020). Additionally, many-objective optimization is currently also deployed to deal with uncertainty (Nicklow et al. 2010 ). However, uncertainty not only applies to climatic and demographic conditions, but also to what humans value. So the real challenge is not to optimize for specific values upfront, but rather to develop AI techniques that are flexible with regard to the values to be optimized (Van de Poel 2018). This seems especially relevant in the water domain where often long-term decisions need to be made, for example regarding infrastructures that are designed to last for a relatively long lifespan. Ideally, we should be able to account for changing use or changing appreciation of those infrastructures (e.g. Linsen, Mostert, and Van der Zaag 2015) . Where current adaptive algorithms may provide the necessary flexibility for realtime decision support (Wong and Kerkez 2016) , we are still a long way from developing algorithms that have this more distant time horizon. First steps in this direction are the introduction of models for decision-making under deep uncertainty; for example, the use of many-objective optimization tools to discern decision-relevant scenarios within a wide range of future scenarios (Kwakkel 2019) , where scenarios can also account for variation of values and changing societal values. Ultimately, the practical and responsible use of optimization techniques in general, and evolutionary algorithms in particular, depends on the quality of the problem formulations (Maier et al. 2014 ). Regarding the ethical concerns voiced, AI models are usually contrasted with theory-driven models, which is seen both as an advantage and as a drawback. The advantage is that data-driven models are capable of providing useful outcomes in complex situations where existing theories are not able to capture the non-linear aspects of the system. The drawback of these data-driven models is that they are black boxes with no physical meaning and little explanatory power. Here, a combination of data-driven models and physics-based solutions may provide a fruitful middle way (Sun and Scanlon 2019) . In the water domain, a field with a strong physical science base, so-called theory-guided data science (TGDS) may provide a promising way to keep the best of both approaches. By introducing dependencies with sufficient grounding in physical principles, TGDS models have a better chance to represent causal relationships. Since TGDS develops models consistent with scientific principles, they also achieve better J o u r n a l P r e -p r o o f generalizability than models that are purely data-driven (Karpatne et al. 2017 ). This may address some of the transparency concerns that apply to applications that have no or less physical grounding. Lastly, many of the dystopian scenarios involving AI really boil down to the undemocratic use of data and lack of involvement by stakeholders actually affected by the outcomes of AI-based technologies. Especially given the spatial and public nature of water and many water infrastructure, participation and public engagement is high on the water ethics agenda (Sharp 2017) . In recent decades, the water domain has gained ample experience in democratizing water policy and in involving stakeholders in decisionmaking on water-related issues, so as to improve its democratic legitimacy and to avoid water policies that go against important public and ethical values (e.g. Pigmans et al. 2019; Basco-Carrera et al. 2017; Mostert 2003) . While a few articles in the review already addressed the issue of participation (Smith, Kasprzyk, and Dilling 2019; Lerma et al. 2015; Lewis and Randall 2017) , this aspect could still be further strengthened. Using AI in the water domain requires similar attention to these democratic questions and so should be approached as something requiring more than just data expertise (Blumenstock 2018). It is therefore important that AI-based water interventions are developed in collaboration between the people who understand the problems and context and the data experts. Ideally, some basic training in data science will be part of the training of the future generation of water professionals (Sun and Scanlon 2019). This paper presents a review and a rough categorization of different types of applications of AI in the water domain. These categories are: modeling, prediction and forecasting, decision support and operational management, and optimization. Comparison of the reviewed literature on AI in the water domain and recent literature on AI and ethics shows that little attention is paid to the ethical aspects of AI applications in the water domain. Recent literature on AI and ethics suggests a list of five principles that should be taken into account when developing AI applications. These are: transparency, justice and fairness, responsibility and accountability, privacy, and non-maleficence. Application of AI in the water domain remains somewhat limited, compared to its use in other sectors (Hadjimichael, Comas, and Corominas 2016) . This provides an opportunity to avoid and to learn from the mistakes made elsewhere (Van Thienen 2019). This paper has discussed three insights pertaining to the water domain in particular. The first is the use of AI-based optimization techniques that optimize on several dimensions. Many-objective optimization may be instrumental in addressing some of the pressing ethical challenges pertaining to water, namely competing uses and uncertainty (Doorn 2019) . Its success in this regard strongly depends on the quality of the problem formulations and the values included in it (Maier et al. 2014) . Second, some of the drawbacks of AI in other sectors can be partly compensated because many applications in the water domain are also informed by physics. This allows for theoryguided data science, which can avoid some of the pitfalls of strictly data-driven models. Third, many of the ethical concerns with respect to AI deal with the lack of stakeholder involvement in data-driven decision-making. Fortunately, this is increasingly recognized and there is ample experience with participatory decision-making in the water domain. This suggests that the development and application of responsible AI techniques for the water domain should not be left to data scientists alone, but requires concerted effort by water professionals and data scientists working together, complemented with expertise from the social sciences and humanities. Figure 1 Evaluation of Machine Learning Algorithms for Surface Water Extraction in a Landsat 8 Scene of Nepal Effect of river flow on the quality of estuarine and coastal waters using machine learning models Water characterization and early contamination detection in highly varying stochastic background water, based on Machine Learning methodology for processing real-time UV-Spectrophotometry A DARPA Perspective on AI [date accessed Engineering Ethics Hydrological modelling using artificial neural networks' Deep Learning: Methods and Applications', Foundations and Trends in Signal Processing From single-objective to multiple-objective multiple-rainfall events automatic calibration of urban storm water runoff models using genetic algorithms Ethics in artificial intelligence: introduction to the special issue Water and justice: Towards an ethics for water governance AI and Its New Winter: from Myths to Realities A Unified Framework of Five Principles for AI in Society Water security: A review of place-based research A Multi-Scale Analysis of Single-Unit Housing Water Demand Through Integration of Water Consumption, Land Use and Demographic Data The human right to water Genetic Algorithms in Search, Optimization and Machine Learning Water Ethics: A Values Approach to Solving the Water Crisis Discrimination of plant root zone water status in greenhouse production based on phenotyping and machine learning techniques Privacy Engineering: Shaping an Emerging Field of Research and Practice Do machine learning methods used in data mining enhance the potential of decision support systems? A review for the urban water sector A Review of Multiple Criteria Analysis for Water Resource Planning and Management Engineering Ethics: Concepts and Cases The Elements of Statistical Learning: Data Mining, Inference, and Prediction Artificial Intelligence: The Very Idea Debate: To Nudge or Not to Nudge*' Avoiding Another AI Winter Combining Pixel-and Object-Based Machine Learning for Identification of Water-Body Types From Urban High-Resolution Remote-Sensing Imagery Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems ANFIS: adaptive-network-based fuzzy inference system The global landscape of AI ethics guidelines Machine learning: Trends, perspectives, and prospects Theory-guided data science: A new paradigm for scientific discovery from data Many objective robust decision making for complex environmental systems undergoing change Battling Arrow's Paradox to Discover Robust Water Management Alternatives Exploring policy perceptions and responsibility of devolved decision-making for water service delivery in Kenya's 47 county governments Automated Design of Both the Topology and Sizing of Analog Electrical Circuits Using Genetic Programming Using Machine Learning to Assess the Risk of and Prevent Water Main Breaks A generalized many-objective optimization approach for scenario discovery Advances in Water Resources Systems Engineering: Applications of Machine Learning Assessment of evolutionary algorithms for optimal operating rules design in real Water Resource Systems Solving multi-objective water management problems using evolutionary computation Infrastructure and adaptive management in an eco-hydrological Delta: Lessons learned from design and construction of the Haringvliet Sluices Wastewater-based epidemiology: current status and future prospects', Current Opinion in Environmental Science & Health Evolutionary algorithms and other metaheuristics in water resources: Current status Some philosophical problems from the standpoint of artificial intelligence Machines Who Think: A Personal Inquiry Into the History and Prospects of Artificial Intelligence The potential of combined machine learning and agent-based models in water resources management Steps toward Artificial Intelligence Saving Water with a Nudge (or Two): Evidence from Costa Rica on the Effectiveness and Limits of Low-Cost Behavioral Interventions on Water Use Preferred Reporting Items for Systematic Reviews and Meta-Analyses: The PRISMA Statement The Dartmouth College Artificial Intelligence Conference: The Next Fifty years The challenge of public participation Limitations of the current stock of ideas about problemsolving State of the Art for Genetic Algorithms and Beyond in Water Resources Planning and Management Early SARS-CoV-2 outbreak detection by sewage-based epidemiology Ethical challenges regarding artificial intelligence in medicine from the perspective of scientific editing and peer review Value deliberation to improve stakeholder participation in water governance Society-in-the-loop: programming the algorithmic social contract Multiclass Feature Selection With Kernel Gram-Matrix-Based Criteria Water is an economic good: How to use prices to promote equity, efficiency, and sustainability Ethics of artificial intelligence Meaningful Human Control over Autonomous Systems: A Philosophical Account The Power to Nudge The ethics of nudging: An overview Daily water level forecasting using wavelet decomposition and artificial intelligence techniques Application of artificial intelligence CFD based on neural network in vapor-water twophase flow XXII. Programming a computer for playing chess Reconnecting People and Water: Public Engagement and Sustainable Urban Water Management (Routledge The logic of heuristic decision making Artificial intelligence based modeling for predicting the disinfection by-products in water Testing the potential of Multiobjective Evolutionary Algorithms (MOEAs) with Colorado water managers Artificial intelligence technologies in surface water quality monitoring How can Big Data and machine learning benefit environment and water management: a survey of methods, applications, and future directions How AI can be a force for good Water-Quality Indices Based on Fuzzy Logic and Other Methods of Artificial Intelligence Nudge: Improving decisions about health, wealth, and happiness A Review of Future and Ethical Perspectives of Robotics and AI Mind, LIX: 4330460. Van de Poel, Ibo R Responsible AI for the water sector? Explorations in Data Mining for the Water Sector Administration by Algorithm? Public Management Meets Public Sector Machine Learning The Manipulation of Choice: Ethics and Libertarian Paternalism Ethical governance is essential to building trust in robotics and artificial intelligence systems Real-time environmental sensor data: An application to water quality using web services The United Nations World Water Development Report 4: Managing Water under Uncertainty and Risk Sustainable Water Resource Management of Regulated Rivers under Uncertain Inflow Conditions Using a Noisy Genetic Algorithm Using Machine Learning to Discern Eruption in Noisy Environments: A Case Study Using CO2-Driven Cold-Water Geyser This research was supported by a grant from the Dutch National Research Council NWO (grant no. VI.Vidi.195.119).