key: cord-0505009-85v2jgx0 authors: Muhlenbach, Fabrice title: A Methodology for Ethics-by-Design AI Systems: Dealing with Human Value Conflicts date: 2020-10-15 journal: nan DOI: nan sha: 5dd512e8d4d2c244251fb8647b9d93e65092c62b doc_id: 505009 cord_uid: 85v2jgx0 The introduction of artificial intelligence into activities traditionally carried out by human beings produces brutal changes. This is not without consequences for human values. This paper is about designing and implementing models of ethical behaviors in AI-based systems, and more specifically it presents a methodology for designing systems that take ethical aspects into account at an early stage while finding an innovative solution to prevent human values from being affected. Two case studies where AI-based innovations complement economic and social proposals with this methodology are presented: one in the field of culture and operated by a private company, the other in the field of scientific research and supported by a state organization. Artificial intelligence (or "AI") is often defined as the simulation on a machine of so-called "intelligent" processes. Being both an applied and theoretical field, this discipline of computer science is covering the range from weak AI (can machines act intelligently?) to strong AI (can machines really think?) [1] . For the past decade, the first approach has brought AI back to the forefront, especially with the development of new machine learning techniques, such as deep learning models [2] . This technique has created extremely effective AI applications in the area of pattern recognition or information selection problems for decision making, with programs that extract information from raw data and learn to improve their skills from existing examples. With this learning process, AI systems can perform complex tasks in place of humans. However, the arrival of this new technique has brought a number of ethical issues. Firstly, artificial intelligence programs reason in a simplistic way, but the real world is complex and full of unexpected events, which the machine has great difficulty in dealing with. Secondly, when an AI program learns about data collected on past situations, it performs statistical deductions and transforms correlations between variables into implication relationships. This can lead to problems with dramatic consequences such as gender bias in resume analysis support system by denying women access to managerial positions [3] , or racial bias in legal decision support system to predict future criminals [4] . The concern for the developers is to optimize some specific criteria, e.g., efficiency and usability. For instance, online e-commerce marketplaces seek to optimize the relevance of the connection between supply and demand. They collect a maximum of information from consumers and suppliers, and use a whole set of AI-derived strategies to maximize the matching between the different parties. Attention-economybased companies seek to keep their users on their websites and apps for as long as possible to offer them advertising. The collection of personal data leads to personalization which may have certain advantages, such as recommending more relevant content, but also important problems relating to the preservation of privacy, attacks on democracy [5] , or the limitation of access to diversity [6] . Intelligent system algorithms are black boxes that are impossible to understand, they are unregulated and difficult to question in the case of the presence of bias, and in some cases they amplify inequalities [7] . The highlighting of these problems has led to reactions from States and civil society. Some initiatives, for example in the USA (Asilomar AI principles, 1 "AI Now" Institute...), in Canada (Montréal Declaration for a responsible development of Artificial Intelligence 2 ), or in France (Villani Report "For a meaningful Artificial Intelligence" 3 ), with the production of reports, charters, lists of principles or laws, are moving in the direction of taking ethical concerns into account early in the development of AI systems. However, there is not much information provided on how to make this possible. For designing and implementing models of ethical behaviors, this paper proposes in the following a methodology for building ethics-by-design AI systems. This methodology is explained and illustrated by means of two innovative applications developed during PhD works supervised by the author. We will explain how these AI applications are able to respond to the problem of human value conflicts. The main idea of the three-step methodology proposed here is to take advantage of collective intelligence. The introduction of technologies derived from artificial intelligence leads to risks of undermining values for which the different stakeholders may have contrasting views. Step: Innovations in an Economic and Social World The first step of the methodology is to set up an economic and social model respectful of human values where the contribution of AI technologies can be integrated as a complement to the skills of human beings. The quote "They who can give up essential Liberty to obtain a little temporary Safety, deserve neither Liberty nor Safety" attributed to Benjamin Franklin makes us think that certain values are necessarily contradictory: liberty/privacy vs. security/safety. When resources are limited, choices need to be made. These choices will necessarily follow certain values and favor certain behaviors supported by some groups of individuals over others. However, in the case of adding a new technology such as AI, we must find a way not to optimize a single value at the expense of others. It is important to try always to find an alternative and innovative solution that guarantees respect for all human values. For example, in our study made in the field of predictive judicial analytics [8] , we came to the conclusion that the values of knowledge and trust are essential to give credit to decisions produced by intelligent systems. If the deep learning models cannot guarantee understanding by specialists (i.e., lawyers), they must be replaced by other more comprehensible models (rule-based approaches, Bayesian methods), otherwise these AI models will no longer be able to contribute to their role as a tool to serve justice and help defend litigants. The methodology continues with the clarification of the main values involved in the introduction of a new AI system for all stakeholders by setting up a decision-making tool called the "ethical matrix." This tool was initially proposed by Ben Mepham to facilitate judgements on bioethic questions [9] , in order to identify the values that are threatened. By focusing more specifically on a small number of ethical principles (e.g., respect for wellbeing, for autonomy and for justice) on a given subject (e.g., the impact of new technologies in food and agriculture), it is possible to elicit the problems and concerns of different stakeholders or interest groups. The selected ethical principles constitute values that form the columns of the matrix. The rows of the matrix consist of each stakeholder caught up with the issue in question. Each cell of the matrix specifies the main criterion to be satisfied for a stakeholder for a given principle. One of the most important difficulties of the ethical matrix approach concerns the choice of the values to use. A value is a way of being and acting for a person or a group of people, and people consider that certain behaviors resulting from a value are more desirable than others. From Antiquity (e.g., with Plato) to the present day, authors have proposed lists of fundamental values, such as the Axiology -defined as the philosophical science of values-proposed by the philosopher Max Scheler [10] , or the "Theory of Basic Human Values" proposed by the social psychologist Shalom H. Schwartz [11] . The arrival of artificial intelligence highlights a specific value: the efficiency. By taking into account a greater number of variables on a greater number of examples that no human being is capable of accumulating in a lifetime, intelligent programs are able to accomplish very complex tasks brilliantly and effortlessly. However, this efficiency, introduced in a disruptive manner in fields which until now felt protected, such as medicine or justice, seems to be predatory for other values. Faced with this difficult task, it seems important to us to select a list of basic values, for example the 10-item list of values selected by the Montréal Declaration for a Responsible Development of Artificial Intelligence [12] : well-being, respect for autonomy, protection of privacy and intimacy, solidarity, democratic participation, equity, diversity inclusion, prudence, responsibility, and sustainable development. However this list is neither necessary nor exhaustive. Depending on the application context, values more likely than others to be affected by an AI technology must replace certain values among the 10 or be added to this list. The values of individuals can differ greatly from one individual to another, or from one social group to another. It is then recommended to take into account all possible stakeholders, even those who cannot express themselves (e.g., animals, nature). In the field of artificial intelligence, the stakeholders refers especially to researchers, developers, manufacturers, providers, policymakers, and users [13] . In the generic ethical matrix manual [14] , a protocol has been proposed for these stakeholders to meet in workshops in order to take into account the diversity of opinions. In the last step, the plural visions and their consequences on some human values provided by the matrix allow to draw the guidelines for integrating human values throughout the design process and this development by following the principles of the "Value Sensitive Design" (VSD) introduced by Batya Friedman and her colleagues in the late 1980s [15] . The VSD theory has influenced the Dutch approach to responsible innovation [16] . In this approach, human values are integrated throughout the design process and this development is done using a threephase survey: conceptual, empirical and technological [17] . There are many different methods of applying this theoretical approach in practice to real-world problems [18] . Among the advice given by the authors, they recommend starting with a value, a technology or a context of use. Then they propose to identify the direct and indirect stakeholders, as well as the benefits and harms for each stakeholder group. The benefits and harms are then mapped onto the corresponding values, allowing the identification of potential value conflicts. With the advent of the Internet, it was predicted that this would be an opportunity for musical creation aimed at a niche audience. Unfortunately, it did not happen. Apart from a few notable exceptions, access to music via the Internet has only strengthened trends: popular music, artists or genres have become even more popular, and lesser-known artists have found it even more difficult to find their audience, producing an impoverishment of musical diversity [19] . One of the reasons for the lack of diversity is linked to what Eli Pariser calls the "filter bubble" [6] . Personalized recommendations are based on the processing of big data: the algorithms recommend to individuals products that are similar, or products appreciated by individuals sharing common traits, or popular products. With machine learning, recommendation algorithms will identify people's profiles, extract the most prototypical traits, and make only recommendations corresponding to a caricature of everyone's interests. The modus operandi of the recommender systems may be suitable for mainstream consumers who want to go directly to the greatest hits. However, this "superstar effect" -where the winner takes all [19] -impoverishes musical diversity and does not allow musical niches to emerge, ending up stifling a musical creativity that can no longer reconnect with its audience. If the digital music market is held only by major record labels supporting a musical offer limited to a few superstars, are the music and musical artists produced by independent record labels destined to disappear? In addition, we have become accustomed to paying the container (e.g., a subscription to an Internet service provider, a nice smartphone) but not the cultural content as such (e.g., music and videos on online video-sharing platform, mobile applications on app stores). As the new digital use is mainly free, how can we make people accept to pay to access to unknown musical content produced by indie labels? According to the methodology proposed here, the first step consists of a response from the economic and social world. Such a response exists in the form of a company offering the general public innovative solutions to discover independent creations. The French social economy company 1Dlab 4 has decided to take up this challenge. To be able to give an answer to the question of how to support the varied musical creation produced by independent record labels, this company has found a two-level solution. The first level solution is economic. The company proposes a fairer remuneration model to the creators (artists and content producers). The business plan works according to a businessto-many model (B2M). The company provides a music streaming platform (1DTouch, diMusic) but also works in the physical world to build a network of places and actors for favoring new ways of sharing, meeting, discovering and creating a cultural diversity. The streaming platform access is possible by a subscription fee paid by the project located partners (libraries, showrooms, work's councils, administrative divisions...) to their members. The second level solution was proposed in the PhD work of Pierre-René Lhérisson, supervised by Pierre Maret and the author of this paper: it is a technical innovation allowing the general public to discover pleasant independent creations with bold recommendation tools that has been designed following an ethics-by-design approach [20] . The views on values such as music (or more generally to culture as a whole), equity, diversity, trust and usefulness are very different depending on whether the stakeholders 4 1Dlab -Innovation, Culture & Territories: http://en.1d-lab.eu/ are the consumers (who listen to music), the creators of cultural contents (artists), the media services providers or the independent record labels. Following the second step of our methodology, we build an ethical matrix (Table I) allowing to see how to develop an innovative solution capable of meeting all the criteria. In the third step of the methodology, we follow an approach inspired by the VSD theory to integrate early in the design the values defined in the previous step for the development of a fair tool for recommending indie content. First of all, we developed a distance measure between musical items (music, songs, artists, music genres) based on different criteria (text information of the music, properties extracted from the sound signal, musical classification performed by experts) as well as on the perceptual distance sensation obtained through an experiment with listeners. Then we proposed an "optimal diversity function" using this multicriteria distance measure in order to find a compromise between the values diversity and efficiency (or usefulness). To get this optimal diversity function, we projected the distances between items on the (negative) Mexican hat wavelet function to establish clear distinction between similar, dissimilar, and sufficiently dissimilar items ( Figure 1 ). 5 Fig. 1 . Mexican hat and optimal diversity function. The optimal diversity is obtained when the items are not perceived too similar nor too dissimilar. For example, a listener enjoys trip hop music, a musical genre resulting from the fusion of hip hop and electronic music. Based on this information, it is unreasonable to recommend music located at other ends of the spectrum of musical genres (e.g., classical music or hard rock). On the contrary, it is more relevant for the listener to recommend slightly different items, such as music of the acid house, indie soul or dub type, depending on whether the musical orientation is respectively rather in the direction of jazz, funk music or reggae (Figure 2 ). Each item (music, song, artist or musical genre) is located in a multidimensional space where it is always possible to find other items at more or less close distances, therefore of a more or less optimal diversity value. In order to guarantee respect for the equity value, it is thus possible to orient the sense Fig. 2 . Musical genres and optimal diversity. Compared to a given item (here, trip hop genre), the diversity will be optimal when it is far enough from the item in question, but not too much (e.g., acid house, electro jazz, dub or indie soul can be recommended genres because the diversity is optimal). of diversity towards basins of attraction where artists are less recommended than others in order to make them discovered. To maintain the trust of the user to the recommendations made by the streaming platform, the system indicates to the user that the proposed recommendations are bold, i.e., they are not targeting to the music that the listener is used to listening to, but that there is a chance that the user may still like it. If, after having tried these recommended musics in a daring way, the listeners express their dissatisfaction, the personalization can adapt by decreasing the distance of optimum diversity and the system can thus propose musics closer to what they already appreciate. If, on the other hand, the listeners have eclectic tastes and want bolder discoveries, the distance of optimum diversity can increase for these people. Finally, by taking into account the opinions of the various players in a music platform system at a very early stage, it is possible to propose an innovative solution guaranteeing respect for a priori contradictory ethical values. IV. CASE STUDY 2: RESEARCH-PAPER RECOMMENDER SYSTEMS IN SCIENTIFIC DIGITAL LIBRARIES Access to knowledge is a major problem for human beings. This issue affects populations of all ages and at all levels: education for all the children of the planet, fight against (digital) illiteracy, and access to expert knowledge for all scientists, etc. In the latter case, there are notable initiatives that are part of an open science perspective, but articles published in scientific journals or renowned conference proceedings still remain mainly the property of scientific publishers and their access is most often made by purchasing articles or by subscribing to the publisher's platform. Yet this access to knowledge is necessary for the production of knowledge because researchers are science prosumers: researchers "consume" and "produce" knowledge, and this activity takes one of the two following paths: exploration or exploitation. In the exploitation phase, researchers use their existing knowledge to produce new knowledge by following a discovery process specific to their discipline. In the exploratory phase, researchers update their knowledge by discussing with colleagues in their laboratory, reading books or scientific articles, or attending seminars and conferences. Exploratory research is an essential phase of researcher's activity [21] , and exploitation and exploration are two complementary phases that feed on each other. In addition to the cost of accessing these sources of knowledge, even if these scientific digital platforms are multidisciplinary, the way they are designed and the means they offer to access interesting information will not promote the knowledge transfer between disciplines. Each scientific discipline has its own vocabulary, its own jargon. Without knowing the good keywords to enter in the search engine of the digital library platform, scientists thus find themselves isolated in a "filter bubble" [6] preventing them from accessing the content of articles from other disciplines. However, results can be obtained when the research transcends the limits of a discipline. It is well established that exchanges (knowledge transfers) between disciplines are very fruitful [22] . Faced with these problems, the French state research organization (CNRS) has proposed a solution that is part of the social and economic world -as the first step in our methodology-by launching an ambitious project in two parts. The first part concerns the creation of a scientific digital library accessible to all French research institutions called ISTEX 6 [23] , resulting from a massive resource acquisition policy, obtained by entering into operating contracts with a lot of international and French-speaking publishers. The second part concerns the exploitation of these ISTEX resources with text and data mining techniques [24] . Together with the PhD student Hussein Al-Natsheh and with Djamel A. Zighed (cosupervisor), we had the opportunity to lead a 4-people team working on this goal: provide assistance to users of ISTEX resource by offering recommender systems that promote disciplinary diversity. For researchers (whether they produce or seek knowledge), scientific publishers, research organizations and public authorities, interests diverge greatly, and their respective views on values such as knowledge, equity, diversity, trust or usefulness do not involve the same issues, which can lead to conflicts, as shown on the ethical matrix (second step: Table II ). In the third step of the methodology, we integrate the values of the ethical matrix into the design of an innovative solution, following the VSD approach. Here, we focus on the value of diversity which is threatened by the value of a precisionbased usefulness. As part of the projects funded by the CNRS and intended to benefit from the ISTEX platform, we have proposed to develop a research-paper recommender system whose purpose is to favor diversity. Instead of being focused on the sole criterion of accuracy of results, as most other systems do [25] , we propose a system able to recommend scientific papers through a semantic similarity model [26] based on computational linguistics methods and word embedding techniques [27] , as shown on Figure 3 . In this system, the user does not enter a set of keywords but a list of articles dealing with a given subject of interest. The content (abstract, plain text) of these articles is projected in a multidimensional space constructed by the text representation of the scientific papers of the multidisciplinary digital library (blue arrow in Figure 3 ). As the representation space is constructed by semantic similarities resulting from a proximity of the presence of terms in the texts, the more or less distant neighborhood will bring articles to recommend (arrows in green) comprising more or less diversity with the input articles. In the experiments we launched, we collaborated with sports science researchers who were working on a test to detect motor skills. Thanks to our system, we have been able to recommend research papers to the sports science researchers -deemed relevant by them-from different disciplines such as psychology, linguistics and defectology (e.g., studies on sign language used by deaf people, another form of complex movements with meaning). This kind of transfer of knowledge Fig. 3 . Semantic-similarity based recommendation of scientific papers. First, a vector representation of the scientific terms is created with word embedding techniques from the contents of the research articles of the library (represented by white dots in a multidimensional space). Second, the interesting papers provided by the user are projected in the vector representation space (represented by red dots) to define a target area of interest. Third, the neighboring articles of the target (black dots) in the vector representation space are activated to recommend diverse items, sometimes coming from other disciplines, but connected through semantic links to the input papers. between disciplines has hitherto never been established. By examining in detail the different values involved using the ethical matrix, it was possible to explain them better and to realize that the usefulness of the system does not necessarily imply the quest for precision: diversity also matters. The early integration of the diversity value in the design of the research-paper recommender system allows researchers in the exploratory phase to have access to knowledge from various sources. With this recommender system, ISTEX platform promotes scientific diversity and allows all scientific disciplines to be represented with equity and all their articles to be recommended. From an epistemological point of view, this diversity is essential because otherwise a researcher using a scientific digital library will only have access to sources of information from his/her own discipline. The result will be that the documents found will only confirm the initial point of view of the researcher using this digital library, thus calling into question one of the golden rules of science which is the property of "falsifiability" [28] . AI is often associated with robotization, and robotization can be understood as "automation" but also as "the process of turning a human being into a robot." That is why the arrival of AI can scare some individuals, giving rise to feelings of technophobia with the fear of mass unemployment due to jobs performed by robots or intelligent computer systems. Moreover, there is a rejection of AI because this technology may be seen as a kind of automated and not human -if not "inhuman"-thinking. The species of the human being is Homo sapiens, which means "wise man" in Latin. The human being is not simply characterized as being intelligent, that is to say Artificial Intelligence -A Modern Approach The Deep Learning Revolution Amazon scraps secret AI recruiting tool that showed bias against women Machine bias Mindf*ck: Cambridge Analytica and the Plot to Break America The Filter Bubble: What The Internet Is Hiding From You Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy Artificial intelligence and law: What do people really want? Example of a French multidisciplinary working group A framework for the ethical analysis of novel foods: the ethical matrix Formalism in Ethics and Non-formal Ethics of Values Universals in the content and structure of values: theoretical advances and empirical tests in 20 countries Montréal declaration for a responsible development of artificial intelligence Responsible Artificial Intelligence: How To Develop And Use AI In A Responsible Way Ethical matrix manual Value Sensitive Design: Shaping Technology with Moral Imagination Handbook of Ethics, Values, and Technological Design: Sources, Theory, Values and Application Domains Value sensitive design and information systems A survey of value sensitive design methods Digital music and the "death of the long tail Fair recommendations through diversity promotion Exploratory search: From finding to understanding Scientific Discovery: Computational Explorations of the Creative Process Scientific and Technical Information Department -CNRS, White Paper -Open Science in a Digital Republic White Paper -Open Science in a Digital Republic -Strategic Guide Research-paper recommender systems: a literature survey Semantic search-by-examples for scientific topic corpus expansion in digital libraries Distributed representations of words and phrases and their compositionality The Logic of Scientific Discovery. London and New York: Routledge Classics having an ability to find solutions to complex problems, but also has a nature of a discerning, wise, and sensible being.The events that we are currently experiencing -COVID-19 crisis, climate change-produce abrupt changes in all areas, bringing new challenges in the fields of health, economy, ecology, security, justice, science, culture. . . With the collective intelligence of human beings, let AI and ICT be the means pushing us to be innovative in order to respond to these problems in an appropriate manner, and may this response be done wisely in order to guarantee respect for human values.