key: cord-0227274-06ekdrnt
authors: Perera, Harsha; Hoda, Rashina; Shams, Rifat Ara; Nurwidyantoro, Arif; Shahin, Mojtaba; Hussain, Waqar; Whittle, Jon
title: The Impact of Considering Human Values during Requirements Engineering Activities
date: 2021-11-30
journal: nan
DOI: nan
sha: 0345a72aee2b1091264683e9b53a1c2b9bee0483
doc_id: 227274
cord_uid: 06ekdrnt

Human values, or what people hold important in their life, such as freedom, fairness, and social responsibility, often remain unnoticed and unattended during software development. Ignoring values can lead to values violations in software that can result in financial losses, reputation damage, and widespread social and legal implications. However, embedding human values in software is not only non-trivial but also generally an unclear process. Commencing as early as during the Requirements Engineering (RE) activities promises to ensure fit-for-purpose and quality software products that adhere to human values. But what is the impact of considering human values explicitly during early RE activities? To answer this question, we conducted a scenario-based survey where 56 software practitioners contextualised requirements analysis towards a proposed mobile application for the homeless and suggested values-laden software features accordingly. The suggested features were qualitatively analysed. Results show that explicit considerations of values can help practitioners identify applicable values, associate purpose with the features they develop, think outside-the-box, and build connections between software features and human values. Finally, drawing from the results and experiences of this study, we propose a scenario-based values elicitation process -- a simple four-step takeaway as a practical implication of this study.

Software is an inextricable part of our lives and social fabric. People expect software to demonstrate and respect human values -what people hold important in their life [1] -such as social justice, freedom, independence, fairness, accessibility, and tradition. Unsurprisingly, values violations through software applications often end up creating undesired consequences such as financial losses [2] , reputation damages [3] , and even loss of lives [4] . For example, Facebook (now renamed as Meta) recently changed WhatsApp's term and conditions, leaving no choice for users but to grant access of their personal data including phone number and behaviour to Facebook or lose their WhatsApp account [5] . People accused Facebook of violating their trust and freedom to choose, and this change led millions of WhatsApp users to migrate to alternative messaging apps, such as Telegram and Signal [5] . In a more severe example, "Blue Whale Challenge" -a game conducted through social media apps was responsible for the death of 153 teenagers around the world [6] . The game presented its players with 50 tasks in 50 days, and the 50 th task was to take your own life [7] , possibly violating all human values imaginable.

Engineering human values into software is challenging due to their ill-defined nature in the software context [8] . Ferrario often invisible and taken for granted' [9] . Thew and Sutcliffe argue that stakeholder values, motivations and emotions are not explicitly addressed in the requirements processes [10] . Further, Harbers and others propose "explicitly identifying and considering stakeholder values during requirements elicitation, identification and analysis will lead to software that better supports human values" [11] . There are commendable recent efforts in Requirements Engineering (RE) that support the effort of explicit consideration of human values in software such as values based requirements engineering (VBRE) [10] , HuValue -a values based design tool [12] , and the value story workshop [11] . All of these studies seem to hold a common assumption that explicitly considering human values in RE would make software better aligned with values. In this study, we challenge this assumption by examining it with real-world software practitioners who engage in RE activities, with the following research question:

To address this research question, we designed a scenariobased survey, where a scenario was presented to the respondents and while explicitly thinking about human values, they were asked to suggest features that would satisfy the requirements given in the scenario. We discuss the survey design and flow in detail in Section 3.1, as well as our reflections on its design and outcomes in Section 5.2.

Given the detailed nature of the scenario-based survey, taking an average of 30 minutes to complete, we were pleased to receive responses from 56 software practitioners who engaged in RE activities on a regular basis. Apart from demographics and values familiarity details, the responses mainly consisted of the descriptions of the suggested fea-arXiv:2111.15293v1 [cs.SE] 30 Nov 2021 tures with due consideration of human values. We used these feature descriptions as the primary unit of qualitative analysis. Using socio-technical grounded theory (STGT) for data analysis [13] , the characteristics of the suggested feature were rigorously analysed to inform us of the impact of considering human values explicitly during RE activities in terms of helping practitioners deign more values-focused software features.

This research makes the following key contributions:

• Providing empirical evidence of the impact of considering human values during RE activities in improving values identification and mapping. • Promoting scenario-based thinking as an effective tool for operationlizing human values in RE. • Introducing a four-step scenario-based values elicitation process as a practical takeaway for RE practitioners to consider human values in their day-to-day requirements analysis work and for researchers to adapt and use in similar contexts. • Introducing scenario-based surveys as an effective and flexible research tool for addressing complex research questions that require experiential evidence and flexibility under physical and time constraints such as those imposed by the Covid-19 pandemic related work-fromhome conditions at scale.

Human Values Definition: Human values are defined by Schwartz as standards that we use to judge the appropriateness of attitudes, traits or virtues [1] . Meanwhile, seven different definitions of human values are summarized as "guiding principles of what people consider important in life" [15] . In software engineering contexts, human values represent the characteristics of software that are considered as important for the stakeholders [16] . They contain but are not limited to values of ethical importance, often known as ethics. Human Values Representation: Since the 1950s, social scientists have been searching for the most useful way to conceptualize basic human values [17] . In 1973, Rokeach captured 36 human values and organized them into two categories as terminal values and instrumental values [18] . In 1980, Hofstede divided values into two categories, desired (what people actually desire) and desirable (what people think ought to be desired) [19] . In 1992, Schwartz introduced the theory of basic human values, which is assessed across 82 countries [1] . It identified ten motivationally-distinct values categories and measured them using 58 distinct values [1] , [14] . Although there are many more classifications for human values [15] , in this research, we use Schwartz's theory, which is the most cited and widely applied classification not only in the social sciences but also in other disciplines [9] , [10] . Figure 1 depicts Schwartz's values categories, their definitions and individual values.

Since the 1970s, research on human values in technology design and development has been going on [20] . Attempts to consider human values during technology design and development have been of interest particularly in the field of Human-Computer Interaction (HCI). The first attempt came from Batya Friedman by proposing an approach called Value-Sensitive Design (VSD) to elicit values and integrate them in technology design [20] . According to Friedman et al., "Value Sensitive Design is a theoretically grounded approach to the design of technology that accounts for human values in a principled and comprehensive manner throughout the design process" [21] . Friedman et al. also explored the conceptual, empirical, and technical aspects of VSD and provided suggestions accordingly to use VSD [21] . However, VSD is often being questioned for limiting to values with ethical or moral importance [22] , [23] . While morality or ethics may judge right from wrong, values do not necessarily have an ethical import all the time (authority, ambitious, capable, pleasure for example). Considering only a subset of human values makes VSD incomplete to address the challenge of integrating human values into software [24] . Moreover, translating identified human values into corresponding design features in the system is an underdeveloped activity in VSD [11] , [25] . [28] . There are also a few studies to address values in different phases of software development life-cycle (SDLC). For example, a recent study proposed a dashboard tool for software repositories to address values-related issues during SDLC [29] . Meanwhile, other studies attempt to address values in a specific phase of the SDLC, such as requirements and design, as presented in the following subsections.

Design phase is considered as a potential place to "realize values" [30] . For this reason, several studies proposed approaches to support values consideration in the design process of technology. One of the approaches was used by Aldewereld et al. to propose a framework that creates explicit links between the values and the corresponding architectural and design decisions to maintain the values during development [31] . This framework is called Value-Sensitive Software Development (VSSD) that used 'Design for Values' approach [31] . In another study, Hussain et al. proposed a framework to consider human values in design patterns [32] .

Preserving and enhancing the welfare of those with whom one is in frequent personal contact e.g., helpful, honest, forgiving, responsible, true friendship, mature love Universalism Understanding, appreciation, tolerance, and protection for the welfare of all people and for nature e.g., broadminded, social justice, equality, world at peace, world of beauty, unity with nature, wisdom, protecting the environment 

Failure to distinguish between user and system requirements may lead to soft issues in RE, such as politics and people's feelings, motivations and values [10] . However, RE offers relatively little guidance to deal with them, and human values are rarely considered among soft issues compared to quality aspects of software such as privacy or security [11] . Detweiler and Harbers explain this ignorance as 'thinking about values is not common practice in RE' [33] . However, to address human values in software, it is necessary to capture them in the requirements during the RE activities. Here, we acknowledge, recent, but isolated RE approaches that recognised human values explicitly in their research. Value-Based Requirements Engineering (VBRE) [10] , uses stakeholders' values, motivations, and emotions (VME) to elicit and analyze soft issues of the software. However, VBRE identifies the RE process management implications that values bring about, rather than providing proper guidance to convert identified values to features of the system [11] . Duboc et al. considered nontechnical aspects of software, such as ethics, power, politics, and values, by utilizing critical system thinking in the early requirements engineering process [34] . Another couple of studies suggest two different model languages to model emotions [35] , [36] . A more recent effort in RE to address human values is HuValue tool that supports designers in considering human values in their design [12] .

While each of these research has its limitations such as VBRS provides less guidance to convert identified values to features of the system [11] , they hold a common assumption that explicitly considering human values in RE would make software better aligned with values. While each of these research has its limitations, they hold a common assumption that explicitly considering human values in RE would make software better aligned with values, however, it requires RE research to prove such impact. To the best of our knowledge, there is no research to examine this assumption effectively with empirical evidence. Therefore, this study aims to investigate the impact of considering human values explicitly in RE activities on software features.

We conducted scenario-based survey research to study the impact of explicit consideration of human values in the early Requirements Engineering (RE) activities (e.g., requirements analysis) on software design. The scenario-based survey was in-depth and involved a hypothetical case to consider. The approach was used to overcome the challenges of conducting in-person workshops during the Covid-19 pandemic situation. Using a survey, we wanted to reach a broader population of software practitioners involved in RE activities.

In Australia, Around 2.5 million people over the age of 15 experience homelessness at some point in their lives. Just over one-third of these people driven out of their homes and into poverty due to family or domestic violence. A government official recently added: "There is nothing lonelier than being homeless … sense of connection is a critical thing. It's a means for people to find a connection". Following the idea, the government launched the project WECare as a combination of technology, innovation and great love.

WeCare, a mobile application, is to facilitate homeless people across the country. This app's main objective is to act as a platform that connects homeless people (henceforth service seeker), service providers, and the government. Service seekers currently find these service providers by other means such as free localised printed guides in everyday places such as train stations or shops. Three hundred fiftythousand (350000) service providers within Melbourne provide housing, meals, healthcare, counselling, legal advice and addiction treatment for homeless people in Melbourne.

The WeCare app should serve service seekers by offering a location-based, comprehensive directory of essential support services such as shelters, food, health services, near-by social workers.

Other Requirements of the WeCare App:

• For health-related services, the providers may trace the service seeker's location with his/her consent.

• Service seekers should be able to search for a service or service provider using the app.

• Also, the app should give location-based navigation to reach services.

The app should be easy to use, given the minimal digital literacy level of the (majority of) service seekers.

The app should respect their privacy and ensure their security.

The app should be able to periodically send locationbased service suggestions to help the service seekers whenever possible.

WeCare -Home for All 

Alternative route The overall methodology consists of three stages, namely, a pilot study, a main study, and data analysis (see Fig.2 ). First, as a preparation, we set up the survey goals during, designed the research flow, and drafted the scenario along with the initial survey questionnaire. We started the pilot study after obtaining approval from the Monash University Human Research Ethics Committee (MUHREC) (project number 25278).

Pilot Study: We conducted a pilot study with four industry practitioners selected from our contacts to assess our survey design and in particular, the clarity of the survey questions and the comprehensibility of the scenario. The pilot participants were asked to use the 'think-aloud' technique in which they voice-recorded their feedback to the survey questions (if any) while doing the survey [37] . Then we carried out short discussions with pilot participants to elicit further suggestions. Based on the analysis of the thinkaloud voice recordings and the researcher's notes of the discussions, we improved the phrasing of a few questions and added external links to access the Schwartz's model in the question descriptions. Further, we updated several exit points of the survey and streamlined the survey logic. However, none of these changes affected the principle design or the intention of the survey. Rather, they served to improve the flow, comprehensibility, and information needs of the respondents. Once finalised, we continued to conduct the main study. The following subsections discuss the final survey design, data collection, and data analysis approaches in detail.

The first section of the survey collected demographic information about the participants, including their job roles and experience in the software industry (see Fig. 4 ). Further, the section questioned to what extent they elicit, analyse, prioritise and design software requirements as a part of their job. Finally, the section evaluated their level of familiarity with human values. Schwartz's theory of basic human values was used to define human values, and in the online survey platform, the cards shown in Fig.1 were displayed as clickable areas to pick the values that they were already familiar with. This served the dual purposes of gauging the participants' prior knowledge of human values and introducing them to (or reminding them of) the Schwartz's model.

The survey was developed around a hypothetical scenario of a proposed mobile app (WeCare) for homeless people in Australia. Fig. 3 depicts the scenario presented to survey participants, which was written in a values neutral lens, without explicit mention of any human values. The scenario laid out the objective, context of use, and key requirements. Through the introduction of a common scenario, our aim was to make the survey experience uniform across all the participants who varied in their demographic aspects such as job roles, project experiences, and geographical locations.

After introducing the scenario, we proposed five standard features for the WeCare app. In this research, we identify standard features as app functionalities that are common in almost all the apps, without considering any specific scenario. We suggested the following five standard features and asked participants to select the features they wanted to see in the WeCare App.

• Register-this feature allows the users to provide their necessary information and register with the application. • Login -this feature allows the users to provide a correct username/email and password to login to the system. • Login (social media) -this feature allows the users to use existing social media to login to the system. • Search -this feature allows the users to search within the application. Any settings, information matching with the search string will be the output. • FAQ -frequently asked questions are listed and answered. These standard features were suggested upfront in order to save the participants' time spent on coming up with such features while brainstorming in the follow-up sections of the survey. Further, they indirectly acted as example templates that the participants could follow when asked to suggest their own features in the upcoming questions of the survey.

As depicted using the gray diamond shapes in Fig.4 , after the standard feature selection, the participants reached the first decision point (DP1) of the survey, where the participants decided whether more features are needed to accommodate the requirements mentioned in the scenario other than standard features. If yes, they were given a chance to suggest up to seven new features to the WeCare App. If not, the participants were directed through the alternate route demonstrated using dashed lines in the Survey flow (see Fig. 4 ).

When suggesting features, the participants were given a chance to mention the values category they had in their mind. The options list also included the none of the values option to indicate the feature was suggested from a values neutral point of view. Then, the participants were presented with a 3-minute video that further explains the importance of having values in software in general. Afterwards, the participants reached the second decision point (DP2), where they were given a chance to change the suggested features or keep them as suggested. We hoped that the video would help participants modify their suggested features to be better aligned with human values. We will discuss the response to the video later in the Reflections section 5.2.

Until this point of the survey, participants suggested features and linked them with human values -a bottom-up approach. The final section of the survey attempted a topdown approach with values triggers. This section showed participants the ten values categories, their definition, and examples as values triggers and asks whether they can identify any features that align with the given human values. This approach allowed them to start with a broader range of values and suggest new features, in addition to the features the participants had suggested earlier. All the participants were directed to this section, including those who said 'no' in decision point 1. This marked the endpoint of the survey. We will discuss the effect of values triggers in Section 4 and Section 6.

In this survey, we intended to target software practitioners involved in RE-related activities. Therefore, we used a non-probabilistic purposive sampling technique in the study [38] . We used Qualtrics platform to design the survey and, subsequently, we advertised the survey as an anonymous survey link (without any email logging) for RE communities via social media (LinkedIn, Facebook and Twitter) and email lists. The survey attracted nearly 70 practitioners; however, data cleansing resulted in 56 usable responses as we removed responses that did not reach the endpoint in the survey (see Fig.4 ). Considering the detailed, scenariobased, and partly open-ended nature of the survey, it took approximately 30 minutes to complete on an average and generated significant amount of qualitative data to analyse. The effort to attract RE-related participants was successful as 42 of the 56 (75%) said they were involved in eliciting, analysing, prioritising, or designing software requirements as a part of their job at least a couple of times a week. Another six participants (10.7%) are involved in RE activities at least couple of time a month while remaining participants mentioned they involved in RE activities couple time a year or very rarely. Participants demographics are presented in the Results section 4.1.

The data collected through 56 participants included quantitative and qualitative data; therefore, we used mixedmethod analysis to derive the results. Quantitative data mainly emerged from the first section (demographics and values familiarity of participants) of the survey. The quantitative data was analysed using Qualtrics reports and Google spreadsheets by the first author of the paper. After the scenario introduction, the survey produced qualitative data, which mainly consisted of the suggested features by the participants. In this survey, we use those suggested features as the primary unit of analysis to understand the effect of explicit consideration of human values in RE activities.

The data analysis involved three of the authors as the analysts. All of the analysts had a decent understanding of human values and experiences in conducting qualitative analysis. We applied Socio-Technical Grounded Theory (STGT) for Data Analysis [13] to analyse the qualitative data, using techniques such as open coding, constant comparison, and writing memos. Since the survey responses provided sufficient qualitative data to apply the coding techniques but were not enough (say, as compared to in-depth interview responses) for full theory development, a limited application of STGT for data analysis was found suitable [13] . We selected this approach over other qualitative analysis techniques, such as thematic analysis, because of its (a) rigour that led to multi-dimensional results (presented in section 4) that are original, relevant, and dense as evidenced by the depth of the categories; and (b) reflective practices such as memo writing that led to layered insights and reflections (presented in section 5.1 and 5.2).

Open coding was used to identify the codes from the suggested features. The suggested features were shared through the open text boxes in the survey and served as the raw qualitative data on which analysis was applied. Using constant comparison, the codes were grouped into concepts and concepts into categories. An example of the analysis is presented below.

Raw Data: "Push notifications should be sent [from] time to time based on the location of [the] service seeker on nearby service providers" Code: Location-based suggestions Similarly, other codes such as recommendation services were derived from the suggested features. These codes were combined to form a higher-level concept, functional requirements.

Concept: Functional Requirements Similar concepts were combined to form a category. In this case, the concepts functional requirements and non-functional requirements were combined to form a higher-level category, requirements type.

Category: Requirements Type to capture the concepts functional requirements (e.g., feature eFR04-'services near me -allow users to browse local services') or non-functional requirements (NFRs) (e.g., feature eNR07-'access to information with less number of clicks/swipes'). We use prefix FR and NR to identify these classifications respectively. • Feature Granularity: The third category to be derived from the data analysis was Granularity. Since participants were free to suggest features as they liked, without any format constraints or specific guidance, the responses varied in the level of granularity of the features. For example, some of the suggested features were described at the level of implementation details (e.g., feature eFR19-'search should be able to filter by different categories'), which would normally be captured as tasks by software teams. On the other hand, some other suggested features were pitched at a more abstract level, without implementation details, otherwise known as user stories. Finally, some were described at an even more abstract level (e.g., feature eNR08-'clear and straight forward UI') that could serve as a high-level guidelines, themes, or epics and depending on their relative importance, they can be applied as an overarching principle or broken down into specific user stories. This is similar to Bick et al.'s findings and categorisation of agile backlog items to be in a range of coarse-grained (theme or epic) level to fine-grained level (task) [39] . • Expected Outcome: During the analysis, we observed that features could be categorised based on whether they were suggested using information that was within the scope of the given scenario or by bringing ideas from outside of the scenario (i.e, 'thinking outside-thebox'). For example, feature eFR06-'push notifications should be sent [from] time to time based on the location of service seeker on nearby service providers' is well within the weCare app scenario shared with the participants (see Fig. 3 ). However, feature uFR04-'public to list out items they are willing to donate' was well outside the details of the weCare app scenario, as the general public was never mentioned as a stakeholder of this application. This led us to classify some suggested features as expected (prefix e) and others as unexpected (prefix u), from the viewpoint of the scenario exercise. This helped us add an additional layer of detail to our categorisation of functional and non-functional requirements, captured by adding the prefix 'e' where the feature was expected and the prefix 'u' where it was unexpected, as follows: uFR03 "Ability to recommend the service provider to a friend, Articulate how the data captured while user sign will be used."

Recommendation services eNR07 "Access to information with less number of clicks/swipes" Usability Non-Functional Requirements uNR01 "Provide physical locations where users can access services at a kiosk or the like, if they don't have a phone to use the app" Accessibility eFR12 "Suggestions -notifications for relevant services" Suggestions Epic/Theme level Granularity eNR08 "Clear and straight forward UI" UX uFR07 "Portal to connect with each others to build friendship/support without revealing identity"

Create secure interaction eFR21 "As a user, I should be able to reserve a service"

Reserve service User story Level uFR04 "Public to list out items they are willing to donate"

Donation listing eFR19 "Search should be able to filter by different categories" Search filters Task Level uFR06 "Providing all possible options under the sex of the person" Gender options eFR04 "Services near me -allow users to browse local services" Search services Expected feature Expected Outcome eFR20 "When the user want to on board on specify service provider, he/she should be put his credential as a token of responsibility, thus the provider could have capacity planning beforehand"

Capacity planing eNR10 "App should clearly make statement about privacy and which data is being used by the company"

Privacy policy uFR04 "Public to list out items they are willing to donate" Public donation Unexpected Feature uFR05 "Ability for the homeless to create value through their art/creations (similar to fair trade) facilitate by a platform connected to the apps" Sell products uNR02 "Customize according to the ages" Personalized UI inferring, we further identified two different levels based on the easiness to relate a suggested feature to a particular value. Consider feature eFR08-'Forum -a place inside the app where users can publish posts and add comments'. This feature can be easily linked to being helpful because a forum and its public posts help each other, i.e. direct inferred mapping. Also, if we take the inferring a step further, it can be seen that a forum may align with being curious or looking for friendship or sense of belonging, i.e. indirect inferred mapping. Altogether, we use three levels of values mappings in this study as follows:

• Mapping level 1 -Direct values mapping (mainly in VAL category) • Mapping level 2 -Direct inferred values mapping • Mapping level 3 -Indirect inferred values mapping These are denoted by the superscripts x 1 , x 2 , and x 3 respectively on the value names (x) in Tables 2, 3, and 4.

As a part of the STGT process, we wrote 'memos' to document the insights generated while performing the open coding activities. While the open coding provided valuable results through categorising the features in terms of human values, requirements types, granularity, and expected outcomes, the memos helped to surface nuanced insights of this study. We draw on these memos in Section 5.1 and 6, where we share our insights and discussions. Following is an example of a memo created.

Probing participants with human values worked, as nearly half of the features were suggested after probing with values in the last question. Is it easier for practitioners to think from values to features (top to bottom) rather than from features to values? Some of the value categories received their first feature just because we probed participants with values!

In this section, we present the findings from the survey analysis. First, we present the outcome of the first section of the survey -understanding the participant (see Fig. 4 ), including participant demographics and their values familiarity. Then we present the outcome of the rest of the survey questions, mainly the feature categorisation, where we present 66 features across five categories -human values (VAL), expected functional requirements (eFR), unexpected functional requirements (uFR), expected non-functional requirements (eNR), and unexpected non-functional requirements (uNR). We discuss each of these categories and insights of the respective features.

Out of the 56 participants, 20 ( which were the most common job roles among participants. The authors re-categorised some of the similar job roles into commonly known job roles to the best of their knowledge and experience. For example, requirements engineers and business analysts roles were categorised as business analysts. Most participants (26, 46 .43%) had 1-5 years of work experience in the software industry, while 20 (35.71%) participants had 5-10 years of experience. We also had three participants (5.34%) with 20 to 25 years of experience in the software industry. We have summarised the demographics of the participants in Fig. 5 . To calculate the average years of experience, we use the midpoint of the year category as the fair value (for example, if a participant selected 5-10 years as his/her experience, we assumed they had 7.5 years of experience). The overall average of 56 survey participants was calculated as 6.07 years of experience.

Most participants (70.37%) were either extremely familiar (3.7%), very familiar (29.63%), or moderately familiar (37.04%) with the values. Fig. 6 shows these levels of familiarity with human values. The follow-up question revealed the value categories that participants often consider when they develop software in general. Participants were allowed to select multiple values categories. The percentage of familiarity for each value category is presented in Fig. 7 . Unsurprisingly, Security -the well know software quality aspect -scored the highest popularity (62%) while Hedonism recorded as the least popular (18%) values category. We discuss the way this value category popularity may have affected the suggested features under #FamiliarityImpact in the Insights section 5.1.

Next, we discuss the results of the survey after the introduction of the scenario. The first task was to select the standard features from a given list. Registration, Search, and FAQs standard features were selected by more than 75% of the participants, while Login with or without social media recorded less popularity (around 65%). This outcome indicates that the participants thought about the privacy of users, i.e., homeless people. To this end, we found suggested features such as eFR03 (see Table 3 ) and eNR13 (see Table 4 ) also suggested as being anonymous within the WeCare platform.

We identified 17 values or values related qualities [VAL01 -VAL17] suggested by the participants (see Table 2 ). These suggestions demonstrate that practitioners are capable of identifying values that are aligned with a given scenario. A feature like VAL12 -'The app should not asking private data that is not adhere with tradition ... user could choose which service provider that provide food that adhere to his/her religion' shows that the participant made a clear link between the requirement and the value, Tradition. Further, we identified evidence that suggests that explicit thinking about human values can alter a typical software feature to better align with values requirements. For example, a standard chat function would enable users to connect with other users; however, VAL16 (see Table 2 ) suggests connecting people based on 'common parameters of individuals', i.e., mutual interest. This example demonstrates that values thinking during RE activities would give an extra dimension to typical software features by adding a purpose, i.e., answering why? someone wants to develop a particular feature in the first place. However, some of the suggested features were very short in description and used the same terms as the values themselves (e.g., VAL01-'Helpful', VAL02-'Responsible', VAL03-'Forgiving'). It is clear that such short descriptions and listing of value names does not assist with operationalizing such features in practicality. This aligns with one of our previous findings [40] , the inability to translate human values into features being one of the common challenges in operationalizing human values in SE. It also highlights the importance of identifying the granularity level of features that are brainstormed at the early stages of RE. We further discuss this idea in Section 4.6.

Through the STGT data analysis, detailed in section 3, out of the 66 suggested features, 31 were identified as functional The app should not asking private data that is not adhere with tradition / data that is not related to the provided service. For example, asking "religion" for "food" can be changed with providing the food menu, so the user could choose which service provider that provide food that adhere to his/her religion Key success indicators -a personalised metric created and set by the individual service seeker in a checklist, that they can tick off to enable self-approval or self-worth.

Successful 1 Achievement requirements (eFR01 to eFR24 (Table 3 ) and uFR01 to uFR07 (Table 5 )), while 18 features categorised as non-functional requirements (eNR01 to eNR15 (Table 4 ) and uNR01 to uNR03 (Table 5) ). The functional requirements in Table 3 mainly addressed the scenario requirements. We have identified different subcollections within eFR features aligning with major functional components of the WeCare app such as login and registration, location-based feature, rating & feedback (see Table 3 ). We found participants have suggested similar functional components; however, the descriptions of the functions have given different dimensions to the feature. For example, feature eFR10 suggests having 'chat service / hotline' while feature eFR11 suggests the same with a purpose as 'support -get support for contacting services from a help desk, should be accessible throughout the app' (see Table 3 ). Also. The latter expresses the need for accessibility, a quality aspect of the feature in addition to what it should do.

Moreover, the participants were found to draw ideas from real-world applications to suggest some features for the WeCare app scenario. For instance, feature eFR17 proposes 'the public who are part of the service system to be tiered based on their contribution and thereafter recognized (similar to how contributors to google maps/local guides are treated today)'. Such features (e.g., eFR03, eFR05, eFR17, eNR12, uFR05) indicate that if given a scenario, practitioners would be able to draw ideas from similar experiences and contexts, which supports scenario-based thinking as potentially an effective tool toward operationalizing human values in RE.

Suggested non-functional requirement features mainly addressed quality aspects such as portability, accessibility, usability, and privacy. Under values mapping activity, we have mapped these features to Schwartz's values as demonstrated in Table 4 and Table 5 . Most of these mappings were in level 2 or 3, i.e., inferred values mapping with direct or indirect links. On a related note, human values are often confused with non-functional requirements. For example, eNR09 to eNR15 (see Table 4 ) mainly address the privacy of users, which could be categorised as an NFR or a value. However, the feature description were closer to the quality aspects of the app; therefore, we categorised them as nonfunctional requirements.

Similar to functional requirements, some features were suggested with their purpose. For example, feature uNR03 (see Table 5 ) suggested to develop a feature called 'remove my data' because 'user should have the choice to put/remove their data on the server'. Such outcomes suggest that in both functional and non-functional requirements, values thinking would encourage practitioners to think about the purpose of the feature they develop.

A part of this scenario-based survey activity was to map the suggested features to human values categories in Schwartz's theory of basic human values [41] . Participants mapped all (66) suggested features to values categories. All the features are visualised under their participant classified value categories, as depicted by superimposing on the Schwartz model's circular structure (see Fig. 7 ). Search should be able to filter by different categories Capability 2 , Helpful 3 Self-direction Other features eFR20

When the user want to on board on specify service provider, he/she should be put his credential as a token of responsibility, thus the provider could have capacity planning beforehand Capable 3 , Helpful 2 , Security 3 Benevolence eFR21

As a user, I should be able to reserve a service Independent 2 , Capable 2 Achievement eFR22

Favourites -this feature will allow the service seekers to list the type of services they are interested in from a list of existing services, so that they can be marked as favourites. Dashboard with icons showing access to services Helpful 3 Self-direction eFR24

Ability to select the details and content to be displayed Freedom 2 , Independence 2 Self-direction Fig. 7 . Schwartz's theory (circular structure) [14] (adapted from [42] ). Words in black boxes are values categories. All the suggested features are superimposed over Schwartz's values categories

We identified three granularity levels: epic/theme level, user story level, and task level at which the suggested features were pitched. We have demonstrated the feature granularity in Fig. 7 using the different outlines for feature bubbles. For the features in the VAL category (yellow bubbles), we did not define any granularity level as almost all of them are still at a highly abstract level. We found that most of the Expected Functional Requirements (eFR) (light blue bubbles in Fig. 7) were suggested at either the user story level or the task level. Practitioners work with such functional requirements every day, and perhaps that allowed the participants to produce more fine-grained level features [39] . For example, though we did not request participants to use any particular format when suggesting features, one of the participants suggested a feature using the user story format -'as a user, I should be able to reserve a service' (eFR21).

Expected non-functional requirements (eNR) (light green bubbles) and unexpected functional requirements (uFR) (dark blue bubbles) had a mixed result in terms of gran-ularity between epic/theme level and user story level. It was noted that the non-functional and unexpected features were suggested at more coarse-grained levels, which aligns with similar findings in agile planning contexts [39] . For example, the three unexpected non-functional requirements (uNR) items (dark green bubbles) support this argument as two of them were pitched at the epic level, while the other on the user story level.

Converting abstract concepts such as human values into actionable software tasks is one of the critical challenges in operationalizing human values in SE [40] . These findings suggest that considering human values while conducting RE activities may reveal features with different granularity levels. Therefore, being conscious about the level of granularity of the features in software design may help operationalize human values in RE. For instance, while conducting RE activities, a practitioner may explicitly label the granularity of the design choices and try to brainstorm less abstract features and more towards well-defined task levels. To give an example, one participant suggested feature eNR06-'Easy navigation', which is pitched at the epic/theme level, while another participant suggested eNR05-'Home page with easy access icons for most popular/important services', which is a more actionable task level feature.

We found a total of 39 expected features, both functional (eFR01 to eFR24) and non-functional (eNR01 to eNR15) and 10 unexpected features (uFR01 to uFR07 and uNR01 to uNR03). As explained in section 3, our definition of expected features is bound to whether the feature emerged directly from the given information in the scenario (expected) or whether the participants drew on ideas outside the scenario and suggested the feature (unexpected).

We found some of the feature descriptions among expected features that depict that the scenario-based approach eased the participants' thinking process and helped them develop the purpose of the feature they proposed, i.e., why you need this feature? For example, eFR14 said 'reviews/rating system for service providers. So future service seekers will be able to get an understanding of how responsible service providers are'. With the use of the 'so' the participant went on to describe the rationale for why the feature is required. In this case, by providing a feature to review the quality of the service, it seems they want to ensure service providers are responsible, thereby focusing on the service seekers' quality of life. In feature eFR20, a participant suggested that 'When the user wants to on board on specify [specific] service provider, he/she should be put his credential as a token of responsibility; thus the provider could have capacity planning before-hand'. These examples, in particular, demonstrate that participants had thought about the different stakeholders involved and had considered the purpose of the feature from service seeker's and service provider's points of view. This leads to an interesting discussion on values trade-offs [43] , where one stakeholder's values might positively or negatively affect the (rest of the) values of the same stakeholder or another stakeholder.

We found unexpected features as interesting ideas that demonstrated creativity and the ability of the participants to draw ideas from potentially their professional software engineering experience and/or personal worldviews as software users. For example, the feature uNR01-'provide physical locations where users can access services at a kiosk or the like, if they don't have a phone to use the app' not only goes beyond the scenario information but challenges the scenario assumption that the homeless have access to mobile phones. The participant proposes to make everyone capable of using the WeCare app with or without mobile, a precise alignment with value category Universalism. Further, in the feature uFR05, a participant suggests using the WeCare platform as a source of income by 'create value through their [users'] art/creations (similar to fair trade) facilitate by a platform connected to the apps'. uFR04 has extended the stakeholder list of the scenario by suggesting the public 'to list out items they are willing to donate'.

These examples suggest that explicit consideration of human values can extend the thinking boundaries of practitioners and enable them to come up with more features, feature options, identify more stakeholders, and their roles that make the software design better aligned with human values.

Based on careful consideration of the results and drawing on our memos, we present some key insights.

#FamiliarityImpact: Based on the evidence, we were able to draw an insight about the potential impact of values familiarity on values elicitation. In subsection 4.2, we have discussed the values familiarity of participants (presented as percentages in greyed outlined boxes in Fig. 7 , next to the value category name). The categories with relatively higher familiarity levels visibly have more suggested features. For example, Benevolence (familiarity 44%), Self Direction (58%) and Security (62%) have absorbed 23, 15 and 10 features, respectively, making them the top three value categories with the most number of features. Similarly, less familiar categories like Hedonism and Power have lesser number of suggested features (see Fig. 7 ). This pattern suggests that more the practitioners are aware and familiar with particular values, the more features they are likely to derive aligned with such values. However, Conformity is an exception in this regard, where nearly half the participants (46%) acknowledged their familiarity with this value category but only one feature was suggested in this category (uNR02).

#ValuesTriggering: We tried both: feature-driven value mapping, brainstorm features, then map to values and values-driven feature mapping, triggering with values to brainstorm features. At a first glance, it seems both approaches are almost equally effective, as 29 out of 66 features were suggested at the last "values triggers" question of the survey (see Fig. 4 ). However, the fact that the participants were able to collectively derive 29 additional features, on top of the ones they had already identified, as a direct result of values triggering can be seen as a significant impact of the values-driven feature mapping approach, and the process of applying both approaches. In addition to its impact on deriving more features, the value triggers were also seen to impact outside-the-box thinking and the elicitation of unexpected features. A majority of the unexpected features (9 out of 10) were suggested as a response to the values triggering question. More surprisingly, 6 of these were suggested by participants who had initially responded that they did not have more features to add earlier on in the survey (at decision point DP1, see Fig. 4 ) but went on to identify more features using the values-driven feature mapping approach. #ValuesConflict: Using a circular structure to his model, Schwartz explains interlinks between values. Values located closer to each other are complementary, whereas values further apart tend to be in tension with each other [41] . Following this principle, it can be suggested that the suggested features listed in a category such as Benevolence may complement the features in the Universalism category, while features listed under Self-direction may be in conflict with implementing feature listed in Security, and vice-versa. For example, the feature VAL13, in the Tradition value category suggests 'display the content on the app based on the user's traditional values and origin', while the feature eFR03 from Self-direction category -on opposite end of the modelrequests to implement 'guest access' which can be an obstacle to collecting the data, such as user's traditional values and origin, necessary to implement the feature VAL13. This leads us to our next insight about values trade-off.

We identified that values trade-offs could occur for the same stakeholder (as described above) or between different stakeholders. For example, eFR10 and eFR11 (see Tabel 3) both suggest having chat services/hotline or a help desk. These features are helpful for service seekers; however, such external services might negatively affect the wealth of the government, i.e., cost for recruiting people, conducting training and maintaining the help desk. Though it was not our intention to handle values trade-offs, including prioritisation, in this study, we acknowledge the importance of handling them and will continue our future research to resolve them.

Conducting this scenario-based survey study was an interesting experience. We share some reflections which may be helpful for other researchers and practitioners.

We further analysed the demographics of practitioners who proposed the unexpected features. Practitioners who proposed unexpected features had an average of 9 years of experience, while the same for the entire sample was 6.07 years. Moreover, six out of ten features were suggested by practitioners who held senior positions, mainly the BA roles, suggesting that maturity in the software industry allows practitioners to consider a wide range of issues associated with software requirements and design, to think-outside-the-box where the 'box' is the given scenario, and to elicit specific values.

The choice of the scenario is likely to have an impact on the values categories elicited from the scenario-based survey. In this study, the scenario used was based on a proposed mobile app for homeless people. While the scenario itself was written with a values neutral lens, the nature of the application domain, i.e. providing shelter for the homeless, is likely to have elicited certain categories of values over others. For example, Benevolence and Self-direction were the value categories which elicited a maximum number of value items based on self-identified by the participants and as inferred through the analysis of the features. These categories align with the values perceived to be demanded by the users (e.g. helpful, responsible) and those seen to be supported by the app (e.g. freedom, independence) respectively. It may also suggest why only one feature was suggested in the Conformity category, despite nearly half the participants being familiar with this value category. #ResearchAdaptations: We had originally planned to conduct the study as in-person workshops. However, the global Covid-19 pandemic imposed strict and extended lockdowns in Melbourne, as in many parts of the world, forcing us to consider other ways of continuing our research study. The research team brainstormed alternatives. Considering zoom fatigue and the need for schedule flexibility for better work-life balance while working from home during such challenging times, we decided to proceed with a survey which could be filled asynchronously in the participant's own time. We played with the idea of embedding the scenario in the survey. At first, this seemed difficult but after several rounds of reviews, we managed to design a reasonable survey flow. We were pleasantly surprised by the number of participants and their sincere working through the survey responses, spending 30 minutes on an average and eliciting 66 features all together. Our experience suggests a well-designed scenario-based survey is a reasonable data collection technique, especially when under physical and time constraints.

We had hoped that a visual introduction of values through a 3-minute video (see Fig. 4 ) would help participants modify their suggested features (in the step after the video) to be better aligned with human values. However, the feature modification step was highly unpopular as almost all the participants marked suggested features as either 'values are already considered' or 'keep feature as it is'. The level of values familiarity of participants in this study, as depicted in Fig. 6 , suggests a majority of participants (70.37%) were well-positioned to extract values. Therefore, it is likely that they were confident about the initial feature suggestions. However, such video introduction might be helpful with a different cohort of participants with lesser values awareness.

We conducted a scenario-based survey research study to address the research question, what is the impact of considering human values explicitly in the early requirements engineering activities? In response to the RQ, the results show that considering human values explicitly while conducting requirements analysis registers several impacts. It helps practitioners to:

• identify human values that are applicable to a given scenario (VAL), • associate purpose with the features they develop in their day-to-day life (eFR, eNR), considering the important why question, instead of jumping into software development, • think outside-the-box, beyond the given scenario, and draw ideas from their life experiences (uFR, uNR), and • build connections between software features and human values (eFR, eNR, uFR, uNR). Overall, the explicit consideration of human values in the early RE activities has a strong potential to enable practitioners to concretely identify and align human values with software requirements -previously identified as a key challenge of operationalizing humans in software engineering [8] . Furthermore, we argue that explicit consideration is valuable and essential to developing software that demonstrates and respects human values. Such explicit consideration is likely to lend purpose to SE practitioners while developing software as they can clearly identify and make connections between the requirements they are fulfilling and stakeholders' values. Given the success of the scenario-based survey, we also suggest that scenario-based thinking as a good approach to implementing the connection between features and values.

Based on the results and our experience, we propose a scenario-based values elicitation process as a practical implication and takeaway of this study (see Fig. 8 ). Step 1: Designing scenarios. This step involves some members of the software team coming up with a scenario that captures the standard requirements of the software being designed. This is ideally done by the manager, business analyst, or a values champion [27] , ideally in consultation with the product owner or customer representative. Our scenario WeCare in Fig. 3 can be consulted as a guide for this purpose. The team may also wish to make use of personas to further develop the scenario.

Step 2: Feature-driven value mapping. The step follows the bottom up approach of coming up with features for the scenario, followed by mapping them to the Schwartz' values model.

Step 3: Values-driven feature mapping. Once the identified features are mapped to values, the next step is to follow a top-down approach to ensure a good coverage across the value categories. To do this, the team considers the values in the Schwartz model and come up with more features, or refine the identified features, to align with the value categories. While it may seem intuitive to try and achieve a good coverage across all values, the software being designed may be naturally inclined to mapping with some values over others, so it may not be a good idea to force even coverage.

Step 4: Granularity for implementation. The last step is to check the granularity, feature type and expected outcome of the feature. In terms of granularity, as discussed in Findings (Section 4), features may be found in between highly abstract level, i.e., closer to human values and fine-grained level, i.e., closer to implementation. We propose this step as a decision point, where practitioners may go back to brainstorm further to make the suggested feature closer to task level, thus closer to operationalization of human values. Similarly, practitioners may fine-tune the suggested features in iterations until satisfied for requirements type (Functional Requirements and NFRs) and expected outcome (in-scope and out-scope features). This step might be helpful to identify the indirect stakeholders (like the public in WeCare app) of the scenario.

Another approach could be to start Step 2 with more guidance and structure -than we did in the surveyasking the participants to suggest the features with enough details so that implementation details can be drawn from them and associated tasks can be written up. The choice of approach (open format vs structured format) depends on who is participating. For example, it may be possible to work on a fine-grained task level if developers were participating. However, managers, customers, and business analysts and other people in non-technical roles are more likely to express features as user stories or themes/general guidelines. Generally, a mix of roles is recommended to achieve healthy discussions of values and optimal mapping at the practical implementation level.

Internal Validity: The structure and questions of the scenariobased survey may have introduced some threats to the internal validity. All the authors reviewed the survey questions and agreed upon the questions. Furthermore, we improved the survey questions based on feedback that we received in the pilot phase. More specifically, the comprehensibility of the scenario used in the survey might be a question since it was about a specific social group, i.e., homeless people in Australia. It is unknown to what extent each participant was familiar with such context. We used the pilot phase to understand the level of comprehensibility of the scenario introduction and improved the scenario description to be easier to follow. To further support, a hyperlink was added where necessary to that has access to the scenario (Fig. 3) .

The subjective judgment in the process of mapping the suggested feature descriptions to human values may also become a source of threat to internal validity. We mitigated this potential bias by employing three analysts who worked individually during the coding process as prescribed in investigator triangulation [39] . Additionally, these analysts also had a similar understanding of human values in SE/RE.

Construct Validity: Possible threats to construct validity may arise from the participants' and the analysts' understanding of human values. The human values theory we used might have been completely new to the survey participants, which may have led to misinterpretations of values. We identified the lack of familiarity with the provided human values definitions and examples among the participants in the pilot phase. In the main study, our strategy to help participants better understand human values was to add hyperlinks, where necessary, to an external document. The external document contained detailed information about Schwartz's theory, including the values circular model. As for the analyst, they all had a decent understanding of human values and had research experience on human values in software.

External Validity: Given the number of survey participants, we accept that the findings and conclusions may not be applicable to the entire global RE community. Nevertheless, the participants came from 15 different job roles and had a range of years of experience (less than a year to 25 years), which can be a reasonable representation of the RE community.

The demand for software that reflects human values is increasing, and Requirements Engineering (RE) has a crucial responsibility of designing software features to demonstrate desired human values. This scenario-based survey contributes to identifying the impact of explicit consideration of human values during RE activities. Our survey attracted 56 participants who are mainly involved in RE activities in their day-to-day life. The results suggest practitioners may confidently consider human values during RE activities as it allows them to (i) identify values related to a given scenario, (ii) associate purpose with the features they develop, (iii) be creative as well as draw ideas from life experiences, and (iv) build connections between human values and software features.

Further, we find human values alignment with features could be done effectively using either feature-driven value mapping (brainstorm features, then map to values ) or valuesdriven feature mapping (triggering with values to brainstorm features). Finally, scenario-based values elicitation -a fourstep takeaway process for practitioners to use scenarios to elicit values and develop a value-aligned feature list for a given scenario(s).

In future work, we will address several points. First, as discussed, we will look for potential guidelines, tools and techniques that can handle explicit values consideration in RE and support the scenario-based values elicitation process to be effectively practiced in the industry. Second, we plan to continue to research on #ValuesTrade-offs and potentially extend the scenario-based values elicitation process to accommodate and evaluate values trade-offs. Third, we will explore ways to extend and trace the explicit consideration of human values throughout subsequent steps in the requirements engineering tasks. Finally, we will further improve the scenario-based survey technique reflecting on learning from this study to be an effective research tool.

An overview of the schwartz theory of basic values

Over $119bn wiped off facebook's market cap after growth shock

Fairness testing: testing software for discrimination

Molly russell: Instagram bans graphic selfharm images after suicide of uk teen

Whatsapp loses millions of users to rivals telegram and signal amid fears of increased data sharing with facebook

Is the blue whale game among adolescents just a media hype?

blue whale challenge': A game or crime?

Operationalizing human values in software: a research roadmap

Software engineering for 'social good': Integrating action research, participatory design, and agile development

Value-based requirements engineering: method and experience

Embedding stakeholder values in the requirements engineering process

Huvalue: a tool to support design students in considering human values in their design

Socio-technical grounded theory for software engineering

Universals in the content and structure of values: Theoretical advances and empirical tests in 20 countries

Developing a meta-inventory of human values

Human values in software development artefacts: A case study on issue discussions in three android applications

Basic human values: theory, methods, and application

The nature of human values

International studies of management & organization

Design for values: An introduction

The handbook of information and computer ethics

Handbook of Ethics, Values, and Technological Design: Sources, Theory, Values and Application Domains

Values-first se: research principles in practice

Advancing the study of human values in software engineering

Principles for valuesensitive agent-oriented software engineering

Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

How can human values be addressed in agile methods? a case study on safe

Measuring human values in software engineering

Towards a human values dashboard for software development: An exploratory study

Making values explicit during the design process

Handbook of Ethics, Values, and Technological Design: Sources, Theory, Values and Application Domains

Integrating social values into software design patterns

Value stories: Putting human values into requirements engineering

Critical requirements engineering in practice

Towards a requirements language for modeling emotion in videogames

Don't leave me untouched: Considering emotions in personal alarm use and development

Getting access to what goes on in people's heads? reflections on the think-aloud technique

Sampling in software engineering research: A critical review and guidelines

Coordination challenges in large-scale software development: a case study of planning misalignment in hybrid settings

Human values in software engineering: Contrasting case studies of practice

Universals in the content and structure of values: Theoretical advances and empirical tests in 20 countries

The common cause handbook

Value sensitive design: Theory and methods