key: cord-0043989-2cgr6mbf
authors: Ruiz, Marcela; Hasselman, Björn
title: Can We Design Software as We Talk?: A Research Idea
date: 2020-05-05
journal: Enterprise, Business-Process and Information Systems Modeling
DOI: 10.1007/978-3-030-49418-6_22
sha: 5c0b3507241504460ee7218b2a037beb1cd460e7
doc_id: 43989
cord_uid: 2cgr6mbf

In the context of digital transformation, speeding up the time-to-market of high-quality software products is a big challenge. Main challenges. Software quality correlates with the success of requirements engineering (RE) sessions. RE sessions demand software analysts to collect all relevant material usually specified on written notes, flip charts, pictures, etc. Afterwards comprehensible requirements need to be specified for software implementation and testing. These activities are mostly performed manually, which causes process delays and software quality attributes like reliability, usability, comprehensibility, etc., are diminished causing software devaluation. Innovative aspects. This research idea paper proposes a framework for automating the tasks of requirements specification. The proposed framework involves computational mechanisms to enable the automatic generation of software design while requirements are discussed. The innovative aspect of this research comes from digitally transforming the software development life cycle (SDLC) where requirements are generated “on the fly” and virtual reality systems are in place. Potential to make change. The proposed framework has the potential to renovate the role of software analysts, which can experience substantial reduction of manual tasks, more efficient communication, dedication to more analytical tasks, and assurance of software quality from conception phases. This research idea paper introduces the framework for automating the task of requirements specification, and report our progress. We conclude the paper by outlining lessons learnt and future lines of work.

Living in a digital era, service providers are challenged to offer services to their customers through a wide spectrum of channels. This constant introduction of new devices and technology challenges organisations to provide rapid This research project is supported by ZHAW Digital and the Digitalisation Initiative of Zürich Universities DIZH.

improvements of their IT infrastructure, while ensuring the highest user experience possible. Digital transformation stimulates the adaptation of existent business models and the creation of new ones; while society adapts to new ways to interact with services. Software systems are omnipresent in digital transformation process; making software quality of crucial value to ensure successful digital transformation [1] .

Software quality correlates with the success of requirements engineering (RE) sessions [2] , which makes RE a crucial phase of the software development life cycle (SDLC) [3] . The agile movement have proposed user stories as a minimal but complete language for the specification of software requirements [4] . This language has been proven to be successful and widely adopted by software developers [5] . software requirements are collected during RE sessions in the shape of pictures, flip chart notes, documentation etc. Later on, relevant information is digitalised in order to specify a set of comprehensible user stories to be used during development phases. Digitalisation of discussed requirements demands extra effort that has to be undertaken by software analysts [6] . All this complexity is magnified if we consider that software development is not usually taking place in the same geographical location. Big companies make use of software providers located in different continents. Teleworking is posing big challenges at the moment to ensuring collaborative requirements engineering [7] .

The main objective of our research endeavour is to reduce the time-to-market of software products by automating the task of requirements specification while requirements are discussed. In this research idea paper, we introduce a framework for automating the task of requirements specification (see Fig. 1 ). We conceive a requirements engineering room where participants discuss requirements that are automatically specified in the shape of user stories, and transformed into software prototypes. In this room, we incorporate virtual reality tools like double robots for embodiment of remote participants, and interactive boards with collaborative tools as they have demonstrated to facilitate access and real-time edition of discussed requirements [8] . By implementing the proposed framework in practical settings, it is expected that SDLC goes through a process of digital transformation where software analysts are going to experience a significant reduction of manual tasks. User stories will be generated on the fly during the session, and virtual reality systems will allow efficient communication. Software analysts get empowered by focusing on meaningful tasks like analysing created user stories and prioritization. Software quality is then assured from the first release. The framework components are still under design and evaluation phases. Particularly, this paper reports our first steps towards automating the specification of user stories "on the fly" during RE elicitation sessions. We discuss the design and illustrate the use of the DEEP LEARNING CLASSIFIER and ONTOLOGY CRAWLER components presented in Fig. 1 . Setting up of the requirements engineering room and software prototype generation are considered part of our short term research plans.

Paper Organisation: After reviewing related work in Sect. 2, we introduce our advances on providing automatic specification of user stores in Sect. 3. We summarise the design of two components: a deep learning classifier in Sect. 3 

In the field of requirements engineering there are several related work that approach the challenge of automate requirements specifications from different angles. We analyse these approaches based on: (a) requirements source: audio recordings/transcripts from requirements meetings, tweets, bug reports, user stories, existing documentation, domain repository; (b) generated requirements specification in the shape of: meeting minutes, knowledge extraction, tweets classification, relevant topics, remedied user stories, meeting summaries, and user stories; (c) Existing validation or evaluation: laboratory demonstration, comparative experiment; (d) Existence or not of tool support; and (e) Whether or has been applied in practice.

Some works focus on supporting software requirements specification by generating meeting minutes. For instance, Kaiya et al. [9] proposes a tool to support requirement elicitation meetings by recording the sessions and providing an assistant tool to manage the recordings and mark the important points via hypertext. Authors conclude that further collaboration mechanisms need to be incorporated to facilitate real-time edition of requirements and knowledge share. Murray et al. [10] developed a natural language processing approach to summarize emails and conversations in general, more projects involving textual sources appeared. Especially in the field of machine learning were multiple techniques developed to extract requirements engineering relevant information from different written origins [4, [11] [12] [13] [14] .

Rodeghero et al. [13] proposed a machine learning classification algorithm trained to recognise user stories' information [15] . As a conclusion of this study, the authors found out that information about software functionality and requirements rationale can be identified by means of classification algorithms. Nevertheless, no information about the role can be automatically extracted. Another tool assisted approach to dynamic requirement elicitation was introduced by Abad et al. [14] . The tool extracts relevant snippets and simultaneously uses a third-party API to recognize tone and intentions of statements' providers.

We have taken the aforementioned research works as a reference to cover the gaps in terms of providing complete user stories from spoken software requirements during elicitation sessions (see last row in Table 1 ). 

Our goal is to elicit complete user stories including information related to "Role, Function and Rationale". Based on the research work presented in Rodeghero et al. [13] , we propose to classify software functionality in terms of functional and non-funcional requirements; as well as identify requirements' rationale from requirement elicitation sessions. Our research strategy is summarised in Fig. 2 . In short, our research idea is to build a deep learning algorithm that can be further trained by providing labeled requirements elicitation sessions. For identifying missing roles, we propose to make use of existing ontologies that provide information related to typical roles belonging to the context in which software elicitation sessions take place.

In this paper, we summarise the deep learning classifier (see Sect. 3.1) and the ontology crawler components (see Sect. 3.2). For implementation purposes We chose the Java language as it guarantees portability and its popularity results in maintained and tested frameworks we can use. It has a sophisticated deep learning framework available in DL4J 1 and the Java OWL API 2 for handling Ontology files. The components will later provide the data to be used by the user story assembler component (out of scope of this paper).

We propose a deep learning classifier based on the work proposed by [13] . We used deep learning specifically, because our intention is to imitating the classification process that has been done by using machine learning. In this way we can further compare performance values in subsequent experiments.

A turn is an established unit of analysis in natural language processing as opposed to using single sentences. It describes, when a person speaks in a conversation in between other speakers. To represent a turn in a learnable format, we use word embeddings provided by the model described in the work of Pennington et al. "GloVe: Global Vectors for Word Representation" [17] . Here, words will be represented as multidimensional, real-value vectors. In a three-dimensional space, similar words would lie 'closer' together than those, that semantically differ. Different, pre-trained word embedding models are available . The available representation dimensions depend on the vectors but range from 50 to 300. They also differ in terms of topic and number of tokens.

For our initial development process, we used the smallest set; the "Wikipedia 2014 + Gigawords", which consists of 6 billion tokens and a representation of 50 dimensions.

The implementation of the deep learning classifier is available in our public GitHub repository at https://github.com/lmruizcar/requirements classifier. An example of classification is presented in Fig. 3 . The model in its current state performs about as well as random guessing since we need data for training purposes. As it has been mentioned by [16] , the lack of data from requirements elicitation sessions is an obstacle in this type of investigations. Our model differentiates between three labels: None (0), Non-Functional (1) and Functional (2) . A caveat of this deep learning approach is, that it only cares indirectly for the fact that turns can be both; labelled 1 and 2. Whereas [13] built multiple binary classifiers which each analysed the turn, our approach uses a SoftMax layer for which the output is interpret able as probabilities. A turn that falls into both categories, would have probabilities around 0.5 for both labels which can be interpreted individually, but is not represented in the standard evaluation method of machine learning classifiers. Rodeghero et al. stated , that in conversations in requirement elicitation meetings "only 0.5% discussed role" [13] . Which is why they concluded that it is not feasible to extract role information from transcripts. For a complete user story, this role information is crucial. An established practice in information science is the use of ontologies to organize data and reduce complexity. We propose the ontology crawler component. The ontology crawler searches an ontology for defined entities and their restrictions to identify possible roles in the required context. As input, it takes an ontology from a file formatted in Web Ontology Language format. As output, it generates a list of foundational user stories, which consist of a role, an action and an object. For this, we have implemented a prototype based on Java OWL API. The prototype is available on our GitHub public repository at https://github.com/lmruizcar/ontology crawler. Figure 4 exemplifies the generation of foundational user stories for the SmallShop Case. A product manager needs to be able to add products to the shop and remove it, e.g., if they are not in stock anymore. The customers that use the shop want to buy 

The current paper presents a research idea for automating the process of requirements specification. We propose a framework consisting of a requirements engineering room, and components to support the automatic generation of user stories and software prototype generation. This paper presents results from the implementation of the components for automatic generation of user stories while requirements are discussed. Our efforts lead to the development of prototypes for a deep learning classifier and ontology crawler. Initial results are promising and proves the feasibility of the proposed research idea. Prototypes are made available on our public GitHub repository to motivate further research in the field.

For the near future, we plan to keep evolving the prototype for user stories assembler component. In this way, we can obtain full user stories from elicitation sessions. For our framework to mature and being implemented in practical settings, we envision to build a flexible environment to support the plug-and-play of components that conform the framework. In this way, we can incorporate alternative components while ensuring a proper interoperability. In addition to evaluating the extent to which our solution improves user stories' generation in terms of efficiency, we plan to evaluate the intention to use and usability of the framework from the perspective of requirements engineers. For our long term plans, we plan to investigate the quality of software engineering recordings to further adjust and improve our framework.

Challenges of the digital transformation in software engineering

Does quality of requirements specifications matter? Combined results of two empirical studies

The role of requirement engineering in software development life cycle

Agile requirements engineering with user stories

Working software over comprehensive documentation -rationales of agile teams for artefacts usage

FlexiSketch: a lightweight sketching and metamodeling approach for end-users

RE challenges in multi-site software development organizations

Sketching and notation creation with FlexiSketch team: evaluating a new means for collaborative requirements elicitation

Design of a hyper media tool to support requirements elicitation meetings

Summarizing spoken and written conversations

A little bird told me: mining tweets for requirements and software evolution

Summarizing software artifacts: a case study of bug reports

Detecting user story information in developer-client conversations to generate extractive summaries

ELICA: an automated tool for dynamic extraction of requirements relevant information

TraceLab components for generating extractive summaries of user stories

Behavior-informed algorithms for automatic documentation generation

Glove: global vectors for word representation