key: cord-0038943-av9e296k authors: Joubert, Anke; Murawski, Matthias; Bühler, Julian; Bick, Markus title: Happiness and Big Data – Theoretical Foundation and Empirical Insights for Africa date: 2020-03-06 journal: Responsible Design, Implementation and Use of Information and Communication Technology DOI: 10.1007/978-3-030-44999-5_37 sha: e1ad46544249aa05701560e160e20de4e0814351 doc_id: 38943 cord_uid: av9e296k Big data has gained academic relevance over the last decade and is also of interest to other role-players such as governments, businesses and the general public. Based on our previous work on the Big Data Readiness Index (BDRI) we place the focus on one under-investigated aspect of big data: the linkage to happiness. The BDRI, applied on Africa, includes the topic of happiness within the digital wellbeing driver, but the link between the two topics requires further investigation. Thus, two underlying questions emerge: what is the relation between happiness and big data? And how does Africa perform in digital wellbeing? This paper includes a structured literature review highlighting five key clusters indicating this link. Furthermore, we present some first empirical insights using the BDRI focusing on Africa. Overall, the African continent performs best in the social inclusion cluster of happiness, with the most room for improvement in the job creation cluster. Big data analytics is a research field that has gained academic relevance over the past 12 years. Big data is not only of interest for academics, but also for governments, businesses and society. This is due to its ability to use existing data for improved operational efficiency, better decision making, facilitation of innovation and delivery of solutions with a social and developmental impact, amongst other things [1, 2] . In this paper, we place the focus on an under-investigated aspect of big data: the linkage to the topic of happiness. As shown by Wang et al., this linkage has been addressed in only a few studies, with this focus becoming increasingly important [3] . Aside from concrete economic measures, happiness is more and more considered an important aspect of country indices and rankings, indicated, for instance, by the United Nations (UN)-published World Happiness Report or the emergence of the concept of Gross National Happiness (GNH) [4] . Happiness research stretches across domains, playing a large role in philosophy, psychology, religion, environmental studies, healthcare, politics and economics [5] . Throughout history, large events have affected how people define happiness. The age of big data is expected to also have an impact on human happiness. In a previous paper [6] , we have suggested the Big Data Readiness Index (BDRI), which can be used to compare big data readiness on a country level. The BDRI, which is built on the five prominent v's: volume, variety, velocity, veracity and value, includes the topic of happiness under veracity, more detailed under the driver digital wellbeing 1 . There is a need for deeper research into the relation between happiness and big data focusing on a stronger theoretical foundation. Based on this, we formulate the following research question that will be answered by conducting a structured literature review: RQ1: What is the relation between happiness and big data? The second objective of this paper is to present some empirical insights on big data and happiness. Therefore, we apply the BDRI driver digital wellbeing using open data focused on the African continent. Our interest in Africa is grounded in a massive underrepresentation of this continent in big data research, resulting in a research gap. Furthermore, our BDRI has been developed in the context of the particular aspects of Africa [6] . Thus, the second research question is: RQ2: How do African countries perform in terms of digital wellbeing? The remainder of this paper is organized as follows: We begin with a general introduction of the topics of happiness and big data, as well as our research focus on Africa in Sect. 2. We present our literature review in Sect. 3, before the research questions are answered in Sect. 4, based on our findings. We conclude the paper in Sect. 5. Constantinescu [7] explains how the concept of happiness has always been present and is constantly being reshaped. Definitions of happiness place the focus on different aspects, for instance: a virtue that could be obtained through values accepted by society that represented the good, true and beautiful [8] , balance between nature and the world's will [7] or long life, riches, health, love of virtue, and a natural death [9] . The industrial revolution moved the definition of happiness to include consumerism through the increased aspiration to own material goods [7] . Twentieth century media spread this conception of success through owning. The financial crisis questioned the absence of morals within happiness through consumerism [8] , consequently literature on the failure of consumerism to create happiness emerged. Seeing how happiness has always played a role in society and changed according to civilization's circumstances make it evident that the age of analytics will also have an impact on human happiness. Criticism of happiness research points at the fact that it is a highly individual trait and thus aggregating happiness to a people can be misleading [7] . Taking this into consideration, indexes should carefully consider how different individuals would experience different circumstances and weight and collect indicators accordingly. Therefore, a Human Wellbeing Index (HWI) together with an Ecosystem Wellbeing Index (EWI) have been developed, striving to find a balance between good human and good ecosystem conditions. The HWI was built on five categories of wellbeing: health and population, wealth, knowledge and culture, community and equity [9] . In July 2011, the UN General Assembly invited member states to measure the level of happiness of their citizens and use this as a guide for developing public policies, followed by the first UN summit on happiness and wellbeing in early 2012. From this year onwards, the UN published an annual World Happiness Report. The World Happiness Report takes variables such as GDP per capita, social support, life expectancy at birth, freedom to make life choices and the level of corruption into account [7] . In this study we understand happiness as an individual attribute that can be defined differently among societies and individuals. It points towards the positive personal or societal experience or value, driven through different factors referring to life quality and purpose, societal equality, social interaction, human rights, and the surrounding environment. While measuring development and setting up policies or rules, it is necessary to consider happiness. There is no globally accepted definition for the term 'big data'. The complexity of defining this term increased due to a shared origin and usage between academia, industry, media together with wide public interest. According to Ward and Baker [10] , large international IT role-players also provide conflicting definitions of big data, including: a process of applying serious computing power, including techniques such as machine learning and artificial intelligence (Microsoft) or the derivation of value from traditional relational databases by augmenting it with new sources of unstructured data (Oracle). Fosso-Wamba et al. [11] summarize some descriptions of the impact of big data in previous literature as the next big thing in innovation; the fourth paradigm of science; the next frontier for innovation, competition, and productivity and that big data is bringing a revolution in science and technology. Various stakeholders providing diverse definitions lead to the emergence of literature that attempt to establish a common definition. The first academic concepts associated with big data describe big data using the three v's: volume, velocity, and variety. This approach has been reiterated in various studies [12] [13] [14] . Volume refers to the size and magnitude of data, whereas velocity refers to the speed and frequency of generating data. Variety highlights the fact that big data is generated from a large number of sources, formats and types that include structured and unstructured data [14] . A fourth v for value and a fifth for veracity was subsequently suggested, sometimes referred to as verification. Value emphasizes the importance of extracting economic benefits from data [15, 16] . Veracity stresses the importance of data quality and includes security measures and techniques to assure trustworthy big data analysis results [17, 18] . This paper considers big data as the overarching approach of collecting, managing, processing and gaining value from the 5 Vs (volume, variety, velocity, veracity and value). The BDRI bases an index on measuring big data readiness with these five v's as core components. Data related analysis and academic studies focusing on the African continent are scarce due to limited data availability, old and inaccurate data, limited coverage of the available data and a small reference research pool [6] . Due to these challenges, Africa has often been referred to as the continent of missing data. Duermeijer et al. [19] confirms that Africa produces less than 1% of the world's research even though 12,5% of the world's population is from Africa. With regard to published big data related literature up to 2015, most authors are from China, the USA, Australia, the UK and Korea [20] . This unequal geographic coverage raises the question whether this is due to a global big data divide or lack of essential knowledge to undertake big data studies [6] . If big data implementation excludes the developing world, it can lead to greater inequalities. Additional challenges such as data availability and infrastructure hinder easy implementation of big data analytics, but the implementation of big data itself can have large positive effects to alleviate these initial shortcomings [21] . With the spreading of smart mobile devices in developing countries enabling the collection of multiple data inputs for possible big data solutions, big data research does no longer have to be restricted to the developed world. Bifet [22] estimate that 80% of mobile phones are located in developing countries. These developments will increase data availability in Africa and allow new sources of collected data to emerge. Furthermore, Africa has a population of 1.2 billion people of whom around 60% are under 25 years old [23] . This growing number of future digital natives, who will be generating data increasingly, shows the research potential in previously unconnected countries. Our focus on Africa will attempt to fill this geographical research gap and increase the research pool in order to encourage other authors to focus on this high potential region. We will make use of open data in order to make a country-based comparison, using the BDRI [6] and specifically focusing on the digital wellbeing driver within the velocity component to see how happiness and big data are connected to answer RQ2. A core step which links the theoretical foundation with the empirical analysis in our paper is a structured literature review on the two leading terms, "happiness" and "big data". Vital discussions among information system (IS) researchers exist about how to approach a literature review. Some authors argue for a comprehensible approach by covering all articles somehow related to the review topic [e.g., 24], others such as vom Brocke et al. [25] actively claim the exclusion of articles while being as transparent as possible. For this review, we apply a hybrid approach of systematicity [26] , because we are particularly interested in the combination of the two terms "happiness" and "big data". Thus, we exclude articles that can be associated with only one of the key terms to achieve a holistic review of all scientific articles that combine both terms. In the following phase, we use three exclusion criteria to filter the articles in a funneling process. The final search string for the queries consequently is: happiness AND 'big data' Following the guideline by Webster and Watson [27] , we screened three major databases for scientific publications, EBSCO, JSTOR, and Web of Science. This guarantees a good coverage of articles positioned beyond the boundaries of IS, which is in line with the authors' advice for a literature review in an interdisciplinary and connected field, such as IS [27] . The results that cover all available fields in the databases' advance search options (e.g., title, keywords, abstract) are presented in the following section. Search results for "happiness" solely, which we initially applied for crosschecks on our combined search string, revealed a huge number of more than 20.000 articles. Combined with the second term "big data", the queries applied to the three databases listed a total number of 154 peer-reviewed journal articles. We applied a set of three major criteria to limit the number of articles to a reasonable and suitable number for our purposes. First, we downloaded and screened the title, keywords and abstracts of all articles and used this as first methodological screening criteria [26] . In this step duplicated articles, non-English articles and certain types of articles such as short editor's notes as a preface or front and back matters were eliminated (A). Furthermore, we eliminated articles that are directly related to the medical or health sector: they use happiness in the context of wellbeing after medical treatment, which is out of scope for this study (B). In a final step, we conducted an individual screening of all the articles and applied final content-related criteria. All articles were removed which refer only once to either of the two core terms. These articles appear correctly as search results but provide a very vague connection between the two terms at most. Additionally, we eliminated articles that are out of scope (C). In total, 20 articles with a strong relation between the terms "happiness" and "big data" remained after the three elimination steps. The initial pool of articles, those excluded at each step, as well as those that remained and form our final sample are summarized in Table 1 : Based on the literature review results (cf. Table 1 ), we extracted a total of 20 peerreviewed articles, which we then clustered regarding potential linkages between happiness and big data. We screened the articles and aggregated them according to central themes, research areas and methodology. These clusters from the BDRI will be explained in this section, specifically indicating how big data could affect mentioned clusters and how this in turn effects happiness, thereby establishing the general relation between big data and happiness. A first cluster we could identify can be associated with jobs, specifically job creation and increased productivity. According to one paper we associated with this cluster, Frey and Stutzer [28] , unemployment has a strong negative effect on the individual as well as an entire society. Consequently, lowering unemployment potentially leads to greater levels of happiness, greater bonds towards a community, and life satisfaction [29] . This effect is even stronger for younger adults [30] . The implementation of big data concepts can support the process of job creation in general, but it is diverse with regard to different target groups within a society. New jobs will predominantly increase happiness of a minority, which due to a reasonable level of education, has the option to work in big data fields. To address job creation coherently, we put our focus on big data concepts that also impact existing jobs. Heeks [31] uses Bhutan as an example where e-agriculture can assist farmers who represent the majority of the population. In this example, big data concepts support agricultural extension, better planting, cropping, animal husbandry, and monitoring market prices for better market revenue. Implementation of big data analytics can not only improve productivity within existing jobs but also lead to creation of new jobs. Evidently, a meaningful source of income and higher productivity has positive impacts on the level of happiness in society, explaining how through this cluster job creation and productivity big data can contribute to happiness. Big data can also be used to reduce socio-economic inequalities, our second cluster. Investments in new technologies can have a reasonable positive impact on multiple users simultaneously. They can benefit from one investment, such as a shared computer in a school. In return, research results indicate that countries with poorer socioeconomic conditions, e.g., less or no investments, also perform worse in happiness studies [8] . Other meta studies in the review addressed socio-economic equality using big data concepts to investigate links between various factors that consolidate a society, such as religiousness [32] or a predominant mindset [33] . This cluster links big data to happiness through the potential impact of big data on reducing socio economic inequalities, which creates a happiness surplus though having a more equal society. Furthermore, the literature review revealed a third cluster referring to social inclusion, which covers closer relationships with others including friends and family that ultimately increase happiness. Communication networks such as social media play a key role in this cluster. This has been analyzed by researchers with the help of big data concepts. For example, Dodds et al. [34] used large-scaled text and word analyses to measure what they call 'societal happiness' on Twitter. Kosinski et al. [35] predicted 'sensitive personal attributes' including happiness based on nearly 60,000 digital Facebook records. Similarly, helping others or the Good Samaritan effect is shown to increase happiness of the helper as well as of the one being helped [36] . However, lack of emotional expressions communicated by others via social media services tend to lower an individual user's level of happiness [37] . As big data has the potential to provide the opportunity of higher connectivity, this cluster improves societal happiness of those experiencing a higher degree of social inclusion. A fourth cluster called good governance comprises governmental aspects such as political participation and the trust in the government's policies and performance [38] , where big data can help increase the level of transparency towards citizens [39] . Democracy itself increases happiness as the majority of citizens have voted to get their preferential policies in place [28] . The government can use big data technologies for improved decision making and implementing policies more effectively. Two real-life examples include the use of mobile phone data and airtime credit purchases to estimate food security in East Africa or using mining citizen feedback data in order to gain input for government decision making in Indonesia [40, 41] . This reveals that big data can lead to better governance, which in turn will impact overall happiness, as shown by happiness-studies. In addition, our literature review revealed other than politically driven contexts of happiness, through life satisfaction of people influenced by their residential status. Jokela et al. [42] used big data techniques to analyze more than 56,000 records. This study found a strong link between "life satisfaction" and "neighborhood characteristics", which were set up by public authorities. The final cluster healthy environment has some connections with good governance but with a stronger focus on environmental aspects. As people, we are highly dependent on our environment for daily living, especially in developing regions such as Africa where agriculture is still the largest sector in more than half of the countries [5] . Particularly big data concepts that address superordinate living conditions are relevant. One example from our literature review is a study by Zhao et al. [43] linking wellbeing, amongst other things, with economic growth and green space in larger Chinese cities. Results reveal a significant positive relationship between wellbeing and a high percentage of green space in a city. A healthy environment includes low levels of pollution, good water quality as well as the concept of sustainability that is gaining popularity especially among millennials [44] . Cloutier et al. [45] show how well individual cities and communities embrace sustainable practices and how these practices translate to opportunities for residents to pursue happiness. Applying big data to have smart cities, will lead to a more sustainable and effective society, and by doing so the healthier environment cluster is the fifth identified cluster through which big data relates to higher levels of happiness. Before discussing our specific empirical findings for Africa in the next section, we would like to address the initially raised research question RQ1 by summarizing that the relation between happiness and big data can be constituted by the five extracted clusters that are found in the BDRI: job creation, socio economic equality, social inclusion, good governance, and healthy environment. We use the BDRI, to zoom into digital wellbeing. The BDRI was developed using the five v's: volume, variety, velocity, veracity and value as components, with sub-drivers. Digital wellbeing is a driver under veracity, including indicators for the five previously extracted clusters job creation including improved productivity, social-economic equality, social inclusion, good governance and a healthy environment. The digital wellbeing driver is based on happiness research. Heeks [31] suggests a model that looks at substrates of happiness and unhappiness to link ICT to happiness. This has been adapted in the BDRI to incorporate possible links of big data as a technological priority to happiness. Other BRDI components such as trust and security deal with eliminating unhappiness caused by implementing big data technologies, thus digital wellbeing focusses on five clusters contributing to societal happiness. Indicators used to measure these five happiness clusters include, amongst others, the unemployment rate to indicate the opportunity of job creation and improved productivity through big data innovation [46] . The human development index score that takes three characteristics into account including a long and healthy life, access to knowledge and a decent standard of living to show socio economic equality [47] . Policies for social inclusion and equity including gender equality policies, equity of public resource use, social protection policies and policies for institutional sustainability are included in the social inclusion measure [48] . Political participation, political rights, legitimacy of policy, electoral process, power to govern workers' rights, the right to collective bargaining, freedom of -expression, -association and, -press and electoral self-determination are included as measures for good governance [49] . The indicator for a healthy environment measures sustainability through considering the share of renewable electricity to total electricity generated by all types of plants, and other factors [50] . Isolating and aggregating these indicators of the BDRI digital wellbeing driver, shows how Africa performs in terms of the big data related happiness clusters (cf. Fig. 1 ). All countries in the upper quintile, with the exception of Cameroon, lie in the Southern Hemisphere, of which Sao Tome & Principe, Gabon and Kenya lie on the equator. Looking at the different regions: North Africa, East Africa, Southern Africa, Southern Africa is also the top performing region in the overall BDRI. Thus, it is interesting to have a look at the top 10 BDRI performers and look at differences in terms of digital wellbeing. The BDRI top ten includes mostly coastal and island nations, with Rwanda as the only landlocked exception [6] . Even though this group of 10 countries outperforms their peers in terms of big data readiness, there is a large gap in terms of their performance in digital wellbeing. Missing data allows limited comparison in two of the five happiness clusters: job creation and social inclusion. Using various indicators to aggregate the five clusters of digital wellbeing in the BDRI gives some insights on the performance of African countries, not only within big data readiness, but also within the specific digital wellbeing driver. These indicators was selected following the Design Science Research approach [15] . Namibia, ranking sixth in the overall BDRI (cf. Fig. 2 ), performs best in the digital wellbeing driver. This desert nation performs well in available fields, showing highest performance in the healthy environment cluster. Amongst many factors included in the BDRI digital wellbeing driver, Namibia's well managed mineral wealth from a political and legal perspective, has allowed it to avoid the 'resource curse' seen across Africawhere countries rich in resources have low growth [51] . South Africa, in second position in both the overall BDRI and the digital wellbeing driver, performs well compared to its African peers in all clusters, except healthy environment. This field includes sustainability through the proxy of the share of renewable electricity to total electricity. South Africa derives 70% of total primary energy supply from coal, making this economy a large producer of greenhouse gases [52] . Kenya is the third-best performer in digital wellbeing from the BDRI top 10, followed by island nations Seychelles and Mauritius in fourth and fifth place. Of the BDRI top ten, Morocco ranks lowest in digital wellbeing, with a low comparative score in most digital wellbeing clusters. Amongst other clusters, the indicators aggregated to proxy job creation shows Morocco to have high labor market inefficiency, with high minimum wages and labor taxes being the main points of concern [46] . To answer research question RQ2 it is clear that a major difference exists between digital wellbeing across the 54 African countries. An overall view on digital wellbeing in Africa shows that on average the continent performs best in the cluster of social inclusion. Overall, the countries struggle most with socio-economic equality and job creation. This calls for policy intervention to focus on decreasing socio-economic inequality and reducing unemployment. In a previous paper [6] , we have suggested the Big Data Readiness Index (BDRI) that measures big data readiness on a country level and by using open data specifically applied this index to Africa. The BDRI includes the topic of happiness under veracity within the driver digital wellbeing. By completing a structured literature review, this paper strengthens the theoretical foundation for the under-investigated relation of big data and happiness. Five overall clusters form the link between happiness and big data: job creation including improved productivity, social-economic equality, social inclusion, good governance and a healthy environment. Using the BDRI digital wellbeing driver allowed a closer look into happiness across Africa. Empirical findings show that the top 20% of countries all lie within the Southern hemisphere, with Cameroon as the only exception. Although African countries vary highly regarding ranking and performance within each happiness cluster, Namibia, South Africa and Kenya are the top three performers within overall digital wellbeing from the countries in the BDRI top 10. Our study also has limitations which can be used as avenues for future studies based on the BDRI. It could be beneficial in terms of the model design to investigate causality and the direction of effects between happiness and big data. Regarding the country-specific analysis, we focused on Africa in its entirety and highlighted specific countries, which revealed significant results (cf. Fig. 2 ). Future studies could distinctively focus on the integration of cultural studies though, such as GLOBE or Hofstede, and compare the cultural clusters within Africa with other cultural clusters outside Africa. This would also help overcome the issue of scarce data availability for many countries in Africa, which makes empirical research in this area more difficult. A larger data pool would also be beneficial for consolidating the BDRI results. Creating strategic business value from big data analytics: a research framework Big data and city living -what can it do for us Towards felicitous decision making: an overview on challenges and trends of Big Data #iamhappybecause: Gross National Happiness through Twitter analysis and big data How good is life in Africa? Digging deeper than GDP Big data readiness index -Africa in the age of analytics A new design of happiness in the context modern world Sustainable Development Solutions Network The Wellbeing of Nations. A Country-by-Country Index of Quality of Life and the Environment Undefined By Data: A Survey of Big Data Definitions How 'big data' can make big impact: Findings from a systematic review and a longitudinal case study Effects of data set features on the performances of classification algorithms Big data: the management revolution Design science in information systems research Business intelligence and analytics: from big data to big impact Digital workplaces The value of Big Data in servitization Africa generates less than 1% of the world's research; data analytics can change that. An in-depth analysis of the continent's research reveals promising developments -and strategies for continued improvement Critical analysis of Big Data challenges and analytical methods Reflections on societal and business model transformation arising from digitization and big data analytics: a research agenda Mining big data in real time Digital planet: how competitiveness and trust in digital economies vary across the world Understanding frameworks and reviews Reconstructing the giant: on the importance of rigour in documenting the literature search process What literature review is not: diversity, boundaries and recommendations Analyzing the past to prepare for the future: writing a literature review What can economists learn from happiness research? Re-considering the linkage between the antecedents and consequences of happiness The rising well-being of the young Information technology and gross national happiness A data mining and data visualization approach to examine the interrelationships between life satisfaction, secularization and religiosity Mining-based lifecare recommendation using peer-to-peer dataset and adaptive decision feedback. Peer-To-Peer Temporal patterns of happiness and information in a global social network: hedonometrics and Twitter Private traits and attributes are predictable from digital records of human behavior The good Samaritan effect: a lens for understanding patterns of participation Experimental evidence of massive-scale emotional contagion through social networks Transforming by metrics that matter -progress, participation, and the national initiatives of fixing well-being indicators Big data: the next frontier for innovation, competition, and productivity Estimating Food Consumption and Poverty Indices with Mobile Phone Data UN Global Pulse: Mining Citizen Feedback Data for Enhanced Local Government Decision-Making Geographically varying associations between personality and life satisfaction in the London metropolitan area An analysis of well-being determinants at the City Level in China using big data Sustainable millennials The Sustainable Neighborhoods for Happiness Index (SNHI): a metric for assessing a community's sustainability and potential influence on happiness United Nations Development Programme: Human Development Data World Bank Group: CPIA database -policies for social inclusion/equity cluster average (1 = low to 6 = high Political participation Institutions and the resource curse Energy policies for sustainable development in South Africa