key: cord-0507675-hyhe1wn2 authors: Wang, Junda; Zhang, Xupin; Luo, Jiebo title: How COVID-19 Has Changed Crowdfunding: Evidence From GoFundMe date: 2021-06-18 journal: nan DOI: nan sha: 30b72a85ea2b3e4c9be699eeb8a654abbe4ed1fb doc_id: 507675 cord_uid: hyhe1wn2 While the long-term effects of COVID-19 are yet to be determined, its immediate impact on crowdfunding is nonetheless significant. This study takes a computational approach to more deeply comprehend this change. Using a unique data set of all the campaigns published over the past two years on GoFundMe, we explore the factors that have led to the successful funding of a crowdfunding project. In particular, we study a corpus of crowdfunded projects, analyzing cover images and other variables commonly present on crowdfunding sites. Furthermore, we construct a classifier and a regression model to assess the significance of features based on XGBoost. In addition, we employ counterfactual analysis to investigate the causality between features and the success of crowdfunding. More importantly, sentiment analysis and the paired sample t-test are performed to examine the differences in crowdfunding campaigns before and after the COVID-19 outbreak that started in March 2020. First, we note that there is significant racial disparity in crowdfunding success. Second, we find that sad emotion expressed through the campaign's description became significant after the COVID-19 outbreak. Considering all these factors, our findings shed light on the impact of COVID-19 on crowdfunding campaigns. In recent years, with the development of the Internet, there are more ways to raise money online. GoFundMe, an American for-profit crowdfunding platform that encourages people to create online crowdfunding projects about their life events such as illnesses and accidents, is a good example. Even if crowdfunding becomes more and more convenient, the success rate of crowdfunding is still not high. To date, there is not much information regarding a recipe for a successful campaign, and the factors leading to the success of crowdfunding have become a critical research goal [1] . In the current study, we analyze GofundMe to extract the most critical factors that contribute to crowdfunding success. We collect data of 36,370 campaigns on GoFundMe and split them into two parts. One part of the crowdfunding data is for before the COVID-19 outbreak (2019), and the other is for after the COVID-19 outbreak (2020). In this regard, we also analyze whether the influencing factors have changed under the influence of the pandemic. Some researchers believed the emotional elements conveyed through text and facial expressions are likely to attract donors [2] . We extract the face from the picture and judge the emotion using the Baidu Application Programming Interface (API) 1 . In addition, we extract and infer individuallevel features such as gender, race, age, beauty, target, location, followers, shares, distinct donors, family status, and facial attractiveness, as well as duration of crowdfunding. Text is one of the most basic aspects of information transfer, and some researchers believed that textual elements including descriptions, reviews, and emotion have a considerable impact on the success of crowdfunding [3] . Therefore, we incorporate text emotion into models through a text scoring model that produces the scores as a feature. According to the characteristics mentioned above, a comprehensive and in-depth analysis can be performed. If a campaign page visitor's sympathy with specific project topics is given, a more longer visit duration would increase the probability that the page visitor donates and thus the success of the crowdfunding [3] . Hence, the aesthetic and technical scores of the cover image are also considered as factors affecting the success of crowdfunding [4] . Based on the above data, we propose three hypotheses: • Hypothesis 1: The basic features of crowdfunding, fundraiser and descriptions have a significant impact on fund-raising success. There are significant differences between before and after the COVID-19 outbreak. The goal of this study focuses on analyzing the main factors that affect crowdfunding campaigns and the impact of the COVID-19 on these factors. Notably, we define a success group where the crowdfunding projects have raised more than the target amounts, and a failure group otherwise. Moreover, we build predictions models for both classification of success/failure and regression of the raised amount. For regression or classification problems, we use XGBoost to solve these two problems and provide the list of essential factors. Finally, we perform a counterfactual experiment to analyze the influences of different factors and the significant impact of the COVID-19 on these factors. To our best knowledge, this is the first study examining the impact of COVID-19 on crowdfunding campaigns. In the past few years, more platforms have been created to provide online crowdfunding, such as Kickstarter, Indiegogo, and GoFundMe. Particularly, crowdfunding platforms aim to be intermediaries between sponsors and fundraisers. They can also promote the ideas of the fundraiser and call for support from potential investors [5] . In addition to the factors mentioned above, the funding goal [6] and project duration [7] have a considerable impact on the success of crowdfunding. Increasingly more studies have investigated crowdfunding success using data-mining methods. Specifically, they predict crowdfunding success and analyze the factors that are significant to crowdfunding. Mollick argued that it is difficult to measure success and performed a manual analysis to distinguish between success and failure [8] . Yuan identified the importance of topics and sentiment further affecting the success of crowdfunding [9] . Notably, Mitra and Gilbert predicted the success of crowdfunding through analyzing text using LIWC [10] . Lauren examined the relationship between the emotion of the cover image and the crowdfunding success [2] and found that crowdfunding platforms rely on emotions to attract sponsors. We focus on GofundMe to analyze the crucial factors contributing to the success. Specifically, we crawl all the available 36,370 crowdfunding campaigns on GofundMe and divide them into two parts. One part was collected before August 2019, while the other was collected after August 2020. The purpose of this division is to analyze whether COVID-19 has had an impact on people's attitude towards crowdfunding or whether some influential factors or less influential factors have changed. The features of the dataset are summarized below and shown in Table 1 . The features that can be directly extracted from the website are as follows: launch date, cover image, description, category, current amount, goal amount, # of followers, # of shares, and # of donors. We refer to these features as the basic features. • Quality Scores: We use the pre-trained model NIMA to obtain the aesthetic and technical scores of each cover image [11] . • Text Features: First, we merge the titles and descriptions of the campaigns and use them as our text data. Next, we employ Vader [12] to evaluate individual text data and obtain three predicted emotion scores including positive, negative, and neutral scores, respectively. Finally, we examine the text using LIWC to obtain more detailed text sentiment scores: sad scores, anger scores, and anxiety scores. However, it is evident that both descriptions of the campaigns are essential in impacting crowdfunding success. These potential effects are not easy to measure directly, therefore we train an XLNet model to predict the success of the crowdfunding campaign and learn its potential effects [13] . • Image Features: In terms of image features, we compare the DeepFace API [14] with the Baidu API, and also consider the previous research comparing the Baidu API and other competing APIs [15] . In the end, we choose the Baidu API, which is a platform providing reliable face recognition services. The Baidu API returns facial attractiveness, age, race, emotion, and gender of each face in the cover image (also referred to as profile image). For simplicity, we calculate the mean attractiveness of faces when an image contains more than one person. For the age feature, we not only calculate the mean age but also return two characteristics: the number of children and the number of older adults. People under 15 years old are defined as children, while those over 60 years are regarded as older adults. We regard the number of people of different races as the characteristics variable. In the Baidu API, there are four different races: Black, White, Asian, and Others. In terms of gender, we obtain the number of males and females. In the end, we extract the emotion of each face (happy, sad, grimace, neutral, or angry). There are 21 different categories of campaigns and their distribution is shown in Figure 1 . Some of these categories are related to the success rate of crowdfunding, while others are not. Specifically, we analyze the categories that have the most significant impact on crowdfunding success using the t-tests. We find that the impact of some categories on crowdfunding success has changed over the past two years. Since it makes no sense to compare p values directly, we compare whether their significance levels has changed. For example, the significance levels of the To analyze the contribution of influential factors to success, we employ the XGBoost method and divide features into three categories: basic features, text features, and image features. We feed (1) basic features, (2) basic features plus text features, and (3) all the features into an XGBoost model to obtain the accuracy, F1 score, precision, and recall, respectively. We also train an XGBoost model to construct a regression model with the ratio/percentage of success as the output. It is evident that the inferred features significantly improve the model performance (Table 3) . To extract statistically significant features, we construct a logistic regression model for the classification problem. The result is shown in Table 4 and validates Hypothesis 1. To further analyze the impact of each feature on the success of crowdfunding, we conduct several counterfactual analysis experiments. Since we have too many features, we analyze the statistically significant features in a separate analysis. We compare the impact of the above mentioned features on crowdfunding success before and after the outbreak of COVID-19 and further analyze whether the COVID-19 pandemic has changed people's concerns and priorities. To have a better understanding of the causality between the aforementioned features and crowdfunding success rate, we employ counterfactual analysis, which has been widely used in comparative inquiry [16] . We test the remaining two of the three hypotheses formulated in the beginning based on the results of the counterfactual analysis experiments. Table 2 has changed, indicating that the impact of COVID-19 on these three categories is profound. This shows that the COVID-19 pandemic has changed individuals' preferences and individuals focus more on medical or charity topics than dreams, arts and charity topics. These results validate Hypothesis 2. It is worth mentioning that the ratio in the table changes little because of the multiple kinds of categories. Eliminating the impact of one of these categories cannot significantly change the overall success, but it does not mean that these categories are not significant. Improving the effect of features First, we divide the remaining features into the basic features, text features, and image features after removing the category features. Combined with the counterfactual experiment, we focus on the analysis of those statistically significant features. Table 6 shows the result of counterfactual analyses of statistically significant features using logistic regression. When increasing these features with their mean values, the percentage of success increases significantly. • Goal Amount From the results of the counterfactual experiment, the feature with the most significant causal relationship with success is the goal amount. The displayed funding goal functions is a signal of difficulty/complexity of the project. In general, the higher the crowdfunding goal is, the lower the success rate of the project is [7] . Our counterfactual experiment results are also consistent with the previous results. • Duration According to the experimental results, we find that the duration of crowdfunding campaigns has a positive impact on the success. The longer a crowdfunding project lasts, the more likely people have the opportunity to find and share it (p < 0.001). The campaigns will receive more donations with an increased number of shares (p < 0.001). The relationship between the number of donors and success is significant (p < 0.001). The number of donors has a positive impact on crowdfunding success as shown by the counterfactual experiment results. • Emotion of text Text scores obtained by XLNet have a significant positive impact on crowdfunding success since the outbreak of COVID-19. It implies that there is a statistically significant relationship between text and success. Further analysis shows that there is a significant relationship between emotion and success rate. For example, anxiety (p < 0.1) and sadness (p < 0.1) of text have a negative impact on success. This result is consistent with our counterfactual experiment. • Technical Score Technical scores obtained by Pretrained model NIMA have a significant positive impact on crowdfunding success since the outbreak of COVID-19 (p < 0.01). It implies that there is a statistically significant relationship between the quality of the images and success. For image features, we find that race has different levels of influence on the success of crowdfunding since the outbreak of COVID-19. Among all races, Black has the most significant impact on the success (p < 0.05). It seems that while historically black people are less likely to be funded than other races, there is less significant effect before COVID-19 (p < 0.1). In other words, black people are even more less likely to be funded during the COVID-19 pandemic. In the whole data set, Asian also has a statistically significant impact on the success of crowdfunding (p < 0.05) after the COVID-19 outbreak, while before the COVID-19 outbreak, there was no statistically significant impact. These results support Hypothesis 3. We suspect that the former might be related to the root cause of the #BlackLivesMatter movement and the latter might be related to the root cause of the #StopAsianHate movement. These are two interesting directions to investigate in depth in future work. • Emotion of cover image We examine the relationship between the emotion of the cover image and success. We find that there is a significant difference between emotion and success. According to the counterfactual experiments, sadness has a negative effect on success, while happiness has a positive effect on success (p < 0.1). People are more willing to donate to those whose cover images carry positive valence than those whose images carry negative valence. This is an interesting and somewhat surprising discovery as donors seem to like those who are upbeat and optimistic about life. • Children From the results of the logistic regression model, we find that the impact of children on crowdfunding success would be significantly positive if the campaign was related to children. In addition, we confirm that the effect is indeed positive as it is consistent with the regression model. We further analyze the relationship between crowdfunding success and the topics in each category by using LDA (Latent Derelicht Allocation) topic modeling [17] . For each category, we employ LDA to divide it into five topics. We build a regression model to analyze which topics have the most positive impact on the success rate of crowdfunding and which topics have the most negative impact. We focus on analyzing the categories that experience significant changes before and after the COVID-19 outbreak. Figure 2 , where c is the correlation coefficient between success and topics. The first value represents the correlation coefficient before the COVID-19 outbreak, while the second value represents the correlation coefficient after the COVID-19 outbreak. We find that the frequencies of some words in the same category are very high, resulting in a high overlap between different topics. Therefore, we add the words whose frequency is in the top ten of all topics to the stop words list. Next, we apply an LDA model again to generate the topics and repeat this process until there is no such word. In the end, the coherence of the topic model is 0.34803. In addition, we find that there are some significant changes in the Other category before and after the COVID-19 outbreak. However, the Other category includes various topics, hence we separately construct a topic model for this category to analyze the impact of such topics on the success rate of crowdfunding. We find that there are some significant changes in the Other category before and after the COVID-19 outbreak. We analyze the the Other category individually because it contains various topics. We start with a topic number with the best coherence value. We find that before the COVID-19, topics in the Other categories are relatively scattered, and we divide them into four topics(coherence score: 0.35192, as shown in Figure 3 ). Interestingly, topics are relatively concentrated after the COVID-19 outbreak as we can only divide the category into two topics (coherence score: 0.32625, as shown in Figure 4 ). We find that before COVID-19, the Other category includes dreams, gifts, children, travel, Honeymoon & Wedding, and other topics. Among these topics, combined with mentioned results, we find that most of these topics have a negative impact on the success of crowdfunding. However, after the COVID-19 outbreak, most topics focus on family, children, friends, and medical care. These topics have a positive impact on the success of crowdfunding. This suggests that the significance of the Other category has changed because of the change of topics under the Other category before and after the COVID-19 outbreak. This study has three limitations. The first limitation is the number of crawled campaigns. Second, we cannot control for every possible factor, and the nature of this study design leaves the possibility of residual confounding. Finally, we did not consider some dynamic factors, such as the change in the amount of a single donation. In conclusion, our study analyzes the changes of significant features impacting crowdfunding success before and after the COVID-19 outbreak and validates three hypotheses. The study results suggest a substantial difference in some categories between before and after the COVID-19 outbreak. While dreams, travel, or other topics are less likely to be funded before the COVID-19 outbreak, people began to make donations to these campaigns after the outbreak of COVID-19. People began to pay more attention to medical, accident, and charity. Some categories have not changed Fig. 2 : Topics under different categories. A red value indicates that the topic has a positive impact on success, while a green value indicates that the topic has a negative impact on success. Note those topics with changes in the polarity of the impact before and after the COVID-19 pandemic. Fig. 3 : Topics under the Other category that had significant effect on success before the COVID-19 outbreak. We generate four topics with the best coherence value: 0.35192. Fig. 4 : Topics under the Other category that had significant effect on success during the COVID-19 pandemic. We generate two topics with the best coherence value: 0.32625. before or after the COVID-19 outbreak. For example, campaigns including babies, family, funerals, and memorials have always been easier to get people's donations. In contrast, campaigns such as sports, weddings, and missions have not been easy to get donations. More importantly, we find that there are significant differences in crowdfunding success among different races. The COVID-19 has made it more challenging for Black and Asian to raise money because the COVID-19 pandemic exacerbated such existing social disparities. Our study can also guide and support fundraisers and crowdfunding companies like GofundMe to raise money. For example, we find that both the emotion of the text and emotion of the cover image impact the success of crowdfunding. In addition, we find that sadness, anxiety, or anger have no effect or even negative effect on crowdfunding. Compared with these negative emotions, positive emotions are more likely to get funded. Therefore, when the fundraisers write the description or upload the cover image, in addition to describing their misfortune and difficulties, they should also express more about their optimistic attitudes towards life. Furthermore, our success prediction model can help the company provide feedback scores to fundraisers, making it easier for fundraisers to succeed. In the future, we will examine a longer period of campaigns, as opposed to the 2-year data employed in this study. In addition, we will take other dynamic factors, such as each donation into consideration. The elements of a successful crowdfunding campaign: A systematic literature review of crowdfunding performance Emotional delivery in pro-social crowdfunding success The recipe of successful crowdfunding campaigns What contributes to a crowdfunding campaign's success? evidence and analyses from gofundme data Crowdfunding: Tapping the right crowd Exploring entrepreneurial legitimacy in reward-based crowdfunding Crowdfunding practices in and outside the us The dynamics of crowdfunding: An exploratory study The determinants of crowdfunding success: A semantic text analytics approach The language that gets people to give: Phrases that predict success on kickstarter Nima: Neural image assessment Vader: A parsimonious rule-based model for sentiment analysis of social media text Xlnet: Generalized autoregressive pretraining for language understanding Lightface: A hybrid deep face recognition framework Benchmarking commercial emotion detection systems using realistic distortions of facial image datasets Mobility network models of covid-19 explain inequities and inform reopening Latent dirichlet allocation