key: cord-0197990-2fo2es6r
authors: Zhao, Yu; Wei, Shaopeng; Guo, Yu; Yang, Qing; Li, Qing; Zhuang, Fuzhen; Liu, Ji; Kou, Gang
title: Bankruptcy Prediction via Mixing Intra-Risk and Conductive-Risk
date: 2022-02-01
journal: nan
DOI: nan
sha: db8c10c23f2fe0c1b7d4a2064ca2c7a3a71a022f
doc_id: 197990
cord_uid: 2fo2es6r

Bankruptcy risk prediction for Small and Medium-sized Enterprises (SMEs) is a crucial step for financial institutions to make the loan decision and identify region economics's early warning. However, previous studies in both finance and AI research fields only consider either the intra-risk or the conductive-risk, ignoring their interactions and their combinatorial effect for simplicity. This paper for the first time considers both risks simultaneously and their joint effect in bankruptcy prediction. Specifically, we first propose an enterprise intra-risk encoder with LSTM based on enterprise risk statistical significance indicators from its basic business information and litigation information for its intra-risk learning. Afterward, we propose an enterprise conductive-risk encoder based on enterprise relational information from the enterprise knowledge graph for its conductive-risk embedding. In particular, the conductive-risk encoder is equipped with both the newly proposed Hyper-Graph Neural Networks (Hyper-GNNs) and Heterogeneous Graph Neural Networks (Heter-GNNs), which is able to model conductive-risk from two different aspects, i.e. common risk factors based on hyperedges and direct diffusion risk from the neighbors, respectively. With the two kinds of encoders, a unified framework is designed to simultaneously capture intra-risk and conductive-risk for bankruptcy prediction. To evaluate our model, we collect multi-sources SMEs real-world data and build a novel benchmark dataset SMEsD. We provide open access to the dataset, which is expected to promote the financial risk analysis research further. Experiments on SMEsD against nine SOTA baselines demonstrate the effectiveness of the proposed model for bankruptcy prediction.

S MALL and medium-sized enterprises (SMEs) contribute up to 40% of national gross domestic product (GDP) in emerging economies and also provide more than 50% employment worldwide 1 . SMEs' financial risk prediction is of great importance for both government policymakers and financial institution loan decisions [1] , [2] . Previous enterprise risk studies in both finance and AI research fields typically either focus on the enterprise inner financial statement and its related financial news or analysis risk diffusion based on simulating [3] , [4] , [5] , [6] , [7] , [8] , [9] . However, most of them only consider either the enterprise intra-risk or the conductive-risk, ignoring their interactions and their combinatorial effect for simplicity. It is a non-trivial and challenging task to build up a uniform framework for enterprise bankruptcy prediction, taking advantage of both enterprise intrarisk and conductive-risk due to the multi-source heterogeneity characteristics of the intra-risk data and the multiplex conductive-1. https://www.worldbank.org/en/topic/smefinance risk relations of enterprises [10] .

To address this problem, we propose a novel enterprise bankruptcy prediction method by mixing enterprise intra-risk and conductive-risk. Firstly, we propose an enterprise intra-risk encoder based on Long Short-Term Memory (LSTM), which leverage rich features from the enterprise basic business information and the litigation information for its intra-risk mining. After an extensive statistical-significance analysis on the correlation between the enterprises basic intelligence (including the enterprise basic attributes and litigation information here) and their bankruptcy risk, as reported in Table 1 , we successfully select 12 statistical significance indices for intra-risk encoder learning (Please refer to Section 2.1 for more details of significance analysis.). Secondly, we propose an enterprise conductive-risk encoder based on enterprise relational information from the enterprise knowledge graph (EKG) to embed its conductive-risk, which is also known as risk momentum spillover effect in finance [11] . Figure 1 gives a toy example of EKG, from which we can find that enterprises have two kinds of relations, such as hyperedges and pair-wise heterogeneous relations (Please refer to Section 2.2 for more analysis details. ). Hence, we accordingly propose two different sub-models, i.e. Hyper-Graph Neural Networks (Hyper-GNNs) and Heterogeneous Graph Neural Networks (Heter-GNNs), to arm the conductive-risk encoder for modeling risk diffusion on the EKG. In particular, the Hyper-GNNs aims to mine hyperedges in EKG, such as the same industry and the same area, which would be beneficial for enterprise risk prediction. For instance, the majority of mask and vaccine manufacturing enterprises in the same medical industry boom under the circumstance of the COVID-19 outbreak, while the catering industry faces a common huge bankruptcy risk under such an epidemic situation. The Heter-GNNs captures the direct TABLE 1 The statistical significance analysis on the correlation between the enterprises' basic intelligence (including the enterprise basic attributes, the enterprise litigation information here) and their bankruptcy risk. The symbols ***, ** and * denote the statistic result is significant in 99%, 95% and 90% level, respectively. conductive-risk factors from its neighboring enterprises. For example, an enterprise faces a loan default, which could lead to its related creditors' bad financial situation. Such events have critical effects on a creditor's business and would lead to its bankruptcy. With the two encoders mentioned above, we propose a uniform framework to sufficiently capture enterprise intra-risk as well as conductive-risk for bankruptcy prediction. Figure 2 shows the overall architecture of the proposed method. In Fintech literature, especially in SMEs research field, few researchers publicly provide the experimental benchmark datasets for reproduction 2 . This negative phenomenon may be caused by the sensitivity and rarity characteristics of SMEs' financial data, which greatly impedes the development of SMEs' intelligence research due to data deficiency [14] . In this paper, we collect multi-2. For example, a series of top conference papers, such as SemiGNN [12] in ICDM 2019; HACUD [13] in AAAI 2019; ST-GNN [14] in IJCAI 2020; AMG-DP [15] in CIKM 2020; TemGNN [16] in SDM 2021; PC-GNN [17] in WWW 2021, do not provide their datasets for reproduction.

sources SMEs data and provide publicly a new dataset (SMEsD) for reproduction. Hopefully, SMEsD will become a significant benchmark dataset for SMEs bankruptcy prediction, and boost the development of financial risk study further, especially for the SMEs bankruptcy research. Experimental results on SMEsD demonstrate our proposed model is able to sufficiently capture enterprise intra-risk as well as conductive-risk for bankruptcy prediction.

The contributions of our work are fourfold:

• We conduct inspiring data exploratory analysis to prove that the enterprise intelligence, i.e. the enterprise basic attributes, the litigation information and enterprise knowledge graph, has an impact on bankruptcy risk prediction for SMEs.

We propose a novel framework for inferring enterprise bankruptcy by mixing both its intra-risk and conductiverisk. To the best of our knowledge, this work is the first attempt to considers both risks simultaneously and their joint effect in bankruptcy prediction.

• Under this framework, we utilize a LSTM-based encoder to dig enterprise intra-risk from its basic intelligence. We propose a novel GNNs-based conductive-risk encoder, including Hyper-GNNs and Heter-GNNs, to calculate conductive-risk through the hyperedges and pair-wise heterogeneous relations in the enterprise knowledge graph.

We propose a new benchmark dataset (SMEsD) to evaluate the proposed method, which is also expected to promote enterprise financial risk analysis further. The empirical experiments on our constructed dataset demonstrate the proposed method can successfully mix enterprise intrarisk and conductive-risk for bankruptcy prediction 3 .

In this section, we conduct exploratory analysis between the enterprise intelligence (i.e. the enterprise basic attributes and litigation information, and enterprise knowledge graph), and its bankruptcy risk. We first give the statistic correlation and independent sample t-Test results between the basic attributes as well as the lawsuit features of the enterprises and their bankruptcy status. Afterward, we introduce conductive-risk analysis on the enterprise knowledge graph for bankruptcy prediction.

We collect 11,523 civil lawsuits of 4,229 Chinese SMEs from 2000 to 2021, and the basic attributes of these enterprises. Table  1 summarizes the statistic analysis on the correlation and t-Test between the enterprises' basic intelligence (i.e. the enterprise basic attributes and litigation information, and see Definition 1.) and their bankruptcy risk. The first part in Table 1 refers to the enterprise basic attributes, i.e. established time, registered capital, and paid-in capital. The last four rows in Table 1 are about the most significant features of lawsuits, i.e. lawsuit cause, court level, verdict and duration of action (DOA). We give the analysis results as follows: Enterprise Attribute. The first part is about enterprise basic attributes of business information, including established time counted by months, registered capital and paid-in capital counted by ten thousand yuan. From Table 1 , we can find that:

• All of the three indicators are significantly negative to bankruptcy. The indicators of surviving enterprises are significantly higher than that of bankrupted enterprises in t-Test.

It indicates that the longer established time, the larger registered capital and the larger paid-in capital an enterprise has, the less probability it will go to bankruptcy.

Lawsuit Cause. We explore the correlation between lawsuit causes and enterprise bankruptcy. From Table 1 , we find that both the two types of lawsuit causes, i.e. loan contract dispute and sales contract dispute, are significantly correlated to enterprise bankruptcy. Specifically, the correlation coefficient between the number of loan contract disputes and enterprise bankruptcy is 0.122, which is statistically significant at 99% level. The correlation coefficient between sales contract disputes and enterprise bankruptcy is 0.077, which is also statistically significant at 99% level. These findings confirm that bankrupted enterprises tend to have more loan contracts and sales contract disputes, which is in line with our intuition. Meanwhile, we can observe that the average number of loan contract disputes of the surviving enterprises is 1.80 and the number of which for bankrupted enterprises is 2.23. The difference of the two average numbers for surviving enterprises and bankrupted enterprises is significant at 95% level by t-Test, which reaffirms the correlation between enterprise bankruptcy and loan contract dispute. We can obtain a similar conclusion from the statistical results about sales contract disputes. In sum, we can find that:

• The numbers of loan contract disputes and sales contract disputes are both significantly positively correlated with enterprise bankruptcy.

Court Level of Lawsuit. The court level of a lawsuit is another factor related to enterprise risk. There are four levels of the court type, i.e., grassroots people's court, intermediate people's court, higher people's court and supreme people's court in order. Most of the lawsuits are dealt with by grass-roots people's court, while some of them with large underlying assets are brought to intermediate court directly. If the litigant disagrees with the verdict, it could appeal to a higher people's court. From Table 1 , we can find that:

• The number of grassroots court lawsuits is significantly positively correlated with enterprise bankruptcy.

The lawsuit numbers of both intermediate people's court and higher people's court are significantly negatively correlated with enterprise bankruptcy.

These findings indicate that the bankrupted enterprises tend to have more grass-roots court lawsuits and fewer intermediate people's court and higher court lawsuits. This is may because involving a large number of lawsuits of grass-roots people's court implies enterprise financial risk in nature. On the contrary, the number of high people's court lawsuits an enterprise involves means their powerful capacity of dealing with such lawsuits, and also reflects their larger business scale. The t-Test has also confirmed this conclusion.

Verdict. We divide the results of lawsuits into four types, i.e. plaintiff winner, plaintiff loser, defendant winner, and defendant loser, according to the different litigant status and the verdict. From Table 1 , we can observe that:

• The enterprises as plaintiff winners are less likely to bankruptcy, i.e. significantly negative.

The enterprises as defendant losers are more prone to bankruptcy, i.e. significantly positive.

The correlation coefficients of the two types of verdicts are both significant at 99% level, which confirm the importance of lawsuit result. This is because becoming a plaintiff winner in a lawsuit is good news for an enterprise, and being a defendant indicates risk for it. Besides, we can also draw the same conclusion from the average number difference of the two types of lawsuit results related to bankrupt enterprises and surviving enterprises in t-Test.

Duration of Action. Inspired by [18] , we divide the duration of action (DOA) into two types, i.e. less than two years and more than two years. From Table 1 , we can find that:

• The correlation between the number of lawsuits in the last two years and enterprise bankruptcy is significant positive.

The correlation between the number of lawsuits before two years and enterprise bankruptcy is significant negative.

These findings indicate that the bankrupted enterprises tend to have more lawsuits in the past two years before bankruptcy. The more lawsuits the more direct risk for an enterprise, especially lawsuits in recent two years. On the other hand, involving a large number of lawsuits before two years implies an enterprise has gone through many disputes and still survives, which means the enterprise has a large business scale and is strong to confront the various challenges.

In the beginning, the conductive effect is applied for studying stock movement prediction [19] , which indicate that the stock fluctuation is partially affected by its related stocks. Here, conductiverisk indicates that the risk generated from an enterprise tends to diffuse through enterprise knowledge graph to its neighboring enterprises, which is ubiquitous in real market circumstances [14] , [20] , [21] , [22] . Figure 1 shows a toy example of enterprise knowledge graph extracted from our newly generated dataset SMEsD, from which we can find that enterprises have two types of relations, i.e. hyperedges and pair-wise heterogeneous relations. (i) There are three types of hyperedges in the enterprise knowledge graph (See Definition 2), i.e., industry, area and stakeholder colored with red, yellow and green block, respectively. For example, enterprise A, enterprise D and enterprise E are in the same city, then they are influenced by regional policy, such as same tax administration and economic policy, facing similar region risk. Hence, we accordingly propose a Hyper-Graph Neural Networks to model such conductive-risk. (ii) There are 7 types of pair-wise heterogeneous relations among enterprises and persons (See Definition 3). Both person 1 and enterprise F invest in enterprise A, where the edge widths indicate distinct investment share. Person 2 and Person 3 are stakeholders, such as manager, stockholder and supervisor, of enterprise A and enterprise F, respectively. Enterprise A has two branches companies, i.e., enterprise C and enterprise B. Besides, enterprise A has potential business relations with enterprise D and enterprise E as a result of previous loan contract and deal contract disputes with them. Here, we conduct Heterogeneous Graph Neural Networks to model such kind of conductive-risk.

Definition 1. Enterprise basic intelligence. Enterprise basic intelligence consists of two parts, i.e., the enterprise basic attributes and the enterprise litigation information as in Table 1 , which can be formulated as

denotes a specific lawsuit k related to enterprise i, including lawsuit cause, court level of lawsuit, verdict and time interval of action.

Definition 2. Enterprise hyper-graph. An enterprise hypergraph can be defined as G hyper = (V e , E, T hyper ). Here, V e denotes the set of enterprise nodes. E = {hp 1 , hp 2 , ...} denotes hyperedge set. T hyper = {Ω 1 , Ω 2 , ..., Ω M } denotes hyperedge type set, and |T hyper | > 1 here. Hyperedge type map function ψ: ψ(hp) ∈ T hyper . The relationship between enterprise nodes can be represented by an incidence matrix H ∈ R |V|×|E| with elements defined as:

v ∈ V e denotes an enterprise node, and hp ∈ E denotes a hyperedge.

Definition 3. Enterprise heterogeneous-graph. An enterprise heterogeneous graph is defined as a connected graph G hete = (V, L, T , R, W). V denotes the set of all nodes. L denotes a link set. They are associated with two functions: (i) a node type mapping function ϕ : V → T . V = {V e , V p }, V e , V p denote the node set of enterprises and persons, respectively. V e ∩ V p = ∅. Each node v ∈ V belongs to one particular type in node type set T : φ(v) ∈ T . (ii) a link class mapping function ψ : L → R. W denotes edge weights.

Problem 1. Enterprise bankruptcy prediction. Given an enterprise multi-source data, which consists of enterprise basic intelligence A, an enterprise heterogeneous hypergraph G hyper and an enterprise heterogeneous graph G hete , we aim to learn enterprise risk embeddings that consider both the intra-risk and conductive-risk. Based on enterprises' representations, we conduct bankruptcy prediction task, which can be treated as a binary classification problem.

Enterprise Intra-Risk In general, traditional enterprise risk analysis methods mainly consider financial indicators, such as profitability index, operating efficiency and solvency, using multivariate discriminant analysis [23] , [24] , [25] , or machine learning methods, such as SVM and Decision Tree [26] , [27] , [28] . For example, Erdogan et al. [29] proposed an ensemble method utilizing SVM as base classifiers for commercial bank bankruptcy.

There are also many researches utilize neural networks to improve prediction accuracy [3] , [30] . Hosaka et al. [31] proposed to transform financial ratios as an image utilize convolution networks to deal with bankruptcy prediction. Recently, many researches have concentrated on utilizing text information, such as financial reports and conference calls, for mining enterprise intra-risk. For instance, Borochin et al. [32] found that conference call tones were negatively related to measures of firm value uncertainty from the equity options market. Li et al. [5] collected a largescale multi-modal dataset named MAEC and experiments showed the efficiency of proposed dataset on volatility forecasting. Liu et al. [7] constructed six pre-training tasks trained both on general corpora and financial domain corpora, which enabled it to capture financial specific semantic information. However, SMEs usually lacks normal financial reports as well as public conference calls, which brings a challenge to risk analysis for SMEs. On the other hand, there are abundant risk sources such as relevant lawsuits, which are proved significantly related to enterprise credit risk [18] , which have not been well utilized in previous works.

Enterprise Conductive-Risk Enterprise conductive-risk also plays a vital role in risk analysis as there is no enterprise that is absolutely independent from others. In financial studies, some works are proposed to utilize interconnections between firms or assets for risk analysis [33] , [34] , [35] . For example, Eisenberg et al. [33] took interconnections among firms into consideration in obligation clearing mechanism research. Elsinger et al. [34] proposed to assess systemic financial stability with a network model of inter-bank loan. Acemoglu et al. [36] provided a framework for studying the relationship between the financial network architecture and the likelihood of systemic failures considering conductive risk and found that financial contagion exhibited a form of phase transition as inter-bank connections increase.

However, most of previous researches explore the effect on conductive-risk by simulating [8] , [9] , which can not be applied in real scenarios.

Graph Neural Networks (GNNs) utilize deep neural networks to deal with graph representation learning and witnessed a great success on various tasks on graph, such as node classification [37] , [38] , link prediction [39] and community detection [40] . GNNs also contributed a lot on traditional scenarios, such as recommendation system [41] , [42] , natural language process [43] , [44] and computer vision [45] , [46] . We refer the readers to [47] for more surveys on graph neural networks. As enterprise interconnections naturally form a heterogeneous graph, consisting of enterprise nodes, person nodes and connections among them. In the Fintech field, some works applied GNNs to model various risk. For example, SemiGNN [12] utilized the multi-view labeled and unlabeled data for fraud detection. Hu et al. [15] proposed to jointly model various relations and objects as well as the rich attributes on nodes and edges for loan default detection. CCR-GNN [48] proposed to solve the problem of Corporate Credit Rating via Graph Neural Networks. Yang et al. [14] examined mine supply chain relationship and conducted lift prediction on a collected supply chain dataset. Kosasih et al. [49] posed the supply chain visibility problem as a link prediction problem via GNNs. Pan et al. [10] utilized a triple-layer attention network for bankruptcy prediction considering different metapath-based neighbors.

Hypergraph has showed strong capacity on modeling highorder relationships, which have been utilized in many areas, such as social recommendation [50] , [51] and computer version [52] , [53] . In regard of enterprise risk modeling, there exist huge number of hyperedges among enterprises and related persons, which is suitable for utilizing the hypergraph. While few works apply hypergraph neural networks to this area.

Despite the success of previous studies, few researchers consider both intra-risk and conductive-risk simultaneously and their joint effect in bankruptcy prediction. However, most of them fail to sufficiently mine risk information as a result of hybrid risk source and multiplex relations. Meanwhile, few of works provides open access data for later researches, which restricts the development of risk analysis, such as bankrupt prediction, default prediction and credit rating.

In this section, we introduce the overall architecture of the proposed method, as shown in Figure 2 . The proposed model consists of three significant parts: (I) Enterprise Intra-Risk Encoder using the enterprise statistic significantly features in Table  1 . (II) Enterprise Conductive-Risk Encoder consists of two submodules: (a) Hyper-Graph Neural Networks (Hyper-GNNs) using enterprise hypergraph, and (b) Heterogeneous Graph Neural Networks (Heter-GNNs) using enterprise heterogeneous graph. (III) Enterprise Bankruptcy Prediction. Different from previous works, we take advantage of hierarchical mechanism for both Hyper-GNNs and Heter-GNNs to sufficiently utilize multiplex heterogeneous hyperedges and heterogeneous relations. Next, we give the details of them.

The enterprise intra-risk encoder aims to learn enterprise self risk embedding using enterprise basic intelligence, i.e. enterprise basic attributes and enterprise litigation information, which is formally given in Definition 1. As Fig. 3 shows, firstly, for each enterprise node v i ∈ V e , we use b i in Definition 1 as its basic attribution features. Secondly, the lawsuit event j k i of enterprise i contains four significance attributes, i.e., lawsuit cause, court level of lawsuit, verdict and duration of action, as analysis in Section 2.1. For the first three attributes, we map each of them into latent spaces and then concatenate them to get lawsuit representation s k i . Inspired by [54] , we utilize a time decay function Decayer to weight each lawsuit representation for better making use of time information in relevant lawsuit events. Specifically, we calculate time interval ∆ k i between the happened time of each related lawsuit and the enterprise's observation time. For bankrupted enterprises, the observation time is set as bankruptcy time, while for surviving enterprises the observation time is set as the present time.

As the most recent two years' lawsuits play an important role for enterprise risk prediction [18] , we assign lower w when performing time weight decay for the recent two years' lawsuits. We divide lawsuits into L periods {T 1 , T 2 , ..., T L }. Then, we sum lawsuit information in the same time period based on decayed lawsuit representations. Afterwards, we utilize LSTM [55] to aggregate lawsuit information from different time periods as follows:

Thirdly, we also randomly generate an embedding u i for enterprise i based on standard normal distribution as a supplement embedding, since we believe its hidden risk is always unknown. Finally, we concatenate total the basic attribution features, the litigation embedding and supplement embedding, and project it into new latent space as follow:

h i denotes the output of intra-risk representation of enterprise i. || denotes concatenation operation. W e ∈ Rd ×d is a trainable matrix.

Hypergraph plays an important role in bankruptcy prediction, as the hyperedge reflects common factors that enterprises are confronted with. Thus, it is natural to utilize hypergraph to capture common risk information, such as industry development recession, regional economic policy changing and guarantee risk caused by same stakeholders. As shown in Fig. 4 , since different types of hyperedges contribute to node representation at different levels, we assign different weights to them when aggregating node representations. Specifically, following Feng et al. [56] , we firstly calculate hypergraph convolution module as follow:

where Θ Ωm ∈ R |Ve|×|Ve| denotes convolution module. hypergraph type Ω m . W is the node weight matrix, we set it as an identity matrix, which means all weights are equal. D e denotes hyperedge degree matrix. Afterwards, we conduct hypergraph convolution under hypergraph type Ω m as follow:

where H l+1 Ωm denotes the learned representations under hypergraph type Ω m of layer l + 1, I − Θ Ωm denotes hypergraph laplacian, W hp ∈ R d×d is a trainable matrix, which is shared for different types of hypergraphs. Then we aggregate different types of hypergraph convolution representations as follow:

where z i ∈ R d is the learned hypergraph comprehensive representation of enterprise i, Ωm is a trainable parameter, which denotes the importance of hypergraph Ω m for all enterprise nodes.

We propose the Hete-GNNs to sufficiently utilize multiplex interactions among enterprises and persons. Specifically, we first aggregate entity level information and then relation level in a hierarchical mechanism as shown in Fig. 5 .

Graph-2 We initialize person nodes' representations as same as that for enterprises in 5.1. Then we perform transformation based on node type to project enterprise node and person representation to same latent space as follow:

where W φ(vi) ∈ R d×d is a node type specific trainable weight matrix. h i ∈ R d and h i ∈ R d are the original and transformed node representations, respectively. Then we conduct entity level aggregation. For weighted edges, such as holder investment, we directly set the ratio of contribution capital as the edge weight. For unweighted relations, we adopt attention mechanism to assign weights for node v i 's neighbors' representation as follows:

×d is a trainable matrix, LeakyRelu is an activation function. To make weights comparable, we utilize Softmax function to normalize weights across all choices of j as follows:

where r Φ k im is the m-th element of the aggregated Φ k unweighted relation representation for node v i . α Φ k ijm is the m-th dimension of the normalized importance of node j related to node i under unweighted relationship Φ k , N Φ k i denotes node i's neighbors under unweighted relationship Φ k . For weighted edges, we implement node level aggregation as follows:

η Φ k ij denotes the normalized importance that node j has for node i under weighted relaion, w Φ k ij denotes original edge weight between node i and node j, such as contribution capital. W 2 Φ k ∈ R d ×d is a trainable matrix. r Φ k i denotes the learned aggregated representation of node i's neighbors under weighted relationship Φ k .

To fully capture risk information implied in different relationships, we perform transformer based attention mechanism:

where g ik denotes the relation level importance that relation Φ k has for node i, W Q , W K ∈ R d ×d are trainable matrices, b Q , b K ∈ R d are trainable parameters, µ Φ k is a trainable parameter used to adjust the scale of learned importance, which is relation type specific. Similarly, we utilize Softmax function to normalize learned attention and aggregate relation level representations as follows:

where β ik denotes the normalized importance of relation Φ k for node i, h i is the learned aggregated risk information for node i. Afterwards, we utilize a residual connection to merge node original representations and learned risk information as follow:

wherez i is the final representation of node i, λ is a trainable parameter to balance conductive-risk through the heterogeneous graph and inner risk. σ is an activation function, we choose GELU here [57] .

We sum the learned representations of Hyper-GNNs and Heter-GNNs and utilize a full connected layer to transform learned node representations for bankruptcy prediction, as in Figure 2 (III).

where W p is a trainable matrix and b p is the bias vector. Finally we train our model by minimizing the cross-entropy loss.

where Y L is the set of labeled nodes. y i andỹ i are the ground truth and the predicted label for node i, respectively.

To examine the performance of the proposed model on bankruptcy prediction, we manually collect and pre-process a real world SMEs dataset, named SMEsD. To the best of our knowledge, this dataset is the largest multi-mode bankruptcy prediction dataset that contains abundant multi-dimension information. The SMEsD consists of 4229 SMEs and related persons in China from 2014 to 2021, which constitutes a multiplex enterprise knowledge graph. All enterprises are associated with their basic business information and lawsuit events spanning from 2000 to 2021. Specifically, the enterprise business information includes registered capital, paidin capital and established time. Each lawsuit consists of the associated plaintiff, defendant, subjects, court level, result and timestamp. Table 2 gives the statistics of the SMEsD. The dataset contains two types of nodes, i.e., enterprise and person. For the enterprise heterogeneous graph, there exist 7 types of relationships between enterprises and persons. The holder investment relationship is weighed by the contribution capital and other edges are unweighted. Besides, the loan and deal edges are constructed through lawsuits if enterprises have such two types of disputes. For the hypergraph, there exist three types of edges, i.e. industry, area and stakeholder. We split SMEsD into training set, validation set and testing set across the bankruptcy time of seed enterprises. 

To measure the effectiveness of our method, we compare the proposed model with four types of state-of-the-art (SOTA) methods:

(1) the conventional machine learning based method that only consider enterprise lawsuit information in the view of lawsuit attributes' frequency and basic business information; (2) the hypergraph neural networks based methods that take high order relationships among enterprises into consideration, which is able to detect common risk that enterprises are confronted with. (3) the homogeneous GNNs based methods that utilize abundantly connections among enterprises, which is able to capture conductiverisk; (4) the heterogeneous GNNs based methods that is able to distinguish multiplex relationships in EKG.

• Support Vector Machine (SVM) [58] : a model utilized support vectors to divide vector spaces into different classes.

• Hypergraph neural networks (HGNN) [56] : a model proposed to utilize high-order relationship information in graphs.

• Hypergraph Wavelet Neural Network (HWNN) [59] : a newly proposed model which makes use of wavelet basis instead of Fourier basis to perform localized hypergraph convolution.

Homogeneous GNNs (HomoG) Based Methods • Graph Convolutional Networks (GCN) [60] : a popular model which averages neighbors' information during the message passing process.

• Graph Attention Networks (GAT) [38] : a recent model which takes attention mechanism to align different weights to neighbors during the information aggregating process.

• Relational Graph Convolutional Networks (RGCN) [61] : an advanced extension of GCN, which takes relation information into consideration by giving different weights for different relationships.

• Heterogeneous graph neural network (HetGNN) [62] : a multi-modal heterogeneous graph model which utilizes Bi-LSTM to process multi-moding information, then applies attention mechanism in heterogeneous information fusing.

• Heterogeneous Graph Attention Network (HAN) [63] : one of the earliest model which implements hierarchical attention on graph neural network based on meta-path.

• interpretable and efficient Heterogeneous Graph Convolutional Network (ie-HGCN) [64] : a SOTA model which firstly implements object-level aggregation and then aggregates type-level information based on different meta-paths.

For all baseline methods, we calculate enterprise risk information by counting each number of lawsuit attributes and concentrating them with enterprise basic business attributes as enterprise risk representations. Besides, for GNNs based methods, we utilize the same initialization as our model to assign representations for enterprises and persons. We implement MIXR and baselines with PyTorch and PyTorch Geometric (PyG). We refer to THU-HyperG [65] for constructing hypergraphs. We implement baselines through official codes and default parameters. All neural network based models are trained with SGD optimizer [66] with the Cosine Annealing Learning Rate Scheduler [67] . We set input dimension 64 and output dimension 12 for each model. We run all the methods for 200 epochs and update models considering the improvement of both two comprehensive indicators on validation dataset, i.e., accuracy and F1 score to alleviate overfitting problem. We report the results of all methods on the testing dataset. Table 3 shows the evaluation results against nine state-of-theart baselines, from which we observe that our proposed method outperforms all baselines for enterprise bankruptcy prediction in terms of all metrics on our newly generated dataset SMEsD. Specifically, MIXR achieves state-of-art performance with improvements of 3.62%, 0.56%, 3.77% and 3.28% on accuracy, precision, recall and F1 score respectively, which confirms the capability of our method in mixing both enterprise intra-risk and conductive-risk for bankruptcy prediction. 6.3.0.1 Analysis.: (1) We could observe that SVM achieves good performance on recall, that is because lawsuit information as well as enterprise basic information is highly correlated with enterprise bankruptcy. However, SVM achieves poor performance on accuracy because of overfitting. (2) We can also observe that HWNN performs better than HGNN because of considering different types of hyperedges which shows the necessity of considering hypergraph heterogeneity. (3) we can find that SOTA HeterG baseline model, i.e., ie-HGCN, performs better on the two comprehensive indicators, i.e., accuracy and F1 score than ML, HyperG and HomoG models, which affirms the ability of heterogeneous graphs in capturing conductive-risk.

To evaluate the effectiveness of different components in the proposed model MIXR, we conduct an ablation experiment. The three ablated variants are as follows: (1) MIXR w/o Intra-risk, which deletes inner risk encoder; (2) MIXR w/o Hyper-GNNs, which removes Hierarchical Hypergraph encoder; (3) MIXR w/o Hete-GNNs, which deletes hierarchical risk encoder module. The results are shown in Figure 6 , we can observe that removing either heterogeneous graph, hypergraph or risk encoder leads to performance decreasing, which demonstrates the effectiveness of three proposed modules. Specifically, the proposed model MIXR outperforms MIXR w/o Intra-risk, which confirms the importance of lawsuit information in bankruptcy prediction. Meanwhile, MIXR w/o Intra-risk achieves the worst performance among three ablated variants, which verifies the effectiveness of risk information. Thus we highlight the design of capturing lawsuit risk information. Compared with MIXR w/o Hyper-GNNs, the proposed model MIXR also achieves better performance, which demonstrates the contribution of hypergraphs. It is because enterprises in the same industry, same area or with same stakeholders usually face similar external risk, such as industry development recession, regional economic policy changing and guarantee risk, which can be detected by hypergraphs. For MIXR w/o Hete-GNNs, we see that the performance also decreases, which confirms that utilizing multiplex heterogeneous relationships in enterprise knowledge graph can strengthen the capacity of the model. Figure 7 , we can find that the proposed MIXR achieves the best performance compared to all variants. Specifically, MIXR performs better than Risk-Frequency, which demonstrates the risk representation capacity of the inner risk encoder again. This is because our model not only utilizes lawsuit risk information in the view of frequency but also takes advantage of time interval related to each lawsuit, which is proved significantly correlated with enterprise bankruptcy in Table 1 . Compared with Hyper-HGNN, the proposed model MIXR also does better because our model is able to distinguish different types of hyperedges and assign different importance weights for the learned representations. We can also observe that replacing Hierarchical Risk Encoder with RGCN leads to performance decrease, from which we can conclude that the proposed Hierarchical Risk Encoder can capture the conductive-risk embedded in multiplex relationships more sufficiently. Compared with the Cascade, we find the performance decrease, which demonstrates the superiority of the designed architecture of the proposed model MIXR.

We examine the effects of the two critical hyper-parameters, i.e., input dimension and lawsuit risk information dimension of MIXR, the default dimension of which are 64 and 20 respectively. Impact of input dimension. As shown in Figure 8 (a), we can find that the performance firstly rises up with the dimension increasing before 64 and then falls with the dimension increasing. This is may because a model with too low dimension fails to represent abundant information of nodes. Meanwhile, high dimension brings too much noisy information and thus restricts the capacity of the proposed model MIXR.

Impact of lawsuit risk information dimension. We can observe from Figure 8 (b) that the model performance firstly increases and reaches its peak in 20, then decreases with the dimension rising up. This is mainly because the number of total lawsuit attributes in the SMEsD dataset is 20, both lower and higher lawsuit risk dimension leads to performance decrease.

In this paper, we propose to model enterprise bankruptcy risk by mixing its intra-risk and conductive-risk. Under this framework, we propose a novel method that is equipped with an LSTMbased intra-risk encoder and GNNs-based conductive-risk encoder. Specifically, the intra-risk encoder is able to capture enterprise intra-risk using the statistic correlated indicators from the basic business information and litigation information. The conductiverisk encoder consists of hypergraph neural networks and heterogeneous graph neural networks, which aim to model conductive-risk through two aspects, i.e. hyperedge and multiplex heterogeneous relations among enterprise knowledge graph, respectively. To evaluate the proposed model, we collect multi-sources SMEs data and build a new dataset SMEsD, on which the experimental results demonstrate the superiority of the proposed method. The dataset is expected to become a significant benchmark dataset for SMEs bankruptcy prediction and promote the development of financial risk study further. Shaopeng Wei received the B.S. degree from Huazhong Agricultural University in 2019, and now is a Ph.D student in Southwestern University of Finance and Economics. His research interests include graph learning and relevant applications in Fintech and recommendation system. Yu Guo received the B.S. degree from Chengdu Normal University in 2020, and now is a master candidate in Southwestern University of Finance and Economics. His research interests include natural language processing,enterprise risk forecasting.

Qing Yang received the B.S. degree from Southwestern University of Finance and Economics in 2021, and now is a master candidate in Southwestern University of Finance and Economics. Her research interests include enterprise risk forecasting.

Job creation versus job shedding and the role of smes in economic development

Loan managers' trust and credit access for smes

A neural network model for bankruptcy prediction

A binary classification method for bankruptcy prediction

Maec: A multimodal aligned earnings conference call dataset for financial risk prediction

Redefining financial constraints: A textbased analysis

Finbert: A pre-trained financial language representation model for financial text mining

Supply network structure, visibility, and risk diffusion: A computational approach

Heterogeneity and network structure in the dynamics of diffusion: Comparing agent-based and differential equation models

Heterogeneous graph attention network for small and mediumsized enterprises bankruptcy prediction

Returns to buying winners and selling losers: Implications for stock market efficiency

A semi-supervised graph attentive network for financial fraud detection

Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism

Financial risk analysis for smes with graph-based supply chain mining

Loan default analysis with multiplex graph learning

Temporal-aware graph neural network for credit risk prediction

Pick and choose: A gnn-based imbalanced learning approach for fraud detection

Evaluating the credit risk of smes using legal judgments

Shared analyst coverage: Unifying momentum spillover effects

Bankruptcy spillover effects on strategic alliance partners

Spreading the misery? sources of bankruptcy spillover in the supply chain

Financial firm bankruptcy and contagion

Financial ratios, discriminant analysis and the prediction of corporate bankruptcy

Logit versus discriminant analysis: A specification test and application to corporate bankruptcies

A multi-industry bankruptcy prediction model using back-propagation neural network and multivariate discriminant analysis

Comparative analysis of data mining methods for bankruptcy prediction

Measuring firm performance using financial ratios: A decision tree approach

Dynamic bankruptcy prediction models for european enterprises

A novel approach for panel data: An ensemble of weighted functional margin svm models

Using neural network ensembles for bankruptcy prediction and credit scoring

Bankruptcy prediction using imaged financial ratios and convolutional neural networks

The effects of conference call tones on market perceptions of value uncertainty

Systemic risk in financial systems

Risk assessment for banking systems

A simulation-based risk network model for decision support in project risk management

Systemic risk and stability in financial networks

Semi-supervised classification with graph convolutional networks

Graph attention networks

Watch your step: Learning node embeddings via graph attention

Position-aware graph neural networks

Sequential recommendation with graph neural networks

Dual graph enhanced embedding neural network for ctr prediction

Every document owns its structure: Inductive text classification via graph neural networks

Iterative gnn-based decoder for question generation

Graph r-cnn for scene graph generation

Rgcnn: Regularized graph cnn for point cloud segmentation

Graph representation learning: a survey

Every corporation owns its structure: Corporate credit ratings via graph neural networks

A machine learning approach for predicting hidden links in supply chain with graph neural networks

Music recommendation by unified hypergraph: combining social media information and music content

Selfsupervised multi-channel hypergraph convolutional network for social recommendation

Adahgnn: Adaptive hypergraph neural networks for multi-label image classification

Semi-dynamic hypergraph neural network for 3d pose estimation

Streaming graph neural networks

Long short-term memory

Hypergraph neural networks

Gaussian error linear units (gelus)

Least squares support vector machine classifiers

Heterogeneous hypergraph embedding for graph classification

Semi-supervised classification with graph convolutional networks

Modeling relational data with graph convolutional networks

Heterogeneous graph neural network

Heterogeneous graph attention network

Interpretable and efficient heterogeneous graph convolutional network

Hypergraph learning: Methods and practices

On the importance of initialization and momentum in deep learning

Sgdr: Stochastic gradient descent with warm restarts

The authors would like to thank all anonymous reviewers in advance. This research has been partially supported by grants from the National Natural