key: cord-0455461-cbc7w8zu authors: Medo, Mat'uvs; Mariani, Manuel S.; Lu, Linyuan title: The fragility of opinion formation in a complex world date: 2020-10-23 journal: nan DOI: nan sha: d84e2fa5338183df543cc90cc02f8dc0a284bbec doc_id: 455461 cord_uid: cbc7w8zu With vast amounts of high-quality information at our fingertips, how is it possible that many people believe that the Earth is flat and vaccination harmful? Motivated by this question, we quantify the implications of an opinion formation mechanism whereby an uninformed observer gradually forms opinions about a world composed of subjects interrelated by a signed network of mutual trust and distrust. We show numerically and analytically that the observer's resulting opinions are highly inconsistent (they tend to be independent of the observer's initial opinions) and unstable (they exhibit wide stochastic variations). Opinion inconsistency and instability increase with the world complexity represented by the number of subjects, which can be prevented by suitably expanding the observer's initial amount of information. Our findings imply that even an individual who initially trusts credible information sources may end up trusting the deceptive ones if at least a small number of trust relations exist between the credible and deceptive sources. Identifying potential mechanisms behind the formation of opinions in society is vital to understand how polarization emerges in society [1] , how misinformation spreads and can be prevented [2] , and how science can be effectively communicated to the public [3] . Despite recent advances in how opinions propagate in social networks [1] , how artificial agents promote low-credibility content in social media [4, 5] and how rapidly misinformation spreads compared to reliable content [2, 6] , misinformation still thrives in our society. This is wellexemplified by the recent growth of anti-vaccination views and the related existence of antivaccination clusters in online social networks [7] . The popularity of unreliable opinions -which is especially dangerous during global emergencies, such as the recent COVID-19 pandemic [8] -calls for a deeper investigation of the possible drivers behind the process whereby individuals form opinions in a society. We show that, even in the absence of a social network of influence, inconsistent and unstable opinions can emerge from a process whereby an uninformed individual forms opinions about a world composed of interrelated subjects. Existing models of opinion formation [1, [9] [10] [11] [12] and cultural dynamics [13] focus on opinion or culture propagation on a social network of influence. Departing from existing models, we develop a modeling approach that focuses on the process whereby an individual observer forms opinions about a set of interrelated subjects. * matus.medo@unifr.ch In existing models [10] [11] [12] , an individual can form opinions on distinct topics as the result of independent realizations of an opinion propagation process on a social network of influence. By contrast, the proposed model takes into account the connections among topics [14] , which we find to be a key determinant of opinion inconsistency. Inspired by Heider's social balance theory [15] [16] [17] , and its validation on data on armed conflicts among countries [18, 19] and large-scale social media [20] [21] [22] , our model assumes that an individual observer gradually forms opinions on a set of subjects connected by signed links representing positive and negative relations, respectively. The subjects on which the opinions are formed can represent governments, politicians, news media, or other individuals that belong to two different camps. While such systems tend to form macroscopic structures such as two opposing camps [18, 23, 24] , these structures are generally imperfect. Two countries, for example, can belong to the same alliance whose members generally have positive relations, yet their mutual relation can be negative due to historic or economic reasons (consider the two NATO members, Greece and Turkey, and their longterm issues). In science, it has repeatedly occurred that a Nobel prize recipient endorsed conspiracy theories, as was recently the case with Luc Montagnier's controversial claims on the origin of the ongoing COVID-19 pandemic. Forming a reliable opinion about a complex subject requires effortful reasoning. However, psychological research indicates that humans tend to be rather driven by simple heuristics when forming opinions about complex topics, sometimes reaching opinions that violate basic logic rules [25, 26] . The limitations of our cognition have important consequences. For example, the susceptibil-ity to partisan fake news was recently found to be driven more by "lazy reasoning" than by partisan bias [27] . This motivates us to study the problem of an observer who starts with opinions on a small set of subjects (seed opinions) and applies a local rule (heuristics) to form opinions on the remaining subjects by relying only on their explicit signed relations. We find that even a small fraction of misleading links in the relation network (e.g., a link of mutual trust between a scientific and low-credibility information source) leads to the resulting opinions that are both inconsistent with the observer's seed opinions and vary significantly between model realizations. We determine analytically the relation between average opinion consistency and the world complexity, represented by the number of subjects, which demonstrates that opinion consistency grows as the world complexity increases. This increase can be prevented by suitably increasing the observer's initial number of independent opinions. Although opinion consistency depends on network topology and can be improved by considering a more sophisticated local opinion formation mechanism, our main conclusions are robust to variations of the network topology and the opinionformation mechanism. Our findings point to the inherent fragility of the opinion formation process in a world composed of many interrelated subjects and, at the same time, suggest strategies to increase its reliability. Since subjects may represent co-existing scientific or low-credibility information sources, our model presents a contributing mechanism for how misinformation sources may gain their audience. We consider an individual observer who gradually develops opinions on a world composed of N interrelated subjects (see Fig. 1A ). The number of subjects represents the complexity of the world. Each opinion is for simplicity assumed to take one of three possible states: no opinion, a positive opinion (trust), or a negative opinion (distrust). The observer's opinions can be formally represented by an N -dimensional opinion vector o whose element o i represents the opinion on subject i; o i ∈ {−1, 0, 1} corresponds to a negative opinion, no opinion, and a positive opinion, respectively. The subjects form a signed undirected network of relations. These relations are represented by a symmetric N × N relation matrix whose element R ij represents the trust relation between subjects i and j; R ij ∈ {−1, 0, 1} corresponds to a negative relation, no relation, and a positive relation, respectively. We emphasize the main difference between this setting and traditional opinion formation models based on propagation on networks of social influence [9] [10] [11] [12] : in existing models, simulating the opinion formation on N subjects would require running N in-dependent realizations of the opinion formation process, which would miss the interconnectedness among subjects; by contrast, in the proposed approach, the interconnectedness among subjects is naturally encoded in the relation matrix R. The observer's opinion formation starts from an initial condition where the observer has an initial opinion on a "seed" subset of subjects, S (seed opinions). The observer then gradually forms an opinion on each of the remaining subjects via sequential opinion formation, until opinions on all subjects are formed. Once formed, the opinions are not updated. In one step, a target subject i is chosen at random from the pool of subjects with no opinion (o i = 0). The observer then attempts to form an opinion on i. From all subjects j with an opinion (o j = 0) that are adjacent to i (R ij = 0), we choose one subject at random (source subject). The opinion o i is then set to o j R ji (see Fig. 1B ). As a result, a positive opinion on i is formed if either: (1) the observer has a positive opinion on j and the relation between j and i is positive ("the friends of my friends are also my friends") or (2) the observer has a negative opinion on j and the relation between j and i is negative (formalizing the ancient proverb "the enemies of my enemies are my friends"). A negative opinion on i is formed otherwise. Note that this mechanism produces a balanced triad consisting of the observer and subjects i and j (in Heider's original sense of heterogeneous triads that can include both individuals as well as objects [15, 28] ). The observer then continues with a next subject until opinions on all subjects have been formed. This opinion formation process-which we refer to as the random neighbor rule as it forms opinions using neighboring subjects chosen at random-is purposely simple as it intends to imitate an observer with limited cognitive resources (see [27] for a recent account on susceptibility to fake news driven by "lack of reasoning"). We study a more thorough process (majority rule) below. The opinion formation outcome is not deterministic (except for the special case when all paths in the subject network are balanced; see section S1 in Supporting Material, SM) as it is influenced by the order in which subjects are chosen for opinion formation as well as the source subject choices. For a given relation network and a set of seed opinions, individual realizations of the process correspond to a population of independent individuals or, alternatively, various possible "fates" of a single individual. We study outcomes of multiple model realizations to characterize statistical properties of the resulting opinions. For simulations on synthetic relation networks, we additionally average over various network realizations to remove possible effects of a specific network topology on the results. Starting from a small set of seed opinions and a world of unknown subjects (gray nodes), an observer gradually forms opinions (black circles, ±) on all the subjects. The subjects are interconnected by mutual relations of trust (solid green lines) or distrust (dotted red lines). (B) The formed opinion is determined as a product of the opinion on the source subject and the sign of the relation between the source subject and the target subject. A positive opinion is formed when the source opinion and the relation are both positive or both negative; a negative opinion is formed otherwise. B. Opinion formation simulations on synthetic networks. We now study the opinion formation model on a specific relation network where the subjects form two camps. This scenario is relevant to various real situations [18, 23, 24] : The two camps can represent two opposing political parties (such as democrats and republicans), standard news outlets and false news outlets, or scientists and conspiracy theorists, for example. In synthetic networks, each camp consists of N/2 subjects. Every subject is connected by signed links with z random subjects, thus creating a random network of trust with average degree z. If subjects from the same camp are linked, the sign of their relation is +1 with probability 1 − β and −1 otherwise. Similarly, if subjects from different camps are linked, the sign of their relation is −1 with probability 1 − β and +1 otherwise. Parameter β ∈ [0, 0.5] thus plays the role of structural noise. As β grows, the negative relations become more common within each camp and positive relations become more common across the camps. When β = 0.5, the two camps become indistinguishable by definition. The network's level of structural balance [16, 29] is the ratio of the number of balanced triads to all triads in the network. In our case, which corresponds to either all links of a triad (the first term) or one link of a triad (the second term) respecting the two-camp structure, producing a balanced triad as a result. B grows monotonously with β. The equation can be inverted, yielding β = (1 + 3 √ 1 − 2B)/2, which can be used to write our results in terms of B instead of β. We assume that the observer has initially a positive opinion on N S seed subjects from camp 1, and we examine whether the observer ends up with a positive opinion on other subjects from camp 1 and a negative opinion on subjects from camp 2, or not. If the two camps represent scientists and conspiracy theorists, for example, the corresponding practical question is whether an observer who initially trusts a scientist would end up predominantly trusting scientists or conspiracy theorists. Without noise (β = 0), the opinion formation leads to a definite outcome: A positive opinion on all subjects from camp 1 and a negative opinion on all subjects from camp 2. In such a case, we say that the opinions are perfectly consistent with the underlying two-camp structure of the relationship network among the subjects. Opinion consistency of a resulting opinion vector, o, can be measured as where S is the set of seed subjects and T represents the ground-truth structure of the relation network (in our case, T j = 1 for j from camp 1 and T j = −1 for j from camp 2). If the observer's opinions are chosen at random, the resulting consistency is zero on average. A zero or small consistency value thus indicates that the observer's opinions are independent of the seed opinion and thus inconsistent with the two-camp structure of the relationship network. Negative consistency is also possible: The observer starts with a positive opinion on subjects from camp 1 but ends with more positive opinions in camp 2 than in camp 1. Knowing that opinion consistency is one in the absence of noise, how does it change as the noise parameter β grows? Numerical simulations for a set of 100 subjects and one seed opinion show ( Fig. 2A ) that opinion consistency decreases rapidly with β. Indeed, if the relationship between consistency and noise was linear, we would have expected C 0 (β) := 1 − 2β which starts at one when β = 0 and reaches zero when β = 0.5 as the two camps then cannot be distinguished by definition. By contrast, we observe a substantially faster decay of the mean consistency µ C (β). In addition, the consistency values vary strongly between model realizations. For β = 0.02, for example, mean consistency is only 0.80 and there are model realizations with consistency below 0.54 and above 0.97 (the 10th and 90th percentile, respectively, of the obtained consistency values for z = 4). This means that even when the noise is small, some sets of formed opinions are in a dramatic disagreement with the observer's seed opinion. To appreciate the level of noise in real data, Moore [18] reported that 80% of triads among middle East countries are balanced. Eq. (1) shows that such a level of structural balance is achieved at β ≈ 0.08 in our two-camp networks. In Fig. 2B , mean opinion consistency at β = 0.08 is as low as 0.42 (for z = 10). These results confirm our initial hypothesis that a realistic level of noise leads to the adoption of a large fraction of opinions that do not align with the observer's initial opinion. The opinion formation with the two-camp structure can be studied analytically under the assumption of homogeneous mixing [30] . It is advantageous to study the problem in terms of the number of formed opinions, n, and the number of consistent opinions, c (that is, the opinions that are consistent with the seed opinions and the two-camp structure). By rewriting the sum When the observer forms a new opinion, n increases by one and c either increases by one (if the new opinion is consistent) or remains constant. We introduce the probability distribution of c when n opinions have been formed, P (c; n), for which the master equation (see Methods for the derivation) has the form The initial condition P (N S ; N S ) = 1 represents that all N S seed opinions are consistent. Eq. (3) can be solved numerically and the obtained solution P (c; n) can be used to compute the corresponding mean opinion consistency. The numerical solution agrees well with the model simulations (Fig. 2B ), in particular when the relation network is not sparse (z 10). Eq. (3) allows us to investigate the dependence between opinion consistency and the world complexity, represented by the number of subjects, N . A surprising finding is that as the number of subjects increases, the distribution of C obtained by solving Eq. (3), P (C), does not approach a well-defined limit distribution, but instead steadily shifts towards C = 0 and becomes narrower in the process ( Figure 2C ). We study this behavior by computing the mean opinion consistency, µ C (N ), and the standard deviation of consistency, σ C (N ). Multiplying Eq. (3) with c and summing it over c = N S , . . . , N yields the recurrence equation with the initial condition c(N S ) = N S (the seed opinions are assumed to be correct). This recurrence equation can be solved in general, yielding For N S = 1, the corresponding mean consistency is with the leading contribution This shows that the mean opinion consistency vanishes in the limit N → ∞. The leading-term contribution to σ C (N ) is also proportional to N −2β when β ≤ 1/4. When β > 1/4, the leading term becomes proportional to N −1/2 . These analytic results agree with numerical simulations of the model ( Figure 3 ). The behavior demonstrated by Figures 2B and 3 , and supported by the analytic solution above, has important consequences. It shows that as the world complexity increases, the formed opinions become on average less consistent with the seed opinions and the two-camp structure of the subject network. Crucially, the opinion consistency is zero in the limit of an infinite number of subjects for any positive level of noise, β, in the subject relation network: in the limit of an infinite-complexity world, even a tiny amount of noise is enough to nullify opinion consistency. These results are robust with respect to variations of the relation network structure and using more than one seed opinion (see Sec. S3 in SM). The convergence of opinion consistency to zero as N → ∞ can be avoided if the number of seed opinions grows linearly with N so that the fraction of seed opinions remains constant. Assuming that N S = f S N , Eq. (5) can be used to show that the mean consistency approaches to in the limit N → ∞ and the standard deviation of consistency vanishes as 1/ √ N (see Sec. S2 SM). This scaling relation determines the necessary proportion of seed opinions, f S , needed to achieve a desired opinion consistency, µ C , for given β. These results are confirmed by numerical simulations shown in Fig. 3C ,D. Despite having a positive limit value, opinion consistency still decreases quickly with noise in the relationship network when f S is small (see Fig. S2 in SM). While our findings hold qualitatively when a different topology of the relation network is used, the mean opinion consistency values are heavily affected by the network topology (see Figure 4 ). We run the opinion formation model on a growing preferential attachment network, a configuration model (CN) network with a power-law degree distribution, and Watts-Strogatz networks with various values of the rewiring probability, p r (see Sec. S3.3 in SM for details on the network construction). We find that networks with broad degree distributions lead to higher opinion consistency which decays with N slower (see Fig. S5 in SM) than in the previously studied random networks. By contrast, Watts-Strogatz networks yield lower opinion consistency which further decreases as the networks become more regular through lowering the rewiring probability, p r . D. Opinion formation using the majority rule. The results described above hold for the opinion formation model where a random neighbor of a target subject is chosen as the reference. We chose this model to study the consequences of a cognitively easy opinion formation model. At this stage, one might object that the observed sensitivity of opinion consistency to noise might be because each formed opinion directly relies on only one previously formed opinion, and it might disappear if the observer incorporates the information from more neighbors before forming an opinion. To rule out this potential argument, we investigate a model where all neighbors of a target subject are considered before forming the opinion. Denote the numbers of neighbors leading to the adoption of a positive and a negative opinion (determined as in Figure 1B ) as n P and n N , respectively. If n P > n N , the observer forms a positive opinion. If n N > n P , the observer forms a negative opinion. If n P = n N , a random opinion is formed. We refer to this as the majority opinion formation rule. It is more demanding than the original random neighbor rule based on choosing a random neighbor as it assumes that the observer carefully collects all evidence for forming an opinion on a target subject. The majority rule is nevertheless still a local rule as it only considers direct neighbors of a target node. Using the majority rule, a scaling analogous to Figure 3 can be observed ( Figure 5A ,B). The important difference is that the scaling exponent now depends on both β and z whereas a higher mean degree, z, generally leads to µ C (N ) and σ C (N ) decaying slower with N . Except for the smallest used noise and the highest used degree (β = 0.05 and z = 50), all fitted slopes are significantly positive. Since the majority rule does not lend itself to analytical computation, whether the limit of µ C (N ) is indeed positive when β is sufficiently small and z is sufficiently high remains an open question. Figure 5C shows results for a fixed fraction of seed nodes. We see that when the network density is low (z = 4), the majority rule achieves results that are comparable to those of the random neighbor rule. When z increases, the majority rule leads to significantly more consistent opinions than the random neighbor rule. It has to be noted, though, that when z is large, the cost for the observer to collect and analyze all information for opinion-making is large too. E. Opinion formation simulations on real networks. The trust consistency metric requires information on the ground truth structure of the relation network (such as the assignment of subjects to one of the two camps in the case of a two-camp structure). Before analyzing empirical data, we aim to introduce a proxy for opinion consistency that does not require such information which is typically not available for real data. To this end, we introduce opinion stability, S, which measures the extent to which elements of the opinion vector are the same in independent realizations of the opinion formation model (see Methods for the definition). If an opinion on a given subject always ends up positive (or always negative), it is a sign of a robust opinion and it contributes positively to opinion stability. Small opinion stability indicates that the opinion formation outcomes are highly volatile and, in turn, they do not comply with the division of subjects in camps in synthetic networks. It can be shown that when the relation network's level of structural balance is one, opinion stability is one as well. In synthetic worlds, the opinion stability metric behaves as required when the relation network is sufficiently dense (z 10): S = 1 in synthetic networks when β = 0 and S is close to zero when β = 0.5 (see Sec. S4 in SM). In fact, the values of opinion consistency and opinion stability are nearly the same for all β values. The main reason for this agreement between stability, S, and consistency, C, is that high opinion consistency can be only achieved when the opinions in question are the same in all model realizations which in turn leads to high opinion stability. Crucially, opinion stability vanishes as the number of subjects grows to infinity similarly as we have seen it for opinion consistency (see Fig. S8 in SM). Equipped with the opinion stability metric, we can assess opinion formation in empirical worlds represented by empirical signed networks. We first use signed networks derived from United Nations General Assembly (UNGA) votes in individual sessions, where countries that vote similarly are connected with positive links and countries that vote differently are connected with negative links (see Methods for the data description). Figure 6A shows part of the network corresponding to the latest completed UNGA session 74 (2019-2020). The loop is unbalanced as the product of its link weights is −1. As a result, the outcome of opinion formation using the random neighbor rule is not deterministic: Assuming a positive seed opinion on Italy, the formed opinion on Russia is negative if it is made using the path ITA-FRA-RUS or positive if it is made using the path ITA-USA-RUS. This outcome variability then directly translates in results shown in Figure 6B,C where two different realizations of the random neighbor rule are shown to demonstrate the high variability of opinions despite a high level of structural balance of the respective UNGA network (in this case, B = 0.86). This agrees with our results in Figure 2B where opinion consistency decreases quickly with β and displays large fluctuations. Finally, Figures 6D-F show that the majority rule yields substantially more stable opinions and that the stability difference between the random neighbor rule and the majority rule tends to grow as the level of structural balance decreases. The number of nodes in the UNGA datasets is limited by the number of countries participating in the assembly's voting (the number of nodes grows from 53 in the 1st assembly to 191 in the 74th). To be able to observe the scaling of opinion stability similar to the scaling of opinion consistency in synthetic data (Fig. 3) , we thus use signed trust networks from two popular online services: Slashdot [31, 32] and Epinions [32] (see Methods for the data description). Note that while Slashdot and Epinions are social networks, our model still differs from classical models of opinion formation on social networks as it concerns opinion-making of an observer, not opinion-making of each individual member of the social network. Nodes in the given social networks represent interconnected subjects on which opinions are made. We create multiple subsets of each network with progressively increasing numbers of nodes (see Methods for details). We find that a stability-complexity tension is present in the real worlds (Fig. 7 , panels D and E): opinion stability consistently decreases with the number of subjects. The fitted scaling exponents are 0.40 and 0.20 for Slashdot and Epinions, respectively. These values cannot be directly compared with the scaling exponent 2β that we derived for opinion consistency in random networks as Slashdot and Epinions networks are manifestly non-random. Building on the understanding that we gained by analyzing simulations on synthetic worlds, we can conclude that the levels of noise in the two real relation networks are high, which makes opinion formation using the random neighbor rule unreliable. We finally apply the majority rule on real networks. In agreement with previous results, we observe that the resulting opinion stability is almost always higher than achieved by the random neighbor rule and the difference generally grows as the network's level of structural balance decreases (see Fig. S10 in SM). As N increases, the average opinion stability still vanishes with N , albeit slower than is the case for the random neighbor rule (see Figure 7C ,D). We can thus again conclude that the majority rule does not solve the fundamental problem identified by our work: the formed opinions become pro-gressively less reliable as the system size grows. We demonstrated that sequential opinion formation based on an explicit network of trust and distrust is inherently fragile: even a small amount of noise leads to inherently fragile (i.e., inconsistent and unstable) outcomes. This suggests that to prevent the spreading of misinformation in large-scale online systems, it is paramount that there exist no trust links from credible to low-credibility sources of information. If a tiny fraction of such misleading links exists, an observer who starts trusting credible information sources may end up trusting a substantial number of low-credibility sources. If the same happens for a large number of observers, a cluster of misinformed individuals (such as anti-vaccination clusters [7] ) can thrive. The more complex our world, the more fragile the process -there is a tension between opinion consistency/stability and the world's complexity. An individual observer can compensate for this increasing fragility by forming an independent (i.e., not derived from the trust network) initial opinion on a larger number of subjects, before resorting to the trust network to form an opinion on the remaining subjects. For example, a person forming opinions about a set of interconnected websites -some of them based on scientific content, some of them promoting conspiracy material -can increase the opinion consistency/stability by first carefully eval- uating the trustworthiness of a large number of websites, and only then relying on relations between already trusted/distrusted websites to form opinions on the remaining ones. This suggests that policy to increase the consistency/stability of collective opinion may aim to promote the formation of individuals' independent opinion about a substantial number of subjects. Within the studied framework, the majority rule yields better results, yet: (1) the majority rule is more laborious than the random neighbor rule as it assumes collecting all direct evidence on each target node, (2) opinion consistency and stability under the majority rule still vanish when the network is not sufficiently dense and the noise is not sufficiently low. From the observer standpoint, our work focuses on little cognitively-demanding opinion formation mechanisms, which opens the way to studying more sophisticated mechanisms, and understanding the trade-off between the robustness of the resulting opinions and the observer's cognitive costs required to form the opinions. In the real world, additional influences-social influence, in particular-and heuristics are likely to be at work, and a high level of heterogeneity across observers is expected. Whether additional influences and mechanisms further increase or mitigate the fragility of the individu-als' opinion formation process is a priori unclear. More sophisticated opinion formation models and their calibration to empirical data hold promise to shed light on this fundamental process for our interconnected societies. When the ground truth division of subjects in camps is known, we compute the consistency of a given opinion vector o with the ground truth assignment T using Eq. (2). The seed opinions are excluded from the computation of consistency as these opinions are consistent by construction. The consistency values range from −1 for the opinion vector that totally disagrees with the ground truth (except for the seed opinions, the observer has positive opinions on all subjects from camp 2 and negative opinions on all subjects from camp 1) to +1 for the opinion vector that perfectly matches the ground truth camps. The consistency of a random opinion vector is zero on average. Note that we assume here that opinions have been eventually made on all subjects, which is the case for all simulations presented here. In numerical simulations, we average over independent model realizations on multiple realizations of the synthetic two-camp networks to estimate the mean opinion consistency. In Figure 2 , we complement the mean with the 10th-90th percentile range of the consistency values. In Figure 3 , we assess the uncertainty of mean consistency with the standard deviation of mean consistency over 10,000 bootstrapped sets of results; the displayed error bars are 3-times of that. The proposed opinion formation model using the random neighbor rule can be studied analytically for the two-camp relation network. We use the number of formed opinions, n, and the number of "consistent" opinions, c, as the variables to describe the process. Initially, n = N S and c = N S , because all seed opinions are assumed to be consistent (they are all positive opinions on subjects from camp 1 or, more generally, positive opinions on subjects from camp 1 and negative opinions on subjects from camp 2). We introduce the probability distribution of c consistent opinions after n opinions are formed, P (c; n), with normalization c P (c; n) = 1. The initial condition is P (N S ; N S ) = 1 in line with the description above. To find P (c; n) for n > 1, we write the general master equation The transition probability W (c − 1 → c; n − 1) corresponds to making a consistent opinion in a situation when n−1 opinions have been made, of which c−1 are correct. If the target subject (on which the opinion is to be formed) is from camp 1, W (c − 1 → c; n − 1) is the probability that the observer decides to form a positive opinion on the target subject. Assume now that there are t 1 trusted (i.e., with a positive opinion) subjects from camp 1, d 1 distrusted (i.e., with a negative opinion) subjects from camp 1, t 2 trusted subjects from camp 2, and d 2 distrusted subjects from camp 2. The probability of forming a positive opinion on the target node is n P /(n P + n N ) where n P and n N are the numbers of neighbors of the target subject that-when chosen-would result in forming positive and negative opinions, respectively. The expected value of n P is proportional to where we used that t 1 + d 2 = c − 1 and t 1 + d 1 + t 2 + d 2 = n−1. Note that it is the random structure of the relation network that allowed us to write a simple expression. Similarly, n N is proportional to t 1 β +d 1 (1−β)+t 2 (1−β)+d 2 β = (c−1)β +(n−c)(1−β). Taken together, these formulas give us the transition probability W (c − 1 → c; n − 1) = n P /(n P + n N ) = [c(1 − 2β) + β(n + 1) − 1]/(n − 1). It can be checked easily that the form of W (c − 1 → c; n − 1) is the same when the target subject is from camp 2. By plugging this W (c − 1 → c; n − 1) in Eq. (9), we obtain the master equation Eq. (3) which describes how P (c; n) changes as n grows. Note that the fact that W (c − 1 → c; n − 1) = n P /(n P + n N ) implies that the same master equationand thus the same opinion formation model-is obtained by a seemingly more thorough observer who first evaluates all neighbors of the target subject and counts the number of subjects whose choice would result in forming a positive and a negative opinion, n P and n N , respectively. Based on n P and n N , the opinion on the target subject can be formed in a probabilistic manner: positive with probability n P /(n P + n N ) and negative otherwise. The outcome is thus the same as choosing one neighbor of the target opinion at random and forming the opinion accordingly. Opinion consistency assumes that the ground truth division of subjects in camps is known but that is not the case for most real datasets. To overcome this difficulty, we introduce another metric to assess the formed opinions, opinion stability. For a given relation network, we fix the opinions on subject i and use R independent model realizations to compute the average opinion o j for all other subjects. If the formed opinions on subject j are stable, they are the same in all or most realizations and the value o j is thus close to +1 or −1. By contrast, volatile formed opinions result in o j close to zero. We then compute the average opinion stability with respect to the seed subject i as where the absolute value reflects the fact that both o j = 1 and o j = −1 are signs of stable opinions on subject j. Note that the seed opinion is again excluded from the summation. Opinion stability is S i = 1 when all realizations yield the same opinion on i. For random opinions, however, the stability is not zero due to the absolute value in Eq. (10) which is never negative. In that case, o j follows the normal distribution with zero mean and standard deviation 1/ √ R. It can be shown that the mean of |o j | is 2/(Rπ) which represents the expected opinion stability of a random trust vector. We thus transform Eq. (10) as to obtain the final formula for opinion stability. Its values range from zero, on average, when opinions on all subjects are random to one when opinions on all subjects are the same in all model realizations. Individual S i values can be used to characterize the stability of opinions based on a seed opinion on subject i or aggregated to represent the overall opinion stability. In simulations on synthetic relation networks, we use 100 independent network realizations and compute opinion stability for a randomly chosen node. In simulations on real relation networks, we present opinion stability results for 100 nodes chosen at random. See Sec. S4 in SM for a comparison between opinion consistency and opinion stability. We test the opinion formation model on three distinct real datasets. The UNGA dataset contains the votes by countries at United Nations General Assemblies [33] . [34] We use the state ideal point positions in one dimension estimated in [33] from the voting data to quantify "state positions toward the US-led liberal order". The dataset contains all 74 general assemblies held in the years 1946-2020; assembly 19 is ignored because of faulty data. For each assembly, estimated state positions x i can be directly translated in distances |x i − x j | between the states. We generate one signed network for each general assembly by first removing all countries with less than 20 votes (up to 7 countries have been removed in one session) and then representing state distances below the 33.33th percentile (for the given general assembly) as positive links and state distances above the 66.67th percentile as negative links, saving the network's giant component. The numbers of nodes and links in the network increase progressively from 53 and 919, respectively, in the 1st general assembly to 191 and 12,096, respectively, in the 74th. The numbers of positive and negative links are identical by construction in each network; the level of structural balance ranges from 0.86 to 0.98. The Slashdot dataset represents the social network website social network where the users can tag each other as friends or foes [31, 32] . [35] While the original network is not symmetric, we represent it as symmetric, neglecting the mutual links whose signs do not agree (less than 1% of all links) and finally keeping only the giant component. The resulting Slashdot network comprises 82,052 nodes and 498,527 signed links. The fraction of negative links is 0.236 and the network's level of structural balance is B = 0.867. The Epinions dataset represents the social trust network of the website's users [32] . [36] After the same processing as we apply to the Slashdot data, the resulting Epinions network comprises 119,070 nodes and 701,569 singed links. The fraction of negative links is 0.168 and the network's level of structural balance is B = 0.905. To study the dependence of results on the network size, we created small subsets of the large Slashdot and Epinions networks by choosing a random node and gradually including its nearest neighbors, second-nearest neighbors, and so on, until a target number of nodes is reached. We created 100 independent networks for each network size, each of them starting from a node chosen at random. An alternative construction by choosing a given number of nodes or links at random would produce very sparse networks whose sparsity would directly impact the opinion formation process (see Fig. S7 in SM). In the two-camp case, β = 0 leads to the ideal two-camp structure without any noisy links. Assuming that a positive seed opinion on a subject from camp 1, the resulting opinions are positive for all subjects from camp 1 and negative for all subjects from camp 2. The opinion consistency and stability are then one. A more general proposition is as follows: When all loops in the relation network are balanced, the opinions formed from a single seed opinion are deterministic. The notion of loop balance is a direct generalization of balanced and imbalanced triads: A loop is balanced if the product of edge signs along the loop is one. An imbalanced loop has the product of edge signs along the loop equal to −1. To prove this proposition, consider loop L that contains the seed node, s and take any other node, i, in this loop. The loop can be now split in two independent paths, Γ 1 and Γ 2 , leading from s to i. According to the probabilistic rule, opinion o i formed using path Γ from s to i can be written as o i = o s e∈Γ R(e) where R(e) is the sign of edge e in the relation network. Since Γ 1 ∪ Γ 2 = L, we can write Now if loop L is balanced, then e∈L R(e) = 1 on the left-hand side. As a result, the two terms on the right-hand side must have the same sign which then directly implies that o i is the same regardless of whether Γ 1 or Γ 2 are used to form the opinion on node i-the opinion-formation outcome is deterministic when only loop L is used. If all loops in the relation network are balanced, then the outcome is deterministic regardless of which paths are used to form the opinions. When all opinions formed using a given seed opinion are always the same, the opinion stability metric is one. Conversely, if at least one imbalanced loop is present in the relation network, then there is some randomness in the resulting opinions and the resulting opinion stability is less than one. Finally, note that it is possible that all triads in the network are balanced (hence the level of structural balance, B, is one), yet some longer loops are imbalanced (see G. Facchetti, G. Iacono, C. Altafini, Computing global structural balance in large-scale signed social networks, PNAS 108, 20953, 2011 for how to evaluate global structural balance in a signed network). As described in the main text, the opinion formation model can be analyzed in terms of the number of consistent opinions, c, and the number of formed opinions, n. For the two-camp relation network, we derived the master equation which describes how the probability distribution P (c; n) relates to the "previous step" probability distributions P (c − 1; n − 1) and P (c; n − 1). Beyond solving Eq. (S1) numerically, we study the properties of its solution analytically. While the first idea is to study the limit distribution of the fraction of correct opinions, c/N , such a solution does not exist as the simulations show that the variance of this distribution goes to zero as N grows (see Figures 2C and 3 in the main text). The decrease of the standard deviation of consistency, σ C (N ), albeit very slow when β is small, means that a limit distribution does not exist as C approaches a fixed value in the thermodynamic limit. Since the limit behavior of Eq. (S1) cannot be studied, we focus instead of computing the first and second moment of c. To this end, we first multiply Eq. (S1) with c and sum it over c = 1, . . . , N to obtain which relates c in two consequent steps of the model. The initial condition of Eq. (S1) is c(1) = 1 (the first seed opinion is by definition correct). When β = 0, this equation is solved by c(n) = n as expected. When β = 0.5 (when the two-camp structure ceases to exist), the solution is c(n) = (n + 1)/2. Eq. (S2) can be also solved in general, leading to The leading order contribution to the second term is n 1−2β /Γ(2 − 2β). When c of n opinions are correct, the remaining n − c opinions are not correct. Since the seed opinion is not included in the evaluation of opinion consistency defined in Eq. (2) in the main text, the corresponding consistency can be thus written as [c − 1 − (n − c)]/(n − 1) = (2c − n − 1)/(n − 1). When c = n, we obtain C = 1. By contrast, when c = 1 (only the seed opinion is correct), we obtain C = −1 (maximally inconsistent opinions). The obtained result for c(n) can be thus used to find the average opinion consistency as C(n) = [2 c(n) − n − 1]/(n − 1) whose leading contribution is in turn C(n) = 1/[Γ(2 − 2β)n 2β ] which agrees with the simulation results in Figure 3 in the main text. The behavior of c 2 (n) can be studied analogously. When Eq. (S1) is multiplied with c 2 and summed over c = 1, . . . , N , we obtain We simplify the notation by introducing c(n) := F n and c 2 (n) := S n (here F and S stand for the first and the second moment, respectively). By subtracting the second power of Eq. (S2) from Eq. (S4), we obtain where V n := S n − F 2 n . Once we know V n , equation C = (2c − n − 1)/(n − 1) implies that the standard deviation of consistency can be found as σ C (n) = 2 √ V n /(n − 1). Using the previously derived solution for F n , Eq. (S5) can be solved in general, leading to The leading contributions of the first two terms are An 2−4β . The last term is linear in n which becomes the leading contribution when β > 1/4. When β ≤ 1/4, σ c ∼ n 1−2β and consequently σ C ∼ n −2β . When β > 1/4, σ c ∼ n 1/2 and consequently σ C ∼ n −1/2 . This is confirmed by Figure S1 where the scaling of µ C ∼ N −2β for all β values but σ C ∼ N −1/2 for β > 1/4. When N S = 1, Eq. (S8) simplifies to Eq. (S6). The leading contribution can be found to be again proportional to N −2β (when β ≤ 1/4) or to N −1/2 (when β > 1/4). The basic two-camp setting assumes that subjects 1, . . . , N/2 form camp 1 and subjects N/2 + 1, . . . , N form camp 2. Within the camps, the links are positive with probability 1 − β and negative otherwise. Across the camps, the links are positive with probability β and negative otherwise. Here β ∈ [0, 0.5] plays the role of a noise parameter: As β grows, the distinction between the two camps vanishes. The topology of the network is assumed to be random whereas each node has fixed degree z. To achieve this, we assign z "stubs" to each node and gradually match nodes with free stubs whilst avoiding the nodes linking to themselves and multiple links between a pair of nodes. It is possible that a small number of stubs cannot be matched at the end; those stubs are discarded. Figure S2 shows that the scaling of opinion consistency with N changes when the number of seed opinions, instead of being fixed, grows with N as f S N . We see that the µ C (N ) does not converge to zero but to a positive value; the limit of σ C (N ) remains the same. To study the limit N → ∞ in this case, we can use the same approach as above. In particular, we can directly use Eq. (S7) and plug in N S = f S N where f S is the fraction of seed opinions. The resulting mean consistency µ C (N ) is given in Eq. (11) in the main text. The main difference from the previous two cases (one seed opinion and N S seed opinions) is that the limit µ C (N ) is not zero in the limit N → ∞: The leading contribution to σ C (N ) can be found to be proportional to 1/ √ N . These results are confirmed by the numerical simulations shown in Figure S2 . We now finally investigate how the network topology influences the resulting opinion consistency. To this end, we run the model on distinct kinds of synthetic networks: 1. Random networks with a power-law degree distribution: Each node is first assigned two stubs and the remaining zN − 2N stubs are then distributed one by one with probability directly proportional to node "activity" value. Node activity values, a, are power-law distributed as 1/a 3 in the range [1, ∞) , thus leading to a degree distribution with a power-law tail (see Figure S4 ). The rest of the construction is as described in Section S3 S3.1. 2. Preferential attachment networks: Starting with two nodes connected by a link, one new node is introduced in each step and creates z/2 links by choosing target nodes with probability directly proportional to their degree. Each existing node can be chosen at most once. In the beginning of the simulation, when the number of available nodes is smaller than z/2, the number of created links is adjusted correspondingly. The network is grown until it contains N nodes. 3. Watts-Strogatz networks: Starting from a periodic regular 1D lattice where each node has z neighbors, one end of each of the existing links is rewired with the rewiring probability p r . As p r grows from 0 to 1, the resulting networks transition from the regular lattice limit to the random network limit, respectively. For each kind of networks, their N nodes are assigned at random in two camps of equal size and the link signs are generated in the same way as for the original random networks with fixed degree. All results shown here are for z = 4 and β = 0.1 (we use here a lower z value to make the heterogeneous degree distribution of preferential attachment networks more pronounced). Figure 3 in the main text, we run the model once on each of 1,000 network realizations, z = 4. We see that while the scaling behaviors µC (N ) ∼ N −γ and σC (N ) ∼ N −γ are maintained for all considered network topologies, the scaling exponent γ is strongly influenced by the network topology. In agreement with the comparison presented in Figure 4 in the main text, we see that relationship networks with broad degree distributions (preferential attachment networks and random networks with power-law degree distributions) yield lower exponent γ (compared to the original random network with fixed degree) albeit consistency variations are substantially higher and vanish slowly (see panel B). The Watts-Strogatz networks yield significantly higher exponents γ: the lower the rewiring probability, the faster µC (N ) vanishes with N . Opinion consistency assumes that the ground truth division of subjects in camps is known but that is not the case for most real datasets. To overcome this difficulty, we introduce another metric to assess the formed opinions, opinion stability. For a given relation network, we fix the opinions on subject i and use R independent model realizations to compute the average opinion o j for all other subjects. If the formed opinions on subject j are stable, they are the same in all or most realizations and the value o j is thus close to +1 or −1. By contrast, volatile formed opinions result in o j close to zero. We then compute the average opinion stability with respect to the seed subject i as where the absolute value reflects the fact that both o j = 1 and o j = −1 are signs of stable opinions on subject j. Note that the seed opinion is again excluded from the summation. Opinion stability is S i = 1 when all realizations yield the same opinion on i. For random opinions, however, the stability is not zero due to the absolute value in Eq. (S10) which is never negative. In that case, o j follows the normal distribution with zero mean and standard deviation 1/ √ R. It can be shown that the mean of |o j | is 2/(Rπ) which represents the expected opinion stability of a random trust vector. We thus transform Eq. (S10) as to obtain the final formula for opinion stability. Its values range from zero, on average, when opinions on all subjects are random to one when opinions on all subjects are the same in all model realizations. Individual S i values can be used to characterize the stability of opinions based on a seed opinion on subject i or aggregated to represent the overall opinion stability. In simulations on synthetic relation networks, we use 100 independent network realizations and compute opinion stability for a randomly chosen node. In simulations on real relation networks, we present opinion stability results for 100 nodes chosen at random. N = 1000 (B) . We use one seed opinion and z = 10 in all simulations. Opinion stability is computed by running R model realizations with a fixed relation network and a fixed seed opinion. The results are then averaged over 100 network realizations. We see that opinion stability weakly depends on the number of realizations, R. We use R = 1000 in all other synthetic data simulations with opinion stability and R = 100 in the computationally more intensive real data simulations. Opinion consistency z does, as we have already seen in the main text, depend on z, in particular when the noise level is low. The dependence of consistency on z reflects the fact that in a sparse network, a single inverted link in the relation network can lead to the formation of a substantial cluster of opinions inconsistent with the ground truth. Opinion stability depends on z even stronger and this dependence is counter-intuitive: As z increases (more relation information is added), opinion stability decreases. In fact, S can produce an illusion of stable opinions even when β = 0.5 and the relation network has no structure at all (note that this is properly acknowledged by opinion consistency, C, which is then zero for all z values). To understand this apparent opinion stability, consider a tree-like relation network. On such a network, the opinions formed for a given seed opinion are always the same which in turns yields S = 1. However, these "stable" opinions do not reflect any specific structure in the relation network other than the relation network being a tree. Opinion stability therefore becomes a meaningful measure only when the mean network degree is z 10. At that point, the relation network makes it possible for the opinion formation process to have various outcomes and a stable opinion on a given subject thus becomes truly informative. FIG. S13. The difference between opinion stability achieved by the majority rule, Sm, and opinion stability achieved by the probabilistic rule, Sp, against the subset's level of structural balance. We show here results for Slashdot (left) and Epinions (right) subsets with 100 nodes (top row) and 300 nodes (bottom row). Of the 400 displayed subsets, only one Slashdot subset with 100 nodes exhibits Sm − Sp < 0. Proceedings of the National Academy of Sciences Participatory Sensing, Opinions and Collective Awareness Social network analysis: Methods and applications Proceedings of the 3rd International Workshop on Link Discovery Thinking, fast and slow Proceedings of the 18th International Conference on World Wide Web Proceedings of the SIGCHI conference on human factors in computing systems