Science Journals — AAAS


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

1 of 11

N E T W O R K  S C I E N C E

A clarified typology of core-periphery structure 
in networks
Ryan J. Gallagher1*, Jean-Gabriel Young2,3, Brooke Foucault Welles1,4

Core-periphery structure, the arrangement of a network into a dense core and sparse periphery, is a versatile de-
scriptor of various social, biological, and technological networks. In practice, different core-periphery algorithms 
are often applied interchangeably despite the fact that they can yield inconsistent descriptions of core-periphery 
structure. For example, two of the most widely used algorithms, the k-cores decomposition and the classic two-block 
model of Borgatti and Everett, extract fundamentally different structures: The latter partitions a network into a 
binary hub-and-spoke layout, while the former divides it into a layered hierarchy. We introduce a core-periphery 
typology to clarify these differences, along with Bayesian stochastic block modeling techniques to classify networks 
in accordance with this typology. Empirically, we find a rich diversity of core-periphery structure among networks. 
Through a detailed case study, we demonstrate the importance of acknowledging this diversity and situating 
networks within the core-periphery typology when conducting domain-specific analyses.

INTRODUCTION
Core-periphery structure is a fundamental network pattern, referring 
to the presence of two qualitatively distinct components: a dense 
“core” of tightly connected nodes and a sparse “periphery” of nodes 
loosely connected to the core and among each other. This pattern 
has helped explain a broad range of networked phenomena, including 
online amplification (1), cognitive learning processes (2), technological 
infrastructure organization (3,  4), and critical disease-spreading 
conduits (5). It applies so seamlessly across domains because it pro-
vides a succinct mesoscale description of a network’s organization 
around its core. By decomposing a network into core and peripheral 
nodes, core-periphery structure separates central processes from 
those on the margin, allowing us to more precisely classify the func-
tional and dynamical roles of nodes with respect to their structural 
position. The analytic generality of this approach, together with the 
relative ubiquity of core-periphery structure among networks, makes 
core-periphery structure an indispensable methodological concept 
in the network science inventory.

Several methods and algorithms exist for extracting core-periphery 
structure from networks (6). They take on a variety of mathematical 
and algorithmic forms, ranging from statistical inference (7–9), 
spectral decomposition (10, 11), and diffusion mapping (12) to motif 
counting (13), geodesic tracing (10), and model averaging (14). These 
algorithms exhibit a creative diversity of approaches for extracting 
core-periphery structure, and each is motivated by imagery of how 
core and peripheral nodes connect to one another. However, under-
neath the different high-level descriptions of each model, there are 
varying and often inconsistent assumptions about how the core and 
periphery are mutually connected and how core-periphery structure 
is reflected in a network. As a result, despite the importance of 
core-periphery decomposition in answering substantive domain 
questions outside of network science, practitioners looking to apply 
these methods to their own fields are left without a warning that 

each algorithm has a different vision of what “core-periphery struc-
ture” actually means. This threatens the ability of researchers to draw 
valid conclusions about the structure and dynamics of numerous 
networks. By introducing a core-periphery typology that distinguishes 
between two qualitatively and quantitatively distinct structures and 
by providing statistical techniques for determining where networks 
fall within that typology, we intend to make the distinction between 
various core-periphery structures clear and enable reliable network 
inferences by scholars and practitioners.

The two types of core-periphery characterizations in our typology 
are well exemplified by two of the most popular approaches for 
identifying core-periphery structure in networks. The first, which we 
refer to as the “two-block model,” is rooted in a definition originally 
proposed by Borgatti and Everett (15). Their mathematical formu-
lation of core-periphery structure proposes that nodes are arranged 
into two groups, the core and the periphery, such that “core nodes are 
adjacent to other core nodes, core nodes are adjacent to some periphery 
nodes, and periphery nodes do not connect with other periphery nodes” 
(15). This paints a hub-and-spoke picture of core- periphery struc-
ture: There is a central hub of interwoven nodes and a periphery 
that radiates out from that hub. The hub-and-spoke core-periphery 
formulation is at the backbone of network science methodology 
because it underlies many of the more sophisticated models that 
have been developed since Borgatti and Everett’s foundational work 
(7,  9,  14). Furthermore, because the two-block formulation was 
originally proposed in the language of block models (16), it is often 
the de facto statistical definition of core-periphery structure for many 
network scientists.

The second core-periphery characterization is reflected in the 
widely used k-cores decomposition. The k-core of a network is the 
largest subset of nodes in the network such that every node has at 
least k connections to other nodes in that subset (17). The k-cores 
define a hierarchy of k-shells, each of which consists of all the nodes 
in the k-core but not the (k + 1)–core. A decomposition in terms of 
the k-cores highlights a network’s core-periphery structure and is 
typically obtained by iteratively removing the k-shells (18), starting 
with peripheral low-degree nodes in the outer shells and working 
toward embedded high-degree nodes in the inner cores. This algo-
rithmic pruning process is accompanied by a suite of evocative 

1Network Science Institute, Northeastern University, Boston, MA 02115, USA. 2Center 
for the Study of Complex Systems, University of Michigan, Ann Arbor, MI 48109, USA. 
3Department of Computer Science, University of Vermont, Burlington, VT 05405, USA. 
4Department of Communication Studies, Northeastern University, Boston, MA 02115, 
USA.
*Corresponding author. Email: gallagher.r@northeastern.edu

Copyright © 2021 
The Authors, some 
rights reserved; 
exclusive licensee 
American Association 
for the Advancement 
of Science. No claim to 
original U.S. Government 
Works. Distributed 
under a Creative 
Commons Attribution 
NonCommercial 
License 4.0 (CC BY-NC).

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

2 of 11

language: The periphery is described as a series of “shells” (17), “onion 
layers” (19), or “leaves” (20), while the core is referred to as the 
“epicenter” (1), “corona” (21), or “nucleus” (4). The language of the 
k-cores decomposition conjures up an image of a layered core- 
periphery structure composed of a nested sequence of layers that 
funnel toward a core. The scalable algorithm of the k-cores decom-
position (20) has made it a practical tool for studying networks of all 
sizes, meaning that a number of applied network analyses implicitly 
assume a layered network arrangement.

By these accounts, it is clear that the layered and hub-and-spoke 
characterizations provide distinct descriptions of core-periphery 
structure. The differences between the hub-and-spoke and layered 
core-periphery characterizations are more than a linguistic sleight 
of hand though; they can have repercussive consequences for sub-
stantive network analyses. In what follows, we first show that the 
structures extracted by the two most widely used algorithms, the 
two-block model and the k-cores decomposition, diverge quantita-
tively across many empirical networks. To establish a statistically 
principled way of comparing these two classes of models, we then 
formulate both the hub-and-spoke and layered core-periphery struc-
tures as stochastic block models (22), which allow us to encode the 
qualitative differences between the two characterizations and for-
mulate an information-theoretic criterion of model fit (8, 23). With 
these tools, we analyze a suite of empirical networks and find a rich 
diversity of core-periphery structure that spans across the core- 
periphery typology. We finish with a case study of hashtag activism 
amplification and emphasize how the choice of core-periphery model 
critically affects the interpretation of substantive results. Our typology 
clarifies the distinct core-periphery structures that can emerge in 
networks and provides a methodologically sound approach for dis-
entangling those structures in practice.

RESULTS
Inconsistent core-periphery partitions
We start by showing that the hub-and-spoke structure explicitly 
extracted by the two-block model (15) and the layered structure 

implicitly suggested by the k-cores decomposition (17) provide funda-
mentally different descriptions of core-periphery structure for the 
same networks. To this end, we draw upon the Koblenz Network 
Collection (KONECT) (24), a diverse network repository that spans 
a number of social, biological, and technological domains. For each 
KONECT network, we extract the binary partition of nodes accord-
ing to the two-block model (25) and the nested partition of nodes 
according to the k-cores decomposition (20). We then measure the 
distance between these partitions via the variation of information 
(VI) (26, 27) and present the pairwise comparisons in Fig. 1. The VI 
is an information-theoretic measure and is therefore expressed in bits 
per nodes. Intuitively, it can be thought of as the sum of information 
not shared by the two partitions. Hence, the more distant or dissimilar 
two partitions are, the larger the VI.

Across network domains, we find that the core-periphery parti-
tions identified by the two-block model and k-cores decomposition 
are quite dissimilar, with an overall median VI of 2.9 bits per nodes. 
A normalized version of the VI, which can only be consistently inter-
preted for individual networks and not across datasets (26), yields a 
median of 35% the maximal value across domains. In other words, 
for each individual network, the partitions are about a third as distant 
as possible. Furthermore, the differences in outcomes are not exhibited 
uniformly across domains. Some classes of networks (e.g., social, 
animal, and infrastructure networks) see relatively more agreement 
between the two-block and k-cores partitions. For other network types 
(e.g., authorship, hyperlink, and software networks), however, the 
two core-periphery algorithms almost always extract distant struc-
tures. Even within domains, there can be a wide heterogeneity in the 
similarities: For example, the range of distances is 7.2 bits for com-
munication networks, 4.9 bits for infrastructure networks, and 4.2 bits 
for hyperlink networks. We find even less agreement between algo-
rithms when we use other measures to compare partitions, such as the 
adjusted (28) or reduced (29) mutual information. We are able to 
somewhat improve the agreement by matching the partitions on sizes, 
to correct for the discrepancy between the number of k-cores and the 
number of groups in the two-block structure, but nonetheless, the 
partitions disagree in general (see the Supplementary Materials).

Fig. 1. Distributions of distances between partitions extracted by the two-block model (15) and the k-cores decomposition (18) across network domains. The 
two-block and k-cores partitions are dissimilar across a variety of networks and network types, indicating that they generally extract different core-periphery structures. 
Distance is measured by the VI (26), where higher values indicate more dissimilar partitions. Thick lines in each domain’s box plot indicate the median difference and are 
also represented as the color of each box. The distributions furthest to the left indicate the kernel density estimate and box plot of the distance distribution across all 
networks. Detailed results are reported in the Supplementary Materials.

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

3 of 11

The results of Fig. 1 do not imply that one algorithm or the other 
is intrinsically flawed but, rather, that the algorithms do not agree in 
general. If one’s goal is only to describe a network, then this dis-
agreement is not an issue because each algorithm simply provides its 
own description of the network, whatever that may be (30). However, 
there is a strict statistical sense in which the algorithms cannot both 
equally well-characterize a given network: Each core-periphery par-
tition corresponds to a statistical description of the network, one of 
which will necessarily be more concise and precise than the other 
(31, 32). So, if we want to make network inferences based on core- 
periphery structure, then we need to call on methods that can iden-
tify models that give better statistical descriptions of a network’s 
structure than others. Selecting models, however, requires that we 
have statistical models in the first place, and the notions of core- 
periphery structure that we have applied so far have only been defined 
through algorithms. We therefore turn to Bayesian stochastic block 
models (8) to establish the missing connection between the two.

Core-periphery stochastic block models
The stochastic block model is a general statistical model of a network’s 
mesoscale structure (22). At its core, it assumes that nodes belong to 
different groups, or “blocks,” such as the core and the periphery. 
These blocks then specify the probability that any two nodes are 
connected. More formally, suppose that we have the adjacency matrix 
A of an unweighted, undirected, simple network with N nodes. We 
assume that there are a fixed number of B blocks, and let the block 
assignments of all the nodes be recorded in , a vector of length N 
where i = r indicates that node i belongs to block r. The probability 
that any two nodes in the network are connected is given by p, a B × 
B matrix where prs is the probability that a node in block r connects 
to a node in block s. This is the defining characteristic of the sto-
chastic block model: The block assignments completely determine 
the probability of connection between any two nodes.

In practice, we do not know the block assignments of the nodes 
 or the probability of connection between blocks p. We are inter-
ested, then, in the distribution P(, p ∣ A), the probability that we 
have a particular arrangement of nodes into blocks and connections 
between them, given our observed network data A. Applying Bayes’ 
rule, we have

  P(, p ∣ A ) ∝ P(A ∣ , p ) P( ) P(p)  (1)

The posterior distribution P(, p ∣ A) is proportional to three 
components: the likelihood P(A ∣ , p) of the network A, the 
prior on the block assignments P(), and the prior on the block 

connectivity matrix P(p). We outline the standard setup of the likeli-
hood and block assignment prior in Materials and Methods. For con-
structing core-periphery stochastic block models, our main concern 
is with the prior on the block connectivity matrix. When we know 
that we want to model core-periphery structure specifically, we only 
want to consider particular arrangements of connection probabilities 
p and that prior knowledge should be reflected in P(p). This is a 
different view than is usually taken for block models: Rather than 
assuming nothing about a network’s structure and applying a gen-
eral, unconstrained block model, we intentionally seek and encode 
core-periphery structure in our models.

We propose a core-periphery typology that contains two struc-
tures: the hub-and-spoke structure and the layered structure. Both 
characterizations can be phrased in the language of block models by 
arranging the block connectivities of p in different ways, as depicted 
in Fig. 2. Through the Bayesian approach to the stochastic block 
modeling, we can alter the prior P(p) to encode these different 
arrangements and constrain the model to adhere to those structures 
(30). The constrained models allow us to only consider networks with 
respect to the core-periphery typology and classify them appropriately 
according to the structure that they exhibit. From here onward, we 
use “hub-and-spoke” to refer to either the theoretical typology 
characterization or the stochastic block model implementation, de-
pending on context. However, we only use “two-block model” or 
“two-block algorithm” when specifically referencing Borgatti and 
Everett’s implementation (15). Similarly, we use “layered” to refer 
to the typology characterization or the block model but only “k-cores” 
to refer to the heuristic algorithm.

The hub-and-spoke characterization specifies two blocks, one 
for the core and one for the periphery. If we let the core be denoted 
by the first block and the periphery by the second, then the original 
two-block model presented by Borgatti and Everett (15) can be re-
covered by setting p11 = 1, p12 = 1, and p22 = 0. We consider a relaxation 
of this structure (7), which allows for flexibility in the connections 
by only requiring p11 > p12 > p22. This configuration (shown in 
Fig. 2A) conveys the intuition of the hub-and-spoke structure: There 
is a densely connected core moderately connected with a periphery 
that is only loosely connected among itself. Statistically, we enforce 
this constraint through a uniform prior over all block matrices that 
satisfy 0 < p22 < p12 < p11 < 1. In notation, we write

  P(p ) ∝  1  {0< p  22  < p  12  < p  11  <1}    (2)

where  1  is the indicator function that takes on the value 1 if the 
constraint is satisfied and 0 otherwise.

Fig. 2. The core-periphery typology formalized through block model representations of the hub-and-spoke and layered structures. Each figure depicts the block 
connectivity matrix p, where darker colors indicate higher densities of links. (A) The hub-and-spoke model is defined according to two blocks, where p11 > p12 > p22. 
(B) The layered core-periphery model is defined according to 𝓁 layers, which are ordered as p1 > p2 > … > p𝓁.

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

4 of 11

We can similarly formulate the layered block model. For conve-
nience, we let prr = pr and assume that there are 𝓁 layers, equal to the 
number of blocks B. To configure the layered structure shown in 
Fig. 2B, we first specify

   p  rs   =  p  max (r,s)    (3)

which binds the matrix p into layers. Similar to the hub-and-spoke 
model, we then order the layers through a uniform prior over all p 
that satisfy 0 < p𝓁 < p𝓁 − 1 < … < p1 < 1

  P(p ) ∝  1  {0< p  ℓ  < p  ℓ−1  <…< p  1  <1}    (4)

The layered model can be seen as a special case of the hub-and-
spoke model when 𝓁 = 2 and p12 ≈ p22. In this block structure, 
peripheral nodes in the second layer connect to any other node, 
whether in the core or periphery, with the same probability. For 𝓁 ≥ 
3, any node is agnostic to the specific role of nodes that are in cores 
more central than itself, connecting to them all with the same prob-
ability. In this sense, the layered model can be viewed as a sequence 
of nested and increasingly dense subgraphs.

Recall that we are introducing these priors to determine the type 
of core-periphery structure that best describes a particular network. 
In practice, we find these structures by computing the most likely 
core-periphery position of each node in each model (see Materials 
and Methods for details). Doing so for all nodes yields both hub-
and-spoke and layered partitions. While the hub-and-spoke and 
layered formulations may seem innocuous modifications of existing 
stochastic block models, they complicate the computational tracta-
bility substantially (8, 16). To infer the distributions of  and p for 
the two models, we therefore have to resort to a Gibbs sampling 
procedure, detailed in the Supplementary Materials.

We compare the inferred core-periphery structures to select the 
most statistically appropriate among the two. Formally, for a partition 
ℋ inferred through the hub-and-spoke model ℋ and a partition ℒ 
inferred through the layered model ℒ, we want to identify which 
model and its assignment of block labels to nodes is a better fit of the 
network data A. The answer is given by the the posterior odds ratio (23)

   =    
P(   ℋ  , ℋ ∣ A)   ─  
P(   ℒ  , ℒ ∣ A)

    (5)

If the posterior odds ratio  > 1, then the hub-and-spoke model 
better characterizes the core-periphery structure of the network, 
while  < 1 implies that the layered model is a better descriptor. 
Assuming that we are agnostic about the models a priori, and so 
P(ℋ) = P(ℒ) = 1/2, we can equivalently consider

 − log  =  Σ  ℋ   −  Σ  ℒ   = − log P(A,    ℋ   ∣ ℋ ) + log P(A,    ℒ​​ ∣ ℒ) 
(6)

the difference between description lengths of the hub-and-spoke and 
layered models. The description length, Σℳ = − log P(A, ℳ ∣ ℳ), 
of a model ℳ describes how well that model can compress the in-
formation expressed by a network’s structure (23, 33). A model that 
is able to efficiently describe a network with a smaller number of 
parameters is a better descriptor of the network and will have a 
minimal description length. So, if the hub-and-spoke model ℋ is a 
better descriptor of a network’s core-periphery structure than the 

layered model ℒ, then we will have a posterior odds ratio where  > 1 
and, equivalently, a negative difference in description lengths, 
Σℋ − Σℒ < 0. We use the description length to quantify model fit, by 
considering either the pairwise difference in description lengths be-
tween two models or the minimum description length (MDL) across 
many models (see the Supplementary Materials for numerical details). 
This measure allows us to distinguish which block model most aptly 
describes a particular network and properly situate it within the 
core-periphery typology.

We briefly note connections of our core-periphery block models 
to prior work. With respect to the hub-and-spoke model, Zhang et al. 
(7) identified the ordering p11 > p12 > p22 as a relaxed version of the 
hub-and-spoke block structure introduced by Borgatti and Everett. 
However, they did not formally encode this constraint in their model 
and, instead, relied on the susceptibility of stochastic block models 
to heterogeneous degree distributions (16) to retrieve core-periphery 
structure. We examine the suitability of this assumption in the 
Supplementary Materials. With respect to the layered model, Borgatti 
and Everett (15) presented a special case of our model where 𝓁 = 2, 
p1 = 1, and p2 = 0. However, given that those binary layer densities 
imply a network that consists of a connected core component sur-
rounded by a cloud of isolate periphery nodes, they only briefly re-
marked on the model’s limited conceptual utility. Regardless, it is 
important to emphasize that while the more general layered block 
model introduced here shares the intuition of nested, increasingly 
dense layers with the k-cores decomposition, it is distinct from that 
algorithm. The layered block model is a fuzzier interpretation of how 
core-periphery structure can be reflected through layers; it allows 
for low degree nodes to be placed in core layers if they are embedded 
among other core nodes, while, by definition, nodes can only be in 
higher k-shells if they have a higher degree. In general, then, the block 
assignments from the layered model will not exactly align with the 
partition produced by the k-cores decomposition.

Synthetic network experiments
As an essential validation step, we experimentally verify that our 
two models of core-periphery properly recover hub-and-spoke and 
layered structures when we know that they exist within a network. 
Our first experiment measures the capacity for the block models to 
discern between hub-and-spoke and layered structures. We gener-
ate synthetic networks according to the stochastic block model and 
design p to have a known, ground truth core-periphery block 
arrangement. We configure p according to a two-parameter model 
(see Materials and Methods for details). The first parameter  inter-
polates between hub-and-spoke and layered core-periphery structure: 
When  = 0, the network has a known hub-and-spoke core-periphery 
structure, and when  = 1, the network has a known layered struc-
ture consisting of three layers. The second parameter  defines the 
structural clarity: When  = 1, the network is random and neither 
model should be able to infer structure, and for large , the networks 
have a well-defined core-periphery structure.

The results, shown in Fig. 3A, demonstrate that the block models 
effectively discern the two types of planted core-periphery structures. 
Within each regime of the interpolation parameter , the MDL ap-
propriately identifies the correct model as a better description of the 
network structure, and it is more confident as the parameter reaches 
the boundaries of its range, at which point only that structure defin-
itively exists in the network. For low values of structural clarity , for 
which the networks are purely random, the models have approximately 

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

5 of 11

equal MDLs, and neither is strongly designated better in terms of 
model fit.

Our second experiment tests the layered model’s ability to iden-
tify the appropriate number of layers in a synthetic network with a 
known layered structure. We start with a synthetic network of six 
equally sized layers and progressively reduce the effective number 
of layers by merging layers until there are only two layers (see 
Materials and Methods for details). For each fixed number of layers 
in the synthetic network, we run multiple layered core-periphery 
models for different choices of the parameter 𝓁, which designates 
the number of layers to infer. The results, given in Fig. 3B, indicate 
that the average MDL accurately identifies the number of layers that 
exist in each synthetic network. This demonstrates that we can not 
only use the MDL for model selection between the hub-and-spoke 
and layered models, per the results of the first experiment, but also 
use it for choosing the number of layers.

Core-periphery diversity of empirical networks
Having validated the core-periphery block models and the use of 
the MDL as a measure of model fit, we establish the diversity of core- 
periphery structure expressed by empirical networks. For all net-
works with up to 200,000 nodes in the KONECT dataset (24), we 
infer partitions according to each of the hub-and-spoke and layered 
core-periphery models (see Materials and Methods for details). Re-
call that we use the terminology hub-and-spoke to refer to the 
stochastic block model (SBM) implementation and two-block to refer 
specifically to Borgatti and Everett’s original implementation (15). 
Similarly, layered refers to the SBM implementation, and k-cores 
refers to the heuristic algorithm.

In Fig. 4A, we show the breadth of network structure exhibited 
across the core-periphery typology. As we clearly see, both the hub-
and-spoke and layered core-periphery structures are expressed to a 

wide degree of intensity across all types of networks; neither model 
is a universal, best descriptor of core-periphery structure. Some 
classes of networks seem to be generally well described by either just 
the hub-and-spoke characterization or just the layered characteri-
zation, but many more show a range of structure across the core- 
periphery spectrum. Communication networks in KONECT, for 
example, exhibit the full range of core-periphery prevalence across 
both characterizations. The diversity of core-periphery structure in 
these empirical networks demonstrates the danger in assuming a 
core-periphery type a priori and the need to situate a network within 
the core-periphery typology to mitigate later downstream network 
mischaracterizations.

We also observe that a smaller portion of networks do not strongly 
exhibit either a hub-and-spoke or layered structure. In Fig. 4B, we 
show that these networks are often the ones for which both core- 
periphery block models extract partitions that are similar or identical. 
As discussed earlier when deriving the block models, the layered 
model with 𝓁 = 2 layers is a special case of the hub-and-spoke model. 
A number of smaller networks in particular, including all of the 
animal networks, are best modeled with two layers, and so, both 
models extract similar partitions and have similar MDLs. On the 
other hand and as expected, we also find that the partitions become 
less similar as one core-periphery model or the other is preferred 
according to the MDL.

Last, we connect the core-periphery stochastic block models back 
to the motivating algorithms, the two-block model and the k-cores 
decomposition. In Fig. 4C, we show that networks that are equally 
well modeled by either the layered or hub-and-spoke block model 
are also those for which the two-block model and k-cores decompo-
sition extract similar partitions. Furthermore, the networks that have 
the most distant two-block and k-cores core-periphery partitions are 
also those that mostly strongly exhibit a hub-and-spoke or layered 

Fig. 3. Synthetic network experiments validating the core-periphery block models and the use of MDL as a measure of model fit. (A) The difference in MDL per 
edge between the layered model and hub-and-spoke model on networks constructed to have varying degrees of each structure. Negative values (blue) indicate that the 
hub-and-spoke model is a better model fit, while positive values (red) indicate that the layered model is a better fit. The MDL accurately discerns which model is the best 
descriptor of the ground truth network structure, as indicated by the gradual transition from blue to red as the true structure varies from hub-and-spoke to layered. (B) The 
average MDL per edge of layered models on networks with planted layered structure. For each fixed number of actual layers in the synthetic networks (vertical axis), we 
run the inference with a varying number of modeled layers 𝓁 (horizontal axis). Stars (⋆) indicate the number of modeled layers 𝓁 that yields the lowest MDL for a fixed 
number of planted layers. The alignment of the stars on the diagonal shows that the layered model properly discerns the number of layers used in the generative model.

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

6 of 11

structure according to the description length. Figure 4D complements 
these findings by comparing the distances between the inferred par-
titions according to the stochastic block models and the partitions 
of the two-block model and k-cores decomposition. The hub-and-
spoke partitions found with the SBM are consistently closer to the 
two-block partitions than the k-core decomposition. The relation-
ship between the layered partitions and the k-cores partitions is less 
sharp, with layered and hub-and-spoke partitions being about equally 
distant from the k-cores partitions on average. These results provide 
evidence that the two-block model is representative of the hub-and-
spoke characterization. The layered characterization, however, finds 
partitions that are not quite the same as the k-cores algorithm. This 
is due, in part, to the fact that it aggregates nodes in fewer layers. 
However, as mentioned earlier, the layered block model, along with 
the hub-and-spoke model, is more flexible than the k-cores decom-
position, which requires core nodes to be high-degree nodes. Instead 

and critically, both block models allow for low-degree nodes to be 
core nodes if they are embedded among other core nodes. So, the 
similar distances of the hub-and-spoke and layered partitions from 
the k-cores partitions are also partly explained by how both block 
models allow for a more fluid interpretation of what it means for a 
node to be in the core or the periphery.

Case study: Hashtag activism amplification
To emphasize the importance of distinguishing between hub-and-
spoke and layered core-periphery structures, we briefly conclude 
with a case study of hashtag activism amplification. Social media are 
notable for creating spaces where historically disenfranchised indi-
viduals can come together and share their stories at an unprecedented 
scale (34–36). Hashtag activism, in particular, has been a critical vehi-
cle for driving those marginalized voices into the mainstream public 
sphere (35, 36), as exemplified by hashtags such as #BlackLivesMatter 

Fig. 4. Structural diversity across the core-periphery typology. (A) Distributions of differences in MDL between the best-fit hub-and-spoke and layered models, by 
network domain. Different networks within and across domains are better modeled by hub-and-spoke or layered structure, indicating that there is no one universal best 
descriptor of core-periphery structure. (B) Difference in MDL plotted against the distance between the hub-and-spoke and layered partitions for each KONECT network. 
Networks that have similar MDLs are generally those where both core-periphery models extracted similar partitions. (C) Difference in MDL plotted against the distance 
between the two-block and k-cores partitions for each KONECT network. The stronger the disagreement between heuristics, the clearer the fit for either of the SBM-based 
core-periphery. (D) Distance between the best-fit block model (either hub-and-spoke, indicated by blue, or layered, indicated by red) and the two-block and k-cores 
partitions. Histograms show the marginal distributions of distances, where dashed lines indicate the mean distance. Partition distance in all subplots is measured in bits 
according to the VI.

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

7 of 11

and #MeToo (35, 37). The amplification of those voices is a funda-
mentally networked process; the core consists of those who are most 
visible exactly because many peripheral amplifiers share the core’s 
posts through emergent crowdsourcing (1). Core-periphery struc-
ture is a natural network model for such amplification processes.

Although those at the periphery of hashtag activism events are 
sometimes derided as “slacktivists,” Barberá et al. (1) demonstrated 
that the periphery contributes significantly to the amplification of 
core protest voices. We perform a similar analysis on the retweet 
network of the hashtag #MeToo, a hashtag that highlighted the per-
vasiveness of sexual violence against women by creating a space for 
them to publicly disclose their experiences (35, 37). We fit hub-and-
spoke and layered core-periphery models to the #MeToo network 
and calculate the coreness of each individual according to each model. 
The coreness, which varies from 0 to 1, indicates whether an individual 
is more likely to be situated in the periphery or core, respectively (see 
Materials and Methods for details). In line with prior work (1), we 
operationalize amplification by measuring the total reach as the sum 
of the number of followers that each individual in the network has.

To quantify the relationship between amplification and the core- 
periphery structure, we iteratively remove individuals from the 
retweet network according to their coreness, decomposing the net-
work from the periphery to the core, and measure how the hashtag 
reach varies (see Fig. 5). We observe that the cumulative reach, the 
total number of possible followers exposed to the hashtag, declines 
sharply for both the hub-and-spoke and layered models as the 
abundance of peripheral amplifiers is removed. This is consistent 
with the findings of Barberá et al. (1). However, the reach drops more 
rapidly for the hub-and-spoke model (MDL = 7.6 bits per edge, the 
best-fit model overall) than any of the layered models and, in partic-
ular, the best-fit layered model with 𝓁 = 4 layers (MDL = 9.9 bits per 
edge). Comparatively, the layered models markedly underestimate 
the contribution of the periphery to the early reach of #MeToo; taking 
the reach of the hub-and-spoke model as the expected value, the 
estimates of reach by the best-fit layered model have a percent error 
of 63% at a coreness threshold of 0.1 and a percent error of 36% at a 
threshold of 0.2. Hence, we see that using a network model that 
does not properly describe the core-periphery structure of a hashtag 
activism network notably misestimates how amplification varies 
across core and peripheral participants.

This example illustrates why it is critical to account for the core- 
periphery typology to make sound network inferences. Qualitatively, 
the Bayesian block models give us a succinct description of the 
#MeToo retweet network, informing us that it is best described as a 
hub-and-spoke structure that serves to broadcast a small set of core 
voices, rather than a layered structure with many connections among 
those disclosing at the periphery. Quantitatively, using the MDL to 
select the hub-and-spoke model as the best fit to our network data 
allows us to confidently estimate the periphery’s contribution to the 
hashtag’s reach. This measure can be used to compare across in-
stances of hashtag activism and assess the effectiveness of peripheral 
amplification or to develop interventions to counteract amplification 
manipulation tactics, such as those deployed by social bots and 
coordinated information operations.

DISCUSSION
We have presented a typology of core-periphery structure that 
raises the important distinction between two characterizations: 

hub-and-spoke and layered. These structures, which are reflected in 
two of the most widely used core-periphery algorithms (15, 18), often 
yield starkly different descriptions of a network’s core-periphery 
layout. To elucidate the typology, we have formulated two Bayesian 
stochastic block models that statistically encode the hub-and-spoke and 
layered structures. By applying description length as an information- 
theoretic measure of model fit across a large network database, we 
have shown empirically that networks express a rich variety of core- 
periphery structure. Through a case study of online amplification 
of hashtag activism, we have demonstrated that the choice of core- 
periphery model used to describe a network affects the substantive 
interpretation of the network’s structure and function, indicat-
ing the need to distinguish between hub-and-spoke and layered 
structures.

While a number of algorithms exist for extracting core-periphery 
structure, they generally take a vaguely intuitive view of core-periphery 
structure: We have core-periphery structure when we have a core of 
densely connected nodes and a sparse periphery around that core. 
Our work challenges the ambiguity of this definition and demon-
strates that there are at least two distinct ways that we can concep-
tualize core-periphery structure, neither of which is a universal 
descriptor across all networks. Although there is no universal way 
of describing core-periphery structure, our typology classifies net-
works into distinct categories based on their specific connectivity 
patterns between the core and periphery. Within these categories, 

Fig. 5. Core-periphery amplification of the hashtag #MeToo during its first 
12 hours of use in October 2017. Reach is measured as the cumulative number of 
followers among those in the network. Curves show how the fraction of total reach 
decomposes as the coreness threshold for inclusion into the retweet network is 
increased. The solid blue curve indicates the best-fit hub-and-spoke curve (and best 
fit overall); the solid red line indicates the best-fit layered curve (𝓁 = 4 layers), and 
lighter red lines indicate other layered models with 2 to 20 layers. Markers on the 
vertical axis indicate the reach after removing nodes with coreness of exactly 0. 
The histogram above the plot shows the distribution of coreness among nodes in 
the network for each best-fit model.

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

8 of 11

there is the potential to unveil larger structural patterns that cut 
across networks and domains; networks may systematically exhibit 
layered or hub-and-spoke structures as a result of a number of 
phenomena, including domain-specific generative processes, network- 
specific data collection traditions, and context-specific network 
constraints. Likely, all of these play a role in explaining why hub-
and-spoke and layered structures emerge in networks. However, 
there is significant work to be done across a variety of social, biological, 
and technological domains to disentangle exactly how and under 
what conditions those factors affect core-periphery structure. Our 
work shows that there are core-periphery commonalities across net-
works and domains that are still unexplained and that a universal 
notion of core-periphery structure cannot be taken as a given.

In light of this lack of universal organization, we must adjust our 
practical approach to measuring core-periphery structure going for-
ward. Researchers and network practitioners need to be more explicit 
about their theories of core-periphery structure and more deliberate 
in what method they choose to infer that structure. In cases where 
it is not theoretically clear what kind of core-periphery structure 
should be reflected in a network—hub-and-spoke, layered, or 
otherwise—researchers should use the tools that we have introduced 
here to make informed decisions about what best describes a given 
network at hand. The core-periphery stochastic block models that 
we have developed require that we explicitly restrict the search space 
of the statistical inference to a smaller subset of network structures. 
These restrictions reduce the expressivity of the model but, at the 
same time, allow substantive domain experts to guide models as they 
apply core-periphery models to network datasets, which we argue is 
critical (30). In the context of our hashtag activism case study, using 
constrained block models allowed us to focus our inquiry on core- 
periphery structure, the theoretical network model that describes 
the structure of online amplification (1). If we were to use a general 
stochastic block model, which could return community structure, 
disassortativity, or any other mesoscale pattern, then we would not 
be able to properly align our domain knowledge with the network 
analysis. Constrained core-periphery block models force researchers 
to be more intentional about how they describe core-periphery 
structure and, in turn, help researchers theorize about core-periphery 
network effects in a more precise and statistically sound manner.

Both the conceptual typology and the statistical methods that we 
have presented are only the first step in a broader line of work that 
interrogates how core-periphery structure is reflected in networks. 
There are natural methodological extensions for network scientists 
to develop, which would expand the range of the core-periphery 
typology. Our models focus on identifying a single core-periphery 
structure in a network, but there could be multiple or even interacting 
sets of cores and peripheries (9). Expanding the scope of the core- 
periphery typology to include this multiplicity would allow for 
detailed network descriptions that encode the interactions between 
cores, peripheries, and communities. The same modeling approach 
could be used to incorporate edge weights, to extend the typology in 
terms of core-periphery cohesiveness, and directionality, to extend 
the typology in terms of in-cores and out-cores (38). When we ex-
press these extensions and formulations of core-periphery struc-
ture (39) in the language of Bayesian block models, we have a 
statistically consistent framework for adjudicating between them 
and determining which best describes any given network. By pre-
senting a core-periphery typology with accompanying statistical 
models that are readily extensible and generalizable, we have 

provided the foundation for unifying the notion of core-periphery 
structure both methodologically and theoretically.

Network scientists increasingly recognize that there is no “ground 
truth” structure of networks (40); there are only models that do and 
do not help us address particular questions. Our constrained block 
models, typology, and measure of model fit make it possible to 
more acutely answer questions about core and peripheral dynamics 
that were not previously possible (39). As researchers and practi-
tioners use our methods to be more deliberate about the kind of 
core-periphery structure that they want to describe, they will un-
doubtedly raise questions that cannot be answered with the current 
network models at hand. This presents an opportunity for network 
scientists to fill those methodological gaps and present models that, 
themselves, may open doors to new theories and questions. Our 
core-periphery typology and models clarify the ways in which 
core-periphery algorithms can be applied to networks and provide 
an example of how we, as both domain experts and network scien-
tists, can begin to better align our structural methodology with our 
substantive questions.

MATERIALS AND METHODS
KONECT network data
As of the time of data collection, the KONECT (24) consists of 
261 networks and represents a variety of network domains. Networks 
in the collection may be undirected, directed, or bipartite, and they 
can contain multiedges and self-loops. The edges themselves can be 
unweighted, weighted, signed, or temporal. We take the following 
preprocessing steps: (i) Weighted edges are treated as unweighted, 
and all multiedges are collapsed to a single edge. (ii) Self-loops are 
disregarded. (iii) Directed edges are treated as undirected. (iv) Only 
the largest weakly connected component is considered.

We exclude all temporal and dynamic networks in KONECT to 
avoid the ambiguity in choosing a time scale to define static networks. 
We also exclude all networks that were marked as “incomplete” in 
KONECT (24). Last, we exclude bipartite networks because they 
should be modeled with stochastic block models that can account 
for their special structure and high local density when projected 
(41, 42). After these preprocessing and inclusion criteria, we are left 
with 142 networks, listed in the Supplementary Materials.

We note that the simplifications made during preprocessing likely 
affect the core-periphery modeling of the KONECT networks. A 
node’s strength, the sum of its edge weights, often correlates with its 
degree (43), such that the weight-agnostic models may underestimate 
the cohesiveness of core and how tightly peripheral nodes connect 
to it. Converting directed edges into undirected ones also forces 
symmetry upon the adjacency matrix, which can obscure other prom-
inent patterns particular to directed networks (39) that may more 
comprehensively describe the core-periphery organization.

We infer block models for all KONECT networks with up to 
200,000 nodes, a total of 95 of the 142 networks. The networks with 
more than 200,000 nodes are concentrated on a small set of network 
domains: 25% are social networks, 25% are hyperlink networks, and 
23% are communication networks. Fitting larger networks is possi-
ble but requires considerable computation.

Stochastic block model formulation
Recall that we let A be the adjacency matrix of an unweighted, un-
directed, simple network with N nodes. Nodes are assigned to a 

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

9 of 11

fixed number of B blocks, represented by , a vector of length N 
where i = r indicates that node i belongs to block r. Connections 
between blocks are specified by the B × B matrix p, where prs is the 
probability that a node in block r is connected to a node in block s. 
The posterior distribution of the parameters  and p given our net-
work data A is written as

  P(, p ∣ A ) ∝ P(A ∣ , p ) P( ) P(p)  (7)

Earlier, we constructed the prior P(p) on the block connectivity 
matrix, which is the primary alteration needed for the core-periphery 
block models

   P  ℋ  (p ) = 3 ! · 1  {0< p  22  < p  12  < p  11  <1}    (8)

and

   P  ℒ  (p ) = ℓ ! · 1  {0< p  ℓ  < p  ℓ−1  <…< p  1  <1}    (9)

where the leading numerical factors ensure normalization. Here, we 
think of blocks as layers, so we let B = 𝓁, where 𝓁 is the number 
of layers.

We otherwise use a standard formulation of the likelihood 
P(A ∣ , p) and the block assignments prior P() (8, 16). The network 
likelihood rests upon the cornerstone assumption of the stochastic 
block model: Connections are independently generated on the 
basis of only the block assignments of nodes. Let mrs be the number 
of edges that exist between blocks r and s, and let Mrs be the maxi-
mum number of edges that could potentially exist between the 
two blocks. This number equals nrns for two different blocks of size 
nr and ns and nr(nr − 1)/2 when considering the internal edges of 
block r. The likelihood can be then calculated as the product of 
independent Bernoulli processes across edges and then aggregated 
at the block level to yield

  P(A ∣ , p ) =   ∏ 
r≤s

     p rs  
 m  rs     (1 −  p  rs  )   

 M  rs  − m  rs     (10)

The constraints on p yield a more compact form for this likeli-
hood (see the Supplementary Materials). The last missing piece of 
the model is the layer assignment prior P(). The prior on  can then 
be expressed in three parts (8). First, we consider the probability P(𝓁) 
of choosing a particular number of layers 𝓁, which is always 𝓁 = 2 
for the hub-and-spoke model and a free parameter for the layered 
model. Next, given the number of layers, we consider the probabil-
ity P(n ∣ 𝓁) of drawing a particular sequence of layer sizes n = {n1, 
n2, …n𝓁}. Last, given the layer sizes, we determine the probability 
P( ∣ n) of seeing a particular allotment of nodes to layers. All 
together in notation, the prior on the block assignments  is ex-
pressed as

  P( ) = P( ∣ n ) P(n ∣ ℓ ) P(ℓ ) =   
 ∏ r      n  r   ! ─ N !     (     

N − 1   ℓ − 1   )     
−1

   N   −1   (11)

With these three parts of the model specified, we can calculate 
the posterior probability of the model. For more details on the sto-
chastic block model formulation, see (8, 16, 44).

We fit the model with a Metropolis-within-Gibbs algorithm, de-
tailed in the Supplementary Materials. We estimate the true layer of 

a node by selecting the layer that maximizes its marginal posterior 
distribution over layers. In doing so, we average the random fluctu-
ations found in real systems and avoid overfitting to a particular 
partition when there are many similar optima (45, 46).

Synthetic networks
Discernment experiment
The first synthetic network experiment tests the ability of the core- 
periphery models to discern between hub-and-spoke and layered 
structures (see Fig. 3A). We generate networks through the stochastic 
block model according to block matrices given by

  
[
​ 

p

  
p

  
p(1 −  ) +   
p

 ─    

    p  p +   
p

 ─   (1 − )    
p

 ─       

p(1 −  ) +   
p

 ─    

  
p

 ─   

  
p

 ─   

  
]​
 (12)

where p > 0 is the baseline density of the network,  ∈ [1,1/p] is the 
structural clarity parameter, and  ∈ [0,1] is the interpolation pa-
rameter. The structural clarity parameter  determines the preva-
lence of core-periphery structure in the network. When  = 1, the 
network as a whole is simply an Erdős-Rényi random network with 
density p. When  ≫ 1, the core-periphery structure is well defined. 
The interpolation parameter  specifies whether a layered or hub-
and-spoke core-periphery structure is reflected in the network. 
When  = 0, the block densities arrange in such a way that the net-
work is effictively generated from two blocks and a hub-and-spoke 
structure is present. When  = 1, the network exhibits a three-layer 
structure. We note that  = 1/2 holds no special meaning in this 
interpolation: The structure smoothly transitions from one type to 
the other as  is varied.

For the experiment, each synthetic network consists of n = 
10,000 nodes, divided equally among the three blocks. We set 
p  =  0.0075 and generate networks over the parameter ranges  ∈ 
[0,1] and  ∈ [1,4]. See the “Block model inference and parameters” 
section for details on inference of the experimental network 
structure.
Number of layers experiment
The second synthetic network experiment tests the ability of the 
layered model to identify the number of layers in synthetic layered 
networks (see Fig. 3B). We first generate a network G via the layered 
stochastic block model, where G has n = 10,000 nodes evenly split 
among 𝓁 = 6 layers, where layers are connected according to an ini-
tial connectivity matrix p(G). We then consider a new network Gk of 
the same number of nodes and layers but where

   
p

 r  
(
 
G

  
k
  
)
  
=  

p
 r  
(G)

   
for r < k

 and
   
p

 
r
  

(

 
G

  
k

  
)

  
=

  
q

  
k
    
for r ≥ k. This is a network where the inner layers

 have the same density as in G but where the outermost layers are 
effectively merged because they have the same density qk. We set qk 
such that the overall average degree  of G is preserved, i.e., the 
average degree of Gk is  for all k. The merged layers density qk pre-
serving  is given by

   q  k   =   
 (    n   2   )   ∑ r=k  

ℓ     p r  
(G)  +  n   2   ∑ r=k  ℓ   (r − 1 )  p r  

(G) 
   ────────────────────   

 (    n   2   )  (ℓ − k + 1 ) +  n   
2   ∑ r=k  ℓ   (r − 1)

    (13)

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

10 of 11

For each choice of k ∈ [2,6], we make 10 networks generated from 
the same block matrix. We define the block matrix p(G) of the original 
network such that p1 = 0.002, p6 = 0.1, and pr for 1 < r < 6 is geomet-
rically distributed between p1 and p6. See the Block model inference 
and parameters section for details on inference of the experimental 
network structure.

Hashtag activism case study
For the hashtag activism case study, we consider all of the tweets 
containing the hashtag #MeToo that were posted 12 hours after ac-
tress Alyssa Milano’s “me too” tweet, which catalyzed the hashtag 
campaign on 15 October 2017 [see (37) for further data details]. The 
use of human-generated data gathered from Twitter was reviewed 
and approved by the Institutional Review Board at Northeastern 
University. In the first 12 hours, there were 208,926 tweets. We con-
struct a retweet network from these tweets by representing individ-
uals as nodes and retweets as edges. For the purpose of core-periphery 
modeling, we treat edges as undirected and unweighted and remove 
self-loops from the network. We model the largest weakly connected 
component of the network (47), which consists of 74,214 nodes and 
130,277 edges.

We measure the coreness of each individual by taking into ac-
count all the potential core-periphery descriptions identified by a 
model. We can consider the average block or layer i of a node as a 
measure of its distance to the core of the network and use that to 
define coreness as

   c  i   = 1 −   1 ─ ℓ     ∑ r=1
  

ℓ
   r P(   i   = r ∣ A)  (14)

In this expression, 𝓁 is the number of blocks and P(i = r ∣ A) is 
the probability that node i takes on block assignment r. The latter 
probability is the marginal distribution of i, formally defined as

  P( θ  i   = r ∣ A ) =  ∑ θ     P(θ ∣ A )  1  { θ  i   = r}    (15)

Coreness varies between 0 and 1, where an individual positioned 
consistently in the core will have a higher coreness score.

Block model inference and parameters
For the discernment experiment, we run the hub-and-spoke and 
layered models three times each for each (, ) parameter tuple. For 
each model, we use the best model according to the MDL. For each 
run of each model, we sweep over 250 Gibbs samples and let each 
Markov chain Monte Carlo (MCMC) simulation run for 10 times 
the number of nodes in the network (see the Supplementary Mate-
rials for numerical details). We use samples from the second half of 
the Gibbs sampling chain to infer the parameters   ̂    , the most prob-
able block labels. We use 107 samples to approximate the MDL (see 
the Supplementary Materials for numerical details).

For the layered experiment, we consider L, the actual number of 
layers in each synthetic network, and 𝓁, the fixed parameter in the 
layered model. For each L, there are NL = 10 networks. For each of 
those networks, we run the layered models three times and choose 
the best model from those three runs according to the MDL. We 
then average the MDL over the NL networks to get the average MDL 
per (L, 𝓁) pair. We perform inference similar to the discernment 
experiment but instead use 20 steps per node for the MCMC chains. 
To account for more layers than the previous experiment, we use 
108 samples to approximate the MDL.

For each KONECT network and the #MeToo case study network, 
we run both the hub-and-spoke and layered models three times each. 
For each model, we choose the best run, as determined by the 
MDL. For the KONECT networks, we run layered models for 𝓁 ∈ 
[2,10] and use the model with the best MDL across all choices of 𝓁. 
For the #MeToo network, we vary 𝓁 in the range 𝓁 ∈ [2,20] and 
choose the best model overall (thick red line in Fig. 5) and for each 
individual choice of 𝓁 (light red lines in Fig. 5) according to the 
MDL. We use 200 Gibbs samples for the KONECT and #MeToo 
models and infer partitions according to the second half of each 
chain. For the MCMC chains, we use 10 steps per node. We use 108 
samples to estimate the MDL.

SUPPLEMENTARY MATERIALS
Supplementary material for this article is available at http://advances.sciencemag.org/cgi/
content/full/7/12/eabc9800/DC1

REFERENCES AND NOTES
 1. P. Barberá, N. Wang, R. Bonneau, J. T. Jost, J. Nagler, J. Tucker, S. González-Bailón, The 

critical periphery in the growth of social protests. PLOS ONE 10, e0143611 (2015).
 2. D. S. Bassett, N. F. Wymbs, M. P. Rombach, M. A. Porter, P. J. Mucha, S. T. Grafton, 

Task-based core-periphery organization of human brain dynamics. PLOS Comput. Biol. 9, 
e1003171 (2013).

 3. J. I. Alvarez-Hamelin, L. Dall’Asta, A. Barrat, A. Vespignani, K-core decomposition 
of Internet graphs: Hierarchies, self-similarity and measurement biases. Netw. Heterog. 
Media 3, 371–393 (2008).

 4. S. Carmi, S. Havlin, S. Kirkpatrick, Y. Shavitt, E. Shir, A model of Internet topology using 
k-shell decomposition. Proc. Natl. Acad. Sci. U.S.A. 104, 11150–11154 (2007).

 5. M. Kitsak, L. K. Gallos, S. Havlin, F. Liljeros, L. Muchnik, H. E. Stanley, H. A. Makse, 
Identification of influential spreaders in complex networks. Nat. Phys. 6, 888–893 (2010).

 6. F. D. Malliaros, C. Giatsidis, A. N. Papadopoulos, M. Vazirgiannis, The core decomposition 
of networks: Theory, algorithms and applications. VLDB J. 29, 61–92 (2020).

 7. X. Zhang, T. Martin, M. E. J. Newman, Identification of core-periphery structure 
in networks. Phys. Rev. E 91, 032803 (2015).

 8. T. P. Peixoto, Bayesian stochastic blockmodeling, in Advances in Network Clustering and 
Blockmodeling, P. Doreian, V. Batagelj, A. Ferligoj, Eds. (Wiley, ed. 1, 2019), pp. 289–332.

 9. S. Kojaku, N. Masuda, Finding multiple core-periphery pairs in networks. Phys. Rev. E 96, 
052313 (2017).

 10. M. Cucuringu, P. Rombach, S. H. Lee, M. A. Porter, Detection of core–periphery structure in 
networks using spectral methods and geodesic paths. Eur. J. Appl. Math. 27, 846–887 (2016).

 11. F. Tudisco, D. J. Higham, A nonlinear spectral method for core-periphery detection 
in networks. SIAM J. Math. Data Sci. 1, 269–292 (2019).

 12. F. Della Rossa, F. Dercole, C. Piccardi, Profiling core-periphery network structure by 
random walkers. Sci. Rep. 3, 1467 (2013).

 13. C. Ma, B.-B. Xiang, H.-S. Chen, M. Small, H.-F. Zhang, Detection of core-periphery structure 
in networks based on 3-tuple motifs. Chaos 28, 053121 (2018).

 14. M. P. Rombach, M. A. Porter, J. H. Fowler, P. J. Mucha, Core-periphery structure 
in networks. SIAM J. Appl. Math. 74, 167–190 (2014).

 15. S. P. Borgatti, M. G. Everett, Models of core/periphery structures. Soc. Netw. 21, 375–395 
(2000).

 16. B. Karrer, M. E. J. Newman, Stochastic blockmodels and community structure in networks. 
Phys. Rev. E 83, 016107 (2011).

 17. B. Bollobás, Modern Graph Theory (Springer, 1998).
 18. S. B. Seidman, Network structure and minimum degree. Soc. Netw. 5, 269–287 (1983).
 19. L. Hébert-Dufresne, J. A. Grochow, A. Allard, Multi-scale structure and topological 

anomaly detection via a new network statistic: The onion decomposition. Sci. Rep. 6, 
31708 (2016).

 20. V. Batagelj, M. Zaveršnik. An O(m) algorithm for cores decomposition of networks. 
arXiv:0310049 [cs.DS] (2003).

 21. A. V. Goltsev, S. N. Dorogovtsev, J. F. F. Mendes, k-core (bootstrap) percolation on complex 
networks: Critical phenomena and nonlocal effects. Phys. Rev. E 73, 056101 (2006).

 22. P. W. Holland, K. B. Laskey, S. Leinhardt, Stochastic blockmodels: First steps. Soc. Netw. 5, 
109–137 (1983).

 23. T. P. Peixoto, Nonparametric Bayesian inference of the microcanonical stochastic block 
model. Phys. Rev. E 95, 012317 (2017).

 24. J. Kunegis, KONECT: The Koblenz Network Collection, in Proceedings of the 22nd 
International Conference on World Wide Web (ACM, 2013), pp. 1343–1350.

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/cgi/content/full/7/12/eabc9800/DC1
http://advances.sciencemag.org/cgi/content/full/7/12/eabc9800/DC1
https://arxiv.org/abs/cs/0310049
http://advances.sciencemag.org/


Gallagher et al., Sci. Adv. 2021; 7 : eabc9800     17 March 2021

S C I E N C E  A D V A N C E S  |  R E S E A R C H  A R T I C L E

11 of 11

 25. S. Z. W. Lip, A fast algorithm for the discrete core/periphery bipartitioning problem. 
arxiv:1102.5511 [physics.soc-ph] (2011).

 26. M. Meilă, Comparing clusterings—An information based distance. J. Multivar. Anal. 98, 
873–895 (2007).

 27. A. J. Gates, Y.-Y. Ahn, CluSim: A Python package for calculating clustering similarity. 
J. Open Source Softw. 4, 1264 (2019).

 28. N. X. Vinh, J. Epps, J. Bailey, Information theoretic measures for clusterings comparison: 
Variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 
2837–2854 (2010).

 29. M. E. J. Newman, G. T. Cantwell, J.-G. Young, Improved mutual information measure 
for clustering, classification, and community detection. Phys. Rev. E 101, 042304 (2020).

 30. J.-G. Young, G. St-Onge, P. Desrosiers, L. J. Dubé, Universality of the stochastic block 
model. Phys. Rev. E 98, 032309 (2018).

 31. S. C. Olhede, P. J. Wolfe, Network histograms and universality of blockmodel 
approximation. Proc. Natl. Acad. Sci. U.S.A. 111, 14722–14727 (2014).

 32. T. P. Peixoto, Parsimonious module inference in large networks. Phys. Rev. Lett. 110, 
148701 (2013).

 33. D. J. C. MacKay, Information Theory, Inference and Learning Algorithms (Cambridge Univ. 
Press, ed. 1, 2003).

 34. Y. Benkler, The Wealth of Networks: How Social Production Transforms Markets and 
Freedom (Yale Univ. Press, 2006).

 35. S. Jackson, M. Bailey, B. Foucault Welles, #HashtagActivism: Networks of Race and Gender 
Justice (MIT Press, 2020).

 36. Z. Papacharissi, Affective Publics: Sentiment, Technology, and Politics (Oxford Univ. Press, 2015).
 37. R. J. Gallagher, E. Stowell, A. G. Parker, B. Foucault Welles, Reclaiming stigmatized 

narratives: The networked disclosure landscape of #MeToo. Proc. ACM Hum. Comput. 
Interact. 3, 1–30 (2019).

 38. J. P. Boyd, W. J. Fitzgerald, M. C. Mahutga, D. A. Smith, Computing continuous core/
periphery structures for social relations data with MINRES/SVD. Soc. Netw. 32, 125–137 (2010).

 39. P. Csermely, A. London, L.-Y. Wu, B. Uzzi, Structure and dynamics of core/periphery 
networks. J. Complex Netw. 1, 93–123 (2013).

 40. L. Peel, D. B. Larremore, A. Clauset, The ground truth about metadata and community 
detection in networks. Sci. Adv. 3, e1602548 (2017).

 41. D. B. Larremore, A. Clauset, A. Z. Jacobs, Efficiently inferring community structure 
in bipartite networks. Phys. Rev. E 90, 012805 (2014).

 42. M. Gerlach, T. P. Peixoto, E. G. Altmann, A network approach to topic models. Sci. Adv. 4, 
eaaq1360 (2018).

 43. A. Barrat, M. Barthélemy, R. Pastor-Satorras, A. Vespignani, The architecture of complex 
weighted networks. Proc. Natl. Acad. Sci. U.S.A. 101, 3747–3752 (2004).

 44. T. P. Peixoto, Efficient Monte Carlo and greedy heuristic for the inference of stochastic 
block models. Phys. Rev. E 89, 012804 (2014).

 45. A. Decelle, F. Krzakala, C. Moore, L. Zdeborová, Asymptotic analysis of the stochastic 
block model for modular networks and its algorithmic applications. Phys. Rev. E 84, 
066106 (2011).

 46. P. Zhang, C. Moore, Scalable detection of statistically significant communities 
and hierarchies, using message passing for modularity. Proc. Natl. Acad. Sci. U.S.A. 111, 
18144–18149 (2014).

 47. M. Newman, Networks (Oxford Univ. Press, 2018).

Acknowledgments: We thank A. Clauset and D. Larremore for the initial inspiration and 
depiction of the layered block model presented in this work. We also thank N. Beauchamp, 
A. Clauset, L. Torres, and J. Davis for conversations early in the project and B. Klein for 
assistance and advice on the data visualization. Funding: This work was supported, in part, by 
equipment and computing resources from NVIDIA Corporation and Northeastern University’s 
Discovery computing cluster. J.-G.Y. was supported by a James S. McDonnell Foundation 
Postdoctoral Fellowship Award. Author contributions: R.J.G. and B.F.W. conceptualized the 
project. R.J.G. and J.-G.Y. developed the methods, designed all experiments, and validated all 
results. B.F.W. and J.-G.Y. supervised the project. R.J.G. implemented and validated all 
computer code, curated all data, and wrote the initial draft. All authors reviewed and edited 
the final manuscript. Competing interests: The authors declare that they have no competing 
interests. Data and materials availability: All data needed to evaluate the conclusions in the 
paper are present in the paper and/or the Supplementary Materials. The Python code for 
inferring the hub-and-spoke and layered core-periphery models and evaluating their model fit 
is freely available online at https://github.com/ryanjgallagher/core_periphery_sbm. The 
KONECT dataset used in this work is freely available at http://konect.cc/. The Twitter data 
underlying the #MeToo case study is available at the University of Michigan’s Inter-University 
Consortium for Political and Social Research upon submission and acceptance of a Restricted 
Data Use Agreement. Additional information related to this paper may be requested from 
the authors.

Submitted 26 May 2020
Accepted 29 January 2021
Published 17 March 2021
10.1126/sciadv.abc9800

Citation: R. J. Gallagher, J.-G. Young, B. F. Welles, A clarified typology of core-periphery structure 
in networks. Sci. Adv. 7, eabc9800 (2021).

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

https://arxiv.org/abs/1102.5511
https://github.com/ryanjgallagher/core_periphery_sbm
https://www.icpsr.umich.edu/web/ICPSR/studies/37447
https://www.icpsr.umich.edu/web/ICPSR/studies/37447
http://advances.sciencemag.org/


A clarified typology of core-periphery structure in networks
Ryan J. Gallagher, Jean-Gabriel Young and Brooke Foucault Welles

DOI: 10.1126/sciadv.abc9800
 (12), eabc9800.7Sci Adv 

ARTICLE TOOLS http://advances.sciencemag.org/content/7/12/eabc9800

MATERIALS
SUPPLEMENTARY http://advances.sciencemag.org/content/suppl/2021/03/15/7.12.eabc9800.DC1

REFERENCES

http://advances.sciencemag.org/content/7/12/eabc9800#BIBL
This article cites 37 articles, 6 of which you can access for free

PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions

Terms of ServiceUse of this article is subject to the 

 is a registered trademark of AAAS.Science AdvancesYork Avenue NW, Washington, DC 20005. The title 
(ISSN 2375-2548) is published by the American Association for the Advancement of Science, 1200 NewScience Advances 

License 4.0 (CC BY-NC).
Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial 
Copyright © 2021 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of

 o
n
 A

p
ril 5

, 2
0
2
1

h
ttp

://a
d
va

n
ce

s.scie
n
ce

m
a
g
.o

rg
/

D
o
w

n
lo

a
d
e
d
 fro

m
 

http://advances.sciencemag.org/content/7/12/eabc9800
http://advances.sciencemag.org/content/suppl/2021/03/15/7.12.eabc9800.DC1
http://advances.sciencemag.org/content/7/12/eabc9800#BIBL
http://www.sciencemag.org/help/reprints-and-permissions
http://www.sciencemag.org/about/terms-service
http://advances.sciencemag.org/