Revising paradigms of fractal complex networks

It will soon be two decades since it was first shown that some real networks (such as the World Wide Web [WWW] and different biological networks) have fractal properties1,2. This means that, when covered with non-overlapping boxes, with the maximum distance between any two nodes in each box less than \(l_B\), they exhibit power-law scaling1,2,3,4,5:

$$\begin{aligned} N_B(l_B)/N\simeq l_B^{-d_{B}}, \end{aligned}$$
(1)

where \(N_{B}(l_B)\) is the number of boxes of a given diameter, and \(d_B\) is the fractal (or box) dimension of the network of size N. Such fractal networks are also said to be self-similar, because their power-law degree distributions,

$$\begin{aligned} P(k)\sim k^{-\gamma }, \end{aligned}$$
(2)

remain invariant under a renormalization scheme6,7, according to which a new network emerges from the original one when nodes belonging to the same box in the original network are replaced by one supernode in the renormalized network. In this case, the supernode is connected to another supernode if in the original network there is at least one link between the nodes of the corresponding boxes.

Here, at least two critical remarks can be made. The first remark is that an analogous invariance of the degree distribution with respect to the box-covering renormalization scheme is also observed in networks that do not satisfy Eq. (1) (in this respect, well-known examples are the internet and Barabási-Albert (BA) networks2,8,9). The second remark is that it is not entirely clear, what structural characteristics of fractal networks exhibits geometric self-similarity and remain invariant10 under the described renormalization. Clearly, the power-law node degree distribution cannot be considered such a characteristic because it is intrinsically invariant under the rescaling of the degree11. Its invariance under box-covering renormalization may only suggest the existence of some (presumably) degree-dependent network measure, whose self-similarity under the renormalization procedure could result in the observed invariance of the degree distribution. One argument supporting this statement is that random networks, where the degree distribution is not a power law, can also exhibit fractal properties (in this regard, the best example is the giant component of classical random graphs near the percolation transition).

If the above remarks, indicating an incomplete understanding of fractality in complex networks, are reasonable, pertinent questions would be: What are the real origins and potential consequences of fractality in complex networks? What determines networks’ fractal dimension? Indeed, several studies have been published throughout the years that focus on the exploration of the origins of fractality12,13,14,15,16,17,18,19. However, these efforts did not lead to consensus. Thus, there is a lack of realistic (and not just deterministic20,21, or reflecting the renormalization procedure2,22,23) fractal network models that would allow testing the role of fractality in the context of geometry-involving issues24, such as navigability, localization of information sources, prediction of hidden network connections, etc. These are of particular importance when faced with the confirmed fractal properties of different information, biological, and even social networks (see e.g.25,26). The goal of this article is to initiate far-reaching changes in this state of affairs.

In what follows, we will first argue that the correct scale-dependent network measure, which is self-similar (i.e. geometrically invariant) under the \(l_B\)-box-covering renormalization procedure, is the normalized mass of the box - \(\mu (L,k)=m(L,k)/\!\langle m\rangle \), where m(Lk) is the number of nodes in the box of diameter \(L\ge l_B\) and hub degree k, and \(\langle m\rangle =N/N_B(L)=L^{d_B}\) (1) is the average mass of non-overlapping boxes of this diameter. It should be emphasized here that although the definition of the box is the same throughout the paper, we distinguish between \(l_B\)-boxes used to renormalize the network and L-boxes (where \(L\ge l_B\)) whose self-similarity we examine. This distinction is crucial to make it easier to understand the main idea of the paper.

Then, we show that one of the consequences of this result is the previously discovered scaling relation between the degree \(k'\) of the supernode in the renormalized network and the degree k of the hub of the corresponding \(l_B\)-box in the network before renormalization: \(k'=l_B^{-d_k}k\), where \(d_k\) is only one of four scaling exponents that characterize microscopic structure of the fractal complex network and determine its box dimension. We also show that if the fractal complex network has a power-law node degree distribution (which is traditionally referred to as the scale-free property), then the mass box distribution also follows the power-law, and it is invariant under the box renormalization procedure. Furthermore, the characteristic exponents of both distributions are related to the microscopic scaling exponents describing the masses of the boxes, thus bridging local self-similarity and global scale invariance in fractal complex networks. Lastly, we successfully verify our findings in real networks situated in various fields (information - the World Wide Web, biological - the human brain, and social - scientific collaboration networks) and in several fractal network models.

Local self-similarity and global scale-invariance in fractal networks

Geometric self-similarity

In classical fractals27, which reproduce themselves at different space scales, self-similarity manifests itself in the scale-invariant equation10, which describes how the mass m(L) of the system changes with its linear size L:

$$\begin{aligned} m(bL)=\mu (b)\,m(L), \end{aligned}$$
(3)

where \(b>0\). In theoretical physics, this type of equation is, for example, encountered in the theory of critical phenomena11,28. Mathematically, this equation defines a homogeneous function. Its solution is simply a power law:

$$\begin{aligned} m(L)=AL^{d_f}, \end{aligned}$$
(4)

which, in the case of fractals, determines their fractal dimension, \(d_f=\ln \mu /\ln b\), and leads to the well-known scaling relation29:

$$\begin{aligned} m(bL)=b^{d_f}m(L). \end{aligned}$$
(5)

Moving forward, to address the problem of geometric self-similarity in complex networks, we first argue that Eq. (1) can be treated as a special case of Eq. (5). Then, building on this observation, we assume that Eq. (5) is also a special case of a more general equation in which the masses of the system and its parts, which are further identified with the number of nodes in the network and the number of nodes in different L-boxes extracted from this network, respectively, do not only depend on the diameter of the examined set of nodes (i.e. the entire network or a box) but also on the degree of the best-connected node in this set. This assumption leads us to the consistent scaling theory of fractal complex networks.

Figure 1
figure 1

Schematic illustration of the idea of geometric self-similarity in complex networks on the example of the fractal model of nested BA networks (for the definition of the model, see “Methods” section). Part (a) of the figure shows that the network can be subdivided into parts—boxes of a given diameter—each of which is (at least approximately) a reduced-size copy of the entire network. In the top picture shown, one such box, marked in red, is extracted from the original network and treated as a new network (shown below). It is divided again into new smaller boxes, some of which are marked with different colours. Both macroscopic and microscopic characteristics of this new network (represented by green squares in Fig. 2) are similar to those of the original network (indicated by navy circles in Fig. 2). Part (b) of this figure illustrates renormalization procedure applied to the same network as in part a. The top original network is divided into boxes of a fixed diameter, some of which are marked with different colours. In the new network after renormalization (shown below), these boxes are replaced by nodes with the corresponding colours. Again, the macroscopic and microscopic characteristics of the network after renormalization (represented by red triangles in Fig. 2) are similar to those of the original network.

To grasp the relation between Eqs. (1) and (5), it is enough to analyse the meaning of Eq. (5), which can be interpreted in two ways. More directly, it states that if one considers a smaller part of the system, let’s say of size \(L'=bL\) (with \(b<1\)), then \(m(L')\), as compared to m(L), is decreased by a factor \(\mu (b)=b^{d_f}\), which only depends on b. However, this equation also applies to the masses of the system on two different scales, or resolutions, which, from a formal point of view, can be treated as two stages of some renormalization procedure applied to that system. (A network-based illustration of these two interpretation schemes is shown in Figs. 1 and 2a–c.) Accordingly, to make Eq. (5) more operationalizable, it can be rewritten as:

$$\begin{aligned} m'(L')=b^{d_f}m(L), \end{aligned}$$
(6)

where the notation with the apostrophe is introduced to indicate the relation between the mass of the system before renormalization, m(L), to its mass after renormalization, \(m'(L')\). Now, it is easy to see that Eq. (1) is indeed a special case of Eq. (6), with: \(d_f=d_B\), \(b=l_B^{-1}\), \(m(L)=N\), and \(m'(L')=N_B(l_B)\), where L and \(L'\) stand for diameters of the network before and after renormalization, respectively.

In what follows, to extend the concept of geometric self-similarity to fractal complex networks, we assume that Eq. (6) can be rewritten in the form:

$$\begin{aligned} m'(L',k')=l_B^{-d_B}m(L,k), \end{aligned}$$
(7)

where \(d_B\) is the box dimension of fractal networks, whereas m(Lk) and \(m'(L',k')\) stand for the number of nodes and supernodes in the same box, before and after its renormalization with boxes of diameter \(l_B<L\), respectively. In other words, in Eq. (7), \(m'(L',k')\) is equal to the number of \(l_B\)-boxes used to cover the initial box of mass m(Lk). As indicated in this equation, during renormalization, when \(l_B\)-boxes are replaced with supernodes, not only the mass of the initial box changes, but also its diameter (from L to \(L'\)) and the degree of its hub (from k to \(k'\), where \(k'\) is the degree of the best-connected \(l_B\)-box within the initial L-box).

Now, since Eq. (7), like Eqs. (5) and (6), defines a generalized homogeneous function28 of the form:

$$\begin{aligned} m(L,k)=B\,L^\alpha k^\beta , \end{aligned}$$
(8)

after its substitution into (7), we obtain several scaling relations characterizing fractal networks. The first relation poses:

$$\begin{aligned} L'=L/l_B^{\;d_L}=L/l_B, \end{aligned}$$
(9)

where \(d_L=1\) is a direct consequence of the applied renormalization procedure, assuming perfect tiling of the network with boxes of diameter \(l_B\) each2. The second relation has the form of the long-confirmed empirical relation1,

$$\begin{aligned} k'=k/l_B^{\;d_k}, \end{aligned}$$
(10)

but in the context of Eq. (7), which applies to boxes of any diameter \(L\ge l_B\), the range of its applicability is much wider than previously thought (according to our notation, it was limited to the case of \(L=l_B\) and \(L'=1\)). Finally, taken together Eqs. (7)–(10) give the following scaling relation:

$$\begin{aligned} d_B=\alpha d_L+\beta d_k=\alpha +\beta d_k, \end{aligned}$$
(11)

which is one of the most important results of this article.

Figure 2
figure 2

Macroscopic characteristics of fractal complex networks: corresponding to (a) the number of boxes—\(N_B(l_B)\) needed to cover the considered networks as a function of the diameter \(l_B\) in the box, (b) the node degree distributions—P(k), and (c) the distributions of normalized masses of L-boxes—\(P(\mu )\), for \(L=3\). To construct these graphs, a nested BA network of size \(N\simeq 5\cdot 10^4\) and diameter \(d=475\) was created (this data are marked with navy circles, see also Table 1). To analyse the self-similarity of the network parts, the original network was covered with boxes of diameter \(l_B=40\), and the largest box of size \(M\simeq 1.4\cdot 10^3\) was extracted as a new network (this data are marked with green squares). To create the renormalized network of size \(N'\simeq 6.1\cdot 10^3\), the original network was covered with boxes of size \(l_B=6\), and then each of these \(N_B(6)=N'\) boxes was replaced with a supernode (these data are marked with red triangles). In the graph (d), results of repeated \(l_B\)-renormalizations of a single box of size \(m\simeq 3.5\cdot 10^4\) and diameter \(L=300\) are shown, which allow for an alternative determination of the box dimension of the studied fractal network (see the description in the main text of the paper).

According to Eq. (11), the box dimension \(d_B\) of fractal networks is only determined by the scaling exponents characterizing the microscopic structure of the network at the box level. In particular, as follows from Eq. (7), beside the method of determining the box dimension, which involves counting non-overlapping boxes (see Fig. 2a), \(d_B\) can also be obtained by subsequent renormalizations of a single L-box with smaller boxes of a given diameter \(l_B<L\) (see Fig. 2d). The decreasing sequence of renormalized masses of the box obtained in this way: \(m,m',m'',\dots ,m^{(i)},m^{(i+1)}\dots \), when presented on a graph in the form of points \((m^{(i)},m^{(i+1)})\), can be used to determine the coefficient \(l_B^{-d_B}\) in Eq. (7). Repeating this procedure for different values of \(l_B\), a set of points \((l_B,l_B^{-d_B})\) can be obtained which, when fitted with straight line on a double logarithmic scale, gives the same value of \(d_B\), as that resulting from the classical method based on Eq. (1).

The form of Eq. (11) is also very suggestive. It is the sum of two components, each of which is the product of scaling exponents relating to specific quantities characterizing the mass of the box before and after renormalization. In classical fractals, in which the mass of the box depends only on its linear size L, this sum has only one component. For this reason, in classical fractals, the fractal dimension can be determined by one of two methods: the box-covering method or the cluster-growing method, which are equivalent to each other. However, this is not the case of fractal complex network1,30, where \(\alpha \), playing the role of the spreading dimension, only describes how the mass of the box, Eq. (8), varies with its diameter:

$$\begin{aligned} m(bL,k)=b^{\alpha }m(L,k), \end{aligned}$$
(12)

where \(b>0\). In a similar vein, the second addend in (11), which is further called the mass exponent (in analogy to the degree exponent, \(d_k\)),

$$\begin{aligned} d_m=\beta d_k, \end{aligned}$$
(13)

only characterizes, how the local network density (understood as the number of nodes per local area of dimeter \(L=L'\)) changes as a result of renormalization:

$$\begin{aligned} m'(L,k')=l_B^{-d_m}m(L,k). \end{aligned}$$
(14)

Finally, an observation of great importance for the scaling theory of fractal complex networks (see “Scale-free property” section) is that when the box mass m(Lk) (8) is divided by the average mass \(\langle m\rangle =N/N_B(L)=L^{d_B}\) (1) one gets the normalized mass:

$$\begin{aligned} \mu (L,k)=\frac{m(L,k)}{\langle m\rangle }=B\,L^{-d_m}k^\beta , \end{aligned}$$
(15)

which turns out to be the invariant of the \(l_B\)-renormalization procedure, since

$$\begin{aligned} \mu (L,k)=\frac{m(L,k)}{\langle m\rangle }=\frac{l_B^{\,d_B}m'(L',k')}{L^{d_B}}=\frac{m'(L',k')}{L'^{\,d_B}}= \frac{m'(L',k')}{\langle m'\rangle }=\mu '(L',k'). \end{aligned}$$
(16)

Indeed, the normalized box mass (15) is a local network measure that behaves the same regardless of the scale of observation, as the following scaling relations clearly describe:

$$\begin{aligned} \mu (bL,k)=b^{-d_m}\mu (L,k), \end{aligned}$$
(17)

and

$$\begin{aligned} \mu '(L,k')=l_B^{-d_m}\mu (L,k). \end{aligned}$$
(18)

A proper perspective on the meaning of these two relations is gained when comparing them with the corresponding relations for classical fractals, namely Eqs. (5) and (6). From this perspective, the scaling exponent \(d_m\) appears to be the self-similarity dimension of fractal complex networks, which, remarkably, is different from the box dimension \(d_B\).

Scale-free property

At this point, we would like to emphasize the lack, in our considerations so far, of scale-free node degree distributions, whose invariance due to the renormalization procedure is considered an attribute of fractal networks1,2,3. Interestingly, this lack clearly shows the otherwise obvious fact that fractal networks may not have the scale-free property. Nevertheless, when they reveal the property, then both the node degree distribution, \(P(k)\sim k^{-\gamma }\), and the box mass distribution, \(P(m)\sim m^{-\delta }\), are invariant under the box-covering renormalization procedure, with their invariance being a consequence of the already discussed geometric self-similarity of boxes and the scale-free property of the distribution of normalized masses \(P(\mu )\sim \mu ^{-\delta }\), from which P(m) inherits its characteristic exponent \(\delta \) (see Fig. 2b,c).

To show this, let us assume that \(P(\mu )\) is scale-free:

$$\begin{aligned} P(\mu ;L)\sim \mu ^{-\delta }, \end{aligned}$$
(19)

where, by writing \(P(\mu ;L)\) instead of \(P(\mu )\), we emphasize that all boxes in the network have the same diameter L. To clarify, this distribution refers to the normalized masses of non-overlapping boxes of diameter L used to cover the network. Invariance of this distributions in networks after \(l_B\)-renormalization is due to the property (16) which implies that \(P(\mu ;L)=P(\mu ';L')\), i.e.

$$\begin{aligned} P'(\mu ';L')\sim \mu '^{-\delta }. \end{aligned}$$
(20)

Now, having the relationship between \(\mu \), m and k (15) and using it together with Eq. (19) in the balance equations between the corresponding distributions, i.e. \(P(\mu )d\mu =P(m)dm\) and \(P(\mu )d\mu =P(k)dk\), it is easy to show that

$$\begin{aligned} P(m;L)\sim m^{-\delta }, \end{aligned}$$
(21)

and

$$\begin{aligned} P(k;L)\sim k^{-\gamma }, \end{aligned}$$
(22)

where the characteristic exponent \(\gamma \) is given by:

$$\begin{aligned} \gamma =1+\beta (\delta -1). \end{aligned}$$
(23)

The invariance of these distributions in networks after \(l_B\)-renormalization is obvious due to Eq. (20).

The above reasoning shows that the geometric self-similarity of the boxes (16)–(18) and the scale-free distribution of their normalized masses (19) themselves guarantee the invariance of P(k) and P(m) under renormalization. Another consequence of these two assumptions (i.e. self-similarity and scale-freeness), which is not obvious, although it may seem so at first glance, is independence of \(P(\mu ;L)\) (19) from the diameter of the boxes L (of course, the same applies to \(P'(\mu ';L')\)). In general, this feature can be shown to be true by comparing the numbers of boxes having the same hub nodes when the network is covered with boxes of different diameters. Because the diameter and degree of the hub determine the mass of the box, such a comparison comes down to comparing the number of boxes with the given normalized masses:

$$\begin{aligned} N_B\!\left( \mu \!(L,k)\right) \,d\!\mu (L,k)=N_B\!\left( \mu \!(bL,k)\right) \,d\!\mu (bL,k), \end{aligned}$$
(24)

where the relationship between the considered masses is determined by Eq. (17). Making the appropriate substitutions in this equation, i.e. \(N_B(\mu (L,k))=N_B(L)\,P(\mu ;L)\sim L^{-d_B}(L^{-d_m}k^\beta )^{-\delta }\) and \(N_B(\mu (bL,k))=N_B(bL)\,P(\mu ;bL)\sim (bL)^{-d_B}P(\mu ;bL)\), cf. Eqs. (1), (15) and (19), not only do we confirm that the distribution \(P(\mu ;bL)\) is scale-free regardless of b (19), but we also obtain a new relation between the scaling exponents:

$$\begin{aligned} \delta =1+\frac{d_B}{d_m}. \end{aligned}$$
(25)

Interestingly, using Eqs. (13) and (23), the above relation can be easily transformed into the well-known relation1

$$\begin{aligned} \gamma =1+\frac{d_B}{d_k}. \end{aligned}$$
(26)

At this point, a natural question to ask is: How is it possible that the relation (26) was originally derived without having to refer to the self-similarity of the boxes? In fact, as we show below, self-similarity cannot be ignored, and the derivation described in Ref.1 takes it into account, albeit implicitly.

More precisely, in the original reasoning leading to Eq. (26), one starts with the following equation:

$$\begin{aligned} N(k)dk=N'(k')dk', \end{aligned}$$
(27)

where N(k) (respectively, \(N'(k')\)) is the number of nodes with k (respectively, \(k'\)) links in the network before (after) renormalization. Then, substitutions are made in this equation: \(N(k)=NP(k)\) and \(N'(k')=N'P'(k')\), where N and \(N'=N_B(l_B)=Nl_B^{-d_B}\) (1) stand for the number of nodes in the network before and after renormalization, respectively. These substitutions lead to the following density balance equation:

$$\begin{aligned} P(k)dk=l_B^{-d_B}P'(k')dk', \end{aligned}$$
(28)

from which Eq. (26) is obtained under the assumptions that both node degree distributions are scale-free with the same scaling exponent, i.e. \(P(k)\sim k^{-\gamma }\) and \(P'(k')\sim k'^{-\gamma }\), and that Eq. (10) is met between k and \(k'\). It should be emphasized, however, that what underlies validity of these assumptions is the geometric self-similarity of the boxes and the scale-free distribution of their masses. Furthermore, this derivation itself is a special case of more general considerations, in which the starting point is the below equation:

$$\begin{aligned} N_B\!\left( \mu \!(L,k)\right) \,d\!\mu (L,k)=N_B'\!\left( \mu '\!(L,k')\right) \,d\!\mu '(L,k'), \end{aligned}$$
(29)

whose logic is similar to that behind Eq. (24). To explain, the left-hand side of Eq. (29) represents the number of boxes with diameter L and hub degree k in the network before renormalization, while the right-hand side is the number of boxes with the same diameter L and hubs of degree \(k'\) in the network after \(l_B\)-renormalization. The numbers of these boxes must match because hubs of degree \(k'\) in the network after renormalization arise from those \(l_B\)-boxes in the network before renormalization that contained hubs of degree k. The relation between the masses of the considered boxes is given by Eq. (18).

Interestingly, for arbitrary value of L, the scaling analysis of Eq. (29) leads to the scaling relation (25). However, when \(L\!=\!1\) is assumed, then Eq. (29) can be transformed to Eq. (27). In particular, the left hand side of Eq. (29) becomes: \(N_B(\mu (1,k))d\mu (1,k)=NP(\mu (1,k))\frac{d\mu (1,k)}{dk}dk=\beta Nk^{-\beta (\delta -1)-1}dk=\beta Nk^{-\gamma }dk=\beta N(k)dk\), where we one by one used Eqs. (1), (19), (15), and (23). Similar transformations applied to the right-hand side of Eq. (29) result in: \(N_B'(\mu '(1,k'))d\mu '(1,k')=\beta N'(k')dk'\), what was to be shown.

From microscopic to macroscopic scaling exponents in real and model-based fractal networks

All scaling exponents discussed in this article, which describe fractal complex networks, can be divided into two groups. The first group refers to the macroscopic characteristics of the network (\(d_B\), \(\gamma \), and \(\delta \)), and the second group includes the exponents that characterize the network structure at the microscopic level (\(d_k\), \(d_m\), \(\alpha \) and \(\beta \)). Interestingly, exponents from both groups are related to each other and, as in the scaling theory of critical phenomena, only a few of them, three to be exact, are independent. The choice of the three fundamental exponents depends on the focus of the study. Here, to validate our results in real and model-based fractal networks, we take the easier to measure macroscopic exponents as independent. This choice results in the following set of test relations, cf. Eqs. (25) and (26):

$$\begin{aligned} d_m=\frac{d_B}{\delta -1},\;\;\;\;\;\;\;\;\;\;d_k=\frac{d_B}{\gamma -1}, \end{aligned}$$
(30)

and, cf. Eqs. (11) and (13):

$$\begin{aligned} \alpha =\frac{\delta -2}{\delta -1}\,d_B,\;\;\;\;\;\;\;\;\;\;\beta =\frac{\gamma -1}{\delta -1}, \end{aligned}$$
(31)

of which only the relation for \(d_k\) (30) has been verified in real1 and model2 networks, and the results of the validation of relations (31) are summarized below.

The real networks analyzed in this paper come from various fields and represent information, social, and biological networks. We analyzed: (1) a sample of the WWW with nodes corresponding to web pages and links standing for hyperlinks31; (2) a coauthorship network (DBLP), where nodes are scientists and edges are placed between two scientists if they have co-authored a paper32,33; (3) a functional brain network, which reflects the correlation between the activity of different areas in the human brain34,35. In addition to real networks, we have also analyzed several fractal network models, including our own network model, which is based on nested BA networks36, the Song–Havlin–Makse (SHM) model2 and (u,v)-flowers20. Detailed information on all these networks (real and synthetic) can be found in “Methods” section.

Table 1 Values of the scaling exponents for various fractal networks.

Table 1 presents the theoretical and empirical values of the scaling exponents of all analyzed networks. The theoretical values, which are given in brackets, are of two types. For the deterministic model-based networks—the SHM model and (u,v)-flowers—their values can be calculated using the appropriate formulas, the details of which are provided in “Methods” section (more precisely: “SHM model” and “(u,v)-Flowers” subsections, respectively). For real networks and for the numerical model of nested BA, the theoretical values of \(\alpha \) and \(\beta \) were calculated from Eqs. (31) using the empirical values of the macroscopic exponents.

Correspondingly, the empirical values of the scaling exponents were calculated from Figs. 3 and 4 according to the following protocol (the same for each network): First, we determined the box dimension \(d_B\) of these networks resulting from tiling the network with boxes of different sizes \(l_B\), see Figs. 34a–c. To this end, we used the algorithm developed by Song et al.37, and in the case of deterministic models of fractal networks, shown in Fig. 4, we additionally analysed the tiling consistent with their deterministic construction procedures, finding that they use a much smaller number of boxes than the Song method. We confirmed that the value of \(d_B\) after renormalization (even multiple times) remains the same as before renormalization, see Fig. 2a. We then examined the invariance of distributions P(k) and \(P(\mu )\). The given values of \(l_B\) refer to the diameter of the boxes that are used to renormalize the network. As already stated, \(l_B=1\) refers to the original network - before renormalization. In the case of \(P(\mu )\) distributions, the diameters L of the boxes whose mass was studied are also given. With respect to these distributions, the provided values of \(l_B\) and L should be read as follows: The relevant distribution \(P(\mu )\) refers to the network that was first renormalized with boxes of diameter \(l_B\) and then covered with non-overlaying boxes of diameter L. Regarding \(P(\mu )\), however, due to the low statistical reliability of the data for \(l_B,L>1\), in this paper, we only present data for the largest networks (i.e. WWW and model based networks). It should be noted that in all networks we studied, both distributions are scale-invariant, with well-defined characteristic exponents \(\gamma \) and \(\delta \) (see Figs. 34d–i). Lastly, having determined the macroscopic scaling exponents: \(d_B\), \(\gamma \), and \(\delta \), we were able to calculate the theoretical values of the local exponents—\(\alpha \) and \(\beta \), Eqs. (31)—which we used to obtain the adequately rescaled masses of boxes to determine their empirical values (see Figs. 34j–l). In particular, to obtain the empirical value of \(\alpha \), the masses of all the internally connected boxes, obtained during tiling the network with different \(l_B\)-boxes, were divided by the hub’s degree raised to the power of the theoretically obtained \(\beta \). Such rescaled masses \(m/k^\beta \) were then plotted against the actual diameters of the boxes, \(L<l_B\), which had been specified individually for each box. A similar procedure was applied to determine the empirical value of \(\beta \). (For more details see the subsection titled “Numerical calculation of microscopic scaling exponents” in the ”Methods” section).

Figure 3
figure 3

Scale-invariant and self-similar scaling in real fractal networks. The graphs placed in the same column refer to the same network (i.e. WWW, brain and DBLP, respectively, starting from the left), and those placed in the same row to the same scaling relation. In particular, the following graphs show: (ac) A log-log plot of \(N_B\) versus \(l_B\) revealing the fractal nature of the studied network according to Eq. (1). (df) Invariance of the node degree distribution P(k) under the renormalization for different box sizes \(l_B\) (the case of \(l_B=1\) corresponds to the original network). (gi) Invariance of the normalized mass box distribution \(P(\mu )\) (where L represents diameter of the considered boxes). (jl) Scaling of the masses of boxes according to Eq. (8). (See the description given in the main text.).

Figure 4
figure 4

Scale-invariant and self-similar scaling in model-based fractal networks. As in Fig. 3, the graphs placed in the same column refer to the same model of fractal networks (i.e. the nested BA networks, the SHM model, and (u,v)-flowers, respectively, starting from the left), and those placed in the same row to the same scaling relation. The presentation of data in this figure compared to Fig. 3 differs only in that the graphs relating to deterministic models show two types of points: closed and open. For these models, closed points refer to the box-covering method resulting from their deterministic construction procedure, which uses a significantly smaller number of boxes than the Song’s algorithm, whose results correspond to open points (see the main text of the paper for more detailed explanation). Table 1 shows the results obtained based on the closed points.

Interestingly, in the case of deterministic fractal network models, only the box-covering method which takes into account the network construction procedure while using a smaller number of boxes than Song’s method, leads to microscopic exponents consistent with their theoretical predictions (cf. Fig. 4k,l and Table 1). In the case of these networks, the poor performance of Song’s method (e.g., compare Fig. 7a vs. Fig. 7b) is especially visible in the range of small masses of the \(P(\mu )\) distributions (see subset graphs in Fig. 4h,i). We suspect, this latter observation may explain the occurrence of two different scaling behaviours (for small and large \(\mu \)) in \(P(\mu )\) of other fractal networks (cf. Figs. 3g–i and 4g).

Perspectives

The origins and consequences of fractality are one of the three main research directions in the geometry of complex networks24, next to the hyperbolic geometry of hidden network spaces38,39 and the geometry induced by dynamic processes in networks40,41,42. Although these three geometries, due to the various definitions of distance in each of them, are defined differently, there is no doubt that they must be closely related to each other. While these relationships have yet to be explored, evidence of their existence can be found in our results.

For example, when examining deterministic models of fractal networks (SHM model and (u,v)-flowers, see “SHM model” and “(u,v)-Flowers” subsections, respectively), we noticed that while macroscopic scaling exponents are very stable in the sense that they do not depend on the box-covering method37,43, this may not be the case for microscopic exponents. In particular, in the mentioned models, gathering nodes according to their kinship—which is the most optimal, because it corresponds to the smallest number of boxes—gives the values of microscopic exponents closest to their theoretical predictions. Since the degree of kinship can be thought of as a distance in some metric space—the space of kinship—this observation is important. In fact, the fractality of these models may be considered a feature they inherit from their kinship spaces. Here, natural questions arise, such as whether the fractality of real complex networks may result from the properties of hidden (similarity-based) metric spaces44. Similar studies on community structure confirm the existence of such a relationship45,46,47. The mention of the community structure is not entirely accidental here, because, as the example of the DBLP network shows—in which the removal of weak ties reveals its fractal properties (see also26,33)—the fat-tailed community size distribution48,49 may result from the scale-invariant distributions of box masses observed in (not necessarily tree-like) fractal skeletons13,14 of these networks.

The second thread that we would like to emphasize concerns the geometry induced by diffusion-like dynamic processes in networks40,41,42. In classical fractals, this kind of geometry is closely related to the cluster-growing method of calculating their fractal dimensions, which is actually a way of measuring the distance27. In complex networks, establishing an analogous relationship has not been possible so far due to the lack of theoretical foundations distinguishing between the box dimension - \(d_B\) (7) (which can be determined by the box covering method) and the spreading dimension - \(\alpha \) (12) (which corresponds to the cluster-growing method). It seems that the scaling theory of fractal complex networks presented in this paper has the potential to break this impasse. This is even more likely since in its general findings, with box masses depending not only on the diameter of the boxes but also on the degree of the best-connected node inside the box, the theory refers to the well-established heterogeneous (degree-based) mean-field theory commonly used to study dynamical processes on complex networks50.

Methods

Real and model-based fractal networks analysed in the paper

The real networks analysed include:

  • WWW network: The web subset analysed consists of 326 k web pages that are linked if there is a URL link from one page to another31. This dataset has been analysed for fractal properties in many other papers (see e.g.1,2). It is publicly available in many network repositories (e.g.52).

  • DBLP coauthorship network: DBLP is a digital library of article records published in computer science32,53. In this study, we use the 12th version of the dataset (DBLP-Citation-network V12; released in April 2020, which contains information on approximately 4.9 M articles published mostly during the last 20 years). We ourselves processed the raw DBLP data into the form of coauthorship network, from which we extracted the network backbone by imposing a threshold on the minimum number of joint papers (\(\ge 25\)) two scientists should have. This procedure significantly reduces the size of the studied network (from 2.9 M nodes and 12.5 M links to 2.5 k nodes and 3.2 k edges), but thanks to it the network becomes naturally fractal.

  • Human brain networks: The networks are based on functional magnetic resonance imaging (fMRI). The fMRI data consists of temporal series, known as the blood oxygen level dependent (BOLD) signals, from different brain regions. To build brain networks, the correlations \(C_{ij}\) between the BOLD signals are calculated and the two nodes (brain regions) are connected if \(C_{ij}\) is greater than some threshold value T. In our case we assume \(T=0.85\). The brain networks analysed here were used in34,35 and can be found at54.

Figure 5
figure 5

Single step of the construction procedure of the nested BA network. First, one node is chosen with probability that is proportional to its degree, in the figure \(k=4\). Then the node is replaced by the corresponding BA network with the best connected node of the same degree \(k=4\), as the removed one. Green edges of the removed node are reconnected to randomly selected nodes of the newly created subnetwork.

The studied models of fractal networks include:

  • SHM model: The details of the model are presented in “SHM model” subsection, where local scaling exponents for this model were also derived.

  • (u,v)-Flowers: The details of the model are presented in “(u,v)-Flowers” subsection, where local scaling exponents for this model were also derived.

  • Nested BA networks: The nested BA network model has three parameters: N - the number of nodes, \(k_{max}\) - the degree of the best connected node in the network, and m - the number of edges by which the newly created node connects to the already existing nodes. The network evolution procedure is as follows:

    1. 1.

      First, a BA network with the hub of degree \(k_{max}\) is created (that is, the network grows until one of the nodes reaches degree \(k_{max}\)).

    2. 2.

      Then, as long as the size of the network is less than N (see Fig. 5):

      1. (a)

        a node is chosen proportionally to its degree k and it is replaced with a BA subnetwork with the largest node degree k;

      2. (b)

        edges that were connected to the removed node are reconnected to randomly selected nodes of the newly created subnetwork.

Numerical calculation of microscopic scaling exponents

In Figs. 3j–l and 4j–l we presented the microscopic scaling exponents \(\alpha \) and \(\beta \). Their values result from fitting a straight line to the set of points marked with blue circles and red triangles, respectively. These points represent the geometric mean of logarithmically equal sized bins of the original data which were obtained as follows. First, we estimated the range of \(l_B\) for which dependence of \(N_B\) on \(l_B\) is linear in log-log scale. In Fig. 6a, which shows an example analysis of nested BA model, this range is indicated by a gray rectangle. Then, for each \(l_B\) in this range, we performed box covering, obtaining a triple of values (mLk) for each box. Based on the set of such triples and the theoretically calculated value of the \(\alpha \) or \(\beta \) exponent, a set of points \((k, m/L^\alpha )\) or \((L,m/k^\beta )\) was created, respectively. In Fig. 6d, the later set is represented by yellow circles. Having this raw data, logarithmic binning has been performed and geometric mean has been calculated for each bin. Figure 6d, in addition to the blue points denoting the geometric mean, also shows the geometric standard deviation and the fitted line, whose slope corresponds to the \(\alpha \) value we are looking.

Figure 6
figure 6

Numerical calculation of microscopic scaling exponents. Detailed description of this figure is given in the text.

If we restrict our analysis to one specific value of \(l_B\) (for example, in Fig. 6b we take \(l_B = 32\), while in Fig. 6c we take \(l_B = 8\)), we end up with a much smaller set of triples that would not allow for a reliable results. These limitations are particularly severe for small real, DBLP, and brain networks, with sizes \(N<3\cdot 10^3\) each. The blue lines in Fig. 6b,c are not a result of fitting but are shown for comparison purposes only.

Microscopic scaling exponents for deterministic fractal network models

In this section, we derive exact formulas for microscopic scaling exponents (\(\alpha \) and \(\beta \)) characterising deterministic fractal network model.

SHM model

In the Song–Havlin–Makse (SHM) model2, at \(t=0\), the network starts to grow from two nodes connected by one link. Then, during subsequent, \(t+1\), time steps, next \((t+1)\)-generations of the network recursively emerge, in which: s new nodes are attached to the endpoints of each link of the previous t-generation, old links are removed from the network, and new links are created in place of those removed, which connect pairs of offspring-nodes attached to the endpoints of the deleted ones.

As a result of this construction procedure, in successive generations, every node i increases its degree multiplicatively:

$$\begin{aligned} k_i(t, t_i)=s\,k_i(t-1, t_i)=s^{\Delta t}, \end{aligned}$$
(32)

where

$$\begin{aligned} \Delta t=t-t_i, \end{aligned}$$
(33)

is the time that has elapsed since the node appeared in the network for the first time, at time \(t_i\), assuming that its initial degree was equal to \(k_i(t_t,t_i)=1\) (in fact, the initial degree of one newcomer out of s is \(k_i(t_i,t_i)=2\). However, since the nodes’ initial degrees does not affect our further calculations, we will not be concerned with this minor oversight.).

It is also easy to see that a similar multiplicative dynamics is also shown by the diameter \(L_i\) and mass \(M_i\) of the largest box where the i-th node of degree \(k_i(t,t_i)\) acts as a hub (such a box consists of all nodes that can be treated as the offspring of i, for which the i-th node is the parent, grandparent, great-grandfather, etc.). Specifically, the maximum diameter of such a box is given by:

$$\begin{aligned} L_i(t,t_i)=a\,L_i(t-1,t_i)=a^{\Delta t}, \end{aligned}$$
(34)

with \(a=3\) (the value of a results from the construction procedure of the model, since removing the old edges and replacing them with new ones, which are created only between newly added nodes, is formally equivalent to replacing the old edges with paths of length 32.) and \(L_i(t_i,t_i)=1\), whereas its mass satisfies the following recurrence relation:

$$\begin{aligned} M_i(t,t_i)=nM_i(t-1,t_i)=n^{\Delta t}, \end{aligned}$$
(35)

where \(n=2s+1\) and \(M_i(t_i,t_i)=1\). The reasoning behind the factor n is the following: at time t, the largest box that can be created around the node i consists of the same nodes that formed its largest box at time \(t-1\) (the number of which is: \(M_i(t-1,t_i)\)) and all descendants of those nodes that were created in the last time step (the number of which, due to the tree-like structure of the network, is: \(2s(M_i(t-1,t_i)-1)\simeq 2sM_i(t-1,t_i)\)). This leads to the following relation: \(M_i(t,t_i)=M_i(t-1,t_i)+2sM_i(t-1,t_i)\), which is equivalent to the first part of Eq. (35).

At this point, it is worth noting a few remarks regarding Eqs. (34) and (35).

First, Eq. (34), when applied to the largest boxes with the oldest nodes, i.e. those from which the network’s evolution began, shows how the diameter of the entire network changes in subsequent generations:

$$\begin{aligned} L(t)=L_i(t,0)=a\,L_i(t-1,0)=a\,L(t-1). \end{aligned}$$
(36)

Correspondingly, by applying Eq. (35) to these boxes, one finds the analogous formula for the total number of nodes in the network:

$$\begin{aligned} N(t)=M_i(t,0)=n\,M_i(t-1,0)=n\,N(t-1). \end{aligned}$$
(37)

In Ref.2, the above recurrence relations, Eqs. (36) and (37), together with Eq. (32), were used to derive exact expressions for the box dimension of the considered fractal network model, cf. Eq. (1):

$$\begin{aligned} d_B=\frac{\ln n}{\ln a}, \end{aligned}$$
(38)

and for its degree exponent, cf. Eq. (8):

$$\begin{aligned} d_k=\frac{\ln s}{\ln a}, \end{aligned}$$
(39)

thus enabling verification of the scaling relation (17), according to which, in this model, the characteristic exponent of the degree distribution is given by:

$$\begin{aligned} \gamma =1+\frac{d_B}{d_k}=1+\frac{\ln n}{\ln s}. \end{aligned}$$
(40)

The second remark regarding Eqs. (34) and (35) it that boxes containing hubs of degrees \(k_i(t,t_i)>1\) may have diameters and masses smaller than \(L_i(t,t_i)\) and \(M_i(t,t_i)\), respectively. For example, when the diameter of the i-th box is \(l_i=1\), then the box is confined to the node itself and as a result its mass is equal to \(m_i=1\). Similarly, when \(l_i=a=3\), then the box, apart from the hub itself, also contains all its neighbours, making the mass of the box equal to \(m_i=1+k_i\simeq k_i\). More generally, the diameter of the i-th box can be equal to:

$$\begin{aligned} l_i=a^\tau , \end{aligned}$$
(41)

where

$$\begin{aligned} 0\le \tau \le \Delta t, \end{aligned}$$
(42)

with the value of \(\tau \) affecting its mass, which can be determined from:

$$\begin{aligned} m_i=n^\tau k_i(t-\tau ,t_i). \end{aligned}$$
(43)

In fact, the rationale behind Eq. (43) is the same as for Eq. (35). The only difference between \(M_i\) and \(m_i\) is that the initial condition for the multiplicative growth of the latter is \(m_i=1+k_i(t-\tau ,t_i)\simeq k_i(t-\tau ,t_i)\) and not just \(M_i(t_i,t_i)=1\). Now, substituting Eq. (32) into (43), one gets:

$$\begin{aligned} m_i= & {} \left( \frac{n}{s}\right) ^\tau k_i(t,t_i). \end{aligned}$$
(44)

Then, using Eq. (41) in (44), one obtains the following relation for the mass of the box as a function of its diameter and hub’s degree, cf. Eq. (9):

$$\begin{aligned} m_i=l_i^{\alpha }k_i^{\beta }, \end{aligned}$$
(45)

where the local scaling exponents are given by:

$$\begin{aligned} \alpha =\frac{\ln n-\ln s}{\ln a}, \end{aligned}$$
(46)

and

$$\begin{aligned} \beta =1. \end{aligned}$$
(47)

It is easy to see that, together with the previously obtained expressions for \(d_B\) (38) and \(d_k\) (39), the obtained above expressions for \(\alpha \) and \(\beta \) satisfy the scaling relation: \(d_B=\alpha +d_k\,\beta \), Eq. (10), which has been derived in the main text of the paper.

Figure 7
figure 7

(2,2)-Flowers of generation \(t=5\). (a) The network was covered with boxes of diameter \(l_B=8\) according to the degree of kinship of the nodes. (b) The node is covered with boxes of diameter \(l_B=8\) according to the algorithm developed by Song et al.37. It is clear that in the studied deterministic network, Song’s random algorithm performs worse than the covering according to the kinship space.

(u,v)-flowers

In the deterministic fractal network model called (u,v)-flowers20, networks start to grow, at \(t=0\), from two nodes, so-called initial hubs, connected by one link. Then, subsequent \((t+1)\)-generations of the model are obtained from t-generations by replacing each link by two parallel paths of \(u>1\) and \(v\ge u\) links long. An essential and not obvious at first glance property of this construction procedure is its equivalence to another procedure in which to obtain \((t+1)\)-generation one produces \(w=u+v\) copies of the previous t-generation and then joins the copies at their initial hubs.

From the second method of constriction, it is easy to see20 that the number of links in (u,v)-flowers of generation \(t>0\) is given by:

$$\begin{aligned} E_t=wE_{t-1}=w^t, \end{aligned}$$
(48)

the number of nodes is:

$$\begin{aligned} N_t=wN_{t-1}-w\;\propto \;w^t, \end{aligned}$$
(49)

and the diameter of the networks grows as:

$$\begin{aligned} L_t\propto u^t. \end{aligned}$$
(50)

Furthermore, by construction, the networks have only nodes of degree

$$\begin{aligned} k_i=2^n, \end{aligned}$$
(51)

where \(n=1,2,\dots ,t\), and their node degree distribution is scale-free (2) with the characteristic exponent equal to:

$$\begin{aligned} \gamma =1+\frac{\ln w}{\ln 2}. \end{aligned}$$
(52)

It was also shown that the box dimension (1) of (u,v)-flowers is:

$$\begin{aligned} d_B=\frac{\ln w}{\ln u}, \end{aligned}$$
(53)

and their degree exponent (8) is:

$$\begin{aligned} d_k=\frac{\ln 2}{\ln u}, \end{aligned}$$
(54)

in accordance with the scaling relation (14).

In what follows, we show that the local scaling exponents, \(\alpha \) and \(\beta \) (9), of the model are given by:

$$\begin{aligned} \alpha =\frac{\ln w-\ln 2}{\ln u}, \end{aligned}$$
(55)

and

$$\begin{aligned} \beta =1, \end{aligned}$$
(56)

respectively, so their values satisfy the scaling relation (10).

We first consider the scaling exponent \(\beta \). From Eq. (9), it follows that if the degree of the hub inside the box increases x times, then the mass of the box will increase \(x^\beta \) times. Correspondingly, the second method of construction of (u,v)-flowers assumes that in successive generations of these networks, the degrees of the initial hubs double, i.e. \(x=2\), which is due to the merger of two initial hubs from two copies of the network of the previous generation. Moreover, since the merged copies are identical, the masses of the boxes with the initial hubs also double, i.e. \(x^\beta =2\). Thus, we come to the conclusion that the masses of the boxes are proportional to the degrees of their hubs, which gives \(\beta =1\), i.e. Eq. (56).

To find \(\alpha \), we again consider boxes with the initial hubs of degree \(k_i=2^t\), Eq. (51), in networks of generation \(t>0\). Such boxes can be of various diameters. For example, when the diameter of the box is twice the diameter \(L_{(t-1)}\) of the network of \((t-1)\)-generation, then the mass of the box is twice the number of nodes \(N_{(t-1)}\) in the network of \((t-1)\)-generation. In general, when the box has a diameter of \(2L_n\) (with \(0<n<t\)), then its mass is equal to (cf. Fig. 7a):

$$\begin{aligned} m_i(2L_n,2^t)=2^{t-n}N_n\propto 2^{t-n}w^n. \end{aligned}$$
(57)

Comparing the above relationship with Eq. (9) one gets:

$$\begin{aligned} m_i(2L_n,2^t)\propto 2^t\left( \frac{w}{2}\right) ^n=2^t(u^t)^\alpha , \end{aligned}$$
(58)

where \(\alpha =(\ln w-\ln 2)/\ln u\), cf. Eq. (55).