## Introduction

Bitcoin, the pioneering cryptocurrency, has brought about an unprecedented revolution in the payment industry1. Despite its traction and success over the last ten years, the original blockchain – the technological infrastructure underlying Bitcoin – suffers from some limitations that may hinder the future growth and adoption of the cryptocurrency. One of the major issues is the scalability of the system: the current number of transactions validated via this platform is between 3 and 7 transactions per second, compared for instance to thousands of transactions handled by the Visa circuit2. The lack of scalability is mainly caused by constraints on throughput of transactions, with the block size fixed at 1MB, and by the high latency – with a new block created on average only every ten minutes. Those limitations are imposed to safeguard the security of the platform against malicious attacks and are difficult to relax without major changes in the protocol.

The main solutions proposed to address the scalability issue include (i) changes to the main protocol (consensus algorithm, parameters) and (ii) sidechains and second-layer solutions (see3 for a recent technical review). Notable examples of type-(i) solutions include new consensus protocols, which would allow a faster issuance of new blocks among other new features4. Sidechains are blockchains “connected” to the main Bitcoin blockchain such that Bitcoins can be transferred bidirectionally between the main and side blockchain5. At the same time, sidechains are completely separate ecosystems whose technical features or issues would not be shared with the main blockchain. The Lightning Network (LN), instead, is a so-called second-layer technology built on top of the Bitcoin blockchain to provide “off-chain” fast payment channels between users6. By off-chain we mean that not all transactions are settled and stored on the main blockchain. In a nutshell, the idea of a Lightning channel is the following: two parties lock the same amount of money as collateral and open a channel for a certain period of time. During this time, they can then exchange money back and forth through the channel, and only the netted transaction will be eventually validated and stored on the main blockchain. If one party is malicious and does not correctly update the balance, the other can keep the collateral posted by the malicious party, as a form of insurance. Any two users can open a channel and all other participants can use one or more existing channels to route transactions off-chain upon payment of a fee to channel “owners”. The scalability problem could be solved if a sufficient number of channels were opened, implying that the Lightning Network spans across the whole pool of users of the main blockchain.

The Lightning Network topology is, indeed, relevant to understand the resilience of the system to attacks or random failures and its robustness. Measures of the network structure based on empirical data – such as degree distribution, assortativity, shortest paths length – provide an indication of the efficiency of payments’ routing and features of the system (i.e. average number of channels per user, clusters and communities, etc.)7. Experiments on random or targeted nodes removal from the network give information on the system resilience by monitoring when the original network is broken into multiple isolated clusters8,9,10. In the Lightning Network case, it has been shown that some types of targeted attacks – aimed at consuming, for instance, the channels’ liquidity of specific nodes – may yield severe consequences for the resilience of the network in terms of average payment flow and reachability11.

The topology of the network is in turn driven by users’ economic incentives to relay transactions “off-chain”. Moreover, as the Lightning fees are set by channels’ owners, an important question is how high such fees should be set in order to guarantee profits, while providing at the same time the right incentives for Bitcoin users to participate in the Lightning Network. In a recent work on simple network topologies (i.e. bidirectional channels and star graphs), the authors have estimated the demand for transactions on the main Bitcoin blockchain compared to the LN, the level of LN fees that would cover maintenance costs of the channel and their implication for the overall network security12. Indeed, transacting on the Lightning Network might impact the security of the main blockchain network by inducing a decrease in the amount of fees collected by the miners for the validation of blockchain transactions. Moreover, LN’s transaction fees have been empirically studied using a traffic simulator13.

Fees on the main blockchain are used as incentives to miners (i.e. nodes capable of validating transactions and generating new blocks) to contribute to the security of the platform. The blockchain security is associated with the platform’s decentralization, hence to the miners’ total computing power14,15. Normally, users “compete” to set up the minimal fees that would ensure their transaction to be validated within a given timeframe, as miners try to maximize the total amount of fees per block. A strand of the literature has been investigating the Bitcoin fee set-up mechanisms, the miners’ incentives and their potential correlation with risks of attacks and manipulation of the transaction history. In16 the authors use a game-theoretic model to investigate the factors influencing the value of Bitcoin fees, while in17 they also examine the interplay between fees and security of the platform, theoretically showing that the current fee model may not be sustainable in the long run. Alternative fee mechanisms have also been proposed, for instance based on auction models16, and compared with the existing one to highlight weaknesses and possible improvements.

The Bitcoin ecosystem has been already extensively investigated using approaches based on complex networks. The transactions network has been studied to understand latency issues and propagation mechanisms in peer-to-peer systems18 and inefficiencies of the process of permanent inclusion of the transactions on the blockchain19. Global and local structural properties of the users’ network in Bitcoin have also provided insights on boom-and-bust events20 and Bitcoin price dynamics21. Moreover, data on users’ behaviour and spending patterns have been used to understand the global state of the crypto-economy22 and the drivers of the growth of the network23. More generally, our paper taps into the growing literature on quantitative investigations of the cryptocurrencies landscape, including models of pricing and adoption of tokens24,25,26,27, analysis of the market structure28,29,30,31,32,33,34, price prediction based on sentiment and social interactions35,36,37,38,39,40,41,42,43, dynamical analysis of informational efficiency44, and centralization of the Bitcoin economy45.

In this paper, we investigate under which conditions in terms of blockchain and Lightning fees, average wealth and volume of transactions per user, a Lightning Network that spans a sizeable fraction of Bitcoin users – thus solving the scalability problem – emerges. We model the emergence of the Lightning Network as a (bond) percolation process on a graph, exploring how different conditions may impact its feasibility46. In particular, we consider fitness-dependent network models47,48,49,50 where the probability of creating a new edge depends on intrinsic node features collectively denoted node fitness. In the LN case, the node fitness will be defined in terms of the node wealth and activity (i.e. volume of transactions). The viability of the Lightning Network will be characterized in terms of the presence (or not) of a giant connected cluster of nodes: a non-fragmented network would, indeed, guarantee a smooth relay of payments and information between users and will incentivize off-chain transactions. Our model depends on parameters that can be all obtained – or at least estimated – from publicly available data, and is fairly robust against different choices of distributions of parameters and fitness kernels.

The paper is organized as follows. In the following section we provide a quick overview of the main Bitcoin blockchain and the main ideas behind LN. In Section “Model setup” we describe our model and provide the relevant theory, which is then applied to two specific wealth distributions (uniform and exponential) in the subsections “Uniform wealth distribution” and “Exponential wealth distribution”, respectively. In the “Results” section, we discuss the results of numerical simulations, and we provide some conclusions and outlook in Section “Discussion”. The Appendices are devoted to technical aspects of percolation theory on networks and are included to make the paper self-contained.

### The Bitcoin blockchain and Lightning Network

In this section, we summarize the main features of the Bitcoin main blockchain and Lightning Network payment layer. The Bitcoin blockchain is a distributed, shared ledger that immutably records transactions among peers in the network1,14. Transactions are bundled in blocks and chained together via cryptographic primitives to ensure that any change at any point in the transaction history would invalidate the full record. Transactions are validated for correctness, temporarily stored in memory pools and then arranged in the blocks data structure by miners: multiple miners compete using computational power to validate the next block of the chain – and therefore earn the associated reward for the service and transactions’ fees–according to the Proof-of-Work consensus algorithm. Depending on the usage of the network and due to limitations in block size, waiting times can peak around 30 minutes (while the typical range is around 6–8 minutes), while blockchain fees per transaction exhibit a broad range of variability, from a few cents to 40–50 USD (data taken from https://www.blockchain.com/charts).

The idea behind the creation of the Lightning Network6 is, therefore, to devise a network for frequent and fast micro-transactions that can be performed at low transactions fees. The basic components of the Lightning Network are payment channels (schematically shown in Fig. 1, panel A), enabling trustless transfers between users. In the typical payment channel implementation, a theoretically unlimited amount of payments can be made, with only two transactions broadcast on the blockchain. In addition to a reduction of the number of blockchain transactions and associated costs, payment channels also offer the advantage of speed and, importantly, the ability of users to recover their funds if one of the parties is malicious.

A channel is established between two parties by locking an initial amount of funds, for instance m Bitcoins for each user, on the main blockchain, which represents the maximum amount of Bitcoins that can be transferred over the channel. Funds are locked on so-called 2-of-2 multisignature addresses14, which can be unlocked upon providing the signature of both interested parties. For instance, user A wishes to send r Bitcoins to user B: she signs a transaction, sends it to B, who will sign it and send it back to A. Only the first transaction is recorded on the main blockchain. At each time step in the lifetime of the channels, the users keep sending back and forth signed transactions that can be at any point consensually broadcast on the main blockchain to close the channel and redeem the net amount of funds. To prevent fraudulent behavior, for instance user B not acknowledging the receipt of a payment from A, a refund option is always included in any exchange. The refund option can be unilaterally unlocked and submitted to the blockchain after a certain amount of time $${t}^{\star }$$ has elapsed from the moment the channel was first established. Every new refund option is indeed signed by both parties, signaling therefore that they are in agreement with the terms of the refund, which may be exercised unilaterally at a later time. In the worst-case scenario, one party would simply submit the original refund transaction created contextually with the opening of the channel.

Payments can be relayed via the Lightning Network also if two parties are not directly connected via a Lightning channel, if there exists a path indirectly linking them via existing channels owned by third parties. Exploiting an existing path to route the payments may often prove more convenient as the two interested parties need not open a new channel, therefore saving the associated costs in blockchain fees. Channels’ owners are indeed owed “routing fees” to allow payments through their channel, but at the moment those fees are very competitive (~4 orders of magnitude less than the Bitcoin blockchain, data taken from https://1ml.com/statistics and https://bitcoinfees.info). In Fig. 1, Panel B we show an example of an indirect routing path between user 1 and 4. One of the biggest issues of the Lightning Network is the limitation in liquidity. Payments are made by effectively having intermediaries forwarding collateral across multiple channels: this means that if party 1 is transferring Bitcoins to party 4, each relaying channel needs to have at least Bitcoins available in the direction of the payment.

To prevent dishonest behavior in the transfer from party 1 to 4 via party 2 and 3 (see Fig. 1, Panel B), Party 1 will lock the Bitcoins with a secret key known only by the receiver: when party 4 receives the Bitcoins from party 3, the secret is revealed and every player can collect their coins and fees14.

## Model Setup

In this section, we model the emergence of the Lightning Network as a (bond) percolation process. We consider N agents, who are all able to reciprocally transfer Bitcoins using the main blockchain and – if economically convenient – to open a channel on the Lightning Network and transact “off chain”. We introduce the node capacity (or wealth) wi of node i, a random variable extracted from a pdf Π(w), which is proportional to the maximum amount of Bitcoins that node i can lock in a Lightning channel it partakes in. We will consider two explicit examples for the wealth distribution (uniform and exponential) in the following, with qualitatively similar results.

Two nodes are more likely to open a Lightning channel if they expect to submit a large number of transactions over a given period of time. Therefore, we introduce for each node i a quantity i that represents its “activity” in terms of the average number of transactions node i sends through each channel in the network. The average number of transactions is also a random variable extracted from the discrete distribution $$\widehat{\Pi }(\ell )$$ over non-negative integers. We also include the costs associated with transacting over one of the two networks (main blockchain only or blockchain and Lightning). These costs can be fixed per transaction (base fee) or can be calculated as a percentage of the value transferred (fee rate).

• c, Lightning channel maintenance/usage base fee: Using the LN channel provided by an operator or other users to transfer coins carries an associated LN fee. Opening a channel has also maintenance costs (fee setup, market and nodes monitoring, connections) and costs related to locking Bitcoins and providing liquidity in the channel.

• ϕ, main blockchain fee rate: We assume that a fraction ϕ of the value transferred in each transaction needs to be paid by the sender to have it included in blocks and validated by miners.

The probability of opening a new LN channel between two nodes $${p}_{ij}^{{\rm{LN}}}$$ can be modeled as a function of (i) the costs associated with opening the LN channel (if the costs are significantly smaller than using the Bitcoin blockchain, there is an incentive for the users towards opening the channel), (ii) users’ affinity (the more likely are users to transact over a period of time τ, the higher the benefits of opening a channel), (iii) the wealth of the nodes (nodes wishing to open a LN channel have to lock a minimal amount of Bitcoins on the main blockchain as collateral).

The growth of the Lightning Network can be modeled as a bond percolation process on a set of N nodes representing Bitcoin users. The edges then represent new Lightning channels being opened. In particular, we construct the bond percolation model considering fitness-dependent networks47,48,49,50. In fitness models, the network topology is determined by (i) an attachment kernel f(x, y), describing the probability that a node with fitness x will connect to a node with fitness y, and (ii) the distribution of fitness ρ(x) across nodes.

The network we consider has a fixed number of nodes N – corresponding to all Bitcoin users that may decide to switch to the LN – and is sparse, i.e. the number M of edges is M N2. If we consider node i and j having fitness xi and xj respectively, a LN channel, i.e. an edge between them, is added with probability

$${p}_{ij}^{{\rm{LN}}}=f({x}_{i},{x}_{j}) \sim {\mathcal{O}}(1/N).$$
(1)

The resulting network is undirected if f(xy) = f(yx), which is a sensible requirement: indeed, opening a LN channel between two nodes will require a “symmetric” commitment from both nodes to lock Bitcoins on the main blockchain. In our model, we will consider bond percolation only: number and “state” of the nodes (e.g. occupied/unoccupied or infected/susceptible) will not change.

In the context of the LN network, we define the fitness xi of node i as the simplest increasing function of both capacity and volume of transactions, i.e.

$${x}_{i}={w}_{i}({\ell }_{i}+1),$$
(2)

where wi represents the wealth of the node, and i + 1 ≥ 1 its “activity” in terms of number of transactions expected to be sent through the channel. Note that we assume that all nodes are potentially active, xi > 0 for all i. As in48, we consider the fitness to be defined in the interval [0, ).

Given this definition of the node fitness, the fitness distribution can be calculated from the wealth and activity distributions, Π(w) and $$\widehat{\Pi }(\ell )$$ respectively, as

$$\rho (x)=\sum _{\ell \ \ge \ 0}\widehat{\Pi }(\ell ){\int }_{0}^{\infty }{\rm{d}}w\Pi (w)\delta (x-w(\ell +1)).$$
(3)

If we imagine links are added one at a time at a given rate, from the kernel f(xy) we can derive the probability that a node with fitness x increases its degree by one as47

$$\lambda (x,N)=\frac{1}{N}\frac{{\int }_{0}^{\infty }{\rm{d}}yf(x,y)\rho (y)}{\kappa },$$
(4)

where $$N\kappa \sim {\mathcal{O}}(1)$$ is the average degree of the network with N nodes, with

$$\kappa ={\int }_{0}^{\infty }{\int }_{0}^{\infty }{\rm{d}}x{\rm{d}}y\rho (x)\rho (y)f(x,y).$$
(5)

We also define λ(x) = Nλ(xN) and rewrite it as

$$\lambda (x)=\frac{1}{\kappa }{\int }_{0}^{\infty }{\rm{d}}yf(x,y)\rho (y).$$
(6)

Note that λ(x) clearly satisfies the following normalization condition

$${\int }_{0}^{\infty }\lambda (x)\rho (x){\rm{d}}x=1.$$
(7)

The degree distribution P(k) for large N is given by

$$P(k)={\int }_{0}^{\infty }{\rm{d}}x\rho (x)\frac{{{\rm{e}}}^{-N\kappa \lambda (x)}{[N\kappa \lambda (x)]}^{k}}{k!},$$
(8)

whose average degree is $$\langle k\rangle =N\kappa$$, as shown in detail in Appendix A.

In the following, we will assume that the activity distribution is Poisson with average $$\bar{n}$$, $$\widehat{\Pi }(\ell )=\exp (-\bar{n}){\bar{n}}^{\ell }/\ell !$$, and that the connectivity kernel models the effects of blockchain and LN fees as follows

$$f(x,y)=\frac{\mu }{N}\Theta (x\phi -c)\Theta (y\phi -c),$$
(9)

where Θ(z) is the Heaviside step function. The interpretation of this kernel is as follows: agent i expects to interact with μ other agents, which we assume for simplicity to be chosen randomly.

The probability of interacting with a given agent j is equal to μ/N for all j. Agent i wishes to transfer an amount wi(i + 1) (corresponding to i + 1 transactions of size wi) to each of them, and is willing to open a Lightning channel if the cost of maintaining it (c) is lower than the cost of transferring the money through the blockchain (wi(i + 1)ϕ). The same considerations apply to its counterpart j.

We define

$${N}_{+}=\mathop{\sum }\limits_{i=1}^{N}\Theta ({x}_{i}\phi -c)$$
(10)

the number of nodes with “high” fitness, for whom it is economically viable to engage in a LN. Note that N+ is a random variable, which depends on the realization of the fitnesses. We define the average fraction $${f}_{+}=\langle {N}_{+}\rangle /N$$.

The network constructed via the sequential deposition of links (as described above) may undergo a percolation transition46,47,51,52 as a function of f+, such that – beyond a critical value of f+ – a giant connected component of NC nodes emerges, whose fractional average size $$S=\langle {N}_{C}\rangle /N$$ remains finite as N → . We stress that in any fixed instance NC ≤ N+, since some high-fitness nodes may still not engage in LN (see Fig. 2). In our language, this connected component represents the set of nodes that not only do exploit Lightning channels to exchange wealth off-chain between nearest neighbors, but may also transfer wealth to any “distant node”, routing the transaction via connected paths. It is therefore of paramount importance to understand under which conditions on the average wealth, average volume of transactions, and routing fees, this transition may happen, and what finite fraction of nodes will it involve.

With the choice of the kernel in (9), the topology of the resulting Lightning Network of N+ nodes is that of an Erdős-Rényi (E-R) graph with average degree equal to $$\mu {f}_{+} \sim {\mathcal{O}}(1)$$. At odds with the standard model of E-R graphs, in our case the size of the graph N+ is itself a random variable, which depends on the parameters of the model. In fact, once f+ has been obtained, the model can be mapped onto a site percolation problem on random networks, where each node is occupied with probability f+, and the emergence of a viable Lightning Network corresponds to the emergence of a giant component of occupied nodes53.

The relevant percolation theory is summarized in Appendix B to make the paper self-contained.

### Uniform wealth distribution

We now take Π(w) – the pdf of wealth across nodes – as uniform in the interval [0, w0]. Hence, we have

$${\rho }^{(u)}(x)=\sum _{\ell \ \ge \ 0}\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !}{\int }_{0}^{{w}_{0}}\frac{{\rm{d}}w}{{w}_{0}}\delta (x-w(\ell +1))\,,$$
(11)

where the superscript (u) refers to uniform wealth distribution. Simplifying we obtain

$${\rho }^{(u)}(x)=\frac{1}{{w}_{0}}\sum _{\ell \ \ge \ \lceil \frac{x}{{w}_{0}}-1\rceil }\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !(\ell +1)}=\,\frac{1}{{w}_{0}\bar{n}}\left(1-\frac{\Gamma \left(\lceil \frac{x}{{w}_{0}}\rceil ,\bar{n}\right)}{\Gamma \left(\lceil \frac{x}{{w}_{0}}\rceil \right)}\right),$$
(12)

where $$\Gamma (a,x)={\int }_{x}^{\infty }{t}^{a-1}{{\rm{e}}}^{-t}{\rm{d}}t$$, and $$\lceil {z}\rceil$$ denotes the smallest integer larger than z. In this case, it follows from (6) and (9) that

$${\lambda }^{(u)}(x)=\frac{1}{{f}_{+}^{(u)}}\Theta (x\phi -c),$$
(13)

where $${f}_{+}^{(u)}$$ is the average fraction of high-fitness nodes and is given by

$${f}_{+}^{(u)}={\int }_{c/\phi }^{\infty }{\rm{d}}x{\rho }^{(u)}(x)=\frac{1}{{w}_{0}}\sum _{\ell \ \ge \ 0}\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !}{\int }_{0}^{{w}_{0}}{\rm{d}}w\Theta (w(\ell +1)-c/\phi )\,,$$
(14)

which requires w > c/[( + 1)ϕ], in turn constraining $$c/[(\ell +1)\phi ]\ \le \ {w}_{0}\to \ell \ \ge \ \lceil \frac{c}{{w}_{0}\phi }-1\rceil$$ (which may also be negative). Therefore

$${f}_{+}^{(u)}=\frac{1}{{w}_{0}}\sum _{\ell =\max \left(0,\lceil \frac{c}{{w}_{0}\phi }-1\rceil \right)}\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !}{\int }_{\frac{c}{(\ell +1)\phi }}^{{w}_{0}}{\rm{d}}w=\Psi (0)-\frac{c}{\phi {w}_{0}}\Psi (-1)\,,$$
(15)

where

$$\Psi (t)=\sum _{\ell =\max \left(0,\lceil \frac{c}{{w}_{0}\phi }-1\rceil \right)}\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !}{(\ell +1)}^{t}.$$
(16)

The evaluation of P(u)(k) from (8) requires some care, as λ(u)(x) is zero if x < c/ϕ. Splitting the integration region, we get

$$\begin{array}{lll}{P}^{(u)}(k) & = & {\delta }_{k,0}{\int }_{0}^{c/\phi }{\rm{d}}x{\rho }^{(u)}(x)+\frac{{{\rm{e}}}^{-\mu {f}_{+}^{(u)}}{(\mu {f}_{+}^{(u)})}^{k}}{k!}{\int }_{c/\phi }^{\infty }{\rm{d}}x{\rho }^{(u)}(x)\\ & = & (1-{f}_{+}^{(u)}){\delta }_{k,0}+{f}_{+}^{(u)}\frac{{{\rm{e}}}^{-\mu {f}_{+}^{(u)}}{(\mu {f}_{+}^{(u)})}^{k}}{k!}\end{array}.$$
(17)

The interpretation of (17) is quite neat: on average, the network contains a fraction 1 − f+ of isolated (low-fitness) nodes, and a fraction f+ of high-fitness nodes that may (or may not) partake in the LN, establishing sparse random connections with an average of μf+ other high-fitness nodes. Computing now the generating function (35)

$${G}_{0}^{(u)}(s)=1-{f}_{+}^{(u)}+{f}_{+}^{(u)}\exp (\mu {f}_{+}^{(u)}(s-1)),$$
(18)

it follows from Eq. (36) that

$${G}_{1}^{(u)}(s)=\frac{{G}_{0}^{{(u)}^{{\prime} }}(s)}{{G}_{0}^{{(u)}^{{\prime} }}(1)}=\exp (\mu {f}_{+}^{(u)}(s-1))\,.$$
(19)

The general theory (see Appendix B, in particular Eq. (54)) then implies that the equation determining 0 <$${\xi }^{\star }$$  ≤ 1 is

$${\xi }^{\star }=\exp [\mu {f}_{+}^{(u)}({\xi }^{\star }-1)].$$
(20)

The average size of the giant component thus reads from Eq. (53)

$${S}^{(u)}=(1-{\xi }^{\star }){f}_{+}^{(u)}\,,$$
(21)

and the condition in Eq. (51) for the giant component to appear is

$$\mu {f}_{+}^{(u)} > 1.$$
(22)

The interpretation of this condition is fairly obvious: the giant connected component can only arise if “fit” nodes open on average more than one channel with other fit nodes (see Figs. 3 and 4).

### Exponential wealth distribution

We now take Π(w) to be the exponential pdf with mean w0. The fitness distribution now becomes

$${\rho }^{(e)}(x)=\sum _{\ell \ \ge \ 0}\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !}{\int }_{0}^{\infty }\frac{{\rm{d}}w}{{w}_{0}}{{\rm{e}}}^{-\frac{w}{{w}_{0}}}\delta (x-w(\ell +1))=\frac{1}{{w}_{0}}\sum _{\ell \ \ge \ 0}\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !(\ell +1)}{{\rm{e}}}^{-\frac{x}{{w}_{0}(\ell +1)}}.$$

As in the uniform wealth case

$${\lambda }^{(e)}(x)=\frac{1}{{f}_{+}^{(e)}}\Theta (x\phi -c),$$
(23)

where this time $${f}_{+}^{(e)}$$ reads

$${f}_{+}^{(e)}={\int }_{c/\phi }^{\infty }{\rm{d}}x{\rho }^{(e)}(x)=\frac{1}{{w}_{0}}\sum _{\ell \ \ge \ 0}\frac{{{\rm{e}}}^{-\bar{n}}{\bar{n}}^{\ell }}{\ell !}{\int }_{0}^{\infty }{\rm{d}}w{{\rm{e}}}^{-w/{w}_{0}}\Theta (w(\ell +1)-c/\phi )\,,$$
(24)

which requires $$\ell \ \ge \ \lceil \frac{c}{\phi w}-1\rceil$$. Therefore,

$${f}_{+}^{(e)}=1-\frac{1}{{w}_{0}}{\int }_{0}^{\infty }{\rm{d}}w{{\rm{e}}}^{-w/{w}_{0}}\frac{\Gamma \left(\lfloor \frac{c}{\phi w}\rfloor ,\bar{n}\right)}{\Gamma \left(\lfloor \frac{c}{\phi w}\rfloor \right)}\,,$$
(25)

where $$\lceil {z}\rceil$$ denotes the largest integer smaller than z. As in the uniform-wealth case

$${P}^{(e)}(k)=(1-{f}_{+}^{(e)}){\delta }_{k,0}+{f}_{+}^{(e)}\frac{{{\rm{e}}}^{-\mu {f}_{+}^{(e)}}{(\mu {f}_{+}^{(e)})}^{k}}{k!}.$$
(26)

Now, consider the solution 0 < $${\eta }^{\star }$$ ≤ 1 of

$${\eta }^{\star }=\exp [\mu {f}_{+}^{(e)}({\eta }^{\star }-1)]\,.$$
(27)

Then, the average size of the giant component reads

$${S}^{(e)}=(1-{\eta }^{\star }){f}_{+}^{(e)}\,,$$
(28)

and the condition for the giant component to appear reads $$\mu {f}_{+}^{(e)} > 1$$ (see Fig. 5).

## Results

We present numerical simulations on networks of N = 5 104 nodes, generated by sequential deposition of links with probability as in Eq. (1), using the kernel in Eq. (9). In Fig. 3, where we use a uniform distribution of wealth with average w0 = 1, 2, we plot the average size S(u) of the connected component as a function of $$\bar{n}$$, the average volume of transactions to be deployed on the LN, for varying values of the fees ratio c/ϕ. Fixing a certain average fraction S(u) of nodes – which can reach each other via a connected LN path – and increasing the ratio c/ϕ between the LN and main-blockchain fees, we observe that a larger average volume of LN transactions is required to make the off-chain network financially sustainable. Increasing the average wealth w0 would push the curves upwards: as more liquidity becomes available across nodes, more and more players may get involved in the LN for the same level of routing fees. In Fig. 4, the average size S(u) of the connected component is plotted instead as a function of $${f}_{+}^{(u)}$$, the fraction of high-fitness nodes, for different values of μ, showing that the transition value between S = 0 and S > 0 happens at 1/μ as predicted by the condition in Eq. (22). In Fig. 5, we observe qualitatively the same phenomenon, this time for an exponential distribution of wealth.

To find the size of the largest connected component, we use a breadth-first search algorithm54: starting from a source node s, we label it as belonging to cluster #1. We then explore its neighborhood and assign all nodes reachable from s to cluster #1 as well. The algorithm proceeds recursively until either the whole network has been labelled, or no unlabelled nodes can be further reached. In the latter case, we select another random source among the unlabelled nodes, assign it the label #2, and restart the procedure to find another cluster. At the end, all disjoint clusters have been identified, and their size recorded. In our plots, we monitor the size of the largest cluster.

In Fig. 6, we plot the phase diagram in the (ϕc) plane for the uniform wealth distribution model (very similar results are obtained for the exponential wealth distribution, not shown). The colors from blue to yellow represent (from low to high) the values of $${\bar{n}}^{\star }$$, the minimal average volume of transactions that need to be deployed to make a LN financially viable for a given value of LN and main-blockchain fees, c and ϕ, respectively. We observe a transition between two regimes, signalled by the red line: one (region 1) where the LN fees are sufficiently low (compared to main-blockchain fees) that any volume of transactions (however low, $${\bar{n}}^{\star }=0$$ strictly) can be transferred off-chain and still be financially viable, the other (region 2) where the LN fees are sufficiently high that agents may be discouraged from opening channels and transferring wealth off-chain unless there is a minimal volume of transactions to be deployed ($${\bar{n}}^{\star } > 0$$ strictly). The higher the ratio c/ϕ, the less convenient it is to open LN channels for a fixed value of transactional activity.

## Discussion

In summary, we have presented a simple fitness-based network model for the emergence of a connected set of nodes exchanging wealth off-chain, whose average fractional size S remains finite as N → . The percolation transition resulting from sequential deployment of edges is studied numerically and analytically as a function of a limited set of parameters that we predict will be in principle possible to infer from empirical or synthetic13 data: w0 (related to the average wealth jointly owned by the agents), $$\bar{n}$$ (the average volume of transactions that can be handled off-chain), cϕ (the fees associated with off-chain and on-chain transactions) and μ (the average number of channels per node). As a matter of fact, different platforms are currently being offered – but only at a test stage – where users can experience the Lighting Network services in a simulated environment. Already at this early stage in the development of a fully operational payment system, some useful data can be gathered: for instance, the platform ‘1ML’ (https://1ml.com/statistics) currently aggregates information about ~10000 nodes sharing ~30000 channels, with an average capacity per node of ~1000 USD, and a base fee per transaction of around 0.000072 USD. Similarly, for the Bitcoin blockchain we can gather an estimate of ~0.52 USD as base fee per transaction, as well as more accurate figures about number of transactions per day and average transaction values (data available at https://www.blockchain.com/en/charts).

The function f(xy) in Eq. (9) has been selected as the simplest but nontrivial attachment kernel that favors a link (i.e. the opening of a Lightning channel) whenever the fitness of both concurring nodes (in terms of exchangeable wealth and volume of predicted activity) exceeds a financially viable threshold. We have checked that “smoothing” the 0/1- kernel in Eq. (9), e.g. by multiplying the thetas by xy/(1 + xy) or $$1-\exp (-(x+y))$$, has negligible effects on the results, while making the analytical treatment unnecessarily more complicated. Similarly, the model is fairly insensitive to the details of the full probability distribution of wealth that is used (see however23 for a data-driven analysis of Bitcoin wealth distribution), while being flexible enough to generate a desired degree distribution P(k) via a different choice of the attachment kernel f(xy)55. A percolation transition separates a phase where no sustainable LN can be formed, from a phase where the fees being charged, the total available wealth and the average activity conspire to make off-chain payments a viable option for a finite fraction of the network in the limit N → . The transition is elucidated analytically and numerically, with excellent agreement.

In the future, this investigation can be extended in the following ways:

• A mechanism for the dynamical update of wealth as more channels are opened and funds are locked may be introduced to investigate the liquidity constraints of the network in more detail. Dynamically generated wealth inequalities and concentration may be detected by means of centrality measures.

• The resilience of the network can be studied under different types of attacks and compared with available empirical results7,11.

• Different choices of the kernel f(xy) (e.g. non-factorized) may be also explored. This could lead to networks with heterogeneous (heavy-tailed) degree distribution, which seems to be in line with recent empirical studies7.

Once the development of the Lightning Network technology and implementation will have reached maturity, it will be possible to gather data to calibrate our model, which can serve as a driver for policy changes and as guidance for incentive mechanisms design.

## Appendix A: Degree distribution P(k)

Following47, the probability pM,N(k|x) that a node in a large undirected graph with N nodes and M N2 edges has degree k given that its fitness is x follows the recursion

$${p}_{M+1,N}(k+1| x)={p}_{M,N}(k+1| x)[1-2\lambda (x,N)]+2{p}_{M,N}(k| x)\lambda (x,N).$$
(29)

The interpretation is easy: the probability of having a node with degree k + 1 after an edge addition (M + 1) is equal to the probability that the node already had degree k + 1 times the probability that the new edge does not have any of its two terminal points attached to it ([1 − 2λ(xN)]), plus the probability that the node had degree k times the probability that the new edge has one of its two terminal points connected to it (2λ(xN)).

Multiplying both sides of Eq. (29) by sk and summing over k ≥ 0, we obtain the following equation for $${F}_{M,N}(s| x)={\sum }_{k\ge 0}{s}^{k}{p}_{M,N}(k| x)$$

$$\begin{array}{lll}{F}_{M+1,N}(s| x)-{F}_{M,N}(s| x) & = & 2{F}_{M,N}(s| x)(s-1)\lambda (x,N)+{F}_{M+1,N}(0| x)\\ & & -{F}_{M,N}(0| x)(1-2\lambda (x,N)).\end{array}$$
(30)

For large M, Eq. (30) can be rewritten as an ordinary differential equation of the form $$\frac{\partial F}{\partial M}=2(s-1)\lambda (x,N)F+$$$$\frac{\partial F}{\partial M}+2\lambda (x,N)F{| }_{s=0}$$, with solution

$${F}_{M,N}(s|x)=\exp \left[\frac{2M}{N}\lambda (x)(s-1)\right]\,,$$
(31)

where we recall that we defined Nλ(x) = λ(xN) and we used the initial condition F0,N(s|x) = 1 that follows from the fact that in a network with zero edges, p0,N(k|x) = δk,0.

Taylor-expanding around s = 0 and noting that 2MN = Nκ, we obtain the degree distribution conditional on the fitness of the node x

$${p}_{M,N}(k| x)=\frac{{{\rm{e}}}^{-N\kappa \lambda (x)}{(N\kappa \lambda (x))}^{k}}{k!}.$$
(32)

Marginalizing with respect to x, we eventually obtain the probability that a node has degree k (irrespective of its fitness) as

$$P(k)={\int }_{0}^{\infty }{\rm{d}}x\,{p}_{M,N}(k| x)\rho (x)={\int }_{0}^{\infty }{\rm{d}}x\,\frac{{{\rm{e}}}^{-N\kappa \lambda (x)}{(N\kappa \lambda (x))}^{k}}{k!}\rho (x),$$
(33)

which correctly implies

$$\langle k\rangle =\sum _{k\ \ge \ 0}kP(k)=N\kappa ,$$
(34)

using (7).

## Appendix B: Giant component

The generating function of the probability that a node has degree k is denoted by

$${G}_{0}(s)=\sum _{k\ \ge \ 0}P(k){s}^{k}.$$
(35)

We introduce the generating function G1(s) of the (normalized) probability that by following a randomly chosen edge we reach a node with degree k

$${G}_{1}(s)=\frac{\sum _{k\ \ge \ 1}kP(k){s}^{k-1}}{\sum _{k\ \ge \ 0}kP(k)}=\frac{{G}_{0}{\prime} (s)}{{G}_{0}{\prime} (1)}=\frac{{G}_{0}{\prime} (s)}{N\kappa }.$$
(36)

This is because the node we reach by following a randomly chosen edge has degree distribution kP(k)/Nκ rather than just P(k) – since a randomly chosen edge is more likely to lead to a node of higher degree.

We also define the generating function of the number of nodes that can be reached following a randomly chosen edge and that belong to a connected component of size t with size distribution ψ(t)

$${H}_{1}(x)=\sum _{t\ \ge \ 1}\psi (t){x}^{t}.$$
(37)

Moreover, we indicate with H0(x) the generating function of the probability that a randomly chosen node belongs to a connected component of size t

$${H}_{0}(x)=\sum _{t\ge 1}Q(t){x}^{t}.$$
(38)

More precisely Eq. (37) must be interpreted as

$${H}_{1}(x)={{\rm{lim}}}_{N\to \infty }\mathop{\sum }\limits_{t=1}^{N}\psi (t,N){x}^{t}\,,$$
(39)

where ψ(tN) is the probability that – in a network with N nodes – by following a randomly chosen link, we reach a component of size t ≤ N, and similarly for H0(x) in Eq. (38). Assuming that the typical component sizes are finite and that the chances of a component containing a closed loop of edges are negligible for sufficiently large N, the distribution of components generated by H1(x) can be obtained as follows47,51,52. Let us denote by ζ(t|k) the probability that a node with degree k belongs to a component of size t

$$\zeta (t| k)=\sum _{{t}_{1}\ \ge \ 1}\cdots \sum _{{t}_{k}\ \ge \ 1}\delta \left(t-1,\mathop{\sum }\limits_{m=1}^{k}{t}_{m}\right)\mathop{\prod }\limits_{m=1}^{k}\psi ({t}_{m})\,,$$
(40)

where δ(ab) is the Kronecker delta. Indeed, the sum of the sizes of the components that can be reached by following the k edges departing from the node must be equal to t − 1, and each of these sizes is drawn from the distribution ψ(t).

Marginalizing over the degree distribution, we obtain the probability Q(t) that a randomly chosen node belongs to a component of size t as

$$Q(t)=\sum _{k\ge 0}P(k)\zeta (t|k).$$
(41)

Computing H0(x) from (38)

$$\begin{array}{cc} & {H}_{0}(x)=\sum _{t\ge 1}Q(t){x}^{t}=\sum _{t\ge 1}{x}^{t}\sum _{k\ge 0}P(k)\zeta (t|k)\\ & =\sum _{k\ge 0}P(k)\sum _{t\ge 1}{x}^{t}\sum _{{t}_{1}\ge 1}\cdots \sum _{{t}_{k}\ge 1}\delta (t-1,\mathop{\sum }\limits_{m=1}^{k}{t}_{m})\mathop{\prod }\limits_{m=1}^{k}\psi ({t}_{m})\\ & =x\sum _{k\ge 0}P(k)\sum _{{t}_{1}\ge 1}\cdots \sum _{{t}_{k}\ge 1}{x}^{\sum _{m}{t}_{m}}\mathop{\prod }\limits_{m=1}^{k}\psi ({t}_{m})=x\sum _{k\ge 0}P(k){[\sum _{t\ge 1}\psi (t){x}^{t}]}^{k}=x{G}_{0}({H}_{1}(x))\end{array},$$
(42)

where we have used (35) and (37). The calculation for H1(x) is analogous, with the replacement $$P(k)\to \frac{kP(k)}{{\sum }_{k{\prime} }k{\prime} P(k{\prime} )}$$. Summarizing, the two equations to be solved together are

$${H}_{0}(x)=x{G}_{0}({H}_{1}(x)),$$
(43)
$${H}_{1}(x)=x{G}_{1}({H}_{1}(x)).$$
(44)

The average size t of the connected components is given from (38) as

$$\langle t\rangle =\sum _{t\ge 1}tQ(t)={H}_{0}{\rm{{\prime} }}(1).$$
(45)

$${H}_{0}{\prime} (1)$$ can be obtained from (43) as

$${H}_{0}{\prime} (1)={G}_{0}({H}_{1}(1))+{G}_{0}{\prime} ({H}_{1}(1)){H}_{1}{\prime} (1)\,.$$
(46)

Note that from (37) it follows that H1(1) = 1 (by normalization of ψ(t)). Similarly, from (35), we have that G0(1) = 1 (by normalization of P(k)). Eq. (46) can be therefore simplified as follows

$${H}_{0}{\prime} (1)=1+{G}_{0}{\prime} (1){H}_{1}{\prime} (1).$$
(47)

We can then compute $${H}_{1}{\prime} (1)$$ using (44)

$${H}_{1}{\prime} (1)={G}_{1}({H}_{1}(1))+{G}_{1}{\prime} ({H}_{1}(1)){H}_{1}{\prime} (1).$$
(48)

As before, we can simplify it using the fact that H1(1) = 1 and that $${G}_{1}(1)=\frac{{\sum }_{k}kP(k){s}^{k-1}}{{\sum }_{k}kP(k)}{| }_{s=1}=1$$ (see (36)), obtaining:

$${H}_{1}{\prime} (1)=1+{G}_{1}^{{\prime} }(1){H}_{1}{\prime} (1)\Rightarrow {H}_{1}{\prime} (1)=\frac{1}{1-{G}_{1}{\prime} (1)}.$$
(49)

Substituting (49) in (47) yields

$$\langle t\rangle ={H}_{0}{\prime} (1)=1+\frac{{G}_{0}{\prime} (1)}{1-{G}_{1}{\prime} (1)},$$
(50)

which diverges when

$$1-{G}_{1}{\prime} (1)=0$$
(51)

or equivalently (using (36)) when $${G}_{0}^{{}^{{\prime\prime} }}(1)=N\kappa$$, signalling the emergence of the giant component.

When the giant component has formed, H0(x) and H1(x) (see Eq. (37), (38), (39)) become the sum of two contributions: one where the sum is restricted to components of size t ~ o(N), and the other restricted to (giant) components of size $$t \sim {\mathcal{O}}(N)$$. Assuming that there is only one such giant component, Eq. (38) for x = 1 can then be written as

$$1={H}_{0}^{(f)}(1)+S,$$
(52)

where $${H}_{0}^{(f)}(1)$$ (and similarly $${H}_{1}^{(f)}(1)$$) satisfy the equations (43) and (44), as they include ~o(N) contributions for N →  coming from components other than the giant one, whereas S = NC/N is the fraction of nodes that belong to the giant component.

Therefore (from (43) and (44))

$$S=1-{G}_{0}({\xi }^{\star }),$$
(53)

where $${\xi }^{\star }$$ satisfies

$${\xi }^{\star }={G}_{1}({\xi }^{\star }).$$
(54)