Quantum communication complexity beyond Bell nonlocality

Efficient distributed computing offers a scalable strategy for solving resource-demanding tasks such as parallel computation and circuit optimisation. Crucially, the communication overhead introduced by the allotment process should be minimised -- a key motivation behind the communication complexity problem (CCP). Quantum resources are well-suited to this task, offering clear strategies that can outperform classical counterparts. Furthermore, the connection between quantum CCPs and nonlocality provides an information-theoretic insights into fundamental quantum mechanics. Here we connect quantum CCPs with a generalised nonlocality framework -- beyond the paradigmatic Bell's theorem -- by incorporating the underlying causal structure, which governs the distributed task, into a so-called nonlocal hidden variable model. We prove that a new class of communication complexity tasks can be associated to Bell-like inequalities, whose violation is both necessary and sufficient for a quantum gain. We experimentally implement a multipartite CCP akin to the guess-your-neighbour-input scenario, and demonstrate a quantum advantage when multipartite Greenberger-Horne-Zeilinger (GHZ) states are shared among three users.

Quantum technology enables applications ranging from fundamentally secure cryptography [1] to quantum teleportation [2,3] and ultimately the quantum internet as enabled by distribution of global-scale entanglement resources [4][5][6].Quantum correlations, created in quantum networks, can be harnessed to enhance the efficiency of distributed information processing, i.e., by reducing communication cost; this is exemplified in the use of shared entanglement in communication complexity problems (CCPs) [7][8][9].This will be of interest in near term quantum computing platforms where many medium-sized nodes are linked to scale up the computational capabilities [10,11], where CCP will naturally reside.
Distributed computing represents a highly versatile method of solving demanding tasks by taking a global target function and splitting the input among multiple users.The users act on their inputs locally to solve a global problem with some communication allowed between each of the users.CCP provides the necessary framework in evaluating the ultimate performance of these architectures, notably evaluating the minimum communication overhead needed to achieve the task [12][13][14].Recent developments of CCPs have adopted the use of non-classical resources [7], and an updated definition for evaluating the complexity of a problem.One might also be interested in obtaining the highest probability of successfully evaluating the target function, with fixed amount of communication [8,[15][16][17].In spite of Holevo's theorem [18]-showing that quantum states cannot reduce the cost of transmitting a classical message-if our aim is to compute a function of it, as in the classic CCP setting, quantum resources such as entanglement can demonstrate an improvement.Remarkably, it has been proven that quantum advantages in CCPs can be mapped to the violation of Bell inequalities [17,19], thus establishing an important link between two key concepts of computer science and quantum theory.Moreover, it is widely believed that communication complexity should scale with the size of the input data.Interestingly, this non-triviality of a CCP can be seen as an informational principle for why Nature cannot be more nonlocal than what is achievable within quantum theory [20,21].
The connection between Bell inequalities and communication complexity has only been proven for the standard notion of Bell nonlocality, which contrasts quantum mechanics with local hidden variable (LHV) models.In this standard scenario, one considers a number of separate parties who share a common source of correlations but cannot communicate-an arrangement that is severely limiting, particularly in the context of CCPs.The study of Bell nonlocality has however produced much more general and stronger notions of nonlocality [22][23][24][25][26][27][28][29][30][31]; these include scenarios that allow subsets of parties involved in a Bell experiment to communicate, while the classical description now considers nonlocal hidden variable (NLHV) models.So far the connection between CCPs and this generalised Bell nonlocality has not been investigated.That is precisely our aim.
Here we show that NLHV models define a new and • • • , x n } according to some underlying causal structure indicated by the shaded pink region.Each user processes their input data along with shared correlations and broadcast a one bit message m i to all others before they compute a function f (x 1 , . . ., x n , y 1 , . . ., y n ).
When allowing for quantum resources, shared quantum correlations that violate a Bell-like inequality bounding the corresponding NLHV model provide an advantage in the probability of success of the task over its classical counterpart.(B) Causal structure of the three-party GYNI scenario [24,32] experimentally investigated.The input of each party x i is communicated to its neighbour to the right (or alternatively, to the left).The variable λ stands for pre-shared (classical) correlations shared among the parties and used to produces respective outcomes, a i .(C) Causal structure of Svetlichny's scenario [22] as discussed in Supplementary Material.
more general class of CCPs.As opposed to the standard scenario, where each party has an exclusive part of the input data required to compute the desired function, in the generalised case the input data can be distributed in arbitrary manners.Every NLHV model defines a specific pattern in which the input data is distributed among the parties, see Fig. 1 (A).Moreover, every full correlator Bell inequality bounding the classical correlations in such NLHV models not only defines a target function in the CCP, but also the corresponding probability of success in trying to compute it with a restricted amount of communication between the parties.Thus, the violation of these Bell inequalities are a necessary and sufficient condition for a quantum advantage in generalised CCPs.This establishes the first general connection between this new form of nonclassicality emerging from NLHV models and a relevant quantum information task.We experimentally investigate a three-party CCP task with a causal structure inspired by the guess-your-neighbour-input (GYNI) non-local game, and demonstrate an increased winning probability when using quantum resources which violate the associated Bell-like inequality.This exemplifies the need to consider the underlying causal structure when defining the Bell-like inequalities for generalised CCPs.

Bell scenarios with communication and nonlocal hidden variable models.
In a standard Bell scenario, each of n distant parties receive an input x i and output a i (with i = 1, . . ., n).A local hidden variable description implies that the observed probability distribution p(a 1 , . . . ,a n |x 1 , . . ., x n ) = p( a| x) can be decomposed as where λ is a classical random variable accounting for all correlations observed between the measurement outcomes of the distant parties.In turn, in a quantum description, Born's rule implies that where M x i a i are measurement operators and ρ describes the quantum state shared between the parties.As shown by Bell [33], there are quantum correlations (2) that cannot be written as (1).This is the phenomenon known as Bell nonlocality and is witnessed by the violation of Bell inequalities [19,33], where linear constraints on the probabilities should be respected by any distribution of the form (1).
In spite of its importance, the usual Bell scenario is rather restrictive, in that no communication can take place between the parties.Alternatively, one can think of an external agent that generates a sequence of values of n random variables {x i } and sends (possibly over- lapping) subsets of this sequence to each of the n parties involved in the Bell test.In this more general scenario the i-th party can receive a total of l i inputs that we label as x i,j ∈ {x i } with j = 1, . . ., l i and x i,1 = x i .Let the set of inputs of party i be organised in a vector x i = {x i,j |j = 1, . . ., l i }.A classical description is then given by a NLHV model that can be graphically represented by a directed acyclic graph where each measurement outcome a i has a set of parents x i,j .See Fig. 1 (B) and Fig. 1 (C) for examples.Analogously to Eq. ( 2), the set of quantum correlations in this extended Bell scenario is described as that is, the measurement settings of each party may now depend on subsets of {x i }, denoted as x i for the subset held by party i, and for different parties these might have an overlap.From a broad perspective, we are imposing a given causal structure to the experiment, one in which parts of the input of a given party can also be known by other distant parties.Similarly to the usual Bell's theorem, we will be interested in whether: i) there are quantum correlations, Eq. (4), that do not have a classical description as in Eq. (3); and ii) this nonclassicality can be harnessed in the processing of information, in particular in CCPs.
The answer to the first question will inherently depend on the specific causal structure under analysis but positive examples are known [22,[24][25][26][27][28][29][30] and will be explored in more detail below.Preceding this, a general answer to the second question is the central theoretical result of this paper.

Communication Complexity and Bell inequalities.
Without loss of generality, in a usual CCP involving n participants [15], each party i receives two bits-x i and y i -and can broadcast to all other parties just a one bit message.As an example of such a CCP, one can consider that the parties want to schedule an appointment, their local inputs represent their availability in different time slots and thus the function they want to compute relates to finding a time slot when all of them are available.However, one can think of more general scenarios where the schedule (or part of it) from one of the participants is known to the others.Having this in mind in our generalised CCP each party i has access to the random variables x i = x i,j |j = 1, . . ., l i ⊂ {x i |i = 1, . . ., n} and y i , where x i , y i ∈ {±1} (see Fig. 1 A).The values of the variables x i are drawn from a joint probability distribution q(x 1 , . . ., x n ), while the y i 's are independently drawn from a uniform distribution.Furthermore, the parties are also allowed to share correlated systems and use their measurement outcomes a i ∈ {±1} in the execution of the protocol.Here, we follow closely the conceptual framework of the seminal results in Ref. [15], the first to provide a general connection between LHV models and CCPs.As in Ref. [15], the goal is for each party to evaluate a binary function f (x 1 , ..., x n , y 1 , ..., where ) is a function of all inputs, and S[Q] = Q |Q| is the sign function, with the restriction that each party i can only broadcast a single bit m i = m i (x i,1 , . . ., x i,l i , y i , a i ) to every other party j.If party i guesses G i ( x, y) for the function f ( x, y), its probability of success is given by where and 0 otherwise.Consider now a general Bell inequality of the form where Q(x 1 , . . ., x n ) is the coefficient of E x 1 ,...,x n , which stands for the full correlation function and B C n is the classical bound associated with a given causal structure described by the NLHV decomposition in Eq. (3).Then, our main theoretical result is to show that a violation of such a Bell inequality is necessary and sufficient to lead to a quantum advantage in a CCP related to the computation of the function in Eq. ( 5).This is stated in the following theorem, the proof of which is elaborated in the Methods and Supplementary Material.
Theorem 1.Given a Bell inequality of the form (7), the optimal classical probability of success P C i of party i computing the function is limited by with Γ = ∑ 1 x 1 ,...,x n =−1 |Q(x 1 , . . ., x n )|.Moreover, using the correlations shared between the parties there is a protocol achieving thus showing that a violation of the Bell inequality (7) is both necessary and sufficient for an advantage in the CCP.
This result shows that every full correlator Bell inequality that displays a quantum violation is associated to a CCP with quantum advantage, even in scenarios where the parties can communicate.As detailed in the Methods and Supplementary Material, to achieve the probability of success (11), the message m i communicated from one party to all others should be m i = y i a i , that is, the product of its input y i with the measurement outcome a i .Interestingly, as we show next, there are inequalities that do not show quantum violations in standard Bell scenarios, that, however, are violated if such communication is allowed.In order to illustrate the theorem, we present a Bell inequality along with its corresponding CCP, associated to a tripartite Bell scenario with communication related to the guessyour-neighbour's-input (GYNI) scenario [32].We then proceed to implement this scenario experimentally.As a second illustration of the Theorem, we introduce the 'Svetlichny' scenario [22] in the Supplementary Material.

Guess-your-neighbour's-input scenario.
The causal structure for this scenario (see Fig. 1 (B)), akin to the guess-your-neighbour's-input game [32], was introduced in [24], leading to a new type of multipartite nonlocality.Classical correlations (3) are bounded by the inequality [24] with
According to the Theorem, the classical probability of success in computing the associated function is P C Suc ≤ 7/8 = 0.875.As shown in [24], a quantum violation of this inequality is not possible if party i only has access to the input x i .If however, the three parties share a GHZ state of the form and are able to choose their measurements according to the GYNI causal structure depicted in Fig. 1 (B), the inequality ( 12) can be violated up to B G ≈ 7.39.This results in a higher probability of success, P Suc ≈ 0.962, a quantum advantage in this CCP.As shown in the Supplementary Material, resorting to a generalisation of the NPA hierarchy [34] (a secondary but still relevant technical contribution of our results), this is the optimal quantum value.
Experimental implementation.We experimentally investigate the GYNI scenario by producing a tripartite GHZ state encoded in polarisation of telecomwavelength photons.We measure the correlation terms defined by inequality (12) to demonstrate that the experimentally observed state can violate the inequality, which is both necessary and sufficient for a quantum advantage in the CCP.For each correlation term, we implement the optimal measurement settings in Alice, Bob and Charlie's polarisation analysers and record the photon statistics for all 2 3 outcomes to evaluate the expectation values as shown in Fig. 3. See Methods for details on the optimal measurement settings.From our measurements we obtain a correlation value of B G = 7.023 ± 0.036, representing a violation of the inequality (12) by 28 standard deviations with respect to the classical bound of B C G = 6.Using Eq. ( 11), the observed violation translates to a probability of correctly computing the CCP in this GYNI scenario of P Suc = 0.9389 ± 0.0049.
We implement the tripartite GYNI protocol in a faithful round-by-round execution to verify the general connection between CCPs and NLHV models established in this work.In each round of the protocol, we distribute a GHZ state to three users, and use the NIST randomness beacon [35] to generate randomised input data (x 1 , x 2 , x 3 , y 1 , y 2 , y 3 ).Each user receives their input bits {x i , x j , y i }, updates their polarisation analysers with their respective measurement settings M (x i ,x j ) , and records a single coincidence event to obtain their respective outcomes, a i , for the round-see Methods for details.Finally, every user announces their one-bit message, m i = y i • a i , evaluates a guess for the round and compares the joint result with the value of the target

4
, obtaining a pass/fail for the round.
We perform the protocol for a total of 10100 rounds and observe successful outcomes for 9403 rounds, corresponding to a probability of success of P Suc = 0.9310.The experimentally measured probability of success is slightly smaller than the estimated value obtained by the measured violation of the inequality.This is likely due to drifts in the setup when performing the protocol Each user performs projective measurements on their qubit using a quarter-wave plate (QWP), half-wave plate (HWP), and PBS.Single photons are detected using superconducting nanowire single-photon detectors (SNSPD) and time-tagged for coincidence measurements within a 1 ns window.
over the large number of rounds.For the round-byround execution we collected two days worth of statistics, whilst the measurements pertaining to the inequality violation were recorded in less than an hour.The majority of data acquisition time in both cases is owed to slow rotation stages that update measurement settings.Our implementation of the protocol obtained

DISCUSSION
Generalisations of Bell's theorem to more complex causal networks and in particular those involving communication between the parties are attracting growing attention.On one side they unveil new [30,36] and sometimes stronger [22,27] kinds of quantum nonlocality.On the other hand, the practical use of this nonclassicality in the processing of information has so far been limited to specific cases such as device-independent entanglement quantification [37], closing attacks in multipartite cryptographic protocols [38] and game theory [39].Here we proposed a general approach by showing that nonlocal hidden variable models introduce a new class of communication complexity problems that contain previous versions [15] as special cases.The Bell inequalities bounding classical correlations in such models can be mapped to the functions to be computed in the associated communication problem.Further, the violations of these inequalities provide a necessary and sufficient condition for a quantum advantage over the best possible classical protocol.Our results are theoretically proven in full generality and validated for a spe-cific scenario in an experiment based on a high-fidelity tripartite GHZ state, demonstrating a quantum violation of the Bell inequality (12), akin to the guess-yourneighbour-input scenario [24,32].
Our implementation of the protocol in its entiretyusing randomised measurement settings on a roundby-round basis-demonstrates the advantage in a CCP task when using non-locality as a quantum resource.Future quantum networks will enable the implementation of increasingly sophisticated CCP related tasks over distances, e.g., linking a cluster of network nodes.
Here the entanglement resources are produced in the telecom regime, enabling low-loss transmission in optical fibres connecting network nodes.Additionally, the investigation of different CCP tasks associated with "network-friendly" multipartite entangled states such as graph states would further provide utility in more generalised scenarios.
In spite of the generality of our results a few relevant questions still require further in-depth analysis.As we show here, the violation of a full correlator Bell inequality is a necessary and sufficient condition for a quantum advantage, even in NLHV models related to Bell scenarios with communication.But for which NLHV models are such violations possible?Initial attempts [24] have provided partial answers in the case where the quantum correlations are nonsignalling, that is, the quantum measurements do not make use of the inputs of other parties.The answer to the more general case remains open.Further, we have focused here on full correlator Bell inequalities and it is known that, in the more general case, the violation of Bell inequalities does not necessarily lead to quantum improvements in standard CCPs [40].In view of that, analysing under which conditions quantum advantages in generalised CCPs can also be connected with Bell inequalities involving marginals and more measurement outcomes is an interesting question for future research.

Entangled photon source.
We employ two parametric down conversion (PDC) sources to create the polarisation-encoded GHZ state.Each source consists of a 30 mm aperiodically poled KTP (aKTP) crystal designed to produce spectrally pure photon pairs at 1550 nm in the Type-II configuration [41].This is achieved through an optimised domain engineering technique, where aperiodic poling achieves a non-linear Phase Matching Function (PMF) that approximates a Gaussian, resulting in near-optimal biphoton spectral purity [42].This approach allows our photon sources to operate without lossy narrowband filters-the interference filters have nominal full-widthhalf-maximum bandwith of 8.8nm-allowing higher heralding and collection efficiencies while maintaining high visibility non-classical interference as required for producing multi-photon states efficiently.Our crystals are designed for matching a transform-limited Sech-shaped pump spectrum with a pulse duration of 1.3 ps.The domain engineered crystal is embedded in a Sagnac loop which generates polarisationentanglement between the photon pair.A lens with a 50cm nominal focal length is used to focus pump field into each crystal, leading to a source brightness of ∼2400 pairs/mW/s and heralding efficiencies of ∼ 60%.With 50mW of pump power we witness an interference visibility of 94.2 ± 1.5% between photons generated from independent sources without any filtering.In addition, the picosecond laser is spatially multiplexed attaining 320 MHz repetition rate.This is implemented using two free-space delay loops using 50:50 beamsplitters (BS) and mirrors.This allows the peak power per pulse to be reduced to lower the probability of unwanted multi-photon events at the same pump power.
One photon from each source interferes nonclassically on a PBS such that conditional on measuring one photon in each output detector set; Alice (A), Bob (B), Charlie (C), and Trigger (T), the quantum state of the overall four-photon system is, where |H ≡ |0 and |V ≡ |1 in the logical basis encoding [43].We note the phase shift ϑ is intrinsic to the optical components in our setup and can be compensated by local operations on any one of the entangled qubits.Our setup uses standard polarisation measurement analysers-which consists of a QWP, HWP and PBS where the output modes are fibre coupled to SNSPDs-to perform arbitrary projective measurements on each qubit.
To obtain the three-qubit GHZ state for this experiment we project the Trigger photon onto the state, |ϑ .= (|H + e −iϑ |V )/ √ 2, which ensures entanglement among the remaining photons and simultaneously implements the phase correction.Detecting a photon after the PBS projects the remaining three photons onto the following state, up to a local bit-flip which is implemented in the respective user's measurement stage with the use of the polarisation fibre controller.We perform quantum state tomography and reconstruct the density matrix to characterise the general properties of the GHZ state.We observe the fidelity to the ideal state to be F = 0.9508 ± 0.0031 and a state purity of P = 0.9255 ± 0.0058.Uncertainties are reported for one standard deviation and obtained by Monte Carlo sampling using 200 runs, assuming Poissonian statistics.

Measurement settings.
In our experiment the measurement setting in each round for each user is determined by two input bits {x i , x j }, distributed as per the GYNI scenario.As such, each user has four possible projective measurement settings, M (x i ,x j ) , given by Alice Bob Charlie The correlators, , measured in our experiment are expressed in conventional notation in which the subscript indices, denote the local variable x i assigned to each user prior to distribution to their neighbour.As such we make use of the following look-up table to determine the measurements that is performed by each user:

Measurement setting
Alice Bob Charlie Our measurement apparatus allows us to perform ar-bitrary projective measurements by using the HWP and QWP to rotate the measurement basis.Placement of detectors behind both outputs of the PBS enables us to obtain outcomes spanning the full basis set.

Calculating correlations.
In the experiment each user's measurement stage is accompanied with two detectors to measure both outputs of the PBS.This allows us to directly sample the joint-outcomes for a given basis defined by the measurement settings of each user.For example to evaluate the correlator E +++ we set the measurement waveplates to implement M , and M (+,+) 3 , for Alice, Bob and Charlie respectively.We record the three-fold coincidence events according to the outcome detector patterns and evaluate, where C ijk are the number of coincidences, and indices {i, j, k} ∈ {+, −} denote the outcome for Alice, Bob and Charlie respectively.Finally, we note that the other correlators are evaluated in the same way following the measurement settings outlined previously.

Sketch of the Theorem's Proof.
The full proof of ( 10) is rather lengthy and presented in the Supplementary Material.Here we focus on (11).First notice that full correlators can be written as 16) Inequality ( 7) can then be rewritten as in which q * (x 1 , . . ., x n ) = |Q(x 1 ,...,x n )|
The protocol proceeds as follows.
Each party i chooses a measurement to perform from the set x i,j |j = 1, . . ., l i , obtaining outcome a i .Each party, then, broadcasts to all other parties the message m i = a i y i ∈ {±1}.In the final step all parties make the same guess about the function f to be computed, given by A comparison between the guess of each party (18) and the function ( 5) to be computed shows that, given a sequence of inputs x 1 . . .x n , y 1 . . .y n , the success probability is independent of y 1 , . . ., y n and given by P x 1 ,...,x n (∏ n i=1 a i = S[Q(x 1 , . . ., x n )]).Hence, since the variables x 1 , . . ., x n are sorted according to a distribution q(x 1 , . . ., x n ) and variables y 1 , ..., y n are uniformly sorted, the final probability of success is A comparison with (17) leads directly to (11).
Recall that the distributed function that the parties should be able to compute is The fidelity between the function and Alice's guess, ( f , G A ), is defined as: x 1 ,x 2 ,x 3 ,y 1 ,y 2 ,y 3 =−1 ∑ q 1 ,q 2 ,q 3 ,q 4 ,q 5 G q 1 q 2 q 3 q 4 q 5 x q 1 1 y Notice that from equation (S13), we have: and since m B (x 2 , y 2 , x 1 ) ∈ {−1, 1}, we must have: The same argument can be applied on equation (S14) to obtain: and to equation (S15) to obtain: Using these relations, we can define: This leads to the following expression: which, from inequality (S9), is clearly bounded by: Once the success probability reads: We have that: The same holds for Bob, given the symmetry of the problem.On the other hand, the most general guess Charlie may provide is: q 1 ,q 2 ,q 3 ,q 4 G q 1 q 2 q 3 q 4 x q 1 3 y The fidelity between function f (x 1 , x 2 , x 3 , y 1 , y 2 , y 3 ) and Charlie's guess, ( f , G C ), reads: .
Notice that, from equation (S13), we have: and since m B (x 2 , y 2 , x 1 ) ∈ {−1, 1}, we must have: The same argument can be applied on equation (S12) to obtain: and to equation (S25) to obtain: Using these relations, we can define: These lead to: which, from inequality (S9), is clearly bounded by: Once the success probability reads: we have that: This shows, once again, that Theorem 1 holds.

GUESS-YOUR-NEIGHBOUR-INPUT SCENARIO
Performing a brute force optimization over all pure qubit states and projective measurements we have found that that maximum violation of the inequality (12) in the main text is given by S G ≈ 7.391.As discussed below, this value is still pretty close to the numerical value obtained by running an extension of the NPA hierarchy [34] for a Bell scenario with communication.In this case, the optimization is performed over all possible quantum states and measurements but in general it will only provide an upper bound for the maximum quantum value.
Regarding the maximum quantum violation with qubits.The projective measurement for each of the three parties can be written as where σ = (σ x , σ y , σ z ) are the Pauli matrices.By performing a brute-force numerical optimization the best state is a GHZ state and the optimal measurements are given by The fidelity ( f , G A ) reads: x 1 ,x 2 ,x 3 ,y 1 ,y 2 ,y 3 =−1 ∑ q 1 ,q 2 ,q 3 ,q 4 ,q 5 G q 1 q 2 q 3 q 4 q 5 x q 1 1 y Notice that from equation S44, we have: 1 and once m B (x 2 , y 2 , x 1 ) ∈ {−1, 1} we must have: The same argument can be applied on equation S45 to obtain: and to equation S46 to obtain: Using these relations, we can define: Leading to: Which, given inequality S40, clearly is bounded to: Once the success probability reads: We have that: proving that theorem 1 holds.Notice that this result holds for every party given the symmetry of the problem.

GENERALIZATION OF THE NPA HIERARCHY TO SCENARIOS WITH COMMUNICATION
In this section we show how to estimate an upper bound on the success of quantum strategies.Since a direct computation of the best quantum strategy is a difficult problem, we resort to successive approximations of the set of quantum probabilities by a Navascues-Pironio-Acin (NPA) hierarchy of supersets [34], with supersets of higher levels in the hierarchy contained in all sets of lower level, in a way that the sequence converges exactly to the quantum set as the level goes to infinity.
To employ this technique in a scenario that involves signalling among the parties, we assume that a genuine quantum realization of any given probability distribution implies the existence of measurement operators M x,x a for each possible combination of local and nonlocal inputs (represented by x and x , respectively).Effectively, this means that we take the signalling scenario as a particular case of a larger nonsignaling scenario, where local-input alphabets are augmented to incorporate the influence of communicating parties.This notion is illustrated in Fig. S2.
In particular, for instance, a quantum realization for a distribution compatible with the guess-your-neighboursinput scenario, shown in Fig. 1b of the main text, is given by positive semidefinite operators M x,u a , M y,v b , M z,w c and a density matrix ρ such that and ∑ a M x,u a = 1 (and similarly for the other parties), which in turn realize the observed distribution when the identifications u = z, v = x, w = y are executed, so that P(abc|xyz) = P(abc|(x, z) (y, x) (z, y)).
FIG. S2.Adaptation of the NPA method to signaling scenarios.We find an augmented, nonsignaling scenario that includes the signaling one as a particular case.Nonlocal influences in the original model, such as X → B in the figure, become then mediated by local variables (X in the example), which are considered as independent variables.The original problem is then obtained when the particular choice x = x is made.
We then solve the approximate compatibility problem with a standard semidefinite program (SDP).The problem consists of finding a truncated, positive semidefinite moment matrix compatible with the augmented probability, with the corresponding internal constraints and the additional constraint that the reduced distribution matches the observed one.Formally, the SDP is given by Given where the optimization variable P corresponds to a distribution in the augmented scenario; P sig is the distribution of interest in the signaling scenario.M is a moment matrix, truncated according to the level of approximation desired (set as k in the program description).
A valid moment matrix is one with entries M ij compatible with Tr A i A † j ρ , for some density matrix ρ and A i operators in a set S = {1} S 1 . . .S k , with S n = {AB | A ∈ S 1 , B ∈ S n−1 } for n > 1, and S 1 a nonempty set of operators, chosen, in our case, as the set of POVMs M x,u a , M y,v b , M z,w c , with extra terms corresponding to combinations of three of these operators.Constraints over M correspond to these compatibility conditions.In particular, pertinence to NPA k corresponds to linear equality constraints among the entries of M and between entries and elements of the distribution P imposed by the expected structure.
In this way, we have obtained a violation of S G ≈ 7.393 using moments generated by the second level in the hierarchy with extra measurements combining extra terms ABC + AAB + AAC + BCB.Considering the Svetlichny's scenario (Fig. S1) instead, and using inequality (S9) of the main text, we have obtained S S ≈ 5.6568, practically coinciding with the quantum bound 4 √ 2, already for the second level of the hierarchy with no extra terms.We used Peter Wittek's ncpol2sdpa library for Python [44] to obtain the moment matrix structure for the SDP and we solved it using MOSEK solver [45].

PROOF OF THEOREM 1 IN THE GENERAL CASE
Here, we consider the general case in which a source distributes 2n bits {x j , y j } j=1,...,n among n separated local parties.Party i receives the bits {x i , y i } plus l i − 1 other bits from the set {x j } j =i .For convenience, we adopt a relabelling of the inputs communicated to party i with respect to party i as follows: the new label consists of two sub-indexes the first one stand for the party we are taking as reference and the second is the relabel itself which is take to be sequential, for instance all variables x k that are communicated to party i are relabeled as x i,j with i ∈ 1, ..., l i (in particular we take x i,1 = x i ).Furthermore, using the same notation, we relabel the variables that are not communicated to party i with a tilde: the subset of {x j , y j } j=1,...,n of variables which are not communicated to party i is specified by a tilde sign, forming a set { xi,j } i=1,...,n−l i .
Each party j broadcasts a message to the other parties, the message j sends to party i will be referred as message m i,j .

FIG. 1 .
FIG. 1. Distributed computation in a CCP scenario.(A)A central agent allocates resources among a network of n users who independently produce outcomes to solve a collective task.Each user i receives y i and a subset x i of the variables {x 1 , • • • , x n } according to some underlying causal structure indicated by the shaded pink region.Each user processes their input data along with shared correlations and broadcast a one bit message m i to all others before they compute a function f (x 1 , . . ., x n , y 1 , . . ., y n ).When allowing for quantum resources, shared quantum correlations that violate a Bell-like inequality bounding the corresponding NLHV model provide an advantage in the probability of success of the task over its classical counterpart.(B) Causal structure of the three-party GYNI scenario[24,32] experimentally investigated.The input of each party x i is communicated to its neighbour to the right (or alternatively, to the left).The variable λ stands for pre-shared (classical) correlations shared among the parties and used to produces respective outcomes, a i .(C) Causal structure of Svetlichny's scenario[22] as discussed in Supplementary Material.

FIG. 2 .
FIG. 2. Experimental layout.(A)Conceptual layout of the three-user GYNI protocol.The tripartite GHZ state is distributed among the three users who locally measure their photon based on input bits {x i , x j }.To compute the target function, each user broadcasts a one-bit message using their measured outcome and a local bit, y i .(B) We create the GHZ state using two polarisation-entangled photon-pair sources and a linear-optics fusion gate.Each source is implemented with an aperiodicallypoled KTP crystal embedded in a Sagnac loop that is optically pumped bidirectionally using a picosecond mode-locked laser, see Methods for details.Down-converted photons are separated from the pump laser using dichroic mirrors (DM) and interference filters (IF) then fibre coupled into single mode fibres.One photon from each source non-classically interferes on a polarising beamsplitter (PBS) creating the three-photon GHZ state conditioned on measuring the forth photon as a Trigger.Each user performs projective measurements on their qubit using a quarter-wave plate (QWP), half-wave plate (HWP), and PBS.Single photons are detected using superconducting nanowire single-photon detectors (SNSPD) and time-tagged for coincidence measurements within a 1 ns window.

FIG. 3 .
FIG. 3. Experimental results.Experimentally measured correlation terms belonging to the non-local Bell inequality in the GYNI scenario, E x 1 x 2 x 3 where {x 1 , x 2 , x 3 } = ±1.Orange bars represent the theoretical values assuming the optimal measurement settings while yellow bars show the experimentally observed values.We evaluate the inequality and report a correlation value of B G = 7.023 ± 0.036.Errors as estimated by Monte Carlo sampling, using N = 200 runs and assuming Poissonian statistics, are omitted as they are to small to be visible.