The standard model (SM) of particle physics provides precise predictions for the properties and interactions of fundamental particles, which have been confirmed by numerous experiments since the inception of the model in the 1960s. However, it is clear that the model is incomplete. The SM is unable to explain cosmological observations of the dominance of matter over antimatter, the apparent dark matter content of the Universe, or the patterns seen in the interaction strengths of the particles. Particle physicists have therefore been searching for ‘new physics’, that is, new particles and interactions that can explain the SM’s shortcomings.

One method to search for new physics is to compare measurements of the properties of hadron decays, where hadrons are bound states of quarks, with their SM predictions. Measurable quantities can be predicted precisely in the decays of a charged beauty hadron, B+, into a charged kaon, K+, and two charged leptons, +. The B+ hadron contains a beauty antiquark, \(\overline{b}\), and the K+ a strange antiquark, \(\overline{s}\), such that at the quark level the decay involves a \(\overline{b}\to \overline{s}\) transition. Quantum field theory allows such a process to be mediated by virtual particles that can have a physical mass larger than the energy available in the interaction. In the SM description of such processes, these virtual particles include the electroweak force carriers, the γ, W± and Z0 bosons, and the top quark (Fig. 1, left). Such decays are highly suppressed1, and the fraction of B+ hadrons that decay into this final state (the branching fraction, \({{{\mathcal{B}}}}\)) is on the order of 106 (ref. 2).

Fig. 1: Contributions to B+ → K++ decays in the SM and possible new physics models.
figure 1

A B+ meson, consisting of \(\overline{b}\) and u quarks, decays into a K+, containing \(\overline{s}\) and u quarks, and two charged leptons, +. Left: the SM contribution involves the electroweak bosons γ, W+ and Z0, and the up-type quarks ū, \(\bar{c}\) and \(\bar{t}\). Right: a possible new physics contribution to the decay with a hypothetical leptoquark (LQ) which, unlike the electroweak bosons, could have different interaction strengths with the different types of leptons.

A distinctive feature of the SM is that the different leptons, electron (e), muon (μ) and tau (\(\tau^{-}\)), have the same interaction strengths. This is known as ‘lepton universality’. The only exception to this is due to the Higgs field, since the lepton–Higgs interaction strength gives rise to the differing lepton masses mτ > mμ > me. The suppression of \(\overline{b}\to \overline{s}\) transitions is understood in terms of the fundamental symmetries on which the SM is built. Conversely, lepton universality is an accidental symmetry of the SM, which is not a consequence of any axiom of the theory. Extensions to the SM that aim to address many of its shortfalls predict new virtual particles that could contribute to \(\overline{b}\to \overline{s}\) transitions (Fig. 1, right) and could have non-universal interactions, hence giving branching fractions of B+ → K++ decays with different leptons that differ from the SM predictions. Whenever a process is specified in this paper, the inclusion of the charge-conjugate mode is implied.

Calculation of the SM predictions for the branching fractions of B+ → K+μ+μ and B+ → K+e+e decays is complicated by the strong nuclear force that binds together the quarks into hadrons, as described by quantum chromodynamics (QCD). The large interaction strengths preclude predictions of QCD effects with the perturbation techniques used to compute the electroweak force amplitudes, and only approximate calculations are currently possible. However, the strong force does not couple directly to leptons, hence its effect on the B+ → K+μ+μ and B+ → K+e+e decays is identical. The ratio between the branching fractions of these decays is therefore predicted with \({{{\mathcal{O}}}}(1 \%)\) precision3,4,5,6,7,8. Due to the small masses of both electrons and muons compared with that of b quarks, this ratio is predicted to be close to unity, except where the value of the dilepton invariant mass-squared (q2) significantly restricts the phase space available to form the two leptons. Similar considerations apply to decays with other B hadrons, B → Hμ+μ and B → He+e, where B= B+, B0, \({B}_{s}^{0}\) or \({\Lambda }_{b}^{0}\), and H can be, for example, an excited kaon, K*0, or a combination of particles such as a proton and charged kaon, pK. The ratio of branching fractions, RH (refs. 9,10), is defined in the dilepton mass-squared range \({q}_{\min }^{2} < {q}^{2} < {q}_{\max }^{2}\) as

$${R}_{H}\equiv \frac{\int\nolimits_{{q}_{\min }^{2}}^{{q}_{\max }^{2}}\frac{{{{\rm{d}}}}{{{\mathcal{B}}}}\ \ (B\to H{\mu }^{+}{\mu }^{-})}{{{{\rm{d}}}}{q}^{2}}{{{\rm{d}}}}{q}^{2}}{\int\nolimits_{{q}_{\min }^{2}}^{{q}_{\max }^{2}}\frac{{{{\rm{d}}}}{{{\mathcal{B}}}}\ \ (B\to H{e}^{+}{e}^{-})}{{{{\rm{d}}}}{q}^{2}}{{{\rm{d}}}}{q}^{2}}\,.$$

For decays with H = K+ and H = K*0 such ratios, denoted by RK and RK*0, respectively, have previously been measured by the Large Hadron Collider beauty (LHCb)11,12, Belle13,14 and BaBar15 collaborations. For RK the LHCb measurements are in the range 1.1 < q2 < 6.0 GeV2 c−4, whereas for RK*0, the ranges are 0.045 < q2 < 1.1 GeV2 c−4 and 1.1 < q2 < 6.0 GeV2 c−4. These ratios have been determined to be 2.1–2.5 standard deviations below their respective SM expectations3,4,5,6,7,16,17,18,19,20,21,22. The analogous ratio has also been measured for \({\Lambda }_{b}^{0}\) decays with H = pK and is compatible with unity at the level of one standard deviation23.

These decays all proceed via the same \(\overline{b}\to \overline{s}\) quark transition, and the results have therefore further increased interest in measurements of angular observables24,25,26,27,28,29,30,31,32,33,34 and branching fractions35,36,37,38 of decays mediated by \(\overline{b}\to \overline{s}{\mu }^{+}{\mu }^{-}\) transitions. Such decays also exhibit some tension with the SM predictions but the extent of residual QCD effects is still the subject of debate3,21,39,40,41,42,43,44,45,46,47. A consistent model-independent interpretation of all these data is possible via a modification of the \(\overline{b}\to \overline{s}\) coupling strength48,49,50,51,52,53,54. Such a modification can be realized in new physics models with an additional heavy neutral boson or with leptoquarks. Other explanations of the data involve a variety of extensions to the SM, such as supersymmetry, extended Higgs–boson sectors and models with extra dimensions. References to the extensive literature describing these new physics models can be found in the Supplementary Information. Tension with the SM is also seen in the combination of several ratios that test lepton universality in \(\overline{b}\to \overline{c}{\ell }^{+}{\nu }_{\ell }\) transitions55,56,57,58,59,60,61,62,63.

In this paper, a measurement of the RK ratio is presented based on proton–proton collision data collected with the LHCb detector at the CERN’s Large Hadron Collider (Methods). The data were recorded during 2011, 2012 and 2015–2018 with centre-of-mass energy of the collisions of 7, 8 and 13 TeV and correspond to an integrated luminosity of 9 fb−1. Compared with the previous LHCb RK result11, the experimental method is essentially identical but the analysis uses an additional 4 fb−1 of data collected in 2017 and 2018. The results supersede those of the previous LHCb analysis.

The analysis strategy aims to reduce systematic uncertainties induced in modelling the markedly different reconstruction of decays with muons in the final state, compared with decays with electrons. These differences arise due to the significant bremsstrahlung radiation emitted by the electrons and the different detector subsystems that are used to identify electron and muon candidates (Methods). The major challenge of the measurement is then correcting for the efficiency of the selection requirements used to isolate signal candidates and reduce background. To avoid unconscious bias, the analysis procedure was developed and the cross-checks described below performed before the result for RK was examined.

In addition to the process discussed above, the K++ final state is produced via a \({B}^{+}\to {X}_{q\overline{q}}{K}^{+}\) decay, where \({X}_{q\overline{q}}\) is a bound state (meson) such as the J/ψ. The J/ψ meson consists of a charm quark and antiquark, \(c\overline{c}\), and is produced resonantly at q2 = 9.59 GeV2c−4. This ‘charmonium’ resonance subsequently decays into two leptons, J/ψ → +. The B+ → J/ψ( → +)K+ decays are not suppressed and hence have a branching fraction orders of magnitude larger than that of B+ → K++ decays. These two processes are separated by applying a requirement on q2. The 1.1 < q2 < 6.0 GeV2 c−4 region used to select B+ → K++ decays is chosen to reduce the pollution from the J/ψ resonance and the high-q2 region that contains contributions from further excited charmonium resonances, such as the ψ(2S) and ψ(3770) states, and from lighter \(s\overline{s}\) resonances, such as the ϕ(1020) meson. In the remainder of this paper, the notation B+ → K++ is used to denote only decays with 1.1 < q2 < 6.0 GeV2 c−4, which are referred to as non-resonant, whereas B+ → J/ψ( → +)K+ decays are denoted resonant.

To help overcome the challenge of modelling precisely the different electron and muon reconstruction efficiencies, the branching fractions of B+ → K++ decays are measured relative to those of B+ → J/ψK+ decays64. Since the J/ψ → + branching fractions are known to respect lepton universality to within 0.4% (refs. 2,65), the RK ratio is determined via the double ratio of branching fractions

$${R}_{K}=\frac{{{{\mathcal{B}}}}\ \ ({B}^{+}\to {K}^{+}{\mu }^{+}{\mu }^{-})}{{{{\mathcal{B}}}}\ \ ({B}^{+}\to J/\psi (\to {\mu }^{+}{\mu }^{-}){K}^{+})}/\frac{{{{\mathcal{B}}}}\ \ ({B}^{+}\to {K}^{+}{e}^{+}{e}^{-})}{{{{\mathcal{B}}}}\ \ ({B}^{+}\to J/\psi (\to {e}^{+}{e}^{-}){K}^{+})}\ .$$

In this equation, each branching fraction can be replaced by the corresponding event yield divided by the appropriate overall detection efficiency (Methods), as all other factors needed to determine each branching fraction individually cancel out. The efficiency of the non-resonant B+ → K+e+e decay therefore needs to be known only relative to that of the resonant B+ → J/ψ( → e+e)K+ decay, rather than relative to the B+ → K+μ+μ decay. As the detector signature of each resonant decay is similar to that of its corresponding non-resonant decay, systematic uncertainties that would otherwise dominate the calculation of these efficiencies are suppressed. The yields observed in these four decay modes and the ratios of efficiencies determined from simulated events then enable an RK measurement with statistically dominated uncertainties. As detailed below, percent-level control of the efficiencies is verified with a direct comparison of the B+ → J/ψ( → e+e)K+ and B+ → J/ψ( → μ+μ)K+ branching fractions in the ratio

$${r}_{J/\psi }={{{\mathcal{B}}}}\ ({B}^{+}\to J/\psi (\to {\mu }^{+}{\mu }^{-}){K}^{+})/{{{\mathcal{B}}}}\ ({B}^{+}\to J/\psi (\to {e}^{+}{e}^{-}){K}^{+}),$$

which does not benefit from the same cancellation of systematic effects.

Candidate B+ → K++ decays are found by combining the reconstructed trajectory (track) of a particle identified as a charged kaon, together with the tracks from a pair of well-reconstructed oppositely charged particles identified as either electrons or muons. The particles are required to originate from a common vertex, displaced from the proton–proton interaction point, with good vertex-fit quality. The techniques used to identify the different particles and to form B+ candidates are described in Methods.

The invariant mass of the final state particles, m(K++), is used to discriminate between signal and background contributions, with the signal expected to accumulate around the known mass of the B+ meson. Background originates from particles selected from multiple hadron decays, referred to as combinatorial background, and from specific decays of B hadrons. The latter also tend to accumulate around specific values of m(K++). For the muon modes, the residual background is combinatorial and, for the resonant mode, there is an additional contribution from B+ → J/ψπ+ decays with a pion misidentified as a kaon. For the electron modes, in addition to combinatorial background, other specific background decays contribute significantly in the signal region. The dominant such background for the non-resonant and resonant modes comes from partially reconstructed B(0,+) → K+π(−,0)e+e and B(0,+) → J/ψ( → e+e)K+π(−,0) decays, respectively, where the pion is not included in the B+ candidate. Decays of the form \({B}^{+}\to {\overline{D}}^{0}(\to {K}^{+}{e}^{-}{\overline{\nu }}_{e}){e}^{+}{\nu }_{e}\) also contribute at the level of \({{{\mathcal{O}}}}(1 \%)\) of the B+ → K+e+e signal; and there is also a contribution from B+ → J/ψ( → e+e)K+ decays, where a photon is emitted but not reconstructed. The kinematic correlation between m(K+e+e) and q2 means that, irrespective of misreconstruction effects, the latter background can only populate the m(K+e+e) region well below the signal peak.

After the application of the selection requirements, the resonant and non-resonant decays are clearly visible in the mass distributions (Fig. 2). The yields in the two B+ → K++ and two B+ → J/ψ( → +)K+ decay modes are determined by performing unbinned extended maximum-likelihood fits to these distributions (Methods). For the non-resonant candidates, the m(K+e+e) and m(K+μ+μ) distributions are fitted with a likelihood function that has the B+ → K+μ+μ yield and RK as fit parameters and the resonant decay mode yields incorporated as Gaussian-constraint terms. The resonant yields are determined from separate fits to the mass, mJ/Ψ(K++), formed by kinematically constraining the dilepton system to the known J/ψ mass2 and thereby improving the mass resolution.

Fig. 2: Candidate invariant mass distributions.
figure 2

Distribution of the invariant mass m(J/ψ)(K++) for candidates with electron (left) and muon (right) pairs in the final state for the non-resonant B+ → K++ signal channels (top) and resonant B+ → J/ψ( → +)K+ decays (bottom). The fit projection is superimposed, with dotted lines describing the signal contribution and solid areas representing each of the background components described in the text and listed in the legend. Part. reco. refers to partially reconstructed B hadron decays. In the resonant-mode distributions, some fit components are too small to be visible. Uncertainties on the data points are statistical only and represent one standard deviation, calculated assuming Poisson-distributed entries. The y axis in each panel shows the number of candidates in an interval of the indicated width.

Simulated events are used to derive the two ratios of efficiencies needed to form RK using equation (2). Control channels are used to calibrate the simulation to correct for the imperfect modelling of the B+ production kinematics and various aspects of the detector response. The overall effect of these corrections on the measured value of RK is a relative shift of (+3 ± 1)%. When compared with the 20% shift that these corrections induce in the measurement of rJ/ψ, this demonstrates the robustness of the double-ratio method in suppressing systematic biases that affect the resonant and non-resonant decay modes similarly.

The systematic uncertainty (Methods) from the choice of signal and background mass-shape models in the fits is estimated by fitting pseudo-experiments with alternative models that still describe the data well. The effect on RK is at the 1% level. A comparable uncertainty arises from the limited size of the calibration samples, with negligible contributions from the calibration of the B+ production kinematics and modelling of the selection and particle-identification efficiencies. Systematic uncertainties that affect the ratios of efficiencies influence the measured value of RK and are taken into account using constraints on the efficiency values. Correlations between different categories of selected events and data-taking periods are taken into account in these constraints. The combined statistical and systematic uncertainty is then determined by scanning the profile likelihood, and the statistical contribution to the uncertainty is isolated by repeating the scan with the efficiencies fixed to their fitted values.

The determination of the rJ/ψ ratio requires control of the relative selection efficiencies for the resonant electron and muon modes and does not therefore benefit from the cancellation of systematic effects in the double ratio used to measure RK. Given the scale of the corrections required, comparison of rJ/ψ with unity is a stringent cross-check of the experimental procedure. In addition, if the simulation is correctly calibrated, the measured rJ/ψ value will not depend on any variable. The rJ/ψ ratio is therefore also computed as a function of different kinematic variables. Even though the non-resonant and resonant samples are mutually exclusive as a function of q2, there is significant overlap between them in the quantities on which the efficiency depends, such as the laboratory-frame momenta of the final-state particles or the opening angle between the two leptons. This is because a given set of values for the final-state particles’ momenta and angles in the B+ rest frame will result in a distribution of such values when transformed to the laboratory frame.

The value of rJ/ψ is measured to be 0.981 ± 0.020. This uncertainty includes both statistical and systematic effects, where the latter dominate. The consistency of this ratio with unity demonstrates control of the efficiencies well in excess of that needed for the determination of RK. In the measurement of the rJ/ψ ratio, the systematic uncertainty is dominated by the imperfect modelling of the B+ production kinematics and the modelling of selection requirements, which have a negligible impact on the RK measurement. No significant trend is observed in the differential determination of rJ/ψ as a function of any considered variable. An example distribution, with rJ/ψ determined as a function of B+ momentum component transverse to the beam direction, pT, is shown in Fig. 3. Assuming that the observed rJ/ψ variation in such distributions reflects genuine mis-modelling of the efficiencies, rather than statistical fluctuations, and taking into account the spectrum of the relevant variables in the non-resonant decay modes, a total shift on RK is computed for each of the variables examined. The resulting variations are typically at the permille level and hence well within the estimated systematic uncertainty on RK. Similarly, computations of the rJ/ψ ratio in bins of two kinematic variables also do not show any trend and are consistent with the systematic uncertainties assigned on the RK measurement.

Fig. 3: Differential rJ/ψ measurement.
figure 3

The distributions of the B+ transverse momentum (pT, left) and the ratio rJ/ψ (right) relative to its average value \(< {r}_{J/\psi } >\) as a function of pT. The pT spectrum of the B+ → J/ψK+ decays is similar to that of the corresponding B+ → K++ decays such that the measurement of rJ/ψ tests the kinematic region relevant for the RK measurement. The lack of any dependence of the value of \({r}_{J/\psi }/ < {r}_{J/\psi } >\) as a function of B+ pT demonstrates control of the efficiencies. Uncertainties on the data points are statistical only and represent one standard deviation.

In addition to B+ → J/ψK+ decays, clear signals are observed from B+ → ψ(2S)K+ decays. The double ratio of branching fractions, Rψ(2S), defined by

$$\begin{array}{l}{R}_{\psi (2S)}\\=\frac{{{{\mathcal{B}}}}\ ({B}^{+}\to \psi (2S)(\to {\mu }^{+}{\mu }^{-}){K}^{+})}{{{{\mathcal{B}}}}\ ({B}^{+}\to J/\psi (\to {\mu }^{+}{\mu }^{-}){K}^{+})}/\frac{{{{\mathcal{B}}}}\ ({B}^{+}\to \psi (2S)(\to {e}^{+}{e}^{-}){K}^{+})}{{{{\mathcal{B}}}}\ ({B}^{+}\to J/\psi (\to {e}^{+}{e}^{-}){K}^{+})}\ ,\end{array}$$

provides an independent validation of the double-ratio analysis procedure and further tests the control of the efficiencies. This double ratio is expected to be close to unity2 and is determined to be 0.997 ± 0.011, where the uncertainty includes both statistical and systematic effects, the former of which dominates. This can be interpreted as a world-leading test of lepton flavour universality in ψ(2S) → + decays.

The fit projections for the m(K++) and mJ/Ψ(K++) distributions are shown in Fig. 2. The fit is of good quality, and the value of RK is measured to be

$${R}_{K}(1.1 < {q}^{2} < 6.0\,{{{{\rm{GeV}}}}}^{2}\,{c}^{-4})=0.84{6}_{-0.039-0.012}^{+0.042+0.013}\ ,$$

where the first uncertainty is statistical and the second systematic. Combining the uncertainties gives \({R}_{K}=0.84{6}_{-\ 0.041}^{+\ 0.044}\). This is the most precise measurement to date and is consistent with the SM expectation, 1.00 ± 0.01 (refs. 3,4,5,6,7), at the level of 0.10% (3.1 standard deviations), giving evidence for the violation of lepton universality in these decays. The value of RK is found to be consistent in subsets of the data divided on the basis of data-taking period, different selection categories and magnet polarity (Methods). The profile likelihood is given in Methods. A comparison with previous measurements is shown in Fig. 4.

Fig. 4: Comparison between RK measurements.
figure 4

In addition to the LHCb result, the measurements by the BaBar15 and Belle13 collaborations, which combine B+ → K++ and \({B}^{0}\to {K}_{{{{\rm{S}}}}}^{0}{\ell }^{+}{\ell }^{-}\) decays, are also shown. The vertical dashed line indicates the SM prediction. Uncertainties on the data points are the combination of statistical and systematic and represent one standard deviation.

The 3,850 ± 70 B+ → K+μ+μ decay candidates that are observed are used to compute the B+ → K+μ+μ branching fraction as a function of q2. The results are consistent between the different data-taking periods and with previous LHCb measurements37. The B+ → K+e+e branching fraction is determined by combining the value of RK with the value of \({{{\rm{d}}}}{{{\mathcal{B}}}}\ ({B}^{+}\to {K}^{+}{\mu }^{+}{\mu }^{-})/{{{\rm{d}}}}{q}^{2}\) in the region 1.1 < q2 < 6.0 GeV2 c−4 (ref. 37), taking into account correlated systematic uncertainties. This gives

$$\begin{array}{rcl}\frac{{{{\rm{d}}}}{{{\mathcal{B}}}}\ ({B}^{+}\to {K}^{+}{e}^{+}{e}^{-})}{{{{\rm{d}}}}{q}^{2}}(1.1 < {q}^{2} < 6.0\,{{{{\rm{GeV}}}}}^{2}{c}^{-4})\\=(28.{6\ }_{-\ 1.4}^{+\ 1.5}\ \pm 1.3)\times 1{0}^{-9}\ {c}^{4}\,{{{{\rm{GeV}}}}}^{-2}\ .\end{array}$$

The 1.9% uncertainty on the B+ → J/ψK+ branching fraction2 gives rise to the dominant systematic uncertainty. This is the most precise measurement of this quantity to date and, given the large (\({{{\mathcal{O}}}}(10 \%)\)) theoretical uncertainty on the predictions7,66, is consistent with the SM.

A breaking of lepton universality would require an extension of the gauge structure of the SM that gives rise to the known fundamental forces. It would therefore constitute a significant evolution in our understanding and would challenge an inference based on a wealth of experimental data in other processes. Confirmation of any effect beyond the SM will clearly require independent evidence from a wide range of sources.

Measurements of other RH observables with the full LHCb dataset will provide further information on the quark-level processes measured. In addition to affecting the decay rates, new physics can also alter how the decay products are distributed in phase space. An angular analysis of the electron mode, where SM-like behaviour might be expected in the light of the present results and those from \(\overline{b}\to \overline{s}{\mu }^{+}{\mu }^{-}\) decays, would allow the formation of ratios between observable quantities other than branching fractions, enabling further precise tests of lepton universality16,18,31,67,68. The hierarchical effect needed to explain the existing \(\overline{b}\to \overline{s}{\ell }^{+}{\ell }^{-}\) and \(\overline{b}\to \overline{c}{\ell }^{+}{\nu }_{\ell }\) data, with the largest effects observed in tau modes, then muon modes and little or no effects in electron modes, suggests that studies of \(\overline{b}\to \overline{s}{\tau }^{+}{\tau }^{-}\) transitions are also of great interest69,70. There are excellent prospects for all of the above and further measurements with the much larger samples that will be collected with the upgraded LHCb detector from 2022 and, in the longer term, with the LHCb Upgrade II71. Other experiments should also be able to determine RH ratios, with the Belle II experiment in particular expected to have competitive sensitivity72. The ATLAS and CMS experiments may also be able to contribute73,74.

In summary, in the dilepton mass-squared region 1.1 < q2 < 6.0 GeV2c−4, the ratio of branching fractions for B+ → K+μ+μ and B+ → K+e+e decays is measured to be \({R}_{K}=0.84{6}_{-\ 0.041}^{+\ 0.044}\). This is the most precise measurement of this ratio to date and is compatible with the SM prediction with a P value of 0.10%. The significance of this discrepancy is 3.1 standard deviations, giving evidence for the violation of lepton universality in these decays.


Experimental setup

The Large Hadron Collider (LHC) is the world’s highest-energy particle accelerator and is situated approximately 100 m underground, close to Geneva, Switzerland. The collider accelerates two counter-rotating beams of protons, guided by superconducting magnets located around a 27 km circular tunnel, and brings them into collision at four interaction points that house large detectors. The LHCb experiment75,76 is instrumented in the region covering the polar angles between 10 and 250 mrad around the proton beam axis, in which the products from B hadron decays can be efficiently captured and identified. The detector includes a high-precision tracking system with a dipole magnet, providing measurements of momentum and impact parameter (IP), defined for charged particles as the minimum distance of a track to a primary proton–proton interaction vertex (PV). Different types of charged particles are distinguished using information from two ring-imaging Cherenkov detectors, a calorimeter and a muon system76.

Since the associated data storage and analysis costs would be prohibitive, the experiment does not record all collisions. Only potentially interesting events, selected using real-time event filters referred to as triggers, are recorded. The LHCb trigger system has a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage that uses all the information from the detector, including the tracking, to make the final selection of events to be recorded for subsequent analysis. The trigger selection algorithms are based on identifying key characteristics of B hadrons and their decay products, such as high pT final-state particles, and a decay vertex that is significantly displaced from any of the PVs in the event.

For the RK measurement, candidate events are required to have passed a hardware trigger algorithm that selects either a high-pT muon, or an electron, hadron or photon with high transverse energy deposited in the calorimeters. The B+ → K+μ+μ and B+ → J/ψ( → μ+μ)K+ candidates must be triggered by one of the muons, whereas B+ → K+e+e and B+ → J/ψ( → e+e)K+ candidates must be triggered in one of three ways: by either one of the electrons, by the kaon from the B+ decay or by other particles in the event that are not decay products of the B+ candidate. In the software trigger, the tracks of the final-state particles are required to form a displaced vertex with good fit quality. A multivariate algorithm is used for the identification of displaced vertices consistent with the decay of a B hadron77,78.

Analysis description

The analysis technique used to obtain the results presented in this paper is essentially identical to that used to obtain the previous LHCb RK measurement, described in ref. 11, and only the main analysis steps are reviewed here.

Event selection

Kaon and muon candidates are identified using the output of multivariate classifiers that exploit information from the tracking system, the ring-imaging Cherenkov detectors, the calorimeters and the muon chambers. Electrons are identified by matching tracks to particle showers in the electromagnetic calorimeter (ECAL) and using the ratio of the energy detected in the ECAL to the momentum measured by the tracking system. An electron that emits a bremsstrahlung photon due to interactions with the material of the detector downstream of the dipole magnet results in the photon and electron depositing their energy in the same ECAL cells and therefore in a correct measurement of the original energy of the electron in the ECAL. However, a bremsstrahlung photon emitted upstream of the magnet will deposit energy in a different part of the ECAL than the electron, which is deflected in the magnetic field. For each electron track, a search is therefore made in the ECAL for energy deposits around the extrapolated track direction before the magnet that are not associated with any other charged tracks. The energy of any such deposit is added to the electron energy that is derived from the measurements made in the tracker. Bremsstrahlung photons can be added to none, either or both of the final-state e+ and e candidates.

To suppress background, each final-state particle is required to have sizeable pT and to be inconsistent with coming from a PV. The particles are required to originate from a common vertex, with good vertex-fit quality, that is displaced significantly from all of the PVs in the event. The PVs are reconstructed by searching for space points where an accumulation of track trajectories is observed. A weighted least-squares method is then employed to find the precise vertex position. The B+ momentum vector is required to be aligned with the vector connecting one of the PVs in the event (below referred to as the associated PV) and the B+ decay vertex. The value of q2 is calculated using only the lepton momenta, without imposing any constraint on the m(K++) mass.

The m(K++) mass ranges and the q2 regions used to select the different decay modes are shown in Extended Data Table 1. The selection requirements applied to the non-resonant and resonant decays are otherwise identical. For the muon modes, the superior mass resolution allows a fit in a reduced m(K++) mass range compared with the electron modes. For the electron modes, a wider mass region is needed to perform an accurate fit, but the range chosen suppresses any significant contribution from decays with two or more additional pions that are not reconstructed. The residual contribution from such decays is considered as a source of systematic uncertainty. Resolution effects similarly motivate the choice of non-resonant q2 regions, with a lower limit that excludes contributions from ϕ-meson decays and an upper limit that reduces the tail from B+ → J/ψ( → e+e)K+ decays. The proportion of signal candidates that migrate in and out of the q2 region of interest is on the order of 10%. This effect is accounted for using simulation (Extended Data Figs. 7 and 8).

Cascade background of the form \(B\to {H}_{c}(\to {K}^{+}{\ell }^{-}{\overline{\nu }}_{\ell }X){\ell }^{+}{\nu }_{\ell }Y\), where Hc is a hadron containing a c quark (D0, D+, \({D}_{s}^{+}\), \({\Lambda }_{c}^{+}\)), and X, Y are particles that are not included in the B+ candidate, are suppressed by requiring that the kaon–lepton invariant mass is in the region m(K+) > \(m_{D^0}\), where \(m_{D^0}\) is the known D0 mass2. For the electron mode, this requirement is illustrated in Extended Data Fig. 1 (left). Analogous background sources with a misidentified particle are reduced by applying a similar veto, but with the lepton mass hypothesis changed to that of a pion (denoted [→π]). In the muon case, Kμ[→π] combinations with mass smaller than \(m_{D^0}\) are rejected. In the electron case, a ±40 MeV c−2 window around the D0 mass is used to reject candidates where the veto is applied without the bremsstrahlung recovery, that is, on the basis of only the measured track momenta. The mass distributions are shown in Extended Data Fig. 1. The electron and muon veto cuts differ given the relative helicity suppression of π+ → +ν decays. This causes misidentification backgrounds to populate a range of Kμ masses but only a peak in the Ke mass. The veto requirements retain 97% of B+ → K+μ+μ and 95% of B+ → K+e+e decays passing all other selection requirements.

Background from other exclusive B-hadron decays requires at least two particles to be misidentified. These include the decays B+ → K+π+π, and misreconstructed B+ → J/ψ( → +)K+ and B+ → ψ(2S)( → +)K+ decays. In the latter two decays, the kaon is misidentified as a lepton and the lepton (of the same electric charge) as a kaon. Such background is reduced to a negligible level by particle identification criteria. Background from decays with a photon converted into an e+e pair are also negligible due to the q2 selection.

Multivariate selection

A boosted decision tree (BDT) algorithm79 with gradient boosting80 is used to reduce combinatorial background. For the non-resonant muon mode and for each of the three different trigger categories of the non-resonant electron mode, a single BDT classifier is trained for the 7 and 8 TeV data, and an additional classifier is trained for the 13 TeV data. The BDT output is not strongly correlated with q2, and the same classifiers are used to select the respective resonant decays. To train the classifier, simulated non-resonant B+ → K++ decays are used as a proxy for the signal and non-resonant K++ candidates selected from the data with m(K++) > 5.4 GeV c−2 are used as a background sample. The k-folding technique is used in the training and testing81. The classifier includes the following variables: the pT of the B+, K+ and dilepton candidates, and the minimum and maximum pT of the leptons; the B+, dilepton and K+\({\chi }_{{{{\rm{IP}}}}}^{2}\) with respect to the associated PV, where \({\chi }_{{{{\rm{IP}}}}}^{2}\) is defined as the difference in the vertex-fit χ2 of the PV reconstructed with and without the considered particle; the minimum and maximum \({\chi }_{{{{\rm{IP}}}}}^{2}\) of the leptons; the B+ vertex-fit quality; the statistical significance of the B+ flight distance; and the angle between the B+ candidate momentum vector and the direction between the associated PV and the B+ decay vertex. The pT of the final state particles, the vertex-fit χ2 and the significance of the flight distance have the most discriminating power. For each of the classifiers, a requirement is placed on the output variable to maximize the predicted significance of the non-resonant signal yield. For the electron modes that dictate the RK precision, this requirement reduces the combinatorial background by approximately 99% while retaining 85% of the signal mode. The muon BDT classifier has similar performance. In both cases, for both signal and background, the efficiency of the BDT selection has negligible dependence on m(K++) and q2 in the regions used to determine the event yields.

Calibration of simulation

The simulated data used in this analysis are produced using the software described in refs. 82,83,84,85,86,87,88. Bremsstrahlung emission in the decay of particles is simulated using the PHotos software in the default configuration89, which is observed to agree with an independent quantum electrodynamics calculation at the level of 1% (ref. 5).

Simulated events are weighted to correct for the imperfect modelling using control channels. The B+ production kinematics are corrected using B+ → J/ψ( → +)K+ events. The particle identification performance is calibrated using data, where the species of particles in the final state can be unambiguously determined purely on the basis of the kinematics. The calibration samples consist of D*+ → D0( → Kπ+)π+, J/ψ → μ+μ and B+ → J/ψ( → e+e)K+ decays, from which kaons, muons and electrons, respectively, can be selected without applying particle identification requirements. The performance of the particle identification requirements is then evaluated from the proportion of events in these samples which fulfil the particle identification selection criteria. The trigger response is corrected using weights applied to simulation as a function of variables relevant to the trigger algorithms. The weights are calculated by requiring that simulated B+ → J/ψ( → +)K+ events exhibit the same trigger performance as the control data. The B+ → J/ψ( → +)K+ events selected from the data have also been used to demonstrate control of the electron track reconstruction efficiency at the percent level90. Whenever B+ → J/ψ( → +)K+ events are used to correct the simulation, the correlations between calibration and measurement samples are taken into account in the results and cross-checks presented in this paper. The correlation is evaluated using a bootstrapping method to recompute the yields and efficiencies many times with different subsets of the data.

Likelihood fit

An unbinned extended maximum-likelihood fit is made to the m(K+e+e) and m(K+μ+μ) distributions of non-resonant candidates. The value of RK is a fit parameter, which is related to the signal yields and efficiencies according to

$$\begin{array}{rcl}{R}_{K}&=&\frac{N({B}^{+}\to {K}^{+}{\mu }^{+}{\mu }^{-})}{\varepsilon ({B}^{+}\to {K}^{+}{\mu }^{+}{\mu }^{-})}\cdot \frac{\varepsilon ({B}^{+}\to {K}^{+}{e}^{+}{e}^{-})}{N({B}^{+}\to {K}^{+}{e}^{+}{e}^{-})}\\ &&\times \frac{\varepsilon ({B}^{+}\to J/\psi (\to {\mu }^{+}{\mu }^{-}){K}^{+})}{N({B}^{+}\to J/\psi (\to {\mu }^{+}{\mu }^{-}){K}^{+})}\cdot \frac{N({B}^{+}\to J/\psi (\to {e}^{+}{e}^{-}){K}^{+})}{\varepsilon ({B}^{+}\to J/\psi (\to {e}^{+}{e}^{-}){K}^{+})}\ ,\end{array}$$

where N(X) indicates the yield of decay mode X and ε(X) is the efficiency for selecting decay mode X. The resonant yields are determined from separate fits to mJ/Ψ(K++). In the fit for RK, these yields and the efficiencies are incorporated as Gaussian-constraint terms.

To take into account the correlation between the selection efficiencies, the m(K+e+e) and m(K+μ+μ) distributions of non-resonant candidates in each of the different trigger categories and data-taking periods are fitted simultaneously, with a common value of RK. The relative fraction of partially reconstructed background in each trigger category is also shared across the different data-taking periods.

The mass-shape parameters are derived from the calibrated simulation. The four signal modes are modelled by multiple Gaussian functions with power-law tails on both sides of the peak91,92, although the differing detector response gives different shapes for the electron and muon modes. The signal mass shapes of the electron modes are described with the sum of three distributions, which model whether the ECAL energy deposit from a bremsstrahlung photon was added to both, either or neither of the e± candidates. The expected values from simulated events are used to constrain the fraction of signal decays in each of these categories. These fractions are observed to agree well with those observed in resonant events selected from the data. To take into account residual differences in the signal shape between data and simulation, an offset in the peak position and a scaling of the resolution are allowed to vary in the fits to the resonant modes. The corresponding parameters are then fixed in the fits to the relevant non-resonant modes. This resolution scaling changes the migration of candidates into the q2 region of interest by less than 1%.

For the modelling of non-resonant and resonant partially reconstructed backgrounds, data are used to correct the simulated Kπ mass spectrum for B(0,+) → K+π(−,0)e+e and B(0,+) → J/ψ( → e+e)K+π(−,0) decays93. The calibrated simulation is used subsequently to obtain the m(K++) mass shape and relative fractions of these background components. To accommodate possible lepton universality violation in these partially reconstructed processes, which are underpinned by the same \(\overline{b}\to \overline{s}\) quark-level transitions as those of interest, the overall yield of such decays is left to vary freely in the fit. The shape of the B+ → J/ψπ+ background contribution is taken from simulation, but the size with respect to the B+ → J/ψK+ mode is constrained using the known ratio of the relevant branching fractions2,94 and efficiencies.

In the fits to non-resonant B+ → K+e+e candidates, the mass shape of the background from B+ → J/ψ( → e+e)K+ decays with an emitted photon that is not reconstructed is also taken from simulation and, adjusting for the relevant selection efficiency, its yield is constrained to the value from the fit to the resonant mode within its uncertainty. In all fits, the combinatorial background is modelled with an exponential function with a freely varying yield and shape.

The fits to the non-resonant (resonant) decay modes in different data-taking periods and trigger categories are shown in Extended Data Fig. 2 (Extended Data Fig. 3). For the resonant modes, the results from independent fits to each period/category are shown. Conversely, the non-resonant distributions show the projections from the simultaneous fit across data-taking periods and trigger categories that is used to obtain RK. The fitted yields for the resonant and non-resonant decays are given in Extended Data Table 2.

The profile likelihood for the fit to the non-resonant decays is shown in Extended Data Fig. 4. The likelihood is non-Gaussian in the region RK > 0.95 due to the comparatively low yield of B+ → K+e+e events. Following the procedure described in refs. 11,12, the P value is computed by integrating the posterior probability density function for RK, having folded in the theory uncertainty on the SM prediction, for RK values larger than the SM expectation. The corresponding significance in terms of standard deviations is computed using the inverse Gaussian cumulative distribution function for a one-sided conversion.

A test statistic is constructed that is based on the likelihood ratio between two hypotheses with common (null) or different (test) RK values for the part of the sample analysed previously (7, 8 and part of the 13 TeV data) and for the new portion of the 13 TeV data. Using pseudo-experiments based on the null hypothesis, the data suggest that the RK value from the new portion of the data is compatible with that from the previous sample with a P value of 95%. Further tests give good compatibility for subsamples of the data corresponding to different trigger categories and magnet polarities.

The departure of the profile likelihood shown in Extended Data Fig. 4 from a normal distribution stems from the definition of RK. In particular, in the RK ratio, the denominator is affected by larger statistical uncertainties than the numerator, owing to the larger number of non-resonant muonic signal candidates. However, the intervals of the likelihood distribution are found to be the same when estimated with 1/RK as the fit parameter.

Additional cross-checks

The rJ/ψ single ratio is used to perform a number of additional cross-checks. The distribution of this ratio as a function of the angle between the leptons and the minimum pT of the leptons is shown in Extended Data Fig. 5, together with the spectra expected for the resonant and non-resonant decays. No significant trend is observed in either rJ/ψ distribution. Assuming that the deviations observed are genuine mis-modelling of the efficiencies, rather than statistical fluctuations, a total shift of RK at a level less than 0.001 would be expected due to these effects. This estimate takes into account the spectrum of the relevant variables in the non-resonant decay modes of interest and is compatible with the estimated systematic uncertainties on RK. Similarly, the variations seen in rJ/ψ as a function of all other reconstructed quantities examined are compatible with the systematic uncertainties assigned. In addition, rJ/ψ is computed in two-dimensional intervals of reconstructed quantities (Extended Data Fig. 6). Again, no significant trend is seen.

Systematic uncertainties

The majority of the sources of systematic uncertainty affect the relative efficiencies between non-resonant and resonant decays. These are included in the fit to RK by allowing the relative efficiency to vary within Gaussian constraints. The width of the constraint is determined by adding the contributions from the different sources in quadrature. Correlations in the systematic uncertainties between different trigger categories and run periods are taken into account. Systematic uncertainties affecting the determination of the signal yield are assessed using pseudo-experiments generated with variations of the fit model. Pseudo-experiments are also used to assess the degree of bias originating from the fitting procedure. The bias is found to be 1% of the statistical precision and thus negligible with respect to other sources of systematic uncertainty.

For the non-resonant B+ → K+e+e decays, the systematic uncertainties are dominated by the modelling of the signal and background components used in the fit. The effect on RK is at the 1% level. A significant proportion (0.7%) of this uncertainty comes from the limited knowledge of the spectrum in B(0,+) → K+π(−,0)e+e decays. In addition, a 0.2% systematic uncertainty is assigned for the potential contribution from partially reconstructed decays with two additional pions. An uncertainty comparable to that from the modelling of the signal and background components is induced by the limited sizes of calibration samples. Other sources of systematic uncertainty, such as the calibration of B+ production kinematics, the trigger calibration and the determination of the particle identification efficiencies, contribute at the few-permille or permille level, depending strongly on the data-taking period and the trigger category.

The uncertainties on parameters used in the simulation model of the signal decays affect the q2 distribution and hence the selection efficiency. These uncertainties are propagated to an uncertainty on RK using predictions from the FLAVIO software package7 but give rise to a negligible effect. Similarly, the differing q2 resolution between data and simulation, which alters estimates of the q2 migration, has negligible impact on the result.