Measurement of the Higgs boson width and evidence of its off-shell contributions to ZZ production

Since the discovery of the Higgs boson in 2012, detailed studies of its properties have been ongoing. Besides its mass, its width - related to its lifetime - is an important parameter. One way to determine this quantity is by measuring its off-shell production, where the Higgs boson mass is far away from its nominal value, and relating it to its on-shell production, where the mass is close to the nominal value. Here, we report evidence for such off-shell contributions to the production cross section of two Z bosons with data from the CMS experiment at the CERN Large Hadron Collider. We constrain the total rate of the off-shell Higgs boson contribution beyond the Z boson pair production threshold, relative to its standard model expectation, to the interval [0.0061, 2.0] at 95% confidence level. The scenario with no off-shell contribution is excluded at a $p$-value of 0.0003 (3.6 standard deviations). We measure the width of the Higgs boson as $\Gamma_{\mathrm{H}}$ = 3.2 $_{-1.7}^{+2.4}$ MeV, in agreement with the standard model expectation of 4.1 MeV. In addition, we set constraints on anomalous Higgs boson couplings to W and Z boson pairs.


1
The standard model (SM) of particle physics provides an elegant description for the masses and interactions of fundamental particles. These are fermions, which are the building blocks of ordinary matter, and gauge bosons, which are the carriers of the electroweak (EW) and strong forces. In addition, the SM postulates the existence of a quantum field responsible for the generation of the masses of fundamental particles through a phenomenon known as the Brout-Englert-Higgs mechanism. This field, known as the Higgs field [1][2][3], interacts with SM particles, giving them mass, as well as with itself. The field carrier is a massive, scalar (spin-0) particle known as the Higgs (H) boson. Nearly half a century after its postulation, it was finally observed in 2012 with a mass m H of around 125 GeV by the ATLAS and CMS Collaborations [4][5][6] at the CERN Large Hadron Collider (LHC). Given the unique role the H boson plays in the SM, studies of its properties are a major goal of particle physics.
Apart from mass, another important property of a particle is its lifetime τ. Only a few fundamental particles are stable; others-including the H boson-exist only for a fleeting moment before disintegrating into other, lighter, species. The Heisenberg uncertainty principle [7] provides a direct connection between the lifetime of a particle and the uncertainty in its mass, a property known as the particle's width, Γ. Any unstable particle (often referred to as a resonance) has a finite lifetime, with shorter τ corresponding to broader Γ. The two quantities are related through the Planck constant, h, as Γ = h/(2πτ). Even with perfect experimental resolution, the observed mass of an unstable particle will not be constant across a series of measurements (e.g., of the invariant mass of its decay products i, which is calculated from the sums of their energies, E i , and momenta, p i , as (∑ i E i ) 2 − |∑ i p i | 2 ). The possible mass values are distributed according to a characteristic relativistic Breit-Wigner distribution [8] with a nominal mass value corresponding to the maximum of the Breit-Wigner, and with width parameter Γ.
Particles are understood to be on the mass shell (on-shell) if their mass is close to the nominal mass value, and off-shell if their mass takes a value far away from it. By the aforementioned property of the Breit-Wigner line shape, particles are generally more likely to be produced onshell than off-shell when energy and momentum conservation allows it. Scattering amplitudes (A) for off-shell particle production, followed by a specific decay final state, may be modified further by interference with other processes, which is large and destructive in the case of the H boson. In this specific case, writing A = H + C, with H standing for the H boson contribution and C for other interfering contributions, we will use the term "off-shell production" as a shorthand for the |H| 2 term in |A| 2 .
For broad resonances, the width can be obtained by directly measuring the Breit-Wigner line shape, e.g., as was done in the case of the Z boson, measured to have a mass of m Z = 91.188 ± 0.002 GeV and a width of Γ Z = 2.495 ± 0.002 GeV at the CERN Large Electron Positron collider [9]. The H boson is expected to live three orders of magnitude longer, with a theoretically predicted width of Γ H = 4.1 MeV (0.0041 GeV) [10], and a deviation from the SM prediction would indicate the existence of new physics. This width is too small to be measured directly from the line shape because of the limited mass resolution of order 1 GeV achievable with the present LHC detectors. Another direct way of measuring the H boson width would be to measure its lifetime by means of its decay length and use the relationship Γ H = h/(2πτ H ), but its lifetime is still too short (τ H = 1.6 × 10 −22 s) to be detectable directly. The present experimental limit on this quantity is τ H < 1.9 × 10 −13 s at 95% confidence level (CL) [11], nine orders of magnitude above the SM lifetime.
The value of Γ H can be extracted with much better precision through a combined measurement of on-shell and off-shell H boson production. In the decay of an H boson with m H ≈ 125 GeV to a pair of massive gauge bosons V (V = W or Z, with masses around 80.4 or 91.2 GeV, respectively), we have m V < m H < 2m V . Therefore, when the H boson is produced on-shell (with the VV invariant mass m VV ∼ m H ), one of the V bosons must be off-shell to satisfy fourmomentum conservation. Once the H boson is produced off-shell with large enough invariant mass m VV > 2m V (off-shell H boson production region), the V bosons themselves are produced on-shell. Since the Breit-Wigner mass distribution of either the H or V boson maximizes at their respective nominal masses, the rate of off-shell H boson production above the V boson pair production threshold is enhanced with respect to what one would expect from the Breit-Wigner line shape of the H boson alone.
The measurement of the higher part of the m VV spectrum can then be used to establish off-shell H boson production. The ratio of off-shell to on-shell production rates allows for a measurement of Γ H [12,13] via the cross section proportionality relations where g p and g d are the couplings associated with the H boson production and decay modes, respectively, and µ p is the on-shell H boson signal strength in the production mode being considered. Each signal strength is defined as the ratio of the H boson squared amplitude in the measured cross section to that predicted in the SM. The off-shell H boson signal strength, µ off-shell p , can be expressed as µ p Γ H in each production mode, and the scenario with no off-shell production becomes equivalent to the limiting case Γ H = 0. For the rest of this article, we concentrate on the ZZ decay channel, i.e., g d corresponding to the H → ZZ decay. The CMS and ATLAS Collaborations have previously used this method to set upper limits on Γ H as low as 9.2 MeV at 95% CL [14,15].
It is important to distinguish between two types of H boson production modes: the gluon fusion gg → H → ZZ process, where the H boson is produced via its couplings to fermions, and the EW processes, which involve HVV (i.e., HWW or HZZ) couplings. The top row of Fig. 1 shows the Feynman diagrams for the most dominant contributions to the gg (top left) process, and the EW processes of vector boson fusion (VBF, top center) and VH (top right). A more complete set of diagrams for the EW process are shown in Extended Data Figs. 1 and 2. Because different H boson couplings are involved in the gg and EW processes, we extract two off-shell signal strength parameters µ off-shell F for the gg mode and µ off-shell V for the EW mode. We also consider an overall off-shell signal strength parameter µ off-shell with different assumptions on the ratio R off-shell A major challenge arises from the fact that there are other sources of ZZ pairs in the SM (continuum ZZ production), see for example the bottom row of Fig. 1. These contributions, particularly those from qq → ZZ, are typically much larger than the contribution from off-shell H → ZZ. In addition, some of the amplitudes from continuum ZZ processes interfere with the H boson amplitudes because they share the same initial and final states. For example, the amplitudes in the first column of Fig. 1, or those in the second column, interfere with each other; the amplitude shown in the lower right panel (shown more generically in Extended Data Fig. 3) does not interfere with any of the other diagrams as we omit the negligible contribution of qq → H → ZZ that would interfere with it.
The interference between the H boson and continuum ZZ amplitudes is destructive [16][17][18][19][20][21]. This destructive interference plays a key role in the SM as it is one of the contributions that unitarizes the scattering of massive gauge bosons, keeping the computation of the cross section for ZZ production in proton-proton (pp) collisions finite [16][17][18][19].  , and those that give rise to continuum ZZ production (bottom). The interaction displayed at tree level in each diagram is meant to progress from left to right. Each straight, curvy, or curly line refers to the different set of particles denoted. Straight, solid lines with no arrows indicate the line could refer to either a particle or an antiparticle, whereas those with forward (backward) arrows refer to a particle (an antiparticle).
between the H boson production modes and the interfering continuum amplitudes, illustrating the growing importance of their destructive interference as m ZZ grows in the two final states included in the analysis, ZZ → 2 2ν and ZZ → 4 . In the parametrization of the total cross section, contributions from this type of interference between the H boson and continuum ZZ amplitudes scale with √ µ off-shell F and √ µ off-shell V for the gg and EW modes, respectively.
In this article, we study off-shell H boson decays to ZZ → 2 2ν, and on-shell as well as off-shell H boson decays to ZZ → 4 ( = µ or e), using a sample of pp collisions at 13 TeV collected by the CMS experiment at the LHC. The selection and analysis of the off-shell ZZ → 2 2ν data sample is described in detail in this article, and it is based on data collected between 2016 and 2018, corresponding to an integrated luminosity of 138 fb −1 . For the ZZ → 4 mode, we use previously published CMS off-shell (2016 and 2017 data sets, 78 fb −1 [15]) and on-shell (2015 [15,22] and 2016-2018 [23] data sets, 2.3 fb −1 and 138 fb −1 , respectively) results.
Information on the off-shell signal strengths, Γ H , and constraints on possible beyond-the-SM (BSM) anomalous couplings are extracted from combined fits over several kinematic distributions of the selected 2 2ν and 4 events. While off-shell events are the ones solely used to establish the presence of off-shell H boson production, the measurement of Γ H relies on the combination of on-shell and off-shell data.
Because                Shown are the distributions for the 2 2ν invariant mass, m 2 2ν , from the gg → 2 2ν process on the left panel, and the 4 invariant mass, m 4 , from the EW ZZ(→ 4 ) + qq processes on the right. These processes involve the H boson (|H| 2 ) and interfering continuum (|C| 2 ) contributions to the scattering amplitude, shown in black and gold, respectively. The dashed green curve represents their direct sum without the interference (|H| 2 + |C| 2 ), and the solid magenta curve represents the sum with interference included (|H + C| 2 ). Note that the interference is destructive, and its importance grows as the mass increases. The integrated luminosity is taken to be 1 fb −1 , so these distributions are equivalent to the differential cross section spectra dσ/dm 2 2ν (left) and dσ/dm 4 (right). The distributions are shown after requiring that all charged leptons satisfy p T > 7 GeV and |η| < 2.4, and that the invariant mass of any charged lepton pair with same flavor and opposite charge is greater than 4 GeV. Here, p T denotes the magnitude of the momentum of these leptons transverse to the pp collision axis, and η denotes their pseudorapidity, defined as − ln [tan (θ/2)] using the angle θ between their momentum vector and the collision axis. Calculations for the gg → 4 and EW ZZ(→ 2 2ν) + qq processes exhibit similar qualitative properties. The details of the Monte Carlo programs used for these calculations are given in the Methods section.
compared to the direct constraint from Ref. [11]. The inclusion of the 2 2ν data also allows the lower limits on µ off-shell V to reach within ∼65% of its best fit value, compared to the weaker constraints from 4 data alone, which reach within ∼90% of the 4 -only best fit value [15].
The m ZZ line shape is sensitive to the potential presence of anomalous HVV couplings [10,11,15,[24][25][26]. Thus, BSM physics could affect the ratio of off-shell to on-shell H boson production rates, and therefore the measurement of Γ H . We test the effect of these couplings on the Γ H measurement and constrain the contribution from these couplings themselves. In parametrizing anomalous HVV contributions, we adopt the formalism of Ref. [15] with the scattering amplitude Here, the polarization vector (four-momentum) of the vector boson V i is denoted by The BSM couplings a 2 , a 3 , and 1/Λ 2 1 (denoted generically as a i ) are assumed to be real and can take negative values, with the κ factors in Ref. [15] absorbed into the definition of 1/Λ 2 1 . The first two are coefficients for generic CP-conserving and CP-violating higher dimensional operators, respectively, while 1/Λ 2 1 is the coefficient for the first-order term in the expansion of a SM-like tensor structure with an anomalous dipole form factor in the invariant masses of the two V bosons. In what follows, we will use the shorthand "a i hypothesis" to refer to the scenario where all BSM HVV couplings other than a i itself are zero.
Throughout this work, we assume that the gluon fusion loop amplitudes do not receive new physics contributions apart from a rescaling of the SM amplitude. Possible modifications of the m ZZ line shape [26,27] are neglected based on existing LHC constraints [28][29][30].

2ν analysis considerations
The 2 2ν analysis is based on the reconstruction of Z → decays with a second Z boson decaying to neutrinos that escape detection. The momentum of the undetected Z boson transverse to the pp collision axis can be measured through an imbalance across all remaining particles, i.e., missing transverse momentum (p miss T or p miss T in vector form). Thus, the analysis requires large p miss T as the Z → νν signature.
The event selection is sensitive to the tail of the instrumental p miss T resolution in pp → Z+jets events that constitute an important reducible background. This contribution is estimated through a study of a data control region (CR) of γ+jets events, where p miss T is purely instrumental as it is in Z+jets events.
Processes such as pp → tt or WW result in nonresonant dilepton final states of same (e + e − and µ + µ − ) and opposite flavor (e ± µ ∓ ) with the same probability and the same kinematic properties. Thus, their background contribution to the 2 2ν signal, which includes two leptons of the same flavor, is estimated from an opposite-flavor eµ CR.
Other backgrounds from qq → ZZ, qq → WZ with W → ν and an undetected lepton, and the small contribution from tZ production are estimated from simulation. A third CR of trilepton events, consisting mostly of qq → WZ events, is used to constrain the qq → WZ background and, most importantly, the large qq → ZZ background. The ability to constrain qq → ZZ from qq → WZ is based on the similarity in the physics of these processes.
Further details on event selection, kinematic observables, and the methods to estimate the different contributions are discussed in the Methods section.

2ν kinematic observables
The analysis of off-shell H boson events is based on m ZZ . This quantity is computed from the reconstructed momenta in the 4 final state as the invariant mass of the 4 system, m 4 . However, because of the undetected neutrinos, we can only use the transverse mass m ZZ T , defined below, as a proxy for m ZZ in the 2 2ν final state. First, we identify p miss T as the transverse momentum vector of the Z boson decaying into neutrinos. Since there is no information on the longitudinal momenta of the neutrinos, m ZZ T is then computed as the invariant mass of the ZZ pair with all longitudinal momenta set to zero. This results in a variable with a distribution that peaks at m ZZ , with a long tail towards lower values. The definition of m ZZ T is where p T and m are the dilepton transverse momentum and invariant mass, respectively, and m Z , the Z boson pole mass, is taken to be 91.2 GeV.
The kinematic quantity p miss T itself is used as another observable to discriminate processes with genuine, large p miss T against the Z+jets background. Finally, in events with at least two jets, we use matrix element (MELA [26]) kinematic discriminants that distinguish the VBF process from the gg process or SM backgrounds. These discriminants are the D VBF 2jet -type kinematic discriminants used in Refs. [15,23], and are based on the four-momenta of the H boson and the two jets leading in p T .

Data interpretation
The results for the off-shell signal strength parameters µ off-shell F , µ off-shell V , and µ off-shell , and the H boson width Γ H are extracted from binned extended maximum likelihood fits over several kinematic distributions following the parametrization in Ref. [15]. In this parametrization, all mass dependencies are absorbed into the distributions for the various terms contributing to the likelihood, and the off-shell signal strength parameters, or Γ H , are kept mass-independent. Over different data periods and event categories, 117 multidimensional distributions are used in the fit: 42 for off-shell 2 2ν data (10 867 events), including 18 distributions from the trilepton WZ CR (8541 events), and 18 and 57 for off-shell and on-shell 4 data (1407 off-shell and 621 on-shell events), respectively.
In the 2 2ν data sample, the value of m ZZ T is required to be greater than 300 GeV. Depending on the number of jets (N j ), this sample is binned in m ZZ T and p miss T (N j < 2), or m ZZ T , p miss T , and the D VBF 2jet -type kinematic discriminants (N j ≥ 2). For the 4 samples, the binning is in m 4 and MELA discriminants, which are sensitive to differences between the H boson signal and continuum ZZ production, or the interfering amplitudes, or anomalous HVV couplings. These variables are listed in Table II of Ref. [15] for 4 off-shell data, under 'Scheme 2' in Table IV of Ref. [23] for on-shell 2016-2018 data, and in Table 1 of Ref. [15] for on-shell 2015 data. The m 4 range is required to be within 105-140 GeV for 4 on-shell data, or above 220 GeV for 4 off-shell data.
Theoretical uncertainties in the kinematic distributions include the simulation of extra jets (up to 20% depending on N j ), and the quantum chromodynamic (QCD) running scale and parton distribution function (PDF) uncertainties in the cross section calculation (up to 30% and 20%, respectively, depending on the process, and m ZZ T or m 4 ). These are particularly important in the gg process since it cannot be constrained by the trilepton WZ CR. Theory uncertainties also include those associated with the EW corrections to the qq → ZZ and WZ processes, which reach 20% at masses around 1 TeV [31, 32].
Experimental uncertainties include uncertainties in the lepton reconstruction and trigger efficiency (typically 1% per lepton), the integrated luminosity (between 1.2% and 2.5%, depending on the data-taking period [33-35]), and the jet energy scale and resolution [36], which affect the counting of jets, as well as the reconstruction of the VBF discriminants.  from the 4 off-shell signal region are displayed on the right. The stacked histogram displays the distribution after a fit to the data with SM couplings, with the blue filled area corresponding to the SM processes that do not include H boson interactions, and the pink filled area adding processes that include H boson and interference contributions. The gold dot-dashed line shows the fit to the no off-shell hypothesis. The black points with error bars as uncertainties at 68% CL show the observed data, which is consistent with the prediction with SM couplings within one standard deviation. The last bins contain the overflow. The requirements on the missing transverse momentum p miss T in 2 2ν events, and the D bkg -type kinematic background discriminants (see Table II of Ref. [15]) in 4 events are applied in order to enhance the H boson signal contribution. The values of integrated luminosity displayed correspond to those included in the off-shell analyses of each final state. The bottom panels show the ratio of the data or dashed histograms to the SM prediction (stacked histogram). The black horizontal line in these panels marks unit ratio.

Evidence for off-shell contributions, and width measurement
The constraints on µ off-shell F , µ off-shell V , µ off-shell , and Γ H are summarized in Table 1, where we show the "observed" results, i.e., those extracted from data, as well as the "expected" ones, i.e., those based on the SM and our understanding of selection efficiencies, backgrounds, and systematic uncertainties. The two set of results are consistent with statistical fluctuations in the data. The constraint on Γ H at 95% confidence level corresponds to 7.7 × 10 −23 < τ H < 1.3 × 10 −21 s in H boson lifetime.
The profile likelihood scans in the µ off-shell F and µ off-shell V plane are shown on the left panel of Fig. 4; scans over the individual signal strengths are in Extended Data Fig. 8. Likelihood scans over Γ H are displayed in the right panel of Fig. 4. These scans always include information from the 4 on-shell data, and the three cases displayed correspond to adding the 4 off-shell data alone, the 2 2ν off-shell data alone, or adding both. The steepness of the slope of the log-likelihood curves near µ off-shell = 0 and Γ H = 0 MeV is caused by the interference terms between the H boson and continuum ZZ production amplitudes that scale with √ µ off-shell or Γ H , respectively.
The no off-shell scenario with µ off-shell = 0, or Γ H = 0 MeV is excluded at a p-value of 0.0003 (3.6 standard deviations). The p-value calculation is checked with pseudoexperiments and the Feldman-Cousins prescription [37]. As described in greater detail in the Methods section, the exclusion is illustrated in Extended Data Fig. 9 through a comparison of the total number of events in each off-shell signal region bin predicted for the fit of the data to the no off-shell scenario, and the best fit. Constraints on Γ H are stable within 1 MeV (0.1 MeV) for the upper (lower) limits when testing the presence of anomalous HVV couplings. More results on these anomalous couplings are discussed in the Methods section, and can be found in Extended Data Fig. 8

Experimental setup
The CMS apparatus [39] is a multipurpose, nearly hermetic detector, designed to trigger on [40] and identify muons, electrons, photons, and charged or neutral hadrons [41][42][43]. A global reconstruction algorithm, particle-flow (PF) [44], combines the information provided by the allsilicon inner tracker and by the crystal electromagnetic and brass-scintillator hadron calorimeters (ECAL and HCAL, respectively), operating inside a 3.8 T superconducting solenoid, with data from gas-ionization muon detectors interleaved with the solenoid return yoke, to build jets, missing transverse momentum, tau leptons, and other physics objects [36, 45, 46]. In the following discussion up to likelihood scans, we will focus on the details of the 2 2ν analysis. Analysis details for the off-shell 4 data can be found in Ref. [15], 2015 on-shell 4 data in Refs. [15,22], and 2016-2018 on-shell 4 data in Ref. [23].

Physics objects
Events in the 2 2ν signal region, the eµ CR, and the trilepton WZ CR are selected using singlelepton and dilepton triggers. The efficiencies of these selections are measured using orthogonal triggers, i.e., jet or p miss T triggers, and events triggered on a third, isolated lepton, or a jet. They range between 78% and 100%, depending on the flavor of the leptons, and p T and η of the dilepton system, taking lower values at lower p T . Photon triggers are used to collect events for the γ+jets CR. The photon trigger efficiency is measured using a tag-and-probe method [47] in Z → ee events, with one electron interpreted as a photon with tracks ignored, as well as through a study of γ events. The efficiency is found to range from ∼55% at 55 GeV in photon p T to ∼95% at photon p T > 220 GeV.
Jets are reconstructed using the anti-k T algorithm [48] with a distance parameter of 0.4. Jet energies are corrected for instrumental effects, as well as for the contribution of particles originating from additional pp interactions (pileup). A multivariate technique is used to suppress jets from pileup interactions [49]. For the purpose of this analysis, we select jets of p T > 30 GeV and |η| < 4.7, and they must be separated by ∆R = √ (∆φ) 2 + (∆η) 2 > 0.4, with φ being the azimuthal angle measured in radians, from a lepton or a photon of interest. Jets within |η| < 2.5 (|η| < 2.4 for 2016 data) can be identified as b jets using the DEEPJET algorithm [50] with a loose working point. The efficiency of this working point ranges between 75% and 95%, depending on p T , η, and the data period.
The missing transverse momentum vector p miss T is estimated from the negative of the vector sum of the transverse momenta of all PF candidates. Dedicated algorithms [51] are used to eliminate events featuring cosmic ray contributions, beam-gas interactions, beam halo, or calorimetric noise.
The algorithms to reconstruct leptons are described in detail in Ref.
[41] for muons and Ref. [42] for electrons. Muons are identified using a set of requirements on individual variables, while electrons are identified using a boosted decision tree algorithm. Leptons of interest in this analysis are expected to be isolated with respect to the activity in the rest of the event. A measure of isolation is computed from the flux of photons and hadrons reconstructed by the PF algorithm that are within a cone of ∆R < 0.3 built around the lepton direction, including corrections from the contributions of pileup. We define loose and tight isolation requirements for muons (electrons) with p T > 5 GeV and |η| < 2.4 (|η| < 2.5). The efficiency of loose selection for muons (electrons) ranges from ∼85% (65-75%, depending on η) at p T = 5 GeV to > 90% (> 85%) at p T > 25 GeV. The additional requirements for tight selections reduce efficiencies by 10-15%.
Photons are reconstructed from energy clusters in the ECAL not linked to charged tracks, with the exception of converted photons [42]. Their energies are corrected for shower containment in the ECAL crystals and energy loss due to conversions in the tracker with a multivariate regression. In this analysis, we consider photons with p T > 20 GeV and |η| up to 2.5, with requirements on shower shape and isolation used to identify isolated photons and separate them from hadronic jets. The selection requirements are tightened in the γ+jets CR, which leads to selection efficiencies in the range 50-75%, depending on p T and η.

Event simulation
The signal Monte Carlo (MC) samples are generated for an undecayed H boson for gg, VBF, ZH, and WH productions using the POWHEG 2 [52][53][54][55] program at next-to-leading order (NLO) in QCD at various H boson pole masses, ranging from 125 GeV to 3 TeV. The generated H bosons are decayed to four-fermion final states through intermediate Z bosons using the JHUGEN [26] program, with versions between 6.9.8 and 7.4.0.
These samples are reweighted using the MELA matrix element package, which interfaces with the JHUGEN and MCFM [13,[56][57][58] matrix elements, following the same reweighting techniques used in Ref. [15] to obtain the final ZZ event sample, including the H boson contribution, the continuum, and their interference. The MELAANALYTICS package developed for Ref. [15] is used to automate matrix element computations and to account for the extra partons in the NLO simulation. The gg generation is rescaled with the next-to-NLO (NNLO) QCD K-factor, differential in m VV , and an additional uniform K-factor of 1.10 for the next-to-NNLO cross section computed at m H = 125 GeV [10]. Furthermore, the pole mass values of the top quark (173 GeV) and the bottom quark (4.8 GeV) [59] are used in the massive loop calculations for the generation of this process. The difference that would be introduced by using the MS renormalization scheme for these masses is found to be within the systematic uncertainties after accounting for the effects on both the H boson and continuum ZZ amplitudes.
The tree-level Feynman diagrams in Fig. 1 illustrate the complete set contributing to the gg → ZZ process on the leftmost top and bottom panels, and some of the diagrams contributing to the EW ZZ production associated with two fermions on the middle and top right panels. Extended Data Figs. 1 and 2 display the full set of diagrams for the EW process.
The qq → ZZ and WZ MC samples are also generated with POWHEG 2 applying EW NLO corrections for two on-shell Z and W bosons [31,32], and NNLO QCD corrections as a function of m VV [60]. The tree-level Feynman diagrams for these noninterfering continuum contributions are illustrated in Extended Data Fig. 3. Samples for the tZ+X processes, or other processes contributing to the CRs, are generated using MADGRAPH5 aMC@NLO at NLO or LO precision using the FxFx [61] or MLM [62] schemes, respectively, to match jets from matrix element calculations and parton shower.
The parton shower and hadronization are modeled with PYTHIA (8.205

Signal region selection requirements
Events in the 2 2ν final state are required to have two opposite-sign, same-flavor leptons (µ + µ − or e + e − ) satisfying tight isolation requirements with p T > 25 GeV, m within 15 GeV of m Z , and p T > 55 GeV. Additional requirements are imposed to reduce contributions from Z+jets and tt processes as follows. Events with b-tagged jets, additional loosely isolated leptons of p T > 5 GeV, or additional loosely identified photons with p T > 20 GeV are vetoed. To further improve the effectiveness of the lepton veto, events with isolated reconstructed tracks of p T > 10 GeV are removed. This requirement is also effective against one-prong τ decays.
The value of p miss T is required to be > 125 GeV (> 140 GeV) for N j < 2 (≥ 2). Requirements are imposed on the unsigned azimuthal opening angles (∆φ) between p miss T and other objects in the event in order to reduce contamination from p miss

Matrix element kinematic discriminants
In events with N j ≥ 2, we use two MELA kinematic discriminants for the VBF process, D VBF 2jet and D VBF,a2 2jet [15]. Each of these discriminants consists of a ratio of two matrix elements, or equivalently a ratio of event-by-event probability functions, expressed in terms of the fourmomenta of the H boson and the two jets leading in p T . The four-momentum of the H boson in the 2 2ν channel is approximated by taking the η of the Z → 2ν candidate, together with its sign, to be the same as that of the Z → 2 candidate. This approximation is found to be adequate through MC studies.
In both discriminants, one of matrix elements is always computed for the SM H boson production through gluon fusion. The remaining matrix element is computed for the SM VBF process in D VBF 2jet , so this discriminant improves the sensitivity to the EW H boson production. The D VBF,a2 2jet discriminant also computes the remaining matrix element for the VBF process, but under the a 2 HVV coupling hypothesis instead of the SM scenario. We find that this second discriminant brings additional sensitivity to SM backgrounds as well as being sensitive to the a 2 HVV coupling hypothesis by design. When anomalous HVV contributions are considered, the a 2 hypothesis used in the computation is replaced by the appropriate a i hypothesis to optimize sensitivity for the coupling of interest.

Control regions
As already mentioned, Z+jets events are a background to the 2 2ν signal selection. This can occur because of resolution effects in p miss T and the large cross section for this process. Since γ+jets and Z+jets have similar production and p miss T resolution properties, the Z+jets contributions at high p miss T can be estimated from a γ+jets CR [68].
In this CR, all event selection requirements are the same as those on the signal region, except that the photon replaces the Z → decay. The m ZZ T kinematic variable is constructed using the photon p T in place of p T , and m Z in place of m . Only photons in the barrel region (i.e., |η| < 1.44) are considered for N j < 2 to eliminate beam halo events that can mimic the γ + p miss T signature. Reweighting factors are extracted as a function of photon p T , photon η (when N j ≥ 2), and the number of observed pp collisions by matching the corresponding distributions in γ+jets sidebands at low p miss T (< 125 GeV) to those of Z+jets sidebands with the same requirement at each N j category separately. These reweighting factors are then applied to the high-p miss T γ+jets data sample. This technique to estimate the background from the data is verified using closure tests from the simulation by comparing the Z+jets and reweighted γ+jets MC distributions over each kinematic observable.
Contributions to the γ+jets CR from events with genuine, large p miss T from the Z(→ νν)γ, W(→ ν)γ, and W(→ ν)+jets processes are subtracted in the final estimate of the instrumental p miss T background. The first two are estimated from simulation, where the Zγ contribution is corrected based on the observed rate of Z(→ )γ. The W+jets contribution is estimated from a single-electron sample selected with requirements similar to those in the γ+jets CR. Representative distributions for this estimate are shown in Extended Data Fig. 5.
Processes such as pp → tt and pp → WW, including nonresonant H boson contributions, can produce two leptons and large p miss T without a resonant Z → decay. The kinematic properties of the dilepton system in these processes is the same for any combination of lepton flavors e or µ. These nonresonant ee or µµ background processes are therefore estimated from an eµ CR. This CR is constructed applying the same requirements used in the signal selection except for the flavor of the leptons. Data events are reweighted to account for differences in trigger and reconstruction efficiencies between eµ, and ee or µµ final states. Representative distributions for this estimate are shown in Extended Data Fig. 6.
A third CR selects trilepton qq → WZ events. These events are used to constrain the normalization and kinematic properties of the qq → ZZ and WZ continuum contributions. The Z → candidate is identified from the opposite-sign, same-flavor lepton pair with m closest to m Z , and the value of m for this Z candidate is required to be within 15 GeV of m Z . Trigger requirements are only placed on this Z candidate. The remaining lepton is identified as the lepton from the W decay ( W ). The leading-p T lepton from the Z decay is required to satisfy p T > 30 GeV, and the remaining leptons are required to satisfy p T > 20 GeV.
Similar to the signal region, requirements are imposed on the unsigned ∆φ between p miss T and other objects in the event in order to reduce contamination from the Z+jets and qq → Zγ processes: ∆φ miss > 1.0 between p miss T and p T for the Z candidate, ∆φ

Likelihood scans
As mentioned in the discussion of data interpretation, the likelihood is constructed from several multidimensional distributions binned over the different event categories. Profile likelihood scans over µ off-shell F , µ off-shell V , µ off-shell , and Γ H are shown in Extended Data Fig. 8. When testing the effects of anomalous HVV couplings, we perform fits to the data with all BSM couplings set to zero, except the one being tested, in the model to be fit. Because the only remaining degree of freedom is the ratio of these BSM couplings to the SM-like coupling, a 1 , the probability densities are parametrized in terms of the effective, signed on-shell cross section fraction f ai for each of the a i coupling, where the sign of the phase of a i relative to a 1 is absorbed into the definition of f ai [23]. The constraints on Γ H are found to be stable within 1 MeV (0.1 MeV) for the upper (lower) limits under the different anomalous HVV coupling conditions, and they are summarized in Extended Data Table 1.
In addition, we provide a simplified illustration for the exclusion of the no off-shell hypothesis in Extended Data Fig. 9. In this figure, the total number of events in each bin of the likelihood are compared from the 2 2ν and 4 off-shell regions for the fit of the data to the no off-shell (N no off-shell ) scenario, and the best fit (N best fit ). Events can then be rebinned over the ratio N no off-shell /(N no off-shell + N best fit ) extracted from each bin, and these rebinned distributions can then be compared at different Γ H values. In particular, we compare the observed and expected event distributions over this ratio under the best fit scenario, and the scenario with no off-shell H boson production, in order to illustrate which bins bring most sensitivity to the exclusion of the no off-shell scenario. The exclusion is noted to be most apparent from the last two bins displayed in this figure. We note, however, that the full power of the analysis ultimately comes from the different bins over the multidimensional likelihood, and that this figure only serves to condense the information for illustration.
When we perform separate likelihood scans over the three f ai fractions, only the corresponding BSM parameter is allowed to be nonzero in the fit. Profile likelihood scans for f a2 , f a3 and f Λ1 under different fit conditions are shown in Extended Data Fig. 8, and the summary of the allowed intervals at 68% and 95% CL is presented in Extended Data Table 1.

Data availability
Tabulated results are provided in the HEPData record for this analysis [38]. Release and preservation of data used by the CMS collaboration as the basis for publications is guided by the CMS data preservation, reuse, and open acess policy.

Code availability
The CMS core software is publicly available on GitHub (https://github.com/cms-sw/ cmssw). Table 1   Here, f refers to any , ν, or q. The tree-level diagrams featuring VBF production are grouped together in the upper row, and those featuring VH production are grouped in the lower row. The interaction displayed in each diagram is meant to progress from left to right. Each straight, curvy, or curly line refers to the different set of particles denoted. Straight, solid lines with no arrows indicate the line could refer to either a particle or an antiparticle, whereas those with forward (backward) arrows refer to a particle (an antiparticle).

Extended Data
VBS ZZ production: Figure 2: Feynman diagrams for the EW continuum ZZ production contributions. Here, f refers to any , ν, or q. The tree-level diagrams featuring vector boson scattering (VBS) production are grouped together in the upper half, and those featuring VZZ production are grouped in the lower half. The interaction displayed in each diagram is meant to progress from left to right. Each straight, curvy, or curly line refers to the different set of particles denoted. Straight, solid lines with no arrows indicate the line could refer to either a particle or an antiparticle, whereas those with forward (backward) arrows refer to a particle (an antiparticle). q q V Z qq → V Z: Extended Data Figure 3: Feynman diagram for the qq → ZZ and qq → WZ processes. Both processes are represented at tree level with a single diagram. These two processes constitute the major irreducible, noninterfering background contributions in the off-shell region. The interaction displayed in each diagram is meant to progress from left to right. Each straight, curvy, or curly line refers to the different set of particles denoted. Straight, solid lines with no arrows indicate the line could refer to either a particle or an antiparticle, whereas those with forward (backward) arrows refer to a particle (an antiparticle). > 200 GeV to enrich H boson contributions. The color legend for the stacked or dot-dashed histograms is given above the plots. The stacked histogram is split into the following components: gg (light pink) and EW (dark pink) ZZ production, instrumental p miss T background (purple), nonresonant proceses (gray), the qq → ZZ (blue) and qq → WZ (green) processes, and tZ+X production, where X refers to any other particle. Postfit refers to individual fits of the data (shown as black points with error bars as uncertainties at 68% CL) to the combined 2 2ν + 4 sample, including the WZ control region, and assuming either SM H boson parameters (stacked histogram with the hashed band as the total postfit uncertainty at 68% CL) or no off-shell H boson production (dot-dashed gold line). The middle panels along the vertical show the ratio of the data or dashed histograms to the stacked histogram, and the lower panels show the predicted relative contributions of each process. The rightmost bins contain the overflow. Observed Observed  Extended Data Figure 5: Distributions of m ZZ T in the different N j categories of the γ+jets CR. The distributions of the transverse ZZ invariant mass are displayed for the N j = 0, N j = 1, and N j ≥ 2 jet multiplicity categories from left to right. The missing transverse momentum requirement p miss T > 200 GeV is applied in the N j ≥ 2 category to focus on the region more sensitive to off-shell H boson production. The stacked histogram shows the predictions for contributions with genuine, large p miss T , or the instrumental p miss T background from the γ+jets simulation. Contributions with genuine, large p miss T are split as those coming from the more dominant Z(→ νν)γ (teal), W(→ ν)γ (purple), and W(→ ν)+jets (yellow) processes, and other small components (red). The prediction for instrumental p miss T background from simulation is shown in light pink. The black points with error bars as uncertainties at 68% CL show the observed CR data. The distributions are reweighted with the γ → transfer factors extracted from the p miss T < 125 GeV sidebands. The rightmost bins include the overflow. In these distributions, we find a discrepancy between the observed data and the predicted distributions because the reweighted γ+jets samples have inaccurate p miss T response and the simulation is at LO in QCD. Therefore, we use the difference between the observed data and the genuine-p miss T contributions to model the instrumental p miss T background instead of using simulation for this estimate. (right) kinematic VBF discriminants are shown in the 2 2ν signal region, N j ≥ 2 category. The stacked histogram shows the predictions from simulation, which consists of nonresonant contributions from WW (green) and tt (gray) production, or other small components (orange). The black points with error bars as uncertainties at 68% CL show the prediction from the eµ CR data. While only the data is used in the final estimate of the nonresonant background, we note that predictions from simulation already agree well with the data estimate. in different N j categories of the WZ control region. The postfit distributions of the transverse WZ invariant mass are displayed for the N j = 0, N j = 1, and N j ≥ 2 jet multiplicity categories of the WZ → 3 1ν control region from left to right. Postfit refers to a combined 2 2ν + 4 fit, together with this control region, assuming SM H boson parameters. The stacked histogram is shown with the hashed band as the total postfit uncertainty at 68% CL. The color legend is given above the plots, with the different contributions referring to the WZ (light green), ZZ (blue), Z+jets (dark green), Zγ (yellow), tt (gray), and tV+X (brown, with X being any other particle) production processes, as well as the small EW ZZ production component (dark pink). The black points with error bars as uncertainties at 68% CL show the observed data. The middle panels along the vertical show the ratio of the data to the total prediction, and the lower panels show the predicted relative contributions of each process. The rightmost bins contain the overflow. Extended Data Figure 9: Distributions of ratios of the numbers of events in each off-shell signal region bin. The ratios are taken after separate fits to the no off-shell hypothesis (N no off-shell ) and the best overall fit (N best fit ) with the observed Γ H value of 3.2 MeV in the SM-like HVV couplings scenario. The stacked histogram displays the predicted contributions (pink from the 4 off-shell and green from the 2 2ν off-shell signal regions) after the best fit, with the hashed band representing the total postfit uncertainty at 68% CL, and the gold dot-dashed line shows the predicted distribution of these ratios for a fit to the no off-shell hypothesis. The black solid (hollow) points, with error bars as uncertainties at 68% CL, represent the observed 2 2ν and 4 (4 -only) data. The first and last bins contain the underflow and the overflow, respectively. The bottom panel displays the ratio of the various displayed hypotheses or observed data to the prediction from the best fit. The integrated luminosity reaches only up to 138 fb −1 since on-shell 4 events are not displayed.