Gravitational-wave physics and astronomy in the 2020s and 2030s

The 100 years since the publication of Albert Einstein’s theory of general relativity saw significant development of the understanding of the theory, the identification of potential astrophysical sources of sufficiently strong gravitational waves and development of key technologies for gravitational-wave detectors. In 2015, the first gravitational-wave signals were detected by the two US Advanced LIGO instruments. In 2017, Advanced LIGO and the European Advanced Virgo detectors pinpointed a binary neutron star coalescence that was also seen across the electromagnetic spectrum. The field of gravitational-wave astronomy is just starting, and this Roadmap of future developments surveys the potential for growth in bandwidth and sensitivity of future gravitational-wave detectors, and discusses the science results anticipated to come from upcoming instruments. In the past few years, gravitational-wave observations provided stunning insights into some of the most cataclysmic events in the Universe, heralding a bright future for gravitational-wave physics and astronomy. This is a Roadmap for the field in the coming two decades. Gravitational-wave observations of binary black hole and neutron star mergers by LIGO and Virgo in the past five years have opened a completely new window on the Universe. The gravitational-wave spectrum, extending from attohertz to kilohertz frequencies, provides a fertile ground for exploring many fundamental questions in physics and astronomy. Pulsar timing arrays currently probe the nanohertz to microhertz frequency band to detect gravitational-wave remnants from past mergers of super-massive black holes. The space-based Laser Interferometer Space Antenna (LISA) will target gravitational-wave sources from microhertz up to hundreds of millihertz and trace the evolution of black holes from the early Universe through the peak of the star formation era. Einstein Telescope and Cosmic Explorer, two future ground-based observatories now under development for the 2030s, are pursuing new technologies to achieve a tenfold increase increase in sensitivity to study compact object evolution to the beginning of the star formation era. Gravitational-wave observations of binary black hole and neutron star mergers by LIGO and Virgo in the past five years have opened a completely new window on the Universe. The gravitational-wave spectrum, extending from attohertz to kilohertz frequencies, provides a fertile ground for exploring many fundamental questions in physics and astronomy. Pulsar timing arrays currently probe the nanohertz to microhertz frequency band to detect gravitational-wave remnants from past mergers of super-massive black holes. The space-based Laser Interferometer Space Antenna (LISA) will target gravitational-wave sources from microhertz up to hundreds of millihertz and trace the evolution of black holes from the early Universe through the peak of the star formation era. Einstein Telescope and Cosmic Explorer, two future ground-based observatories now under development for the 2030s, are pursuing new technologies to achieve a tenfold increase increase in sensitivity to study compact object evolution to the beginning of the star formation era.

The past five years have witnessed a revolution in astronomy. The direct detection of gravitational waves (GW) emitted from the binary black hole (BBH) merger GW150914 (Fig. 1) by the Advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detector 1 on September 14, 2015 (reF. 2 ) was a watershed event, not only in demonstrating that GWs could be directly detected but more fundamentally in revealing new insights into these exotic objects and the Universe itself. On August 17, 2017, the Advanced LIGO and Advanced Virgo 3 detectors jointly detected GW170817, the merger of a binary neutron star (BNS) system 4 , an equally momentous event leading to the observation of electromagnetic (EM) radiation emitted across the entire spectrum through one of the most intense astronomical observing campaigns ever undertaken 5 .
Coming nearly 100 years after Albert Einstein first predicted their existence 6 , but doubted that they could ever be measured, the first direct GW detections have undoubtedly opened a new window on the Universe. The scientific insights emerging from these detections have already revolutionized multiple domains of physics and astrophysics, yet, they are 'the tip of the iceberg' , representing only a small fraction of the future potential of GW astronomy. As is the case for the Universe seen through EM waves, different classes of astrophysical sources emit GWs across a broad spectrum ranging over more than 20 orders of magnitude, and require different detectors for the range of frequencies of interest (Fig. 2).
In this Roadmap, we present the perspectives of the Gravitational Wave International Committee (GWIC, https://gwic.ligo.org) on the emerging field of GW astronomy and physics in the coming decades. The GWIC was formed in 1997 to facilitate international collaboration and cooperation in the construction, operation and use of the major GW detection facilities worldwide. Its primary goals are: to promote international cooperation in all phases of construction and scientific exploitation of GW detectors, to coordinate and support long-range planning for new instruments or existing instrument upgrades, and to promote the development of GW detection as an astronomical tool, exploiting especially the potential for multi-messenger astrophysics. Our intention in this Roadmap is to present a survey of the science opportunities and to highlight the future detectors that will be needed to realize those opportunities. The recent remarkable discoveries in GW astronomy have spurred the GWIC to re-examine and update the GWIC roadmap originally published a decade ago 7 .
We first present an overview of GWs, the methods used to detect them and some scientific highlights from the past five years. Next, we provide a detailed survey neutron stars. All ground-based detectors use enhanced Michelson interferometry with suspended mirrors to directly measure a GW's phase and amplitude. The detection of audio-band GWs places extremely stringent demands on the isolation of the mirrors from local forces and disturbances. The two US-based Advanced LIGO detectors 1 have L = 4 km arm lengths, whereas the European-based Advanced Virgo 3 and the Japan-based KAGRA 8,9 have L = 3 km arms. Typical strains from astrophysical sources are on the order of 10 −21 or less, thus, displacement sensitivities δL of less than ~10 −18 m are required to detect GWs with sufficient signal-to-noise (SNR) ratio. This is an incredibly small displacement; for comparative purposes, note that the radius of a proton is ~8.5 × 10 −16 m.
A schematic view of Advanced LIGO, shown in Fig. 4, illustrates the configuration of the current generation of ground-based detectors. The mirrors are suspended from multi-stage pendulum systems such that, above the resonant frequencies of the suspension system (typically around 1 Hz), they can be effectively treated as in free fall (that is, in a local inertial frame) in the direction of light propagation. These suspensions and accompanying seismic isolation systems reduce the undesired test-mass motion induced by ambient ground motion by about a factor of 10 12 from 1 Hz to 10 Hz (reFS 3,10 ). In addition to seismic noise, there are three primary noise sources that currently limit interferometer sensitivity: thermal noise produced by random displacements of the mirror surfaces that are produced by thermally fluctuating stresses in the mirror coatings, substrates and suspensions 11 ; Newtonian (or dynamic gravity gradient) noise arising from earth (ground) and atmospheric density perturbations directly exerting dynamic forces on the mirrors 12 ; and quantum noise resulting from both vacuum fluctuations of the EM field that limit phase resolution in the readout photodetector (so-called 'shot noise') and displacements of the mirrors via quantum radiation pressure noise (QRPN), which induce stochastic impulses (or 'kicks') on the mirrors due to the random arrival time of the momentum-carrying photons 13 .
The effect of QRPN is diminished as the mirror mass increases, and both QRPN and shot noise can be reduced by injecting quantum-engineered squeezed vacuum states of light into the interferometer 14 . Thermal noise manifests itself in a variety of ways in mirror coatings, mirror substrates and suspensions 15 ; it can be understood from a statistical mechanics perspective as infinitesimal internal motions of macroscopic objects at non-zero temperatures caused by intrinsic dissipation (or mechanical loss) in the system. In addition to these fundamental noise sources, a very large number of technical noises must be identified and overcome, which broadly group into laser frequency and intensity noises, acoustically and seismically driven scattered light noises, sensor and actuator noises, stochastic forces from electrical and magnetic fields, and, potentially, energy deposited by energetic particles. (More details about these noise sources are presented in the last section, where we discuss some of the challenges to building future ground-based detectors.) To deliver the best science, a network of globally distributed interferometers functioning as a unified detector is required. The Advanced LIGO and Advanced Virgo detectors have actively searched the GW sky in a highly coordinated campaign during a series of observing runs carried out from 2015. Figure 5 shows the sensitivities of the LIGO and Virgo interferometers during the 'O2' observing run; in the latest 'O3' run, the detectors have achieved sensitivities sufficient to detect BBH mergers on a weekly basis 16 .
The KAGRA detector recently joined LIGO and Virgo to form the LIGO-Virgo-KAGRA network; the LIGO-India 17 interferometer will be joining later in this decade, dramatically improving the ability of the network to confidently detect and locate GW events 18 and providing new methods for testing alternative theories of gravity through enhanced ability to resolve GW polarizations 19 .
The LIGO-Virgo observations have, in a few years, already produced revelations about some of the most energetic and cataclysmic processes in the Universe. From GW150914, and more recent BBH mergers observed by the LIGO Scientific and Virgo collaborations, it is now known that there is a population of black holes paired in orbitally bound binary systems that evolve through the emission of GWs and merge in less than a Hubble time (the age of the Universe); that black holes of many tens and even hundreds of solar masses exist in nature; and that the properties of the observed black holes are entirely consistent with GR to within current measurement limits 16,[20][21][22][23][24][25][26][27][28] . The BNS detection GW170817 and subsequent observations in the

Michelson interferometer
A device for precisely measuring small differential displacements using a laser light source that is split into two perpendicular paths (arms) by a beamsplitter and reflected back to recombine at the beamsplitter. relative displacements between the two arms produce phase shifts, leading to a change in the intensity of the light leaving the interferometer.
Thermal noise intrinsic noise resulting from microscopic atomic motions in bulk matter at finite temperatures.

Quantum radiation pressure noise
Noise resulting from fluctuations in the momentum imparted to the interferometer mirrors when light reflects off their surface.

BBH mergers
The collision and fusion of two orbitally bound black holes to form a more massive black hole.
www.nature.com/natrevphys EM domain collectively comprise the first demonstration of GW-EM multi-messenger astronomy, providing an astounding wealth of knowledge, including the first definitive link between BNS merger progenitors and short gamma-ray bursts [29][30][31][32][33][34][35][36][37] ; the first definitive observation of a kilonova [38][39][40][41][42][43][44][45][46] , conclusive spectroscopic proof that BNS mergers produce heavy elements through r-process nucleosynthesis 40,[47][48][49][50][51][52] ; the first demonstration that GWs travel at the same speed as light to better than a few parts in 10 15 (reF. 29 ); and an independent method for measuring the Hubble constant using detected GWs as a 'standard siren' for determining the absolute distance to the source [53][54][55] . Additionally, the Advanced LIGO and Advanced Virgo detections have enabled tests of GR in the strong gravity regime that were inaccessible to other experiments and astronomical observations 56,57 , motivating research on many fronts in fundamental physics and astrophysics. This only represents a brief overview of the recent discoveries and, as we discuss in detail below, captures only a fraction of the potential science afforded by future GW observations.

Space-based detectors
When launched in the mid-2030s, the Laser Interferometer Space Antenna (LISA) 58 will possess a breathtaking scientific portfolio. LISA will explore much of the GW Universe in the frequency band from 100 μHz to 100 mHz. A constellation of three satellites separated by 2.5 × 10 9 m in an Earth-trailing orbit, LISA will be capable of detecting the first seed black holes formed out to redshifts z ~ 20 or more 59 , and intermediate-mass and 'light' super-massive coalescing black hole systems in the 10 2 -10 7 M ⊙ (solar mass) range, thus, tracing the evolution of black holes from the early Universe through the peak of the star formation era. Through detections of extreme mass ratio inspirals (EMRIs, binary systems with mass ratios as small as ~10 −6 ) 60 , LISA will directly map the curvature of spacetime at the event horizons of massive black holes, yielding even more precise tests of GR in the strong gravitational field regime. LISA might also detect stellar-mass BBH systems years before they are detectable by ground-based detectors 61   gravitational-wave spectrum probed by strain-sensitive gravitational-wave detectors, ranging from 10 −9 Hz to more than 1,000 Hz. The source classes are shown above the spectrum and the detectors below. The portion of the gravitational-wave spectrum below 10 −9 Hz probed through measurements of the cosmic microwave background polarization is not shown.

Multi-messenger astronomy
A new field that explores the universe collectively using the information carried by photons, gravitational waves, neutrinos and cosmic rays.
Nucleosynthesis r-Process nucleosynthesis stands for 'rapid neutron capture nuclear process', whereby a nucleus rapidly increases its atomic number by repeatedly capturing neutrons in a neutron-rich environment.

Standard siren
A gravitational-wave source that is determining the absolute distance to the source.

Extreme mass ratio inspirals
The orbit of a binary system in which the more massive object is greater than the less massive object by ~10,000 or more. NaTuRe RevIeWS | PHySIcS a consortium of European national agencies, as well as NASA. It convincingly demonstrated some of the key performance requirements for the full LISA mission, most notably the displacement sensitivity and control of spurious acceleration noise required for LISA. More on LISA science is presented in the next section, whereas the LISA and LPF detector technology is discussed in detail in the last section.

PTAs
Pulsar timing arrays (PTAs) 64-67 explore the nanohertz portion of the GW spectrum ranging from 10 −9 to 10 −6 Hz. Rather than using laser light to measure variations in detector length as ground-based and space-based detectors do, a PTA measures variations in the radio frequency pulse arrival times at the Earth from an array of millisecond pulsars 68,69 (Fig. 6).
Pulsars are rotating neutron stars that act like cosmic lighthouses, appearing as periodic pulsating radio sources. Because millisecond pulsars, pulsars with periods between roughly 1.4 and 30 ms, possess rotational stabilities comparable with the best atomic clocks, they are ideal timing sources. Once effects such as rotational spin-down, astrometric position and motion, and orbital effects from binary companions are accounted for, the pulse arrival times can be precisely modelled and predicted to fractions of a microsecond for up to decades into the future 70 , and variations arising from GW perturbations can be measured. Distortions in the spacetime around Earth or the pulsars will produce systematics in timing residuals (deviations of the measured pulse arrival times relative to the predicted arrival times), and, crucially, spatially correlated systematics in the timing residuals of the array of pulsars across the sky 71 . A GW emitted from a single binary system passing the pulsar-Earth system will cause two frequency components in the time series of the timing residuals: one from the spacetime variations at the pulsar ('pulsar term'), the other from variations at the Earth ('Earth term'), with different frequencies resulting from changes in the orbital frequency of the emitting source during the time it takes for the radio pulses to travel to the Earth. The top panel of Fig. 7 shows the expected detection in the form of the Hellings and Downs curve, the correlated response of a pair of pulsar-Earth baselines to a stochastic GW background averaged over all sky positions and polarizations as a function of the angle between the pulsar pair-Earth baselines 71 .
Pulsars are observed at monthly or more rapid cadences in order to sample and measure changing properties, such as the position of the pulsar (that is, proper motion) and varying dispersion due to the interstellar medium. In addition, they must be observed for roughly one half-hour per observation to average over enough of the pulses to mitigate the effects of jitter induced by astrophysical and receiver noise. The observations themselves cover very wide bandwidths (>GHz) or occur near-simultaneously at multiple radio frequencies in order to correct for the effects of interstellar dispersion. Pulsar timing instruments must have fine frequency reso lution (~1 MHz) to correct for these effects, coupled with high time resolution in order to sufficiently sample the roughly millisecond-wide radio pulses.
As each pulsar needs to be timed for about a year (equivalent to one Earth orbit) to be properly localized and understood, PTA experiments must have years-long durations. In practice, the lower end of the frequency window is given by the length of the data set (currently about 1 nHz), whereas the upper end is given by the cadence of the timing observations (currently about 1 μHz). Timing residual amplitudes of about 100 ns or less are resolved for the best timed millisecond pulsars.
Today, there are three major PTAs: the Parkes PTA 72 in Australia, the European PTA Consortium 65 and the NANOGrav 73 consortium in North America. These arrays regularly achieve sub-microsecond timing on over 100 millisecond pulsars (MSPs), which collectively form the International Pulsar Timing Array 74 (IPTA). PTA science is often sensitivity-limited, and many of the MSPs being discovered in recent surveys have flux densities that often require hour-long observations with 100-m class (or larger) telescopes to achieve the requisite sub-microsecond timing. The Five-hundred-meter Aperture Spherical Telescope (FAST) (500 m diameter) and MeerKAT (64 antennas × 13.7 m diameter) telescopes have been commissioned, and are now commencing regular MSP timing, joining many existing 64-100-m class facilities in the Northern Hemisphere, and the Parkes 64-m telescope in the Southern Hemisphere. Figure 7 illustrates the radio telescopes used for pulsar timing experiments around the globe. NANOGrav has used two telescopes -the

Timing residuals
Deviations of the measured pulsar pulse arrival times relative to the modelled arrival times based on the known physics of pulsar emissions.

Hellings and Downs curve
The predicted angular correlation of the timing residuals of an ensemble of independent pairs of pulsars as seen from earth resulting from the presence of a gravitational-wave background.
www.nature.com/natrevphys 0123456789();: Arecibo Observatory (AO) in Puerto Rico and the Green Bank Telescope (GBT) in West Virginia -with each telescope providing roughly half of the sensitivity to GWs. NANOGrav currently observes almost 80 MSPs, about half at the AO and the other half at the GBT, and is seeing the first indications of a signal consistent with GWs 75 .
The recent loss of the AO poses significant challenges. In the short term, to minimize the loss of sensitivity to the stochastic background of GWs, NANOGrav is going to move most of the pulsars observed at the AO to the GBT, requiring roughly double the amount of time currently used at the GBT. Longer term, the US community will need a replacement for the AO (such as the DSA-2000 concept 76 ). Legacy AO observations will anchor combined future data sets, allowing us to characterize the low-frequency GW universe and glean unique insights into galaxy evolution and cosmology.
The most promising GW sources in the nanohertz band are super-massive (10 7 -10 10 M ⊙ ) binary black holes (SMBBHs) that form via the collisions of massive galaxies. The astrophysical stochastic gravitational-wave background (ASGWB) produced by the cosmic population of slowly inspiralling SMBBHs across the Universe 77-79 is the first signal likely to be detected, due to the very long lifetime in the detection band and the relatively small rate of systems in the final coalescence phase. As sensitivity improves, this may be followed by the observation of individual SMBBHs [80][81][82] ; parallel EM observations can both help recover GW signals and allow for richer physics to be extracted. The detection of the ASGWB will reveal essential information about the formation of the large-scale structure of the Universe, determine the rates of galaxy mergers and definitively resolve the 'final parsec' problem 83 -the theoretical difficulty of shrinking the orbit of a SMBBH by a factor of ~100 after its formation at a separation of ~1 pc via the scattering of stars. PTA measurements are currently probing the expected range of astrophysical signals [84][85][86][87] and, based on recent results, a detection of the ASGWB may be imminent 79 . The detection of individual SMBBHs will allow for combined EM and GW multi-messenger observations and, although only a handful are expected, the scientific return of these discoveries will be immense 88 .

Cosmic microwave background polarization
The lowest frequencies of the GW spectrum, down to approximately 10 −18 Hz, are populated by a stochastic background of remnant primordial GWs produced during the Big Bang. Standard inflationary cosmology predicts a GW spectrum too feeble to be detectable by current ground-based detectors, LISA or PTAs, although some extensions of inflation and more exotic models, including first-order phase transitions and topological defects, predict primordial GW energy densities that can be detected across the frequency bands 89,90 . EM-based measurements of the cosmic microwave background (CMB) polarization may reveal signs of the remnant primordial GWs 91 . As CMB polarization measurements are based on a fundamentally different detection method than their higher frequency counterparts, this approach is not discussed further here.

Stochastic background
An incoherent background of gravitational waves produced either by a large ensemble of independent gravitational-wave sources or in the earliest moments of the primordial universe.

NaTuRe RevIeWS | PHySIcS
Upcoming physics and astronomy with GWs In the coming decades, the new observational window of GW astronomy promises to deliver data that will transform the landscape of physics, addressing some of the most pressing problems in fundamental physics, astrophysics and cosmology 88,[92][93][94][95]  Fundamental physics GW observations, because they explore the most extreme conditions of spacetime and of matter, can serve as unsurpassed probes of fundamental physics. In this section, we will look at the power of this new tool in exploring gravity and matter at their most extremes.
Testing GR and modified theories of gravity. GR has been a tremendously successful theory in explaining current astronomical observations and laboratory experiments [99][100][101] . Nevertheless, there is a general consensus that GR is, at best, incomplete, representing an approximation to a more complete theory that cures some or all of its problems 102 . These issues include the loss of information down a black hole 103 , which contradicts unitary evolution of physical states in quantum mechanics; the inevitability of spacetime singular ities 104,105 , for example, at the centre of a black hole where physical quantities such as the density and curvature of spacetime become infinitely large; a cosmological constant that is responsible for the late-time accelerated expansion of the Universe 106,107 , whose value cannot be accounted for in the standard model of particle physics 108 ; and the lack of a viable formulation of quantum gravity, which might resolve all of these problems but has, so far, been elusive. These difficulties led to increased interest in searching for GR violations in observations in the hope that they will provide clues to an alternative theory of gravity.
The spacetime curvature at the horizon of a black hole of mass M and radius R ~ 2GM/c 2 goes as κ GM c R c GM / = / 8 2 3 2 , where G is the gravitational constant and c is the speed of light. Note that κ is larger for lighter black holes, thus, binary coalescences of the lightest astrophysical black holes are, therefore, the strongest regions of gravity that we know of and are ideal for testing strong field predictions of GR 101,102 . Sub-solar-mass black hole binaries, should they exist, would have even greater curvature. Although neutron stars are lighter than astrophysical black holes, they are not as compact and, hence, probe smaller curvature scales. Black holes also probe regions of greatest compactness (or dimensionless gravitational potential) defined as Φ = GM/c 2 R, which is largest for black holes. Past experiments such as the Cassini spacecraft 109 and the double pulsar orbital decay 110 verified the validity of GR in regimes where fields are moderately strong and/or velocities are small compared with the speed of light (see Fig. 9). Current and future experiments, such as the Event Horizon Telescope (EHT) 111 and the GRAVITY instrument 112 , explore the validity of GR near massive black holes and, hence, in the small curvature, but high compactness, regime. X-ray observations by the NICER experiment 113 probes GR in the high curvature and large compactness regime of neutron stars 114 , whereas GW observations of stellar-mass black holes by ground-based detectors (area denoted by 'GW ground' in Fig. 9) and LISA probe regions' curvature and compactness on a wide range of scales: stellar-mass black holes of up to ~5-100 M ⊙ (mostly ground-based observatories, but also LISA for sources that are close by), intermediate-mass black holes of 10 2 -10 4 M ⊙ (ground-based observatories and LISA) and super-massive black holes (SMBHs) of 10 5 -10 10 M ⊙ (LISA at the lower end and PTAs at the higher end of the mass range), offering tests of GR over ten orders of magnitude in length scale and twenty orders of magnitude in curvature.
In addition to probing the strong field predictions of GR, the vast cosmological distances over which GWs travel (redshifts in excess of z ~ 20 both in the case of LISA and future ground-based detectors) will greatly constrain local Lorentz invariance and graviton mass 99 . Violations of Lorentz invariance or a non-zero graviton mass could cause dispersion in the observed waves and, hence, help to discover new physics predicted by certain quantum gravity theories. At the same time, propagation effects could also reveal the presence of large extra spatial dimensions that lead to different values for the luminosity distance to a source inferred by GW and EM observations (see reFS 115,116 ) or cause birefringence of the waves predicted in certain formulations of string theory, as discussed in reFS 117,118 . In certain modified gravity theories, GWs have more than two polarizations (in contradiction with GR); the presence of such additional degrees of freedom could be explored by future detector networks 99,119 .

Equation of state of ultra-high density matter.
Neutron star cores can reach densities many times that of nuclear saturation density (~2 × 10 17 kg m −3 ), making them the regions of the highest matter density known in the Universe 120 . With colliding BNS (and neutron starblack hole binaries), one can probe the structure of cold, ultra-high-density matter 121 . Some 50 years after pulsars (neutron stars that beam pulsed radio signals towards Earth) were first discovered 122 , a good understanding of the physics of neutron stars is still lacking, especially regarding the composition of their inner core. It is possible that the core is simply a gigantic nucleus or, alternatively, a sea of free quarks and gluons. Nucleons in neutron star cores could undergo a phase transition to quark-gluon plasma 123,124 at the super-high densities that might be found in the hyper-massive neutron stars 125,126 that form in the aftermath of the merger of two neutron stars and live briefly before collapsing to a black hole. GWs emitted during the last few cycles of inspiral of neutron stars and the ringdown of the remnant carry the crucial signatures of the properties of their cores and hot dense matter equation of state.
A neutron star in a binary system can tidally deform its companion, which can cause the system to inspiral and merge more rapidly 127 . The tidal deformation is greater for neutron stars with larger radii or cores with stiffer equations of state, as in the case of hadronic matter. Conversely, cores comprised of unbounded quark matter will have smaller radii and be harder to deform; thus, tidal effects wouldn't alter the orbital evolution as much. The tidal effects are directly encoded in the emitted GWs, but the effect arises at fifth-post-Newtonian order (or at O v c ( / ) 10 in post-Newtonian expansion) 127 . Moreover, the post-merger phase could also produce GW emission from the (initially) highly deformed hyper-massive neutron star 128 ; the spectral features of the emitted signal can be mapped directly to the equation of state of 'hot' dense matter, including possible phase transitions [129][130][131] .
GW170817 ruled out some of the stiffest equations of state and determined the radii of companions to be in the range 9-13 km (reFS 4,132 ). Third-generation ground-based observatories will constrain the radius to within a few hundred metres and help measure central densities and pressures in neutron stars 92 . They will also detect post-merger waveforms and provide constraints for 'hot' equations of state [129][130][131] and possibly offer clues of quark deconfining phase transitions 123,124 .

Exploring dark matter properties with GW observations.
Black hole and neutron star mergers could provide unique insight into the nature of dark matter 133 , another long-standing problem in astrophysics and cosmology. Much of what is known about dark matter comes from its gravitational influence on the dynamics of stars and galaxies and the CMB. Dark matter, in extensions of the standard model, is conceived to be comprised of weakly interacting massive particles (WIMP) 134 or, possibly, extremely light particles, such as axions, which were proposed to solve the strong charge-parity (CP) problem in quantum chromodynamics (QCD) 135 .
Efforts are underway to make a direct detection of WIMP and axionic dark matter in laboratory experiments; however, these experiments have not been successful to date. GW observations could help infer the properties of dark matter in several ways, although any inference drawn will still be indirect evidence of their properties. For example, in the presence of a dark matter fluid, BBHs would experience a drag force that would alter their general relativistic orbital dynamics. This signature could be extracted from the frequency evolution of the observed GW 136 .
Dark matter could be composed, at least in part, of ultralight bosons such as QCD axions 137 , dark photons or other light particles 138 , spanning a wide mass range of 10 −33 -10 −10 eV (reFS 137,138 ). The Compton wavelength of ultralight bosons in the mass range 10 −20 -10 −10 eV corresponds to the horizon size of black holes of mass 10-10 10 M ⊙ . Although these ultralight fields may not interact with other standard model particles, the equivalence principle implies that their gravitational interaction with, for instance, black holes could have observable consequences. For example, bosonic fields whose Compton wavelength matches the horizon scale of an astrophysical black hole could form bound states (often called 'gravitational atoms') around black holes and extract their rotational energy and angular momentum

Luminosity distance
The distance computed from the luminosity of a specific emitting source. For gravitational waves emitted by binary inspirals, the luminosity distance is an absolute standard reference determined by the amplitude of the gravitational wave detected on earth.

Tidal deformation
The physical distortion of an object (such as a neutron star) caused by extreme gravitational field gradients present near massive compact objects, such as stellar black holes and neutron stars.

Post-Newtonian expansion
expansion of the ratio of the velocity of an object that creates the gravitational field to the speed of light used for finding an approximate solution of the einstein field equations in general relativity.
Compton wavelength equal to the wavelength of a particle (photon or graviton) whose energy is the same as the mass of that particle, defined as λ Compton = h/(mc). general relativity predicts that λ Compton for the graviton is infinite.  The pulsar beams sweep across the radio antenna and are detected using a radio receiver at ~GHz frequencies. The data are corrected for delays due to the interstellar medium ('de-dispersion') and then folded at the period of the pulsar to create an integrated profile. This profile is then crosscorrelated with a noise-free template profile to calculate a time of arrival (TOA), using a high-precision reference clock at the observatory.
via the mechanism of superradiance 139,140 . This would result in a Bose-Einstein condensate that acts as a source of continuously emitted GWs. Ground-based detectors would explore the higher end of the mass range from 10 −13 eV to 10 −10 eV, which corresponds to QCD axions. LISA could explore the presence of even lighter bosonic fields.
Primordial black holes have also been proposed to constitute dark matter and gained attention after LIGO's first discovery of stellar black holes with unusually large masses 141 . Searches have also been performed for sub-solar-mass black holes, but no detection has been made so far, leading to some of the best upper limits on the fraction of dark matter in black holes of mass 0.2-1.0 M ⊙ (reF. 142 ). The existence of sub-solar-mass black holes would be considered to be a definitive proof that they were produced in the primordial Universe, as stellar evolution cannot produce black holes below about 3 M ⊙ .

Cosmology
From an observational cosmology point of view, the past few decades have witnessed a series of compounding problems that simply won't go away, including the accelerated expansion of the Universe 106,107 , the tension between the local and early Universe measurements of the Hubble constant 143 and the lack of direct observation of dark matter 144 . As we discuss below, these are each prime examples of outstanding puzzles in modern physics where GW observations are bound to lead to progress.
The reason why GW observations have the potential to make fundamental contributions to cosmology is that www.nature.com/natrevphys merging binary systems are 'standard sirens' , that is, the signal contains direct information about the luminosity distance to the source 53,145 . This information can be extracted from the signal, provided the detector network can localize the sky position of the source and measure the polarization of the signal.
Hubble constant. LIGO and Virgo, together with EM observations, made their first measurement of the Hubble constant using the standard siren GW170817 (reF. 54 ), and the measurement is continuously being refined 146 . Stand-alone measurements are also possible via a statistical association between a GW source and nearby galaxies 147 . The current ground-based network, recently augmented by KAGRA and later this decade by LIGO-India, is expected to improve the measurement of the Hubble constant to a precision of a few percent. This approach does not rely on astronomical distance ladders, so it provides an important check on systematic errors and other assumptions used in other methods 148 . GW measurements may help resolve the existing tension between the two principal Hubble constant measurement methods 149,150 , clarifying whether the tension is due to measurement issues or new physics.
Dark energy. Big Bang cosmology is largely consistent with GR. However, the accelerated expansion of the Universe in its recent history cannot be explained by the theory, indicating either its failure in that our cosmological principles are too simple or there exists an exotic form of matter-energy density, termed 'dark energy' 151 . Many dedicated telescopes are being built to try to ascertain the nature of dark energy. GW observations offer an independent tool for understanding the acceleration and the nature of dark energy, via the ability of BNS and black holes to serve as GW 'standard sirens' . LISA and 3G detectors will reach to higher redshifts, enabling them to measure the amount of dark energy and, possibly, the dark energy equation of state, even without counterpart identifications. The large population of binary mergers out to z = 3 or more that is expected to be detected by the 3G network will allow for a test of cosmological isotropy: do the distances to these events vary around the sky in a statistically significant way? On smaller angular scales, these distances will also allow for independent estimates of weak lensing, mapping the dark matter. The large variety of sources observed by LISA will provide different classes of standard sirens. Stellar black hole binaries at z ≲ 0.2 (reF. 152 ), EMRIs at z ≲ 1 (reF. 153 ) and SMBBHs up to z ≈ 10 (reF. 154 ) will enable precision cosmology across the entire astrophysically relevant redshift range.
With a population of compact binary mergers observed with 3G detectors, and their redshifts obtained by follow-up EM observations, it will be possible to accurately measure cosmological parameters such as the dark matter and dark energy densities, and the equation of state of dark energy 155 , giving a completely independent and complementary measurement of the dynamics of the Universe.

Astrophysical and primordial stochastic backgrounds.
The ASGWB that will be detected by PTAs also contains cosmological information. The properties of the ASGWB depend on the formation and evolution of cosmological source populations 86 . PTA measurements of the ASGWB produced by SMBBHs, the most promising GW source in that band, will constrain the evolution of the SMBHs that become quasi-stellar objects and active galactic nuclei (AGN). In addition, PTAs are sensitive to GWs produced by fundamental physical phenomena such as phase transitions in the early Universe, cosmic strings and inflation, all of which would provide unique windows into high-energy and early-Universe physics [156][157][158] . Finally, just as for ground-based and spaced-based detectors, EM counterparts to individual SMBBH systems will allow for new measurements of the Hubble constant.

GW astrophysics
Formation and evolution of compact stars. Binaries formed from pairs of compact stellar remnant objects such as white dwarfs, neutron stars or black holes are efficient emitters of GWs. As these binaries form through various channels and enter the GW-driven regime of their evolution, where GW radiation determines the orbital dynamics, they sweep up in frequency through first the mHz band of space-based detectors and, eventually, into the Hz to kHz band of groundbased interferometers 159 . The more massive systems (such as GW150914) among these binaries will first sweep through the LISA band, crossing to the groundbased frequency band a few years later. LISA will allow precise determination of the sky location and time of coalescence weeks in advance, making it possible to schedule massive and deep EM coverage of the sky at the time of merger. Such measurements can yield clues as to the likely progenitor systems and their evolution, providing an important constraint on models of stellar

Box 1 | Fundamental questions addressed through gravitational-wave observations
• What is the physics of stellar core collapse? How often do core-collapse supernovae occur 257 evolution. Detailed measurements of GWs from individual systems can also provide information about the internal structure of the objects involved, such as (discussed above) the equation of state of neutron stars involved in a neutron star-neutron star binary. Through interleaved operation and improvement of ground-based detectors, an ever larger statistical sample of black hole-black hole, neutron star-neutron star and neutron star-black hole binary coalescences will be observed, enabling the reconstruction of the formation and evolution of these systems along their cosmic history 160,161 .
With the arrival of LISA in the 2030s, tens of thousands of BBH and BNS systems will be added to the existing catalogue 162 . The population of white dwarf-white dwarf binaries in the Milky Way will also be unveiled, enabling a range of astrophysical investigations, from the structure of our own galaxy to the connection between white dwarf-white dwarf binaries and type Ia supernovae. Beyond the Milky Way, hundreds of stellar-mass black hole binaries far from coalescence will be added to the event count, thus, providing precious complementary information to that gathered by ground-based detectors.
SMBH growth and evolution. SMBHs having masses in the range from 10 6 to 10 10 M ⊙ and inhabiting the centres of galaxies also frequently form binaries by pairing with other compact objects 163 . This may happen in the aftermath of a galaxy-galaxy merger, when two SMBHs pair with each other, resulting in the formation of a SMBBH, or as a result of dynamical processes in dense stellar nuclei, in which the capture of a stellar remnant (black hole, neutron star or white dwarf) by a SMBH initiates an EMRI (reF. 60 ). Both classes of sources are of capital importance in piecing together the puzzle of cosmic structure formation. Today, it is known that virtually every massive galaxy hosts a SMBH at its centre, that galaxies merge frequently, that protogalaxies were already in place at z > 10 and that quasars were already shining at z > 7. These pieces of evidence led to the framework of hierarchical structure formation, whereby galaxies grow by accreting gas cooling along the filaments of the cosmic web and by merging with other galaxies. LISA has the capability to detect mergers of black holes in the mass range 10 3 -10 7 M ⊙ out to their formation redshift, including for z > 20 for some mass ranges 59 . Only GW detectors can observe individual objects at such early times. SMBBH mergers trace the assembly of their hosts from the formation of the first protogalaxies following the dark ages, well beyond the epoch of reionization, up to now. By inferring the redshifts of these events from their luminosity distances, LISA can follow the evolution of large-scale structure over time and, by exploring the demographics of black hole seeds (including their masses and spins), LISA can test models of how early black holes grow into the massive and SMBHs we see today in galaxies and quasars. This information is complemented on the one hand by the observation of EMRIs up to z ≈ 1 and on the other hand by the PTA detection 164 of the stochastic GW background produced by the most massive black hole binaries in the Universe. EMRIs can probe the population of inactive (thus, otherwise invisible) SMBHs, providing invaluable insights into the low-mass end of the SMBH mass function down to the mass scales of dwarf galaxies. The properties of individual inspirals (such as eccentricity and orbital inclination) carry information on the dynamical processes governing the evolution of dense relativistic systems, offering a unique laboratory for testing strong gravity.
At the super-massive end of the mass spectrum, PTAs are expected to reveal the cosmic population of inspiralling SMBBHs that inhabit the largest galaxies in the Universe 82,164 . These objects are in a frequency range inaccessible to LISA and ground-based detectors. Outstanding questions such as the precise occupation fraction of SMBHs in galaxies, the merger rate of galaxies, the relation between galaxy masses and the masses of the SMBHs they host, the efficiency of pairing of SMBHs and the nature of their dynamical interaction with the environments at the cores of galaxies will be answered by deciphering the information encoded in the amplitude and shape of the ASGWB spectrum 88 . The detection of the ASGWB will definitively resolve the 'final parsec' problem, proving that SMBHs can merge and possibly elucidating their dynamical interactions in the cores of galaxies. The dominant dynamical processes are expected to be the scattering of stars on orbits that intersect the galactic core or interactions with a circumbinary disk. PTAs probe frequencies at the interface between the environment-driven regimes (when the SMBHs are far apart) and GW-dominated regimes (when the SMBHs separations are below a milliparsec). Each dynamical mechanism also predicts different inspiral timescales compared with estimates that assume GW-driven inspiral, so measuring the ASGWB spectrum can provide clear evidence of which dynamical processes dominate in these massive galaxy hosts. Individual SMBBH systems are expected to be detected after the detection of the ASGWB. Studies of individual systems, coupled with EM observations, will allow probing the astrophysical processes driving mergers even further, and determine how the importance of various processes depends on the properties of the galaxies 88 .

Multi-messenger astronomy with GWs Dawn of a new multi-messenger era
The detection of GWs from the inspiral and merger of the first BNS system GW170817 (reF. 4 ) marked the start of an era of multi-messenger astronomy incorporating GW observations 5 . The extensive multi-wavelength, multi-year follow-up campaign of GW170817 enabled the detection of counterparts in almost all the EM bands, confirming that the merger of a binary system of neutron stars powers high-energy transients, such as short gamma-ray bursts [29][30][31][32][33][34][35][36][37] and kilonovae [38][39][40][41][42][43][44][45][46] . This unique multi-messenger detection 5 showed the potential of multi-messenger astronomy impacting our knowledge of relativistic astrophysics 36,37,165,166 , radioactively powered transients, nucleosynthesis and heavy-element enrichment of the Universe 47-52 , and the physics of dense nuclear matter 132,[167][168][169][170][171] . It also showed the importance of population studies required to disentangle the microphysics of the source and its interaction with the environment, from the source geometry and energetics. Increasing the number of joint detections will make it possible to determine the equation of state of neutron stars 171 , to probe the properties of different components of the mass ejected during and after the merger [172][173][174] , to understand if the BNS mergers are the primary channel of formation of heavy elements and the details of the nuclear physics relevant to nucleosynthesis 175 , and to understand the structure of the relativistic jets and the physics behind their formation 176,177 .

Multi-messenger facilities
During the current decade (2020-2030), the transient sky will be explored by new observatories and surveys, which will probe a range of frequency bands and timescales with better sensitivity than ever before. The improved sensitivity will be crucial to follow up GW signals coming from larger distances accessible by the upgrades of the LIGO, Virgo and KAGRA detectors, and the third generation of GW detectors. In the optical band, the Vera C. Rubin Observatory 178 will commence operation in the early 2020s and serve as a unique resource for deep, multi-colour searches for optical counterparts over hundreds of square degrees. The James Webb Space Telescope (JWST) 179 and a new generation of 30-40-m class telescopes, such as the European Southern Observatory Extremely Large Telescope (ESO-ELT) 180 , the Giant Magellan Telescope (GMT 181 ) and the Thirty Meter Telescope (TMT) 182 , will allow characterization of the nature of the GW source following the temporal evolution and spectral properties of the faint emission through deep imaging and spectroscopy. The high angular resolution and sensitivity of these telescopes will enable probing the local environment of the source, the properties of the host galaxy and the possible presence of star clusters, providing insights into the formation and evolution of compact objects. The ultraviolet transient sky will soon be monitored by ULTRASAT 183 . Its unprecedented large field of view in the ultraviolet range and its rapid real-time response make it ideal for the follow-up of GW signals.
The high-energy sky is currently monitored by sensitive, large field-of-view survey instruments, including NASA's Neil Gehrels Swift 184 , Fermi 185 190 , and some are envisioned, such as the mission concept All-sky Medium Energy Gamma-ray Observatory (AMEGO) 191 . The sensitivity of the Advanced Telescope for High ENergy Astrophysics (Athena) X-ray observatory 192 , which is expected to be launched around 2030, and the ambitious Lynx project 193 , proposed in the USA, will be of great value to detect fainter X-ray sources, such as the X-ray afterglow emission from relativistic jets observed off-axis. Mission concepts, such as the Transient High Energy Sky and Early Universe Surveyor (THESEUS 194 ) and the Transient Astrophysics Probe (TAP 195 ), are designed to have a unique combination of instruments for gamma-ray, X-ray and infrared transient detections to catch non-thermal and thermal emissions from GW sources.
In the radio band, the Square Kilometre Array (SKA) 196 and the next-generation Very Large Array (ngVLA) 197 will have unprecedented sensitivity, excellent angular resolution and faster survey speed, which will make them ideal for survey studies and transient detections. These new radio facilities are capable of detecting the possible prompt radio burst 198 signals produced by ultra-relativistic jets with timescales of weeks and by the sub-relativistic merger ejecta with timescales of a few years 199 .
Turning to particle detectors, the Cherenkov Telescope Array (CTA) 200 will explore the GeV-TeV sky with a deeper sensitivity than previous instruments. The large field of view, the flexibility to map very large and arbitrary sky patches, and the rapid response time (within about 30 s) make CTA the ideal instrument to detect possible very-high-energy gamma-ray counterpart of a GW signal. Joint GW-neutrino observations with IceCube and KM3NeT may reveal coincident emissions of high-energy neutrinos from BNS mergers or other energetic astrophysical phenomena 201 .

Probing SMBBH counterparts
Unlike stellar-mass black holes, SMBH coalescences resulting from the collision and merger of galaxies are expected to take place in environments with significant amounts of gas 202 . This leads to the exciting possibility of EM signals associated with LISA detections, although the exact nature of such signals is, as yet, unclear. Several counterparts, including precursors, prompt transients and afterglows have been proposed in the literature 203 , but pinning down the distinctive nature of the emission of SMBBH coalescences will require detailed general-relativistic magnetohydrodynamic simulations that include radiative transfer, and is currently an active area of theoretical and numerical research 204,205 . Detecting and identifying EM counterparts of SMBBH coalescences will be challenging. Because of its frequency response, the bulk of LISA events are expected to involve fairly light (<10 6 M ⊙ ), high-redshift (z > 3) systems, which will make it challenging to achieve a deep coverage on a typical deg 2 sky localization with LISA 59 . This challenge is coordinated by the ESA, which is planning a significant overlap between LISA and the upcoming X-ray satellite Athena 206 , with the goal of discovering X-ray signatures from merging SMBBHs 207 . The scientific payoffs of a coincident detection would be very valuable, enabling the study of the host environment, shedding light on the formation and evolution of SMBHs and their galaxies, allowing, for the first time, detailed studies of accretion physics on SMBHs of known masses and spins (extracted from the GW signal) 208 .
The large population of white dwarf-white dwarf binaries detected by LISA will provide a rich arena for www.nature.com/natrevphys multi-messenger studies 209 . Taking advantage of the complementary nature of EM and GW observations, it will be possible to reveal information about orbital geometries, object sizes and mass transfer. Unlike many other GW sources, the white dwarf binaries detected by LISA evolve slowly on human timescales, making them persistent multi-messenger sources. In fact, several dozen systems that LISA will observe with high SNR have already been observed electromagnetically 210 . The population of these known 'verification binaries' is growing through the work of the Zwicky Transient Facility 211 and GAIA 212 , for example, and is expected to increase significantly with surveys such as the Vera C. Rubin Observatory 213 before being greatly expanded by LISA itself.
At nanohertz frequencies, PTAs will enable the individual detection of several SMBBHs 80,82 of M > 10 9 M ⊙ at z < 1 in their adiabatic inspiral phase. These are the most massive binaries in the low-redshift Universe, which can only be hosted in extremely massive galaxies 214 . It will be possible to rank the most likely hosts within the PTA localization area 81 and use time-domain surveys (such as the Vera C. Rubin Observatory), as well as available spectroscopic observations, to look for periodic AGNs matching the period of the detected GW and search for other spectral signatures indicative of a possible binary 215,216 . The secure identification of counterparts will be critical to understand the distinctive signatures of SMBBHs, distinguishing them from regular AGNs. Those signatures can then be searched for in large AGN data sets to identify the expected much larger population of SMBBHs with periods larger than several years (emitting GWs below the frequency range probed by PTAs 217 ). EM counterpart identification for SMBH coalescences will also enable the study of the host environments of SMBHs, shedding light on the formation and evolution of black holes and their galaxies.
With future radio telescopes such as the SKA and the ngVLA, improved timing precision will enable much more precise distance measurements of pulsars used in a PTA 218 . With distances known to within a GW wavelength, the so-called 'pulsar term' can be used to determine the position of a single GW to arcsecond precision, hence, allowing multi-messenger follow-up observations 219 .

Future GW detectors
The ambitious GW science opportunities summarized in the previous section will be enabled in coming decade by upgrades to the existing ground-based and PTA detectors, and next decade by completely new detectors capable of achieving significant sensitivity increases or, in the case of LISA, completely new observational bands. Below, we survey the needed advances in each detector waveband and discuss the prospects for addressing them.

Next-generation ground-based detectors
The present generation of observatories have arm lengths of 3 km (Advanced Virgo, KAGRA) and 4 km (Advanced LIGO), respectively. Given the fixed arm length L, any increase in strain sensitivity for these observatories will be accomplished through reducing displacement noise that limits the measurement precision of δL.
Both Advanced LIGO and Advanced Virgo are implementing mid-scale upgrade programmes, designated as A+ and AdV+, that aim to improve the sensitivities of existing observatories by more than two times their current levels 220,221 . The primary planned upgrades include improved mirrors with lower thermal noise optical coatings, better squeezed light performance, including frequency-dependent optical squeezing 222 , and improved GW readout method based on balanced homodyne detection 223,224 . LIGO-India, slated for operation late in this decade, is planned to come online in the A+ configuration.
Future ground-based detectors are targeting as much as a tenfold increase in sensitivity over the existing network. The key change will be an increase in the baseline arm length L. Two 3G detector concepts -ET in Europe and CE in the USA -are currently being pursued in parallel. ET is currently envisioned as an underground infrastructure in Europe housing three interferometers in a triangular configuration, each with 10-km arm lengths 96 . CE retains the 'L' interferometer configuration used in the current interferometers but increases the arm lengths to up to 40 km (reF. 97 ). CE is planned for two-stage implementation: CE1 will primarily use tested Advanced LIGO technology (albeit with heavier mirrors and higher laser power), whereas CE2 may use newer technologies described in more detail below. An artist's concept of ET is shown in Fig. 10a. The ET and CE configurations and lengths are shown for comparison in Fig. 10b. Additionally, external forces on the mirrors will be reduced, via better seismic filtering, reduced thermal noise and choice of site. Refinements to the sensing system will allow more precise measurements of the positions of the test mirrors, for example, by improving the ability to measure δL, to further improve their sensitivities.
As compellingly demonstrated by the detection of the BNS merger GW170817 (reF. 4 ), multi-messenger astronomy is one of the driving scientific motivations for building 3G detectors. The angular response (or antenna pattern) of a single interferometer to a GW is essentially omnidirectional 18 . To sufficiently resolve the sky location of a GW source, a network of widely separated observatories is needed. As a standalone detector, ET has some capability to identify the sky positions of transient GW sources by virtue of its ability to resolve polarization (through its triangular geometry and multiple co-located interferometers), whereas a single CE has very little capability to determine position. To localize a large fraction of sources within z ~ 1.5 with ≤10 deg 2 resolution, at least three interferometers operated as a single, global-scale detector are needed.
Designing 3G interferometers, housed in suitable observatories and capable of meeting the long-term ambitious science goals presented in the previous section, requires not only specifying a number of key detector design parameters but also understanding the interdependencies and design trade-offs to be made.
Interferometer arm length. Increasing the baseline L by lengthening the interferometer arms is, perhaps, as the strain equation h = δL/L implies, the most

Homodyne detection
A detection method used in precision measurement in which the signal carrier is compared with a reference at the carrier frequency. straightforward path to improved sensitivity. However, increasing the arm length places demands on other detector design aspects. For example, larger beam sizes (due to diffraction of the laser beam) require larger diameter mirrors, placing more stringent requirements on the mirror substrate material and the reflective coating in terms of homogeneity, uniformity and surface figure error. Moreover, when the arm length L approaches the half GW signal wavelength λ = c/f GW , where f GW is the GW frequency, the sensitivity is reduced. For targeting the detection of the post-merger ringdown of a BNS collision, the sensitivity of the detector at 4 kHz is important; a 40-km-long arm length reduces the sensitivity for these signals (due to the presence of a null in sensitivity caused by the 3.75 kHz free spectral range in a 40-km cavity) and a shorter arm length would be better for this particular science goal. In addition, costs for the vacuum system and facility (on the Earth's surface or underground) housing the detector scale approximately linearly with length and must be taken into consideration. Models that combine astrophysical goals and measurement limitations have been developed to determine optimal arm lengths 97 .

Mirror substrate material and temperature.
Lowering the mirror temperature can reduce thermal noise both through direct scaling with kT (here, k is the Boltzmann constant and T the temperature of the mirror) and through the temperature dependence of the mechanical loss, especially in crystalline materials. Although room-temperature fused silica is used in most current GW detectors, its performance degrades when cooled. Potential future detector optical substrate materials include crystalline silicon and sapphire, with mirror masses of up to 320 kg currently envisioned to suppress QRPN. Growing single-crystal substrates possessing high homogeneity, low optical absorption and low internal stress birefringence will require significant research and development. Ultralow temperatures ≤5 K offer the highest possible reduction in thermal noise; however, the engineering challenges are formidable. As one example, a heat link will be required to extract heat (deposited by the laser interferometer sensing system) from the mirror without introducing excessive displacement noise, thermal noise or otherwise compromising design requirements. Silicon may have an advantage in that its coefficient of thermal expansion exhibits a zero crossing at 123 K (reF. 225 ), thus, reducing thermo-elastic noise 226 and, equally important from an operational standpoint, minimizing thermal lensing induced by heating from absorption of the laser light. In addition, it is feasible to extract heat via radiative cooling at 123 K (as the environment can be engineered to be significantly colder than the substrate), possibly alleviating the need for physically contacted heat links.
Mirror coatings. Future interferometers will also require reduced coating thermal noise 11 to achieve their ultimate sensitivity, with the aim of a ten times reduction over the current state of the art. Recent research on ion beam sputtered amorphous coatings has produced factors of a few reduction at 1.06-μm wavelengths used in today's detectors 227 , with evidence that better performance may be achieved out to 2 μm. Small-aperture crystalline coatings have also shown promise 228 but suffer from increased optical loss and challenges in scaling up to large apertures. This appears to be the noise source least susceptible to a straightforward engineering solution. Furthermore, scaling the fabrication process up to coating large-aperture optics will require significant engineering development effort.

Laser wavelength.
Silicon as a substrate material is being explored for both ET and CE. However, silicon is opaque at 1.06 μm (the wavelength used in current detectors), necessitating the use of wavelengths between 1.5 and 2.1 μm to be able to transmit light through Fabry-Perot input cavity mirrors, beamsplitters and auxiliary optics. Longer wavelengths have advantages and disadvantages. They are less susceptible to optical scatter and loss, important for harnessing and preserving the full impact of squeezed quantum states. There may also be evidence of lower coating mechanical loss at longer wavelengths. Conversely, longer wavelengths place tighter requirements on interferometry (the ability to 'split a fringe').
In addition, the diffraction-limited beam size increases linearly with the wavelength, requiring larger aperture mirrors. Developing suitably large diameter (≈80 cm) and massive silicon substrates with low levels of absorption is a key technology goal for future generation instruments. Finally, frequency-dependent squeezing 229 will be used to further minimize shot and QRPN. The requisite nonlinear optical devices (such as phase modu lators) exist at 1.06 μm but are in need of significant development at longer wavelengths.
Laser power. Shot noise is the dominant limit to interferometer sensitivities at high frequencies in a simple interferometer. The choice of mirror substrate material dictates the choice of laser wavelength, which leads to constraints of the laser technologies that can be used. Lasers operating at one micron can be scaled to 500 W with the required frequency/intensity stability and mode quality 230 ; however, development work is needed to achieve the required power and stability for lasers operating in the 1.5-2.1 μm range.
Low-frequency performance. The low-frequency observing cut-off is a critical parameter for 3G detectors. Increasing sensitivity to lower frequencies below 10-20 Hz (where the current generation of detectors operate) will enable detections of intermediate-mass black hole mergers in the range of 10 2 -10 4 M ⊙ and shed light on how heavier black holes form. Mirror suspension systems currently have fundamental resonances in the 1-Hz range; future detectors' suspensions must push to lower resonant frequencies for greater isolation in the 1-10-Hz range. Suspension stages may need to be made from silicon or sapphire, which can be cooled to reduce thermal noise. Minimizing Newtonian noise necessitates finding a site with low environmental noise. An underground location (as is planned for ET) should reduce the Newtonian noise, although requires care in preserving the quiet environment through observatory design. In addition, Newtonian noise subtraction approaches 231 need to be designed and tested to reach the planned factor of ten reduction to meet ET and CE performance goals.

Observatory network configuration.
Multi-messenger astronomy requires accurate and relatively precise localization of GW events. The current network currently has four km-scale observatories in operation: LIGO Hanford, LIGO Livingston, Virgo and KAGRA. The addition of LIGO-India later this decade will further improve the sky localization capability 18 . Detailed studies of networks 232,233 show that a third detector to complement ET or CE in the Southern Hemisphere -needed for detecting the majority of the events within z ~ 1.5, with error boxes less than 10 deg 2 -would form a powerful array, and studies are ongoing in Australia for possible implementation.
Interferometer vacuum systems. Observatory vacuum systems are a critical infrastructure component for 3G observatories. The laser light used to probe the arm lengths must travel in an ultrahigh vacuum to avoid pathlength fluctuations due to the polarizability of residual gas, and the beam tube must not introduce scattered light.
As it is currently envisioned, CE will require two 40-km beam tubes, of 1.2 m diameter, at pressures less than 10 −9 torr, with stringent requirements on partial pressures of molecular hydrogen, water and select hydrocarbons. As the vacuum system comprises much of the cost of a 3G observatory, 'value engineering' these systems is a high priority. Efforts are already underway exploring the use of low-carbon steel and nested vacuum systems 234 .
Civil engineering. Whether the new observatories are nominally on the surface of the Earth (such as CE) or underground (such as ET), there will be significant costs associated with the civil engineering. Site location, acquisition and preparation can present significant practical challenges, and can impact the configuration of a worldwide array. Addressing these primary challenges (and many other challenges not discussed here) will require a sustained and globally coordinated R&D programme to be undertaken before ET and CE conceptual designs can be finalized. To have operating 3G detectors in the 2030s, facility construction should commence in this decade, requiring R&D and prototyping efforts currently underway to be ramped up significantly. Several efforts are of sufficiently large scale that industrial partnerships will be essential to succeed. Examples include the development of mirror optical coatings and mirror substrates with the requisite optical and mechanical properties. Also essential in the near term is the development of a prototype interferometer test bed for interferometry and laser development at the laboratory scale.
On a longer term, upgrading one or more of the existing LIGO facilities to a full km-scale interferometer using 3G technologies is a particularly appealing path to a full-scale 3G interferometer network in that it will both test 3G technologies almost 'at scale' and deliver a detector with considerably more sensitivity than the current second-generation detectors. The LIGO Voyager detector design 98 is being explored as a possible upgrade to the existing LIGO observatories late in this decade. Voyager can potentially achieve a twofold sensitivity increase when compared against Advanced LIGO Plus. In addition, the Neutron star Extreme Matter Observatory (NEMO, reF. 235 ) has been proposed in Australia as a 4-km observatory targeting neutron star GW astrophysics, aiming to have sensitivity comparable with ET and CE at frequencies above 2 kHz.

Future space-based detectors
Space-based detectors measure differential strain using an approach that is similar to that of their terrestrial counterparts. The primary difference between Squeezed quantum states Quantum states of light in which the uncertainties in two conjugate intrinsic quantities (such as amplitude and phase) are manipulated to decrease the measurement uncertainty in one quantity while simultaneously increasing the uncertainty in the conjugate quantity, such that the Heisenberg uncertainty relation is preserved.

Newtonian noise
Displacement noise in gravitational-wave interferometers produced by dynamic gravitational field gradients arising from density fluctuations in the earth's crust and atmospheric pressure fluctuations. space-based designs such as LISA (Fig. 11 and reF. 58 ) and terrestrial interferometers is the size of the baselines: L is currently 3 or 4 km for existing ground-based designs, with plans for 10 or 40 km for future designs, versus L ~ 2.5 × 10 6 km for the LISA design.

NaTuRe RevIeWS | PHySIcS
The longer baselines have two effects. First, the required displacement sensitivity of the interferometric metrology system required to achieve an equivalent GW strain sensitivity is roughly a factor of a million less for space-based interferometers than ground-based interferometers. Second, the optimal response to GWs is shifted to the mHz frequency band, roughly the inverse of the light travel time across the ~10 6 -km baselines. These differences change the nature of the noise sources that limit the instrument's response. Many of the fundamental limitations on measurement, such as seismic noise, and thermal noise in optics and coatings, that must be aggressively addressed in ground-based interferometers are not an issue for space-based interferometers due to the larger size of the displacements. However, the longer wave periods (hours to seconds) and increased duration of the signals (hours to years) demand that the designers of space-based interferometers must be more concerned with the very-low-frequency gravitational and EM stochastic forces due to thermal drifts. Furthermore, the space-based instruments must operate in, and protect their test masses from, the harsh environment of space (caused by the exposure to various types of radiation, as well as large thermal gradients produced by asymmetric solar illumination of the spacecraft) and without direct human intervention.
Aside from these differences between ground-based and space-based GW interferometers, the fundamental limitations on sensitivity are the same: stray forces on the test masses and a limited precision in the measurement of the separation of the test masses. The first limit is how well the test masses, which act as a fiducial inertial test particle, approximate an inertial frame. This can be characterized by the spectrum of residual acceleration from non-gravitational forces on the test mass. For first-generation space-based interferometers, the required level of accelerations is on the order of femto-gs (1 g = 9.81 m s −2 ). The specific requirement for LISA is for the non-gravitational forces on each test mass to be less than 3 × 10 −15 m s −2 Hz −1/2 in the mHz band, with relaxations at both higher and lower frequencies 58 . When expressed as an acceleration, this is similar to the stability of the test masses in the second-generation ground-based interferometers that made the historic first detections of GWs.
A practical consequence of working in space is that the strength of the local gravitational field, which is ~1 g for ground-based interferometers, is significantly lowertypically limited by the mass distribution of the spacecraft itself (and, perhaps, nearby celestial bodies, depending on the orbit). This allows a conceptually simple technique to be used to realize the low-disturbance test mass: simply let it go and let it drift. Forces from sunlight and residual particles in the space environment can be avoided by allowing the test mass to drift within a shielded housing inside the spacecraft. To prevent the spacecraft from running into the test mass, and to limit disturbances from electrostatic and gravitational couplings between the spacecraft and the test mass, a control system known as drag-free control will be used to precisely steer the spacecraft to follow the test mass 236 . Drag-free control was incorporated into early design concepts for space-based interferometers, and was first demonstrated 237,238 by the LPF spacecraft in 2016. At the core of the LPF were two gravitational reference sensors (GRS), each consisting of a test mass, an electrostatic sensing and control assembly, and a surrounding vacuum enclosure. The test masses were cubes of a gold-platinum alloy, approximately 4 cm on a side and with a mass of ~2 kg (Fig. 12). These were separated from their hollow cubic housing by gaps of a few millimetres. Electrodes on the inner faces of the housing allowed the position and orientation of the test mass to both be sensed in all six kinematic degrees of freedom. These same electrodes could be used to apply forces and torques to the test mass electrostatically. The two GRS assemblies were placed approximately 38 cm apart, with an optical bench placed between them. This optical bench was used to perform a series of interferometric measurements to determine the relative position of the two test masses and the position of one test mass relative to the spacecraft. During the LPF tests, the spacecraft operated in drag-free mode around one of the two test masses, with the second test mass electrostatically suspended as www.nature.com/natrevphys a witness. Although the design requirements for the LPF were deliberately relaxed from those of LISA, the LPF performed far better than the Pathfinder requirements (Fig. 12), and was able to demonstrate a performance that was significantly better than the LISA requirements 63,237 .
Relative to ground-based interferometers, the scale of the GW-induced displacements in space-based interferometers is significantly larger (~10 −12 m for space-based versus ~10 −18 m for ground-based). As a result, the interferometric techniques for space are comparatively simple -there is no need for resonant cavities and complex interferometer topologies, such as power-recycling and signal-recycling cavities, as shown in Fig. 4. However, the measurement must be made over long distances and between separate platforms with large relative motion. For LISA, the relative velocity between each pair of spacecrafts is several m s −1 over the separation distance of ~2.5 × 10 6 km. The first challenge is simply getting enough light from one spacecraft to another. Using a 30-cm-aperture telescope to both transmit and receive the beam on each spacecraft, diffraction over these baselines results in a reduction in power of roughly 10 10 at the collection point. To reach a shot-noise-limited displacement noise in the picometre range requires that roughly 1 W of 1-μm laser light leaves the telescope. This is a moderately high level for space-qualified continuous-wave lasers and likely requires the use of a multi-stage laser system. Because of these large diffraction losses, there is no hope of reflecting the light back to the originating spacecraft, as would be done in a classical Michelson interferometer. Instead, a triangular constellation of three identical satellites is used. Each spacecraft simultaneously transmits its own laser signal while receiving signals from each of the other two in the constellation. The incoming and outgoing beams are combined to interfere on an optical bench, resulting in a fringe pattern that encodes information about each laser's intrinsic frequency noise, the orbital motion of the spacecraft and small fluctuations from passing GWs. These fringe patterns are tracked and recorded on each of the three satellites and transmitted to the ground. The set of measurements is then combined using a technique called time-delay interferometry (TDI), in which linear combinations of each signal are formed, suppressing the intrinsic phase noise of the lasers, while retaining the GW information 239,240 .
The basic principles of the LISA displacement measurement have been validated through a combination of numerical experiments 241 and demonstrations using laboratory analogues 242,243 . More recently, an opportunity to validate several key components of the LISA metrology system in flight has been realized with the Laser Ranging Instrument (LRI) on board the GRACE-Follow On (GRACE-FO) spacecraft, which measures the distance between two satellites via optical interferometry. Several of the LRI parameters, including the delivered light powers, Doppler rates and laser phase noise, are similar to those for LISA. The first LRI results demonstrate nanometre-level ranging over the 210-km link, meeting the design goals of the LRI and within a few orders of magnitude of the LISA requirements, despite the LRI operating with a single link and, therefore, being limited by laser frequency noise 244 .
The current planned launch date for LISA is 2034 to match the ESA's schedule for its 'L' (Large) missions. The ESA and its European and US partners are currently engaged in a concerted effort to complete developments of technologies that were not demonstrated on the LPF or GRACE-FO or which require minor modifications. These developments will be completed by the time the mission enters its implementation phase, planned for the mid-2020s. Whereas on-time launch of LISA appears feasible, LISA may not be the sole member of the first generation of space-based GW observatories. Several Chinese-led efforts are currently in development, including the TianQin (reF. 245 ) concept for a

Multi-stage laser system
A laser system consisting of a seed laser ('master oscillator'), which is amplified to greater power levels using power amplifiers.

Time-delay interferometry
An algebraic method to produce linear combinations of interferometry signals from separated spacecraft used in LiSA that subtracts the intrinsic phase noise of the lasers, while retaining the gravitational-wave signal.
NaTuRe RevIeWS | PHySIcS geocentric constellation and the Taiji (reF. 246 ) concept for a heliocentric constellation, both of which use an architecture similar to LISA, but with somewhat different instrumental parameters.
To reach sensitivities beyond this first generation of observatories will require additional investments in basic research. Concepts for second-generation spacebased GW observatories include ones targeting frequency bands above 247 and below that of LISA. DECihertz Interferometer Gravitational wave Observatory (DECIGO) 248,249 is a future Japanese space mission with a frequency band of 0.1-10 Hz, mainly aimed at the detection of primordial GWs. DECIGO consists of three spacecraft, which form three Fabry-Perot Michelson interferometers, with an arm length of 1,000 km. Concepts beyond that of LISA 250 to look at even more massive systems have also been discussed, as well as detectors similar to LISA with increased sensitivity 251 . An alternative would be the deployment of networks of space-based observatories, which could greatly improve angular resolution and enable further multi-messenger science 252 . The specific technology required would depend on the science target and mission architecture. Examples include more powerful, yet, stable lasers, large, but still dimensionally stable, telescopes, inertial sensors with even better performance than those flown on the LPF or, perhaps, more compact and affordable elements that would allow for large networks of space-based interferometers to be deployed.

Future PTAs
In the near term, improvements to the PTA network will focus on improved receivers with much wider radio bandwidths being developed and deployed on several telescopes 253 . These will increase sensitivity dramatically by allowing for better removal of interstellar propagation effects and integrating more of the emitted radio signals. Wider bandwidth receivers will also allow for greater observing efficiency, obviating the need for observations at multiple frequencies. These new facilities and instruments will greatly enhance the sensitivity of the individual and global PTAs, but will also increase the already difficult task of merging the heterogeneous data from all of the individual telescopes in the array; due to this difficulty, the current best IPTA limits on the stochastic GW background are less constraining than the best individual PTA limit 86 .
Significant human and computational resources will be needed to make the IPTA project a success. Great progress has been made in developing new GW detection algorithms that properly account for a number of sources of noise in PTA data. As an example, planetary ephemerides have been found to be too inaccurate for PTA experiments, but a new software package can properly model these ephemeris errors while performing the GW searches 86,254 .
Other issues that must be carefully considered include: long-term pulse profile evolution, the ability to accurately model changes in the electron column density along the line of sight and pulse jitter (pulse-pulse deviations from the average). Pulse jitter is pulsar-dependent and defines a minimum dwell time regardless of telescope sensitivity to achieve a given timing precision. For the celebrated bright MSP PSR J0437-4715, observations of less than an hour can never achieve residuals below 40 ns. Others, such as PSR J1909-3744, have much lower levels of jitter, nearer 10 ns. Timing arrays are starting to factor jitter limits into their observing plans so that large-aperture facilities are not 'wasted' on targets that do not benefit from increased SNRs.
Looking further into the future (see, for example, reF. 255 ), the SKA and the proposed ngVLA will further enhance detection and science prospects, and China has other large-aperture radio telescopes planned, such as the 110-m Xingjiang QTT. Existing timing limits are near where many models predicted the stochastic background might be, and there is a good chance that an individual source may be separable from the background within the next decade. The major threat to PTA science is the increasingly crowded radio spectrum from satellites, aircraft and ground-based transmitters that increasingly use the once sparsely populated 300-MHz to 3-GHz band for terrestrial navigation and communications.

Conclusions
In just a few years of using instruments capable of recording the waveforms of signals, ground-based GW observatories have made seminal contributions to the fields of GR, fundamental physics and astrophysics. The multi-messenger characterization of the first observable BNS coalescence dramatically enhanced our understanding of extreme states of nuclear matter and the astrophysics of kilonovae.
The scientific potential for the field of GW science in the next few decades is considerable, afforded by the prospects of upgrades to existing observatories in this decade and the construction or launch of new observatories in the 2030s. There are clear paths to improvements in both the sensitivity of the instruments and the range of frequencies. For ground-based detectors, sensitive to astrophysics of up to ~1,000-M ⊙ compact objects, the network of detectors of 3-km and 4-km scale will both improve and grow in the coming decade, and the future planned ET and CE observatories offer the possibility of a quieter environment, implementation of detectors of greater complexity and longer arms. Hence, all stellar-mass coalescing BBH systems in the Universe will be within detection reach.
The LISA space-based detector will deliver sensitivity to signals from SMBHs, with exquisite resolution of signal waveforms and completing the survey of the Universe for binaries up to some 10 6 M ⊙ . The PTAs will continue to evolve with new antenna networks, more sensitive and wider-band receivers, and discovery of additional pulsar 'clocks' , providing unique information on the dynamics of the very largest galaxies in the Universe. Together with EM and particle detectors, these instruments will provide quantitative and qualitative new insights into physics, astrophysics, cosmology and astronomy. GW detectors have, indeed, opened a new window onto the Universe.

Published online 14 April 2021
Planetary ephemerides A precise measure of the trajectory of planets in the Solar System required for accurate modelling of pulsar radio pulse arrival times. www.nature.com/natrevphys