Waveguide Integrated Superconducting Single Photon Detectors Implemented as Coherent Perfect Absorbers

At the core of an ideal single photon detector is an active material that ideally absorbs and converts photons to discriminable electronic signals. A large active material volume favours high-efficiency absorption, but often at the expense of conversion efficiency, noise, speed and timing accuracy. The present work demonstrates how the concept of coherent perfect absorption can be used to relax this trade-off for a waveguide-integrated superconducting nanowire single photon detector. A very short (8.5$\mu$m long) and narrow (8$\times$35nm$^2$) U-shaped NbTiN nanowire atop a silicon-on-insulator waveguide is turned into a perfect absorber by etching an asymmetric nanobeam cavity around it. At 2.05K, the detectors show $\sim$96$\pm$12% on-chip quantum efficiency for 1545nm photons with an intrinsic dark count rate $<$0.1Hz. The estimated timing jitter is $\sim$53ps full-width at half-maximum and the reset time is $<$7ns, both extrinsically limited by readout electronics. This architecture is capable of pushing ultra-compact detector performance to ideal limits, and so promises to find a myriad of applications in quantum optics.

All the optical components on a SOI platform -including the detectors -should operate at telecom compatible infrared wavelengths. Superconducting nanowire single photon detectors (SNSPD) 38,39 represent the most promising stand-alone infrared photon counting technology, and so it is not surprising that nanowires placed on top of optical waveguides have figured prominently in recent demonstrations of integrated single photon detectors 35,[40][41][42][43][44] . In the travelling wave (TW) configuration employed in these detectors, photons propagating down micron-wide waveguides are evanescently absorbed by a critically biased superconducting nanowire over a distance of tens of microns, and then an easily detectable normal state transition in the nanowire occurs 39 . Various of these TW SNSPDs have achieved high quantum efficiencies (up to 91% 35 ), or low noise (down to 2 <0.01Hz 45 ), or fast recovery time (<10ns 35 ), or accurate timing response (<20ps of jitter 35 ), but no integrated single detector performs well in all of these critical categories. If truly scalable quantum information on-a-chip is to become practical, then detectors that exceed all of these individual performance metrics are needed.
The evanescent absorption in the TW geometry intrinsically puts a minimum on the coupling length of an efficient detector, and correspondingly, the length of the nanowire (typically between 40µm and 400µm 35,41,42 depending on the nanowire layout and the host material system). A higher packing density of lower noise 46 , faster 47,48 and more accurate detectors 49 would be possible if the nanowire length could be further reduced. The obvious challenge is the apparent trade-off in absorption efficiency 50 (shorter wires offer less absorbing volume). This paper describes a successful strategy for circumventing this trade-off, by embedding a short (8.5µm) ultra-narrow (8×35nm 2 ) superconducting nanowire within a high quality factor microcavity specifically designed to turn the overall detector into a coherent perfect absorber [51][52][53] (see scanning electron microscope images on Fig. 1). Implemented on a SOI platform, the detector's ultra small foot-print (0.5×7.0µm 2 )smaller than any reported to date -also incorporates a built-in optical filter. It absorbs and detects nearly 100% of ∼1545nm light in the waveguide while exhibiting a sub-Hz intrinsic dark count rate, and a fast recovery time <7ns. Insets on the left are scanning electron microscope images (nanowire is colored) zoomed into the detector region. The dashed blue line encloses components held at cryogenic temperature (2.05K).
The photons generated by a laser are delivered to the detector through waveguides, grating couplers, a hole in a cold shield surrounding the chip (S) and room-temperature optics (P: polarizer, L: lens, BS: beam-splitter, W: cryostat windows, D 1 : power meter). GC-WG-GC devices together with a second set of room temperature optics that involves D 2 are used for calibration, imaging and alignment purposes (I: iris). The nanowire is biased by a voltage source, a resistor (R = 100Ω), and an inductor (L = 100nH). The detection signal is devlivered to a low noise amplifier (LNA) through a coupling capacitor (C = 22pF) for transmission to a room temperature amplifier (AMP) and a counter through a coaxial cable.

2 Results
Design Concept In order for a short nanowire to absorb virtually every photon incident along a waveguide, it is placed within an asymmetric optical microcavity defined by etching a series of holes on either side of the nanowire (see Fig. 2a). The cavity containing the weakly absorbing nanowire is designed to function as a coherent perfect absorber [51][52][53]. Ideally (ignoring all other losses), the reflectivity of the back mirror (right side) is unity, so all of the light incident from the left must be either absorbed or reflected. At the resonant frequency of the cavity, ω R , a portion of the light incident from the left excites a resonating cavity mode with amplitude A.
The normalized power reflected back into the incident waveguide is then |1 − 2/τ r A| 2 , and the power absorbed by the nanowire is 2|A| 2 /τ A , where τ r and τ A are the time constants associated with the decay of A into the waveguide and the nanowire, respectively 54 . Power conservation in this idealized scenario requires |1 − 2/τ r A| 2 + 2|A| 2 /τ A = 1, which in general determines how much of the incident light reflects, and how much is transferred into the cavity. However, if the left side mirror reflectivity is designed such that τ r = τ A , then all of the incident light is perfectly transferred to the cavity and absorbed. From a different perspective, the scattering matrix (M) describing the dielectric mirrors and nanowire in a CPA-SNSPD is such that the incident photons excite the detector into an eigenvector of M with eigenvalue equal to zero 51 .
In the ideal scenario above, an arbitrarily short nanowire (arbitrarily long τ A ) can absorb all of the incident radiation as long as the reflectivity of the input mirror can be precisely tuned close to 100% (i.e. long τ r ). In practice there will also be some additional scattering losses associated with the cavity, so its overall quality factor, Q will be given by The above analysis indicates that detectors with absorption efficiency η A ∼ 1, can be achieved using very short and narrow nanowires as long as dielectric reflectors can be designed such that Q A = Q r ≪ Q scatt . The minimum nanowire length (L NW ) is dictated by how tightly localized the high Q scatt cavity mode is, and therefore high-index contrast host materials like SOI are ideal. Finally it is noted that this approach necessarily limits the bandwidth over which the detector absorbs efficiently, to ∼ ω R /Q. However, for many quantum information processing applications, optical filters are placed in front of detectors to minimize spurious counts due to stray photons 55,56 . The CPA detector described here actually integrates the filter and detector into a single, compact unit.

Design Details
The CPA-SNSPDs are formed by pattering holes in a silicon nanobeam (silicon on SiO 2 , 190nm thick, 500nm wide) to form an asymmetric cavity around a U-shaped nanowire (NbTiN, 8nm thick, W NW wide) that lies on top of the nanobeam, as illustrated in Fig. 2a. Nanobeam cavities 54, 57 allow low-loss coupling to waveguides, high Q scatt , low mode volume, and an accessible near field which are all useful features for CPA-SNSPDs. The back (perfect) reflector consists of 10 holes, 6 of fixed radius (R m = 100nm) and 4 with shrinking radii, down to R 1 = 50nm, toward the input waveguide to impedance match the Bloch mode to the waveguide mode. The front (partial) reflector has N r holes, all of which are linearly tapered from a maximum hole radius of R r down to R 1 = 50nm on both sides. These 1D photonic crystals 54 were designed (see simulation methods) to have a band-gap centered at 1545nm. To design the CPA, N r = 14 and R r = 100nm are fixed, and the cavity length (L c ) in the absence of the nanowire is adjusted to get a mode at the detector operation wavelength of 1545nm with a moderate Q = 5.6 × 10 4 . This ensures appropriate design of tapers and consequently a high enough Q scatt > 5.6 × 10 4 . When the nanowire is added to the same cavity, the mode profile and wavelength stay almost fixed, but the new Q ≃ Q A (W NW ) is substantially reduced, even for the smallest W NW = 20nm (see squares on shows that 1% of the total 1.1% is lost to the substrate. Q scatt could be further increased by undercutting the nanobeam, and the back-transmission could be further reduced by adding more holes.

8
Applying both strategies, the losses would be reduced to ∼0.2% and the absorption could peak close to ∼99.8%. For comparison, the simulated absorption of the same 8.5µm long nanowire, but without the cavity, is 29%; a 160µm long nanowire would be needed to obtain 99.8% absorption in the absence of the cavity. The huge reduction of L NW without compromising absorption efficiency (η A ) promises a detector not only efficient in absorption, but also superior in the efficiency of converting absorbed photons to electric pulses (η D ). It also promises less dark counts, higher speed, and a more compact foot-print, as demonstrated and discussed below. R Ph , into the strip waveguides (η C ), as needed for on-chip efficiency measurements (η C is shown by circles on Fig. 3a).

Efficiency and Dark Count Characterization
To measure the quantum efficiency and noise performance, three different count rates were determined (see methods): photon count rate (PCR) measured when the input grating couplers were excited by the laser, background (mostly blackbody) count rate (BCR) measured in the same conditions but with the laser off, and dark count rate (DCR) measured when a cylindrical brass shield (50µm thick) without any holes surrounded the sample holder. The BCR will exceed the DCR because the large cryostat windows -that were unshielded for PCR and BCR measurements -allow intense blackbody radiation to enter the cryostat. The quantum efficiency (QE = η A η D ) was deduced from the measured rates as Figure 3a shows the system quantum efficiency, QEη C , for several SNSPDs, all biased at a bias current equal to ∼90% of the experimentally determined critical current (I C ). Triangles are for two CPA-SNSPDs, while squares are for a device with the same nanowire layout, but without any cavity holes, effectively a TW structure. The TW device exhibits a broad spectral response which is a down scaled version of η C (expected because of the small η A for such short TW SNSPD). In contrast, the CPA devices show resonant-like spectra that almost ideally sample the η C . Note that the off-resonant QEη C goes to negligibly small values compared to the peak (only 0.3% of the peak QEη C ) confirming negligible contribution of non-guided photons to the peak PCR. Although the efficiency numbers reported on Fig. 3a are small, a relevant number for integrated quantum optics applications is the efficiency of detecting photons that are already in the waveguide. With the same symbol conventions, Fig. 3b shows the on-chip QEs at ∼ 0.9I C obtained by normalizing the above measured QEη C with η C . The QE for devices without the cavity stays flat and small, as both η A and η D are broadband 59 , and as η A is expected to be much less than unity. The QE for the CPA devices peak close to unity (equal to unity within the uncertainty of the measurements) signaling very efficient η A and appropriate functioning of the CPA design, as well as very high η D . Shown on the same figure, with filled circles and a dashed line (best Lorentzian fit), is the simulated QE = η A (assuming η D = 1) for the designed CPA detector. As can be seen the fabricated CPA devices perform close to the target QE. The small difference in the resonant wavelength is due to fabrication imperfections, and could be improved by using better fabrication tools.
In addition to wavelength, the quantum efficiency of a CPA SNSPD is a function of bias current. Figure 3c shows the QE versus bias current at a fixed wavelength for the same devices with the same symbols. The lines are best fits to experimental data using a double sigmoidal function. A typical bias current of ∼0.9I C for each device is marked with a big square. The QE of the two CPA devices (triangles) measured at their resonant frequencies follow a single sigmoidal shape with saturation at 1.04 and 0.99. These devices perform so well that the small difference between their measured saturated QE and unity stays less than our experimental uncertainties. However, their QEs at ∼0.9I C are ∼2% smaller than the saturation level, showing a high η D of ∼98%. This value together with the simulated value of η A =98.4% implies an upper bound for QE = η A η D ∼96% at ∼0.9I C . The QE for the non-cavity device (squares) measured at 1550nm follows a more complex double sigmoidal curve, signaling the presence of a material or geometrical constriction in the nanowire (in agreement with its smaller I C ). But, the fitted sigmoid has a saturation at QE = η A =27% (η D =1 at saturation), close to the simulated η A of 29%.
For comparison, a TW device with a simulated η A of 90% was also fabricated using a relatively long nanowire (W NW =35nm) length of 57.2µm. The QE of this device (diamonds on Fig. 3c) changes considerably more gradually with the bias current than the CPA structures, most probably because of more non-uniformities along its long absorbing length. At ∼0.9I C the QE is only 89% of its saturated level at ∼92%. This then clearly demonstrates the advantage of reducing L NW in CPA detectors, which allows high η D and high η A , therefore a high QE.
Quite apart from being efficient, a good detector must be low noise. Figure 3d shows DCR and BCR measurements for devices of Fig. 3c with the same symbols. The DCR starts from a plateau-like level at around 0.1Hz followed by a fast exponential increase as expected for nanowire detectors. The plateau likely originates from black-body radiation that enters the cryostat from top or even through the thin brass foil; it can be reduced by using better shields 45,55 . At ∼0.9I C the DCR for the non-cavity 57.2µm long device is ∼1.1Hz, whereas the DCR for the CPA device is ∼0.1Hz. This confirms another advantage of the CPA design, in substantially reducing the DCR. Note that a CPA-SNSPD at ∼0.9I C (see vertical dashed line connecting figures 3c and d) maintains its high QE over a bias current range for which DCR stays negligibly small (i.e. almost ideal detection performance). The BCR versus bias current curves show QE-like shapes which 13 further suggests it originates from black-body or stray photons rather than being intrinsic to the detectors, like the DCR. This is in agreement with other studies that show how the BCR can be diminished by using well-shielded fiber coupled cryostats and utilizing inline cold filters 45,55 .
Timing Performance Characterization Figure 4a shows a waveform histogram of amplified photon detection pulses from the detector when biased at 3µA and excited by a CW laser. An inductor, L =100nH, was externally placed in series with the nanowire and the two were looking at an impedance of R = 100Ω (see Fig. 1). The measured pulse has a negative polarity making it compatible with the counter, and shows an under-damped shape because of the frequency cut-offs of the chain of amplifiers (10MHz to 1.2GHz). The counts are restored about 7ns after an initial detection event at the falling edge, approximately five L/R time constants, as expected for an LR circuit. The external inductor was used in the measurement setup to ensure a smooth over-damped return of bias current to the nanowire, and therefore to avoid after pulses 60 . However, this inductor can in principle be removed and replaced with a more careful read-out circuit to further reduce the reset time to its intrinsic limit, that is known to scale with nanowire length 47,48,61 .
As for the timing jitter, the energy stored in a CPA cavity decays with a time constant τ = Q CPA /ω R . This gives τ =0.23ps for the CPA-SNSPD designs reported here. Provided τ is much smaller than the minimum reported timing jitter for nanowire detectors (<15ps), this cavity decay time constant should not have any adverse effect on the measured timing jitter. As an estimation of the timing jitter of this system, a single photon detection trace at 5µA is shown in Fig. 4b. The associated root-mean-square of the electrical noise is σ n =1.4mV, and the slope of falling edge is K =62.7mV/ns. This yields an estimated timing jitter of 2.355σ n /K =53ps full-width at halfmaximum. This jitter is limited by the performance of the read-out circuits, and is consistent with reports on SNSPDs with comparable bias currents 62 .

Conclusion and Outlook
The demonstrated impact of coherent perfect absorption on the performance of a single photon detector paves the way for integrating hundreds of ultra-high performance detectors on a chip. The silicon host material facilitates high-performance CPA-SNSPDs because of its high index contrast, while also offering the potential of integrating on-chip electronic circuits with the detectors to further boost the performance, and to handle the complexity of scaling up the read-out circuits.
This latter point is especially interesting in light of studies that show the compatibility of CMOS 15 transistors with the cryogenic environment 63,64 . The projected superior speed performance of CPA based detectors combined with their built-in filtered QE, and advanced fiber to waveguide coupling methods 65 , will also make these detectors ideal for implementing fast quantum communication systems.

Methods
Simulations Frequency domain finite element solvers (COMSOL, Inc.) are used for numerical simulations. The index of refraction for silicon is set to 3.45 (measured close to 1550nm at cryogenic temperatures 66 ), and the environment is assumed to be superfluid helium (n =1.03). The index of refraction for NbTiN is set to 4.17+i5.62 67 . Eigenmode analysis is used for all Q factor simulations, while the simulated spectra are obtained by using available numeric port boundary conditions.

Fabrication
The devices were fabricated on NbTiN coated (8nm thick -STAR Cryoelectronics Inc.) SOI wafers with a silicon device layer thickness of 200nm and a buried oxide thickness of 1220nm. The superconducting thin film has a critical temperature T C =7.16K, and critical current density J C (T = 0) = 7.57 × 10 6 A/cm 2 (see supplementary information for details). Positive electron-beam (e-beam) resist (ZEP520A from ZEON Corp.) was spin-coated at 1800rpm and hot-plate backed at 180 • C for 3min to make a 600nm thick film. Contact pads and the first set of alignments marks were defined in a 25KeV e-beam lithography machine (dose 110µC/cm 2 ) and developed in o-xylene at 20 • C for 60s followed by a 30s soak in IPA and a DI water rinse. The chip was rinsed in 140:1 BHF:H 2 O for 60s to wet-etch ∼1nm of NbTiN surface oxides, and was immediately transferred to an e-beam evaporator chamber to deposit an 8nm/90nm titanium-gold bilayer, followed by lift-off in sonicated chlorobenzene. The chip was coated again by 600nm thick ZEP520A, after which the photonic structures and a second set of alignments marks were e-beam written using the original gold alignment marks at 140µC/cm 2 . The sample was then developed in at 2800rpm and the same baking conditions were used to coat 180nm of resist on flat areas and 110nm on waveguides that are surrounded by relatively deep trenches. The second alignments marks were uncovered by e-beam lithography and development to allow sharp imaging of the marks for the last lithography step. The nanowires were written using these uncovered marks at 78mC/cm 2 for which ZEP520A acts as a negative resist 69 . All of the resist except the area exposed by the high-dose e-beam was removed by 5min exposure to ultraviolet radiation (λ = 320nm, ∼ 3W/cm 2 ) and a 1min rinse in chlorobenzene. A final RIE in 15:2 CF 4 :O 2 for ∼40s was used to etch the unprotected NbTiN and 10nm of silicon. SEM images of several samples indicates this process yields better than ±20nm alignment of the nanowires with the nanobeam cavities.
Measuring η C It is very difficult to directly measure η C -the coupling efficiency between the photons incident on the cryostat windows at room temperature and the strip waveguide -as there is no direct means of accessing the strip waveguide inside the cryostat. The η C can however be indirectly but accurately inferred by incorporating several test devices on the chip, as follows.
The first test device is laid out as two grating couplers connected with a long waveguide (see Fig. 1) that has a U-shaped nanowire on top (the same nanowire geometry as the TW SNSPD devices presented, but without contacts.). Seven of these devices with nanowire lengths (L NW ) from 0 to 60µm were made and their transmission (T ) measured. A linear fit to log(T /T (L NW = 0)) versus L NW yields η A versus L NW for devices without cavities. At L NW =57.2µm the measured η A =92.8%, and the best linear fit gives 91.7%, both in good agreement with the designed η A of 90%. The QEη C =(PCR − BCR)/R Ph at 1550nm for the 57.2µm long SNSPD versus bias current is then measured (scaled version of diamonds on Fig. 3c). Using the best sigmoidal fit to the measured points, the maximum saturated QEη C (for which η D =1) and QEη C at 0.9I C were determined and divided to yield η D =89% at 0.9I C . Then, measuring QEη C versus wavelength at 0.9I C and equating the QE at 1550nm to η A η D =0.917×0.89 (solid horizontal line on Fig. 3b), η C is calculated as shown by the circles on Fig. 3a.
A nice feature of the above procedure is that the uncertainties in absolute calibration of the power meter used to measure R Ph -the rate of photons incident to the cryostat windows -is mostly embedded in η C rather than QE. This keeps the estimated QE values largely independent of the meter's calibration. However, there may be variations in η C among different devices on the chip, causing errors in measured QE. To evaluate these variations, 14 identical test devices was included on the chip, each laid out as two grating couplers connected by a long waveguide. Measuring transmission through these devices, ±12% of change relative to the mean value is observed for wavelengths close to 1550nm. Noting the QE measurements like the transmission measurements involving two couplers, and considering that the above measurement of η A for L NW =57.2µm yields an η A very close to that expected, the uncertainty bars on QE as shown on Fig. 3 were set at ±12%.

Acknowledgements
Financial support from the Canadian Institute for Advanced Research and the Natural Sciences and Engineering Research Council is gratefully acknowledged.