Ultrafast optical circuit switching for data centers using integrated soliton microcombs

Due to the slowdown of Moore’s law, it will become increasingly challenging to efficiently scale the network in current data centers utilizing electrical packet switches as data rates grow. Optical circuit switches (OCS) represent an appealing option to overcome this issue by eliminating the need for expensive and power-hungry transceivers and electrical switches in the core of the network. In particular, optical switches based on tunable lasers and arrayed waveguide grating routers are quite promising due to the use of a passive core, which increases fault tolerance and reduces management overhead. Such an OCS-network can offer high bandwidth, low network latency and an energy-efficient and scalable data center network. To support dynamic data center workloads efficiently, however, it is critical to switch between wavelengths at nanosecond (ns) timescales. Here we demonstrate ultrafast OCS based on a microcomb and semiconductor optical amplifiers (SOAs). Using a photonic integrated Si3N4 microcomb, sub-ns (<520 ps) switching along with the 25-Gbps non-return-to-zero (NRZ) and 50-Gbps four-level pulse amplitude modulation (PAM-4) burst mode data transmission is achieved. Further, we use a photonic integrated circuit comprising an Indium phosphide based SOA array and an arrayed waveguide grating to show sub-ns switching (<900 ps) along with 25-Gbps NRZ burst mode transmission providing a path towards a more scalable and energy-efficient wavelength-switched network for data centers in the post Moore’s Law era.

of the disaggregated transceiver design is that the multiwavelength comb source can be shared across many (e.g. 64) nodes (racks or servers) in parallel instead of using 64 separate comb sources, using a split-and-amplify architecture (Fig. 1a). The source can thus be treated as a shared infrastructure element such as the power source in today's data centers. This allows for an appealing division of functionality since the power consumed by the comb source is amortized as the power efficiency of the end-to-end system converges to the efficiency of the amplifiers while allowing for a high-quality and stable light source that can be rapidly wavelength-tuned (cf. discussion on power analysis and Supplementary information (SI)).
Here, we use a Si 3 N 4 based soliton microcomb as a multiwavelength source to show ultrafast optical wavelength switching. In a proof-of-concept experiment, we demonstrate switching of >20 individual comb channels at sub-ns time scale (<520 ps) using discrete SOAs. Further, 25-GBd (baud-rate) non-return-tozero on-off keying (NRZ) and four-level pulse amplitude modulation (PAM-4) burst mode transmission systems along with ultrafast switching are shown. Then, a more compact switching system consisting of PIC-based AWG and SOAs is implemented to show sub-ns switching and 25-GBd NRZ burst mode transmission, indicating the potential utilization of such a miniaturized system to mitigate power and scaling issues.

Results
OCS architecture based on soliton microcomb. Figure 1 shows the OCS architecture containing the soliton microcomb, SOAs, AWGs in the switch and AWGRs to route the data across many racks (cf. Methods). This architecture allows further parallelization of different resources by sharing the soliton across many racks and modulators for power efficiency and parallel data transmission, respectively (Fig. 1a, b). The multi-wavelength source is generated by pumping a packaged Si 3 N 4 microresonator fabricated using the photonic Damascene reflow 38,39 process enabling a mean intrinsic Q-factor (Q 0 ) of >15 million. Initially, a multi-soliton is initiated by performing a scan over resonance (forward tuning) 20 and then a single soliton is generated via backwards switching 40 (Fig. 2b). The soliton is amplified using a low-noise and compact EDFA resulting in a comb with a maximum power of up to −4 dBm and an optical signal-to-noise ratio (OSNR) of >34 dB. The post-amplified soliton as shown in Fig. 2c is de-multiplexed using a 100G spaced 48-channel AWG (~1525-1564 nm) providing 30 dB isolation. The individual comb channels are initially switched using discrete SOAs with a smallsignal gain of~11-13 dB at 1550 nm. Figure 2d shows 10-90% rise and fall times of 493 and 395 ps, respectively, for a single microcomb carrier centered at 1555 nm (CH 37 of AWG) when applying a current of 120 mA to operate the SOA. Similarly, more than 20 comb channels (1540-1564 nm) in C-band are tested individually to show sub-ns switching (cf. SI). Even though not tested due to the unavailability of an L-band AWG, the current results indicate that more than 40 comb channels in the L-band could be used as well.
By scaling to thousands of nodes, we can take advantage of the fact that today's servers or top-of-the-rack (ToR) switches comprise multiple channels. For example, the recently announced NVIDIA Ampere A100 GPU 41 , supports 2.4 Tbps of bandwidth by combining 48 channels, each operating at 50 Gbps. Therefore, assuming 40 wavelengths and SOAs per transmitter, we can connect up to 48 × 40 = 1920 other nodes. If, instead, we consider a rack-based deployment, the latest ToR switches 42 have 512 SERDESes (i.e., 256 uplinks) that could allow to interconnect up to 256 × 40 = 25,600 racks, which is an order of magnitude higher than the number of racks in even large data centers today.
However, this also means that a node is directly connected to any other node in the data center through only one of its uplink channels. To ensure that any pair of nodes can still communicate with their full bandwidth, the network routes traffic between any pair of nodes through all other nodes in the network. Such detour routing imposes a throughput overhead although the throughput can be at most 2× worse than an ideal switch and thus, can be compensated by doubling the per-node network bandwidth. Furthermore, detour routing offers several advantages including the fact that it obviates the need for explicitly scheduling network traffic, which has been a key bottleneck for practical and deployable optical switching in data center. We provide more details of our network architecture and the trade-offs imposed by our topology and routing choice in a separate paper 18 .
Ultra-fast wavelength switching and data transmission via discrete SOAs. For a proof-of-concept system-level demonstration, fast switching within four different comb channels is performed. Figure 2e shows sub-ns switching between four different comb channels with a wavelength separation of~5.6 nm. A guard zone of 2.56 ns is used to allow a smooth switching between adjacent comb channels while an external reference clock (timing board) aligns the different switching signals. Distinct currents are applied to the SOAs to achieve a constant output power and to compensate for the non-uniform comb power per line or SOA's gain (Fig. 2e). While the sub-ns switching of four channels with a maximum 20 nm separation has been demonstrated, the maximum channel separation is mainly limited by the optical band-pass filter (OBF) (cf. SI).
In the following experiment, we show 25 Gbps (NRZ) and 50 Gbps (PAM-4) burst mode data transmission while switching between four comb channels using the setup shown in Fig. 3b. The four optical carriers after switching are further amplified to overcome the insertion loss (~7 dB) of the 20-GHz Mach-Zehnder modulator (MZM) which is operating at the quadrature point. In addition to eliminating comb channels in the next order FSR of the AWG, the OBF is utilized to suppress the out-of-band SOA and EDFA amplified spontaneous emission (ASE) noise. The burst mode sequence at 25 GBaud symbol rate, generated by the arbitrary waveform generator is applied to the MZM with a random sequence of 2 15 -and 2 16 -bits for NRZ and PAM-4 respectively. The electrical waveform is amplified using a transimpedance amplifier (TIA) after detecting it on a fast photodiode having 50-GHz bandwidth. The amplified waveform is acquired using a real-time oscilloscope with 160 GSamples/s sampling rate. The digital signal processing (DSP), explained in detail in ref. 17 , is performed offline to obtain the bit error ratio (BER). The received optical power (ROP) vs. (BER) of the system, as shown in Fig. 3c, is characterized by changing the optical power of incoming waveform via a variable optical attenuator (VOA). A BER of below 5 × 10 −5 , which is the threshold for forward error correction (FEC) in data center transmission systems corresponding to the KP4 FEC 43 , is achieved for both NRZ and PAM-4 at a ROP of~−12 dBm and~−8 dBm respectively. The BER error Interconnection of 64 racks via an arrayed waveguide grating router (AWGR) for implementing fast OCS. In this model, distinct wavelengths are assigned to each rack at each time slot. At each receiver, a 10-20 nanosecond (ns) time slot is assigned for each rack on a round-robin basis for data transmission. The switching module is placed on the top of rack (TOR) switch. A single comb source that is post amplified via cascaded amplifiers to attain high optical power per line can be distributed among many racks. The 64 individual combs (comb1, ..., comb 64) split from a central frequency comb generator (FCG) can be distributed across 64 different racks as a multiwavelength laser, making this architecture more power-efficient and flexible. The multiple datacarrying optical carriers are routed using a passive AWGR to the assigned racks. b Each comb channel is transmitted to SOAs after de-multiplexing, where a control signal (turning on/off current) is applied to switch between the comb channels at sub-ns. The comb channels (10-ns slots) are encoded with data using Mach-Zehnder modulators (MZMs) and transmitted to the relevant racks. The multiple MZMs shown here indicate that this architecture can be scaled further to establish links between more racks. c The multi-wavelength source based on the chip-scale soliton microcomb is generated by pumping with a single laser. Microscope images of a Si 3 N 4 microresonator (d) and a photonic chip (e) containing an AWG and SOAs. The inset in (d) shows a false color SEM image of a Si 3 N 4 microresonator's coupling section.
floor for PAM4 at a ROP of >−6 dBm emerges due to ASE and AWG crosstalk.
Photonic integrated circuit based optical switching and data transmission. Next, integrated Indium phosphide (InP) chipbased SOAs and AWG are used to show 25-Gbps NRZ burst mode data transmission along with fast OCS using a soliton microcomb. Figure 4b shows the photonic integrated circuit (PIC) based wavelength selector PIC with a dimension of 6 mm × 8 mm. The reflection of the light from the high-reflection (HR) coated facet allows simultaneous utilization of an AWG with 32 × 50 GHz separated channels as multiplexer and demultiplexer. This simplifies the wavelength alignment procedure and reduces the footprint of the device. Nineteen SOAs are connected via integrated waveguides to AWG output channels. The wavelength alignment of the comb channels to the AWG is   3 Experimental demonstration of burst mode NRZ and PAM-4 transmission using discrete SOAs while switching. a A stream of multiple data packets showing data transmission along with sequential switching between the four comb channels. A single burst waveform sequence consists of header, payload, and guard zone containing 32, 1024, and 64 symbols respectively. b The signal after the frequency comb generator (FCG) and the optical switching unit (OSU) is amplified using a compact EDFA to compensate for losses of the 20-GHz Mach-Zehnder modulator (MZM). Then, it is filtered out using a wide-band optical bandpass filter (OBF) (~20 nm) to reject amplified spontaneous emission (ASE) noise from the SOAs and the EDFA. The data are encoded on the modulator using an arbitrary waveform generator (DAC). A fast photodiode (PD) is used to detect the signal. The electrical signal is amplified using a trans-impedance amplifier (TIA) and finally captured by an OSC. c) The bit error ratio (BER) of four different comb channels while switching between them, using different modulation formats non-return to zero (NRZ) and four-level pulse amplitude modulation (PAM-4). A performance below forward error correction (FEC) threshold is achieved for NRZ and PAM-4 for a received optical power of~−12 dBm and~−8 dBm, respectively. CH performed by changing the temperature of the PIC, resulting in seven comb channels matching with the AWG. This can also be realized by changing the temperature of the Si 3 N 4 chip. Figure 4c shows the optical spectrum of the AWG aligned comb channels indicating a >20 dB isolation with adjacent channels. Initially, the 10-90% rising and falling times of PIC is characterized by performing simultaneous switching between two comb channels. The maximum (minimum) experimentally observed switching time is~820 ps (~375 ps). Moreover, the overshoot in the switching signal, as seen in Fig. 4d, arises due to impedance mismatch between the SOAs on-chip electrodes and the RF probes 17 . Then, a burst mode data transmission demonstration with 25-Gbps NRZ modulation is performed. A BER below FEC threshold is obtained when switching between two-channels with different combinations for a ROP >−11 dBm. While channels 40 and 41 have approximately the same optical power, channel 41 shows better BER performance as the crosstalk from channel 41 to 40 is about 8 dB lower than the crosstalk from channel 40 to 41. The PAM-4 burst mode transmission demonstration requires further improvement in the output power of the comb due to low in-and out-coupling in a packaged Si 3 N 4 . The main reason behind the low coupling efficiency (15%) is an additional 2 dB splicing loss between UHNA and SMF-28 fiber, which can be reduced to <0.2 dB by using state-of-art splicing instruments 44 . Similarly, the AWG crosstalk and insertion loss of the PIC can be further improved along with the utilization of lower FSR microcombs (25/ 50 GHz) to enhance the overall performance of this architecture. Nevertheless, the current results show the potential of a soliton microcomb as a suitable multi-wavelength source for 25 GBd NRZ burst mode transmission along with fast switching.

Discussion
Regarding the power consumption, the current multi-wavelength source consumes a total electrical power of~30 W (cf. SI) providing more than 60 carriers, having an optical power >−20 dBm (~500 mW electrical power per carrier). The electrical power consumption can be improved down to <193 mW per carrier (15.5 W total) by reducing the splicing loss between UHNA and SMF fibers 44 , implementing on-chip actuators 45 instead of a bulk temperature controller and using a power-efficient, compact distributed feedback (DFB) laser as CW pump 35,37 . By optimizing the microresonator dispersion design, it is possible to generate 122 comb channels having an optical power >−14 dBm without needing any post-amplification 30 . Furthermore, an amplifier configuration with a separate amplifier for the C-and L-band would give a comb with an optical power per line~13 dBm as mentioned in SI of ref. 30 .
More importantly, this high power comb source can be shared among multiple racks by adding a hierarchy of passive optical splitters and amplifiers for better power and resource utilization (Fig. 1a). The soliton microcomb source distributed among 32 racks provides carriers with P opt~− 4 dBm while consuming 2.57 W (1.115 W) electrical power per rack by using a state of the art commercial EDFA (on-chip amplifier 46 ) making it a highly power-efficient and flexible solution for the data center (cf. SI). More broadly, the flexibility of sharing the comb across many racks, with the fast wavelength selection done on the rack itself, means that the overall electrical power efficiency per channel approaches the power efficiency of the EDFAs with the comb power as only a small contributor. This indicates that an optimized shared comb source would consume a comparable electrical power to other multi-wavelength source solutions, including recent techniques that use a bank of tunable lasers as a Fig. 4 Sub-ns optical circuit switching (OCS) and data transmission using on-chip SOAs and on-chip AWG along with soliton microcomb. a Schematic of the setup used to perform the OCS and data transmission. The multi-wavelength optical carriers, generated via the frequency comb generator (FCG), are coupled to an InP chip containing an AWG and SOAs via an optical circulator. The coupled optical carriers are aligned to the AWG by changing the temperature of InP chip. The aligned carriers are transmitted to integrated SOAs; if one of them is biased, then the AWG channel (waveguide) connected to that particular SOA is reflected from the high-reflection coated facet of the chip while non-biased SOAs block the light. The reflected-back channels are coupled back via an anti-reflection coated optical fiber for encoding the information using the data transmission unit (DTU). b Microscope image of PIC showing the SOAs (red arrows) and AWG. c Optical spectrum of comb channels with different spacing while switching after the InP chip indicating more than 20 dB isolation with adjacent AWG channels (cross talk). The red curve shows CH 36 and 42 with 4.8 nm wavelength spacing, the blue curve CH 35 and 42 (5.6 nm) and the purple curve CH 40 and 41 (0.8 nm). d The sub-ns wavelength switching between two different comb channels using on-chip SOAs. The overshoot in the switching signal is due to impedance mismatch between the high-speed radio-frequency (RF) probes and the on-chip electrodes. This effect can be minimized by optimizing the drive signal. e The left and right figures show the zoomed-in view of switching signals between two different comb channels (CH35 and CH 42). f The bit error ratio (BER) performance of the 25 GBd NRZ PIC-based switching system for different combinations of two comb channels. multi-wavelength source 18,47 . Since the wavelength tuning is done on the bank of tunable lasers itself, sharing it between multiple racks would lead to an increased complexity when synchronizing the wavelength switching between the racks, due to the varying time delays between the bank of tunable lasers and the racks. As a matter of fact, one tunable laser bank per rack would be more optimal for a time-multiplexed solution 47 which would require at least 2 × 32 tunable lasers for switching between 32 racks instead of a single amplified comb chip. Moreover, a comb does not require additional complex algorithms for fast switching and wavelength stabilization, thus offering an appealing division of functionality by leveraging a complex yet highly shared light source.
In this work, we demonstrate the possibility of achieving nanosecond OCS using a chip-based microcomb for future power-efficient and low latency data centers. More than twenty individual comb channels in C-band having a power >−20 dBm are switched at <520 ps using discrete SOAs. The OCS system with 25-GBd NRZ and PAM-4 burst mode data transmission is shown while switching between different comb channels. Further, a PIC containing on-chip SOAs and an AWG is implemented to show sub-ns switching (<900 ps) and 25-GBd NRZ transmission. The current demonstration can provide a route for a fully integrated, fast-tunable transceiver providing dense carriers for wavelength switching to meet the power and latency requirements posed by future cloud workloads.

Methods
Switching architecture. The link between two racks is only established via a single wavelength (comb tooth) in specific time slots (t 64 , t 128 , . . . ) as shown in Fig. 1a. The soliton microcombs provide many coherent optical carriers assigned to distinct racks. The switching operation is performed by applying a control signal on the SOAs, e.g., the switching from t 1 to t 2 data slot is done by applying an on signal to the second SOA and off signals to all other SOAs. A trigger signal from an external reference clock is used to align the switching control and data-encoding units. Soliton microcomb generation. A compact fiber laser (Koheras BASIK) is amplified using an EDFA. Then, ASE noise is filtered out using a narrow optical bandpass filter. The Si 3 N 4 chip is packaged by splicing the ultrahigh numerical aperture (UHNA) fiber with standard SMF-28 fiber with chip through (fiber-chipfiber) 15% coupling efficiency 49 . A single soliton is initiated at an input power of 450 mW in the bus waveguide by applying a custom-designed ramp voltage. The deterministic soliton initiation and backward tuning are controlled via a computer interface. The strong pump line is filtered out using an OADM. Then the soliton is amplified using a low-noise EDFA before de-multiplexing. The power conversion from CW-pump to single soliton is~2% in the current device. An intrinsically coherent single soliton state is preferred due to low line to line power variation in 3 dB bandwidth (smooth spectral).
Photonic integrated SOAs and AWG. The InP-based wavelength selector PIC incorporates 23 SOAs of which 4 are used as references. The other 19 SOAs are connected to an 1 × 32 AWG which acts as a multiplexer and demultiplexer. One of the PIC facets is high-reflection coated so that the light is reflected back through the SOAs and AWG to the input waveguide. The reflective single AWG design reduces the footprint compared to using two AWGs and avoids wavelength misalignment of the AWGs. The wavelength selector PIC was designed using the JEPPIX foundry and fabricated at Fraunhofer HHI.
Switching control unit. The switching control unit supplies the bias currents and electrical switching signals to the SOAs. A negative voltage is applied to drive the SOAs at the zero-level. The current flowing through the SOAs at the zero-level was around 1 μA. To optimize the switching between two different channels, the upcoming channel starts switching on while the current channel is still switching off. The ER appears to be degraded since the signal intensity shows the sum of the channels and does not drop to the zero-level during the switching event. The ER of the waveform in Fig. 2d is~15 dB. It is difficult to estimate actual ER as the signal was measured using a sampling scope and zero level of signal is buried in noise. This unit also controls the clock and time synchronisation of the switching signals. A time synchronization having <100 ps accuracy is achieved in current study 18 .

Data availability
The data used to produce the plots within this paper are available at https://doi.org/ 10.5281/zenodo.4588562.