Introduction

The rapid advancement of fifth-generation (5 G) wireless technology towards higher data rates, lower latency, and ultra-high reliability has revolutionized various aspects of our lives. It has also served as a catalyst for the leap-forward development of smart cities and smart homes. The devices, which facilitate people’s lives by autonomous decision-making become essential in smart scenarios. In recent years, metasurfaces have emerged as novel two-dimensional (2D) artificially engineered structures to exhibit extraordinary abilities to control electromagnetic waves. Leveraging their advantages of small size, low cost, and flexible design, metasurfaces have been extensively studied in various fields, including beam manipulation1,2,3,4, wireless power transfer (WPT)5,6,7, radar cross section (RCS) reduction8,9,10,11, orbital angular momentum (OAM) vortex electromagnetic waves12,13,14,15, holographic imaging16,17,18, and so on. The advent of digital coding programmable metasurfaces19,20 further offers unprecedented degrees of freedom to dynamically switch the metasurface functions by controllable components or materials, such as the positive-intrinsic-negative (PIN) diodes21,22,23,24,25, varactors26,27,28, and micro-electro-mechanical systems (MEMS)29,30,31. Compared with the conventional phased antenna arrays, which require complicated and expensive T/R components to realize the tunable beam steering, the programmable metasurfaces composed of extremely simple architectures can easily achieve versatile functions of manipulating electromagnetic waves in a low-cost and real-time manner. Various coding strategies, such as space digital coding, time digital coding, and time-space digital coding, are merged into programmable metasurface designs, realizing a lot of fascinating works, for instance, nonlinear harmonic modulations32,33,34, non-reciprocal responses35,36, transmission-reflection integrated beam manipulations37,38, and so on. Equipped with sensors, programmable metasurfaces are capable of sensing the ambient environment and achieving self-adaptive function adjustment through an unmanned sensing feedback system39,40,41, and accordingly a concept of information metasurface (IMS)42,43,44 has emerged.

IMSs have opened up possibilities for smart cities and smart homes. Various electronic and mechanical sensors, including gyroscope for orientations45, brain-computer interface for neural signals46, camera for computer visions47, etc., have been integrated into metasurfaces to realize localization and tracking, human action recognition, intention-based imaging, and others in an intelligent way. On the other hand, speech is one of the most effective tools for human communication. Intelligent voice interaction between humans and metasurfaces offers a more natural, convenient, and non-contact method by leveraging natural language as a medium for information transfer. By incorporating an intelligent voice interaction function into the metasurfaces, the resulting intelligent system can “listen” to the human commands and make the responses accordingly, significantly extending the intelligent scope and depth of metasurfaces.

In this article, an innovative voice-interactive IMS information and power transmission system specifically tailored for smart conference room scenarios has been developed, with integrating the intelligent voice interaction technology, IMS technology, smart tagging technology, and wireless communication technology. Integrating the voice interaction into the IMS exhibits some distinctive advantages for the manipulation of electromagnetic waves, including remote control, touchless operation, simple structure, and greater flexibility in use. By synergizing speech recognition, speech synthesis, target detection, wireless communication, and power transfer, a self-determining closed-loop control intelligent system operating in either user voice instruction mode or autonomous perceptual mode enables targeted data transmission to individuals and wireless power transmission to rechargeable devices within the conference room. With effortless user control through voice commands and insightful voice feedback, the voice-interactive IMS system facilitates real-time data collection, transmission, and exploitation regarding individuals and rechargeable devices, thus promoting more potential smart applications of the IMS technology.

Result

Architecture of the IMS system

The concept of the proposed voice interaction IMS system for smart conference room scenarios is illustrated in Fig. 1. The system called “RIS I” consists of our custom-designed core processing system, programmable metasurface, detection tags system, and video capture and transmission system. The IMS system can operate in two modes: instruction mode and autonomous mode, with the user selecting the desirable mode through voice instructions. In the instruction mode, the user can actively choose the recipients to receive the video information and the wireless power transmitted by the programmable metasurface via voice instructions, while in the autonomous mode, “RIS I” intelligently detects the recipients in the conference room and transmits the video information and wireless power to them in a self-decision making way. The proposed system incorporates an information feedback architecture based on the designed detection tags system, allowing the IMS system to make intelligent judgments. For example, the detection tags system monitors the actual usage of seats within the conference room in real-time. The video information is transmitted exclusively to the occupied seats, while the unoccupied places require assistance from the user’s voice commands to realize the video transmission. The video capture system composed of a camera and a video transmission module is used to capture and transmit the video information. In addition to the video transmission, wireless power can be intelligently transmitted to rechargeable devices depending on whether the charging board is occupied. In virtue of either the user’s voice commands or data from the detection tags, the core processing system is specifically designed to generate the control signals, which are used to change coding sequences of the programmable metasurface through the PIN control circuit board to achieve dynamical manipulation of the electromagnetic beam. Multiple directional beams generated by the metasurface ensure effective signal coverage of the occupied places in the conference room, enhancing signal transmission quality while minimizing unnecessary energy consumption. A well-developed intelligent algorithm is implemented in the system to optimize signal transmission efficiency through precise control of the electromagnetic beam. Note that the use of the keyword “RIS I” serves as a safeguard against the false triggering of the intelligent system when it is not in working mode. The “RIS I” system starts to enter the working mode only if the user utters the pre-set keyword. Similarly, the intelligent system can be put into sleeping mode through a close command, allowing users to easily manage its operation. It is worth noticing that the proposed intelligent system can seamlessly achieve video transmissions for people on the seats and wireless power supply for electronic devices on the charging boards with only modifications to control signals of the coding sequences, highlighting the flexibility and applicability of the intelligent system. Meanwhile, the voice interaction provides a natural and intuitive way for users to interact with devices, allowing for hands-free operation and a more seamless user experience.

Fig. 1: Schematic of the voice interactive IMS system.
figure 1

The voice assistant “RIS I” we proposed can receive and recognize voice commands from users in real time and intelligently, and use the data received from the detection tags to capture the usage of the places in the conference room, and then manipulate the electromagnetic beam of the programmable metasurface through the PIN control board to selectively transmit information in accordance with the user’s instructions, and at the same time, communicating with the user through voice and information feedback.

Design of the programmable metasurface

The structure of the proposed programmable metasurface is illustrated in Fig. 2a. It consists of 12-by-12 linearly polarized 1-bit elements, and each element is equipped with a single PIN diode (Skyworks SMP1340-040LF)48. The element structure comprises two substrates bonded together. The top substrate is F4BM with a relative permittivity of 3.5 and a loss tangent of 0.003, while the bottom one is FR4 with a relative permittivity of 4.4 and a loss tangent of 0.026. The bonding substrate is Tu872_1080 with a relative permittivity of 3.65. The thicknesses of three substrates are H1 = 3 mm, H2 = 0.075 mm, and H3 = 0.5 mm, respectively. Meanwhile, two square copper patches with the same size connected to a PIN diode are located on the upper surface of F4BM. A DC-driven layer is fabricated on the bottom surface of FR4 (Fig. 2b). A copper plate located on the bottom surface of F4BM serves as the ground plane for the RF and DC circuits. One of the patches is connected to the ground through two metal through-vias, serving as the cathode connection of the diode, while the other patch is connected to the lowermost DC-driven layer through a metal through-via, acting as the anode connection of the diode. To achieve high isolation between the RF and DC signals, the bias line is placed at the weaker part of the electric field, and a fan-shaped patch on the DC-driven layer is employed to ensure a choking effect. The details of the metasurface element can refer to Supplementary Note 1.

Fig. 2: The programmable metasurface of the voice interactive IMS system and its reflective properties.
figure 2

a The programmable metasurface consists of 12-by-12 1-bit unit cells, each of which is integrated by a PIN diode. b The bottom view of the element. c The equivalent circuit of the PIN diode in ON and OFF states. d The reflection coefficient and reflection phase of the proposed 1-bit unit cells. e The front view of the proposed voice-interactive IMS system. f The back views of the proposed voice-interactive IMS system.

When the metasurface elements are normally illuminated by a plane wave with x-polarized electric field, its reflective responses will be changed by switching the states of the PIN diode. The equivalent circuits of the PIN diode in OFF and ON states are illustrated in Fig. 2c. We encode the element with OFF state as the unit cell 0 and the element with ON state as the unit cell 1. The reflection characteristics of the 1-bit unit cells are solved by using commercial software, ANSYS Electromagnetics Suite, wherein the element is enclosed by the periodic boundaries along x and y directions and excited by floquet ports along z direction. The reflection phases of the element in two states differ by 180o ± 20o in the band of 5.5~6.5 GHz with the reflection amplitude better than -0.8 dB (Fig. 2d). With the proposed element, a metasurface comprising 12-by-12 elements is developed, as depicted in Fig. 2e and f. The entire metasurface aperture is 316 mm×333.5 mm (6.32λ0×6.67λ0), in which λ0 is wavelength in free space at the center frequency of 6 GHz. To provide the efficient control over the PIN diodes, 12 socket interfaces have been incorporated at the edge of the DC-driven layer. Each of these interfaces can control a row of 12 PIN diodes (see Supplementary Note 1 for details). An offset square horn is employed as the feed to illuminate the metasurface, with its phase center at point (0, -82, 357) mm. The offset configuration can effectively reduce the feed blockage, and the feed location is properly chosen to balance the spillover efficiency and illumination efficiency49,50. More details about the feed antenna location can be found in Supplementary Note 2.

Multifunctional wave manipulations of metasurface

The diverse functions of the proposed programmable metasurface are demonstrated by changing digital coding sequences obtained through advanced optimization techniques (see Supplementary Note 3 for details). The designed metasurface can achieve multiple functions, including beam scanning, power focusing, generation of OAM waves, RCS reduction, and holography. The solution details of the digital coding sequences for each function can refer to in Supplementary Note 3.

In a smart conference room scenario, beam scanning and wireless power transfer are employed, respectively, to accomplish information and energy transmissions. It is worth noting that the metasurface performance of wave manipulations in xoz and yoz planes are similar. For the practical application purpose, the results in yoz plane are demonstrated. To evaluate the beam scanning performance, we have established a measurement environment within a microwave anechoic chamber, as depicted in Fig. 3a and b. The measurement system comprises a feeding antenna (HD-70SGAH15N), a receiving antenna (HD-10200DRHA10S), and a vector network analyzer (VNA, Agilent Technologies N5244A). To ensure precise measurements, a carefully designed bracket, which securely holds the metasurface and the feeding antenna together is positioned on a rotating platform. The feeding antenna is connected to the transmitting port of the VNA as the signal source, while the receiving antenna is connected to the receiving port of the VNA to capture the reflected waves. The receiving antenna is situated at a distance of about 2.67 m from the metasurface to ensure far-field measurement. Furthermore, to minimize the potential interference between the PIN control board and the metasurface, the PIN control board is positioned horizontally at the back of the metasurface. Real-time control of the switching state of each PIN diode on the metasurface is achieved through preset coding sequences.

Fig. 3: The microwave anechoic chamber measurement environment and measured results of the proposed programmable metasurface.
figure 3

a, b The microwave anechoic chamber measurement environment for far-field beam scanning characteristic. c, d The microwave anechoic chamber measurement environment for near-field energy focusing. e, f The single- and dual-beam scanning results of the proposed metasurface. g Electric field distribution on the observation plane obtained from the measurement.

To elucidate the wireless power transfer performance of the proposed metasurface, an experimental setup has been established, as depicted in Fig. 3c and d. In this setup, the feeding antenna and the metasurface are positioned on a platform. A waveguide probe (HD-70WOEWPN) at a distance of about 0.6 m from the feeding antenna is connected to the receiving port of the VNA to capture the reflected electromagnetic wave. The probe is moved in a plane with a fixed step of 20 mm, measuring near-field amplitude and phase characteristics. The scanning plane covers a range of 600 mm × 600 mm.

The experimental results regarding the single- and dual-beam scanning and energy-focusing capabilities of the proposed metasurface are presented in Fig. 3e–g, respectively. It can be observed from Fig. 3e and f that both single-beam and dual-beam scanning within a coverage range of ±50° are achieved. As the scanning angle increases from 0o to 50o, the gain for single-beam scanning gradually decreases from 18.4 dBi to 14.98 dBi, accompanied by a slight increase in sidelobe level. Similar phenomena can be observed for dual-beam scanning. This is because of the phase quantization error caused by 1-bit coding pattern, which can be further improved by applying higher-bit coding scheme. Nonetheless, the measured results demonstrate that the obtained sidelobe levels remain relatively low, ensuring that the metasurface can meet information transmission requirements without significant interference. The simulated results and the digital coding sequences for beam scanning can be found in Supplementary Note 4. For the wireless power transfer performance of the proposed metasurface, the measured electric field distribution on the observation plane is illustrated in Fig. 3g, clearly showing a well-focused spot at the designated position of (0, 100, 600) mm. Overall, the designed metasurface exhibits favorable performance in beam scanning and near-field energy focusing, making it suitable candidate for wireless information and power transmission applications. More simulated and measured results about other functions such as the generation of OAM waves, RCS reduction, and holography can refer to Supplementary Note 4.

Architecture of intelligent information processing algorithm

The intelligent information processing algorithm architecture for intelligent wireless information and power transmissions in a smart conference room scenario has been proposed, as shown in Fig. 4. This architecture aims to efficiently generate the coding sequences used for intelligent beam manipulations based on information of the occupied seats and charging boards. Note that the algorithmic procedures of the intelligent system are identical for both wireless information transmission and wireless power transfer. Here only the algorithmic architecture for wireless information transmission is described.

Fig. 4: The architecture of the intelligent information processing algorithm.
figure 4

After receiving the voice command from the user, the core processing system will intelligently recognize the user’s command, and by collecting and processing the place usage information in the intelligent meeting room, it will apply the DC bias voltages corresponding to the coding sequences to each PIN diode through the PIN control board, so as to intelligently realize the beam manipulation according to the user’s command.

In the instruction mode, user is able to control the transmission direction of video information captured by the camera based on the voice commands. For example, as depicted in Fig. 4, assume that the user wants to send the video information to people on the seat 1 using speech commands. Firstly, the detection tags on all seats in the conference room wirelessly transmit the real-time seat usage data to the core processing system. Secondly, the core processing system quickly compares the user’s voice commands with the commands stored in the pre-set voice library to comprehend the user’s intentions to send video information to people on the seat 1. In the meantime, the core processing system receives data from the detection tags system to determine whether there is a person on the seat 1. If the seat 1 is taken, the core processing system sends the confirmed command message to the PIN control board, indicating that the video information should be sent to people on the seat 1. Thirdly, upon receiving the control instructions from the core processing system, the PIN control circuit system retrieves the corresponding coding sequences from the pre-set coding sequence library. These coding sequences are then instantly transmitted to the programmable metasurface, enabling the electromagnetic beam pointing to the seat 1.

On the other hand, in the autonomous mode, there is no need for user involvement. The detection tags system and core processing system autonomously collect, transmit, receive, and process the seat usage information in the conference room. Subsequently, the core processing system intelligently transmits the video information to everyone in the room without the need for any voice commands from the user. Overall, the proposed intelligent system architecture enables flexible and efficient wireless information transmission in a smart conference room.

Hardware circuit platform of the proposed system

To optimize the size and cost of the intelligent system, a compact core processing system based on a 32-bit microcontroller unit (MCU) has been developed, which includes speech recognition module, speech synthesis module, and wireless communication module.

A microcontroller (STMicroelectronics STM32F103ZET6) is used to serve as the core control chip, offering rich functionalities and communication ports, as illustrated in Fig. 5a. Real-time voice commands from the user are captured by a microphone and processed by the speech recognition module. The LD3320 chip has been incorporated for speech recognition, which employs speaker-independent automatic speech recognition technology. By defining specific keywords, it can filter out irrelevant or spam speech and accurately recognize Chinese speech, achieving a recognition accuracy of over 90%. Upon receiving the voice command, the microcontroller intelligently compares the received voice signal with the predefined voice library and selects the most matching voice sequence as the final recognized command. The microcontroller then assigns the corresponding coding sequence based on the voice command and transmits the command signal to the PIN control board through the data transmission interface, enabling the precise manipulation of the electromagnetic beam.

Fig. 5: The hardware circuit platform of the proposed system.
figure 5

a The hardware circuit platform for core processing system. b The hardware circuit platform for detection tags system. c, d The hardware circuit platform for PIN control system.

For speech synthesis, we have integrated the SYN6658 Chinese speech synthesis chip into our system. This chip can receive real-time text data through asynchronous serial communication and convert it into synthesized speech. Additionally, to monitor the real-time usage information of the seats and the charging boards in the conference room, we utilize the NRF24L01 wireless communication module to receive data from the detection tags module. Moreover, the inclusion of the LCD screen provides real-time visualization of the intelligent system’s current status, displaying the corresponding digital code sequences on the metasurface.

In order to accurately acquire the information about the utilization of the seats and charging boards in real-time, the detection tags system consisting of a 32-bit MCU (STMicroelectronics STM32F103C8T6), a wireless communication module (NRF24L01) and two sensor circuit modules, i.e., a long-range sensor module SR602 and a short-range sensor module TCRT5000, has been introduced, as shown in Fig. 5b. The human body infrared sensor SR602 can offer a high sensitive detection with a sensing distance up to 3 m, while the module TCRT5000 is a reflective optical sensor with a sensing distance up to 2.5 cm. Cooperation of two sensor modules can improve the precision of the detection. It is worthwhile pointing out that the detection tags system offers low power consumption and high-speed accurate data transmission rates. Furthermore, each detection tags system is assigned a unique ID address, facilitating the identification of data received from different transmitters.

To enable independent control of the DC bias voltage for each element on the metasurface, we have designed the PIN control board capable of driving 144 channels, as illustrated in Fig. 5c and d. The PIN control board also uses the microcontroller of STM32F103ZET6. To simplify the design and minimize the number of the control lines, 8-bit serial-to-parallel shift registers (Fuman Electronics 74HC164D) have been employed to extend the control line ports. Additionally, 8-bit drivers (Fuman Electronics 74HC245TS) have been used to supply the proper driving current for each PIN diode. To provide real-time display of the operating state of the metasurface, an array of display lights is designed by connecting a light-emitting diode (LED) in series with each channel. Each PIN diode is connected in series with a resistor and a LED, and thus the ON and OFF states of each LED correspond to the unit cell 1 and unit cell 0 on the metasurface. By observing the changes in the LED array, visual indication of the electromagnetic beam can be realized. The connection between the PIN control circuit board and the programmable metasurface can refer to Supplementary Note 5.

Experimental setup and measurement

To validate the function and effectiveness of the voice-interactive IMS system, comprehensive experimental demonstrations of information transmission and wireless power transfer have been conducted through voice commands, depicted in Fig. 6a and b.

Fig. 6: Experimental setup and environment of the proposed system.
figure 6

a The experimental setup and environment for the wireless information transmission. b The experimental setup and environment for the wireless power transfer.

In the wireless information transmission experiment, video information has been initially captured by a camera. The collected video is then converted into a signal at 5.805 GHz by the video transmission module, and delivered to the feeding horn of the metasurface. The core processing system constantly monitored the occupancy of each seat in the conference room. Before initiating video information transmission, it first checks if the target seat has been occupied. If the seat is occupied, the video information can be transmitted accordingly. However, if the seat is unoccupied, the speech feedback from the voice interaction system is utilized to assist the user in deciding whether to proceed with the current information transmission operation.

The setup for wireless information transmission experiment is composed of a video capture camera, a video transmission module connected to the transmitting horn (HD-70SGAH15N) as the feeding source of the metasurface through RF cables, detection tag modules and video receivers operating at 5.805 GHz, as shown in Fig. 7a. In this scenario, the video transmission module operates in the low power mode with output power of -43.47 dBm (see Supplementary Note 6 for details). Two distinct application scenarios have been designed according to two operation modes: instruction mode and autonomous mode. In instruction mode, a seat, which receives the video information is assigned based on the user’s voice commands. Meanwhile, the seat usage information is obtained by the detection tag system. When the given seat is not occupied, the core processing system reminds user of the unoccupied seat through voice communication, inquiring whether to proceed with video information transmission. With user’s permission, the video information transmission is enabled. Details can be found in Supplementary Movie 1. On the other hand, the autonomous mode of intelligent system empowers the core processing system to independently detect the occupancy of all seats in the conference room. It autonomously achieves the transmission of video information based on the received detection data, without the need for the user to send individual voice commands. This autonomous capability enhances the flexibility and convenience of the system, providing a more user-friendly experience. The video is well transmitted when the seat is occupied (Fig. 7b). Conversely, there is no video transmission for the unoccupied seat (Fig. 7c). Details can be found in Supplementary Movie 2. Additionally, voice responses have been incorporated into the system’s design, ensuring that users receive informative feedback for each voice command. This further enhances the intelligence and user satisfaction of the interactive system, creating a more immersive and comfortable experience during information transmission.

Fig. 7: Experiments of wireless information transmission and wireless power transmission for the proposed system.
figure 7

a The experiment scenarios of the wireless information transmission. b Measured video transmission with the seat occupied. c No video transmission with the seat unoccupied. d The experiment scenarios of the wireless power transmission. e Measured ambient power level without a phone on the rechargeable board. f Measured transmission power level with a phone on the rechargeable board.

In the wireless power transmission experiment, by utilizing voice command, the electromagnetic waves generated by the designed IMS system can be focused at a location/a few given positions in near-field regions, realizing the point-to-point/point-to-multipoint power transmission. Moreover, the system could be turned off by a voice command.

The setup for wireless power transmission experiment is same as that for wireless information transmission experiment, except that a receiving antenna (HD-10200DRHA10S) operating at 5.805 GHz instead of the video receivers is employed to receive the wireless power, as shown in Fig. 7d. In the scenario of the power transfer, video information is captured by a camera and transmitted by the video transmission module operating in the high-power mode with output power of 26.89 dBm (see Supplementary Note 6 for details). The receiving antenna is positioned at the pre-defined energy focus position of (0, 100, 600) mm. A power sensor is used to measure the transmitted power. The ambient power level measured by the power sensor is about -35.68 dBm (Fig. 7e), when the metasurface is not in working mode. When the phone on the charging board is detected, with the voice command the wireless power from the video is transmitted to the designated charging board, and thus the peak power level is increased to 8.73 dBm (Fig. 7f). More details about the measurement of the wireless power transfer can refer to Supplementary Movie 3.

Discussions

In this paper, a groundbreaking application of the voice-interactive IMS system in the context of wireless video information transmission and wireless power transfer in smart conference rooms has been introduced. By seamlessly integrating voice interaction technology with IMS technology, we have developed a cutting-edge solution that enhances the intelligence and user experience within conference room environments.

The key innovation of our system focuses on merging the humanlike intelligent sensing abilities into the IMS to realize highly efficient information and power transmissions through touchless control manner in a smart conference room. Real-time data collection on the presence of individuals and rechargeable devices are achieved by the detection tags system, and managed by the advanced core processing system. The incorporation of voice commands empowers users to effortlessly interact with the system, enabling rapid information acquisition and intentional transmission operations for both individuals and rechargeable devices within the conference room. This research marks an important step towards the practical implementation of the voice-interactive IMS system and demonstrates great potential for applying wireless information transmission and power transfer into various smart environments.

Methods

Details of the programmable metasurface

A programmable metasurface with 12-by-12 elements has been presented in this work, which consists of 1-bit x-polarized elements with the operating frequency band of 5.5 ~ 6.5 GHz. Meanwhile, it is designed by the commercial software ANSYS Electromagnetics Suite, manufactured using printed circuit board technology, and measured in the microwave anechoic chamber. The radiating layer and the metal layer are fabricated on the upper and lower surfaces of the dielectric substrate F4BM with relative permittivity of 3.5, respectively. The DC bias feed layer is printed on the lower surface of the dielectric substrate FR4 with relative permittivity of 4.4, and the two dielectric substrates are bonded together by the dielectric substrate Tu872_1080 with relative permittivity of 3.65. The photographs of fabricated prototypes are shown in Fig. 2e and f. A PIN diode (Skyworks SMP1340-040LF) is integrated into each element to regulate the phase of the reflected electromagnetic wave, which enables the manipulation of the electromagnetic beam according to the reflective array principle.

Measurement setups

The electromagnetic beam manipulation capability of the proposed programmable metasurface has been carried out in the microwave anechoic chamber, as illustrated in Fig. 3. The feeding antenna and the metasurface are installed on a well-designed bracket, wherein the feeding antenna (HD-70SGAH15N) is connected to the transmitting port of the VNA (Agilent Technologies N5244A), and the receiving antenna (HD-10200DRHA10S) is connected to the receiving port of the VNA to receive the electric field amplitude and phase of the reflected electromagnetic waves.

In the wireless information transmission experiments, the proposed intelligent system can operate in two different modes, i.e., instruction mode and autonomous mode. The instruments to be used include a feeding antenna, an IMS array, a camera and video transmission module, video receivers, and detection tag modules, as shown in Fig. 7a. In the instruction mode, before the captured video signal is directed for transmission through the command, the core processing system will first detect the usage of the corresponding position. Only when the seat is occupied, the information transmission operation can continue. Otherwise, the user’s reconfirmation is required to complete this information transmission operation (See Supplementary Movie 1 for details). In autonomous mode, the core processing system automatically collects and processes the usage information of all seats in the intelligent meeting room scenario, and automatically selects the optimal beam modulation method to achieve effective coverage of all occupied seats (See Supplementary Movie 2 for details).

In the wireless energy transmission experiment, a feeding antenna, an IMS array, a camera and video transmission module, a receiving antenna, a power sensor (ROHDE & SCHWARZ NRP8SN), and a detection tag module are connected together, as shown in Fig. 7d. When the detection tag system detects rechargeable devices, the user can control the electromagnetic wave of the IMS by voice command to focus the wireless power on the rechargeable devices. Otherwise, wireless power will not be transmitted (See Supplementary Movie 3 for details).