Flexible and broadband colloidal quantum dots photodiode array for pixel-level X-ray to near-infrared image fusion

Combining information from multispectral images into a fused image is informative and beneficial for human or machine perception. Currently, multiple photodetectors with different response bands are used, which require complicated algorithms and systems to solve the pixel and position mismatch problem. An ideal solution would be pixel-level multispectral image fusion, which involves multispectral image using the same photodetector and circumventing the mismatch problem. Here we presented the potential of pixel-level multispectral image fusion utilizing colloidal quantum dots photodiode array, with a broadband response range from X-ray to near infrared and excellent tolerance for bending and X-ray irradiation. The colloidal quantum dots photodiode array showed a specific detectivity exceeding 1012 Jones in visible and near infrared range and a favorable volume sensitivity of approximately 2 × 105 μC Gy−1 cm−3 for X-ray irradiation. To showcase the advantages of pixel-level multispectral image fusion, we imaged a capsule enfolding an iron wire and soft plastic, successfully revealing internal information through an X-ray to near infrared fused image.


Reviewer #1 (Remarks to the Author):
This work proposed an ideal solution to achieve pixel-level multispectral image fusion by flexible and broadband colloidal quantum dots photodiode array. It is a comprehensive work starting from the image fusion design to device performance measurements and strengthened by the pixel-level image fusion from X-ray to infrared.
The experiments are delicately designed, and the conclusions are well supported. The performance of the photodiode array for Vis-NIR and X-ray are comparable to those of the commercial InGaAs (NIR) and α-Se (X-ray) detectors, suggesting great potential in flexible electronics. Overall, I found that the results are very solid, and the concepts are new, it should be published in this journal after addressing some minor issues.
1. As far as we know, the CQDs have more complicated surface states, which may have less radiation hardness than their bulk counterpart. Why do PbS CQDs have much better X-ray robustness compared to their bulk counterpart?
Response: Thanks the reviewer for this helpful comment. As shown in supplementary  S16). Overall, this result is very interesting, and worthy further investigation.
2. In the abstract, the authors claim that the X-ray sensitivity is 2×10 4 μC Gy −1 cm −3 , but the sensitivity is 17.8 μC Gy −1 cm −2 in the introduction, please clarify.

Response:
We are thankful for the reviewer's comment. We revised the description of 2×10 4 μC Gy −1 cm −3 as volume sensitivity and 17.8 μC Gy −1 cm −2 as area sensitivity.

(Line 28-29, Page 1)
We revised the manuscript accordingly "The CQDs photodiode array showed a specific detectivity exceeding 10 12 Jones in visible and NIR range and a favorable volume sensitivity of approximately 2×10 4 μC Gy −1 cm −3 for X-ray irradiation."

(Line 21, Page 4)
We revised the manuscript accordingly "It could be operated at a very low voltage (0.1-1.25 V) with an area sensitivity of 17.8 μC Gy −1 cm −2 …".
3. In Figure 2i, why the detector with different bias has the same −3dB frequency.
Response: Thanks the reviewer for raising this concern. The response time of PbS CQDs photodetectors is limited by various factors including drift time, diffusion time, and RC (resistor-capacitor) time. The photocurrent of the device saturates at a low reverse bias (−0.5 V), revealing the PbS CQDs layer would be completely depleted under bias of −0.5 to −2 V as shown in Fig. 2d. Therefore, the response time of our PbS CQDs devices is determined by the drift time and RC time. According to the previous report, as the active area of CQDs device decreases, the response rate significantly increases. Therefore, the RC time primarily limits the response rate of the CQDs photodetector. [doi.org/10.1016/j.matt.2020.12.017] The −3dB frequency of our CQDs devices is mainly limited by geometrical capacitance rather than bias voltage. 4. In the article, the thickness of the detector is only 900 nm, why not increase the film thickness to enhance X-ray absorption?
Response: Thanks the reviewer for raising this concern. 900 nm is a balanced thickness for our detector considering its NIR and X-ray detection performance. We made PbS CQDs photodiodes with different thickness of CQDs layer and added their photoresponses as supplementary Fig. S5. Thicker CQDs layer enhances X-ray and NIR absorption. The high penetration depth of X-ray enables photogenerated carriers within or near the depleted region, which facilitates effective extraction of photogenerated carriers. The photoresponse to X-ray is enhanced by increasing the thickness of CQDs layer. However, for NIR illumination, the photogenerated carriers are mainly at the surface of CQDs layer far from the depletion region, resulting in low extraction efficiency and hence lower performance. Considering the contradictory requirement, 900 nm is the optimal thickness for our device.  Figure 2, the PbS CQD-EDT layer and C60 layer were labeled in energy band diagram (2c), but not in 2a and 2b.

Response:
We are thankful for the reviewer's reminding. We added clear labels in Response : We are thankful for the reviewer's reminding. We added the detailed information of the image fusion process in Materials and Methods.

Imaging fusion
The photocurrent matrices under different light sources were 8-bit normalized in a range of 0-1. The imaging matrices were obtained by weighted summation of the normalized photocurrent matrices pixel by pixel. The quality of fused image could be improved by optimizing the weight factors of X-ray, visible and NIR photocurrent matrices. For images in this paper, the optimal weight factors of X-ray, visible and NIR photocurrent matrices were respectively 0.25, 0.125 and 0.625.

Reviewer #2 (Remarks to the Author):
The authors provide well-quantified measurements that show that a colloidal solid composed of PbS quantum dots can exhibit x-ray, visible, and NIR performance metrics that are comparable or superior to other direct-detection technologies.
1. Abstract: "Image fusion extracts and combines information from multispectral images into a fused image, which is informative and beneficial for human or machine perception. However, currently multiple photodetectors with different response bands are used, which require complicated algorithm and system to solve the pixel and position mismatch problem." The text could use a good grammar edit throughout. For instance, the second sentence of the above should be written "Currently, (you don't need the however) multiple photodetectors with different response regimes are used, which requires complicated algorithms and systems to solve the …" (pluralize "algorithm" and "system"). Even the first sentence is redundant "Image fusion …. into a fused image…."…. Instead, I would suggest "Combining information from multispectral images into a fused image is informative and beneficial for human or machine perception. (or some such)" Anyway, I won't English edit the rest of the paper but suggest you have someone do that (especially, pluralizing the various nouns throughout the paper).

Response:
We are thankful for the reviewer's suggestions. We polished the manuscript and pluralize the various nouns throughout the paper as below.
Combining information from multispectral images into a fused image is informative and beneficial for human or machine perception. Currently, multiple photodetectors with different response bands are used, which require complicated algorithms and systems to solve the pixel and position mismatch problem. An ideal solution would be pixel-level multispectral image fusion (PLMSIF), which involves multispectral image using the same photodetector and circumventing the mismatch problem. Here we presented the potential of PLMSIF utilizing colloidal quantum dots (CQDs) photodiode array, with a broadband response range from X-ray to near infrared (NIR) and excellent tolerance for bending and X-ray irradiation. The CQDs photodiode array showed a specific detectivity exceeding 10 12 Jones in visible and NIR range and a favorable volume sensitivity of approximately 2×10 4 μC Gy −1 cm −3 for X-ray irradiation. To showcase the advantages of PLMIF, we imaged a capsule enfolding an iron wire and soft plastic, successfully revealing internal information through an X-ray to NIR fused image.
Multi-spectral image fusion is a technique that extracts the most pertinent information from different-wavelength source images into a unified image, with the goal of providing richer and more valuable information for subsequent applications, such as machine vision 1 , autonomous vehicles 2 , medical diagnosis 3 and other artificial intelligences 4 . Existing approaches for multi-spectral image fusion typically rely on vision algorithms, including multi-scale transformation 5 , deep learning 6 and etc., at the sacrifice of resolution mismatch, overloaded computing resources and complicated systems 7 . With the advancement of photodetectors that have broader response range, pixel-level image fusion can be a more practical approach, where multi-spectral images are captured using just one photodetector. This approach simplifies imaging processes and systems, with the additional benefits of conserving computational resources and reducing energy consumption. For example, traditional InGaAs photodetectors have been modified to broaden their response range from 0.9−1.7 μm to 0.4−1.7 μm for visible-infrared pixel-level image fusion 8 , yielding more informative images in the inclement weather.
Pixel-level multi-spectral image fusion (PLMSIF) of X-ray, visible and infrared is highly desired in various areas such as medical imaging 9 , security monitoring 10 and nondestructive testing 11 . As for application in medical imaging, the X-ray image emphasizes the inorganic skeleton texture, while the visible image supports the assessment of appearance, and the infrared image provides a detailed description of organic tissue structure. Combining X-ray, visible and infrared images into one single image can effectively and comprehensively construct the complete medical atlas, as realized by the traditional approach (Fig. 1a) using three individual photodetectors for X-ray, visible, infrared and then applying a vision algorithm. This system requires complex vision algorithms and extensive computing resources to compensate for the differences in pixel position and resolution between the three types of photodetectors, impeding the development of artificial intelligence in medical imaging. As another increasingly active demand for comfortable and real-time medical imaging, wearable and flexible photodetectors also need to be taken into consideration and developed to fit irregular biology surface and improve comfort level. However, as far as we are concerned, there is no report on one single flexible photodetector capable of capturing X-ray, visible and infrared images to achieve image fusion (Fig. 1b). This new approach is very appropriate for flexible lensless imaging, such as biomedical measurement and medical diagnosis 12 .
Various materials such as halide perovskites 12, 13 , organic semiconductors 14 , twodimensional materials 15, 16 and colloidal quantum dots (CQDs) 17, 18 have emerged, enabling flexible and wide detection range beyond traditional silicon and InGaAs photodetectors. Halide perovskites are ultra-sensitive and have a low detection limit for X-ray and visible detection due to their high absorption coefficient and high μτ product, but they show poor performance for infrared detection owing to their large bandgap 19,20 . Organic semiconductors have achieved ultra-low dark current, large linear dynamic range and excellent flexibility but with limited response range and poor X-ray irradiation resistance 21 . Two-dimensional materials such as graphene exhibit fast photoresponse and ultra-broadband response from visible to terahertz, but they are too thin to efficiently absorb X-ray and have limited capacities for imaging array 22 . PbS CQDs are widely recognized for their excellent visible and infrared photodetection capabilities, which are attributed to their tunable bandgap, high absorption coefficient and low-temperature solution processing 23-25 . Actually, these materials contain heavy element Pb which is a strong absorber for X-ray because the absorption coefficient of X-ray is proportional to the fourth power of atomic number (Pb, 82). Furthermore, as shown in our manuscript, PbS CQDs exhibit much better X-ray robustness compared to their bulk counterpart. Hence, PbS CQDs are at least one of the best choices for the pixel-level X-ray to infrared image fusion.

Response:
We are thankful for the reviewer's reminding. We corrected the error in 3. Intro, pg 2: "Fusing X-ray, visible and infrared images as one single image could effectively and comprehensively construct the whole medical atlas as realized by the traditional approach (Fig. 1a) using three individual photodetectors for X-ray, visible, infrared and then applying vision algorithm." Utilizing the same pixels for all wavelength bands can make the fused-image formation more computationally straightforward, but you should also comment on any performance costs associated with using the same readout plane. For instance: larger pixels for x-rays needed compared to visible in order to increase detection efficiency because of the far lower photon fluence of the source; secondary electron escape from x-ray-induced photoelectrons if the pixel size is too small; potential loss of NIR and visible image fidelity because of needs of x-ray imager. Is the cost in performance of using a single readout structure sufficiently small that the computational image processing gains compensate?
Response: Thanks the reviewer for raising this concern. In this work, we propose a new approach to simplify the complex computational processes during multispectral image fusion. Considering far lower photon fluence and much weaker convergence of the Xray source, the commercial X-ray imaging system has large pixel size and no lens.
Similar to the commercial X-ray system, our imaging system also has large pixel size, which is beneficial to sensitive photoresponses to X-ray, visible and NIR light. If used for optical camera with lens, our imaging system needs expensive large-aperture lens.
Hence, our approach to fuse X-ray, visible and NIR images by one single photodetector is appropriate for flexible lens-free imaging, such as biomedical measurement and medical diagnosis [doi.org/10.1038/ s41928-019-0354-7].

(Line 6-8, Page 3)
We revised the manuscript accordingly "This new approach could be useful for flexible lens-free imaging, such as biomedical measurement and medical diagnosis 12 ." 4. Intro, pg. 4: "Van der Waals interaction between adjacent dots allows slipping of CQDs without broken bonds and new defects under bending state (Fig. 1e), which supports desirable flexibility of CQDs devices." (Just for you information) even if the CQDs are chemically bonded (via oriented attachment for instance), the radius of curvature between neighboring QDs is sufficiently small (for small particles) that large scale macroscopic bending is possible.
Response: Thanks the reviewer for raising the discussion. We agree with your viewpoint. We calculated the strain of bended PbS CQDs film and added the detailed description in the article as below.

(Line 13, Page 4)
We revised the manuscript accordingly "Van der Waals interaction between adjacent dots allows slipping of CQDs without broken bonds and new defects under bending state (Fig. 1e), which supports desirable flexibility of CQDs devices (Supplementary Response: We are thankful for the reviewer's remind. We added the detailed description in the article as below. Response: Thanks the reviewer for raising this concern. In this work, we present the design of a simple large-area imaging system to assess the feasibility of capturing multiple images using a single photodetector. The design of this imaging system mainly refers to the commercial X-ray thin-film-transistor (TFT) detector array. The commercial a-Se flat panel X-ray detectors (e.g. Hologic and ANRAD) typically have over 100 µm pixel size [doi.org/10.3390/qubs5040029]. In order to achieve better X-ray imaging, we designed a larger pixel size of 900 µm to increase X-ray absorption and hence improve the X-ray response. The pixel size can be reduced for higherresolution lens-free imaging and further for the optical camera with lens. In addition, this lens-free imaging system with large pixel size is very suitable for biomedical The XD of ZnO/PbS CQDs heterojunction are approximately 370 nm at zero bias, 600 nm at −1 V, 770 nm at −2 V, and 900 nm at −3 V. The depletion width in the n-type ZnO (xn) and p-type PbS CQDs (xp) layers can be calculated using the following formula: The calculated maximum depletion width in ZnO layer is approximately ~90 nm. We experimentally determined the optimal thickness of the ZnO layer to be 120 nm as shown in Fig. R1a. The primary function of NiOx is to act as an electron blocking layer, which can reduce carrier recombination. But its deep valence band maximum forms hole transport barrier that hinders the extraction of holes as shown in Fig. 2c. The optimal thickness of the NiOx layer is about 40 nm through the J-V tests (Fig. R1b).  8. Results: pg. 5: "…adequate X-ray absorption. The active layer of PbS CQDs was fabricated by spin-coating with a thickness of ~900 nm." Please define your definition of "adequate x-ray absorption". How does the x-ray response (in whatever metric) vary for a greater or reduced number of layer-by-layer depositions?
Response: Thanks the reviewer for raising this concern. The absorption efficiency of PbS to 50 keV X-ray photon versus thickness is shown in Fig. S11a. As the film's thickness increases, the X-ray absorption efficiency steadily increases until it reaches 90% at a thickness of ~400 μm. We made PbS CQDs photodiodes with different thickness of CQDs layer and added their photoresponses as supplementary Fig. S5.
Thicker CQDs layer enhances X-ray and NIR absorption. The high penetration depth of X-ray enables photogenerated carriers within or near the depleted region, which facilitates effective extraction of photogenerated carriers. The photoresponse to X-ray is enhanced by increasing the thickness of CQDs layer. However, the photogenerated carriers by NIR illumination are mainly at the surface of CQDs layer, which is outside the depletion region and hence suffers from with low extraction efficiency. The photoresponse to NIR is optimal when the CQDs thickness is 900 nm. When the CQDs thickness exceeds the optimal value (~900 nm), incomplete carrier extraction causes a severe drop in EQE to NIR.

(Line 13-15, Page 5)
We revised the manuscript accordingly "The active layer of PbS CQDs was fabricated by spin-coating with a thickness of ~900 nm enabling ~5% X-ray absorption (supplementary Fig. S5 and S11)." show that as the CQDs size increases, the carrier extraction is still efficient due to the matched band energy alignment. Response: Thanks the reviewer for raising this concern. The EQE of photodiodes is limited not just by absorption efficiency of light-absorbing layer, but also by extraction efficiency of photogenerated carriers. As shown in Fig We are working on improving the mobility of our CQD film so that thicker film could be used for better X-ray and NIR detection performance. Response: Thanks the reviewer for raising this concern. The response time of where ID is the drain current, L and W are the channel length (10 μm) and channel width (180 μm) respectively, VG and VTH are the grate voltage and threshold voltage, Ci is the capacitance per unit area of the dielectric layer. The mobility of PbS CQDs film is measured as ~4.63×10 -3 cm 2 /V·s (Fig. R4b). 12.   Fig. 3a.

(Line 2-3, 7-8, Page 8)
We revised the manuscript accordingly "…which is higher than some typical semiconductors such as Si and α-Se on account of its large average atomic number. … Bulk PbS and PbS CQDs with higher absorption coefficient than traditional Si and a-Se allow thinner film to achieve adequate X-ray absorption."   S12). The defect depth of the PbS CQDs film decreases from 0.122 eV to 0.101 eV after X-ray irradiation…… The deep understanding of this positive effect needs further investigation." Yes on the last question, but these are nice measurements. However, why did you limit the stability study to short times (minutes or hours)…. How is the stability over many days or weeks?

Response:
We are thankful for the reviewer's comment. We monitored the photoresponse of PbS CQDs film under X-ray irradiation (5.5 mGyair s −1 ) for longer times. We supplemented the stability of PbS CQDs film under X-ray irradiation in the article as below. The photoresponse of 7 PbS CQDs films remains stable under X-ray irradiation for one week.

(Supporting Information)
We revised the supporting information file accordingly. 15. Results, pg. 9: "…and slightly decreases by 5% at bending angle of 60° possibly due to the ITO breaking." Did you ensure that the exposed surface area is the same?
Response: Thanks the reviewer for raising this concern. We bent the CQDs photodiode at various angles and then released it to original flat state for the photoresponse tests.
Hence, the exposed surface area is the same in the photoresponse tests. Through morphology characterization as shown in Fig. R5, we observed stripped cracks on the surface of the ITO film after 60° bending. We suspected that the slight degradation of device performance was due to ITO damage after 60° bending.

Response:
We are thankful for the reviewer's reminding. We corrected the error in the captions.

(Supporting Information)
We revised the supporting information file accordingly " Fig 1. The detector structure is standard, and performance is not superior either; many demonstrations have been demonstrated already. Very quick search, we can find PbS QD photodetectors with a responsivity of 373 A/W and a detectivity of 10^13 Jones (Nanotechnology 32 195502) much better than the current manuscript. The X-ray response is stated as "compete well with the reported X-ray direct detectors" but the reference is from 2003. How can it compare with new results such as Nat Commun 9, 2926 (2018)? The possible significance here might be the array structure and X-ray detection with a photodetector device. But an array is just an incremental engineering demonstration, and I have no doubt that previous PbS photodetector devices in literature respond to X-rays as well.
2. 900 nm thickness of PbS is stated to be determined by the diffusion and drift length of photogenerated carriers and adequate X-ray absorption. This statement is very standard, all researchers know such information but how to get 900nm is a mystery. Is it really optimized or simply a one-shot?
Reviewer #1 ( S16). Overall, this result is very interesting, and worthy further investigation.
2. In the abstract, the authors claim that the X-ray sensitivity is 2×10 5 μC Gy −1 cm −3 , but the sensitivity is 17.8 μC Gy −1 cm −2 in the introduction, please clarify.

Response:
We are thankful for the reviewer's comment. We revised the description of 2×10 5 μC Gy −1 cm −3 as volume sensitivity and 17.8 μC Gy −1 cm −2 as area sensitivity.

(Line 28-29, Page 1)
We revised the manuscript accordingly "The CQDs photodiode array showed a specific detectivity exceeding 10 12 Jones in visible and NIR range and a favorable volume sensitivity of approximately 2×10 5 μC Gy −1 cm −3 for X-ray irradiation."

(Line 21, Page 4)
We revised the manuscript accordingly "It could be operated at a very low voltage 4. In the article, the thickness of the detector is only 900 nm, why not increase the film thickness to enhance X-ray absorption?
Response: Thanks the reviewer for raising this concern. 900 nm is a balanced thickness for our detector considering its NIR and X-ray detection performance. We made PbS CQDs photodiodes with different thickness of CQDs layer and added their photoresponses as supplementary Fig. S5. Thicker CQDs layer enhances X-ray and NIR absorption. The high penetration depth of X-ray enables photogenerated carriers within or near the depleted region, which facilitates effective extraction of photogenerated carriers. The photoresponse to X-ray is enhanced by increasing the thickness of CQDs layer. However, for NIR illumination, the photogenerated carriers are mainly at the surface of CQDs layer far from the depletion region, resulting in low extraction efficiency and hence lower performance. Considering the contradictory requirement, 900 nm is the optimal thickness for our device. 5. In Figure 2, the PbS CQD-EDT layer and C60 layer were labeled in energy band diagram (2c), but not in 2a and 2b.

Response:
We are thankful for the reviewer's reminding. We added clear labels in Figure 2a and 2b.
6. Details of the image fusion process need to be added like what weight factor was used.

Response:
We are thankful for the reviewer's reminding. We added the detailed information of the image fusion process in Materials and Methods.

Imaging fusion
The photocurrent matrices under different light sources were 8-bit normalized in a range of 0-1. The imaging matrices were obtained by weighted summation of the normalized photocurrent matrices pixel by pixel. The quality of fused image could be improved by optimizing the weight factors of X-ray, visible and NIR photocurrent matrices. For images in this paper, the optimal weight factors of X-ray, visible and NIR photocurrent matrices were respectively 0.25, 0.125 and 0.625.

Reviewer #2 (Remarks to the Author):
The authors provide well-quantified measurements that show that a colloidal solid composed of PbS quantum dots can exhibit x-ray, visible, and NIR performance metrics that are comparable or superior to other direct-detection technologies.
1. Abstract: "Image fusion extracts and combines information from multispectral images into a fused image, which is informative and beneficial for human or machine perception. However, currently multiple photodetectors with different response bands are used, which require complicated algorithm and system to solve the pixel and position mismatch problem." The text could use a good grammar edit throughout. For instance, the second sentence of the above should be written "Currently, (you don't need the however) multiple photodetectors with different response regimes are used, which requires complicated algorithms and systems to solve the …" (pluralize "algorithm" and "system"). Even the first sentence is redundant "Image fusion …. into a fused image…."…. Instead, I would suggest "Combining information from multispectral images into a fused image is informative and beneficial for human or machine perception. (or some such)" Anyway, I won't English edit the rest of the paper but suggest you have someone do that (especially, pluralizing the various nouns throughout the paper).

Response:
We are thankful for the reviewer's suggestions. We polished the manuscript and pluralize the various nouns throughout the paper as below.
Combining information from multispectral images into a fused image is informative and beneficial for human or machine perception. Currently, multiple photodetectors with different response bands are used, which require complicated algorithms and systems to solve the pixel and position mismatch problem. An ideal solution would be pixel-level multispectral image fusion (PLMSIF), which involves multispectral image using the same photodetector and circumventing the mismatch problem. Here we presented the potential of PLMSIF utilizing colloidal quantum dots (CQDs) photodiode array, with a broadband response range from X-ray to near infrared (NIR) and excellent tolerance for bending and X-ray irradiation. The CQDs photodiode array showed a specific detectivity exceeding 10 12 Jones in visible and NIR range and a favorable volume sensitivity of approximately 2×10 4 μC Gy −1 cm −3 for X-ray irradiation. To showcase the advantages of PLMIF, we imaged a capsule enfolding an iron wire and soft plastic, successfully revealing internal information through an X-ray to NIR fused image.
Multi-spectral image fusion is a technique that extracts the most pertinent information from different-wavelength source images into a unified image, with the goal of providing richer and more valuable information for subsequent applications, such as machine vision 1 , autonomous vehicles 2 , medical diagnosis 3 and other artificial Pixel-level multi-spectral image fusion (PLMSIF) of X-ray, visible and infrared is highly desired in various areas such as medical imaging 9 , security monitoring 10 and nondestructive testing 11 . As for application in medical imaging, the X-ray image emphasizes the inorganic skeleton texture, while the visible image supports the assessment of appearance, and the infrared image provides a detailed description of organic tissue structure. Combining X-ray, visible and infrared images into one single image can effectively and comprehensively construct the complete medical atlas, as realized by the traditional approach (Fig. 1a) using three individual photodetectors for X-ray, visible, infrared and then applying a vision algorithm. This system requires complex vision algorithms and extensive computing resources to compensate for the differences in pixel position and resolution between the three types of photodetectors, impeding the development of artificial intelligence in medical imaging. As another increasingly active demand for comfortable and real-time medical imaging, wearable and flexible photodetectors also need to be taken into consideration and developed to fit irregular biology surface and improve comfort level. However, as far as we are concerned, there is no report on one single flexible photodetector capable of capturing X-ray, visible and infrared images to achieve image fusion (Fig. 1b). This new approach is very appropriate for flexible lensless imaging, such as biomedical measurement and medical diagnosis 12 .
Various materials such as halide perovskites 12, 13 , organic semiconductors 14 , twodimensional materials 15, 16 and colloidal quantum dots (CQDs) 17, 18 have emerged, enabling flexible and wide detection range beyond traditional silicon and InGaAs photodetectors. Halide perovskites are ultra-sensitive and have a low detection limit for X-ray and visible detection due to their high absorption coefficient and high μτ product, but they show poor performance for infrared detection owing to their large bandgap 19, 20 . Organic semiconductors have achieved ultra-low dark current, large linear dynamic range and excellent flexibility but with limited response range and poor X-ray irradiation resistance 21 . Two-dimensional materials such as graphene exhibit fast photoresponse and ultra-broadband response from visible to terahertz, but they are too thin to efficiently absorb X-ray and have limited capacities for imaging array 22 . PbS CQDs are widely recognized for their excellent visible and infrared photodetection capabilities, which are attributed to their tunable bandgap, high absorption coefficient and low-temperature solution processing 23-25 . Actually, these materials contain heavy element Pb which is a strong absorber for X-ray because the absorption coefficient of X-ray is proportional to the fourth power of atomic number (Pb, 82). Furthermore, as shown in our manuscript, PbS CQDs exhibit much better X-ray robustness compared to their bulk counterpart. Hence, PbS CQDs are at least one of the best choices for the pixel-level X-ray to infrared image fusion.

Response:
We are thankful for the reviewer's reminding. We corrected the error in Fig.1.   (Line 1, Page 19) 3. Intro, pg 2: "Fusing X-ray, visible and infrared images as one single image could effectively and comprehensively construct the whole medical atlas as realized by the traditional approach (Fig. 1a) using three individual photodetectors for X-ray, visible, infrared and then applying vision algorithm." Utilizing the same pixels for all wavelength bands can make the fused-image formation more computationally straightforward, but you should also comment on any performance costs associated with using the same readout plane. For instance: larger pixels for x-rays needed compared to visible in order to increase detection efficiency because of the far lower photon fluence of the source; secondary electron escape from x-ray-induced photoelectrons if the pixel size is too small; potential loss of NIR and visible image fidelity because of needs of x-ray imager. Is the cost in performance of using a single readout structure sufficiently small that the computational image processing gains compensate?
Response: Thanks the reviewer for raising this concern. In this work, we propose a new approach to simplify the complex computational processes during multispectral image fusion. Considering far lower photon fluence and much weaker convergence of the Xray source, the commercial X-ray imaging system has large pixel size and no lens.
Similar to the commercial X-ray system, our imaging system also has large pixel size, which is beneficial to sensitive photoresponses to X-ray, visible and NIR light. If used for optical camera with lens, our imaging system needs expensive large-aperture lens.
Hence, our approach to fuse X-ray, visible and NIR images by one single photodetector is appropriate for flexible lens-free imaging, such as biomedical measurement and medical diagnosis [doi.org/10.1038/ s41928-019-0354-7].

(Line 6-8, Page 3)
We revised the manuscript accordingly "This new approach could be useful for flexible lens-free imaging, such as biomedical measurement and medical diagnosis 12 ." 4. Intro, pg. 4: "Van der Waals interaction between adjacent dots allows slipping of CQDs without broken bonds and new defects under bending state (Fig. 1e), which supports desirable flexibility of CQDs devices." (Just for you information) even if the CQDs are chemically bonded (via oriented attachment for instance), the radius of curvature between neighboring QDs is sufficiently small (for small particles) that large scale macroscopic bending is possible.
Response: Thanks the reviewer for raising the discussion. We agree with your viewpoint. We calculated the strain of bended PbS CQDs film and added the detailed description in the article as below.

(Line 13, Page 4)
We revised the manuscript accordingly "Van der Waals interaction between adjacent dots allows slipping of CQDs without broken bonds and new defects under bending state (Fig. 1e), which supports desirable flexibility of CQDs devices (Supplementary Response: We are thankful for the reviewer's remind. We added the detailed description in the article as below.

(Line 6-7, Page 6)
We revised the manuscript accordingly "The device exhibits a low dark current density as 50.9 nA/cm 2 at −1 V bias and a high rectification ratio of around 1000 at ±1 V bias, where the bandgap of our PbS CQDs is 1.18 eV." 6. Results and Discussion, pg. 4: "The as-prepared flexible 100×100 PbS CQDs photodiode array in the inset of Fig. 2a shows 20×20 mm 2 active area with 0.9×0.9 mm 2 pixel area and 0.1 mm pixel pitch patterned by a shadow mask." Why did you choose this pixel size (very large for optical camera image)?
Response: Thanks the reviewer for raising this concern. In this work, we present the design of a simple large-area imaging system to assess the feasibility of capturing multiple images using a single photodetector. The design of this imaging system mainly refers to the commercial X-ray thin-film-transistor (TFT) detector array. The commercial a-Se flat panel X-ray detectors (e.g. Hologic and ANRAD) typically have over 100 µm pixel size [doi.org/10.3390/qubs5040029]. In order to achieve better X-ray imaging, we designed a larger pixel size of 900 µm to increase X-ray absorption and hence improve the X-ray response. The pixel size can be reduced for higherresolution lens-free imaging and further for the optical camera with lens. In addition, this lens-free imaging system with large pixel size is very suitable for biomedical The XD of ZnO/PbS CQDs heterojunction are approximately 370 nm at zero bias, 600 nm at −1 V, 770 nm at −2 V, and 900 nm at −3 V. The depletion width in the n-type ZnO (xn) and p-type PbS CQDs (xp) layers can be calculated using the following formula: The calculated maximum depletion width in ZnO layer is approximately ~90 nm. We experimentally determined the optimal thickness of the ZnO layer to be 120 nm as shown in Fig. R1a. The primary function of NiOx is to act as an electron blocking layer, which can reduce carrier recombination. But its deep valence band maximum forms hole transport barrier that hinders the extraction of holes as shown in Fig. 2c. The optimal thickness of the NiOx layer is about 40 nm through the J-V tests (Fig. R1b). Table R1. Parameters of the ZnO and NiOx layers in optimal PbS CQDs device.
Thickness 8. Results: pg. 5: "…adequate X-ray absorption. The active layer of PbS CQDs was fabricated by spin-coating with a thickness of ~900 nm." Please define your definition of "adequate x-ray absorption". How does the x-ray response (in whatever metric) vary for a greater or reduced number of layer-by-layer depositions?
Response: Thanks the reviewer for raising this concern. The absorption efficiency of PbS to 50 keV X-ray photon versus thickness is shown in Fig. S11a. As the film's thickness increases, the X-ray absorption efficiency steadily increases until it reaches 90% at a thickness of ~400 μm. We made PbS CQDs photodiodes with different thickness of CQDs layer and added their photoresponses as supplementary Fig. S5.
Thicker CQDs layer enhances X-ray and NIR absorption. The high penetration depth of X-ray enables photogenerated carriers within or near the depleted region, which facilitates effective extraction of photogenerated carriers. The photoresponse to X-ray is enhanced by increasing the thickness of CQDs layer. However, the photogenerated carriers by NIR illumination are mainly at the surface of CQDs layer, which is outside the depletion region and hence suffers from with low extraction efficiency. The photoresponse to NIR is optimal when the CQDs thickness is 900 nm. When the CQDs thickness exceeds the optimal value (~900 nm), incomplete carrier extraction causes a severe drop in EQE to NIR.

(Line 13-15, Page 5)
We revised the manuscript accordingly "The active layer of PbS CQDs was fabricated by spin-coating with a thickness of ~900 nm enabling ~5% X-ray absorption (supplementary Fig. S5 and S11)." 9. Result, pg. 5: "The energy band alignment of PbS CQDs photodiode in Fig. 2c promotes efficient extraction of photo-generated electrons and holes and reduces recombination at electrodes." Did you study the performance effect of altering the QD size in order to modify the alignment on the valence band? From Fig. 2c, it looks like a slightly smaller QD may improve the alignment.
Response: Thanks the reviewer for raising this concern. The energy band structure of PbS CQDs is demonstrated in Fig. R2a  Response: Thanks the reviewer for raising this concern. The EQE of photodiodes is limited not just by absorption efficiency of light-absorbing layer, but also by extraction efficiency of photogenerated carriers. As shown in Fig. R3 S5). We are working on improving the mobility of our CQD film so that thicker film could be used for better X-ray and NIR detection performance. Response: Thanks the reviewer for raising this concern. The response time of where ID is the drain current, L and W are the channel length (10 μm) and channel width (180 μm) respectively, VG and VTH are the grate voltage and threshold voltage, Ci is the capacitance per unit area of the dielectric layer. The mobility of PbS CQDs film is measured as ~4.63×10 -3 cm 2 /V·s (Fig. R4b). 12.  'C' means the dark current density calculated from the given dark current and device area in the article.
13. Result, pg. 8: "PbS with higher absorption coefficient allows thinner film to achieve adequate X-ray absorption.". You should mention though that the effective density of your QD film is less than the bulk and the polycrystalline film presumably.

Response:
We are thankful for the reviewer's comment. We supplemented the description of the effective density of PbS CQDs film in the article as below. Based on the energy dispersive spectroscopy (EDS) results presented in Table S2, 2-3, 7-8, Page 8) We revised the manuscript accordingly "…which is higher than some typical semiconductors such as Si and α-Se on account of its large average atomic number. … Bulk PbS and PbS CQDs (supplementary Table S2) with higher absorption coefficient than traditional Si and a-Se allow thinner film to achieve adequate X-ray absorption."  respond to X-rays as well.

Response:
We thank the reviewer for the appreciation of the main idea of our manuscript: multispectral image fusion with a detector array is compelling compared with the existing approach using multiple detectors and complicated algorithm as shown in Fig. R6. We here answer the concerns briefly first: 1. This is the first report of pixel-level multi-spectral image fusion by one single sensor.
This method could avoid pixel mismatch, overloaded computing resources and complicated systems compared with traditional methods using multiple sensors.
2. Finding a material with good response toward X-ray all the way to infrared is not easy; PbS CQDs is such a carefully chosen material.
3. The performance of our PbS CQD device toward both infrared and X-ray detection are among the best in the field.
Please read the detailed response in the following: Multi-spectral image fusion can combine the most valuable information from different- nA/cm 2 ) by employing an all-inorganic ligand and transport layer structure, coupled with meticulous optimization of the device structure, film thickness, and other key parameters. We measured the total current noise spectrum of PbS CQD photodiodes by a lock-in amplifier, and the corresponding measured detectivity (7.5×10 12 Jones at 1 kHz) is the highest among the reported PbS CQDs flexible photodiodes (Fig. R8).  In general, we demonstrated a flexible PbS CQDs photodiode array with ultrabroadband response range from X-ray to near infrared that compatibly integrates with silicon-based or flexible TFT readout circuit. Operating at an exceptionally low bias voltage (0-0.1 V), this array demonstrates outstanding performance in detecting X-ray, visible and infrared light, thus satisfying the application requirements for pixel-level multi-spectral image fusion.

(Line28-30, Page 8)
We revised the manuscript accordingly "It should be noted that the volume sensitivity of the device is about 2×10 5 μC Gy −1 cm −3 at the lowest bias voltage of 0~0.1 V, which is comparable with that of the reported flexible X-ray direct detectors using new materials (supplementary Table S3) 34 ." 2. 900 nm thickness of PbS is stated to be determined by the diffusion and drift length of photogenerated carriers and adequate X-ray absorption. This statement is very standard, all researchers know such information but how to get 900nm is a mystery. Is it really optimized or simply a one-shot?

(Supporting Information)
Response: Thanks the reviewer for raising this concern. 900 nm is a balanced thickness for our detector considering its NIR and X-ray detection performance. We made PbS CQDs photodiodes with different thickness of CQDs layer and added their photoresponses as supplementary Fig. S5. Thicker CQDs layer enhances X-ray and NIR absorption. The high penetration depth of X-ray enables photogenerated carriers within or near the depleted region, which facilitates effective extraction of photogenerated carriers. The photoresponse to X-ray is enhanced by increasing the thickness of CQDs layer. However, for NIR illumination, the photogenerated carriers are mainly at the surface of CQDs layer far from the depletion region, resulting in low extraction efficiency and hence lower performance. Considering the contradictory requirement, 900 nm is the optimal thickness for our device. 3. Basically, I did not learn much new knowledge from this manuscript rather than seeing a fancy demonstration, which is worth publishing but in a specialized journal.

Response:
We appreciate the reviewer for the valuable remarks. In this work, we demonstrated a flexible PbS CQDs photodiode array with ultra-broadband response range from X-ray to near infrared, which has impressive performance with a low dark current density, a high detectivity under visible-near infrared illumination and a comparable sensitivity under X-ray irradiation. The main innovations of this work are as follows.
1. We demonstrated a simple method for pixel-level multi-spectral image fusion by one single sensor for the first time, avoiding pixel mismatch, overloaded computing resources and complicated systems compared with traditional methods. This new approach could be useful for flexible lens-free imaging, such as biomedical measurement and medical diagnosis.
2. This work systematically showed flexible and broadband PbS CQDs photodiode array for pixel-level image fusion from X-ray to near-infrared. This array achieves the lowest dark current (12.6 nA/cm 2 ) and the highest measured detectivity (7. PbS CQDs are of large specific surface area and quasi-amorphous, of which the surface exists many unsaturated bonds and vacancies (Fig. R10b). The irradiation energy of Xray photons probably promotes ligand migration and defect annihilation, and therefore leads to enhanced device performance.