Hybrid low-voltage physical unclonable function based on inkjet-printed metal-oxide transistors

Modern society is striving for digital connectivity that demands information security. As an emerging technology, printed electronics is a key enabler for novel device types with free form factors, customizability, and the potential for large-area fabrication while being seamlessly integrated into our everyday environment. At present, information security is mainly based on software algorithms that use pseudo random numbers. In this regard, hardware-intrinsic security primitives, such as physical unclonable functions, are very promising to provide inherent security features comparable to biometrical data. Device-specific, random intrinsic variations are exploited to generate unique secure identifiers. Here, we introduce a hybrid physical unclonable function, combining silicon and printed electronics technologies, based on metal oxide thin film devices. Our system exploits the inherent randomness of printed materials due to surface roughness, film morphology and the resulting electrical characteristics. The security primitive provides high intrinsic variation, is non-volatile, scalable and exhibits nearly ideal uniqueness.

1. page 1 line 15 "high entropy". The paper claims a 28 bit PUF. But there are 8 inverters. The produced bits have 0.45 bias with similar bits on other devices. I guess high entropy is a relative term. You might want to elaborate.
2. Page 1 lines 21 to 29: you are not distinguishing between the true source of randomness vs. something that uses that to produce pseudorandomness (e.g., a cryptographic algorithm). PUF you are describing is in the former category. The "28 bit" PUF does not provide sufficient entropy into the cryptographic algorithms given that there are 8 inverters and 0.45 bias across devices.
3. Page 1 line 32. "PUFs used as hardware security keys". PUFs can be used to derive the keys. A function is not a key. Also the 22% error in your PUF would preclude getting a bit stable key unless lots of post processing is done (and that would be difficult with printed electronics).
4. Page 1 lines 28-29. Loftstrom paper (24) is not a PUF. It does not define its circuit as having an input and therefore is not a function. PUFs came after Gassend 2002 paper which requires input (challenge) and an output with output determined by manufacturing variation. Bufferfly PUF (22) is a bistable PUF, not delay PUF. The first delay PUF paper is arguably the Gassend CCS 2002 paper.
5. Page 4 -5 lines 122 to 127. It is unclear how the 8 inverters are expanded to 28-bit challenge using permutations. Might want to describe that more. Also elaborate whether the bits generated is true entropy or pseudo-entropy.
6. Page 6 line 177. It seems like mu_m of 44.5% represent a min entropy reduction of -log(.55) reducing 8 bit entropy to 6.8 bits. This assumes adversary can use other like devices to break another via across device bit correlation. 7. Page 8 line 226. A reliability of 78.5%, meaning error rate > 20% is actually quite high in the silicon PUF world. For identification, this would require an even longer response length for reliable authentication (for a given false positive and false negative goal).
8. Page 8 line 234 "root of trust" usually refers to a bit-stable key. An error rate of 20%+ is difficult to address even with error correction. And on a printed circuit I am not sure how that's going to be addressed. There is a lot of future work here. 9. Page 10 lines 316 to 326. Why use w and t, two different variables. These two eqns looks almost the same. Isn't one just 100 minus the other?
However, to address the concern of reviewer 1 and also to include and answer to a similar comment (comment 2.5.) of reviewer 2, we have added an additional figure (Figure 1h) to clearly show the challenge-response mechanism and within that the digital response (answers point 1) extracted from the inverter output voltages (answers point 3). The variations (answers point 2) are now described in the manuscript in line 93 and are partly shown in Figure 1b when the surface roughness and thickness of the various layers are discussed. Through these extensions we have now created a more comprehensive picture early in the manuscript (section 2.1 of Results) to address all three points of reviewer 1. The full challenge-response mechanism of the scaled-up system to form a 28-bit including the permutation-based challenge addressing is further described in the supplementary information.

Changes to the manuscript:
Line 93-100, new, (added): In our hybrid PUF approach, we exploit the implicit random variations caused by the material composition, layer thickness and roughness, as well as interface properties between several printed layers as a source of randomness for hardware security. These variations are reflected in the electrical characteristics of our printed EGTs and corresponding inverter structures. By addressing two inverters (Inv ak , Inv bk ) simultaneously and comparing their output voltages, one output bit is generated, based on the voltage difference ∆Vout. To enable a comprehensive understanding of the challenge-response mechanism and the corresponding effects that lead to the response bits, we also track the individual inverter output voltages at the comparator input with an analog-to-digital converter (ADC). The full mechanism of the challenge-response generation is shown in Figure 1h. The exact challenge configuration to generate a 28-bit response is further described in the supplementary information. inverter pair (Inv ak , Inv bk ) addressing is provided by the permutation-based sub-challenge c k. The comparator output generates the corresponding sub-response r k , based on the voltage difference ∆Vout between Inv ak and Inv bk .

Comment 1.4:
As there are other works related to printed PUF in literature, the authors should provide a qualitative comparison among them (if quantitative comparison is non-trival). This is to clearly distinguish this work from the others. At moment the clarity here needs improvement.

Response 1.4:
We have included a table comparing qualitatively our work with other printed PUFs (Table R1.4) which is included in the revised supplementary information. From the table and by our revised introduction the conceptual novelty of our work is the realization and comprehensive experimental study including the implementation, fabrication and integration of a printed electronic PUF into an embedded hybrid system. Our achievements, compared to the related works are: -extracting physical randomness from thin metal oxide devices in a differential circuit design to form a complex, 28-bit PUF response. -first experimental access to full security metrics such as uniqueness, reliability, bit aliasing, false acceptance rate and false rejection rate, based on hidden variations. -statistical analysis based on 15 integrated PUF cores and 3300 measured PUF responses -the first novel material and manufacturing system that can be compared due to its complexity with silicon PUFs.
Prior existing studies on printed PUFs focused on standalone output voltage measurements of mostly single bit functions. Our approach consequently addresses the design of a hybrid PUF starting from the level of printable materials, the formation of the thin film electrical devices and their circuit behavior up to the embedded system. Only this coherent design, fabrication and characterization approach allows for the understanding and calculation of the security metrics based on a specific challenge-response mechanism. In that context, we believe our analysis could serve as a benchmark for novel material based PUFs to compare their security metrics in future. It should be noted, that our approach requires a high amount of statistical data and reproducibility of the utilized novel materials and technology. Furthermore, with this work we want to give full insights into the environmental effects, influencing PUF operation such as temperature, humidity and noise -as visualized and discussed in the Results section of the manuscript.
In summary, we hope that our results on the hybrid PUF system motivates further research on security devices and systems based on novel materials and provide a promising exploitation path for the design of secure lightweight identification devices in the IoT.

Changes to supplementary information:
Line 29-36, new: We qualitatively compare this work with other state of the art experimentally evaluated PE-based PUFs. To the best of our knowledge no fully verified PE-based PUF including the evaluation of all security metrics has been reported in literature, yet. However, evaluating the PUF security metrics is important to enable a qualitative comparison between different PUF implementations. Nonetheless, currently existing comparable results are listed in Supplementary  Table 2. We compare the used printing technology, the type of response generation (electrical or optical), the PUF type (weak or strong PUF), the presented response bit width, as well as experimentally verified PUF security metrics. This includes the uniqueness, reliability, bit aliasing bit errors, false-acceptance rate (FAR), and false-rejection-rate (FRR).
Supplementary Paper(s) on PUFs were recently published in Nature Electronics (within the last two months). Please do a search and cite these in the introduction to assist motivating the importance and directions in the field.

Response 1.5:
In addition to the already cited references of [34,35,38,42,44,45,46], we added the latest (within the last two months) PUF related publications from Nature Electronics (Gao et. al)

Reviewer #2:
This is a very well written paper with very interesting and well documented results in terms of creating a printed PUF circuit that can serve as a unique intrinsic identifier. The methods and the results are described in great detail that would allow another to replicate the experiment. The material is a definite interest to the research community. There are several areas where this paper can be improved: Comment 2.1: Page 1 line 15 "high entropy". The paper claims a 28 bit PUF. But there are 8 inverters. The produced bits have 0.45 bias with similar bits on other devices. I guess high entropy is a relative term. You might want to elaborate.

Response 2.1:
We fully agree that the term 'high entropy' is not well defined in this context and needs clarification. In general, in the context of information theory, entropy relates to the uncertainty of an outcome and is interlinked with randomness. This general definition is in principle applicable for all kinds of uncertainties, as e.g. in probability theory, chaos theory, statistics, cryptography, to name but a few. Due to the nature of additive printing processes, ink substrate interactions, as well as intrinsic material properties based on thin film morphology and interface effects, PE technology underlies higher variation compared to silicon-based electronics. This leads to greater uncertainty in term of unpredictability in the circuit characteristics. For that reason, we talk about high entropy in terms of random/uncontrollable variations. In summary, this discussion mainly refers to the physical entropy source.
The NIST specification 800-90 recommends determining the so-called min-entropy to assess the worst-case information entropy of a binary source. The min-entropy metric is based on the distribution of 0s and 1s. Thereby, the probability of having a 0 is denoted by , whereas the probability of a 1 is . In general, the min-entropy is calculated according to Equation (1): In the context of PUFs, the min-entropy metric considers the bias of the PUF responses. As shown in Figure 4a in the manuscript, the average bit aliasing, which corresponds to the bias, is = 44.5 % for the fabricated and evaluated hybrid PUFs. This leads to a min-entropy of = − log (0.555) = 0.849, which denotes that the actual information content of a 28-bit response is limited to ≈23.772 bits. However, for our simulation results we determined a mean bit aliasing value of = 49.8 %, which leads to a min-entropy value of = − log (0.502) = 0.994. This value indicates the theoretical performance that can be expected for greater sample sizes in our approach.
Finally, we want to summarize that there are various interpretations of the term 'entropy'. As suggested by the reviewer, we will now distinguish between the entropy source (which refers to the variations and materials properties) and the information entropy of the binary PUF responses.
To reduce possible misunderstandings, we changed the wording in the abstract accordingly. Furthermore, we include the calculations as shown above, in the supplementary information and discuss this topic.

Changes to the supplementary information:
Line 65 ff., new: 1.

Entropy of the PUF responses
To determine the entropy of random numbers the min-entropy estimation is widely used, as recommended in the NIST specification 800-90. The min-entropy metric is based on the distribution of 0s and 1s. Thereby, the probability of having a 0 is denoted by , whereas the probability of a 1 is . In general, the min-entropy is calculated according to Equation (1): In the context of PUFs the min-entropy metric considers the bias of the PUF responses. As shown in Figure 4a in the manuscript, the average bit aliasing, which corresponds to the bias, is μ m = 44.5 % for the fabricated and evaluated hybrid PUFs. This leads to a min-entropy of = − (0.555) = 0.849, which denotes that the actual information content of a 28-bit response is limited to ≈23.772 bits. However, for our simulation results we determined a mean bit aliasing value of μ m = 49.8 %, which leads to a min-entropy value of = − (0.502) = 0.994. This value indicates the theoretical performance that can be expected for greater sample sizes in our approach.

Comment 2.2:
Page 1 lines 21 to 29: you are not distinguishing between the true source of randomness vs. something that uses that to produce pseudorandomness (e.g., a cryptographic algorithm). PUF you are describing is in the former category. The "28 bit" PUF does not provide sufficient entropy into the cryptographic algorithms given that there are 8 inverters and 0.45 bias across devices.

Response 2.2:
As pointed out by the reviewer and explained in Response 2.1, the PUF responses in their current form and the amount of statistical data does not hold a high enough entropy for cryptographic applications. As noted in lines 30,31 of the manuscript, the design goal of the hybrid PUF was to investigate PUFs for secure, lightweight identification for IoT devices. In general, PUFs use a natural entropy source that is expected to be random to generate PUF responses. Therefore, true random numbers can be generated by PUFs. However, most physical realization of PUFs suffer from systematic errors and noise, making it difficult to enable true random responses without post-processing -which provides another unwanted attack vector.
The experimentally demonstrated bias (μ = 44.5 %) based on 15 PUF cores for the hybrid PUF and a theoretical bias of μ = 49.8 %, based on python simulations are already promising and in principle the circuit complexity and architecture including the addressing and readout circuits could be optimized for cryptographic applications.

Changes in the manuscript:
Line 190, new: The experimentally obtained a bit aliasing of μ = 44.5 % indicates a bias of the PUF responses towards logic '0'. However, the simulated theoretical bit aliasing for the presented hybrid PUF is μ = 49.8 %. This shows that the bit aliasing can be improved to provide a close to true random bit sequence, suitable for cryptographic applications with the presented approach. The hybrid PUF's response entropy and its capabilities regarding identification are discussed in the supplementary information.

Comments 2.3:
Page 1 line 32. "PUFs used as hardware security keys". PUFs can be used to derive the keys. A function is not a key. Also, the 22% error in your PUF would preclude getting a bit stable key unless lots of post processing is done (and that would be difficult with printed electronics).

Response 2.3:
Thank you for pointing it out, we have refined our language and corrected the sentence accordingly. We agree that reducing the error rates by increasing the reliability is a crucial point to be addressed in future work on PE-based PUFs. This can be achieved with temperature compensated low-overhead architectures, passivation and encapsulation.

Changes in the manuscript:
Line 249-251, new: Regarding future work in the area of printed PUFs several points need to be further investigated. As shown in our results, temperature stability needs to be improved to increase reliability and to enable bit-stable PUF responses.

Comment 2.4:
Page 1 lines 28-29. Loftstrom paper (24) is not a PUF. It does not define its circuit as having an input and therefore is not a function. PUFs came after Gassend 2002 paper which requires input (challenge) and an output with output determined by manufacturing variation. Bufferfly PUF (22) is a bistable PUF, not delay PUF. The first delay PUF paper is arguably the Gassend CCS 2002 paper.

Response 2.4:
We agree that the challenge-response mechanism, that defines a PUF and its functionality, is not declared in the Lofstrom paper. The work of Gassend et. al [14] shows and discusses the challenge-response functionality for the first time. We therefore decided to replace the reference of the Lofstrom paper [25], by related work on analog-PUFs by Yang et. al) [24]. We also updated the references for the Butterfly PUF, which is classified as a bistable PUF.

Comment 2.5:
Page 4 -5 lines 122 to 127. It is unclear how the 8 inverters are expanded to 28-bit challenge using permutations. Might want to describe that more. Also elaborate whether the bits generated is true entropy or pseudo-entropy.

Response 2.5:
To show the generation of the 28-bit response-challenge mechanism, based on 8 inverters, we include an additional figure (Figure 1h) and further show the permutation sequence of a challenge. The inverter input and outputs are readdressed in a lexicographic order, in order to derive a 28-bit response. Therefore, each inverter address will occur M-1 times in the response with M=8 inverters. Furthermore, the challenge-response mechanism is described in more detail in the supplementary information. As elaborated in our response to comment 2.1 we have clarified the terms random entropy source, have shown the potential of creating a min entropy of 1 and elaborated on the current experimental results of the hybrid PUF system. The corresponding changes in the manuscript have been listed above (comment 2.1). Since true random numbers are not biased towards logic '0' or '1', the results obtained from our statistical analysis on the hybrid PUF show that the PUF responses are pseudo-random. However, most computer-generated random numbers are generated using (computational) deterministic cryptographic algorithms. In practice, pseudorandom numbers can be enough, also for securitycritical applications.
At this point we want to note that these issues exist for most of the PUF designs and is not limited to the hybrid PUF. Comment 2.6: Page 6 line 177. It seems like mu_m of 44.5% represent a min entropy reduction of -log(.55) reducing 8 bit entropy to 6.8 bits. This assumes adversary can use other like devices to break another via across device bit correlation.

Response 2.6:
We are not sure, we interpret this comment right. We agree that a μ of 44.5 % represents a non-ideal value and could result in a security threat. We have included detailed calculations of the min-entropy in our response of comment 2.1, as well as in the revised manuscript in the supplement material.

Comment 2.7:
Page 8 line 226. A reliability of 78.5%, meaning error rate > 20% is actually quite high in the silicon PUF world. For identification, this would require an even longer response length for reliable authentication (for a given false positive and false negative goal).

Response 2.7:
We fully agree, that the reliability at this point is, in comparison to state-of-the art, silicon-based PUFs relatively low. This stems from various effects, as in current Si-based PUF implementations, stabilization methods for high reliability, uniqueness and bit-errors are deployed, such as temporal majority voting. However, this post-processing doesn't come without a cost, as Helper-Data is required, that stores information for stable PUF response generation. This provides an attackable weak spot in hardware-based security. The presented reliability of the hybrid PUF consists of raw PUF responses only. The reliability values are comparable, if even better, than first Si-based PUF implementations. One should note, that our hybrid PUF presents the first fully evaluated device utilizing emerging technologies such as inkjet-printing and metal oxide materials. However, we agree, that the reliability could be improved in the future. Possible solutions would include proper passivation, encapsulation and research in low-overhead temperature stable PEbased architectures. We therefore added a discussion on improving the reliability and highlight future research perspectives, regarding inkjet-printed PUFs and their materials. To the best of our knowledge, our work shows the first realization of a PE-based PUF with comprehensive statistics on security metrics based on experimental data. For the time being it is not possible to compare the reliability results with other related works not based on silicon, due to the novelty of our work.
To determine the identification capabilities of the hybrid PUFs, we determine the fuzziness of the PUF responses. Therefore, we compute plot the intra-HD and inter-HD distributions based on experimental and simulation data. Figures R2.7 (a) and R2.7 (b) show the corresponding distributions. The area enclosed between both curves determines the fuzziness of the responses. The area can be split by the identification threshold (th ) into the left region, which is also referred to as the false-acceptance-rate (FAR) and the right region, the so-called false-rejection-rate (FRR). In the ideal case and if the enclosed area is zero, all PUF responses can be distinguished without errors. However, the ideal case is never reached without post-processing since fabricated raw PUFs underlie variations induced through the imperfect manufacturing process as well as changing operating conditions.
(a) (b) Figure R2.7: Intra-HD and inter-HD distributions on the basis of (a) experimental data, and (b) simulation data.
The FAR and FRR values are calculated according to Equation (R2.7-1) and Equation (R2.7-2), respectively: where P (•) and P (•) are the probability density functions of the inter-HD and intra-HD distributions. Since the FAR and FRR values are typically very small numbers, it is common practice to use the (•) representation. Typical FAR and FRR values used in identification systems can reach from -3 up to -12 (after post-processing). Basically, the FAR and FRR values depend on the selected identification threshold value. Two often used approaches to set the identification threshold is (1) to use the intersection point between both distributions and (2) use the so-called equal-error-rate (EER) where FAR=FRR. Firstly, we use the experimental response data and calculate the FAR and FRR values for the former identification threshold. The resulting values are FAR=-2.23 and FRR=-1.71. Furthermore, we compute the values for the EER, which result in FAR=FRR=-1.83. To the best our knowledge, this is the first assessment of the identification capabilities of a PE-based PUF. Even in the more matured research field of silicon-based PUFs, such detailed statistical evaluations are rare. At this point we want to note that our evaluations are based on raw PUF responses without additional post-processing, such as error-correction. For our simulation data, the resulting values are FAR=-2.21 and FRR=-2.68. For the equal-errorrate (EER) the values are FAR=FRR=-2.32. The results show that the experimental results are in good agreement with our simulations. Additional post-processing could further improve the identification capabilities of the hybrid PUF and could be tackled in future work.

Changes in the supplementary information:
Line 77, new: 1.6 Hybrid PUF identification To investigate the identification capabilities of the hybrid PUF, we perform evaluations based on the intra-HD and inter-HD distributions. The intra-HD is a measure of the reproducibility of the PUF responses for a fixed challenge and under the impact of changing operating conditions. The inter-HD indicates the uniqueness of the responses generated by different PUFs. Supplementary  Figure 4a shows the intra-and inter-HD distributions for our measured PUF responses under humidity and voltage variations. The solid black line shows the intra-HD Gaussian distribution, whereas the dash-dot red line shows the inter-HD Gaussian distribution, respectively. The enclosed area below both lines divides into the two regions left and right from the intersection value. The left region is denoted as the false-acceptance rate (FAR), whereas the right one is the false-rejection rate (FRR). To ensure a proper identification, both the FAR and FRR should be minimized and the PUF responses should contain enough entropy with respect to the sample size. As there is an overlap between both, the intra-and the inter-HD variation distributions, some PUFs cannot be distinguished without additional post-processing. The x-value of the intersection is the ideal threshold value to distinguish between PUF responses when applying a binning technique. If the HD between the database entry and the measured response is less than this threshold, the identification is successful. If there is no overlap between both distributions, the identification can be assumed errorless, if the threshold is placed somewhere in between. Supplementary Figure 4b shows the corresponding intra-and inter-HD Gaussian distributions for the simulated data of 150 printed PUF cores. The simulation data used has been generated in our prior work and refers to the worst-case considerations where a noise level of 10 mV is applied. The standard deviation of the inter-HD can be expected to decrease for larger sample sizes. The overlap in the plot between both distributions is small, which implies a low identification error of the hybrid PUFs. A three-bit error correction reduces the intra-HD to zero and therewith also eliminates the overlapping area. To further assess the identification capabilities based on the raw PUF responses (without additional error correction), we compute the FAR and FRR values based on experimental and simulation data. The FAR and FRR values are calculated according to Equation (2) and Equation (3), respectively: where ( ) and ( ) are the probability density functions of the inter-HD and intra-HD distributions. Since the FAR and FRR values are typically very small numbers, it is common practice to use the ( ) representation. Typical FAR and FRR values used in identification systems reach from -3 up to -12 (after post-processing). Basically, the FAR and FRR values depend on the selected identification threshold value. Two often used approaches to set the identification threshold is (1) to use the intersection point between both distributions and (2) use the so-called equal-error-rate (EER) where FAR=FRR. We use the experimental response data and calculate the FAR and FRR values for the former identification threshold. The resulting values are FAR=-2.23 and FRR=-1.71. Furthermore, we compute the values for the EER, which result in FAR=FRR=-1.83. To the best our knowledge, this is the first assessment of the identification capabilities of a PE-based PUF. Even in the more matured research field of silicon-based PUFs, such detailed statistical evaluations are rare. At this point we want to note that our evaluations are based on raw PUF responses without additional post-processing, such as error-correction. For our simulation data, the resulting values are FAR=-2.21 and FRR=-2.68. For the EER the values are FAR=FRR=-2.32. The results show that the experimental results are in good agreement with our simulations. However, additional post-processing could further improve the identification capabilities of the hybrid PUF.
Comment 2.8: Page 8 line 234 "root of trust" usually refers to a bit-stable key. An error rate of 20%+ is difficult to address even with error correction. And on a printed circuit I am not sure how that's going to be addressed. There is a lot of future work here.

Response 2.8:
Concerning the quantitative analysis of the error rates, please refer to our response to your comment 2.7. In addition, we would like to point out, that from an information security point of view, the "root of trust" describes a source that can always be trusted. Typically, in cryptographic systems the security is dependent on binary keys to encrypt or decrypt data, generate digital signatures, to name but a few. The root of trust is inaccessible from the outside and guarantees the authenticity of the overall system. Most commonly, the root of trust comprises functionalities for random number generation, key derivation, secure memory etc. Device-unique keys are often injected into the system during the manufacturing process by an external party, which depicts a potential security threat. However, the advantage of injected keys is their perfect reproducibility. On the other hand, PUFs are an option to derive binary keys from an inherent entropy source. In the context of root of trust, PUFs can provide a higher security level, since no external party is needed to inject device-unique keys. Furthermore, in PUFs use an intrinsic storage (variations), which mitigates the threats emanating from memory leakage attacks.
PE technology, particularly the unique property of decentralized manufacturing, allows to fabricate circuits at the manufacturing site and not necessarily at a foundry. In this context, PE-based PUFs can leverage from trusted supply chains, which further enhances the security level of the keys. Consequently, PE-based PUFs can enhance the overall security level of a root of trust by covering secure key derivation and trusted manufacturing.
To address the reviewer's comment regarding possible future work we refer to response 2.3 and