Provable Quantum Advantage in Randomness Processing

Quantum advantage is notoriously hard to find and even harder to prove. For example the class of functions computable with classical physics actually exactly coincides with the class computable quantum-mechanically. It is strongly believed, but not proven, that quantum computing provides exponential speed-up for a range of problems, such as factoring. Here we address a computational scenario of"randomness processing"in which quantum theory provably yields, not only resource reduction over classical stochastic physics, but a strictly larger class of problems which can be solved. Beyond new foundational insights into the nature and malleability of randomness, and the distinction between quantum and classical information, these results also offer the potential of developing classically intractable simulations with currently accessible quantum technologies.

almost surely after a finite number of coin-flips. Slightly more precisely: for all p the procedure to construct f (p) must define disjoint sets S 1 and S 2 whose elements are finite strings, such that with probability 1 the sequence of flips produced has exactly one of these strings as an initial segment. For any sequence of flips the output is heads if the initial segment is in S 1 and tails if it is in S 2 , and thus an output is produced almost surely in finite time.
The topic of which functions are constructible, how easily they can be constructed, and their applications goes by the name "Bernoulli Factory" [2][3][4][5]. Crucially, in 1995 a theorem of Keane and O'Brien [2] determined the exact set of functions constructible from a classical coin of unknown bias p. Loosely speaking, it was found that a function f (p) : (S ⊆ [0, 1]) → [0, 1] is constructible if and only if (a) it is continuous, (b) it does not touch 0 or 1 within its domain and (c) it does not approach zero or one exponentially quickly at any edge of its domain (see the supplementary materials for a precise statement). While this allows such surprising functions as e cos p , √ p, it also rules out important ones such as as the "probability amplification" function f (p) = 2p, which is central to certain stochastic simulation protocols. Moreover, it says nothing about the resources required to actually construct the functions -often an infeasibly large number of coin-flips are required. The scenario described for the Bernoulli factory shares similarities to a Turing Machine, however it is worth emphasizing that there are differences between it and a fully universal scenario. While both possess a target function to be "computed", the Bernoulli factory, with its unbounded, probabilistic i.i.d input, is in a sense a simpler, arguably more tractable, model. This makes the Bernoulli Factory an ideal candidate with which to establish quantum-mechanical results that are provably beyond the reach of classical physics.
The central results we present in this work are: 1. Quantum Bernoulli Factories allow the construction of a strictly larger class of functions than allowed in stochastic classical physics.
2. Quantum Bernoulli Factories provide dramatic improvements in terms of resource requirements over a range of classically constructible functions.
The classical Bernoulli Factory (CBF) can be easily described within a quantum-mechanical setting via (arbitrarily many) copies of a qubit prepared in the mixed qubit state for unknown p ∈ S ⊆ [0, 1]. Here the computational basis {|0 , |1 } of the two-dimensional qubit Hilbert space H, denotes a fiducial projective measurement that extracts classical data. In contrast, a quantum-mechanical extension of the classical coin states has coherences in this basis. Our goal is to contrast the fundamental processing of such classical randomness with the quantum randomness attainable in a quantum Bernoulli factory (QBF). It should be emphasised, however, that our desired output is still classical. We refer to the quantum-mechanical extension of the coin state as a quoin, and accordingly it is described by the coherent state For this we find measurement in the computational basis {|0 , |1 } returns the probability distribution (p, 1 − p), and so by restricting to stochastic mixing in this basis, together with algorithmic processing, we see that the classical Bernoulli Factory setting is recovered as a special case from the quantum-mechanical one. In particular we see that the stochastic mixing available in the classical factory is a special case of the unitary operations available in the quantum setting 2 . We now demonstrate that the full set of quantum-mechanical operations allows a strictly larger class of functions than allowed classically. A crucial demonstration of the superior power of the quantum-mechanical Bernoulli Factory is given by the probability amplification function f (p) = 2p. This function is impossible to construct classically, since it attains the value f (p) = 1 for p = 1/2, and so traditional work-arounds involve "chopping" the function as it approaches p = 1/2, and forming a truncated function, f (p) = min(2p, 1 − ) for some fixed 0 < < 1. This approximate function then does satisfy the conditions of the Keane O'Brien theorem, however the amount of coins needed to produce such a function scale very poorly with (see [6,7] for examples). In contrast we show that within a QBF it is possible to efficiently construct the classically impossible probability amplification function f ∧ : as shown in Fig.1. Our method is as follows. The target function admits an alternative representation of f ∧ (p) = 1 − 1 − 4p(1 − p), which in turn possesses an expansion of the form where q k is a probability distribution. Since within the (classical or quantum) Bernoulli Factory we can generate any constant distribution, we first construct an integer output k with probability and then conditioned on this output construct the function g k (p) = (4p(1 − p)) k . The latter set of functions {g k } are classically inaccessible for all k > 0. We also note that g k (p) = g k 1 (p) and so our task reduces to constructing the k = 1 case. This is easily achieved by considering a Bell-basis measurement on two quoins. The probability that we obtain |ψ + ψ + | or |φ − φ − | is 1/2, however the probability of obtaining the outcome |ψ + ψ + |, conditioned on obtaining |ψ Putting everything together, to construct a f ∧ (p) coin we output an index k with probability q k and then construct k g 1 (p)-coins using O(k) quoins. If k outcomes of heads in a row are obtained from the g 1 (p)-coins then heads is output, otherwise tails is output. This provides an exact construction of the function f ∧ , as claimed. We can provide a clearer account of this construction by adapting a method in [8], where we can represent the above method as a random walk on a ladder [Fig.2]. One begins at the point marked "Start" and flips a g 1 (p)-coin to decide where to move next. Once on the ladder at any vertex we step up the ladder with probability g 1 (p)/2, or down the ladder with the same probability; otherwise we move across. If we reach the bottom left corner we output heads, while if we reach the bottom right corner we output tails; this means that if the very first flip is tails we output tails immediately. The probability of outputting heads can be shown to be The construction that we have provided uses two-qubit measurements in an entangled basis, and so one might think that entanglement is required for any quantum advantage, surprisingly this is not the case -the following theorems determine the exact class of functions f : [0, 1] → [0, 1] that are constructible within a QBF using only single-qubit operations, and are the main results of this work. In fact the only type of operations required for our proofs are the unitaries which construct a coin that gives output one with probability 3. ∀z ∈ Z there exists constants c, δ > 0 and and integer k < ∞ such that 4. ∀w ∈ W there exists constants c, δ > 0 and an integer k < ∞ such that The proof of Theorem 1 is too long to include here and is provided in the supplementary materials. The main idea is similar to the construction of the probability amplification function, and involves arriving at a convex decomposition in terms of functions that are explicitly constructible using quantum operations on quoins. It is clear that the conditions of the theorem are a natural generalization of the classical case, except now the function is allowed to go (polynomially quickly) to 0 and 1 at a finite number points over the interval [0, 1]. Moreover, this implies that the scaling of resources within the interior no longer behaves as in the classical case, where large number of coins are required if the function approaches 0 or 1 at for example p = 1/2. Instead, the scaling for any point x ∈ [0, 1] behaves like the end-points p = 0, 1.
One straightforward generalization is that we do not require the target function be defined at all points inside the interval [0, 1], and can allow more extreme behaviours (such as rapidly increasing oscillations or sharp discontinuities) in the functions that we construct. To this end we have the following theorem.
Theorem 2. A function f : (0, a 1 ) ∪ (a 1 , a 2 ) ∪ ... ∪ (a n , 1) → [0, 1] is constructible with quoins and a finite set of single qubit unitaries if f is continuous on its domain and there exists a finite list {a 1 , a 2 , ...a n }, which contains {a 1 , a 2 , ...a n }, and integer k such that for all p ∈ (0, 1), where The proof of Theorem 2 follows a slightly different construction to Theorem 1, and is also provided in the supplementary materials. The above two theorems both relate to single-qubit operations and provide a broad class of constructible functions, however we conjecture that multi-qubit unitaries do not extend the set of quantumly constructible functions, but do provide additional speed-ups, as is illustrated by the example of the function g 1 constructed from Bell measurements.
It is important to note that in addition to extending the class of functions, the quantummechanical Bernoulli Factory provides dramatic speed-ups for certain functions that are classically accessible. For example, consider the function f α : [0, 1] → [0, 1], given by f α (p) = αh a (p) with 0 < α < 1, which is easily constructed via a convex combination of h a (which requires just a single quoin) and the function 0. Since the function h a is inaccessible classically, the construction of the function f α necessarily requires a rapidly increasing number of classical coins as α tends to 1, in stark contrast to the quantum-mechanical case.
Up to this point we have been concerned with constructing from a coherent input a target distribution f (p) as a classical probabilistic distribution, however a more sophisticated goal is to construct a coherent output from a coherent input. Specifically, given an unbounded number of input quoins |p what output quoins |f (p) can be obtained through arbitrary quantum operations on the input string? This is related to exact sampling tasks of the following form: given a classical algorithm that can efficiently sample bit strings x i with associated probabilities p i , does there exist an efficient quantum algorithm which outputs the state This problem has been called q-sampling [9,10] and q-samples constructible using the simple techniques of [9][10][11] form a useful starting point in many modern quantum algorithms [12][13][14][15] and enable q-sampling of various useful distributions [16][17][18]. It should be noted that the question of which efficient classical sampling algorithms allow for creation of an efficient q-sample is remarkably subtle. For instance, it is classically trivial to uniformly sample the n! adjacency matrices corresponding to permutations of the vertex labellings of an n-vertex graph, such a q-sample would easily allow for an efficient quantum algorithm to solve graph isomorphism but despite attempts from many researchers no efficient procedure has been found. The distinguishing features of quantum and classical information are subtle, and often wellhidden. Paradigmatic examples have already appeared in single-party cryptography [19], twoparty cryptography, and communication complexity [20]. Of arguably broader significance is to determine the computational abilities allowed by quantum physics. Quantum computing does not allow new functions to be constructed and the speed-ups, whilst strongly supported by evidence remain unproven. The work presented here provides a computational scenario in which quantum mechanics has strict superiority over classical physics and, by virtue of requiring only single-qubit manipulations, appears vastly easier to attain experimentally.

Supplementary Methods
In order to prove our two theorems we will need to define certain sets and probability measures. We define Ω as the set of all infinite sequences X = (X 1 , X 2 , X 3 , ...) such that X i ∈ {0, 1}. A flip of "heads" corresponds to the value 1.
We define cylinder sets A probability measure P p takes an open subset of Ω to the probability of it occurring. In the simplest case, that of classical coins with no processing, every value of p ∈ [0, 1], determines a probability measure P p on Ω such that P p (X i = 1) = p for every i.
For any subset S of Ω we define the outer measure, required for when S is not open, as An event is a subset of Ω which is measurable with respect to P p for every p.
The only events we are interested in discussing are those for which we can say whether they occur after some finite number of coin flips. We call these events discernible with respect to a given probability distribution. An event is discernible if the probability that we need to flip infinitely many coins to decide whether it occurs is zero. More formally: Definition 1: An event S is discernible with respect to a probability measure P p if P p (S) + P p (S c ) = 1. This may be the case for some values of p and not others. Now in our quantum factory our ability to apply different unitaries to different coins means defining some new things.
The total sample space is given by is the set of all ordered n-tuples of infinite {0, 1} strings. Subsets are then products of cylinder sets.
is the set of all ordered n-tuples of infinite {0, 1} strings which begin with the finite strings v 1 , v 2 , ..., v n respectively. Again illustrating with an example, the set is the set of all ordered 3-tuples of infinite {0, 1} strings which begin with 1, 01 and 111 respectively.
We also have different probability distributions in the quantum cases, defined by a value p and finite set of unitary matrices. These unitaries describe the algorithmic freedom we have to process the unknown quantum state, and are determined by the target function f (p). For each Ω i we associate a unitary U i of the form which yields a coin that gives output one with probability when measured in the computational basis.
We call the set of unitaries G = {U i } and so the quantum probability distribution is a function of both p and G.
We are interested in functions where we can construct sets and probability measures such that there is an event that occurs with probability f (p) and is discernible.
is q-constructible if there exists an event S and a probability measure P p,G such that S is discernible with respect to P p,G and P p,G (S) = f (p) The following definitions will be used for Theorem 1. 3. ∀z ∈ Z there exists constants c, δ > 0 and and integer k < ∞ such that 4. ∀w ∈ W there exists constants c, δ > 0 and an integer k < ∞ such that We will need some more definitions for Theorem 2.
Definition 5: A function f : [0, 1] → [0, 1] is finitely q-constructible if it is q-constructible and there exists some N such that after observing the first N bits of the string one can say with certainty whether the output will be 0 or 1.
Definition 6: A function α : Ω → N is discernible, if α −1 (n) is discernible for every n We now prove Proposition 1, which is the main result needed to adapt the classical result to the quantum case.
Proposition 1: For any SPB function f (p) there exists a q-constructible pair of bounding functions L(p) and U (p) such that These two conditions ensure that will be a good approximation to f (p), which will be an important step later on. We will outline an explicit construction of these bounding functions, which are key to Theorem 1.
Consider some candidates for L(p), U (p) which do not quite work, namely A n (p), B n (p) defined by These bounding functions have a minimum separation of one third rather than one half. This is because some of the modifications we will use will slightly increase this value, meaning we require an initial buffer.
From Bernstein's proof of Weierstrass' approximation theorem [2] we know that A n (p), B n (p) converge uniformly in n to 2 3 f (p) , 1 3 + 2 3 f (p) , and so would be good bounding functions had we the freedom to take n to infinity. They are also manifestly constructible (within even a classical Bernoulli factory). However in the vicinity of the zeros and ones of f (p) we have a problem, as easily illustrated with the example[SupFig.1].
The strategy for dealing with this will be to multiply A n (p) by a generalized Heaviside-type function (that must itself be q-constructible). We focus on L (p) whenever it is obvious how to define the corresponding procedure for U (p). Defining is performed prior to a computational basis measurement. Bernoulli's inequality [3] (1 + x) n ≥ 1 + nx can be used to show that We can use such functions to modify A n (p), to try and force it to always be a lower bound for f (p). For the same example consider now the product A 100 (p)T 400,5,1/2 (p)T 400,1,3/4 (p): There are a couple of issues we need to be careful about [SupFig.3]: Issue 1. When we "push down" A n (p) we increase its distance from B n (p) , which itself is going to be "pushed up" at other (possibly nearby) locations. So we run the risk of increasing the distance between the lower-and upper-bounding functions to more than 1/2. You can see A(p) being flattened too much around p = 0.5 Issue 2. We want the generalized Heaviside function to go towards zero over a finite width interval around z that is not too narrow -it must include all the problematic points at which A n (p) becomes larger than f (p). You can see this happening around p = 0.75. When this happens we can either increase the interval width or alternatively increase n so that the points where A n (p) crosses f (p) "slide" inwards.
Since h z (z) = 0 and h z (p) is convex we readily see that, denoting by µ i the two solutions to h z (p) = 1/M, any large integer M defines a finite interval I(M, z) := [µ 1 , µ 2 ] that (strictly) contains the point z. The interval width decreases as M is increased, but is finite for all M < ∞. Inside/outside I(M, z) standard theorems for continuous functions ensure that the convergence (in m) to 0 or 1 respectively is uniform.
A series expansion around the point p = z makes clear the reason for the conditions 3 and 4 in Definition 1for functions that rise exponentially slowly from any zero z we would not be able to choose parameters M and m such that we can push A n (p) below f (p) in the vicinity of z.
We now have all ingredients in place to outline a procedure for constructing our lowerbounding function, which will take the form for some appropriate choice of parameters n, m i , M i . For each zero z i ∈ Z we first choose large integers M i such that we fix non-overlapping intervals I (M i , z i ) which are smaller than those determined by the associated δ (see Definition 1) for that z i . We ensure they also do not overlap with the similar intervals I (M i , w i ) used in the construction of U (p) . However these intervals remain finite so that as n is increased, uniform convergence ensures that all the problematic points where A n (p) ≥ f (p) end up strictly contained within the I (M i , z i ) , so L(p) ≤ f (p).
Finally, and critically, we also ensure that This is possible, because f is continuous and the interval contains a zero of f so for sufficiently large n there will be an interval containing z for which U (p) < 1/2. The importance of this condition is that now, even as our lower bounding function is taken close to 0 within the intervals I (M i , z i ) , our upper-bounding function (which converges uniformly to 1 3 + 2 3 f (p) within I (M i , z i )) can be brought within a distance 1/2 from the lower bounding function.
In summary, we now have that L (p) converges uniformly on [0, 1] − ∪ i I (M i , z i ) to 1 3 f (p) and on ∪ i I (M i , z i ) to 0 as the remaining free parameters n, m i are increased. Similarly we construct U (p) = B n (p) which converges uniformly on [0, 1] − ∪ i I (M i , w i ) to 1 3 + 2 3 f (p) and on ∪ i I (M i , w i ) to 1. The conditions of Proposition 1 can clearly be met with some suitable large (but finite) choice of the remaining free parameters. Proof. Let L k (p) and U k (p) denote lower and upper-bounding functions satisfying Proposition 1, for the sequence of SPB functions f k (p) , where Since U (p) and L(p) are finitely q-simulable there exists some N after which we can say with certainty whether they occur or not. If neither occur, the output is heads; if both occur, the output is tails; if only L(p) occurs, simulate the functions again.
This procedure produces an output with probability (1 − U (p) + L(p)) on each trial and so the probabilty of this outputting heads is Similarly we have that f (p) can be convexly decomposed Using ancillary randomness to sample an index k with probability 3 4 k−1 1 4 we then construct a g k (p) coin and we have shown that any SPB function can be constructed.
To show that any function constructible with quoins and a finite number of unitaries must be SPB is far simpler.
With the exception of the trivial functions f (p) = 0, 1, if a function is constructible then there is some number of flips k such that after k flips the probability of outputting 0 and the probability of outputting 1 are both greater than zero.
The probability of outputting 1 must be less than f (p) but be at least some constant multiplied by the least probable string which in turn must be greater than the expression below.
This yields condition three of Definition 3 and half of condition two. Repeating the process with output one yields condition four and the other half of condition two. Finally we recall that a sum of continuous functions is a continuous function and since the probability measure on open sets give continuous functions and the event is made up of such sets the probability of the event must e a continuous function. Thus we have shown any function constructible with quoins and a finite number of unitaries must be SPB and we have completed the proof of Theorem 1.
For the classical Bernoulli factory there are two different approachs which yield the same result. When we applied both approaches to the quantum cases we found that they led us in different directions. When emulating the approach due to Keane and O'Brien proof of the first theorem arose naturally, however when we tried to prove the first theorem with the approach due to Wästlund various complications arose. This led to us proving Theorem 2 instead.
Theorem 2: A function f : (0, a 1 ) ∪ (a 1 , a 2 ) ∪ ... ∪ (a n , 1) → [0, 1] is constructible with quoins and a finite set of single qubit unitaries if f is continuous on its domain and there exists a finite list {a 1 , a 2 , ...a n }, which contains {a 1 , a 2 , ...a n }, and integer k such that for all p ∈ (0, 1), where This second theorem gives stranger functions than the first. At the set of excluded points {a i } the function can undergo a discontinuity or even have no definite value. Were the functions extended over the whole interval (0, 1) we would find we only require that the functions be piecewise continuous. The list {a 1 , a 2 , ...a n } contains all the points at which f (p) is not defined and any points within the domain where f (p) takes the value zero or one. We prove Theorem 2 by showing that we can pinpoint p to within an interval with high confidence and then approximate f (p); by averaging over many such approximations we achieve an exact sample. There are obvious parallels with promise problems.
The proof can be summarised as follows: • It is shown in proposition 2 that one can guess which of a set of intervals p lies in with a small error which depends on p.
• It is shown in proposition 3 that we can choose our intervals such that there is a value q i which differs from the possible values of f (p) in the interval by some small error which depends on p.
• It is shown in proposition 3 that these two results lead to an approximation of f (p) which differs from the true function by a small error which depends on p • As with Theorem 1, it is shown that if we can create good approximations we can create a sampling algorithm.
Proposition 2: If (0, a 1 ) ∪ (a 1 , a 2 ) ∪ ... ∪ (a n , 1) is covered by countably many open subintervals W i , i ∈ N, and k is a positive integer, then there is an discernible function α : Ω → N such that for every p ∈ (0, 1), Proposition 2 states that given a set of intervals covering the domain of the function we can pinpoint p to within one of them after a finite number of tosses with the probability that our guess is incorrect being less than a k (p). This is an essential step in our proof.

Proof
In order to guess which interval we are in we must narrow down a potentially infinite number of intervals to a finite number. To this end we wish to restrict ourselves to a union of closed intervals so that we can invoke compactness.
For each a ∈ {a 1 , a 2 , ...a n } we flip h a i (p)-coins until we have observed nk + 2k + 1 tails. In addition we flip p-coins until we have observed nk + 2k + 1 tails and nk + 2k + 1 heads. For each case we call the number of flips before we see the required results m a i , m 0 for p-coins, and we call the maximum value m max . Now for some a i the probability of the required results occurring after m a i is.
We can use this to rule out a region around a i of size 2 as follows.
The same procedure can also show the probability that p ∈ (0, ] or p ∈ [1 − , 1) is also below a k (p). This means we can say that p ∈ [ , a 1 − ] ∪ [a 1 + , a 2 − ] ∪ ... ∪ [a n + , 1 − ] with probability greater than 1 − a k (p). It is possible for some of these regions to be large enough that they merge; this is not a problem. Now that we are restricted to a union of closed intervals we can, by compactness, choose a finite number of W i 's which cover this set. We can also find δ > 0 such that a k (p) ≥ δ throughout this region. From the finite open cover we can choose a closed interval F i ⊂ W i for each i such that the {F i } cover our union of closed intervals. Now we estimate p by flipping classical coins and calculating the frequency of heads, our estimated value is We do this with N flips, N being chosen such that We now define our discernible function α as α(f N ∈ F i ) = i. Recall that the {F i } are conditioned on m max so the function is not as simple as it appears.
We have now shown that such a discernible function exists.
Proposition 3: If f is any continuous function (0, a 1 ) ∪ (a 1 , a 2 )... ∪ (a n , 1) → (0, 1), and l is any positive integer, then there is a function g representing the probability of an discernible event such that for every p in (0, a 1 ) ∪ (a 1 , a 2 )... ∪ (a n , 1), Proposition 3 uses Proposition 2 to show that we can find contructible functions which are good approximations to f (p) which allows us to complete our proof of Theorem 2 in a manner similar to that for Theorem 1; we sample a randomly from a set of approximations to f (p) in such a way that on average we sample f (p).

Proof
Since f is continuous, we can choose countably many subintervals W 1 , W 2 , W 3 , ... covering (0, a 1 ) ∪ (a 1 , a 2 )... ∪ (a n , 1), and numbers q 1 , q 2 , q 3 , ... so that for every p ∈ W i We have shown in Proposition 2 that we can guess the interval with some small error probability. The above statements show that we can choose our intervals such that they have the q values we need. The idea is that we guess our interval, W i , and that we have chosen our intervals such that f (p) can be approximated as q i within them.
In the event that f takes the value zero or one at a point within its domain then we assign a function q i (p) to an interval containing that point such that for every p ∈ W i |f (p) − q i (p)| < a l+1 (p) (46) using the techniques from Theorem 1. For ease of notation we will continue the proof assuming this is not necessary By Proposition 2, there is an discernible function α : Ω → N such that for every p, Once α is known, which we know will be in finite time, we then use an event which occurs with probability q α .
Thus the total event has probability Hence for every p, the error |f (p) − g(p)| is bounded by the chance of guessing the wrong interval plus the difference between q i and f (p).
Recalling that and therefore So we have shown the total error is less than the error we are allowed.
We can now complete the proof Theorem 2 by showing that this ability to create approximations allows us to create an algorithm made from successive approximations.

Proof
Let f : (0, a 1 ) ∪ (a 1 , a 2 ) ∪ ... ∪ (a n , 1) → [0, 1] be any continuous function satisfying for some k. If we apply Proposition 3 to f with l = k + 1 we find that there exists a function f 1 , representing the probability of an discernible event, such that, for every p, We then have Since f (p) < 1 − a k (p), we see that Similarly, Since f (p) > a k (p), we see that Combining these gives which can be rearranged as If we repeat this argument, replacing f (p) by 2f (p) − f 1 (p) and k by k + 1, we find that these exists a function f 2 which is the probability of an discernible event.
Continuing in this way we can find functions f 1 , f 2 , f 3 , ... each representing the probability of an discernible event, such that for every n, If we replace the lower and upper bounds by 0 and 1 and divide by 2 n we get Letting n → ∞, we obtain So if we choose a value n with probability p n = 1 2 n and then sample f n (p) we have sampled f (p) exactly.
To show that any function constructible with quoins and a finite number of unitaries must obey the conditions of Theorem 2 is far simpler.
With the exception of the trivial functions f (p) = 0, 1, if a function is constructible then there is some number of flips k such that after k flips the probability of outputting 0 and the probability of outputting 1 are both greater than zero.
The probability of outputting 1 must be less than f (p) but be at least some constant multiplied by the least probable string which in turn must be greater than the expression below.
The least probable string after k flips occurs with probability min a i (1 − h a i (p)) k Repeating the process with output zero and combining the two yields.
as required.
Finally we recall that a sum of continuous functions is a continuous function and since the probability measure on open sets give continuous functions and the event is made up of such sets the probability of the event must be a continuous function. It should be emphasised that although the extensions of many of these functions are piecewise continuous the functions themselves are always continuous on their domain. This completes the proof of Theorem 2.