Lean4FHE

Fully homomorphic encryption (FHE) is an exciting area of research, but how can we be certain that our implementations of FHE schemes are valid? In this project, we investigated verification in the context of FHE compilation. To begin, we briefly discuss the motivation for studying this topic.
Why verify FHE compilation?
While verification gives arguably the highest level of confidence in program behavior, this level of assurance is not without its costs. In particular, verification of whole programs is complex, and the choice of when to use it is context-dependent.
Good specifications are hard. Making a good specification, and furthermore knowing that a specification is good, is challenging. As an example, if one wanted to verify something as simple as a sorting function, a specification like
f(l) = l'wherel' is sortedandlen(l) = len(l')is insufficient, because it does not guarantee that the output list is a permutation of the input list. For this reason, verification is best-suited for areas where desirable properties are well-defined and standardized.Easy to specify does not mean easy to implement. Ease of specification should not be confused for ease of implementation. Indeed, verification is most valuable when implementations are susceptible to subtle errors. Because verified programs demonstrate correct behavior in all cases, including in pathological cases which testing might not capture, devoting effort to verification can be especially useful when these tricky failure modes are harder to detect.
Correctness must matter enough to justify the cost. Finally, because verification is time-consuming and laborious, it is best applied in cases where the correctness of the program is important, to have the greatest possible impact.
Why verify FHE compilation?
Looking at these qualities that a good candidate for verification should have, we see that compilation fits well into this niche:
The specification for a compiler is fairly straightforward: compilation should preserve observable behavior, which is itself easily defined in terms of the semantics of both input and output languages.
At the same time, compilers do a great deal of work while still preserving that behavioral invariant -- analysis, lowering across various representations, and especially optimization. This creates a vast space for potential bugs to appear, whose detection is the goal of the seminal paper Finding and Understanding Bugs in C Compilers. The authors find a variety of errors in widely-used compilers like GCC and LLVM, which underscores the difficulty of correct compilation, and also the level of effort required to locate these errors from a compiler engineering perspective. It is especially noteworthy that these bugs were not captured by the extensive set of testing libraries that exist for the C programming language.
Because compilers are ubiquitous, and their correctness (or lack thereof) has serious implications on the correctness of compiled programs, verifying compilers is clearly attractive from the angle of utility as well.
Verification of a realistic compiler has been accomplished before, with CompCert, a verified optimizing C compiler. In the previously-mentioned bug detection paper, it was assessed along with other C compilers, and notably its middle-end verified layer had no errors whatsoever. We see this as a success story for verified compilation, despite its complexity.
Why verify FHE compilation?
Returning now to FHE compilation, we see similar desiderata for verification:
- Beyond general compilation invariants, security properties are also generally well-understood and have been described formally.
- As in the case of compilation down to machine code, compilers producing and transforming circuits would benefit from assurance that the code implementing these pen-and-paper designs is sound.
- Lastly, because FHE compilation has additional security and privacy considerations, even a verification of adherence to a partial specification would be useful in improving confidence of correctness.
For these reasons, we aim to investigate the feasibility of verifying part of an FHE compilation pipeline in a theorem prover.
Background
We now discuss the relevant background for our project.
Interactive Theorem Provers (ITP)
Interactive theorem provers are a broad class of software which encode a general-purpose logic like ZF set theory or the Calculus of Constructions, and allow a user to write and prove statements in that logic. Many theorem provers, like Lean, Agda, Idris, and Rocq, are based on dependent type theory, a highly-expressive extension of the lambda calculus. Proofs can be automatically checked for correctness, meaning one only needs to trust the definitions themselves, and the kernel of the ITP, in order to attain a high level of confidence in the results.
Lean and Mathlib
Lean is an ITP based on dependent type theory, which we selected over other similar ITPs like Rocq and Agda primarily because of Mathlib, an extensive collection of mathematical definitions and theorems that are already implemented and verified.
While Lean offers robust support for many kinds of mathematics, and automatically checkable proofs, it is important to highlight the areas in which Lean struggles as compared to pen-and-paper proofs. Critically, cryptographic work often models adversaries using probabilistic polynomial-time Turing machines. However, this is triply challenging to work with in Lean, which prefers to reason over discrete spaces, has limited library support for runtime cost analysis, and is somewhat unwieldy to use with Turing machines.
For these reasons, we will focus on properties of fully-homomorphic encryption that can more easily be verified in Lean. In the context of this project, this specifically is the correctness of the translation from plaintext circuit to ciphertext circuit, which allows us to sidestep difficulties with polynomial adversaries. This said, there is interesting recent work on adding support for symbolic security guarantees in Lean, then automatically translating these into proofs of computational security. Investigating FHE in the context of that paper's methods may be an interesting direction for future work.
Fully Homomorphic Encryption (FHE)
Fully homomorphic encryption is a powerful cryptographic technique that allows for arbitrary computations to be performed on encrypted data, all without revealing the underlying plaintext. That is, for a function f, and with encryption and decryption functions enc and dec, it should be possible to obtain a translated function f' such that f(x) = dec(f'(enc(x))). Of note is the fact that FHE has seen deployment by industrial users like Apple, suggesting that although the overhead of FHE is often too high to be practical, use cases are emerging.
There are a variety of schemes for FHE, which may operate on different input spaces and have varying performance characteristics. For this post, we will focus on the GSW scheme, despite the fact that other schemes like CKKS tend to see more use. We chose to work with GSW because it is fairly straightforward, and we wanted to explore the feasibility of verifying FHE compilation in the simplest setting initially.
The GSW Encryption Scheme
We now describe the encryption scheme our compiler targets. GSW, introduced by Gentry, Sahai, and Waters in 2013, is one of the simplest fully homomorphic encryption schemes. It encrypts a single bit by hiding it inside a matrix, in such a way that a party with the secret key can extract it but an adversary without it cannot. Addition of ciphertexts is just matrix addition, and multiplication is matrix multiplication preceded by a bit-decomposition step that we describe below.
Ciphertexts as approximate eigenvectors
Fix $n, q \in \mathbb{Z}^{\ge 2}$, and let $\ell = \lceil \log_2(q)\rceil$ and $m = n\ell$. A ciphertext encrypting a message $\mu \in \lbrace 0,1 \rbrace $ is a matrix $C \in \mathbb{Z}_q^{n \times m}$ satisfying
$$\mathbf{s}^\top C = \mu \cdot \mathbf{s}^\top G + \mathbf{e}, \qquad (\star)$$
where:
- $\mathbf{s} \in \mathbb{Z}_q^n$ is the secret key,
- $\mathbf{e} \in \mathbb{Z}_q^m$ is a small noise vector, and
- $G \in \mathbb{Z}_q^{n \times m}$ is the gadget matrix, the fixed public matrix $G = I_n \otimes g^\top$ with $g^\top = \begin{bmatrix} 1 & 2 & 4 & 8 & \dots & 2^{\ell-1} \end{bmatrix}$.
GSW Homomorphic Operations
Addition. Given ciphertexts $C_1, C_2$ encrypting $\mu_1, \mu_2$ respectively, their sum $C_{\text{add}} = C_1 + C_2$ encrypts $\mu_1 + \mu_2$. To see why, we simply add the two instances of $(\star)$:
$$\mathbf{s}^\top C_{\text{add}} = \mathbf{s}^\top (C_1 + C_2) = (\mu_1 + \mu_2) \mathbf{s}^\top G + (\mathbf{e}_1 + \mathbf{e}_2).$$
The result is a valid ciphertext with noise $\mathbf{e}_1 + \mathbf{e}_2$, which remains small as long as the input noises are small.
Multiplication. Multiplication is more delicate. A naive attempt like $C_1 \cdot C_2$ would not preserve the invariant, and would also cause the noise to blow up. Instead, GSW uses the gadget matrix's inverse, $G^{-1}$, which maps each column of its input to its binary decomposition. Concretely, $G^{-1}: \mathbb{Z}_q^{n \times m} \to \lbrace 0,1 \rbrace ^{m \times m}$ satisfies $G \cdot G^{-1}(A) = A$ for any matrix $A$, and note that its output has small entries (only 0s and 1s).
The homomorphic product is defined as $C_{\text{mult}} = C_1 \cdot G^{-1}(C_2)$. To verify correctness:
$$\mathbf{s}^\top C_{\text{mult}} = \mathbf{s}^\top C_1 \cdot G^{-1}(C_2) = (\mu_1 \mathbf{s}^\top G + \mathbf{e}_1) G^{-1}(C_2) = \mu_1 \mathbf{s}^\top C_2 + \mathbf{e}_1 G^{-1}(C_2).$$
Expanding $\mathbf{s}^\top C_2$, we get:
$$= \mu_1(\mu_2 \mathbf{s}^\top G + \mathbf{e}_2) + \mathbf{e}_1 G^{-1}(C_2) = \mu_1 \mu_2 \mathbf{s}^\top G + \underbrace{\mu_1 \mathbf{e}_2 + \mathbf{e}_1 G^{-1}(C_2)} _{\text{new noise}}.$$
Because $G^{-1}(C_2)$ has entries in $ \lbrace 0,1 \rbrace $, the term $\mathbf{e}_1 G^{-1}(C_2)$ stays controlled: the noise grows, but only by a small amount.
The corresponding Lean definitions are as follows, where bitDecompMatrix is $G^{-1}$:
def homAdd (c1 c2 : Ciphertext p) : Ciphertext p :=
⟨c1.mat + c2.mat⟩
def homMult (c1 c2 : Ciphertext p) : Ciphertext p :=
⟨c1.mat * bitDecompMatrix p.l p.q c2.mat p.hl⟩The noise growth problem
Note that bit decomposition does not eliminate noise growth. Each homomorphic multiplication increases the noise, and after enough operations the noise can exceed $q/4$, at which point decryption fails. This limits the depth of circuits that can be evaluated: a fresh ciphertext can support only a bounded number of multiplications before becoming undecryptable.
A scheme that can evaluate circuits up to some fixed depth is called somewhat/leveled homomorphic (SWHE). To achieve fully homomorphic encryption, one can use bootstrapping, which homomorphically evaluates the decryption circuit to refresh a ciphertext's noise.
From Boolean circuits to GSW circuits
Since messages are single bits, one might expect homomorphic addition to give XOR and multiplication to give AND. Multiplication does give AND, since the product of two bits is always a bit. But homomorphic addition operates in $\mathbb{Z}_q$, not $\mathbb{Z}_2$: adding two encryptions of $1$ yields an encryption of $2$, not $0$. So homAdd alone does not give us XOR.
Instead, we build the universal gate NAND: $\text{NAND}(\mu_1, \mu_2) = 1 - \mu_1 \cdot \mu_2$. For binary inputs this always lands in $ \lbrace 0,1 \rbrace $, since $\mu_1 \cdot \mu_2 \in \lbrace 0,1 \rbrace$. Because NAND is functionally complete, any Boolean circuit can be evaluated homomorphically using only this gate, and our compiler targets NAND as its only operation.
def homNAND (c1 c2 : Ciphertext p) : Ciphertext p :=
let prod := homMult p c1 c2
⟨gadgetMatrix p.n p.l p.q - prod.mat⟩Note that homNAND computes $G - C_1 \cdot G^{-1}(C_2)$, where $G$ serves as a noiseless encryption of $1$.
Noise growth in NAND gates. Since NAND combines an addition and a multiplication, we can ask how much noise it introduces. Suppose both input ciphertexts have noise bounded by $B$ in the infinity norm, and that the messages $\mu_1, \mu_2 \in \lbrace 0,1 \rbrace $ are binary. The NAND noise is the multiplication noise $\mu_1 \cdot \mathbf{e}_2 + \mathbf{e}_1 G^{-1}(C_2)$ (negated, but negation preserves the norm). The addition with the constant $1$ contributes no noise, since $1$ can be encrypted noiselessly as the gadget matrix $G$. By the triangle inequality, the NAND noise is bounded by:
$$|| \mu_1 \cdot \mathbf{e}_2 || _{\infty} + || \mathbf{e}_1 G^{-1}(C_2) || _{\infty} \leq (B + m \cdot B) = (m+1) \cdot B,$$
where the first term uses the fact that $\mu_1 \in \lbrace 0,1 \rbrace $, and the second uses the fact that $G^{-1}(C_2)$ has binary entries. Each NAND gate therefore multiplies the noise by a factor of at most $(m+1)$. For a circuit of depth $d$, the noise after evaluation is at most $(m+1)^d \cdot B_0$, where $B_0$ is the initial noise. Decryption remains correct as long as this quantity stays below $q/4$, which the scheme's parameters must be chosen to satisfy.
What is actually proved
Altogether, our Lean development establishes an end-to-end correctness guarantee for the compiler. At the lowest level, we verify that the gadget matrix inverse is a true inverse ($G \cdot G^{-1}(A) = A$), and that each homomorphic operation: addition, multiplication, and NAND, preserve the approximate-eigenvector invariant $(\star)$. Building on this, we prove a quantitative noise bound: each NAND gate multiplies the noise by at most $(m+1)$, so after a circuit of depth $d$ the noise is at most $(m+1)^d \cdot B_0$ (compile_bounded). The top-level theorem compile_noise_small then ties everything together: given a Boolean circuit whose depth fits within the parameter budget, and encrypted inputs whose noise is at most $B_0$, the compiled FHE circuit produces a ciphertext that (1) encrypts the correct Boolean output, produces the ciphertext for same value the plaintext circuit would produce, and (2) has noise $w$ satisfying $4 || w || _{\infty} < q$, which is exactly the condition needed for decryption to succeed. The parameter constraint $4 \cdot (m+1) ^{ d _{\max} } \cdot B_0 < q$ is enforced statically in the GSWParams structure, so any circuit that type-checks against these parameters is guaranteed to decrypt correctly. In other words, as summarized in the figure below, the proof covers functional correctness of the compiler (the encrypted computation matches the plaintext one) and the quantitative soundness of the noise budget (the result is actually decryptable), but does not address the semantic security of the encryption scheme itself.
Looking Ahead: Roadmap to Bootstrapping
Our compiler is currently leveled: it handles circuits up to a fixed maxDepth. Bootstrapping would make it fully homomorphic by refreshing ciphertext noise mid-computation. The key steps are:
Define decryption. Add a
decryptfunction that recovers the plaintext bit by checking whether an entry of $\mathbf{s}^\top C$ is closer to $0$ or $\lfloor q/2 \rfloor$.Express decryption as a Boolean circuit. This requires non-trivial sub-cicuits, and is the most labor-intensive step. The depth of this circuit directly constrains the GSW parameters (since $4 \cdot (m+1)^{d_{\text{dec}}} \cdot B_0 < q$ must hold), and we are yet to explore how these requirements interact with practical parameter choices.
Define bootstrapping. Using precomputed encryptions of the secret key bits, feed a noisy ciphertext's bits into the compiled decryption circuit. The output is a fresh ciphertext with noise independent of the original.
Prove correctness and extend the compiler. Correctness composes
compile_correctwithdecrypt_correct; the noise bound follows fromcompile_bounded. With bootstrapping available, the compiler can insert refresh operations at regular intervals, removing the depth restriction entirely.
Conclusion
We presented Lean4FHE, a verified compiler from Boolean NAND circuits to GSW homomorphic encryption circuits, written and proven correct in Lean 4 with Mathlib. Our end-to-end theorem guarantees that for any circuit fitting within the parameter budget, the compiled FHE circuit produces the correct plaintext result and remains decryptable. While the current system is leveled, the codebase is structured to support bootstrapping as a natural extension. We hope this project demonstrates that verified FHE compilation is feasible, and that the techniques developed here can serve as a first step for verified compilation targeting richer FHE schemes.

Next: zkPi+