key: cord-0532078-xdntrzxm
authors: Pradel, Gaetan; Mitchell, Chris
title: Privacy-Preserving Biometric Matching Using Homomorphic Encryption
date: 2021-11-24
journal: nan
DOI: nan
sha: e72078a94725c1eb72a51c272c44a40fb3692673
doc_id: 532078
cord_uid: xdntrzxm

Biometric matching involves storing and processing sensitive user information. Maintaining the privacy of this data is thus a major challenge, and homomorphic encryption offers a possible solution. We propose a privacy-preserving biometrics-based authentication protocol based on fully homomorphic encryption, where the biometric sample for a user is gathered by a local device but matched against a biometric template by a remote server operating solely on encrypted data. The design ensures that 1) the user's sensitive biometric data remains private, and 2) the user and client device are securely authenticated to the server. A proof-of-concept implementation building on the TFHE library is also presented, which includes the underlying basic operations needed to execute the biometric matching. Performance results from the implementation show how complex it is to make FHE practical in this context, but it appears that, with implementation optimisations and improvements, the protocol could be used for real-world applications.

This paper proposes a privacy-preserving biometric-based authentication protocol based on fully homomorphic encryption (FHE), designed for use in the case where the biometric sample for a user is gathered by a local device but matched against a biometric template by a remote server. The goal is to enable this to occur without the remote server,modeled as a honest-but-curious adversary,gaining access to any of the sensitive biometric data. The privacy-preserving and authentication properties of the protocol are formally established. A proof-of-concept C/C++ implementation building on the TFHE library due to Chillotti et al. [2] has also been developed, in which face matching is used as the biometric. Performance results from this implementation are presented. The results of the implementation confirm the difficulty of making FHE practical in such a scenario, but we suspect that, with optimisations and improvements, the protocol could be used for real-world applications.

As part of the proof-of-concept, all the elementary operations necessary to execute the protocol using FHE were implemented. Thus, as a side contribution, we have provided a set of elementary arithmetic routines in the ciphertext domain 1 , which could be useful for other prototype implementations.

a) Homomorphic encryption: Homomorphic encryption allows one to perform computations on encrypted data, without ever decrypting it. This enables users to perform operations in untrusted environments. The idea of performing computations on encrypted data was introduced in 1978 by Rivest, Shamir and Adleman [3] . While many homomorphic schemes have been proposed [4] - [7] , it wasn't until 2009 that Gentry presented [8] the first FHE scheme, based on ideal lattices. Gentry's breakthrough rests on a technique called bootstrapping. An FHE scheme based on Gentry's blueprint enables an arbitrary number of additions and multiplications, i.e. any function, to be computed on encrypted data. Since then, many other schemes have been proposed [9] - [14] , including schemes not using the bootstrapping technique. For example, in 2012, Brakerski, Gentry and Vaikuntanathan [15] presented a scheme based on the ring version of the Learning With Errors problem, introduced by Regev [16] . A second type of FHE scheme was introduced by Gentry, Sahai and Waters [17] . This scheme was further improved [18] , [19] , and most recently by Chillotti et al. [2] , [20] .

b) Biometric authentication: The use of biometrics for authentication has been discussed for several decades, and has seen growing use. International organisations suggest passwordless 2 systems for authentication, and biometrics can solve this issue. Advances mean that in some circumstances biometric recognition algorithms perform better than humans, even for face recognition [21] . Nonetheless, biometric authentication faces a range of challenges [22] , in particular regarding the protection of users' sensitive data. Biometric data, such as a fingerprint, is fixed for a lifetime, meaning that its use gives rise to significant privacy concerns. Ideally, biometric data should not be processed without protection or anonymisation. Homomorphic encryption offers a possible solution to this problem [22] , as it allows the authentication provider to perform biometric matching on (encrypted) data, while protecting the privacy of sensitive biometric data.

c) Related work: The use of homomorphic cryptography in the context of biometric matching is not new [23] , [24] . However, most previous work uses partially homomorphic encryption and not FHE. Some of this work has promising performance results, e.g. Blanton and Gasti [25] who calculate the Hamming distance between two iris feature vectors in only 150 ms. However, because of the additive-only (partially homomorphic) characteristic of the encryption schemes they use, they are not able to evaluate a circuit much more complex than for Hamming distance. Yasuda et al. [26] used a homomorphic scheme that also enables multiplications in the ciphertext domain, but still only compute the Hamming distance between two biometric vectors; moreover, the approach is vulnerable against malicious attackers [27] . Back in 2008, Bringer and Chabanne [28] proposed an authentication protocol based on the homomorphic properties of two partially homomorphic encryption schemes.

Biometric matching based on FHE has been previously proposed; perhaps the first example is the private face verification system of Troncoso-Pastoriza et al. [29] . Cheon et al. [30] proposed Ghostshell, a tool that works on iris templates, that is computationally costly. More recently, Boddeti [31] showed how to execute a secure face matching using the Fan-Vercauteren FHE scheme [32] and obtained practical results by packing the ciphertexts in a certain way. d) Structure of the paper: Section II introduces the notions necessary for the rest of the paper. Sections III and IV are the core of the paper, presenting the design and security properties of the protocol. Finally, Sections V and VI give results from the protocol implementation and conclude the paper.

II. PRELIMINARIES N, Z, R and B represent the sets of natural numbers, integers, reals and bits, respectively.

We next introduce some formal security notions. For more complete versions of Definitions 1-5, see Goldreich [33] .

We say a function f : N → R is negligible if for every polynomial p there exists an N such that, for all n > N :

Definition 2 (Probability ensemble). Let I be a countable index set. A probability ensemble indexed by I is a sequence of random variables indexed by I. Namely, any X = (X i ) i∈I , where each X i is a random variable, is an ensemble indexed by I.

for all i. Then X and Y are said to be indistinguishable in polynomial-time if, for every probabilistic polynomial-time algorithm D : B n → B, every polynomial p, and all sufficiently large n:

We write X c ≈ Y.

Remark 1. We follow common practice and refer to computational indistinguishability instead of indistinguishability in polynomial-time.

Then the statistical distance function ∆ : N → R is defined as:

Remark 2. If the ensembles X and Y are statistically indistinguishable, then they are also computationally indistinguishable. The converse is not true.

Definition 6 (Adversary). An adversary A for a cryptographic scheme is a polynomial-time algorithm (or a set of polynomialtime algorithms) that models a real-world attacker. It is equipped with defined computational resources and capabilities, and is designed to attack the security of the scheme typically as a participant in a security game.

Definition 7 (Challenger). A challenger for a cryptographic scheme is a polynomial-time algorithm (or a set of polynomialtime algorithms) that models a real-world instance of the scheme. It is usually assumed to possess unlimited computational resources and capabilities, and is viewed as a 'black box' which responds to queries made by an adversary in a security game.

Definition 8 (Security game). A security game models an attack on a cryptographic scheme involving an adversary and a challenger.

Definition 9 (Advantage). In the context of a cryptographic scheme and a security game for this scheme, the advantage of an adversary is a function of the probability that the adversary wins the security game that measures the adversary's improvement over random choice.

We next formally introduce homomorphic encryption and certain associated notions. For more complete versions of these definitions, see Armknecht et al. [34] .

Definition 10 (Homomorphic Encryption scheme). A homomorphic encryption scheme E for a circuit family Π consists of four PPT algorithms (KeyGen, Enc, Dec, Eval) with the following properties.

• (sk, pk, evk) ← KeyGen(1 λ ). Given the security parameter λ ∈ N, KeyGen outputs a key triple made up of a secret key sk, a public key pk and an evaluation key evk. The plaintext space M and the ciphertext space M are determined by pk. • m ← Enc(pk, m). Given a public key pk and a plaintext m ∈ M, Enc outputs a ciphertext m ∈ M.

• m ⊥ ← Dec(sk, m). Given a secret key sk and a ciphertext m, Dec outputs either the plaintext m ∈ M if m ← Enc(pk, m) or ⊥. • m ← Eval(evk, π, m). Given an evaluation key evk, a circuit π ∈ Π, where Π is a circuit family (see Appendix A for details) and a ciphertext m ∈ M, Eval outputs another ciphertext m ∈ M.

Remark 3. Depending on the scheme, the evaluation key evk might be part of, or equal to, the public key pk. For simplicity of presentation, here and throughout we assume that the circuit input to Eval has input size corresponding to the size of the input ciphertext(s).

Definition 10, and those below, holds for a range of types of plaintext, including both bit strings and vectors of plaintexts. Some algorithms, such as KeyGen, take as input a security parameter λ, which will be denoted as such throughout this paper unless stated otherwise. This input is usually written in unary representation 1 λ because we want an algorithm that runs in time polynomial in the size of λ to be considered as efficient. We refer to the outputs of Enc as 'fresh ciphertexts' and those of Eval as 'evaluated ciphertexts'.

Definition 11 (Correctness). Suppose E = (KeyGen, Enc, Dec, Eval) is a homomorphic encryption scheme with security parameter λ. We say E is correct for a circuit family Π if E correctly decrypts both fresh and evaluated ciphertexts, namely, for all λ ∈ N, the following two conditions hold.

• Suppose (sk, evk, pk) ← KeyGen(1 λ ). If m ∈ M and m ← Enc(pk, m) then m ← Dec(sk, m). Else ⊥← Dec(sk, m).

• For any key triple (sk, evk, pk) ← KeyGen(1 λ ), any circuit π ∈ Π, any plaintext m ∈ M and any ciphertext m ∈ M with m ← Enc(pk, m), if m ← Eval(evk, π, m) then Dec(sk, m ) → π(m).

Definition 12 (Indistinguishability under Chosen-Plaintext Attacks security game). Suppose E = (KeyGen, Enc, Dec, Eval) is a homomorphic encryption scheme with security parameter λ. Suppose also that A is a PPT adversary. The indistinguishability under chosen-plaintext attacks (IND-CPA) security game is as follows. 1) A challenger runs (sk, pk, evk) ← KeyGen(1 λ ) and shares pk with A.

2) A generates two distinct plaintexts {m 0 , m 1 } and submits a query to the challenger to request the encryption of one of them with pk. 3) The challenger chooses i ∈ B uniformly at random, computes m ← Enc(m i , pk) and sends m to A. 4) A outputs a pair (m j , m), where j ∈ B, and wins the game if i = j. We denote this security game by IND-CPA A E (1 λ ) and a win in an instance of this security game by IND-CPA A E (1 λ ) = 1. Definition 13 (Advantage for the IND-CPA security game). Suppose E = (KeyGen, Enc, Dec, Eval) is a homomorphic encryption scheme with security parameter λ. Suppose A is an adversary in the IND-CPA security game. The advantage of A with respect to E, denoted Adv E A (λ), is defined to be: 

We now describe the privacy-preserving biometric matching protocol. In fact we give two descriptions: in §III-A we give an informal introduction, explaining the motivation for the design, and then in §III-B we give a formal description which we use as the basis for the analysis in Section IV. For simplicity of presentation we suppose that the public key pk and the evaluation key evk are equal.

We describe a protocol involving two parties, a client C and a server S, where C is acting on behalf of user U . C wishes to access a certain service, not offered by S, which requires an initial authentication of the user U associated with C to S. The process of authentication uses sensitive biometric data such as face images or iris information for U that is gathered by C. If S successfully authenticates U , S sends an ID token τ , to C. C can now use τ to access the requested service.

Note that C is trusted by S to correctly gather a fresh biometric sample from U . In the protocol, S verifies that the gathered sample matches the appropriate user template, and also authenticates C to S. Note that the protocol neither provides authentication of S to C nor provides encryption of transferred messages; it is implicitly assumed that these properties are provided by the communications channel, e.g. using a server-authenticated TLS session.

In the description below, Step 0 (registration) is performed once before use of the protocol. Steps 1-4 of the protocol are performed every time the user U wishes to be authenticated to S (via C).

Step 0: Registration C generates a key pair (sk C , pk C ) for a homomorphic encryption scheme E, and obtains by some means a biometric template t for its associated user U . C then encrypts t as t ← Enc(pk C , t) and sends t to S via a trusted channel. S stores t, and subsequently uses it for biometric matching when the protocol is executed (see Step 2) . In the remainder of this description we suppose that S, by some means, is assured of the identity of U and that the encrypted biometric template t for U is genuine.

Step 1: Initialisation C takes a fresh biometric sample s from U and, using E, computes an encrypted version s ← Enc(pk C , s) and sends it to S.

Step 2: Construction of the Matching Token Phase 1: Matching Computation We suppose that S is equipped with a biometric matching function f : M × M → B which inputs a biometric template and a biometric sample and outputs an indication of whether there is a sufficiently close match between them.

where b is the encrypted version of a boolean b indicating the success or not of the biometric matching, i.e. b ← Enc(pk C , f (s, t)).

In a naïve version of the protocol, S now sends C the encrypted matching result b; C decrypts it to obtain b ← Dec(sk C , b), and sends b to S. S can now use b to decide whether not to generate the ID Token τ . For obvious reasons this is not secure (b is not authenticated), and hence we need a slightly more elaborate protocol.

In order to enable S to authenticate C, we introduce the notion of a Matching Token, denoted by y. In Phase 2 this token is constructed by S as a function of b (whilst still encrypted) in such a way that S can, when provided by C with a decrypted version of the token in Step 4, (a) verify its authenticity, and (b) determine the value of b.

Phase 2: Signature Computation We suppose S has an implementation of the function

S first selects two random numbers r 0 $ ← B λ and r 1 $ ← B λ , and stores them for use in Step 4. S next computes r 0 ← Enc(pk C , r 0 ) and r 1 ← Enc(pk C , r 1 ).

In the encrypted domain of E (under pk C ), S now uses b, r 0 and r 1 to compute the encrypted matching token y as:

where π g ∈ Π g , the circuit family associated with E which implements g. That is, S obtains y ← Enc(pk C , g(b, r 0 , r 1 )) although, of course, S does not have access to b; i.e. at this stage S does not know whether or not the biometric matching succeeded. S sends now y to C.

Note that this part of the protocol requires S to retain the random values r 0 and r 1 until Step 4, and hence the protocol is stateful.

Step 3: Decryption of y C receives y from S and computes y ← Dec(sk C , y).

At this point it is still the case that neither C nor S know whether the biometric matching succeeded. C only possesses a string which looks random, and S cannot decrypt any data encrypted with pk C . C now sends y to S.

Step 4: Authentication of C Phase 1: Verification S receives y from C, and checks whether it is equal to r 0 or r 1 . If so, S has successfully authenticated C; if not S rejects C.

Phase 2: Token generation S generates an ID Token τ where τ ← ACCEPT if y = r 1 , and τ ← REJECT otherwise, and sends it to C. As a result, C has a valid ID Token, which can be used to access the desired service, if and only if the biometric matching was successful and S has authenticated C.

We now formally present the protocol, referred to as P. The protocol is summarised in Figure 1 , where λ E is the security parameter of E. Protocol initialisation, described immediately below, assumes Step 0 has been successfully completed.

Input to C: C has a biometric sample s, and a key pair (sk C , pk C ) generated with a homomorphic encryption scheme E. This is represented by the tuple (s, sk C ). We denote the plaintext space and the ciphertext space associated with E by M E and M E respectively.

Input to S: S has an encrypted biometric template t ← Enc(pk C , t) generated by C in a pre-computation phase. This is represented by the tuple (t).

The following functions are used by S. 

The above two initialisations are expressed formally as P :

Common input: Both parties know the homomorphic encryption scheme E and the public key pk C generated by C.

Protocol transcript:

Take a fresh biometric sample t from U to be used as template; c) t ← Enc(pk C , t). 

The protocol P is an instance of an interactive proof system, as defined by Menezes et al. [35, Chapter 10] . We next show that P is a proof of knowledge, i.e. it has the properties of completeness and soundness. The following definition is adapted from [35, Chapter 10] .

Definition 15 (Completeness). An authentication protocol is complete if, given an honest client C and an honest server S, the protocol succeeds with overwhelming probability (i.e. S accepts C's claim).

Theorem 1 (Completeness). The protocol P is complete.

Proof: Suppose P is run with an honest client C that sends a validly constructed value s ← Enc(pk C , s) for a sample s to server S. We consider two cases.

(a) Suppose the sample s matches the template t, i.e. suppose f (s, t) = 1. Then, by definition, b = Enc(pk C , 1), and thus y = r 1 . Hence, if y ← Dec(sk C , y) then y = r 1 . Thus, S accepts C. (b) Suppose the sample s does not match the template t, i.e.

suppose f (s, t) = 0. Then, by definition, b = Enc(pk C , 0), and thus y = r 0 . Hence, if y ← Dec(sk C , y) then y = r 0 . Thus, S does not accept C. That is, S accepts C if and only if the sample s matches the template t.

The following definition is adapted from [35, Chapter 10] .

Definition 16 (Soundness). An authentication protocol is sound if there exists an expected polynomial time algorithm A with the following property: if a dishonest client C (impersonating C) can with non-negligible probability successfully execute the protocol with S, then A can be used to extract from C knowledge (essentially equivalent to C's secret) which with non-negligible probability allows successful subsequent protocol executions.

We first need the following preliminary result.

Lemma 1. Suppose a client C * engages in the protocol P with the server S, using sample s C * , and that S accepts C * . It follows that:

(a) the sample s C * matches the template t held by S;

(b) C * has access to the value r 1 chosen by S in Step 2b of P.

Proof: (a) Since S accepts C * , it immediately follows from Theorem 1 that the sample s C * matches the template t. (b) In Step 4 of P, S accepts C * if and only if the value y sent by C * to S in Step 3 equals r 1 . The result follows.

We can now give our main result.

Theorem 2. The protocol P is sound.

Proof: Suppose P is run with a dishonest client C , impersonating an honest client C, that sends a validly constructed value s C ← Enc(pk C , s C ) for a sample s C to server S in Step 1 of P. Suppose also that there is a nonnegligible probability that C is accepted. We need to establish that C can, with non-negligible probability, engage in further successful protocol executions with S.

Since S accepts C in the protocol execution with nonnegligible probability, by Lemma 1 we know that C with nonnegligible probability has access to r 1 , which was provided to C in encrypted form in Step 2 of P. Hence C must have access to an oracle O that, given an input encrypted using C's public key, with non-negligible probability returns its decrypted version.

Assume a subsequent instance of the same protocol P.

Step 1, C uses the sample s * C = s C , computes s * C using the public key of C, and sends it to S.

Step 2 is executed as specified by S, where the two random values chosen by S are denoted by r * 0 and r * 1 . Clearly s * C matches t (from (a) above), and hence the value y * sent to C will satisfy y * = r * 1 .

Step 3, C uses oracle O which will, with nonnegligible probability, correctly decrypt y * ; that is, the value y * output by O will satisfy y * = r * 1 with nonnegligible probability. C then sends y * to S.

Step 4, since y * = r * 1 with non-negligible probability, S will accept C with non-negligible probability.

That is, there exists a PPT algorithm A, using O as a subroutine, that for any instance of P can be used to arrange that C will be accepted by S with non-negligible probability.

We suppose the protocol P is carried out in the real world between a challenger and an adversary. In the real world, adversaries can play the role of the client or the server. We suppose adversaries are static, i.e. they cannot change their role within an instance of the protocol, and cannot play both roles at the same time. We distinguish between two classes of adversary:

• Honest-but-curious adversaries execute a protocol honestly, although 'on the side' they can make any other calculations with the purpose of obtaining information to which they are not entitled. • Malicious adversaries execute a protocol in ways not permitted in the specification, perform any calculations, and use any means to obtain information. In our setting, we model an honest-but-curious adversary.

One of the main goals of P is to give C (and U ) assurance regarding the privacy of biometric data shared with S, i.e. all samples and templates. As we next show, this property relies on the IND-CPA security (see Definition 14) of the homomorphic encryption scheme.

Definition 17 (Privacy-preserving). If a biometric authentication protocol preserves the privacy of the biometric data of the client against a honest-but-curious adversary (a server or external party), then the protocol is privacy-preserving.

Definition 18 (Privacy-preserving game). Suppose the precomputation phase of the protocol P is run with an honest client C that sends a validly constructed encrypted value s ← Enc(pk C , s) for template t ← Enc(pk C , t) to server S. Suppose also that A is a PPT adversary. The privacypreserving game is as follows. 2) The challenger encrypts the two samples as s 0 ← Enc(pk C , s 0 ) and s 1 ← Enc(pk C , s 1 ).

3) The challenger sends {s 0 , s 1 , t} to A. 4) A outputs a pair (s j , t), where j ∈ B, and wins the game if i = j. We denote this security game by PRI-PRE A P (1 λ ), where λ is the security parameter of the homomorphic encryption scheme E and a win in an instance of this security game by PRI-PRE A P (1 λ ) = 1. Definition 19 (Advantage for the PRI-PRE game). Suppose that P, λ and A are as in Definition 18. The advantage of the adversary with respect to P, denoted by Adv P A (λ), is defined to be:

Definition 20 (PRI-PRE security). Suppose P, A and λ are as in Definition 19. If the advantage Adv P A (λ) for A in the PRI-PRE game is negligible, then P is PRI-PRE, i.e. privacypreserving.

Theorem 3. The protocol P is privacy-preserving.

Proof: Suppose that the protocol P is not privacypreserving, i.e. by Definition 20, there exists an adversary A that has a non-negligible advantage in the privacy-preserving game. By definition this means that A has a distinguisher D that distinguishes, with non-negligible probability, which of two encrypted samples s 0 and s 1 will match an encrypted template t.

We next construct an adversary B against the IND-CPA security of E. Suppose B generates a triple of values (s, s , t) satisfying f (s, t) = 1 and f (s , t) = 0. B now submits the pair (s, s ) to a challenger in the IND-CPA security game. B receives back from the challenger the ciphertext s * , where s * equals either s or s (with equal probability).

B first computes s and t from s and t, and then runs the distinguisher D with inputs s * and s as the encrypted samples and t as the encrypted template. If D returns s * (which we call event e X ), then B outputs (s, s * ) in the IND-CPA game. If D returns s (which we call event e Y ), then B outputs (s , s * ) in the IND-CPA game.

To evaluate the probability that B wins the game, we consider two cases.

• Suppose s * = s (event e A which has probability 0.5).

Then the two encrypted samples s * and s submitted to D both match the template. Hence the probability that D will return s * (event e X ) = the probability it returns s (event e Y ) = 0.5. • Suppose s * = s (event e B which also has probability 0.5). Then of the two encrypted samples s * and s submitted to D, only s will match the template. Hence the probability that D will return s (event e Y ) is 0.5 + p, where p > 0 is non-negligible (this follows since D is a distinguisher). Hence we have:

Pr(e A ∧ e X ) = Pr(e A )Pr(e X ) = 0.5 2 = 0.25; and

Pr(e B ∧ e Y ) = Pr(e B )Pr(e Y ) = 0.5(0.5 + p) = 0.25 + 0.5p.

If e X occurs then, by assumption, B outputs (s, s * ) in the IND-CPA game. The probability this wins is simply Pr(e A |e X ). Similarly, if e Y occurs then the probability of B winning is Pr(e B |e Y ). Hence, since events e X and e Y are mutually exclusive, the probability that B wins the game is:

By definition the advantage for B is 2(0.5(1 + p)) − 1 = 2p, which is non-negligible since p is non-negligible. This contradicts the assumption that E is IND-CPA secure, and hence P is privacy-preserving.

We next show that Steps 2-4(a) of P constitute a secure authentication protocol. We follow the approach of Boyd et al. [36] , based on the Bellare-Rogaway model [37] , adapting a proof of Blake-Wilson and Menezes [38] . We first give an informal definition of entity authentication.

Definition 21 (Menezes et al. [35] ). Entity authentication is the process whereby one party is assured (through acquisition of corroborative evidence) of the identity of a second party involved in a protocol, and that the second has actually participated (i.e. is active at, or immediately prior to, the time the evidence is acquired).

Steps 2-4(a) of P by design constitute a unilateral entity authentication protocol, i.e. only C authenticates to S. Before formally defining the authentication notion, we need the concept of matching conversations due to Bellare and Rogaway [37] . We suppose that an adversary A has access to an infinite family of oracles denoted by Ω i a,b , where a and b are in the space of participants of a protocol, i ∈ N denotes the i-th instance of a protocol, and the oracle behaves as if entity a is performing protocol P in the belief it is communicating with the entity b for ith time.

For any oracle Ω i a,b , its conversation for instance i is the following n-tuple K = (t 1 , α 1 , β 1 ), (t 2 , α 2 , β 2 ), ..., (t n , α n , β n )

where at time t j , the oracle Ω i a,b received α j and sent β j (1 ≤ j ≤ n).

We can now define matching conversations, again following Bellare and Rogaway [37, Definition 4.1] . We assume that the number of moves n in a protocol is odd (n even is investigated by Boyd et al. [36] ). Definition 23 (Matching conversations). Suppose P is a nmove protocol, where n = 2k − 1 for some integer k. Run P and suppose oracles Ω i a,b and Ω j b,a engage in conversations K i and K j , respectively. If there exist t 0 < t 1 < ... < t n and α 1 , β 1 , ..., α k , β k such that K i is prefixed by

and K j is prefixed by

then K j is a matching conversation to K i .

∅ means that the oracle has no input, because it initiates the protocol; we call it an initiator oracle; otherwise, an oracle is a responder oracle. * means that the oracle has no output, because the protocol ends with this last move.

Informally, this means that conversation K j of Ω j b,a (a responder oracle) matches conversation K i of Ω i a,b (an initiator oracle). We also need the following definition, which has been modified for the unilateral (as opposed to mutual) authentication case.

is an adversary. Suppose also that when P is run against A there exists an initiator oracle Ω i a,b with a conversation K i in the ACCEPT state but no oracle Ω j b,a has a conversation matching with K i . We denote this event by No-Match A P and its probability by Pr(No-Match A P ). These preliminaries enable us to state the following key definition. Note that this definition corresponds to the case where the protocol responder (entity b) is authenticated by the protocol initiator (entity a), i.e. in the case of protocol P where the server is entity a and the client is entity b.

Definition 25 (Secure unilateral authentication protocol). A protocol P is a secure unilateral entity authentication protocol if for every adversary A: 1) If Ω i a,b and Ω j b,a have matching conversations, then the initiator oracle Ω i a,b accepts; 2) Pr(No-Match A P ) is negligible. The first condition refers to completeness. The second condition says that the only way for an adversary to corrupt an honest responder oracle to the ACCEPT state is to relay the messages in the protocol without modification, i.e. an adversary can only observe and relay messages.

We can now state the main result. Proof: Since for the purposes of the Theorem we are ignoring Steps 1, 4(b) and 4(c) of P, the server is the protocol initiator and the client is the responder, although the reverse is true for P in its entirety. Suppose λ is the security parameter of the underlying homomorphic encryption scheme E. Suppose also that Steps 2-4(a) of P do not form a secure authentication protocol. From Theorem 1, we know that P is complete, i.e. that the first condition of Definition 25 holds. Thus the second condition does not hold, i.e. there exists a PPT adversary A such that Pr(No-Match A P ) is non-negligible. We say that A succeeds against Ω i a,b if, at the end of A's operation, there exists an initiator oracle Ω i a,b with a conversation K i in the ACCEPT state but no oracle Ω j b,a has a conversation K j matching with K i . We denote the probability that A succeeds against the initiator oracle Ω i a,b by Pr(A succeeds) = p. Then, by assumption, p is non-negligible. Suppose also A possesses the public key pk A of a genuine client. We next construct an adversary B from A against the IND-CPA security of E.

We consider the details of the conversation of the oracle Ω i S,C . Since we only consider Steps 2-4(a) of P, we have n = 3. Suppose the conversation for Ω i S,C is

where at time t 0 , the oracle sent α 1 and at time t 2 the oracle received β 1 and sent α 2 . Then it follows that we have α 1 = y = Enc(pk C , r w ) (where w is 0 or 1), β 1 = y, and α 2 = * ), where we ignore the ID token τ since its construction is independent of the design of the protocol. Since A is successful against Ω i S,C with probability p, it follows that y ∈ {r 0 , r 1 } with probability p. Since r 0 and r 1 are chosen uniformly at random for each conversation instance, and since we are also assuming that there is no matching conversation, A must have a means for recovering r w from Enc(pk C , r w ) which works with probability at least p. Hence A must have access to an oracle O which, when given an input encrypted using the public key of C, with non-negligible probability returns its decrypted version. However, since A does not have access to the private key of C, this oracle can immediately be used to construct an adversary B against the IND-CPA security of E. This gives the desired contradiction and hence it follows that P is a secure unilateral authentication protocol.

The protocol has been implemented using the C/C++ Fully Homomorphic Encryption over the Torus (TFHE) library due to Chillotti et al. [39] . One feature of TFHE is that it implements gate bootstrapping, i.e. at each evaluated gate the bootstrapping method is executed. This enables the evaluation of arbitrary circuits on encrypted data. In practice, TFHE offers the fastest gate bootstrapping in the literature, namely of the order of 13 milliseconds per gate on a single core; however, "bootstrapped bit operations are still about one billion times slower than their plaintext equivalents" [2] .

In Section II, we described a homomorphic encryption scheme as a public key encryption system. The TFHE scheme is symmetric but can easily be used in the context of P because it provides a pair of keys: a secret key sk and a cloud key ck. In the context of P (see Section III), sk is kept secret and used by the client C to encrypt and decrypt data. C sends ck to the server S during the registration phase. S is then able to compute arbitrary circuits on data encrypted under sk using ck without being able to decrypt them. For further information on the design and security of TFHE see Chillotti et al. [2] .

We chose facial recognition as the biometric method for our proof-of-concept implementation for two main reasons: it is a mature technology (see, for example, the NIST report [40] ) and one that suits the homomorphic setting. For our purposes, facial samples and templates are vectors x = x 1 , ..., x n ∈ (Z m ) n , where Z m is the set of the integers modulo m (for some m). Samples and templates are compared using Euclidean distance, as defined below.

Definition 26 (Euclidean distance). Suppose x, y ∈ (Z m ) n . The Euclidean distance between x and y is defined to be:

To simplify calculations, we used the square of the distance as the metric. As in the following definition, a sample and a template are deemed to match if the (square of) the distance is at most B, for some B. For comparison purposes, when verifying the correctness of the implementation, we also implemented the Manhattan distance, defined below.

Definition 28 (Manhattan distance). Suppose x, y ∈ (Z m ) n , and let |z| denote the absolute value of z. The Manhattan distance between x and y is defined to be:

To obtain performance results, the implementation was run on an Ubuntu 20.04.1 LTS 64-bit machine with 8 GB of RAM and a four-core Intel(R) Core(TM) i3-6100CPU @ 3.70GHz. TFHE was used with the default parameter, which achieves 110-bit cryptographic security [39] . We chose to use biometric vectors of length 128 (i.e. n = 128) because it is a likely realworld value.

To obtain timing figures, we first measured the 'homomorphic' (ciphertext domain) computation times for most of the arithmetic and bit comparison subroutines given in Appendix A. For comparison purposes, we also implemented and measured the performance of all the subroutines in the plaintext domain. Table I summarises the results.

It is clear that homomorphic computations have a substantial performance cost, with an order of magnitude of at least 10 6 . This finding is in line with previous work [34] , despite the optimisations included in the TFHE library [20] . Building on the implementations of fundamental operations, we implemented a naive version of P. The performance results are shown in Table II , and confirm that the current proof-ofconcept implementation is certainly not practical, and needs considerable optimisation in order to be usable in practice. For comparison we also show computation results in the plaintext domain. Note that none of the performance results given in Table I include the encryption and decryption time. These results demonstrate the importance of optimising the design of an algorithm and its implementation. The performance results are not only due to the homomorphic paradigm, but also because we implemented the most naive routines without any optimisations or parallelisations. We project from those results that with an optimised and targeted implementation P could be practical in the real world.

To conclude, we showed that, implemented naively, homomorphic encryption does not meet the performance criteria for practical use, since a user cannot wait for a few hours to be authenticated in most (if not all) authentication use cases. Indeed, Nah [41] showed that a typical user will not tolerate a wait of more than two seconds for a web page to appear. Nonetheless, there are considerable possibilities for optimisation, and the implementation and design of P can be enhanced in various ways, as we next briefly discuss.

The most obvious improvement would be from the algorithmics perspective. As explained above all the subroutines are implemented in a very naïve way.

There exist various public libraries that could be used to add parallel computing features. One example would be a C++ library such as OpenMP. Many of the subroutines have for loops in which all execution instances are independent.

Finally, perhaps the most effective optimisation would be to mix the FHE schemes, as proposed by Boura et al. [42] , [43] . Existing libraries are optimised for certain targeted homomorphic computations; the main idea is to switch between libraries, choosing the most efficient for each homomorphic computation. In our case, the arithmetic subroutines would be faster on libraries other then TFHE; however, bit comparisons are much better handled by the TFHE library. This idea is practically effective, as shown by Lou et al. [44] who present Glyph, a tool which switches between TFHE [39] and BGV cryptosystems [15] .

We presented the design and a proof-of-concept implementation of a novel privacy preserving authentication protocol based on fully homomorphic encryption. Human authentication is based on biometric matching, implemented in the proof-of-concept using face matching. In the implementation, all underlying operations are executed using FHE, including biometric matching, Euclidean distance computation, and integer comparison. We showed that the protocol is privacypreserving and a secure unilateral authentication protocol if the underlying homomorphic encryption scheme is IND-CPA.

The implementation results are for a naive and unoptimised version, i.e. the worst-case scenario. However, producing it involved developing a set of elementary routines in the ciphertext domain that can be used as low-level building blocks in other applications. The results confirm that FHE is not practical in a naive worst-case model, and real-world implementations would require optimisations. However, the results suggest that, with already identified improvements, the protocol can be made ready for real-world adoption.

There are number of possible directions for future work in improving performance. First, as identified in §V-B, mixing FHE schemes to take advantage of the best of each scheme (see [43] , [44] ) would significantly benefit performance without compromising the IND-CPA security of the homomorphic encryption scheme. Better algorithmics and implementation design is also an obvious target for improvement. Another possibility would be to change the biometric matching paradigm. Deep Learning is known to be useful in this context, and the performance in particular for face matching has been much improved recently thanks to initiatives such as that of NIST 3 . However, when such deep learning techniques are used in combination with homomorphic encryption, only the inference phase is run homomorphically and the training phase is run on clear data (see e.g. [45] , [46] ). To achieve the level of security we showed in this paper with FHE, both phases need to be executed in the ciphertext domain. However, encrypting both phases may not be straightforward to achieve, as recent experience shows that it is costly [44] , [47] , despite improvements in making FHE practical.

Informally, a Boolean circuit is a directed acyclic graph with internal nodes marked by elements of {∧, ∨, ¬}. Nodes with no in-going edges are called input nodes, and nodes with no outgoing edges are called output nodes. A node marked ¬ may have only one outgoing edge. Computation in the circuit begins with placing input bits on the input nodes (one bit per node) and proceeds as follows. If the outgoing edges of a node (of in-degree d) marked ∧ (similarly for nodes marked ∨ and ¬) have values v 1 , v 2 , ..., v d then the node is assigned the value ∧ d i=1 v i . The output of the circuit is read from its output nodes. The size of a circuit is the number of its edges. A polynomialsize circuit family is an infinite sequence of Boolean circuits π 1 , π 2 , ... such that, for every n, the circuit π n has n input nodes and size p(n), where p is a polynomial fixed for the entire family.

Definition 32 (Circuit). Let B be a basis. A Boolean circuit over B with n inputs and m outputs is a tuple π = (V, E, α, β, ω), where (V, E) is a finite directed acyclic graph, α : E → N is an injective function, β : V → B ∪ {x 1 , x 2 , ..., x n }, and ω : V → {y 1 , y 2 , ..., y m }∪{ * }, such that the following conditions hold: 1) If v ∈ V has in-degree 0, then β(v) ∈ {x 1 , x 2 , ..., x n } or β(v) is a 0-ary Boolean function (i.e. a Boolean constant) from B. 2) If v ∈ V has in-degree k > 0, then β(v) is a k-ary Boolean function from B or a family of Boolean functions from B.

3) For every i, 1 ≤ i ≤ n, there is at most one node v ∈ V such that β(v) = x i . 4) For every i, 1 ≤ i ≤ m, there is at most one node v ∈ V such that ω(v) = y i .

A Boolean circuit π with n inputs and m outputs computes a Boolean function f : B n → B m .

Definition 33 (Circuit family). Let B be a basis. A circuit family over B is a sequence Π = (π 0 , π 1 , π 2 , ...), where for every n ∈ N, π n is a circuit over B with n inputs. Let f n be the function computed by π n . Then we say that Π computes the function f : B * → B * , defined for every w ∈ B * by f (w) def = f |w| (w).

Remark 5. For simplicity of presentation, we often abuse our notation slightly by considering circuit families (π n ) n∈N , where π n has p(n) rather than n input bits, for some fixed polynomial p.

Algorithm 1 implements the function f defined in §III-B.

Algorithm 1: Pseudo-code of the biometric matching f Input : x, y ∈ (Z m ) n and B ∈ Z

As stated by Crawford et al. [49] , a key step for practical homomorphic encryption is to implement basic routines and tools, e.g. binary arithmetic, and make them available for use and optimisation. We implemented the following basic arithmetic functions, needed to calculate Euclidean distance (see §V-A). In each case pseudo-code (using mainly logic) is provided below. Apart from the specified functions, we also used the bitwise routines implemented in the TFHE library 4 . All the functions are presented as they are executed in the plaintext domain, although the implementations of those routines are specific to the ciphertext domain.

1-bit addition We denote naive binary addition by 1bit add. Two bits a and b are XOR-ed with carry; the carry is updated and returned for use in another 1-bit addition as part of n-bit addition. Algorithm 2 implements the 1-bit addition routine.

n-bit addition

Privacy-preserving biometric matching using homomorphic encryption

TFHE: fast fully homomorphic encryption over the torus

A method for obtaining digital signatures and public-key cryptosystems

Evaluating 2-dnf formulas on ciphertexts

A new public key cryptosystem based on higher residues

A new public-key cryptosystem as secure as factoring

Public-key cryptosystems based on composite degree residuosity classes

A fully homomorphic encryption scheme

Efficient fully homomorphic encryption from (standard) LWE," in FOCS

Lattice-based FHE as secure as PKE

Fully homomorphic encryption over the integers

Homomorphic evaluation of the AES circuit

Fully homomorphic encryption with relatively small key and ciphertext sizes

Faster fully homomorphic encryption

(leveled) fully homomorphic encryption without bootstrapping

On lattices, learning with errors, random linear codes, and cryptography

Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based

Faster bootstrapping with polynomial error

FHEW: bootstrapping homomorphic encryption in less than a second

Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds

Surpassing human-level face verification performance on LFW with gaussianface

Privacy-preserving biometric authentication: Challenges and directions

Scifi -A system for secure face identification

Efficient binary conversion for paillier encrypted values

Secure and efficient protocols for iris and fingerprint identification

Packed homomorphic encryption based on ideal lattices and its application to biometrics

Attacks on privacy-preserving biometric authentication

An authentication protocol with encrypted biometric data

Fully private noninteractive face verification

Ghostshell: Secure biometric authentication using integrity-based homomorphic evaluations

Secure face matching using fully homomorphic encryption

Somewhat practical fully homomorphic encryption

Basic Techniques

A guide to fully homomorphic encryption

Handbook of Applied Cryptography

Protocols for Authentication and Key Establishment, Second Edition, ser. Information Security and Cryptography

Entity authentication and key distribution

Entity authentication and authenticated key transport protocols employing asymmetric techniques

TFHE: Fast fully homomorphic encryption library

Facial Recognition Technology (FRT), National Institute of Standards and Technology (NIST)

A study on tolerable waiting time: How long are web users willing to wait

Chimera: Combining ring-lwe-based fully homomorphic encryption schemes

CHIMERA: combining ring-lwe-based fully homomorphic encryption schemes

Glyph: Fast and accurately training deep neural networks on encrypted data

Fast homomorphic evaluation of deep discretized neural networks

Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy

Towards deep neural network training on encrypted data

Introduction to Circuit Complexity -A Uniform Approach, ser. Texts in Theoretical Computer Science

Doing real work with FHE: the case of logistic regression

Multiplication of multidigit numbers on automata

Improved secure integer comparison via homomorphic encryption

APPENDIX We next formally introduce notions related to circuits. For more complete versions of these definitions, see Vollmer

A Boolean function is a function f : B n → B for some n ∈ N

Definition 30 (Family of Boolean functions). A family of Boolean functions is a sequence f = (f n ) n∈N , where f n is an n-ary Boolean function

A basis is a finite set consisting of Boolean functions and families of Boolean functions

Algorithm 2: Pseudo-code of 1-bit addition Input : a, b, carry in ∈ B Output: res, carry out ∈ B res ← a XOR b XOR carry in ; carry out ← (a AND b) XOR (a AND carry in ) XOR (b AND carry in ); return (res, carry out )We denote naive bitwise addition by nbit add. This routine uses 1bit add and applies to all bits of two n-bit numbers. Algorithm 3 implements the n-bit addition routine.Algorithm 3: Pseudo-code of n-bit addition Input : a, b ∈ B n Output: res ∈ B n+1 carry ∈ B; carry ← 0; for i ← 1 to n do (res, carry) ← 1bit add(a i , b i , carry) return res Two's complement We implemented subtraction as addition between a number and the two's complement of the other number. Thus, we require this subroutine. We denote byå the two's complement of a and by twos the two's complement function. Algorithm 4 implements the two's complement routine.Algorithm 4: Pseudo-code of two's complement Input : a ∈ B n Output:å ∈ B n+1 for i ← 1 to n do a i ← a i XOR 1 a ← nbit add(å, 1); returnå

The absolute value was required when calculating the Manhattan distance (see §V-A). We denote this function by abs. MSB(a) below outputs the Most Significant Bit of a. Algorithm 5 implements the absolute value routine.tmp ← nbit add(a, mask); for i ← 1 to n do |a| i ← tmp i XOR mask i return |a| n-bit subtraction As explained above, when subtracting b from a, the routine adds a tob. We denote this routine by sub. After the addition, if the final carry denoted carry f over the sum is 1, the result is positive and remains unchanged. If not (i.e. carry f is 0), the result is negative thus its two's complement is returned. The cost of a branching condition being too great, a sequence of instructions is used instead, leading to the genuine result. Algorithm 6 implements the subtraction routine.Algorithm 6: Pseudo-code of n-bit subtraction Input : a, b ∈ B n Output: res, tmp, var ∈ B n+1 b ← twos(b); tmp ← nbit add(a,b); for i ← 1 to n + 1 do var i ← carry f res ← twos(tmp AND var) OR (tmp AND var); return res

We implemented a naive multiplication algorithm; however, other algorithms have smaller complexity, e.g. Karatsuba multiplication [50] . Implementing this is left for future work. Algorithm 7 implements this routine denoted by mult.

Input : a, b ∈ B n Output: res ∈ B 2n res ← 0 2n ; tmp ← 0 2n ; for i ← 1 to n do for j ← 1 to n do tmp i+j ← a j AND b i res ← nbit add(res, tmp) return res

Secure integer comparison has been studied for a long time [51] . The first solution was probably that of Yao [52] through the Millionaires' problem. Integer comparison is very costly in terms of computation when using FHE; this is why it is usually better to avoid computing such an operation. Moreover it can also be difficult to articulate in ciphertext spaces. In TFHE, this operation is done using logic gates, and a proposal for implementation is published in the tutorial section in [39] . The authors use a MUX gate in their function, which is exhaustively explained in [2, Section 3.4] . The authors provide two functions, one to compare bitwise and the other to compare two binary numbers, denoted by 1bit comp and nbit comp, respectively. We adapted their function in our implementation. Algorithm 8 implements the 1-bit comparison routine.n-bit comparison This routine performs a comparison of two n-bit numbers using the previous routine. Algorithm 9 implements the n-bit comparison routine.Algorithm 9: Pseudo-code of n-bit comparison Input : a, b ∈ B n Output: res ← a?b : carry carry, tmp ∈ B; carry ← 0; for i ← 1 to n do tmp ← 1bit comp(a i , b i , carry)for i ← 1 to n do res ← MUX(carry, b i , a i ) return res