FHE1: Introduction to practical homomorphic encryption 15 Mar 2019

Background

It is not hard to be fascinated when first hearing about homomorphic encryption.

Homomorphic encryption schemes are asymmetric cryptosystems that support the evaluation of functions on encrypted data. For example with RSA, ciphertexts can be easily be multiplied. The result is a ciphertext containing the (encrypted) result of the multiplication:

Enc(a) * Enc(b) = Enc(a*b)

This is already cool, but actually there is something even cooler: homomorphic evaluation of arbitrary (computable) functions!

The first cryptosystem supporting this has been proposed by Craig Gentry in 2009 in his seminal thesis. That cryptosystem operates on bits and can evaluate arbitrary computable functions in polynomial time. This was the first fully homomorphic cryptosystem.

This of course enables a myriad of possible applications from private cloud computing to protecting the computations on untrusted devices. For example a team at Microsoft developed a POC of a neural network (called CryptoNets) that processes encrypted data. For more exciting results like private database queries or protected genome analysis check out the publication list of the Homomorphic Encryption group at Microsoft.

But let’s clarify the terminology a bit more:

Since 2009 many researchers including Gentry himself have been working on various improvements, improving performance to a somewhat practical level. Here is a (probably incomplete) list of important research papers:

Practical usage

In the following I will show you how to actually use homomorphic encryption. For that purpose I give some examples using the Pyfhel Python package. This library uses (currently in version 2.0) the SEAL library by Microsoft as a backend.

For basic usage we can do something like this:

import Pyfhel
he = Pyfhel.Pyfhel()
he.contextGen(p=10000)
he.keyGen()
# now everything is initialized
half_the_truth = he.encryptInt(21)
two = he.encryptInt(2)
the_truth = he.multiply(half_the_truth, two)
print(he.decryptInt(the_truth))
#=> prints 42

Interestingly, we can also do this:

pi = he.encryptFrac(3.141592)
two_frac = he.encryptFrac(2.0)
result = he.multiply(pi, two_frac)
print(he.decryptFrac(result))
#=> prints 6.283183999825269

Let me point out why this is so interesting:

Why is this?

Plaintext space

To answer this question let me explain a little bit more about how the SEAL library works. To that end let’s discuss how plaintexts look like and what parameters control the cryptosystem in which ways.

The plaintexts that are being manipulated by the SEAL library are actually quite abstract. The plaintext space is a polynomial ring over a prime integer ring. That means plaintexts are polynomials with integers modulo a prime number as coefficients and the polynomials are themselves taken modulo a big irreducible (something like prime) polynomial.

This already touches upon some of the parameters we can set:

Additionally there is the Coefficient modulus that is harder to select and hidden by the Pyfhel library. Instead of it, in Pyfhel there is a security parameter that can be set to 128, 192 and 256 bit.

Noise budget & Encodings

To understand what affect the different parameters have, we have to introduce the concept of noise budget and look at the different data encodings.

The following code snippet helps to visualize the noise budget of ciphertexts:

ctxt1 = he.encryptInt(2)
print(he.decryptInt(ctxt1))
#=> prints 2
print(he.noiseLevel(ctxt1))
#=> prints 30
ctxt2 = he.add(ctxt1, two)
print(he.decryptInt(ctxt1))
#=> prints 4
print(he.noiseLevel(ctxt1))
#=> prints 29
ctxt3 = he.multiply(ctxt2, two)
print(he.decryptInt(ctxt2))
#=> prints 8
print(he.noiseLevel(ctxt2))
#=> prints 6
ctxt4 = he.multiply(ctxt3, two)
print(he.decryptInt(ctxt3))
#=> prints -7906467554648178793
print(he.noiseLevel(ctxt3))
#=> prints 0

Each ciphertext has an associated noise budget and with each operation the noise budget shrinks. The noise impact of additions is a lot smaller than the one of multiplications. Also depleting the noise budget is not a good idea: it makes the ciphertexts indecipherable.

Note: the he.noiseLevel function can be used to check the noise budget, but it needs access to the private key.

All parameters affect the noise budget of a freshly encrypted ciphertext:

This means that there is a trade-off between the number of homomorphic multiplications (and less strongly additions) that can be performed and the performance.

Now let’s talk about the data encodings. Despite the fact that under the hood, SEAL operates on polynomials, it is possible to encrypt integers and fractional numbers. Additionally there is a ‘batch encoding’ that enables the encryption of vectors of integers.

Here is an overview:

We already saw a code snippet for the integer and fractional encoding, so now lets see how the batch encoding is used:

he2 = Pyfhel.Pyfhel()
# to allow batch encoding, we need a polynomial modulus p
# so that the polynomial modulus divides p-1
he2.contextGen(p=40961, m=4096, flagBatching=True)
he2.keyGen()
# we need to generate rotation keys to enable rotation
# operations
he2.rotateKeyGen(20)

array1 = he2.encryptBatch([1,2,3,4])
array2 = he2.encryptBatch([2,2,2,2])

# the third argument (True) is used to store the result in
# a new ciphertext (otherwise it would be stored in array1)
array_mult = he2.multiply(array1, array2, True)
print(he2.decryptBatch(array_mult)[:10])
#=> prints [2, 4, 6, 8, 0, 0, 0, 0, 0, 0]

rotated_row = he2.rotate(array1, 1, True)
print(he2.decryptBatch(rotated_row)[:10])
#=> prints [2, 3, 4, 0, 0, 0, 0, 0, 0, 0]
print(he2.decryptBatch(rotated_row)[2040:2050])
#=> prints [0, 0, 0, 0, 0, 0, 0, 1, 0, 0]

Relinearization

There is one important thing that is missing: relinearization. For this it is important to know that ciphertexts have a size that is 2 for a fresh ciphertext and grows with multiplications. The problem is that multiplying ciphertexts of bigger size depletes the noise budget even faster.

To help this problem, there is a relinearization operation that shrinks ciphertexts back to size 2.

print(array_mult.size())
#=> prints 3

mult2 = he2.multiply(array_mult, array_mult, True)
print(mult2.size())
#=> prints 5
print(he2.noiseLevel(mult2))
#=> prints 20

he2.relinKeyGen(10,3)
he2.relinearize(array_mult)
print(array_mult.size())
#=> prints 2

mult2_new = he2.multiply(array_mult, array_mult, True)
print(mult2_new.size())
#=> prints 3
print(he2.noiseLevel(mult2_new))
#=> prints 26

We see that the multiplying the relinearized array_mult results in a ciphertext with slightly higher remaining noise budget.

This already concludes this introductory post on practically applying homomorphic encryption. We talked about what HE is, had a quick look at some of the important research papers and discussed how to actually use it with a nice and simple Python package that wraps the SEAL library by Microsoft.

In upcoming posts we will have a closer look at how the batch encoding can be used for some optimized computations, e.g. for neural networks.