参考资料：

代数基础

质数
- 质数，也称素数，是指只能被1和本身整除的自然数，即大于1的自然数中，除了1和它本身以外，没有其它的因数。比如2、3、5、7、11等都是质数，而4、6、8、9等则不是质数。
$Z_{n}^{*}$
- 表示模 $n$ 意义下，即 $0 n - 1$ 中所有与 $n$ 互质的元素的集合。
$Z_{n^{2}}^{*}$
- 表示模 $n^{2}$ 意义下的可逆元素集合。具体地，对于正整数 $n$ ， $Z_{n^{2}}^{*}$ 包含了模 $n^{2}$ 意义下与 $n^{2}$ 互质的所有元素。也就是说，如果 $x$ 属于 $Z_{n^{2}}^{*}$ ，那么 $x$ 满足以下两个条件：
  1. $x$ 和 $n^{2}$ 互质，即 $g c d (x, n^{2}) = 1$ ；
  2. $x$ 在模 $n^{2}$ 意义下有逆元，即存在 $y$ 使得 $x y \equiv 1 (m o d n 2)$ 。
欧拉函数 $φ (n)$
- $n = pq$ ，其中 $p$ 和 $q$ 是两个质数
- 欧拉函数 $φ (n) = (p - 1) (q - 1)$ 即为 $Z_{n^{2}}^{*}$ 中的元素个数
- 更进一步， $Z_{n^{2}}^{*}$ 的元素个数为 $φ (n 2) = n φ (n)$
卡迈克尔函数(Carmichael’s Function)
$λ (n) = l c m (p - 1, q - 1)$ ，满足 $aλ (n) \equiv 1 (m o d n)$ ，其中 $a$ 与 $n$ 互质。
- 根据卡迈克尔定理(Carmichael’s Theorem)，对于任意元素 $ω \in Z_{n^{2}}^{*}$
  - $ω^{λ} = 1 m o d n$
  - $ω^{nλ} = 1 m o d n^{2}$
n次剩余
- 在数论中，给定一个正整数 $a$ 和一个奇素数 $p$ ，如果存在整数 $x$ 使得 $x 2 \equiv a (m o d p)$ ，那么称 $a$ 是模 $p$ 的二次剩余（Quadratic Residue，QR），否则称 $a$ 是模 $p$ 的二次非剩余（Quadratic Non-Residue，QNR）。
- 类似地，如果存在整数 $x$ 使得 $x^{n} \equiv a (m o d p)$ ，那么称 $a$ 是模 $p$ 的 $n$ 次剩余（nth Power Residue），否则称 $a$ 是模 $p$ 的 $n$ 次非剩余（nth Power Non-Residue）。
- 当 $n = 2$ 时， $n$ 次剩余就是二次剩余。 $n$ 次剩余和二次剩余一样，具有很多重要的应用，例如在加密算法和密码学中被广泛使用。
复合剩余类问题（Composite Residuosity Problem，CRP）
- 在模 $n$ 意义下，给定两个整数 $a$ 和 $b$ ，判断是否存在整数 $x$ ，使得 $a \equiv x 2 (m o d n)$ 且 $b \equiv x 2 (m o d n)$ 同时成立。其中， $n$ 是一个合数。

$(0, 1) *$ 0, 1比特串，长度任意 $Z_{q}^{*} = x \in Z_{q} ∣ g c d (x, q) = 1, Z_{q}^{*} = {x \in Z_{q} ∣ g cd (x, q) = 1},$ 其中 $Z_{q} = 0, 1, \dots, q - 1, g c d (x, q) = x, q$ 的最大公因数。 $Z_{q}^{*}$ 关于模 q 乘法运算构成阿贝尔群，当 q 为素数时为循环群。

模

对整数 $a, b$

a m o d b = r (a + b) m o d c = (a m o d c + b m o d c) m o d c (a * b) m o d c = ((a m o d c) * (b m o d c)) m o d c a^{b} m o d c = (a m o d c)^{b} m o d c

同余

\begin{matrix} a\ \mod n = b\ \mod n\ a \equiv b\ (\mod n) \ 当k为整数时\ a \equiv b\ (\mod n) ⇒ ak \equiv bk\ (\mod n) \ 当k,n 互质时\ ak \equiv bk\ (\mod n) ⇒ a \equiv b\ (\mod n) \ \end{matrix}

- 模逆 $a,b$关于模$n$互为模倒数，也称模反元素或者模逆。

\begin{matrix} ab \equiv 1\ (mod n) \ b \equiv \frac{1}{a}(mod\ n)\ \end{matrix}

在模下的除法可以用对应模逆的乘法来表示。分数取模等于求分母的模逆。

\begin{matrix} b\ mod\ n = 1/a\ mod\ n\ (c/a)mod\ n = ((c\ mod\ n)(1/a\ mod\ n))mod\ n = (cb)mod\ n\ \end{matrix}

求模逆元可以通过遍历 ```python # find modular multiplicative inverse of 'a' under modulo 'n' def modular_multiplicative_inverse(a: int, n: int) -> int: for i in range(n): if (a * i) % n == 1: return i raise ValueError('{} has no multiplicative inverse under modulo {}'.format(a, n)) ``` 遍历效率不高，需要改进 [什么是模逆元-CSDN博客](https://blog.csdn.net/aaron67/article/details/109006977#:~:text=%E6%A8%A1%E9%80%86%E5%85%83%20%E5%8F%82%E8%80%83%E5%80%92%E6%95%B0%EF%BC%88%20xy%20%3D%201%20%EF%BC%89%E7%9A%84%E5%AE%9A%E4%B9%89%EF%BC%8C%E5%AF%B9%E6%95%B4%E6%95%B0%20a%20%E5%92%8C,b%20%E2%89%A1%20a%E2%88%921%20%28mod%20n%29%20%E7%B1%BB%E4%BC%BC%E4%BA%8E%E5%AE%9E%E6%95%B0%E9%99%A4%E6%B3%95%EF%BC%8C%E5%9C%A8%E6%A8%A1%20n%20%E4%B8%8B%E7%9A%84%E9%99%A4%E6%B3%95%E5%8F%AF%E4%BB%A5%E7%94%A8%E5%92%8C%E5%AF%B9%E5%BA%94%E6%A8%A1%E9%80%86%E5%85%83%E7%9A%84%E4%B9%98%E6%B3%95%E6%9D%A5%E8%A1%A8%E8%BE%BE%E3%80%82) # 同态加密 同态加密（Homomorphic Encryption，HE）是将数据加密后，对加密数据进行运算处理，之后对数据进行解密，解密结果等同于数据未进行加密，并进行同样的运算处理。同态加密的概念最初在1978年，由Ron Rivest，Leonard Adleman和Michael L. Dertouzos共同提出，旨在解决在不接触数据的前提下，对数据进行加工处理的问题。 满足下面的性质:

\begin{matrix} 加法同态：E(a)+E(b) = E(a+b) \ 乘法同态：E(a)E(b) = E(ab) \end{matrix}

从**明文空间**和**密文空间**的角度来看，密文空间具有特定的算符。明文空间的加法对应密文空间的 ⊕ ，明文空间的乘法对应密文空间的 ⊙ 。 | 明文空间 | 密文空间 | | ---- | ------ | | 2 | Enc(2) | | + | o+ | | 3 | Enc(3) | | 5 | Enc(5) | HE是一种特殊的加密方法，它允许直接对加密数据执行计算，如加法和乘法，而计算过程不会泄露原文的任何信息。计算的结果仍然是加密的，拥有密钥的用户对处理过的密文数据进行解密后，得到的正好是处理后原文的结果。 目前，同态加密支持的运算主要为加法运算和乘法运算。根据支持的计算类型和支持程度，同态加密可以分为以下三种类型： - **半同态加密**（Partially Homomorphic Encryption, **PHE**）：只支持加法或乘法中的一种运算。其中，只支持加法运算的又叫加法同态加密（Additive Homomorphic Encryption, AHE）；**PHE 的优点是原理简单、易实现，缺点是仅支持一种运算(加法或乘法)**； - **部分同态加密**（Somewhat Homomorphic Encryption, **SWHE**）：可同时支持加法和乘法运算，但支持的计算次数有限； - **全同态加密**（Fully Homomorphic Encryption, **FHE**）：支持任意次的加法和乘法运算。FHE 有以下类别：基于理想格的 FHE 方案、基于 LWE/RLWE 的 FHE 方案等等。**FHE 的优点是支持的算子多并且运算次数没有限制，缺点是效率很低，目前还无法支撑大规模的计算。** 同态加密算法包括**全同态**（FHE）、**部分同态**（SWHE）和**半同态**（PHE）三种。FHE支持无限次的乘法和加法运算，当前算法复杂度高，实际使用较少。SWHE支持有限次的加法和乘法运算。PHE只支持加法或乘法运算中的一种。加法同态的算法有Paillier算法、DGK算法、OU算法、基于格密码的方案等。乘法同态有我们常见的RSA算法、ElGamal算法等。PHE在实际中使用的较多。 **FHE在计算有限次乘法后需要较复杂的去除噪声的操作，经典的通用MPC协议通信开销较大，而TEE的安全性高度依赖于硬件厂商，无法提供密码学上严谨的安全性。** **PHE的高效、支持无限次加法或乘法的特点，使其成为隐私计算的重要基本组件，可辅助完成多种隐私计算功能：** ## 中国剩余定理 中国剩余定理（Chinese Remainder Theorem, CRT），又称为孙子定理，源于《孙子算经》，是数论中的一个关于一元线性同余方程组的定理，说明了一元线性同余方程组有解的准则以及求解方法。 通用方程： $$ \left\{ \begin{matrix} x\equiv a_0(mod\ n_0)\\ x\equiv a_1(mod\ n_1)\\ x\equiv a_2(mod\ n_2)\\ ...\\ x\equiv a_k(mod\ n_k) \end{matrix} \right.

解法流程：

计算所有模数的乘积 $n = \prod_{i = 0}^{k} n_{i}$
计算 $m_{i} = n / n_{i}, c_{i} = m_{i} * m_{i}^{- 1}$
解为： $x = \sum_{i = 0}^{k} a_{i} c_{i} (m o d n)$

Paillier：最著名的半同态加密方案

在Paillier算法出现之前，基于公钥加密的算法主要有两个分支：

以RSA为代表的，基于大数因数分解难题的公钥加密算法
以ElGama为代表的，基于大数离散对数难题的公钥加密算法 Paillier加密算法，由Pascal Paillier于1999年发表，给出了公钥加密算法的一个新的分支领域。Paillier基于复合剩余类难题，满足加法同态和数乘同态，具有非常高效的运行时性能。

Paillier是一个支持加法同态的公钥密码系统 [1]，由Paillier在1999年的欧密会（EUROCRYPT）上首次提出。此后，在PKC’01中提出了Paillier方案的简化版本[26][8]，是当前Paillier方案的最优方案。在众多PHE方案中，Paillier方案由于效率较高、安全性证明完备的特点，在各大顶会和实际应用中被广泛使用，是隐私计算场景中最常用的PHE实例化方案之一。

其他的支持加法同态的密码系统还有DGK [5]、OU [6]和基于格密码的方案[12]等。其中，DGK方案的密文空间相比Paillier更小，加解密效率更高，但由于算法的正确性和安全性在学术界没有得到广泛研究和验证，且我们的实验表明算法的加解密部分存在缺陷，不推荐在工业界代码中使用。OU和基于格的加法同态计算效率更高，也是PHE不错的候选项。其中OU的在方案中的使用频率相对较低，而基于格的方案密文大小较大，在一些特定场景有自身的优势。

算法思路

Paillier 半同态加密算法的安全性基于复合剩余类问题（Decisional Composite Residusity Assumption，DCRA）的困难性，即在给出 $y$ 和 $n$ 的情况下，很难判断模 $n^{2}$ 的 $n$ 次剩余是否存在: $z \equiv y^{n} (m o d n^{2})$ 。

加解密过程

密钥生成 KeyGeneration

选择两个随机的大质数 $p$ 和 $q$ ，满足 $g c d (pq, (p - 1) (q - 1)) = 1$ ，且满足 $p, q$ 长度相等。
计算出 $n = p \cdot q$ 和 $λ (n) = l c m (p - 1, q - 1)$
选取随机数 $g ， g \in Z_{n^{2}}^{*}$ ，且满足 $μ = (L (g^{λ} m o d n^{2}))^{- 1} m o d n$ 存在， $L$ 函数定义为 $L (x) = \frac{x - 1}{n}$
得到公钥为 $(n, g)$ ，私钥为 $(λ, μ)$

加密 Encryption

对于明文 $m, m \in Z n, 0 \leq m \leq n$ ，选择随机数 $r \in Z_{n}^{*}, r < n, g c d (r, n) = 1$
$c = E n c (m, n, g, r) = g^{m} r^{n} (m o d n^{2})$

解密 Decryption

输入密文 $c, c \in Z_{n^{2}}^{*}$
$m = Dec (c, λ, μ) = L (c^{λ} m o d n^{2}) * μ (m o d n) = \frac{L ( c ^{λ} m o d n ^{2} )}{L ( g ^{λ} m o d n ^{2} )} (m o d n)$

加法同态

对任意明文 $m_{1}, m_{2} \in Z_{N}$ 和任意 $r_{1}, r_{2} \in Z_{N}^{*}$ ，对应密文 $c_{1}, c_{2}$ ，满足：

c_{1} \cdot c_{2} = E (m_{1}, r_{1}) \cdot E (m_{2}, r_{2}) = g^{m_{1} + m_{2}} \cdot (r_{1} + r_{2})^{N} m o d N^{2} D (c_{1} \cdot c_{2}) = (m_{1} + m_{2}) m o d N

即密文乘等于明文加

算法证明

高效加速

如果取 $g = n + 1$

g^{m} = (n + 1)^{m} =

前面 $m - 1$ 项都是 $n^{2}$ 的倍数，在模 $n^{2}$ 下，把模指数运算简化为1次模运算，即

g^{m} = mn + 1 m o d n^{2}

蒙哥马利模乘

蒙哥马利模乘算法简介_montgomery modular multiplication-CSDN博客从数学上介绍模运算： What is modular arithmetic? (article) | Khan Academy 介绍模逆元什么是模逆元-CSDN博客

详细的从算法、代码实现讨论，并且结合一个例子。高效幂模算法探究：Montgomery算法解析 - FreeBuf网络安全行业门户

sv实现用SV写一个蒙哥马利模乘的参考模型 - 知乎 (zhihu.com)

蒙哥马利算法

详细的讲解，给出伪代码

蒙哥马利算法详解_蒙德马利法-CSDN博客

模乘， $x * y (m o d N)$
约减， $t * ρ^{- 1} (m o d N)$
模幂， $x^{y} (m o d N)$ 集合是整数模N后得到的： $Z_{N} = 0, 1, 2, ..., N - 1$ ，其在base-b进制下有 $l_{N}$ 位这样的集合称为N的剩余环，任何属于这个集合Z的x满足下面条件

正整数
最大长度 $l_{N}$ 蒙哥马利算法计算基于这个集合上的运算剩余环上重要的运算有简单运算如加减，复杂运算如乘对于取模操作，一般有以下几种方法
根据以下公式，来计算取模操作取模运算即为 $t - (N * ⌊ t / N ⌋)$ 这种解法有以下特征

整个计算过程是基于标准的数字表示
不需要预计算（也就是提前计算一些变量，以备使用）
涉及到一个除法操作，非常费时和复杂

用Barrett reduction算法，这篇文章不细说，但是有以下特征

基于标准的数字表示
不需要预计算
需要 $2 (l_{N} + 1)^{2}$ 次数乘运算

用蒙哥马利约减，也就是下面要讲的算法，有以下特征

不是基于标准的数字表示（后文中有提到，是基于蒙哥马利表示法）
需要预计算
需要 $2 l_{N}^{2}$ 次数乘运算

蒙哥马利预备知识

1. 除法转换

计算 $x \cdot y$ 其实是诸位乘法相加在十进制下又可以做简化（霍纳法则）例如： $x \cdot y = (y_{0} * x) + 10 (y_{1} * x + 10 * (y_{2} * x + 10 * 0))$ 对于 $x \cdot y /1000$ ，也可以类似处理在剩余域下的模如 $x \cdot y /1000 (m o d 667)$ ，其中除于10会有小数，但可以选择一个u， $u \cdot 667 \equiv 0 (m o d 667)$ ， $u * 667 + r$ 可以被10整除在某些情况，除于可以转化为进制操作，因此不需要除法

2. 约减转换

考虑两个算法：

输入 $x, y$ ，计算 $x \cdot y \cdot ρ^{- 1}$
输入 $t$ ，计算 $t \cdot ρ^{- 1}$ $x \cdot y (m o d 667) = ((x \cdot 1000) \cdot (y \cdot 1000) /1000) /1000 (m o d 667)$ 转化为了上面两个算法的结合即模乘和模约减蒙哥马利算法的特点就是基于蒙哥马利表示法，将输入参数转化。定义下面概念
蒙哥马利参数给定N，N在b进制下（如2进制）有l位，gcd(N,b)=1，预计算几个值
$ρ = b^{k}$ 指定一个最小的k，使得 $b^{k} > N$
$ω = - N^{- 1} (m o d ρ)$ 这两个参数就是用来计算前面演变中的1000和u
蒙哥马利表示法对于x, $0 \leq x \leq N - 1$ ，x的蒙哥马利表示法为 $x = x \cdot ρ (m o d N)$

蒙哥马利约减

给定整数t， $t \cdot ρ^{- 1} (m o d N)$ 约减可以当作模乘当x=1的特例，也可以用来计算模，如将 $m \cdot ρ$ 约减可得 $m (m o d N)$

蒙哥马利模乘

\overset{x}{^} = x \cdot ρ (m o d N) \overset{y}{^} = y \cdot ρ (m o d N) t = \overset{x}{^} \cdot \overset{y}{^} = (x \cdot y) \cdot ρ^{2} 约减 : \hat{t} = (x \cdot y) \cdot ρ (m o d N)

输入参数 $\overset{x}{^}, \overset{y}{^}$ ，结果 $(x \cdot y) \cdot ρ (m o d N)$ 也即 $(x \cdot y) \cdot ρ^{- 1} (m o d N)$

b = 10
N = 1000
b^k > N => k = 3
p = b^k = 1000
w = -667^{-1}(mod p) = 997

x = 421; xhat = 123
y = 422; yhat = 456
xhat * yhat * p^-1 = 547(mod 667)

蒙哥马利模幂

普通幂运算，通过分解，转化为一系列模乘用蒙哥马利做改变：

正确性与安全性

C++ 实现

同态加密算力开销如何弥补？港科大等提出基于FPGA实现的同态加密算法硬件加速方案_澎湃号·湃客_澎湃新闻-The Paper FPGA-Based Hardware Accelerator of Homomorphic Encryption for Efficient Federated Learning

A Programmable SoC-Based Accelerator for Privacy-Enhancing Technologies and Functional Encryption

Abstract

privacy-enhancing technologies (PETs) based on additively homomorphic encryption (AHE) Present: a hardware/software (HW/SW) codesign for programmable systems-on-chip (SoCs) Impelmentation: a microcode-based multicore architecture for accelerating various PETs using AHE with large integer modular arithmetic

introduction

Many privacy problems can be avoided by encrypting the data before storing it into the cloud. Partially homomorphic encryption allows computing a limited set of operations over encrypted data.(AHE Paillier encryption) AHE are simpler and more efficient than FHE. Paillier & SEDs on the cyphertext & FE contributions:

a programmable Soc-based architecture for accelerating large modular arithmetic(for AHE); accelerator utilizes a microcode-based architecture; a multicore structure that allows exploiting the inherent parallelism
Instantiate in Soc and use the implementation for SEDs and FE scheme for inner products

preliminaries

A. Additively Homomorphic Encryption

AHE allows computing additions with ciphertexts

Paillier Encryption

Key Generation: Given a security parameter k(e.g., k = 2048), choose two random primes p, q of the length k/2, N = p*q, select a group generator g for the multiplicative group $Z_{N^{2}}^{*} t h e p u b l i c k ey p k - (N, g); secre t k eys k -$ \lambda = lcm(p-1, q-1)$
Encryption: Take a message $m \in Z_{N}$ and pk (N, g), select a random $r \in Z_{N}^{*}$ $E n c (p k, m) = g^{m} \cdot r^{N} m o d (N^{2})$
Decryption: take $∣ m ∣ \in Z_{N^{2}}$ and sk $λ$ $Dec (s k, ∣ m ∣) = \frac{L ( ∣ m ∣ ^{λ} m o d N ^{2} )}{L ( g ^{λ} m o d N ^{2} )} m o d N, L (u) = (u - 1) / N$

B. Use Cases of Paillier Encryption

privacy-preserving computation of SEDs between the user’s query and the service provider’s database
FE that is a new paradigm for encryption that permits the service provider to compute only a specific function

Privacy-Preserving SEDs over Encrypted Data

Y = (y_{0}, y_{1}, ..., y_{n - 1})

$y$ 有m个分量， $y_{i, j} \in Z_{l},$ ，l认为是2的幂，本文取2^8，取 $x, x_{i} \in Z_{l}$ 是列向量。 Y is in the possession of a server and x is a query sent by a user server - calculate encrypted distances |di| between the encrypted query |x| and each vector yi in Y.

Multi-Input FE for Inner Products

C. MIFE-IP Based on Paillier Encryption

III. Algorithms

A Long Integer Modular Arithmetic

bottleneck - long integer modular arithmetic operations, especially modular multiplication(MM) base modular operations on the Montgomery modular arithmetic; use algorithms for modular addition (MA) and modular subtraction (MS), montgomery reduction (MR), radix-2k Montgomery MM, and left-to-right modular exponen-tiation (ME) The implementation of these algorithms are constant time(CT) except left-to-right ME.

B. Squared Euclidean Distances

IV architecture

HW: the computationally heavy long integer modular arithmetic (e.g., MMs and MEs) SW: the computation of auxiliary operations

inherent parallelisms in the use cases → multicore architecture including multiple parallel and programmable cryptography processor(CP) cores | Each CP core can be programmed
the ciphertexts are large → data communication between the CP cores and the SW becomes a bottleneck

A. High-Level HW/SW Codesign

HW/SW co-design 3-level Data Memory: L1 is located in each CP core; L2 is shared for all CP cores; L3 is the SW side

1. SW

controlling all operations in HW and peripherals(DDR3, I/O), computations not supported in HW. communication between SW&HW: 4 HP ports for transfering data; 2 GP ports for commands and status transfer

2. HW

CP cores for perfroming the actual computations and many supporting modules for data communication and storage connected in an AXI-based structure. Multicore architecture is organized into M parallel clusters where each cluster contains N parallel CP cores. Data communication between SW/HW is done via HP/AXI DMA; 4 AXI memory interconnect - M/4 AXI DMA blocks L2-DMEM contains a single-port RAM and a smart router(round-robin arbiter) for controlling the access of the CP cores to the memory

3. Configuration

B. Cryptography Core

Main idea: compact, programmable, and high-performance processor for large integer modular arithmetic optimized for the resources of modern FPGAs Implemented based on a microprogramming architecture instead of FSMs

1) External Interface Unit:

2) Arithmetic Unit:

3 parts: 1) source registers; 2) arithmetic blocks; and 3) an output register 3 main source reg (loaded from either the L1-DMEM, output of arithmetic unit, 0, present value) → inputs for MMAA & MAS 1 output reg → source reg/L1-DMEM auxiliary part: one bit of a vector for extracting the exponent bit of ME
MAS/MMAA(72bit) MAS - add/sub & 1-bit carry/borrow MMAA - $(I n_{0} * I n_{1} + I n_{2}) m o d 2^{72}$ MA/MS are computed by applying MAS iteratively $r_{i} = b_{i} a_{i} + s_{i, j}$ can be implemented by iterating the MMAA block

a) MMAA block

2 parts: 1) multiply-adder block, 2) complementary part for accumulation and modular operations MultiAdd: 6-level pipelined; 3 72-bit inputs → 1 144-bit output lower 72-bit part of the output is accumulated 73-bit result: 72-bit lower part is Outlow; 1-bit MSB for the next accumulation higher 72-bit for the next accumulation MultiAdd 72-bit higher part is the higher part of the previous MMAA MultiAdd as the primary computing block, is implemented using DSP

b) MAS block

works in 6 modes

3. Data Memory Unit

L1-DMEM is the local memory for each CP and used for storing data for algorithm L1-DMEM - duplicated RAM: one true dual-port/ one simple dual-port → two reads and one write in one cycle

4. Address Generation and Control Unit

generating read and write addresses of L1-DMEM and making control decisions for loop iterations constructed based on five categories of control registers: RDR1_x, RDR2_x, WRR_x, CVR_x, LCR_y

5. Instruction Memory Unit

IMEM stores microcodes(sequences of instructions)

C. Implementation of the Use Cases in the HW/SW Codesign

Paillier Enc&dec enc is computed in HW using MMs&MEs; SW controls data transfers and commands dec consists of 2 phase: 1. precomputations(u, ME in HW, division/ modular inversion)
SED
MIFE-IP

D. Security Model

assumption: Paillier are secure and focus on information leakage from the implementation; no physical access or malware, only timing side-channel attacks

V result and analysis

HPCA

介绍了一些关于存储结构

除法

参考High-Speed ASIC Implementation of Paillier Cryptosystem with Homomorphism 使用恢复余数法

【HDL系列】除法器(1)——恢复余数法 - 知乎 (zhihu.com)

A Programmable SoC-Based Accelerator forPrivacy-Enhancing Technologies and Functional Encryption

Quartz 4

Explorer

pillar

代数基础

模

Paillier：最著名的半同态加密方案

算法思路

加解密过程

密钥生成 KeyGeneration

加密 Encryption

解密 Decryption

加法同态

算法证明

高效加速

蒙哥马利模乘

蒙哥马利算法

蒙哥马利预备知识

1. 除法转换

2. 约减转换

蒙哥马利约减

蒙哥马利模乘

蒙哥马利模幂

正确性与安全性

C++ 实现

A Programmable SoC-Based Accelerator for Privacy-Enhancing Technologies and Functional Encryption

Abstract

introduction

preliminaries

A. Additively Homomorphic Encryption

B. Use Cases of Paillier Encryption

Privacy-Preserving SEDs over Encrypted Data

Multi-Input FE for Inner Products

C. MIFE-IP Based on Paillier Encryption

III. Algorithms

A Long Integer Modular Arithmetic

B. Squared Euclidean Distances

IV architecture

A. High-Level HW/SW Codesign

1. SW

2. HW

3. Configuration

B. Cryptography Core

1) External Interface Unit:

2) Arithmetic Unit:

a) MMAA block

b) MAS block

3. Data Memory Unit

4. Address Generation and Control Unit

5. Instruction Memory Unit

C. Implementation of the Use Cases in the HW/SW Codesign

D. Security Model

V result and analysis

HPCA

除法

Graph View

Table of Contents

Backlinks