ECDH-PSI 协议#

算法流程#

算法分为2阶段,第一阶段为握手过程,第二阶段为算法主体,其流程如下:

../_images/ecdh-psi-flow.png

握手过程#

握手所用的 HandshakeRequest 定义如下:

interconnection/handshake/entry.proto#
 1// unified protocol for interconnection
 2message HandshakeRequest {
 3  // 握手请求版本号, 当前等于 2
 4  int32 version = 1;
 5
 6  //** META INFO **//
 7
 8  // The sender's rank
 9  int32 requester_rank = 2;
10
11  //** AI/BI 算法层 **//
12
13  // enum AlgoType
14  repeated int32 supported_algos = 3;
15
16  // 算法详细握手参数,与 supported_algos 一一对应
17  // SS-LR:learning_rate,optimizer,normalize
18  // ECDH-PSI:Nothing,skip
19  repeated google.protobuf.Any algo_params = 4;
20
21  //** 安全算子层 **//
22
23  // AI/BI 算法所需的 op 列到此处
24  // op = enum OpType
25  // ECDH-PSI:Nothing,skip
26  repeated int32 ops = 5;
27  repeated google.protobuf.Any op_params = 6;
28
29  //** 密码协议层 **//
30
31  // protocol_family = enum ProtocolFamily
32  //  SS: Protocol: [Semi2K, ABY3], FieldType, BeaverConfig, SerializeFormat
33  //  ECC: Hash2Curve, EcGroup, SerializeFormat
34  //  PHE: Protocol: [Paillier, EcElgamal], SerializeFormat
35  repeated int32 protocol_families = 7;
36  repeated google.protobuf.Any protocol_family_params = 8;
37
38  //** 数据 IO **//
39
40  // 定义 AI/BI 算法的输入和结果输出格式,不包括中间交互数据的格式
41  // PSI: item_count、result_to_rank
42  // SS-LR: sample_size、feature_num、has_label, etc.
43  google.protobuf.Any io_param = 9;
44}

HandshakeRequest 主要包括以下信息:

  1. 协议版本号

  2. 请求方的传输层 rank 值

  3. 想使用的具体算法,比如使用 ECDH-PSI

  4. 每类算法的详细参数,ECDH-PSI 算法忽略该字段

  5. 用到的安全算子的类型,ECDH-PSI 算法忽略该字段

  6. 每个安全算子的详细参数,ECDH-PSI 算法忽略该字段

  7. 用到的密码协议族,比如 ECC 协议族

  8. 每个协议族的详细参数,比如 ECC 协议族需要说明具体的椭圆曲线类型,哈希算法等参数

  9. 算法的输入和结果输出格式,比如 ECDH-PSI 需要说明结果是A,B都可见,还是只对某一方可见

HandshakeRequest 中的 supported_algos 字段的定义如下:

interconnection/handshake/entry.proto#
1enum AlgoType {
2  ALGO_TYPE_UNSPECIFIED = 0;
3  ALGO_TYPE_ECDH_PSI = 1;
4  ALGO_TYPE_SS_LR = 2;
5}

HandshakeRequest 中的 protocol_families 字段的定义如下:

interconnection/handshake/entry.proto#
1enum ProtocolFamily {
2  PROTOCOL_FAMILY_UNSPECIFIED = 0;
3  PROTOCOL_FAMILY_ECC = 1;
4  PROTOCOL_FAMILY_SS = 2;
5  PROTOCOL_FAMILY_PHE = 3;
6}

如果协议族是 ECC,则 HandshakeRequest 中的 protocol_family_params 字段格式如下:

interconnection/handshake/protocol_family/ecc.proto#
 1message EccProtocolProposal {
 2  repeated int32 supported_versions = 1;
 3
 4  // list of <curve, hash, hash2curve_strategy> suits
 5  repeated EcSuit ec_suits = 2;
 6
 7  // ref enum PointOctetFormat
 8  // 点的序列化格式
 9  repeated int32 point_octet_formats = 3;
10
11  // Whether to enable the optimization method: secondary ciphertext truncation
12  bool support_point_truncation = 4;
13}

EccProtocolProposal 中的 ec_suits 字段的定义如下:

interconnection/handshake/protocol_family/ecc.proto#
 1// Suit of <curve, hash, hash2curve_strategy>
 2message EcSuit {
 3  // ref enum CurveType
 4  int32 curve = 1;
 5
 6  // ref enum HashType
 7  int32 hash = 2;
 8
 9  // ref enum HashToCurveStrategy
10  int32 hash2curve_strategy = 3;
11}

如果算法是 ECDH-PSI,则 HandshakeRequest 中的 io_param 字段格式如下:

interconnection/handshake/algos/psi.proto#
 1message PsiDataIoProposal {
 2  repeated int32 supported_versions = 1;
 3
 4  // How many items do I've.
 5  //
 6  // 待求交的 PSI 数据总量
 7  int64 item_num = 2;
 8
 9  // Which rank can receive the psi results.
10  //
11  // 确定 PSI 结果获取方。
12  //
13  // NOTES:
14  //   `-1`: all parties (所有机构都可以拿到交集结果)
15  //   `>= 0`: corresponding rank can get the results (指定机构拿到交集结果)
16  int32 result_to_rank = 3;
17}

握手请求的结果 HandshakeResponse 定义如下:

interconnection/handshake/entry.proto#
 1message HandshakeResponse {
 2  // response header
 3  ResponseHeader header = 1;
 4
 5  //** AI/BI 算法层 **//
 6
 7  // algos = enum AlgoType
 8  int32 algo = 2;
 9
10  // 算法详细握手参数
11  // SS-LR:learning_rate,optimizer,normalize
12  // ECDH-PSI:Nothing,skip
13  google.protobuf.Any algo_param = 3;
14
15  //** 安全算子层 **//
16
17  // AI/BI 算法所需的 op 列到此处
18  // op = enum OpType
19  // ECDH-PSI:Nothing,skip
20  repeated int32 ops = 4;
21  repeated google.protobuf.Any op_params = 5;
22
23  //** 密码协议层 **//
24
25  // protocol_family = enum ProtocolFamily
26  //  SS: Protocol: [Semi2K, ABY3], FieldType, BeaverConfig, SerializeFormat
27  //  ECC: Hash2Curve, EcGroup, SerializeFormat
28  //  PHE: Protocol: [Paillier, EcElgamal], SerializeFormat
29  repeated int32 protocol_families = 6;
30  repeated google.protobuf.Any protocol_family_params = 7;
31
32  //** 数据 IO **//
33
34  // 定义 AI/BI 算法的输入和结果输出格式,不包括中间交互数据的格式
35  // PSI: item_count、result_to_rank
36  // SS-LR: sample_size、feature_num、has_label, etc.
37  google.protobuf.Any io_param = 8;
38}

其中 ResponseHeader 定义如下:

interconnection/common/header.proto#
 1syntax = "proto3";
 2
 3package org.interconnection;
 4
 5// 31100xxx is the white box interconnection code segment
 6// 31100xxx 为引擎白盒互联互通号段
 7enum ErrorCode {
 8  OK = 0;
 9
10  GENERIC_ERROR = 31100000;
11  UNEXPECTED_ERROR = 31100001;
12  NETWORK_ERROR = 31100002;
13
14  INVALID_REQUEST = 31100100;
15  INVALID_RESOURCE = 31100101;
16
17  HANDSHAKE_REFUSED = 31100200;
18  UNSUPPORTED_VERSION = 31100201;
19  UNSUPPORTED_ALGO = 31100202;
20  UNSUPPORTED_PARAMS = 31100203;
21}
22
23message ResponseHeader {
24  int32 error_code = 1;
25  string error_msg = 2;
26}

如果协议族是 ECC,则 HandshakeResponse 中的 protocol_family_params 字段格式如下:

interconnection/handshake/protocol_family/ecc.proto#
 1message EccProtocolResult {
 2  int32 version = 1;
 3
 4  // The chosen suit
 5  EcSuit ec_suit = 2;
 6
 7  // The chosen octet format
 8  int32 point_octet_format = 3;
 9
10  // optimization method: secondary ciphertext truncation
11  // -1 means disable this optimization (do not truncation)
12  int32 bit_length_after_truncated = 4;
13}

如果算法是 ECDH-PSI,则 HandshakeResponse 中的 io_param 字段格式如下:

interconnection/handshake/algos/psi.proto#
 1message PsiDataIoResult {
 2  int32 version = 1;
 3
 4  // 确定 PSI 结果获取方。
 5  //
 6  // NOTES:
 7  //   `-1`: all parties (所有机构都可以拿到交集结果)
 8  //   `>= 0`: corresponding rank can get the results (指定机构拿到交集结果)
 9  int32 result_to_rank = 2;
10}

Protobuf 传输方式#

Protobuf 传输使用《传输层白盒互联互通协议》中的 P2P 传输协议进行传输。其中传输的 key 按照《传输层白盒互联互通协议》中定义的方法生成,value 即为 protobuf 序列化之后的二进制字符串。

算法主体#

../_images/ecdh-psi-algo.png

算法第二步、第四步使用 EcdhPsiCipherBatch 格式进行传输,EcdhPsiCipherBatch 定义如下:

interconnection/runtime/ecdh_psi.proto#
 1// ECDH PSI 密文传输
 2message EcdhPsiCipherBatch {
 3  // The type hint for each message. (密文类型)
 4  //
 5  // "enc": the first stage ciphertext
 6  //
 7  // "dual.enc": the second stage ciphertext
 8  //
 9  // ECDH PSI 密文阶段类型,主要用来区分一阶段和二阶段的密文.
10  string type = 1;
11
12  // The batch index. Start from 0.
13  //
14  // Batch 索引,从 0 开始
15  int32 batch_index = 3;
16
17  // Is last batch flag
18  bool is_last_batch = 4;
19
20  // Count of items in this batch.
21  // count == 0 is allowed for last batch
22  int32 count = 6;
23
24  // The packed all in one ciphertext for this batch.
25  //
26  // The first stage ciphertext takes 256 bits for each ciphertext element.
27  // However, the second stage ciphertext takes 96 bits each. According to PSI
28  // papers, we do not need to send all 256 bit for the final ciphertext. The
29  // number of bits needed to compare is `Log(MN) + 40` given a 40 bits
30  // statistical security parameter. TODO (add paper link here).
31  //
32  // We define each bucket has less than 2^28 items, i.e. about 270 million
33  // (单桶最多 2.7亿) items, which is general enough for various psi algorithms.
34  //
35  // NOTE: we do not use `repeated`` here to save overhead of metadata.
36  bytes ciphertext = 7;
37}

其中 ciphertext 字段用于存放 ECC 上的点,每个点按照握手协议中的 point_octet_format 序列化之后依次连续存放。