前言
往期推送过一个蒙哥马利算法的介绍,如果要实现蒙哥马利模乘的硬件模块,那么一个参考模型是必不可少的,这一期将利用SV实现一个简单的参考模型,这个参考模型可以直接用于功能仿真
根据以往推送中的运算流程进行建模
类的定义
class BN;rand bit [127:0] num [32:0];string name;constraint c {num[32]==0;num[0][0]==1;};function new(string name="A");this.name = name;endfunctionendclass : BN
定义一个大数的类,计算的位宽是4096,而使用的基是128bit也就是基为\(2^{128}\),数组大小定义为33,用于处理数运算时的溢出。
大数显示
// BN displayfunction void BN_display(bit flag=0);string s;s={s,name,":"};if(flag==1)s={s,$sformatf("%h",num[32])};for (int i=1; i<32+1; i++) begins={s,$sformatf("%h",num[32-i])};ends={s,"\n"};$display(s);endfunction : BN_display
默认不打印num[32]
大数移位
// BN_shift
function void BN_shift();for (int i=0; i<32; i++) beginnum[i] = num[i+1];endnum[32] = 0;
endfunction : BN_shift
蒙哥马利算法中有/b的操作,我们选取\(2^{128}\)为基,那么这个操作就能使用移位实现,运算的大数为无符号数,最高为补0
大数乘法
// BN_mulfunction void BN_mul(bit [127:0] a,BN ans);bit [255:0] temp;bit [127:0] carry;for (int i = 0; i < 32; i++)begintemp = num[i] * a + carry;ans.num[i] = temp[127:0];carry = temp>>128;endans.num[32] = carry;endfunction : BN_mul
实现的是4096*128位的乘法,用于计算\(y_i \cdot x\)和\(u \cdot N\)
大数加法
// BN_addfunction automatic void BN_add(input BN a,b,ref BN c);bit [128:0] psum;bit carry;carry =0;for (int i = 0; i < 32+1; i++)beginpsum = a.num[i] + b.num[i]+ carry;c.num[i] = psum;carry = psum[128];endendfunction : BN_add
计算两个大数的加法
大数减法
利用取补码,然后相加实现
取补码
取反加一
// BN_comfunction void BN_com(ref BN c);BN temp,b;temp = new("temp");b = new("b");for (int i=0; i<32+1; i++) begintemp.num[i]=~this.num[i];b.num[i]=0;endb.num[0]=1;BN_add(temp,b,c);endfunction : BN_com
实现减法
// BN_subfunction automatic void BN_sub(input BN a,b,ref BN c);BN com;com =new("com");b.BN_com(com);BN_add(a,com,c);endfunction : BN_sub
计算\(\omega\)
计算原理如下图,下图有一处错误,循环变量中的\(\omega\)和我们所求的\(\omega\)并不是一个变量,这里指的是我们基的bit数
// BN_wfunction void BN_w(output bit [127:0] w);bit[127:0] t;bit[255:0] tt;t=1;tt=0;for (int i=0; i<127; i++) begintt=t*t;t=tt[127:0];tt=t*num[0];t=tt[127:0];$display("%h",t);endtt=0-t;w=tt[127:0];endfunction
大数比较
最后输出时要检查r是否小于N
// BN_cmp return 1 when this >= afunction bit BN_cmp(input BN a);for (int i=0; i<32+1; i++) beginif(num[32-i]<a.num[32-i])return 0;if(num[32-i]>a.num[32-i])return 1;endreturn 1;endfunction
蒙哥马利模乘
有了上面的函数,我们就能实现蒙哥马利模乘啦
// BN_subfunction automatic void mont_mul(input BN x,y,N,ref BN r);BN temp1,temp2,temp3;bit [127:0] u,w;bit [255:0] ut;temp1=new("temp1");temp2=new("temp2");temp3=new("temp3");N.BN_w(w);$display("%h",w);x.BN_display();y.BN_display();N.BN_display();r.BN_display();for (int i=0; i<32; i++) beginut=y.num[i]*x.num[0];u=ut[127:0];ut=r.num[0]+u;u=ut[127:0];ut=u*w;u=ut[127:0];x.BN_mul(y.num[i], temp1);N.BN_mul(u, temp2);BN_add(r, temp1, temp3);BN_add(temp3, temp2, r);r.BN_shift();endif(r.BN_cmp(N)) beginBN_sub(r,N,temp1);r.num=temp1.num;end$display("x*y*p^-1 mod N:");r.BN_display();endfunction : mont_mul
tb测试
顶层文件内容为
module tb;import BN_pkg::*;initial beginBN x,y,N,r;x=new("x");y=new("y");N=new("N");r=new("r");void'(x.randomize());void'(y.randomize());void'(N.randomize());mont_mul(x,y,N,r);endendmodule
我们运行一次
那么如何验证是否正确呢,我们使用python检查一下,python自带了大数运算的库,可以自动的计算大数模乘。
下面的python代码用也模拟了一遍蒙哥马利模乘的流程,与直接模乘进行对比。蒙哥马利模乘的计算结果实际上是\(x \cdot y \cdot \rho^{-1} mod N\),所以要将结果再乘以一个\(\rho\)才能进行对比。把上述打印出来的结果复制到python中运行,进行检查。
# tranfer long num to array
def tranBN(bn,bn_a):bn_t=bnfor x in range(0,32):bn_a[x] = bn_t % 0x100000000000000000000000000000000bn_t = bn_t >> 128# -n^(-1) mod 2^128
def get_w(n):nw=n % 0x100000000000000000000000000000000w=1;for x in range(0,128):w=w*w % 0x100000000000000000000000000000000w=w*nw % 0x100000000000000000000000000000000# print("w:%x" %w)w=0x100000000000000000000000000000000-wreturn w# p mod n
def get_p():t=1for x in range(0,4096):t=(t<<1)t=t%creturn tif __name__ == '__main__':a = 0xdaaf6a1e33805c0f68e58a95cc39b4c64594dbc4f2b0ba88aae650f51e467e1c4f10ba102d77eb77c1547e0c40e6d7aeb05539c308ea01dafb6da336492100ab2cdd38a580091aaec64d74192431c00cce4f4c752498e88aaa5ccc010b2317db8e01c0660e1dc9ba01154024448965f8209721d39158422ef2e1817ac4240be53bfc0f05b7336e172e271c9e9fcd38057746bbe8f5bb1907ab681ae012395e78e531f5291340108b4f8b182614a29fa0c7a44032229fe3fb3af01a5577cf335f318c1ecc70b613e7532ab85dc087c618020e949640cb14a3dbf634fa0b48f0098c9e9ee4861a5e6193f2a9241e28d1f4d3c9a8f11c460943dbd7b7b06f18fe75454e20593388dcaa8b98aabe293987d22e2725251d6ebf2729cde05db076ed775b7f369d1f9e1109812960b8b76e333bcca8aaa98931c2937cadb68a4ffc6c54eff9a6bcb77da76dc02fcb83167105319dd5a25f19d6ef0b214927120635e665afe46f681259247978d4a6853bb3cac03bc554d07003496f6b8b624bfec45f4cfb24acded0aeb074e8f70df1813ebb26bd5fe26be2a627684d793a8a052e3a9476a1d9697dd9e27beb4db7ad01eb8b0a3b5c7717d716ebd30727cc7786a17f09b04d6b94a56d9f70ac514e026f42834486e6a0852cef4808c7222cc02e908f2ab225f93e316612d1fd67359987ab7a23be6348b73f6764e60de2edf72c500ebfb = 0x14b8f4973aacf4927678c823e8c5988fe76caf5b4241b1a32954ccb35729ff3573f617bc077c8d165ec5270c0b863fc231ae96dd5d933e9a98abdaf3d6e852e98149945ab1a9a90e38e07c3017c1273b18598d87b59a289de9d7c5bc5c6f64cccdbcbec42c289c8b1b799f8454cba6b89e5976a84c19217d64ddde5af42e37ab465928d068deaa3a0270b8d062dbe0b737667c3afd065871532081e72bc1f79e1d7ebd1fb933ec3555a8e986f949f72ca11bc2fbe4c704b20838c68b707d9f3db1d8ae45b44b6bd36a58bfbf7d565347a6c6e20130c84f1bad77f6251e81dfb6ffa9a508d64db7d2fe48b5e4ebe68e7c8d62cdf5ab1c2ca8c2d2e835a1423acbef65956c980dfb62b3a405b9efbc93283d5071c2129b831481c537cc5be8f1d2723f1168f797bde736c1f73054d7d0dc97538fba25bb3e38703934d8fc46ad22eb23ea409184c3dba8241efc92ce5a6728f4385da637bc23ef7acb506d0543804ae7d660926a82406f9d3206376d5454466ecde2246a125c99aebdf16743d55cfb1c4ab0fdb8387320d541a94e3c5aa6038466eaa18682a163d571db3214de448b3d4d7a632bc60f0a524a041cd6e72a75dbc9f6bb63743df3c3c0d4649a28bd0bbeee569182303a66b830a2273b8df05c712adadf2bcb75244a66826265da778e0c3b45a20d6c962fd203e708ff62dd29b9edd96f2afd2bfe92014968e439efc = 0xa35f4906ed176bc241535c78955d02f4d0ac5376d736ae280077887200c758b7781b4432fa8baca2a81ad6fb0817051a00fccf8e15c63048681bcf8342b56433abd550affa489b289cd4f0482adce321c8cf4374ce15267692dfc8b0da108f4bb0e922d4a28402ef785c2516f6296486f8505ac3df05c0f953acce65e2dc5f1e59965ded73fa18ffb482ad1a2e5433d4df8211de12a3e7a71a1a084fed671fb11eeaf76f640c4fd549ea307b6622f798f027786e79232206de1507281d84c719209d408bc85f9ed2e1b82ecf72ff805a45221dc712c45a8dbc375e9b64227ec6b659a75fc5b5e051e776bcd9f4f6d82ebaff89a48c8494d6ed072372b846156af229994baab390ec57c00130255acc2cdf975783df4678153f0ca51b854425b1568b5b8b53239f50dd39fc53c3d41827a0687c435f6de5e98843def3fb7b0f7e701cdfb51517d6628392bd9291c16282556f5581766dd6a0a426a35312237399f93ad69502592c0f6d1864ba0b75600ee04cb406bcb833bc98527a0ac1249c6a918456b06f24611770c1708426b4d9041f7fe83be68fbc7018e461951d234ebf00227b4301911e24055c745203c888276f4db0c05f66514ae4e6b4bf4c8914e36c4a94bf57bf807dd40c7572d1a99c27d9f58af0877bb217c081d750d5edbe3c45eafc3ea6786560fa819873452cc8bffc7ab998ab70496b77fdadffb7e7262dt = 1a_a=[0 for i in range(32)]b_a=[0 for i in range(32)]c_a=[0 for i in range(32)]tranBN(a,a_a)tranBN(b,b_a)tranBN(c,c_a)w=get_w(c)r = 0
# mont mul , get a*b*p^-1 mod cfor x in range(0,32):r0 = r%0x100000000000000000000000000000000u = (r0+b_a[x]*a_a[0])*wu = u % 0x100000000000000000000000000000000r = r+b_a[x]*a+u*c# check r[127:0] == 0if r%0x100000000000000000000000000000000 == 0:passelse:print("r[127:0]!=0!!!!")print(r%0x100000000000000000000000000000000)r = r >> 128# print("%x"%r)if r>c:r=r-cprint("a*b*p^-1 mod c:")print("%x"%r)t=get_p()# compare to pythonr_s=(a*b) % cr_m=(r*t) % cprint("python result: \n%x" %r_s)print("mntmul result: \n%x" %r_m)if r_s==r_m:print("result match!")else:print("result dismatch!")
python中模拟蒙哥马利的结果与sv中一致
而python中模拟蒙哥马利和直接模乘结果也一致
注:结果太长,只截了一部分