Multi-scalar multiplication: state of the art and new ideas

Multi-scalar multiplication: state of the art and new ideas
presented at zkStudyClub
Gus Gutoski
zkTeam, ConsenSys R&D
June 1, 2020
Gus Gutoski (zkTeam, ConsenSys R&D) MSM: SotA and new ideas June 1, 2020 1 / 57

Introduction
The multi-scalar multiplication problem (MSM)
Also known as. Multi-exponentiation, multi-exp.
Parameters. A cyclic group G whose order |G| has bit length b.
(Example. BLS or BN elliptic curves have |G| ≈ 2256, so b = 256.)
Input. Group elements G1, . . . , Gn in G called inputs.
Integers a1, . . . , an between 0 and |G| called scalars.
Output. The group element a1G1 + · · · + anGn called the output.
Goal. Minimize the number of group (+) operations as a function of n.
Naive solution. Use double-and-add to compute each ai Gi , then add them all up.
Expected group ops: 1.5bn ≈ 384n.
Can we do better? (<sarcasm> No. Let’s all go home. </sarcasm>)

Motivation: zero-knowledge provers
Motivation: zero-knowledge proofs (ZKPs)

Example: Groth16 protocol (gross simpliﬁcation)
Let n denote the size of the secret inputs x accepted by program P.
The proving key for program P contains (among other things) n group elements
G1, . . . , Gn.
Given a size-n secret input x for program P, the prover deduces integers a1, . . . , an and
computes G := a1G1 + · · · + anGn. The proof contains G (among other things).

Are you motivated yet?
Program P takes size-n secret input =⇒ a ZKP prover must do a MSM of n points.
Example. Zcash spend program: n ≈ 4 × 104.
Example. Rollup (a scalability solution): the bigger the n, the better.
Goal. Programs with n = 107, 108, or more.
MSM accounts for 80% of prover work.
Justin Drake: ”Focus on multi-exponentiation, forget about FFTs.” From Zero
Knowledge podcast , 2020-03-11.
Takeaway
Multi-scalar multiplication (MSM) dominates prover costs. Prover costs dominate ZKP costs.
Improvements for MSM immediately yield improvements in ZKP eﬃciency.

Overview
State of the art: the bucket method
zkTeam’s implementation in gnark: [GitHub]
BLS or BN curves (b ≈ 256).
Number of group (+) ops scales like
16n + (constant)
Compare: naive method scales like 384n. That’s a 24× improvement!

Overview
Overview of the bucket method
Succinct description in the paper 2012/549. (Section 4, “Overlap in the Pippenger
approach”.)
High-level strategy:
1. Reduce one b-bit MSM to several c-bit MSMs for manageable c ≤ b.
2. [Interesting part.] Use tricks to solve the c-bit MSMs. (See next section.)
3. Combine c-bit MSMs into the ﬁnal b-bit MSM.

Overview
Step 1. Reduce b-bit MSM to c-bit MSM (1)
Choose c ≤ b. Write each scalar a1, . . . , an in binary. Partition binary scalars into c-bit parts.
Example. Given b = 12, choose c = 3. Each 12-bit scalar a is partitioned into 3-bit parts.
Given the scalar a = 1368 we write a = (2, 3, 5, 0):
1368 in binary : 010
2
101
5
011
3
000
0

Overview
Step 1. Reduce b-bit MSM to c-bit MSM (2)
Deduce b/c instances of c-bit MSM from the partitioned scalars.
Example, continued. (b, c, b/c) = (12, 3, 4). Paritition each scalar ai = (ai,1, ai,2, ai,3, ai,4).
The 4 c-bit MSM instances T1, . . . , T4 are given by
T1 := a1,1G1 + · · · + an,1Gn
...
T4 := a1,4G1 + · · · + an,4Gn

Overview
Step 3. Combine c-bit MSMs into the ﬁnal b-bit MSM
The usual way: double c times then add.
Example, continued. (b, c, b/c) = (12, 3, 4). Combine T1, . . . , T4 into the ﬁnal answer T:
1. T ← T1
2. For j = 2, . . . , 4:
2.1 T ← 2c
T (Double c times)
2.2 T ← T + Tj
Final answer: T = a1G1 + · · · + anGn.
Computation cost in group (+) ops:
(b/c − 1) × (c + 1) = b − c +
b
c
− 1 (1)

Overview
Step 2. Use tricks to solve the c-bit MSMs
Ready for tricks?

Core
Each input goes into a bucket
6G1 + 2G2 + 6G3 + 1G4 + 7G5 + 2G6 + 3G7 + · · ·
G1
2c − 1 buckets: 1 2 3 4 5 6 7
bucket sums: S1 S2 S3 S4 S5 S6 S7

Core
6G1 + 2G2 + 6G3 + 1G4 + 7G5 + 2G6 + 3G7 + · · ·
G2 G1
2c − 1 buckets: 1 2 3 4 5 6 7

Core
6G1 + 2G2 + 6G3 + 1G4 + 7G5 + 2G6 + 3G7 + · · ·
G3
G2 G1
2c − 1 buckets: 1 2 3 4 5 6 7

Core
6G1 + 2G2 + 6G3 + 1G4 + 7G5 + 2G6 + 3G7 + · · ·
G3
G4 G2 G1
2c − 1 buckets: 1 2 3 4 5 6 7

Core
6G1 + 2G2 + 6G3 + 1G4 + 7G5 + 2G6 + 3G7 + · · ·
G3
G4 G2 G1 G5
2c − 1 buckets: 1 2 3 4 5 6 7

Core
Sum the contents of each bucket
G14 G11
G9 G6 G13 G3 G12
G4 G2 G7 G10 G8 G1 G5
2c − 1 buckets: 1 2 3 4 5 6 7
S1 ← G4 + G9 + G14 S3 ← G7 + G13 S5 ← G8 S7 ← G5 + G12
S2 ← G2 + G6 S4 ← G10 S6 ← G1 + G3 + G11
Expected cost to compute S1, . . . , S7 in group (+) ops:
n − (2c
− 1) = n − 2c
+ 1 (2)

Core
Combine the bucket sums to get the answer
Desired output a1G1 + · · · + anGn equals
S1 + 2S2 + 3S3 + · · · + 7S7
This is not obvious, but easy to check.
This is another instance of MSM with inputs S1, . . . , S7, scalars 1, . . . , 7.
Number of inputs 2c
− 1 is ﬁxed.
Scalars 1, . . . , 2c
− 1 are known in advance.

Core
A fast way to combine the bucket sums
The desired sum
S1 + 2S2 + 3S3 + · · · + 7S7
is computed via
S7
+ (S7 + S6)
+ (S7 + S6 + S5)
...
+ (S7 + S6 + S5 + S4 + S3 + S2 + S1)
Computation cost in group (+) ops:
2 × (2c
− 2) + 1 = 2c+1
− 3 (3)

Eﬃciency
Total cost for the bucket method: theory
Expected cost in group (+) ops (from Eqs. (1), (2), (3)):
b
c
(n + 2c
− 2) + b − c +
b
c
− 1 ≈
b
c
(n + 2c
) (4)
Minimum occurs at c ≈ log n. At ﬁrst glance, asymptotic scaling looks like
O b
n
log n
Beware! We must have c ≤ b, so we cannot choose c ≈ log n when n > 2b.
For n > 2b scaling reverts to O(n).
Example (b = 1). n
log n scaling is impossible; O(n) is the best we can do.
Example (b = 256). n can never reach 2256
so n
log n scaling is achievable.

Eﬃciency
Total cost for the bucket method: practice
Eq. (4):
b
c
(n + 2c
)
A large instance is n = 107 (so log n ≈ 23).
For gnark with b = 256 we observed peak performance at c = 16. This yields the cost
claimed on slide 6:
16n + 212
Puzzle: Why c = 16 instead of c = log n? Other concerns:
Memory use scales with 2c
. Eventually, memory is a bottleneck.
Fewer edge cases if c divides b.
Example. 256-bit scalars stored in four 64-bit limbs. It’s annoying if c-bit MSM straddles
two limbs.

Improvements
Ideas to improve upon the bucket method
1. Parallelism: yes
2. Precomputation: not really, unless combined with item 4
3. Low Hamming-weight representations: no
4. [New!] [Elliptic curves only] Signed digits, and generalization: yes

Parallelism
Parallelism for the bucket method
Is the bucket method faster on multiple cores? Yes!
Natural boundary for parallel computation: each c-bit MSM is independent of the rest.
scalars b-bit decimal c-bit binary parts
a1 : 1368 010 101 011 000
a2 : 819 001 100 110 011
...
...
...
...
...
...
an : 2709 101 010 010 101
b/c cores : 1 2 3 4
Easily make full use of up to b/c cores.
Example: (b, c) = (256, 16) =⇒ easy use up to 256/16 = 16 cores.
Increased memory use: each core uses 2c memory.

Parallelism
Even more parallelism for the bucket method?
Sometimes we have more than b/c cores available. Can we use all of them? Yes, but. . .
Another natural boundary: partition the inputs
Ineﬃciency. 2 MSM instances of size n/2 costs more group (+) ops than 1 MSM
instance of size n.
gnark does not do this; parallelism is limited to b/c cores.

Precomputation
Precomputation for the bucket method
Inputs G1, . . . , Gn are known in advance. Can we use this to our advantage? Sort of.
Idea. Precompute a bunch of points and store them. Examples:
For each input G: 2G, 3G, . . . , (2c
− 1)G.
For each input G: 2k
G, 22k
G, . . . , 2mk
G for some k, m
Various subsets of inputs: (G1 + G2 + G3), (G1 + G2), (G1 + G3), (G2 + G3).
Goal. A smooth trade-oﬀ between procomputed storage vs. run time.
Problem. Large MSM instances already use most available memory.
Example. For n = 108
gnark needs 58GB to store enough BLS12-377 curve points to
produce a ZKP for a program with size-n secret input.
Perhaps we could store extra points on disk. But disk reads might be too slow.
Experimentation needed.

Precomputation
Naive precomputation for the bucket method
For each input G precompute 2G, 3G, . . . , (2c − 1)G.
Recall. The goal of c-bit MSM is to compute a1G1 + · · · + anGn for c-bit scalars
a1, . . . , an.
If ai Gi are already in storage then there’s nothing left to do!
No need to compute bucket sums S1, . . . , S2c −1.
No need to accumulate bucket sums S1 + 2S2 + · · · + (2c
− 1)S2c −1.
Number of group (+) ops reduces to
b
c
n + b − c +
b
c
− 1 ≈
b
c
n
Extra storage space required is (2c − 2)n
Extreme example b = c = 256. If we store 2256
n points then we need only n group (+) ops.
Realistic example (n, b, c) = (223
, 256, 16). Extra storage is 550 billion elliptic curve points!

Precomputation
A trade-oﬀ for naive precomputation
For each input G precompute k points: (2c − k)G, . . . , (2c − 1)G.
Bucket method only for the ﬁrst 2c − 1 − k buckets instead of 2c − 1:
Compute only S1, . . . , S2c −1−k .
Accumulate only S1 + 2S2 + · · · + (2c
− 1 − k)S2c −1−k .
Total cost in group (+) ops:
b
c
(n + 2c
− k − 2) + b − c +
b
c
− 1 ≈
b
c
(n + 2c
− k)
Extra storage: kn points. Choose k as big as you can store.
Takeaway
Small storage capacity =⇒ negligible improvement.
Non-negligible improvements can only be achieved with very large storage capacity.

Low Hamming-weight representations
Lessons from speed-ups for single-scalar multiplication
Problem. (Single-) scalar multiplication.
Input. Scalar a, group element G.
Output. Group element aG.
Standard method: double-and-add.
Cost increases with the Hamming weight of a.
Examples. 8-bit scalars:
128 binary: 10000000 7 group (+) ops
170 binary: 10101010 10 group (+) ops
240 binary: 11110000 10 group (+) ops
255 binary: 11111111 14 group (+) ops
Idea. Use a diﬀerent (non-binary) encoding for scalars with lower average Hamming
weight.

Example: non-adjacent form (NAF)
Like binary except digits can be {−1, 0, 1}.
Requires group (−) ops to cost the same as group (+) ops.
Elliptic curve groups have this property.
Non-zero digits are never adjacent.
Average Hamming density is 1/3. (Compare: 1/2 for binary.)
Examples.
128 NAF: 0 1 0 0 0 0 0 0 0 7 group (+) ops
170 NAF: 0 1 0 1 0 1 0 1 0 10 group (+) ops
240 NAF: 1 0 0 0 −1 0 0 0 0 8 group (+) ops, 1 group (−) op
255 NAF: 1 0 0 0 0 0 0 0 −1 8 group (+) ops, 1 group (−) op

Example: double-base number system (DBNS)
Scalars written as a linear combo of 2i 3j . (Compare: 2i only for binary.)
Digit set can be binary {0, 1} or larger.
Highly redundant—each scalar has many representations.
Example. 127 has 783 representations with digit set {0, 1}. Minimum Hamming weight
is 3. (Compare: 7 for binary.) There are 3 such representations:
127 = 22
33
+ 21
32
+ 20
30
127 = 22
33
+ 24
30
+ 20
31
127 = 25
31
+ 20
33
+ 22
30
Hamming density for a b-bit scalar is O(1/ log b). (Compare: 1/2 for binary.)

Can low Hamming-weight representations improve the bucket method?
No!
Cost of bucket method increases with the number of possible scalars.
There are 2c possible c-bit scalars =⇒ always need 2c buckets, regardless of how those
scalars are encoded.
Cost of bucket method could be reduced if we have a guarantee that some scalars never
(or rarely) occur.
Scalar encodings cannot provide such a guarantee.
More advanced techniques can give such a guarantee. (Example: Pippenger’s algorithm.)
The big question is whether the cost of establishing the guarantee outweights its beneﬁts.
That’s a discussion for another talk...

Improvement: exploit cheap group inversion
New idea: exploit cheap group inversion
Inspiration: elliptic curve group inversion is (almost) free.
Given G ∈ G, it’s cheap to compute −G via (x, y) → (x, −y).
Currently: c-bit scalars written with digit set {0, . . . , 2c − 1}.
Instead, allow negative digits. e.g. {−2c−1, . . . , 2c−1 − 1}
If scalar a > 0 for point G then add G to bucket Sa as usual.
If scalar a < 0 for point G then add −G to bucket S|a|.
No need for buckets Sa for a > 2c−1.
We have eliminated half the buckets!

Improvement: exploit cheap group inversion Buckets for c-bit MSM, revisited
Example: 3-bit MSM with negative digits
(−2)G1 + 2G2 + (−2)G3 + 1G4 + (−1)G5 + 2G6 + 3G7 + · · ·
−G1
2c−1 buckets: 1 2 3 4
bucket sums: S1 S2 S3 S4

(−2)G1 + 2G2 + (−2)G3 + 1G4 + (−1)G5 + 2G6 + 3G7 + · · ·
G2
−G1
2c−1 buckets: 1 2 3 4

(−2)G1 + 2G2 + (−2)G3 + 1G4 + (−1)G5 + 2G6 + 3G7 + · · ·
−G3
G2
−G1
2c−1 buckets: 1 2 3 4

(−2)G1 + 2G2 + (−2)G3 + 1G4 + (−1)G5 + 2G6 + 3G7 + · · ·
−G3
G2
G4 −G1
2c−1 buckets: 1 2 3 4

(−2)G1 + 2G2 + (−2)G3 + 1G4 + (−1)G5 + 2G6 + 3G7 + · · ·
−G3
−G5 G2
G4 −G1
2c−1 buckets: 1 2 3 4

Sum the contents of each bucket
G14 −G11
−G12 G6
G9 −G3 G13
−G5 G2 −G8
G4 −G1 G7 G10
2c−1 buckets: 1 2 3 4
S1 ← G4 − G5 + G9 − G12 + G14 S3 ← G7 + G13
S2 ← −G1 + G2 − G3 + G6 − G11 S4 ← G10

Improvement: exploit cheap group inversion How much improvement?
Combine the bucket sums to get the answer
Like before, desired output a1G1 + · · · + anGn equals
S1 + 2S2 + 3S3 + 4S4
Instead of 2c − 1 buckets, we now have only 2c−1 buckets.
Bucket accumulation works exactly as before, except with half the buckets.
Bucket accumulation costs drop from ∼ 2c to ∼ 2c−1.

Total cost of the improved bucket method
New approximate cost in group (+) ops:
b
c
n + 2c−1
for your choice of c.
Option 1: Use the same c as before and enjoy 50% saving in bucket accumulation costs.
Option 2: Set c ← c + 1, which reduces the multiple of n:
b
c+1
(n + 2c
)

How much improvement?
Under option 2 the multiple of n improves by the factor c
c+1.
Example. c = 19 =⇒ 5% improvement (ignoring bucket accumulation cost).
As discussed last time, there might be other reasons not to change c.
Concrete improvement
I implemented a stupid PoC in gnark keeping c = 16 (option 1) and observed a 5.7% speed
improvement for n = 106 inputs. [GitHub]

Improvement: exploit cheap group inversion Scalars with negative digits
How to express scalars with negative digits?
In the basic bucket method it’s easy to partition b-bit scalars into c-bit parts.
Example. (b, c) = (12, 3). Given a = 1368 we write a = (2, 3, 5, 0):
1368 in binary : 010
2
101
5
011
3
000
0
In general: we are given for free a0, . . . , ab/c−1 from {0, . . . , 2c − 1} with
a =
b/c−1
i=0
ai 2ci
We need to ﬁnd a0, . . . , ab/c−1 from {−2c−1, . . . , 2c−1 − 1} with
a =
b/c−1
i=0
ai 2ci

Lending: like borrowing, except the opposite
function Signed-Digits(a0, . . . , ab/c−1)
for i ← 0, . . . , b/c − 1 do
if ai ≥ 2c−1 then
assert: i = b/c − 1 No overﬂow for ﬁnal digit!
ai ← ai − 2c Force this digit into {−2c−1, . . . , 2c−1 − 1}
ai+1 ← ai+1 + 1 Lend 2c to the next digit
else
ai ← ai
end if
end for
return a0, . . . , ab/c−1
end function

On the efficiency of conversion to signed digits
Signed-Digits works only if |G| fits comfortably into b bits.
Example. BLS12-377 has |G| is 253 bits, typically stored in 256 bits =⇒ 3 unused bit in
the final digit =⇒ overflow cannot occur.
Conversion to signed digits has a cost. Fortunately, that cost seems to be negligible.
What’s the most efficient way to compute signed digits?
In my stupid PoC I allocated separate memory for signed digits: n × b bits of additional
memory use. For n > 106
that’s 32MB.
This memory can be saved if you compute ai on the fly. But it seems you need to compute
a0, . . . , ai−1 first, so there’s lots of repeated computation.
Careful! This problem becomes much worse when we generalize this idea.

Generalization: exploit cheap scalar multiplication
Can we do more?
Recall: The ability to cheaply compute (−1)G allowed us to reduce bucket accumulation
cost by a factor of 1/2.
Question. Suppose there are scalars µ1, . . . , µk (including 1) for which we can cheaply
compute µ1G, . . . , µkG. Can we reduce bucket accumulation cost by a factor of 1/k?
Dare to dream: approximate cost in group (+) ops becomes
b
c
(n + 2c
/k) or
b
c + log k
(n + 2c
)
The multiple of n improves by the factor c
c+log k
Example. If (c, k) = (16, 16) then that’s a ∼20% improvement for MSM!

Example: one extra scalar, and it’s combo with −1
Suppose we can cheaply compute (−1)G, λG.
We can also cheaply compute −λG.
Think: (µ1, µ2, µ3, µ4) = (1, −1, λ, −λ) so k = 4 instead of 2
Suppose we can write scalars using digit set
{ 0, 1, . . . , 2c,
−1, . . . , −2c,
λ, . . . , λ2c,
−λ, . . . , −λ2c }
This digit set has size ∼ 4 × 2c and requires only ∼ 2c buckets.
In this case, the new λ doubled k. (Hooray!) But that won’t always happen.

Full generalization
Suppose we can cheaply multiply by µ1, . . . , µk.
Suppose we can write scalars using digit set
i=0,...,2c
j=1,...,k
{iµj }
This digit set has size ∼ k2c and requires only ∼ 2c buckets.

Generalization: exploit cheap scalar multiplication Endomorphism multiplication for elliptic curves
Cheap multiplication for elliptic curves
Consider an elliptic curve of the form
E : y2
= x3
+ b
for some b in a prime ﬁeld Fp. (e.g. BLS curves, . . . ).
Let G ⊆ E(Fp) be a prime order subgroup. Let β be a cube root of 1 mod p. For any G ∈ G
the map φ : (x, y) → (βx, y) acts as
φ : G → λG
where λ is a cube root of 1 mod |G|.
Takeaway
Computing λG can be implemented with a single multiplication modulo p—only slightly more
costly than computing −G and much cheaper than a group (+) op.

Generalization: exploit cheap scalar multiplication Endomorphism multiplication for elliptic curves
More cheap scalars?
φ gives us 2 new scalars λ, λ2.
Combos with −1 yield a total of k = 6 cheap scalars:
(µ1, . . . , µ6) = (1, −1, λ, −λ, λ2
, −λ2
)
Observe: new scalars do not always double k
Example. λλ2
= 1 is not another new scalar.
More endomorphisms like φ?
Galbraith-Lin-Scott find more in 2008/194. But you need to work in an extension field Fpk
instead of Fp.
We need to convince everyone to switch do a different base field.
It might be worth it: Hu-Longa-Zu demonstrate a speed-up for single-scalar multiplication.
[link]

Generalization: exploit cheap scalar multiplication Strange digit sets
Complication: large multiples
−1 is a well-behaved scalar
Cheap to convert scalars to signed digits
Overﬂow is easy to quantify, easy to avoid.
Other scalars λ might be very large
Example. For BLS12-377, λ is 129 bits
How to convert scalars to digit sets containing λ?
Overﬂow might kill us

First attempt to generalize Signed-Digits
Suppose:
We have a scalar a whose ith digit ai is large: 2c−1 ≤ ai < 2c.
Want to map it to a small bucket label ai with 0 ≤ ai < 2c−1
.
We can write
λai = ai + d2c
for some integers ai , d with ai in the desired range.
Example. We saw (λ, d) = (−1, −1).
Then we can do the following:
Set ai ← ai + d2c and ai+1 ← ai+1 − d.
(Pray that we do not overﬂow!)
During bucket accumulation for G ∈ G, add point λG into bucket ai .
Problem. If λ is large then d will be very large =⇒ digits ai become very large =⇒ badness

Mitigation: borrow from higher digits
Write
λai = ai + d12c
+ · · · + d 2c
for di , . . . , d in some reasonable digit set.
Need to set several digits instead of just one:
ai+1 ← ai+1 − d1
...
ai+ ← ai+ − d

Problem: that’s ﬁne until we run out of digits
We cannot completely avoid overﬂow:
Example. If |G| is 256 bits and λ is 128 bits (like BLS12-377) then this optimization can
be used for only the lower half of digits =⇒ 50% performance penalty.
Example. If |G|, λ have equal bit lengths then this optimization cannot be used at all.
Open problem. Can we do better?

A boost for precomputation
Precomputation, revisited
On slide 27 we observed a trade-off for precomputed storage: cost of the bucket method
can be reduced to
b
c
(n + 2c
− k)
at a cost of storing kn extra group elements. That’s not a very good trade-off.
Precomputation gives a better trade-off when combined with the new method of signed
digits:
b
c
(n + 2c
/k)
for (k − 1)n extra storage. (Thanks to Alexandre Belling.)

A boost for precomputation
Precomputation + signed digits = significant improvement
Advantage. We can choose λ to play nice with signed digits (e.g. λ = 2, 3, . . . )
Faster, simpler conversion to the new digit set; less worry of overflow.
Each new precomputed multiple λG significantly increases the number k of cheap scalars
µ1, . . . , µk at our disposal.
Example.
Suppose we start with 6 scalars µ1, . . . , µ6. (Perhaps obtained from endomorphisms.)
Each additional precomputed multiple λ adds 6 new cheap scalars:
λµ1, . . . , λµ6
On slide 45 we estimated a ∼20% improvement for k ≥ 16. (Ignoring the cost to convert
digit sets!)
We could exceed this target k = 16 only 2n extra storage.
Example. Even with only 2 scalars (1, −1) we can reach k = 16 with 7n extra storage.
I did not implement this improvement.

Get off my lawn
Summary of open problems
Find an efficient way to write scalars using the digit set from slide 47 without overflow
Prior art for SSM ([GLV], [HPX]) use a SVP lattice solution—can it be adapted to MSM?
Find more scalars µ that admit cheap scalar multiplication.
The best lead I know is 2008/194 and follow-ups.

Get oﬀ my lawn
Fin
Thank you!

Multi-scalar multiplication: state of the art and new ideas

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Multi-scalar multiplication: state of the art and new ideas

Similar to Multi-scalar multiplication: state of the art and new ideas (20)

Recently uploaded

Recently uploaded (20)

Multi-scalar multiplication: state of the art and new ideas