SlideShare a Scribd company logo
1 of 23
Download to read offline
Reaping and breaking keys at scale:
when crypto meets big data
Nils Amiet
Yolan Romailler
August 2018 — DEF CON 26
Warning: These slides are not up to date
Get the latest slides from:
https://research.kudelskisecurity.com/
Warning: if you want to test your keys live
You can go to our website:
keylookup.kudelskisecurity.com
and submit your key to test it against our dataset!
If the key is already known, results are immediate,
otherwise you’ll have to check back later.
We’ll come back to this during our demo.
The problem
● Asymmetric cryptography relies on two type of keys:
○ the public key, that anybody can use to encrypt/verify data.
○ the private key, that should remain secret and allows for decryption/signing of the said data.
● Tons of public keys are out there:
○ TLS/SSL
○ SSH
○ PGP, …
● Can public keys leak data about the private keys?
Crypto recap
● RSA (Rivest–Shamir–Adleman)
○ Choose two large prime numbers p and q
○ Public key (n, e)
■ with n = p * q
■ and some e such that e and λ(n) are coprime
○ Private key (n, d) where d ≡ e^−1 (mod λ(n))
○ Message encryption
■ c ≡ m^e (mod n)
○ Ciphertext decryption
■ m ≡ c^d (mod n)
○ RSA security relies on the hardness of the integer factorization problem
Crypto recap (cont’d)
● ECC (“Elliptic Curve Cryptography”)
○ Security based on the hardness of the EC discrete logarithm problem
■ It is hard to retrieve the integer d, knowing the points G and Q=dG
○ Working with an elliptic curve C
○ Private key is an integer d
○ Public key is a point Q = (x, y) = dG
■ where (x, y) are the coordinates of the point on a given known curve
Passive attacks on public keys
● The Return of Coppersmith’s Attack (ROCA)
● RSA modulus factorization (Batch GCD)
● Invalid parameters
○ DSA generator
○ Key sizes
○ Invalid curve attacks
★ Batch GCD already used in 2010, 2012, 2016 to break weak keys
○ on datasets <100M keys
★ These are all known attacks!
★ And they are completely passive, the target is left unaware
Collecting public keys
● X.509 certificates
○ HTTPS, on IPv4 range and on CT domain names (>260 million)
○ IMAP(S), POP3(S), SMTP(S)
● SSH keys
○ Github.com, Gitlab.com
○ SSH host key scans (port 22)
● PGP keys
○ Github.com, Keybase.io
○ Key dumps from SKS pool of PGP key servers
Keys (millions) per key container type
Keys collected per data source
● PGP keys
○ 9.5M on SKS key servers
○ 220k on Keybase.io
○ 6k on Github.com
● SSH keys
○ 71M from CRoCS dataset
○ 17M from SSH scans
○ 4.5M on Github.com
○ 1.2M on Gitlab.com
● X.509 certificates
○ > 200M from HTTPS scans
○ 1-2M each from SMTP(S), POP3(S) and IMAP(S) scans
Our public keys stash: Big Brother style
● Attacks like RSA Batch GCD work best with larger datasets
○ More keys = more chances of finding common factors
● We collected as many public keys as we could
○ > 343,492,000 unique keys and growing
○ collection made over 1 year
Key types
● RSA 324M 324476553
● ECC 13M 13975895
● DSA 2.5M 2568969
● ElGamal 2.4M 2468739
● GOST R 34.10-2001 1k 1759
● Other <1k 217
What to do with the (big) data
1. Collect raw data
2. Parse raw data
3. Run additional tests on parsed data
4. Ingest parsed data into database
5. Run queries on distributed database
6. ???
7. Profit
Toolbox
To collect data:
● Fingerprinting with cert/key grabbing: Scannerl with custom modules
● To parse it: Python code
● To ingest it: NiFi and HDFS
● To study it: Presto
To break keys:
● Batch GCD on RSA keys, using a custom distributed version
● ROCA attack on vulnerable RSA keys
● Sanity checks on EC keys
Demo
[Live demo, using https://keylookup.kudelskisecurity.com/]
[Factoring a key with the GCD, rebuilding the private key out of the factors]
Behind the scenes
● Distributed fingerprinting using 50 Scannerl slaves
● Batch-GCD:
○ 280 vCPUs cluster across 7 nodes
○ 1.4 TB storage for intermediate results
● Data Lake with 10+ data nodes
● Storage: over 15TB including all raw data
Results: RSA keys
Broken keys:
● >4k RSA keys vulnerable to ROCA
○ 33% of size 2048 (weak), 64% of size 4096 (should be fine)
○ Mostly PGP keys (97%)
○ Found vulnerable keys on Keybase.io, Github.com and Gitlab.com! Check your keys!
● >200k RSA keys factored through batch GCD
○ Real-world breakable/broken keys!
○ 207k X.509 certificates, allowing for MitM attacks
■ at least 261 certs currently in use, 1493 certs used over last year
○ >1200 SSH keys, allowing for MitM attacks
○ 6 PGP keys, allowing for decryption, impersonation and other evil
Results: RSA keys
Unsurprisingly, many routers are concerned:
Results: RSA keys
D-Link problem
Results: ECC keys
● The adoption rate of ECC differs greatly depending on the source:
○ X509 and PGP are steadily adopting ECC
● Most common curves for SSH:
○ secp256r1 97,68%
○ secp521r1 1,87%
○ Curve25519 0,37%
○ secp384r1 0,07%
Growth of ECC keys
Scan failure
Fun facts
● Some keys are used as both PGP keys, SSH keys and/or X509 certs!
● PGP subkey/master key ratio
○ 50.5% master keys
○ 49.5% subkeys
○ Most people have only one subkey?!
● At least 361 keys we could factor had more than 2 factors!
● DSA is dead (OpenSSL deprecated it in 2015):
○ only 3106 X.509 certs seen over last year
○ < than 0.55% of SSH keys are DSA based
Conclusion
● Mind your keys!
● Anybody can do the same kind of silent attack!
And maybe they already do...
● Find our open source code on Github
○ https://github.com/kudelskisecurity
○ https://github.com/kudelskisecurity/scannerl
● Find more results and analysis on our blog
○ https://research.kudelskisecurity.com

More Related Content

More from Priyanka Aash

Cyber Crisis Management.pdf
Cyber Crisis Management.pdfCyber Crisis Management.pdf
Cyber Crisis Management.pdfPriyanka Aash
 
CISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdfCISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdfPriyanka Aash
 
Chennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdfChennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdfPriyanka Aash
 
Cloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdfCloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdfPriyanka Aash
 
Stories From The Web 3 Battlefield
Stories From The Web 3 BattlefieldStories From The Web 3 Battlefield
Stories From The Web 3 BattlefieldPriyanka Aash
 
Lessons Learned From Ransomware Attacks
Lessons Learned From Ransomware AttacksLessons Learned From Ransomware Attacks
Lessons Learned From Ransomware AttacksPriyanka Aash
 
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)Priyanka Aash
 
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)Priyanka Aash
 
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)Priyanka Aash
 
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow LogsCloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow LogsPriyanka Aash
 
Cyber Security Governance
Cyber Security GovernanceCyber Security Governance
Cyber Security GovernancePriyanka Aash
 
Web Application Penetration Testing
Web Application Penetration Testing Web Application Penetration Testing
Web Application Penetration Testing Priyanka Aash
 
Hardware Security on Vehicles
Hardware Security on VehiclesHardware Security on Vehicles
Hardware Security on VehiclesPriyanka Aash
 
Web hacking using Cyber range
Web hacking using Cyber rangeWeb hacking using Cyber range
Web hacking using Cyber rangePriyanka Aash
 
Hacking IoT with EXPLIoT Framework
Hacking IoT with EXPLIoT FrameworkHacking IoT with EXPLIoT Framework
Hacking IoT with EXPLIoT FrameworkPriyanka Aash
 
Creating New Models To Combat Business Email Compromise
Creating New Models To Combat Business Email CompromiseCreating New Models To Combat Business Email Compromise
Creating New Models To Combat Business Email CompromisePriyanka Aash
 
Cyberterrorism. Past, Present, Future
Cyberterrorism. Past, Present, FutureCyberterrorism. Past, Present, Future
Cyberterrorism. Past, Present, FuturePriyanka Aash
 
Rethinking Application Security for cloud-native era
Rethinking Application Security for cloud-native eraRethinking Application Security for cloud-native era
Rethinking Application Security for cloud-native eraPriyanka Aash
 

More from Priyanka Aash (20)

Cyber Crisis Management.pdf
Cyber Crisis Management.pdfCyber Crisis Management.pdf
Cyber Crisis Management.pdf
 
CISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdfCISOPlatform journey.pptx.pdf
CISOPlatform journey.pptx.pdf
 
Chennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdfChennai Chapter.pptx.pdf
Chennai Chapter.pptx.pdf
 
Cloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdfCloud attack vectors_Moshe.pdf
Cloud attack vectors_Moshe.pdf
 
Stories From The Web 3 Battlefield
Stories From The Web 3 BattlefieldStories From The Web 3 Battlefield
Stories From The Web 3 Battlefield
 
Lessons Learned From Ransomware Attacks
Lessons Learned From Ransomware AttacksLessons Learned From Ransomware Attacks
Lessons Learned From Ransomware Attacks
 
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
Emerging New Threats And Top CISO Priorities In 2022 (Chennai)
 
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
Emerging New Threats And Top CISO Priorities In 2022 (Mumbai)
 
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
Emerging New Threats And Top CISO Priorities in 2022 (Bangalore)
 
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow LogsCloud Security: Limitations of Cloud Security Groups and Flow Logs
Cloud Security: Limitations of Cloud Security Groups and Flow Logs
 
Cyber Security Governance
Cyber Security GovernanceCyber Security Governance
Cyber Security Governance
 
Ethical Hacking
Ethical HackingEthical Hacking
Ethical Hacking
 
Web Application Penetration Testing
Web Application Penetration Testing Web Application Penetration Testing
Web Application Penetration Testing
 
Hardware Security on Vehicles
Hardware Security on VehiclesHardware Security on Vehicles
Hardware Security on Vehicles
 
Web hacking using Cyber range
Web hacking using Cyber rangeWeb hacking using Cyber range
Web hacking using Cyber range
 
Hacking IoT with EXPLIoT Framework
Hacking IoT with EXPLIoT FrameworkHacking IoT with EXPLIoT Framework
Hacking IoT with EXPLIoT Framework
 
Telecom Security
Telecom SecurityTelecom Security
Telecom Security
 
Creating New Models To Combat Business Email Compromise
Creating New Models To Combat Business Email CompromiseCreating New Models To Combat Business Email Compromise
Creating New Models To Combat Business Email Compromise
 
Cyberterrorism. Past, Present, Future
Cyberterrorism. Past, Present, FutureCyberterrorism. Past, Present, Future
Cyberterrorism. Past, Present, Future
 
Rethinking Application Security for cloud-native era
Rethinking Application Security for cloud-native eraRethinking Application Security for cloud-native era
Rethinking Application Security for cloud-native era
 

Recently uploaded

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Recently uploaded (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

Reaping and breaking keys at scale: when crypto meets big data

  • 1. Reaping and breaking keys at scale: when crypto meets big data Nils Amiet Yolan Romailler August 2018 — DEF CON 26
  • 2. Warning: These slides are not up to date Get the latest slides from: https://research.kudelskisecurity.com/
  • 3. Warning: if you want to test your keys live You can go to our website: keylookup.kudelskisecurity.com and submit your key to test it against our dataset! If the key is already known, results are immediate, otherwise you’ll have to check back later. We’ll come back to this during our demo.
  • 4. The problem ● Asymmetric cryptography relies on two type of keys: ○ the public key, that anybody can use to encrypt/verify data. ○ the private key, that should remain secret and allows for decryption/signing of the said data. ● Tons of public keys are out there: ○ TLS/SSL ○ SSH ○ PGP, … ● Can public keys leak data about the private keys?
  • 5. Crypto recap ● RSA (Rivest–Shamir–Adleman) ○ Choose two large prime numbers p and q ○ Public key (n, e) ■ with n = p * q ■ and some e such that e and λ(n) are coprime ○ Private key (n, d) where d ≡ e^−1 (mod λ(n)) ○ Message encryption ■ c ≡ m^e (mod n) ○ Ciphertext decryption ■ m ≡ c^d (mod n) ○ RSA security relies on the hardness of the integer factorization problem
  • 6. Crypto recap (cont’d) ● ECC (“Elliptic Curve Cryptography”) ○ Security based on the hardness of the EC discrete logarithm problem ■ It is hard to retrieve the integer d, knowing the points G and Q=dG ○ Working with an elliptic curve C ○ Private key is an integer d ○ Public key is a point Q = (x, y) = dG ■ where (x, y) are the coordinates of the point on a given known curve
  • 7. Passive attacks on public keys ● The Return of Coppersmith’s Attack (ROCA) ● RSA modulus factorization (Batch GCD) ● Invalid parameters ○ DSA generator ○ Key sizes ○ Invalid curve attacks ★ Batch GCD already used in 2010, 2012, 2016 to break weak keys ○ on datasets <100M keys ★ These are all known attacks! ★ And they are completely passive, the target is left unaware
  • 8. Collecting public keys ● X.509 certificates ○ HTTPS, on IPv4 range and on CT domain names (>260 million) ○ IMAP(S), POP3(S), SMTP(S) ● SSH keys ○ Github.com, Gitlab.com ○ SSH host key scans (port 22) ● PGP keys ○ Github.com, Keybase.io ○ Key dumps from SKS pool of PGP key servers
  • 9. Keys (millions) per key container type
  • 10. Keys collected per data source ● PGP keys ○ 9.5M on SKS key servers ○ 220k on Keybase.io ○ 6k on Github.com ● SSH keys ○ 71M from CRoCS dataset ○ 17M from SSH scans ○ 4.5M on Github.com ○ 1.2M on Gitlab.com ● X.509 certificates ○ > 200M from HTTPS scans ○ 1-2M each from SMTP(S), POP3(S) and IMAP(S) scans
  • 11. Our public keys stash: Big Brother style ● Attacks like RSA Batch GCD work best with larger datasets ○ More keys = more chances of finding common factors ● We collected as many public keys as we could ○ > 343,492,000 unique keys and growing ○ collection made over 1 year
  • 12. Key types ● RSA 324M 324476553 ● ECC 13M 13975895 ● DSA 2.5M 2568969 ● ElGamal 2.4M 2468739 ● GOST R 34.10-2001 1k 1759 ● Other <1k 217
  • 13. What to do with the (big) data 1. Collect raw data 2. Parse raw data 3. Run additional tests on parsed data 4. Ingest parsed data into database 5. Run queries on distributed database 6. ??? 7. Profit
  • 14. Toolbox To collect data: ● Fingerprinting with cert/key grabbing: Scannerl with custom modules ● To parse it: Python code ● To ingest it: NiFi and HDFS ● To study it: Presto To break keys: ● Batch GCD on RSA keys, using a custom distributed version ● ROCA attack on vulnerable RSA keys ● Sanity checks on EC keys
  • 15. Demo [Live demo, using https://keylookup.kudelskisecurity.com/] [Factoring a key with the GCD, rebuilding the private key out of the factors]
  • 16. Behind the scenes ● Distributed fingerprinting using 50 Scannerl slaves ● Batch-GCD: ○ 280 vCPUs cluster across 7 nodes ○ 1.4 TB storage for intermediate results ● Data Lake with 10+ data nodes ● Storage: over 15TB including all raw data
  • 17. Results: RSA keys Broken keys: ● >4k RSA keys vulnerable to ROCA ○ 33% of size 2048 (weak), 64% of size 4096 (should be fine) ○ Mostly PGP keys (97%) ○ Found vulnerable keys on Keybase.io, Github.com and Gitlab.com! Check your keys! ● >200k RSA keys factored through batch GCD ○ Real-world breakable/broken keys! ○ 207k X.509 certificates, allowing for MitM attacks ■ at least 261 certs currently in use, 1493 certs used over last year ○ >1200 SSH keys, allowing for MitM attacks ○ 6 PGP keys, allowing for decryption, impersonation and other evil
  • 18. Results: RSA keys Unsurprisingly, many routers are concerned:
  • 20. Results: ECC keys ● The adoption rate of ECC differs greatly depending on the source: ○ X509 and PGP are steadily adopting ECC ● Most common curves for SSH: ○ secp256r1 97,68% ○ secp521r1 1,87% ○ Curve25519 0,37% ○ secp384r1 0,07%
  • 21. Growth of ECC keys Scan failure
  • 22. Fun facts ● Some keys are used as both PGP keys, SSH keys and/or X509 certs! ● PGP subkey/master key ratio ○ 50.5% master keys ○ 49.5% subkeys ○ Most people have only one subkey?! ● At least 361 keys we could factor had more than 2 factors! ● DSA is dead (OpenSSL deprecated it in 2015): ○ only 3106 X.509 certs seen over last year ○ < than 0.55% of SSH keys are DSA based
  • 23. Conclusion ● Mind your keys! ● Anybody can do the same kind of silent attack! And maybe they already do... ● Find our open source code on Github ○ https://github.com/kudelskisecurity ○ https://github.com/kudelskisecurity/scannerl ● Find more results and analysis on our blog ○ https://research.kudelskisecurity.com