SlideShare a Scribd company logo
SemFuzz: Semantics-based Automatic
Generation of Proof-of-Concept Exploits
CCS 2017
Wei You, Kai Chen, XiaoFeng Wang, etc
Indiana University, UCAS, etc
Abstract
● Input validation flaws exploit can be automatically generated, through it’s hard and
rare
● Less understood are the implications of other bug-related informations(CVE infos,
etc.), and such information can facilitate exploit generation
● They present a tool called SemFuzz that can leverage vulnerability-related text to
guide automatic generation of POC exploits
○ Target: Linux kernel with CVE report and git log
○ Including UAF, Memory corruption, information leak, etc
● 18/122 Succeed, 1 0-day and 1 undisclosed vulnerability
Background
● Implication of “other” Information
● Challenges in automatic exploit
generation
Vulnerability Life Cycle
● System updates are often slow
● Miscreants are often given a large time frame (30
days on average), during which they can leverage
the information exposed by public patches to
recover hidden bugs
● Less understood, however, are the implications of
other information
○ CVE, git log, bug description posted on forums and blogs
● Whether such information can also be leveraged
for automatic construction of complicated
exploits?
Challenges in AEG
● Attack on input-validation flaws
○ Symbolic execution
○ Constraint solving are known to be difficult
■ Non-linear, incomplete constraints
● Other types of vulnerabilities are more complicated,
cannot be patched by a patch
○ Even a whole chunk of code need to be replaced
SemFuzz
● Design
● Semantic Information Retrieving
● Semantic Guided Fuzzing
Design
● Semantic Information
Retrieving
○ NLP
○ affected version, vulnerability
type, vulnerable functions, critical
variables, system calls
● Semantics-based Fuzzing
○ Generate seeds
○ Mutate
■ Coarse-level
■ Fine-grained
○ Event Listener
Semantic Information Retrieving
●
Semantic Information Retrieving
● Natural Language Processing
○ Part-of-Speech(POS) Tagging, Phrase Parsing and Syntactic Parsing
● Generating parse tree
○ Represent the syntactic structure of a sentence according to a Context-Free Grammar(CoFG)
S: Sentence, NP: none phrase, VP: verb phrase, JJ: adj., NN: noun.
“the whole skb len is dangerous”
Semantic Information Retrieving
● Affected Version: Regular expression
● Vulnerability Type: Match Candidate Types List
● Vulnerable Functions: Code Diff
● Critical Variables: Match Symbol Table
● System Call:
○ 2 types, prepare environment or trigger the bug
○ Sometimes no syscall in bug description
Syscall
● Build a knowledge base
○ LPM
● Correlate the keywords
to domain-speci€c
concepts
○ e.g. Link MSG MORE to
the flags parameter of the
sendto system call
● Selects the system call
that can cover the most
keywords
Semantics-Guided Fuzzing
● Environment Setup
○ Syzkaller based Framwork
● Generating the seed input
● Coarse mutation
○ Find a system call sequence
● Fire-Grained mutation
○ Mutation on variable
○ Monitor “critical variables”
● Trigger the vulnerability
KCOV: kernel code coverage API
Parameter Monitor: observe param of kf instead of critical variables, with C/DFA
Out-Box Loader: capture abnormal events, KASAN, UBSAN, etc.
Seed Input
● First, put all retrieved syscall together
○ incomplete seed input
○ fill all parameters, including structures (learn from LPM)
○ socket, sendto need syscall bind
● Second, correlates other system calls with the retrieved ones
Coarse-level Mutation
● Mutate input and check distance between
vulnerable function and trace
○ shortest path
○ new seed input
● Construct a reverse call graph
○ Backward reachability analysis
○ Modify GCC to collect call info
Fine-grained Mutation
● Mutate the values of system call parameters
● Only observes the function parameters that the critical variables depend on
○ DFA, CFA
● Measure the input quality using the distance between BBL
e: entry bbl
p: patch bbl
b: current bbl
Evaluation
● Effectiveness
● Performance
● Findings
● Cases
Effectiveness
● Environment
○ x86/x86_64 Linux kernel from 4.0 to 4.11
○ KCOV ported to version before 4.6
○ KASAN & UBSAN enabled
○ Vulnerabilties require specific devices are filtered out
○ Time limit: 48 hour
● Generate PoC exploits for 18(16%) CVEs
○ 5 of 18 have been studied, other without trigger
● For the rest 94
○ 49% lead to vulnerable function
○ 20% lead to patched block
Performance
● Faster than Syzkaller
○ 13.2h VS 33.9h
○ 18 VS 7 (trigger vulnerabilities)
● Conner Cases
○ Specific condition
○ Race Condition
Findings
● More vulnerable functions decrease the possibility to generate a vulnerability
○ So do the Critical Variables
● More precise info works well
● Unknown Vulnerabilities
○ 0day: CVE-2017-6347
○ Undisclosed vulnerability
Cases
● 0day: CVE-2017-6347
○ In the fuzzing process of CVE-2016-4794
■ a UAF vulnerability in the Berkeley Packet Filter
(bpf) subsystem
○ Same syscall sequence with different params
● Undisclosed vulnerability
○ In the fuzzing process of CVE-2016-3841
■ a UAF vulnerability in the networking subsystem
○ 18 vulnerable functions/patches
○ triggered in another protocol
Thanks!

More Related Content

What's hot

Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMCDiagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
Mushfekur Rahman
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Mushfekur Rahman
 
Distributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEDistributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEE
Mushfekur Rahman
 
2 basic of cryptography
2   basic of cryptography2   basic of cryptography
2 basic of cryptography
Panji Prasetyo
 
Pyongyang Fortress
Pyongyang FortressPyongyang Fortress
Pyongyang Fortress
Mayank Dhiman
 
Pentesting custom TLS stacks
Pentesting custom TLS stacksPentesting custom TLS stacks
Pentesting custom TLS stacks
Alexandre Moneger
 

What's hot (6)

Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMCDiagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
Distributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEDistributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEE
 
2 basic of cryptography
2   basic of cryptography2   basic of cryptography
2 basic of cryptography
 
Pyongyang Fortress
Pyongyang FortressPyongyang Fortress
Pyongyang Fortress
 
Pentesting custom TLS stacks
Pentesting custom TLS stacksPentesting custom TLS stacks
Pentesting custom TLS stacks
 

Similar to SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits

Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kevin Lynch
 
31c3 Presentation - Virtual Machine Introspection
31c3 Presentation - Virtual Machine Introspection31c3 Presentation - Virtual Machine Introspection
31c3 Presentation - Virtual Machine Introspection
Tamas K Lengyel
 
Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...
Gobinath Loganathan
 
Jvm profiling under the hood
Jvm profiling under the hoodJvm profiling under the hood
Jvm profiling under the hood
RichardWarburton
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Guglielmo Iozzia
 
Combining Phase Identification and Statistic Modeling for Automated Parallel ...
Combining Phase Identification and Statistic Modeling for Automated Parallel ...Combining Phase Identification and Statistic Modeling for Automated Parallel ...
Combining Phase Identification and Statistic Modeling for Automated Parallel ...
Mingliang Liu
 
The Google file system
The Google file systemThe Google file system
The Google file system
Sergio Shevchenko
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kevin Lynch
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
Thomas Graf
 
Ever Present Persistence - Established Footholds Seen in the Wild
Ever Present Persistence - Established Footholds Seen in the WildEver Present Persistence - Established Footholds Seen in the Wild
Ever Present Persistence - Established Footholds Seen in the Wild
CTruncer
 
MongoDB Operational Best Practices (mongosf2012)
MongoDB Operational Best Practices (mongosf2012)MongoDB Operational Best Practices (mongosf2012)
MongoDB Operational Best Practices (mongosf2012)
Scott Hernandez
 
Skydive 31 janv. 2016
Skydive 31 janv. 2016Skydive 31 janv. 2016
Skydive 31 janv. 2016
Sylvain Afchain
 
Масштабируемый и эффективный фаззинг Google Chrome
Масштабируемый и эффективный фаззинг Google ChromeМасштабируемый и эффективный фаззинг Google Chrome
Масштабируемый и эффективный фаззинг Google Chrome
Positive Hack Days
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...
Omid Vahdaty
 
Linux Internals - Part II
Linux Internals - Part IILinux Internals - Part II
Linux Internals - Part II
Emertxe Information Technologies Pvt Ltd
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
Alexandre Moneger
 
Process control daemon
Process control daemonProcess control daemon
Process control daemon
haish
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
Thomas Graf
 
Black hat dc-2010-egypt-uav-slides
Black hat dc-2010-egypt-uav-slidesBlack hat dc-2010-egypt-uav-slides
Black hat dc-2010-egypt-uav-slides
Bakry3
 
Vpm
VpmVpm

Similar to SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits (20)

Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
31c3 Presentation - Virtual Machine Introspection
31c3 Presentation - Virtual Machine Introspection31c3 Presentation - Virtual Machine Introspection
31c3 Presentation - Virtual Machine Introspection
 
Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...Real time intrusion detection in network traffic using adaptive and auto-scal...
Real time intrusion detection in network traffic using adaptive and auto-scal...
 
Jvm profiling under the hood
Jvm profiling under the hoodJvm profiling under the hood
Jvm profiling under the hood
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Combining Phase Identification and Statistic Modeling for Automated Parallel ...
Combining Phase Identification and Statistic Modeling for Automated Parallel ...Combining Phase Identification and Statistic Modeling for Automated Parallel ...
Combining Phase Identification and Statistic Modeling for Automated Parallel ...
 
The Google file system
The Google file systemThe Google file system
The Google file system
 
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the DatacenterKubernetes @ Squarespace: Kubernetes in the Datacenter
Kubernetes @ Squarespace: Kubernetes in the Datacenter
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Ever Present Persistence - Established Footholds Seen in the Wild
Ever Present Persistence - Established Footholds Seen in the WildEver Present Persistence - Established Footholds Seen in the Wild
Ever Present Persistence - Established Footholds Seen in the Wild
 
MongoDB Operational Best Practices (mongosf2012)
MongoDB Operational Best Practices (mongosf2012)MongoDB Operational Best Practices (mongosf2012)
MongoDB Operational Best Practices (mongosf2012)
 
Skydive 31 janv. 2016
Skydive 31 janv. 2016Skydive 31 janv. 2016
Skydive 31 janv. 2016
 
Масштабируемый и эффективный фаззинг Google Chrome
Масштабируемый и эффективный фаззинг Google ChromeМасштабируемый и эффективный фаззинг Google Chrome
Масштабируемый и эффективный фаззинг Google Chrome
 
Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...Lessons learned from designing QA automation event streaming platform(IoT big...
Lessons learned from designing QA automation event streaming platform(IoT big...
 
Linux Internals - Part II
Linux Internals - Part IILinux Internals - Part II
Linux Internals - Part II
 
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
BSides LV 2016 - Beyond the tip of the iceberg - fuzzing binary protocols for...
 
Process control daemon
Process control daemonProcess control daemon
Process control daemon
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
Black hat dc-2010-egypt-uav-slides
Black hat dc-2010-egypt-uav-slidesBlack hat dc-2010-egypt-uav-slides
Black hat dc-2010-egypt-uav-slides
 
Vpm
VpmVpm
Vpm
 

Recently uploaded

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
David Brossard
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 

Recently uploaded (20)

Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
OpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - AuthorizationOpenID AuthZEN Interop Read Out - Authorization
OpenID AuthZEN Interop Read Out - Authorization
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 

SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits

  • 1. SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits CCS 2017 Wei You, Kai Chen, XiaoFeng Wang, etc Indiana University, UCAS, etc
  • 2. Abstract ● Input validation flaws exploit can be automatically generated, through it’s hard and rare ● Less understood are the implications of other bug-related informations(CVE infos, etc.), and such information can facilitate exploit generation ● They present a tool called SemFuzz that can leverage vulnerability-related text to guide automatic generation of POC exploits ○ Target: Linux kernel with CVE report and git log ○ Including UAF, Memory corruption, information leak, etc ● 18/122 Succeed, 1 0-day and 1 undisclosed vulnerability
  • 3. Background ● Implication of “other” Information ● Challenges in automatic exploit generation
  • 4. Vulnerability Life Cycle ● System updates are often slow ● Miscreants are often given a large time frame (30 days on average), during which they can leverage the information exposed by public patches to recover hidden bugs ● Less understood, however, are the implications of other information ○ CVE, git log, bug description posted on forums and blogs ● Whether such information can also be leveraged for automatic construction of complicated exploits?
  • 5. Challenges in AEG ● Attack on input-validation flaws ○ Symbolic execution ○ Constraint solving are known to be difficult ■ Non-linear, incomplete constraints ● Other types of vulnerabilities are more complicated, cannot be patched by a patch ○ Even a whole chunk of code need to be replaced
  • 6. SemFuzz ● Design ● Semantic Information Retrieving ● Semantic Guided Fuzzing
  • 7. Design ● Semantic Information Retrieving ○ NLP ○ affected version, vulnerability type, vulnerable functions, critical variables, system calls ● Semantics-based Fuzzing ○ Generate seeds ○ Mutate ■ Coarse-level ■ Fine-grained ○ Event Listener
  • 9. Semantic Information Retrieving ● Natural Language Processing ○ Part-of-Speech(POS) Tagging, Phrase Parsing and Syntactic Parsing ● Generating parse tree ○ Represent the syntactic structure of a sentence according to a Context-Free Grammar(CoFG) S: Sentence, NP: none phrase, VP: verb phrase, JJ: adj., NN: noun. “the whole skb len is dangerous”
  • 10. Semantic Information Retrieving ● Affected Version: Regular expression ● Vulnerability Type: Match Candidate Types List ● Vulnerable Functions: Code Diff ● Critical Variables: Match Symbol Table ● System Call: ○ 2 types, prepare environment or trigger the bug ○ Sometimes no syscall in bug description
  • 11. Syscall ● Build a knowledge base ○ LPM ● Correlate the keywords to domain-speci€c concepts ○ e.g. Link MSG MORE to the flags parameter of the sendto system call ● Selects the system call that can cover the most keywords
  • 12. Semantics-Guided Fuzzing ● Environment Setup ○ Syzkaller based Framwork ● Generating the seed input ● Coarse mutation ○ Find a system call sequence ● Fire-Grained mutation ○ Mutation on variable ○ Monitor “critical variables” ● Trigger the vulnerability KCOV: kernel code coverage API Parameter Monitor: observe param of kf instead of critical variables, with C/DFA Out-Box Loader: capture abnormal events, KASAN, UBSAN, etc.
  • 13. Seed Input ● First, put all retrieved syscall together ○ incomplete seed input ○ fill all parameters, including structures (learn from LPM) ○ socket, sendto need syscall bind ● Second, correlates other system calls with the retrieved ones
  • 14. Coarse-level Mutation ● Mutate input and check distance between vulnerable function and trace ○ shortest path ○ new seed input ● Construct a reverse call graph ○ Backward reachability analysis ○ Modify GCC to collect call info
  • 15. Fine-grained Mutation ● Mutate the values of system call parameters ● Only observes the function parameters that the critical variables depend on ○ DFA, CFA ● Measure the input quality using the distance between BBL e: entry bbl p: patch bbl b: current bbl
  • 17. Effectiveness ● Environment ○ x86/x86_64 Linux kernel from 4.0 to 4.11 ○ KCOV ported to version before 4.6 ○ KASAN & UBSAN enabled ○ Vulnerabilties require specific devices are filtered out ○ Time limit: 48 hour ● Generate PoC exploits for 18(16%) CVEs ○ 5 of 18 have been studied, other without trigger ● For the rest 94 ○ 49% lead to vulnerable function ○ 20% lead to patched block
  • 18. Performance ● Faster than Syzkaller ○ 13.2h VS 33.9h ○ 18 VS 7 (trigger vulnerabilities) ● Conner Cases ○ Specific condition ○ Race Condition
  • 19. Findings ● More vulnerable functions decrease the possibility to generate a vulnerability ○ So do the Critical Variables ● More precise info works well ● Unknown Vulnerabilities ○ 0day: CVE-2017-6347 ○ Undisclosed vulnerability
  • 20. Cases ● 0day: CVE-2017-6347 ○ In the fuzzing process of CVE-2016-4794 ■ a UAF vulnerability in the Berkeley Packet Filter (bpf) subsystem ○ Same syscall sequence with different params ● Undisclosed vulnerability ○ In the fuzzing process of CVE-2016-3841 ■ a UAF vulnerability in the networking subsystem ○ 18 vulnerable functions/patches ○ triggered in another protocol