SlideShare a Scribd company logo
Analysis of malicious PDF
by
Abdul Adil
Open Info.sec Community
Disclaimer: Either me or the organizers are not responsible for any damages or any sort of actions that you made with the provided information.
Who am i?
• Information security enthusiast & Developer.
• Certified in OCJP ,CEH.
• You can reach me at:
Codestudio8.wordpress.com
Linkedin.com/in/abduladil02
Facebook.com/abduladil02
Twitter.com/abduladil02
Abdul.Adil@connectica.in or AbdulAdil02@gmail.com
What your going learn?
• What is a pdf?
• Internals of PDF.
• Strings of pdf.
• Scanning pdf’s with virus total.
• Demo.
• Conclusion.
What is a pdf?
• It stands for Portable Document Format(PDF).
• Extension of portable document format is “.pdf”.
• Is a file format used to present documents in a manner independent
of application software, hardware, and operating systems.
• Developed by Adobe Systems in the year 1991.
• Interactive features like acroforms , rich media…
• Current version of pdf is 1.7 was released in 2011.
First Malware of PDF
• PDF attachments carrying viruses were first discovered in 2001.
• The virus, named OUTLOOK.PDFWorm or Peachy, uses Microsoft
Outlook to send itself as an attachment to an Adobe PDF file.
• It was activated with Adobe Acrobat, but not with Acrobat Reader.
Structure of pdf
Internals of pdf
• Header: this probably the most simple section. It is made of a single line which specifies the PDF language
version eg:1.1.
• Body: which generally contains the most part of the PDF code. This section is made of a list of objects which
describes how the final document will look.
• cross reference table: this table contains all the data required to the PDF management software (e.g. a
reader) in order to access directly any document object without having to read throughout the file to find
this object. Starts with ‘Xref’.
• Trailer: Any PDF software management application always begins to read from the end of the file where this
last section is located. The trailer contains different essential data, which are from the top to the bottom of
the trailer:
a. the number of objects contained in the file (field /Size),
b. the ID of the file root document (field /Root),
c. the offset (in bytes) of the cross reference table (the line just above the %%EOF line).
Xref table structure
14 objects
Object is free
Object is in use
Take a close look before you proceed!
Tools to analyze pdf files
• You can download from http://blog.didierstevens.com/programs/pdf-tools/
• Pdf-parser.py: This tool will parse a PDF document to identify the fundamental elements used in
the analyzed file. It will not render a PDF document.
• Pdfid.py: This tool is not a PDF parser, but it will scan a file to look for certain PDF keywords,
allowing you to identify PDF documents that contain (for example) JavaScript or execute an action
when opened. PDFiD will also handle name obfuscation.
• Other tools:PeePdf.py
• Online tools:
a. Virustotal.com
b. wepawet(http://wepawet.iseclab.org)
c. pdfexaminer(www.malwaretracker.com)
d. jsunpack.jeek.org
e. pdf stream dumper.
Strings in pdf
• obj,endobj,stream,endstream,xref,trailer,startxref,/Page,/Encrypt,/Obj
Stm,/JS,/JavaScript,/AA,/OpenAction,/JBIG2Decode,/RichMedia,/Laun
ch,/XFA.
• Almost every PDF documents will contain the first 7 words (obj
through startxref), and to a lesser extent stream and endstream.
• /Page gives an indication of the number of pages in the PDF
document. Most malicious PDF document have only one page eg.You
won a lottery mail.
• /Encrypt indicates that the PDF document has DRM or needs a
password to be read.
• /ObjStm counts the number of object streams. An object stream is a
stream object that can contain other objects, and can therefore be
used to obfuscate objects (by using different filters).
Strings in pdf
• /JS and /JavaScript indicate that the PDF document contains
JavaScript. Almost all malicious PDF documents that I’ve found in the
wild contain JavaScript (to exploit a JavaScript vulnerability and/or to
execute a heap spray). Of course, you can also find JavaScript in PDF
documents without malicious intend.
• /AA and /OpenAction indicate an automatic action to be performed
when the page/document is viewed. All malicious PDF documents
with JavaScript I’ve seen in the wild had an automatic action to
launch the JavaScript without user interaction.
Demo
• Let’s see a demo
1.Pdf-parser.py
2.pdfid.py
3.Peepdf
4.Metasploit
Just a glance malicious action snippet
Drawbacks in pdfid.py
• Because PDFiD is just a string scanner (supporting name obfuscation),
it will also generate false positives. For example, a simple text file
starting with %PDF-1.1 and containing words from the list will also be
identified as a PDF document.
What you can do?
• Scan pdf files with anti-malware application.
• Scan with online scanners like virustotal.com and malwr.com(cuckoo).
You can’t stop stupidity!!
Analysis of malicious pdf

More Related Content

Viewers also liked (13)

Social actions
Social actionsSocial actions
Social actions
 
Main Project Presentation
Main Project PresentationMain Project Presentation
Main Project Presentation
 
Av bypass
Av bypassAv bypass
Av bypass
 
Descriptive statistics research survey analysis (part 2)
Descriptive statistics research   survey analysis (part 2)Descriptive statistics research   survey analysis (part 2)
Descriptive statistics research survey analysis (part 2)
 
Effects and processing
Effects and processingEffects and processing
Effects and processing
 
Asteroïde
AsteroïdeAsteroïde
Asteroïde
 
Zitting5 powerpoint
Zitting5 powerpointZitting5 powerpoint
Zitting5 powerpoint
 
Development of editing styles
Development of editing stylesDevelopment of editing styles
Development of editing styles
 
Extended project presentation
Extended project presentationExtended project presentation
Extended project presentation
 
Stephen willacy præs
Stephen willacy præsStephen willacy præs
Stephen willacy præs
 
Apresentação Institucional GO>Express by Transporta
Apresentação Institucional GO>Express by TransportaApresentação Institucional GO>Express by Transporta
Apresentação Institucional GO>Express by Transporta
 
Premier Pro Guide - Charlie MacArthur
Premier Pro Guide - Charlie MacArthurPremier Pro Guide - Charlie MacArthur
Premier Pro Guide - Charlie MacArthur
 
Descriptive statistics research survey analysis (part 2)
Descriptive statistics research   survey analysis (part 2)Descriptive statistics research   survey analysis (part 2)
Descriptive statistics research survey analysis (part 2)
 

Similar to Analysis of malicious pdf

Client Side Exploits Using Pdf
Client Side Exploits Using PdfClient Side Exploits Using Pdf
Client Side Exploits Using Pdf
titanlambda
 
LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...
LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...
LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...
Wayne State University College of Liberal Arts and Sciences
 

Similar to Analysis of malicious pdf (20)

REMnux Tutorial-3: Investigation of Malicious PDF & Doc documents
REMnux Tutorial-3: Investigation of Malicious PDF & Doc documentsREMnux Tutorial-3: Investigation of Malicious PDF & Doc documents
REMnux Tutorial-3: Investigation of Malicious PDF & Doc documents
 
Client Side Exploits using PDF
Client Side Exploits using PDFClient Side Exploits using PDF
Client Side Exploits using PDF
 
Ebooks without Vendors: Using Open Source Software to Create and Share Meanin...
Ebooks without Vendors: Using Open Source Software to Create and Share Meanin...Ebooks without Vendors: Using Open Source Software to Create and Share Meanin...
Ebooks without Vendors: Using Open Source Software to Create and Share Meanin...
 
How Can You Edit a PDF File and Make it More Readable?
How Can You Edit a PDF File and Make it More Readable?How Can You Edit a PDF File and Make it More Readable?
How Can You Edit a PDF File and Make it More Readable?
 
Docs as-code-missing.-manual
Docs as-code-missing.-manualDocs as-code-missing.-manual
Docs as-code-missing.-manual
 
Zero day-malware-protection-brief-2607983
Zero day-malware-protection-brief-2607983Zero day-malware-protection-brief-2607983
Zero day-malware-protection-brief-2607983
 
Client Side Exploits Using Pdf
Client Side Exploits Using PdfClient Side Exploits Using Pdf
Client Side Exploits Using Pdf
 
DR FAT
DR FATDR FAT
DR FAT
 
Two-For-One Talk: Malware Analysis for Everyone
Two-For-One Talk: Malware Analysis for EveryoneTwo-For-One Talk: Malware Analysis for Everyone
Two-For-One Talk: Malware Analysis for Everyone
 
Hacking and Securing iOS Apps : Part 1
Hacking and Securing iOS Apps : Part 1Hacking and Securing iOS Apps : Part 1
Hacking and Securing iOS Apps : Part 1
 
LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...
LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...
LoA (Librarian of Alexandria): An AI-Powered Linux-Python Tool for Comprehens...
 
Unified characterisation, please
Unified characterisation, pleaseUnified characterisation, please
Unified characterisation, please
 
ICT.docx
 ICT.docx ICT.docx
ICT.docx
 
Learning Python
Learning PythonLearning Python
Learning Python
 
API Documentation Workshop tcworld India 2015
API Documentation Workshop tcworld India 2015API Documentation Workshop tcworld India 2015
API Documentation Workshop tcworld India 2015
 
Firefox-Addons
Firefox-AddonsFirefox-Addons
Firefox-Addons
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Content Type Attack Dark Hole in the Secure Environment by Raman Gupta
Content Type Attack Dark Hole in the Secure Environment by Raman GuptaContent Type Attack Dark Hole in the Secure Environment by Raman Gupta
Content Type Attack Dark Hole in the Secure Environment by Raman Gupta
 
Django
DjangoDjango
Django
 
Lesson 5 computer software
Lesson 5 computer softwareLesson 5 computer software
Lesson 5 computer software
 

Recently uploaded

Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 

Recently uploaded (20)

How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 

Analysis of malicious pdf

  • 1. Analysis of malicious PDF by Abdul Adil Open Info.sec Community Disclaimer: Either me or the organizers are not responsible for any damages or any sort of actions that you made with the provided information.
  • 2. Who am i? • Information security enthusiast & Developer. • Certified in OCJP ,CEH. • You can reach me at: Codestudio8.wordpress.com Linkedin.com/in/abduladil02 Facebook.com/abduladil02 Twitter.com/abduladil02 Abdul.Adil@connectica.in or AbdulAdil02@gmail.com
  • 3. What your going learn? • What is a pdf? • Internals of PDF. • Strings of pdf. • Scanning pdf’s with virus total. • Demo. • Conclusion.
  • 4. What is a pdf? • It stands for Portable Document Format(PDF). • Extension of portable document format is “.pdf”. • Is a file format used to present documents in a manner independent of application software, hardware, and operating systems. • Developed by Adobe Systems in the year 1991. • Interactive features like acroforms , rich media… • Current version of pdf is 1.7 was released in 2011.
  • 5. First Malware of PDF • PDF attachments carrying viruses were first discovered in 2001. • The virus, named OUTLOOK.PDFWorm or Peachy, uses Microsoft Outlook to send itself as an attachment to an Adobe PDF file. • It was activated with Adobe Acrobat, but not with Acrobat Reader.
  • 7. Internals of pdf • Header: this probably the most simple section. It is made of a single line which specifies the PDF language version eg:1.1. • Body: which generally contains the most part of the PDF code. This section is made of a list of objects which describes how the final document will look. • cross reference table: this table contains all the data required to the PDF management software (e.g. a reader) in order to access directly any document object without having to read throughout the file to find this object. Starts with ‘Xref’. • Trailer: Any PDF software management application always begins to read from the end of the file where this last section is located. The trailer contains different essential data, which are from the top to the bottom of the trailer: a. the number of objects contained in the file (field /Size), b. the ID of the file root document (field /Root), c. the offset (in bytes) of the cross reference table (the line just above the %%EOF line).
  • 8. Xref table structure 14 objects Object is free Object is in use
  • 9. Take a close look before you proceed!
  • 10. Tools to analyze pdf files • You can download from http://blog.didierstevens.com/programs/pdf-tools/ • Pdf-parser.py: This tool will parse a PDF document to identify the fundamental elements used in the analyzed file. It will not render a PDF document. • Pdfid.py: This tool is not a PDF parser, but it will scan a file to look for certain PDF keywords, allowing you to identify PDF documents that contain (for example) JavaScript or execute an action when opened. PDFiD will also handle name obfuscation. • Other tools:PeePdf.py • Online tools: a. Virustotal.com b. wepawet(http://wepawet.iseclab.org) c. pdfexaminer(www.malwaretracker.com) d. jsunpack.jeek.org e. pdf stream dumper.
  • 11. Strings in pdf • obj,endobj,stream,endstream,xref,trailer,startxref,/Page,/Encrypt,/Obj Stm,/JS,/JavaScript,/AA,/OpenAction,/JBIG2Decode,/RichMedia,/Laun ch,/XFA. • Almost every PDF documents will contain the first 7 words (obj through startxref), and to a lesser extent stream and endstream. • /Page gives an indication of the number of pages in the PDF document. Most malicious PDF document have only one page eg.You won a lottery mail. • /Encrypt indicates that the PDF document has DRM or needs a password to be read. • /ObjStm counts the number of object streams. An object stream is a stream object that can contain other objects, and can therefore be used to obfuscate objects (by using different filters).
  • 12. Strings in pdf • /JS and /JavaScript indicate that the PDF document contains JavaScript. Almost all malicious PDF documents that I’ve found in the wild contain JavaScript (to exploit a JavaScript vulnerability and/or to execute a heap spray). Of course, you can also find JavaScript in PDF documents without malicious intend. • /AA and /OpenAction indicate an automatic action to be performed when the page/document is viewed. All malicious PDF documents with JavaScript I’ve seen in the wild had an automatic action to launch the JavaScript without user interaction.
  • 13. Demo • Let’s see a demo 1.Pdf-parser.py 2.pdfid.py 3.Peepdf 4.Metasploit
  • 14. Just a glance malicious action snippet
  • 15. Drawbacks in pdfid.py • Because PDFiD is just a string scanner (supporting name obfuscation), it will also generate false positives. For example, a simple text file starting with %PDF-1.1 and containing words from the list will also be identified as a PDF document.
  • 16. What you can do? • Scan pdf files with anti-malware application. • Scan with online scanners like virustotal.com and malwr.com(cuckoo).
  • 17. You can’t stop stupidity!!