SlideShare a Scribd company logo
1 of 69
Download to read offline
Caring for
file formats
Caring for
file formats
Ange Albertini
Troopers 2016
Ange Albertini
Troopers 2016
TL;DR
● Attack surface with file formats is too big.
● Specs are useless (just a nice ‘guide’), not representing reality.
● We can’t deprecate formats because we can’t preserve and we can’t define how
they really work
● We need open good libraries to simplify landscape, and create a corpus to
express the reality of file format, which gives us real “documentation”.
● Then we can preserve and deprecate older format, which reduces attack surface.
● From then on, we can focus on making the present more secure.
● We don’t need “new” formats: we need ‘alive’ specs and files corpus.
Otherwise specs will always diverge from reality.
Ange Albertini
reverse engineering &
visual documentation
@angealbertini
ange@corkami.com
http://www.corkami.comWelcome to my talk!
I make polyglots (multi-type files),
schizophrenics (multi-behavior)...
I tried to explain file formats with cows…
But that didn’t really tell why people should care.
1
3DES
I really like to play with file formats...
AESK
AESK
JPG
JAR
(ZIP + CLASS)
PDF
FLV
PNG
2
I’m a part of PoC||GTFO,
for which I’m a file format
user and abuser.
PoC||GTFO: many file formats
● Articles
PDFLaTeX PDFBook Inkscape GhostScript Scribus Blender Gimp Fontforge
PDFFont Mutool
● Proof of Concept
Qpdf Xpdf Ruby Python Bash Truecrypt Wavpack Audacity Baudline Sox Tar
Zip MkIsoFS LSnes PngOpt JpegSnoop AdvPNG Nasm Qemu BPGEnc
And many custom scripts handling file formats in unconventional ways…
I'm interested about hardware preservation
and digital preservation.
My interests
● Using file formats
○ graphics, 3d, music…
● Abusing file formats
○ polyglot, schizophrenia, hash collisions…
● Preserving file formats
○ Retro-gaming, digital archeology...
A miserable little pile of secrets
Not just a sequence of binary
What is a file format?
If you [/your program] generate
a picture of any kind,
you might want to export
the result to something
that you can re-use later.
(same for any form of information)
A computer dialect
to communicate
between communities.
What is a file format?
File formats are
community connectors.
Don’t think so?
Try exporting everything as XML ;)
Most people don’t care
about <actor>
They only care about <roles>
We mostly care about the input/output.
Example:
We don’t care about GIF
We mostly care about its characteristics
and how easy it is to use.
No need to be emotional,
and stay in our comfort zone.
We don’t really care
about file formats.
We care about their caracteristics.
Not groundbreaking,
but supported “everywhere”.
Why should infosec care?
Fuzz formats. Blame “bad” devs.
Collect CVEs. Boast your ego.
10 PRINT “SOLVED ANYTHING YET?”
20 GOTO 10
Attack surface
● 1 OS = N supported formats
● For each format:
○ How many parsers?
○ For each parser:
■ Which version, compiler...
The PGM or PPM
formats are the easiest
way to convert any data
in valid grayscale or
RGB pictures.
But most people don’t
know it’s supported out
of the box by many
softwares.
We should reduce the attack surface.
How many unsuspected supported
[sub-]formats and parsers?
https://lcamtuf.blogspot.com/2014/10/psa-dont-run-strings-on-untrusted-files.html
How many file formats supported
by your browser ?
By your OS?
How many do you really need ?
Think “embedded”.
Capacity is still too cheap:
we keep stacking formats/features,
which doesn’t solve anything.
It’s a problem everywhere.
We keep losing ground.
<!--
PoC||GTFO 10
“Pokemon plays Twitch”
1. Exploit a GameBoy game via input
2. Take over the Super GameBoy
3. Take over the Super Nintendo
The file itself can perform the exploit
(on the hardware or an emulator).
The payload displays the article.
-->
PoC||GTFO 10 is a PoC-ception:
- a PDF article describing the exploit
- a file performing the exploit
(to display the article)
“young celebs”
What they were supposed to be
doesn’t really matter.
What file formats were supposed to be
doesn't matter anymore,
what they are now is all we care.
Security cares about current reality,
not obsolete theory.
We can blame bad parsers.
What about the file formats?
If the map is unclear enough, you’ll get lost anyway.
A blurry file format will never lead to a clean parser.
use a ready-made translator:
an import/export library
Write your own:
read the specs.
2 ways to communicate
Landscapes
To exploit hash collisions, I abused JPEG.
To abuse JPEG “everywhere”, just abuse LibJPEG.
JPEG format’s landscape
in practice, JPEG is LibJPEG turbo v6
● de facto standard
○ later versions not used (different API)
Even if you create your own JPEG library,
you want to have full LibJPEG compatibility.
JPEG format is defined by LibJPEG.
I made extremely custom PDFs for each reader.
These "extreme" PDFs fail on any other reader.
PDF’s current landscape
PDF: 6 interpretations of the specs
● specs are even more useless
One good open library:
a unified attack surface
Fuzz it, pwn everyone ?
True, but also fixed for everyone!
Is diversity really good?
We’re all supposed to use the same file format.
Diversity is good?
Attack surface is worse.
Unofficial substandards.
In any cases...
Specs are merely an introduction guide.
A free set of examples w/ corner cases.
A grammar ?
PDF’s future
PDF/E (engineer): 3d crap
PDF/A (archiving): already 8 flavours
Specs:
● specs are now commercial
● the main implementation is not open
● no set of free files.
And all countries preserve their culture with that format?!?!
We’re waiting for a new disaster...
many file formats are
abandoned
One specs. then nothing.
It’s like knowing about someone
only from a baby’s picture.
<!--
PoC||GTFO 11
PoC||GTFO 11 is a webserver serving itself, with its own HTML page
extracting its own attachments from its ZIP.
$ruby pocorgtfo11.pdf
Listening for connections on port 8080.
To listen on a different port,
re-run with the desired port as a command-line argument.
A neighbor at 127.0.0.1 is requesting /
A neighbor at 127.0.0.1 is requesting /ajax/feelies.json
A neighbor at 127.0.0.1 is requesting /favicon.png
$unzip -l pocorgtfo11.pdf
Archive: pocorgtfo11.pdf
Length Date Time Name
-------- ---- ---- ----
0 03-16-16 13:37 4am/
25955 03-11-16 15:06 4am/Stickybear Math 2 (4am crack).txt
[...]
3241 03-16-16 13:37 wafflehouse.txt
-------- -------
8177332 23 files
-->
PoC||GTFO 11 is self-aware:
a PDF that serves itself (HTTP quine),
parses its own ZIP to serve its archived feelies.
Important question
Do you still sleep
with a teddy bear?
Kids really deprecate stuff
Our computers still handle always more
and more file formats.
⇒ The attack surface just keeps growing.
Obsolete formats are
still omnipresent
Formats, sub-formats, features...
Because it’s unclear
if we can go back.
We’d be too afraid to deprecate them.
Yet we deprecate
for security.
Example for PDF:
JPEG-compressed text
is not supported anymore
(it could bypass security).
Windows PE format
becomes stricter
(deprecates packers)
For example,
EPUB 3.1 suddenly killed
backward compatibility.
http://blog.kbresearch.nl/2016/03/10/the-future-of-epub-a-first-look-at-the-epub-3-1-editors-draft/
Sometimes,
it’s not even for
security reasons
We don’t need
new file formats.
It’s the same problem again if
eventually their specs stop reflecting reality.
Even dictionaries have
regular updates,
to reflect reality.
Story time
Digipres = PDF worshippers. 150 years of availability?
● Non free specs + closed source software?
Here comes the grim reaper:
● Fix your stuff or it will be killed (like Flash)
We store our knowledge. What about files born digital?
Not infosec, but worrying.
veraPDF and its test files:
a great initiative.
PE.corkami.com: my own collection of hand-made executables and "documentation" (completely free).
Some of these failed a lot of software...
Consequence of my PE page+corpus
● 'corkami-proof' software
● raises the bar for everyone
● become a hub of knowledge
○ "I can't share the sample", but from the knowledge,
my own file will be shared
⇒ even useful for the original contact
Conclusion
Attack surface
Too many (sub)formats
Too many parsers (= no good open lib)
Specs
Specs shouldn’t be a religious text
● Worshipped, but outdated and worthless
Specs should reflect reality (a law)
● updated, enforced, realistic, freely available
A good open lib
Deprecation
Deprecation is a natural cycle, and yet...
We are afraid to deprecate because
no file format is fully preserved:
● open, up to date specs
● free test coverage
But it won’t happen...
...until a great disaster ?
It ends up on CNN, with a logo & a website :)
Ack
Phil Fabrice Travis Sergey
Micah Kurt QKumba Hanno...
Thank you!
Caring for
file formats
corkami.com
@angealbertini
Hail to the king, baby!

More Related Content

What's hot

Let's write a PDF file
Let's write a PDF fileLet's write a PDF file
Let's write a PDF fileAnge Albertini
 
Multilingual sites in plone
Multilingual sites in ploneMultilingual sites in plone
Multilingual sites in ploneRamon Navarro
 
Code quality. Patch quality
Code quality. Patch qualityCode quality. Patch quality
Code quality. Patch qualitymalcolmt
 
Why I Love Python
Why I Love PythonWhy I Love Python
Why I Love Pythondidip
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6Andrei Zmievski
 
GDG Helwan Introduction to python
GDG Helwan Introduction to pythonGDG Helwan Introduction to python
GDG Helwan Introduction to pythonMohamed Hegazy
 
What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)wesley chun
 
Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Younggun Kim
 

What's hot (10)

Let's write a PDF file
Let's write a PDF fileLet's write a PDF file
Let's write a PDF file
 
Multilingual sites in plone
Multilingual sites in ploneMultilingual sites in plone
Multilingual sites in plone
 
Code quality. Patch quality
Code quality. Patch qualityCode quality. Patch quality
Code quality. Patch quality
 
Using unicode with php
Using unicode with phpUsing unicode with php
Using unicode with php
 
Why I Love Python
Why I Love PythonWhy I Love Python
Why I Love Python
 
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
The Good, the Bad, and the Ugly: What Happened to Unicode and PHP 6
 
GDG Helwan Introduction to python
GDG Helwan Introduction to pythonGDG Helwan Introduction to python
GDG Helwan Introduction to python
 
Neo4j
Neo4jNeo4j
Neo4j
 
What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)What is Python? (Silicon Valley CodeCamp 2014)
What is Python? (Silicon Valley CodeCamp 2014)
 
Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015Writing Fast Code (JP) - PyCon JP 2015
Writing Fast Code (JP) - PyCon JP 2015
 

Viewers also liked

Connecting communities
Connecting communitiesConnecting communities
Connecting communitiesAnge Albertini
 
Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)Ange Albertini
 
TASBot - the perfectionist
TASBot - the perfectionistTASBot - the perfectionist
TASBot - the perfectionistAnge Albertini
 
Preserving arcade games - 31c3
Preserving arcade games -  31c3Preserving arcade games -  31c3
Preserving arcade games - 31c3Ange Albertini
 
Exploring the Portable Executable format
Exploring the Portable Executable formatExploring the Portable Executable format
Exploring the Portable Executable formatAnge Albertini
 

Viewers also liked (7)

Connecting communities
Connecting communitiesConnecting communities
Connecting communities
 
Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)Binary art - Byte-ing the PE that fails you (extended offline version)
Binary art - Byte-ing the PE that fails you (extended offline version)
 
nikhil resume
nikhil resumenikhil resume
nikhil resume
 
TASBot - the perfectionist
TASBot - the perfectionistTASBot - the perfectionist
TASBot - the perfectionist
 
Hacks in video games
Hacks in video gamesHacks in video games
Hacks in video games
 
Preserving arcade games - 31c3
Preserving arcade games -  31c3Preserving arcade games -  31c3
Preserving arcade games - 31c3
 
Exploring the Portable Executable format
Exploring the Portable Executable formatExploring the Portable Executable format
Exploring the Portable Executable format
 

Similar to Caring for file formats

The challenges of file formats
The challenges of file formatsThe challenges of file formats
The challenges of file formatsAnge Albertini
 
Schizophrenic files v2
Schizophrenic files v2Schizophrenic files v2
Schizophrenic files v2Ange Albertini
 
What every C++ programmer should know about modern compilers (w/ comments, AC...
What every C++ programmer should know about modern compilers (w/ comments, AC...What every C++ programmer should know about modern compilers (w/ comments, AC...
What every C++ programmer should know about modern compilers (w/ comments, AC...Sławomir Zborowski
 
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2ice799
 
Introduction to Programming in Go
Introduction to Programming in GoIntroduction to Programming in Go
Introduction to Programming in GoAmr Hassan
 
Go language presentation
Go language presentationGo language presentation
Go language presentationparamisoft
 
Codebits Handivi
Codebits HandiviCodebits Handivi
Codebits Handivicfpinto
 
C unix ipc
C unix ipcC unix ipc
C unix ipcrgx112
 
Python enterprise vento di liberta
Python enterprise vento di libertaPython enterprise vento di liberta
Python enterprise vento di libertaSimone Federici
 
Semantic web, python, construction industry
Semantic web, python, construction industrySemantic web, python, construction industry
Semantic web, python, construction industryReinout van Rees
 
PDF - Secrets - 140519092839-phpapp01
PDF - Secrets - 140519092839-phpapp01PDF - Secrets - 140519092839-phpapp01
PDF - Secrets - 140519092839-phpapp01dumbfuckery
 
Python @ PiTech - March 2009
Python @ PiTech - March 2009Python @ PiTech - March 2009
Python @ PiTech - March 2009tudorprodan
 
Messing with binary formats
Messing with binary formatsMessing with binary formats
Messing with binary formatsAnge Albertini
 
Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)
Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)
Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)Joxean Koret
 
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...Area41
 
BUD17-104: Scripting Languages in IoT: Challenges and Approaches
BUD17-104: Scripting Languages in IoT: Challenges and ApproachesBUD17-104: Scripting Languages in IoT: Challenges and Approaches
BUD17-104: Scripting Languages in IoT: Challenges and ApproachesLinaro
 
Infrastructure as code might be literally impossible
Infrastructure as code might be literally impossibleInfrastructure as code might be literally impossible
Infrastructure as code might be literally impossibleice799
 

Similar to Caring for file formats (20)

The challenges of file formats
The challenges of file formatsThe challenges of file formats
The challenges of file formats
 
Schizophrenic files v2
Schizophrenic files v2Schizophrenic files v2
Schizophrenic files v2
 
What every C++ programmer should know about modern compilers (w/ comments, AC...
What every C++ programmer should know about modern compilers (w/ comments, AC...What every C++ programmer should know about modern compilers (w/ comments, AC...
What every C++ programmer should know about modern compilers (w/ comments, AC...
 
NISO Webinar: Software Preservation and Use: I Saved the Files But Can I Run ...
NISO Webinar: Software Preservation and Use: I Saved the Files But Can I Run ...NISO Webinar: Software Preservation and Use: I Saved the Files But Can I Run ...
NISO Webinar: Software Preservation and Use: I Saved the Files But Can I Run ...
 
Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2Infrastructure as code might be literally impossible part 2
Infrastructure as code might be literally impossible part 2
 
Introduction to Programming in Go
Introduction to Programming in GoIntroduction to Programming in Go
Introduction to Programming in Go
 
Go language presentation
Go language presentationGo language presentation
Go language presentation
 
Codebits Handivi
Codebits HandiviCodebits Handivi
Codebits Handivi
 
U2 l05 lossy vs.lossless compression
U2 l05 lossy vs.lossless compressionU2 l05 lossy vs.lossless compression
U2 l05 lossy vs.lossless compression
 
C unix ipc
C unix ipcC unix ipc
C unix ipc
 
Python enterprise vento di liberta
Python enterprise vento di libertaPython enterprise vento di liberta
Python enterprise vento di liberta
 
Semantic web, python, construction industry
Semantic web, python, construction industrySemantic web, python, construction industry
Semantic web, python, construction industry
 
PDF - Secrets - 140519092839-phpapp01
PDF - Secrets - 140519092839-phpapp01PDF - Secrets - 140519092839-phpapp01
PDF - Secrets - 140519092839-phpapp01
 
Python @ PiTech - March 2009
Python @ PiTech - March 2009Python @ PiTech - March 2009
Python @ PiTech - March 2009
 
Messing with binary formats
Messing with binary formatsMessing with binary formats
Messing with binary formats
 
Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)
Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)
Pigaios: A Tool for Diffing Source Codes against Binaries (Hacktivity 2018)
 
Schizophrenic files
Schizophrenic filesSchizophrenic files
Schizophrenic files
 
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...
Ange Albertini and Gynvael Coldwind: Schizophrenic Files – A file that thinks...
 
BUD17-104: Scripting Languages in IoT: Challenges and Approaches
BUD17-104: Scripting Languages in IoT: Challenges and ApproachesBUD17-104: Scripting Languages in IoT: Challenges and Approaches
BUD17-104: Scripting Languages in IoT: Challenges and Approaches
 
Infrastructure as code might be literally impossible
Infrastructure as code might be literally impossibleInfrastructure as code might be literally impossible
Infrastructure as code might be literally impossible
 

More from Ange Albertini

Technical challenges with file formats
Technical challenges with file formatsTechnical challenges with file formats
Technical challenges with file formatsAnge Albertini
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formatsAnge Albertini
 
Abusing archive file formats
Abusing archive file formatsAbusing archive file formats
Abusing archive file formatsAnge Albertini
 
You are *not* an idiot
You are *not* an idiotYou are *not* an idiot
You are *not* an idiotAnge Albertini
 
An introduction to inkscape
An introduction to inkscapeAn introduction to inkscape
An introduction to inkscapeAnge Albertini
 
Exploiting hash collisions
Exploiting hash collisionsExploiting hash collisions
Exploiting hash collisionsAnge Albertini
 
Preserving arcade games
Preserving arcade gamesPreserving arcade games
Preserving arcade gamesAnge Albertini
 
Hide Android applications in images
Hide Android applications in imagesHide Android applications in images
Hide Android applications in imagesAnge Albertini
 
Let's play with crypto! v2
Let's play with crypto! v2Let's play with crypto! v2
Let's play with crypto! v2Ange Albertini
 
Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1Ange Albertini
 

More from Ange Albertini (16)

Technical challenges with file formats
Technical challenges with file formatsTechnical challenges with file formats
Technical challenges with file formats
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formats
 
Abusing archive file formats
Abusing archive file formatsAbusing archive file formats
Abusing archive file formats
 
TimeCryption
TimeCryptionTimeCryption
TimeCryption
 
You are *not* an idiot
You are *not* an idiotYou are *not* an idiot
You are *not* an idiot
 
KILL MD5
KILL MD5KILL MD5
KILL MD5
 
No more dumb hex!
No more dumb hex!No more dumb hex!
No more dumb hex!
 
Beyond your studies
Beyond your studiesBeyond your studies
Beyond your studies
 
An introduction to inkscape
An introduction to inkscapeAn introduction to inkscape
An introduction to inkscape
 
Exploiting hash collisions
Exploiting hash collisionsExploiting hash collisions
Exploiting hash collisions
 
Infosec & failures
Infosec & failuresInfosec & failures
Infosec & failures
 
Preserving arcade games
Preserving arcade gamesPreserving arcade games
Preserving arcade games
 
Let's talk about...
Let's talk about...Let's talk about...
Let's talk about...
 
Hide Android applications in images
Hide Android applications in imagesHide Android applications in images
Hide Android applications in images
 
Let's play with crypto! v2
Let's play with crypto! v2Let's play with crypto! v2
Let's play with crypto! v2
 
Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1Malicious Hashing: Eve’s Variant of SHA-1
Malicious Hashing: Eve’s Variant of SHA-1
 

Recently uploaded

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 

Recently uploaded (20)

Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 

Caring for file formats

  • 1. Caring for file formats Caring for file formats Ange Albertini Troopers 2016 Ange Albertini Troopers 2016
  • 2. TL;DR ● Attack surface with file formats is too big. ● Specs are useless (just a nice ‘guide’), not representing reality. ● We can’t deprecate formats because we can’t preserve and we can’t define how they really work ● We need open good libraries to simplify landscape, and create a corpus to express the reality of file format, which gives us real “documentation”. ● Then we can preserve and deprecate older format, which reduces attack surface. ● From then on, we can focus on making the present more secure. ● We don’t need “new” formats: we need ‘alive’ specs and files corpus. Otherwise specs will always diverge from reality.
  • 3. Ange Albertini reverse engineering & visual documentation @angealbertini ange@corkami.com http://www.corkami.comWelcome to my talk!
  • 4. I make polyglots (multi-type files), schizophrenics (multi-behavior)...
  • 5. I tried to explain file formats with cows… But that didn’t really tell why people should care.
  • 6. 1 3DES I really like to play with file formats... AESK AESK JPG JAR (ZIP + CLASS) PDF FLV PNG 2
  • 7. I’m a part of PoC||GTFO, for which I’m a file format user and abuser.
  • 8. PoC||GTFO: many file formats ● Articles PDFLaTeX PDFBook Inkscape GhostScript Scribus Blender Gimp Fontforge PDFFont Mutool ● Proof of Concept Qpdf Xpdf Ruby Python Bash Truecrypt Wavpack Audacity Baudline Sox Tar Zip MkIsoFS LSnes PngOpt JpegSnoop AdvPNG Nasm Qemu BPGEnc And many custom scripts handling file formats in unconventional ways…
  • 9. I'm interested about hardware preservation and digital preservation.
  • 10. My interests ● Using file formats ○ graphics, 3d, music… ● Abusing file formats ○ polyglot, schizophrenia, hash collisions… ● Preserving file formats ○ Retro-gaming, digital archeology...
  • 11. A miserable little pile of secrets Not just a sequence of binary What is a file format?
  • 12. If you [/your program] generate a picture of any kind, you might want to export the result to something that you can re-use later. (same for any form of information)
  • 13. A computer dialect to communicate between communities. What is a file format?
  • 14. File formats are community connectors. Don’t think so? Try exporting everything as XML ;)
  • 15. Most people don’t care about <actor> They only care about <roles> We mostly care about the input/output.
  • 16. Example: We don’t care about GIF We mostly care about its characteristics and how easy it is to use. No need to be emotional, and stay in our comfort zone.
  • 17. We don’t really care about file formats. We care about their caracteristics. Not groundbreaking, but supported “everywhere”.
  • 18. Why should infosec care? Fuzz formats. Blame “bad” devs. Collect CVEs. Boast your ego. 10 PRINT “SOLVED ANYTHING YET?” 20 GOTO 10
  • 19. Attack surface ● 1 OS = N supported formats ● For each format: ○ How many parsers? ○ For each parser: ■ Which version, compiler...
  • 20. The PGM or PPM formats are the easiest way to convert any data in valid grayscale or RGB pictures. But most people don’t know it’s supported out of the box by many softwares.
  • 21. We should reduce the attack surface. How many unsuspected supported [sub-]formats and parsers? https://lcamtuf.blogspot.com/2014/10/psa-dont-run-strings-on-untrusted-files.html
  • 22. How many file formats supported by your browser ? By your OS? How many do you really need ? Think “embedded”.
  • 23. Capacity is still too cheap: we keep stacking formats/features, which doesn’t solve anything. It’s a problem everywhere. We keep losing ground.
  • 25. “Pokemon plays Twitch” 1. Exploit a GameBoy game via input 2. Take over the Super GameBoy 3. Take over the Super Nintendo
  • 26. The file itself can perform the exploit (on the hardware or an emulator). The payload displays the article.
  • 27. --> PoC||GTFO 10 is a PoC-ception: - a PDF article describing the exploit - a file performing the exploit (to display the article)
  • 28.
  • 29. “young celebs” What they were supposed to be doesn’t really matter.
  • 30. What file formats were supposed to be doesn't matter anymore, what they are now is all we care. Security cares about current reality, not obsolete theory.
  • 31. We can blame bad parsers. What about the file formats? If the map is unclear enough, you’ll get lost anyway. A blurry file format will never lead to a clean parser.
  • 32. use a ready-made translator: an import/export library Write your own: read the specs. 2 ways to communicate
  • 34. To exploit hash collisions, I abused JPEG. To abuse JPEG “everywhere”, just abuse LibJPEG.
  • 35. JPEG format’s landscape in practice, JPEG is LibJPEG turbo v6 ● de facto standard ○ later versions not used (different API) Even if you create your own JPEG library, you want to have full LibJPEG compatibility. JPEG format is defined by LibJPEG.
  • 36. I made extremely custom PDFs for each reader.
  • 37. These "extreme" PDFs fail on any other reader.
  • 38. PDF’s current landscape PDF: 6 interpretations of the specs ● specs are even more useless
  • 39. One good open library: a unified attack surface Fuzz it, pwn everyone ? True, but also fixed for everyone! Is diversity really good? We’re all supposed to use the same file format.
  • 40. Diversity is good? Attack surface is worse. Unofficial substandards.
  • 41. In any cases... Specs are merely an introduction guide. A free set of examples w/ corner cases. A grammar ?
  • 42. PDF’s future PDF/E (engineer): 3d crap PDF/A (archiving): already 8 flavours Specs: ● specs are now commercial ● the main implementation is not open ● no set of free files. And all countries preserve their culture with that format?!?! We’re waiting for a new disaster...
  • 43. many file formats are abandoned One specs. then nothing. It’s like knowing about someone only from a baby’s picture.
  • 45. PoC||GTFO 11 is a webserver serving itself, with its own HTML page extracting its own attachments from its ZIP. $ruby pocorgtfo11.pdf Listening for connections on port 8080. To listen on a different port, re-run with the desired port as a command-line argument. A neighbor at 127.0.0.1 is requesting / A neighbor at 127.0.0.1 is requesting /ajax/feelies.json A neighbor at 127.0.0.1 is requesting /favicon.png $unzip -l pocorgtfo11.pdf Archive: pocorgtfo11.pdf Length Date Time Name -------- ---- ---- ---- 0 03-16-16 13:37 4am/ 25955 03-11-16 15:06 4am/Stickybear Math 2 (4am crack).txt [...] 3241 03-16-16 13:37 wafflehouse.txt -------- ------- 8177332 23 files
  • 46. --> PoC||GTFO 11 is self-aware: a PDF that serves itself (HTTP quine), parses its own ZIP to serve its archived feelies.
  • 48. Do you still sleep with a teddy bear?
  • 49. Kids really deprecate stuff Our computers still handle always more and more file formats. ⇒ The attack surface just keeps growing.
  • 50. Obsolete formats are still omnipresent Formats, sub-formats, features...
  • 51. Because it’s unclear if we can go back. We’d be too afraid to deprecate them.
  • 52. Yet we deprecate for security. Example for PDF: JPEG-compressed text is not supported anymore (it could bypass security).
  • 53. Windows PE format becomes stricter (deprecates packers)
  • 54. For example, EPUB 3.1 suddenly killed backward compatibility. http://blog.kbresearch.nl/2016/03/10/the-future-of-epub-a-first-look-at-the-epub-3-1-editors-draft/ Sometimes, it’s not even for security reasons
  • 55. We don’t need new file formats. It’s the same problem again if eventually their specs stop reflecting reality.
  • 56. Even dictionaries have regular updates, to reflect reality.
  • 57. Story time Digipres = PDF worshippers. 150 years of availability? ● Non free specs + closed source software? Here comes the grim reaper: ● Fix your stuff or it will be killed (like Flash) We store our knowledge. What about files born digital? Not infosec, but worrying.
  • 58. veraPDF and its test files: a great initiative.
  • 59. PE.corkami.com: my own collection of hand-made executables and "documentation" (completely free).
  • 60. Some of these failed a lot of software...
  • 61. Consequence of my PE page+corpus ● 'corkami-proof' software ● raises the bar for everyone ● become a hub of knowledge ○ "I can't share the sample", but from the knowledge, my own file will be shared ⇒ even useful for the original contact
  • 63. Attack surface Too many (sub)formats Too many parsers (= no good open lib)
  • 64. Specs Specs shouldn’t be a religious text ● Worshipped, but outdated and worthless Specs should reflect reality (a law) ● updated, enforced, realistic, freely available A good open lib
  • 65. Deprecation Deprecation is a natural cycle, and yet... We are afraid to deprecate because no file format is fully preserved: ● open, up to date specs ● free test coverage
  • 66. But it won’t happen... ...until a great disaster ? It ends up on CNN, with a logo & a website :)
  • 67. Ack Phil Fabrice Travis Sergey Micah Kurt QKumba Hanno...