Chema Alonso, José Palazón “Palako”Tactical Fingerprinting using metadata, hidden info and lost data using FOCA
2003 – a piece of historyIrak war was about to startUS wanted the UK to be an ally. US sent a document “proving” the existence of massive destruction weapons Tony Blair presented the document to the UK parliament.Parliament asked Tony Blair “Has someone modified the document?”He answered: No
2003 – MS Word bytes Tony Blair
What kind of data can be found?Metadata:Information stored to give information about the document.For example: Creator, Organization, etc..Hidden information:Information internally stored by programs and not editable.For example: Template paths, Printers, db structure, etc…Lost data:Information which is in documents due to human mistakes or negligence, because it was not intended to be there.For example: Links to internal servers, data hidden by format, etc…
MetadataMetadata LifecycleWrongmanagementBadformatconversionUnsecureoptionsWrongmanagementBadformatconversionUnsecureoptionsNew appsorprogramversionsSearchenginesSpidersDatabasesEmbeddedfilesHiddeninfoLost DataEmbeddedfiles
Metadatacreatedby Google
Lost Data
Lost data everywhere
Public server
So… are people aware of this? The answer is NO.Almost nobody is cleaning documents.Companies publish thousands of documents without cleaning them before with:Metadata.Hidden Info.Lost data.
Sample: FBI.govTotal:  4841 files
Are theyclean?Total:  1075 files
Howmany files is my companypublishing?
Sample: Printer info found in odf files returned by Google
Google Sets prediction
Sample: Info found in a PDF file
What files store Metadata, hidden info or lost data?Office documents:Open Office documents.MS Office documents.PDF Documents.XMP.EPS Documents.Graphic documents.EXIFF.XMP.And almost everything….
Pictureswith GPS info..EXIFREADERhttp://www.takenet.or.jp/~ryuuji/
Demo: Lookingfor EXIF information in ODF file
Even Videos withusers…http://video.techrepublic.com.com/2422-14075_11-207247.html
And of course, printedtxt
What can be found? Users:Creators.Modifiers .Users in paths.C:\Documents and settings\jfoo\myfile/home/johnnyfOperating systems.Printers.Local and remote.Paths.Local and remote.Network info.Shared Printers.Shared Folders.ACLS.Internal Servers.NetBIOS Name.Domain Name.IP Address.Database structures.Table names.Colum names.Devices info.Mobiles.Photo cameras.Private Info.Personal data.History of use.Software versions.
How can metadata be extracted?Info is in the file in raw format:Binary.ASCII .Therefore Hex or ASCII editors can be used:HexEdit.Notepad++.BintextSpecial tools can be used:ExifredaerExifToolLibextractor.Metagoofil.……or just open the file!
Tools: Libextractor
Tools: MetaGoofilhttp://www.edge-security.com/metagoofil.phpYes, also Google….
Your FBI user
Your UN user
YourScotlandYarduser
YourCarabinieriuser
YourWhiteHouseuser
Yes, we can!
DrawbacksThese tools only extract metadata.Not looking for Hidden Info.Not looking for lost data.Not post-analysis.
OnlyMetadatahttp://gnunet.org/libextractor/demo.php3
Notverygoodwith XML files (SWX, ODF, OOXML)
Google is [almost] GOD
FiletypeorExtension?
FocaFingerprinting  Organizations with Collected Archives.Search for documents in Google and BingAutomatic file downloadingCapable of extracting Metadata, hidden info and lost dataCluster information Analyzes the info to fingerprint the network.
Demo: FOCA
FOCA Onlinehttp://www.informatica64.com/FOCA
Solutions?
First: Cleanallpublicdocuments
Clean your documents:MSOffice 2k7
Clean your documents: MSOffice 2k3 & XPhttp://www.microsoft.com/downloads/details.aspx?displaylang=en&FamilyID=144e54ed-d43e-42ca-bc7b-5446d34e5360
OLE StreamsIn MS Office binaryformat filesStoreinformationaboutthe OSAre notcleanedwiththese ToolsFOCA findsthisinfo
Demo: Lookingforinfo in cleaneddocument
OpenOfficecleaningoptionsOnlymetadataNotcleaninghiddeninfoNotcleaninglost data
Cleaning documentsOOMetaExtractorhttp://www.codeplex.org/oometaextractor
Demo: OpenOffice “Security” Options…
Are yousaferelyingonyourusers?
IIS MetaShield Protectorhttp://www.metashieldprotector.com
Second: Beg Google todeleteallthecached files
Don´t trust your users!!!
Don´tcomplainaboutyourjob!!
PS: Thisfilealso has metadata
ThanksAuthorsChema Alonsochema@informatica64.comJose Palazón “Palako”palako@lateatral.comEnrique RandoEnrique.rando@juntadeandalucia.esAlejandro Martínamartin@informatica64.comFrancisco Ocafroca@informatica64.comAntonio Guzmánantonio.guzman@urjc.es
Metadata Security: MetaShield Protector

Metadata Security: MetaShield Protector