- hi name is godfreynolan, president of RIIS LLC, mobile development company in Southfield, MI- welcome to the conference, really looking forward to next couple days of conference- author of decompiling java and now decompiling android, due out tomorrow- free copy of the book if you visit our booth and drop off your business card
2. Why are we herewe're here to talk about how to protect your android source codelook at the tools and techniques that I've encountered while writing the book and also as part of our security practice- hear no evil, see no evilsome of the excuses I regularly hear on why decompilation is something that can be ignored arecreate a good Android Application and continue to improve on it then source will protect itselfgood support and regular upgrades are much better ways of protecting your code than any tools or techniquesunderstanding your own code after 6 months is hard, how can anyone else understand reverse engineered codehowever the issue is not the quality of your code, apks are client side apps which often communicate with backend systems, so you need to protect any usernames and passwords or API keys exposed when the code is obfuscated.- decompile apknow let's see how someone would decompile an apkgoing to use an simple app called agile and beyond which we did for an agile conference in 2011ACTION - open app using ddms shortcut show couple pages and refreshACTION - pull the file, show the code and compare to original, explain how it's not perfect Dataservice.adb shell, su, ls /data/app ls /data/app-privateadb pull /data/app/com.riis.agile.agileandbeyond.android-1.apk .dex2jar\\dex2jar com.riis.agile.agileandbeyond.android-1.apk- raising the baras you can see it's pretty easy to decompile the coide, hardest thing to do is to get the usb driverswe have proguard, yes we do but nobody is using itOut of 100 apks we downloaded, 1 was obfuscated correctly and that was a phonegap project
4. Why is it so easy (cont'd)Currently the main reason why decompilation is a problem for Android, is because of it's close relationship to Java. As you're probably aware the Java code that you write gets compiled into a classes.dex file. And if you're not, then this is a good time to take a look inside an apk file which is basically a zip file.ACTION - rename apk to zip and then unzip file.Regardless of what IDE or command line tool you use for your builds, your Java code gets compiled first into Java and then into the classes.dex file using the 'dx' command that comes with the Android SDK. The format of the classes.dex file is completely different from the Java class file. Don't know if you've been following the ongoing court case or not between Oracle and Google. But I presumed that the classes.dex format came out of Google trying to avoiding paying licensing fees to Sun or Oracle. However now i'm pretty sure that they were trying to create a minimalist format for small phones which would have to run mutiple virtual machines. If you want to convince yourself then take a look at the size of a classes.dex file and compare it a decompiled jar file. It's typically a lot smaller than the corresponding jar file.Xiaobo Pan from Hangzhou in China created a tool called dex2jar that reverses dx process and converts the classes.dex file back into a jar file and so it can be decompiled using any of the many java decompilers. It's not 100% perfect so in most cases it's quite hard to recompile the code, but it's good enough in most cases to provide a lot of valuable information to a hacker.
5. Why is it so easy (cont'd)In this slide I'm showing the format of the java jar file on the left hand side and the classes.dex file on the right hand side. The main difference is that there is only one classes.dex file in an apk where there are multiple class files in the a jar file. ACTION - show the class file in xml and the classes.dex file in xmlClasses.dex file has Different structure, Different opcodes, Register based not stack based, Multiple DVMs on deviceThe bytecode lives in the data section of the classes.dex file and the code attribute section of the class file. This bytecode gets reverse engineered back into source code. And although dex2jar changes the Android bytecode back into Java bytecode there is no reason why the classes.dex can't be reverse engineered into Java source too.btw the principle way of protecting your files is using obfuscation. And obfuscators work by renaming variables in the constant pool or in the ids area of the dex file.
6. Possible ExploitsOk, you've seen how easy it is to gain access to the source, so what can exactly does that mean. What are the possible exploits. ACTION - show API key, username and passwords and credit card informationThe names and locations have been changed to protect the innocentWhile I don’t think gaining access to one person’s credit card information is a huge issue, but you’ll probably find that it’s not PCI compliant and you’ll fail an audit There is also the possibility that someone could recompile your app or a modified version of your app and harvest usernames and passwords
7. Downloading APKsLet's take a look again at downloading APKs from the phone or device onto your computer for decompilation. backup using Astro File Manager then use SDCard to get it off the phoneACTION - show Astro File ManagerRooting phone - Z4root, uses rage against the cage exploit. spins up as many adb shell's as your phone can handle and the last one standing is rooted. There are similar exploits for GingerBreak for Gingerbread andSuperboot for Ice Cream Sandwich.If your APK is out there and it's got any number of downloads you can probably be sure that it's been shared on any number of forums.ACTION – do a search for xda-developers forum fandango apk and click on [Q] fandango apkSo if there are any APK issues in old apks then be aware that simply fixing them and doing a marketplace update isn’t going to be enough if the old apk still has the keys to the castle.
Explain the difference between disassemblers and decompilersAlready talked about the android debug bridge or adb, dex2jar and JD-GUIWe also saw that dx could be used to compile java classes into classes.dex files, also worth looking at its output log as it’s one of the more complete disassemblers and does a really good job of pulling apart the classes.dex fileACTION – Show dx command and output logdexdump is another disassembler more in the vein of javap also comes with the Android SDKACTION – show dexdump command and output Dedexer is an alternative to dx’s log file, written by Gabor Paller in Hungary. Personally I really like dedexer as it’s easy to parse so I used it a lot in the book as a good starting point so that I didn’t have to convert the hexadecimal bytes before parsing the classes.dex file.ACTION – show dedexer command and output smali and baksmali continue the icelandic motif that you’ll find all everywhere in Android, baksmali means disassembly and smali means assembly to go along with Dalvik (a fishing town in Iceland) as in the Dalvik Virtual MachineAXMLPrinter2 converts the compressed AndroidManifest.xml in an apk back into a readable formatLets use the apktool to show smali and AXMLPrinter2ACTION – run apktool d
So what can you do to protect yourself. The two options are obfuscation and if you want to go a step further you might want to consider using the Android NDK or Native Developer Kit. We’ve already said that iphones have less of an issue with decompilation as the code is compiled down into a binary, well the good news is that you can do the same using the Android NDK. You can write your code in C++ and use the NDK to compile it into a library that can be included in your APK. The bad news is that you need a different version for whatever chip you are targeting. Almost every phone and tablet runs ARM so unless your APK is going to be running on an Intel chip then you should be ok.
Need to know what types of obfuscation are out there so you can decide what makes sense for you and just how high you want to raise the bar.Christian Collberg wrote a paper called a Taxonomy of Obfuscations which is where I took the list from. We can break obfuscations into 3 main types, namely Layout, Control and Data. The more transformations you employ, the less likely it will be that anyone or any tool can understand the original source. 99% of the obfuscation that you’ll find in early Java obfuscators was layout obfuscation.ACTION – show layout.javaThe concept behind control obfuscations is to confuse anyone looking at decompiled source by breaking up the control flow of the source. Functional blocks that belong together are broken apart, and functional blocks that don’t belong together are intermingled to make the source much more difficult to understand. If you remember Goto Considered Harmful, well the holy grail of obfuscation is to do just that, interleave gotos in the bytecode so that the control flow becomes irreducible or almost impossible to decompile.ACTION – show Control.java and Interleave.javaData obfuscations reshape the data into less natural forms to create confusion when someone is looking at your code.Best example of a data obfuscation is from Proguard. In this demo we’re using Wordpress’ open source Android app so we can compare the original source to the obfuscated source. This is taken from the book and was a method chosen at random from the Wordpress source.From the mapping file we know that public static void escapeHtml(Writer writer, String string) and public static void unescapeHtml(Writer writer, String string) methods have been pushed to a separate file r.java which uses Data obfuscation and is basically unintelligibeACTION – show EscapeUtilsBefore.java and EscapeUtilsAfter.java
Once you have obfuscated the code, because of all the method renaming it can be difficult to debug an apk once it’s been made available in the marketplace. Thankfully each obfuscator has a mapping file that shows you what methods got renamed to whatACTION – show slide mapping.txtProGuard also has a retrace.jar file which can be used in conjunction with the mapping.txt file and your stack trace to help you debug what happened. java -jar retrace.jar mapping.txt stackfile.traceHow and ever this can become a nightmare if you have multiple updates which of course will have different obfuscations each time.So you need to come up with some solution such as storing the mapping.txt files in your subversion or github repositories so you always get back to them for each version of the APK.Unit testing is also an issue with obfuscation, as you can see some of the methods can change quite dramatically. Currently we’re doing unit testing before obfuscation and then integration or functional tests after obfuscation along with some automated UI tests. There are ways to tell the ProGuard.cfg file to ignore files for unit testing but it didn’t work very well for us and we ended up not obfuscating too many classes. Would love to hear anyone’s input on this after the class.As you can see from the Obfuscation Theory slide obfuscation is defactoring as opposed to refactoring so it may seem very counterintuitive to many. I’m not advocating that you start employing bad programming practices but what I am advocating is the use of a tool to automate the process.
Wordpress have an open source android app that is great for comparing obfuscators. It’s a large real world app where you have access to the source code. Let’s take a look at the unobfuscated and obfuscated jars for the wordpress appACTION – show unobfuscated.jar, show wordpress_proguard, show wordpress_dashoRun proguardACTION – show proguard.out and explain about shrinkingACTION – run proguard and run dashoACTION – show libriis-jni.so
I found out yesterday that the launch has been pushed back to June 20th. The book is currently available on the Apress’ alpha books which is an ebook format and you can pre-order it on Amazon. Also we’re giving away a free book to anyone who drops off their business card at our booth tomorrow. We’ll ship a signed copy to you once it comes out.The book has lots of the same code I’ve shown here and it also has DexToXML a classes.dexdisassambler and DexToSource which is the first Android decompiler. But before you get all excited it’s not very comprehensive. It’ll do the examples in the book and not much else. Both are written in ANTLR and it’s pretty easy to follow the code and extend it if you really want to join me in my obsession with parsing bytecode.
If you want to follow the latest developments with the book and the parsing tools that we’re working on then please go to decompilingandroid.com and if you want to send me an email, then my address is firstname.lastname@example.org. If you’re interested in having us help you secure your code or learn more about our Android development projects then you can find out more about RIIS at http://www.riis.com or visit us at our booth tomorrow. Had some good news today, one of our clients has instituted some fixes that stop their iphone app from being cracked, which is a good segue into the future of decompilation, in the early days Mocha was the big decompiler, HoseMocha put a stop to that by adding an extra pop bytecode which Mocha couldn’t handle, we’re looking at ways to do that with dex2jar and whatever else comes along. It’s a mini arms-race between the hackers and the security folks.
Protecting Source Code
Hear no evil, see no evil Decompiling APK demo Raising the bar
Easy access to APKs APK design Nobody using obfuscation
According to DuoSecurity Over 50% of Android phones are rootable See Xray.io for more information Vulnerabilities ASHMEM Exploid Gingerbreak Levitator Memoproid etc.