Security in mobile banking apps

Security in mobile banking
applications
Alexandre Teyar
Tutor: Prof. Abdelmalek Benzekri
Supervisors:
Prof. Christian Rohner
Prof. Thiemo Voigt
A thesis presented for the degree of
Master Engineering
Department of Information Technology
Uppsala University
Sweden
September 18, 2015

Security in mobile banking applications
2 Chapter 0 Alexandre Teyar

Acknowledgment
On the very outset of this thesis report, I would like to express my special gratitude to
Mr. Christian Rohner and Mr. Thiemo Voigt who gave me the great opportunity to come
to Sweden in order to pursue my master thesis within the Department of Information
Technology in Uppsala University. I would also like to thank them for theirs guidance and
supervision throughout the master thesis period and for theirs support in completing the
project.
I express my profound and sincere gratitude to Mr. Abdelmalek Benzekri for his
support and understanding of the issues that I encountered during my stay in Uppsala.
I extend my acknowledgements to Mrs. Jackie Leroux and Mrs. Anna-Lena Forsberg
for theirs involvement in my administrative procedures. And for the support that Anna-
Lena shown in my constant quest of ﬁnding accommodations.
I also acknowledge with deep sense of reverence my parents and members of my family,
who have always supported me.
At last but not least, my thanks and appreciations go to all the Department of Infor-
mation Technology working staﬀ, who made me feel comfortable among them during my
stay in Uppsala University.
Chapter 0 Alexandre Teyar 3

Alexandre Teyar
Abstract
It has become a habit for more and more users to manage theirs banking accounts
using dedicated banking applications on their smart-phones. Consequently, necessary
precautions to protect sensitive data including the banking credentials from cyber criminals
attacks must be taken to guarantee secure and safe communications.
The purpose of this thesis is to examine and evaluate the wireless security of mobile
application communications with a focus on banking applications.
We have seen in a large static code analysis that there is wide spread security flaws
regarding the wireless communications of mobile applications. We have also saw that the
tool that performs this static code analysis generates a significant number of false-positives.
In this thesis, we try to reduce that number. We answer to this issue with a new method
of analysis that includes a dynamic code analysis and a manual log files analysis. For that
we trace the applications runtime method calls (dynamic code analysis) and then we have
a procedure to identify the critical functions involved in the (in)security of the applications
wireless communications based on those traces (manual log files analysis).
Résumé
C’est devenu une habitude pour de plus en plus d’utilisateurs de gérer leurs comptes
bancaires en utilisant des applications bancaires dédiés sur leurs téléphones mobiles. Par
conséquent, des précautions nécessaires afin de protéger les données sensibles, incluant les
identifiants bancaires d’attaques de cyber-criminels doivent être prises pour garantir des
communications sécurisées et sûres.
Le but de cette thèse est d’examiner et d’évaluer la sécurité des communications sans
fil d’applications mobiles avec un accent particulier sur les applications bancaires.
Nous avons vu dans une large analyse de code statique qu’il ya de d’importantes failles
de sécurité concernant les communications sans fil d’applications mobiles. Nous avons
également vu que l’outil qui effectue cette analyse statique de code génère un bon nombre
de faux positifs. Dans cette thèse, nous essayons de réduire ce nombre. Nous répondons
à cette problèmatique avec une nouvelle méthode d’analyse qui comprend une analyse de
code dynamique et une analyse manuelle de fichiers journaux. Pour cela nous tra¸cons
les appels des méthodes d’applications en cours d’exécution (analyse de code dynamique)
ensuite nous disposons d’une procédure pour identifier les fonctions critiques impliqués
dans la sécurité des communication sans fil d’applications mobiles basé sur ces fichiers
journaux (analyse manuelle de fichiers journaux).
4

Contents
1 Introduction 8
2 Background and Related Work 10
2.1 Secure connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Certificates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.2 SSL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Man in the middle attack . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 SSL implementation in the Android’s platform[4] . . . . . . . . . . . . . . . 14
2.2.1 Pinning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Blacklisting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Android Platform[7][8] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 Linux Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2 Libraries and Android Runtime . . . . . . . . . . . . . . . . . . . . . 17
2.3.3 Application Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Static code analysis on the Google Play Store 22
3.1 Analysis context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Static code analysis[12] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.4 Manual analysis of insecure banking applications . . . . . . . . . . . . . . . 28
4 Identifying the critical code section in the certificate validation process 30
4.1 Reverse engineering[13] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Generation of the applications traces . . . . . . . . . . . . . . . . . . . . . . 31
4.3 SSL certificate validation critical code section identification . . . . . . . . . 34
5 Conclusion 36
5.1 My feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
A Appendix Graphs 39
B Appendix Tables 41
5

List of Figures
2.1 The structure of a certificate . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 HTTP vs HTTPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 SSL handshake . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Man in the middle attack in a SSL context . . . . . . . . . . . . . . . . . . 14
2.5 Browser certificate warning message . . . . . . . . . . . . . . . . . . . . . . 14
2.6 Interfaces of the javax.net.ssl library . . . . . . . . . . . . . . . . . . . . . . 15
2.7 Android stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.8 Android Sandboxing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.9 Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.10 Android Application Framework Bundles . . . . . . . . . . . . . . . . . . . 18
2.11 Android Activity Lifecycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.12 Eclipse Project Explorer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.13 The structure of an APK . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.1 Overview of the hermes script . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Types of verifiers observed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Applications containing a bad verifier grouped by the year of publication
or last updated, and percentage of such applications in that group . . . . . 26
3.4 Applications containing a bad verifier grouped by download, and percentage
of such applications in that group . . . . . . . . . . . . . . . . . . . . . . . . 27
3.5 Applications containing a bad verifier grouped by rating, and percentage of
such applications in that group . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.6 (In)Security cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1 Decompiling process of an Android app. . . . . . . . . . . . . . . . . . . . . 31
4.2 Disassembly process of an Android app. . . . . . . . . . . . . . . . . . . . . 31
4.3 Overview of the smali-code-injector script . . . . . . . . . . . . . . . . . . . 32
4.4 A method call entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.5 Application trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.6 Intra-method calls concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.7 Indented trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.8 The manual analysis procedure . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.9 Dynamic code analysis tool chain . . . . . . . . . . . . . . . . . . . . . . . . 35
A.1 Usage of the Internet permission . . . . . . . . . . . . . . . . . . . . . . . . 39
A.2 Distribution of verifier types over categories. . . . . . . . . . . . . . . . . . . 40
6

List of Tables
B.1 Distribution of applications with internet permission over categories . . . . 41
B.2 Distribution of applications over verifiers types . . . . . . . . . . . . . . . . 42
B.3 Applications with bad verifier grouped by the year they were published or
last updated, and share of such applications in that group . . . . . . . . . . 42
B.4 Applications with bad verifier grouped by their download number, and share
of such applications in that group . . . . . . . . . . . . . . . . . . . . . . . . 42
B.5 Applications with bad verifier grouped by their rating, and share of such
applications in that group . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7

Chapter 1
Introduction
Nowadays, smart-phones can be viewed as ”pocket computers” or ”digital wallets”. We
carry them with us almost every day and their processing power and memory capacity
have increased to a point that they are now able to run any kind of applications.
They also contain personal and sensitive data including but not limited to the real time
user location, passwords and banking credentials. Moreover, smart-phones are designed to
be nomad devices that connect from a network to an other. Including untrusted networks
where theirs data are freely exposed to wireless attacks making them an easy and privileged
target for cyber criminals.
The purpose of this thesis is to examine and evaluate the wireless security of mobile
application communications with a focus on banking applications which are critical finan-
cial applications. This thesis uses the Android platform which is currently the world most
used smart-phone operating system with 84.6 percent share of global smart-phone ship-
ments in 2014, according to research by Strategy Analytics. And over 1,400,000 available
applications as of November 2014, according to AppBrain Stats.
A key of the Android smart-phones success resides in the Google Play Store and An-
droid software development which are relatively open and unrestricted (see Chapter 3.
Everyone can publish an application on the Google Play Store without any strong verifi-
cation from Google. This offers both developers and users more flexibility and freedom,
but also creates significant security challenges.
The Android ecosystem is all about communicating, a large number of mobile appli-
cations uses web-services or are client-server based. This is motivated by the developers
will to extend their applications functionalities and to provide custom services for their
clients.
Applications can communicate using the HTTP protocol, which provides no security
at all and makes it easy to intercept data. But they can also use the HTTPS protocol,
which basically is HTTP over SSL. In theory HTTPS makes it harder, if not impossible,
to intercept data.
However, in 2012, Fahl et al. performed an analysis on 13,500 applications in the
Android Market (known today as the Google Play Store) and they found out that 1,074
applications make an incorrect use of HTTPS configurations which either accepts any
certificate or any certificate signed by a trusted Certificate Authority.
It is in this context that we decided to write this thesis on the ”Security in mobile
banking applications”. In this thesis, the security aspect of mobile applications is limited
to the wireless communication of Android devices. In other words, we study the security
threats posed by the misuse of SSL in Android wireless communications with a focus on
the banking applications.
We provide an overview on the technologies that will be used in this thesis as well as
the related works that have been done on the Android SSL security topic in the Chapter
8

Background and Related Work.
In Chapter Static code analysis on the Google Play Store, we perform a static
code analysis on 4.795 different applications from the Google Play Store to detect whether
or not they implement broken certificate verifiers that may make them vulnerable to wire-
less attacks. We also manually analyse some banking applications that have been labelled
as vulnerable to wireless attacks by the previous static code analysis. This manual analysis
consists in real situation Man In The Middle Attacks to check the banking applications
wireless security. This static code analysis highlighted its untrustworthy nature since the
manual analyses of the selected banking applications that followed shown that they are in
fact secure to wireless attacks even though they contain incorrect certificate verifiers.
We noticed a high false-positive rate of the static code analysis. With the goal to
improve the analysis we did an additional code analysis. In chapter Identifying the
critical code section in the certificate validation process, we aim to identify the
critical code section in applications certificates validation process using dynamic code
analysis. We also attempt to automatize this finding of applications certificates validation
critical code section to provide a new method of evaluating the wireless security of Android
applications.
We finally conclude in chapter Conclusion with the contributions and work that
improvement that can be done to complete this dynamic code analysis method that we
started to develop.

Chapter 2
Background and Related Work
This background section reviews the main concepts related to the establishment of secure
connections and theirs different types of implementation in the Android’s platform. We
also provide a quick introduction to the Android’s platform stack.
2.1 Secure connection
A secure connection ensures the protection of the communication between communicating
parties from being eavesdropped or tampered through a man in the middle attack. A
secure connection relies on the SSL protocol1 which uses a certificate system to guarantee
authentication, data integrity, and data confidentiality.
2.1.1 Certificates
Authentication is a central concern in the SSL protocol that is ensured by the use of digital
certificates. ”Authentication is the process of determining whether someone or something
is, in fact, who or what it is declared to be.” [1]
”A digital certificate certifies the ownership of a public key by the named subject of the
certificate” [2], but also a number of fields and attributes whether mandatory or optional
related to its identity (see Figure 2.1). In short a digital certificate can be viewed as a
sort of ”identity card”. This allows others (relying parties) to rely upon signatures or on
assertions made by the private key that corresponds to the certified public key.
1
https://tools.ietf.org/html/rfc6101
10

Figure 2.1: The structure of a certificate
Source: https://i-technet.sec.s-msft.com/dynimg/IC196464.gif
The X.509 certificates model2 is the most commonly used certificate model, nonetheless
the OpenPGP certificate model is also supported in SSL.
A X.509 certificates can either be self-signed or signed by a Certificate Authority (refers
to Section 2.1.1). In the case that the other party’s certificate is self-signed, the verifier
needs to have this self-signed certificate hard coded into its trust pool. In all other case
the verifier needs to trust the CA which signed the other party’s certificate, or a CA which
signed the intermediate CA (chain of trust).
Certificate Authorities
”In cryptography, a Certificate Authority or Certification Authority (CA) is an entity that
issues digital certificates” [2]. A CA is a trusted third party trusted both by the subject
(owner) of the certificate and by the party relying upon the certificate. Usually, client
software, for example, browsers include a set of trusted CA certificates.
2.1.2 SSL
This section reviews the Secure Sockets Layer (SSL) protocol and how it is used in the
HTTPS protocol.
HTTP is a protocol3 for sending requests and receiving answers, each request and
answer consisting of detailed headers and (possibly) some content. The HTTP traffic is
unencrypted, all the data are sent in plain-text which allows an attacker to intercept and
modify it without being detected by any of the participants involved in the communication.
Therefore the HTTP protocol is vulnerable to eavesdropping and tampering.
To deal with these issues HTTPS protocol4 was formed in 1994 by Netscape Commu-
nications and became a standard in May 2000. HTTP is meant to run over a bidirectional
tunnel for arbitrary binary data; when that tunnel is an SSL/Transport Layer Socket
2
3
4

(TLS) connection assuring the security of the communication between two hosts, then the
whole thing is called ”HTTPS”.
Figure 2.2: HTTP vs HTTPS
Source: https://www.instantssl.com/images/http-vs-https.png
The term SSL and TLS5 are often used interchangeably or in conjunction with each
other (SSL/TLS) as TLS is the normalized successor of SSL.
SSL handshake[3]
Before the client and the server can begin to exchange application data over SSL, the
encrypted tunnel must be negotiated: the client and the server must agree on the version
of the SSL protocol, choose the cipher-suite, and verify certiﬁcates if necessary. The
agreement upon these parameters can be compared to a handshake, therefore it is called
SSL handshake (see Figure 2.3).
Figure 2.3: SSL handshake
Source: https://www.identrustssl.com/images/learn ssl diagram.gif
5
https://www.ietf.org/rfc/rfc5246.txt

• The client sends a ClientHello specifying information such as the highest supported
TLS version, a random number, a list of suggested cipher suites and compression
methods.
• The server answer with a ServerHello containing the chosen protocol version, a ran-
dom number, cipher suite and compression method from the choices offered by the
client. The server may also send its X.509 Certificate6 message and ServerKeyEx-
change (depending on the selected cipher suite, this may be omitted by the server).
When the server is done with the handshake negociation it sends a ServerHel-
loDone.
• The client may responds with a ClientKeyExchange message, which may contain
a PreMasterSecret, public key, or nothing. (Again, this depends on the selected
cipher.) This PreMasterSecret is encrypted using the public key of the server cer-
tificate.
• After this negotiation phase, both the server and client compute a common secret
key called the MasterSecret using the random number and PreMasterSecret.
During the second phase the client sends a ChangeCipherSpec message which
basically means: ”Everything I tell you from now on will be authenticated (and
encrypted if encryption parameters were present in the server certificate)”.
• Finally, the client sends an authenticated and encrypted Finished message, con-
taining a hash and MAC over the previous handshake messages.
• The server then follows the same procedure.
This is an example of the one-way SSL - only the server authenticates itself with a
certificate. There is also the two-way SSL - the client has to authenticated (in addition to
the server) with a certificate. The one-way SSL is the most used SSL handshake method.
At this point, the ”handshake” is complete and both parties can start to send appli-
cation data securely (encrypted and with authentication). However, if an error happens
during the SSL handshake like an error in the client received server’s certificate verification
for instance, the SSL handshake is aborted.
2.1.3 Man in the middle attack
A man in the middle attack (MITMA) is an attack where the attacker is in a position
to intercept the messages sent between two parties who believe they are directly commu-
nicating with each other (see Figure 2.4). In a passive MITMA, the attacker can only
eavesdrop (spy) on the communication, and in an active MITMA, the attacker can also
tamper (alter) with the communication.
6

Figure 2.4: Man in the middle attack in a SSL context
Source: http://www.consumer.ftc.gov/sites/default/files/blog/spoofed-security-
certificate.png
MITMAs against mobile devices are somewhat easier to execute than against tradi-
tional desktop computers, since the use of mobile devices frequently occurs in changing
and untrusted networks.
Specifically, the use of open access points and evil twin attacks make MITMAs against
mobile devices a serious threat not to mention other tricks such as DNS or ARP poisoning
that can be performed to obtain a MITM position within a given network.
In addition, most of the effective defences against MITMAs can only be found on
router or server-side.
A failure in a certification validation may indicates that someone is eavesdropping/tampering
the communication (see Figure 2.5). In this case the peer gets the attacker’s certificate
instead of the other peer’s certificate (see Figure 2.4. However, a success in the certificate
validation does not always imply that the communication is safe.
Figure 2.5: Browser certificate warning message
Source: http://img15.hostingpics.net/pics/509896certwarning.png
2.2 SSL implementation in the Android’s platform[4]
It’s possible that an application might use SSL incorrectly such that malicious entities may
be able to intercept an application’s data over the network (refers Section 2.1.3). This is
a consequence of the possibility offered to change the native implementation of the SSL
interface in the javax.net.ssl library extensively used by Android (see Figure 2.6).

Figure 2.6: Interfaces of the javax.net.ssl library
Source: http://oi57.tinypic.com/2s775z6.jpg
The two main Java entities responsible for the certificate validation on Android clients
are:
• TrustManagers: ”TrustManagers are responsible for managing the trust mate-
rial that is used when making trust decisions, and for deciding whether credentials
presented by a peer should be accepted.”[5]
• HostnameVerifier: ”Verifies that the specified hostname is allowed within the
specified SSL session.”[6]
This native implementation is by nature non vulnerable to MITMAs. However Android
developers must use a custom implementation to use self-signed, non trusted or non known
certificates. The main reason why developers use self-signed certificates is because being
certified by a trusted CA cost money and most of the android developers do not want to
spend money when they are not even sure about the profitability of their applications. The
safe method to use a self-signed certificate is to add the CA used to sign the self-signed
certificate in the application trusted CAs. This method is known as certificate pinning
(see 2.2.1). This method requires to implement custom verifiers. Therefore, a custom
verifier is not always equal to an insecure application, the actual problem is that there
still are developers who are not aware of those methods and chose the easy way to accept
any certificate without any verification.
There are implementations of the SSL interface that are very dangerous for applications
running on non trusted networks because they open the door for MITMA by theirs lack
of security. These implementations that offers no security are the following:
• The use of trust managers that do not check certificate chains from remote servers,
making it possible for an MITMA to succeed.
Verifying certificates to ensure that they are signed by a known and trusted Certi-
fying Authority (CA) is an integral part of certificate- based, client-server commu-
nication.
• The replacement of platform hostname verifiers by application hostname verifiers
that do not verify the hostname of the remote server.
Having a trust manager that checks certificates is not sufficient in this case, as the
attacker may have a certificate signed by a trusted certifying authority and may
present a valid certificate chain. Therefore, to prevent a MITMA, the hostname of
the server extracted from the CA-issued certificate must match the hostname of the
server the application that intends to connect.

• Applications ignoring SSL errors when they use WebKit to render server pages in
mobile applications.
With server communications that use SSL/TLS, any errors generated should be
caught. Otherwise we open up the vulnerable applications to MITM attacks that
may exploit vulnerabilities such as Javascript Binding Over HTTP (JBOH).
2.2.1 Pinning
”An application can further protect itself from fraudulently issued certificates by a tech-
nique known as pinning”[4]. This technique basically restricts ”an application’s trusted
CAs to a small set known to be used by the application’s servers. This prevents the com-
promise of one of the other 100+ CAs in the system from resulting in a breach of the
applications secure channel”[4].
2.2.2 Blacklisting
”SSL relies heavily on CAs to issue certificates to only the properly verified owners of
servers and domains. In rare cases, CAs are either tricked or, in the case of Comodo or
DigiNotar, breached, resulting in the certificates for a hostname to be issued to someone
other than the owner of the server or domain.
In order to mitigate this risk, Android has the ability to blacklist certain certificates or
even whole CAs. While this list was historically built into the operating system, starting
in Android 4.2 this list can be remotely updated to deal with future compromises”[4].
2.3 Android Platform[7][8]
Android is a mobile operating system (OS) currently developed by Google. Android OS
consists of several layers each with its own purposes (see Figure 2.7). In this section, we
give a high level overview of the Android operating system stacks. The focus of this work
is on layer 4 (Applications) and the layer 2 (Libraries and Android Runtime).
Figure 2.7: Android stack
Source: http://img15.hostingpics.net/pics/301709androidstack.jpg

2.3.1 Linux Kernel
Android is built on top of the Linux Kernel. This choice has been made because Linux
is a mature open-source operating system that provides a pre-built, already maintained
operating system kernel which includes a lot of hardware abstraction related components
that are readily available. Also, Linux is a highly secure system. All the Android ap-
plications run as separate Linux processes with permissions set by the Linux system (see
Section 2.3.1).
This gives to the Android developers a complete, sane basis to start with and the
possibility to modify the Linux kernel to ﬁt their needs (limited processing power, memory
capacity and battery life for instance).
Sandboxing
Running each application as separate processes with diﬀerent permissions is known as
Sandboxing. Sandbox isolates your application data and code execution from other ap-
plications. One application cannot access data in another application sandbox without
explicit permission.
Figure 2.8: Android Sandboxing
Source: http://www.pix-host.com/allimages/49100858.jpg
2.3.2 Libraries and Android Runtime
Native Libraries
The native libraries are C/C++ libraries, often taken from the open source community in
order to provide necessary services to the Android application layer. Some of the libraries
are:
• WebKit: A fast web-rendering engine used by Safari, Chrome, and other browsers.
• SQLite: A full-featured SQL database.
• OpenGL: 3D graphics libraries.
• OpenSSL: The secure locket layer.
• And many others.
Android Dalvik Virtual Machine
Android use Java as primary programming language. Java compiles into bytecode and
then executed on Java Virtual Machine (VM). By using Java VM, it is possible to execute
Java code on every machine that runs Java VM without have to recompile the Java code.

For Android, team at Google created a new virtual machine named Dalvik designed
specifically for mobile devices which has constraint on battery life and processing power.
This makes the compilation process a bit different from standard Java (see Figure 2.9).
After you created Java bytecode, you recompile it once again using the Dalvik compiler
to Dalvik byte code. It is this Dalvik byte code that is then executed on the Dalvik VM.
Figure 2.9: Compilation
Source: http://img15.hostingpics.net/pics/253629dvmjvm.png
Dalvik is based on JIT (just in time) compilation. It means that each time you run
an application, the part of the code required for its execution is going to be translated
(compiled) to machine code at that moment. As you progress through the application,
additional code is going to be compiled and cached, so that the system can reuse the code
while the app is running. Since JIT compiles only a part of the code, it has a smaller
memory footprint and uses less physical space on the device. However, there will be lag
from the moment you click the application until its running.
2.3.3 Application Frameworks
Application Framework sits on top of native libraries, android runtime and Linux kernel.
This framework come pre-installed with high-level building blocks that developers can
use to program applications. Following are the most important application framework
components for Android development in general.
Figure 2.10: Android Application Framework Bundles
Source: http://ows.edb.utexas.edu/sites/default/files/users/nqamar/AppFW.JPG

Activity Manager
Activity is a single focused thing. Activities can run in the foreground giving direct
interaction to the user e.g. current window/tab, they can run as background services or
they can be embedded in other activities.
The entire lifecycle is defined by certain methods or states as shown in Figure 2.11.
Figure 2.11: Android Activity Lifecycle
Source: http://ows.edb.utexas.edu/sites/default/files/users/nqamar/activity.JPG
Contents Provider
Content provides handles data across applications globally. Android comes with a set of
built in content providers to handle multi-media data or contacts etc. Developers can
make up their own providers for flexibility or they can incorporate their data in one of the
existing providers. For our specific application, we are interested in content://browser to
access online data through browser interface.
View System
View system binds all the classes together that handle graphical user interface (GUI)
related elements. All views elements are arranged in a hierarchical single tree manner.
They can be called from a java code or included in XML layout files. One good thing that
we noticed about Android development is extensive use of XML files. These files provide
a nice abstraction between backend java code and layout elements. Designing UI related
elements is done in the same fashion as HTML based web designing.
Resource Manager
Resource Manager handles all non-code things. These can be anything ranged from icons,
graphics or text. Such resources reside under res directory as can be seen under Eclipse
Project Explore in the following Figure 2.12.
All the icons and design work that we have done so far using Adobe Illustrator will
reside under these layout directories.
Location Services
This bundle supports fine-grained location providers such as GPS and coarse-grained lo-
cation providers such as cell phone triangulation. LocationManager system service is the
central component of the location framework. This system service provides an underlying

Figure 2.12: Eclipse Project Explorer
Source: http://ows.edb.utexas.edu/sites/default/files/users/nqamar/explorer.jpg
API to access device location information. Besides LocationManager class, are several
other important classes from android.location that are important to location aware appli-
cations such as Geocoder, GpsSatellite and LocationProvider.
2.3.4 Applications
At the top layer of the Android stack there is the Applications (or apps) layer. These
applications are what end users find valuable about Android. They can come pre-installed
on the device or can be downloaded from the Google Play Store.
The file format used to distribute and install application software and middleware
onto Google’s Android operating system is the Application Package File (APK) (see
Figure 2.13). APK files are basically ZIP file formatted packages based on the JAR file
format, with .apk file extensions.
Figure 2.13: The structure of an APK
Source: http://www.etechtube.com/wp-content/uploads/2014/09/directories-of-apk-
file.jpg
The classes.dex file is the Dalvik Executable file that contains the Java libraries that
an application uses. This file is the most important material in the context of our work.
Indeed, using reverse engineering techniques on it, we can obtain the Java and assembly
source code of any application.
Those Java and assembly source code are then used in Section 4 to evaluate the wireless
security of the studied applications with the help of well known Android security tools.

2.4 Related Work
In 2012, Georgiev et al. shown in ”The most dangerous code in the world: validating
SSL certificates in non-browser software”[9], that the certificates validation process in
SSL which is the world standard for secure Internet communications is completely broken
in many security-critical applications and libraries. They highlighted the importance of
good authentication when dealing with encryption. That an improper usage of certificates
would result in a badly authenticated communication where an attacker can easily remove
the encryption without any party noticing the attack. And therefore, getting access to
sensitive data, such as passwords, and being able to modify the content of the traffic.
They demonstrated with concrete examples that the root causes of these vulnerabilities are
badly designed APIs of SSL implementations and data-transport libraries which present
developers with a confusing array of settings and options. They focused theirs work on
non-browser software, including diverse applications and libraries on Linux, Windows,
Android, and iOS.
Then, later in 2012, Fahl et al. in ”Why Eve and Mallory Love Android: An Analysis
of Android SSL (In)Security”[10] performed a static code analysis on a total of 13,500
applications from the Android Market (currently called Google Play Store) using mallo-
droid, a Python script that has been developed as part of their work. This static code
analysis was designed to detect applications using incorrect HTTPS configuration which
either accepts any certificate or any certificate signed by a trusted Certificate Author-
ity. They found out 1,074 over the 13,500 analysed applications that are vulnerable to
known wireless attacks such as Man In The Middle Attacks. They limited theirs study to
incorrect uses of SSL configuration on Android applications.
This thesis focuses on the wireless security of secure communications in the Android
platform. This thesis differs from the previous related papers in the fact that we do
not make a study on the Android API and documentation. Even though, we provide an
overview on those concepts which are really critical to get a good understanding on the
certificate validation process in the Android platform. Nonetheless, we reiterate the static
code analysis that have been developed by by Fahl et al. in 2012 and then performed by
Christopher Brodd-Reijer, a previous master thesis student at the Uppsala University in
2014. Here, the purpose of reiterating the Fahl et al. static code analysis is to evaluate
the wireless security of the Google Play Store applications at this time. Furthermore, we
manually test the results and reliability of the static code analysis method in the context
of banking applications. We also explore the possibility to develop an other method to
detect broken certificates validation in Android applications based upon dynamic code
analysis.

Chapter 3
Static code analysis on the Google
Play Store
In this thesis work, we examine a large sample of applications all downloaded from the
Google Play Store. This analysis is performed using static code analysis.
The Google Play Store originally called the Android Market, is the Google official
digital distribution platform. Among other things, it allows users to browse and download
applications published through Google. In 2015, it ”has reached more than 1.43 million
applications published and over 50 billion downloads”[11].
The static code analysis consists in determining which classes and methods a given
application contains and what those methods return. This allows the detection of any
certificate validators inside the code of the given application which would validate invalid
certificate.
First, we analyse 4.795 applications from all the categories, and then we manually test
the accuracy of the obtained results for the most used Scandinavian banking applications
that have been labelled as potentially vulnerable by the static code analysis. The manual
test consists in exposing the applications to MITMAs using pandora-box1 which is a
Python script that has been created as part of this research to perform wireless attacks.
3.1 Analysis context
The exclusive nature of the Google Play Store as the Android digital distribution platform,
plus, the huge number of downloaded applications make it a critical link in the evaluation of
the wireless security of Android applications. Therefore, it is important to draw statistics
on the Google Play Store.
The Google Play Store is structured in the 27 following categories:
• Book and References
• Business
• Comics
• Communication
• Education
• Entertainment
• Finance
• Games
• Health and Fitness
• Libraries and Demo
• Lifestyle
• Live Wallpaper
• Media and Video
• Medical
• Music and Audio
• News and Magazines
• Personalization
• Photography
• Productivity
• Shopping
• Social
1
https://github.com/AresS31/pandora-box
22

• Sports
• Tools
• Transportation
• Travel and Local
• Weather
• Widget
Each category is then structured as the 3 following sub-categories:
• Popular
– Free
– Paid
• New
– Free
– Paid
• Top grossing
– Free
– Paid
We select applications from different categories and with different year of publication,
download number, and users rating as material for the static code analysis. With these
different criteria, we get in hand all the data required to carry out an analysis regarding
the wireless security evolution of the applications distributed through the Google Play
Store. And perhaps more importantly, we also get an estimation of the number of users
using insecure applications.
3.1.1 Challenges
We try to keep an even spread number of applications for each categories, but some of the
categories contain very few applications and the Google Play Store allows for retrieving
”only” 500 applications under each of the six subcategories under all 27 categories this
gives a theoretical maximum number of 81,000 applications.
However, hermes, the Python script that we use to carry out the static code analysis
is ”obsolete” in the sense of non compatible with the latest Google protocols version. We
had to spend a considerable amount of time to make it operable.
Indeed, some of the requests that we send to the Google Play Store servers are rejected
due to a protection added to the Google Play Sore servers against crawlers (defined a
crawler in an annexe). This protection consists in solving a captcha capture after sending
too many requests. We use an escaping method (waiting X sec between each request) to
break this protection down.
Nevertheless, the fairness of the static code analysis is compromised by the random
behaviour of the Google Play servers which does not always deliver the requested appli-
cations. Still, we got a sample of 4,795 different applications (seventeenth times lower
than the theoretical sample size) with at least 100 applications for each of the 27 different
categories (verify this data). This sample size is sufficient enough in the context of our
work and it takes around 50 processing hours to analyse it.
3.2 Static code analysis[12]
We carry out the Google Play Store static code analysis using the existing following tools:
• hermes2: A Python script that automatize the download, analyse and statistic
generation of Android applications.
• googleplay-api3: An unofficial Python API that let you search, browse and down-
load Android applications from the Google Play Store.
• mallodroid4: A Python script that search for broken SSL certificate validation in
given applications.
2
https://github.com/ephracis/hermes
3
https://github.com/egirault/googleplay-api
4
https://github.com/sfahl/mallodroid

• androguard5: A reverse engineering framework for Android applications.
We perform the static code analysis in three steps. First, we browse the Google Play
Store and list the applications to process. Second, we download and analyse the listed
applications one at a time before erasing them. And finally, we convert the results into
graphs and tables.
The hermes script performs these three steps. Christoffer Brodd-Reijer, a previous
master thesis student at the Uppsala University developed this script. The hermes script
uses several third party tools to perform these tasks. It uses the googleplay-api to interact
with the Google Play Sore. For the static code analysis, it uses mallodroid, created by
Fahl et al. mallodroid is a module of the androguard framework. Figure 3.1 shows how
these components are connected.
Figure 3.1: Overview of the hermes script
Source:[12]
hermes fetches meta data for each application when browsing the Google Play Store.
These meta data include the author of the application, when the application was last
updated, the rating of the application, the number of times the application has been
downloaded, whether or not the application requires the Internet permission, the price of
the application, and the category the application belongs to.
Once the list of applications to analyse in each categories is created, hermes starts the
static code analysis but only on the free applications that require an Internet permission.
Indeed, we discard all the applications not requiring an Internet permission because they
do not connect to any network and thus it is waste of time and resources to study what type
of SSL certificate verifier they implement. We also discard all the non-free applications
because we have to pay on the Google Play Store to get them and since we are running the
static code analysis on several thousands applications the cost will too important. Then,
those applications are downloaded and analysed one by one using the imported mallodroid
module. The results are finally saved in additional meta data for each of the processed
application.
The mallodroid static code analysis consists in searching certain pieces of code in the
application Java source code in order to detect insecure code. These insecure pieces of
code are the following:
• Custom TrustManager: A custom TrustManager is required for handling self-
signed certificates or situations were the signing CA is unknown to the platform. A
TrustManager is unsecure if it contains a checkServerTrusted method which always
returns either true or void, and throws no exception. This is highly insecure as this
will accept any certificate without proper validation.
5
https://github.com/androguard/androguard

• Custom HostnameVerifier: Similarly to the TrustManager, HostnameVerifier
can sometimes be necessary but must not by itself be insecure. If a TrustManager
contains a verify method that always returns either true or void it is indeed in-
secure. Another type of insecurity arises when the verify method instantiates an
AllowAllHostnameVerifier object.
• Insecure SSLCertificateSocketFactory: Applications are also scanned for code
which calls the static GetInsecure method of the SSLCertificateSocketFactory class
as this will return a SSLSocketFactory with all the verification checks disabled.
3.3 Outcomes
During the static code analysis a total of 4,795 applications where found and downloaded
from the Google Play Store using the googleplay-api. Of these applications a total of 4,419
applications (92.16%) required an internet permission which means that they are likely
to establish network communications with remote servers. Among these 4,419 applica-
tions, 3,754 applications were available free of charge. As stated in Section 3.2 only the
applications that are free on charge and require an internet permission are analysed. This
resulted in 3,754 applications being analysed using the static code analysis method.
In the static code analysis results that will follow we use some words that first need to
be defined:
• Custom verifier: An application containing either a custom TrustManager or a
custom HostnameVerifier is classified as containing a custom verifier.
• Na¨ıve verifier: A custom verifier is considered na¨ıve if it contains an empty check-
ServerTrusted or verify method.
• Bad verifier: A custom verifier is considered na¨ıve if it contains an AllowAllHost-
nameVerifier or a SSLCertificateSocketFactory-¿getInsecure().
A full breakdown using the above terminology of the applications that have undergone
the static code analysis is shown in Figure 3.2. This figure shows that 51.31% of the appli-
cations contain either a native verifier or no verifier at all. Only 8.74% of the applications
contain a correct/secure custom verifier. The remaining 39.95% is for the applications that
make an incorrect usage of SSL and thus open some doors for possible wireless attacks. In
2014 Christoffer Brodd-Reijer[12] carried out a similar static code analysis and observed
that 29.06% of his applications sample implement insecure code. Keeping the differences
in the analysis context in mind we notice an increase in insecure applications (refer to
Section 3.1).
0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000
Verifier type
Apps
Native Custom Naive Bad
Figure 3.2: Types of verifiers observed

This insecure applications increase over time is also confirmed by the results that we got
of the applications containing bad verifiers grouped by their publication years (see Figure
3.3). In this Figure 3.3, we can only make a comparison between the applications released
or last updated in 2014 or 2015 since we do not have enough data for the other years
(see Section 3.1). In the year 2015 we notice an increase of 11.71% in the applications
containing bad verifiers in comparison with the year 2014. This number is similar to
the increase of 10.89% in insecure applications in the static code analysis performed by
Christoffer Brodd-Reijer in February 2014[12].
0 200 400 600 800
Unknown
2008
2009
2011
2010
2013
2012
2014
2015
Apps
Applications
0 20 40 60 80 100
Percentage
Percentage
Figure 3.3: Applications containing a bad verifier grouped by the year of publication or
last updated, and percentage of such applications in that group
It is also interesting to notice that the most downloaded applications are surprisingly
not more secure than the less downloaded ones. It is even the opposite, there is 9.87%
more insecure applications among the applications that have been downloaded more than
100,000,000+ times in comparison with the application that have only been downloaded
between 0-99 times (see Figure 3.4).

0 100 200 300 400
0-99
100-9,999
10,000-999,999
1,000,000-99,999,999
100,000,000+
Apps
Applications
0 20 40 60 80 100
Percentage
Percentage
Figure 3.4: Applications containing a bad verifier grouped by download, and percentage
of such applications in that group
The case of applications rating is similar to the applications download case. It is not
because an application has more stars than an other one that this application is more
secure. This makes sense since users usually rate applications regardless of their security.
In Figure 3.5 we observe that from the applications rated with no star or one star and the
applications rated with four or five stars there is an increase of 19.57% in the applications
containing a bad verifier.
0 200 400 600
0-1
1-2
2-3
3-4
4-5
Unknown
Apps
Applications
0 20 40 60 80 100
Percentage
Percentage
Figure 3.5: Applications containing a bad verifier grouped by rating, and percentage of
such applications in that group
Finally, those results shown that almost one application over two available on the
Google Play Store is likely to be insecure. And that the download numbers or the pub-
lication years and even the applications rating cannot be used as a trust index to check
whether or not it is safe to download an application. The users that wish to use the
applications distributed on the Google Play Store needs to be aware of their security flaws
and should only try to connect to trusted networks. However all the results shown in this
Section and in Section 5.1 must be taken with a grain of salt, indeed it happens that ap-
plications containing a bad or na¨ıve verifier are in reality totally secure to wireless attacks
(see Section 3.4. More graphs and tables are available in Section 5.1.

3.4 Manual analysis of insecure banking applications
From the collected results (see Section 3.3), we focus on the insecure applications classified
under the Business category. Several banking applications are among them. Therefore,
we compile a list with few of the biggest Scandinavian banking applications to carry out
real wireless attacks on each of them.
To perform the wireless attacks, we develop and use the pandora-box script which is a
Python script created as part of this thesis works. The pandora-box script allows an user
to do the following:
• To create a rogue access-point.
• To perform an evil twin attack.
• To sniff all kind of traffic including SSL traffic.
• To de-authenticate users from their legit Acess Point (AP).
• To monitor the rogue AP.
• To boost-up the network wireless interface power.
• To spoof the MAC addresses of the used network interface cards.
After testing our banking applications with the pandora-box script, we are surprised to
observe that all of those applications labelled as in-secure by mallodroid are in fact secure
(see Figure 3.6). Indeed, according to the static code analysis with mallodroid, we should
have been able to intercept the selected banking applications SSL traffics, which was not
the case. They effectively contain insecure SSL certificate verifiers but it seems that those
insecure verifiers might never be used during the applications runtime. Therefore, they can
be considered as false-positives as the main purpose of mallodroid is to detect applications
that are vulnerable to wireless attacks by analysing if they contain SSL certificate verifiers
that are incorrectly implemented.
Figure 3.6: (In)Security cases
The banking applications behaviour under a MITMA, is not to connect to our AP
and instead displaying an error message basically saying that an error occurred and to
check the network settings. This message means that the verifier implemented in those
applications do not validate our AP certificate during the SSL handshake. The reason in
this case is a hostname mismatch the applications expect a certificate from XXX.XXX
(banking server) and receive instead one from YYY.YYY (the address of the computer
running the pandora-box script).
From this ascertainment, we judge interesting to understand why mallodroid labels
these banking applications as insecure. It comes out that mallodroid is not a hundred

percent reliable, it can generate false-positives since the static code analysis does not take
in consideration the unused pieces of code nor the conditions under which the code should
be executed (if user is an administrator then run an insecure verifier else run a secure
verifier) for instance.
From this point, finding a way to improve the accuracy of the analysis becomes relevant
in the context of our work and we decide to perform a dynamic code analysis (see Chapter
4).

Chapter 4
Identifying the critical code
section in the certificate validation
process
As stated in Chapter 3.4, we cannot entirely rely on mallodroid which is the script
detecting broken SSL certificate validation. This script basically checks if insecure [Trust-
Manager/HostnameVerifier/SSLCertificateSocketFactory] classes are present in the code,
without verifying if this code is actually called or whether the certificate validation is done
by other means.
Therefore, we decided to develop an other analysis technique which is more reliable
because generating no false-positives. This technique is based on a dynamic code analysis
that consists in generating applications log files during their runtime. Then those log files
are manually analysed to detect any signs of broken SSL certificate validation.
To do this analysis, we first need to generate the log files of the applications execution
flow (we will call those log files traces) that we desire to examine. We do this by adding
a stack trace to the every method that are contained in those applications. We use
reverse engineering and code injection techniques to achieve this first step. Then once the
applications are traced, we run them to generate two dynamic log files (with and without
undergoing a MITMA) containing their runtime method calls. Finally, we proceed to a
manual analysis of the generated traces following a specific procedure that we developed
in order to locate the critical code sections involved in the SSL certificate validation for
further analysis.
4.1 Reverse engineering[13]
In this thesis work, we used reverse engineering techniques in order to study the functioning
of the SSL validation in the selected studied banking applications.
Reverse engineering is the ability to generate the source code from an executable. This
technique is used to examine the functioning of a program or to evade security mechanisms,
and so forth.
In other words, reverse engineering can be stated as a method or process of modifying
a program in order to make it behave in a manner that the reverse engineer desires.
We used popular Android reverse engineering tools which are the following ones:
• For the decompilation process, we used d2j-dex2jar1 to generate a .jar file from
applications classes.dex file. Then, we used JD-gui2 to decompile the Java byte
1
https://github.com/pxb1988/dex2jar
2
http://jd.benow.ca/
30

code (contained into the .jar file) into human readable Java source code.
Figure 4.1: Decompiling process of an Android app.
• For the disassembly process, we used baksmali3 to generate applications .smali files
from theirs classes.dex file. Baksmali is a word that comes from Icelandic language
and literally means Disassembler.
Figure 4.2: Disassembly process of an Android app.
• For the recompilation process, we used smali4 to assemble .smali files into a valid
applications. Smali is also a word that comes from Icelandic language and literally
means Assembler.
d2j-dex2jar, JD-gui, smali and baksmali are all available and pre-installed on Santoku5
which is a Linux distribution.
We used both methods decompiling and disassembly in order to get the disassembled
Smali source codes and the decompiled Java source codes. However, we mainly worked on
the disassembled Smali source codes. Because in spite of the difficulty of dealing with a
large number of files in assembly language, it is an easy process to reassemble .smali files
back into a valid .apk. Which is not the case for decompiled Java source codes. Indeed,
at this time, there is no technique known to recompile decompiled Java byte code back
into an .apk.
Therefore, we only used decompiled Java source code for the purpose of getting an
overview and a better understanding of the application implementations. Since Java is a
high level programming language it is more easy to deal and understand .java files than
.smali files which are written in Smali a low level assembly language.
Nevertheless, in addition of the impossibility to compile the decompiled Java source
codes back into an application, decompiled Java source codes are not reliable. Decompiling
the code source of an application can be viewed as translating a Chinese text to English,
then to Hebrew and finally back to Chinese. It results a source codes that often make no
sense, with for example no method termination; the Java decompilers handle loops and
multiple conditions very poorly.
4.2 Generation of the applications traces
To generate the applications traces, we used the Dalvik Debug Monitor Server (DDMS)6
which is an Android debugging tool. We used its logcat function that allows users to print
and log applications debugging messages using the Java command Log.d(String tag, String
msg) or a similar command from the android.util.Log; library.
3
https://github.com/JesusFreke/smali
4
https://github.com/JesusFreke/smali
5
https://santoku-linux.com/
6
http://developer.android.com/tools/debugging/ddms.html

We developed smali-code-injector7 which is a Python script performing the code
injection as well as the entire reverse engineering chain of Android applications (see Figure
4.3):
Figure 4.3: Overview of the smali-code-injector script
• Android applications disassembling.
• Codes injection (inside of the .smali files).
• Re-assembling of .smali files into Android applications.
• Android applications signing with jarsigner 8.
The smali-code-injector script is capable of injecting any kind of code. The code
is directly injected into the applications .smali files and the script deals with registries
allocations and dependencies by parsing the assembly files and extracting the relevant
information such as the registers number, the registers type, the methods names, and so
on.
smali-code-injector can also be used for other purposes like hijacking any genuine
Android applications in order to obtain an unlimited access to the host smart-phones
resources and data where the applications are installed. However, for this thesis work we
restricted the usage of smali-code-injector to its tracing feature (the printStackTrace
feature). We used this feature to inject a stack trace into every methods contained in the
studied applications; commonly several thousands methods are traced.
The applications traces consist in a repetition of entries each representing a method
call (see Figure 4.4). The traces length are extending as long the applications are running
and calling methods.
Figure 4.4: A method call entry
7
https://github.com/AresS31/smali-code-injector
8
jarsigner

Figure 4.5 shows an example of an application trace.
Figure 4.5: Application trace
For a human reader traces like shown in Figure ?? may not be easy to read and
understand. The blames that can be imputed to those kind of traces is that they do not
highlight the intra-method calls (refer to Figure 4.6 to understand the intra-method calls
concept) nor announce the end of a method.
Figure 4.6: Intra-method calls concept
Therefore, to address those problems, we created indentation-fixer9 which is a
Python script that format the traces generated with the smali-code-injector printStack-
Trace feature and the Android debugger into a more friendly format. This new formatting
allows users to get a better understanding of the applications execution flow by using an
indentation system as well as a tag system (see Figure 4.7 for an example of a logfile
generated by smali-code-injector and then treated with indentation-fixer).
Figure 4.7: Indented trace
9
https://github.com/AresS31/indentation-fixer

4.3 SSL certificate validation critical code section identifi-
cation
We tried to design a procedure to follow in order to find the method(s) handling the SSL
certificate validation for given applications. This procedure is based on the dynamic traces
that we generated in Section 4.2.
This procedure involves the generation of two different traces for a same application.
The first trace is a trace generated with the application running in a normal/neutral
environment. The second trace is a trace generated whilst performing a MITMA on the
application, the MITMA is performed with the pandora-box script (see Section 3.4).
The traces that are generated with smali-code-injectore and the Android debugger are
usually very long, they usually contain hundred of thousand lines. It is a considerable
amount of work to manually analyse such long traces. We addressed this issue by adding
a new feature to the smali-code-injector script. This new feature is a filtering feature
that allows users to inject their codes inside methods containing specific keywords. Some
specificities of the Smali language made this feature very easy to implement. Indeed, Smali
like most of the low level assembly/programming language represents a method arguments
and returned variable(s) only with their types and the types are represented as the path of
the file containing the type object (Ljava/lang/Integer, Ljava/lang/Boolean, and so on).
Consequently, we pre-compiled a list of keywords that includes all the different possible
variables types that are dealing with networking and SSL.
With this additional step, we lowered the length of the traces from hundred of thou-
sands entries to ”only” a few thousand entries without loosing any relevant data for our
study of the SSL certificate validation.
Then, we used the Linux diff command to generate a logfile containing the differences
between two traces. The resulting logfile that we will call diff-log can almost never be
empty because even in the case where an application is secure to wireless attacks there
will always been differences between each running instance of the application. For example
we can have a method that generate a different id for each session, a method returning
the current time or whatever method generating random numbers. Those differences are
noises in the context of our study.
Finally, we have to proceed to a manual analysis of the diff-log. The procedure to do
this analysis, is to go at the beginning of the diff-log and to look at the first differences,
applications containing methods with explicit method names are more easy to handle
for this analysis. At each difference that we judge relevant, we note the line where this
method is present in the pointed trace and we go back to the original trace to examine the
application execution flow before this noted line and see what may cause this difference.
We open as well the decompiled Java source code (or the Smali source for people skilled in
Smali language) of the application and we take a look at what the pointed method does.
We repeat this operation as many times as necessary until we find the method responsible
of the SSL certificate validation (see Figure 4.8).

Figure 4.8: The manual analysis procedure
We automated most of this analysis process by using our own tool chain containing
smali-code-injector, pandora-box, indentator-ﬁxer and the Linux diﬀ command (see Figure
4.9). Only the manual analysis procedure needs to be automated (see Figure 4.8). In
Chapter 5, we talk about the possible improvement of this method.
Figure 4.9: Dynamic code analysis tool chain

Chapter 5
Conclusion
We did a static code analysis on 4,795 applications from the Google Play Store, and we
got some disturbing results regarding the wireless security of the applications distributed
through this platform. However, after manual analyses of some Scandinavian banking
applications we found out that the static code analysis is not a reliable method to detect
insecure applications because its generate numerous false-positives. From this observation
we decided to develop a new analysis method that would not generate any false-positive.
The main reason of this false-positive generation is that mallodroid the script that search
for broken certificate verifiers takes as input the whole applications source code including
the codes that are never called during the applications runtime (noise). Therefore, we
developed a dynamic analysis method. First, we decided to trace all the methods contained
in the studied banking applications in order to generate a log files of their execution
code flow, so we can have all the functions that are actually called and used by those
applications. From this point, we used a manual analysis on the traces. Three traces were
generated for each studied applications, one with the application undergoing a MITMA, an
other with the applications running in a normal environment and the third one was simply
a log file containing the differences between the two first log files. Then, following a strict
procedure that we developed that consist in tracing back all the differences which seem
to be relevant, we were able to locate the method(s) responsible for the SSL certificate
validation. Future work can be done to improve this new method that we developed.
We could for example try to automate the manual log files analysing which is currently
very heavy and time consuming. We could also develop an hybrid method that would
consist in giving as mallodroid inputs only the codes that are actually executed during the
applications runtime. Now that we are able to generate a list of all the methods which are
called during the applications runtime using our dynamic code analysis method it would
not be very difficult to do this. Therefore, we could get rid of the noises which would
significantly improve the mallodroid accuracy. We can also think about linking all the
results that would be obtained regarding the wireless security of Android applications to
a database so users could be able to check out the wireless security of the applications
that they wish to install (this would be very important for the critical applications such
as banking applications).
5.1 My feedback
This thesis was really interesting, I do not regret that I chose it over the other proposition
that I got at this time. The thesis topic interested me a lot and since I desire to work in
IT security making a good work was my main motivation.
However, the thesis approach surprised me at first. Indeed, the fact that I had no
specific missions, problematic was at first a little bit disturbing since I was not used to
36

this. But after a while we could precisely define the direction to take and what to do.
Moreover, I think that six months was a too short period of time for one person to achieve
a such big project. Nevertheless, I think that I laid the first brick of an ambitious project
and I hope that someone will continue this work. In definitive, this thesis was a great
insight in the Research and Development and doctorate universe.
Through this thesis, I acquired and extended a lot of precious knowledge including but
not limited to Python, Shell, Smali, Java, Debugging, Android Operating System, Reverse
engineering, Wireless Security, Android Security...
On a non technical level, I learnt how to work autonomously mainly due to the distance
where I lived to the University and my lack of transportation. I also learnt how to be self
motivated, how to manage my time. I had weekly meeting with my supervisor though. I
also gain a lot on a personal level living in an other country an meeting people from all
around the world.

Bibliography
[1] Margaret Rouse. authentication definition. Online; accessed 20-Sept-2015. 2015. url:
http://searchsecurity.techtarget.com/definition/authentication.
[2] Certificate authority. Online; accessed 20-Sept-2015. 2015. url: https://en.wikipedia.
org/wiki/Certificate_authority.
[3] Transport Layer Security. Online; accessed 20-Sept-2015. 2015. url: https://en.
wikipedia.org/wiki/Transport_Layer_Security.
[4] Google. Security with HTTPS and SSL. Online; accessed 20-Sept-2015. url: https:
//developer.android.com/training/articles/security-ssl.html.
[5] Oracle. Interface TrustManager. Online; accessed 20-Sept-2015. url: http://docs.
oracle.com/javase/7/docs/api/javax/net/ssl/TrustManager.html.
[6] Android. HostnameVerifier. Online; accessed 20-Sept-2015. 2015. url: http : / /
developer.android.com/reference/javax/net/ssl/HostnameVerifier.html.
[7] Arif Setiawan. Android Stack. Online; accessed 20-Sept-2015. 2014. url: https:
//github.com/devacademy/android-fundamental-one/blob/master/modules/
stack.md.
[8] The University of Texas at Austin. Android (SDK). Online; accessed 20-Sept-2015.
url: https://developer.android.com/training/articles/security- ssl.
html.
[9] Subodh Iyengar Dan Boneh Suman Jana Vitaly Shmatikov Martin Goergiev Rishita
Anubhai. “The Most Dangerous Code in the World: Validating SSL Certificates in
Non-Browser Software”. In: ACM Conference on Computer and Communications
Security ().
[10] Thomas Muders Matthew Smith Lars Baumgartner Bernd Freisleben Sascha Fahl
Marian Harbach. “Why Eve and Mallory Love Android: An Analysis of Android SSL
(In)Security”. In: ACM Conference on Computer and Communications Security ().
[11] Google Play. Online; accessed 20-Sept-2015. 2015. url: https://en.wikipedia.
org/wiki/Google_Play#cite_note-5.
[12] Christoffer Brodd-Reijer. “An evaluation of smartphone communication (in)security”.
In: ACM Conference on Computer and Communications Security ().
[13] Reverse engineering. Online; accessed 20-Sept-2015. 2015. url: https://en.wikipedia.
org/wiki/Reverse_engineering.
38

Appendix A
Appendix Graphs
0 50 100 150 200 250 300 350 400
Travel and Local
Comics
News and Magazines
Social
Game
Shopping
Widgets
Lifestyle
Entertainment
Education
Sports
Transportation
Books and Reference
Health and Fitness
Finance
Business
Music and Audio
Media and Video
Weather
Photography
Communication
Tools
Medical
Productivity
Live Wallpaper
Personalization
Libraries and Demo
Apps
Applications
0 10 20 30 40 50 60 70 80 90 100
Percentage
Percentage
Figure A.1: Usage of the Internet permission
39

0 50 100 150 200 250 300 350 400
Game
Social
Widgets
Entertainment
News and Magazines
Media and Video
Business
Communication
Live Wallpaper
Personalization
Tools
Music and Audio
Productivity
Education
Shopping
Photography
Travel and Local
Medical
Books and Reference
Lifestyle
Finance
Weather
Sports
Transportation
Health and Fitness
Comics
Libraries and Demo
Apps
Native Custom Naive Bad
Figure A.2: Distribution of veriﬁer types over categories.
40 Chapter A Alexandre Teyar

Appendix B
Appendix Tables
Category Total Internet permission Internet permission
Travel and Local 151 151 100.00%
Comics 65 65 100.00%
News and Magazines 267 265 99.25%
Social 302 299 99.01%
Game 400 396 99.00%
Shopping 142 138 97.18%
Widgets 353 343 97.17%
Lifestyle 136 132 97.06%
Entertainment 336 326 97.02%
Education 229 221 96.51%
Sports 141 136 96.45%
Transportation 78 75 96.15%
Books and Reference 263 251 95.44%
Health and Fitness 109 104 95.41%
Finance 109 104 95.41%
Business 258 245 94.96%
Music and Audio 191 181 94.76%
Media and Video 304 287 94.41%
Weather 195 184 94.36%
Photography 220 202 91.82%
Communication 278 252 90.65%
Tools 285 258 90.53%
Medical 165 144 87.27%
Productivity 233 200 85.84%
Live Wallpaper 268 226 84.33%
Personalization 272 216 79.41%
Libraries and Demo 123 70 56.91%
Total 4795 4419 92.16%
Table B.1: Distribution of applications with internet permission over categories
41

Verifier type Apps Percentage
Native or none 1926 51.31%
Custom 328 8.74%
Naive 546 14.54%
Bad 954 25.41%
Table B.2: Distribution of applications over verifiers types
Year Bad Bad
Unknown 0 0.00%
2008 0 0.00%
2009 0 0.00%
2011 0 0.00%
2010 0 0.00%
2013 7 6.42%
2012 3 13.64%
2014 93 16.26%
2015 851 27.97%
Table B.3: Applications with bad verifier grouped by the year they were published or last
updated, and share of such applications in that group
Downloads Bad Bad
0-99 169 22.99%
100-9,999 140 22.22%
10,000-999,999 281 22.88%
1,000,000-99,999,999 341 31.26%
100,000,000+ 23 32.86%
Table B.4: Applications with bad verifier grouped by their download number, and share
of such applications in that group
Rating Bad Bad
0-1 11 7.33%
1-2 1 5.00%
2-3 41 21.24%
3-4 332 26.02%
4-5 569 26.90%
Unknown 0 0.00%
Table B.5: Applications with bad verifier grouped by their rating, and share of such
applications in that group
42 Chapter B Alexandre Teyar

Security in mobile banking apps

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Security in mobile banking apps

Similar to Security in mobile banking apps (20)

Recently uploaded

Recently uploaded (20)

Security in mobile banking apps