App behavioral
analysis using system
calls
Prajit Kumar Das
Advisor: Anupam Joshi,
PhD
Co-Advisor: Tim Finin,
PhD
Problem statement
We present Heimdall, a framework
for studying system calls made by
mobile apps in order to determine an
app’s behavior class and match such
behavior to their stated purpose.
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 2
UMBC Ebiquity Research Group | MobiSec 2017
Image courtesy: Android App Market
Motivation: App issues
Tuesday, May 2, 2017 3
Motivation: Permission inadequacy
Pre-Marshmallow Marshmallow
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 4
Do you read privacy policies and do you
understand them?
Source: lifehacker
Motivation: Software Limitation
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 5
Related Work
• System calls used for software analysis Kosoresow’97
• Three areas of research for mobile app analysis
• Malware classification Zhou’12
• NLP techniques Pandtia’13, Gorla’14
• Taint tracking Enck’10
• Google PHA taxonomy Google Android Security’16
• PrivacyGrade: Grading The Privacy Of Smartphone Apps
• Expectation and purpose: understanding users’ mental models
of mobile app privacy through crowdsourcing. Lin’12
• Modeling users’ mobile app privacy preferences, Lin’14
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 6
Architecture
Download
module
Annotations
module
System call capture module
Mobile device
emulator
Feature generation module
Classification module
Classifier outputUMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 7
Dataset distribution
10 annotated categories, 20 Google Play categories – 75% tool and
productivity
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 8
Experimental setup
• 1560 apps
• 534 successfully executed
• strace can only be used on emulator
• UI/Application exerciser tool used
• Android 6.0.1 December 2015 build used
• SVM, MLP, J48 and NB used
• Best F1-score from MLP at 0.44
• 1-hot vectors – Call present or absent
• TF-IDF weight vectors – Uniqueness and
significance of system calls for app
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 9
Similar TF-IDF scores - Scientific calculator vs To-do list
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 10
Results: Google categories
• Google’s categories are
developer provided
• Could be misleading
• System calls could classify
behavior better
• Additional features
required
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 11
Results: Annotated classes
• System calls not totally
useless
• 1-hot vs TF-IDF
Classifier TF-IDF 1-hot
MLP 0.44 0.26
SVM-Poly 0.31 0.23
SVM-RBF 0.32 0.21
J48 0.27 0.31
NB 0.27 0.27
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 12
Challenges faced
• Annotating app behavior class
• Emulator instabilities
• App limitations/bugs
• User credentials required
• Multi-behavior apps
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 13
Conclusions
Can system calls be used to
distinguish between how an app
“behaves” and it’s perceived/stated
purpose?
• System calls – insufficient as
features
• Emulator – better ones required
• Coarser behavior classes
required
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 14
Future work
• System call bi-grams, tri-grams – for capturing call sequences
• Better emulator – Genymotion
• Malware classification – less scope (state-of-the-art is 96.7%)
• Generating contextual policies from behavior estimation
• Features planned for future:
• App descriptions
• App ratings
• PHA app analysis – require samples for Mobile Unwanted Software
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 15
Questions?
UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 16

App behavioral analysis using system calls

  • 1.
    App behavioral analysis usingsystem calls Prajit Kumar Das Advisor: Anupam Joshi, PhD Co-Advisor: Tim Finin, PhD
  • 2.
    Problem statement We presentHeimdall, a framework for studying system calls made by mobile apps in order to determine an app’s behavior class and match such behavior to their stated purpose. UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 2
  • 3.
    UMBC Ebiquity ResearchGroup | MobiSec 2017 Image courtesy: Android App Market Motivation: App issues Tuesday, May 2, 2017 3
  • 4.
    Motivation: Permission inadequacy Pre-MarshmallowMarshmallow UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 4
  • 5.
    Do you readprivacy policies and do you understand them? Source: lifehacker Motivation: Software Limitation UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 5
  • 6.
    Related Work • Systemcalls used for software analysis Kosoresow’97 • Three areas of research for mobile app analysis • Malware classification Zhou’12 • NLP techniques Pandtia’13, Gorla’14 • Taint tracking Enck’10 • Google PHA taxonomy Google Android Security’16 • PrivacyGrade: Grading The Privacy Of Smartphone Apps • Expectation and purpose: understanding users’ mental models of mobile app privacy through crowdsourcing. Lin’12 • Modeling users’ mobile app privacy preferences, Lin’14 UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 6
  • 7.
    Architecture Download module Annotations module System call capturemodule Mobile device emulator Feature generation module Classification module Classifier outputUMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 7
  • 8.
    Dataset distribution 10 annotatedcategories, 20 Google Play categories – 75% tool and productivity UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 8
  • 9.
    Experimental setup • 1560apps • 534 successfully executed • strace can only be used on emulator • UI/Application exerciser tool used • Android 6.0.1 December 2015 build used • SVM, MLP, J48 and NB used • Best F1-score from MLP at 0.44 • 1-hot vectors – Call present or absent • TF-IDF weight vectors – Uniqueness and significance of system calls for app UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 9
  • 10.
    Similar TF-IDF scores- Scientific calculator vs To-do list UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 10
  • 11.
    Results: Google categories •Google’s categories are developer provided • Could be misleading • System calls could classify behavior better • Additional features required UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 11
  • 12.
    Results: Annotated classes •System calls not totally useless • 1-hot vs TF-IDF Classifier TF-IDF 1-hot MLP 0.44 0.26 SVM-Poly 0.31 0.23 SVM-RBF 0.32 0.21 J48 0.27 0.31 NB 0.27 0.27 UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 12
  • 13.
    Challenges faced • Annotatingapp behavior class • Emulator instabilities • App limitations/bugs • User credentials required • Multi-behavior apps UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 13
  • 14.
    Conclusions Can system callsbe used to distinguish between how an app “behaves” and it’s perceived/stated purpose? • System calls – insufficient as features • Emulator – better ones required • Coarser behavior classes required UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 14
  • 15.
    Future work • Systemcall bi-grams, tri-grams – for capturing call sequences • Better emulator – Genymotion • Malware classification – less scope (state-of-the-art is 96.7%) • Generating contextual policies from behavior estimation • Features planned for future: • App descriptions • App ratings • PHA app analysis – require samples for Mobile Unwanted Software UMBC Ebiquity Research Group | MobiSec 2017Tuesday, May 2, 2017 15
  • 16.
    Questions? UMBC Ebiquity ResearchGroup | MobiSec 2017Tuesday, May 2, 2017 16

Editor's Notes

  • #3 RQ1: Can system calls be used to distinguish between how an app “behaves” and it’s perceived/stated purpose?
  • #4 Why should we care? Apps collect user data Emails, Messages, Documents, Sensor data – Highly Personal Data Can’t App permissions handle privacy and security of data? App permissions – “Take it or leave it” Is user okay with sharing location in public place not private place, no way to control that Use Privacy and Security module to implement context-dependent Rules
  • #9 play cat freq alarm_clock 128 battery_saver 93 drink_recipes 15 file_explorer 72 lunar_calendar 12 pdf_reader 22 scientific_calculator 61 to_do_list 102 video_playback 5 wifi_analyzer 24
  • #11 We had 61 scientific calculators and 102 to-do list apps in 534 that’s a combined percentage of 21% Very similar system calls but different app categories fstatat - get file status relative to a directory file descriptor ftruncate - truncate a file to a specified length pwrite - read from or write to a file descriptor at a given offset sigreturn - return from signal handler and cleanup stack frame sendmsg - send a message on a socket sigaction - examine and change a signal action
  • #13 System calls aren’t useless because the random choice for a 10 classes gives us a probability of 0.1 and we have 0.44 at best and 0.27 at worst Classifiers perform better with TF-IDF features; understandable because just like document classification presence of a system call is significant while presence of a unique system call is a better feature than presence of a particular system call or a group of system calls