SlideShare a Scribd company logo
Software Obfuscation on the Go:
A Large-Scale Empirical Study on
Mobile App Obfuscation
Pei Wang, Qinkun Bao, Li Wang, Shuai Wang, Zhaofeng Chen†,
Tao Wei†, and Dinghao Wu
The Pennsylvania State University
†Baidu X-Lab
Market of Mobile Apps Grows Fast
~29% annual growth 2013–2017
Yet Mobile Developers are Troubled
● Piracy
○ Piracy rates of popular apps can approach to 95%
● Exploitable vulnerabiltities
○ OAuth2 bugs affecting apps with millions of users
● Repackaging
○ Benign-looking malware
● Fraudulent campaigns
○ A million-dollar underground economy
Software Obfuscation Comes to the Rescue
Obfuscation: Program transformations that make
software difficult to understand and analyze
● Raise the bar of reverse engineering
● Buy time for more permanent security solutions
Many Obfuscation Techniques Proposed
The Problem
● Most obfuscation research focuses on the desktop
environment
● Much less is known about obfuscation for mobile
A Large-Scale Empirical Study on Obfuscated
Mobile Apps
● Investigate software obfuscation in real-world mobile
app development
○ (RQ1) Characteristics of obfuscated apps
○ (RQ2) Popular obfuscation patterns
○ (RQ3) Possible impacts from app reviews
○ (RQ4) Effectiveness of obfuscation
● Focus on iOS
○ Important platform deserving more attention from academia
Methodology
1. Problem Scoping
2. Sample Collection
3. Sample Inspection
Step #1 — Problem Scoping
● Too many obfuscation algorithms, cannot afford to
consider them all
● Which ones are most widely applied?
Ask Google
● Search for commercial and open-source obfuscation tools
● Pick the top 4 most popular obfuscation methods for iOS
apps
○ Symbol renaming
○ String literal encryption
○ Decompilation disruption
○ Control flow flattening
Step #2 — Sample Collection
● Start with the entire App Store
○ 1,145,582 app instances crawled from Feb. to Oct. 2016
○ Include different versions of the same app
● Identify apps obfuscated
○ Technically difficult to automatically detect all 4 obfuscations in
a million apps
Assumption
Developers who picked heavyweight methods would very
likely also pick lightweight methods
● Set symbol renaming as the baseline
● Detect apps obfuscated with symbol renaming first
● Manually identify other obfuscations in the narrowed set
of candidates
Detecting Scrambled Symbol Names
Insight: Human-written source code is “natural” in the sense
that it can be described by statistical language models like n-
gram
(Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu.
On the Naturalness of Software. ICSE '12.)
Detecting Scrambled Symbol Names
● Segment symbol names into a sequence of words
● Measure how "surprising" the sequence is within an n-
gram language model
○ "Unnatural" names are considered to be obfuscated
● Use cross-entropy for measurement
Sample Collection Summary
● Rank apps by likelihood of obfuscation
● Manually verify top 6,600 samples
● Identified 601 true positives, grouped into 539 apps
Step #3 — Sample Inspection
● Manually confirm if the sampled apps are protected by
the other 3 methods
○ Over 600 man-hours to analyze 601 app instances
● Pay special attention to obfuscated third-party libraries
○ Apps may passively get obfuscated for linking against
obfuscated libraries
Results
(RQ1) Characteristics of obfuscated apps
(RQ2) Popular obfuscation patterns
(RQ3) Possible impacts from app reviews
(RQ4) Effectiveness of obfuscation
Results
(RQ1) Characteristics of obfuscated apps
(RQ2) Popular obfuscation patterns
(RQ3) Possible impacts from app reviews
(RQ4) Effectiveness of obfuscation
Impact of Third-Party Libraries
● 75.1% of analyzed apps contain
obfuscated third-party libraries
● Only 24.9% of the studied apps are
actively obfuscated by developers
● The other 63.8% are passively obfuscated
due to the inclusion of obfuscated
libraries
● Advertising libraries are most popular —
included by 48.1% studied apps
● Payment and banking libraries are the
second most popular — included by 18.7%
apps
App Distribution by Categories
● Apps related to finance,
medical, copyrighted digital
content, and personal activities
are more likely to be obfuscated
● Games are more likely to be
passively obfuscated
● There is a similar distribution
for obfuscated libraries of each
category
Results
(RQ1) Characteristics of obfuscated apps
(RQ2) Popular obfuscation patterns
(RQ3) Possible impacts from app reviews
(RQ4) Effectiveness of obfuscation
Obfuscation Patterns
Obfuscation Patterns
Finding 1: The majority of apps apply obfuscation at
module-level, suggesting a wide adoption of automated
tools
Obfuscation Patterns
Finding 2: More costly obfuscation algorithms are less popular
Obfuscation Patterns
Finding 3: Apps of certain categories tend to be more heavily obfuscated than the others
Finding 4: 27 of the 195 actively obfuscated apps were
unobfuscated at the beginning of the crawling period,
indicating more and more mobile developers have become
interested in software protection
Obfuscation Patterns
Results
(RQ1) Characteristics of obfuscated apps
(RQ2) Popular obfuscation patterns
(RQ3) Possible impacts from app reviews
(RQ4) Effectiveness of obfuscation
Case Studies
Found two cases that suggest app review can either
stimulate or restrain the adoption of obfuscation
Case #1
● A third-party library trying to use private API and
circumvent app review by obfuscation
● Busted by Apple in 2015, all including apps reported to be
removed
○ Library developers found new obfuscation methods to evade
detection
Case #2
● A reputed security service provider submitted a simple
poker game app that is extremely heavily obfuscated
● Obfuscator developers may want to test what
obfuscation algorithms are acceptable to Apple
Results
(RQ1) Characteristics of obfuscated apps
(RQ2) Popular obfuscation patterns
(RQ3) Possible impacts from app reviews
(RQ4) Effectiveness of obfuscation
Preliminary Penetration Tests
● Test the resilience of the obfuscated samples by simple
reverse engineering
○ Dump symbol names and string literals
○ Manually identify interesting keywords like "secret" and "private
key"
● 33 of 195 actively obfuscated apps still leak certain
sensitive information
○ Obfuscations failed to be as effective as they could have been
Conclusion
● First large-scale empirical study mobile obfuscation
● Filled in the blank of academic iOS app store studies
● Better understanding on how obfuscation is used in real-
world mobile development
● The practice of obfuscating mobile apps require
improvement

More Related Content

Similar to Software Protection on the Go: A Large-Scale Empirical Study on Mobile App Obfuscation

A Preliminary Field Study of Game Programming on Mobile Devices
A Preliminary Field Study of Game Programming on Mobile DevicesA Preliminary Field Study of Game Programming on Mobile Devices
A Preliminary Field Study of Game Programming on Mobile Devices
Tao Xie
 
Eurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активностиEurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активности
Sergey Ulankin
 
AndRadar: Fast Discovery of Android Applications in Alternative Markets
AndRadar: Fast Discovery of Android Applications in Alternative MarketsAndRadar: Fast Discovery of Android Applications in Alternative Markets
AndRadar: Fast Discovery of Android Applications in Alternative Markets
FACE
 
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.com
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.comMobile Application Security Testing, Testing for Mobility App | www.idexcel.com
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.com
Idexcel Technologies
 
Veracode State of Software Security vol 4
Veracode State of Software Security vol 4Veracode State of Software Security vol 4
Veracode State of Software Security vol 4
stemkat
 
IRJET- Approaching Highlights and Security issues in Software Engineering...
IRJET-  	  Approaching Highlights and Security issues in Software Engineering...IRJET-  	  Approaching Highlights and Security issues in Software Engineering...
IRJET- Approaching Highlights and Security issues in Software Engineering...
IRJET Journal
 
Malware detection and pattern classification using NPL
Malware detection and pattern classification using NPLMalware detection and pattern classification using NPL
Malware detection and pattern classification using NPL
IRJET Journal
 
Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...
Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...
Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...
Internet 2Conf
 
Mining apps for anomalies
Mining apps for anomaliesMining apps for anomalies
Mining apps for anomalies
Ahmed Kamel Taha
 
Investigating country differences in mobile app user behavior and challenges ...
Investigating country differences in mobile app user behavior and challenges ...Investigating country differences in mobile app user behavior and challenges ...
Investigating country differences in mobile app user behavior and challenges ...
redpel dot com
 
Evaluating android antimalware against transformation attacks
Evaluating android antimalware against transformation attacksEvaluating android antimalware against transformation attacks
Evaluating android antimalware against transformation attacks
IAEME Publication
 
Factors influncing quality of mobile apps role of mobile app development life...
Factors influncing quality of mobile apps role of mobile app development life...Factors influncing quality of mobile apps role of mobile app development life...
Factors influncing quality of mobile apps role of mobile app development life...
IJCSEA Journal
 
Clone-Based Variability Management in the Android Ecosystem
Clone-Based Variability Management in the Android EcosystemClone-Based Variability Management in the Android Ecosystem
Clone-Based Variability Management in the Android Ecosystem
John Businge
 
Android Malware Detection Literature Review
Android Malware Detection Literature ReviewAndroid Malware Detection Literature Review
Android Malware Detection Literature Review
Ahmed Sabbah
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
 
[2015/2016] Mobile thinking
[2015/2016] Mobile thinking[2015/2016] Mobile thinking
[2015/2016] Mobile thinking
Ivano Malavolta
 
PALO ALTO -NETWORKS Application Usage & Threat Report 2014
PALO ALTO -NETWORKS  Application Usage & Threat Report 2014PALO ALTO -NETWORKS  Application Usage & Threat Report 2014
PALO ALTO -NETWORKS Application Usage & Threat Report 2014
Marcello Marchesini
 
Covert communication in mobile applications
Covert communication in mobile applicationsCovert communication in mobile applications
Covert communication in mobile applications
Andrey Apuhtin
 
Securing the Enterprise with Application Aware Acceptable Use Policy
Securing the Enterprise with Application Aware Acceptable Use PolicySecuring the Enterprise with Application Aware Acceptable Use Policy
Securing the Enterprise with Application Aware Acceptable Use Policy
Allot Communications
 
Key Takeaways for Java Developers from the State of the Software Supply Chain...
Key Takeaways for Java Developers from the State of the Software Supply Chain...Key Takeaways for Java Developers from the State of the Software Supply Chain...
Key Takeaways for Java Developers from the State of the Software Supply Chain...
Steve Poole
 

Similar to Software Protection on the Go: A Large-Scale Empirical Study on Mobile App Obfuscation (20)

A Preliminary Field Study of Game Programming on Mobile Devices
A Preliminary Field Study of Game Programming on Mobile DevicesA Preliminary Field Study of Game Programming on Mobile Devices
A Preliminary Field Study of Game Programming on Mobile Devices
 
Eurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активностиEurecom уличили приложения для Android в тайной от пользователя активности
Eurecom уличили приложения для Android в тайной от пользователя активности
 
AndRadar: Fast Discovery of Android Applications in Alternative Markets
AndRadar: Fast Discovery of Android Applications in Alternative MarketsAndRadar: Fast Discovery of Android Applications in Alternative Markets
AndRadar: Fast Discovery of Android Applications in Alternative Markets
 
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.com
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.comMobile Application Security Testing, Testing for Mobility App | www.idexcel.com
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.com
 
Veracode State of Software Security vol 4
Veracode State of Software Security vol 4Veracode State of Software Security vol 4
Veracode State of Software Security vol 4
 
IRJET- Approaching Highlights and Security issues in Software Engineering...
IRJET-  	  Approaching Highlights and Security issues in Software Engineering...IRJET-  	  Approaching Highlights and Security issues in Software Engineering...
IRJET- Approaching Highlights and Security issues in Software Engineering...
 
Malware detection and pattern classification using NPL
Malware detection and pattern classification using NPLMalware detection and pattern classification using NPL
Malware detection and pattern classification using NPL
 
Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...
Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...
Internet 2.0 Reviews The Future Of Software Development: Trends In AI, Cloud ...
 
Mining apps for anomalies
Mining apps for anomaliesMining apps for anomalies
Mining apps for anomalies
 
Investigating country differences in mobile app user behavior and challenges ...
Investigating country differences in mobile app user behavior and challenges ...Investigating country differences in mobile app user behavior and challenges ...
Investigating country differences in mobile app user behavior and challenges ...
 
Evaluating android antimalware against transformation attacks
Evaluating android antimalware against transformation attacksEvaluating android antimalware against transformation attacks
Evaluating android antimalware against transformation attacks
 
Factors influncing quality of mobile apps role of mobile app development life...
Factors influncing quality of mobile apps role of mobile app development life...Factors influncing quality of mobile apps role of mobile app development life...
Factors influncing quality of mobile apps role of mobile app development life...
 
Clone-Based Variability Management in the Android Ecosystem
Clone-Based Variability Management in the Android EcosystemClone-Based Variability Management in the Android Ecosystem
Clone-Based Variability Management in the Android Ecosystem
 
Android Malware Detection Literature Review
Android Malware Detection Literature ReviewAndroid Malware Detection Literature Review
Android Malware Detection Literature Review
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2015/2016] Mobile thinking
[2015/2016] Mobile thinking[2015/2016] Mobile thinking
[2015/2016] Mobile thinking
 
PALO ALTO -NETWORKS Application Usage & Threat Report 2014
PALO ALTO -NETWORKS  Application Usage & Threat Report 2014PALO ALTO -NETWORKS  Application Usage & Threat Report 2014
PALO ALTO -NETWORKS Application Usage & Threat Report 2014
 
Covert communication in mobile applications
Covert communication in mobile applicationsCovert communication in mobile applications
Covert communication in mobile applications
 
Securing the Enterprise with Application Aware Acceptable Use Policy
Securing the Enterprise with Application Aware Acceptable Use PolicySecuring the Enterprise with Application Aware Acceptable Use Policy
Securing the Enterprise with Application Aware Acceptable Use Policy
 
Key Takeaways for Java Developers from the State of the Software Supply Chain...
Key Takeaways for Java Developers from the State of the Software Supply Chain...Key Takeaways for Java Developers from the State of the Software Supply Chain...
Key Takeaways for Java Developers from the State of the Software Supply Chain...
 

Recently uploaded

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 
Carrer goals.pptx and their importance in real life
Carrer goals.pptx  and their importance in real lifeCarrer goals.pptx  and their importance in real life
Carrer goals.pptx and their importance in real life
artemacademy2
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Dutch Power
 
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
SkillCertProExams
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
gharris9
 
XP 2024 presentation: A New Look to Leadership
XP 2024 presentation: A New Look to LeadershipXP 2024 presentation: A New Look to Leadership
XP 2024 presentation: A New Look to Leadership
samililja
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
gharris9
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Rosie Wells
 
Mẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPoint
Mẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPointMẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPoint
Mẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPoint
1990 Media
 
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij
 
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
OECD Directorate for Financial and Enterprise Affairs
 
ASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdfASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdf
ToshihiroIto4
 
Competition and Regulation in Professions and Occupations – ROBSON – June 202...
Competition and Regulation in Professions and Occupations – ROBSON – June 202...Competition and Regulation in Professions and Occupations – ROBSON – June 202...
Competition and Regulation in Professions and Occupations – ROBSON – June 202...
OECD Directorate for Financial and Enterprise Affairs
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
amekonnen
 
Updated diagnosis. Cause and treatment of hypothyroidism
Updated diagnosis. Cause and treatment of hypothyroidismUpdated diagnosis. Cause and treatment of hypothyroidism
Updated diagnosis. Cause and treatment of hypothyroidism
Faculty of Medicine And Health Sciences
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
kkirkland2
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Dutch Power
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
Frederic Leger
 

Recently uploaded (19)

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 
Carrer goals.pptx and their importance in real life
Carrer goals.pptx  and their importance in real lifeCarrer goals.pptx  and their importance in real life
Carrer goals.pptx and their importance in real life
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
 
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
 
XP 2024 presentation: A New Look to Leadership
XP 2024 presentation: A New Look to LeadershipXP 2024 presentation: A New Look to Leadership
XP 2024 presentation: A New Look to Leadership
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
 
Mẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPoint
Mẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPointMẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPoint
Mẫu PPT kế hoạch làm việc sáng tạo cho nửa cuối năm PowerPoint
 
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
Suzanne Lagerweij - Influence Without Power - Why Empathy is Your Best Friend...
 
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
Competition and Regulation in Professions and Occupations – OECD – June 2024 ...
 
ASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdfASONAM2023_presection_slide_track-recommendation.pdf
ASONAM2023_presection_slide_track-recommendation.pdf
 
Competition and Regulation in Professions and Occupations – ROBSON – June 202...
Competition and Regulation in Professions and Occupations – ROBSON – June 202...Competition and Regulation in Professions and Occupations – ROBSON – June 202...
Competition and Regulation in Professions and Occupations – ROBSON – June 202...
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
 
Updated diagnosis. Cause and treatment of hypothyroidism
Updated diagnosis. Cause and treatment of hypothyroidismUpdated diagnosis. Cause and treatment of hypothyroidism
Updated diagnosis. Cause and treatment of hypothyroidism
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
 

Software Protection on the Go: A Large-Scale Empirical Study on Mobile App Obfuscation

  • 1. Software Obfuscation on the Go: A Large-Scale Empirical Study on Mobile App Obfuscation Pei Wang, Qinkun Bao, Li Wang, Shuai Wang, Zhaofeng Chen†, Tao Wei†, and Dinghao Wu The Pennsylvania State University †Baidu X-Lab
  • 2. Market of Mobile Apps Grows Fast ~29% annual growth 2013–2017
  • 3. Yet Mobile Developers are Troubled ● Piracy ○ Piracy rates of popular apps can approach to 95% ● Exploitable vulnerabiltities ○ OAuth2 bugs affecting apps with millions of users ● Repackaging ○ Benign-looking malware ● Fraudulent campaigns ○ A million-dollar underground economy
  • 4. Software Obfuscation Comes to the Rescue Obfuscation: Program transformations that make software difficult to understand and analyze ● Raise the bar of reverse engineering ● Buy time for more permanent security solutions
  • 6. The Problem ● Most obfuscation research focuses on the desktop environment ● Much less is known about obfuscation for mobile
  • 7. A Large-Scale Empirical Study on Obfuscated Mobile Apps ● Investigate software obfuscation in real-world mobile app development ○ (RQ1) Characteristics of obfuscated apps ○ (RQ2) Popular obfuscation patterns ○ (RQ3) Possible impacts from app reviews ○ (RQ4) Effectiveness of obfuscation ● Focus on iOS ○ Important platform deserving more attention from academia
  • 8. Methodology 1. Problem Scoping 2. Sample Collection 3. Sample Inspection
  • 9. Step #1 — Problem Scoping ● Too many obfuscation algorithms, cannot afford to consider them all ● Which ones are most widely applied?
  • 10. Ask Google ● Search for commercial and open-source obfuscation tools ● Pick the top 4 most popular obfuscation methods for iOS apps ○ Symbol renaming ○ String literal encryption ○ Decompilation disruption ○ Control flow flattening
  • 11. Step #2 — Sample Collection ● Start with the entire App Store ○ 1,145,582 app instances crawled from Feb. to Oct. 2016 ○ Include different versions of the same app ● Identify apps obfuscated ○ Technically difficult to automatically detect all 4 obfuscations in a million apps
  • 12. Assumption Developers who picked heavyweight methods would very likely also pick lightweight methods ● Set symbol renaming as the baseline ● Detect apps obfuscated with symbol renaming first ● Manually identify other obfuscations in the narrowed set of candidates
  • 13. Detecting Scrambled Symbol Names Insight: Human-written source code is “natural” in the sense that it can be described by statistical language models like n- gram (Abram Hindle, Earl T. Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. On the Naturalness of Software. ICSE '12.)
  • 14. Detecting Scrambled Symbol Names ● Segment symbol names into a sequence of words ● Measure how "surprising" the sequence is within an n- gram language model ○ "Unnatural" names are considered to be obfuscated ● Use cross-entropy for measurement
  • 15. Sample Collection Summary ● Rank apps by likelihood of obfuscation ● Manually verify top 6,600 samples ● Identified 601 true positives, grouped into 539 apps
  • 16. Step #3 — Sample Inspection ● Manually confirm if the sampled apps are protected by the other 3 methods ○ Over 600 man-hours to analyze 601 app instances ● Pay special attention to obfuscated third-party libraries ○ Apps may passively get obfuscated for linking against obfuscated libraries
  • 17. Results (RQ1) Characteristics of obfuscated apps (RQ2) Popular obfuscation patterns (RQ3) Possible impacts from app reviews (RQ4) Effectiveness of obfuscation
  • 18. Results (RQ1) Characteristics of obfuscated apps (RQ2) Popular obfuscation patterns (RQ3) Possible impacts from app reviews (RQ4) Effectiveness of obfuscation
  • 19. Impact of Third-Party Libraries ● 75.1% of analyzed apps contain obfuscated third-party libraries ● Only 24.9% of the studied apps are actively obfuscated by developers ● The other 63.8% are passively obfuscated due to the inclusion of obfuscated libraries ● Advertising libraries are most popular — included by 48.1% studied apps ● Payment and banking libraries are the second most popular — included by 18.7% apps
  • 20. App Distribution by Categories ● Apps related to finance, medical, copyrighted digital content, and personal activities are more likely to be obfuscated ● Games are more likely to be passively obfuscated ● There is a similar distribution for obfuscated libraries of each category
  • 21. Results (RQ1) Characteristics of obfuscated apps (RQ2) Popular obfuscation patterns (RQ3) Possible impacts from app reviews (RQ4) Effectiveness of obfuscation
  • 23. Obfuscation Patterns Finding 1: The majority of apps apply obfuscation at module-level, suggesting a wide adoption of automated tools
  • 24. Obfuscation Patterns Finding 2: More costly obfuscation algorithms are less popular
  • 25. Obfuscation Patterns Finding 3: Apps of certain categories tend to be more heavily obfuscated than the others
  • 26. Finding 4: 27 of the 195 actively obfuscated apps were unobfuscated at the beginning of the crawling period, indicating more and more mobile developers have become interested in software protection Obfuscation Patterns
  • 27. Results (RQ1) Characteristics of obfuscated apps (RQ2) Popular obfuscation patterns (RQ3) Possible impacts from app reviews (RQ4) Effectiveness of obfuscation
  • 28. Case Studies Found two cases that suggest app review can either stimulate or restrain the adoption of obfuscation
  • 29. Case #1 ● A third-party library trying to use private API and circumvent app review by obfuscation ● Busted by Apple in 2015, all including apps reported to be removed ○ Library developers found new obfuscation methods to evade detection
  • 30. Case #2 ● A reputed security service provider submitted a simple poker game app that is extremely heavily obfuscated ● Obfuscator developers may want to test what obfuscation algorithms are acceptable to Apple
  • 31. Results (RQ1) Characteristics of obfuscated apps (RQ2) Popular obfuscation patterns (RQ3) Possible impacts from app reviews (RQ4) Effectiveness of obfuscation
  • 32. Preliminary Penetration Tests ● Test the resilience of the obfuscated samples by simple reverse engineering ○ Dump symbol names and string literals ○ Manually identify interesting keywords like "secret" and "private key" ● 33 of 195 actively obfuscated apps still leak certain sensitive information ○ Obfuscations failed to be as effective as they could have been
  • 33. Conclusion ● First large-scale empirical study mobile obfuscation ● Filled in the blank of academic iOS app store studies ● Better understanding on how obfuscation is used in real- world mobile development ● The practice of obfuscating mobile apps require improvement

Editor's Notes

  1. RQ = research question (a term commonly used by the software engineering community)