SlideShare a Scribd company logo
1 of 13
Mining apps for anomalies
P e r s p e c t i v e s o n D a t a S c i e n c e
f o r S o f t w a r e E n g i n e e r i n g
Agenda
2
• Specifications
• APP MINING
• DETECTING ABNORMAL BEHAVIOR
• CHABADA
• TREASURE OF DATA
• OBSTACLES
Specifications
• Does the program do what it is
supposed to do?
• Will it continue to do so in the future?
• How to define what its supposed to do?
3
Formal Methods
Flappy Bird
• Your aim is to move a little bird up and down
such that it does not hit an obstacle.
• As a developer you list undesired properties (no
crash, no spying).
• How to specify gameplay to computer?
• Can we teach a computer how to check a
program against expectations?
• Learn what program behavior is normal in a
given context?
4
APP MINING
5
App mining leverages common knowledge in thousands of
apps to automatically learn what is “normal” behavior—
and in contrast, automatically identify “abnormal” behavior.
APP MINING
• Leverage the knowledge encoded into the hundreds
of thousands of apps available in app stores
• Determine what would be normal behavior, to
detect what would be abnormal behavior
• Guide programmers and users toward better security
and usability
A p p s i n a p p s t o r e s h a v e t h r e e f e a t u r e s
1. Apps come with all sorts of metadata, such as names, categories,
and user interfaces. All of these can be associated with program
features, so you can, for instance, associate program behavior with
descriptions.
2. Apps are pretty much uniform. They use the same libraries, which
on top, use fairly recent designs. All this makes apps easy to analyze,
execute, and test—and consequently, easy to compare.
3. Apps are redundant. There are plenty of apps that all address
similar problems. This is in sharp contrast to open source programs..
This redundancy in apps allows us to learn common patterns of how
problems are addressed—and, in return, detect anomalies.
6
DETECTING ABNORMAL BEHAVIOR
The problem with “normal” behavior is that it varies according to the
app’s purpose.:
• If an app sends out text messages, that would normally be a sign of
malicious behavior—unless it is a messaging application, where
sending text messages is one of the advertised features.
• If an app continuously monitors your position, this might be
malicious behavior—unless it is a tracking app that again advertises
this as a feature.
• Simply checking for a set of predefined “undesired” features is not
enough—if the features are clearly advertised, then it is reasonable
to assume the user tolerates, or even wants these features, because
otherwise, she would not have chosen the app.
7
8
Introducing CHABADA
• To determine what is normal, we thus must assess program behavior together with its description. If the
behavior is advertised then it’s fine; if not, it may come as a surprise to the user, and thus should be flagged.
• This is the idea we followed in our first app mining work, the CHABADA tool.
• A general tool to detect mismatches between the behavior of an app and its description
• Applied on a set of 22,500 apps, CHABADA can detect 74% of novel malware, with a false positive rate
below 10%.
• Our recent MUDFLOW prototype, which learns normal data flows from apps, can even detect more than
90% of novel malware leaking sensitive data.
“Checking App Behavior Against Descriptions of Apps”
CHABADA
• CHABADA starts with a (large) set of apps to be analyzed.
• It first applies tried-and-proven natural language
processing techniques (stemming, LDA (Latent Dirichlet
Analysis), topic analysis) to abstract the app descriptions
into topics.
• It builds clusters of those apps whose topics have the
most in common. Thus, all apps whose descriptions refer
to messaging end up in a “Messaging” cluster.
9
10
CHABADA
• Within each cluster, CHABADA will now search for outliers
regarding app behavior.
• Simply use the set of API calls contained in each app; these
are easy to extract using simple static analysis tools.
• CHABADA uses tried-and-proven outlier analysis techniques,
which provide a ranking of the apps in a cluster, depending
on how far away their API usage is from the norm. Those
apps that are ranked highest are the most likely outliers.
11
A TREASURE OF DATA …
1. Future techniques will tie program analysis to user interface analysis.
2. Mining user interaction may reveal behavior patterns we could reuse in various contexts.
3. Violating behavior patterns may also imply usability issues. If a button named “Login” does nothing, for
instance, it would be very different from the other “Login” buttons used in other apps—and hopefully be
flagged as an anomaly.
4. Given good test generators, one can systematically explore the dynamic behavior, and gain information on
concrete text and resources accessed
a n u mb er of id eas th at ap p stores all make p ossib le
OBSTACLES
1. Getting apps is not hard, but not easy either. Besides the official stores, there is no publicly available repository
of apps where you could simply download thousands of apps, because violation of copyright.
2. For apps, there’s no easily accessible source code, version, or bug information. If you monitor a store for a
sufficient time, you may be able to access and compare releases, but that’s it. Vendors not going to help you and
open source is limited . Fortunately, app byte code is not too hard to get through.
3. Metadata is only a very weak indicator of program quality. Lots of one-star reviews may refer to a recent price
increase or political reasons; but reviews talking about crashes or malicious behavior might give clear signs.
4. Never underestimate developers. Vendors typically have a pretty clear picture of what their users do, If you think
you can mine metadata to predict release dates, reviews, or sentiments: talk to vendors first and check your
proposal against the realities of app development.
Any Questions?
Thank You.

More Related Content

Similar to Detect Abnormal App Behavior with CHABADA

Detecting malicious facebook applications
Detecting malicious facebook applicationsDetecting malicious facebook applications
Detecting malicious facebook applicationsnexgentechnology
 
Detecting malicious facebook applications
Detecting malicious facebook applicationsDetecting malicious facebook applications
Detecting malicious facebook applicationsnexgentech15
 
DETECTING MALICIOUS FACEBOOK APPLICATIONS - IEEE PROJECTS IN PONDICHERRY,BUL...
DETECTING MALICIOUS FACEBOOK APPLICATIONS  - IEEE PROJECTS IN PONDICHERRY,BUL...DETECTING MALICIOUS FACEBOOK APPLICATIONS  - IEEE PROJECTS IN PONDICHERRY,BUL...
DETECTING MALICIOUS FACEBOOK APPLICATIONS - IEEE PROJECTS IN PONDICHERRY,BUL...Nexgen Technology
 
Detecting malicious facebook applicationsi
Detecting malicious facebook applicationsiDetecting malicious facebook applicationsi
Detecting malicious facebook applicationsinexgentechnology
 
Hidden Speed Bumps on the Road to "Continuous"
Hidden Speed Bumps on the Road to "Continuous"Hidden Speed Bumps on the Road to "Continuous"
Hidden Speed Bumps on the Road to "Continuous"Sonatype
 
Stephanie Vanroelen - Mobile Anti-Virus apps exposed
Stephanie Vanroelen - Mobile Anti-Virus apps exposedStephanie Vanroelen - Mobile Anti-Virus apps exposed
Stephanie Vanroelen - Mobile Anti-Virus apps exposedNoNameCon
 
Avtest 2012 02-android_anti-malware_report_english
Avtest 2012 02-android_anti-malware_report_englishAvtest 2012 02-android_anti-malware_report_english
Avtest 2012 02-android_anti-malware_report_englishAnatoliy Tkachev
 
App Testing Tools and Frameworks A Comparative Analysis.pdf
App Testing Tools and Frameworks A Comparative Analysis.pdfApp Testing Tools and Frameworks A Comparative Analysis.pdf
App Testing Tools and Frameworks A Comparative Analysis.pdflubnayasminsebl
 
Getting Paid To Test Apps with your Mobile
Getting Paid To Test Apps with your MobileGetting Paid To Test Apps with your Mobile
Getting Paid To Test Apps with your MobileFreeLife8
 
Discovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile appsDiscovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile appsBoopathi Kumar
 
Testing parallel programs
Testing parallel programsTesting parallel programs
Testing parallel programsPVS-Studio
 
eStudio34 presents London Search Love 2015 | Practical tips for the future o...
eStudio34 presents London Search Love 2015 |  Practical tips for the future o...eStudio34 presents London Search Love 2015 |  Practical tips for the future o...
eStudio34 presents London Search Love 2015 | Practical tips for the future o...William Renedo
 
SearchLove London 2015 | Will Critchlow | Practical Tips for the Future of ...
SearchLove London 2015 |  Will Critchlow |  Practical Tips for the Future of ...SearchLove London 2015 |  Will Critchlow |  Practical Tips for the Future of ...
SearchLove London 2015 | Will Critchlow | Practical Tips for the Future of ...Distilled
 
Growth Hacking- Organic and Paid App Installs
Growth Hacking- Organic and Paid App InstallsGrowth Hacking- Organic and Paid App Installs
Growth Hacking- Organic and Paid App InstallsNeeraj K Kushwaha
 
Fraud App Detection using Machine Learning
Fraud App Detection using Machine LearningFraud App Detection using Machine Learning
Fraud App Detection using Machine LearningIRJET Journal
 
Fortify-overview-300-v2.pptx
Fortify-overview-300-v2.pptxFortify-overview-300-v2.pptx
Fortify-overview-300-v2.pptxAlejandro Daricz
 
mobile app development tool-converted.pdf
mobile app development tool-converted.pdfmobile app development tool-converted.pdf
mobile app development tool-converted.pdfKatieLeslove1
 
APPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEM
APPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEMAPPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEM
APPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEMijcsit
 

Similar to Detect Abnormal App Behavior with CHABADA (20)

App Store Optimization Tips 101
App Store Optimization Tips 101App Store Optimization Tips 101
App Store Optimization Tips 101
 
Detecting malicious facebook applications
Detecting malicious facebook applicationsDetecting malicious facebook applications
Detecting malicious facebook applications
 
Detecting malicious facebook applications
Detecting malicious facebook applicationsDetecting malicious facebook applications
Detecting malicious facebook applications
 
DETECTING MALICIOUS FACEBOOK APPLICATIONS - IEEE PROJECTS IN PONDICHERRY,BUL...
DETECTING MALICIOUS FACEBOOK APPLICATIONS  - IEEE PROJECTS IN PONDICHERRY,BUL...DETECTING MALICIOUS FACEBOOK APPLICATIONS  - IEEE PROJECTS IN PONDICHERRY,BUL...
DETECTING MALICIOUS FACEBOOK APPLICATIONS - IEEE PROJECTS IN PONDICHERRY,BUL...
 
Detecting malicious facebook applicationsi
Detecting malicious facebook applicationsiDetecting malicious facebook applicationsi
Detecting malicious facebook applicationsi
 
Hidden Speed Bumps on the Road to "Continuous"
Hidden Speed Bumps on the Road to "Continuous"Hidden Speed Bumps on the Road to "Continuous"
Hidden Speed Bumps on the Road to "Continuous"
 
Stephanie Vanroelen - Mobile Anti-Virus apps exposed
Stephanie Vanroelen - Mobile Anti-Virus apps exposedStephanie Vanroelen - Mobile Anti-Virus apps exposed
Stephanie Vanroelen - Mobile Anti-Virus apps exposed
 
Avtest 2012 02-android_anti-malware_report_english
Avtest 2012 02-android_anti-malware_report_englishAvtest 2012 02-android_anti-malware_report_english
Avtest 2012 02-android_anti-malware_report_english
 
App Testing Tools and Frameworks A Comparative Analysis.pdf
App Testing Tools and Frameworks A Comparative Analysis.pdfApp Testing Tools and Frameworks A Comparative Analysis.pdf
App Testing Tools and Frameworks A Comparative Analysis.pdf
 
Getting Paid To Test Apps with your Mobile
Getting Paid To Test Apps with your MobileGetting Paid To Test Apps with your Mobile
Getting Paid To Test Apps with your Mobile
 
Discovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile appsDiscovery of ranking fraud for mobile apps
Discovery of ranking fraud for mobile apps
 
Testing parallel programs
Testing parallel programsTesting parallel programs
Testing parallel programs
 
eStudio34 presents London Search Love 2015 | Practical tips for the future o...
eStudio34 presents London Search Love 2015 |  Practical tips for the future o...eStudio34 presents London Search Love 2015 |  Practical tips for the future o...
eStudio34 presents London Search Love 2015 | Practical tips for the future o...
 
SearchLove London 2015 | Will Critchlow | Practical Tips for the Future of ...
SearchLove London 2015 |  Will Critchlow |  Practical Tips for the Future of ...SearchLove London 2015 |  Will Critchlow |  Practical Tips for the Future of ...
SearchLove London 2015 | Will Critchlow | Practical Tips for the Future of ...
 
Building an app from scratch
Building an app from scratchBuilding an app from scratch
Building an app from scratch
 
Growth Hacking- Organic and Paid App Installs
Growth Hacking- Organic and Paid App InstallsGrowth Hacking- Organic and Paid App Installs
Growth Hacking- Organic and Paid App Installs
 
Fraud App Detection using Machine Learning
Fraud App Detection using Machine LearningFraud App Detection using Machine Learning
Fraud App Detection using Machine Learning
 
Fortify-overview-300-v2.pptx
Fortify-overview-300-v2.pptxFortify-overview-300-v2.pptx
Fortify-overview-300-v2.pptx
 
mobile app development tool-converted.pdf
mobile app development tool-converted.pdfmobile app development tool-converted.pdf
mobile app development tool-converted.pdf
 
APPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEM
APPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEMAPPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEM
APPBACS: AN APPLICATION BEHAVIOR ANALYSIS AND CLASSIFICATION SYSTEM
 

More from Ahmed Kamel Taha (19)

Beyond vegetarianism
Beyond vegetarianismBeyond vegetarianism
Beyond vegetarianism
 
5 spy devices
5 spy devices5 spy devices
5 spy devices
 
5 spy software
5 spy software5 spy software
5 spy software
 
PRINCIPLES OF SOFTWARE ARCHITECTURE
PRINCIPLES OF SOFTWARE ARCHITECTUREPRINCIPLES OF SOFTWARE ARCHITECTURE
PRINCIPLES OF SOFTWARE ARCHITECTURE
 
Owasp & php
Owasp & phpOwasp & php
Owasp & php
 
Exam quistions
Exam quistionsExam quistions
Exam quistions
 
Questions
QuestionsQuestions
Questions
 
Choices
ChoicesChoices
Choices
 
Atm
AtmAtm
Atm
 
Software Requirements (3rd Edition) summary
Software Requirements (3rd Edition) summarySoftware Requirements (3rd Edition) summary
Software Requirements (3rd Edition) summary
 
Distributed voting system
Distributed voting systemDistributed voting system
Distributed voting system
 
Owasp & php
Owasp & phpOwasp & php
Owasp & php
 
Functional reactive programming
Functional reactive programmingFunctional reactive programming
Functional reactive programming
 
Design patterns
Design patternsDesign patterns
Design patterns
 
Tcp congestion avoidance
Tcp congestion avoidanceTcp congestion avoidance
Tcp congestion avoidance
 
Offline db
Offline dbOffline db
Offline db
 
Secure mobile payment
Secure mobile paymentSecure mobile payment
Secure mobile payment
 
Week 6 planning
Week 6 planningWeek 6 planning
Week 6 planning
 
[Software Requirements] Chapter 20: Agile Projects
[Software Requirements] Chapter 20: Agile Projects [Software Requirements] Chapter 20: Agile Projects
[Software Requirements] Chapter 20: Agile Projects
 

Recently uploaded

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Detect Abnormal App Behavior with CHABADA

  • 1. Mining apps for anomalies P e r s p e c t i v e s o n D a t a S c i e n c e f o r S o f t w a r e E n g i n e e r i n g
  • 2. Agenda 2 • Specifications • APP MINING • DETECTING ABNORMAL BEHAVIOR • CHABADA • TREASURE OF DATA • OBSTACLES
  • 3. Specifications • Does the program do what it is supposed to do? • Will it continue to do so in the future? • How to define what its supposed to do? 3 Formal Methods
  • 4. Flappy Bird • Your aim is to move a little bird up and down such that it does not hit an obstacle. • As a developer you list undesired properties (no crash, no spying). • How to specify gameplay to computer? • Can we teach a computer how to check a program against expectations? • Learn what program behavior is normal in a given context? 4
  • 5. APP MINING 5 App mining leverages common knowledge in thousands of apps to automatically learn what is “normal” behavior— and in contrast, automatically identify “abnormal” behavior.
  • 6. APP MINING • Leverage the knowledge encoded into the hundreds of thousands of apps available in app stores • Determine what would be normal behavior, to detect what would be abnormal behavior • Guide programmers and users toward better security and usability A p p s i n a p p s t o r e s h a v e t h r e e f e a t u r e s 1. Apps come with all sorts of metadata, such as names, categories, and user interfaces. All of these can be associated with program features, so you can, for instance, associate program behavior with descriptions. 2. Apps are pretty much uniform. They use the same libraries, which on top, use fairly recent designs. All this makes apps easy to analyze, execute, and test—and consequently, easy to compare. 3. Apps are redundant. There are plenty of apps that all address similar problems. This is in sharp contrast to open source programs.. This redundancy in apps allows us to learn common patterns of how problems are addressed—and, in return, detect anomalies. 6
  • 7. DETECTING ABNORMAL BEHAVIOR The problem with “normal” behavior is that it varies according to the app’s purpose.: • If an app sends out text messages, that would normally be a sign of malicious behavior—unless it is a messaging application, where sending text messages is one of the advertised features. • If an app continuously monitors your position, this might be malicious behavior—unless it is a tracking app that again advertises this as a feature. • Simply checking for a set of predefined “undesired” features is not enough—if the features are clearly advertised, then it is reasonable to assume the user tolerates, or even wants these features, because otherwise, she would not have chosen the app. 7
  • 8. 8 Introducing CHABADA • To determine what is normal, we thus must assess program behavior together with its description. If the behavior is advertised then it’s fine; if not, it may come as a surprise to the user, and thus should be flagged. • This is the idea we followed in our first app mining work, the CHABADA tool. • A general tool to detect mismatches between the behavior of an app and its description • Applied on a set of 22,500 apps, CHABADA can detect 74% of novel malware, with a false positive rate below 10%. • Our recent MUDFLOW prototype, which learns normal data flows from apps, can even detect more than 90% of novel malware leaking sensitive data. “Checking App Behavior Against Descriptions of Apps”
  • 9. CHABADA • CHABADA starts with a (large) set of apps to be analyzed. • It first applies tried-and-proven natural language processing techniques (stemming, LDA (Latent Dirichlet Analysis), topic analysis) to abstract the app descriptions into topics. • It builds clusters of those apps whose topics have the most in common. Thus, all apps whose descriptions refer to messaging end up in a “Messaging” cluster. 9
  • 10. 10 CHABADA • Within each cluster, CHABADA will now search for outliers regarding app behavior. • Simply use the set of API calls contained in each app; these are easy to extract using simple static analysis tools. • CHABADA uses tried-and-proven outlier analysis techniques, which provide a ranking of the apps in a cluster, depending on how far away their API usage is from the norm. Those apps that are ranked highest are the most likely outliers.
  • 11. 11 A TREASURE OF DATA … 1. Future techniques will tie program analysis to user interface analysis. 2. Mining user interaction may reveal behavior patterns we could reuse in various contexts. 3. Violating behavior patterns may also imply usability issues. If a button named “Login” does nothing, for instance, it would be very different from the other “Login” buttons used in other apps—and hopefully be flagged as an anomaly. 4. Given good test generators, one can systematically explore the dynamic behavior, and gain information on concrete text and resources accessed a n u mb er of id eas th at ap p stores all make p ossib le
  • 12. OBSTACLES 1. Getting apps is not hard, but not easy either. Besides the official stores, there is no publicly available repository of apps where you could simply download thousands of apps, because violation of copyright. 2. For apps, there’s no easily accessible source code, version, or bug information. If you monitor a store for a sufficient time, you may be able to access and compare releases, but that’s it. Vendors not going to help you and open source is limited . Fortunately, app byte code is not too hard to get through. 3. Metadata is only a very weak indicator of program quality. Lots of one-star reviews may refer to a recent price increase or political reasons; but reviews talking about crashes or malicious behavior might give clear signs. 4. Never underestimate developers. Vendors typically have a pretty clear picture of what their users do, If you think you can mine metadata to predict release dates, reviews, or sentiments: talk to vendors first and check your proposal against the realities of app development.