SlideShare a Scribd company logo
1 of 67
Download to read offline
Software Analytics: Data Analytics
for Software Engineering and Security
(Speaker Info)
Frodo Baggins
Ring Bearer
FOTR, LLC
Tao Xie
Department of Computer Science
University of Illinois at Urbana-Champaign, USA
taoxie@illinois.edu
In Collaboration with Microsoft Research and NC State University
New Era…Software itself is changing...
Software Services
How people use software is changing…
Individual Isolated
Not much data/content
generation
How people use software is changing…
How people use software is changing…
Individual Isolated
Not much data/content
generation
How people use software is changing…
Individual
Social
Isolated
Not much data/content
generation
Collaborative
Huge amount of data/artifacts
generated anywhere anytime
How software is built & operated is changing…
How software is built & operated is changing…
Data pervasive
Long product cycle
Experience & gut-feeling
In-lab testing
Informed decision making
Centralized development
Code centric
Debugging in the large
Distributed development
Continuous release
… …
How software is built & operated is changing…
Data pervasive
Long product cycle
Experience & gut-feeling
In-lab testing
Informed decision making
Centralized development
Code centric
Debugging in the large
Distributed development
Continuous release
… …
Software Analytics
Software analytics is to enable software
practitioners to perform data exploration and
analysis in order to obtain insightful and
actionable information for data-driven tasks
around software and services.
Dongmei Zhang, Yingnong Dang, Jian-Guang Lou, Shi Han, Haidong Zhang, and Tao Xie. Software
Analytics as a Learning Case in Practice: Approaches and Experiences. In MALETS 2011
http://research.microsoft.com/en-us/groups/sa/malets11-analytics.pdf
Software Analytics
Software analytics is to enable software
practitioners to perform data exploration and
analysis in order to obtain insightful and
actionable information for data-driven tasks
around software and services.
http://research.microsoft.com/en-us/groups/sa/
http://research.microsoft.com/en-us/news/features/softwareanalytics-052013.aspx
Data sources
Runtime traces
Program logs
System events
Perf counters
…
Usage log
User surveys
Online forum posts
Blog & Twitter
…
Source code
Bug history
Check-in history
Test cases
Eye tracking
MRI/EMG
…
Target audience – software practitioners
Target audience – software practitioners
Developer
Tester
Target audience – software practitioners
Developer
Tester
Program Manager
Usability engineer
Designer
Support engineer
Management personnel
Operation engineer
Output – insightful information
• Conveys meaningful and useful understanding or
knowledge towards completing the target task
• Not easily attainable via directly investigating raw data
without aid of analytics technologies
• Example
– It is easy to count the number of re-opened bugs, but how to
find out the primary reasons for these re-opened bugs?
Output – actionable information
• “So what” -- enables software practitioners to come up
with concrete solutions towards completing the target
task
• Example
– Why bugs were re-opened?
• A list of bug groups each with the same reason of re-
opening
Research topics & technology pillars
Software
Users
Software
Development
Process
Software
System
Vertical
Horizontal
Information Visualization
Data Analysis Algorithms
Large-scale Computing
Outline
• Overview of Software Analytics
• Software Engineering Tasks
– XIAO: Scalable code clone analysis
– SAS: Incident management of online services
• Mobile App Security Tasks
– WHYPER: NLP on app descriptions
– AppContext: Machine learning to classify malware
XIAO
Scalable code clone analysis
2012
http://research.microsoft.com/jump/175199
XIAO: Code Clone Analysis
• Motivation
– Copy-and-paste is a common developer behavior
– A real tool widely adopted internally and externally
• XIAO enables code clone analysis in the following way
– High tunability
– High scalability
– High compatibility
– High explorability
High tunability – what you tune is what you get
• Intuitive similarity metric: effective control of the
degree of syntactical differences between two code
snippets
for (i = 0; i < n; i ++) {
a ++;
b ++;
c = foo(a, b);
d = bar(a, b, c);
e = a + c; }
for (i = 0; i < n; i ++) {
c = foo(a, b);
a ++;
b ++;
d = bar(a, b, c);
e = a + d;
e ++; }
High explorability
1. Clone navigation based on source tree hierarchy
2. Pivoting of folder level statistics
3. Folder level statistics
4. Clone function list in selected folder
5. Clone function filters
6. Sorting by bug or refactoring potential
7. Tagging
1 2 3 4 5 6
7
1. Block correspondence
2. Block types
3. Block navigation
4. Copying
5. Bug filing
6. Tagging
1
2
3
4
1
6
5
Scenarios & Solutions
Quality gates at milestones
• Architecture refactoring
• Code clone clean up
• Bug fixing
Post-release maintenance
• Security bug investigation
• Bug investigation for sustained engineering
Development and testing
• Checking for similar issues before check-in
• Reference info for code review
• Supporting tool for bug triage
Online code clone search
Offline code clone analysis
Benefiting developer community
Available in Visual Studio 2012 RC
Searching similar snippets
for fixing bug once
Finding refactoring
opportunity
More secure Microsoft products
Code Clone Search service integrated into
workflow of Microsoft Security Response Center
Over 590 million lines of code indexed across
multiple products
Real security issues proactively identified and
addressed
Example – MS Security Bulletin MS12-034
Combined Security Update for Microsoft Office, Windows, .NET Framework, and
Silverlight, published: Tuesday, May 08, 2012
3 publicly disclosed vulnerabilities and seven privately reported involved. Specifically,
one is exploited by the Duqu malware to execute arbitrary code when a user opened
a malicious Office document
Insufficient bounds check within the font parsing subsystem of win32k.sys
Cloned copy in gdiplus.dll, ogl.dll (office), Silver Light, Windows Journal viewer
Microsoft Technet Blog about this bulletin
However, we wanted to be sure to address the vulnerable code wherever it appeared
across the Microsoft code base. To that end, we have been working with Microsoft
Research to develop a “Cloned Code Detection” system that we can run for every
MSRC case to find any instance of the vulnerable code in any shipping product. This
system is the one that found several of the copies of CVE-2011-3402 that we are
now addressing with MS12-034.
SAS
Incident management of online services
http://research.microsoft.com/apps/pubs/?id=202451
Motivation
Incident Management (IcM) is a critical task to
assure service quality
• Online services are increasingly popular & important
• High service quality is the key
Incident Management: Workflow
Detect a
service
issue
Alert On-
Call
Engineers
(OCEs)
Investigate
the problem
Restore
the
service
Fix root cause
via
postmortem
analysis
Incident Management: Characteristics
Shrink-Wrapped
Software Debugging
Root Cause and Fix
Debugger
Controlled
Environment
Online Service
Incident
Management
Workaround
No Debugger
Live Data
Incident Management: Challenges
Large volume and noisy data
Highly complex problem space
No knowledge of entire system
Knowledge not well organized
SAS: Incident management of online services
SAS, developed and deployed to effectively reduce MTTR
(Mean Time To Restore) via automatically analyzing
monitoring data
3
3
 Design Principle of SAS
 Automating Analysis
 Handling Heterogeneity
 Accumulating Knowledge
 Supporting human-in-the-loop (HITL)
Techniques Overview
• System metrics
– Identifying Incident Beacons
• Transaction logs
– Mining Suspicious Execution Patterns
• Historical incidents
– Mining Historical Workaround Solutions
Industry Impact of SAS
Deployment
• SAS deployed to
worldwide datacenters for
Service X (serving
hundreds of millions of
users) since June 2011
• OCEs now heavily depend
on SAS
Usage
• SAS helped successfully
diagnose ~76% of the
service incidents assisted
with SAS
Outline
• Overview of Software Analytics
• Software Engineering Tasks
– XIAO: Scalable code clone analysis
– SAS: Incident management of online services
• Mobile App Security Tasks
– WHYPER: NLP on app descriptions
– AppContext: Machine learning to classify malware
“Conceptual” Model
38
APP DEVELOPERS
APP USERS
App
Functional
Requirements
App Security
Requirements
User
Functional
Requirements
User Security
Requirements
informal: app description, etc. permission list, etc.
App Code
Requirements:
App Description
39
App
Code
App
Permissions
App Security Requirements:
Permission List
40
“Conceptual” Model
41
APP DEVELOPERS
APP USERS
App
Functional
Requirements
App Security
Requirements
User
Functional
Requirements
User Security
Requirements
informal: app description, etc. permission list, etc.
App Code
Example Andriod App: Angry Birds
42
o Focus on permission  app descriptions
o permissions (protecting user understandable resources)
should be discussed
o What does the users expect (w.r.t. app functionalities)?
o GPS Tracker: record and send location
o Phone-Call Recorder: record audio during phone call
WHYPER: Text Analytics for Mobile Security
43
App Description Sentence
Permission
Linkage
Pandita et al. WHYPER: Towards Automating Risk Assessment of Mobile Applications. USENIX Security 2013
http://web.engr.illinois.edu/~taoxie/publications/usenixsec13-whyper.pdf
WHYPER Overview
Application Market
WHYPER
DEVELOPERS
USERS
44
Pandita et al. WHYPER: Towards Automating Risk Assessment of Mobile Applications. USENIX Security 2013
http://web.engr.illinois.edu/~taoxie/publications/usenixsec13-whyper.pdf
• Enhance user experience while installing apps
• Enforce functionality disclosure on developers
• Complement program analysis to ensure justifications
Natural Language Processing on App Description
45
• “Also you can share the yoga exercise to your friends via Email and SMS.
– Implication of using the contact permission
– Permission sentences
• Confounding effects:
– Certain keywords such as “contact” have a confounding meaning
– E.g., “... displays user contacts, ...” vs “... contact me at abc@xyz.com”.
• Semantic inference:
– Sentences describe a sensitive action w/o referring to keywords
– E.g., “share yoga exercises with your friends via Email and SMS”
NLP + Semantic Graphs/Ontologies Derived from Android API Documents
• Synonym analysis
• Ex non-permission sentence: “You can now turn recordings into
ringtones.”
• functionality that allows users to create ringtones from previously recorded sounds but
NOT requiring permission to record audio
• false positive due to using synonym: (turn, start)
• Limitations of Semantic Graphs
• Ex. permission sentence: “blow into the mic to extinguish the flame like
a real candle”
• false negative due to failing to associate “blow into” with “record”
• Automatic mining from user comments and forums
Challenges
46
Not All Malware Developers Are “Dumb”
or “Lazy”
47
Example Malicious App
48
Not All Malware Developers Are “Dumb” or “Lazy”
Benign? Malicious?
Our Insight
Different goals of benign apps vs. malware.
• Benign apps
– Meet requirements from users (as delivering utility)
• Malware
– Trigger malicious behaviors frequently (as maximizing profits)
– Evade detection (as prolonging lifetime)
50
Differentiating characteristics
Mobile malware (vs. benign apps)
– Frequently enough to meet the need: frequent
occurrences of imperceptible system events;
• E.g., many malware families trigger malicious behaviors via
background events.
– Not too frequently for users to notice anomaly:
indicative states of external environments
• E.g., Send premium SMS every 12 hours
Balance!!!
ActionReceiver.OnReceive()
Date date = new Date();
if(data.getHours>23 || date.getHours< 5 ){
ContextWrapper.StartService(MainService);
…
MainService.OnCreate()
DummyMainMethod()
SendTextActivity$4.onClick()
SplashActivity.OnCreate()
SmsManager.sendTextMessage()
long last = db.query(“LastConnectTime");
long current = System.currentTimeMillis();
if(current – last > 43200000 ){
SmsManager.sendTextMessage();
db.save(“LastConnectTime”, current);
…
SendTextActivity$5.run()
MainService.b()
ContextWrapper.StartService()
The app will send an SMS when
• user clicks a button in the app
Example of malicious app
SendTextActivity$4.onClick
SmsManager.sendTextMessage
ActionReceiver.OnReceive()
Date date = new Date();
if(data.getHours>23 || date.getHours< 5 ){
ContextWrapper.StartService(MainService);
…
MainService.OnCreate()
DummyMainMethod()
SendTextActivity$4.onClick()
SplashActivity.OnCreate()
SmsManager.sendTextMessage()
long last = db.query(“LastConnectTime");
long current = System.currentTimeMillis();
if(current – last > 43200000 ){
SmsManager.sendTextMessage();
db.save(“LastConnectTime”, current);
…
SendTextActivity$5.run()MainService.b()
ContextWrapper.StartService()
The app will send an SMS when
• phone signal strength changes
(frequent)
• current time is within 11PM-5 AM
(not too frequent, User not
around)
Example of malicious app
if(data.getHours>23 || date.getHours< 5 ){
Android.intent.action.SIG_STR
ActionReceiver.OnReceive()
Date date = new Date();
if(data.getHours>23 || date.getHours< 5 ){
ContextWrapper.StartService(MainService);
…
MainService.OnCreate()
DummyMainMethod()
SendTextActivity$4.onClick()
SplashActivity.OnCreate()
SmsManager.sendTextMessage()
long last = db.query(“LastConnectTime");
long current = System.currentTimeMillis();
if(current – last > 43200000 ){
SmsManager.sendTextMessage();
db.save(“LastConnectTime”, current);
…
SendTextActivity$5.run()
MainService.b()
ContextWrapper.StartService()
The app will send an SMS when
• user enters the app (frequent)
• (current time – time when last msg
sent) >12 hours (not too frequent)
Example
if(current – last > 43200000 ){
AppContext
• Capture differentiating characteristics with
contexts of security-sensitive behavior.
• Leverage contexts in machine learning
(classification) to differentiate malware and
benign apps.
Yang et al. AppContext: Differentiating Malicious and Benign Mobile App Behavior Under Contexts. ICSE 2015.
http://taoxie.cs.illinois.edu/publications/icse15-appcontext.pdf
Techniques
• Abstraction for expressing context of security-
sensitive behaviors, e.g., a permission protected
API method.
– To precisely capture the differentiating
characteristics
• Inter-component analysis for extracting contexts
– To identify entry point for activation events
– To connect control flows for context factors
Context of security-sensitive behavior
• Activation events:
• E.g., signal strength changes
• Context factors:
• Environmental attributes for affecting security-
sensitive behavior’s invocation (or not)
• E.g., current system time
AppContext - Workflow
CG: Call Graph; ECG: Extended CG; RICFG: Reduced ICFG
Context-based
Security-Behavior Classification
Context1:
(Event: Signal strength changes),
(Factor: Calendar)
Context2:
(Event: Entering app),
(Factor: Database, SystemTime)
Context3:
(Event: Clicking a button)
Transforming Labelling Training ClassifyingStep 1. Transform contexts for each app’s security behavior as
features
Context-based
Security-Behavior Classification (Cont.)
Transforming Labelling Training Classifying
Systematically label security-sensitive method calls as
malicious based on the existing malware signatures
Support Vector Machine (SVM)
• SVM is resilient to over-fitting
• SVM can handle high dimension data such as our
context factor data (dimension reduction may be
another option).
Evaluation
Subjects: 846 Android apps
• 633 benign apps: randomly selected from popular
apps on Google Play.
• 202 malicious apps: collected through three
different malware dataset (Genome, VirusShare,
Contagio).
• 11 open source apps: randomly selected from F-
Droid.
Research Questions
• RQ1: How effective is AppContext in identifying
malware?
• RQ2: How do activation events and context factors
in our context definition contribute to the
effectiveness of malware identification?
• RQ3: How accurate is our static analysis in inferring
contexts?
Evaluation
Complete Context has higher precision (87.7%)
and recall (95.0%)
Evaluation
Activation events effectively help identify malicious
method calls without context factors
Evaluation
Context factors effectively help identify malicious
behaviors triggered by UI events or malicious
behaviors with no activation events
Limitations
• False negatives
– Malicious behaviors triggered by UI events and
without context factors.
• UI events have less indication of the maliciousness of a
security-sensitive method call
• False positives
– Reflective method calls, dynamic code loading in
benign apps.
– Uncommon security-sensitive method calls used in
benign apps.
Conclusion
Software
Users
Software
Development
Process
Software
System
Vertical
Horizontal
Information Visualization
Data Analysis Algorithms
Large-scale Computing
Q & A
http://taoxie.cs.illinois.edu/
Contact: taoxie@illinois.edu

More Related Content

What's hot

ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckTao Xie
 
User Expectations in Mobile App Security
User Expectations in Mobile App SecurityUser Expectations in Mobile App Security
User Expectations in Mobile App SecurityTao Xie
 
Software Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that MattersSoftware Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that MattersTao Xie
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software DatasetsTao Xie
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringMehdi Mirakhorli
 
Measuring Agile Software Development
Measuring Agile Software DevelopmentMeasuring Agile Software Development
Measuring Agile Software DevelopmentMiroslaw Staron
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software developmentThomas Zimmermann
 
PhD Proposal talk
PhD Proposal talkPhD Proposal talk
PhD Proposal talkRay Buse
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software EngineeringAlexander Serebrenik
 
Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Massimiliano Di Penta
 
Put Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowPut Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowMassimiliano Di Penta
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTrivadis
 
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Alex Pinto
 
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...Alex Pinto
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Alex Pinto
 
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting AutomationBiting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting AutomationAlex Pinto
 
BSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information SecurityBSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information SecurityAlex Pinto
 

What's hot (20)

ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...Intelligent Software Engineering: Synergy between AI and Software Engineering...
Intelligent Software Engineering: Synergy between AI and Software Engineering...
 
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William EnckHotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
 
User Expectations in Mobile App Security
User Expectations in Mobile App SecurityUser Expectations in Mobile App Security
User Expectations in Mobile App Security
 
Software Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that MattersSoftware Analytics: Towards Software Mining that Matters
Software Analytics: Towards Software Mining that Matters
 
Software Mining and Software Datasets
Software Mining and Software DatasetsSoftware Mining and Software Datasets
Software Mining and Software Datasets
 
Big(ger) Data in Software Engineering
Big(ger) Data in Software EngineeringBig(ger) Data in Software Engineering
Big(ger) Data in Software Engineering
 
Measuring Agile Software Development
Measuring Agile Software DevelopmentMeasuring Agile Software Development
Measuring Agile Software Development
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
PhD Proposal talk
PhD Proposal talkPhD Proposal talk
PhD Proposal talk
 
Opinion Mining for Software Engineering
Opinion Mining for Software EngineeringOpinion Mining for Software Engineering
Opinion Mining for Software Engineering
 
Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?Empirical evaluation in 2020: how big, how beautiful?
Empirical evaluation in 2020: how big, how beautiful?
 
My life as a cyborg
My life as a cyborg My life as a cyborg
My life as a cyborg
 
Put Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowPut Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and How
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
 
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting AutomationBiting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
 
BSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information SecurityBSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information Security
 

Similar to Software Analytics: Data Analytics for Software Engineering and Security

Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)Tao Xie
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesTao Xie
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOpsBlack Duck by Synopsys
 
Build Security into the Software with Sparrow
Build Security into the Software with SparrowBuild Security into the Software with Sparrow
Build Security into the Software with SparrowJason Sohn
 
Programming languages and techniques for today’s embedded andIoT world
Programming languages and techniques for today’s embedded andIoT worldProgramming languages and techniques for today’s embedded andIoT world
Programming languages and techniques for today’s embedded andIoT worldRogue Wave Software
 
DevSecOps : an Introduction
DevSecOps : an IntroductionDevSecOps : an Introduction
DevSecOps : an IntroductionPrashanth B. P.
 
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case StudyFinding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case StudyDevOps.com
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Securitysedukull
 
Building Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSABuilding Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSADenim Group
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareLastline, Inc.
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicDavid Solivan
 
Find Out What's New With WhiteSource May 2018- A WhiteSource Webinar
Find Out What's New With WhiteSource May 2018- A WhiteSource WebinarFind Out What's New With WhiteSource May 2018- A WhiteSource Webinar
Find Out What's New With WhiteSource May 2018- A WhiteSource WebinarWhiteSource
 
Building an Open Source AppSec Pipeline - 2015 Texas Linux Fest
Building an Open Source AppSec Pipeline - 2015 Texas Linux FestBuilding an Open Source AppSec Pipeline - 2015 Texas Linux Fest
Building an Open Source AppSec Pipeline - 2015 Texas Linux FestMatt Tesauro
 
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...DevOps.com
 
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...DevOps for Enterprise Systems
 
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...Deborah Schalm
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosaPharo
 
Meetic Backend Mutation With Symfony
Meetic Backend Mutation With SymfonyMeetic Backend Mutation With Symfony
Meetic Backend Mutation With SymfonymeeticTech
 
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black DuckSoftware Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black DuckBlack Duck by Synopsys
 

Similar to Software Analytics: Data Analytics for Software Engineering and Security (20)

Software Analytics: Towards Software Mining that Matters (2014)
Software Analytics:Towards Software Mining that Matters (2014)Software Analytics:Towards Software Mining that Matters (2014)
Software Analytics: Towards Software Mining that Matters (2014)
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Dev{sec}ops
Dev{sec}opsDev{sec}ops
Dev{sec}ops
 
Software Security Assurance for DevOps
Software Security Assurance for DevOpsSoftware Security Assurance for DevOps
Software Security Assurance for DevOps
 
Build Security into the Software with Sparrow
Build Security into the Software with SparrowBuild Security into the Software with Sparrow
Build Security into the Software with Sparrow
 
Programming languages and techniques for today’s embedded andIoT world
Programming languages and techniques for today’s embedded andIoT worldProgramming languages and techniques for today’s embedded andIoT world
Programming languages and techniques for today’s embedded andIoT world
 
DevSecOps : an Introduction
DevSecOps : an IntroductionDevSecOps : an Introduction
DevSecOps : an Introduction
 
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case StudyFinding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
Finding Zero-Days Before The Attackers: A Fortune 500 Red Team Case Study
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Security
 
Building Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSABuilding Your Application Security Data Hub - OWASP AppSecUSA
Building Your Application Security Data Hub - OWASP AppSecUSA
 
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in FirmwareUsing Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
Using Static Binary Analysis To Find Vulnerabilities And Backdoors in Firmware
 
The Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs PublicThe Magic Of Application Lifecycle Management In Vs Public
The Magic Of Application Lifecycle Management In Vs Public
 
Find Out What's New With WhiteSource May 2018- A WhiteSource Webinar
Find Out What's New With WhiteSource May 2018- A WhiteSource WebinarFind Out What's New With WhiteSource May 2018- A WhiteSource Webinar
Find Out What's New With WhiteSource May 2018- A WhiteSource Webinar
 
Building an Open Source AppSec Pipeline - 2015 Texas Linux Fest
Building an Open Source AppSec Pipeline - 2015 Texas Linux FestBuilding an Open Source AppSec Pipeline - 2015 Texas Linux Fest
Building an Open Source AppSec Pipeline - 2015 Texas Linux Fest
 
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
 
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
 
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
Leverage DevOps & Agile Development to Transform Your Application Testing Pro...
 
2014 01-ticosa
2014 01-ticosa2014 01-ticosa
2014 01-ticosa
 
Meetic Backend Mutation With Symfony
Meetic Backend Mutation With SymfonyMeetic Backend Mutation With Symfony
Meetic Backend Mutation With Symfony
 
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black DuckSoftware Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
Software Security Assurance for DevOps - Hewlett Packard Enterprise + Black Duck
 

More from Tao Xie

MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...Tao Xie
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringTao Xie
 
Diversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from AlliesDiversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from AlliesTao Xie
 
Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)Tao Xie
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTao Xie
 
Advances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and PracticeAdvances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and PracticeTao Xie
 
Common Technical Writing Issues
Common Technical Writing IssuesCommon Technical Writing Issues
Common Technical Writing IssuesTao Xie
 
Transferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to PracticeTransferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to PracticeTao Xie
 
Impact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering ToolingImpact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering ToolingTao Xie
 
Next Generation Developer Testing: Parameterized Testing
Next Generation Developer Testing: Parameterized TestingNext Generation Developer Testing: Parameterized Testing
Next Generation Developer Testing: Parameterized TestingTao Xie
 
Csise15 codehunt
Csise15 codehuntCsise15 codehunt
Csise15 codehuntTao Xie
 
Text Analytics for Security
Text Analytics for SecurityText Analytics for Security
Text Analytics for SecurityTao Xie
 
Gamifying Teaching and Learning of Software Engineering and Programming
Gamifying Teaching and Learning of Software Engineering and ProgrammingGamifying Teaching and Learning of Software Engineering and Programming
Gamifying Teaching and Learning of Software Engineering and ProgrammingTao Xie
 
Towards Mining Software Repositories Research that Matters
Towards Mining Software Repositories Research that MattersTowards Mining Software Repositories Research that Matters
Towards Mining Software Repositories Research that MattersTao Xie
 
Tutorial: Text Analytics for Security
Tutorial: Text Analytics for SecurityTutorial: Text Analytics for Security
Tutorial: Text Analytics for SecurityTao Xie
 
Teaching and Learning Programming and Software Engineering via Interactive Ga...
Teaching and Learning Programming and Software Engineering via Interactive Ga...Teaching and Learning Programming and Software Engineering via Interactive Ga...
Teaching and Learning Programming and Software Engineering via Interactive Ga...Tao Xie
 

More from Tao Xie (16)

MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
MSR 2022 Foundational Contribution Award Talk: Software Analytics: Reflection...
 
Intelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software EngineeringIntelligent Software Engineering: Synergy between AI and Software Engineering
Intelligent Software Engineering: Synergy between AI and Software Engineering
 
Diversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from AlliesDiversity and Computing/Engineering: Perspectives from Allies
Diversity and Computing/Engineering: Perspectives from Allies
 
Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)Transferring Software Testing Tools to Practice (AST 2017 Keynote)
Transferring Software Testing Tools to Practice (AST 2017 Keynote)
 
Transferring Software Testing Tools to Practice
Transferring Software Testing Tools to PracticeTransferring Software Testing Tools to Practice
Transferring Software Testing Tools to Practice
 
Advances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and PracticeAdvances in Unit Testing: Theory and Practice
Advances in Unit Testing: Theory and Practice
 
Common Technical Writing Issues
Common Technical Writing IssuesCommon Technical Writing Issues
Common Technical Writing Issues
 
Transferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to PracticeTransferring Software Testing and Analytics Tools to Practice
Transferring Software Testing and Analytics Tools to Practice
 
Impact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering ToolingImpact-Driven Research on Software Engineering Tooling
Impact-Driven Research on Software Engineering Tooling
 
Next Generation Developer Testing: Parameterized Testing
Next Generation Developer Testing: Parameterized TestingNext Generation Developer Testing: Parameterized Testing
Next Generation Developer Testing: Parameterized Testing
 
Csise15 codehunt
Csise15 codehuntCsise15 codehunt
Csise15 codehunt
 
Text Analytics for Security
Text Analytics for SecurityText Analytics for Security
Text Analytics for Security
 
Gamifying Teaching and Learning of Software Engineering and Programming
Gamifying Teaching and Learning of Software Engineering and ProgrammingGamifying Teaching and Learning of Software Engineering and Programming
Gamifying Teaching and Learning of Software Engineering and Programming
 
Towards Mining Software Repositories Research that Matters
Towards Mining Software Repositories Research that MattersTowards Mining Software Repositories Research that Matters
Towards Mining Software Repositories Research that Matters
 
Tutorial: Text Analytics for Security
Tutorial: Text Analytics for SecurityTutorial: Text Analytics for Security
Tutorial: Text Analytics for Security
 
Teaching and Learning Programming and Software Engineering via Interactive Ga...
Teaching and Learning Programming and Software Engineering via Interactive Ga...Teaching and Learning Programming and Software Engineering via Interactive Ga...
Teaching and Learning Programming and Software Engineering via Interactive Ga...
 

Recently uploaded

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 

Recently uploaded (20)

Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 

Software Analytics: Data Analytics for Software Engineering and Security

  • 1. Software Analytics: Data Analytics for Software Engineering and Security (Speaker Info) Frodo Baggins Ring Bearer FOTR, LLC Tao Xie Department of Computer Science University of Illinois at Urbana-Champaign, USA taoxie@illinois.edu In Collaboration with Microsoft Research and NC State University
  • 2. New Era…Software itself is changing... Software Services
  • 3. How people use software is changing…
  • 4. Individual Isolated Not much data/content generation How people use software is changing…
  • 5. How people use software is changing… Individual Isolated Not much data/content generation
  • 6. How people use software is changing… Individual Social Isolated Not much data/content generation Collaborative Huge amount of data/artifacts generated anywhere anytime
  • 7. How software is built & operated is changing…
  • 8. How software is built & operated is changing… Data pervasive Long product cycle Experience & gut-feeling In-lab testing Informed decision making Centralized development Code centric Debugging in the large Distributed development Continuous release … …
  • 9. How software is built & operated is changing… Data pervasive Long product cycle Experience & gut-feeling In-lab testing Informed decision making Centralized development Code centric Debugging in the large Distributed development Continuous release … …
  • 10. Software Analytics Software analytics is to enable software practitioners to perform data exploration and analysis in order to obtain insightful and actionable information for data-driven tasks around software and services. Dongmei Zhang, Yingnong Dang, Jian-Guang Lou, Shi Han, Haidong Zhang, and Tao Xie. Software Analytics as a Learning Case in Practice: Approaches and Experiences. In MALETS 2011 http://research.microsoft.com/en-us/groups/sa/malets11-analytics.pdf
  • 11. Software Analytics Software analytics is to enable software practitioners to perform data exploration and analysis in order to obtain insightful and actionable information for data-driven tasks around software and services. http://research.microsoft.com/en-us/groups/sa/ http://research.microsoft.com/en-us/news/features/softwareanalytics-052013.aspx
  • 12. Data sources Runtime traces Program logs System events Perf counters … Usage log User surveys Online forum posts Blog & Twitter … Source code Bug history Check-in history Test cases Eye tracking MRI/EMG …
  • 13. Target audience – software practitioners
  • 14. Target audience – software practitioners Developer Tester
  • 15. Target audience – software practitioners Developer Tester Program Manager Usability engineer Designer Support engineer Management personnel Operation engineer
  • 16. Output – insightful information • Conveys meaningful and useful understanding or knowledge towards completing the target task • Not easily attainable via directly investigating raw data without aid of analytics technologies • Example – It is easy to count the number of re-opened bugs, but how to find out the primary reasons for these re-opened bugs?
  • 17. Output – actionable information • “So what” -- enables software practitioners to come up with concrete solutions towards completing the target task • Example – Why bugs were re-opened? • A list of bug groups each with the same reason of re- opening
  • 18. Research topics & technology pillars Software Users Software Development Process Software System Vertical Horizontal Information Visualization Data Analysis Algorithms Large-scale Computing
  • 19. Outline • Overview of Software Analytics • Software Engineering Tasks – XIAO: Scalable code clone analysis – SAS: Incident management of online services • Mobile App Security Tasks – WHYPER: NLP on app descriptions – AppContext: Machine learning to classify malware
  • 20. XIAO Scalable code clone analysis 2012 http://research.microsoft.com/jump/175199
  • 21. XIAO: Code Clone Analysis • Motivation – Copy-and-paste is a common developer behavior – A real tool widely adopted internally and externally • XIAO enables code clone analysis in the following way – High tunability – High scalability – High compatibility – High explorability
  • 22. High tunability – what you tune is what you get • Intuitive similarity metric: effective control of the degree of syntactical differences between two code snippets for (i = 0; i < n; i ++) { a ++; b ++; c = foo(a, b); d = bar(a, b, c); e = a + c; } for (i = 0; i < n; i ++) { c = foo(a, b); a ++; b ++; d = bar(a, b, c); e = a + d; e ++; }
  • 23. High explorability 1. Clone navigation based on source tree hierarchy 2. Pivoting of folder level statistics 3. Folder level statistics 4. Clone function list in selected folder 5. Clone function filters 6. Sorting by bug or refactoring potential 7. Tagging 1 2 3 4 5 6 7 1. Block correspondence 2. Block types 3. Block navigation 4. Copying 5. Bug filing 6. Tagging 1 2 3 4 1 6 5
  • 24. Scenarios & Solutions Quality gates at milestones • Architecture refactoring • Code clone clean up • Bug fixing Post-release maintenance • Security bug investigation • Bug investigation for sustained engineering Development and testing • Checking for similar issues before check-in • Reference info for code review • Supporting tool for bug triage Online code clone search Offline code clone analysis
  • 25. Benefiting developer community Available in Visual Studio 2012 RC Searching similar snippets for fixing bug once Finding refactoring opportunity
  • 26. More secure Microsoft products Code Clone Search service integrated into workflow of Microsoft Security Response Center Over 590 million lines of code indexed across multiple products Real security issues proactively identified and addressed
  • 27. Example – MS Security Bulletin MS12-034 Combined Security Update for Microsoft Office, Windows, .NET Framework, and Silverlight, published: Tuesday, May 08, 2012 3 publicly disclosed vulnerabilities and seven privately reported involved. Specifically, one is exploited by the Duqu malware to execute arbitrary code when a user opened a malicious Office document Insufficient bounds check within the font parsing subsystem of win32k.sys Cloned copy in gdiplus.dll, ogl.dll (office), Silver Light, Windows Journal viewer Microsoft Technet Blog about this bulletin However, we wanted to be sure to address the vulnerable code wherever it appeared across the Microsoft code base. To that end, we have been working with Microsoft Research to develop a “Cloned Code Detection” system that we can run for every MSRC case to find any instance of the vulnerable code in any shipping product. This system is the one that found several of the copies of CVE-2011-3402 that we are now addressing with MS12-034.
  • 28. SAS Incident management of online services http://research.microsoft.com/apps/pubs/?id=202451
  • 29. Motivation Incident Management (IcM) is a critical task to assure service quality • Online services are increasingly popular & important • High service quality is the key
  • 30. Incident Management: Workflow Detect a service issue Alert On- Call Engineers (OCEs) Investigate the problem Restore the service Fix root cause via postmortem analysis
  • 31. Incident Management: Characteristics Shrink-Wrapped Software Debugging Root Cause and Fix Debugger Controlled Environment Online Service Incident Management Workaround No Debugger Live Data
  • 32. Incident Management: Challenges Large volume and noisy data Highly complex problem space No knowledge of entire system Knowledge not well organized
  • 33. SAS: Incident management of online services SAS, developed and deployed to effectively reduce MTTR (Mean Time To Restore) via automatically analyzing monitoring data 3 3  Design Principle of SAS  Automating Analysis  Handling Heterogeneity  Accumulating Knowledge  Supporting human-in-the-loop (HITL)
  • 34. Techniques Overview • System metrics – Identifying Incident Beacons • Transaction logs – Mining Suspicious Execution Patterns • Historical incidents – Mining Historical Workaround Solutions
  • 35. Industry Impact of SAS Deployment • SAS deployed to worldwide datacenters for Service X (serving hundreds of millions of users) since June 2011 • OCEs now heavily depend on SAS Usage • SAS helped successfully diagnose ~76% of the service incidents assisted with SAS
  • 36. Outline • Overview of Software Analytics • Software Engineering Tasks – XIAO: Scalable code clone analysis – SAS: Incident management of online services • Mobile App Security Tasks – WHYPER: NLP on app descriptions – AppContext: Machine learning to classify malware
  • 37. “Conceptual” Model 38 APP DEVELOPERS APP USERS App Functional Requirements App Security Requirements User Functional Requirements User Security Requirements informal: app description, etc. permission list, etc. App Code
  • 40. “Conceptual” Model 41 APP DEVELOPERS APP USERS App Functional Requirements App Security Requirements User Functional Requirements User Security Requirements informal: app description, etc. permission list, etc. App Code
  • 41. Example Andriod App: Angry Birds 42
  • 42. o Focus on permission  app descriptions o permissions (protecting user understandable resources) should be discussed o What does the users expect (w.r.t. app functionalities)? o GPS Tracker: record and send location o Phone-Call Recorder: record audio during phone call WHYPER: Text Analytics for Mobile Security 43 App Description Sentence Permission Linkage Pandita et al. WHYPER: Towards Automating Risk Assessment of Mobile Applications. USENIX Security 2013 http://web.engr.illinois.edu/~taoxie/publications/usenixsec13-whyper.pdf
  • 43. WHYPER Overview Application Market WHYPER DEVELOPERS USERS 44 Pandita et al. WHYPER: Towards Automating Risk Assessment of Mobile Applications. USENIX Security 2013 http://web.engr.illinois.edu/~taoxie/publications/usenixsec13-whyper.pdf • Enhance user experience while installing apps • Enforce functionality disclosure on developers • Complement program analysis to ensure justifications
  • 44. Natural Language Processing on App Description 45 • “Also you can share the yoga exercise to your friends via Email and SMS. – Implication of using the contact permission – Permission sentences • Confounding effects: – Certain keywords such as “contact” have a confounding meaning – E.g., “... displays user contacts, ...” vs “... contact me at abc@xyz.com”. • Semantic inference: – Sentences describe a sensitive action w/o referring to keywords – E.g., “share yoga exercises with your friends via Email and SMS” NLP + Semantic Graphs/Ontologies Derived from Android API Documents
  • 45. • Synonym analysis • Ex non-permission sentence: “You can now turn recordings into ringtones.” • functionality that allows users to create ringtones from previously recorded sounds but NOT requiring permission to record audio • false positive due to using synonym: (turn, start) • Limitations of Semantic Graphs • Ex. permission sentence: “blow into the mic to extinguish the flame like a real candle” • false negative due to failing to associate “blow into” with “record” • Automatic mining from user comments and forums Challenges 46
  • 46. Not All Malware Developers Are “Dumb” or “Lazy” 47
  • 48. Not All Malware Developers Are “Dumb” or “Lazy” Benign? Malicious?
  • 49. Our Insight Different goals of benign apps vs. malware. • Benign apps – Meet requirements from users (as delivering utility) • Malware – Trigger malicious behaviors frequently (as maximizing profits) – Evade detection (as prolonging lifetime) 50
  • 50. Differentiating characteristics Mobile malware (vs. benign apps) – Frequently enough to meet the need: frequent occurrences of imperceptible system events; • E.g., many malware families trigger malicious behaviors via background events. – Not too frequently for users to notice anomaly: indicative states of external environments • E.g., Send premium SMS every 12 hours Balance!!!
  • 51. ActionReceiver.OnReceive() Date date = new Date(); if(data.getHours>23 || date.getHours< 5 ){ ContextWrapper.StartService(MainService); … MainService.OnCreate() DummyMainMethod() SendTextActivity$4.onClick() SplashActivity.OnCreate() SmsManager.sendTextMessage() long last = db.query(“LastConnectTime"); long current = System.currentTimeMillis(); if(current – last > 43200000 ){ SmsManager.sendTextMessage(); db.save(“LastConnectTime”, current); … SendTextActivity$5.run() MainService.b() ContextWrapper.StartService() The app will send an SMS when • user clicks a button in the app Example of malicious app SendTextActivity$4.onClick SmsManager.sendTextMessage
  • 52. ActionReceiver.OnReceive() Date date = new Date(); if(data.getHours>23 || date.getHours< 5 ){ ContextWrapper.StartService(MainService); … MainService.OnCreate() DummyMainMethod() SendTextActivity$4.onClick() SplashActivity.OnCreate() SmsManager.sendTextMessage() long last = db.query(“LastConnectTime"); long current = System.currentTimeMillis(); if(current – last > 43200000 ){ SmsManager.sendTextMessage(); db.save(“LastConnectTime”, current); … SendTextActivity$5.run()MainService.b() ContextWrapper.StartService() The app will send an SMS when • phone signal strength changes (frequent) • current time is within 11PM-5 AM (not too frequent, User not around) Example of malicious app if(data.getHours>23 || date.getHours< 5 ){ Android.intent.action.SIG_STR
  • 53. ActionReceiver.OnReceive() Date date = new Date(); if(data.getHours>23 || date.getHours< 5 ){ ContextWrapper.StartService(MainService); … MainService.OnCreate() DummyMainMethod() SendTextActivity$4.onClick() SplashActivity.OnCreate() SmsManager.sendTextMessage() long last = db.query(“LastConnectTime"); long current = System.currentTimeMillis(); if(current – last > 43200000 ){ SmsManager.sendTextMessage(); db.save(“LastConnectTime”, current); … SendTextActivity$5.run() MainService.b() ContextWrapper.StartService() The app will send an SMS when • user enters the app (frequent) • (current time – time when last msg sent) >12 hours (not too frequent) Example if(current – last > 43200000 ){
  • 54. AppContext • Capture differentiating characteristics with contexts of security-sensitive behavior. • Leverage contexts in machine learning (classification) to differentiate malware and benign apps. Yang et al. AppContext: Differentiating Malicious and Benign Mobile App Behavior Under Contexts. ICSE 2015. http://taoxie.cs.illinois.edu/publications/icse15-appcontext.pdf
  • 55. Techniques • Abstraction for expressing context of security- sensitive behaviors, e.g., a permission protected API method. – To precisely capture the differentiating characteristics • Inter-component analysis for extracting contexts – To identify entry point for activation events – To connect control flows for context factors
  • 56. Context of security-sensitive behavior • Activation events: • E.g., signal strength changes • Context factors: • Environmental attributes for affecting security- sensitive behavior’s invocation (or not) • E.g., current system time
  • 57. AppContext - Workflow CG: Call Graph; ECG: Extended CG; RICFG: Reduced ICFG
  • 58. Context-based Security-Behavior Classification Context1: (Event: Signal strength changes), (Factor: Calendar) Context2: (Event: Entering app), (Factor: Database, SystemTime) Context3: (Event: Clicking a button) Transforming Labelling Training ClassifyingStep 1. Transform contexts for each app’s security behavior as features
  • 59. Context-based Security-Behavior Classification (Cont.) Transforming Labelling Training Classifying Systematically label security-sensitive method calls as malicious based on the existing malware signatures Support Vector Machine (SVM) • SVM is resilient to over-fitting • SVM can handle high dimension data such as our context factor data (dimension reduction may be another option).
  • 60. Evaluation Subjects: 846 Android apps • 633 benign apps: randomly selected from popular apps on Google Play. • 202 malicious apps: collected through three different malware dataset (Genome, VirusShare, Contagio). • 11 open source apps: randomly selected from F- Droid.
  • 61. Research Questions • RQ1: How effective is AppContext in identifying malware? • RQ2: How do activation events and context factors in our context definition contribute to the effectiveness of malware identification? • RQ3: How accurate is our static analysis in inferring contexts?
  • 62. Evaluation Complete Context has higher precision (87.7%) and recall (95.0%)
  • 63. Evaluation Activation events effectively help identify malicious method calls without context factors
  • 64. Evaluation Context factors effectively help identify malicious behaviors triggered by UI events or malicious behaviors with no activation events
  • 65. Limitations • False negatives – Malicious behaviors triggered by UI events and without context factors. • UI events have less indication of the maliciousness of a security-sensitive method call • False positives – Reflective method calls, dynamic code loading in benign apps. – Uncommon security-sensitive method calls used in benign apps.