SlideShare a Scribd company logo
1 of 36
Understanding Android Fragmentation with Topic
Analysis of Vendor-Specific Bugs
Dan Han, Chenlei Zhang, Xiaochao Fan, Abram Hindle, Kenny Wong and Eleni Stroulia
Department of Computing Science
University of Alberta
1
• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Outline
2
• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Introduction
3
Hardware-Based Fragmentation
http://www.android.com/devices/?country=all
Software-Based Fragmentation
5
http://www.blackeco.com/petites-speculations-autour-de-la-prochaine-version-dandroid/
Why do we care
More than 20
Android device
manufacturers
Multiple Android versions
6
Hundreds of different
Android devices
Developers
Users
Stakeholders
What do we do in this study
7
 Goal: search for evidence of Android
fragmentation within Android ecosystem
based on the Android bug reports
 Approach: apply topic model and topic
analysis
• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Previous Work
8
Topic Model and Topic Analysis
9
Topic Model: a statistical model for discovering
abstract topics that occur in a collection of
documents.
 Latent Dirichlet Allocation (LDA)
Topic Analysis: extract and evaluate the
topics from a corpus of text documents through
topic models
 Traceability recovery: Asuncion et al., Lukins et al., Hindle et al.
 Feature location: Marcus et al., Poshyvanyk et al., Grant et al.
 Software evolution and trend analysis: Thomas et al., Martie et
al.
Differences between previous work and our work
10
 Previous work applied unsupervised topic
models, e.g. LDA
 We performed Labeled-LDA, a supervised
topic model to analyze topic evolution
 We compared the performance between
LDA and Labeled-LDA on our dataset
LDA and Labeled-LDA
11
Labeled-LDA
 A novel method applied in
software engineering so
far
 Manual labeling
 Supervised topic
modeling algorithm
 Labeled-LDA only
predicts the relevance
between each document
and its labels
LDA
 Well studied in software
engineering
 Unsupervised topic
modeling algorithm
 Need documents and
number of topics N as
input
 LDA predicts the
relevance between each
document and all the N
topics
Difference between a topic and a label
12
Topic:
 A word distribution extracted from bug
reports by topic models
Label:
 The annotation of a document
• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Methodology
13
Methodology
14
Case Study
15
 Android bug reports, 2008-2011, 20,000+
 Vendor-specific bug reports
 HTC -- 1503
 Motorola --1058
http://www.puremobile.co.uk/insiderblog/wp-
content/uploads/2011/08/Motorola-
Mobile_logo.jpg
http://www.finestdaily.com/news/htc-
jetstream-to-be-launched-on-september-
6t.html/attachment/htc_cmyk_white_strapl
ine
VS
Create labels for Android bug reports
16
 Feature-oriented labels for Android bug
reports
 Android labels
 Features in Android versions
e.g. Language, Bluetooth
 Popular applications
e.g. Google Maps, Gmail
 Hardware of Android devices
e.g. Keyboard, GPS
Label Android bug reports
17
 60 person-hours of manual labeling
effort
 Labeled bug reports are public now
HTC – 72 labels in total Motorola – 58 labels in total
Apply Labeled-LDA
18
Apply LDA
19
 Try a range of N to find the most distinct topics
 Label each topic using our manual labels for the
bug reports of HTC and Motorola
 2 hours of labeling effort
Comparing LDA and Labeled-LDA
20
 Each topic model generates the document-topic
matrix
 Determine if LDA generates similar results to
Labeled-LDA
 Compute and compare the Jaccard similarity of
documents related to each topic generated by
LDA and Labeled-LDA
• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Comparing Topic Models
21
Comparing Topic Models in HTC
22
Pairwise Jaccard Similarity between each topic in LDA and Labeled-LDA
Labeled-LDA
LDA
Diagonal Entries in HTC
23
Labeled-LDA
LDA
Comparing Topic Models in Motorola
24
Pairwise Jaccard Similarity between each topic in LDA and Labeled-LDA
LDA
Labeled-LDA
Conclusion of comparing LDA and Labeled-LDA
25
 Mean Jaccard similarities of the diagonal
entries are 0.2 for HTC and 0.08 for Motorola
 The number of bug reports related to same
labels in LDA and Labeled-LDA are different
( tests: p<0.01) for both HTC and Motorola
 Labeled-LDA produced more feature relevant
topics than LDA
2

• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Topic Analysis
26
Categorized Topics
27
Common Topics Unique Topics
Unique Topics
Common
Topics Unique Topics
HTC Motorola
Common Topics
28
Both vendors share many identical topic words
Label HTC Motorola
bluetooth bluetooth, headset, car,
connect, device,
connection, version,
data, app, desire, 2.2,
work, connects, behavior,
2.1
bluetooth, headset, droid,
device, connected,
connect, devices, calls,
car, issue, connection,
2.2, car, pair, time
Relevance of common topic “bluetooth” in HTC and Motorola
Android 2.1 Android 2.2
Common Topics
29
Topics of each vendor tend to have vendor-specific terms
Label HTC Motorola
display screen, version, desire,
behavior, app, home,
number, code, final,
press, sure, user, black,
new, power
droid, screen, button,
correct, home, display,
behavior, landscape, 2.1,
menu, bar, xoom, device,
user, status
http://www.motorola.com/xoomhttp://en.wikipedia.org/wiki/Motorol
a_droid
http://en.wikipedia.org/wiki/HTC
_Desire
Unique Topics in HTC
30
Label HTC Motorola
keyboard keyboard, input, text, key,
version, number, typing,
on-screen, mode, field,
landscape, virtual, keys,
type, message
keyboard, droid, keys,
text, press, space, box,
open, device, key, app,
software, 2.0.1,
landscape
Relevance of unique topic “keyboard” in HTC
Android 2.1Android 1.5
Unique Topics in Motorola
31
Label HTC Motorola
GPS gps, data, position,
location, maps, google,
time, lock, wrong, icon,
turn, home, latitude, unit,
tag, available
maps, gps, google, app,
droid, location,
application, navigation,
map, device, traffic, time,
upgrade, turn, route
Relevance of unique topic “GPS” in Motorola
Android 2.2
• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Discussion
32
Fragmentation Discussion
33
 Software-Based Fragmentation
 New features and changes contribute the bug reports
 Difficult to test across all of the vendor and product-lines
Relevance of common topic “bluetooth” in HTC and Motorola
Android 2.1 Android 2.2
Fragmentation Discussion
34
 Hardware-Based Fragmentation
 Different product lines were associated with different
topics
 Evident by differing bug topics and product specific issues
Label HTC Motorola
display screen, version, desire,
behavior, app, home,
number, code, final, press,
sure, user, black, new,
power
droid, screen, button,
correct, home, display,
behavior, landscape, 2.1,
menu, bar, xoom, device,
user, status
• Introduction
• Previous Work
• Methodology
• Comparing Topic Models
• Fragmentation Topic Analysis
• Fragmentation Discussion
• Conclusion
Conclusion
35
Conclusion
36
 Found how fragmentation is manifested within Android
between HTC and Motorola
 Incompatibility issues
 Portability issues
 Compared the performance of Labeled-LDA and LDA
 Labeled-LDA produced more feature relevant
topics than LDA
 Labeled-LDA need more manual effort
 http://softwareprocess.es/static/Fragmentation.html
(http://goo.gl/SwGDT)
 Could be useful to make project dashboards,
process mining and software process recovery

More Related Content

Viewers also liked

seminar Tugas Akhir
seminar Tugas Akhirseminar Tugas Akhir
seminar Tugas Akhirlunamayah
 
Next Generation Intelligent APM: Pain Points, Trends and Solutions
Next Generation Intelligent APM: Pain Points, Trends and SolutionsNext Generation Intelligent APM: Pain Points, Trends and Solutions
Next Generation Intelligent APM: Pain Points, Trends and SolutionsYuchen Zhao
 
Enabling Agile Development v.3
Enabling Agile Development v.3Enabling Agile Development v.3
Enabling Agile Development v.3Don Michie
 
Infrastructure and APM Approach and Framework v.3
Infrastructure and APM Approach and Framework v.3Infrastructure and APM Approach and Framework v.3
Infrastructure and APM Approach and Framework v.3Don Michie
 
Geek Sync I In Depth Look At Application Performance Monitoring
Geek Sync I In Depth Look At Application Performance MonitoringGeek Sync I In Depth Look At Application Performance Monitoring
Geek Sync I In Depth Look At Application Performance MonitoringIDERA Software
 
SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...
SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...
SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...South Tyrol Free Software Conference
 
App dynamics and servicenow v5
App dynamics and servicenow   v5App dynamics and servicenow   v5
App dynamics and servicenow v5BrendanBooth
 
How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...
How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...
How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...AppDynamics
 
Best Network Performance Monitoring Tool
Best Network Performance Monitoring ToolBest Network Performance Monitoring Tool
Best Network Performance Monitoring ToolJoe Shestak
 
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Site24x7
 
From APM to Business Monitoring with AppDynamics Analytics
From APM to Business Monitoring with AppDynamics AnalyticsFrom APM to Business Monitoring with AppDynamics Analytics
From APM to Business Monitoring with AppDynamics AnalyticsAppDynamics
 
PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...
PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...
PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...AppDynamics
 
Blueprint in education
Blueprint in educationBlueprint in education
Blueprint in educationKiran Kushwaha
 
End User Monitoring with AppDynamics - AppSphere16
End User Monitoring with AppDynamics - AppSphere16End User Monitoring with AppDynamics - AppSphere16
End User Monitoring with AppDynamics - AppSphere16AppDynamics
 

Viewers also liked (15)

seminar Tugas Akhir
seminar Tugas Akhirseminar Tugas Akhir
seminar Tugas Akhir
 
Next Generation Intelligent APM: Pain Points, Trends and Solutions
Next Generation Intelligent APM: Pain Points, Trends and SolutionsNext Generation Intelligent APM: Pain Points, Trends and Solutions
Next Generation Intelligent APM: Pain Points, Trends and Solutions
 
Enabling Agile Development v.3
Enabling Agile Development v.3Enabling Agile Development v.3
Enabling Agile Development v.3
 
Infrastructure and APM Approach and Framework v.3
Infrastructure and APM Approach and Framework v.3Infrastructure and APM Approach and Framework v.3
Infrastructure and APM Approach and Framework v.3
 
Geek Sync I In Depth Look At Application Performance Monitoring
Geek Sync I In Depth Look At Application Performance MonitoringGeek Sync I In Depth Look At Application Performance Monitoring
Geek Sync I In Depth Look At Application Performance Monitoring
 
Issues Monitoring
Issues MonitoringIssues Monitoring
Issues Monitoring
 
SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...
SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...
SFScon15 - Jürgen Vigna: " Application Performance Monitoring auf Open Source...
 
App dynamics and servicenow v5
App dynamics and servicenow   v5App dynamics and servicenow   v5
App dynamics and servicenow v5
 
How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...
How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...
How Accenture's IT Organization Drives Performance Monitoring Globally - AppS...
 
Best Network Performance Monitoring Tool
Best Network Performance Monitoring ToolBest Network Performance Monitoring Tool
Best Network Performance Monitoring Tool
 
Application Performance Monitoring (APM)
Application Performance Monitoring (APM)Application Performance Monitoring (APM)
Application Performance Monitoring (APM)
 
From APM to Business Monitoring with AppDynamics Analytics
From APM to Business Monitoring with AppDynamics AnalyticsFrom APM to Business Monitoring with AppDynamics Analytics
From APM to Business Monitoring with AppDynamics Analytics
 
PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...
PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...
PayU's Digital Transformation: Transparency from Dev to Prod, Monitoring Micr...
 
Blueprint in education
Blueprint in educationBlueprint in education
Blueprint in education
 
End User Monitoring with AppDynamics - AppSphere16
End User Monitoring with AppDynamics - AppSphere16End User Monitoring with AppDynamics - AppSphere16
End User Monitoring with AppDynamics - AppSphere16
 

Similar to Understanding Android Fragmentation with Topic Analysis of Vendor-Specific Bugs

Architecture Design Decisions and Group Decision Making
Architecture Design Decisions and Group Decision MakingArchitecture Design Decisions and Group Decision Making
Architecture Design Decisions and Group Decision MakingHenry Muccini
 
A FAIR Approach to Publishing and Sharing Machine Learning Models
A FAIR Approach to Publishing and Sharing Machine Learning ModelsA FAIR Approach to Publishing and Sharing Machine Learning Models
A FAIR Approach to Publishing and Sharing Machine Learning ModelsBen Blaiszik
 
chapter 1 software design.pptx
chapter 1 software design.pptxchapter 1 software design.pptx
chapter 1 software design.pptxrecoveraccount1
 
Test-Driven Development in the Corporate Workplace
Test-Driven Development in the Corporate WorkplaceTest-Driven Development in the Corporate Workplace
Test-Driven Development in the Corporate WorkplaceAhmed Owian
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisMarcus Hanwell
 
apidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoff
apidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoffapidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoff
apidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoffapidays
 
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglassapidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglassapidays
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...semanticsconference
 
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...Alexandr Savchenko
 
"Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa..."Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa...Fwdays
 
Growing an API Culture - APIdays LIVE AU 2020
Growing an API Culture - APIdays LIVE AU 2020Growing an API Culture - APIdays LIVE AU 2020
Growing an API Culture - APIdays LIVE AU 2020Saul Caganoff
 
Name ID Number Section 1 SummaryAt least 250 words as counted.docx
Name ID Number Section 1 SummaryAt least 250 words as counted.docxName ID Number Section 1 SummaryAt least 250 words as counted.docx
Name ID Number Section 1 SummaryAt least 250 words as counted.docxroushhsiu
 
OOSAD Chapter 6 Object Oriented Design.pptx
OOSAD Chapter 6 Object Oriented Design.pptxOOSAD Chapter 6 Object Oriented Design.pptx
OOSAD Chapter 6 Object Oriented Design.pptxBereketMuniye
 
DAE Tools 1.8.0 - Introduction
DAE Tools 1.8.0 - IntroductionDAE Tools 1.8.0 - Introduction
DAE Tools 1.8.0 - IntroductionDragan Nikolić
 
10 Things You Should Know About MDD
10 Things You Should Know About MDD10 Things You Should Know About MDD
10 Things You Should Know About MDDJohan den Haan
 
The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...
The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...
The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...The Internet of Things Methodology
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine JonesJisc RDM
 
A Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small ProjectsA Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small ProjectsGabor Guta
 
CSE320 SOFTWARE ENGINEERING Lecture01 (1).ppt
CSE320  SOFTWARE ENGINEERING Lecture01 (1).pptCSE320  SOFTWARE ENGINEERING Lecture01 (1).ppt
CSE320 SOFTWARE ENGINEERING Lecture01 (1).pptDHIRENDRAHUDDA
 

Similar to Understanding Android Fragmentation with Topic Analysis of Vendor-Specific Bugs (20)

Architecture Design Decisions and Group Decision Making
Architecture Design Decisions and Group Decision MakingArchitecture Design Decisions and Group Decision Making
Architecture Design Decisions and Group Decision Making
 
A FAIR Approach to Publishing and Sharing Machine Learning Models
A FAIR Approach to Publishing and Sharing Machine Learning ModelsA FAIR Approach to Publishing and Sharing Machine Learning Models
A FAIR Approach to Publishing and Sharing Machine Learning Models
 
chapter 1 software design.pptx
chapter 1 software design.pptxchapter 1 software design.pptx
chapter 1 software design.pptx
 
Test-Driven Development in the Corporate Workplace
Test-Driven Development in the Corporate WorkplaceTest-Driven Development in the Corporate Workplace
Test-Driven Development in the Corporate Workplace
 
Open Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & AnalysisOpen Chemistry: Input Preparation, Data Visualization & Analysis
Open Chemistry: Input Preparation, Data Visualization & Analysis
 
apidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoff
apidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoffapidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoff
apidays LIVE Australia - Growing an API Culture by Liz Douglass & Saul Caganoff
 
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglassapidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
apidays LIVE Paris - Growing an API Culture by Saul Caganoff & Liz Douglass
 
Session 2.1 ontological representation of the telecom domain for advanced a...
Session 2.1   ontological representation of the telecom domain for advanced a...Session 2.1   ontological representation of the telecom domain for advanced a...
Session 2.1 ontological representation of the telecom domain for advanced a...
 
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
PHPFrameworkDay 2020 - Different software evolutions from Start till Release ...
 
"Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa..."Different software evolutions from Start till Release in PHP product" Oleksa...
"Different software evolutions from Start till Release in PHP product" Oleksa...
 
Growing an API Culture - APIdays LIVE AU 2020
Growing an API Culture - APIdays LIVE AU 2020Growing an API Culture - APIdays LIVE AU 2020
Growing an API Culture - APIdays LIVE AU 2020
 
Name ID Number Section 1 SummaryAt least 250 words as counted.docx
Name ID Number Section 1 SummaryAt least 250 words as counted.docxName ID Number Section 1 SummaryAt least 250 words as counted.docx
Name ID Number Section 1 SummaryAt least 250 words as counted.docx
 
OOSAD Chapter 6 Object Oriented Design.pptx
OOSAD Chapter 6 Object Oriented Design.pptxOOSAD Chapter 6 Object Oriented Design.pptx
OOSAD Chapter 6 Object Oriented Design.pptx
 
DAE Tools 1.8.0 - Introduction
DAE Tools 1.8.0 - IntroductionDAE Tools 1.8.0 - Introduction
DAE Tools 1.8.0 - Introduction
 
10 Things You Should Know About MDD
10 Things You Should Know About MDD10 Things You Should Know About MDD
10 Things You Should Know About MDD
 
The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...
The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...
The IoT Methodology & An Introduction to the Intel Galileo, Edison and SmartL...
 
Research software identification - Catherine Jones
Research software identification - Catherine JonesResearch software identification - Catherine Jones
Research software identification - Catherine Jones
 
A Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small ProjectsA Lightweight MDD Process Applied in Small Projects
A Lightweight MDD Process Applied in Small Projects
 
Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017Hobbit project overview presented at EBDVF 2017
Hobbit project overview presented at EBDVF 2017
 
CSE320 SOFTWARE ENGINEERING Lecture01 (1).ppt
CSE320  SOFTWARE ENGINEERING Lecture01 (1).pptCSE320  SOFTWARE ENGINEERING Lecture01 (1).ppt
CSE320 SOFTWARE ENGINEERING Lecture01 (1).ppt
 

Understanding Android Fragmentation with Topic Analysis of Vendor-Specific Bugs

  • 1. Understanding Android Fragmentation with Topic Analysis of Vendor-Specific Bugs Dan Han, Chenlei Zhang, Xiaochao Fan, Abram Hindle, Kenny Wong and Eleni Stroulia Department of Computing Science University of Alberta 1
  • 2. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Outline 2
  • 3. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Introduction 3
  • 6. Why do we care More than 20 Android device manufacturers Multiple Android versions 6 Hundreds of different Android devices Developers Users Stakeholders
  • 7. What do we do in this study 7  Goal: search for evidence of Android fragmentation within Android ecosystem based on the Android bug reports  Approach: apply topic model and topic analysis
  • 8. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Previous Work 8
  • 9. Topic Model and Topic Analysis 9 Topic Model: a statistical model for discovering abstract topics that occur in a collection of documents.  Latent Dirichlet Allocation (LDA) Topic Analysis: extract and evaluate the topics from a corpus of text documents through topic models  Traceability recovery: Asuncion et al., Lukins et al., Hindle et al.  Feature location: Marcus et al., Poshyvanyk et al., Grant et al.  Software evolution and trend analysis: Thomas et al., Martie et al.
  • 10. Differences between previous work and our work 10  Previous work applied unsupervised topic models, e.g. LDA  We performed Labeled-LDA, a supervised topic model to analyze topic evolution  We compared the performance between LDA and Labeled-LDA on our dataset
  • 11. LDA and Labeled-LDA 11 Labeled-LDA  A novel method applied in software engineering so far  Manual labeling  Supervised topic modeling algorithm  Labeled-LDA only predicts the relevance between each document and its labels LDA  Well studied in software engineering  Unsupervised topic modeling algorithm  Need documents and number of topics N as input  LDA predicts the relevance between each document and all the N topics
  • 12. Difference between a topic and a label 12 Topic:  A word distribution extracted from bug reports by topic models Label:  The annotation of a document
  • 13. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Methodology 13
  • 15. Case Study 15  Android bug reports, 2008-2011, 20,000+  Vendor-specific bug reports  HTC -- 1503  Motorola --1058 http://www.puremobile.co.uk/insiderblog/wp- content/uploads/2011/08/Motorola- Mobile_logo.jpg http://www.finestdaily.com/news/htc- jetstream-to-be-launched-on-september- 6t.html/attachment/htc_cmyk_white_strapl ine VS
  • 16. Create labels for Android bug reports 16  Feature-oriented labels for Android bug reports  Android labels  Features in Android versions e.g. Language, Bluetooth  Popular applications e.g. Google Maps, Gmail  Hardware of Android devices e.g. Keyboard, GPS
  • 17. Label Android bug reports 17  60 person-hours of manual labeling effort  Labeled bug reports are public now HTC – 72 labels in total Motorola – 58 labels in total
  • 19. Apply LDA 19  Try a range of N to find the most distinct topics  Label each topic using our manual labels for the bug reports of HTC and Motorola  2 hours of labeling effort
  • 20. Comparing LDA and Labeled-LDA 20  Each topic model generates the document-topic matrix  Determine if LDA generates similar results to Labeled-LDA  Compute and compare the Jaccard similarity of documents related to each topic generated by LDA and Labeled-LDA
  • 21. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Comparing Topic Models 21
  • 22. Comparing Topic Models in HTC 22 Pairwise Jaccard Similarity between each topic in LDA and Labeled-LDA Labeled-LDA LDA
  • 23. Diagonal Entries in HTC 23 Labeled-LDA LDA
  • 24. Comparing Topic Models in Motorola 24 Pairwise Jaccard Similarity between each topic in LDA and Labeled-LDA LDA Labeled-LDA
  • 25. Conclusion of comparing LDA and Labeled-LDA 25  Mean Jaccard similarities of the diagonal entries are 0.2 for HTC and 0.08 for Motorola  The number of bug reports related to same labels in LDA and Labeled-LDA are different ( tests: p<0.01) for both HTC and Motorola  Labeled-LDA produced more feature relevant topics than LDA 2 
  • 26. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Topic Analysis 26
  • 27. Categorized Topics 27 Common Topics Unique Topics Unique Topics Common Topics Unique Topics HTC Motorola
  • 28. Common Topics 28 Both vendors share many identical topic words Label HTC Motorola bluetooth bluetooth, headset, car, connect, device, connection, version, data, app, desire, 2.2, work, connects, behavior, 2.1 bluetooth, headset, droid, device, connected, connect, devices, calls, car, issue, connection, 2.2, car, pair, time Relevance of common topic “bluetooth” in HTC and Motorola Android 2.1 Android 2.2
  • 29. Common Topics 29 Topics of each vendor tend to have vendor-specific terms Label HTC Motorola display screen, version, desire, behavior, app, home, number, code, final, press, sure, user, black, new, power droid, screen, button, correct, home, display, behavior, landscape, 2.1, menu, bar, xoom, device, user, status http://www.motorola.com/xoomhttp://en.wikipedia.org/wiki/Motorol a_droid http://en.wikipedia.org/wiki/HTC _Desire
  • 30. Unique Topics in HTC 30 Label HTC Motorola keyboard keyboard, input, text, key, version, number, typing, on-screen, mode, field, landscape, virtual, keys, type, message keyboard, droid, keys, text, press, space, box, open, device, key, app, software, 2.0.1, landscape Relevance of unique topic “keyboard” in HTC Android 2.1Android 1.5
  • 31. Unique Topics in Motorola 31 Label HTC Motorola GPS gps, data, position, location, maps, google, time, lock, wrong, icon, turn, home, latitude, unit, tag, available maps, gps, google, app, droid, location, application, navigation, map, device, traffic, time, upgrade, turn, route Relevance of unique topic “GPS” in Motorola Android 2.2
  • 32. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Discussion 32
  • 33. Fragmentation Discussion 33  Software-Based Fragmentation  New features and changes contribute the bug reports  Difficult to test across all of the vendor and product-lines Relevance of common topic “bluetooth” in HTC and Motorola Android 2.1 Android 2.2
  • 34. Fragmentation Discussion 34  Hardware-Based Fragmentation  Different product lines were associated with different topics  Evident by differing bug topics and product specific issues Label HTC Motorola display screen, version, desire, behavior, app, home, number, code, final, press, sure, user, black, new, power droid, screen, button, correct, home, display, behavior, landscape, 2.1, menu, bar, xoom, device, user, status
  • 35. • Introduction • Previous Work • Methodology • Comparing Topic Models • Fragmentation Topic Analysis • Fragmentation Discussion • Conclusion Conclusion 35
  • 36. Conclusion 36  Found how fragmentation is manifested within Android between HTC and Motorola  Incompatibility issues  Portability issues  Compared the performance of Labeled-LDA and LDA  Labeled-LDA produced more feature relevant topics than LDA  Labeled-LDA need more manual effort  http://softwareprocess.es/static/Fragmentation.html (http://goo.gl/SwGDT)  Could be useful to make project dashboards, process mining and software process recovery

Editor's Notes

  1. According to the time-series relevance evolution of each topic, we categorized these topics into common topics and unique topics. The common topics represent the same labels that are shared between both vendors, and theyshare similar evolution of the average relevance over time.The unique topics arewith significantly different topic relevance over time which is more specific to one vendor than the other.
  2. In our study, there are 14 common topics shared between the vendors. Let’s take a look at one of the most frequent topics, bluetooth. This table shows the “BlueTooth” topic and associated word list with related top 15 terms generated by Labeled-LDA for each vendor. We can see both vendors share many identical topic words. The bottom figure is the evolution of average relevance of “bluetooth”in HTC and Motorola over time. From the figure, we can observe the bluetooth topic has a cross vendor peak with the release of both Android 2.1 and 2.2.
  3. Another feature of common topics is the topics of each vendor tend to have vendor specific terms. By reading the bug reports, we found In HTC, there are 9 topics share the term “desire” which refers to the HTC Desire phone. In Motorola topics, they share the term “Droid” and “Xoom” which refer to Motorola Droid phone and Xoom tablet. It implies the evidence that the different product lines face different issues.For example, there is the display issues between Motorola Droid phone and Xoom tablet because of the screen size.
  4. Keyboard is one of the unique topics in HTC. We can see there are two peaks in the figure. By reading the bug reports, we found most HTC devices have no physical keyboards, this virtual keyboard feature is frequently used by HTC users. In contrast, Motorola’s Android devices tend to have physical keyboards, which might explain the lack of bug activities in the Motorola bug reports. This figure shows that HTC keyboard relevance peaks and drops out, while keyboard in Motorola is steady. This behavior suggests that hardware and software configuration dictate the importance of the keyboard topic.
  5. GPS is one of unique topics in Motorola. By reading the bug reports and the history brief of Android releases, we found Motorola and HTC use the different GPS software before Android 2.2. Motorola Droid smart phone use Google Map as GPS navigation service from Android 2.2. As a result, this new feature contributes three peaks for Motorola in the figure.
  6. In our study, we found how fragmentation is manifested within Android by comparing and contrasting the bug reports between two Android smart phone vendors: HTC and Motorola.Based on Labeled-LDA topic analysis we found different topics tended to be associated with their different products, providing even more evidence of vendor specific fragmentation. As a result, hardware-based fragmentation in Android is evident by differing bug topics and product specific issues. moreover, from word list of the different topics, Software-based and hardware-based fragmentation within Android appears thought incompatibility issues and portability issues. On the other hand, we compared the performance of Labeled-LDA and LDA. We found Labeled-LDA produced more feature relevant topics than LDA. However, applied Labeled-LDA need more efforts than LDA. You can download our labeled dataset from the following link.Finally, Our findings can be used to make project dashboards, process mining and software process recovery.