Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Android Apps and User Feedback: A Dataset or Software Evolution and Quality Improvement

326 views

Published on

Nowadays, Android represents the most popular mobile platform
with a market share of around 80%. Previous research showed that data contained in user reviews and code change history of mobile apps represent a rich source of information for reducing software maintenance and development effort, increasing customers’ satisfaction. Stemming from this observation, we present in this paper a large dataset of Android applications belonging to 23 different apps categories, which provides an overview of the types of feed-back users report on the apps and documents the evolution of the related code metrics. The dataset contains about 395 applications of the F-Droid repository, including around 600 versions, 280,000 user reviews and more than 450,000 user feedback (extracted withspecific text mining approaches). Furthermore, for each app versioning our dataset, we employed the Paprika tool and developed several Python scripts to detect different code smells and compute code
quality indicators. The paper discusses the potential usefulness of
the dataset for future research in the field.

  • Be the first to comment

  • Be the first to like this

Android Apps and User Feedback: A Dataset or Software Evolution and Quality Improvement

  1. 1. AndroidAppsand User Feedback: ADatasetfor Software Evolutionand Quality Improvement WorkshoponAppMarketAnalytics -WAMA2017 G.Grano,A. Di Sorbo, F. Mercaldo, C.Visaggio G. Canfora, S. Panichella ✉ grano@ifi.uzh.ch giograno90
  2. 2. OUTLINE → Context → Motivationand relevance → Description ofthe dataset → Enabled Research Giovanni Grano @ s.e.a.l. 2
  3. 3. Google Play Store 3 millions ofapps 65 billions ofdownloads ~ 13$ billions revenues Giovanni Grano @ s.e.a.l. 3
  4. 4. AppStores → newparadigm rich source ofinformation: appdescriptions, changelogs user reviews Giovanni Grano @ s.e.a.l. 4
  5. 5. Findings from mobile store: DirectandActionable impacts forappdeveloperteams1 1 Martin, Sarro, Jia, Zhang, Harman, A Survey of App Store Analysis for Software Engineering, TSE 16 Giovanni Grano @ s.e.a.l. 5
  6. 6. Initialresearch focused on classification2 and summarization3 ofuser reviews 3 Di Sorbo, Panichella, Alexandru, Shimagaki, Visaggio, Canfora,Gall, What would users change in my app? Summarizing app reviews for recommending software changes, FSE 16 2 Panichella, Di Sorbo, Guzman, Visaggio, Canfora, Gall, How can i improve my app? Classifying user reviews for software maintenance and evolution, ICSME 15 Giovanni Grano @ s.e.a.l. 6
  7. 7. Evolution is guided by requests in user reviews4,5 stores lack in functionalities 5 Palomba, Linares-Vásquez, Bavota, Oliveto, Di Penta, Poshyvanyk, Lucia, User reviews matter! Tracking crowdsourced reviews to support evolution of successful apps, ICSME 15 4 Palomba, Salza, Ciurumelea, Panichella, Gall, Ferrucci, De Lucia, Recommending and localizing change requests for mobile apps based on user reviews, ICSE 17 Giovanni Grano @ s.e.a.l. 7
  8. 8. Our Dataset: ~ 280k user reviews 395application 22 code quality metrics 8 code smells Giovanni Grano @ s.e.a.l. 8
  9. 9. DatasetConstruction We built the dataset in two phases: → DataCollection FDroid + Google Play Store →Analysis Phase Classification + apk analsys Giovanni Grano @ s.e.a.l. 9
  10. 10. DataCollection → FDroid Crawler for meta-data ~ 1,929 apps → PlayStore Matching Removed not matched apps or older than 2014 Giovanni Grano @ s.e.a.l. 10
  11. 11. DataCollection → ReviewCrawler Mining reviews for 965 apps →Version Matching Based on release and post date → Filtering Version with less than 10 review. 288k reviews for 629 versions of 395 apps! Giovanni Grano @ s.e.a.l. 11
  12. 12. Analysis → User Reviews Classification » Two-level taxonomy → CodeAnalysis » Code Quality Indicators » Code Smells Giovanni Grano @ s.e.a.l. 12
  13. 13. User Reviews Classification URMTaxonomyModel Two-level taxonomy » Intention ARDOC6 : reviews classifier based on NLP+SA+TA » Topic SURF3 : topic classifier based on topics- related keyword and n-grams 3 Di Sorbo, Panichella, Alexandru, Shimagaki, Visaggio, Canfora,Gall, What would users change in my app? Summarizing app reviews for recommending software changes, FSE 16 6 Panichella, Sorbo, Guzman, Visaggio, Canfora, Gall, ARdoc: app reviews development oriented classifier, FSE 16 Giovanni Grano @ s.e.a.l. 13
  14. 14. Intention Categories Category Definition Information Giving Informs users or developers about app aspects Information Seeking Attemps to obtain informations or help Feature Requests Expresses idea, suggestions for enhancing the app Problem Discovery Unexpected behaviour or issues Other Anything not in previous categories Giovanni Grano @ s.e.a.l. 14
  15. 15. Examples Problem Discovery, Update/Version I can’t access my SD card with the new update which makes this app and the ery money I donated worthless. Feature Request, Feature Functionality I would give 5 stars if there was a way to move emails from the delete folder back into the inbox folder. Giovanni Grano @ s.e.a.l. 15
  16. 16. Some numbers... Topic Sentences FR PD IS IG Other App 117,409 4,879 11,089 1,600 11,943 87,898 GUI 37,620 3,381 5,034 705 3,560 2,4940 Contents 16,819 1,315 1,973 434 1,620 11,477 Download 7,853 333 1,346 363 830 4,981 Company 1672 118 190 57 152 1,155 Feature 173,847 15,480 27,810 4,342 14,972 111,243 Improvement 8,281 1,005 304 54 755 6,163 Pricing 4,016 142 216 62 559 3,037 Resources 3071 155 375 50 263 2228 Update/ Version 21,669 1,358 3,886 548 2,423 13,454 Model 22,044 1,308 3,397 459 2,055 14,825 Security 2,392 212 313 65 218 1,584 Other 189,784 630 2,019 1,402 2,842 182,891 TOTAL 606,477 30,316 57,952 10,141 42,192 465,876 Giovanni Grano @ s.e.a.l. 16
  17. 17. CodeAnalysisapks →apktool→ smali bytecode smali bytecode → python scripts → metrics available metrics @ githubwiki Giovanni Grano @ s.e.a.l. 17
  18. 18. Code Metrics → DimensionalMetrics → ComplexityMetrics → Object-Oriented Metrics →Android-Oriented Metrics Giovanni Grano @ s.e.a.l. 18
  19. 19. CodeAnalysis smali bytecode → Paprika→ smells » Blob Class (BLOB) » Swiss Army Knife (SAK) » Long Method (LM) » Complex Class (CC) » Internal Getter/Setter (IGS) » Member Ignoring Method (MIM) » No Low Memory Resolver (NLMR) » Leaking Inner Class (LIC) code smells @ githubwiki Giovanni Grano @ s.e.a.l. 19
  20. 20. Data Sharing → CSVFiles→ Relational Database Giovanni Grano @ s.e.a.l. 20
  21. 21. CSVFiles →Versions id, package name, category,version, release date 1125,org.tomdroid,Productivity,0.7.5,January 16 2014 → Reviews id, package name,text,category,version, release date, stars,version id 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 org.tomdroid Don't sync it online. The whole app crashed. I had to reinstall it. Lost my notes. As long as you keep it in ur sd card it works good August 24 2015 3 1125 Giovanni Grano @ s.e.a.l. 21
  22. 22. → Sentences id,text, intention,topic 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 Don't sync it online. INFORMATION GIVING, Other 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 The whole app crashed. PROBLEM DISCOVERY, App 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 I had to reinstall it. OTHER, App-Update/Version 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 Lost my notes. OTHER, Contents-Feature/Functionality 7bd1c70a-afc9-11e6-93ea-c4b301cdf627 As long as you keep it in ur sd card it works good OTHER, Feature/Functionality Giovanni Grano @ s.e.a.l. 22
  23. 23. → User metrics id, package name, no.reviews, no.sentences, rating, FR, %FR, PD, % PD → Code Metrics id, package name, <allmetric names> → Code Smells id, package name, <allsmellnames> Giovanni Grano @ s.e.a.l. 23
  24. 24. RelationalDB Giovanni Grano @ s.e.a.l. 24
  25. 25. Research Opportunities
  26. 26. undestanding how code quality affects reviewsand rating for different categories Giovanni Grano @ s.e.a.l. 26
  27. 27. observe consequences on code quality while integrating user feedback intotheappcodebase Giovanni Grano @ s.e.a.l. 27
  28. 28. studyco-evolutiontrends of quality metrics, code smells and user feedback for sequentialreleases Giovanni Grano @ s.e.a.l. 28
  29. 29. thanks foryour attentiondataset@ GitHub ✉ grano@ifi.uzh.ch giograno90

×