Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

MetaScience: Holistic Approach for Research Modeling and Analysis


Published on

This article (presented at the ER2016 conference) proposes a conceptual schema providing a holistic view of conference-related information (e.g., authors, papers, committees and topics). This schema is automatically and incrementally populated with data available online.

A number of data analysis and visualization algorithms are applied on top of this data to provide meaningful information to prospective authors, PC members and conference steering committeees

Published in: Software
  • Be the first to comment

  • Be the first to like this

MetaScience: Holistic Approach for Research Modeling and Analysis

  1. 1. MetaScience An Holistic Approach for Research Modeling Valerio Cosentino, Javier Cánovas & Jordi Cabot ICREA – Open University of Catalonia @softmodeling
  2. 2.
  3. 3. Both need to work together to have a healthy community
  4. 4. How can we use conceptual models to help ?
  5. 5. Nurtur e your Comm unity
  6. 6. -Data vs Information -Non-trivial analysis: Integration of different sources -Conceptual Schema as unifying representation -Separation of data collection from data analysis
  7. 7. Conceptual Schema
  8. 8. DB schema
  9. 9. DB schema
  10. 10. Flick/skepticalview
  11. 11. Conferences and papers: DBLP Table mapping
  12. 12. PCs and topics: web scrapping
  13. 13. + a variety of other (overlapping) sources & formats
  14. 14. Community Analysis for everybodyFlickr/leg0fenris
  15. 15. Single metrics, e.g. authors per paper 0 0.5 1 1.5 2 2.5 3 3.5 4 ρ = 0.88
  16. 16. Single metrics, e.g. seniority of researchers
  17. 17. Graph-based analysis: Co-authorship graphs Prolific authors Frequent collaborations Clusters Betweenness (bridge authors)
  18. 18. Historical / temporal view, e.g. PC Analysis ARE PC MEMBERS ACTIVE IN THE CONFERENCE? 60 out of the 99 members from 2015 did not publish in the previous 3 editions ARE ACTIVE MEMBERS BEING IGNORED? Only 7 researchers published constantly from 2012 to 2014 3 of them were PC members in 2015, while the remaining 4 were not
  19. 19. Newcomers % of papers with all authors new to the main track % of papers with all authors new to the conf
  20. 20. NL/ Topic analysis, e.g. Top-30 keywords for last 10 edts From paper abstracts From topics of interest
  21. 21. And there’s much more... Collaborating with the complex systems group: rich club ordering, small- world behaviour, inter- conf analysis…
  22. 22. CORE analysis (shameless plug)
  23. 23. Tool SupportFlickr/JDHancock
  24. 24. Gitana: Integrated analysis (ER’15) Coding platform Issue trackers Commun. channels Code review tools
  25. 25.
  26. 26. ChallengesFlickr/TimPainter
  27. 27. The CS keeps growing
  28. 28. Data Collection limitations
  29. 29. And even more challenges….  Paper classification o Not clear distinction of paper types o Changes on the characteristics from one edition to another (e.g. number of pages for short papers)  Committee / topics data o Conference edition web sites may be not available anymore  Partial solution: WayBack Machine o Committee data similar but there is no common “standard”  Entity resolution o Researchers can use different names  Partial solution: DBLP provides aliases o Researcher names may appear misspelled (mostly in committee data)
  30. 30. Let’s work together jordi.cabot@ @softmodeling modeling-