Dynamic Social Network Analysis (and more!) with eResearch Tools

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    1 Favorite

    Dynamic Social Network Analysis (and more!) with eResearch Tools - Presentation Transcript

    1. Dynamic Social Network Analysis (and more!) with eResearch Tools Andrea Wiggins iSchool @ Syracuse University 21 July, 2008
    2. eResearch for FLOSS
      • An approach to research using cyberinfrastructure
      • Collaborative and transparent, like FLOSS
      • Large-scale shared data sets
        • FLOSSmole
        • Notre Dame SourceForge dumps
        • CVSanalY
        • Etc…
      • Uses tools and analyses that allow sharing among researchers to support open science ideals
        • Taverna Workbench
        • MyExperiment.org
    3. Using Taverna
      • Scientific analysis workflow tool
        • Open source development lead by myGrid team
        • Target users are UK life sciences community
      • Create analysis workflows by connecting modular components through input/output ports
        • Produces ( rigorous ) analyses that are replicable, self-documenting, and easy to share
        • Components include WSDL SOAP web services, Beanshell, RShell, and local Java shims
      • Collaboratively developing our workflows
    4. Replicating FLOSS Research as eResearch
      • Replicating a selection of FLOSS papers and presentations, currently in progress
      • Demonstrating utility and viability of eResearch approaches for FLOSS and social science
      • Building reusable, customizable analysis components specific to FLOSS research, e.g. for data selection, sociomatrix generation for SNA, etc.
      • Extending the original research analysis by parameterization (inputs, thresholds) and implementing “future work” suggestions of authors (plus our own ideas, of course)
    5. Social dynamics of FLOSS communications
      • Replication of Howison, Inoue & Crowston, 2006
        • Compute dynamic network centrality of projects from trackers for 120 projects
      • Extension
        • Added exponentially-decayed edge weighting function (needs sensitivity testing)
        • Made sliding window adjustable
        • Can apply to any threaded communication venue for which data is available
        • Completed: all venues for 2 projects; queued: 216 projects with 635 venues!
    6. Workflow for Dynamic SNA
    7. Dynamic SNA Across FLOSS Communication Channels
      • Clearly a lot of variation across channels (user, developer & trackers), no easily observed patterns except overall trend toward decentralization
      • Implications: carefully match theoretical constructs to data sampling, as different venues are very likely to yield different results, which significantly impacts interpretations
    8. “Do the Rich Get Richer?”
      • Replication of OSCon 2004 presentation by Conklin
        • Demonstrate scale-free distribution of developers among projects
      • Almost there
        • A little more analysis to replicate
      • Hoping to extend to dynamic analysis of preferential attachment
        • Showing change to project sizes over time
        • Comparing evolution and growth across repositories
    9. Workflow for Rich Get Richer
      • Using a single FLOSSmole summary statistic
      • Very simple workflow, can expand analysis considerably
      • Analysis of over 65K projects completes in under 3 minutes!
      Scale-free Developer Distribution in FLOSS
    10. “Identifying success and tragedy of FLOSS Commons”
      • Replication of English & Schweik, 2007
        • Classification of project success by stage of growth for 110K projects as of August 2006
        • Requires data from 2 repositories, FLOSSmole & ND
      • Extension
        • Parameterized all thresholds, makes sensitivity analysis possible
        • Added 2 additional options for a criterion test, one suggested by authors in article
        • Limitation: slightly less available data in FLOSSmole, 94K projects as of April 2005
    11. Workflow for Success-Abandonment Classification
    12. Classifying FLOSS Projects
      • Very complex data requirements; meshing across repositories
      • Difficult to scale and resource intensive
      • Already using this workflow for project sampling
      • For small (non-random) sample of 54 projects:
        • 64% growth, 17% initiation, 19% null (i.e. missing data)
          • Indeterminate Growth: 18.9%
          • Success Growth: 39.6%
          • Tragedy Growth: 7.5%
          • Other: 34%
      amsn,downloaded,growth,enough.releases,active,ok.release.rate,true,SG anjuta,downloaded,growth,enough.releases,active,ok.release.rate,false,SG anon,downloaded,growth,enough.releases,inactive,fast.release.rate,false,TG etc…
    13. Future Directions
      • Replication of “Evolution & Growth in Large Libre Software Projects” by Robles et al., 2005
      • Prototyping OWL ontology of FLOSS communication data, already in use with RDF & SPARQL
      • Cross-linking data, analyses, and papers
      • Increasing scale of analyses to thousands of projects
      • Extending analyses, sensitivity testing to strengthen findings
      • Building reusable analysis components to share, enabling cumulative research
    14. Thanks!
      • More at floss.syr.edu/publications/

    + Andrea WigginsAndrea Wiggins, 2 years ago

    custom

    1247 views, 1 favs, 0 embeds more stats

    A presentation for the OSS Watch Expert Workshop on more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 1247
      • 1247 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 37
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories