SlideShare a Scribd company logo
1 of 46
Download to read offline
Little eScience
Andrea Wiggins
June 18, 2009
Overview

• Background


• Exposition: Sociology of Science


  • Broad generalizations about science


• Example: FLOSS Research


  • Little science context for eScience research


• Expectations: What next?


                                           http://www.flickr.com/photos/pmtorrone/304696349/
My Background

• BA: Maths with economics


• Nonprofit & IT industry work


  • Adult literacy, nonprofit management support,
    professional theatre


  • Web analytics


• MSI: Human-computer interaction,
  complex systems & network science


• PhD: Information science & technology
Science

• Systematic investigation for the production of knowledge


  • Scientific method emphasizes reproducibility


  • Not all phenomena are reproducible...


• Many categories


  • Experimental, applied, social, etc.


  • Categories are not mutually exclusive


                                            http://www.flickr.com/photos/radiorover/419414206/
Paradigms & Revolutions

• Kuhn - Laws, theories, applications & instrumentation that create
  coherent traditions of scientific research


• Paradigms help us direct
  our research, but limit
  our view of the world


• New technologies can
  lead to scientific revolutions
  by revealing anomalies




                                           http://www.flickr.com/photos/weichbrodt/644302381/
Normal Science

• Kuhn - “normal science” is research based on broadly accepted scientific
  paradigms


• Shared paradigms are based on rules
  and standards for scientific practice


• Key requirement: agreement on
  focus and conduct of research


  • Ǝ(Grand Challenges)|Discipline




                                         http://www.flickr.com/photos/themadlolscientist/2421152973/
Big Science

• de Solla Price - “Big Science” is...


   • Inherently paradigmatic


   • Always normal science


• Produces detailed insights into
  the minutiae of phenomena
  studied in the paradigm




                                         http://www.flickr.com/photos/31333486@N00/1883498062/
Pre-paradigmatic Science

• Paradigms require agreement on...


  • Epistemology


  • Ontology


  • Methodology


• Most social sciences are pre-paradigmatic


  • Primarily exploratory research


  • Very little replication                   http://www.flickr.com/photos/askpang/327577395/
Little Science

• de Solla Price - “Little Science” is a
  romanticized precursor to Big Science,
  featuring lone, long-haired geniuses
  misunderstood by society, etc.


• If it’s not Big Science, it’s Little Science


   • Pre-paradigmatic and fraught with ambiguity


   • Often fundamentally exploratory


   • Epistemological/theoretical/methodological
     divergence among researchers
                                                 http://www.flickr.com/photos/mrjoax/2548045246/
Social Science

• Social science is real science: the goal is systematic knowledge production


• Focuses on the study of the social life of human groups and individuals


• IMHO, fundamentally more difficult than
  “hard” sciences due to infinite
  complexity of social phenomena


• Replicability is a major challenge
  with respect to scientific method


• Not all social science can or should
  aspire to replicability
                                            http://www.flickr.com/photos/smiteme/2379629501/
Normalizing Science

• Becoming a normal science requires community and convergence


  • Ǝ(community) != Ǝ(agreement)


• Establishing grand challenges and
  methods are primary tasks
  of normalizing


• Resistance to change is pervasive




                                      http://www.flickr.com/photos/9036026@N08/2949211479/
Scientific Collaboration

• Collaboration requires common focus, if not also epistemology and ontology


• Challenging enough in normal sciences


• Harder in pre-paradigmatic research


• Economics: systemic disincentives to
  collaborate, versus potential benefits
  and ideals of science




                                          http://www.flickr.com/photos/richardsummers/542738965/
Big Science Collaboration

• LHC, CERN, etc.


  • Thousands of collaborators


  • Complex but coordinated,
    at least somewhat centralized


• Requires shared goals and resources,
  plus (lots of) communication


  • Only happens in normal sciences


                                         http://www.flickr.com/photos/8767020@N08/531355152/
Little Science Collaboration

• A Professor & a grad student, give or take


   • Localized goals and resources


      • -> localized research practices


• Small research teams


   • Fundamentally difficult to achieve
     consensus that allows larger groups


   • Restricts the ability to obtain funding
     and undertake ambitious projects
                                               http://www.flickr.com/photos/lamazone/2735939345/
Scientific Collaboration Requirements

• Shared goals


  • Establishes focus of research


• Shared research resources


  • Both social and artifactual


  • Social aspects include
    training and community
    socialization

                                     we can has share?
                                    http://www.flickr.com/photos/ryanr/142455033/
Historical Research Artifacts

• Letters, Books, Journals, Lectures


• Also technologies: methods, instrumentation


• Sharing?


   • Recordkeeping is not always
     a researcher’s main priority


   • Without records, there’s not
     much to share except the
     research outputs

                                          http://www.flickr.com/photos/smailtronic/1535870363/
Today’s Research Artifacts

• Large scale datasets, scripts, software, workflows, papers, images, video,
  audio, annotations, ephemera, web sites...


   • “Research objects” -
     bundling all the pieces together


   • Hybrids of boundary objects
     and touchstones


• Technologies -> scientific revolution!


   • Open science

                                           http://www.flickr.com/photos/smiteme/2379630899/
Example: FLOSS Research

• Phenomenological & interdisciplinary


  • Software engineering,
    Information Systems,
    Anthropology,
    Sociology,
    CSCW,
    etc...


• Ethos


  • (Idealistic) combination
    of open source values
    and scientific values
                                     http://www.flickr.com/photos/themadlolscientist/2542236565/
FLOSS Phenomenon

• Free/Libre Open Source Software
 “Free as in speech, free as in beer” - liberty versus cost



  • Distributed collaboration
    to develop software


  • Volunteers and sponsored
    developers


  • Community-based model
    of development



                                                              http://www.flickr.com/photos/prawnwarp/541526661/
Typical FLOSS Research Topics

• Coordination and collaboration


• Growth and evolution (social and code)


• Code quality


• Business models and firm involvement


• Motivation, leadership, success


• Culture and community


• Intellectual property and copyright      http://www.flickr.com/photos/eean/519258881/
What we study @ SU

• Social aspects of FLOSS


  • What practices make some distributed work teams more effective than
    others?


  • How are these practices developed?


  • What are the dynamics through which self-organizing distributed teams
    develop and work?
Sharing FLOSS Research Artifacts

• Community: Small but growing, maybe around 400 researchers worldwide,
  with lively face-to-face interaction but relatively low listserv activity


• Data: Lots of it, and readily available, though often difficult to use for several
  reasons


• Analyses and tools: Not quite as
  easy to get, but there if you can
  find them


• Papers: Repositories are as yet
  underdeveloped, but efforts are
  underway
                                          http://www.flickr.com/photos/12698507@N08/2762563631/
FLOSS Research Community

• Handful of small research groups, mostly in UK & Europe


   • Most often found in Software Engineering departments


• International conferences
  targeted to academics,
  developers, or both


   • OSS, ICSE, FOSDEM, etc.


• IFIP WG 2.13


                                          http://www.flickr.com/photos/steevithak/2883218362/
FLOSS Research Data

• Data sources include interviews, surveys, and ethnographic fieldwork


• Digital “trace” data: archival, secondary,
  by-product of work, easy but hard


• Repositories


   • Hosting “forges” like SourceForge,
     FreshMeat, RubyForge, etc.


• RoRs: Repositories of Repositories


   • Data sources for research
We Built It...

• Motivations


  • Stop hammering forge servers, getting entire campus IPs blocked...


  • Stop reinventing the wheel!


• Adoption


  • Shared data sources
    seeing increasing use


  • Next step is harder:
    sharing tools and workflows
                                          http://www.flickr.com/photos/circulating/997909242/
RoRs: FLOSSmole

• Multiple PIs @ Syracuse, Elon, & Carnegie Mellon
  One grad student @ SU (me), a couple of undergrads @ Elon
                                                                             
                                                                                                                  
                                                                                                                          
                                                                                                                     
                                                                                                                                          



• Public access to 300+ GB data on
                                                                           
                                                                                                               
                                                                                                                       
                                                                                                                         
                                                                                                                 
                                                                                                                   
                                                                                                                     
                                                                                                                                  




  • 300K+ projects from 8 repositories
                                                                                                                                                                               
                                                                                                                                                                            
                                            
                                                                                                                                                                            
                                          
                                                                                                                                                                            
                                                                                  
                                                                                                                                                                            
                                                                                           



  • Flat files & SQL datamarts
                                                                                                                                                                            
                                                                                      
                                                                                                                                                                            
                                                                                   
                                                                                                                                                                            
                                                                                
                                                                                                                                                                            
                                                                                       
                                                                                                                                                                            
                                                                                    
                                                                                                                                                                            
                                                                                             
                                                                                                                                                                            
                                                                                     
                                                                                                                                                                            
                                                                                  
                                                                                                                                                                            
                                          
                                                                                                                                                                            



  • Released via SF & GC
                                         
                                                                                                                                                                            
                                                                                                                                                                          




• 5 TB allotment on TeraGrid @ SDSC                        
                                                                                                                                                      
                                                                                                                                              
                                                                                                          
                                                                                                                                        
                                                                                                          
                                                                                                                                           
                                                                                                                                      
                                                        
                                                                                                          
                                                      
                                                                                                          
                                                                                                          
                                                                                                          
                                                                                                          
                                                                                                          
                                                                                                         
RoRs: FLOSSmetrics

• Produced by LibreSoft with academic and corporate partners


• Public access to data for 2800+ projects


• Analyzed & raw data from CVS, email, trackers


• Tools for:


   • calculating code metrics


   • parsing trackers


   • parsing email lists
RoRs: SRDA

• SourceForge Research Data Archive


  • One PI @ Notre Dame University


  • One massive 300 GB+ SQL db of monthly dumps from SourceForge


     • Original obtuse structure,
       regular table deprecation,
       some documentation


  • Gated access: researchers only,
    condition of data release from SF
RoRs: Emerging Sources

• Ultimate Debian Database (UDD)


  • 300 MB compressed Postgres DB,
    produced by Debian community


  • Planning to add to FLOSSmole
FLOSS Research Analyses

• When available...


   • Bespoke Scripts


   • Taverna workflows
FLOSS Research Papers

• First, there was opensource.mit.edu


   • They no longer maintain it, and gave us the data


• Work-in-progress working papers
  repository at FLOSSpapers.org


• Essential viability problem is that
  repositories require long-term
  stewardship...


   • ...which requires long-term
     commitments of funding and
     personnel, not just volunteers
FLOSS Research Collaboration

• Multiple partners involved in producing FLOSSmole & FLOSSmetrics


• Federated data sources by choice,
  starting to develop ontologies


• As yet, a Little Science domain


   • Cross-institutional collaboration
     poses many challenges


   • Usual difficulties magnified by
     general lack of resources, both
     financial and human
Latest Initiatives

• Resource-oriented


  • Expanding resources: data, research artifacts, and pedagogical materials


  • DOIs: 10.4118/*


  • Semantic data
    interoperability


• Community-oriented


  • FLOSShub.org
Evangelizing eScience

• Made presentations at OSS conferences: well received, but hard to make
  converts for several reasons


• Tried to get other research group members to use Taverna: learning overhead
  is too high for most


• Submitted a paper on eScience
  to an IS conference: rejected
  because reviewers were unable
  to adequately evaluate eScience
  as a topic, as it’s too unfamiliar


• Currently just doing our work this
  way, as an exemplar
                                            http://www.flickr.com/photos/naezmi/2418745377/
Barriers to Uptake

• Lack of agreement in research focus, theory, methods; researcher isolation


• Bimodal distribution of requisite skills


   • “I can’t possibly do that! I can’t code!”


   • “Why bother? I can code my own.
     You should too; just use Python.”
     “Overheard” on Twitter:

     Friend #1: i HATE that openoffice automatically took
     over my "open with..." defaults.

     Friend #2: @Friend #1 <opensourcedeveloper> If you
     don't like it, then why don't you submit code to change
     the behavior!? </opensourcedeveloper>
                                                               http://www.flickr.com/photos/noner/1739876378/
What I had to learn to get this far

• Taverna                           • A little bit of OWL, RDF, & SPARQL


• A lot more Unix terminal & XML    • I would not have taken this on if I
                                      had known what was in store, but
                                      once I got started, I was hooked
• Relational DB management & SQL


• More R, plus packages and
  dependency management


• Java & Eclipse - just enough to
  write my own Beanshells


• SVN & SSH
                                        http://www.flickr.com/photos/sashala/292868436/
Sociotechnical Engineering

• Tools are part of the solution, thanks to brilliant CS and SE people


• Social elements are the true barrier


   • Awareness of methods and
     benefits


   • Incentive systems


   • Resistance to change
     (paradigms again)


   • Proof of concept is difficult
                                              http://www.flickr.com/photos/pinprick/3117108495/
Using Taverna for Little eScience

• Implementing analysis is usually easy


• Data handling is almost always hard


   • All data are in SQL databases, with consistent IDs


   • Lots of data manipulation is required


• Avoiding web services as much as possible


   • Infrastructure and resources are limited


   • Benefit is truly questionable: AFAIK, I am 50% of the user base...
Example: Our Recent Research

• Estimating user base and potential user interest in FLOSS projects


   • Based on common release-and-download patterns


   • Proxy for project success, a common dependent variable

                   Area under             Potential user
                 curve is active         experimentation      Active user base
                 users updating           growth (good             growth
                                           publicity?)
    downloads




                Version 0.5        Version 0.6      Version 0.7
●
            5000

            4000
                                                                         measure
downloads




            3000
                                       ●
                                                                              user_base
            2000                                                     ●
                        ●                                        ●
                                                                              baseline
                   ●
            1000

                                 ●     ●                         ●   ●
                   ●    ●



             Oct−2005       Apr−2006       Oct−2006   Apr−2007




     “Normal” Download-
                                                       BibDesk
        Release Patterns
1.3.2-RC1
          +2 presentations   1.5.0



  ?   ?




Taverna’s Download-
                                     External effects!
   Release Patterns
Taverna’s Estimated
                       14 day baseline & drop-off
Baseline & User Base
Taverna’s Estimated
                       7 day baseline & drop-off
Baseline & User Base
Interpretation

• Taverna is not a “normal” open source project


  • Speaking tours, tutorials, articles, and other events influence downloads


• What this demonstrates...


  • Care is needed with quantitative measures


  • Not all open source projects are the same


  • Taverna users are just as reactive as any


                                           http://www.flickr.com/photos/pagedooley/2121472112/
Where next?

• Adoption is a long-term agenda, as changing social practices doesn’t happen
  overnight


• For FLOSS research and our disciplinary communities


  • We will keep doing our work this way,
    and hope to draw in others

    “Won’t you come out and play?”




                                              http://www.flickr.com/photos/atiq/2658884520/
Thanks!

• Credits where they are due


  • Kevin Crowston, my advisor




  • James Howison, my collaborator




  • Everett Wiggins, my husband

More Related Content

What's hot

Data Management for Citizen Science
Data Management for Citizen ScienceData Management for Citizen Science
Data Management for Citizen ScienceAndrea Wiggins
 
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...Andrea Wiggins
 
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...Andrea Wiggins
 
Ian Thornhill Citizen Science Training Day
Ian Thornhill Citizen Science Training DayIan Thornhill Citizen Science Training Day
Ian Thornhill Citizen Science Training DayAlice Sheppard
 
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themesData Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themesAndrea Wiggins
 
Citizen Science Training Day: Working with Citizen Scientists
Citizen Science Training Day: Working with Citizen ScientistsCitizen Science Training Day: Working with Citizen Scientists
Citizen Science Training Day: Working with Citizen ScientistsAlice Sheppard
 
Citizen science
Citizen scienceCitizen science
Citizen sciencesamar1407
 
"Breaking the Barriers to Citizen Science"
"Breaking the Barriers to Citizen Science""Breaking the Barriers to Citizen Science"
"Breaking the Barriers to Citizen Science"Alice Sheppard
 
4-H and Citizen Science Basics
4-H and Citizen Science Basics4-H and Citizen Science Basics
4-H and Citizen Science BasicsCitizenScience.org
 
Activities for citizen science at science centers
Activities for citizen science at science centersActivities for citizen science at science centers
Activities for citizen science at science centersCitizenScience.org
 
Why do citizen science at science centers?
Why do citizen science at science centers?Why do citizen science at science centers?
Why do citizen science at science centers?CitizenScience.org
 
The wider environment of open scholarship – Jisc and CNI conference 10 July ...
The wider environment of open scholarship – Jisc and CNI conference 10 July ...The wider environment of open scholarship – Jisc and CNI conference 10 July ...
The wider environment of open scholarship – Jisc and CNI conference 10 July ...Jisc
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)Duncan Hull
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?Daniel S. Katz
 
Toward a World Wise Web
Toward a World Wise WebToward a World Wise Web
Toward a World Wise Webszpak
 
Tina Phillips (Cornell Lab of Ornithology) - the DEVISE project
Tina Phillips (Cornell Lab of Ornithology) - the DEVISE projectTina Phillips (Cornell Lab of Ornithology) - the DEVISE project
Tina Phillips (Cornell Lab of Ornithology) - the DEVISE projectCitizenCyberlab
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeJosh Cowls
 

What's hot (20)

Data Management for Citizen Science
Data Management for Citizen ScienceData Management for Citizen Science
Data Management for Citizen Science
 
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...
Citizen Science 101: What Every Researcher Should Know About Crowdsourcing Sc...
 
Citizen Science and Inquiry
Citizen Science and InquiryCitizen Science and Inquiry
Citizen Science and Inquiry
 
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...
Crowdsourcing Scientific Work: A Comparative Study of Technologies, Processes...
 
Ian Thornhill Citizen Science Training Day
Ian Thornhill Citizen Science Training DayIan Thornhill Citizen Science Training Day
Ian Thornhill Citizen Science Training Day
 
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themesData Intensive Collaboration in Science and Engineering: CSCW workshop themes
Data Intensive Collaboration in Science and Engineering: CSCW workshop themes
 
Citizen Science Training Day: Working with Citizen Scientists
Citizen Science Training Day: Working with Citizen ScientistsCitizen Science Training Day: Working with Citizen Scientists
Citizen Science Training Day: Working with Citizen Scientists
 
Citizen science
Citizen scienceCitizen science
Citizen science
 
"Breaking the Barriers to Citizen Science"
"Breaking the Barriers to Citizen Science""Breaking the Barriers to Citizen Science"
"Breaking the Barriers to Citizen Science"
 
4-H and Citizen Science Basics
4-H and Citizen Science Basics4-H and Citizen Science Basics
4-H and Citizen Science Basics
 
Activities for citizen science at science centers
Activities for citizen science at science centersActivities for citizen science at science centers
Activities for citizen science at science centers
 
Why do citizen science at science centers?
Why do citizen science at science centers?Why do citizen science at science centers?
Why do citizen science at science centers?
 
The wider environment of open scholarship – Jisc and CNI conference 10 July ...
The wider environment of open scholarship – Jisc and CNI conference 10 July ...The wider environment of open scholarship – Jisc and CNI conference 10 July ...
The wider environment of open scholarship – Jisc and CNI conference 10 July ...
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
Toward a World Wise Web
Toward a World Wise WebToward a World Wise Web
Toward a World Wise Web
 
Context Aware Harassment Detection in Social Media [Overview]
Context Aware Harassment Detection in Social Media [Overview]Context Aware Harassment Detection in Social Media [Overview]
Context Aware Harassment Detection in Social Media [Overview]
 
Engaging the software in research community
Engaging the software in research communityEngaging the software in research community
Engaging the software in research community
 
Tina Phillips (Cornell Lab of Ornithology) - the DEVISE project
Tina Phillips (Cornell Lab of Ornithology) - the DEVISE projectTina Phillips (Cornell Lab of Ornithology) - the DEVISE project
Tina Phillips (Cornell Lab of Ornithology) - the DEVISE project
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science Knowledge
 

Similar to Little eScience: Exploring Collaboration in FLOSS Research

Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Brave new world:more access, more impact, more control
Brave new world:more access, more impact, more controlBrave new world:more access, more impact, more control
Brave new world:more access, more impact, more controlElizabeth Yates
 
AAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationAAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationWilliam Gunn
 
SciELO 2013: Empowering Scholars with Altmetrics
SciELO 2013: Empowering Scholars with AltmetricsSciELO 2013: Empowering Scholars with Altmetrics
SciELO 2013: Empowering Scholars with AltmetricsWilliam Gunn
 
Tools and Methodology for Research: Future of Science
Tools and Methodology for Research: Future of ScienceTools and Methodology for Research: Future of Science
Tools and Methodology for Research: Future of ScienceYannick Prié (Enseignement)
 
Open science: your questions answered
Open science: your questions answeredOpen science: your questions answered
Open science: your questions answeredVarsha Khodiyar
 
Charleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchCharleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchWilliam Gunn
 
Manufacturing pasts: opening Britain's industrial past to new learners and ne...
Manufacturing pasts: opening Britain's industrial past to new learners and ne...Manufacturing pasts: opening Britain's industrial past to new learners and ne...
Manufacturing pasts: opening Britain's industrial past to new learners and ne...tbirdcymru
 
Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...
Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...
Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...tbirdcymru
 
Who are you online? Or how to build an academic online identity…
Who are you online? Or how to build an academic online identity…Who are you online? Or how to build an academic online identity…
Who are you online? Or how to build an academic online identity…Marieke Guy
 
Pikas casci talk 11262013 final
Pikas casci talk 11262013 finalPikas casci talk 11262013 final
Pikas casci talk 11262013 finalChristina Pikas
 
Canoe the Open Content Rapids
Canoe the Open Content RapidsCanoe the Open Content Rapids
Canoe the Open Content RapidsDorothea Salo
 
The digital researcher1
The digital researcher1The digital researcher1
The digital researcher1Neal Sumner
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Ucla is245 july_2011_final-livelinks
Ucla is245 july_2011_final-livelinksUcla is245 july_2011_final-livelinks
Ucla is245 july_2011_final-livelinksSara R. Tompson, M.S.
 
Conservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspectiveConservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspectiveNancie Ravenel
 
Master final-2014-learning-symposium
Master final-2014-learning-symposiumMaster final-2014-learning-symposium
Master final-2014-learning-symposiumBruce Gilbert
 

Similar to Little eScience: Exploring Collaboration in FLOSS Research (20)

Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Brave new world:more access, more impact, more control
Brave new world:more access, more impact, more controlBrave new world:more access, more impact, more control
Brave new world:more access, more impact, more control
 
New Directions in Scholarly Communication
New Directions in Scholarly CommunicationNew Directions in Scholarly Communication
New Directions in Scholarly Communication
 
AAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationAAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes Collaboration
 
SciELO 2013: Empowering Scholars with Altmetrics
SciELO 2013: Empowering Scholars with AltmetricsSciELO 2013: Empowering Scholars with Altmetrics
SciELO 2013: Empowering Scholars with Altmetrics
 
Tools and Methodology for Research: Future of Science
Tools and Methodology for Research: Future of ScienceTools and Methodology for Research: Future of Science
Tools and Methodology for Research: Future of Science
 
Open science: your questions answered
Open science: your questions answeredOpen science: your questions answered
Open science: your questions answered
 
Charleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchCharleston 2013: The Social Side of Research
Charleston 2013: The Social Side of Research
 
Manufacturing pasts: opening Britain's industrial past to new learners and ne...
Manufacturing pasts: opening Britain's industrial past to new learners and ne...Manufacturing pasts: opening Britain's industrial past to new learners and ne...
Manufacturing pasts: opening Britain's industrial past to new learners and ne...
 
Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...
Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...
Manufacturing Pasts: Opening Britain's Industrial Past to New Learners and Ne...
 
Who are you online? Or how to build an academic online identity…
Who are you online? Or how to build an academic online identity…Who are you online? Or how to build an academic online identity…
Who are you online? Or how to build an academic online identity…
 
Pikas casci talk 11262013 final
Pikas casci talk 11262013 finalPikas casci talk 11262013 final
Pikas casci talk 11262013 final
 
New media and digital research literacies
New media and digital research literaciesNew media and digital research literacies
New media and digital research literacies
 
Canoe the Open Content Rapids
Canoe the Open Content RapidsCanoe the Open Content Rapids
Canoe the Open Content Rapids
 
The digital researcher1
The digital researcher1The digital researcher1
The digital researcher1
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
Ucla is245 july_2011_final-livelinks
Ucla is245 july_2011_final-livelinksUcla is245 july_2011_final-livelinks
Ucla is245 july_2011_final-livelinks
 
Conservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspectiveConservation's Digital Landscape: one conservator's perspective
Conservation's Digital Landscape: one conservator's perspective
 
Thursday pm richard fyson
Thursday pm richard fysonThursday pm richard fyson
Thursday pm richard fyson
 
Master final-2014-learning-symposium
Master final-2014-learning-symposiumMaster final-2014-learning-symposium
Master final-2014-learning-symposium
 

More from Andrea Wiggins

Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...
Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...
Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...Andrea Wiggins
 
With Great Data Comes Great Responsibility
With Great Data Comes Great ResponsibilityWith Great Data Comes Great Responsibility
With Great Data Comes Great ResponsibilityAndrea Wiggins
 
Mechanisms for Data Quality and Validation in Citizen Science
Mechanisms for Data Quality and Validation in Citizen ScienceMechanisms for Data Quality and Validation in Citizen Science
Mechanisms for Data Quality and Validation in Citizen ScienceAndrea Wiggins
 
Open Source & Citizen Science
Open Source & Citizen ScienceOpen Source & Citizen Science
Open Source & Citizen ScienceAndrea Wiggins
 
From Conservation to Crowdsourcing: A Typology of Citizen Science
From Conservation to Crowdsourcing: A Typology of Citizen ScienceFrom Conservation to Crowdsourcing: A Typology of Citizen Science
From Conservation to Crowdsourcing: A Typology of Citizen ScienceAndrea Wiggins
 
Motivation by Design: Technologies, Experiences, and Incentives
Motivation by Design: Technologies, Experiences, and IncentivesMotivation by Design: Technologies, Experiences, and Incentives
Motivation by Design: Technologies, Experiences, and IncentivesAndrea Wiggins
 
Secondary data analysis with digital trace data
Secondary data analysis with digital trace dataSecondary data analysis with digital trace data
Secondary data analysis with digital trace dataAndrea Wiggins
 
Reclassifying Success and Tragedy in FLOSS Projects
Reclassifying Success and Tragedy in FLOSS ProjectsReclassifying Success and Tragedy in FLOSS Projects
Reclassifying Success and Tragedy in FLOSS ProjectsAndrea Wiggins
 
Intellectual Diversity in the iSchools: Past, Present and Future
Intellectual Diversity in the iSchools: Past, Present and FutureIntellectual Diversity in the iSchools: Past, Present and Future
Intellectual Diversity in the iSchools: Past, Present and FutureAndrea Wiggins
 
Distributed Scientific Collaboration: Research Opportunities in Citizen Science
Distributed Scientific Collaboration: Research Opportunities in Citizen ScienceDistributed Scientific Collaboration: Research Opportunities in Citizen Science
Distributed Scientific Collaboration: Research Opportunities in Citizen ScienceAndrea Wiggins
 
Designing Virtual Organizations for Citizen Science
Designing Virtual Organizations for Citizen ScienceDesigning Virtual Organizations for Citizen Science
Designing Virtual Organizations for Citizen ScienceAndrea Wiggins
 
National Park System Property Designations
National Park System Property DesignationsNational Park System Property Designations
National Park System Property DesignationsAndrea Wiggins
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsAndrea Wiggins
 
Tales of the Field: Building Small Science Cyberinfrastructure
Tales of the Field: Building Small Science CyberinfrastructureTales of the Field: Building Small Science Cyberinfrastructure
Tales of the Field: Building Small Science CyberinfrastructureAndrea Wiggins
 
Coordination Dynamics in Free/Libre and Open Source Software
Coordination Dynamics in Free/Libre and Open Source SoftwareCoordination Dynamics in Free/Libre and Open Source Software
Coordination Dynamics in Free/Libre and Open Source SoftwareAndrea Wiggins
 
Heartbeat: Measuring Active User Base and Potential User Interest
Heartbeat: Measuring Active User Base and Potential User InterestHeartbeat: Measuring Active User Base and Potential User Interest
Heartbeat: Measuring Active User Base and Potential User InterestAndrea Wiggins
 
Replicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearchReplicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearchAndrea Wiggins
 
Social dynamics of FLOSS team communication across channels
Social dynamics of FLOSS team communication across channelsSocial dynamics of FLOSS team communication across channels
Social dynamics of FLOSS team communication across channelsAndrea Wiggins
 
eResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software developmenteResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software developmentAndrea Wiggins
 

More from Andrea Wiggins (19)

Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...
Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...
Crowdsourcing Citizen Science Data Quality with a Human-Computer Learning Net...
 
With Great Data Comes Great Responsibility
With Great Data Comes Great ResponsibilityWith Great Data Comes Great Responsibility
With Great Data Comes Great Responsibility
 
Mechanisms for Data Quality and Validation in Citizen Science
Mechanisms for Data Quality and Validation in Citizen ScienceMechanisms for Data Quality and Validation in Citizen Science
Mechanisms for Data Quality and Validation in Citizen Science
 
Open Source & Citizen Science
Open Source & Citizen ScienceOpen Source & Citizen Science
Open Source & Citizen Science
 
From Conservation to Crowdsourcing: A Typology of Citizen Science
From Conservation to Crowdsourcing: A Typology of Citizen ScienceFrom Conservation to Crowdsourcing: A Typology of Citizen Science
From Conservation to Crowdsourcing: A Typology of Citizen Science
 
Motivation by Design: Technologies, Experiences, and Incentives
Motivation by Design: Technologies, Experiences, and IncentivesMotivation by Design: Technologies, Experiences, and Incentives
Motivation by Design: Technologies, Experiences, and Incentives
 
Secondary data analysis with digital trace data
Secondary data analysis with digital trace dataSecondary data analysis with digital trace data
Secondary data analysis with digital trace data
 
Reclassifying Success and Tragedy in FLOSS Projects
Reclassifying Success and Tragedy in FLOSS ProjectsReclassifying Success and Tragedy in FLOSS Projects
Reclassifying Success and Tragedy in FLOSS Projects
 
Intellectual Diversity in the iSchools: Past, Present and Future
Intellectual Diversity in the iSchools: Past, Present and FutureIntellectual Diversity in the iSchools: Past, Present and Future
Intellectual Diversity in the iSchools: Past, Present and Future
 
Distributed Scientific Collaboration: Research Opportunities in Citizen Science
Distributed Scientific Collaboration: Research Opportunities in Citizen ScienceDistributed Scientific Collaboration: Research Opportunities in Citizen Science
Distributed Scientific Collaboration: Research Opportunities in Citizen Science
 
Designing Virtual Organizations for Citizen Science
Designing Virtual Organizations for Citizen ScienceDesigning Virtual Organizations for Citizen Science
Designing Virtual Organizations for Citizen Science
 
National Park System Property Designations
National Park System Property DesignationsNational Park System Property Designations
National Park System Property Designations
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
Tales of the Field: Building Small Science Cyberinfrastructure
Tales of the Field: Building Small Science CyberinfrastructureTales of the Field: Building Small Science Cyberinfrastructure
Tales of the Field: Building Small Science Cyberinfrastructure
 
Coordination Dynamics in Free/Libre and Open Source Software
Coordination Dynamics in Free/Libre and Open Source SoftwareCoordination Dynamics in Free/Libre and Open Source Software
Coordination Dynamics in Free/Libre and Open Source Software
 
Heartbeat: Measuring Active User Base and Potential User Interest
Heartbeat: Measuring Active User Base and Potential User InterestHeartbeat: Measuring Active User Base and Potential User Interest
Heartbeat: Measuring Active User Base and Potential User Interest
 
Replicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearchReplicating FLOSS Research as eResearch
Replicating FLOSS Research as eResearch
 
Social dynamics of FLOSS team communication across channels
Social dynamics of FLOSS team communication across channelsSocial dynamics of FLOSS team communication across channels
Social dynamics of FLOSS team communication across channels
 
eResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software developmenteResearch workflows for studying free and open source software development
eResearch workflows for studying free and open source software development
 

Recently uploaded

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfROWELL MARQUINA
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 

Recently uploaded (20)

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdf
 
How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 

Little eScience: Exploring Collaboration in FLOSS Research

  • 2. Overview • Background • Exposition: Sociology of Science • Broad generalizations about science • Example: FLOSS Research • Little science context for eScience research • Expectations: What next? http://www.flickr.com/photos/pmtorrone/304696349/
  • 3. My Background • BA: Maths with economics • Nonprofit & IT industry work • Adult literacy, nonprofit management support, professional theatre • Web analytics • MSI: Human-computer interaction, complex systems & network science • PhD: Information science & technology
  • 4. Science • Systematic investigation for the production of knowledge • Scientific method emphasizes reproducibility • Not all phenomena are reproducible... • Many categories • Experimental, applied, social, etc. • Categories are not mutually exclusive http://www.flickr.com/photos/radiorover/419414206/
  • 5. Paradigms & Revolutions • Kuhn - Laws, theories, applications & instrumentation that create coherent traditions of scientific research • Paradigms help us direct our research, but limit our view of the world • New technologies can lead to scientific revolutions by revealing anomalies http://www.flickr.com/photos/weichbrodt/644302381/
  • 6. Normal Science • Kuhn - “normal science” is research based on broadly accepted scientific paradigms • Shared paradigms are based on rules and standards for scientific practice • Key requirement: agreement on focus and conduct of research • Ǝ(Grand Challenges)|Discipline http://www.flickr.com/photos/themadlolscientist/2421152973/
  • 7. Big Science • de Solla Price - “Big Science” is... • Inherently paradigmatic • Always normal science • Produces detailed insights into the minutiae of phenomena studied in the paradigm http://www.flickr.com/photos/31333486@N00/1883498062/
  • 8. Pre-paradigmatic Science • Paradigms require agreement on... • Epistemology • Ontology • Methodology • Most social sciences are pre-paradigmatic • Primarily exploratory research • Very little replication http://www.flickr.com/photos/askpang/327577395/
  • 9. Little Science • de Solla Price - “Little Science” is a romanticized precursor to Big Science, featuring lone, long-haired geniuses misunderstood by society, etc. • If it’s not Big Science, it’s Little Science • Pre-paradigmatic and fraught with ambiguity • Often fundamentally exploratory • Epistemological/theoretical/methodological divergence among researchers http://www.flickr.com/photos/mrjoax/2548045246/
  • 10. Social Science • Social science is real science: the goal is systematic knowledge production • Focuses on the study of the social life of human groups and individuals • IMHO, fundamentally more difficult than “hard” sciences due to infinite complexity of social phenomena • Replicability is a major challenge with respect to scientific method • Not all social science can or should aspire to replicability http://www.flickr.com/photos/smiteme/2379629501/
  • 11. Normalizing Science • Becoming a normal science requires community and convergence • Ǝ(community) != Ǝ(agreement) • Establishing grand challenges and methods are primary tasks of normalizing • Resistance to change is pervasive http://www.flickr.com/photos/9036026@N08/2949211479/
  • 12. Scientific Collaboration • Collaboration requires common focus, if not also epistemology and ontology • Challenging enough in normal sciences • Harder in pre-paradigmatic research • Economics: systemic disincentives to collaborate, versus potential benefits and ideals of science http://www.flickr.com/photos/richardsummers/542738965/
  • 13. Big Science Collaboration • LHC, CERN, etc. • Thousands of collaborators • Complex but coordinated, at least somewhat centralized • Requires shared goals and resources, plus (lots of) communication • Only happens in normal sciences http://www.flickr.com/photos/8767020@N08/531355152/
  • 14. Little Science Collaboration • A Professor & a grad student, give or take • Localized goals and resources • -> localized research practices • Small research teams • Fundamentally difficult to achieve consensus that allows larger groups • Restricts the ability to obtain funding and undertake ambitious projects http://www.flickr.com/photos/lamazone/2735939345/
  • 15. Scientific Collaboration Requirements • Shared goals • Establishes focus of research • Shared research resources • Both social and artifactual • Social aspects include training and community socialization we can has share? http://www.flickr.com/photos/ryanr/142455033/
  • 16. Historical Research Artifacts • Letters, Books, Journals, Lectures • Also technologies: methods, instrumentation • Sharing? • Recordkeeping is not always a researcher’s main priority • Without records, there’s not much to share except the research outputs http://www.flickr.com/photos/smailtronic/1535870363/
  • 17. Today’s Research Artifacts • Large scale datasets, scripts, software, workflows, papers, images, video, audio, annotations, ephemera, web sites... • “Research objects” - bundling all the pieces together • Hybrids of boundary objects and touchstones • Technologies -> scientific revolution! • Open science http://www.flickr.com/photos/smiteme/2379630899/
  • 18. Example: FLOSS Research • Phenomenological & interdisciplinary • Software engineering, Information Systems, Anthropology, Sociology, CSCW, etc... • Ethos • (Idealistic) combination of open source values and scientific values http://www.flickr.com/photos/themadlolscientist/2542236565/
  • 19. FLOSS Phenomenon • Free/Libre Open Source Software “Free as in speech, free as in beer” - liberty versus cost • Distributed collaboration to develop software • Volunteers and sponsored developers • Community-based model of development http://www.flickr.com/photos/prawnwarp/541526661/
  • 20. Typical FLOSS Research Topics • Coordination and collaboration • Growth and evolution (social and code) • Code quality • Business models and firm involvement • Motivation, leadership, success • Culture and community • Intellectual property and copyright http://www.flickr.com/photos/eean/519258881/
  • 21. What we study @ SU • Social aspects of FLOSS • What practices make some distributed work teams more effective than others? • How are these practices developed? • What are the dynamics through which self-organizing distributed teams develop and work?
  • 22. Sharing FLOSS Research Artifacts • Community: Small but growing, maybe around 400 researchers worldwide, with lively face-to-face interaction but relatively low listserv activity • Data: Lots of it, and readily available, though often difficult to use for several reasons • Analyses and tools: Not quite as easy to get, but there if you can find them • Papers: Repositories are as yet underdeveloped, but efforts are underway http://www.flickr.com/photos/12698507@N08/2762563631/
  • 23. FLOSS Research Community • Handful of small research groups, mostly in UK & Europe • Most often found in Software Engineering departments • International conferences targeted to academics, developers, or both • OSS, ICSE, FOSDEM, etc. • IFIP WG 2.13 http://www.flickr.com/photos/steevithak/2883218362/
  • 24. FLOSS Research Data • Data sources include interviews, surveys, and ethnographic fieldwork • Digital “trace” data: archival, secondary, by-product of work, easy but hard • Repositories • Hosting “forges” like SourceForge, FreshMeat, RubyForge, etc. • RoRs: Repositories of Repositories • Data sources for research
  • 25. We Built It... • Motivations • Stop hammering forge servers, getting entire campus IPs blocked... • Stop reinventing the wheel! • Adoption • Shared data sources seeing increasing use • Next step is harder: sharing tools and workflows http://www.flickr.com/photos/circulating/997909242/
  • 26. RoRs: FLOSSmole • Multiple PIs @ Syracuse, Elon, & Carnegie Mellon One grad student @ SU (me), a couple of undergrads @ Elon         • Public access to 300+ GB data on                • 300K+ projects from 8 repositories            • Flat files & SQL datamarts                            • Released via SF & GC    • 5 TB allotment on TeraGrid @ SDSC                      
  • 27. RoRs: FLOSSmetrics • Produced by LibreSoft with academic and corporate partners • Public access to data for 2800+ projects • Analyzed & raw data from CVS, email, trackers • Tools for: • calculating code metrics • parsing trackers • parsing email lists
  • 28. RoRs: SRDA • SourceForge Research Data Archive • One PI @ Notre Dame University • One massive 300 GB+ SQL db of monthly dumps from SourceForge • Original obtuse structure, regular table deprecation, some documentation • Gated access: researchers only, condition of data release from SF
  • 29. RoRs: Emerging Sources • Ultimate Debian Database (UDD) • 300 MB compressed Postgres DB, produced by Debian community • Planning to add to FLOSSmole
  • 30. FLOSS Research Analyses • When available... • Bespoke Scripts • Taverna workflows
  • 31. FLOSS Research Papers • First, there was opensource.mit.edu • They no longer maintain it, and gave us the data • Work-in-progress working papers repository at FLOSSpapers.org • Essential viability problem is that repositories require long-term stewardship... • ...which requires long-term commitments of funding and personnel, not just volunteers
  • 32. FLOSS Research Collaboration • Multiple partners involved in producing FLOSSmole & FLOSSmetrics • Federated data sources by choice, starting to develop ontologies • As yet, a Little Science domain • Cross-institutional collaboration poses many challenges • Usual difficulties magnified by general lack of resources, both financial and human
  • 33. Latest Initiatives • Resource-oriented • Expanding resources: data, research artifacts, and pedagogical materials • DOIs: 10.4118/* • Semantic data interoperability • Community-oriented • FLOSShub.org
  • 34. Evangelizing eScience • Made presentations at OSS conferences: well received, but hard to make converts for several reasons • Tried to get other research group members to use Taverna: learning overhead is too high for most • Submitted a paper on eScience to an IS conference: rejected because reviewers were unable to adequately evaluate eScience as a topic, as it’s too unfamiliar • Currently just doing our work this way, as an exemplar http://www.flickr.com/photos/naezmi/2418745377/
  • 35. Barriers to Uptake • Lack of agreement in research focus, theory, methods; researcher isolation • Bimodal distribution of requisite skills • “I can’t possibly do that! I can’t code!” • “Why bother? I can code my own. You should too; just use Python.” “Overheard” on Twitter: Friend #1: i HATE that openoffice automatically took over my "open with..." defaults. Friend #2: @Friend #1 <opensourcedeveloper> If you don't like it, then why don't you submit code to change the behavior!? </opensourcedeveloper> http://www.flickr.com/photos/noner/1739876378/
  • 36. What I had to learn to get this far • Taverna • A little bit of OWL, RDF, & SPARQL • A lot more Unix terminal & XML • I would not have taken this on if I had known what was in store, but once I got started, I was hooked • Relational DB management & SQL • More R, plus packages and dependency management • Java & Eclipse - just enough to write my own Beanshells • SVN & SSH http://www.flickr.com/photos/sashala/292868436/
  • 37. Sociotechnical Engineering • Tools are part of the solution, thanks to brilliant CS and SE people • Social elements are the true barrier • Awareness of methods and benefits • Incentive systems • Resistance to change (paradigms again) • Proof of concept is difficult http://www.flickr.com/photos/pinprick/3117108495/
  • 38. Using Taverna for Little eScience • Implementing analysis is usually easy • Data handling is almost always hard • All data are in SQL databases, with consistent IDs • Lots of data manipulation is required • Avoiding web services as much as possible • Infrastructure and resources are limited • Benefit is truly questionable: AFAIK, I am 50% of the user base...
  • 39. Example: Our Recent Research • Estimating user base and potential user interest in FLOSS projects • Based on common release-and-download patterns • Proxy for project success, a common dependent variable Area under Potential user curve is active experimentation Active user base users updating growth (good growth publicity?) downloads Version 0.5 Version 0.6 Version 0.7
  • 40. 5000 4000 measure downloads 3000 ● user_base 2000 ● ● ● baseline ● 1000 ● ● ● ● ● ● Oct−2005 Apr−2006 Oct−2006 Apr−2007 “Normal” Download- BibDesk Release Patterns
  • 41. 1.3.2-RC1 +2 presentations 1.5.0 ? ? Taverna’s Download- External effects! Release Patterns
  • 42. Taverna’s Estimated 14 day baseline & drop-off Baseline & User Base
  • 43. Taverna’s Estimated 7 day baseline & drop-off Baseline & User Base
  • 44. Interpretation • Taverna is not a “normal” open source project • Speaking tours, tutorials, articles, and other events influence downloads • What this demonstrates... • Care is needed with quantitative measures • Not all open source projects are the same • Taverna users are just as reactive as any http://www.flickr.com/photos/pagedooley/2121472112/
  • 45. Where next? • Adoption is a long-term agenda, as changing social practices doesn’t happen overnight • For FLOSS research and our disciplinary communities • We will keep doing our work this way, and hope to draw in others “Won’t you come out and play?” http://www.flickr.com/photos/atiq/2658884520/
  • 46. Thanks! • Credits where they are due • Kevin Crowston, my advisor • James Howison, my collaborator • Everett Wiggins, my husband