An Open Source Platform for Social Science Research

•

1 like•796 views

In 2016, a group of social scientists at the University of California Berkeley received a large grant to develop tools for rigorous social science research, initially focused on collective identity formation. Jazkarta has been helping them develop Dallinger, a tool to automate experiments that use large numbers of subjects recruited on platforms like Mechanical Turk. They chose Jazkarta because of our web development and project management expertise, but also because of our familiarity with large, open source software projects - which is a goal for Dallinger. At this 2017 Plone Conference presentation, members of the Jazkarta team (David Glick, Alec Mitchell, Matthew Wilkes, and Sally Kleinfeldt) describe how we've put the lessons of Plone to work setting up this new open source project. We also describe how the technology stack (Python, Redis, Web Sockets, Heroku, AWS/Mechanical Turk/boto, Flask, PostgreSQL/SQLAlchemy, Gunicorn, Pytest, gevent) has been working for us.

Technology

An Open Source Platform
for Social Science
Research
Sally Kleinfeldt
Barcelona
2017

Next Generation Social
Sciences
(NGS2)
Funded by:

"The program aims to build and evaluate new
methods and tools to advance rigorous,
reproducible social science studies at scales
necessary to develop and validate causal
models of human social behaviors."

Beneﬁts for
• Public health
• Economics
• National security

Initial Focus
Identify causal mechanisms of
collective identity formation

Dallinger
• Crowdsourced experiments
• Abstracted into single function calls
• Can be inserted into higher-order
algorithms, for example to progressively
reﬁne experiment

Fully Automated
• Recruits participants (Mechanical Turk)
• Obtains informed consent
• Arranges participants into a network
• Runs experiement (Heroku)
• Coordinates communication

Fully Automated
• Records the data they produce
• Pays participants
• Recruits new batches of participants
contingent on the structure of the
experiment
• Validates and manages the resulting data

How Does It Work?
• Experiments are modeled as directed
graphs
• Experiments are like Plone add-ons being
run by the Dallinger system

Games!
• All teams coalesced on using a public goods
game
• Dallinger is the only team using a real-time
multiplayer game: Grid Universe

Repository
• github.com/Dallinger/Dallinger

Why Jazkarta?
• Expertise in Python, Flask, PostgreSQL,
SQLAlchemy,Amazon Mechanical Turk,
boto, tox, pytest, Redis, Selenium,
PhantomJS, JavaScript, HTML, CSS…
• Expertise in project management
• Expertise in a mature open source
community - Plone!

Our Process
• Discovery meeting fall 2017
• Developed user stories
• Estimated using planning poker
• Implementing in a series of iteration

Our Team
• Alec Mitchell
• David Glick
• Matthew Wilkes
• Carlos de la Guardia
• Jesse Snyder

Lessons
• Don’t over-engineer plugin architectures
(like recruiters)
• Support live editing as much as possible
• Break backwards compatibility when
needed
• Remove references to old ways of doing
things

Lessons
• Ship lots of useful demos
• Be diligent about code reviews
• Make important approach decisions
together
• People involved in decisions should know
user needs intimately (we miss Joel Burton)

Tech Stack
• Web based, but Flask instead of Zope
• PostgreSQL instead of ZODB
• Real time websockets
• Built in deployment command

Helping Dallinger Users
• Documentation
• Slack channel
• Cookie cutter template
• Extendable base templates
• Javascript library
• Commands for local debugging

Code Quality
• Automated lint checks
• Continuous integration with minimum
code coverage requirement
• Code review
• Regression testing of an experiment (GU)
against changes to core Dallinger

Fun Challenges
• Scaling selenium-based bots
• Getting access to track interactions with
3rd-party sites (Chrome extension)
• Testing multiple participants in parallel
without sharing cookies

Similar to An Open Source Platform for Social Science Research

Hydra Project Management SurveyMark Notess

Getting agile with drupalPromet Source

Liferay v. Drupal: Pound for Pound @ Liferay Symposium 2014 - Findings from t...Dave DeMichele

Deep Learning with CNTKAshish Jaiman

Improving success with Distributed TeamsGreg Robinson

A Method to Select e-Infrastructure Components to SustainDaniel S. Katz

SGCI OAC webinar 4 18-19Nancy Wilkins-Diehr

Chapter 10bodo-con

Some perspectives from the Astropy ProjectKelle Cruz

How to choose tools for DevOps and Continuous Delivery - DevOps Manchester me...Matthew Skelton

SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...Sandra Gesing

How to choose tools for DevOps and Continuous Delivery - DevOps CardiffMatthew Skelton

How to choose tools for DevOps and Continuous Delivery - #doxlonMatthew Skelton

Matthew Skelton - How to choose tools for DevOps - collaboration over automationOutlyer

Rethinking system designBryan Ollendyke

The Social Semantic Server: A Flexible Framework to Support Informal Learning...tobold

The Social Semantic Server - A Flexible Framework to Support Informal Learnin...Sebastian Dennerlein

SGCI at Earth Science Information Partners meetingNancy Wilkins-Diehr

Towards an Agile approach to building application profilesPaul Walk

How you and your gateway can benefit from the services of the Science Gateway...Katherine Lawrence

Similar to An Open Source Platform for Social Science Research (20)

Hydra Project Management Survey

Getting agile with drupal

Liferay v. Drupal: Pound for Pound @ Liferay Symposium 2014 - Findings from t...

Deep Learning with CNTK

Improving success with Distributed Teams

A Method to Select e-Infrastructure Components to Sustain

SGCI OAC webinar 4 18-19

Chapter 10

Some perspectives from the Astropy Project

How to choose tools for DevOps and Continuous Delivery - DevOps Manchester me...

SGCI - Science Gateways - Technology-Enhanced Research Under Consideration of...

How to choose tools for DevOps and Continuous Delivery - DevOps Cardiff

How to choose tools for DevOps and Continuous Delivery - #doxlon

Matthew Skelton - How to choose tools for DevOps - collaboration over automation

Rethinking system design

The Social Semantic Server: A Flexible Framework to Support Informal Learning...

The Social Semantic Server - A Flexible Framework to Support Informal Learnin...

SGCI at Earth Science Information Partners meeting

Towards an Agile approach to building application profiles

How you and your gateway can benefit from the services of the Science Gateway...

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

unit 4 immunoblotting technique complete.pptxBkGupta21

How to write a Business Continuity PlanDatabarracks

DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

Gen AI in Business - Global Trends Report 2024.pdfAddepto

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

"ML in Production",Oleksandr BaganFwdays

Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup

unit 4 immunoblotting technique complete.pptx

How to write a Business Continuity Plan

DevoxxFR 2024 Reproducible Builds with Apache Maven

Nell’iperspazio con Rocket: il Framework Web di Rust!

Gen AI in Business - Global Trends Report 2024.pdf

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Ensuring Technical Readiness For Copilot in Microsoft 365

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES

Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf

Unraveling Multimodality with Large Language Models.pdf

SIP trunking in Janus @ Kamailio World 2024

Generative AI for Technical Writer or Information Developers

SAP Build Work Zone - Overview L2-L3.pptx

DMCC Future of Trade Web3 - Special Edition

"ML in Production",Oleksandr Bagan

Digital Identity is Under Attack: FIDO Paris Seminar.pptx

"Debugging python applications inside k8s environment", Andrii Soldatenko

What is DBT - The Ultimate Data Build Tool.pdf

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

An Open Source Platform for Social Science Research

1. An Open Source Platform for Social Science Research Sally Kleinfeldt Barcelona 2017

2. The Program

3. Next Generation Social Sciences (NGS2) Funded by:

4. "The program aims to build and evaluate new methods and tools to advance rigorous, reproducible social science studies at scales necessary to develop and validate causal models of human social behaviors."

5. Beneﬁts for • Public health • Economics • National security

6. Initial Focus Identify causal mechanisms of collective identity formation

7. Dallinger

8. Grant Awarded To

9. Dallinger • Crowdsourced experiments • Abstracted into single function calls • Can be inserted into higher-order algorithms, for example to progressively reﬁne experiment

10. Fully Automated • Recruits participants (Mechanical Turk) • Obtains informed consent • Arranges participants into a network • Runs experiement (Heroku) • Coordinates communication

11. Fully Automated • Records the data they produce • Pays participants • Recruits new batches of participants contingent on the structure of the experiment • Validates and manages the resulting data

12. How Does It Work? • Experiments are modeled as directed graphs • Experiments are like Plone add-ons being run by the Dallinger system

13. Games! • All teams coalesced on using a public goods game • Dallinger is the only team using a real-time multiplayer game: Grid Universe

14. Demo

15. Repository • github.com/Dallinger/Dallinger

16. Jazkarta’s Contribution

17. Why Jazkarta? • Expertise in Python, Flask, PostgreSQL, SQLAlchemy,Amazon Mechanical Turk, boto, tox, pytest, Redis, Selenium, PhantomJS, JavaScript, HTML, CSS… • Expertise in project management • Expertise in a mature open source community - Plone!

18. Our Process • Discovery meeting fall 2017 • Developed user stories • Estimated using planning poker • Implementing in a series of iteration

19. Our Team • Alec Mitchell • David Glick • Matthew Wilkes • Carlos de la Guardia • Jesse Snyder

20. Lessons Learned from Plone

21. Lessons • Don’t over-engineer plugin architectures (like recruiters) • Support live editing as much as possible • Break backwards compatibility when needed • Remove references to old ways of doing things

22. Lessons • Ship lots of useful demos • Be diligent about code reviews • Make important approach decisions together • People involved in decisions should know user needs intimately (we miss Joel Burton)

23. Plone vs. Dallinger Tech Stack

24. Tech Stack • Web based, but Flask instead of Zope • PostgreSQL instead of ZODB • Real time websockets • Built in deployment command

25. Writing Experiments

26. Helping Dallinger Users • Documentation • Slack channel • Cookie cutter template • Extendable base templates • Javascript library • Commands for local debugging

27. Ensuring Code Quality

28. Code Quality • Automated lint checks • Continuous integration with minimum code coverage requirement • Code review • Regression testing of an experiment (GU) against changes to core Dallinger

29. Fun Technical Challenges

30. Fun Challenges • Scaling selenium-based bots • Getting access to track interactions with 3rd-party sites (Chrome extension) • Testing multiple participants in parallel without sharing cookies

31. Questions?

An Open Source Platform for Social Science Research

Recommended

Recommended

More Related Content

Similar to An Open Source Platform for Social Science Research

Similar to An Open Source Platform for Social Science Research (20)

More from Jazkarta, Inc.

More from Jazkarta, Inc. (20)

Recently uploaded

Recently uploaded (20)

An Open Source Platform for Social Science Research