Successfully reported this slideshow.
Your SlideShare is downloading. ×

BDVe Webinar Series - QROWD: The Human Factor in Big Data

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 31 Ad

BDVe Webinar Series - QROWD: The Human Factor in Big Data

Download to read offline

The second of the BDVe series of webinars related to Big Data technologies presents the QROWD project. Elena Simperl (University of Southampton) will provide an overview and technical details on how human interaction and crowdsourcing could help in different steps of the data value chain, from data acquisition to data curation and completion, etc. Examples of how to add human in the loop in the domains of Smart Cities and Smart Transportation will be provided.

The second of the BDVe series of webinars related to Big Data technologies presents the QROWD project. Elena Simperl (University of Southampton) will provide an overview and technical details on how human interaction and crowdsourcing could help in different steps of the data value chain, from data acquisition to data curation and completion, etc. Examples of how to add human in the loop in the domains of Smart Cities and Smart Transportation will be provided.

Advertisement
Advertisement

More Related Content

Similar to BDVe Webinar Series - QROWD: The Human Factor in Big Data (20)

Advertisement

More from Big Data Value Association (20)

Recently uploaded (20)

Advertisement

BDVe Webinar Series - QROWD: The Human Factor in Big Data

  1. 1. The human factor in big data BDVe webinar series November 6th 2018 Elena Simperl, University of Southampton, UK @esimperl
  2. 2. Volume Veracity Velocity Variety Big data • Data value chains as driver for growth and change • Transformative impact leading to new infrastructure, businesses, politics and social interactions • Created, refined, valued and exchanged unlike any other resources • Alters the rules for markets and demands new approaches from regulators The data economy
  3. 3. Example: Disrupting transport Smart cities have access to more data than ever to inform policy and service design Driverless cars, electrification and connectivity are transforming the automotive industry Machine learning and AI can help optimise traffic, support future planning and improve fuel efficiencies
  4. 4. Challenges Data availability • Collecting missing data • Labelling data to train and validate algorithms • Improving data quality • Integrating across sources Data use • Making decisions inclusively • Enabling the free flow of data • Innovating responsibly Many of these tasks are automated, but technology has limitations Legal, economic, social, ethical implications
  5. 5. More and better data Training and validating algorithms Engaging and empowering citizens, customers etc. The human factor in big data
  6. 6. Approaches Citizen sensing Urban auditing Participatory democracy Open innovation Crowdsourcing Human in the loop
  7. 7. Crowdsourcing
  8. 8. Organisations struggle to leverage the human factor What form of crowdsourcing to choose? How to engage with the crowd? Why would the crowd care? How do we control the quality? Does it need to be in real- time? Can we afford it at scale?
  9. 9. Qrowd Innovation action, part of the Big Data Value PPP Started in December 2016, 3 years, 3.9M € 8 partners from 5 European countries, coordinated by the University of Southampton Smart city solutions Combining crowd and computational intelligence Piloted in transportation with A medium-sized smart city A leading navigation and traffic management service provider
  10. 10. Enabling data value chains Standards compliant, interoperable, open, no vendor lock-in Leverages existing technology stacks Used by industry partners Extendable and scalable to adapt to new urban contexts Platform for data and process (data flow) integration
  11. 11. The human factor in Qrowd Mix of open innovation methods to co-design pilots and encourage stakeholder participation Value-centric approach to platform design: personal data empowerment, open source, building upon existing standards Sustainable urban auditing through online and mobile crowdsourcing Human-in-the-loop (HIL) architecture to improve the accuracy of predictions
  12. 12. More than just technology Supports deployment of human-machine workflows throughout Interfaces to multiple crowdsourcing services Complemented by methodology and guidelines Data protection by design
  13. 13. The ‘what, who, how, why’ methodology 14 What • Tasks you can’t complete in-house or using computers • A question of time, budget, resources, ethics etc. Who • Crowdsourcing ≠‘turkers’ • Open call, biased via choice of platforms and promotion channels • No traditional means to manage and incentivize • Crowd has often little to no context about the project How • Macro vs. microtasks • Complex workflows • Assessment and aggregation • Timeliness of results Why • Different crowds with different motivations • Incentives influence motivations • Aligning incentives
  14. 14. Using the methodology Who is it for • Organisations interested in increasing participation via crowdsourcing • Technology providers implementing HIL architectures How can it be used • Provides a process model starting with the What, followed by the Who, which then determine the How. Every What/Who/How decision impacts on the Why • Can be used with or without the Qrowd platform • Helps specify goals and decide what forms of crowdsourcing to use • Helps roll out crowdsourcing projects and use their results effectively • Helps understand motivations and incentives and their role in successful projects
  15. 15. Examples Urban auditing: Collect up to date information about parking spaces in a city Modal split: Collecting training data to predict the use of different means of transport
  16. 16. What In general • Something you cannot do using traditional means or that requires broader engagement • Something you cannot do (fully) automatically – a data collection or analysis task In our examples • Parking: We need a dataset with all parking spaces in a city (alternatively: parking availability). Traditional surveys too costly. • Modal split: We need trips involving different means of transport and labels for each trip segment. This data is not available and is needed to train AIs. 11/6/2018 17 What Who How Why
  17. 17. What task am I trying to solve? Can I solve it via other means: buy the data, label in house, use less/noisier data etc.
  18. 18. Who In general • An open (‘unknown’) crowd • Scale helps solve problem faster • Some tasks will have time, location or skills constraints (hence, smaller crowd, hence slower or costlier) In our examples • Parking • People who are familiar with an urban area e.g., Open Street Map community, citizens • Drivers using a SatNav • Paid crowd workers • Social media users • Modal split • Commuters, tourists, people using transport 11/6/2018 19 What Who How Why
  19. 19. Who is my crowd? How do I recruit participants? What are my requirements? Can I find volunteers? Shall I use a crowdsourcing platform?
  20. 20. How: Process In general • Many ways to implement tasks: specialized platforms, social media, extension of existing system etc. • Tasks broken down into smaller units, undertaken in parallel by different people • Does not apply to all forms of crowdsourcing – sometimes the breakdown is part of the solution! • Does not apply to creative tasks, underexplored problem spaces etc. • Task assignment to match skills, preferences, and contribution history • Example: random assignment vs meritocracy vs full autonomy • Explicit vs. implicit participation • Affects motivation • Partial or independent answers consolidated and aggregated into complete solution • Example: challenges (e.g., Netflix) vs aggregation (e.g., Wikipedia) • Real-time answers • Require alternative models and incentives 11/6/2018 21 What Who How Why
  21. 21. How: Process In our example - parking 1. Crowdsourcing platform: Virtual City Explorer tool using virtual street imagery. Participants are paid. 2. Extension of existing system: SatNav prompting user to answer questions about parking availability. Contributions could be incentivised. 3. Data collection app: i-Log app launches challenges to collect parking pictures in a city. Best pictures receive a prize. 11/6/2018 22 What Who How Why
  22. 22. Virtual City Explorer • Crowdsourcing platform for urban auditing, developed at the University of Southampton • People explore a virtual city via street imagery • They solve small tasks against micropayments • VCE validates answers, consolidates data and analyses user behaviour to propose optimisations
  23. 23. i-Log and QrowdLab i-Log is an Android application developed at the University of Trento used for people-centric sensing QrowdLab is a citizen innovation lab set up in Trento to engage with citizens on city matters We need tools to connect with the citizens We need data to understand patterns of behaviour and collect missing data We need feedback on how people interact with the city and its infrastructure
  24. 24. How: Process In our example – modal split • Combination of machine learning classifier, citizen sensing and labelled data collected via gamified challenges 11/6/2018 25 What Who How Why
  25. 25. Where do I deploy crowdsourcing? Do I need a new system? How do I allocate tasks to people? Or do I let them choose freely how to contribute? How do I deal with low quality solutions? Can I recognise good solutions easily?
  26. 26. Why: money, love or glory Love and glory reduce costs Money and glory make the crowd move faster 27 Intrinsic vs extrinsic motivation • Rewards/incentives influence motivation Successful unpaid crowdsourcing is difficult to predict or replicate • Highly context-specific • Not applicable to arbitrary tasks Reward models often easier to study and control (if performance can be reliably measured) • Not always easy to abstract from social aspects (free-riding, social pressure) • May undermine intrinsic motivation What Who How Why
  27. 27. Why In our examples Who benefits from the results? Who owns the results? How much effort does it require from the crowd? Money Different models: pay-per-time, pay-per-unit, winner- takes-it-all Define the rewards, analyse trade-offs accuracy vs. costs, avoid spam Love OpenStreetMap, games, citizen panels Glory Competitions, awards
  28. 28. Why would anyone care to contribute? Is the task intrinsically rewarding? What would motivate people to participate? How do I sustain participation?
  29. 29. Leveraging the human factor The most sophisticated AI systems showcase ingenious combinations of human and machine intelligence Crowdsourcing can augment any aspect of the data value chain Our methodology can help organisations understand how to use crowdsourcing effectively Qrowd develops a platform with integrated crowdsourcing support to deploy hybrid data collection and analysis workflows
  30. 30. Further reading • Qrowd project: qrowd-project.eu, @QrowdProject • Figure Eight: figure-eight.com • How to use crowdsourcing effectively, Simperl, E. (2015): https://www.liberquarterly.eu/articles/10.18352/lq.9948/ • When computers were human, David Alan Grier, 2007 • The collective intelligence genome, Malone, T. W., Laubacher, R., & Dellarocas, C. (2010). MIT Sloan Management Review, 51(3), 21. • Getting Results from Crowds: The Definitive Guide to Using Crowdsourcing to Grow, Dawson, R. and Bynghall, S. (2011). Advanced Human Technologies

×