SlideShare a Scribd company logo
1 of 31
Download to read offline
The human factor in big data
BDVe webinar series
November 6th 2018
Elena Simperl, University of Southampton, UK
@esimperl
Volume
Veracity
Velocity
Variety
Big data
• Data value chains as driver for growth and change
• Transformative impact leading to new infrastructure,
businesses, politics and social interactions
• Created, refined, valued and exchanged unlike any other
resources
• Alters the rules for markets and demands new
approaches from regulators
The data economy
Example: Disrupting transport
Smart cities have access to more data than ever to
inform policy and service design
Driverless cars, electrification and connectivity are
transforming the automotive industry
Machine learning and AI can help optimise traffic,
support future planning and improve fuel efficiencies
Challenges
Data
availability
• Collecting missing data
• Labelling data to train and
validate algorithms
• Improving data quality
• Integrating across sources
Data use
• Making decisions inclusively
• Enabling the free flow of data
• Innovating responsibly
Many of these
tasks are
automated, but
technology has
limitations
Legal, economic,
social, ethical
implications
More and better data
Training and validating algorithms
Engaging and empowering citizens,
customers etc.
The human factor in big data
Approaches
Citizen sensing
Urban auditing
Participatory democracy
Open innovation
Crowdsourcing
Human in the loop
Crowdsourcing
Organisations struggle to leverage
the human factor
What form of
crowdsourcing
to choose?
How to engage
with the
crowd?
Why would
the crowd
care?
How do we
control the
quality?
Does it need
to be in real-
time?
Can we afford
it at scale?
Qrowd
Innovation action, part of the Big Data Value PPP
Started in December 2016, 3 years, 3.9M €
8 partners from 5 European countries, coordinated by the
University of Southampton
Smart city solutions
Combining crowd and computational intelligence
Piloted in transportation with
A medium-sized smart city
A leading navigation and traffic management service provider
Enabling data value chains
Standards compliant,
interoperable, open, no
vendor lock-in
Leverages existing
technology stacks
Used by industry partners
Extendable and scalable to
adapt to new urban
contexts
Platform for data and
process (data flow)
integration
The human factor in Qrowd
Mix of open innovation methods to co-design pilots and encourage
stakeholder participation
Value-centric approach to platform design: personal data empowerment,
open source, building upon existing standards
Sustainable urban auditing through online and mobile crowdsourcing
Human-in-the-loop (HIL) architecture to improve the accuracy of
predictions
More than just technology
Supports deployment of
human-machine workflows
throughout
Interfaces to multiple
crowdsourcing services
Complemented by
methodology and
guidelines
Data protection by design
The ‘what, who, how, why’ methodology
14
What
• Tasks you can’t complete in-house or using computers
• A question of time, budget, resources, ethics etc.
Who
• Crowdsourcing ≠‘turkers’
• Open call, biased via choice of platforms and promotion
channels
• No traditional means to manage and incentivize
• Crowd has often little to no context about the project
How
• Macro vs. microtasks
• Complex workflows
• Assessment and aggregation
• Timeliness of results
Why
• Different crowds with different motivations
• Incentives influence motivations
• Aligning incentives
Using the methodology
Who is it for
• Organisations interested in increasing participation via crowdsourcing
• Technology providers implementing HIL architectures
How can it be used
• Provides a process model starting with the What, followed by the Who,
which then determine the How. Every What/Who/How decision impacts
on the Why
• Can be used with or without the Qrowd platform
• Helps specify goals and decide what forms of crowdsourcing to use
• Helps roll out crowdsourcing projects and use their results effectively
• Helps understand motivations and incentives and their role in successful
projects
Examples
Urban auditing: Collect up to date
information about parking spaces in a city
Modal split: Collecting training data to
predict the use of different means of
transport
What
In general
• Something you cannot do using traditional means or that
requires broader engagement
• Something you cannot do (fully) automatically – a data
collection or analysis task
In our examples
• Parking: We need a dataset with all parking spaces in a city
(alternatively: parking availability). Traditional surveys too costly.
• Modal split: We need trips involving different means of
transport and labels for each trip segment. This data is not
available and is needed to train AIs.
11/6/2018 17
What Who
How Why
What task am I trying to
solve?
Can I solve it via other
means: buy the data,
label in house, use
less/noisier data etc.
Who
In general
• An open (‘unknown’) crowd
• Scale helps solve problem faster
• Some tasks will have time, location or skills constraints (hence,
smaller crowd, hence slower or costlier)
In our examples
• Parking
• People who are familiar with an urban area e.g., Open Street Map community, citizens
• Drivers using a SatNav
• Paid crowd workers
• Social media users
• Modal split
• Commuters, tourists, people using transport
11/6/2018 19
What Who
How Why
Who is my crowd?
How do I recruit
participants?
What are my
requirements?
Can I find volunteers?
Shall I use a crowdsourcing
platform?
How: Process
In general
• Many ways to implement tasks: specialized platforms, social media, extension of
existing system etc.
• Tasks broken down into smaller units, undertaken in parallel by different people
• Does not apply to all forms of crowdsourcing – sometimes the breakdown is part of the
solution!
• Does not apply to creative tasks, underexplored problem spaces etc.
• Task assignment to match skills, preferences, and contribution history
• Example: random assignment vs meritocracy vs full autonomy
• Explicit vs. implicit participation
• Affects motivation
• Partial or independent answers consolidated and aggregated into complete
solution
• Example: challenges (e.g., Netflix) vs aggregation (e.g., Wikipedia)
• Real-time answers
• Require alternative models and incentives
11/6/2018 21
What Who
How Why
How: Process
In our example - parking
1. Crowdsourcing platform: Virtual City Explorer tool using virtual
street imagery. Participants are paid.
2. Extension of existing system: SatNav prompting user to answer
questions about parking availability. Contributions could be
incentivised.
3. Data collection app: i-Log app launches challenges to collect
parking pictures in a city. Best pictures receive a prize.
11/6/2018 22
What Who
How Why
Virtual City Explorer
• Crowdsourcing platform for
urban auditing, developed at the
University of Southampton
• People explore a virtual city via
street imagery
• They solve small tasks against
micropayments
• VCE validates answers,
consolidates data and analyses
user behaviour to propose
optimisations
i-Log and QrowdLab
i-Log is an Android application developed at the University of
Trento used for people-centric sensing
QrowdLab is a citizen innovation lab set up in Trento to
engage with citizens on city matters
We need tools to connect with the citizens
We need data to understand patterns of
behaviour and collect missing data
We need feedback on how people interact with
the city and its infrastructure
How: Process
In our example – modal split
• Combination of machine learning classifier, citizen sensing and
labelled data collected via gamified challenges
11/6/2018 25
What Who
How Why
Where do I deploy
crowdsourcing? Do I need a
new system?
How do I allocate tasks to
people? Or do I let them
choose freely how to
contribute?
How do I deal with low quality
solutions? Can I recognise
good solutions easily?
Why: money, love or glory
Love and glory reduce costs
Money and glory make the
crowd move faster
27
Intrinsic vs extrinsic motivation
• Rewards/incentives influence motivation
Successful unpaid crowdsourcing is difficult to
predict or replicate
• Highly context-specific
• Not applicable to arbitrary tasks
Reward models often easier to study and
control (if performance can be reliably
measured)
• Not always easy to abstract from social
aspects (free-riding, social pressure)
• May undermine intrinsic motivation
What Who
How Why
Why
In our examples
Who benefits from the results?
Who owns the results?
How much effort does it require from the crowd?
Money
Different models: pay-per-time, pay-per-unit, winner-
takes-it-all
Define the rewards, analyse trade-offs accuracy vs.
costs, avoid spam
Love
OpenStreetMap, games, citizen panels
Glory
Competitions, awards
Why would anyone care to
contribute?
Is the task intrinsically
rewarding?
What would motivate people
to participate?
How do I sustain participation?
Leveraging the human factor
The most sophisticated AI systems showcase ingenious
combinations of human and machine intelligence
Crowdsourcing can augment any aspect of the data value
chain
Our methodology can help organisations understand how
to use crowdsourcing effectively
Qrowd develops a platform with integrated crowdsourcing
support to deploy hybrid data collection and analysis
workflows
Further reading
• Qrowd project: qrowd-project.eu, @QrowdProject
• Figure Eight: figure-eight.com
• How to use crowdsourcing effectively, Simperl, E. (2015):
https://www.liberquarterly.eu/articles/10.18352/lq.9948/
• When computers were human, David Alan Grier, 2007
• The collective intelligence genome, Malone, T. W., Laubacher, R., &
Dellarocas, C. (2010). MIT Sloan Management Review, 51(3), 21.
• Getting Results from Crowds: The Definitive Guide to Using
Crowdsourcing to Grow, Dawson, R. and Bynghall, S. (2011).
Advanced Human Technologies

More Related Content

What's hot

The Journey to Big Data Analytics
The Journey to Big Data AnalyticsThe Journey to Big Data Analytics
The Journey to Big Data AnalyticsDr.Stefan Radtke
 
Spark of life: maximising urban efficiency through location-based analytics
Spark of life: maximising urban efficiency through location-based analyticsSpark of life: maximising urban efficiency through location-based analytics
Spark of life: maximising urban efficiency through location-based analyticsThe Economist Media Businesses
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launchrobkitchin
 
Ibm ofa ottawa_analytics_in_gov _campbell_robertson
Ibm  ofa ottawa_analytics_in_gov _campbell_robertsonIbm  ofa ottawa_analytics_in_gov _campbell_robertson
Ibm ofa ottawa_analytics_in_gov _campbell_robertsondawnrk
 
Data Visualization & Data Storytelling
Data Visualization & Data StorytellingData Visualization & Data Storytelling
Data Visualization & Data Storytelling彭其捷 Jack
 
How analytics will transform banking in luxembourg
How analytics will transform banking in luxembourgHow analytics will transform banking in luxembourg
How analytics will transform banking in luxembourgTommy Lehnert
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboardrobkitchin
 

What's hot (8)

The Journey to Big Data Analytics
The Journey to Big Data AnalyticsThe Journey to Big Data Analytics
The Journey to Big Data Analytics
 
Spark of life: maximising urban efficiency through location-based analytics
Spark of life: maximising urban efficiency through location-based analyticsSpark of life: maximising urban efficiency through location-based analytics
Spark of life: maximising urban efficiency through location-based analytics
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launch
 
Ibm ofa ottawa_analytics_in_gov _campbell_robertson
Ibm  ofa ottawa_analytics_in_gov _campbell_robertsonIbm  ofa ottawa_analytics_in_gov _campbell_robertson
Ibm ofa ottawa_analytics_in_gov _campbell_robertson
 
Data Visualization & Data Storytelling
Data Visualization & Data StorytellingData Visualization & Data Storytelling
Data Visualization & Data Storytelling
 
How analytics will transform banking in luxembourg
How analytics will transform banking in luxembourgHow analytics will transform banking in luxembourg
How analytics will transform banking in luxembourg
 
CSCMP 2014 :exploring scm big data cscmp
CSCMP 2014 :exploring scm big data cscmpCSCMP 2014 :exploring scm big data cscmp
CSCMP 2014 :exploring scm big data cscmp
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboard
 

Similar to The Human Factor in Big Data: Leveraging Crowdsourcing

Human factor in big data qrowd bdve
Human factor in big data qrowd bdveHuman factor in big data qrowd bdve
Human factor in big data qrowd bdveLuis Daniel Ibáñez
 
Crowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesCrowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesElena Simperl
 
Smart Cities? Smart Citizens!
Smart Cities? Smart Citizens!Smart Cities? Smart Citizens!
Smart Cities? Smart Citizens!Frank Kresin
 
Data Days: Citadel pilots results
Data Days: Citadel pilots resultsData Days: Citadel pilots results
Data Days: Citadel pilots resultsSarahBuelens
 
Revenue models of personal data platform operators
Revenue models of personal data platform operatorsRevenue models of personal data platform operators
Revenue models of personal data platform operatorsLaura Kemppainen
 
The Purdue IronHacks
The Purdue IronHacksThe Purdue IronHacks
The Purdue IronHacksPurdue RCODI
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressedBonnie Holub
 
"Developments in Accessibility of Information" - Access Israel 's 6th Annual ...
"Developments in Accessibility of Information" - Access Israel 's 6th Annual ..."Developments in Accessibility of Information" - Access Israel 's 6th Annual ...
"Developments in Accessibility of Information" - Access Israel 's 6th Annual ...Ricardo Garcia Bahamonde
 
Open Mobility - the case for CitySDK
Open Mobility - the case for CitySDKOpen Mobility - the case for CitySDK
Open Mobility - the case for CitySDKFrank Kresin
 
Digital Vision for CALP
Digital Vision for CALPDigital Vision for CALP
Digital Vision for CALPtaipida
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressedBonnie Holub
 
London data and digital masterclass for councillors slides 14-Feb-20
London data and digital masterclass for councillors slides 14-Feb-20London data and digital masterclass for councillors slides 14-Feb-20
London data and digital masterclass for councillors slides 14-Feb-20LG Inform Plus
 
Locus Charter Presentation
Locus Charter Presentation Locus Charter Presentation
Locus Charter Presentation Suchith Anand
 
Community solutions lab
Community solutions labCommunity solutions lab
Community solutions labGabe Sawhney
 
Engaging citizens in the future of mobility
Engaging citizens in the future of mobilityEngaging citizens in the future of mobility
Engaging citizens in the future of mobilityMobility Lab UK
 
NJFuture Redevelopment Forum 2015 Bottigheimer
NJFuture Redevelopment Forum 2015 BottigheimerNJFuture Redevelopment Forum 2015 Bottigheimer
NJFuture Redevelopment Forum 2015 BottigheimerNew Jersey Future
 
Where does Data Democracy begin? [Segment-Synapse, 2019]
Where does Data Democracy begin? [Segment-Synapse, 2019]Where does Data Democracy begin? [Segment-Synapse, 2019]
Where does Data Democracy begin? [Segment-Synapse, 2019]aj_cache
 
Managing Change: Transformation for Productive Public Services 6/12/2016
Managing Change: Transformation for Productive Public Services 6/12/2016Managing Change: Transformation for Productive Public Services 6/12/2016
Managing Change: Transformation for Productive Public Services 6/12/2016mckenln
 

Similar to The Human Factor in Big Data: Leveraging Crowdsourcing (20)

Human factor in big data qrowd bdve
Human factor in big data qrowd bdveHuman factor in big data qrowd bdve
Human factor in big data qrowd bdve
 
Crowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesCrowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart cities
 
Smart Cities? Smart Citizens!
Smart Cities? Smart Citizens!Smart Cities? Smart Citizens!
Smart Cities? Smart Citizens!
 
Data Days: Citadel pilots results
Data Days: Citadel pilots resultsData Days: Citadel pilots results
Data Days: Citadel pilots results
 
Revenue models of personal data platform operators
Revenue models of personal data platform operatorsRevenue models of personal data platform operators
Revenue models of personal data platform operators
 
The Purdue IronHacks
The Purdue IronHacksThe Purdue IronHacks
The Purdue IronHacks
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
 
"Developments in Accessibility of Information" - Access Israel 's 6th Annual ...
"Developments in Accessibility of Information" - Access Israel 's 6th Annual ..."Developments in Accessibility of Information" - Access Israel 's 6th Annual ...
"Developments in Accessibility of Information" - Access Israel 's 6th Annual ...
 
CTDC Ecosystem Mapping Guide
CTDC Ecosystem Mapping Guide  CTDC Ecosystem Mapping Guide
CTDC Ecosystem Mapping Guide
 
Open Mobility - the case for CitySDK
Open Mobility - the case for CitySDKOpen Mobility - the case for CitySDK
Open Mobility - the case for CitySDK
 
Digital Vision for CALP
Digital Vision for CALPDigital Vision for CALP
Digital Vision for CALP
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
 
London data and digital masterclass for councillors slides 14-Feb-20
London data and digital masterclass for councillors slides 14-Feb-20London data and digital masterclass for councillors slides 14-Feb-20
London data and digital masterclass for councillors slides 14-Feb-20
 
Locus Charter Presentation
Locus Charter Presentation Locus Charter Presentation
Locus Charter Presentation
 
Community solutions lab
Community solutions labCommunity solutions lab
Community solutions lab
 
Engaging citizens in the future of mobility
Engaging citizens in the future of mobilityEngaging citizens in the future of mobility
Engaging citizens in the future of mobility
 
Week2 chapters1 3
Week2 chapters1 3Week2 chapters1 3
Week2 chapters1 3
 
NJFuture Redevelopment Forum 2015 Bottigheimer
NJFuture Redevelopment Forum 2015 BottigheimerNJFuture Redevelopment Forum 2015 Bottigheimer
NJFuture Redevelopment Forum 2015 Bottigheimer
 
Where does Data Democracy begin? [Segment-Synapse, 2019]
Where does Data Democracy begin? [Segment-Synapse, 2019]Where does Data Democracy begin? [Segment-Synapse, 2019]
Where does Data Democracy begin? [Segment-Synapse, 2019]
 
Managing Change: Transformation for Productive Public Services 6/12/2016
Managing Change: Transformation for Productive Public Services 6/12/2016Managing Change: Transformation for Productive Public Services 6/12/2016
Managing Change: Transformation for Productive Public Services 6/12/2016
 

More from Big Data Value Association

Data Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharingData Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharingBig Data Value Association
 
Key Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplaceKey Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplaceBig Data Value Association
 
GDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharingGDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharingBig Data Value Association
 
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...Big Data Value Association
 
Three pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and PrivacyThree pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and PrivacyBig Data Value Association
 
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...Big Data Value Association
 
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...Big Data Value Association
 
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna Big Data Value Association
 
BDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionalsBDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionalsBig Data Value Association
 
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...Big Data Value Association
 
BDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshopBDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshopBig Data Value Association
 
BDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshopBDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshopBig Data Value Association
 
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...Big Data Value Association
 
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBig Data Value Association
 
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector WebinarBigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector WebinarBig Data Value Association
 
Virtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for BenchmarkingVirtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for BenchmarkingBig Data Value Association
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Big Data Value Association
 
Policy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical OverviewPolicy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical OverviewBig Data Value Association
 
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...Big Data Value Association
 

More from Big Data Value Association (20)

Data Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharingData Privacy, Security in personal data sharing
Data Privacy, Security in personal data sharing
 
Key Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplaceKey Modules for a trsuted and privacy preserving personal data marketplace
Key Modules for a trsuted and privacy preserving personal data marketplace
 
GDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharingGDPR and Data Ethics considerations in personal data sharing
GDPR and Data Ethics considerations in personal data sharing
 
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
Intro - Three pillars for building a Smart Data Ecosystem: Trust, Security an...
 
Three pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and PrivacyThree pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
Three pillars for building a Smart Data Ecosystem: Trust, Security and Privacy
 
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
Market into context - Three pillars for building a Smart Data Ecosystem: Trus...
 
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
BDV Skills Accreditation - Future of digital skills in Europe reskilling and ...
 
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
BDV Skills Accreditation - Big Data skilling in Emilia-Romagna
 
BDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionalsBDV Skills Accreditation - EIT labels for professionals
BDV Skills Accreditation - EIT labels for professionals
 
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
BDV Skills Accreditation - Recognizing Data Science Skills with BDV Data Scie...
 
BDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshopBDV Skills Accreditation - Objectives of the workshop
BDV Skills Accreditation - Objectives of the workshop
 
BDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshopBDV Skills Accreditation - Welcome introduction to the workshop
BDV Skills Accreditation - Welcome introduction to the workshop
 
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
BDV Skills Accreditation - Definition and ensuring of digital roles and compe...
 
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector Webinar
 
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector WebinarBigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
BigDataPilotDemoDays - I-BiDaaS Application to the Financial Sector Webinar
 
Virtual BenchLearning - Data Bench Framework
Virtual BenchLearning - Data Bench FrameworkVirtual BenchLearning - Data Bench Framework
Virtual BenchLearning - Data Bench Framework
 
Virtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for BenchmarkingVirtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
Virtual BenchLearning - DeepHealth - Needs & Requirements for Benchmarking
 
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
Virtual BenchLearning - I-BiDaaS - Industrial-Driven Big Data as a Self-Servi...
 
Policy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical OverviewPolicy Cloud Data Driven Policies against Radicalisation - Technical Overview
Policy Cloud Data Driven Policies against Radicalisation - Technical Overview
 
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
Policy Cloud Data Driven Policies against Radicalisation - Participatory poli...
 

Recently uploaded

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 

Recently uploaded (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 

The Human Factor in Big Data: Leveraging Crowdsourcing

  • 1. The human factor in big data BDVe webinar series November 6th 2018 Elena Simperl, University of Southampton, UK @esimperl
  • 2. Volume Veracity Velocity Variety Big data • Data value chains as driver for growth and change • Transformative impact leading to new infrastructure, businesses, politics and social interactions • Created, refined, valued and exchanged unlike any other resources • Alters the rules for markets and demands new approaches from regulators The data economy
  • 3. Example: Disrupting transport Smart cities have access to more data than ever to inform policy and service design Driverless cars, electrification and connectivity are transforming the automotive industry Machine learning and AI can help optimise traffic, support future planning and improve fuel efficiencies
  • 4. Challenges Data availability • Collecting missing data • Labelling data to train and validate algorithms • Improving data quality • Integrating across sources Data use • Making decisions inclusively • Enabling the free flow of data • Innovating responsibly Many of these tasks are automated, but technology has limitations Legal, economic, social, ethical implications
  • 5.
  • 6. More and better data Training and validating algorithms Engaging and empowering citizens, customers etc. The human factor in big data
  • 7. Approaches Citizen sensing Urban auditing Participatory democracy Open innovation Crowdsourcing Human in the loop
  • 9. Organisations struggle to leverage the human factor What form of crowdsourcing to choose? How to engage with the crowd? Why would the crowd care? How do we control the quality? Does it need to be in real- time? Can we afford it at scale?
  • 10. Qrowd Innovation action, part of the Big Data Value PPP Started in December 2016, 3 years, 3.9M € 8 partners from 5 European countries, coordinated by the University of Southampton Smart city solutions Combining crowd and computational intelligence Piloted in transportation with A medium-sized smart city A leading navigation and traffic management service provider
  • 11. Enabling data value chains Standards compliant, interoperable, open, no vendor lock-in Leverages existing technology stacks Used by industry partners Extendable and scalable to adapt to new urban contexts Platform for data and process (data flow) integration
  • 12. The human factor in Qrowd Mix of open innovation methods to co-design pilots and encourage stakeholder participation Value-centric approach to platform design: personal data empowerment, open source, building upon existing standards Sustainable urban auditing through online and mobile crowdsourcing Human-in-the-loop (HIL) architecture to improve the accuracy of predictions
  • 13. More than just technology Supports deployment of human-machine workflows throughout Interfaces to multiple crowdsourcing services Complemented by methodology and guidelines Data protection by design
  • 14. The ‘what, who, how, why’ methodology 14 What • Tasks you can’t complete in-house or using computers • A question of time, budget, resources, ethics etc. Who • Crowdsourcing ≠‘turkers’ • Open call, biased via choice of platforms and promotion channels • No traditional means to manage and incentivize • Crowd has often little to no context about the project How • Macro vs. microtasks • Complex workflows • Assessment and aggregation • Timeliness of results Why • Different crowds with different motivations • Incentives influence motivations • Aligning incentives
  • 15. Using the methodology Who is it for • Organisations interested in increasing participation via crowdsourcing • Technology providers implementing HIL architectures How can it be used • Provides a process model starting with the What, followed by the Who, which then determine the How. Every What/Who/How decision impacts on the Why • Can be used with or without the Qrowd platform • Helps specify goals and decide what forms of crowdsourcing to use • Helps roll out crowdsourcing projects and use their results effectively • Helps understand motivations and incentives and their role in successful projects
  • 16. Examples Urban auditing: Collect up to date information about parking spaces in a city Modal split: Collecting training data to predict the use of different means of transport
  • 17. What In general • Something you cannot do using traditional means or that requires broader engagement • Something you cannot do (fully) automatically – a data collection or analysis task In our examples • Parking: We need a dataset with all parking spaces in a city (alternatively: parking availability). Traditional surveys too costly. • Modal split: We need trips involving different means of transport and labels for each trip segment. This data is not available and is needed to train AIs. 11/6/2018 17 What Who How Why
  • 18. What task am I trying to solve? Can I solve it via other means: buy the data, label in house, use less/noisier data etc.
  • 19. Who In general • An open (‘unknown’) crowd • Scale helps solve problem faster • Some tasks will have time, location or skills constraints (hence, smaller crowd, hence slower or costlier) In our examples • Parking • People who are familiar with an urban area e.g., Open Street Map community, citizens • Drivers using a SatNav • Paid crowd workers • Social media users • Modal split • Commuters, tourists, people using transport 11/6/2018 19 What Who How Why
  • 20. Who is my crowd? How do I recruit participants? What are my requirements? Can I find volunteers? Shall I use a crowdsourcing platform?
  • 21. How: Process In general • Many ways to implement tasks: specialized platforms, social media, extension of existing system etc. • Tasks broken down into smaller units, undertaken in parallel by different people • Does not apply to all forms of crowdsourcing – sometimes the breakdown is part of the solution! • Does not apply to creative tasks, underexplored problem spaces etc. • Task assignment to match skills, preferences, and contribution history • Example: random assignment vs meritocracy vs full autonomy • Explicit vs. implicit participation • Affects motivation • Partial or independent answers consolidated and aggregated into complete solution • Example: challenges (e.g., Netflix) vs aggregation (e.g., Wikipedia) • Real-time answers • Require alternative models and incentives 11/6/2018 21 What Who How Why
  • 22. How: Process In our example - parking 1. Crowdsourcing platform: Virtual City Explorer tool using virtual street imagery. Participants are paid. 2. Extension of existing system: SatNav prompting user to answer questions about parking availability. Contributions could be incentivised. 3. Data collection app: i-Log app launches challenges to collect parking pictures in a city. Best pictures receive a prize. 11/6/2018 22 What Who How Why
  • 23. Virtual City Explorer • Crowdsourcing platform for urban auditing, developed at the University of Southampton • People explore a virtual city via street imagery • They solve small tasks against micropayments • VCE validates answers, consolidates data and analyses user behaviour to propose optimisations
  • 24. i-Log and QrowdLab i-Log is an Android application developed at the University of Trento used for people-centric sensing QrowdLab is a citizen innovation lab set up in Trento to engage with citizens on city matters We need tools to connect with the citizens We need data to understand patterns of behaviour and collect missing data We need feedback on how people interact with the city and its infrastructure
  • 25. How: Process In our example – modal split • Combination of machine learning classifier, citizen sensing and labelled data collected via gamified challenges 11/6/2018 25 What Who How Why
  • 26. Where do I deploy crowdsourcing? Do I need a new system? How do I allocate tasks to people? Or do I let them choose freely how to contribute? How do I deal with low quality solutions? Can I recognise good solutions easily?
  • 27. Why: money, love or glory Love and glory reduce costs Money and glory make the crowd move faster 27 Intrinsic vs extrinsic motivation • Rewards/incentives influence motivation Successful unpaid crowdsourcing is difficult to predict or replicate • Highly context-specific • Not applicable to arbitrary tasks Reward models often easier to study and control (if performance can be reliably measured) • Not always easy to abstract from social aspects (free-riding, social pressure) • May undermine intrinsic motivation What Who How Why
  • 28. Why In our examples Who benefits from the results? Who owns the results? How much effort does it require from the crowd? Money Different models: pay-per-time, pay-per-unit, winner- takes-it-all Define the rewards, analyse trade-offs accuracy vs. costs, avoid spam Love OpenStreetMap, games, citizen panels Glory Competitions, awards
  • 29. Why would anyone care to contribute? Is the task intrinsically rewarding? What would motivate people to participate? How do I sustain participation?
  • 30. Leveraging the human factor The most sophisticated AI systems showcase ingenious combinations of human and machine intelligence Crowdsourcing can augment any aspect of the data value chain Our methodology can help organisations understand how to use crowdsourcing effectively Qrowd develops a platform with integrated crowdsourcing support to deploy hybrid data collection and analysis workflows
  • 31. Further reading • Qrowd project: qrowd-project.eu, @QrowdProject • Figure Eight: figure-eight.com • How to use crowdsourcing effectively, Simperl, E. (2015): https://www.liberquarterly.eu/articles/10.18352/lq.9948/ • When computers were human, David Alan Grier, 2007 • The collective intelligence genome, Malone, T. W., Laubacher, R., & Dellarocas, C. (2010). MIT Sloan Management Review, 51(3), 21. • Getting Results from Crowds: The Definitive Guide to Using Crowdsourcing to Grow, Dawson, R. and Bynghall, S. (2011). Advanced Human Technologies