SlideShare a Scribd company logo
1 of 38
Social Mining &
Big Data Analytics
H2020 - www.sobigdata.eu
September 2015- August 2019
@SoBigData (https://twitter.com/SoBigData)
https://www.facebook.com/SoBigData
The Consortium
Delft 17 – 19 February 2016
Integrating national research
Infrastructures
SoBigData is…
A Multidisciplinary European Infrastructure on Big Data and Social
Data Mining providing an integrated ecosystem for ethic-sensitive
scientific discoveries and advanced applications of social data
mining on the various dimensions of social life, as recorded by “big
data”.
SMARTCATs
The GOAL of the Research Infrastructure
• to integrate key national infrastructures and centres of
excellence at European level in social mining and big data
analytics
• to enable cutting-edge, multi-disciplinary social mining &
responsible data science experiments leveraging the Research
Infrastructure assets: big data sets, analytical tools and
services, and data scientist skills
• to grant access (both online and on-site) to multidisciplinary
scientists, innovators, public bodies, citizen organizations,
SMEs, as well as data science students at any level of
education.
SMARTCATs
The pillars for reaching the goal
• a distributed data ecosystem for procurement, access and
curation of big social data
• a distributed platform of interoperable, social data mining
methods and associated “data scientist” skills for mining,
analysing, and visualising complex and massive datasets
• a community of multidisciplinary scientists, innovators,
public bodies, citizen organizations, SMEs, as well as data
science students at any level of education scientific, brought
together by extensive networking and innovation actions
What are doing our researcher?
• any responsible data science experiment is
composed by:
– data acquiring (open data, crowdsourcing,
crowdsensing,)
– model building (very complex validation phase),
– creation of an exploration scenario (what-if
analysis) (different validation setting),
– ….similar to many other data-driven science
process,…but data are produced by humans
Delft 17 – 19 February 2016
Exploratories
Social Mining Research Environments
tailored on specific multidisciplinary
domains
• Promotes results sharing among scientists and
communities
• Promotes the use of RI through Virtual and
Transnational Access
Big Data for Societal Debates
Polarization, controversy and topic trends on societal debates through social media
Lead by Aris Gionis and Dominic Rout
Polarized Political Debates
Monitoring Topics across Time and space
Exploratory: Big Data for City of Citizens
Lead by Roberto Trasarti
Estimating traffic fluxes on road network
A
B
C
H
W
Big Data for Well Being and Economic
Performance
Deprivation Index (in France) predicted with Mobile Phone traces
Lead by Peep Kungas
Well-being and Economic
Performance
Systemic Risk and Gender Diversity
BigData & Migration Studies
Sentiment Analysis
• Internal and external perception by country
– Index ρ - the ratio between pro refugees users and against refugees
users
– Red means a higher predominance of positive sentiment, higher ρ
– Yellow means a higher predominance of negative sentiment, lower ρ
(a) Overall. (b) Internal
perception.
(c) External
perception.
-
+
-
+
-
+
SoBigData e-Infrastructure
• An Exploratory is also a Virtual Research Environments :
– VRE are web-based, community-oriented, comprehensive, flexible, and
secure working environments
– VREs are tailored to satisfy the needs of a designated community.
• services for data and methods discovery and access
• collaboration oriented facilities enabling scientists
– DATA: different sharing policy, may be shared or not
– METHODS: web services (executed over a variety of data centers), or
downloaded (packages to be executed on DATA side)
– WORKFLOW: complex analytical process that may imply executions on
different sites. (currently only description or on-site execution on some
special analytical platforms)
Firenze, 14 Nov 2016
e-infra design
• SoBigData.eu portal
• Sobigdata.eu Catalogue: a set of
functionalities to search, index and discovery
all resources (Data, Models and workflows)
(powered by D4Science)
• Virtual research Environments (Exploratories)
functionalities to create, update and
operation (powered by D4Science):
Delft 17 – 19 February 2016
Firenze, 14 Nov 2016
Firenze, 14 Nov 2016
SoBigData example: Resource Catalogue
Search for datasets and methods
Description
Recent Activities
Action Bar
Recent
Products
Statistics
Firenze, 14 Nov 2016
VRE example: SoBigData VRE
Application
Posting messages to other VRE users
VRE
Abstract
VRE
Managers
News Feed
Top Topics
Recent
Files
Firenze, 14 Nov 2016
The ethics of SoBigData
• Gathering large quantities of data may have serious
consequences:
– consequences range from personal harm,
– to issues of autonomy, injustice and inequality.
• Making Big Data accessible is a value for democracy
• SoBigData adheres to a value-sensitive design
approach:
– design solutions to overcome ethical dilemma’s, in this
case those between the utility of the data gathered vs.
the protection of the individuals subject to the research.
Ethics in practice
• SoBigData has an ethical framework which
provides a broad overview of all the ethical
concerns of big data.
• But, as per the VSD outlook, data protection is
not only the concern of the ethicists. In order
to make the ideals of SoBigData successful,
scientific methods also need to be developed
in order embed moral principles in practice.
The ethics of SoBigData
• How do we create an infrastructure in which such
methods can be disseminated and improved
upon?
• Data Management Plan plays a key role:
– Each data has its privacy requirements and fact checks
and responsibility
• Anonymization techniques are part of the
research
• Researchers will be trained in applying the
necessary procedural safeguards
Anonymization
Service Provider
Mining and
Analytical Engine
Educating the responsible data
scientists
Based on a cooperation between ethicists and
computer
1. A Massive Online Open Course (MOOC) which
instructs all prospective researchers about the
legal and ethical dangers of big data research
and the steps they can take to minimise these;
2. A set of workflows that outline the steps
researchers can take when designing their
approach;
3. Information pop-ups which redirect researchers
to state-of-the-art ethical methods.
New challenges are coming
• One of the OECD ideals is algorithmic
transparency and the GDPR, also, says that
decision-making algorithms should
be explainable.
• But what is enough to constitute an
explanation?
• We're working on developing some sort of
template that would satisfy most people's
conceptions of what an explanation should be
• What function should a terms of use have?
Currently, SoBigData is trying to defer legal
responsibility to the Final User through the ToS,
but this is difficult.
• Example: How do we deal with Twitter's
intellectual property rights? May he scraper
violate Twitter terms of use? although of course
the above also holds. How does SoBigData relate
to data collectors terms of service
Delft 17 – 19 February 2016
New challenges are coming
SoBigData metadata structure
• A highly structured and detailed metadata
structure has been designed in order to provide
information about:
– Description of the dataset (to make it Findable)
– How the dataset has been produced
– Intellectual Property
– Privacy issues
– Who can access the data and how (terms of use,
NDA…)
• Mainly based on the DataCite standard
Delft 17 – 19 February 2016
• Copyright
Copyright is a legal right that grants the creator of an original work
exclusive rights for its use and distribution for a limited amount of time.
Copyright can exist on individual data as well as over a dataset or
database as a whole. The application of copyright to factual data and
metadata have no eligibility for copyright protection.
• License
A license is a unilateral permission by the right holder from the licensor to
the licensee to use certain rights. Licenses distinguish themselves from
contracts since the implementation of a license does not require mutual
agreement.
• Terms of Use
The terms of use are rules that one must obey in order to use the data or
service. The terms of use agreement is mainly used for legal purposes by
data providers and databases that store data. A legitimate terms of use
agreement is legally binding and may be subject to change.
Managing data: disambiguating terms
Firenze, 14 Nov 2016
There are three broad types of data:
• Primary/raw data: data coming directly from the
source;
• Derivative data: data processed from primary/raw
data;
• Metadata: reference data describing either the
primary/raw data or derivative data.
Primary/raw data and derivative data may be licensed
under different conditions and by different stakeholders
Managing data: type of data and their licenses
Firenze, 14 Nov 2016
The e-infrastructure offering discover and access may be operated by a
different actor and offered through specific terms of use.
In this case three types of licenses may be involved:
1. the one agreed between the system operator and the primary data
owner,
2. the one selected for derivative product that may differ from the one
associated with primary data, and
3. the one agreed between the system operator and the data
consumer.
All these licenses have to be captured by the “terms of use” of the e-
infrastructure, i.e., they are part of the rules a consumer must agree to
accept when using the system.
Managing data: dealing with complexity
Firenze, 14 Nov 2016
A re-use license (to specify in the ToU of the e-infrastructure) concerns at least
attribution, copyleft requirement, and control on commercial exploitation of the
dataset. Moreover it is needed to manage and apply some forms of control on access.
Virtual Research Environments (VREs) offer flexible and secure web-based,
community-centric platforms, so researchers can work together on common
challenges.
VREs terms of use are automatically composed according to the combined data and
services selected at the time of VRE definition.
• Raw data are then licensed according to the license expressed by the data
owner/custodian and expressed at time of registration of the data content to the e-
infrastructure.
• Derivative data products instead are licensed with a license compatible and legally
interoperable with the one associated with the primary data.
• It remains under the responsibility of a single user, as expressed in the VRE terms
of use, to confirm the license to associate with any produced derivative data.
Managing data:
VRE as an instrument to manage the complexity
Firenze, 14 Nov 2016
Meta data definition: Ethics
Firenze, 14 Nov 2016
Meta data definition: Intellectual Properties
Firenze, 14 Nov 2016
Il laboratorio di ricerca SoBigData.it organizza
la prima Tuscan Big Data Challenge,
un’occasione gratuita per le aziende per
migliorare il proprio business. Grazie a
un’analisi avanzata ei dati prodotti dalle
aziende o estratti da internet è possibile
ricavare informazioni utili su diversi fronti.

More Related Content

What's hot

Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016Research Data Alliance
 
Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance
 
Wood-RDA and-data publishing-nfdp13
Wood-RDA and-data publishing-nfdp13Wood-RDA and-data publishing-nfdp13
Wood-RDA and-data publishing-nfdp13DataDryad
 
Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016Research Data Alliance
 
Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance
 

What's hot (20)

Rda in a_nutshell_october_2018
Rda in a_nutshell_october_2018Rda in a_nutshell_october_2018
Rda in a_nutshell_october_2018
 
Rda in a Nutshell - February 2019
Rda in a Nutshell - February 2019Rda in a Nutshell - February 2019
Rda in a Nutshell - February 2019
 
Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016Monthly statistics of the RDA community - February 2016
Monthly statistics of the RDA community - February 2016
 
Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016Research Data Alliance Member Statistics January 2016
Research Data Alliance Member Statistics January 2016
 
Rda in a_nutshell_december_2018
Rda in a_nutshell_december_2018Rda in a_nutshell_december_2018
Rda in a_nutshell_december_2018
 
Rda in a_nutshell_aug_2018
Rda in a_nutshell_aug_2018Rda in a_nutshell_aug_2018
Rda in a_nutshell_aug_2018
 
Rda in a_nutshell_november_2018
Rda in a_nutshell_november_2018Rda in a_nutshell_november_2018
Rda in a_nutshell_november_2018
 
RDA in a nutshell May 2016
RDA in a nutshell May 2016RDA in a nutshell May 2016
RDA in a nutshell May 2016
 
Wood-RDA and-data publishing-nfdp13
Wood-RDA and-data publishing-nfdp13Wood-RDA and-data publishing-nfdp13
Wood-RDA and-data publishing-nfdp13
 
Rda in a_nutshell_january_2017
Rda in a_nutshell_january_2017Rda in a_nutshell_january_2017
Rda in a_nutshell_january_2017
 
RDA in a nutshell September 2016
RDA in a nutshell September 2016RDA in a nutshell September 2016
RDA in a nutshell September 2016
 
Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016Monthly statistics of the RDA community - March 2016
Monthly statistics of the RDA community - March 2016
 
Rda in a_nutshell_march_2018
Rda in a_nutshell_march_2018Rda in a_nutshell_march_2018
Rda in a_nutshell_march_2018
 
Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015Research Data Alliance Member Statistics December 2015
Research Data Alliance Member Statistics December 2015
 
Rda members 05 april2016
Rda members 05 april2016Rda members 05 april2016
Rda members 05 april2016
 
Rda in a nutshell - April 2019
Rda in a nutshell - April 2019Rda in a nutshell - April 2019
Rda in a nutshell - April 2019
 
Rda in a nutshell may 2019
Rda in a nutshell may 2019Rda in a nutshell may 2019
Rda in a nutshell may 2019
 
Rda in a nutshell - June 2019
Rda in a nutshell - June 2019Rda in a nutshell - June 2019
Rda in a nutshell - June 2019
 
Rda in-a-nutshell-july-2019
Rda in-a-nutshell-july-2019Rda in-a-nutshell-july-2019
Rda in-a-nutshell-july-2019
 
Rda in a_nutshell_march_2017
Rda in a_nutshell_march_2017Rda in a_nutshell_march_2017
Rda in a_nutshell_march_2017
 

Viewers also liked

Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...AmbasciatadelCanada
 
Social Network Analysis Project
Social Network Analysis ProjectSocial Network Analysis Project
Social Network Analysis ProjectFrancesco Corucci
 
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MININGDino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MININGieee-cist
 
Encouraging an ecological evolution of data infrastructure
Encouraging an ecological evolution of data infrastructureEncouraging an ecological evolution of data infrastructure
Encouraging an ecological evolution of data infrastructureResearch Data Alliance
 
Infrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDAInfrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDAResearch Data Alliance
 
Going Glocal—Polar Data in a Global Infrastructure
Going Glocal—Polar Data in a Global InfrastructureGoing Glocal—Polar Data in a Global Infrastructure
Going Glocal—Polar Data in a Global InfrastructureResearch Data Alliance
 
Educating Data Scientists: the SoBigData master experience
Educating Data Scientists: the SoBigData master experienceEducating Data Scientists: the SoBigData master experience
Educating Data Scientists: the SoBigData master experienceResearch Data Alliance
 
Removing Barriers to Data Sharing: the Research Data Alliance
Removing Barriers to Data Sharing: the Research Data AllianceRemoving Barriers to Data Sharing: the Research Data Alliance
Removing Barriers to Data Sharing: the Research Data AllianceResearch Data Alliance
 
Parsons scidatacon2016
Parsons scidatacon2016Parsons scidatacon2016
Parsons scidatacon2016Mark Parsons
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open ScienceMark Parsons
 
Parsons citation geodata2014
Parsons citation geodata2014Parsons citation geodata2014
Parsons citation geodata2014Mark Parsons
 
CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...
CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...
CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...Research Data Alliance
 
Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe Research Data Alliance
 
Why Data Citation Currently Misses the Point
Why Data Citation Currently Misses the PointWhy Data Citation Currently Misses the Point
Why Data Citation Currently Misses the PointMark Parsons
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkResearch Data Alliance
 
Stories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureStories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureResearch Data Alliance
 
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Research Data Alliance
 

Viewers also liked (20)

Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
Fosca Giannotti - Università di Pisa & ISTI-CNR - Big Data and Social Data Mi...
 
Social Network Analysis Project
Social Network Analysis ProjectSocial Network Analysis Project
Social Network Analysis Project
 
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MININGDino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
Dino pedreschi keynote ieee cist 2014 BIG DATA ANALYTICS & SOCIAL MINING
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open Science
 
Encouraging an ecological evolution of data infrastructure
Encouraging an ecological evolution of data infrastructureEncouraging an ecological evolution of data infrastructure
Encouraging an ecological evolution of data infrastructure
 
Infrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDAInfrastructure, relationships, trust, and RDA
Infrastructure, relationships, trust, and RDA
 
Going Glocal—Polar Data in a Global Infrastructure
Going Glocal—Polar Data in a Global InfrastructureGoing Glocal—Polar Data in a Global Infrastructure
Going Glocal—Polar Data in a Global Infrastructure
 
Educating Data Scientists: the SoBigData master experience
Educating Data Scientists: the SoBigData master experienceEducating Data Scientists: the SoBigData master experience
Educating Data Scientists: the SoBigData master experience
 
Removing Barriers to Data Sharing: the Research Data Alliance
Removing Barriers to Data Sharing: the Research Data AllianceRemoving Barriers to Data Sharing: the Research Data Alliance
Removing Barriers to Data Sharing: the Research Data Alliance
 
Research Data Alliance Overview
Research Data Alliance OverviewResearch Data Alliance Overview
Research Data Alliance Overview
 
Parsons scidatacon2016
Parsons scidatacon2016Parsons scidatacon2016
Parsons scidatacon2016
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open Science
 
Parsons citation geodata2014
Parsons citation geodata2014Parsons citation geodata2014
Parsons citation geodata2014
 
CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...
CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...
CoBRA guideline : a tool to facilitate sharing, reuse, and reproducibility of...
 
Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe
 
Why Data Citation Currently Misses the Point
Why Data Citation Currently Misses the PointWhy Data Citation Currently Misses the Point
Why Data Citation Currently Misses the Point
 
Open Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing WorkOpen Data is Not Enough: Making Data Sharing Work
Open Data is Not Enough: Making Data Sharing Work
 
Stories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureStories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global Infrastructure
 
Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...Efficient and effective: can we combine both to realize high-value, open, sca...
Efficient and effective: can we combine both to realize high-value, open, sca...
 
Research Data Alliance Overview
Research Data Alliance OverviewResearch Data Alliance Overview
Research Data Alliance Overview
 

Similar to SoBigData. European Research Infrastructure for Big Data and Social Mining

Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...BigData_Europe
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020OpenAIRE
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...Nancy Pontika
 
What is open data
What is open dataWhat is open data
What is open dataScott Sosna
 
Open data for UK public sector organisations
Open data for UK public sector organisationsOpen data for UK public sector organisations
Open data for UK public sector organisationsAndrew Mackenzie
 
Building blocks for fair digital society
Building blocks for fair digital societyBuilding blocks for fair digital society
Building blocks for fair digital societySitra / Hyvinvointi
 
Big Data in a Digital City. Key Insights from the Smart City Case Study
Big Data in a Digital City. Key Insights from the Smart City Case StudyBig Data in a Digital City. Key Insights from the Smart City Case Study
Big Data in a Digital City. Key Insights from the Smart City Case StudyBYTE Project
 
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...Diego López-de-Ipiña González-de-Artaza
 
DataWeek 2023 Participatory data for innovation - URBANITE v3.pdf
DataWeek 2023 Participatory data for innovation - URBANITE v3.pdfDataWeek 2023 Participatory data for innovation - URBANITE v3.pdf
DataWeek 2023 Participatory data for innovation - URBANITE v3.pdfURBANITEProject
 
Local Open Data: a perspective from local government in England 2014
Local Open Data: a perspective from local government in England 2014Local Open Data: a perspective from local government in England 2014
Local Open Data: a perspective from local government in England 2014Gesche Schmid
 
Local Open Data: A perspective from local government in England by Gesche Schmid
Local Open Data: A perspective from local government in England by Gesche SchmidLocal Open Data: A perspective from local government in England by Gesche Schmid
Local Open Data: A perspective from local government in England by Gesche SchmidOpening-up.eu
 
UNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdfUNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdfvvpadhu
 
Data ecosystems: turning data into public value
Data ecosystems:  turning data into public valueData ecosystems:  turning data into public value
Data ecosystems: turning data into public valueSlim Turki, Dr.
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonAfrican Open Science Platform
 
Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11Headstar
 
Aligning stakeholders' perspectives in Open Government Data Community
Aligning stakeholders' perspectives in Open Government Data CommunityAligning stakeholders' perspectives in Open Government Data Community
Aligning stakeholders' perspectives in Open Government Data CommunityAdegboyega Ojo
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...Piet J.H. Daas
 

Similar to SoBigData. European Research Infrastructure for Big Data and Social Mining (20)

Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
 
Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020Open Access Week 2017: Introduction to Open Data Policies in H2020
Open Access Week 2017: Introduction to Open Data Policies in H2020
 
General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...General introduction to Open Data Policies H2020, influence of OD policies on...
General introduction to Open Data Policies H2020, influence of OD policies on...
 
What is open data
What is open dataWhat is open data
What is open data
 
Open data for UK public sector organisations
Open data for UK public sector organisationsOpen data for UK public sector organisations
Open data for UK public sector organisations
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Building blocks for fair digital society
Building blocks for fair digital societyBuilding blocks for fair digital society
Building blocks for fair digital society
 
Big Data in a Digital City. Key Insights from the Smart City Case Study
Big Data in a Digital City. Key Insights from the Smart City Case StudyBig Data in a Digital City. Key Insights from the Smart City Case Study
Big Data in a Digital City. Key Insights from the Smart City Case Study
 
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...
Enabling Smarter Cities through Internet of Things, Web of Data & Citizen Par...
 
DataWeek 2023 Participatory data for innovation - URBANITE v3.pdf
DataWeek 2023 Participatory data for innovation - URBANITE v3.pdfDataWeek 2023 Participatory data for innovation - URBANITE v3.pdf
DataWeek 2023 Participatory data for innovation - URBANITE v3.pdf
 
Open Data is not Enough
Open Data is not EnoughOpen Data is not Enough
Open Data is not Enough
 
Local Open Data: a perspective from local government in England 2014
Local Open Data: a perspective from local government in England 2014Local Open Data: a perspective from local government in England 2014
Local Open Data: a perspective from local government in England 2014
 
Local Open Data: A perspective from local government in England by Gesche Schmid
Local Open Data: A perspective from local government in England by Gesche SchmidLocal Open Data: A perspective from local government in England by Gesche Schmid
Local Open Data: A perspective from local government in England by Gesche Schmid
 
UNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdfUNIT 1 -BIG DATA ANALYTICS Full.pdf
UNIT 1 -BIG DATA ANALYTICS Full.pdf
 
Data ecosystems: turning data into public value
Data ecosystems:  turning data into public valueData ecosystems:  turning data into public value
Data ecosystems: turning data into public value
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
A coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon HodsonA coordinated framework for open data open science in Botswana/Simon Hodson
A coordinated framework for open data open science in Botswana/Simon Hodson
 
Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11Closing plenary: the future of public sector websites #BPCW11
Closing plenary: the future of public sector websites #BPCW11
 
Aligning stakeholders' perspectives in Open Government Data Community
Aligning stakeholders' perspectives in Open Government Data CommunityAligning stakeholders' perspectives in Open Government Data Community
Aligning stakeholders' perspectives in Open Government Data Community
 
Opportunities and methodological challenges of Big Data for official statist...
Opportunities and methodological challenges of  Big Data for official statist...Opportunities and methodological challenges of  Big Data for official statist...
Opportunities and methodological challenges of Big Data for official statist...
 

More from Research Data Alliance

The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsResearch Data Alliance
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersResearch Data Alliance
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchResearch Data Alliance
 

More from Research Data Alliance (20)

RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020RDA in a Nutshell - September 2020
RDA in a Nutshell - September 2020
 
RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020RDA in a Nutshell - August 2020
RDA in a Nutshell - August 2020
 
RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020RDA in a Nutshell - July 2020
RDA in a Nutshell - July 2020
 
RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020RDA in a Nutshell - June 2020
RDA in a Nutshell - June 2020
 
RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020RDA in a Nutshell - May 2020
RDA in a Nutshell - May 2020
 
RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020RDA in a Nutshell - April 2020
RDA in a Nutshell - April 2020
 
RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020RDA in a Nutshell - March 2020
RDA in a Nutshell - March 2020
 
RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020RDA in a Nutshell - February 2020
RDA in a Nutshell - February 2020
 
RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020RDA in a Nutshell - January 2020
RDA in a Nutshell - January 2020
 
Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019Rda in a Nutshell - December 2019
Rda in a Nutshell - December 2019
 
Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019Rda in a Nutshell - November 2019
Rda in a Nutshell - November 2019
 
RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019RDA in a Nutshell - October 2019
RDA in a Nutshell - October 2019
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
The Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to IndividualsThe Value of the Research Data Alliance to Individuals
The Value of the Research Data Alliance to Individuals
 
RDA Value for Infrastructure Providers
RDA Value for Infrastructure ProvidersRDA Value for Infrastructure Providers
RDA Value for Infrastructure Providers
 
Rda in a nutshell september 2019
Rda in a nutshell september 2019Rda in a nutshell september 2019
Rda in a nutshell september 2019
 
The Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing ResearchThe Value of the Rda Value for Organisations Performing Research
The Value of the Rda Value for Organisations Performing Research
 
RDA Value for Libraries
RDA Value for LibrariesRDA Value for Libraries
RDA Value for Libraries
 
The Value of the RDA for Funders
The Value of the RDA for FundersThe Value of the RDA for Funders
The Value of the RDA for Funders
 
Rda in a nutshell august 2019
Rda in a nutshell august 2019Rda in a nutshell august 2019
Rda in a nutshell august 2019
 

Recently uploaded

办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 

Recently uploaded (20)

办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 

SoBigData. European Research Infrastructure for Big Data and Social Mining

  • 1. Social Mining & Big Data Analytics H2020 - www.sobigdata.eu September 2015- August 2019 @SoBigData (https://twitter.com/SoBigData) https://www.facebook.com/SoBigData
  • 3. Delft 17 – 19 February 2016 Integrating national research Infrastructures
  • 4. SoBigData is… A Multidisciplinary European Infrastructure on Big Data and Social Data Mining providing an integrated ecosystem for ethic-sensitive scientific discoveries and advanced applications of social data mining on the various dimensions of social life, as recorded by “big data”.
  • 5. SMARTCATs The GOAL of the Research Infrastructure • to integrate key national infrastructures and centres of excellence at European level in social mining and big data analytics • to enable cutting-edge, multi-disciplinary social mining & responsible data science experiments leveraging the Research Infrastructure assets: big data sets, analytical tools and services, and data scientist skills • to grant access (both online and on-site) to multidisciplinary scientists, innovators, public bodies, citizen organizations, SMEs, as well as data science students at any level of education.
  • 6. SMARTCATs The pillars for reaching the goal • a distributed data ecosystem for procurement, access and curation of big social data • a distributed platform of interoperable, social data mining methods and associated “data scientist” skills for mining, analysing, and visualising complex and massive datasets • a community of multidisciplinary scientists, innovators, public bodies, citizen organizations, SMEs, as well as data science students at any level of education scientific, brought together by extensive networking and innovation actions
  • 7. What are doing our researcher? • any responsible data science experiment is composed by: – data acquiring (open data, crowdsourcing, crowdsensing,) – model building (very complex validation phase), – creation of an exploration scenario (what-if analysis) (different validation setting), – ….similar to many other data-driven science process,…but data are produced by humans Delft 17 – 19 February 2016
  • 8.
  • 9. Exploratories Social Mining Research Environments tailored on specific multidisciplinary domains • Promotes results sharing among scientists and communities • Promotes the use of RI through Virtual and Transnational Access
  • 10. Big Data for Societal Debates Polarization, controversy and topic trends on societal debates through social media Lead by Aris Gionis and Dominic Rout
  • 11. Polarized Political Debates Monitoring Topics across Time and space
  • 12. Exploratory: Big Data for City of Citizens Lead by Roberto Trasarti
  • 13. Estimating traffic fluxes on road network A B C H W
  • 14. Big Data for Well Being and Economic Performance Deprivation Index (in France) predicted with Mobile Phone traces Lead by Peep Kungas
  • 15. Well-being and Economic Performance Systemic Risk and Gender Diversity
  • 17. Sentiment Analysis • Internal and external perception by country – Index ρ - the ratio between pro refugees users and against refugees users – Red means a higher predominance of positive sentiment, higher ρ – Yellow means a higher predominance of negative sentiment, lower ρ (a) Overall. (b) Internal perception. (c) External perception. - + - + - +
  • 18. SoBigData e-Infrastructure • An Exploratory is also a Virtual Research Environments : – VRE are web-based, community-oriented, comprehensive, flexible, and secure working environments – VREs are tailored to satisfy the needs of a designated community. • services for data and methods discovery and access • collaboration oriented facilities enabling scientists – DATA: different sharing policy, may be shared or not – METHODS: web services (executed over a variety of data centers), or downloaded (packages to be executed on DATA side) – WORKFLOW: complex analytical process that may imply executions on different sites. (currently only description or on-site execution on some special analytical platforms) Firenze, 14 Nov 2016
  • 19. e-infra design • SoBigData.eu portal • Sobigdata.eu Catalogue: a set of functionalities to search, index and discovery all resources (Data, Models and workflows) (powered by D4Science) • Virtual research Environments (Exploratories) functionalities to create, update and operation (powered by D4Science): Delft 17 – 19 February 2016
  • 22. SoBigData example: Resource Catalogue Search for datasets and methods Description Recent Activities Action Bar Recent Products Statistics Firenze, 14 Nov 2016
  • 23. VRE example: SoBigData VRE Application Posting messages to other VRE users VRE Abstract VRE Managers News Feed Top Topics Recent Files Firenze, 14 Nov 2016
  • 24. The ethics of SoBigData • Gathering large quantities of data may have serious consequences: – consequences range from personal harm, – to issues of autonomy, injustice and inequality. • Making Big Data accessible is a value for democracy • SoBigData adheres to a value-sensitive design approach: – design solutions to overcome ethical dilemma’s, in this case those between the utility of the data gathered vs. the protection of the individuals subject to the research.
  • 25. Ethics in practice • SoBigData has an ethical framework which provides a broad overview of all the ethical concerns of big data. • But, as per the VSD outlook, data protection is not only the concern of the ethicists. In order to make the ideals of SoBigData successful, scientific methods also need to be developed in order embed moral principles in practice.
  • 26. The ethics of SoBigData • How do we create an infrastructure in which such methods can be disseminated and improved upon? • Data Management Plan plays a key role: – Each data has its privacy requirements and fact checks and responsibility • Anonymization techniques are part of the research • Researchers will be trained in applying the necessary procedural safeguards
  • 28. Educating the responsible data scientists Based on a cooperation between ethicists and computer 1. A Massive Online Open Course (MOOC) which instructs all prospective researchers about the legal and ethical dangers of big data research and the steps they can take to minimise these; 2. A set of workflows that outline the steps researchers can take when designing their approach; 3. Information pop-ups which redirect researchers to state-of-the-art ethical methods.
  • 29. New challenges are coming • One of the OECD ideals is algorithmic transparency and the GDPR, also, says that decision-making algorithms should be explainable. • But what is enough to constitute an explanation? • We're working on developing some sort of template that would satisfy most people's conceptions of what an explanation should be
  • 30. • What function should a terms of use have? Currently, SoBigData is trying to defer legal responsibility to the Final User through the ToS, but this is difficult. • Example: How do we deal with Twitter's intellectual property rights? May he scraper violate Twitter terms of use? although of course the above also holds. How does SoBigData relate to data collectors terms of service Delft 17 – 19 February 2016 New challenges are coming
  • 31. SoBigData metadata structure • A highly structured and detailed metadata structure has been designed in order to provide information about: – Description of the dataset (to make it Findable) – How the dataset has been produced – Intellectual Property – Privacy issues – Who can access the data and how (terms of use, NDA…) • Mainly based on the DataCite standard Delft 17 – 19 February 2016
  • 32. • Copyright Copyright is a legal right that grants the creator of an original work exclusive rights for its use and distribution for a limited amount of time. Copyright can exist on individual data as well as over a dataset or database as a whole. The application of copyright to factual data and metadata have no eligibility for copyright protection. • License A license is a unilateral permission by the right holder from the licensor to the licensee to use certain rights. Licenses distinguish themselves from contracts since the implementation of a license does not require mutual agreement. • Terms of Use The terms of use are rules that one must obey in order to use the data or service. The terms of use agreement is mainly used for legal purposes by data providers and databases that store data. A legitimate terms of use agreement is legally binding and may be subject to change. Managing data: disambiguating terms Firenze, 14 Nov 2016
  • 33. There are three broad types of data: • Primary/raw data: data coming directly from the source; • Derivative data: data processed from primary/raw data; • Metadata: reference data describing either the primary/raw data or derivative data. Primary/raw data and derivative data may be licensed under different conditions and by different stakeholders Managing data: type of data and their licenses Firenze, 14 Nov 2016
  • 34. The e-infrastructure offering discover and access may be operated by a different actor and offered through specific terms of use. In this case three types of licenses may be involved: 1. the one agreed between the system operator and the primary data owner, 2. the one selected for derivative product that may differ from the one associated with primary data, and 3. the one agreed between the system operator and the data consumer. All these licenses have to be captured by the “terms of use” of the e- infrastructure, i.e., they are part of the rules a consumer must agree to accept when using the system. Managing data: dealing with complexity Firenze, 14 Nov 2016
  • 35. A re-use license (to specify in the ToU of the e-infrastructure) concerns at least attribution, copyleft requirement, and control on commercial exploitation of the dataset. Moreover it is needed to manage and apply some forms of control on access. Virtual Research Environments (VREs) offer flexible and secure web-based, community-centric platforms, so researchers can work together on common challenges. VREs terms of use are automatically composed according to the combined data and services selected at the time of VRE definition. • Raw data are then licensed according to the license expressed by the data owner/custodian and expressed at time of registration of the data content to the e- infrastructure. • Derivative data products instead are licensed with a license compatible and legally interoperable with the one associated with the primary data. • It remains under the responsibility of a single user, as expressed in the VRE terms of use, to confirm the license to associate with any produced derivative data. Managing data: VRE as an instrument to manage the complexity Firenze, 14 Nov 2016
  • 36. Meta data definition: Ethics Firenze, 14 Nov 2016
  • 37. Meta data definition: Intellectual Properties Firenze, 14 Nov 2016
  • 38. Il laboratorio di ricerca SoBigData.it organizza la prima Tuscan Big Data Challenge, un’occasione gratuita per le aziende per migliorare il proprio business. Grazie a un’analisi avanzata ei dati prodotti dalle aziende o estratti da internet è possibile ricavare informazioni utili su diversi fronti.