SlideShare a Scribd company logo
Slides by Gabriela Antunes Vieira
Course by TOON VANAGT
Founder and managing director of several
internet startups
Co-founder and board member of Fintech Belgium
Chairman of Open Knowledge Belgium
Tech entrepreneur, lean
startup coach & angel
investor
Toon Vanagt
Datanews ICT manager SME of the year 2002
@Toon Betacowork co-owner (get your free trial ! )
What is Big Data ? How is it classified ? Where does it come
from ? And why is it important ?1
2
3
INTRODUCTION
THE 5 Vs OF DATA
BIG DATA TECHNOLOGY
The main characteristics of Big Data
Big Data Technologies Landscape
4
5
HOW BIG DATA BECOMES SMART DATA
SOME EXAMPLES
Turning Big Data into value.
.
• What is Big Data ?
• Classification of data Sources of Data
• The Importance of Big Data
Course by TOON VANAGT
Data in our 21st century is like oil in the 18th Century: an
immensely, untapped valuable asset…
Course by TOON VANAGT
Big data refers to data sets that are too
large or complex for traditional data-
processing application software to
adequately deal with
Course by TOON VANAGT
The importance of Big Data relies on how
a company utilizes their collected data.
If you can’t turn this data into value, it’s
useless.
Better understanding of customers
Optimization is processes
Improve security
Improve performance
Course by TOON VANAGT
MTurk aims to make accessing human intelligence simple,
scalable, and cost-effective. Businesses or developers
needing tasks done (called Human Intelligence Tasks or
“HITs”) can use the robust MTurk API to access thousands
of high quality, global, on-demand Workers—and then
programmatically integrate the results of that work
directly into their business processes and systems. MTurk
enables developers and businesses to achieve their goals
more quickly and at a lower cost than was previously
possible.
Course by TOON VANAGT
Data that has been organized into a
formatted repository, typically
a database, so that its elements can
be made addressable for more
effective processing and analysis
Information that either does not
have a pre-defined data model or is
not organized in a pre-defined
manner.
A form of structured data that does
not conform with the formal
structure of data models
but nonetheless markers to separate
semantic elements and enforce
hierarchies of records and fields
within the data.
Course by TOON VANAGT
• Images, Videos,
audios …
• Social Media
Facebook, Twitter,
Youtube, Instagram…
• Public, private or
third party cloud
platforms
• Data publicly
available on the web
•Data Generated
from interconnection
of IOT devices
MEDIA
CLOUD
WEB
MACHINE DATA
TRANSACTIONAL
•Product ID,
Distribution,
Payements …
Course by TOON VANAGT
Course by TOON VANAGT
Move up the information ladder by
asking users/patients for input
Combine, correlate and improve
quality of data sets
Bring new value from raw (open)
data sets
Visualise in new ways
Mine deeper to dig out “insights”
(not just basic statistics)
Any company can now run its
“own Google”
Bring new value from raw (open)
data sets
Mine deeper to dig out “insights”
(not just basic statistics)
Any company can now run its
“own Google”
• Volume
• Velocity
• Variety
• Veracity
• Value
Course by TOON VANAGT
Course by TOON VANAGT
Open data is any content or
info that people are free to
use, re-use and redistribute
— without any legal,
technological or social
Publically accessible data
from websites, social
networks, blogs, news feeds,
product feeds (ecommerce)
and more. Re-use is often
not formalized and implicit…
Authentication/secured
access is required to use
proprietary corporate data ,
personal data or device data
• Big Data Landscape
• Big Data Tools
• Process
Course by TOON VANAGT
Course by TOON VANAGT
Course by TOON VANAGT
Many (open source) big data tools can be
relatively cheap building blocks of your
‘refinery’
Course by TOON VANAGT
https: //bit.ly/2GszxZF
• Smart Data Applications
• How to start smart ?
• Big Data Challenges
Course by TOON VANAGT
Fraud detection / Prevention
Targeted ads, product placement, brand sentiment analysis
Patient monitoring, Patient Care…
Proactive equipment repair, power and consumption matching
Bandwidth allocation, Cell Tower diagostics …
Proactive maintenance, Decreasing time, supply planning …
Outbreak detection, network intrusion detection …
Route and time planning, traffic monitoring …
Course by TOON VANAGT
• Could an expert help to sense-
check your results ?
• Can you validate hypotheses ?
• What further Data do you need ?
• What data do you have ?
• How is it used ?
• Do you have the expertise to
manage your data ?
• What data do you have and how
is it used ?
• Are you being specific enough ?
Course by TOON VANAGT
Course by TOON VANAGT
Users' weight, height, heart rate, ovulation cycles
and other data were shared
The information is collected in real time
Data is sent to FB via its Software Development Kit,
open source software tools that can be used by devs
to create mobile apps
These apps use a Facebook analytics tool called
App Events lets developers track user activity on
FB
Course by TOON VANAGT
Retailers are spending lots of time analyzing your data to determine how
to sell you even more.
Target, has figured out how to successfully use shopper data to
determine if an individual is having a baby and when.
Everyone provides data through customer IDs tied to personally
identifiable information (PII) such as credit cards, emails, and loyalty
card numbers
Using shopping behavior data, Target could assign a pregnancy prediction
score to customers based on the purchase and purchase volume of about 25
different products in-store, regardless of baby registries.
The Target advertising team started to use the mix and match technique
that still allows for proper targeting without freaking out customers
Course by TOON VANAGT
The Global Heat Map is published by the GPS
tracking company Strava
It is made up by sticking together the locations
and activities of people who use fitness devices
like Fitbits
Strava fitness map accidentally revealed the
location of secret military bases in war zones in
the Middle East by tracking soldiers' movements
Course by TOON VANAGT
Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)
Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)

More Related Content

What's hot

A Perspective from the intersection Data Science, Mobility, and Mobile Devices
A Perspective from the intersection Data Science, Mobility, and Mobile DevicesA Perspective from the intersection Data Science, Mobility, and Mobile Devices
A Perspective from the intersection Data Science, Mobility, and Mobile Devices
Yael Garten
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Shirshanka Das
 
Data Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk ManagementData Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk Management
Data Science Thailand
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data Journey
Paul Boal
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
Richard Vidgen
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
Caserta
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
Inside Analysis
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
Indu Khemchandani
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
itnewsafrica
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data Discovery
Inside Analysis
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
mark madsen
 
Introduction to open data in DataOps
Introduction to open data in DataOpsIntroduction to open data in DataOps
Introduction to open data in DataOps
Dataops Ghent Meetup
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyThe Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
Greta Workman
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
Greg Goltsov
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
SoftServe
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019
mark madsen
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
Nazir Ahmed
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
DataWorks Summit/Hadoop Summit
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Shirshanka Das
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
Takrim Ul Islam Laskar
 

What's hot (20)

A Perspective from the intersection Data Science, Mobility, and Mobile Devices
A Perspective from the intersection Data Science, Mobility, and Mobile DevicesA Perspective from the intersection Data Science, Mobility, and Mobile Devices
A Perspective from the intersection Data Science, Mobility, and Mobile Devices
 
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystemStrata 2016 - Architecting for Change: LinkedIn's new data ecosystem
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
 
Data Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk ManagementData Science Application in Business Portfolio & Risk Management
Data Science Application in Business Portfolio & Risk Management
 
A Big Data Journey
A Big Data JourneyA Big Data Journey
A Big Data Journey
 
Introduction to big data
Introduction to big dataIntroduction to big data
Introduction to big data
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 
How Can Analytics Improve Business?
How Can Analytics Improve Business?How Can Analytics Improve Business?
How Can Analytics Improve Business?
 
Intro to Data Science Big Data
Intro to Data Science Big DataIntro to Data Science Big Data
Intro to Data Science Big Data
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Data Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data DiscoveryData Wrangling and the Art of Big Data Discovery
Data Wrangling and the Art of Big Data Discovery
 
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)Architecting a Data Platform For Enterprise Use (Strata NY 2018)
Architecting a Data Platform For Enterprise Use (Strata NY 2018)
 
Introduction to open data in DataOps
Introduction to open data in DataOpsIntroduction to open data in DataOps
Introduction to open data in DataOps
 
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph TechnologyThe Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
 
Full-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data TeamFull-Stack Data Science: How to be a One-person Data Team
Full-Stack Data Science: How to be a One-person Data Team
 
Advanced Analytics and Data Science Expertise
Advanced Analytics and Data Science ExpertiseAdvanced Analytics and Data Science Expertise
Advanced Analytics and Data Science Expertise
 
Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019Building a Data Platform Strata SF 2019
Building a Data Platform Strata SF 2019
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Big data(1st presentation)
Big data(1st presentation)Big data(1st presentation)
Big data(1st presentation)
 

Similar to Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)

Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Science
dlamb3244
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
Navjot Kaur
 
Content Marketing Trending Topics in Tech
Content Marketing Trending Topics in TechContent Marketing Trending Topics in Tech
Content Marketing Trending Topics in Tech
UBM (Technology)
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspective
Sravan Ankaraju
 
Why Data Science is Getting Popular in 2023?
Why Data Science is Getting Popular in 2023?Why Data Science is Getting Popular in 2023?
Why Data Science is Getting Popular in 2023?
kavyagaur3
 
Real Estate Big Data- Benefits & Challenges
Real Estate Big Data- Benefits & ChallengesReal Estate Big Data- Benefits & Challenges
Real Estate Big Data- Benefits & Challenges
Manish Parsuramka
 
Module 1 the power of data
Module 1 the power of dataModule 1 the power of data
Module 1 the power of data
caniceconsulting
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
IRJET Journal
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data Applications
ShilpaKrishna6
 
Lay of the Land for All Things Privacy
Lay of the Land for All Things PrivacyLay of the Land for All Things Privacy
Lay of the Land for All Things Privacy
Tinuiti
 
The Product Dev Conundrum: To Build or Buy in a Digital World?
The Product Dev Conundrum: To Build or Buy in a Digital World?The Product Dev Conundrum: To Build or Buy in a Digital World?
The Product Dev Conundrum: To Build or Buy in a Digital World?
Aggregage
 
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
Peter Bihr
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
Mudit Mangal
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_video
ramikaurraminder
 
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...
International Federation for Information Technologies in Travel and Tourism (IFITT)
 
Presentation To Seda Technology Programme
Presentation To Seda Technology ProgrammePresentation To Seda Technology Programme
Presentation To Seda Technology Programme
Elton050505
 
Harness Your Product Data: Better Understanding User Behavior Across Channels...
Harness Your Product Data: Better Understanding User Behavior Across Channels...Harness Your Product Data: Better Understanding User Behavior Across Channels...
Harness Your Product Data: Better Understanding User Behavior Across Channels...
Aggregage
 
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
Hannah Flynn
 
Module 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - OnlineModule 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - Online
caniceconsulting
 
Modernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyModernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data Strategy
Cloudera, Inc.
 

Similar to Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL) (20)

Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Science
 
Big data unit i
Big data unit iBig data unit i
Big data unit i
 
Content Marketing Trending Topics in Tech
Content Marketing Trending Topics in TechContent Marketing Trending Topics in Tech
Content Marketing Trending Topics in Tech
 
Putting data science into perspective
Putting data science into perspectivePutting data science into perspective
Putting data science into perspective
 
Why Data Science is Getting Popular in 2023?
Why Data Science is Getting Popular in 2023?Why Data Science is Getting Popular in 2023?
Why Data Science is Getting Popular in 2023?
 
Real Estate Big Data- Benefits & Challenges
Real Estate Big Data- Benefits & ChallengesReal Estate Big Data- Benefits & Challenges
Real Estate Big Data- Benefits & Challenges
 
Module 1 the power of data
Module 1 the power of dataModule 1 the power of data
Module 1 the power of data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data Applications
 
Lay of the Land for All Things Privacy
Lay of the Land for All Things PrivacyLay of the Land for All Things Privacy
Lay of the Land for All Things Privacy
 
The Product Dev Conundrum: To Build or Buy in a Digital World?
The Product Dev Conundrum: To Build or Buy in a Digital World?The Product Dev Conundrum: To Build or Buy in a Digital World?
The Product Dev Conundrum: To Build or Buy in a Digital World?
 
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
ThingsCon: Trustable Tech Mark (27 Oct 2018, Mozfest Edition)
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
Presentation big data and social media final_video
Presentation big data and social media final_videoPresentation big data and social media final_video
Presentation big data and social media final_video
 
Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...Identifying the new frontier of big data as an enabler for T&T industries: Re...
Identifying the new frontier of big data as an enabler for T&T industries: Re...
 
Presentation To Seda Technology Programme
Presentation To Seda Technology ProgrammePresentation To Seda Technology Programme
Presentation To Seda Technology Programme
 
Harness Your Product Data: Better Understanding User Behavior Across Channels...
Harness Your Product Data: Better Understanding User Behavior Across Channels...Harness Your Product Data: Better Understanding User Behavior Across Channels...
Harness Your Product Data: Better Understanding User Behavior Across Channels...
 
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
Modern Product Data Workflows: Harness Your Product Data: Better Understandin...
 
Module 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - OnlineModule 4 - Data as a Business Model - Online
Module 4 - Data as a Business Model - Online
 
Modernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyModernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data Strategy
 

Recently uploaded

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 

Recently uploaded (20)

HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 

Course 1 - Introduction to Big Data by Toon Vanagt ( #BigDataBXL)

  • 1. Slides by Gabriela Antunes Vieira
  • 2. Course by TOON VANAGT Founder and managing director of several internet startups Co-founder and board member of Fintech Belgium Chairman of Open Knowledge Belgium Tech entrepreneur, lean startup coach & angel investor Toon Vanagt Datanews ICT manager SME of the year 2002 @Toon Betacowork co-owner (get your free trial ! )
  • 3. What is Big Data ? How is it classified ? Where does it come from ? And why is it important ?1 2 3 INTRODUCTION THE 5 Vs OF DATA BIG DATA TECHNOLOGY The main characteristics of Big Data Big Data Technologies Landscape 4 5 HOW BIG DATA BECOMES SMART DATA SOME EXAMPLES Turning Big Data into value. .
  • 4. • What is Big Data ? • Classification of data Sources of Data • The Importance of Big Data
  • 5. Course by TOON VANAGT Data in our 21st century is like oil in the 18th Century: an immensely, untapped valuable asset…
  • 6. Course by TOON VANAGT Big data refers to data sets that are too large or complex for traditional data- processing application software to adequately deal with
  • 7. Course by TOON VANAGT The importance of Big Data relies on how a company utilizes their collected data. If you can’t turn this data into value, it’s useless. Better understanding of customers Optimization is processes Improve security Improve performance
  • 8. Course by TOON VANAGT MTurk aims to make accessing human intelligence simple, scalable, and cost-effective. Businesses or developers needing tasks done (called Human Intelligence Tasks or “HITs”) can use the robust MTurk API to access thousands of high quality, global, on-demand Workers—and then programmatically integrate the results of that work directly into their business processes and systems. MTurk enables developers and businesses to achieve their goals more quickly and at a lower cost than was previously possible.
  • 9. Course by TOON VANAGT Data that has been organized into a formatted repository, typically a database, so that its elements can be made addressable for more effective processing and analysis Information that either does not have a pre-defined data model or is not organized in a pre-defined manner. A form of structured data that does not conform with the formal structure of data models but nonetheless markers to separate semantic elements and enforce hierarchies of records and fields within the data.
  • 10. Course by TOON VANAGT • Images, Videos, audios … • Social Media Facebook, Twitter, Youtube, Instagram… • Public, private or third party cloud platforms • Data publicly available on the web •Data Generated from interconnection of IOT devices MEDIA CLOUD WEB MACHINE DATA TRANSACTIONAL •Product ID, Distribution, Payements …
  • 11. Course by TOON VANAGT
  • 12. Course by TOON VANAGT Move up the information ladder by asking users/patients for input Combine, correlate and improve quality of data sets Bring new value from raw (open) data sets Visualise in new ways Mine deeper to dig out “insights” (not just basic statistics) Any company can now run its “own Google” Bring new value from raw (open) data sets Mine deeper to dig out “insights” (not just basic statistics) Any company can now run its “own Google”
  • 13. • Volume • Velocity • Variety • Veracity • Value
  • 14. Course by TOON VANAGT
  • 15. Course by TOON VANAGT Open data is any content or info that people are free to use, re-use and redistribute — without any legal, technological or social Publically accessible data from websites, social networks, blogs, news feeds, product feeds (ecommerce) and more. Re-use is often not formalized and implicit… Authentication/secured access is required to use proprietary corporate data , personal data or device data
  • 16. • Big Data Landscape • Big Data Tools • Process
  • 17. Course by TOON VANAGT
  • 18. Course by TOON VANAGT
  • 19. Course by TOON VANAGT Many (open source) big data tools can be relatively cheap building blocks of your ‘refinery’
  • 20. Course by TOON VANAGT https: //bit.ly/2GszxZF
  • 21. • Smart Data Applications • How to start smart ? • Big Data Challenges
  • 22. Course by TOON VANAGT Fraud detection / Prevention Targeted ads, product placement, brand sentiment analysis Patient monitoring, Patient Care… Proactive equipment repair, power and consumption matching Bandwidth allocation, Cell Tower diagostics … Proactive maintenance, Decreasing time, supply planning … Outbreak detection, network intrusion detection … Route and time planning, traffic monitoring …
  • 23. Course by TOON VANAGT • Could an expert help to sense- check your results ? • Can you validate hypotheses ? • What further Data do you need ? • What data do you have ? • How is it used ? • Do you have the expertise to manage your data ? • What data do you have and how is it used ? • Are you being specific enough ?
  • 24.
  • 25.
  • 26. Course by TOON VANAGT
  • 27. Course by TOON VANAGT Users' weight, height, heart rate, ovulation cycles and other data were shared The information is collected in real time Data is sent to FB via its Software Development Kit, open source software tools that can be used by devs to create mobile apps These apps use a Facebook analytics tool called App Events lets developers track user activity on FB
  • 28. Course by TOON VANAGT Retailers are spending lots of time analyzing your data to determine how to sell you even more. Target, has figured out how to successfully use shopper data to determine if an individual is having a baby and when. Everyone provides data through customer IDs tied to personally identifiable information (PII) such as credit cards, emails, and loyalty card numbers Using shopping behavior data, Target could assign a pregnancy prediction score to customers based on the purchase and purchase volume of about 25 different products in-store, regardless of baby registries. The Target advertising team started to use the mix and match technique that still allows for proper targeting without freaking out customers
  • 29. Course by TOON VANAGT The Global Heat Map is published by the GPS tracking company Strava It is made up by sticking together the locations and activities of people who use fitness devices like Fitbits Strava fitness map accidentally revealed the location of secret military bases in war zones in the Middle East by tracking soldiers' movements
  • 30. Course by TOON VANAGT