SlideShare a Scribd company logo
1 of 41
Download to read offline
Understanding
Big Data Analytics -
solutions for growing businesses
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
■ 13+ yrs in IT
■ IT Service Management, Project Management,
Business development
■ Cloud Native, DevOps, Data Science, Big Data,
Genomics
■ Involved in:
● PyData Warsaw
● Data Science Summit
● DevOps Days Warsaw
● Cloud Native Warsaw
Rafał Małanij
rafal.malanij@getindata.com
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Founded in 2014 by
ex-Spotify engineers.
Focus only on Big Data and
Cloud (from day 1)
Community builders (Big Data
Tech Warsaw organizers)
60+ Big Data engineers
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
● Volume
● Variety
● Velocity
● Veracity
● Value
Big Data
Source: Wikipedia
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
60% - 85%
Big Data projects fails
(Gartner 2016/2017)
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
“Big data isn't a one-off project: It's a culture
of collecting, analyzing, and using data.”
Matt Asay, Infoworld.com
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
“Technology is the engine of digital
transformation, data is the fuel, process is the
guidance system, and organizational change
capability is the landing gear.”
https://hbr.org/2020/05/digital-transformation-comes-down-to-talent-in-4-key-areas
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data literacy
Data literacy is the ability to read, understand, create, and
communicate data as information.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data
Collection
Data
Storage
Processing Delivery
Clickstream
Mobile apps
Product systems
Transaction system
CRM
Call center
Workforce mgmt
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Lake
● Repository for raw data
● Various type of data
○ Structured
○ Semi-structured
○ Unstructured
○ Binary
● Historical data
vs.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data lineage
● Where data comes from
● What happened / How it was transformed
● Where data is used
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Degrees of intelligence
Competing on Analytics: The New Science of Winning
by Thomas H. Davenport, Jeanne G. Harris
Competitive
advantage
🔴 Optimization What’s the best that can happen?
🔴 Predictive modeling What will happen next?
🔴 Forecasting/extrapolation What if these trends continue?
🔴 Statistical analysis Why is this happening?
🔴 Alerts What actions are needed?
🔴 Query/drill-down Where exactly is the problem?
🔴 Ad-hoc reports How many, how often, where?
🔴 Standard reports What happened?
Analytics
Reporting
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Science vs Machine Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Machine Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
ML Lifecycle
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Machine Learning vs. A.I.
“Artificial intelligence is
the science and engineering
of making computers behave
in ways that, until recently,
we thought required human
Intelligence.”
Andrew Moore,
Carnegie Mellon University,
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Culture
Automation
Lean
Measurement
Sharing
DevOps vs DataOps
+ Data quality
+ Manufacturing process
https://www.dataopsmanifesto.org/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Technical
competences
Possibilities
Degrees of intelligence
Competing on Analytics: The New Science of Winning
by Thomas H. Davenport, Jeanne G. Harris
Competitive
advantage
🔴 Optimization
🔴 Predictive modeling
🔴 Forecasting/extrapolation
🔴 Statistical analysis
🔴 Alerts
🔴 Query/drill-down
🔴 Ad-hoc reports
🔴 Standard reports
Analytics
Reporting
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Interactive BI
● Reports
● Dashboards
● Drill-down reports
● SQL-queries
● Tools: Excel, PowerBi,
QlikView, Tableau,
Superset, Hive, Presto
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Science
● Transformed and Raw data
● Machine Learning
● Tools: Jupyter,
Spark, Scala/Java
R, Python
Tensorflow, etc.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Discovery
● Search tool for data
● What, where, who?
● Metadata
● Popularity score
● Quality and profiling
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Lexikon @ Spotify
● Library for data and insights
● Knowledge Mgmt tool
○ People
○ Description, stats
○ Tables, Queries
https://engineering.atspotify.com/2020/02/27/how-we-improved-data-discovery-for-data-scientists-at-spotify/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Source: “Continuous Analytics:
Stream Query Processing in
Practice”, Michael J Franklin,
Professor, UC Berkley, Dec 2009 i
https://www.slideshare.net/JoshB
aer/shortening-the-feedback-loop
-big-data-spain-external
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Hidden Technical Debt in Machine Learning Systems -
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Dataism
“Dataism declares that the
universe consists of data flows,
and the value of any
phenomenon or entity is
determined by its contribution
to data processing,”
Yuval Noah Harari
“Homo Deus”.
Rafał Małanij
rafal.malanij@getindata.com

More Related Content

What's hot

Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Kai Wähner
 
Platfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora Data Visualization Meetup
Platfora Data Visualization Meetup
Platfora
 

What's hot (20)

The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
 
Bigdata based fraud detection
Bigdata based fraud detectionBigdata based fraud detection
Bigdata based fraud detection
 
Extending BI with Big Data Analytics
Extending BI with Big Data AnalyticsExtending BI with Big Data Analytics
Extending BI with Big Data Analytics
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop Webinar
 
Importance of Big Data Analytics
Importance of Big Data AnalyticsImportance of Big Data Analytics
Importance of Big Data Analytics
 
Big Data Roundtable. Why, how, where, which, and when to start doing Big Data
Big Data Roundtable. Why, how, where, which, and when to start doing Big DataBig Data Roundtable. Why, how, where, which, and when to start doing Big Data
Big Data Roundtable. Why, how, where, which, and when to start doing Big Data
 
Big Data Predictions for 2015
Big Data Predictions for 2015 Big Data Predictions for 2015
Big Data Predictions for 2015
 
Modernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyModernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data Strategy
 
Big Data LDN 2017: The 3rd Wave of Business Intelligence
Big Data LDN 2017: The 3rd Wave of Business IntelligenceBig Data LDN 2017: The 3rd Wave of Business Intelligence
Big Data LDN 2017: The 3rd Wave of Business Intelligence
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...Next-Generation BPM - How to create intelligent Business Processes thanks to ...
Next-Generation BPM - How to create intelligent Business Processes thanks to ...
 
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
Bank Struggles Along the Way for the Holy Grail of Personalization: Customer 360
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Three Dimensions of Data as a Service
Three Dimensions of Data as a ServiceThree Dimensions of Data as a Service
Three Dimensions of Data as a Service
 
Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World S...
Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World S...Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World S...
Meg Mude, Intel - Data Engineering Lifecycle Optimized on Intel - H2O World S...
 
Big Data Case study - caixa bank
Big Data Case study - caixa bankBig Data Case study - caixa bank
Big Data Case study - caixa bank
 
Moving from data to insights: How to effectively drive business decisions & g...
Moving from data to insights: How to effectively drive business decisions & g...Moving from data to insights: How to effectively drive business decisions & g...
Moving from data to insights: How to effectively drive business decisions & g...
 
Deliver World Class Customer Experience with Big Data and Analytics
Deliver World Class Customer Experience with Big Data and AnalyticsDeliver World Class Customer Experience with Big Data and Analytics
Deliver World Class Customer Experience with Big Data and Analytics
 
What are actionable insights? (Introduction to Operational Analytics Software)
What are actionable insights? (Introduction to Operational Analytics Software)What are actionable insights? (Introduction to Operational Analytics Software)
What are actionable insights? (Introduction to Operational Analytics Software)
 
Platfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora Data Visualization Meetup
Platfora Data Visualization Meetup
 

Similar to Understanding Big Data Analytics - solutions for growing businesses - Rafał Małanij, GetInData

Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
GetInData
 
Big data by_mcal
Big data by_mcalBig data by_mcal
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
Perficient, Inc.
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
pietvz
 

Similar to Understanding Big Data Analytics - solutions for growing businesses - Rafał Małanij, GetInData (20)

Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, Get...
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
Presumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of SuccessPresumption of Abundance: Architecting the Future of Success
Presumption of Abundance: Architecting the Future of Success
 
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr MenclewiczData-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
Data-Driven Fast Track: Introduction to data-drivenness with Piotr Menclewicz
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
LEGOAI Introduction.pdf
LEGOAI Introduction.pdfLEGOAI Introduction.pdf
LEGOAI Introduction.pdf
 
Big data by_mcal
Big data by_mcalBig data by_mcal
Big data by_mcal
 
Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview Preparation
 
Big Data overview
Big Data overviewBig Data overview
Big Data overview
 
Taming Big Data With Modern Software Architecture
Taming Big Data  With Modern Software ArchitectureTaming Big Data  With Modern Software Architecture
Taming Big Data With Modern Software Architecture
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
 
How to Prepare for 2025's Intelligence Technology
How to Prepare for 2025's Intelligence TechnologyHow to Prepare for 2025's Intelligence Technology
How to Prepare for 2025's Intelligence Technology
 
How to Prepare for 2025's Intelligence Technology
How to Prepare for 2025's Intelligence TechnologyHow to Prepare for 2025's Intelligence Technology
How to Prepare for 2025's Intelligence Technology
 
Conf2013 bchristensen thebig_t
Conf2013 bchristensen thebig_tConf2013 bchristensen thebig_t
Conf2013 bchristensen thebig_t
 
Transformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big DataTransformando la vida cotidiana a través de Big Data
Transformando la vida cotidiana a través de Big Data
 
SuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-finalSuanIct-Bigdata desktop-final
SuanIct-Bigdata desktop-final
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
 
Big Data: Big Issues for IP
Big Data: Big Issues for IPBig Data: Big Issues for IP
Big Data: Big Issues for IP
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4jAI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
 

More from GetInData

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
GetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
GetInData
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
GetInData
 
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
GetInData
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
GetInData
 

More from GetInData (20)

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competition
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team?
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easier
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataModel serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
 
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInData
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
 
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
 
Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 

Understanding Big Data Analytics - solutions for growing businesses - Rafał Małanij, GetInData

  • 1. Understanding Big Data Analytics - solutions for growing businesses
  • 2. © Copyright. All rights reserved. Not to be reproduced without prior written consent. ■ 13+ yrs in IT ■ IT Service Management, Project Management, Business development ■ Cloud Native, DevOps, Data Science, Big Data, Genomics ■ Involved in: ● PyData Warsaw ● Data Science Summit ● DevOps Days Warsaw ● Cloud Native Warsaw Rafał Małanij rafal.malanij@getindata.com
  • 3. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Founded in 2014 by ex-Spotify engineers. Focus only on Big Data and Cloud (from day 1) Community builders (Big Data Tech Warsaw organizers) 60+ Big Data engineers
  • 4. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 5. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 6. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 7. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 8. © Copyright. All rights reserved. Not to be reproduced without prior written consent. ● Volume ● Variety ● Velocity ● Veracity ● Value Big Data Source: Wikipedia
  • 9. © Copyright. All rights reserved. Not to be reproduced without prior written consent. 60% - 85% Big Data projects fails (Gartner 2016/2017)
  • 10. © Copyright. All rights reserved. Not to be reproduced without prior written consent. “Big data isn't a one-off project: It's a culture of collecting, analyzing, and using data.” Matt Asay, Infoworld.com
  • 11. © Copyright. All rights reserved. Not to be reproduced without prior written consent. “Technology is the engine of digital transformation, data is the fuel, process is the guidance system, and organizational change capability is the landing gear.” https://hbr.org/2020/05/digital-transformation-comes-down-to-talent-in-4-key-areas
  • 12. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data literacy Data literacy is the ability to read, understand, create, and communicate data as information.
  • 13. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Collection Data Storage Processing Delivery Clickstream Mobile apps Product systems Transaction system CRM Call center Workforce mgmt
  • 14. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Lake ● Repository for raw data ● Various type of data ○ Structured ○ Semi-structured ○ Unstructured ○ Binary ● Historical data vs.
  • 15. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 16. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data lineage ● Where data comes from ● What happened / How it was transformed ● Where data is used
  • 17. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Degrees of intelligence Competing on Analytics: The New Science of Winning by Thomas H. Davenport, Jeanne G. Harris Competitive advantage 🔴 Optimization What’s the best that can happen? 🔴 Predictive modeling What will happen next? 🔴 Forecasting/extrapolation What if these trends continue? 🔴 Statistical analysis Why is this happening? 🔴 Alerts What actions are needed? 🔴 Query/drill-down Where exactly is the problem? 🔴 Ad-hoc reports How many, how often, where? 🔴 Standard reports What happened? Analytics Reporting
  • 18. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Science vs Machine Learning
  • 19. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Machine Learning
  • 20. © Copyright. All rights reserved. Not to be reproduced without prior written consent. ML Lifecycle
  • 21. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Machine Learning vs. A.I. “Artificial intelligence is the science and engineering of making computers behave in ways that, until recently, we thought required human Intelligence.” Andrew Moore, Carnegie Mellon University,
  • 22. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 23. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Culture Automation Lean Measurement Sharing DevOps vs DataOps + Data quality + Manufacturing process https://www.dataopsmanifesto.org/
  • 24. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 25. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 26. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Technical competences Possibilities Degrees of intelligence Competing on Analytics: The New Science of Winning by Thomas H. Davenport, Jeanne G. Harris Competitive advantage 🔴 Optimization 🔴 Predictive modeling 🔴 Forecasting/extrapolation 🔴 Statistical analysis 🔴 Alerts 🔴 Query/drill-down 🔴 Ad-hoc reports 🔴 Standard reports Analytics Reporting
  • 27. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Interactive BI ● Reports ● Dashboards ● Drill-down reports ● SQL-queries ● Tools: Excel, PowerBi, QlikView, Tableau, Superset, Hive, Presto
  • 28. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Science ● Transformed and Raw data ● Machine Learning ● Tools: Jupyter, Spark, Scala/Java R, Python Tensorflow, etc.
  • 29. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 30. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Discovery ● Search tool for data ● What, where, who? ● Metadata ● Popularity score ● Quality and profiling
  • 31. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Lexikon @ Spotify ● Library for data and insights ● Knowledge Mgmt tool ○ People ○ Description, stats ○ Tables, Queries https://engineering.atspotify.com/2020/02/27/how-we-improved-data-discovery-for-data-scientists-at-spotify/
  • 32. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 33. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Source: “Continuous Analytics: Stream Query Processing in Practice”, Michael J Franklin, Professor, UC Berkley, Dec 2009 i https://www.slideshare.net/JoshB aer/shortening-the-feedback-loop -big-data-spain-external
  • 34. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 35. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 36. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Hidden Technical Debt in Machine Learning Systems - https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
  • 37. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 38. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 39. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 40. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Dataism “Dataism declares that the universe consists of data flows, and the value of any phenomenon or entity is determined by its contribution to data processing,” Yuval Noah Harari “Homo Deus”.