SlideShare a Scribd company logo
1 of 65
Download to read offline
Analytics 101
How to build a data-driven organisation?
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
■ 13+ yrs in IT
■ IT Service Management, Project
Management, Business development
■ Cloud Native, DevOps, Data Science, Big
Data, Genomics
■ Involved in:
● PyData Warsaw
● Data Science Summit
● DevOps Days Warsaw
● Cloud Native Warsaw
Rafał Małanij
rafal.malanij@getindata.com
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Founded in 2014 by
ex-Spotify engineers.
Focus only on Big Data and
Cloud (from day 1)
Community builders (Big Data
Tech Warsaw organizers)
60+ Big Data engineers
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Digital Transformation
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Big Data and AI Executive Survey 2020
by NewVantage Partners
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
“Technology is the engine of digital transformation,
data is the fuel, process is the guidance system, and
organizational change capability is the landing gear.”
https://hbr.org/2020/05/digital-transformation-comes-down-to-talent-in-4-key-areas
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
“We live in a society that is data rich and
information poor.”
“In Search of Excellence: Lessons from America's Best-Run Companies”
January 1, 1982, Jr. Robert H. Waterman
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data-first mindset
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Democratization
“If you torture the data long enough,
it will confess.”
Ronald H. Coase, Essays on Economics and Economists
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data University @ AirBnB
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data University @ AirBnB
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Literacy
Data literacy is the ability to read, understand,
create, and communicate data as information.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
http://www.tylervigen.com/spurious-correlations
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Lean startup
■ Build-Measure-Learn
■ Iterative approach
■ MVP approach
■ Continuous deployment
■ Split testing
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Holistics / dbdiagram.io
■ 2 interns
■ 2-month project
■ Brand awareness
■ Lead Generation
■ Feedback
■ Public Road Map
■ 200k users (1yr)!!!
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
https://www.holistics.io/blog/launch-a-product-to-200-000-users-with-0-dollars-spent-on-marketing-budget/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
https://about.gitlab.com/handbook/business-ops/data-team/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Low
Risk
Predictable approach
Established processes
Experimenting, failing and
learning from your failures
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
“It's not necessary
to change. Survival
is not mandatory.”
William E. Deming
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
■ Self-service analytics
■ Citizen Data Science
■ Continuous Delivery
■ Data Integrity
■ Use case-driven
■ Security
https://marriagebrokerauntie.com/2017/10/27/fail-if-you-have-to-but-fail-fast/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Q&A
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Q&A
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
“Without data you’re just another person with an opinion” — W.
Edward Deming
https://towardsdatascience.com/do-you-really-have-a-data-strate
gy-ff08795f10ce
Analytics 101
How to build a data-driven organisation
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Big Data
■ Volume
■ Variety
■ Velocity
■ Veracity
https://upload.wikimedia.org/wikipedia/commons/e/ee/Big_Data.png
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
“Big data isn't a one-off project: It's a culture of
collecting, analyzing, and using data.”
Matt Asay, Infoworld.com
■ Not technology issue
■ Start small
■ Ask the right questions
■ Flexible, open data infrastructure
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data
Collection
Data
Storage
Processing Delivery
Clickstream
Mobile apps
Product systems
Transaction system
CRM
Call center
Workforce mgmt
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Lake
■ Repository for raw data
■ Various type of data
● Structured
● Semi-structured
● Unstructured
● Binary
■ Historical data
vs.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Lineage
■ Where it comes from
■ What happened
■ Where they
are used
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Competing on Analytics: The New Science
of Winning
by Thomas H. Davenport, Jeanne G. Harris
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Science vs Machine Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Machine Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Machine Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Machine Learning vs A.I.
“Artificial intelligence is
the science and engineering
of making computers behave
in ways that, until recently,
we thought required human
Intelligence.”
Andrew Moore,
Carnegie Mellon University,
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
DevOps vs DataOps
Culture
Automation
Lean
Measurement
Sharing
+ Data quality
+ Manufacturing process
https://www.dataopsmanifesto.org/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Competing on Analytics: The New Science
of Winning
by Thomas H. Davenport, Jeanne G. Harris
Technical
competences
Possibilities
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Interactive BI
■ Reports
■ Dashboards
■ Drill-down reports
■ SQL-queries
■ Tools: Excel, PowerBi,
QlikView, Tableau, Superset,
Hive, Presto
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Science
■ Transformed and Raw data
■ Machine Learning
■ Tools: Jupyter,
Spark, Scala/Java
R, Python
Tensorflow, etc.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data Discovery
■ Search tool for data
■ What, where, who?
■ Metadata
■ Popularity score
■ Quality and profiling
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Lexikon@Spotify
■ Library for data and insights
■ Knowledge Mgmt tool
● People
● Description, stats
● Tables, Queries
https://engineering.atspotify.com/2020/02/27/how-we-improved-data-discovery-for-dat
a-scientists-at-spotify/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Źródło: “Continuous Analytics: Stream Query
Processing in Practice”, Michael J Franklin,
Professor, UC Berkley, Dec 2009 i
https://www.slideshare.net/JoshBaer/shortening
-the-feedback-loop-big-data-spain-external
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Continuous
Data
Collection
Automation Security Monitoring Orchestration
Data Lake
Big Data
Processing
Data
Governance
Event
Processing
Feature
engineering
Interactive BI
& Analytics
Data
Discovery
Data Science
Machine
Learning
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Hidden Technical Debt in Machine Learning Systems -
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
https://mattturck.com/data2019/
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data-driven insurer
■ Customer Data Platform - Customer360
■ Better understanding of your customer - LTV
■ Personalised marketing
■ Personalised offer, recommendation systems
■ Customer engagement, churn
■ Customer journey, lead scoring
■ Marketing costs and sales funnel optimization
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data-driven insurer
■ Customer Data Platform - Customer360
■ Internal data (transactions, CRM, call-center)
■ Clickstream
■ External data
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data-driven insurer
■ Personalised
products
https://www.the-digital-insurer.com
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data-driven insurer
■ Personalised products
■ Lifestyle information
● Wearables
● Mobile apps
■ Healthcare data
● Diabetes monitoring
● Genomics
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Data-driven insurer
■ Internet of Things
■ Sensors
● Fire
● Water Sensors
■ Telematics
https://www.the-digital-insurer.com
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Dataism
“Dataism declares that the
universe consists of data flows,
and the value of any
phenomenon or entity is
determined by its contribution
to data processing,”
Yuval Noah Harari, “Homo Deus”.
© Copyright. All rights reserved. Not to be reproduced without prior written consent.
Q&A

More Related Content

Similar to Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, GetInData

Conf2013 bchristensen thebig_t
Conf2013 bchristensen thebig_tConf2013 bchristensen thebig_t
Conf2013 bchristensen thebig_tBeau Christensen
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014pietvz
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...InnoTech
 
Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small DataHurwitz & Associates
 
Lynn wong: make a difference with big data - HP
Lynn wong: make a difference with big data - HPLynn wong: make a difference with big data - HP
Lynn wong: make a difference with big data - HPVu Hung Nguyen
 
Big data security
Big data securityBig data security
Big data securityCloudBees
 
Tim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric WorldTim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric WorldDigital Reasoning
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Fredrik Olsson
 
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...Kai Wähner
 
Truecaller towards a data-driven company
Truecaller towards a data-driven companyTruecaller towards a data-driven company
Truecaller towards a data-driven companyGetInData
 
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsGood Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsZivaro Inc
 
IBM Analytics at Scale: Because Business Outcomes Matter
IBM Analytics at Scale: Because Business Outcomes MatterIBM Analytics at Scale: Because Business Outcomes Matter
IBM Analytics at Scale: Because Business Outcomes MatterChristine O'Connor
 
big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfVirajSaud
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataGetInData
 
Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationIntellipaat
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...GetInData
 

Similar to Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, GetInData (20)

Conf2013 bchristensen thebig_t
Conf2013 bchristensen thebig_tConf2013 bchristensen thebig_t
Conf2013 bchristensen thebig_t
 
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
Take the Big Data Challenge - Take Advantage of ALL of Your Data 16 Sept 2014
 
Thilga
ThilgaThilga
Thilga
 
Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...Data science and its potential to change business as we know it. The Roadmap ...
Data science and its potential to change business as we know it. The Roadmap ...
 
Why Big Data is Really about Small Data
Why Big Data is Really about Small DataWhy Big Data is Really about Small Data
Why Big Data is Really about Small Data
 
Lynn wong: make a difference with big data - HP
Lynn wong: make a difference with big data - HPLynn wong: make a difference with big data - HP
Lynn wong: make a difference with big data - HP
 
Big data security
Big data securityBig data security
Big data security
 
Tim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric WorldTim Estes - Information Systems in an Entity Centric World
Tim Estes - Information Systems in an Entity Centric World
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...Online text data for machine learning, data science, and research - Who can p...
Online text data for machine learning, data science, and research - Who can p...
 
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
How to create intelligent Business Processes thanks to Big Data (BPM, Apache ...
 
Truecaller towards a data-driven company
Truecaller towards a data-driven companyTruecaller towards a data-driven company
Truecaller towards a data-driven company
 
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced ThreatsGood Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
Good Guys vs Bad Guys: Using Big Data to Counteract Advanced Threats
 
IBM Analytics at Scale: Because Business Outcomes Matter
IBM Analytics at Scale: Because Business Outcomes MatterIBM Analytics at Scale: Because Business Outcomes Matter
IBM Analytics at Scale: Because Business Outcomes Matter
 
big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdf
 
Big data by_mcal
Big data by_mcalBig data by_mcal
Big data by_mcal
 
Big data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInDataBig data trends - Krzysztof Zarzycki, GetInData
Big data trends - Krzysztof Zarzycki, GetInData
 
Big Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview PreparationBig Data Developer Career Path: Job & Interview Preparation
Big Data Developer Career Path: Job & Interview Preparation
 
Big data
Big dataBig data
Big data
 
Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...Predicting Startup Market Trends based on the news and social media - Albert ...
Predicting Startup Market Trends based on the news and social media - Albert ...
 

More from GetInData

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...GetInData
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competitionGetInData
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? GetInData
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierGetInData
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformGetInData
 
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataModel serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataGetInData
 
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...GetInData
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...GetInData
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...GetInData
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataGetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...GetInData
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...GetInData
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataGetInData
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...GetInData
 
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...GetInData
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...GetInData
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataGetInData
 
Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...GetInData
 
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...GetInData
 
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...GetInData
 

More from GetInData (20)

How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...How do we work with customers on Big Data / ML / Analytics Projects using Scr...
How do we work with customers on Big Data / ML / Analytics Projects using Scr...
 
How NOT to win a Kaggle competition
How NOT to win a Kaggle competitionHow NOT to win a Kaggle competition
How NOT to win a Kaggle competition
 
How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team? How to become good Developer in Scrum Team?
How to become good Developer in Scrum Team?
 
OpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easierOpenLineage & Airflow - data lineage has never been easier
OpenLineage & Airflow - data lineage has never been easier
 
Benefits of a Homemade ML Platform
Benefits of a Homemade ML PlatformBenefits of a Homemade ML Platform
Benefits of a Homemade ML Platform
 
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataModel serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
 
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
Creating Real-Time Data Streaming powered by SQL on Kubernetes - Albert Lewan...
 
MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...MLOps implemented - how we combine the cloud & open-source to boost data scie...
MLOps implemented - how we combine the cloud & open-source to boost data scie...
 
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
 
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInDataFeast + Amundsen Integration - Mariusz Strzelecki, GetInData
Feast + Amundsen Integration - Mariusz Strzelecki, GetInData
 
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...Kubernetes and real-time analytics - how to connect these two worlds with Apa...
Kubernetes and real-time analytics - how to connect these two worlds with Apa...
 
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
Functioning incessantly of Data Science Platform with Kubeflow - Albert Lewan...
 
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInDataMonitoring in Big Data Platform - Albert Lewandowski, GetInData
Monitoring in Big Data Platform - Albert Lewandowski, GetInData
 
Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...Complex event processing platform handling millions of users - Krzysztof Zarz...
Complex event processing platform handling millions of users - Krzysztof Zarz...
 
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...Managing Big Data projects in a constantly changing environment - Rafał Zalew...
Managing Big Data projects in a constantly changing environment - Rafał Zalew...
 
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
NLP for videos: Understanding customers' feelings in videos - Albert Lewandow...
 
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInDataStrategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
 
Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...Monitoring environment based on satellite data with Python and PySpark - Albe...
Monitoring environment based on satellite data with Python and PySpark - Albe...
 
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
Welcome to MLOps candy shop and choose your flavour! - Mateusz Pytel & Marius...
 
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
Real time analytics that controls 50% of mobile network in Poland - Maciej Br...
 

Recently uploaded

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 

Recently uploaded (20)

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 

Analytics 101 - How to build a data-driven organisation? - Rafał Małanij, GetInData

  • 1. Analytics 101 How to build a data-driven organisation?
  • 2. © Copyright. All rights reserved. Not to be reproduced without prior written consent. ■ 13+ yrs in IT ■ IT Service Management, Project Management, Business development ■ Cloud Native, DevOps, Data Science, Big Data, Genomics ■ Involved in: ● PyData Warsaw ● Data Science Summit ● DevOps Days Warsaw ● Cloud Native Warsaw Rafał Małanij rafal.malanij@getindata.com
  • 3. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Founded in 2014 by ex-Spotify engineers. Focus only on Big Data and Cloud (from day 1) Community builders (Big Data Tech Warsaw organizers) 60+ Big Data engineers
  • 4. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 5. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Digital Transformation
  • 6. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Big Data and AI Executive Survey 2020 by NewVantage Partners
  • 7. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 8. © Copyright. All rights reserved. Not to be reproduced without prior written consent. “Technology is the engine of digital transformation, data is the fuel, process is the guidance system, and organizational change capability is the landing gear.” https://hbr.org/2020/05/digital-transformation-comes-down-to-talent-in-4-key-areas
  • 9. © Copyright. All rights reserved. Not to be reproduced without prior written consent. “We live in a society that is data rich and information poor.” “In Search of Excellence: Lessons from America's Best-Run Companies” January 1, 1982, Jr. Robert H. Waterman
  • 10. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data-first mindset
  • 11. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Democratization “If you torture the data long enough, it will confess.” Ronald H. Coase, Essays on Economics and Economists
  • 12. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data University @ AirBnB
  • 13. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data University @ AirBnB
  • 14. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Literacy Data literacy is the ability to read, understand, create, and communicate data as information.
  • 15. © Copyright. All rights reserved. Not to be reproduced without prior written consent. http://www.tylervigen.com/spurious-correlations
  • 16. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 17. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Lean startup ■ Build-Measure-Learn ■ Iterative approach ■ MVP approach ■ Continuous deployment ■ Split testing
  • 18. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 19. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Holistics / dbdiagram.io ■ 2 interns ■ 2-month project ■ Brand awareness ■ Lead Generation ■ Feedback ■ Public Road Map ■ 200k users (1yr)!!!
  • 20. © Copyright. All rights reserved. Not to be reproduced without prior written consent. https://www.holistics.io/blog/launch-a-product-to-200-000-users-with-0-dollars-spent-on-marketing-budget/
  • 21. © Copyright. All rights reserved. Not to be reproduced without prior written consent. https://about.gitlab.com/handbook/business-ops/data-team/
  • 22. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 23. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Low Risk Predictable approach Established processes Experimenting, failing and learning from your failures
  • 24. © Copyright. All rights reserved. Not to be reproduced without prior written consent. “It's not necessary to change. Survival is not mandatory.” William E. Deming
  • 25. © Copyright. All rights reserved. Not to be reproduced without prior written consent. ■ Self-service analytics ■ Citizen Data Science ■ Continuous Delivery ■ Data Integrity ■ Use case-driven ■ Security https://marriagebrokerauntie.com/2017/10/27/fail-if-you-have-to-but-fail-fast/
  • 26. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Q&A
  • 27. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Q&A
  • 28. © Copyright. All rights reserved. Not to be reproduced without prior written consent. “Without data you’re just another person with an opinion” — W. Edward Deming https://towardsdatascience.com/do-you-really-have-a-data-strate gy-ff08795f10ce
  • 29. Analytics 101 How to build a data-driven organisation
  • 30. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Big Data ■ Volume ■ Variety ■ Velocity ■ Veracity https://upload.wikimedia.org/wikipedia/commons/e/ee/Big_Data.png
  • 31. © Copyright. All rights reserved. Not to be reproduced without prior written consent. “Big data isn't a one-off project: It's a culture of collecting, analyzing, and using data.” Matt Asay, Infoworld.com ■ Not technology issue ■ Start small ■ Ask the right questions ■ Flexible, open data infrastructure
  • 32. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Collection Data Storage Processing Delivery Clickstream Mobile apps Product systems Transaction system CRM Call center Workforce mgmt
  • 33. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Lake ■ Repository for raw data ■ Various type of data ● Structured ● Semi-structured ● Unstructured ● Binary ■ Historical data vs.
  • 34. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 35. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Lineage ■ Where it comes from ■ What happened ■ Where they are used
  • 36. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Competing on Analytics: The New Science of Winning by Thomas H. Davenport, Jeanne G. Harris
  • 37. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Science vs Machine Learning
  • 38. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Machine Learning
  • 39. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Machine Learning
  • 40. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Machine Learning vs A.I. “Artificial intelligence is the science and engineering of making computers behave in ways that, until recently, we thought required human Intelligence.” Andrew Moore, Carnegie Mellon University,
  • 41. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 42. © Copyright. All rights reserved. Not to be reproduced without prior written consent. DevOps vs DataOps Culture Automation Lean Measurement Sharing + Data quality + Manufacturing process https://www.dataopsmanifesto.org/
  • 43. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 44. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 45. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Competing on Analytics: The New Science of Winning by Thomas H. Davenport, Jeanne G. Harris Technical competences Possibilities
  • 46. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Interactive BI ■ Reports ■ Dashboards ■ Drill-down reports ■ SQL-queries ■ Tools: Excel, PowerBi, QlikView, Tableau, Superset, Hive, Presto
  • 47. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Science ■ Transformed and Raw data ■ Machine Learning ■ Tools: Jupyter, Spark, Scala/Java R, Python Tensorflow, etc.
  • 48. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 49. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data Discovery ■ Search tool for data ■ What, where, who? ■ Metadata ■ Popularity score ■ Quality and profiling
  • 50. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Lexikon@Spotify ■ Library for data and insights ■ Knowledge Mgmt tool ● People ● Description, stats ● Tables, Queries https://engineering.atspotify.com/2020/02/27/how-we-improved-data-discovery-for-dat a-scientists-at-spotify/
  • 51. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 52. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Źródło: “Continuous Analytics: Stream Query Processing in Practice”, Michael J Franklin, Professor, UC Berkley, Dec 2009 i https://www.slideshare.net/JoshBaer/shortening -the-feedback-loop-big-data-spain-external
  • 53. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 54. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Continuous Data Collection Automation Security Monitoring Orchestration Data Lake Big Data Processing Data Governance Event Processing Feature engineering Interactive BI & Analytics Data Discovery Data Science Machine Learning
  • 55. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Hidden Technical Debt in Machine Learning Systems - https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
  • 56. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 57. © Copyright. All rights reserved. Not to be reproduced without prior written consent. https://mattturck.com/data2019/
  • 58. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data-driven insurer ■ Customer Data Platform - Customer360 ■ Better understanding of your customer - LTV ■ Personalised marketing ■ Personalised offer, recommendation systems ■ Customer engagement, churn ■ Customer journey, lead scoring ■ Marketing costs and sales funnel optimization
  • 59. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data-driven insurer ■ Customer Data Platform - Customer360 ■ Internal data (transactions, CRM, call-center) ■ Clickstream ■ External data
  • 60. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data-driven insurer ■ Personalised products https://www.the-digital-insurer.com
  • 61. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data-driven insurer ■ Personalised products ■ Lifestyle information ● Wearables ● Mobile apps ■ Healthcare data ● Diabetes monitoring ● Genomics
  • 62. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Data-driven insurer ■ Internet of Things ■ Sensors ● Fire ● Water Sensors ■ Telematics https://www.the-digital-insurer.com
  • 63. © Copyright. All rights reserved. Not to be reproduced without prior written consent.
  • 64. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Dataism “Dataism declares that the universe consists of data flows, and the value of any phenomenon or entity is determined by its contribution to data processing,” Yuval Noah Harari, “Homo Deus”.
  • 65. © Copyright. All rights reserved. Not to be reproduced without prior written consent. Q&A