SlideShare a Scribd company logo
A Practical-ish
Introduction to
Data Science
@markawest
Who Am I?
@markawest
Who Am I?
• Previously Java Developer and Architect.
@markawest
Who Am I?
• Previously Java Developer and Architect.
• Currently building and managing a team of
Data Scientists at Bouvet Oslo.
@markawest
Who Am I?
• Previously Java Developer and Architect.
• Currently building and managing a team of
Data Scientists at Bouvet Oslo.
• Leader javaBin (Norwegian Java User Group).
@markawest
Agenda
What is Data
Science?
Machine
Learning
Algorithms
Practical
Example
@markawest
Agenda
What is Data
Science?
Machine
Learning
Algorithms
Practical
Example
@markawest
Agenda
What is Data
Science?
Machine
Learning
Algorithms
Practical
Example
@markawest
Agenda
What is Data
Science?
Machine
Learning
Algorithms
Practical
Example
@markawest
What is Data Science?
What is Data
Science?
Machine
Learning
Algorithms
Practical
Example
@markawest
@markawest
“Data Science… is an interdisciplinary
field of scientific methods, processes,
and systems to extract knowledge or
insight from data…”
Wikipedia
@markawest
“Data Science… is an interdisciplinary
field of scientific methods, processes,
and systems to extract knowledge or
insight from data…”
Wikipedia
Computer
Science/IT
@markawest
Computer
Science/IT
Domain/Business
Knowledge
Software
Development
@markawest
Computer
Science/IT
Math and
Statistics
Domain/Business
Knowledge
Machine
Learning
Software
Development
Traditional
Research
Data
Science
@markawest
Computer
Science/IT
Math and
Statistics
Domain/Business
Knowledge
Machine
Learning
Software
Development
Traditional
Research
@markawest
@markawest
“Data Science… is an interdisciplinary
field of scientific methods, processes,
and systems to extract knowledge or
insight from data…”
Wikipedia
@markawest
1. Question 2. Data
3. Exploratory
Data Analysis
4. Formal
Modelling
5. Interperetation 6. Communication 7. Result
Data Science Process : Hypothesis Driven
@markawest
1. Question 2. Data
3. Exploratory
Data Analysis
4. Formal
Modelling
5. Interperetation 6. Communication 7. Result
Data Science Process : Hypothesis Driven
@markawest
1. Question 2. Data
3. Exploratory
Data Analysis
4. Formal
Modelling
5. Interperetation 6. Communication 7. Result
Data Science Process : Hypothesis Driven
@markawest
1. Question 2. Data
3. Exploratory
Data Analysis
4. Formal
Modelling
5. Interperetation 6. Communication 7. Result
Data Science Process : Hypothesis Driven
@markawest
1. Question 2. Data
3. Exploratory
Data Analysis
4. Formal
Modelling
5. Interpretation 6. Communication 7. Result
Data Science Process : Hypothesis Driven
@markawest
1. Question 2. Data
3. Exploratory
Data Analysis
4. Formal
Modelling
5. Interpretation 6. Communication 7. Result
Data Science Process : Hypothesis Driven
@markawest
1. Question 2. Data
3. Exploratory
Data Analysis
4. Formal
Modelling
5. Interpretation 6. Communication 7. Result
Data Science Process : Hypothesis Driven
@markawest
Roles Required in a Data Science Project
• Prove / disprove
hypotheses.
• Information and
Data Gathering.
• Data Wrangling.
• Algorithm and ML
models.
• Communication.
Data
Scientist
• Build Data Driven
Platforms.
• Operationalize
Algorithms and
Machine Learning
models.
• Data Integration.
Data
Engineer
• Storytelling.
• Build Dashboards
and other Data
visualizations.
• Provide insight
through visual
means.
Visualization
Expert
• Project
Management.
• Manage
stakeholder
expectations.
• Maintain a Vision.
• Facilitate.
Process
Owner
@markawest
Roles Required in a Data Science Project
• Prove / disprove
hypotheses.
• Information and
Data gathering.
• Data wrangling.
• Algorithm and ML
models.
• Communication.
Data
Scientist
• Build Data Driven
Platforms.
• Operationalize
Algorithms and
Machine Learning
models.
• Data Integration.
Data
Engineer
• Storytelling.
• Build Dashboards
and other Data
visualizations.
• Provide insight
through visual
means.
Visualization
Expert
• Project
Management.
• Manage
stakeholder
expectations.
• Maintain a Vision.
• Facilitate.
Process
Owner
@markawest
Roles Required in a Data Science Project
• Prove / disprove
hypotheses.
• Information and
Data gathering.
• Data wrangling.
• Algorithm and ML
models.
• Communication.
Data
Scientist
• Build Data Driven
Platforms.
• Operationalize
Algorithms and
Machine Learning
models.
• Data Integration.
• Monitoring.
Data
Engineer
• Storytelling.
• Build Dashboards
and other Data
visualizations.
• Provide insight
through visual
means.
Visualization
Expert
• Project
Management.
• Manage
stakeholder
expectations.
• Maintain a Vision.
• Facilitate.
Process
Owner
@markawest
Roles Required in a Data Science Project
• Prove / disprove
hypotheses.
• Information and
Data gathering.
• Data wrangling.
• Algorithm and ML
models.
• Communication.
Data
Scientist
• Build Data Driven
Platforms.
• Operationalize
Algorithms and
Machine Learning
models.
• Data Integration.
• Monitoring.
Data
Engineer
• Storytelling.
• Build Dashboards
and other Data
visualizations.
• Provide insight
through visual
means.
Data
Visualization
• Project
Management.
• Manage
stakeholder
expectations.
• Maintain a Vision.
• Facilitate.
Process
Owner
@markawest
Roles Required in a Data Science Project
• Prove / disprove
hypotheses.
• Information and
Data gathering.
• Data wrangling.
• Algorithm and ML
models.
• Communication.
Data
Scientist
• Build Data Driven
Platforms.
• Operationalize
Algorithms and
Machine Learning
models.
• Data Integration.
• Monitoring.
Data
Engineer
• Storytelling.
• Build Dashboards
and other Data
visualizations.
• Provide insight
through visual
means.
Data
Visualization
• Project
Management.
• Manage
stakeholder
expectations.
• Maintain a Vision.
• Facilitate.
• Evangelize.
Process
Owner
@markawest
“Data Science… is an interdisciplinary
field of scientific methods, processes,
and systems to extract knowledge or
insight from data…”
Wikipedia
Isn’t Data Science just
a rebranding of
Business Intelligence?
@markawest
@markawest
Data Science as an Evolution of BI
Business Intelligence Data Science Adds..
Data
Sources
Structured Data, most often
from Relational Database
Management Systems (RDBMS).
Unstructured Data (log files, audio,
images, emails, tweets, raw text,
documents).
Available
Tools
Data Visualization, Statistics. Machine Learning.
Goals Provide support to strategic
decision making, based on
historical data.
Provide business value through
advanced functionality.
Source: https://www.linkedin.com/pulse/data-science-business-intelligence-whats-difference-david-rostcheck
@markawest
Machine Learning: A Tool for Data Science
@markawest
Machine Learning: A Tool for Data Science
Artificial
Intelligence
Artificial Intelligence
Enabling computers to mimic human
intelligence and behavior.
@markawest
Machine Learning: A Tool for Data Science
Artificial
Intelligence
Machine
Learning
Artificial Intelligence
Enabling computers to mimic human
intelligence and behavior.
Machine Learning
Algorithms allowing computers to learn, make
predictions and describe data without being
explicitly programmed.
@markawest
Machine Learning: A Tool for Data Science
Artificial
Intelligence
Machine
Learning
Deep
Learning
Machine Learning
Algorithms allowing computers to learn, make
predictions and describe data without being
explicitly programmed.
Artificial Intelligence
Enabling computers to mimic human
intelligence and behavior.
Deep Learning
Black box learning with multi-layered Neural
Networks.
What is Data Science: Key Takeaways
• Data Scientists require Math and Statistics skills in addition to
traditional Software Development.
• Data Science is Hypothesis Driven.
• Data Science projects require a range of competencies/roles.
• Data Science can be seen as an evolution of Business Intelligence,
providing additional capabilities through the application of cutting
edge technologies and unstructured data.
@markawest
Machine Learning
Algorithms
What is Data
Science?
Machine
Learning
Algorithms
Practical
Example
@markawest
@markawest
“Machine Learning:
Field of study that gives
computers the ability to
learn without being
explicitly programmed.”
Arthur L. Samuel
IBM Journal of Research and Development, 1959
Computer
Data
Rules
Output
Computer
Data
Output
Rules
Traditional Programming
Machine Learning
Generalized
Captures the correlations in
your training data. May have
an error margin.
The Art of The Generalized Model
@markawest
Underfitted Overfitted
Model memorizes the
training data rather than
finding underlying patterns.
Model overlooks underlying
patterns in your training
data.
Supervised Learning
Machine Learning Types
@markawest
Unsupervised Learning
Model trained on historical
data. Resulting model can be
used to make predictions on
new data.
Use Case: Predicting a value
based on patterns discovered
in previous data.
Algorithm finds trends and
patterns in data, without
prior training on historical
data.
Use Case: Describing your
data based on statistical
analysis.
Reinforcement Learning
Model uses a feedback loop
to iteratively improve it’s
performance.
Use Case: Learning how to
best solve a problem based
on trial and error.
Common Machine Learning Algorithm Types
@markawest
Supervised Learning Unsupervised Learning
Common Machine Learning Algorithm Types
@markawest
Supervised Learning Unsupervised Learning
ClassificationRegression Clustering
Example Machine Learning Algorithms
@markawest
Supervised Learning Unsupervised Learning
Linear
Regression
ClassificationRegression
K-Means
Clustering
Decision Trees
Example Machine Learning Algorithms
@markawest
Supervised Learning Unsupervised Learning
Linear
Regression
ClassificationRegression
K-Means
Clustering
Decision Trees
Floor Space House Price
1 180 221 900
2 570 538 000
770 180 000
1 960 604 000
1 680 510 000
… …
… …
5 240 1 225 000
Linear Regression
Feature Label
@markawest
Floor Space House Price
1 180 221 900
2 570 538 000
770 180 000
1 960 604 000
1 680 510 000
… …
… …
5 240 1 225 000
Linear Regression
Feature Label
Trend Line
Deviation
Prediction
@markawest
Fitting a trend line: Ordinary Least Squares
@markawest
a
b
c
d
e
f
a2 + b2 + c2 + d2 + e2 + f2 = sum of squared error
Outlier?
Linear Regression Notes
Benefits
• Simple to
understand.
• Transparent.
Limitations
• Outliers skew
trend line.
• Doesn’t work
with non-
linear
relationships.
Some
Alternatives
• Non-linear
Least Squares.
• Tree
algorithms.
@markawest
Example Machine Learning Algorithms
@markawest
Supervised Learning Unsupervised Learning
Linear
Regression
ClassificationRegression
K-Means
Clustering
Decision Trees
Decision Tree: Calculating the Best Split
@markawest
Name Placements Complaints Lived in Norway Payrise
Don Yes Yes Yes Yes
Lewis Yes Yes No Yes
Mike Yes No Yes Yes
Danny Yes Yes No Yes
Dan No No Yes No
Elliot Yes No No Yes
Luke Yes No No Yes
Tom Yes Yes No Yes
Nathan No Yes Yes No
Owen Yes No No Yes
Goal: Build a
Decision Tree for
deciding who gets a
payrise this year,
based on historical
payrise data.
Features Labels
Decision Tree: Calculating the Best Split
@markawest
Name Placements Complaints Lived in Norway Payrise
Don Yes Yes Yes Yes
Lewis Yes Yes No Yes
Mike Yes No Yes Yes
Danny Yes Yes No Yes
Dan No No Yes No
Elliot Yes No No Yes
Luke Yes No No Yes
Tom Yes Yes No Yes
Nathan No Yes Yes No
Owen Yes No No Yes
Lived in
Norway
Yes No
Decision Tree: Calculating the Best Split
@markawest
Name Placements Complaints Lived in Norway Payrise
Don Yes Yes Yes Yes
Lewis Yes Yes No Yes
Mike Yes No Yes Yes
Danny Yes Yes No Yes
Dan No No Yes No
Elliot Yes No No Yes
Luke Yes No No Yes
Tom Yes Yes No Yes
Nathan No Yes Yes No
Owen Yes No No Yes
Complaints
Yes No
Decision Tree: Calculating the Best Split
@markawest
Name Placements Complaints Lived in Norway Payrise
Don Yes Yes Yes Yes
Lewis Yes Yes No Yes
Mike Yes No Yes Yes
Danny Yes Yes No Yes
Dan No No Yes No
Elliot Yes No No Yes
Luke Yes No No Yes
Tom Yes Yes No Yes
Nathan No Yes Yes No
Owen Yes No No Yes
Placements
Yes No
Decision Tree: Calculating the Best Split
@markawest
Placements
Yes No
Complaints
Yes No
Lived in
Norway
Yes No
Recruiters Placements Complaints Lived in Norway Payrise
8 8 4 2 Yes
2 0 1 2 No
Building a Decision Tree: A Bad Split?
@markawest
Placements
Yes No
Complaints
Yes No
Lived in
Norway
Yes No
Recruiters Placements Complaints Lived in Norway Payrise
8 7 8 2 Yes
2 1 0 2 No
Decision Tree: Recursive Partitioning
@markawest
Outlook Temp Humidity Wind Play
Sunny Hot High Weak No
Sunny Hot High Strong No
Overcast Hot High Weak Yes
… … … … …
… … … … …
Overcast Mild High Strong Yes
Overcast Hot Normal Weak Yes
Rain Mild High Strong No
No Yes No Yes
Yes
Outlook
Humidity Wind
Features Labels
Overcast
Sunny Rain
High WeakNormal Strong
Building a Decision Tree: Where to Stop?
@markawest
#1 : All Data at
current leaf
belongs to the
same class.
No Yes No Yes
YesHumidity Wind
Overcast
Sunny Rain
High Normal Strong
Outlook
Building a Decision Tree: Where to Stop?
@markawest
No Yes No Yes
YesHumidity Wind
Overcast
Sunny Rain
High Normal Strong
Outlook
#2 : Maximum tree
depth reached.
Decision Tree Notes
Benefits
• White Box.
• Flexible (use for
both regression
and classification).
• Robust to outliers.
• Handle non-linear
boundaries.
Limitations
• Susceptible to
overfitting.
• Changes to where
the Data is sliced
can produce
different results.
Some Alternatives
• Support Vector
Machine.
• Logistic
Regression.
• Random Forests.
@markawest
Example Machine Learning Algorithms
@markawest
Supervised Learning Unsupervised Learning
Linear
Regression
ClassificationRegression
K-Means
Clustering
Decision Trees
K-Means Clustering
@markawest
• K = The amount of clusters the
algorithm will try to find.
• K = Should be large enough to
extract meaningful patterns but
small enough that clusters remain
clearly distinct.
• So how do we calculate K?
Sum of Squared Errors
@markawest
a b
c
de
f
a2 + b2 + c2 + d2 + e2 + f2 = sum of squared error
a
b
c
d
e
f
Sum of Squared Errors vs. Amount of Clusters
@markawest
Sum of Squared Errors vs. Amount of Clusters
@markawest
Sum of Squared Errors vs. Amount of Clusters
@markawest
K-Means: Calculating the K value
@markawest
• Scree Plots allow us to find
optimal number of clusters.
• Shows the Sum of Squared
Errors for different
numbers of clusters.
• The optimal K value is at
the “Elbow” of the plot.
K-Means Demo
Randomly allocate centroids
@markawest
K-Means Demo
Randomly allocate centroids
@markawest
K-Means Demo
Iteration 1: Calculate cluster membership based on nearest centroid
@markawest
K-Means Demo
Iteration 1: Move centroids to the center of their cluster
@markawest
K-Means Demo
Iteration 2: Move centroids to the center of their cluster
@markawest
K-Means Demo
Iteration 2: Recalculate cluster membership based on nearest centroid
@markawest
K-Means Demo
After 6 iterations: Clusters and centroids stablise, algorithm stops
@markawest
K-Means Clustering Notes
Benefits
• Fast and highly
effective at
uncovering basic
data patterns.
• Works best for
spherical, non-
overlapping
clusters.
Limitations
• Each data point
can only be
assigned to one
cluster.
• Clusters are
assumed to be
spherical.
Some Alternatives
• Gaussian mixtures.
• Fuzzy K-Means.
@markawest
Machine Learning Algorithms: Key Takeaways
@markawest
• The three main types of Machine Learning are Supervised,
Unsupervised and Reinforcement Learning.
• Machine Learning is more than Neural Networks and Deep Learning.
• A successful Machine Learning Model needs to find the balance
between Overfitting and Underfitting.
• Machine Learning Algorithms are merely tools. Good results come
from understanding how they work and tuning them correctly.
Practical Example
What is Data
Science?
Machine
Learning
Algorithms
Practical
Example
@markawest
Use Case: Titanic Passenger Survival
@markawest
Goal: Build a
classification model
for predicting
Titanic survivability.
Hypothesis
That it is possible
to predict Titanic
survivability based
on Age, Gender
and Ticket Class.
@markawest
@markawest
Variable Description
PassengerId Unique Identifier
Survival Survived = 1, Died = 0
Pclass Ticket class (1, 2 or 3)
Sex Gender (‘male’ or ’female’)
Age Age in years
Sibsp Number siblings / spouses aboard the Titanic
Parch Number parents / children aboard the Titanic
Ticket Ticket number
Fare Passenger fare
Cabin Cabin number
Embarked Port of Embarkation
Name Passenger name, including honorific.
Titanic
Dataset
Tools
@markawest
Practical Example: Key Takeaways
@markawest
• Scikit-learn and Jupyter Notebooks provide a free and flexible basis for starting
with Data Science. Use the Anaconda distribution to save time on installation!
• Feature Engineering is a vital skill for Data Scientists.
• Domain Knowledge is key to succeed!
• Split your data into Test and Training sets.
• Tweaking Hyperparameters may give better results (but you should be able to
explain how your tweak improved model performance).
Tips for Getting Started with Data Science
@markawest
• Become a Data Engineer!
• Learn Python or R (SQL is also useful)!
• Learn some statistical methods!
• Take an online Data Science course (i.e. Udemy DS Nano Degree)!
• Understand the Data Science process!
• Join a Meetup!
• Practice with Kaggle!
Thanks for listening!
@markawest

More Related Content

What's hot

Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
DATAVERSITY
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Srishti44
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
Tharushi Ruwandika
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
DATAVERSITY
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
Spotle.ai
 
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJX
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJXDriving Data Intelligence in the Supply Chain Through the Data Catalog at TJX
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJX
DATAVERSITY
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
ANOOP V S
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | Edureka
Edureka!
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Edureka!
 
Data science Big Data
Data science Big DataData science Big Data
Data science Big Data
sreekanthricky
 
Data Science
Data ScienceData Science
Data Science
Rabin BK
 
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Edureka!
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
DATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
DATAVERSITY
 
Data Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherData Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working Together
DATAVERSITY
 
Data science presentation
Data science presentationData science presentation
Data science presentation
MSDEVMTL
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
Kujambu Murugesan
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?
DATAVERSITY
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Laguna State Polytechnic University
 

What's hot (20)

Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJX
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJXDriving Data Intelligence in the Supply Chain Through the Data Catalog at TJX
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJX
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data Science Full Course | Edureka
Data Science Full Course | EdurekaData Science Full Course | Edureka
Data Science Full Course | Edureka
 
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
 
Data science Big Data
Data science Big DataData science Big Data
Data science Big Data
 
Data Science
Data ScienceData Science
Data Science
 
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working TogetherData Management, Metadata Management, and Data Governance – Working Together
Data Management, Metadata Management, and Data Governance – Working Together
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 

Similar to A Practical-ish Introduction to Data Science

GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceGeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
Mark West
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data Science
Mark West
 
NDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceNDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data Science
Mark West
 
Data science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptxData science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptx
NagarajanG35
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
DATAVERSITY
 
How to Consume Your Data for AI
How to Consume Your Data for AIHow to Consume Your Data for AI
How to Consume Your Data for AI
DATAVERSITY
 
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
Elvis Muyanja
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
KumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
VamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
saitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
Nithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
VamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
SaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science training
DIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
VamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
VamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
KumarNaik21
 

Similar to A Practical-ish Introduction to Data Science (20)

GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceGeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data Science
 
NDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceNDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data Science
 
Data science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptxData science in business Administration Nagarajan.pptx
Data science in business Administration Nagarajan.pptx
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
How to Consume Your Data for AI
How to Consume Your Data for AIHow to Consume Your Data for AI
How to Consume Your Data for AI
 
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 

More from Mark West

Explaining the new Java release and licensing models
Explaining the new Java release and licensing modelsExplaining the new Java release and licensing models
Explaining the new Java release and licensing models
Mark West
 
IoT Meetup Oslo - AI on Edge Devices
IoT Meetup Oslo - AI on Edge DevicesIoT Meetup Oslo - AI on Edge Devices
IoT Meetup Oslo - AI on Edge Devices
Mark West
 
Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...
Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...
Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...
Mark West
 
DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...
DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...
DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...
Mark West
 
GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...
GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...
GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...
Mark West
 
JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...
JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...
JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...
Mark West
 
GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...
GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...
GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...
Mark West
 
Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....
Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....
Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....
Mark West
 
IoT Tech Day Smart Camera slides. Utrecht, April 2017.
IoT Tech Day Smart Camera slides.  Utrecht, April 2017.IoT Tech Day Smart Camera slides.  Utrecht, April 2017.
IoT Tech Day Smart Camera slides. Utrecht, April 2017.
Mark West
 
NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...
NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...
NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...
Mark West
 
JavaZone 2016 : MQTT and CoAP for the Java Developer
JavaZone 2016 : MQTT and CoAP for the Java DeveloperJavaZone 2016 : MQTT and CoAP for the Java Developer
JavaZone 2016 : MQTT and CoAP for the Java Developer
Mark West
 
JavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-Five
JavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-FiveJavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-Five
JavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-Five
Mark West
 
Coding Mojo : Node.js Meetup
Coding Mojo : Node.js MeetupCoding Mojo : Node.js Meetup
Coding Mojo : Node.js Meetup
Mark West
 
IoT Tech Day Coding Mojo slides. Utrecht, April 2016
IoT Tech Day Coding Mojo slides.  Utrecht, April 2016IoT Tech Day Coding Mojo slides.  Utrecht, April 2016
IoT Tech Day Coding Mojo slides. Utrecht, April 2016
Mark West
 
JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...
JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...
JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...
Mark West
 

More from Mark West (15)

Explaining the new Java release and licensing models
Explaining the new Java release and licensing modelsExplaining the new Java release and licensing models
Explaining the new Java release and licensing models
 
IoT Meetup Oslo - AI on Edge Devices
IoT Meetup Oslo - AI on Edge DevicesIoT Meetup Oslo - AI on Edge Devices
IoT Meetup Oslo - AI on Edge Devices
 
Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...
Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...
Make Data Smart Again 2018 - Building a Smart Security Camera with Raspberry ...
 
DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...
DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...
DevExperience 2018 : Building a Smart Security Camera with Raspberry Pi Zero,...
 
GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...
GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...
GeeCON Prague : Building a Smart Security Camera with Raspberry Pi Zero, Java...
 
JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...
JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...
JavaZone 2017 : Building a smart security camera with raspberry pi zero, java...
 
GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...
GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...
GeeCon 2017 : Building a Smart Security Camera with Raspberry Pi Zero, Node.j...
 
Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....
Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....
Riga Dev Days: Building a Smart Security Camera with Raspberry Pi Zero, Node....
 
IoT Tech Day Smart Camera slides. Utrecht, April 2017.
IoT Tech Day Smart Camera slides.  Utrecht, April 2017.IoT Tech Day Smart Camera slides.  Utrecht, April 2017.
IoT Tech Day Smart Camera slides. Utrecht, April 2017.
 
NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...
NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...
NTNU Tech Talks : Smartening up a Pi Zero Security Camera with Amazon Web Ser...
 
JavaZone 2016 : MQTT and CoAP for the Java Developer
JavaZone 2016 : MQTT and CoAP for the Java DeveloperJavaZone 2016 : MQTT and CoAP for the Java Developer
JavaZone 2016 : MQTT and CoAP for the Java Developer
 
JavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-Five
JavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-FiveJavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-Five
JavaZone 2015 : NodeBots - JavaScript Powered Robots with Johnny-Five
 
Coding Mojo : Node.js Meetup
Coding Mojo : Node.js MeetupCoding Mojo : Node.js Meetup
Coding Mojo : Node.js Meetup
 
IoT Tech Day Coding Mojo slides. Utrecht, April 2016
IoT Tech Day Coding Mojo slides.  Utrecht, April 2016IoT Tech Day Coding Mojo slides.  Utrecht, April 2016
IoT Tech Day Coding Mojo slides. Utrecht, April 2016
 
JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...
JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...
JavaOne 2015 : How I Rediscovered My Coding Mojo by Building an IoT/Robotics ...
 

Recently uploaded

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

A Practical-ish Introduction to Data Science

  • 3. Who Am I? • Previously Java Developer and Architect. @markawest
  • 4. Who Am I? • Previously Java Developer and Architect. • Currently building and managing a team of Data Scientists at Bouvet Oslo. @markawest
  • 5. Who Am I? • Previously Java Developer and Architect. • Currently building and managing a team of Data Scientists at Bouvet Oslo. • Leader javaBin (Norwegian Java User Group). @markawest
  • 10. What is Data Science? What is Data Science? Machine Learning Algorithms Practical Example @markawest
  • 11. @markawest “Data Science… is an interdisciplinary field of scientific methods, processes, and systems to extract knowledge or insight from data…” Wikipedia
  • 12. @markawest “Data Science… is an interdisciplinary field of scientific methods, processes, and systems to extract knowledge or insight from data…” Wikipedia
  • 17. @markawest “Data Science… is an interdisciplinary field of scientific methods, processes, and systems to extract knowledge or insight from data…” Wikipedia
  • 18. @markawest 1. Question 2. Data 3. Exploratory Data Analysis 4. Formal Modelling 5. Interperetation 6. Communication 7. Result Data Science Process : Hypothesis Driven
  • 19. @markawest 1. Question 2. Data 3. Exploratory Data Analysis 4. Formal Modelling 5. Interperetation 6. Communication 7. Result Data Science Process : Hypothesis Driven
  • 20. @markawest 1. Question 2. Data 3. Exploratory Data Analysis 4. Formal Modelling 5. Interperetation 6. Communication 7. Result Data Science Process : Hypothesis Driven
  • 21. @markawest 1. Question 2. Data 3. Exploratory Data Analysis 4. Formal Modelling 5. Interperetation 6. Communication 7. Result Data Science Process : Hypothesis Driven
  • 22. @markawest 1. Question 2. Data 3. Exploratory Data Analysis 4. Formal Modelling 5. Interpretation 6. Communication 7. Result Data Science Process : Hypothesis Driven
  • 23. @markawest 1. Question 2. Data 3. Exploratory Data Analysis 4. Formal Modelling 5. Interpretation 6. Communication 7. Result Data Science Process : Hypothesis Driven
  • 24. @markawest 1. Question 2. Data 3. Exploratory Data Analysis 4. Formal Modelling 5. Interpretation 6. Communication 7. Result Data Science Process : Hypothesis Driven
  • 25. @markawest Roles Required in a Data Science Project • Prove / disprove hypotheses. • Information and Data Gathering. • Data Wrangling. • Algorithm and ML models. • Communication. Data Scientist • Build Data Driven Platforms. • Operationalize Algorithms and Machine Learning models. • Data Integration. Data Engineer • Storytelling. • Build Dashboards and other Data visualizations. • Provide insight through visual means. Visualization Expert • Project Management. • Manage stakeholder expectations. • Maintain a Vision. • Facilitate. Process Owner
  • 26. @markawest Roles Required in a Data Science Project • Prove / disprove hypotheses. • Information and Data gathering. • Data wrangling. • Algorithm and ML models. • Communication. Data Scientist • Build Data Driven Platforms. • Operationalize Algorithms and Machine Learning models. • Data Integration. Data Engineer • Storytelling. • Build Dashboards and other Data visualizations. • Provide insight through visual means. Visualization Expert • Project Management. • Manage stakeholder expectations. • Maintain a Vision. • Facilitate. Process Owner
  • 27. @markawest Roles Required in a Data Science Project • Prove / disprove hypotheses. • Information and Data gathering. • Data wrangling. • Algorithm and ML models. • Communication. Data Scientist • Build Data Driven Platforms. • Operationalize Algorithms and Machine Learning models. • Data Integration. • Monitoring. Data Engineer • Storytelling. • Build Dashboards and other Data visualizations. • Provide insight through visual means. Visualization Expert • Project Management. • Manage stakeholder expectations. • Maintain a Vision. • Facilitate. Process Owner
  • 28. @markawest Roles Required in a Data Science Project • Prove / disprove hypotheses. • Information and Data gathering. • Data wrangling. • Algorithm and ML models. • Communication. Data Scientist • Build Data Driven Platforms. • Operationalize Algorithms and Machine Learning models. • Data Integration. • Monitoring. Data Engineer • Storytelling. • Build Dashboards and other Data visualizations. • Provide insight through visual means. Data Visualization • Project Management. • Manage stakeholder expectations. • Maintain a Vision. • Facilitate. Process Owner
  • 29. @markawest Roles Required in a Data Science Project • Prove / disprove hypotheses. • Information and Data gathering. • Data wrangling. • Algorithm and ML models. • Communication. Data Scientist • Build Data Driven Platforms. • Operationalize Algorithms and Machine Learning models. • Data Integration. • Monitoring. Data Engineer • Storytelling. • Build Dashboards and other Data visualizations. • Provide insight through visual means. Data Visualization • Project Management. • Manage stakeholder expectations. • Maintain a Vision. • Facilitate. • Evangelize. Process Owner
  • 30. @markawest “Data Science… is an interdisciplinary field of scientific methods, processes, and systems to extract knowledge or insight from data…” Wikipedia
  • 31. Isn’t Data Science just a rebranding of Business Intelligence? @markawest
  • 32. @markawest Data Science as an Evolution of BI Business Intelligence Data Science Adds.. Data Sources Structured Data, most often from Relational Database Management Systems (RDBMS). Unstructured Data (log files, audio, images, emails, tweets, raw text, documents). Available Tools Data Visualization, Statistics. Machine Learning. Goals Provide support to strategic decision making, based on historical data. Provide business value through advanced functionality. Source: https://www.linkedin.com/pulse/data-science-business-intelligence-whats-difference-david-rostcheck
  • 33. @markawest Machine Learning: A Tool for Data Science
  • 34. @markawest Machine Learning: A Tool for Data Science Artificial Intelligence Artificial Intelligence Enabling computers to mimic human intelligence and behavior.
  • 35. @markawest Machine Learning: A Tool for Data Science Artificial Intelligence Machine Learning Artificial Intelligence Enabling computers to mimic human intelligence and behavior. Machine Learning Algorithms allowing computers to learn, make predictions and describe data without being explicitly programmed.
  • 36. @markawest Machine Learning: A Tool for Data Science Artificial Intelligence Machine Learning Deep Learning Machine Learning Algorithms allowing computers to learn, make predictions and describe data without being explicitly programmed. Artificial Intelligence Enabling computers to mimic human intelligence and behavior. Deep Learning Black box learning with multi-layered Neural Networks.
  • 37. What is Data Science: Key Takeaways • Data Scientists require Math and Statistics skills in addition to traditional Software Development. • Data Science is Hypothesis Driven. • Data Science projects require a range of competencies/roles. • Data Science can be seen as an evolution of Business Intelligence, providing additional capabilities through the application of cutting edge technologies and unstructured data. @markawest
  • 38. Machine Learning Algorithms What is Data Science? Machine Learning Algorithms Practical Example @markawest
  • 39. @markawest “Machine Learning: Field of study that gives computers the ability to learn without being explicitly programmed.” Arthur L. Samuel IBM Journal of Research and Development, 1959 Computer Data Rules Output Computer Data Output Rules Traditional Programming Machine Learning
  • 40. Generalized Captures the correlations in your training data. May have an error margin. The Art of The Generalized Model @markawest Underfitted Overfitted Model memorizes the training data rather than finding underlying patterns. Model overlooks underlying patterns in your training data.
  • 41. Supervised Learning Machine Learning Types @markawest Unsupervised Learning Model trained on historical data. Resulting model can be used to make predictions on new data. Use Case: Predicting a value based on patterns discovered in previous data. Algorithm finds trends and patterns in data, without prior training on historical data. Use Case: Describing your data based on statistical analysis. Reinforcement Learning Model uses a feedback loop to iteratively improve it’s performance. Use Case: Learning how to best solve a problem based on trial and error.
  • 42. Common Machine Learning Algorithm Types @markawest Supervised Learning Unsupervised Learning
  • 43. Common Machine Learning Algorithm Types @markawest Supervised Learning Unsupervised Learning ClassificationRegression Clustering
  • 44. Example Machine Learning Algorithms @markawest Supervised Learning Unsupervised Learning Linear Regression ClassificationRegression K-Means Clustering Decision Trees
  • 45. Example Machine Learning Algorithms @markawest Supervised Learning Unsupervised Learning Linear Regression ClassificationRegression K-Means Clustering Decision Trees
  • 46. Floor Space House Price 1 180 221 900 2 570 538 000 770 180 000 1 960 604 000 1 680 510 000 … … … … 5 240 1 225 000 Linear Regression Feature Label @markawest
  • 47. Floor Space House Price 1 180 221 900 2 570 538 000 770 180 000 1 960 604 000 1 680 510 000 … … … … 5 240 1 225 000 Linear Regression Feature Label Trend Line Deviation Prediction @markawest
  • 48. Fitting a trend line: Ordinary Least Squares @markawest a b c d e f a2 + b2 + c2 + d2 + e2 + f2 = sum of squared error Outlier?
  • 49. Linear Regression Notes Benefits • Simple to understand. • Transparent. Limitations • Outliers skew trend line. • Doesn’t work with non- linear relationships. Some Alternatives • Non-linear Least Squares. • Tree algorithms. @markawest
  • 50. Example Machine Learning Algorithms @markawest Supervised Learning Unsupervised Learning Linear Regression ClassificationRegression K-Means Clustering Decision Trees
  • 51. Decision Tree: Calculating the Best Split @markawest Name Placements Complaints Lived in Norway Payrise Don Yes Yes Yes Yes Lewis Yes Yes No Yes Mike Yes No Yes Yes Danny Yes Yes No Yes Dan No No Yes No Elliot Yes No No Yes Luke Yes No No Yes Tom Yes Yes No Yes Nathan No Yes Yes No Owen Yes No No Yes Goal: Build a Decision Tree for deciding who gets a payrise this year, based on historical payrise data. Features Labels
  • 52. Decision Tree: Calculating the Best Split @markawest Name Placements Complaints Lived in Norway Payrise Don Yes Yes Yes Yes Lewis Yes Yes No Yes Mike Yes No Yes Yes Danny Yes Yes No Yes Dan No No Yes No Elliot Yes No No Yes Luke Yes No No Yes Tom Yes Yes No Yes Nathan No Yes Yes No Owen Yes No No Yes Lived in Norway Yes No
  • 53. Decision Tree: Calculating the Best Split @markawest Name Placements Complaints Lived in Norway Payrise Don Yes Yes Yes Yes Lewis Yes Yes No Yes Mike Yes No Yes Yes Danny Yes Yes No Yes Dan No No Yes No Elliot Yes No No Yes Luke Yes No No Yes Tom Yes Yes No Yes Nathan No Yes Yes No Owen Yes No No Yes Complaints Yes No
  • 54. Decision Tree: Calculating the Best Split @markawest Name Placements Complaints Lived in Norway Payrise Don Yes Yes Yes Yes Lewis Yes Yes No Yes Mike Yes No Yes Yes Danny Yes Yes No Yes Dan No No Yes No Elliot Yes No No Yes Luke Yes No No Yes Tom Yes Yes No Yes Nathan No Yes Yes No Owen Yes No No Yes Placements Yes No
  • 55. Decision Tree: Calculating the Best Split @markawest Placements Yes No Complaints Yes No Lived in Norway Yes No Recruiters Placements Complaints Lived in Norway Payrise 8 8 4 2 Yes 2 0 1 2 No
  • 56. Building a Decision Tree: A Bad Split? @markawest Placements Yes No Complaints Yes No Lived in Norway Yes No Recruiters Placements Complaints Lived in Norway Payrise 8 7 8 2 Yes 2 1 0 2 No
  • 57. Decision Tree: Recursive Partitioning @markawest Outlook Temp Humidity Wind Play Sunny Hot High Weak No Sunny Hot High Strong No Overcast Hot High Weak Yes … … … … … … … … … … Overcast Mild High Strong Yes Overcast Hot Normal Weak Yes Rain Mild High Strong No No Yes No Yes Yes Outlook Humidity Wind Features Labels Overcast Sunny Rain High WeakNormal Strong
  • 58. Building a Decision Tree: Where to Stop? @markawest #1 : All Data at current leaf belongs to the same class. No Yes No Yes YesHumidity Wind Overcast Sunny Rain High Normal Strong Outlook
  • 59. Building a Decision Tree: Where to Stop? @markawest No Yes No Yes YesHumidity Wind Overcast Sunny Rain High Normal Strong Outlook #2 : Maximum tree depth reached.
  • 60. Decision Tree Notes Benefits • White Box. • Flexible (use for both regression and classification). • Robust to outliers. • Handle non-linear boundaries. Limitations • Susceptible to overfitting. • Changes to where the Data is sliced can produce different results. Some Alternatives • Support Vector Machine. • Logistic Regression. • Random Forests. @markawest
  • 61. Example Machine Learning Algorithms @markawest Supervised Learning Unsupervised Learning Linear Regression ClassificationRegression K-Means Clustering Decision Trees
  • 62. K-Means Clustering @markawest • K = The amount of clusters the algorithm will try to find. • K = Should be large enough to extract meaningful patterns but small enough that clusters remain clearly distinct. • So how do we calculate K?
  • 63. Sum of Squared Errors @markawest a b c de f a2 + b2 + c2 + d2 + e2 + f2 = sum of squared error a b c d e f
  • 64. Sum of Squared Errors vs. Amount of Clusters @markawest
  • 65. Sum of Squared Errors vs. Amount of Clusters @markawest
  • 66. Sum of Squared Errors vs. Amount of Clusters @markawest
  • 67. K-Means: Calculating the K value @markawest • Scree Plots allow us to find optimal number of clusters. • Shows the Sum of Squared Errors for different numbers of clusters. • The optimal K value is at the “Elbow” of the plot.
  • 68. K-Means Demo Randomly allocate centroids @markawest
  • 69. K-Means Demo Randomly allocate centroids @markawest
  • 70. K-Means Demo Iteration 1: Calculate cluster membership based on nearest centroid @markawest
  • 71. K-Means Demo Iteration 1: Move centroids to the center of their cluster @markawest
  • 72. K-Means Demo Iteration 2: Move centroids to the center of their cluster @markawest
  • 73. K-Means Demo Iteration 2: Recalculate cluster membership based on nearest centroid @markawest
  • 74. K-Means Demo After 6 iterations: Clusters and centroids stablise, algorithm stops @markawest
  • 75. K-Means Clustering Notes Benefits • Fast and highly effective at uncovering basic data patterns. • Works best for spherical, non- overlapping clusters. Limitations • Each data point can only be assigned to one cluster. • Clusters are assumed to be spherical. Some Alternatives • Gaussian mixtures. • Fuzzy K-Means. @markawest
  • 76. Machine Learning Algorithms: Key Takeaways @markawest • The three main types of Machine Learning are Supervised, Unsupervised and Reinforcement Learning. • Machine Learning is more than Neural Networks and Deep Learning. • A successful Machine Learning Model needs to find the balance between Overfitting and Underfitting. • Machine Learning Algorithms are merely tools. Good results come from understanding how they work and tuning them correctly.
  • 77. Practical Example What is Data Science? Machine Learning Algorithms Practical Example @markawest
  • 78. Use Case: Titanic Passenger Survival @markawest Goal: Build a classification model for predicting Titanic survivability.
  • 79. Hypothesis That it is possible to predict Titanic survivability based on Age, Gender and Ticket Class. @markawest
  • 80. @markawest Variable Description PassengerId Unique Identifier Survival Survived = 1, Died = 0 Pclass Ticket class (1, 2 or 3) Sex Gender (‘male’ or ’female’) Age Age in years Sibsp Number siblings / spouses aboard the Titanic Parch Number parents / children aboard the Titanic Ticket Ticket number Fare Passenger fare Cabin Cabin number Embarked Port of Embarkation Name Passenger name, including honorific. Titanic Dataset
  • 82.
  • 83. Practical Example: Key Takeaways @markawest • Scikit-learn and Jupyter Notebooks provide a free and flexible basis for starting with Data Science. Use the Anaconda distribution to save time on installation! • Feature Engineering is a vital skill for Data Scientists. • Domain Knowledge is key to succeed! • Split your data into Test and Training sets. • Tweaking Hyperparameters may give better results (but you should be able to explain how your tweak improved model performance).
  • 84. Tips for Getting Started with Data Science @markawest • Become a Data Engineer! • Learn Python or R (SQL is also useful)! • Learn some statistical methods! • Take an online Data Science course (i.e. Udemy DS Nano Degree)! • Understand the Data Science process! • Join a Meetup! • Practice with Kaggle!

Editor's Notes

  1. But first, who the devil am I? As you can see from my twitter handle my name is Mark West, and I’m an English living here in Oslo, Norway.
  2. Speaking for me is a hobby that I do to learn and share my own knowledge and experiences. In the past couple of years I have spoken at a range of conference across Europe and the US. The good news is that this is the first time I have spoken at NDC. This is also the first time I have given this specific talk so I am excited to hear your feedback. So lets get started!
  3. Speaking for me is a hobby that I do to learn and share my own knowledge and experiences. In the past couple of years I have spoken at a range of conference across Europe and the US. The good news is that this is the first time I have spoken at NDC. This is also the first time I have given this specific talk so I am excited to hear your feedback. So lets get started!
  4. Speaking for me is a hobby that I do to learn and share my own knowledge and experiences. In the past couple of years I have spoken at a range of conference across Europe and the US. The good news is that this is the first time I have spoken at NDC. This is also the first time I have given this specific talk so I am excited to hear your feedback. So lets get started!
  5. Here is the Agenda for my talk. As you can see it is split into four sections.
  6. I’ll then do on to define what Data Science is, what parts are most relevant for us, and out Data Science is linked with Machine Learning and Aritifical Intelligence. I’ll also talk about the drivers behind Data Science projects that the roles that these projects require.
  7. Machine Learning is the most popular application of Data Science at the moment, and I’ll therefore use some time to define the categories and types of Machine Learning algorithms, and give you some examples of the most commonly used algorithms.
  8. Finally I will show you a practical example of applied Data Science, and show you how Data Science is more than just Machine Learning.
  9. Right, so whats the motivation. Why am I here today?
  10. Tip : Possibly replace this with Bouvet’s own methodology if it is ready and good enough.
  11. Ok, so lets move on to the second part of my talk – Machine Learning algorithms.
  12. Machine Learning is all about giving computers a framework to create their own logic or rules, without these being programmed by a human. Look at it as an inversion of control when compared to traditional programming.
  13. An underfitted model is likely to neglect significant trends, which would cause it to yield less accurate predictions for both current and future data. An overfitted model would yield highly accurate predictions for the current data, but would be less generalizable to future data. But when parameters are tuned just right, such as shown in Figure 2b, the algorithm strikes a balance between identifying major trends and discounting minor variations, rendering the resulting model well-suited for making predictions. Note – more complex models are prone to overfitting.
  14. Note that reinforcement learning continuously improves itself, which supervised and unsupervised models will have to be built again to reflect new data. So if your use case requires you to
  15. Other forms of Regression Model that are popular include Non-Regression, which is used for modelling non-linear trend lines, and Logistic Regression, which is a form of Regression where the trend line is used to separate data points into classes.
  16. Multicollinearity You go to see a rock and roll band with two great guitar players. You're eager to see which one plays best. But on stage, they're both playing furious leads at the same time!  When they're both playing loud and fast, how can you tell which guitarist has the biggest effect on the sound?  Even though they aren't playing the same notes, what they're doing is so similar it's difficult to tell one from the other. 
  17. But first, who the devil am I? As you can see from my twitter handle my name is Mark West, and I’m an English living here in Oslo, Norway.
  18. But first, who the devil am I? As you can see from my twitter handle my name is Mark West, and I’m an English living here in Oslo, Norway.
  19. But first, who the devil am I? As you can see from my twitter handle my name is Mark West, and I’m an English living here in Oslo, Norway.
  20. But first, who the devil am I? As you can see from my twitter handle my name is Mark West, and I’m an English living here in Oslo, Norway.
  21. But first, who the devil am I? As you can see from my twitter handle my name is Mark West, and I’m an English living here in Oslo, Norway.
  22. But first, who the devil am I? As you can see from my twitter handle my name is Mark West, and I’m an English living here in Oslo, Norway.
  23. As decision trees are grown by splitting data points into homogeneous groups, a slight change in the data could trigger changes to the split, and result in a different tree. Why Random Forests As decision trees also aim for the best way to split data points each time, they are vulnerable to overfitting (see Chapter 1.3). Inaccuracy. Using the best binary question to split the data at the start might not lead to the most accurate predictions. Sometimes, less effective splits used initially may lead to better predictions subsequently.
  24. More Data beats complex algorithms : It’s all about the DATA!!!! Garbage in, Garbage out!!
  25. Right, so whats the motivation. Why am I here today?
  26. survival – Did the passenger survive? pclass – Which sex age sibsp parch ticket Fare cabin embarked