SlideShare a Scribd company logo
1 of 37
Download to read offline
Intro to
Python:
Build a
Predictive
Model
Introductions
➔ What's your name?
➔ What brought you here today?
➔ What is your programming experience?
We train developers and
data scientists through
1x1 mentorship and
project-based learning.
Guaranteed.
About
Thinkful
Learn
by
Doing
➔ Why is Data Science a thing?
➔ What is Python?
➔ How do we use it with a real
world project?
➔ How do I learn more?
What
is
a
Data
Scientist?
“[LinkedIn] was like arriving at a conference
reception and realizing you don’t know anyone. So
you just stand in the corner sipping your drink —
and you probably leave early.”
— LinkedIn Manager, June 2006
Example:
LinkedIn
2006
➔ Joined LinkedIn in 2006, only 8M
users (450M in 2016)
➔ Started experiments to predict
people’s networks
➔ Engineers were dismissive: “you
can already import your address
book”
Enter:
Data
Scientist
➔ Frame the question
➔ Collect the raw data
➔ Process the data
➔ Explore the data
➔ Communicate results
The
Process:
LinkedIn
Example
➔ What questions do we want to answer?
◆ Who?
◆ What?
◆ When?
◆ Where?
◆ Why?
◆ How?
Case:
Frame
the
Question
➔ What connections (type and number) lead to higher
user engagement?
➔ Which connections do people want to make but are
currently limited from making?
➔ How might we predict these types of connections with
limited data from the user?
Case:
Frame
the
Question
➔ What data do we need to
answer these questions?
Case:
Collect
the
Data
➔ Connection data (who is who connected to?)
➔ Demographic data (what is the profile of
the connection)
➔ Engagement data (how do they use the site)
Case:
Collect
the
Data
➔ How is the data “dirty”
and how can we clean it?
Case:
Process
the
Data
➔ User input
➔ Redundancies
➔ Feature changes
➔ Data model changes
Case:
Process
the
Data
➔ What are the meaningful
patterns in the data?
Case:
Explore
the
Data
➔ Triangle closing
➔ Time Overlaps
➔ Geographic Overlaps
Case:
Explore
the Data
➔ How do we communicate this?
➔ To whom?
Case:
Communicate
Findings
➔ Marketing - sell X more ad space, results in X more
impressions per day
➔ Product - build X more features
➔ Development - grow our team by X
➔ Sales - attract X more premium accounts
➔ C-Level - more revenue, 8M - 450M in 10 years
Case:
Communicate
Findings
The
Result
Python for Programming
➔ Great for Data Science
➔ Robotics
➔ Web Development
(Python/Django)
➔ Automation
Let’s
Learn
Python
Let’s
Learn
Python
➔ Our model is going to be a Decision Tree
➔ Decision Trees predict the most likely outcome
based on input
➔ Like a computer building a version of 20
questions
The
Model
Decision
Trees:
Golf?
➔ We’ll be using a Google-hosted
Python notebook to build this
model called Colaboratory
➔ Go to:
Colab.research.google.com
➔ Add the Colaboratory extension
to your G-suite
The
Notebook
➔ Go to: bit.ly/glf-dt
➔ Click File
➔ Select Save a Copy in Drive
➔ This is your personal version of
the notebook--let’s get started!
The
Notebook
from sklearn import tree
import pandas as pd
➔ Import Tree functionality from
the SKLearn Python Package
➔ Import Pandas to create our Data
Frame
➔ bit.ly/sklearn-python
Code
Block 1
golf_df = pd.DataFrame()
golf_df['Outlook'] = ['sunny', 'sunny', 'overcast', 'rainy', 'rainy','rainy',
'overcast', 'sunny', 'sunny', 'rainy', 'sunny', ‘overcast’,
'overcast', 'rainy']
golf_df['Temperature'] = ['hot', 'hot', 'hot', 'mild', 'cool', 'cool', 'cool',
'mild', 'cool', 'mild', 'mild', 'mild', 'hot', 'mild']
golf_df['Humidity'] = ['high', 'high', 'high', 'high', 'normal', 'normal',
'normal','high', 'normal', 'normal', 'normal', 'high',
'normal', 'high']
golf_df['Windy'] = ['false', 'true', 'false', 'false', 'false', 'true', 'true',
'false', 'false', 'false', 'true', 'true', 'false', 'true']
➔ Load in our seed data
Code
Block 2
golf_df['Play'] = ['no', 'no', 'yes', 'yes', 'yes', 'no', 'yes', 'no',
'yes', 'yes', 'yes','yes', 'yes', 'no']
golf_df
➔ This is our target data
➔ Print the data frame to make sure our table
looks accurate
Code
Block 2
one_hot_data = pd.get_dummies(golf_df[['Outlook', 'Temperature', 'Humidity', 'Windy']])
one_hot_data
➔ Sklearn Tree needs numerical data
➔ To make this conversion we use a method called One Hot
Encoding
➔ Print the one_hot_data to see the newly encoded data frame
Code
Block 3
clf = tree.DecisionTreeClassifier()
clf_train = clf.fit(one_hot_data, golf_df['Play'])
➔ We create an empty DecisionTreeClassifier and
assign it to the variable clf
➔ We fit the decision tree with our
one_hot_data and our target feature ‘play’
and assign it to clf_train
Code
Block 4
print(tree.export_graphviz(clf_train, None))
➔ SKLearn is automatically creating our
Decision Tree questions for us (Example: Do I
play golf when it is Overcast?)
➔ Paste the return string of the print
statement into: webgraphviz.com
Code
Block 4
#sunny, hot, normal, true
prediction = clf_train.predict([[0, 0, 1, 0, 1, 0, 0, 1, 0,1]])
Prediction
➔ Now we give our inputs, in the same binary
format
➔ We use the one_hot_data as our guide to
test an input
➔ Print our prediction
Code
Block 5
Our model has a few weaknesses:
➔ Limited inputs
➔ Assumptions
Shortcomings
Ways
to
Learn
Data
Science
➔ Start with Python and Statistics
➔ Personal Program Manager
➔ Unlimited Q&A Sessions
➔ Student Slack Community
➔ bit.ly/freetrial-ds
Thinkful
Two-Week
Free
Trial
The
Student
Experience
Marnie Boyer, Thinkful Graduate
Capstone
Wolfgang Hall, Thinkful Graduate
Capstone
➔ bit.ly/tf-event-feedback
Survey

More Related Content

Similar to Tf itpbapm

Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data ScienceTJ Stalcup
 
Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...
Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...
Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...Rodger Devine
 
BTech Final Project (1).pptx
BTech Final Project (1).pptxBTech Final Project (1).pptx
BTech Final Project (1).pptxSwarajPatel19
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Edureka!
 
Data science presentation
Data science presentationData science presentation
Data science presentationMSDEVMTL
 
Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)ShehryarSH1
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teamsVenkatesh Umaashankar
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!Dylan
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamDoug Needham
 
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)Austin Ogilvie
 
Lunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customersLunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customersDaniel Zivkovic
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18Harvinder Atwal
 
MLConf Atlanta - Sept 2017
MLConf Atlanta - Sept 2017MLConf Atlanta - Sept 2017
MLConf Atlanta - Sept 2017Greg Werner
 
A Primer for Your Next Data Science Proof of Concept on the Cloud
A Primer for Your Next Data Science Proof of Concept on the CloudA Primer for Your Next Data Science Proof of Concept on the Cloud
A Primer for Your Next Data Science Proof of Concept on the CloudAlton Alexander
 
Bootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B TestsBootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B TestsDatabricks
 
How to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business AnalystHow to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business AnalystProduct School
 

Similar to Tf itpbapm (20)

Intro to Python for Data Science
Intro to Python for Data ScienceIntro to Python for Data Science
Intro to Python for Data Science
 
Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...
Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...
Data Science for Fundraising: Build Data-Driven Solutions Using R - Rodger De...
 
BTech Final Project (1).pptx
BTech Final Project (1).pptxBTech Final Project (1).pptx
BTech Final Project (1).pptx
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Data science presentation
Data science presentationData science presentation
Data science presentation
 
Tf gsit
Tf gsitTf gsit
Tf gsit
 
Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)Artificial Intelligence (ML - DL)
Artificial Intelligence (ML - DL)
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
Building successful data science teams
Building successful data science teamsBuilding successful data science teams
Building successful data science teams
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Tf itjsbagg
Tf itjsbaggTf itjsbagg
Tf itjsbagg
 
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
Building a Beer Recommender with Yhat (PAPIs.io - November 2014)
 
Lunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customersLunch & Learn BigQuery & Firebase from other Google Cloud customers
Lunch & Learn BigQuery & Firebase from other Google Cloud customers
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
 
MLConf Atlanta - Sept 2017
MLConf Atlanta - Sept 2017MLConf Atlanta - Sept 2017
MLConf Atlanta - Sept 2017
 
Dpo latin strategy 04132018
Dpo latin strategy 04132018Dpo latin strategy 04132018
Dpo latin strategy 04132018
 
A Primer for Your Next Data Science Proof of Concept on the Cloud
A Primer for Your Next Data Science Proof of Concept on the CloudA Primer for Your Next Data Science Proof of Concept on the Cloud
A Primer for Your Next Data Science Proof of Concept on the Cloud
 
Bootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B TestsBootstrapping of PySpark Models for Factorial A/B Tests
Bootstrapping of PySpark Models for Factorial A/B Tests
 
How to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business AnalystHow to Use Data Effectively by Abra Sr. Business Analyst
How to Use Data Effectively by Abra Sr. Business Analyst
 

More from Shannon Gallagher (20)

Tf wiads
Tf wiadsTf wiads
Tf wiads
 
Tf wdvds
Tf wdvdsTf wdvds
Tf wdvds
 
Tf byowwhc
Tf byowwhcTf byowwhc
Tf byowwhc
 
Tf byowwhc
Tf byowwhcTf byowwhc
Tf byowwhc
 
Tf itjsbagg
Tf itjsbaggTf itjsbagg
Tf itjsbagg
 
Tf ffccjs
Tf ffccjsTf ffccjs
Tf ffccjs
 
Tf ffcchtmlcss
Tf ffcchtmlcssTf ffcchtmlcss
Tf ffcchtmlcss
 
Tf bawa
Tf bawaTf bawa
Tf bawa
 
Tf byow
Tf byowTf byow
Tf byow
 
Tf dsyv
Tf dsyvTf dsyv
Tf dsyv
 
Tf ffccjs
Tf ffccjsTf ffccjs
Tf ffccjs
 
Ffcchtml
FfcchtmlFfcchtml
Ffcchtml
 
Tf bawa
Tf bawaTf bawa
Tf bawa
 
Tf byow
Tf byowTf byow
Tf byow
 
Tf ffccjs
Tf   ffccjsTf   ffccjs
Tf ffccjs
 
Tf frccjs
Tf frccjsTf frccjs
Tf frccjs
 
Tf fcchc
Tf fcchcTf fcchc
Tf fcchc
 
Tf bawa
Tf bawaTf bawa
Tf bawa
 
Tf byows
Tf byowsTf byows
Tf byows
 
Tf byows
Tf byowsTf byows
Tf byows
 

Recently uploaded

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 

Recently uploaded (20)

Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 

Tf itpbapm

  • 2. Introductions ➔ What's your name? ➔ What brought you here today? ➔ What is your programming experience?
  • 3. We train developers and data scientists through 1x1 mentorship and project-based learning. Guaranteed. About Thinkful
  • 4. Learn by Doing ➔ Why is Data Science a thing? ➔ What is Python? ➔ How do we use it with a real world project? ➔ How do I learn more?
  • 6. “[LinkedIn] was like arriving at a conference reception and realizing you don’t know anyone. So you just stand in the corner sipping your drink — and you probably leave early.” — LinkedIn Manager, June 2006 Example: LinkedIn 2006
  • 7. ➔ Joined LinkedIn in 2006, only 8M users (450M in 2016) ➔ Started experiments to predict people’s networks ➔ Engineers were dismissive: “you can already import your address book” Enter: Data Scientist
  • 8. ➔ Frame the question ➔ Collect the raw data ➔ Process the data ➔ Explore the data ➔ Communicate results The Process: LinkedIn Example
  • 9. ➔ What questions do we want to answer? ◆ Who? ◆ What? ◆ When? ◆ Where? ◆ Why? ◆ How? Case: Frame the Question
  • 10. ➔ What connections (type and number) lead to higher user engagement? ➔ Which connections do people want to make but are currently limited from making? ➔ How might we predict these types of connections with limited data from the user? Case: Frame the Question
  • 11. ➔ What data do we need to answer these questions? Case: Collect the Data
  • 12. ➔ Connection data (who is who connected to?) ➔ Demographic data (what is the profile of the connection) ➔ Engagement data (how do they use the site) Case: Collect the Data
  • 13. ➔ How is the data “dirty” and how can we clean it? Case: Process the Data
  • 14. ➔ User input ➔ Redundancies ➔ Feature changes ➔ Data model changes Case: Process the Data
  • 15. ➔ What are the meaningful patterns in the data? Case: Explore the Data
  • 16. ➔ Triangle closing ➔ Time Overlaps ➔ Geographic Overlaps Case: Explore the Data
  • 17. ➔ How do we communicate this? ➔ To whom? Case: Communicate Findings
  • 18. ➔ Marketing - sell X more ad space, results in X more impressions per day ➔ Product - build X more features ➔ Development - grow our team by X ➔ Sales - attract X more premium accounts ➔ C-Level - more revenue, 8M - 450M in 10 years Case: Communicate Findings
  • 20. Python for Programming ➔ Great for Data Science ➔ Robotics ➔ Web Development (Python/Django) ➔ Automation Let’s Learn Python
  • 22. ➔ Our model is going to be a Decision Tree ➔ Decision Trees predict the most likely outcome based on input ➔ Like a computer building a version of 20 questions The Model
  • 24. ➔ We’ll be using a Google-hosted Python notebook to build this model called Colaboratory ➔ Go to: Colab.research.google.com ➔ Add the Colaboratory extension to your G-suite The Notebook
  • 25. ➔ Go to: bit.ly/glf-dt ➔ Click File ➔ Select Save a Copy in Drive ➔ This is your personal version of the notebook--let’s get started! The Notebook
  • 26. from sklearn import tree import pandas as pd ➔ Import Tree functionality from the SKLearn Python Package ➔ Import Pandas to create our Data Frame ➔ bit.ly/sklearn-python Code Block 1
  • 27. golf_df = pd.DataFrame() golf_df['Outlook'] = ['sunny', 'sunny', 'overcast', 'rainy', 'rainy','rainy', 'overcast', 'sunny', 'sunny', 'rainy', 'sunny', ‘overcast’, 'overcast', 'rainy'] golf_df['Temperature'] = ['hot', 'hot', 'hot', 'mild', 'cool', 'cool', 'cool', 'mild', 'cool', 'mild', 'mild', 'mild', 'hot', 'mild'] golf_df['Humidity'] = ['high', 'high', 'high', 'high', 'normal', 'normal', 'normal','high', 'normal', 'normal', 'normal', 'high', 'normal', 'high'] golf_df['Windy'] = ['false', 'true', 'false', 'false', 'false', 'true', 'true', 'false', 'false', 'false', 'true', 'true', 'false', 'true'] ➔ Load in our seed data Code Block 2
  • 28. golf_df['Play'] = ['no', 'no', 'yes', 'yes', 'yes', 'no', 'yes', 'no', 'yes', 'yes', 'yes','yes', 'yes', 'no'] golf_df ➔ This is our target data ➔ Print the data frame to make sure our table looks accurate Code Block 2
  • 29. one_hot_data = pd.get_dummies(golf_df[['Outlook', 'Temperature', 'Humidity', 'Windy']]) one_hot_data ➔ Sklearn Tree needs numerical data ➔ To make this conversion we use a method called One Hot Encoding ➔ Print the one_hot_data to see the newly encoded data frame Code Block 3
  • 30. clf = tree.DecisionTreeClassifier() clf_train = clf.fit(one_hot_data, golf_df['Play']) ➔ We create an empty DecisionTreeClassifier and assign it to the variable clf ➔ We fit the decision tree with our one_hot_data and our target feature ‘play’ and assign it to clf_train Code Block 4
  • 31. print(tree.export_graphviz(clf_train, None)) ➔ SKLearn is automatically creating our Decision Tree questions for us (Example: Do I play golf when it is Overcast?) ➔ Paste the return string of the print statement into: webgraphviz.com Code Block 4
  • 32. #sunny, hot, normal, true prediction = clf_train.predict([[0, 0, 1, 0, 1, 0, 0, 1, 0,1]]) Prediction ➔ Now we give our inputs, in the same binary format ➔ We use the one_hot_data as our guide to test an input ➔ Print our prediction Code Block 5
  • 33. Our model has a few weaknesses: ➔ Limited inputs ➔ Assumptions Shortcomings
  • 35. ➔ Start with Python and Statistics ➔ Personal Program Manager ➔ Unlimited Q&A Sessions ➔ Student Slack Community ➔ bit.ly/freetrial-ds Thinkful Two-Week Free Trial
  • 36. The Student Experience Marnie Boyer, Thinkful Graduate Capstone Wolfgang Hall, Thinkful Graduate Capstone