SlideShare a Scribd company logo
Python for Data Science
Trends and Use Cases
WeCloudData
@WeCloudData @WeCloudData tordatascience
weclouddata
WeCloudData tordatascience
WeCloudData
v v vEducation Career Consulting
• Analytics Bootcamp
• Career Services
• Mentorship
• ISA
• Diploma Programs
• Part-time Programs
• Corporate Training
• Data Science
• Machine Learning
• Big Data
• Cloud
DS Career Panel
Join our kick a**
instructor team!
Help
corporates
upskill their
employees
Data/Cloud
Skills Training
for Canadians
DS Diploma
Toronto Institute of
Data Science
Training
Reskill | Upskill
AI Bootcamp
Communities
Meetups
Networking
Hiring Event
AI Expert
Instructing
Consulting
DS/AI
DE
Cloud
6 months
Success rate
89%
75k Salary
Project-based
Training
Upcoming
Events40%
Referral
by WCD
Bring real-world
client projects to
the classroom
Apache Spark Event
ML/AI Workshops
Data Science Part-Time
Learning Path
Prerequisites
Data Science
Learning Path
• ML algorithms
• 2 Projects
• Interview Practice
Applied ML
• Data wrangling
• Data Visualization
• Predictive Modeling
Data Science
w/ Python
• Big data tools
• ML at scale
• ML deployment
• Job referrals
Big Data
Python
Foundation
SQL for
Data Science
Scala & Spark for DE
Linux Command Line
Docker | Kubernetes
Scala Programming
Spark In Depth
ETL for DE
Hadoop | Hive | Presto
Data Ingestion & Integration
Talend
Airflow & Pipelines
Real-time Analytics
Apache Kafka
Spark Streaming
Apache Flink
Apache Beam
SparkforDE
BigData
&
ETL
Realtim
e
Analytics
Learn to build data pipelines, scale
data processing with big data tools,
and deployment real-time
applications and machine learning
models at scale.
Data Engineering
Learning Path
Data Engineering Part-Time
Part-time Program
AWS Big Data - Part-Time
Learning Path
Learn AWS big data tools and
platforms and get certified as AWS
Certified Big Data Specialist
Cloud Computing
AWS Track
Learn AWS Big
Data Tools
Hands-on
Project
Certification
Exam Prep
02/02/202010/12/2019
Learn AWS
Solution Architect
Hands-on
Project
Certification
Exam Prep
Applied Deep Learning
Applied AI – Part-Time
Learning Path
Artificial Intelligence
Program
Deep Learning for NLP
Deep Learning Capstone
Machine Learning in Healthcare
https://www.youtube.com/watch?v=39rSzfpYsvA
P(Get Interview) = 0.4 +0.25 + 0.25 + 0.1S E R N
P(Ace Skills) = 0.25 +0.3 + 0.4 + 0.05S C B P
P(Offer) = P(Get Interview) x P(Ace Interview)
Landing a Data Scientist Job
Key Factors
S
E
N
R
Skills
Experience
Resume
C
Network
Communication
B
P
Business Cases
Preparation
Data Science Immersive
(PCC Approved Diploma Program)
Prerequisites
Data Science
Learning Path
• ML algorithms
• 2 Projects
• Interview Practice
Applied ML
• Data wrangling
• Data Visualization
• Predictive Modeling
Data Science
w/ Python
• Big data tools
• ML at scale
• ML deployment
• Job referrals
Big Data
Python
Foundation
SQL for
Data Science
Prerequisites
• ML algorithms
• 2 Projects
• Interview Practice
Applied ML
• Data wrangling
• Data Visualization
• Predictive Modeling
Data Science
w/ Python
• Big data tools
• ML at scale
• ML deployment
• Job referrals
Big Data
Python
Foundation
SQL for
Data Science
+
Experience
Industry Intern
Consulting Project
+
Career Support
Resume
Referral (50%)
P(Get Interview) = 0.4 +0.25 + 0.25 + 0.1S E R N
S
E
N
R
Skills
Experience
Resume
C
Network
Communication
B
P
Business Cases
Preparation
P(Ace Skills) = 0.25 +0.3 + 0.4 + 0.05S C B P
Training
Data Science Immersive
(PCC Approved Diploma Program)
Python
• Py: Basics
• Py: DataTypes
• Py: Strings
• Py: Functions
• Py: Class
• Py: IDEs
(PyCharm)
W2
W3W1
Learning to
Code
• SQL
• Linux | Docker
• Github
• AWS
Data Science w/ Python
• Py: Functions
• Py: Class/OOP
• DS: Numpy
• DS: Pandas
• DS:Viz
• DS:API
• Scraping Project
ML: Classifier
• ML: KNN
• ML: Logistic
• ML: SVM
• ML: Evaluation
• ML: Cross-val
W4 W5
ML: Classifier
• ML:Trees
• ML: Ensembles
• ML:Tuning
• ML: Imbalanced
• ML: Pipeline
Review Week
• Review
• SQL Quiz
• ML Quiz
• Interview Practice
• ML Project #1
W6
12-week Diploma Program
Data Science Diploma Program – Jan 2020
Syllabus
Big Data
• BD: Spark DF
• BD: NoSQL
• Interview Practice
W11
W12W10
Big Data
• Big Data Project
• Spark Machine
Learning
• Model
Deployment
• Rest API
• Model in
Production
Big Data
• BD: Hadoop
• BD: Hive
• BD: SQL on
Hadoop
• BD: Spark
ML: Regression
• Py: Pandas Adv
• ML: Stats
• ML: Linear Algebra
• ML: Optimization
• ML: Regression
W7
ML: Clustering/NLP
• ML:Text Processing
• ML:Topic Model
• ML: Clustering
• ML Dimension
Reduction
• Interview Practice
• Client Project Kickoff
W8
ML: Neural Net
• ML: Neural Net
• ML: Keras
• ML: CNN
• ML Project #2
• Interview Practic
W9
Data Science Diploma Program – Jan 2020
Syllabus
Python
• Py: Basics
• Py: DataTypes
• Py: Strings
• Py: Functions
• Py: Class
• Py: IDEs
(PyCharm)
W2
W3W1
Learning to
Code
• SQL
• Linux | Docker
• Github
• AWS
Data Science w/ Python
• Py: Functions
• Py: Class/OOP
• DS: Numpy
• DS: Pandas
• DS:Viz
• DS:API
• Scraping Project
ML: Classifier
• ML: KNN
• ML: Logistic
• ML: SVM
• ML: Evaluation
• ML: Cross-val
W4 W5
ML: Classifier
• ML:Trees
• ML: Ensembles
• ML:Tuning
• ML: Imbalanced
• ML: Pipeline
Review Week
• Review
• SQL Quiz
• ML Quiz
• Interview Practice
• ML Project #1
W6
12-week Diploma Program
Big Data
• BD: Spark DF
• BD: NoSQL
• Interview Practice
W11
W12W10
Big Data
• Big Data Project
• Spark Machine
Learning
• Model
Deployment
• Rest API
• Model in
Production
Big Data
• BD: Hadoop
• BD: Hive
• BD: SQL on
Hadoop
• BD: Spark
ML: Regression
• Py: Pandas Adv
• ML: Stats
• ML: Linear Algebra
• ML: Optimization
• ML: Regression
W7
ML: Clustering/NLP
• ML:Text Processing
• ML:Topic Model
• ML: Clustering
• ML Dimension
Reduction
• Interview Practice
• Client Project Kickoff
W8
ML: Neural Net
• ML: Neural Net
• ML: Keras
• ML: CNN
• ML Project #2
• Interview Practic
W9
Client Project Career/Referral
Other Bootcamps
Learning Environment
Lab Environment (Tools & Platforms)
Python | SQL Cloud | Big DataMachine Learning
Hands-on Project
Bring real industry-level project experience to the classroom
By working on real projects, we mean
• You will be helping startups set up data pipelines in AWS
• You will be working on forecast models to optimize inventories for
hundreds of millions of device sales
• Your customer segmentation models will shape how a startup manage
marketing campaigns
• You will help the client save AWS cost by 200% by migrating computing to
Apache Spark
• Your machine learning models will help companies retain high value
customers
• Your work will be presented to the CEOs
153k 13
Market
Research
Student Success
Job Placement
6 months 2 months
56%89%
Data Scientist
Security Analyst
Senior Analyst
Data Scientist
Data Engineer
70k 0 New grad
98k 2 FSA
73k 0 New Grad
63k 0 New Grad
78k 3 PWC
Sr Data Scientist
83k Salary
50%
Referral
by WCD
120k 13 QAData Scientist
Data Scientist 80k 2 Data Analyst
100k 0
Geology (New
Grad0
Data Scientist
70k 0
Statistics (New
Grad)
ML Engineer
Data Science Job Market
Coding/Tools
Math/ML Storytelling
Data
Scientist
Linux
Python/Scala/Java
Cloud (AWS)
Hadoop, Spark
Statistics
Linear Algebra
Regression
Classification
Clustering
NLP
Presentation
Use cases
Project Mgmt
Communications
Data Science
Essential Skills
Business Domain Knowledge
Data is a language—every company, if not
every business unit, speaks its own dialect.
Data Scientist
The Types
Operational DS
Focus: data wrangling, work with
large/small messy data, builds
predictive models
Strength: data handling, tools, business
knowledge
ML Engineer
Focus: ML model deployment, data
pipelines
Strength: coding, algorithms, machine
learning, platforms and tools
ML Researcher
Focus: algorithm development,
research, IP
Strength: ML/DL algorithms,
implmentation, research
DS Product Mngr
Focus: product strategy, business
communications, project management
Strength: product sense, business
requirements, DS acumen
Data Jobs in Canada
Job Categories and Cities
Data Jobs in Canada
Industries – Data Scientist
Data Jobs in Canada
Industries – Data Analyst
Data Jobs in Canada
Industries – Data Engineer
Data Jobs in Canada
Industries – ML Engineer
Data Jobs in Canada
SQL is among most wanted skills
Data Jobs in Canada
Skills – Data Analyst
Data Jobs in Canada
Skills – Data Engineer
Data Jobs in Canada
Skills – ML Engineer
Data Science
Salaries
Data Science Learning Path
Resources
Python
Coding Practice
Coding & Interviews
• LeetCode
• HackerRank
Book Statistics Online Courses
Udemy
• Complete Python Bootcamp
Datacamp
• Introduction to Python
Data Science
Importance of foundations
Data Science
Machine
Learning
Big Data
Data
Engineering
Deep
Learning
ML
Engineering
Focus on one programming language at a time
• Get good at it
Must have skills
• Python
• SQL
Data Science
What’s next?
Prerequisites
Data Science
Learning Path
• ML algorithms
• 2 Projects
• Interview Practice
Applied ML
• Data wrangling
• Data Visualization
• Predictive Modeling
Data Science
w/ Python
• Big data tools
• ML at scale
• ML deployment
• Job referrals
Big Data
Python
Foundation
SQL for
Data Science
Nov 16 Nov 3 Nov 16Nov 23 Oct 19
Python
Why Python?
The ecosystem
Targeting
Profiles
Personaliz
ation
Parse/Filter Classify Synthesize
POI Prediction
Ontology/TaxonomyContexts
URL Parsing
POI Database
Context Extraction
Topic Modeling
Content Classification
Location Classify
Signal Aggregation
Taste Merging
Taste Scoring
Data Data Science Pipelines Data Product
POI Context Builder
Rule-based Predictor
ML Predictor
Location Attributes
Home/Work Predictor
Co-location Location
Graph
• sklearn
• gensim
• nltk
• mrjob
• PySpark
• PySpark
Why Python?
Python in a data science project
Python Data Management
Structured Data with Pandas DataFrame
Row Index Population Area
California 423967 38332521
Florida 170312 19552860
Illinois 149995 12882135
New York 141297 19651127
Texas 695662 26448193
Column
DataFrame
Values
Column
Row
Row
Row
Row
Row
Row Index
Row Index
Row Index
Row Index
Row Index
Column Index Column Index
# row access returns Series
states.loc['Florida']
# column access returns Series
states['area']
California 38332521
Florida 19552860
Illinois 12882135
New York 19651127
Texas 26448193
population 170312
area 19552860
# index based selection
states.iloc[1:3, :1]
Row
Index
Population
Florida 170312
Illinois 149995
Series
Series
DataFrame
Python Data Management
Pandas - GroupBy()
City
Ticket
Sales
Toronto 100
Montreal 50
Toronto 20
Halifax 40
Montreal 30
Halifax 60
City
Ticket
Sales
Toronto 100
Toronto 20
City
Ticket
Sales
Montreal 50
Montreal 30
City
Ticket
Sales
Halifax 40
Halifax 60
City
Ticket
Sales
Toronto 60
City
Ticket
Sales
Montreal 40
City
Ticket
Sales
Halifax 50
City
Ticket
Sales
Toronto 60
Montreal 40
Halifax 50
Input
DataFrame
Split
DataFrameGroupBy
Apply (sum)
DataFrameGroupBy
Combine
DataFrame
df = pd.DataFrame({'city' : ['Toronto', 'Montreal', 'Toronto', 'Halifax',
'Montreal', 'Halifax'],
'sales' : [100, 50, 20, 40, 30, 60]})
Python Data Management
Pandas - Join/Merge # 1-to-1 join
pd.merge(employee, hr, how='inner', on='employee')
Other features
• Pivot Tables
• Window Functions
DataVisualizations
Matplotlib, Seaborn, Plotly, Bokeh
Python
Plotly Dash
Python
Plotly Dash
Python Data Munging
Database Integration
from mrjob.job import MRJob
class MRWordCount(MRJob):
def mapper(self, _, line):
for word in line.split():
yield(word, 1)
def reducer(self, word, counts):
yield(word, sum(counts))
if __name__ == '__main__':
MRWordCount.run()
from pyspark import SparkContext
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: wordcount <file>", file=sys.stderr)
exit(-1)
sc = SparkContext(appName="PythonWordCount")
lines = sc.textFile(sys.argv[1], 1)
counts = lines.flatMap(lambda x: x.split(' ')) 
.map(lambda x: (x, 1)) 
.reduceByKey(add)
output = sorted(counts.collect(), key=lambda x: x[1], reverse=True)[:50]
for (word, count) in output:
print("%s: %i" % (word, count))
sc.stop()
API Support for Big Data Platforms
Hadoop/Spark
Machine Learning
Sklearn | Gensim | NLTK
ML at Scale
PySpark ML
Data Engineering
Data Pipelines with Airflow
Deep Learning
Strong Python Support
Python
IoT/Robotics
Python
Application
Data Science Trends
Trends
Dask
Trends
Auto ML
https://towardsdatascience.com/automl-for-data-enthusiasts-30582b660cda
Trends
Model Explainability
Trends
Featuretools
Trends
Model Deployment in Cloud
SageMaker
EMR
ECR
S3
Notebook
Transform Inference
1. ETL on EMR
using Spark
2. Save Model to S3
s3://weclouddata/mod
els/gbm20190612
SageMaker
Spark ML
Container
3. Start notebook
instance and
deploy model
SageMaker
Spark ML
Container
4. Start
SageMaker
Spark container
for prediction
API
SageMaker
Spark ML
Container
Trends
Kubernetes
On-Prem
HDFS S3 Azure Blob Storage Google Cloud Storage
vYARN Mesos Kubernetes
MapReduce
Spark Core
Spark DataFrame
SQL ML Structured
Streaming
Graph
Frame
Hive MahoutPig
Impala
Presto
Kylin
Trends
ML on Kubernetes
Learn Python and Data Science with WeCloudData
Python Programming
Why Python?
• Python is the most popular data
science and AI programming
language
• Many employers prefer candidates
with python skills
• Mastering python will expose you
to not only Data Scientist jobs, but
also Data Engineers and DevOps
Python Programming
Syllabus
• Python use cases
• Branching
• Loops
• Data Types: list, tuple,
set
• Functions
• Lab: social media
analytics with python
Day 1
Python Basics
Day 2
Intermediate Python
Day 3
Python Data Analysis
• Data Types: String
• Data Types: Dictionary
• Comprehensions
• Regular expression
• Modules & Packages
• Class and Object
• Interview – Prepare for
Python Interview Tests
• Lab – Class and object
• Pandas introduction
• Intro to visualization
with python
• Accessing database
with python
• Use case: Python for
ML and AI
• Project: Building your
first ML algorithm with
python
• Python Installation
• Jupyter Introduction
• Python Introduction
• DS Introduction
• Twitter Dev API Setup
Pre-course
Installation & Preview
• Web scraping basics
• BeautifulSoup
• Selenium
• Project #1 kickoff
• Matplotlib Review
Data Collection
Signup
• Project #1 Presentation
• Seaborn | Plotly
• Map Visualization
• Building analytics
dashboard with Dash
• Project #2 kickoff
EDA & Data Visualizations
• Project #2 Presentation
• Predictive modeling
lifecycle
• Introduction to sklearn
• Regression analysis
Predictive Modeling
• Data Science Intro
• Analyze Toronto Open
Data
Data Science
• Advanced Pandas (Merge
and Joins)
• Advanced Aggregations
• Querying databases
• Reporting with Pandas
and Pivot
Data Wrangling
• Intro to Statistics and
Linear Algebra
• Scipy for statistics
• Numpy for linear algebra
• Time series forecasting
with Prophet
Statistics and Linear Algebra
W1 W3 W5
W2 W4 W6
Final
Review
Data Science with Python
Syllabus (Weekend Cohort – 8 sessions/32 hours)
• Python review
• Intro to Pandas
• Intro to Visualization
Self-paced Lectures
• Classification models
• Model evaluation
• Predicting Toronto TTC
delay using Sklearn
Predictive Modeling
W7
• Course review
• Introduction to Machine
Learning
• Introduction to Big Data
Final Review
W8
Data Science with Python
Hands-on Projects
This course is instructor-led and project-based. Students will be able to apply the data science
skills acquired during the lectures to 2 hands-on projects. The 2 projects will make your
2 Data Science Projects
• Web Data Analytics
• Data Storytelling (Dashboard + Heroku Deployment)
Data Collection
BeautifulSoup
Selenium
Data Cleaning
Pandas
Matplotlib
Data Analysis
Matplotlib
Pandas
SQLAlchemy
Story telling
Insight Analysis
Presentations
App Deployment
Heroku
Flask
Visualization
Dash
Plotly
Project #1:
Web Data
Analysis
Project #2:
Data
Storytelling
Project 2 Demo
Data Science with Python
Student Project Demo
Data Science with Python
Student Project Demo
Analyzing the Top Travel Influencers on Instagram
Data Science with Python
Student Project Demo
Open a webpage
and get HTML of
search results page
Fishing Online
Process data using
pandas and save data
into .csv file
Locating page
elements by XPath
Extract Target Data
NextURL?
Yes
No
Data Science with Python
Student Project Demo
Data Science with Python
Student Project Demo
Learn Data Science
Understand the big picture
Web Crawler Project - Aritzia
Student Project Demo
Content
Motivation Data Interesting
Findings
Conclusion Challenges
Info & Motivation
´ Type : Public
´ Traded as : TSX: ATZ
´ Industry : Fashion
´ Founded : 1984
´ Founder : Brian Hill
´ Headquarters : Vancouver, British Columbia, Canada
´ Products : Clothing
Website
Dataframe
´ Total data : 856,452
´ Date range : 2019-06-08 21:53:50 ~ 2019-06-20 13:57:19
´ File numbers : 30
crawler.py
Interesting Findings
´ Categories & Brand
´ Price Distribution
´ Top 20 Colors
´ Weekdays Vs Weekend - Avg Stock
´ On Sale event - Discount%
´ Price Change Vs Stock Correlation
Category Distribution
Brand Distribution Vs. Brand Average Price
Top 20 Colors
Weekdays Vs Weekend - Avg Stock
SALE !
Discount % of Each Brand
Top 10 products – stock change
Conclusion
´ Business casual clothes prices are higher than others
´ More transactions/purchases happens in weekends
´ Sale event – good deal for famous brands
´ Promotion influences stock change
Challenges
´ Save data as Tree structure (.json)
´ Load data
´ Move root node properties to children node
´ Data analyzing using Pandas
´ Visualization - Plotly (multi-chart types)
Next Step
´ Detailed size distribution of
brands / products
´ Influences of the strength of
discount
´ Stock refill timing
´ Long term data analyzing
(winter vs. summer)
Data Science with Python - WeCloudData
Data Science with Python - WeCloudData

More Related Content

What's hot

MLCommons: Better ML for Everyone
MLCommons: Better ML for EveryoneMLCommons: Better ML for Everyone
MLCommons: Better ML for EveryoneDatabricks
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku
 
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Rodney Joyce
 
Conversational AI with Transformer Models
Conversational AI with Transformer ModelsConversational AI with Transformer Models
Conversational AI with Transformer ModelsDatabricks
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneSri Ambati
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Sri Ambati
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaDatabricks
 
AI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analyticsAI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analyticsDataWorks Summit
 
Intro to graphs for HR analytics
Intro to graphs for HR analyticsIntro to graphs for HR analytics
Intro to graphs for HR analyticsRik Van Bruggen
 
Open-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th JuneOpen-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th JuneInnovative Management Services
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceDatabricks
 
H2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelH2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelSri Ambati
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jFred Madrid
 
Big data-science-oanyc
Big data-science-oanycBig data-science-oanyc
Big data-science-oanycOpen Analytics
 
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...Flavio Clesio
 
ML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationHunter Carlisle
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Neo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajSri Ambati
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist SoftServe
 

What's hot (20)

MLCommons: Better ML for Everyone
MLCommons: Better ML for EveryoneMLCommons: Better ML for Everyone
MLCommons: Better ML for Everyone
 
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex ChallengeDataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
 
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
 
Conversational AI with Transformer Models
Conversational AI with Transformer ModelsConversational AI with Transformer Models
Conversational AI with Transformer Models
 
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital OneUsing H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
Using H2O for Mobile Transaction Forecasting & Anomaly Detection - Capital One
 
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
Drive Away Fraudsters With Driverless AI - Venkatesh Ramanathan, Senior Data ...
 
FlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at HumanaFlorenceAI: Reinventing Data Science at Humana
FlorenceAI: Reinventing Data Science at Humana
 
AI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analyticsAI from your data lake: Using Solr for analytics
AI from your data lake: Using Solr for analytics
 
Intro to graphs for HR analytics
Intro to graphs for HR analyticsIntro to graphs for HR analytics
Intro to graphs for HR analytics
 
Open-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th JuneOpen-BDA - Big Data Hadoop Developer Training 10th & 11th June
Open-BDA - Big Data Hadoop Developer Training 10th & 11th June
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field Experience
 
H2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno CandelH2O World - H2O Deep Learning with Arno Candel
H2O World - H2O Deep Learning with Arno Candel
 
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4jTransforming AI with Graphs: Real World Examples using Spark and Neo4j
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
 
Big data-science-oanyc
Big data-science-oanycBig data-science-oanyc
Big data-science-oanyc
 
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
Spark Summit EU 2017 - Preventing revenue leakage and monitoring distributed ...
 
ML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production ApplicationML-Ops: From Proof-of-Concept to Production Application
ML-Ops: From Proof-of-Concept to Production Application
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Neo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic trainingNeo4j GraphDay Seattle- Sept19- neo4j basic training
Neo4j GraphDay Seattle- Sept19- neo4j basic training
 
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian BharadwajH2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 

Similar to Data Science with Python - WeCloudData

Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 
How to Start a Career in Data Science - Jovian.ml
How to Start a Career in Data Science - Jovian.ml How to Start a Career in Data Science - Jovian.ml
How to Start a Career in Data Science - Jovian.ml Aakash N S
 
Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Wes McKinney
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Debraj GuhaThakurta
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelUwe Printz
 
A practical guidance of the enterprise machine learning
A practical guidance of the enterprise machine learning A practical guidance of the enterprise machine learning
A practical guidance of the enterprise machine learning Jesus Rodriguez
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data EngineeringDurga Gadiraju
 
Data Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudDataData Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudDataWeCloudData
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIshivajirao12345
 
Data Science Training in Chennai
Data Science Training in ChennaiData Science Training in Chennai
Data Science Training in ChennaiSLAJobs Chennai
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIkanimozhikannan1
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIvinothraja12345
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIshivajirao12345
 
awari-ds-aula1.pdf
awari-ds-aula1.pdfawari-ds-aula1.pdf
awari-ds-aula1.pdfMarcos993896
 
BBBT Watson Data Platform Presentation
BBBT Watson Data Platform PresentationBBBT Watson Data Platform Presentation
BBBT Watson Data Platform PresentationRitika Gunnar
 
Neo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
Neo4j GraphTalk Oslo - Building Intelligent Solutions with GraphsNeo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
Neo4j GraphTalk Oslo - Building Intelligent Solutions with GraphsNeo4j
 

Similar to Data Science with Python - WeCloudData (20)

Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
How to Start a Career in Data Science - Jovian.ml
How to Start a Career in Data Science - Jovian.ml How to Start a Career in Data Science - Jovian.ml
How to Start a Career in Data Science - Jovian.ml
 
Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)Building Better Analytics Workflows (Strata-Hadoop World 2013)
Building Better Analytics Workflows (Strata-Hadoop World 2013)
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 
Hadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data ModelHadoop meets Agile! - An Agile Big Data Model
Hadoop meets Agile! - An Agile Big Data Model
 
A practical guidance of the enterprise machine learning
A practical guidance of the enterprise machine learning A practical guidance of the enterprise machine learning
A practical guidance of the enterprise machine learning
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Data Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudDataData Engineering Course Syllabus - WeCloudData
Data Engineering Course Syllabus - WeCloudData
 
Lean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science teamLean Analytics: How to get more out of your data science team
Lean Analytics: How to get more out of your data science team
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAI
 
Data science training
Data science trainingData science training
Data science training
 
Data Science Training in Chennai
Data Science Training in ChennaiData Science Training in Chennai
Data Science Training in Chennai
 
Data science training
Data science training Data science training
Data science training
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAI
 
Data science training
Data science trainingData science training
Data science training
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAI
 
DATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAIDATA SCIENCE TRAINING IN CHENNAI
DATA SCIENCE TRAINING IN CHENNAI
 
awari-ds-aula1.pdf
awari-ds-aula1.pdfawari-ds-aula1.pdf
awari-ds-aula1.pdf
 
BBBT Watson Data Platform Presentation
BBBT Watson Data Platform PresentationBBBT Watson Data Platform Presentation
BBBT Watson Data Platform Presentation
 
Neo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
Neo4j GraphTalk Oslo - Building Intelligent Solutions with GraphsNeo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
Neo4j GraphTalk Oslo - Building Intelligent Solutions with Graphs
 

More from WeCloudData

AWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataAWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataWeCloudData
 
Machine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudDataMachine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudDataWeCloudData
 
SQL for Data Science
SQL for Data ScienceSQL for Data Science
SQL for Data ScienceWeCloudData
 
Introduction to Python by WeCloudData
Introduction to Python by WeCloudDataIntroduction to Python by WeCloudData
Introduction to Python by WeCloudDataWeCloudData
 
Web scraping project aritza-compressed
Web scraping project   aritza-compressedWeb scraping project   aritza-compressed
Web scraping project aritza-compressedWeCloudData
 
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)WeCloudData
 
Introduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataIntroduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataWeCloudData
 
WeCloudData Toronto Open311 Workshop - Matthew Reyes
WeCloudData Toronto Open311 Workshop - Matthew ReyesWeCloudData Toronto Open311 Workshop - Matthew Reyes
WeCloudData Toronto Open311 Workshop - Matthew ReyesWeCloudData
 
Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901WeCloudData
 

More from WeCloudData (9)

AWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataAWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudData
 
Machine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudDataMachine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudData
 
SQL for Data Science
SQL for Data ScienceSQL for Data Science
SQL for Data Science
 
Introduction to Python by WeCloudData
Introduction to Python by WeCloudDataIntroduction to Python by WeCloudData
Introduction to Python by WeCloudData
 
Web scraping project aritza-compressed
Web scraping project   aritza-compressedWeb scraping project   aritza-compressed
Web scraping project aritza-compressed
 
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)Applied Machine Learning Course - Jodie Zhu (WeCloudData)
Applied Machine Learning Course - Jodie Zhu (WeCloudData)
 
Introduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudDataIntroduction to Machine Learning - WeCloudData
Introduction to Machine Learning - WeCloudData
 
WeCloudData Toronto Open311 Workshop - Matthew Reyes
WeCloudData Toronto Open311 Workshop - Matthew ReyesWeCloudData Toronto Open311 Workshop - Matthew Reyes
WeCloudData Toronto Open311 Workshop - Matthew Reyes
 
Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901Tordatasci meetup-precima-retail-analytics-201901
Tordatasci meetup-precima-retail-analytics-201901
 

Recently uploaded

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Subhajit Sahu
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单ewymefz
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhArpitMalhotra16
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单yhkoc
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxbenishzehra469
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单ewymefz
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单vcaxypu
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsalex933524
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBAlireza Kamrani
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIAlejandraGmez176757
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单nscud
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样axoqas
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...elinavihriala
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxzahraomer517
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样axoqas
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatheahmadsaood
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJames Polillo
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单ocavb
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP
 

Recently uploaded (20)

Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Introduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxxIntroduction-to-Cybersecurit57hhfcbbcxxx
Introduction-to-Cybersecurit57hhfcbbcxxx
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Data Science with Python - WeCloudData

  • 1. Python for Data Science Trends and Use Cases WeCloudData @WeCloudData @WeCloudData tordatascience weclouddata WeCloudData tordatascience
  • 2. WeCloudData v v vEducation Career Consulting • Analytics Bootcamp • Career Services • Mentorship • ISA • Diploma Programs • Part-time Programs • Corporate Training • Data Science • Machine Learning • Big Data • Cloud
  • 3. DS Career Panel Join our kick a** instructor team! Help corporates upskill their employees Data/Cloud Skills Training for Canadians DS Diploma Toronto Institute of Data Science Training Reskill | Upskill AI Bootcamp Communities Meetups Networking Hiring Event AI Expert Instructing Consulting DS/AI DE Cloud 6 months Success rate 89% 75k Salary Project-based Training Upcoming Events40% Referral by WCD Bring real-world client projects to the classroom Apache Spark Event ML/AI Workshops
  • 4. Data Science Part-Time Learning Path Prerequisites Data Science Learning Path • ML algorithms • 2 Projects • Interview Practice Applied ML • Data wrangling • Data Visualization • Predictive Modeling Data Science w/ Python • Big data tools • ML at scale • ML deployment • Job referrals Big Data Python Foundation SQL for Data Science
  • 5. Scala & Spark for DE Linux Command Line Docker | Kubernetes Scala Programming Spark In Depth ETL for DE Hadoop | Hive | Presto Data Ingestion & Integration Talend Airflow & Pipelines Real-time Analytics Apache Kafka Spark Streaming Apache Flink Apache Beam SparkforDE BigData & ETL Realtim e Analytics Learn to build data pipelines, scale data processing with big data tools, and deployment real-time applications and machine learning models at scale. Data Engineering Learning Path Data Engineering Part-Time Part-time Program
  • 6. AWS Big Data - Part-Time Learning Path Learn AWS big data tools and platforms and get certified as AWS Certified Big Data Specialist Cloud Computing AWS Track Learn AWS Big Data Tools Hands-on Project Certification Exam Prep 02/02/202010/12/2019 Learn AWS Solution Architect Hands-on Project Certification Exam Prep
  • 7. Applied Deep Learning Applied AI – Part-Time Learning Path Artificial Intelligence Program Deep Learning for NLP Deep Learning Capstone Machine Learning in Healthcare https://www.youtube.com/watch?v=39rSzfpYsvA
  • 8. P(Get Interview) = 0.4 +0.25 + 0.25 + 0.1S E R N P(Ace Skills) = 0.25 +0.3 + 0.4 + 0.05S C B P P(Offer) = P(Get Interview) x P(Ace Interview) Landing a Data Scientist Job Key Factors S E N R Skills Experience Resume C Network Communication B P Business Cases Preparation
  • 9. Data Science Immersive (PCC Approved Diploma Program)
  • 10. Prerequisites Data Science Learning Path • ML algorithms • 2 Projects • Interview Practice Applied ML • Data wrangling • Data Visualization • Predictive Modeling Data Science w/ Python • Big data tools • ML at scale • ML deployment • Job referrals Big Data Python Foundation SQL for Data Science
  • 11. Prerequisites • ML algorithms • 2 Projects • Interview Practice Applied ML • Data wrangling • Data Visualization • Predictive Modeling Data Science w/ Python • Big data tools • ML at scale • ML deployment • Job referrals Big Data Python Foundation SQL for Data Science + Experience Industry Intern Consulting Project + Career Support Resume Referral (50%) P(Get Interview) = 0.4 +0.25 + 0.25 + 0.1S E R N S E N R Skills Experience Resume C Network Communication B P Business Cases Preparation P(Ace Skills) = 0.25 +0.3 + 0.4 + 0.05S C B P Training Data Science Immersive (PCC Approved Diploma Program)
  • 12. Python • Py: Basics • Py: DataTypes • Py: Strings • Py: Functions • Py: Class • Py: IDEs (PyCharm) W2 W3W1 Learning to Code • SQL • Linux | Docker • Github • AWS Data Science w/ Python • Py: Functions • Py: Class/OOP • DS: Numpy • DS: Pandas • DS:Viz • DS:API • Scraping Project ML: Classifier • ML: KNN • ML: Logistic • ML: SVM • ML: Evaluation • ML: Cross-val W4 W5 ML: Classifier • ML:Trees • ML: Ensembles • ML:Tuning • ML: Imbalanced • ML: Pipeline Review Week • Review • SQL Quiz • ML Quiz • Interview Practice • ML Project #1 W6 12-week Diploma Program Data Science Diploma Program – Jan 2020 Syllabus Big Data • BD: Spark DF • BD: NoSQL • Interview Practice W11 W12W10 Big Data • Big Data Project • Spark Machine Learning • Model Deployment • Rest API • Model in Production Big Data • BD: Hadoop • BD: Hive • BD: SQL on Hadoop • BD: Spark ML: Regression • Py: Pandas Adv • ML: Stats • ML: Linear Algebra • ML: Optimization • ML: Regression W7 ML: Clustering/NLP • ML:Text Processing • ML:Topic Model • ML: Clustering • ML Dimension Reduction • Interview Practice • Client Project Kickoff W8 ML: Neural Net • ML: Neural Net • ML: Keras • ML: CNN • ML Project #2 • Interview Practic W9
  • 13. Data Science Diploma Program – Jan 2020 Syllabus Python • Py: Basics • Py: DataTypes • Py: Strings • Py: Functions • Py: Class • Py: IDEs (PyCharm) W2 W3W1 Learning to Code • SQL • Linux | Docker • Github • AWS Data Science w/ Python • Py: Functions • Py: Class/OOP • DS: Numpy • DS: Pandas • DS:Viz • DS:API • Scraping Project ML: Classifier • ML: KNN • ML: Logistic • ML: SVM • ML: Evaluation • ML: Cross-val W4 W5 ML: Classifier • ML:Trees • ML: Ensembles • ML:Tuning • ML: Imbalanced • ML: Pipeline Review Week • Review • SQL Quiz • ML Quiz • Interview Practice • ML Project #1 W6 12-week Diploma Program Big Data • BD: Spark DF • BD: NoSQL • Interview Practice W11 W12W10 Big Data • Big Data Project • Spark Machine Learning • Model Deployment • Rest API • Model in Production Big Data • BD: Hadoop • BD: Hive • BD: SQL on Hadoop • BD: Spark ML: Regression • Py: Pandas Adv • ML: Stats • ML: Linear Algebra • ML: Optimization • ML: Regression W7 ML: Clustering/NLP • ML:Text Processing • ML:Topic Model • ML: Clustering • ML Dimension Reduction • Interview Practice • Client Project Kickoff W8 ML: Neural Net • ML: Neural Net • ML: Keras • ML: CNN • ML Project #2 • Interview Practic W9 Client Project Career/Referral Other Bootcamps
  • 14. Learning Environment Lab Environment (Tools & Platforms) Python | SQL Cloud | Big DataMachine Learning
  • 15. Hands-on Project Bring real industry-level project experience to the classroom By working on real projects, we mean • You will be helping startups set up data pipelines in AWS • You will be working on forecast models to optimize inventories for hundreds of millions of device sales • Your customer segmentation models will shape how a startup manage marketing campaigns • You will help the client save AWS cost by 200% by migrating computing to Apache Spark • Your machine learning models will help companies retain high value customers • Your work will be presented to the CEOs
  • 16. 153k 13 Market Research Student Success Job Placement 6 months 2 months 56%89% Data Scientist Security Analyst Senior Analyst Data Scientist Data Engineer 70k 0 New grad 98k 2 FSA 73k 0 New Grad 63k 0 New Grad 78k 3 PWC Sr Data Scientist 83k Salary 50% Referral by WCD 120k 13 QAData Scientist Data Scientist 80k 2 Data Analyst 100k 0 Geology (New Grad0 Data Scientist 70k 0 Statistics (New Grad) ML Engineer
  • 17.
  • 19. Coding/Tools Math/ML Storytelling Data Scientist Linux Python/Scala/Java Cloud (AWS) Hadoop, Spark Statistics Linear Algebra Regression Classification Clustering NLP Presentation Use cases Project Mgmt Communications Data Science Essential Skills Business Domain Knowledge Data is a language—every company, if not every business unit, speaks its own dialect.
  • 20. Data Scientist The Types Operational DS Focus: data wrangling, work with large/small messy data, builds predictive models Strength: data handling, tools, business knowledge ML Engineer Focus: ML model deployment, data pipelines Strength: coding, algorithms, machine learning, platforms and tools ML Researcher Focus: algorithm development, research, IP Strength: ML/DL algorithms, implmentation, research DS Product Mngr Focus: product strategy, business communications, project management Strength: product sense, business requirements, DS acumen
  • 21. Data Jobs in Canada Job Categories and Cities
  • 22. Data Jobs in Canada Industries – Data Scientist
  • 23. Data Jobs in Canada Industries – Data Analyst
  • 24. Data Jobs in Canada Industries – Data Engineer
  • 25. Data Jobs in Canada Industries – ML Engineer
  • 26. Data Jobs in Canada SQL is among most wanted skills
  • 27. Data Jobs in Canada Skills – Data Analyst
  • 28. Data Jobs in Canada Skills – Data Engineer
  • 29. Data Jobs in Canada Skills – ML Engineer
  • 32. Resources Python Coding Practice Coding & Interviews • LeetCode • HackerRank Book Statistics Online Courses Udemy • Complete Python Bootcamp Datacamp • Introduction to Python
  • 33. Data Science Importance of foundations Data Science Machine Learning Big Data Data Engineering Deep Learning ML Engineering Focus on one programming language at a time • Get good at it Must have skills • Python • SQL
  • 34. Data Science What’s next? Prerequisites Data Science Learning Path • ML algorithms • 2 Projects • Interview Practice Applied ML • Data wrangling • Data Visualization • Predictive Modeling Data Science w/ Python • Big data tools • ML at scale • ML deployment • Job referrals Big Data Python Foundation SQL for Data Science Nov 16 Nov 3 Nov 16Nov 23 Oct 19
  • 37. Targeting Profiles Personaliz ation Parse/Filter Classify Synthesize POI Prediction Ontology/TaxonomyContexts URL Parsing POI Database Context Extraction Topic Modeling Content Classification Location Classify Signal Aggregation Taste Merging Taste Scoring Data Data Science Pipelines Data Product POI Context Builder Rule-based Predictor ML Predictor Location Attributes Home/Work Predictor Co-location Location Graph • sklearn • gensim • nltk • mrjob • PySpark • PySpark Why Python? Python in a data science project
  • 38. Python Data Management Structured Data with Pandas DataFrame Row Index Population Area California 423967 38332521 Florida 170312 19552860 Illinois 149995 12882135 New York 141297 19651127 Texas 695662 26448193 Column DataFrame Values Column Row Row Row Row Row Row Index Row Index Row Index Row Index Row Index Column Index Column Index # row access returns Series states.loc['Florida'] # column access returns Series states['area'] California 38332521 Florida 19552860 Illinois 12882135 New York 19651127 Texas 26448193 population 170312 area 19552860 # index based selection states.iloc[1:3, :1] Row Index Population Florida 170312 Illinois 149995 Series Series DataFrame
  • 39. Python Data Management Pandas - GroupBy() City Ticket Sales Toronto 100 Montreal 50 Toronto 20 Halifax 40 Montreal 30 Halifax 60 City Ticket Sales Toronto 100 Toronto 20 City Ticket Sales Montreal 50 Montreal 30 City Ticket Sales Halifax 40 Halifax 60 City Ticket Sales Toronto 60 City Ticket Sales Montreal 40 City Ticket Sales Halifax 50 City Ticket Sales Toronto 60 Montreal 40 Halifax 50 Input DataFrame Split DataFrameGroupBy Apply (sum) DataFrameGroupBy Combine DataFrame df = pd.DataFrame({'city' : ['Toronto', 'Montreal', 'Toronto', 'Halifax', 'Montreal', 'Halifax'], 'sales' : [100, 50, 20, 40, 30, 60]})
  • 40. Python Data Management Pandas - Join/Merge # 1-to-1 join pd.merge(employee, hr, how='inner', on='employee') Other features • Pivot Tables • Window Functions
  • 45. from mrjob.job import MRJob class MRWordCount(MRJob): def mapper(self, _, line): for word in line.split(): yield(word, 1) def reducer(self, word, counts): yield(word, sum(counts)) if __name__ == '__main__': MRWordCount.run() from pyspark import SparkContext if __name__ == "__main__": if len(sys.argv) != 2: print("Usage: wordcount <file>", file=sys.stderr) exit(-1) sc = SparkContext(appName="PythonWordCount") lines = sc.textFile(sys.argv[1], 1) counts = lines.flatMap(lambda x: x.split(' ')) .map(lambda x: (x, 1)) .reduceByKey(add) output = sorted(counts.collect(), key=lambda x: x[1], reverse=True)[:50] for (word, count) in output: print("%s: %i" % (word, count)) sc.stop() API Support for Big Data Platforms Hadoop/Spark
  • 46. Machine Learning Sklearn | Gensim | NLTK
  • 57. Trends Model Deployment in Cloud SageMaker EMR ECR S3 Notebook Transform Inference 1. ETL on EMR using Spark 2. Save Model to S3 s3://weclouddata/mod els/gbm20190612 SageMaker Spark ML Container 3. Start notebook instance and deploy model SageMaker Spark ML Container 4. Start SageMaker Spark container for prediction API SageMaker Spark ML Container
  • 58. Trends Kubernetes On-Prem HDFS S3 Azure Blob Storage Google Cloud Storage vYARN Mesos Kubernetes MapReduce Spark Core Spark DataFrame SQL ML Structured Streaming Graph Frame Hive MahoutPig Impala Presto Kylin
  • 60. Learn Python and Data Science with WeCloudData
  • 61. Python Programming Why Python? • Python is the most popular data science and AI programming language • Many employers prefer candidates with python skills • Mastering python will expose you to not only Data Scientist jobs, but also Data Engineers and DevOps
  • 62. Python Programming Syllabus • Python use cases • Branching • Loops • Data Types: list, tuple, set • Functions • Lab: social media analytics with python Day 1 Python Basics Day 2 Intermediate Python Day 3 Python Data Analysis • Data Types: String • Data Types: Dictionary • Comprehensions • Regular expression • Modules & Packages • Class and Object • Interview – Prepare for Python Interview Tests • Lab – Class and object • Pandas introduction • Intro to visualization with python • Accessing database with python • Use case: Python for ML and AI • Project: Building your first ML algorithm with python • Python Installation • Jupyter Introduction • Python Introduction • DS Introduction • Twitter Dev API Setup Pre-course Installation & Preview
  • 63. • Web scraping basics • BeautifulSoup • Selenium • Project #1 kickoff • Matplotlib Review Data Collection Signup • Project #1 Presentation • Seaborn | Plotly • Map Visualization • Building analytics dashboard with Dash • Project #2 kickoff EDA & Data Visualizations • Project #2 Presentation • Predictive modeling lifecycle • Introduction to sklearn • Regression analysis Predictive Modeling • Data Science Intro • Analyze Toronto Open Data Data Science • Advanced Pandas (Merge and Joins) • Advanced Aggregations • Querying databases • Reporting with Pandas and Pivot Data Wrangling • Intro to Statistics and Linear Algebra • Scipy for statistics • Numpy for linear algebra • Time series forecasting with Prophet Statistics and Linear Algebra W1 W3 W5 W2 W4 W6 Final Review Data Science with Python Syllabus (Weekend Cohort – 8 sessions/32 hours) • Python review • Intro to Pandas • Intro to Visualization Self-paced Lectures • Classification models • Model evaluation • Predicting Toronto TTC delay using Sklearn Predictive Modeling W7 • Course review • Introduction to Machine Learning • Introduction to Big Data Final Review W8
  • 64. Data Science with Python Hands-on Projects This course is instructor-led and project-based. Students will be able to apply the data science skills acquired during the lectures to 2 hands-on projects. The 2 projects will make your 2 Data Science Projects • Web Data Analytics • Data Storytelling (Dashboard + Heroku Deployment) Data Collection BeautifulSoup Selenium Data Cleaning Pandas Matplotlib Data Analysis Matplotlib Pandas SQLAlchemy Story telling Insight Analysis Presentations App Deployment Heroku Flask Visualization Dash Plotly Project #1: Web Data Analysis Project #2: Data Storytelling Project 2 Demo
  • 65. Data Science with Python Student Project Demo
  • 66. Data Science with Python Student Project Demo Analyzing the Top Travel Influencers on Instagram
  • 67. Data Science with Python Student Project Demo
  • 68. Open a webpage and get HTML of search results page Fishing Online Process data using pandas and save data into .csv file Locating page elements by XPath Extract Target Data NextURL? Yes No Data Science with Python Student Project Demo
  • 69. Data Science with Python Student Project Demo
  • 70. Learn Data Science Understand the big picture
  • 71. Web Crawler Project - Aritzia Student Project Demo
  • 73. Info & Motivation ´ Type : Public ´ Traded as : TSX: ATZ ´ Industry : Fashion ´ Founded : 1984 ´ Founder : Brian Hill ´ Headquarters : Vancouver, British Columbia, Canada ´ Products : Clothing
  • 75. Dataframe ´ Total data : 856,452 ´ Date range : 2019-06-08 21:53:50 ~ 2019-06-20 13:57:19 ´ File numbers : 30
  • 77. Interesting Findings ´ Categories & Brand ´ Price Distribution ´ Top 20 Colors ´ Weekdays Vs Weekend - Avg Stock ´ On Sale event - Discount% ´ Price Change Vs Stock Correlation
  • 79. Brand Distribution Vs. Brand Average Price
  • 81. Weekdays Vs Weekend - Avg Stock
  • 83. Discount % of Each Brand
  • 84. Top 10 products – stock change
  • 85. Conclusion ´ Business casual clothes prices are higher than others ´ More transactions/purchases happens in weekends ´ Sale event – good deal for famous brands ´ Promotion influences stock change
  • 86. Challenges ´ Save data as Tree structure (.json) ´ Load data ´ Move root node properties to children node ´ Data analyzing using Pandas ´ Visualization - Plotly (multi-chart types)
  • 87. Next Step ´ Detailed size distribution of brands / products ´ Influences of the strength of discount ´ Stock refill timing ´ Long term data analyzing (winter vs. summer)