Python
Implementation
in
Data Scinece
Speaker Bio
Andi Mardinsyah
(Data Scientist Telkom Indonesia)
• University of Indonesia -
Electrical Engineering Majoring
in Computer System
• Offenburg University -
Communication and Media
Engineering
Python
• Interpreted high-level general-purpose programming language
• Supports multiple programming paradigms, including structured (particularly, procedural),
object-oriented and functional programming, statistics, analytics, and computer science.
• comprehensive standard library
Data Science
• Data science enables us to take data and
transform it into meaningful information that
can help us make decisions.
• Data science is interdisciplinary and combines
other well-known fields such as probability,
statistics, analytics, and computer science.
There is a wide range of ways that data scientists
may work with strategy, decision making, and
implementation of analysis
The role of a data scientist may look very different
depending upon what company you’re working on
and what business domain you’re working in!
Data Analyst
• Knowledge of database
• Ability to query data (SQL, NoSQL)
• Ability to describe data (Trends, changes, etc)
• Ability to use visualisation tools (Tableau, PBI)
• Fluent at spreadsheet (Excel, sheet)
• Ability to present data (Slides, Dashboards)
• Business Acumen
Data Scientist/Research Scientist
Data Engineer
• Data Modeling (Statistical/Machine Learning)
• Conduct Research (Statistics)
• Experiment (A/B testing)
• Extract insights, Tell Story
• Programming (Python, R, ..)
• Knowledge of database architecture
• Knowledge of cloud platforms
• Data pipeline (ETL)
• Programming (Python, Scala, Java..)
Machine Learning Engineer
• State-of-the art machine learning models
• Deep Learning
• Computer Vision, NLP
• Model deployment
++ Product
Product Analyst
Business
Intelligence
++ Business
++ Statistics
++ Software Engineering
++ Data Engineering
Data Science
In most of the industries
Data Science Roles
Data Science Process
Data Science Tools
Essential To Start With
Online Tutorial
Google search in action!
Data Science Skills
Data Science Data Gathering
Data Science Coding
Data Science Mathematics & Statistics
Data Science Data Visualisation & Communication
Data Science Exploratory Data Analysis
•Comparison Analysis
•Distribution Analysis
•Trend Analysis
•Ranking Analysis
•Variance Analysis
•Contribution Analysis
•Frequency Analysis
•Recency, Frequency, Monetary Analysis
•(Correlation Analysis
•Pareto Analysis
Data Science Machine Learning
Data Science
https://teachablemachine.withgoogle.com/
Training, testing?
Train
Model
Testing
Split
Pandas Example
Indexing df = df.set_index(df['PassengerId’])
Drop missing value, df.dropna()
Seaborn example
sns.scatterplot(x='total_bill', y='tip', data=tips)
Use Case Marketing Analytics
Use Case Customer Analytics
Use Case User Analytics
Active User per Province
New User vs Churn user
Active User (by Payment)
New User by Regional
Churn User by Regional
Active User by LOS
Active User by City (Top 10)
Active User by Regional
Churn User by City (Top 10)
Active User by Group Age
Active User by Gender
Active User by Indihome Product Type
Active User by Urban Rural
Active User by Location Category
Use Case Churn Prevention
Strategy to Improve Customer Retention
1. Gather available customer behavior, transactions, demographics data and usage
patterns
2. Utilize these data points to predict customer segments who are likely to churn.
3. Create a model to pattern the risk tolerance of the business with respect to churn
probability.
4. Design an intervention model to consider how the level of intervention could affect
the churn percentages and customer lifetime value (CLV).
5. Implement effective experimentation across multiple customer segments for
reducing churn and promoting retention.
6. Rinse and Repeat from Step 1 (cognitive churn management is a continuous
process and not once a year exercise).
Customer
Historical Data
• Customer
Behavior
• Transactions
• Demographics
• Usage Pattern
Machine Learning
• Machine
Learning Model
• AI Model
• Deep Learning
Model
Churn Predictive
Model
• Probability
Risk
Churn Clustering
• High Risk
• Medium Risk
• Low Risk
Campaign
Planning
• Campaign
based on
Risk Level
Use Case Recommendation Engine

Bootcamp python-1

  • 1.
  • 2.
    Speaker Bio Andi Mardinsyah (DataScientist Telkom Indonesia) • University of Indonesia - Electrical Engineering Majoring in Computer System • Offenburg University - Communication and Media Engineering
  • 3.
    Python • Interpreted high-levelgeneral-purpose programming language • Supports multiple programming paradigms, including structured (particularly, procedural), object-oriented and functional programming, statistics, analytics, and computer science. • comprehensive standard library
  • 4.
    Data Science • Datascience enables us to take data and transform it into meaningful information that can help us make decisions. • Data science is interdisciplinary and combines other well-known fields such as probability, statistics, analytics, and computer science. There is a wide range of ways that data scientists may work with strategy, decision making, and implementation of analysis The role of a data scientist may look very different depending upon what company you’re working on and what business domain you’re working in!
  • 5.
    Data Analyst • Knowledgeof database • Ability to query data (SQL, NoSQL) • Ability to describe data (Trends, changes, etc) • Ability to use visualisation tools (Tableau, PBI) • Fluent at spreadsheet (Excel, sheet) • Ability to present data (Slides, Dashboards) • Business Acumen Data Scientist/Research Scientist Data Engineer • Data Modeling (Statistical/Machine Learning) • Conduct Research (Statistics) • Experiment (A/B testing) • Extract insights, Tell Story • Programming (Python, R, ..) • Knowledge of database architecture • Knowledge of cloud platforms • Data pipeline (ETL) • Programming (Python, Scala, Java..) Machine Learning Engineer • State-of-the art machine learning models • Deep Learning • Computer Vision, NLP • Model deployment ++ Product Product Analyst Business Intelligence ++ Business ++ Statistics ++ Software Engineering ++ Data Engineering Data Science In most of the industries Data Science Roles
  • 6.
  • 7.
    Data Science Tools EssentialTo Start With Online Tutorial Google search in action!
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
    Data Science DataVisualisation & Communication
  • 13.
    Data Science ExploratoryData Analysis •Comparison Analysis •Distribution Analysis •Trend Analysis •Ranking Analysis •Variance Analysis •Contribution Analysis •Frequency Analysis •Recency, Frequency, Monetary Analysis •(Correlation Analysis •Pareto Analysis
  • 14.
  • 15.
  • 16.
    Pandas Example Indexing df= df.set_index(df['PassengerId’]) Drop missing value, df.dropna()
  • 17.
  • 18.
  • 19.
  • 20.
    Use Case UserAnalytics Active User per Province New User vs Churn user Active User (by Payment) New User by Regional Churn User by Regional Active User by LOS Active User by City (Top 10) Active User by Regional Churn User by City (Top 10) Active User by Group Age Active User by Gender Active User by Indihome Product Type Active User by Urban Rural Active User by Location Category
  • 21.
    Use Case ChurnPrevention Strategy to Improve Customer Retention 1. Gather available customer behavior, transactions, demographics data and usage patterns 2. Utilize these data points to predict customer segments who are likely to churn. 3. Create a model to pattern the risk tolerance of the business with respect to churn probability. 4. Design an intervention model to consider how the level of intervention could affect the churn percentages and customer lifetime value (CLV). 5. Implement effective experimentation across multiple customer segments for reducing churn and promoting retention. 6. Rinse and Repeat from Step 1 (cognitive churn management is a continuous process and not once a year exercise). Customer Historical Data • Customer Behavior • Transactions • Demographics • Usage Pattern Machine Learning • Machine Learning Model • AI Model • Deep Learning Model Churn Predictive Model • Probability Risk Churn Clustering • High Risk • Medium Risk • Low Risk Campaign Planning • Campaign based on Risk Level
  • 22.