SlideShare a Scribd company logo
1 of 19
Download to read offline
Data Science Life Cycle
Presented By:
Praanav Bhowmik
Durgesh Gupta
Agenda:
● What is Data Science?
● Qualities of Data Scientist
● Job Role & Skill Sets
● Data Science Life Cycle
○ Business Understanding
○ Data acquisition and understanding
○ Modeling
○ Deployment
○ Customer Acceptance
○ Maintenance
● Contribution of Data Scientist to the Organization.
Introduction
● Data Science is a combination of multiple disciplines that uses statistics, data
analysis, and machine learning to analyze data and to extract information,
knowledge and gain insights from it.
● Data Science is about data gathering, analysis and decision-making.
● Finding patterns in data, through analysis,visualizations, and make future
predictions are the key things.
● Make use of theory and methods to provide concrete and actionable solutions to
complex problems.
Qualities of Data Scientist
Qualities of Data Scientist
● Discover valuable insights from huge amounts of data, which can then be used to
shape company strategies and achieve business objectives.
● Data Scientists Empower Management to Make Smarter Decisions.
● Data Scientists Make it Easier to Achieve Business Goals
● Challenge the Workforce to Embrace Data
● Refine Target Audiences
● Identify New Revenue Opportunities
● Analytical mind and business acumen
Job Responsibilities
● Fetching information from various sources and analyzing it to get a clear
understanding of how an organization performs.
● Uses statistical and analytical methods plus AI tools to automate specific
processes within the organization and develop smart solutions to business
challenges.
● Build predictive models and machine learning algorithms.
● Project information using data visualization tools.
● Propose solutions and strategies to tackle business challenges.
Skill Set Required
● Data scientists need to use mathematics to process and structure the data they’re
dealing with.
● Probability & Statistics: Statistics allows data scientists to slice and dice through data,
extracting the insights needed to make reasonable conclusions.
● Programming: A data scientist needs to know several programming languages like
Python for writing scripts for data manipulation, analysis, and visualization. R, Java, C etc
to achieve specific goals.
● Data Management: Ability to extract data from relational databases, non-relational and
unstructured data.
● Machine Learning / Deep Learning: ML algorithms to build the model.
● Cloud Computing: Able to use and utilize data and machine learning services and
frameworks available or provided by cloud service providers.
Data Science Life Cycle
Business Understanding
● The complete cycle revolves around the enterprise goal.
● Identify the key business variables that the analysis needs to predict.
● Define the project goals by asking and refining "sharp" questions that are relevant,
specific, and unambiguous.
● Find the relevant data that helps you answer the questions that define the
objectives of the project.
Data Acquisition and
Understanding
● Real-world data sets are often noisy, are missing values, or have a host of other
discrepancies.
● Aim is to produce a clean, high-quality data set whose relationship to the target
variables is understood.
● Develop a solution architecture of the data pipeline that refreshes and scores the
data regularly
Modeling
● Determine the optimal data features for the machine-learning model.
● Create an informative machine-learning model that predicts the target most
accurately.
● The process for model training includes the following steps:
○ Split the input data randomly for modeling into a training data set and a test
data set.
○ Build the models by using the training data set.
○ Evaluate the training and the test data set. Use a series of competing
machine-learning algorithms along with the various associated tuning
parameters (known as a parameter sweep) that are geared toward
answering the question of interest with the current data.
Deployment
● Deploy models with a data pipeline to a production or production-like
environment for final user acceptance.
● After you have a set of models that perform well, you can operationalize them for
other applications to consume. Depending on the business requirements,
predictions are made either in real time or on a batch basis.
● To deploy models, you expose them with an open API interface.
● The interface enables the model to be easily consumed from various applications.
Customer Acceptance
● Confirm that the pipeline, the model, and their deployment in a production
environment satisfy the customer's objectives.
● The customer should validate that the system meets their business needs and
that it answers the questions with acceptable accuracy to deploy the system to
production for use by their client's application.
● The project is handed-off to the entity responsible for operations.
Monitoring & Maintenance
● The final but continuous phase of ML development is model monitoring and
maintenance.
● Post-deployment, you need to monitor your model to ensure it continues to
perform as expected.
● ML model requires regular tuning and updating to meet performance
expectations.
● Failing to perform this essential step may result in diminishing model accuracy
over time.
Contribution to the Organization
Contribution
● Data Science helps businesses monitor, manage, and collect performance
measures to improve decision-making across the organization.
● Companies may use trend analysis to make critical decisions to improve consumer
engagement, corporate performance, and boost revenue.
● Data Science models make use of current data and may simulate a variety of
operations. As a result, businesses may look for candidates with a professional
certificate who have studied the best courses for data analytics.
● Data Science assists firms in identifying and refining target audiences by integrating
existing data with additional data points to provide meaningful insights.
References
● Microsoft Data Science Life Cycle Overview
● Data Science Introduction-W3Schools
Thank You

More Related Content

Similar to Data Science Life Cycle: Business Understanding, Modeling, Deployment

Guide to end end machine learning projects
Guide to end end machine learning projectsGuide to end end machine learning projects
Guide to end end machine learning projectsSkyl.ai
 
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...Soumodeep Nanee Kundu
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwordsAndrew Clark
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategyShaun Kollannur
 
Data Analytics Course In Mumbai
Data Analytics Course In MumbaiData Analytics Course In Mumbai
Data Analytics Course In MumbaiDataMites
 
Data Analytics Course In Bangalore
Data Analytics Course In BangaloreData Analytics Course In Bangalore
Data Analytics Course In BangaloreDataMites
 
Data Analytics Course In Pune
Data Analytics Course In PuneData Analytics Course In Pune
Data Analytics Course In PuneDataMites
 
BA Overview.pptx
BA Overview.pptxBA Overview.pptx
BA Overview.pptxSuKuTurangi
 
Data Analytics Course In Chennai
Data Analytics Course In ChennaiData Analytics Course In Chennai
Data Analytics Course In ChennaiDataMites
 
Succeed in AI projects
Succeed in AI projectsSucceed in AI projects
Succeed in AI projectsSubhendu Dey
 
Investing in ai driven startups
Investing in ai driven startupsInvesting in ai driven startups
Investing in ai driven startupsRoy Lowrance
 
GraphTour London 2020 - Customer Journey
GraphTour London 2020  - Customer Journey GraphTour London 2020  - Customer Journey
GraphTour London 2020 - Customer Journey Neo4j
 
how to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfhow to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfbasilmph
 
Job Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsJob Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsStack Data Labs
 
Data Analytics Certification in Pune-January
Data Analytics Certification in Pune-JanuaryData Analytics Certification in Pune-January
Data Analytics Certification in Pune-JanuaryDataMites
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform Michael Ghen
 

Similar to Data Science Life Cycle: Business Understanding, Modeling, Deployment (20)

Guide to end end machine learning projects
Guide to end end machine learning projectsGuide to end end machine learning projects
Guide to end end machine learning projects
 
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwords
 
Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategy
 
Data Analytics Course In Mumbai
Data Analytics Course In MumbaiData Analytics Course In Mumbai
Data Analytics Course In Mumbai
 
Data Analytics Course In Bangalore
Data Analytics Course In BangaloreData Analytics Course In Bangalore
Data Analytics Course In Bangalore
 
Data Analytics Course In Pune
Data Analytics Course In PuneData Analytics Course In Pune
Data Analytics Course In Pune
 
BA Overview.pptx
BA Overview.pptxBA Overview.pptx
BA Overview.pptx
 
Internship Presentation.pdf
Internship Presentation.pdfInternship Presentation.pdf
Internship Presentation.pdf
 
KK Resume24.pdf
KK Resume24.pdfKK Resume24.pdf
KK Resume24.pdf
 
KK Resume24.pdf
KK Resume24.pdfKK Resume24.pdf
KK Resume24.pdf
 
Data Analytics Course In Chennai
Data Analytics Course In ChennaiData Analytics Course In Chennai
Data Analytics Course In Chennai
 
Succeed in AI projects
Succeed in AI projectsSucceed in AI projects
Succeed in AI projects
 
Investing in ai driven startups
Investing in ai driven startupsInvesting in ai driven startups
Investing in ai driven startups
 
GraphTour London 2020 - Customer Journey
GraphTour London 2020  - Customer Journey GraphTour London 2020  - Customer Journey
GraphTour London 2020 - Customer Journey
 
how to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfhow to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdf
 
Job Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsJob Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabs
 
Data Analytics Certification in Pune-January
Data Analytics Certification in Pune-JanuaryData Analytics Certification in Pune-January
Data Analytics Certification in Pune-January
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform
 

More from Knoldus Inc.

GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfKnoldus Inc.
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxKnoldus Inc.
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingKnoldus Inc.
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesKnoldus Inc.
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxKnoldus Inc.
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxKnoldus Inc.
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxKnoldus Inc.
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxKnoldus Inc.
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxKnoldus Inc.
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationKnoldus Inc.
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationKnoldus Inc.
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIsKnoldus Inc.
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II PresentationKnoldus Inc.
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAKnoldus Inc.
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Knoldus Inc.
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxKnoldus Inc.
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinKnoldus Inc.
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks PresentationKnoldus Inc.
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Knoldus Inc.
 

More from Knoldus Inc. (20)

GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdf
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptx
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose Kubernetes
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptx
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics Presentation
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIs
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II Presentation
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRA
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptx
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and Kotlin
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks Presentation
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 

Recently uploaded

Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

Data Science Life Cycle: Business Understanding, Modeling, Deployment

  • 1. Data Science Life Cycle Presented By: Praanav Bhowmik Durgesh Gupta
  • 2. Agenda: ● What is Data Science? ● Qualities of Data Scientist ● Job Role & Skill Sets ● Data Science Life Cycle ○ Business Understanding ○ Data acquisition and understanding ○ Modeling ○ Deployment ○ Customer Acceptance ○ Maintenance ● Contribution of Data Scientist to the Organization.
  • 3. Introduction ● Data Science is a combination of multiple disciplines that uses statistics, data analysis, and machine learning to analyze data and to extract information, knowledge and gain insights from it. ● Data Science is about data gathering, analysis and decision-making. ● Finding patterns in data, through analysis,visualizations, and make future predictions are the key things. ● Make use of theory and methods to provide concrete and actionable solutions to complex problems.
  • 4. Qualities of Data Scientist
  • 5. Qualities of Data Scientist ● Discover valuable insights from huge amounts of data, which can then be used to shape company strategies and achieve business objectives. ● Data Scientists Empower Management to Make Smarter Decisions. ● Data Scientists Make it Easier to Achieve Business Goals ● Challenge the Workforce to Embrace Data ● Refine Target Audiences ● Identify New Revenue Opportunities ● Analytical mind and business acumen
  • 6. Job Responsibilities ● Fetching information from various sources and analyzing it to get a clear understanding of how an organization performs. ● Uses statistical and analytical methods plus AI tools to automate specific processes within the organization and develop smart solutions to business challenges. ● Build predictive models and machine learning algorithms. ● Project information using data visualization tools. ● Propose solutions and strategies to tackle business challenges.
  • 7. Skill Set Required ● Data scientists need to use mathematics to process and structure the data they’re dealing with. ● Probability & Statistics: Statistics allows data scientists to slice and dice through data, extracting the insights needed to make reasonable conclusions. ● Programming: A data scientist needs to know several programming languages like Python for writing scripts for data manipulation, analysis, and visualization. R, Java, C etc to achieve specific goals. ● Data Management: Ability to extract data from relational databases, non-relational and unstructured data. ● Machine Learning / Deep Learning: ML algorithms to build the model. ● Cloud Computing: Able to use and utilize data and machine learning services and frameworks available or provided by cloud service providers.
  • 9.
  • 10. Business Understanding ● The complete cycle revolves around the enterprise goal. ● Identify the key business variables that the analysis needs to predict. ● Define the project goals by asking and refining "sharp" questions that are relevant, specific, and unambiguous. ● Find the relevant data that helps you answer the questions that define the objectives of the project.
  • 11. Data Acquisition and Understanding ● Real-world data sets are often noisy, are missing values, or have a host of other discrepancies. ● Aim is to produce a clean, high-quality data set whose relationship to the target variables is understood. ● Develop a solution architecture of the data pipeline that refreshes and scores the data regularly
  • 12. Modeling ● Determine the optimal data features for the machine-learning model. ● Create an informative machine-learning model that predicts the target most accurately. ● The process for model training includes the following steps: ○ Split the input data randomly for modeling into a training data set and a test data set. ○ Build the models by using the training data set. ○ Evaluate the training and the test data set. Use a series of competing machine-learning algorithms along with the various associated tuning parameters (known as a parameter sweep) that are geared toward answering the question of interest with the current data.
  • 13. Deployment ● Deploy models with a data pipeline to a production or production-like environment for final user acceptance. ● After you have a set of models that perform well, you can operationalize them for other applications to consume. Depending on the business requirements, predictions are made either in real time or on a batch basis. ● To deploy models, you expose them with an open API interface. ● The interface enables the model to be easily consumed from various applications.
  • 14. Customer Acceptance ● Confirm that the pipeline, the model, and their deployment in a production environment satisfy the customer's objectives. ● The customer should validate that the system meets their business needs and that it answers the questions with acceptable accuracy to deploy the system to production for use by their client's application. ● The project is handed-off to the entity responsible for operations.
  • 15. Monitoring & Maintenance ● The final but continuous phase of ML development is model monitoring and maintenance. ● Post-deployment, you need to monitor your model to ensure it continues to perform as expected. ● ML model requires regular tuning and updating to meet performance expectations. ● Failing to perform this essential step may result in diminishing model accuracy over time.
  • 16. Contribution to the Organization
  • 17. Contribution ● Data Science helps businesses monitor, manage, and collect performance measures to improve decision-making across the organization. ● Companies may use trend analysis to make critical decisions to improve consumer engagement, corporate performance, and boost revenue. ● Data Science models make use of current data and may simulate a variety of operations. As a result, businesses may look for candidates with a professional certificate who have studied the best courses for data analytics. ● Data Science assists firms in identifying and refining target audiences by integrating existing data with additional data points to provide meaningful insights.
  • 18. References ● Microsoft Data Science Life Cycle Overview ● Data Science Introduction-W3Schools