SlideShare a Scribd company logo
1 of 19
Download to read offline
Data Science Life Cycle
Presented By:
Praanav Bhowmik
Durgesh Gupta
Agenda:
● What is Data Science?
● Qualities of Data Scientist
● Job Role & Skill Sets
● Data Science Life Cycle
○ Business Understanding
○ Data acquisition and understanding
○ Modeling
○ Deployment
○ Customer Acceptance
○ Maintenance
● Contribution of Data Scientist to the Organization.
Introduction
● Data Science is a combination of multiple disciplines that uses statistics, data
analysis, and machine learning to analyze data and to extract information,
knowledge and gain insights from it.
● Data Science is about data gathering, analysis and decision-making.
● Finding patterns in data, through analysis,visualizations, and make future
predictions are the key things.
● Make use of theory and methods to provide concrete and actionable solutions to
complex problems.
Qualities of Data Scientist
Qualities of Data Scientist
● Discover valuable insights from huge amounts of data, which can then be used to
shape company strategies and achieve business objectives.
● Data Scientists Empower Management to Make Smarter Decisions.
● Data Scientists Make it Easier to Achieve Business Goals
● Challenge the Workforce to Embrace Data
● Refine Target Audiences
● Identify New Revenue Opportunities
● Analytical mind and business acumen
Job Responsibilities
● Fetching information from various sources and analyzing it to get a clear
understanding of how an organization performs.
● Uses statistical and analytical methods plus AI tools to automate specific
processes within the organization and develop smart solutions to business
challenges.
● Build predictive models and machine learning algorithms.
● Project information using data visualization tools.
● Propose solutions and strategies to tackle business challenges.
Skill Set Required
● Data scientists need to use mathematics to process and structure the data they’re
dealing with.
● Probability & Statistics: Statistics allows data scientists to slice and dice through data,
extracting the insights needed to make reasonable conclusions.
● Programming: A data scientist needs to know several programming languages like
Python for writing scripts for data manipulation, analysis, and visualization. R, Java, C etc
to achieve specific goals.
● Data Management: Ability to extract data from relational databases, non-relational and
unstructured data.
● Machine Learning / Deep Learning: ML algorithms to build the model.
● Cloud Computing: Able to use and utilize data and machine learning services and
frameworks available or provided by cloud service providers.
Data Science Life Cycle
Business Understanding
● The complete cycle revolves around the enterprise goal.
● Identify the key business variables that the analysis needs to predict.
● Define the project goals by asking and refining "sharp" questions that are relevant,
specific, and unambiguous.
● Find the relevant data that helps you answer the questions that define the
objectives of the project.
Data Acquisition and
Understanding
● Real-world data sets are often noisy, are missing values, or have a host of other
discrepancies.
● Aim is to produce a clean, high-quality data set whose relationship to the target
variables is understood.
● Develop a solution architecture of the data pipeline that refreshes and scores the
data regularly
Modeling
● Determine the optimal data features for the machine-learning model.
● Create an informative machine-learning model that predicts the target most
accurately.
● The process for model training includes the following steps:
○ Split the input data randomly for modeling into a training data set and a test
data set.
○ Build the models by using the training data set.
○ Evaluate the training and the test data set. Use a series of competing
machine-learning algorithms along with the various associated tuning
parameters (known as a parameter sweep) that are geared toward
answering the question of interest with the current data.
Deployment
● Deploy models with a data pipeline to a production or production-like
environment for final user acceptance.
● After you have a set of models that perform well, you can operationalize them for
other applications to consume. Depending on the business requirements,
predictions are made either in real time or on a batch basis.
● To deploy models, you expose them with an open API interface.
● The interface enables the model to be easily consumed from various applications.
Customer Acceptance
● Confirm that the pipeline, the model, and their deployment in a production
environment satisfy the customer's objectives.
● The customer should validate that the system meets their business needs and
that it answers the questions with acceptable accuracy to deploy the system to
production for use by their client's application.
● The project is handed-off to the entity responsible for operations.
Monitoring & Maintenance
● The final but continuous phase of ML development is model monitoring and
maintenance.
● Post-deployment, you need to monitor your model to ensure it continues to
perform as expected.
● ML model requires regular tuning and updating to meet performance
expectations.
● Failing to perform this essential step may result in diminishing model accuracy
over time.
Contribution to the Organization
Contribution
● Data Science helps businesses monitor, manage, and collect performance
measures to improve decision-making across the organization.
● Companies may use trend analysis to make critical decisions to improve consumer
engagement, corporate performance, and boost revenue.
● Data Science models make use of current data and may simulate a variety of
operations. As a result, businesses may look for candidates with a professional
certificate who have studied the best courses for data analytics.
● Data Science assists firms in identifying and refining target audiences by integrating
existing data with additional data points to provide meaningful insights.
References
● Microsoft Data Science Life Cycle Overview
● Data Science Introduction-W3Schools
Thank You

More Related Content

Similar to DS Life Cycle

Guide to end end machine learning projects
Guide to end end machine learning projectsGuide to end end machine learning projects
Guide to end end machine learning projectsSkyl.ai
 
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...Soumodeep Nanee Kundu
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwordsAndrew Clark
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategyShaun Kollannur
 
Data Analytics Course In Mumbai
Data Analytics Course In MumbaiData Analytics Course In Mumbai
Data Analytics Course In MumbaiDataMites
 
Data Analytics Course In Bangalore
Data Analytics Course In BangaloreData Analytics Course In Bangalore
Data Analytics Course In BangaloreDataMites
 
Data Analytics Course In Pune
Data Analytics Course In PuneData Analytics Course In Pune
Data Analytics Course In PuneDataMites
 
BA Overview.pptx
BA Overview.pptxBA Overview.pptx
BA Overview.pptxSuKuTurangi
 
Data Analytics Course In Chennai
Data Analytics Course In ChennaiData Analytics Course In Chennai
Data Analytics Course In ChennaiDataMites
 
Succeed in AI projects
Succeed in AI projectsSucceed in AI projects
Succeed in AI projectsSubhendu Dey
 
Investing in ai driven startups
Investing in ai driven startupsInvesting in ai driven startups
Investing in ai driven startupsRoy Lowrance
 
GraphTour London 2020 - Customer Journey
GraphTour London 2020  - Customer Journey GraphTour London 2020  - Customer Journey
GraphTour London 2020 - Customer Journey Neo4j
 
how to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfhow to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfbasilmph
 
Job Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsJob Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsStack Data Labs
 
Data Analytics Certification in Pune-January
Data Analytics Certification in Pune-JanuaryData Analytics Certification in Pune-January
Data Analytics Certification in Pune-JanuaryDataMites
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform Michael Ghen
 

Similar to DS Life Cycle (20)

Guide to end end machine learning projects
Guide to end end machine learning projectsGuide to end end machine learning projects
Guide to end end machine learning projects
 
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
Measuring the Effectiveness of Data Analysis Projects_ Key Metrics and Strate...
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwords
 
Demystifying Data Science
Demystifying Data ScienceDemystifying Data Science
Demystifying Data Science
 
Simplify your analytics strategy
Simplify your analytics strategySimplify your analytics strategy
Simplify your analytics strategy
 
Data Analytics Course In Mumbai
Data Analytics Course In MumbaiData Analytics Course In Mumbai
Data Analytics Course In Mumbai
 
Data Analytics Course In Bangalore
Data Analytics Course In BangaloreData Analytics Course In Bangalore
Data Analytics Course In Bangalore
 
Data Analytics Course In Pune
Data Analytics Course In PuneData Analytics Course In Pune
Data Analytics Course In Pune
 
BA Overview.pptx
BA Overview.pptxBA Overview.pptx
BA Overview.pptx
 
Internship Presentation.pdf
Internship Presentation.pdfInternship Presentation.pdf
Internship Presentation.pdf
 
KK Resume24.pdf
KK Resume24.pdfKK Resume24.pdf
KK Resume24.pdf
 
KK Resume24.pdf
KK Resume24.pdfKK Resume24.pdf
KK Resume24.pdf
 
Data Analytics Course In Chennai
Data Analytics Course In ChennaiData Analytics Course In Chennai
Data Analytics Course In Chennai
 
Succeed in AI projects
Succeed in AI projectsSucceed in AI projects
Succeed in AI projects
 
Investing in ai driven startups
Investing in ai driven startupsInvesting in ai driven startups
Investing in ai driven startups
 
GraphTour London 2020 - Customer Journey
GraphTour London 2020  - Customer Journey GraphTour London 2020  - Customer Journey
GraphTour London 2020 - Customer Journey
 
how to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdfhow to successfully implement a data analytics solution.pdf
how to successfully implement a data analytics solution.pdf
 
Job Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabsJob Profiles in Big Data - StackDataLabs
Job Profiles in Big Data - StackDataLabs
 
Data Analytics Certification in Pune-January
Data Analytics Certification in Pune-JanuaryData Analytics Certification in Pune-January
Data Analytics Certification in Pune-January
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform
 

More from Knoldus Inc.

GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfKnoldus Inc.
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxKnoldus Inc.
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingKnoldus Inc.
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesKnoldus Inc.
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxKnoldus Inc.
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxKnoldus Inc.
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxKnoldus Inc.
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxKnoldus Inc.
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxKnoldus Inc.
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationKnoldus Inc.
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationKnoldus Inc.
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIsKnoldus Inc.
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II PresentationKnoldus Inc.
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAKnoldus Inc.
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Knoldus Inc.
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxKnoldus Inc.
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinKnoldus Inc.
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks PresentationKnoldus Inc.
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Knoldus Inc.
 

More from Knoldus Inc. (20)

GraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdfGraphQL with .NET Core Microservices.pdf
GraphQL with .NET Core Microservices.pdf
 
NuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptxNuGet Packages Presentation (DoT NeT).pptx
NuGet Packages Presentation (DoT NeT).pptx
 
Data Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable TestingData Quality in Test Automation Navigating the Path to Reliable Testing
Data Quality in Test Automation Navigating the Path to Reliable Testing
 
K8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose KubernetesK8sGPTThe AI​ way to diagnose Kubernetes
K8sGPTThe AI​ way to diagnose Kubernetes
 
Introduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptxIntroduction to Circle Ci Presentation.pptx
Introduction to Circle Ci Presentation.pptx
 
Robusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptxRobusta -Tool Presentation (DevOps).pptx
Robusta -Tool Presentation (DevOps).pptx
 
Optimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptxOptimizing Kubernetes using GOLDILOCKS.pptx
Optimizing Kubernetes using GOLDILOCKS.pptx
 
Azure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptxAzure Function App Exception Handling.pptx
Azure Function App Exception Handling.pptx
 
CQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptxCQRS Design Pattern Presentation (Java).pptx
CQRS Design Pattern Presentation (Java).pptx
 
ETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake PresentationETL Observability: Azure to Snowflake Presentation
ETL Observability: Azure to Snowflake Presentation
 
Scripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics PresentationScripting with K6 - Beyond the Basics Presentation
Scripting with K6 - Beyond the Basics Presentation
 
Getting started with dotnet core Web APIs
Getting started with dotnet core Web APIsGetting started with dotnet core Web APIs
Getting started with dotnet core Web APIs
 
Introduction To Rust part II Presentation
Introduction To Rust part II PresentationIntroduction To Rust part II Presentation
Introduction To Rust part II Presentation
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Configuring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRAConfiguring Workflows & Validators in JIRA
Configuring Workflows & Validators in JIRA
 
Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)Advanced Python (with dependency injection and hydra configuration packages)
Advanced Python (with dependency injection and hydra configuration packages)
 
Azure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptxAzure Databricks (For Data Analytics).pptx
Azure Databricks (For Data Analytics).pptx
 
The Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and KotlinThe Power of Dependency Injection with Dagger 2 and Kotlin
The Power of Dependency Injection with Dagger 2 and Kotlin
 
Data Engineering with Databricks Presentation
Data Engineering with Databricks PresentationData Engineering with Databricks Presentation
Data Engineering with Databricks Presentation
 
Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)Databricks for MLOps Presentation (AI/ML)
Databricks for MLOps Presentation (AI/ML)
 

Recently uploaded

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 

DS Life Cycle

  • 1. Data Science Life Cycle Presented By: Praanav Bhowmik Durgesh Gupta
  • 2. Agenda: ● What is Data Science? ● Qualities of Data Scientist ● Job Role & Skill Sets ● Data Science Life Cycle ○ Business Understanding ○ Data acquisition and understanding ○ Modeling ○ Deployment ○ Customer Acceptance ○ Maintenance ● Contribution of Data Scientist to the Organization.
  • 3. Introduction ● Data Science is a combination of multiple disciplines that uses statistics, data analysis, and machine learning to analyze data and to extract information, knowledge and gain insights from it. ● Data Science is about data gathering, analysis and decision-making. ● Finding patterns in data, through analysis,visualizations, and make future predictions are the key things. ● Make use of theory and methods to provide concrete and actionable solutions to complex problems.
  • 4. Qualities of Data Scientist
  • 5. Qualities of Data Scientist ● Discover valuable insights from huge amounts of data, which can then be used to shape company strategies and achieve business objectives. ● Data Scientists Empower Management to Make Smarter Decisions. ● Data Scientists Make it Easier to Achieve Business Goals ● Challenge the Workforce to Embrace Data ● Refine Target Audiences ● Identify New Revenue Opportunities ● Analytical mind and business acumen
  • 6. Job Responsibilities ● Fetching information from various sources and analyzing it to get a clear understanding of how an organization performs. ● Uses statistical and analytical methods plus AI tools to automate specific processes within the organization and develop smart solutions to business challenges. ● Build predictive models and machine learning algorithms. ● Project information using data visualization tools. ● Propose solutions and strategies to tackle business challenges.
  • 7. Skill Set Required ● Data scientists need to use mathematics to process and structure the data they’re dealing with. ● Probability & Statistics: Statistics allows data scientists to slice and dice through data, extracting the insights needed to make reasonable conclusions. ● Programming: A data scientist needs to know several programming languages like Python for writing scripts for data manipulation, analysis, and visualization. R, Java, C etc to achieve specific goals. ● Data Management: Ability to extract data from relational databases, non-relational and unstructured data. ● Machine Learning / Deep Learning: ML algorithms to build the model. ● Cloud Computing: Able to use and utilize data and machine learning services and frameworks available or provided by cloud service providers.
  • 9.
  • 10. Business Understanding ● The complete cycle revolves around the enterprise goal. ● Identify the key business variables that the analysis needs to predict. ● Define the project goals by asking and refining "sharp" questions that are relevant, specific, and unambiguous. ● Find the relevant data that helps you answer the questions that define the objectives of the project.
  • 11. Data Acquisition and Understanding ● Real-world data sets are often noisy, are missing values, or have a host of other discrepancies. ● Aim is to produce a clean, high-quality data set whose relationship to the target variables is understood. ● Develop a solution architecture of the data pipeline that refreshes and scores the data regularly
  • 12. Modeling ● Determine the optimal data features for the machine-learning model. ● Create an informative machine-learning model that predicts the target most accurately. ● The process for model training includes the following steps: ○ Split the input data randomly for modeling into a training data set and a test data set. ○ Build the models by using the training data set. ○ Evaluate the training and the test data set. Use a series of competing machine-learning algorithms along with the various associated tuning parameters (known as a parameter sweep) that are geared toward answering the question of interest with the current data.
  • 13. Deployment ● Deploy models with a data pipeline to a production or production-like environment for final user acceptance. ● After you have a set of models that perform well, you can operationalize them for other applications to consume. Depending on the business requirements, predictions are made either in real time or on a batch basis. ● To deploy models, you expose them with an open API interface. ● The interface enables the model to be easily consumed from various applications.
  • 14. Customer Acceptance ● Confirm that the pipeline, the model, and their deployment in a production environment satisfy the customer's objectives. ● The customer should validate that the system meets their business needs and that it answers the questions with acceptable accuracy to deploy the system to production for use by their client's application. ● The project is handed-off to the entity responsible for operations.
  • 15. Monitoring & Maintenance ● The final but continuous phase of ML development is model monitoring and maintenance. ● Post-deployment, you need to monitor your model to ensure it continues to perform as expected. ● ML model requires regular tuning and updating to meet performance expectations. ● Failing to perform this essential step may result in diminishing model accuracy over time.
  • 16. Contribution to the Organization
  • 17. Contribution ● Data Science helps businesses monitor, manage, and collect performance measures to improve decision-making across the organization. ● Companies may use trend analysis to make critical decisions to improve consumer engagement, corporate performance, and boost revenue. ● Data Science models make use of current data and may simulate a variety of operations. As a result, businesses may look for candidates with a professional certificate who have studied the best courses for data analytics. ● Data Science assists firms in identifying and refining target audiences by integrating existing data with additional data points to provide meaningful insights.
  • 18. References ● Microsoft Data Science Life Cycle Overview ● Data Science Introduction-W3Schools