Radhika (30323U09065).pptx data science with python
1.
1.Introduction to datascience
2.Python for data science
3.Explore machine learning using
python
4.Data visualization using python
5.Exploratory data analysis
ARCOT SRI MAHAALAKSHMI WOMEN’S
COLLEGE
Subject code:23UNM40A
Advance data science with python
2.
INTRODUCTION TO DATASCIENCE
Definition: Interdisciplinary field using scientific methods to
extract insights from data
Combines: Statistics, Computer Science, Domain Knowledge
Real-world Applications:
Healthcare (predictive diagnosis)
Business (customer behavior analysis)
• Social Media (recommendation systems)
3.
KEY COMPONENTS OFDATA SCIENCE
Data Collection: Gathering raw data from multiple sources
Data Cleaning: Removing errors, filling missing values
Exploratory Data Analysis (EDA): Understanding patterns and
trends
Modeling: Using algorithms (like regression, decision trees)
Evaluation: Measuring model performance
• Deployment: Integrating model into real-world applications
4.
TOOLS AND SKILLSREQUIRED
Programming: Python, R
Libraries: Pandas, NumPy, Matplotlib, Scikit-learn
Databases: SQL, NoSQL
Machine Learning: Supervised & Unsupervised Learning
• Soft Skills: Critical Thinking, Communication, Business
Understanding
5.
PYTHON FOR DATASCIENCE
Easy to learn and readable syntax
Large community and vast number of libraries
Integration with data tools (like Jupyter, SQL, Hadoop)
• Widely used in machine learning, AI, and big data
6.
ESSENTIAL PYTHON LIBRARIES
NumPy– numerical operations and array handling
Pandas – data manipulation and analysis
Matplotlib / Seaborn – data visualization
Scikit-learn – machine learning algorithms
• TensorFlow / PyTorch – deep learning frameworks
7.
SAMPLE PYTHON WORKFLOWIN DATA SCIENCE
Import libraries
1.import pandas as pd
import matplotlib.pyplot as plt
2.Load dataset
data = pd.read_csv(‘data.csv’)
3.Clean & analyze
data.dropna(inplace=True)
print(data.describe())
4.Visualize data
data[‘age’].hist()
plt.show()
8.
EXPLORE MECHANIC LEARNINGUSING
PYTHON
Definition: Mechanic Learning involves applying machine learning (ML)
techniques to mechanical systems for analysis, prediction, and optimization.
Applications:
Predictive maintenance
Fault detection
Performance optimization
Why Python?
Rich libraries (NumPy, SciPy, Scikit-learn, TensorFlow)
• Easy data manipulation and visualization
9.
TOOLS AND TECHNIQUES
Title:Python Tools for Mechanic Learning
Content:
Data Handling: pandas, numpy
Modeling: scikit-learn, tensorflow, keras
Visualization: matplotlib, seaborn
Example Techniques:
Regression for load prediction
Classification for fault detection
• Clustering for operational modes
10.
CASE STUDY ANDBENEFITS
1.Content:
Case Study: Vibration data used to detect motor faults using an SVM classifier.
2.Process:
Collect sensor data
Preprocess with Python
Train ML model
Evaluate performance
3.Benefits:
Cost saving
Reduced downtime
Intelligent systems
4.Visuals: Graph showing prediction vs actual, or a sensor-to-model flowchart
11.
DATA VISUALIZATION USINGPYTHON
Title: What is Data Visualization?
Content:
Definition: Graphical representation of data to identify patterns, trends, and insights.
Importance:
Makes data easier to understand
Helps in better decision making
Why Python?
Simple syntax and powerful libraries
Popular tools: matplotlib, seaborn, plotly, pandas
• Visuals: Comparison image (raw data table vs bar chart)
12.
POPULAR PYTHON LIBRARIES
Title:Python Libraries for Data Visualization
Content:
Matplotlib: Basic plotting (line, bar, scatter)
Seaborn: Statistical plots (heatmaps, boxplots)
Plotly: Interactive visualizations
Pandas: Built-in plotting for DataFrames
Code Example:
import seaborn as sns
sns.boxplot(x=‘day’, y=‘total_bill’, data=tips)
• Visuals: Side-by-side visuals of different plot types
13.
APPLICATION AND USECASES
Title: Real-World Applications
Content:
Business Analytics: Sales trend visualization
Healthcare: Patient data visualization
Machine Learning: Model performance plots
Finance: Stock market trends
• Visuals: Example dashboard/chart grid
14.
EXPLORATORY DATA ANALYSIS
Title:What is Exploratory Data Analysis?
Content:
Definition: EDA is the process of analyzing data sets to summarize their main
characteristics.
Purpose:
Understand data structure
Detect outliers and missing values
Find patterns and relationships
Why Python?
Powerful tools like pandas, matplotlib, seaborn, plotly
• Visuals: EDA pipeline (Load Clean Visualize Analyze)
→ → →
BENEFITS AND USECASES
Title: EDA in Action
Content:
Use Case: Analyzing Titanic dataset
Found age and class affect survival
Detected missing age values
Benefits:
Drives data cleaning and model prep
Reveals hidden patterns
Supports better decision-making
• Visuals: Titanic survival barplot or correlation matrix