Textbook
Data Analysis andVisualization Using
Python by Dr Ossama, 2018
Python for Data Analysis by Wes McKinney,
2022
Ultimate Python Libraries for Data Analysis
and Visualization by Abhinaba Banerjee,
2024
Python Data Science Handbook by Jake
VanderPlas , 2022
3.
Summary of
course
objectives:
Understand andwrite programs to store
and manipulate data and measurements.
Implement the fundamental concepts of
interactive visualization of data.
Implement common data
transformations and statistical analysis
Demonstrate current machine learning
techniques for prediction and
knowledge discovery
4.
Your intentions/expectations?
In whatways do you think this course could help your
professional development?
What topics are you most interested in?
What suggestions do you have for the instructors and the
course?
5.
Introduction to DataAnalysis
Overview of key
Python libraries
for data analysis
01
Libraries provide
built-in functions
and modules
02
Extensive range
of functionalities
available
03
6.
Types of PythonData Analysis Libraries
Categorized into
three main groups:
Scientific
Computing Libraries
Data Visualization
Libraries
Machine Learning
Libraries
Pandas - DataManipulation
Provides data
structures and tools
for analysis
Fast access to
structured data
Key feature:
DataFrame (2D table
with rows & columns)
Supports easy
indexing functionality
9.
NumPy - Array-
BasedComputation
Uses arrays as primary
input/output
Supports matrix
operations
Enables fast array
processing with
minimal coding
10.
SciPy - Advanced
MathFunctions
Includes
functions for:
Optimization
Linear algebra
Signal
processing
Statistics
Also supports
data visualization
General-Purpose Datasets
Kaggle -Large
collection for ML & AI
Google Dataset
Search - Aggregated
datasets
UCI ML Repository -
Academic & research-
focused
Data.gov - U.S.
government open
data
Data World -
Community-driven
datasets
19.
Big Data &Business Datasets
AWS Open Data
Registry - AI & Cloud
datasets
Google Cloud Public
Datasets - Cloud-
based analytics
Microsoft Azure
Open Datasets - AI &
Business Intelligence
FiveThirtyEight -
Political, sports, and
social data
20.
Health & ScienceDatasets
WHO DATA - GLOBAL
HEALTH DATASETS
CDC OPEN DATA - PUBLIC
HEALTH AND DISEASE-
RELATED DATASETS
PHYSIONET - CLINICAL AND
PHYSIOLOGICAL HEALTH
DATA
21.
Finance & EconomicsDatasets
WORLD BANK OPEN DATA -
ECONOMIC INDICATORS
IMF DATA -
MACROECONOMIC &
FINANCIAL STATISTICS
QUANDL - MARKET &
FINANCIAL DATA
22.
Geospatial & EnvironmentalDatasets
NASA Earth Data - Climate
& satellite imagery
OpenStreetMap - Free
geospatial data
USGS Earth Explorer -
Remote sensing data
23.
Choosing the RightDataset
SELECT BASED ON PROJECT
NEEDS
DATASET QUALITY, SIZE, AND
SOURCE CREDIBILITY
USE PLATFORMS LIKE KAGGLE,
GOOGLE DATASET SEARCH, AND
GOVERNMENT REPOSITORIES