SlideShare a Scribd company logo
1 of 20
Download to read offline
INTRO
TO DATA ANALYSIS
WITH PYTHON
Mohamed Ramadan
Learning Objectives
We will learn about:
◦ A problem requiring data analysis.
◦ A dataset to be analyzed in python.
◦ Overview of Python packages for data analysis.
◦ Importing and Exporting data in python.
◦ Basic insights from the dataset.
Problem: Used car appraisal
Question :
Can we estimate the price of used cars !
Why Data Analysis
◦ Data is everywhere.
◦ Data analysis / data science help us answer questions
from data.
◦ Data Analysis plays an important role in:
◦ Discovering useful information.
◦ Predicting future or the unknown.
Tom wants to sell his car
How much
money should he
Sell his car for !
The price he sets should not be too high,
But not too low either.
Tom
Estimate used car prices
How can we help tom determine he best price for his car !
◦ Is there data on the prices of other cars and their
characteristics!
◦ What features of cars affect their prices!
◦ Color, Brand, Horsepower, or Something else !
◦ Asking the right questions in term of data.
Data on used car prices
1,158,audi,gas,turbo,four,sedan,fwd,front,105.80,192.70,71.40,55.90,3086,ohc,five,131,mpfi,3.13,3.
40,8.30,140,5500,17,20,238750,?,audi,gas,turbo,two,hatchback,4wd,front,99.50,178.20,67.90,52.00
,3053,ohc,five,131,mpfi,3.13,3.40,7.00,160,5500,16,22,?2,192,bmw,gas,std,two,sedan,rwd,front,101
.20,176.80,64.80,54.30,2395,ohc,four,108,mpfi,3.50,2.80,8.80,101,5800,23,29,164300,192,bmw,gas,st
d,four,sedan,rwd,front,101.20,176.80,64.80,54.30,2395,ohc,four,108,mpfi,3.50,2.80,8.80,101,5800,23
,29,169250,188,bmw,gas,std,two,sedan,rwd,front,101.20,176.80,64.80,54.30,2710,ohc,six,164,mpfi,3
.31,3.19,9.00,121,4250,21,28,20970,1,158,audi,gas,turbo,four,sedan,fwd,front,105.80,192.70,71.40,5
5.90,3086,ohc,five,131,mpfi,3.13,3.40,8.30,140,5500,17,20,238750,?,audi,gas,turbo,two,hatchback,
4wd,front,99.50,178.20,67.90,52.00,3053,ohc,five,131,mpfi,3.13,3.40,7.00,160,5500,16,22,?2,192,bm
w,gas,std,two,sedan,rwd,front,101.20,176.80,64.80,54.30,2395,ohc,four,108,mpfi,3.50,2.80,8.80,101,
5800,23,29,164300,192,bmw,gas,std,four,sedan,rwd,front,101.20,176.80,64.80,54.30,2395,ohc,four,
108,mpfi,3.50,2.80,8.80,101,5800,23,29,169250,188,bmw,gas,std,two,sedan,rwd,front,101.20,176.80,
64.80,54.30,2710,ohc,six,164,mpfi,3.31,3, 3.31,3.19,9.00,121,4250,21,28,20970
https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data
Each of attributes in dataset
NoT Attribute name Attribute range No Attribute name Attribute range
1 symbolling -3, -2, -1, 0, 1, 2, 3. 14 curb-weight continuous from 1488 to 4066.
2 normalized-losses continuous from 65 to 256. 15 engine-type dohc, dohcv,etc.
3 make audi, bmw, etc. 16 num-of-cylinders eight, five, four, six, three, twelve, two
4 fuel-type diesel, gas. 17 engine-size continuous from 61 to 326.
5 aspiration std, turbo. 18 fuel-system 1bbl, 2bbl, 4bbl, idi,etc.
6 num-of-doors four, two. 19 bore continuous from 2.54 to 3.94.
7 body-style sedan, hatchback, etc 20 stroke continuous from 2.07 to 4.17.
8 drive-wheels 4wd, fwd, rwd. 21 compression-ratio continuous from 7 to 23.
9 engine-location front, rear. 22 horsepower continuous from 48 to 288.
10 wheel-base continuous from 86.6 120.9. 23 peak-rpm continuous from 4150 to 6600.
11 length continuous from 141.1 to 208.1. 24 city-mpg continuous from 13 to 49.
12 width continuous from 60.3 to 72.3. 25 highway-mpg continuous from 16 to 54.
13 height continuous from 47.8 to 59.8. 26 price continuous from 5118 to 45400.Target
Python packages for
data science
Scientifics computing libraries in python
1. Scientifics
Computing
Libraries
Pandas
(Data structures & tools)
NumPy
(Arrays & metrices)
SciPy
(Integrals, solving differential
equations, optimization)
Visualization libraries in python
2. Visualization
Libraries
Matplotlib
(Plots & graphs, most popular)
Seaborn
(Plots: heat maps, time series, violin plots)
Algorithmic libraries in python
3. Algorithmic
Libraries
Scikit-learn
(Machine learning : regression,
classification, clustering, …)
Statsmodels
(Explore data, estimates statistical models,
and perform statistical tests.)
Importing and Exporting Data
in Python
Importing Data
◦ Process of loading and reading data into python
from various resources.
◦ Two important properties:
◦ Format
◦ Various formats: csv, xlsx, json, hdf, etc.
◦ File path of dataset
◦ Computer: /Desktop/mydata.csv
◦ Internet: https://archive.ics.uci.edu/autos/imports-85.data
Importing CSV into Python
Installing pandas
~$: sudo pip install pandas
import pandas as pd
url = ‘https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data’
df = pd.read_csv(url)
Importing CSV using pandas
url = ‘https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data’
df = pd.read_csv(url, header = None)
Reading CSV without headers
Printing the dataframe
◦ df prints the entire data frame (Not recommended for large datasets)
◦ df.head(n) to show the first n rows of data frame.
◦ df.tail(n) to show the last n rows of data frame.
df.head()
0 1 2 3 … 22 23 24 25
0 3 ? AUDI Gas … Rwd Front Mpfi 13495
1 3 ? AUDI Gas … Rwd Front Mpfi 16500
2 1 ? ROMIRO Gas … Fwd Front Mpfi 16500
3 2 164 ROMIRO Gas … Fwd Front Mpfi 13900
4 2 164 BMW Gas … 4wd front mpfi 17450
Header
Adding headers
symboling normalized-
losses
'make fuel-
type
… num-of-
doors
engine-
location
Fuel-
system
price
0 3 ? AUDI Gas … 4 Front Mpfi 13495
1 3 ? AUDI Gas … 4 Front Mpfi 16500
2 1 ? ROMIRO Gas … 2 Front Mpfi 16500
3 2 164 ROMIRO Gas … 4 Front Mpfi 13900
4 2 164 BMW Gas … 4 front mpfi 17450
headers=[‘symboling','normalized-losses','make','fuel-type','aspiration','num-of-
doors','body-style','drive-wheels','engine-location','wheel-
base','length','width','height','engine-type','horsepower','price’]
df.columns = headers
df.head(5)
Exporting a pandas data frame to csv
Preserve progress anytime by saving modified dataset using
path = ‘/User/../mydata.csv’
df.to_scv(path)
Exporting to different formats in python
Preserve progress anytime by saving modified dataset using
Data format Read Write
CSV pd.read_csv() df.to_csv()
JSON pd.read_json() df.to_json()
Excel pd.read_excel() df.to_excel()
SQL pd.read_sql() df.to_sql()
THANK YOU
Mohamed Ramadan
Contact

More Related Content

Similar to Intro to Data Analysis with python part 1

YUI Grids slides by Nate Koechley
YUI Grids slides by Nate KoechleyYUI Grids slides by Nate Koechley
YUI Grids slides by Nate Koechley
guest96840e
 
New Product Development Life Cycle Of Tata Nano
New Product Development Life Cycle Of Tata NanoNew Product Development Life Cycle Of Tata Nano
New Product Development Life Cycle Of Tata Nano
Sandip Kadam
 
Web Scrapping Project_AnilKhare.pptx
Web Scrapping Project_AnilKhare.pptxWeb Scrapping Project_AnilKhare.pptx
Web Scrapping Project_AnilKhare.pptx
AnilKhare12
 
License plate recognition
License plate recognitionLicense plate recognition
License plate recognition
slmnsvn
 

Similar to Intro to Data Analysis with python part 1 (20)

The Making Of Tata Nano
The Making Of Tata NanoThe Making Of Tata Nano
The Making Of Tata Nano
 
What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18What’s New in Imply 3.3 & Apache Druid 0.18
What’s New in Imply 3.3 & Apache Druid 0.18
 
Predicting model for prices of used cars
Predicting model for prices of used carsPredicting model for prices of used cars
Predicting model for prices of used cars
 
Accelerating automotive test development may 2008
Accelerating automotive test development   may 2008Accelerating automotive test development   may 2008
Accelerating automotive test development may 2008
 
Anuraag Bhusari - EECS 441 Company Analysis - Tesla
Anuraag Bhusari - EECS 441 Company Analysis - TeslaAnuraag Bhusari - EECS 441 Company Analysis - Tesla
Anuraag Bhusari - EECS 441 Company Analysis - Tesla
 
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
 
Introduction to Green NCAP
Introduction to Green NCAPIntroduction to Green NCAP
Introduction to Green NCAP
 
Kenny_Automobile_EDA.pdf
Kenny_Automobile_EDA.pdfKenny_Automobile_EDA.pdf
Kenny_Automobile_EDA.pdf
 
Your Parts Data and Why You NEED SPEEDcat!
Your Parts Data and Why You NEED SPEEDcat!Your Parts Data and Why You NEED SPEEDcat!
Your Parts Data and Why You NEED SPEEDcat!
 
Why ai
Why aiWhy ai
Why ai
 
Tata Indica
Tata IndicaTata Indica
Tata Indica
 
Toyota Fortuner - Price, Images & Specification
Toyota Fortuner - Price, Images & SpecificationToyota Fortuner - Price, Images & Specification
Toyota Fortuner - Price, Images & Specification
 
YUI Grids slides by Nate Koechley
YUI Grids slides by Nate KoechleyYUI Grids slides by Nate Koechley
YUI Grids slides by Nate Koechley
 
Nate koechley the yui css foundation
Nate koechley the yui css foundationNate koechley the yui css foundation
Nate koechley the yui css foundation
 
Mr Sarangarajan keynote address
Mr Sarangarajan keynote addressMr Sarangarajan keynote address
Mr Sarangarajan keynote address
 
Accelerate machine-learning workloads using Amazon EC2 P3 instances - CMP204 ...
Accelerate machine-learning workloads using Amazon EC2 P3 instances - CMP204 ...Accelerate machine-learning workloads using Amazon EC2 P3 instances - CMP204 ...
Accelerate machine-learning workloads using Amazon EC2 P3 instances - CMP204 ...
 
New Product Development Life Cycle Of Tata Nano
New Product Development Life Cycle Of Tata NanoNew Product Development Life Cycle Of Tata Nano
New Product Development Life Cycle Of Tata Nano
 
Advancing Autonomous Vehicle Development Using Distributed Deep Learning (CMP...
Advancing Autonomous Vehicle Development Using Distributed Deep Learning (CMP...Advancing Autonomous Vehicle Development Using Distributed Deep Learning (CMP...
Advancing Autonomous Vehicle Development Using Distributed Deep Learning (CMP...
 
Web Scrapping Project_AnilKhare.pptx
Web Scrapping Project_AnilKhare.pptxWeb Scrapping Project_AnilKhare.pptx
Web Scrapping Project_AnilKhare.pptx
 
License plate recognition
License plate recognitionLicense plate recognition
License plate recognition
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 

Recently uploaded (20)

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 

Intro to Data Analysis with python part 1

  • 1. INTRO TO DATA ANALYSIS WITH PYTHON Mohamed Ramadan
  • 2. Learning Objectives We will learn about: ◦ A problem requiring data analysis. ◦ A dataset to be analyzed in python. ◦ Overview of Python packages for data analysis. ◦ Importing and Exporting data in python. ◦ Basic insights from the dataset.
  • 3. Problem: Used car appraisal Question : Can we estimate the price of used cars !
  • 4. Why Data Analysis ◦ Data is everywhere. ◦ Data analysis / data science help us answer questions from data. ◦ Data Analysis plays an important role in: ◦ Discovering useful information. ◦ Predicting future or the unknown.
  • 5. Tom wants to sell his car How much money should he Sell his car for ! The price he sets should not be too high, But not too low either. Tom
  • 6. Estimate used car prices How can we help tom determine he best price for his car ! ◦ Is there data on the prices of other cars and their characteristics! ◦ What features of cars affect their prices! ◦ Color, Brand, Horsepower, or Something else ! ◦ Asking the right questions in term of data.
  • 7. Data on used car prices 1,158,audi,gas,turbo,four,sedan,fwd,front,105.80,192.70,71.40,55.90,3086,ohc,five,131,mpfi,3.13,3. 40,8.30,140,5500,17,20,238750,?,audi,gas,turbo,two,hatchback,4wd,front,99.50,178.20,67.90,52.00 ,3053,ohc,five,131,mpfi,3.13,3.40,7.00,160,5500,16,22,?2,192,bmw,gas,std,two,sedan,rwd,front,101 .20,176.80,64.80,54.30,2395,ohc,four,108,mpfi,3.50,2.80,8.80,101,5800,23,29,164300,192,bmw,gas,st d,four,sedan,rwd,front,101.20,176.80,64.80,54.30,2395,ohc,four,108,mpfi,3.50,2.80,8.80,101,5800,23 ,29,169250,188,bmw,gas,std,two,sedan,rwd,front,101.20,176.80,64.80,54.30,2710,ohc,six,164,mpfi,3 .31,3.19,9.00,121,4250,21,28,20970,1,158,audi,gas,turbo,four,sedan,fwd,front,105.80,192.70,71.40,5 5.90,3086,ohc,five,131,mpfi,3.13,3.40,8.30,140,5500,17,20,238750,?,audi,gas,turbo,two,hatchback, 4wd,front,99.50,178.20,67.90,52.00,3053,ohc,five,131,mpfi,3.13,3.40,7.00,160,5500,16,22,?2,192,bm w,gas,std,two,sedan,rwd,front,101.20,176.80,64.80,54.30,2395,ohc,four,108,mpfi,3.50,2.80,8.80,101, 5800,23,29,164300,192,bmw,gas,std,four,sedan,rwd,front,101.20,176.80,64.80,54.30,2395,ohc,four, 108,mpfi,3.50,2.80,8.80,101,5800,23,29,169250,188,bmw,gas,std,two,sedan,rwd,front,101.20,176.80, 64.80,54.30,2710,ohc,six,164,mpfi,3.31,3, 3.31,3.19,9.00,121,4250,21,28,20970 https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data
  • 8. Each of attributes in dataset NoT Attribute name Attribute range No Attribute name Attribute range 1 symbolling -3, -2, -1, 0, 1, 2, 3. 14 curb-weight continuous from 1488 to 4066. 2 normalized-losses continuous from 65 to 256. 15 engine-type dohc, dohcv,etc. 3 make audi, bmw, etc. 16 num-of-cylinders eight, five, four, six, three, twelve, two 4 fuel-type diesel, gas. 17 engine-size continuous from 61 to 326. 5 aspiration std, turbo. 18 fuel-system 1bbl, 2bbl, 4bbl, idi,etc. 6 num-of-doors four, two. 19 bore continuous from 2.54 to 3.94. 7 body-style sedan, hatchback, etc 20 stroke continuous from 2.07 to 4.17. 8 drive-wheels 4wd, fwd, rwd. 21 compression-ratio continuous from 7 to 23. 9 engine-location front, rear. 22 horsepower continuous from 48 to 288. 10 wheel-base continuous from 86.6 120.9. 23 peak-rpm continuous from 4150 to 6600. 11 length continuous from 141.1 to 208.1. 24 city-mpg continuous from 13 to 49. 12 width continuous from 60.3 to 72.3. 25 highway-mpg continuous from 16 to 54. 13 height continuous from 47.8 to 59.8. 26 price continuous from 5118 to 45400.Target
  • 10. Scientifics computing libraries in python 1. Scientifics Computing Libraries Pandas (Data structures & tools) NumPy (Arrays & metrices) SciPy (Integrals, solving differential equations, optimization)
  • 11. Visualization libraries in python 2. Visualization Libraries Matplotlib (Plots & graphs, most popular) Seaborn (Plots: heat maps, time series, violin plots)
  • 12. Algorithmic libraries in python 3. Algorithmic Libraries Scikit-learn (Machine learning : regression, classification, clustering, …) Statsmodels (Explore data, estimates statistical models, and perform statistical tests.)
  • 13. Importing and Exporting Data in Python
  • 14. Importing Data ◦ Process of loading and reading data into python from various resources. ◦ Two important properties: ◦ Format ◦ Various formats: csv, xlsx, json, hdf, etc. ◦ File path of dataset ◦ Computer: /Desktop/mydata.csv ◦ Internet: https://archive.ics.uci.edu/autos/imports-85.data
  • 15. Importing CSV into Python Installing pandas ~$: sudo pip install pandas import pandas as pd url = ‘https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data’ df = pd.read_csv(url) Importing CSV using pandas url = ‘https://archive.ics.uci.edu/ml/machine-learning-databases/autos/imports-85.data’ df = pd.read_csv(url, header = None) Reading CSV without headers
  • 16. Printing the dataframe ◦ df prints the entire data frame (Not recommended for large datasets) ◦ df.head(n) to show the first n rows of data frame. ◦ df.tail(n) to show the last n rows of data frame. df.head() 0 1 2 3 … 22 23 24 25 0 3 ? AUDI Gas … Rwd Front Mpfi 13495 1 3 ? AUDI Gas … Rwd Front Mpfi 16500 2 1 ? ROMIRO Gas … Fwd Front Mpfi 16500 3 2 164 ROMIRO Gas … Fwd Front Mpfi 13900 4 2 164 BMW Gas … 4wd front mpfi 17450 Header
  • 17. Adding headers symboling normalized- losses 'make fuel- type … num-of- doors engine- location Fuel- system price 0 3 ? AUDI Gas … 4 Front Mpfi 13495 1 3 ? AUDI Gas … 4 Front Mpfi 16500 2 1 ? ROMIRO Gas … 2 Front Mpfi 16500 3 2 164 ROMIRO Gas … 4 Front Mpfi 13900 4 2 164 BMW Gas … 4 front mpfi 17450 headers=[‘symboling','normalized-losses','make','fuel-type','aspiration','num-of- doors','body-style','drive-wheels','engine-location','wheel- base','length','width','height','engine-type','horsepower','price’] df.columns = headers df.head(5)
  • 18. Exporting a pandas data frame to csv Preserve progress anytime by saving modified dataset using path = ‘/User/../mydata.csv’ df.to_scv(path)
  • 19. Exporting to different formats in python Preserve progress anytime by saving modified dataset using Data format Read Write CSV pd.read_csv() df.to_csv() JSON pd.read_json() df.to_json() Excel pd.read_excel() df.to_excel() SQL pd.read_sql() df.to_sql()