SlideShare a Scribd company logo
Pandas
•Open-source Python Library
•Data manipulation and analysis tool using its powerful data structures.
•Wes McKinney -2008
•Pandas will extract the data from that CSV into :
•DataFrame
•Table
•Perform following statistical operations on it like :
•What's the average, median, max, or min of each column?
•Does column A correlate with column B?
•What does the distribution of data in column C look like?
•Clean the data by doing things like removing missing values and
filtering rows or columns by some criteria
•Visualize the data with help from Matplotlib. Plot bars, lines,
histograms, bubbles, and more.
•Store the cleaned, transformed data back into a CSV, other file or
database
Pandas
• From command prompt
•pip install pandas
• From Jupyter notebook
• !pip install pandas’
•Then in notebook to use pandas..need to create instance of it
• import pandas as pd
Pandas
•Pandas deals with the following three data structures −
•Series
•DataFrame
•Panel
Data Structure Description
Series 1D labeled homogeneous array, size immutable.
Data Frames 2D - labeled, size-mutable tabular structure with potentially
heterogeneously typed columns.
Panel 3D labeled, size-mutable array.
Pandas
Series
Homogeneous data
Size Immutable
Values of Data Mutable
DataFrame is a two-dimensional array with heterogeneous data.
For example,
Name Age Gender Rating
Steve 32 Male 3.45
Lia 28 Female 2.40
The data is represented in rows and columns. Each column represents an attribute
and each row represents a person.
Data Type of Columns
The data types of the four columns are as follows −
Column Type
Name String
Age Integer
Gender String
Rating Float
Pandas
Data Frame :
•Key Points
•Heterogeneous data
•Size Mutable
•Data Mutable
•Panel
Panel is a three-dimensional data structure with heterogeneous data. It is hard to
represent the panel in graphical representation. But a panel can be illustrated as a
container of DataFrame.
Key Points
Heterogeneous data
Size Mutable
Data Mutable
Pandas
•Series is a one-dimensional labeled array capable of holding data of any type (integer,
string, float, python objects, etc.).
•The axis labels are collectively called index.
A pandas Series can be created using the following constructor −
pandas.Series( data, index, dtype, copy)
The parameters of the constructor are as follows −
Sr.No Parameter & Description
1 Data : data takes various forms like ndarray, list, constants
2 Index : Index values must be unique and hashable, same length as
data. Default np.arrange(n) if no index is passed.
3 Dtype : dtype is for data type. If None, data type will be inferred
4 Copy : Copy data. Default False
Pandas
A series can be created using various inputs like −
Array
Dict
Scalar value or constant
Create a Series from ndarray
If data is an ndarray, then index passed must be of the same length. If no index is
passed, then by default index will be range(n) where n is array length, i.e.,
[0,1,2,3…. range(len(array))-1].
Pandas
•A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular
fashion in rows and columns.
•Features of DataFrame
•Potentially columns are of different types
•Size – Mutable
•Labeled axes (rows and columns)
•Can Perform Arithmetic operations on rows and columns
Pandas
•pandas.DataFrame
•A pandas DataFrame can be created using the following constructor −
•pandas.DataFrame( data, index, columns, dtype, copy)
•Create DataFrame
•A pandas DataFrame can be created using various inputs like −
•Lists
•dict
•Series
•Numpy ndarrays
•Another DataFrame

More Related Content

What's hot

ESTRUCTURAS Y UNIONES EN C++
ESTRUCTURAS Y UNIONES EN C++ESTRUCTURAS Y UNIONES EN C++
ESTRUCTURAS Y UNIONES EN C++
die_dex
 

What's hot (20)

Pandas Series
Pandas SeriesPandas Series
Pandas Series
 
Python Scipy Numpy
Python Scipy NumpyPython Scipy Numpy
Python Scipy Numpy
 
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning LibrariesPython - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning Libraries
 
Pandas Dataframe reading data Kirti final.pptx
Pandas Dataframe reading data  Kirti final.pptxPandas Dataframe reading data  Kirti final.pptx
Pandas Dataframe reading data Kirti final.pptx
 
Sets in python
Sets in pythonSets in python
Sets in python
 
Introduction to numpy Session 1
Introduction to numpy Session 1Introduction to numpy Session 1
Introduction to numpy Session 1
 
Introduction to Data Structure
Introduction to Data Structure Introduction to Data Structure
Introduction to Data Structure
 
Basic data types in python
Basic data types in pythonBasic data types in python
Basic data types in python
 
Strings in Python
Strings in PythonStrings in Python
Strings in Python
 
Python list
Python listPython list
Python list
 
DataFrame in Python Pandas
DataFrame in Python PandasDataFrame in Python Pandas
DataFrame in Python Pandas
 
Intoduction to numpy
Intoduction to numpyIntoduction to numpy
Intoduction to numpy
 
ESTRUCTURAS Y UNIONES EN C++
ESTRUCTURAS Y UNIONES EN C++ESTRUCTURAS Y UNIONES EN C++
ESTRUCTURAS Y UNIONES EN C++
 
Introduction to NumPy
Introduction to NumPyIntroduction to NumPy
Introduction to NumPy
 
9 python data structure-2
9 python data structure-29 python data structure-2
9 python data structure-2
 
Introduction to pandas
Introduction to pandasIntroduction to pandas
Introduction to pandas
 
Python: Modules and Packages
Python: Modules and PackagesPython: Modules and Packages
Python: Modules and Packages
 
Python list
Python listPython list
Python list
 
Python Variable Types, List, Tuple, Dictionary
Python Variable Types, List, Tuple, DictionaryPython Variable Types, List, Tuple, Dictionary
Python Variable Types, List, Tuple, Dictionary
 
Python programming : Arrays
Python programming : ArraysPython programming : Arrays
Python programming : Arrays
 

Similar to Pandas

pandas-221217084954-937bb582.pdf
pandas-221217084954-937bb582.pdfpandas-221217084954-937bb582.pdf
pandas-221217084954-937bb582.pdf
scorsam1
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptx
SumitMajukar
 
XII IP New PYTHN Python Pandas 2020-21.pptx
XII IP New PYTHN Python Pandas 2020-21.pptxXII IP New PYTHN Python Pandas 2020-21.pptx
XII IP New PYTHN Python Pandas 2020-21.pptx
lekha572836
 
Introducing Pandas Objects.pptx
Introducing Pandas Objects.pptxIntroducing Pandas Objects.pptx
Introducing Pandas Objects.pptx
ssuser52a19e
 
Data preprocessing ppt1
Data preprocessing ppt1Data preprocessing ppt1
Data preprocessing ppt1
meenas06
 
Q-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptx
Q-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptxQ-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptx
Q-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptx
kalai75
 

Similar to Pandas (20)

2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx
 
Pandas.pptx
Pandas.pptxPandas.pptx
Pandas.pptx
 
pandas-221217084954-937bb582.pdf
pandas-221217084954-937bb582.pdfpandas-221217084954-937bb582.pdf
pandas-221217084954-937bb582.pdf
 
pandas.pdf
pandas.pdfpandas.pdf
pandas.pdf
 
Unit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptxUnit 3_Numpy_VP.pptx
Unit 3_Numpy_VP.pptx
 
pandas directories on the python language.pptx
pandas directories on the python language.pptxpandas directories on the python language.pptx
pandas directories on the python language.pptx
 
XII IP New PYTHN Python Pandas 2020-21.pptx
XII IP New PYTHN Python Pandas 2020-21.pptxXII IP New PYTHN Python Pandas 2020-21.pptx
XII IP New PYTHN Python Pandas 2020-21.pptx
 
All python data_analyst_r_course
All python data_analyst_r_courseAll python data_analyst_r_course
All python data_analyst_r_course
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Introducing Pandas Objects.pptx
Introducing Pandas Objects.pptxIntroducing Pandas Objects.pptx
Introducing Pandas Objects.pptx
 
Introduction of data science
Introduction of data scienceIntroduction of data science
Introduction of data science
 
Unit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptxUnit 3_Numpy_Vsp.pptx
Unit 3_Numpy_Vsp.pptx
 
Python for statistical analysis
Python for statistical analysisPython for statistical analysis
Python for statistical analysis
 
4)12th_L-1_PYTHON-PANDAS-I.pptx
4)12th_L-1_PYTHON-PANDAS-I.pptx4)12th_L-1_PYTHON-PANDAS-I.pptx
4)12th_L-1_PYTHON-PANDAS-I.pptx
 
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
Matplotlib adalah pustaka plotting 2D Python yang menghasilkan gambar berkual...
 
python-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptxpython-numpyandpandas-170922144956 (1).pptx
python-numpyandpandas-170922144956 (1).pptx
 
Data preprocessing ppt1
Data preprocessing ppt1Data preprocessing ppt1
Data preprocessing ppt1
 
Python Pandas.pptx
Python Pandas.pptxPython Pandas.pptx
Python Pandas.pptx
 
Q-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptx
Q-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptxQ-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptx
Q-Step_WS_06112019_Data_Analysis_and_visualisation_with_Python.pptx
 
Pandas csv
Pandas csvPandas csv
Pandas csv
 

Recently uploaded

Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
AbrahamGadissa
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
Kamal Acharya
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
Kamal Acharya
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
Kamal Acharya
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
R&R Consult
 

Recently uploaded (20)

shape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptxshape functions of 1D and 2 D rectangular elements.pptx
shape functions of 1D and 2 D rectangular elements.pptx
 
Explosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdfExplosives Industry manufacturing process.pdf
Explosives Industry manufacturing process.pdf
 
Scaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltageScaling in conventional MOSFET for constant electric field and constant voltage
Scaling in conventional MOSFET for constant electric field and constant voltage
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
Top 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering ScientistTop 13 Famous Civil Engineering Scientist
Top 13 Famous Civil Engineering Scientist
 
2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge2024 DevOps Pro Europe - Growing at the edge
2024 DevOps Pro Europe - Growing at the edge
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
 
fundamentals of drawing and isometric and orthographic projection
fundamentals of drawing and isometric and orthographic projectionfundamentals of drawing and isometric and orthographic projection
fundamentals of drawing and isometric and orthographic projection
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Fruit shop management system project report.pdf
Fruit shop management system project report.pdfFruit shop management system project report.pdf
Fruit shop management system project report.pdf
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 
Laundry management system project report.pdf
Laundry management system project report.pdfLaundry management system project report.pdf
Laundry management system project report.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxCFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptx
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Democratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek AryaDemocratizing Fuzzing at Scale by Abhishek Arya
Democratizing Fuzzing at Scale by Abhishek Arya
 
Introduction to Casting Processes in Manufacturing
Introduction to Casting Processes in ManufacturingIntroduction to Casting Processes in Manufacturing
Introduction to Casting Processes in Manufacturing
 

Pandas

  • 1. Pandas •Open-source Python Library •Data manipulation and analysis tool using its powerful data structures. •Wes McKinney -2008 •Pandas will extract the data from that CSV into : •DataFrame •Table •Perform following statistical operations on it like : •What's the average, median, max, or min of each column? •Does column A correlate with column B? •What does the distribution of data in column C look like? •Clean the data by doing things like removing missing values and filtering rows or columns by some criteria •Visualize the data with help from Matplotlib. Plot bars, lines, histograms, bubbles, and more. •Store the cleaned, transformed data back into a CSV, other file or database
  • 2. Pandas • From command prompt •pip install pandas • From Jupyter notebook • !pip install pandas’ •Then in notebook to use pandas..need to create instance of it • import pandas as pd
  • 3. Pandas •Pandas deals with the following three data structures − •Series •DataFrame •Panel Data Structure Description Series 1D labeled homogeneous array, size immutable. Data Frames 2D - labeled, size-mutable tabular structure with potentially heterogeneously typed columns. Panel 3D labeled, size-mutable array.
  • 4. Pandas Series Homogeneous data Size Immutable Values of Data Mutable DataFrame is a two-dimensional array with heterogeneous data. For example, Name Age Gender Rating Steve 32 Male 3.45 Lia 28 Female 2.40 The data is represented in rows and columns. Each column represents an attribute and each row represents a person. Data Type of Columns The data types of the four columns are as follows − Column Type Name String Age Integer Gender String Rating Float
  • 5. Pandas Data Frame : •Key Points •Heterogeneous data •Size Mutable •Data Mutable •Panel Panel is a three-dimensional data structure with heterogeneous data. It is hard to represent the panel in graphical representation. But a panel can be illustrated as a container of DataFrame. Key Points Heterogeneous data Size Mutable Data Mutable
  • 6. Pandas •Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). •The axis labels are collectively called index. A pandas Series can be created using the following constructor − pandas.Series( data, index, dtype, copy) The parameters of the constructor are as follows − Sr.No Parameter & Description 1 Data : data takes various forms like ndarray, list, constants 2 Index : Index values must be unique and hashable, same length as data. Default np.arrange(n) if no index is passed. 3 Dtype : dtype is for data type. If None, data type will be inferred 4 Copy : Copy data. Default False
  • 7. Pandas A series can be created using various inputs like − Array Dict Scalar value or constant Create a Series from ndarray If data is an ndarray, then index passed must be of the same length. If no index is passed, then by default index will be range(n) where n is array length, i.e., [0,1,2,3…. range(len(array))-1].
  • 8. Pandas •A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. •Features of DataFrame •Potentially columns are of different types •Size – Mutable •Labeled axes (rows and columns) •Can Perform Arithmetic operations on rows and columns
  • 9. Pandas •pandas.DataFrame •A pandas DataFrame can be created using the following constructor − •pandas.DataFrame( data, index, columns, dtype, copy) •Create DataFrame •A pandas DataFrame can be created using various inputs like − •Lists •dict •Series •Numpy ndarrays •Another DataFrame