Pandas in Series and Data
frame
Dr.R.SUNDAR
CSE DEPT
MITS
pandas in Series and data frame
• In pandas, the primary data structure used to store and manipulate data is called
a "DataFrame."
• A DataFrame is a two-dimensional, tabular data structure with labeled axes (rows
and columns). It is similar to a spreadsheet or a SQL table.
• Each column in a DataFrame can have a different data type, and you can perform
various operations like filtering, grouping, aggregation, and more on the data
stored in a DataFrame.
• Pandas also provides another data structure called a "Series," which is essentially
a one-dimensional array-like object but with an associated index.
• Series can be thought of as a single column of a DataFrame.
• These data structures, DataFrame and Series, are powerful tools for data
manipulation and analysis in Python, and they are an essential part of the pandas
library.
Python Pandas Series
1.import pandas as pd
• # a simple char list
• list = ['g', 'e', 'e', 'k', 's']
• # create series form a char list
• res = pd.Series(list)
• print(res)
2.import pandas as pd
• # a simple int list
• list = [1,2,3,4,5]
• # create series form a int list
• res = pd.Series(list)
• print(res)
pandas in Series and data frame
3.import pandas as pd
• dic = { 'Id': 1013, 'Name': 'MOhe',
'State': 'Maniput','Age': 24}
• res = pd.Series(dic)
• print(res)
4.import pandas as pd
• # list of strings
• lst = [‘python', 'For', ‘series', ‘and',
• ‘Dataframe', ‘is', ‘object']
• # Calling DataFrame constructor on list
• df = pd.DataFrame(lst)
• display(df)
pandas in Series and data frame
• import pandas as pd
• # initialise data of lists.
• data = {'Name':['Tom', 'nick', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}
• # Create DataFrame
• df = pd.DataFrame(data)
• display(df)
pandas in Series and data frame
• import pandas as pd
• # Define a dictionary containing employee data
• data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
• 'Age':[27, 24, 22, 32],
• 'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
• 'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
• # Convert the dictionary into DataFrame
• df = pd.DataFrame(data)
• # select two columns
• print(df[['Name', 'Qualification']])
pandas in Series and data frame
• from pandas import DataFrame
• # Creating a data frame
• Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'],
• 'ID': [12, 43, 54, 32],
• 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna']
• df = DataFrame(Data, columns = ['Name', 'ID', 'Place'])
• # Print original data frame
• print("Original data frame:n")
• display(df)
• # Selecting the product of Electronic Type
• select_prod = df.loc[df['Name'] == 'Mohe']
• print("n")
• # Print selected rows based on the condition
• print("Selecting rows:n")
• display (select_prod)
pandas in Series and data frame
• from pandas import DataFrame
• # Creating a data frame
• Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'],
• 'ID': [12, 43, 54, 32],
• 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna']
• }
• df = DataFrame(Data, columns = ['Name', 'ID', 'Place'])
• # Print original data frame
• print("Original data frame:")
• display(df)
• print("Selected column: ")
• display(df[['Name', 'ID']] )
Indexing & Selecting & Filtering in Pandas
• import numpy as np
• import pandas as pd
• obj=pd.Series(np.arange(5),index=['a','b','c','d','f'])
• obj
• Working with Index in Series
• obj['c']
• entering the index number in square brackets.
• obj[2]
• slice the data.
• Obj[0:3]
• Selecting in Series
• select the specific rows.
• Obj[[‘a’,’c’]]
• Selecting the index number.
• Obj[[0,2]]
• Filtering in Series
• Obj[obj<2]---------values less than 2.
• Slice the values
• Obj[‘a’:’c’]
• Assign a value of sliced piece.
• Obj[‘b’:’c’]=5
• obj
Selecting in DataFrame
• how to index in DataFrame, let me create a DataFrame.
• data=pd.DataFrame(np.arange(16).reshape(4,4),
index=['London','Paris','Berlin','India'],
columns=['one','two','three','four'])
• Data
The column named two
• data['two']
• select more than one column
• data[['one','two']]
• slice the rows.
• data[:3]
Filtering in DataFrame
• data[data['four']>5]
• Assign data to specific values.
data[data<5]=0
data
Selecting with iloc and loc methods
The iloc method to select a row using the row’s index.
data.iloc[1]
select the specific columns of the row
data.iloc[1,[1,2,3]]
select specific columns of multiple rows
data.iloc[[1,3],[1,2,3]]
Filtering in DataFrame
• loc method need to use names for loc.
• data.loc['Paris',['one','two']]
• data.loc[:'Paris','four']

pandas for series and dataframe.pptx

  • 1.
    Pandas in Seriesand Data frame Dr.R.SUNDAR CSE DEPT MITS
  • 2.
    pandas in Seriesand data frame • In pandas, the primary data structure used to store and manipulate data is called a "DataFrame." • A DataFrame is a two-dimensional, tabular data structure with labeled axes (rows and columns). It is similar to a spreadsheet or a SQL table. • Each column in a DataFrame can have a different data type, and you can perform various operations like filtering, grouping, aggregation, and more on the data stored in a DataFrame. • Pandas also provides another data structure called a "Series," which is essentially a one-dimensional array-like object but with an associated index. • Series can be thought of as a single column of a DataFrame. • These data structures, DataFrame and Series, are powerful tools for data manipulation and analysis in Python, and they are an essential part of the pandas library.
  • 3.
    Python Pandas Series 1.importpandas as pd • # a simple char list • list = ['g', 'e', 'e', 'k', 's'] • # create series form a char list • res = pd.Series(list) • print(res) 2.import pandas as pd • # a simple int list • list = [1,2,3,4,5] • # create series form a int list • res = pd.Series(list) • print(res)
  • 4.
    pandas in Seriesand data frame 3.import pandas as pd • dic = { 'Id': 1013, 'Name': 'MOhe', 'State': 'Maniput','Age': 24} • res = pd.Series(dic) • print(res) 4.import pandas as pd • # list of strings • lst = [‘python', 'For', ‘series', ‘and', • ‘Dataframe', ‘is', ‘object'] • # Calling DataFrame constructor on list • df = pd.DataFrame(lst) • display(df)
  • 5.
    pandas in Seriesand data frame • import pandas as pd • # initialise data of lists. • data = {'Name':['Tom', 'nick', 'krish', 'jack'], 'Age':[20, 21, 19, 18]} • # Create DataFrame • df = pd.DataFrame(data) • display(df)
  • 6.
    pandas in Seriesand data frame • import pandas as pd • # Define a dictionary containing employee data • data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'], • 'Age':[27, 24, 22, 32], • 'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'], • 'Qualification':['Msc', 'MA', 'MCA', 'Phd']} • # Convert the dictionary into DataFrame • df = pd.DataFrame(data) • # select two columns • print(df[['Name', 'Qualification']])
  • 7.
    pandas in Seriesand data frame • from pandas import DataFrame • # Creating a data frame • Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'], • 'ID': [12, 43, 54, 32], • 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna'] • df = DataFrame(Data, columns = ['Name', 'ID', 'Place']) • # Print original data frame • print("Original data frame:n") • display(df) • # Selecting the product of Electronic Type • select_prod = df.loc[df['Name'] == 'Mohe'] • print("n") • # Print selected rows based on the condition • print("Selecting rows:n") • display (select_prod)
  • 8.
    pandas in Seriesand data frame • from pandas import DataFrame • # Creating a data frame • Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'], • 'ID': [12, 43, 54, 32], • 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna'] • } • df = DataFrame(Data, columns = ['Name', 'ID', 'Place']) • # Print original data frame • print("Original data frame:") • display(df) • print("Selected column: ") • display(df[['Name', 'ID']] )
  • 9.
    Indexing & Selecting& Filtering in Pandas • import numpy as np • import pandas as pd • obj=pd.Series(np.arange(5),index=['a','b','c','d','f']) • obj • Working with Index in Series • obj['c'] • entering the index number in square brackets. • obj[2] • slice the data. • Obj[0:3] • Selecting in Series • select the specific rows. • Obj[[‘a’,’c’]] • Selecting the index number. • Obj[[0,2]]
  • 10.
    • Filtering inSeries • Obj[obj<2]---------values less than 2. • Slice the values • Obj[‘a’:’c’] • Assign a value of sliced piece. • Obj[‘b’:’c’]=5 • obj
  • 11.
    Selecting in DataFrame •how to index in DataFrame, let me create a DataFrame. • data=pd.DataFrame(np.arange(16).reshape(4,4), index=['London','Paris','Berlin','India'], columns=['one','two','three','four']) • Data The column named two • data['two'] • select more than one column • data[['one','two']] • slice the rows. • data[:3]
  • 12.
    Filtering in DataFrame •data[data['four']>5] • Assign data to specific values. data[data<5]=0 data Selecting with iloc and loc methods The iloc method to select a row using the row’s index. data.iloc[1] select the specific columns of the row data.iloc[1,[1,2,3]] select specific columns of multiple rows data.iloc[[1,3],[1,2,3]]
  • 13.
    Filtering in DataFrame •loc method need to use names for loc. • data.loc['Paris',['one','two']] • data.loc[:'Paris','four']