The document discusses pandas Series and DataFrame data structures. It provides examples of creating Series from lists, dictionaries, and DataFrames from dictionaries of data. It also demonstrates indexing, selecting, and filtering operations in Series like selecting by label or position. For DataFrames, it shows how to select by label or position using .loc and .iloc, filter rows based on conditions, and assign values. Series are single-column data structures and DataFrames allow different data types across columns, making them powerful tools for data analysis in Python.
2. pandas in Series and data frame
• In pandas, the primary data structure used to store and manipulate data is called
a "DataFrame."
• A DataFrame is a two-dimensional, tabular data structure with labeled axes (rows
and columns). It is similar to a spreadsheet or a SQL table.
• Each column in a DataFrame can have a different data type, and you can perform
various operations like filtering, grouping, aggregation, and more on the data
stored in a DataFrame.
• Pandas also provides another data structure called a "Series," which is essentially
a one-dimensional array-like object but with an associated index.
• Series can be thought of as a single column of a DataFrame.
• These data structures, DataFrame and Series, are powerful tools for data
manipulation and analysis in Python, and they are an essential part of the pandas
library.
3. Python Pandas Series
1.import pandas as pd
• # a simple char list
• list = ['g', 'e', 'e', 'k', 's']
• # create series form a char list
• res = pd.Series(list)
• print(res)
2.import pandas as pd
• # a simple int list
• list = [1,2,3,4,5]
• # create series form a int list
• res = pd.Series(list)
• print(res)
4. pandas in Series and data frame
3.import pandas as pd
• dic = { 'Id': 1013, 'Name': 'MOhe',
'State': 'Maniput','Age': 24}
• res = pd.Series(dic)
• print(res)
4.import pandas as pd
• # list of strings
• lst = [‘python', 'For', ‘series', ‘and',
• ‘Dataframe', ‘is', ‘object']
• # Calling DataFrame constructor on list
• df = pd.DataFrame(lst)
• display(df)
5. pandas in Series and data frame
• import pandas as pd
• # initialise data of lists.
• data = {'Name':['Tom', 'nick', 'krish', 'jack'],
'Age':[20, 21, 19, 18]}
• # Create DataFrame
• df = pd.DataFrame(data)
• display(df)
6. pandas in Series and data frame
• import pandas as pd
• # Define a dictionary containing employee data
• data = {'Name':['Jai', 'Princi', 'Gaurav', 'Anuj'],
• 'Age':[27, 24, 22, 32],
• 'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj'],
• 'Qualification':['Msc', 'MA', 'MCA', 'Phd']}
• # Convert the dictionary into DataFrame
• df = pd.DataFrame(data)
• # select two columns
• print(df[['Name', 'Qualification']])
7. pandas in Series and data frame
• from pandas import DataFrame
• # Creating a data frame
• Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'],
• 'ID': [12, 43, 54, 32],
• 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna']
• df = DataFrame(Data, columns = ['Name', 'ID', 'Place'])
• # Print original data frame
• print("Original data frame:n")
• display(df)
• # Selecting the product of Electronic Type
• select_prod = df.loc[df['Name'] == 'Mohe']
• print("n")
• # Print selected rows based on the condition
• print("Selecting rows:n")
• display (select_prod)
8. pandas in Series and data frame
• from pandas import DataFrame
• # Creating a data frame
• Data = {'Name': ['Mohe', 'Shyni', 'Parul', 'Sam'],
• 'ID': [12, 43, 54, 32],
• 'Place': ['Delhi', 'Kochi', 'Pune', 'Patna']
• }
• df = DataFrame(Data, columns = ['Name', 'ID', 'Place'])
• # Print original data frame
• print("Original data frame:")
• display(df)
• print("Selected column: ")
• display(df[['Name', 'ID']] )
9. Indexing & Selecting & Filtering in Pandas
• import numpy as np
• import pandas as pd
• obj=pd.Series(np.arange(5),index=['a','b','c','d','f'])
• obj
• Working with Index in Series
• obj['c']
• entering the index number in square brackets.
• obj[2]
• slice the data.
• Obj[0:3]
• Selecting in Series
• select the specific rows.
• Obj[[‘a’,’c’]]
• Selecting the index number.
• Obj[[0,2]]
10. • Filtering in Series
• Obj[obj<2]---------values less than 2.
• Slice the values
• Obj[‘a’:’c’]
• Assign a value of sliced piece.
• Obj[‘b’:’c’]=5
• obj
11. Selecting in DataFrame
• how to index in DataFrame, let me create a DataFrame.
• data=pd.DataFrame(np.arange(16).reshape(4,4),
index=['London','Paris','Berlin','India'],
columns=['one','two','three','four'])
• Data
The column named two
• data['two']
• select more than one column
• data[['one','two']]
• slice the rows.
• data[:3]
12. Filtering in DataFrame
• data[data['four']>5]
• Assign data to specific values.
data[data<5]=0
data
Selecting with iloc and loc methods
The iloc method to select a row using the row’s index.
data.iloc[1]
select the specific columns of the row
data.iloc[1,[1,2,3]]
select specific columns of multiple rows
data.iloc[[1,3],[1,2,3]]
13. Filtering in DataFrame
• loc method need to use names for loc.
• data.loc['Paris',['one','two']]
• data.loc[:'Paris','four']