2. What is the Pandas
Pandas is an open-source Python library that provides high-performance,
easy-to-use data structure, and data analysis tools for the Python
Python pandas is well suited for different kinds of data, such as:
a. Ordered and unordered time series data
b. Unlabeled data
c. Any other form of observational or statistical data sets
Data Structure in Pandas
1. Series (1 D)
2. DataFrame(2D)
3. Series
Series is a one-dimensional array that can contain any type of data. You can
create a series by using the following constructor:
pandas.Series(data, index, dtype, copy)
import pandas as pd
s = pd.Series()
print (s)
4. 1. Creating a series from a list
arr = [0,1,2,3,4]
s1 = pd.Series(arr)
s1
2. Order list
order = [1,2,3,4]
s1 = pd.Series(arr,index=order)
s1
3. With numpy
n=np.random.randn(5)
index = [‘a’ ,’b’,’c’,’d’]
s2 = pd.Series(n, index=index)
s2
5. 4. With Dict
data = {‘a’ :0,’b’ : 1,’c’ : 2}
s3= pd.Series(data)
print(s3)
5. Modifying and Slicing
print(s1)
s1.index = [ ‘A’,’B’,’C’,’D’]
s1
a = s1[ :3]
Also apply append drop, remove function
Series operation – add del sub mul div.
To retrieve data using labels
6. DataFrame
A DataFrame is a multi-dimensional data
structure in which data is arranged in the form
of rows and columns. You can create a
DataFrame using the following constructor:
pandas.DataFrame(data, index, columns, dtype,
copy)
7. Basic Operations on DataFrames
1. create a DataFrame from lists
data = [1,2,3,4,5]
x = pd.DataFrame(data)
print (x)
2. 2D DataFrame
data = [[ ‘John’,22],’carter’,25],[‘herold’,33]]
x = pd.DataFrame(data,colums=[‘Name’,’Age’])
print (x)
3. series dictionary
d = { ‘one’ : pd.Series([1,2,3],index=[‘a’,’b’,’c’]),’two’ : pd.Series ([4,5,6,7], index = [‘a’,’b’,’c’,’d’])}
x= pd.DataFrame(d)
print (x)
8. Practice Program
Write a Pandas program to join the two given dataframes along columns and assign
all data. Suppose you have
student_data1:
student_id name marks
01 john40
02 aza 50
03 zen 20
student_data2:
student_id name marks
04 ditu 10
05 mark 30