This document discusses importing and interfacing CSV files with Python Pandas DataFrames. It explains that Pandas DataFrames allow for querying and calculations on tabular data, and CSV files are commonly used to store scientific data with columns separated by commas. It then demonstrates how to import CSV files into DataFrames using Pandas read_csv function, specifying options like the file path, column separator, and header row. It also shows how to export DataFrames to CSV files using the to_csv method.
2. Pandas DataFrame and CSV files: Interfacing
This Photo by Unknown Author is licensed under CC BY-SA
• Pandas DataFrame is a two dimensional
structures with row and columns.
• Due to its inherent tabular structure, we
can query and run calculations on pandas
dataframes across an entire row, an entire
column, or a specific cell or series of cells
based on either location and attribute
values.
• CSV files are a very common file format
used to collect and organize scientific
data.
• CSV files use commas (or some other
delimiter like tab spaces or semi-colons)
to indicate separate values.
• CSV files also support labeled names for
the columns, referred to as headers. This
means that CSV files can easily support
multiple columns of related data.
This Photo by Unknown Author
is licensed under CC BY-SA
3. CSV files: Comma Separated Files
We will learn how to import tabular data from text files (.csv) into pandas
dataframes, so we can take advantage of the benefits of working with pandas
dataframes.
5. Importing CSV files using Pandas DataFrame
import pandas as pd
result = pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv")
print(result)
Command to read a CSV file
Storage path of the file
Data Frame
Output
6. read_csv : Syntax
DataFrame = pd.read_csv (“filepath”, sep=“ “, header=0)
• name of the comma separated data file along with its path.
• sep specifies whether the values are separated by comma, semicolon, tab, or any
other character. The default value for sep is a space.
• header specifies the number of the row whose values are to be used as the
column names. It also marks the start of the data to be fetched. By default,
header=0.
1 2 3
1
2
3
7. Variation in sep parameter
import pandas as pd
result = pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv“, sep=“,”,
header=0)
print(result)
Output
8. Variation in the header parameter
import pandas as pd
result = pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv“, sep=“,”,
header=1)
print(result)
Output
9. Using the Name parameter
We can specify our own column names using the parameter names while creating
the DataFrame using the read_csv() function.
import pandas as pd
result=pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv", header=0,
names=["Roll_no","Studname", "Math","English","Science", "Arts"])
print(result)
Note: We need to specify the
Header=0 parameter here otherwise,
there will be a double heading.
Output
10. Changing Index column
import pandas as pd
result=pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv",
index_col='Enroll_no')
print(result)
Output
11. Using DataFrame functions
import pandas as pd
result=pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv")
print(result.head(2))
print(result.tail(2))
Output
14. Adding a new column
import pandas as pd
result=pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv")
result["Marks6"]=[22,11,13,15]
print(result)
Output
15. Adding a new row
import pandas as pd
result=pd.read_csv ("C:/Users/Neeru/Desktop/Docs/b.csv")
result.loc[4]=[22,24,25,20.20]
print(result)
Output
16. Exporting DataFrames to CSV
import pandas as pd
marksUT= {'Name':['Raman','Zuhaire','Mishti','Drovya'],
'UT':[1,2,3,4],
'Maths':[22,21,14,20],
'Science':[25,22,21,18],
'S.St':[18,17,15,22],
'Hindi':[20,22,15,17],
'Eng':[24,23,13,20]
}
pd1=pd.DataFrame(marksUT)
pd1.to_csv(path_or_buf='C:/Users/Neeru/Desktop/Docs/dd.csv’,
sep=',')
DataFrame
Creation
File PathCommand to write to a CSV file