2. STRUCTURED TEXT FILES
• Simple text files are a collection of lines with an escape sequence at
the end of each line.
• There is no definitive way to identify specific pieces of information
unless there is a specified format to the file.
• Ex. /etc/passwd
username:*:UID:GID: name: home Path: shell
• However there are several structured files
• Tab Delimited – values separated with a tab
• CSV – values separated with a ‘,’
• HTML/XML – tags , ‘< >’
3. COMMA SEPARATED VALUES
• Delimited files are a common format often used as an exchange
format for spreadsheets and databases.
• Each line in a CSV file represents a row in the spreadhseet
• Usually there is a header that denoted each of the column names.
• Since CSV’s are a formatted text file they can still have end of line
escape sequencesID Term Course Grade
800412564 201652 ISY150 A
800798465 201652 CIS120 A
800125498 201652 CIS120 C
800174658 201652 CIS150 F
4. MANIPULATING CSV FILES VS. PLAIN TEXT
FILES
• Since CSV files are just formatted text files the process to read them
is similar to processing text files.
• Create a file stream, create reader/writer object, process the reader/writer, close
stream
• When files are read in they need to be processed as lists(arrays) and
each element is a unique element in the array that does not need to
be split.
• There is a unique module for processing csv files
• Code: import csv
5. READ CSV EXAMPLE
import csv
exFile = open(‘example.csv’ , ‘r’)
exReader = csv.reader(exFile)
for row in exReader:
print row
exFile.close()
import csv
exFile = open(‘example.csv’ , ‘r’)
exReader = csv.reader(exFile)
exReader = list(exReader)
for i in (0, 10, 1):
print exReader[i]
exFile.close()
7. PROCESS CSV FILES IN A DIRECTORY
EXAMPLE
import csv, os
for currFile in os.listdir(‘~/Documents’)
if (not currFile.endswith(‘.csv’)):
continue
else:
# process csv file
Editor's Notes
CSV vs escel and otherspreadsheets
No types – all strings
No fonts, sizes or colors
No multiple worsheets
No cell widths or heights
No merged cells
No images or charts