1. Computer Science 151
An introduction to the art of
computing
Copy of Lists Lecture
Rudy Martinez
2. Homework 2c is posted
• Create a file (homework2c.py).
• Your program should open the homework2.csv and averagedata.csv data file.
• You may use any data structure you like, however, you should think about how you need to read, access,
and write the data for the statistics you need. You would be wise to utilize code you have already written.
• You will create a yearly Standard Deviation of TMIN and TMAX. You should use the following equations to
compute average ( ̄X) and standard deviation (σ).
• 𝑥 =
𝑥
𝑛
𝑎𝑛𝑑 𝜎 =
𝑥−𝑥 2
𝑛−1
• Write a comma separated value (CSV) file to disk (statistics.csv) that holds the following data:
• Year, avgTmax, avgTmin, StdDevMax, StdDevMin
3/26/2019CS151SP19
3. Homework 2c
• You will be given code to create a graph, however, you must make sure that your data is in the correct
format to be understood by the graphing code.
• You will write a report on your observations of the data and the outcome, you should use the graph
created in your code. This will be a short one to two paragraphs not to exceed one page on what you
think the data is saying. Approach this as an assignment from the point of view that you are analyzing
this data for a boss or management team.
• What can you generalize about the temperature over the time in the data series?
• What kind of temperatures can you generalize for the next 10, 20, 30 years given this data and the standard
deviation?
• Does the data make sense in terms of what you know about Albuquerque’s weather?
• You will turn in your homework2c.py file on learn, you should also upload a pdf of your report
observations
3/26/2019CS151SP19
4. Python and Lists
• Python tries to be smart with
managing memory.
• Lists can be long so why make a
complete copy?
myList = [ 1, 2, 3, 4, 5]
myCopy = myList
print(‘myList’, myList)
print(‘myCopy’, myCopy)
---------------------------------------
myList 1 2 3 4 5
myCopy 1 2 3 4 5
3/26/2019CS151SP19
5. Python and Lists
• Now we see that python only
creates a pointer as a ‘copy’
• How do we know this?
myList = [ 1, 2, 3, 4, 5]
myCopy = myList
myList.append(6)
print(‘myList’, myList)
print(‘myCopy’, myCopy)
-------------------------------------------
myList 1 2 3 4 5 6
myCopy 1 2 3 4 5 6
3/26/2019CS151SP19
6. Enter Deep Copy
• How do we create a real copy then?
import copy
newList = copy.deepcopy(list1)
• The above code will copy the list element by element and place
it into a whole new list
3/26/2019CS151SP19
7. Enter Deep Copy
• Now we see that python only
creates a pointer as a ‘copy’
• How do we know this?
myList = [ 1, 2, 3, 4, 5]
myrealCopy = copy.deepcopy(myList)
myList.append(6)
print(‘myList’, myList)
print(‘myrealCopy’, myrealCopy)
-------------------------------------------
myList 1 2 3 4 5 6
myCopy 1 2 3 4 5
3/26/2019CS151SP19
8. Why does this matter?
• If you have a for loop for adding items to a list and you push this
to a new list or dictionary you get a pointer of a copy to the list.
3/26/2019CS151SP19
9. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
3/26/2019CS151SP19
10. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
3/26/2019CS151SP19
This is what you might think is happening.
11. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
3/26/2019CS151SP19
This is what really happens.
12. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
# Add one to myList
myList.append(6)
3/26/2019CS151SP19
This is what really happens.
13. Example fixed!
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)]=copy.deepcopy(myList)
# Add one to myList
myList.append(6)
3/26/2019CS151SP19
This is what really happens.
14. Calculating standard deviation
• You already have the list of averages ( 𝑥) in averages.csv
• You will need to get ‘x’ from your cleaned data in homework2.csv
• Then like calculating the sum to get an average you will need to sort
by year.
• You could copy your for loop that did the summation, except you need to
figure out how to store the calculation
𝜎 =
𝑥 − 𝑥 2
𝑛 − 1
3/26/2019CS151SP19
15. This is the where deepcopy comes in
• In the for loop you can calculate 𝑥 − 𝑥 2
sum += math.pow((myList[i][1] – myAvg),2)
• To square a number use math.pow(number,2)
• To cube change 2 to 3
• myList[i][1]
• If myList is loaded of the original file from homework2.csv
• Year,Tmax, tmin
• myAvg
• You already have averages from homework2b, averages.csv
3/26/2019CS151SP19
16. This is the where deepcopy comes in
• Load Averages into dictionary
• Year: [Tmax, Tmin]
• Load homework2.csv into list
• Year, tmax, tmin
• Create dictionary to hold Year and a list of the temps for that day (you need two, tmax and tmin)
• Year: [maxTemplist]
• 1994:['68', '38', '67’, …, '25’]
• 1995: ['48', '25', '51’, …, '16’]
• Now you can calculate standard deviation per year
• More on this later
3/26/2019CS151SP19
18. Code for graphing
This code will be provided to you, however, you will need to have your data in the correct format*:
The Plot takes two lists and plots them.
plt.plot(yearList, maxTempList, label='TMax')
plt.plot(yearList, minTempList, label='TMin')
plt.plot(yearList, plotAvgMax, label='Max Average')
plt.plot(yearList, plotAvgMin, label='Min Average’)
yearList is just a list of the years from the data file.
maxTempList, minTempList is a list of values from homework2b or the averages.csv
plotAvgMax, plotAvgMin is a list of one single value, however the plotter wants a 1:1 so it’s
repeated 25 times.
* Right now this is the format it may slightly change
3/26/2019CS151SP19
19. List Comprehensions
new_list = [expression for_loop_one_or_more condtions]
# We can use len(avgDict) here because we cleaned the data
avgMax = sumMax/len(avgDict)
avgMin = sumMin/len(avgDict)
# Syntactic Sugar for creating an array of values set to one value
plotAvgMax = [avgMax for i in range(len(avgDict))]
plotAvgMin = [avgMin for i in range(len(avgDict))]
We take a value avgMax and apply it in a for loop from I to length of avgDict.
plotAvgMax = [74.3993317088547, … , 74.3993317088547]
3/26/2019CS151SP19