1. Computer Science 151
An introduction to the art of
computing
Graphing Lecture 2
Rudy Martinez
2. Graphing Libraries
• Matplotlib is a Python 2D plotting library which produces publication
quality figures in a variety of hardcopy formats and interactive
environments across platforms.
• De Facto standard for general graphing
• Bokeh is an interactive visualization library that targets modern web
browsers for presentation.
• Designed for high performance over large datasets.
• Seaborn is a Python data visualization library based on matplotlib. It
provides a high-level interface for drawing attractive and informative
statistical graphics.
• Many others …
4/8/2019CS151SP19
3. Pyplot
• Pyplot is a collection of command style functions that make
matplotlib work like MATLAB.
Code Example:
import matplotlib.pyplot as plt
plt.plot([1, 2, 3, 4])
plt.ylabel(“Numbers’)
plt.show()
4/8/2019CS151SP19
4. Plotting Functions
Function Description
pyplot.bar(x-value, y-value)
pyplot.bar([x-values], [y-values])
Plots a single bar on the graph or multiple bars when the x- and y-values are
provided as lists.
pyplot.plot([x-coords], [y-coords])
pyplot.plot([x-coords], [y-coords], format)
Plots a line graph. The color and style of the line can be specified with a format
string.
pyplot.grid("on") Adds a grid to the graph.
pyplot.xlim(min, max)
pyplot.ylim(min, max)
Sets the range of x- or y-values shown on the graph.
pyplot.title(text) Adds a title to the graph.
pyplot.xlabel(text)
pyplot.ylabel(text)
Adds a label below the x-axis or to the left of the y-axis.
pyplot.legend([label1, label2, ...]) Adds a legend for multiple lines.
pyplot.xticks([x-coord1, x-coord2, ...], [label1, label2, ...]) Adds labels below the tick marks along the x-axis.
pyplot.xticks([x-coord1, x-coord2, ...], [label1, label2, ...]) Adds labels to the left of the tick marks along the y-axis.
pyplot.show() Displays the plot.
4/8/2019CS151SP19
5. Creating a Line Graph
• The following line allows us to call our plot plt instead of
matplotlib.pyplot
Import matplotlib.pyplot as plt
# Set Plot Size (optional)
plt.figure(figsize(20,10))
plt.style.use(‘ggplot’)
4/8/2019CS151SP19
6. Example Cont.
plt.plot(X,Y, color=‘red’, label=‘num’)
_______________________
plt.plot(yearList, Avg_Tmax, label=‘Tmax’)
plt.plot(yearList, Avg_Tmin, label=‘Tmin’)
• plt is our plot
• .plot() is a function that will plot
a line chart.
• X,Y is the data to be plotted,
usually a list
• Color is to manually control the
color of the plotted line
• Label will add the given string to
the Legend if selected.
4/8/2019CS151SP19
7. Example Cont.
plt.plot(yearList, Avg_Tmax, label=‘Tmax’)
plt.plot(yearList, Avg_Tmin, label=‘Tmin’)
• yearList is my list of years, I
called .keys() on my dictionary
and it yields a list of years.
• Avg_Tmax is my list of Tmax
values from my averagedata.csv
• Avg_Tmin is also from the Tmin
from average data.csv.
• Note we are not graphing
standard deviation.
4/8/2019CS151SP19
8. Example Cont.
plt.plot(yearList, Avg_Tmax, label=‘Tmax’)
plt.plot(yearList, Avg_Tmin, label=‘Tmin’)
plt.plot(yearList, plotAvgMax, label=‘Max Avg’)
Plt.plot(yearList, plotAvgMin, label=‘Min Avg’)
• plotAvgMax is the average of all
the years plotted on the graph.
• It’s a single value, but the way
graphing works.
4/8/2019CS151SP19
9. List Comprehension
• We need to get a single value in to a list that is of size N.
• We could do a for loop
plotAvgMax = []
for i in range(0, len(AVGDATA), 1):
plotAvgMax = avgMax
• Where plotAvgMax is given a copy of the avgMax in each element.
• avgMax is the sum of all the yearly averages divided by the number of
values. (typical arithmetic mean)
4/8/2019CS151SP19
10. List Comprehension
• We need to get a single value in to a list that is of size N.
• The most pythonic way is to use a list comprehension:
plotAvgMax = [avgMax for I in range(len(AVGDATA))]
• This then gives us what we wanted.
4/8/2019CS151SP19
11. Back to graphing
# Set Title
plt.title(‘Albuquerque, NM NOAA Values’)
# Set Axis
plt.xlabel(‘Year’)
plt.ylabel(‘Temperature (F)’)
# Add a legend
plt.legend()
# Save the file for your report
plt.savefig(‘Homework2cplot.png’)
# Show the plot
plt.show()
4/8/2019CS151SP19
12. Algorithm
• Create Variables
• Dictionary to hold Year: Tmax
• Dictionary to hold Year: Tmin
• Dictionary to hold Average data from averagedata.csv( year:[tmax, tmin] )
• List to hold all the Tmax values from homework2.csv
• List to hold all the Tmin values from homeworlk2.csv
4/8/2019CS151SP19
13. Algorithm
• Load Data from homework2.csv into my two dictionaries
• For every year
• If the year has not changed:
• Append Tmax to list
• Append Tmin to list
• Else the year has changed:
• Update Year: Tmax list (deep copy) to a Dictionary (Max)
• Update Year: Tmin list (deep copy) to a Dictionary (Min)
• Max list clear
• Min List clear
• Set new year
4/8/2019CS151SP19
14. Algorithm
• Need to load data from averagedata.csv into dictionary.
• For every row in averagedata:
• Update in Dictionary Year: [Tmax, Tmin] row[0]: [row[1], row[2]]
• For graphing set a list of Avg_Tmax.append(float(row[1])
• For graphing set a list of Avg_Tmin.append(float(row[2])
• Sum Tmax values
• Sum Tmin values
• Create avgMax = sumTmax/len(Avg_Tmax)
• Create avgMin = sumTmin/len(Avg_Tmin)
4/8/2019CS151SP19
15. Algorithm
• Now manual calc of Std Dev.
• For every year in Tmax dictionary (from averagedata.csv)
• For every item in Tmax list (from homework2.csv)
• Max_sum += math.pow((item – average tmax value),2)
• Max sigma = math.sqrt(max_sum / (length of Tmax list))
• Update a new dictionary with {row: max sigma}
• Do the same for Tmin
• You should now have two dictionaries, Tmax, Tmin with year: sigma
4/8/2019CS151SP19
16. Algorithm
• Calculate std deviation using statistics package
• For every year in list of Tmax daily values
• Max sigma.update( year: statistics.stdev(Tmax[day])
• Min sigma.update( year: statistics.stdev(Tmin[day}
• Remember stddev takes a list, earlier in one of the for loops you
should have created a list of every TMAX value and every TMIN value
and create a TMAX dictionary with year: [list of tmax values] and
TMIN dictionary with year: [list of tmin values]
4/8/2019CS151SP19
17. Algorithm
• Now create the graph.
• plt.plot(yearList, Avg_Tmax, label='TMax')
• plt.plot(yearList, Avg_Tmin, label='TMin')
• plt.plot(yearList, plotAvgMax, label='Max Average')
• plt.plot(yearList, plotAvgMin, label='Min Average’)
• Here yearList is easy Tmax Dictionary.keys()
• Avg_Tmax/Tmin were created when we read in the for loop for
average data.
• Use given list comprehension to generate plotAvgMax/Min
4/8/2019CS151SP19
18. Algorithm
• Write Report!
• What is the data?
• Format
• What did you do to the data? Any formatting or cleaning?
• A short description of your methodology:
• I used a program to analyze the above data ….
• Print the graph in the report
• What is the graph telling you about 25 years of temperature data?
• You have the standard deviation, can you tell me what that number means in terms of
temperature year to year?
• The temperatures are just numbers, what do they really mean in terms of subjective
assessment to you!
4/8/2019CS151SP19
19. Turn in
• Homework2c.py file to learn.
• Report in pdf format.
• No longer than 1 page.
4/8/2019CS151SP19
20. Final Project
• You will receive a data file (csv format) of Sunspot data.
• Utilizing what you have learned in Homework2 you will answer the
following questions in a report:
• Are there any discernable cycles for sunspots?
• If yes, what is this cycle?
• Can you prove it with a graph?
• What are the statistics for each cycle?
• Mean number per cycle
• Std deviation per cycle
4/8/2019CS151SP19