Python Homework Help

For any help regarding Python Homework Help
Visit : https://www.programminghomeworkhelp.com/, Email
: support@programminghomeworkhelp.com or
call us at - +1 678 648 4277
programminghomeworkhelp.com

Quick Reference
D = { } – creates an empty dictionary
D = {key1:value1, …} – creates a non-empty dictionary
D[key] – returns the value that’s mapped to by key. (What if there’s no such key?)
D[key] = newvalue – maps newvalue to key. Overwrites any previous value. del D[key]
– deletes the mapping with that key from D.
len(D) – returns the number of entries (mappings) in D. x in D, x not in D – checks
whether the key x is in the dictionary D.
D.items( ) – returns the entries as a list of (key, value) tuples. for k in D – iterates over
all of the keys in D.
Problem 1 – Inventory Finder
Download the inventory.py file. The file shows eight different items, each having a
name, a price and a count, like so:
HAMMER = “hammer”
HAMMER_PRICE = 10
HAMMER_COUNT = 100
Problems

We’re going to consider that customers generally come in with an idea of how much
money they want to spend. So we’re going to think of items as either CHEAP (under
$20), MODERATE (between $20 and $100) or EXPENSIVE (over $100).
First, fill in the variable inventory so that all of the data for the eight items is inside
inventory. Make sure that you maintain the notion of CHEAP, MODERATE and
EXPENSIVE. Then, implement the function get_info that takes a cheapness and returns
a list of information about each item that falls under that category, as the function’s
information says.
Important: there should NOT be a loop inside this function. Our inventory is small, but
for a giant store, the inventory will be big. The store can’t afford to waste time looping
over all of the inventory every time a customer has a request.
When you’re finished, just run the program. All of the testing lines should print True.
Problem 2 – Indexing the Web, Part 2
So we have a working search engine. That’s great! But how do we know which sites are
better than others? Right now, they’re just returning the sites in an arbitrary order
(remember, dictionaries are unordered). In this problem, we’ll implement a ranking
system.

What will we rank based on? Google used an innovative ranking system that ranked a
page higher if more *other* pages linked to it. We can’t do that unfortunately, because
that requires a considerable understanding of graph theory, so what else can we do? Well,
before Google, most engines ranked based on either the frequency (i.e. number of hits)
of search terms inside the page, or by the percentage of those search terms within the
page’s text. We’ll go with the frequency arbitrarily – we found after Google that neither of
these measures are particularly good, and there isn’t a clear advantage between the two.
To begin, download the following files:
webindexer2.py – this is the file in which you’ll write all of your code.
websearch2.py – this completed program is an updated search engine that will use your
new index with a ranking system.
Again, take a look at the main program, websearch2.py. It’s almost identical to the
previous version, but you can see that it now expects to have tuples of (site, frequency)
rather than just the sites themselves. This way, it is able to display how many hits each
site has. It also expects that the sites are already sorted/ranked from highest to lowest
frequency.
So let’s take a look at webindexer2.py. Again, it’s almost identical to the previous version,
but the descriptions for the search functions now state that frequencyis returned along
with each site, and the sites are sorted by rank.

In order to rank each site by the frequency of search terms in it, we’ll have to store the
information in our index.
To begin, you can copy your functions’ code from webindexer1.py into webindexer2.py,
but you don’t have to.
Task 1 – Implement the index_site function. What information will each site in the index
need to store with it? What’s the best way to store this information? If we have more than
one choice, which choice is mutable, and which one is immutable? While we’re building
the index, we’ll be repeatedly making changes, so which choice is better?
Hints: If you’re stuck, think very logically. When I’m searching, I have a word. I want to be
able to look up this word and get what information? The information needs to be enough
for me to sort it.
Now that we’ve taken care of indexing, we can again move on to searching. And again, we’ll
tackle one word first before multiple words. This should be very similar to your previous
function, but we have to do one additional thing: sort the results based on frequency.
Task 2 – Implement the search_single_word function. We have to return a list of (site,
frequency) tuples. If we have a list L of these tuples, to sort them, do this:
L.sort(key = lambda pair: pair[1], reverse = True)

Don’t worry about what this means yet, but if you’re interested, we can explain.
And again, now that we can handle one word, we’ll handle multiple words. The same
logic applies as before, but again, we have to sort the results before returning them.
Task 3 – Implement the search_multiple_words function. The argument words is a list,
not a string. Make sure you don’t return duplicate sites in your list! And as before, make
sure you sort the list (using the same statement as above).
You should now have a working indexer with a ranking system, so run websearch2.py
and try it out! And for some real fun, don’t use the smallest set of files. Use the 20 set
or the 50 set to see the ranking really come into play.
As before, on the next page, I’ve pasted my output for a few searches from the
mitsites20.txt file. If your output is quite different, you may have done something wrong.
If it’s just slightly different, it may just be a change in the pages (e.g. web.mit.edu) from
when I indexed the site to when you did.

# inventory.py
# Stub file for lab 10, problem 1
#
# 6.189 - Intro to Python
# IAP 2008 - Class 8
CHEAP = "cheap" # less than $20
MODERATE = "moderate" # between $20 and $100
EXPENSIVE = "expensive" # more than $100
HAMMER = "hammer"
HAMMER_PRICE = 10
HAMMER_COUNT = 100
SCREW = "screw"
SCREW_PRICE = 1
SCREW_COUNT = 1000
NAIL = "nail"
NAIL_PRICE = 1
NAIL_COUNT = 1000

SCREWDRIVER = "screwdriver"
SCREWDRIVER_PRICE = 8
SCREWDRIVER_COUNT = 100
DRILL = "drill"
DRILL_PRICE = 50
DRILL_COUNT = 20
WORKBENCH = "workbench"
WORKBENCH_PRICE = 150
WORKBENCH_COUNT = 5
HANDSAW = "handsaw"
HANDSAW_PRICE = 15
HANDSAW_COUNT = 50
CHAINSAW = "chainsaw"
CHAINSAW_PRICE = 80
CHAINSAW_COUNT = 30
# You should put all of the stuff above logically into this dictionary.
# You can just put it all in right here, like shown.
# Try to use only one *variable*, called inventory here.

inventory = { # key1 : value1, note how I can continue on the next line,
# key2 : value2, I don't need a backslash or anything.
# key3 : value3
}
def get_items(cheapness):
""" Return a list of (item, (price, count) tuples that are the given
cheapness. Note that the second element of the tuple is another tuple. """
# your code here
return [] # delete this
# Testing
cheap = get_items(CHEAP)
print type(cheap) is list
print len(cheap) == 5
print (HAMMER, (HAMMER_PRICE, HAMMER_COUNT)) in cheap
print (NAIL, (NAIL_PRICE, NAIL_COUNT)) in cheap
print (SCREW, (SCREW_PRICE, SCREW_COUNT)) in cheap
print (SCREWDRIVER, (SCREWDRIVER_PRICE, SCREWDRIVER_COUNT)) in cheap
print (HANDSAW, (HANDSAW_PRICE, HANDSAW_COUNT)) in cheap

moderate = get_items(MODERATE)
print type(moderate) is list
print len(moderate) == 2
print (DRILL, (DRILL_PRICE, DRILL_COUNT)) in moderate
print (CHAINSAW, (CHAINSAW_PRICE, CHAINSAW_COUNT)) in moderate
expensive = get_items(EXPENSIVE)
print type(expensive) is list
print len(expensive) == 1
print (WORKBENCH, (WORKBENCH_PRICE, WORKBENCH_COUNT)) in expensive
# webindexer2.py
#
from urllib import urlopen
from htmltext import HtmlTextParser
FILENAME = "smallsites.txt" programminghomeworkhelp.com

index = {}
def get_sites():
""" Return all the sites that are in FILENAME. """
sites_file = open(FILENAME)
sites = []
for site in sites_file:
sites.append("http://" + site.strip())
return sites
def read_site(site):
""" Attempt to read the given site. Return the text of the site if
successful, otherwise returns False. """
try:
connection = urlopen(site)
html = connection.read()
connection.close()
except:
return False
parser = HtmlTextParser()
parser.parse(html)

return parser.get_text()
def index_site(site, text):
""" Index the given site with the given text. """
# YOUR CODE HERE #
pass # delete this when you write your code
def search_single_word(word):
""" Return a list of (site, frequency) tuples for all sites containing the
given word, sorted by decreasing frequency. """
# YOUR CODE HERE #
def search_multiple_words(words):
""" Return a list of (site, frequency) tuples for all sites containing any
of the given words, sorted by decreasing frequency. """
# YOUR CODE HERE #
def build_index():
""" Build the index by reading and indexing each site. """

for site in get_sites():
text = read_site(site)
while text == False:
text = read_site(site) # keep attempting to read until successful
index_site(site, text)

Solutions
# inventory.py
#
CHEAP = "cheap" # less than $20
MODERATE = "moderate" # between $20 and $100
EXPENSIVE = "expensive" # more than $100
HAMMER = "hammer"
HAMMER_PRICE = 10
HAMMER_COUNT = 100
SCREW = "screw"
SCREW_PRICE = 1
SCREW_COUNT = 1000
NAIL = "nail"
NAIL_PRICE = 1
NAIL_COUNT = 1000 programminghomeworkhelp.com

SCREWDRIVER = "screwdriver"
SCREWDRIVER_PRICE = 8
SCREWDRIVER_COUNT = 100
DRILL = "drill"
DRILL_PRICE = 50
DRILL_COUNT = 20
WORKBENCH = "workbench"
WORKBENCH_PRICE = 150
WORKBENCH_COUNT = 5
HANDSAW = "handsaw"
HANDSAW_PRICE = 15
HANDSAW_COUNT = 50
CHAINSAW = "chainsaw"
CHAINSAW_PRICE = 80
CHAINSAW_COUNT = 30
# You should put the stuff logically into this dictionary.
# You can just put it all in right here, like shown.
# Try to use only one *variable*, called inventory here.

inventory = {
CHEAP : {
HAMMER : (HAMMER_PRICE, HAMMER_COUNT),
NAIL : (NAIL_PRICE, NAIL_COUNT),
SCREW : (SCREW_PRICE, SCREW_COUNT),
SCREWDRIVER : (SCREWDRIVER_PRICE, SCREWDRIVER_COUNT),
HANDSAW : (HANDSAW_PRICE, HANDSAW_COUNT)
},
MODERATE : {
DRILL : (DRILL_PRICE, DRILL_COUNT),
CHAINSAW : (CHAINSAW_PRICE, CHAINSAW_COUNT)
},
EXPENSIVE : {
WORKBENCH : (WORKBENCH_PRICE, WORKBENCH_COUNT)
}
}
def get_items(cheapness):
""" Return a list of (item, (price, count)) tuples that are the given
cheapness. Note that the second element of the tuple is another tuple. """
return inventory[cheapness].items()

# Testing
cheap = get_items(CHEAP)
print type(cheap) is list
print len(cheap) == 5
print (HAMMER, (HAMMER_PRICE, HAMMER_COUNT)) in cheap
print (NAIL, (NAIL_PRICE, NAIL_COUNT)) in cheap
print (SCREW, (SCREW_PRICE, SCREW_COUNT)) in cheap
print (SCREWDRIVER, (SCREWDRIVER_PRICE, SCREWDRIVER_COUNT)) in cheap
print (HANDSAW, (HANDSAW_PRICE, HANDSAW_COUNT)) in cheap
moderate = get_items(MODERATE)
print type(moderate) is list
print len(moderate) == 2
print (DRILL, (DRILL_PRICE, DRILL_COUNT)) in moderate
print (CHAINSAW, (CHAINSAW_PRICE, CHAINSAW_COUNT)) in moderate
expensive = get_items(EXPENSIVE)
print type(expensive) is list
print len(expensive) == 1
print (WORKBENCH, (WORKBENCH_PRICE, WORKBENCH_COUNT)) in expensive

# webindexer2.py
#
from urllib import urlopen
from htmltext import HtmlTextParser
FILENAME = "smallsites.txt"
index = {}
def get_sites():
""" Return all the sites that are in FILENAME. """
sites_file = open(FILENAME)
sites = []
for site in sites_file:
sites.append("http://" + site.strip())
return sites

def read_site(site):
""" Attempt to read the given site. Return the text of the site if
successful, otherwise returns False. """
try:
connection = urlopen(site)
html = connection.read()
connection.close()
except:
return False
parser = HtmlTextParser()
parser.parse(html)
return parser.get_text()
def index_site(site, text):
""" Index the given site with the given text. """
words = text.lower().split()
for word in words:
if word not in index: # case 1: haven't seen word anywhere
index[word] = {site:1} # make a new entry for the word
elif site not in index[word]: # case 2: haven't seen word on

this site
index[word][site] = 1 # make a new entry for this site
else: # case 3: seen this word on this site
index[word][site] += 1 # increment the frequency by 1
def search_single_word(word):
""" Return a list of (site, frequency) tuples for all sites containing the
given word, sorted by decreasing frequency. """
if word not in index:
return []
L = index[word].items()
return L
def search_multiple_words(words):
""" Return a list of (site, frequency) tuples for all sites containing any
of the given words, sorted by decreasing frequency. """
all_sites = {}
for word in words:
for site, freq in search_single_word(word):
if site not in all_sites: # case 1: haven't included this site
all_sites[site] = freq # make a new entry for site, freq

else: # case 2: have included this site
all_sites[site] += freq # add the frequencies
L = all_sites.items()
return L
def build_index():
""" Build the index by reading and indexing each site. """
for site in get_sites():
text = read_site(site)
while text == False:
text = read_site(site) # keep attempting to read until successful
index_site(site, text)

Python Homework Help

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Python Homework Help

Similar to Python Homework Help (20)

More from Programming Homework Help

More from Programming Homework Help (20)

Recently uploaded

Recently uploaded (20)

Python Homework Help