Contents:
1. Direct Address Table
2. Hashing
3. Characteristics of a good hash function
4. Collision Resolution using Chaining and Probing
5. Static vs Dynamic Hashing
6. Extendible Hashing
7. B+ tree vs Hashing
Hash Tables
The memory available to maintain the symbol table is assumed to be sequential. This memory is referred to as the hash table, HT. The term bucket denotes a unit of storage that can store one or more records. A bucket is typically one disk block size but could be chosen to be smaller or larger than a disk block.
If the number of buckets in a Hash table HT is b, then the buckets are designated HT(0), ... HT(b-1). Each bucket is capable of holding one or more records. The number of records a bucket can store is known as its slot-size. Thus, a bucket is said to consist of s slots, if it can hold s number of records in it.
A function that is used to compute the address of a record in the hash table, is known as a hash function. Usually, s = 1 and in this case each bucket can hold exactly 1 record.
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Kuntal Bhowmick
A Hash table is a data structure used for storing and retrieving data very quickly. Insertion of data in the hash table is based on the key value. Hence every entry in the hash table is associated with some key.
HASHING AND HASH FUNCTIONS, HASH TABLE REPRESENTATION, HASH FUNCTION, TYPES OF HASH FUNCTIONS, COLLISION, COLLISION RESOLUTION, CHAINING, OPEN ADDRESSING – LINEAR PROBING, QUADRATIC PROBING, DOUBLE HASHING
Contents:
1. Direct Address Table
2. Hashing
3. Characteristics of a good hash function
4. Collision Resolution using Chaining and Probing
5. Static vs Dynamic Hashing
6. Extendible Hashing
7. B+ tree vs Hashing
Hash Tables
The memory available to maintain the symbol table is assumed to be sequential. This memory is referred to as the hash table, HT. The term bucket denotes a unit of storage that can store one or more records. A bucket is typically one disk block size but could be chosen to be smaller or larger than a disk block.
If the number of buckets in a Hash table HT is b, then the buckets are designated HT(0), ... HT(b-1). Each bucket is capable of holding one or more records. The number of records a bucket can store is known as its slot-size. Thus, a bucket is said to consist of s slots, if it can hold s number of records in it.
A function that is used to compute the address of a record in the hash table, is known as a hash function. Usually, s = 1 and in this case each bucket can hold exactly 1 record.
Hashing notes data structures (HASHING AND HASH FUNCTIONS)Kuntal Bhowmick
A Hash table is a data structure used for storing and retrieving data very quickly. Insertion of data in the hash table is based on the key value. Hence every entry in the hash table is associated with some key.
HASHING AND HASH FUNCTIONS, HASH TABLE REPRESENTATION, HASH FUNCTION, TYPES OF HASH FUNCTIONS, COLLISION, COLLISION RESOLUTION, CHAINING, OPEN ADDRESSING – LINEAR PROBING, QUADRATIC PROBING, DOUBLE HASHING
Lab 3 Set Working Directory, Scatterplots and Introduction to.docxDIPESH30
Lab 3: Set Working Directory, Scatterplots and Introduction to
Linear Regression
Chao-yo Cheng
[email protected]
Zsuzsanna Magyar
[email protected]
January 16, 2016
1 Section objectives
In this section we will use the HW2.dta. This dataset is a small set of variables from the larger
“Maddison dataset” 1 By the end, you should be comfortable using commands to import
your .dta file, making (somewhat) fancy scatterplots, and running () regressions.
2 Commands
In this lab, you should become familiar with the following commands.
cd
use
regress
and
twoway scatter
lfit
scheme ()
3 Set working directory and quickly running a .do file
• Log in and open Stata.
• Log in the class website. Save the Homework 2 data in “My Documents” (or any folder
that works for you).
• Open a .do file. First set the working directory on Stata. Type:
cd "insert path address"
To get the address of your path, right click on “My documents” and press “Copy address”
so you can copy this inside your “” after cd.
1See here for more information: http://www.ggdc.net/maddison/maddison-project/home.htm.
1
• To import the data type the command use. Type:
use HW2.dta
• Check whether or not the data has been imported properly.
• Now you know how to open your data using code. This means you can quickly run your
.do file on a clean dataset using the clear all command at the top of the .do file.
This will save you the trouble of opening a fresh dataset once your do file is finished.
• To sum up, at the top of your .do file, type
clear all
cd "path address"
use HW2.dta
Question 1. How many variables are in the dataset, and how many observations are there?
4 Scatterplots
• Everything after the “,” in a graphical command is an option. The variables being
graphed come before the comma.
• Sometimes it is nice to use a scheme for your scatterplot so it looks simpler. Here we
use the scheme (s1mono). Schemes determine the overall look of a graph.
• To draw scatterplots with observation labels and titles for the y and x axis. Type
twoway (scatter gdppc_2000 gdppc_1500, mlabel(country)), ///
scheme (s1mono) ///
ytitle("GDP per capita 2000") ///
xtitle("GDP per capita 1500") ///
title("Scatterplot of GDP per capita 1500 versus 2000")
• Add a line of best fit using lfit command. Type:
twoway (scatter gdppc_2000 gdppc_1500, mlabel(country)) (lfit gdppc_2000 gdppc_
1500, color(blue)), ///
scheme (s1mono) ///
ytitle("GDP per capita 2000") ///
xtitle("GDP per capita 1500") ///
title("Scatterplot of GDP per capita 1500 versus 2000")
Question 2. What relationship does the slope of the fitted line indicate?
5 Linear regression
• The dependent variable (or outcome variable) is what we are trying to explain. It is
also called the “outcome” or Y .
2
• The explanatory (or independent variables) are what we use to do the explaining. These
variables are also called predictors, as we think they are trying to predict the dependent
variable.
• The command fo ...
Enhancing and Automating Decision Making with Machine Learning - Main Conference: Introduction to Machine Learning.
DutchMLSchool: 1st edition of the Machine Learning Summer School in The Netherlands.
In this study, we attempted to study the network of Twitter users and the mentions between them. Starting with a very large and incorrectly structured dataset, we used the Unix terminal (sed) and regular expressions to efficiently perform filtering and various transformations to end up with a lighter dataset. Then, using Python, we completely transformed the dataset from a linear (line by line) to a tabular format (columns), in order to load the data in iGraph. Using iGraph, we created a weighted directed graph and performed various tasks to explore the network:
- Identifying basic properties of the network, such as the Number of vertices, Number of edges, Diameter of the graph, Average in-degree and Average out-degree.
- Visualising the 5-day evolution of these metrics and commenting on observed fluctuations.
- Identifying the important nodes of the graph, based on In-degree, Out-degree and PageRank
- Performing community detections on the mention graphs, by applying fast greedy clustering, infomap clustering, and louvain clustering on the undirected versions of the 5 mention graphs.
- Visualising the different communities in the mention graph.
Best Data Science Ppt using Python
Data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data. Data science is related to data mining, machine learning and big data.
Introducing new features in Apache Pinot. In this talk, we will go over indexing support in Pinot, recently added text indexing feature, SQL support, and cloud readiness.
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Francesca Gottschalk from the OECD’s Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Honest Reviews of Tim Han LMA Course Program.pptxtimhan337
Personal development courses are widely available today, with each one promising life-changing outcomes. Tim Han’s Life Mastery Achievers (LMA) Course has drawn a lot of interest. In addition to offering my frank assessment of Success Insider’s LMA Course, this piece examines the course’s effects via a variety of Tim Han LMA course reviews and Success Insider comments.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
1.4 modern child centered education - mahatma gandhi-2.pptx
CS151 Deep copy
1. Computer Science 151
An introduction to the art of
computing
Copy of Lists Lecture
Rudy Martinez
2. Homework 2c is posted
• Create a file (homework2c.py).
• Your program should open the homework2.csv and averagedata.csv data file.
• You may use any data structure you like, however, you should think about how you need to read, access,
and write the data for the statistics you need. You would be wise to utilize code you have already written.
• You will create a yearly Standard Deviation of TMIN and TMAX. You should use the following equations to
compute average ( ̄X) and standard deviation (σ).
• 𝑥 =
𝑥
𝑛
𝑎𝑛𝑑 𝜎 =
𝑥−𝑥 2
𝑛−1
• Write a comma separated value (CSV) file to disk (statistics.csv) that holds the following data:
• Year, avgTmax, avgTmin, StdDevMax, StdDevMin
3/26/2019CS151SP19
3. Homework 2c
• You will be given code to create a graph, however, you must make sure that your data is in the correct
format to be understood by the graphing code.
• You will write a report on your observations of the data and the outcome, you should use the graph
created in your code. This will be a short one to two paragraphs not to exceed one page on what you
think the data is saying. Approach this as an assignment from the point of view that you are analyzing
this data for a boss or management team.
• What can you generalize about the temperature over the time in the data series?
• What kind of temperatures can you generalize for the next 10, 20, 30 years given this data and the standard
deviation?
• Does the data make sense in terms of what you know about Albuquerque’s weather?
• You will turn in your homework2c.py file on learn, you should also upload a pdf of your report
observations
3/26/2019CS151SP19
4. Python and Lists
• Python tries to be smart with
managing memory.
• Lists can be long so why make a
complete copy?
myList = [ 1, 2, 3, 4, 5]
myCopy = myList
print(‘myList’, myList)
print(‘myCopy’, myCopy)
---------------------------------------
myList 1 2 3 4 5
myCopy 1 2 3 4 5
3/26/2019CS151SP19
5. Python and Lists
• Now we see that python only
creates a pointer as a ‘copy’
• How do we know this?
myList = [ 1, 2, 3, 4, 5]
myCopy = myList
myList.append(6)
print(‘myList’, myList)
print(‘myCopy’, myCopy)
-------------------------------------------
myList 1 2 3 4 5 6
myCopy 1 2 3 4 5 6
3/26/2019CS151SP19
6. Enter Deep Copy
• How do we create a real copy then?
import copy
newList = copy.deepcopy(list1)
• The above code will copy the list element by element and place
it into a whole new list
3/26/2019CS151SP19
7. Enter Deep Copy
• Now we see that python only
creates a pointer as a ‘copy’
• How do we know this?
myList = [ 1, 2, 3, 4, 5]
myrealCopy = copy.deepcopy(myList)
myList.append(6)
print(‘myList’, myList)
print(‘myrealCopy’, myrealCopy)
-------------------------------------------
myList 1 2 3 4 5 6
myCopy 1 2 3 4 5
3/26/2019CS151SP19
8. Why does this matter?
• If you have a for loop for adding items to a list and you push this
to a new list or dictionary you get a pointer of a copy to the list.
3/26/2019CS151SP19
9. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
3/26/2019CS151SP19
10. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
3/26/2019CS151SP19
This is what you might think is happening.
11. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
3/26/2019CS151SP19
This is what really happens.
12. Example
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)] = myList
# Add one to myList
myList.append(6)
3/26/2019CS151SP19
This is what really happens.
13. Example fixed!
myList = [ 1, 2, 3, 4, 5]
myDictionary = {}
for i in range(0, len(myList), 1):
myDictionary[str(i)]=copy.deepcopy(myList)
# Add one to myList
myList.append(6)
3/26/2019CS151SP19
This is what really happens.
14. Calculating standard deviation
• You already have the list of averages ( 𝑥) in averages.csv
• You will need to get ‘x’ from your cleaned data in homework2.csv
• Then like calculating the sum to get an average you will need to sort
by year.
• You could copy your for loop that did the summation, except you need to
figure out how to store the calculation
𝜎 =
𝑥 − 𝑥 2
𝑛 − 1
3/26/2019CS151SP19
15. This is the where deepcopy comes in
• In the for loop you can calculate 𝑥 − 𝑥 2
sum += math.pow((myList[i][1] – myAvg),2)
• To square a number use math.pow(number,2)
• To cube change 2 to 3
• myList[i][1]
• If myList is loaded of the original file from homework2.csv
• Year,Tmax, tmin
• myAvg
• You already have averages from homework2b, averages.csv
3/26/2019CS151SP19
16. This is the where deepcopy comes in
• Load Averages into dictionary
• Year: [Tmax, Tmin]
• Load homework2.csv into list
• Year, tmax, tmin
• Create dictionary to hold Year and a list of the temps for that day (you need two, tmax and tmin)
• Year: [maxTemplist]
• 1994:['68', '38', '67’, …, '25’]
• 1995: ['48', '25', '51’, …, '16’]
• Now you can calculate standard deviation per year
• More on this later
3/26/2019CS151SP19
18. Code for graphing
This code will be provided to you, however, you will need to have your data in the correct format*:
The Plot takes two lists and plots them.
plt.plot(yearList, maxTempList, label='TMax')
plt.plot(yearList, minTempList, label='TMin')
plt.plot(yearList, plotAvgMax, label='Max Average')
plt.plot(yearList, plotAvgMin, label='Min Average’)
yearList is just a list of the years from the data file.
maxTempList, minTempList is a list of values from homework2b or the averages.csv
plotAvgMax, plotAvgMin is a list of one single value, however the plotter wants a 1:1 so it’s
repeated 25 times.
* Right now this is the format it may slightly change
3/26/2019CS151SP19
19. List Comprehensions
new_list = [expression for_loop_one_or_more condtions]
# We can use len(avgDict) here because we cleaned the data
avgMax = sumMax/len(avgDict)
avgMin = sumMin/len(avgDict)
# Syntactic Sugar for creating an array of values set to one value
plotAvgMax = [avgMax for i in range(len(avgDict))]
plotAvgMin = [avgMin for i in range(len(avgDict))]
We take a value avgMax and apply it in a for loop from I to length of avgDict.
plotAvgMax = [74.3993317088547, … , 74.3993317088547]
3/26/2019CS151SP19