The document discusses various algorithms for searching and sorting data. It begins by describing sequential search on both ordered and unordered files, noting that sequential search on an ordered file is faster when the item is not found. It then introduces binary search, which is much faster than sequential search for large ordered lists, with a runtime of O(log n). Two sorting algorithms are also covered - bubble sort and insertion sort. Bubble sort has a runtime of O(n^2) while insertion sort is faster at O(n). The document provides examples and pseudocode to illustrate how the algorithms work.
How EverTrue is building a donor CRM on top of ElasticSearch. We cover some of the issues around scaling ElasticSearch and which aspects of ElasticSearch we are using to deliver value to our customers.
How EverTrue is building a donor CRM on top of ElasticSearch. We cover some of the issues around scaling ElasticSearch and which aspects of ElasticSearch we are using to deliver value to our customers.
This 28-minute video introduces the concept of the data analysis workflow.
The focus of this video is social science research that employs statistical techniques to analyse data. Many of the issues associated with the statistical data analysis workflow also pervade other forms of social science research (e.g. qualitative data analysis), despite the different nature of the data and the analytical techniques that are used.
Sorting algorithms in C++
An introduction to sorting algorithm, with details on bubble sort and merge sort algorithms
Computer science principles course
Approximate "Now" is Better Than Accurate "Later"NUS-ISS
How does Twitter track the top trending topics?
How does Amazon keep track of the top-selling items for the day?
How many cabs have been booked this month using your App?
Is the password that a new user is choosing a common/compromised password?
Modern web-scale systems process billions of transactions and generate terabytes of data every single day. In order to find answers to questions against this data, one would initiate a multi-minute query against a NoSQL datastore or kick off a batch job written in a distributed processing framework such as Spark or Flink. However, these jobs are throughput-heavy and not suited for realtime low-latency queries. However, you and your customers would like to have all this information "right now".
At the end of this talk, you'll realize that you can power these low-latency queries and with incredibly low memory footprint "IF" you are willing to accept answers that are, say, 96-99% accurate. This talk introduces some of the go-to probabilistic data structures that are used by organisations with large amounts of data - specifically Bloom filter, Count Min Sketch and HyperLogLog.
A binary tree is a hierarchical data structure in computer science that consists of nodes connected by edges. Each node in a binary tree has at most two children, referred to as the left child and the right child. The topmost node in a binary tree is called the root.
Here are some key terms and concepts associated with binary trees:
Root: The topmost node in the tree, from which all other nodes are descended.
Node: A fundamental unit of a binary tree that contains data and may have zero, one, or two children nodes.
Parent: A node in the tree that has one or more child nodes.
Child: Nodes that are descendants of a parent node. In a binary tree, a node can have at most two children.
Leaf: A node in the tree that has no children, i.e., it is at the bottom of the tree.
Subtree: A tree formed by a node and its descendants.
Height: The length of the longest path from the root to a leaf. The height of an empty tree is typically defined as -1.
Depth: The length of the path from the root to a particular node.
Binary trees are commonly used in various applications, such as expression trees, binary search trees, and Huffman coding trees. They provide an efficient way to organize and search data, and their recursive nature makes them well-suited for certain algorithms and data manipulations. Understanding binary trees is fundamental to many aspects of computer science and programming.
This presentation talks about Natural Language Processing using Java. At Museaic, a music intelligence platform, we spent time figuring out how to extract central themes from song lyrics. In this talk, I will cover some of the tasks involved in natural language processing such as named entity recognition, word sense disambiguation and concept/theme extraction. I will also cover libraries available in java such as stanford-nlp, dbpedia-spotlight and graph approaches using WordNet and semantic databases. This talk would help people understand text processing beyond simple keyword approaches and provide them with some of the best techniques/libraries for it in the Java world.
Telemetry allows the long-term collection of data over days, weeks or months from animals in their home cages. Performing chronic studies presents a number of opportunities in terms of experimental design but also results in the collection of very large data sets. Large data sets come with challenges for the collection and analysis of the data. This webinar covers common issues encountered acquiring and analyzing large data sets from chronic telemetry studies, and potential solutions such as scheduling data sampling and automating data collection and analysis. ADInstruments LabChart will be used as examples of how this can be achieved using macros in data acquisition systems. The LabChart macros introduced in this webinar are available for download here.
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB
How do you determine whether your MongoDB Atlas cluster is over provisioned, whether the new feature in your next application release will crush your cluster, or when to increase cluster size based upon planned usage growth? MongoDB Atlas provides over a hundred metrics enabling visibility into the inner workings of MongoDB performance, but how do apply all this information to make capacity planning decisions? This presentation will enable you to effectively analyze your MongoDB performance to optimize your MongoDB Atlas spend and ensure smooth application operation into the future.
1 1/2 years ago we have rolled out a new integrated full-text search engine for our Intranet based on Apache Solr. The search engine integrates various data sources such as file systems, wikis, internal websites and web applications, shared calendars, our corporate database, CRM system, email archive, task management and defect tracking etc. This talk is an experience report about some of the good things, the bad things and the surprising things we have encountered over two years of developing with, operating and using a Intranet search engine based on Apache Solr.
After setting the scene, we will discuss some interesting requirements that we have for our search engine and how we solved them with Apache Solr (or at least tried to solve). Using these concrete examples, we will discuss some interesting features and limitations of Apache Solr.
In the second part of the talk, we will tell a couple of "war stories" and walk through some interesting, annoying and surprising problems that we faced, how we analyzed the issues, identified the cause of the problems and eventually solved them.
The talk is aimed at software developers and architects with some basic knowledge about Apache Solr, the Apache Lucene project familiy or similar full-text search engines. It is not an introduction into Apache Solr and we will dive right into the interesting and juicy bits.
With datasets becoming larger, new visualizations techniques are needed. One problem is that the number of datapoints is so large, that a scatter plot becomes cluttered. Another problem is that with over a billion objects, only a few cpu cycles are available per object if one wants to process them within a second, making traditional methods not viable.
I will show that it is possible to visualize a billion objects in about 1 second on a modern desktop computer using memory mapping of hdf5 files together with a simple binning algorithm in Python. Density fields in 1, 2 and 3d are computed and statistics of any properties of the objects, or any mathematical operation on them can be done on the fly. This enables efficient exploration or large datasets interactively, making science exploration feasible.
This idea is implemented in a Python library called vaex, which integrates well in the Jupyter/Numpy/Astropy stack. Build on top of this is the vaex application, which allows for interactive exploration and visualization.
The motivation for vaex is the upcoming astronomical catalogue of the Gaia satellite, which will contain properties of over a billion stars. However, vaex can also be used on N-body simulations, other catalogues or any tabular data.
https://www.astro.rug.nl/~breddels/vaex
https://github.com/maartenbreddels/vaex
This 28-minute video introduces the concept of the data analysis workflow.
The focus of this video is social science research that employs statistical techniques to analyse data. Many of the issues associated with the statistical data analysis workflow also pervade other forms of social science research (e.g. qualitative data analysis), despite the different nature of the data and the analytical techniques that are used.
Sorting algorithms in C++
An introduction to sorting algorithm, with details on bubble sort and merge sort algorithms
Computer science principles course
Approximate "Now" is Better Than Accurate "Later"NUS-ISS
How does Twitter track the top trending topics?
How does Amazon keep track of the top-selling items for the day?
How many cabs have been booked this month using your App?
Is the password that a new user is choosing a common/compromised password?
Modern web-scale systems process billions of transactions and generate terabytes of data every single day. In order to find answers to questions against this data, one would initiate a multi-minute query against a NoSQL datastore or kick off a batch job written in a distributed processing framework such as Spark or Flink. However, these jobs are throughput-heavy and not suited for realtime low-latency queries. However, you and your customers would like to have all this information "right now".
At the end of this talk, you'll realize that you can power these low-latency queries and with incredibly low memory footprint "IF" you are willing to accept answers that are, say, 96-99% accurate. This talk introduces some of the go-to probabilistic data structures that are used by organisations with large amounts of data - specifically Bloom filter, Count Min Sketch and HyperLogLog.
A binary tree is a hierarchical data structure in computer science that consists of nodes connected by edges. Each node in a binary tree has at most two children, referred to as the left child and the right child. The topmost node in a binary tree is called the root.
Here are some key terms and concepts associated with binary trees:
Root: The topmost node in the tree, from which all other nodes are descended.
Node: A fundamental unit of a binary tree that contains data and may have zero, one, or two children nodes.
Parent: A node in the tree that has one or more child nodes.
Child: Nodes that are descendants of a parent node. In a binary tree, a node can have at most two children.
Leaf: A node in the tree that has no children, i.e., it is at the bottom of the tree.
Subtree: A tree formed by a node and its descendants.
Height: The length of the longest path from the root to a leaf. The height of an empty tree is typically defined as -1.
Depth: The length of the path from the root to a particular node.
Binary trees are commonly used in various applications, such as expression trees, binary search trees, and Huffman coding trees. They provide an efficient way to organize and search data, and their recursive nature makes them well-suited for certain algorithms and data manipulations. Understanding binary trees is fundamental to many aspects of computer science and programming.
This presentation talks about Natural Language Processing using Java. At Museaic, a music intelligence platform, we spent time figuring out how to extract central themes from song lyrics. In this talk, I will cover some of the tasks involved in natural language processing such as named entity recognition, word sense disambiguation and concept/theme extraction. I will also cover libraries available in java such as stanford-nlp, dbpedia-spotlight and graph approaches using WordNet and semantic databases. This talk would help people understand text processing beyond simple keyword approaches and provide them with some of the best techniques/libraries for it in the Java world.
Telemetry allows the long-term collection of data over days, weeks or months from animals in their home cages. Performing chronic studies presents a number of opportunities in terms of experimental design but also results in the collection of very large data sets. Large data sets come with challenges for the collection and analysis of the data. This webinar covers common issues encountered acquiring and analyzing large data sets from chronic telemetry studies, and potential solutions such as scheduling data sampling and automating data collection and analysis. ADInstruments LabChart will be used as examples of how this can be achieved using macros in data acquisition systems. The LabChart macros introduced in this webinar are available for download here.
MongoDB World 2019: Finding the Right MongoDB Atlas Cluster Size: Does This I...MongoDB
How do you determine whether your MongoDB Atlas cluster is over provisioned, whether the new feature in your next application release will crush your cluster, or when to increase cluster size based upon planned usage growth? MongoDB Atlas provides over a hundred metrics enabling visibility into the inner workings of MongoDB performance, but how do apply all this information to make capacity planning decisions? This presentation will enable you to effectively analyze your MongoDB performance to optimize your MongoDB Atlas spend and ensure smooth application operation into the future.
1 1/2 years ago we have rolled out a new integrated full-text search engine for our Intranet based on Apache Solr. The search engine integrates various data sources such as file systems, wikis, internal websites and web applications, shared calendars, our corporate database, CRM system, email archive, task management and defect tracking etc. This talk is an experience report about some of the good things, the bad things and the surprising things we have encountered over two years of developing with, operating and using a Intranet search engine based on Apache Solr.
After setting the scene, we will discuss some interesting requirements that we have for our search engine and how we solved them with Apache Solr (or at least tried to solve). Using these concrete examples, we will discuss some interesting features and limitations of Apache Solr.
In the second part of the talk, we will tell a couple of "war stories" and walk through some interesting, annoying and surprising problems that we faced, how we analyzed the issues, identified the cause of the problems and eventually solved them.
The talk is aimed at software developers and architects with some basic knowledge about Apache Solr, the Apache Lucene project familiy or similar full-text search engines. It is not an introduction into Apache Solr and we will dive right into the interesting and juicy bits.
With datasets becoming larger, new visualizations techniques are needed. One problem is that the number of datapoints is so large, that a scatter plot becomes cluttered. Another problem is that with over a billion objects, only a few cpu cycles are available per object if one wants to process them within a second, making traditional methods not viable.
I will show that it is possible to visualize a billion objects in about 1 second on a modern desktop computer using memory mapping of hdf5 files together with a simple binning algorithm in Python. Density fields in 1, 2 and 3d are computed and statistics of any properties of the objects, or any mathematical operation on them can be done on the fly. This enables efficient exploration or large datasets interactively, making science exploration feasible.
This idea is implemented in a Python library called vaex, which integrates well in the Jupyter/Numpy/Astropy stack. Build on top of this is the vaex application, which allows for interactive exploration and visualization.
The motivation for vaex is the upcoming astronomical catalogue of the Gaia satellite, which will contain properties of over a billion stars. However, vaex can also be used on N-body simulations, other catalogues or any tabular data.
https://www.astro.rug.nl/~breddels/vaex
https://github.com/maartenbreddels/vaex
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Online aptitude test management system project report.pdfKamal Acharya
The purpose of on-line aptitude test system is to take online test in an efficient manner and no time wasting for checking the paper. The main objective of on-line aptitude test system is to efficiently evaluate the candidate thoroughly through a fully automated system that not only saves lot of time but also gives fast results. For students they give papers according to their convenience and time and there is no need of using extra thing like paper, pen etc. This can be used in educational institutions as well as in corporate world. Can be used anywhere any time as it is a web based application (user Location doesn’t matter). No restriction that examiner has to be present when the candidate takes the test.
Every time when lecturers/professors need to conduct examinations they have to sit down think about the questions and then create a whole new set of questions for each and every exam. In some cases the professor may want to give an open book online exam that is the student can take the exam any time anywhere, but the student might have to answer the questions in a limited time period. The professor may want to change the sequence of questions for every student. The problem that a student has is whenever a date for the exam is declared the student has to take it and there is no way he can take it at some other time. This project will create an interface for the examiner to create and store questions in a repository. It will also create an interface for the student to take examinations at his convenience and the questions and/or exams may be timed. Thereby creating an application which can be used by examiners and examinee’s simultaneously.
Examination System is very useful for Teachers/Professors. As in the teaching profession, you are responsible for writing question papers. In the conventional method, you write the question paper on paper, keep question papers separate from answers and all this information you have to keep in a locker to avoid unauthorized access. Using the Examination System you can create a question paper and everything will be written to a single exam file in encrypted format. You can set the General and Administrator password to avoid unauthorized access to your question paper. Every time you start the examination, the program shuffles all the questions and selects them randomly from the database, which reduces the chances of memorizing the questions.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
1. CMSC 104, Version 8/06 1
L24Searching&Sorting.ppt
Searching and Sorting
Topics
• Sequential Search on an Unordered File
• Sequential Search on an Ordered File
• Binary Search
• Bubble Sort
• Insertion Sort
Reading
• Sections 6.6 - 6.8
2. CMSC 104, Version 8/06 2
L24Searching&Sorting.ppt
Common Problems
• There are some very common problems that
we use computers to solve:
o Searching through a lot of records for a specific
record or set of records
o Placing records in order, which we call sorting
• There are numerous algorithms to perform
searches and sorts. We will briefly explore a
few common ones.
3. CMSC 104, Version 8/06 3
L24Searching&Sorting.ppt
Searching
• A question you should always ask when selecting a
search algorithm is “How fast does the search have
to be?” The reason is that, in general, the faster
the algorithm is, the more complex it is.
• Bottom line: you don’t always need to use or
should use the fastest algorithm.
• Let’s explore the following search algorithms,
keeping speed in mind.
o Sequential (linear) search
o Binary search
4. CMSC 104, Version 8/06 4
L24Searching&Sorting.ppt
Sequential Search on an Unordered File
• Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record != key) and (still more records) )
Get the next record
End_while
• When do we know that there wasn’t a
record in the file that matched the key?
5. CMSC 104, Version 8/06 5
L24Searching&Sorting.ppt
Sequential Search on an Ordered File
• Basic algorithm:
Get the search criterion (key)
Get the first record from the file
While ( (record < key) and (still more records) )
Get the next record
End_while
If ( record = key )
Then success
Else there is no match in the file
End_else
• When do we know that there wasn’t a record in
the file that matched the key?
6. CMSC 104, Version 8/06 6
L24Searching&Sorting.ppt
Sequential Search of
Ordered vs.. Unordered List
• Let’s do a comparison.
• If the order was ascending alphabetical on
customer’s last names, how would the search for
John Adams on the ordered list compare with the
search on the unordered list?
o Unordered list
– if John Adams was in the list?
– if John Adams was not in the list?
o Ordered list
– if John Adams was in the list?
– if John Adams was not in the list?
7. CMSC 104, Version 8/06 7
L24Searching&Sorting.ppt
Ordered vs. Unordered (con’t)
• How about George Washington?
o Unordered
– if George Washington was in the list?
– If George Washington was not in the list?
o Ordered
– if George Washington was in the list?
– If George Washington was not in the list?
• How about James Madison?
8. CMSC 104, Version 8/06 8
L24Searching&Sorting.ppt
Ordered vs.. Unordered (con’t)
• Observation: the search is faster on an
ordered list only when the item being
searched for is not in the list.
• Also, keep in mind that the list has to first be
placed in order for the ordered search.
• Conclusion: the efficiency of these
algorithms is roughly the same.
• So, if we need a faster search, we need a
completely different algorithm.
• How else could we search an ordered file?
9. CMSC 104, Version 8/06 9
L24Searching&Sorting.ppt
Binary Search
• If we have an ordered list and we know
how many things are in the list (i.e., number
of records in a file), we can use a different
strategy.
• The binary search gets its name because
the algorithm continually divides the list into
two parts.
10. CMSC 104, Version 8/06 10
L24Searching&Sorting.ppt
How a Binary Search Works
Always look at the
center value. Each
time you get to discard
half of the remaining
list.
Is this fast ?
11. CMSC 104, Version 8/06 11
L24Searching&Sorting.ppt
How Fast is a Binary Search?
• Worst case: 11 items in the list took 4 tries
• How about the worst case for a list with 32
items ?
o 1st try - list has 16 items
o 2nd try - list has 8 items
o 3rd try - list has 4 items
o 4th try - list has 2 items
o 5th try - list has 1 item
13. CMSC 104, Version 8/06 13
L24Searching&Sorting.ppt
What’s the Pattern?
• List of 11 took 4 tries
• List of 32 took 5 tries
• List of 250 took 8 tries
• List of 512 took 9 tries
• 32 = 25 and 512 = 29
• 8 < 11 < 16 23 < 11 < 24
• 128 < 250 < 256 27 < 250 < 28
14. CMSC 104, Version 8/06 14
L24Searching&Sorting.ppt
A Very Fast Algorithm!
• How long (worst case) will it take to find an
item in a list 30,000 items long?
210 = 1024 213 = 8192
211 = 2048 214 = 16384
212 = 4096 215 = 32768
• So, it will take only 15 tries!
15. CMSC 104, Version 8/06 15
L24Searching&Sorting.ppt
Lg n Efficiency
• We say that the binary search algorithm
runs in log2 n time. (Also written as lg n)
• Lg n means the log to the base 2 of some
value of n.
• 8 = 23 lg 8 = 3 16 = 24 lg 16 = 4
• There are no algorithms that run faster than
lg n time.
16. CMSC 104, Version 8/06 16
L24Searching&Sorting.ppt
Sorting
• So, the binary search is a very fast
search algorithm.
• But, the list has to be sorted before we
can search it with binary search.
• To be really efficient, we also need a
fast sort algorithm.
17. CMSC 104, Version 8/06 17
L24Searching&Sorting.ppt
Common Sort Algorithms
Bubble Sort Heap Sort
Selection Sort Merge Sort
Insertion Sort Quick Sort
• There are many known sorting algorithms. Bubble
sort is the slowest, running in n2 time. Quick sort
is the fastest, running in n lg n time.
• As with searching, the faster the sorting algorithm,
the more complex it tends to be.
• We will examine two sorting algorithms:
o Bubble sort
o Insertion sort
18. CMSC 104, Version 8/06 18
L24Searching&Sorting.ppt
Bubble Sort - Let’s Do One!
C
P
G
A
T
O
B
19. CMSC 104, Version 8/06 19
L24Searching&Sorting.ppt
Bubble Sort Code
void bubbleSort (int a[ ] , int size)
{
int i, j, temp;
for ( i = 0; i < size; i++ ) /* controls passes through the list */
{
for ( j = 0; j < size - 1; j++ ) /* performs adjacent comparisons */
{
if ( a[ j ] > a[ j+1 ] ) /* determines if a swap should occur */
{
temp = a[ j ]; /* swap is performed */
a[ j ] = a[ j + 1 ];
a[ j+1 ] = temp;
}
}
}
}
20. CMSC 104, Version 8/06 20
L24Searching&Sorting.ppt
Insertion Sort
• Insertion sort is slower than quick sort, but
not as slow as bubble sort, and it is easy to
understand.
• Insertion sort works the same way as
arranging your hand when playing cards.
o Out of the pile of unsorted cards that were
dealt to you, you pick up a card and place it in
your hand in the correct position relative to the
cards you’re already holding.
21. CMSC 104, Version 8/06 21
L24Searching&Sorting.ppt
Arranging Your Hand
7
5 7
22. CMSC 104, Version 8/06 22
L24Searching&Sorting.ppt
Arranging Your Hand
5 6
7
5
7
5 6 7
K
5 6 7 8 K
23. CMSC 104, Version 8/06 23
L24Searching&Sorting.ppt
Insertion Sort
Unsorted - shaded
Look at 2nd item - 5.
Compare 5 to 7.
5 is smaller, so move 5
to temp, leaving
an empty slot in
position 2.
Move 7 into the empty
slot, leaving position 1
open.
Move 5 into the open
position.
7
7
5
7
5
7
K
5
7
v
>
<
1
2
3
24. CMSC 104, Version 8/06 24
L24Searching&Sorting.ppt
Insertion Sort (con’t)
Look at next item - 6.
Compare to 1st - 5.
6 is larger, so leave 5.
Compare to next - 7.
6 is smaller, so move
6 to temp, leaving an
empty slot.
Move 7 into the empty
slot, leaving position 2
open.
Move 6 to the open
2nd position.
7
7
5
7
5
K
5
7
v
>
<
1
2
3
6
7
6
5
6
5
25. CMSC 104, Version 8/06 25
L24Searching&Sorting.ppt
Insertion Sort (con’t)
Look at next item - King.
Compare to 1st - 5.
King is larger, so
leave 5 where it is.
Compare to next - 6.
King is larger, so
leave 6 where it is.
Compare to next - 7.
King is larger, so
leave 7 where it is.
7 K
5 6
26. CMSC 104, Version 8/06 26
L24Searching&Sorting.ppt
Insertion Sort (con’t)
7
7
5
7
5 K
5
7
v
>
<
1
2
3
6 7
8
5
6
5
6
6
6
8
K 8
K
K 8
K
27. CMSC 104, Version 8/06 27
L24Searching&Sorting.ppt
Courses at UMBC
• Data Structures - CMSC 341
o Some mathematical analysis of various
algorithms, including sorting and searching
• Design and Analysis of Algorithms - CMSC 441
o Detailed mathematical analysis of various
algorithms
• Cryptology - CMSC 443
o The study of making and breaking codes