Algorithms and
Data Structures
Of the many subfields within computerscience, algorithmsand data
structures may be the most fundamental—it seems to characterize computer
science like perhaps no other. What is involved in the study of algorithms and
data structures?
Importance
Computer Science encompasses theoretical and practical approaches to
computation. It's been said that the four essential subdisciplines are:
 Theory of Computation
 Algorithms and Data Structures
 Programming Methodology and Languages
 Computer Elements and Architecture
The ACM has identified eighteen knowledge areas of Computer Science
as:
Algorithms and
Complexity
AL Algorithmic Analysis; Algorithmic Strategies;
Data Structures and Algorithms; Automata;
Computability;Computational Complexity.
Architecture and
Organization
AR Digital Logic and Digital Systems; Machine-
level Data Representation; Assembly-Level
Machine Organization; Memory System
Organization and Architecture; Interfacing and
Communication; Functional Organization;
Multiprocessing and Alternative Architectures;
Performance Enhancements.
Computational
Science
CN Modeling and Simulation; Processing;
Interactive Visualization; Data, Information and
Knowledge; Numeric Analysis; Symbolic
Computation; Mathematical Modeling; High-
Performance Computing.
Discrete
Structures
DS Logic; Sets, Relations, and Functions; Proof
Techniques; Basics of Counting; Graphs and
Trees; Discrete Probability.
Graphics and
Visualization
GV Media Applications; Digitization; Color Models;
Rendering; Modeling; Animation; Visualization.
Human-
Computer
HCI Interactive Systems; Testing; Collaboration and
Communication; Statistical Methods; Human
Interaction Factors; Mixed, Augmented, and Virtual Reality.
Information
Assurance and
Security
IAS Defensive Programming; Threats and Attacks;
Network Security; Cryptography;Web Security;
Platform Security; Policy and Governance;
Digital Forensics; Secure Software Engineering.
Information
Management
IM Database Systems; Data Modeling; Indexing;
Key-Value, Document, Relational, and Graph
Databases; Query Languages; Transaction
Processing; Distributed Databases; Physical
Database Design; Data Mining; Information
Storage and Retrieval; Multimedia Systems.
Intelligent
Systems
IS Knowledge Representation and Reasoning;
Search; Machine Learning; Reasoning Under
Uncertainty; Agents; Natural Langauge
Processing; Robotics; Speech Recognition and
Synthesis; Perception and Computer Vision.
Networking and
Communication
NC Networked Applications; Reliable Data
Delivery; Routing and Forwarding; Local and
Wide Area Networks; Reource Allocation;
Mobility; Social Networking.
Operating
Systems
OS Operating System Organization; Concurrency;
Scheduling and Dispatch; Memory
Management; Security and Protection; Virtual
Machines; Device Management; File Systems;
Realtime and Embedded Systems; Fault
Tolerance; System Performance and Evaluation.
Platform-Based
Development
PBD Web Platforms; Mobile Platforms; Industrial
Platforms; game Platforms.
Parallel and
Distributed
Computing
PD Parallel Decomposition; Communication and
Coordination; Parallel Algorithms, Analysis, and
Programming; Parallel Architecture; Parallel
Performance; Distributed Systems; Cloud
Computing; Formal Models and Semantics.
Programming
Languages
PL Object Oriented Programming; Functional
Programming; Event-Driven and Reactive
Programming; Type Systems; Program
Representation; Language Translation and
Execution; Syntax Analysis; Semantic Analysis;
Code Generation; Runtime Systems; Static
Analysis; Concurrency and Parallelism; Type
Systems; Formal Semantics; Language
Pragmatics; Logic Programming.
Software
Development
Fundamentals
SDF Algorithms and Design; Fundamental
Programming Concepts; Fundamental Data
Structures; Development Methods.
Software
Engineering
SE Software Processes; Project Management;
Tools and Environments; Requirements
Engineering; Software Design; Software
Construction; Software Verification and
Validation; Software Evolution; Software
Reliability;Formal Methods.
Systems
Fundamentals
SF Computational Paradigms; Cross-Layer
Communications; State and State Machines;
Parallelism; Evaluation; Resource Allocation and
Scheduling; Proximity; Virtualization and
Isolation; Reliability through Redundancy;
Quantitative Evaluation.
Social Issues and
Professional
Practice
SP Social Context; Analytic Tools; Professional
Ethics; Intellectual Property; Privacy and Civil
Liberties; Professional Communication;
Sustainability; History; Economics of
Computing; Security Policies, Law, and Crime.
We study data structures and algorithms because:
 Computing systems are concerned with the storage and retrieval
of information.
 For systems to be economical the data must be organized (into
data structures) in such a way as to support efficient manipulation
(by algorithms).
 Choosing the wrong algorithms and data structures makes a
program slow at best and unmaintainable and insecure at worst.
Topics
While it's possible to study algorithms and data structures exclusively at
a theoretical level, we often study them together with introductory
software engineering concepts; the major topics, grouping them
roughly as follows:
 Data: data types, abstract data types, data structures (arrays and
links), classes, APIs, collections, OO.
 Algorithms: What is an algorithm?, algorithmic patterns and
paradigms (recursion, backtracking, search, etc.), complexity analysis.
 Software development: abstraction, efficiency, robustness,
classification, inheritance and composition, polymorphism, software
life cycle, methodology, modeling languages, design patterns, and
testing.
Data Types
During the design of a system, before the algorithms are written and
before the data structures are sketched out, we must first identify the
kinds of objects we need and how they should behave.
A type is a collection of values, like
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
[Math Processing Error]
A data type is a type together with operations. An abstract data type
(ADT) is a data type in which the internal structure and internal
workings of the objects of the type are unknown to users of the type.
Users can only see the effects of the operations.
Types and operations can be given in pictures...
... or with signatures
CLUBS : Suit
HEARTS : Suit
DIAMONDS : Suit
SPADES : Suit
ACE : Rank
TWO : Rank
THREE : Rank
FOUR : Rank
FIVE : Rank
SIX : Rank
SEVEN : Rank
EIGHT : Rank
NINE : Rank
TEN : Rank
JACK : Rank
QUEEN : Rank
KING : Rank
makeCard : Suit × Rank → Card
getSuit : Card → Suit
getRank : Card → Rank
newDeck : Deck
positionOf : Deck × Card → int
cardAt : Deck × int → Card
topCard : Deck → Card
shuffle : Deck → Deck
split : Deck → Deck
Exercise: Draw a picture, and enumerate operations, for the type of
integer lists. An integer list is a sequence of zero or more integers with
operations such as obtaining the length, adding an item to a given
position, removing the nth item, reversing, finding the position of a
given item, etc.
Collections
A collection is a group of elements treated a single object. Most ADTs
seem to be collection types. Collection types can be
 Homogeneous or heterogeneous
 Unordered or ordered
 Linear, Hierarchical, Networked, or Non-Positional
The latter classification looks like this:
Classification Topology Examples
SET
(NON-
POSITIONAL)
Unrelated items, such as people
in a class or a hand of cards.
SEQUENCE
(LINEAR)
One-to-one relationship, as in a
printer queue, ring LAN, or a line
at the deli.
TREE
(HIERARCHICAL)
One-to-many relationship, as in
a family tree, organization chart,
expression tree, or folders and
files in a file system.
GRAPH
(NETWORK)
Many-to-many relationship, as
in a road map, schedule, belief
network, or communication net.
Operations common to most collections include
 Creation and destruction
 Adding, removing, replacing, and searching for elements
 Getting the size, or figuring out if it is empty or full
 Enumeration (traversal)
 Equality testing
 Cloning
 Serializing
Data Structures
A data structure is an arrangement of data for the purpose of being
able to store and retrieve information. Usually a data structure is some
physical representation of an ADT.
Example: A list (data type) can be represented by an array (data
structure) or by attaching a link from each element to its successor
(another kind of data structure).
Building Blocks
Data structures are built with
 Records (a.k.a. structs, classes, hashes), which group entities into a
single unit with named fields
 Arrays, which store fixed size entities into indexed, contiguous
slots in memory
 Links (a.k.a. references or pointers)
Choosing the Right Data Structure
The data structure chosen for a data type must meet the resource
constraints of the problem. Figure out the constraints before you start
coding!
Example: In an accounting system,
 Creation and deletion of a new account can be slow
 Lookup and update of the balance of a given account must be
extremely fast.
These are the extremely important questions you must ask to determine
acceptable tradeoffs:
 Can the data structure be completely filled at the beginning, or
will there be insertions intermingled with deletions, lookups, updates,
and other operations?
 Can items be deleted from the structure?
 Will the items always be processed in a well-defined order, or will
random-access have to be supported?
The Classics
History is important. Over time, many data structures have become
well-known (for variaous) reasons. Some of these are:
 The Skip List
 The Cartesian Tree
 The Judy Array
 ...
Wait, I can't make up a great list, but Wikipedia can. Check out
Wikpedia's List of Data Structures.
Algorithms
We design algorithms to carry out the desired behavior of a software
system. Good programmers tend to use stepwise refinement when
developing algorithms, and understand the tradeoffs between different
algorithms for the same problem.
Classifying Algorithms
There are several classification schemes for algorithms:
 By problem domain: numeric, text processing (matching,
compression, sequencing), cryptology, fair division, sorting,
searching, computational geometry, networks (routing, connectivity,
flow, span), computer vision, machine learning.
 By design strategy: divide and conquer, greedy, algebraic
transformation, dynamic programming, linear programming, brute
force (exhaustive search), backtracking,branch and bound, heuristic
search, genetic algorithms, simulated annealing, particle swarm, Las
Vegas, Monte Carlo, quantum algorithms.
 By complexity: constant, logarithmic, linear, linearithmic,
quadratic, cubic, polynomial, exponential.
 Around the central data structures and their behavior: whether
(1) organizational: set, linear, hierachical, or network; or (2)
implementational: hashtables, skip lists, red-black trees, a-b trees,
cartesian trees, neural networks, Bayesian networks, swarms.
 By one of many implementation dimensions: sequential vs.
parallel, recursive vs. iterative, deterministic vs. non-deterministic,
exact vs. approximate, ....
Kinds of Problems
This classification has proved useful:
Problem
Type
Description
Decision Given [Math Processing Error] and [Math Processing
Error], is it true that [Math Processing Error]?
Functional Given [Math Processing Error] and [Math Processing
Error], find [Math Processing Error]
Constraint Given [Math Processing Error] and [Math Processing
Error], find an [Math Processing Error] such that [Math
Processing Error]
Optimization Given [Math Processing Error], find the [Math Processing
Error] such that [Math Processing Error] is less than or
equal to [Math Processing Error] for all [Math Processing
Error].
Choosing the Right Algorithm
Choosing the right algorithm is both
 an art, because cleverness, insight, ingenuity, and dumb luck are
often required to find efficient algorithms for certain problems
 a science, because principles of algorithm analysis and widely
applicable algorithmic patterns have been developed over time for
you to use.
The Classics
History is important. Over time, many algorithms have become well-
known (for variaous) reasons. Some of these are:
 Euclid's GCD
 The Fast Fourier Transform
 Miller-Rabin Primality Test
 ...
Wait, I can't make up a great list, but Wikipedia can. Check out
Wikpedia's List of Algorithms.

Algorithms and Data Structures~hmftj

  • 1.
    Algorithms and Data Structures Ofthe many subfields within computerscience, algorithmsand data structures may be the most fundamental—it seems to characterize computer science like perhaps no other. What is involved in the study of algorithms and data structures? Importance Computer Science encompasses theoretical and practical approaches to computation. It's been said that the four essential subdisciplines are:  Theory of Computation  Algorithms and Data Structures  Programming Methodology and Languages  Computer Elements and Architecture
  • 2.
    The ACM hasidentified eighteen knowledge areas of Computer Science as: Algorithms and Complexity AL Algorithmic Analysis; Algorithmic Strategies; Data Structures and Algorithms; Automata; Computability;Computational Complexity. Architecture and Organization AR Digital Logic and Digital Systems; Machine- level Data Representation; Assembly-Level Machine Organization; Memory System Organization and Architecture; Interfacing and Communication; Functional Organization; Multiprocessing and Alternative Architectures; Performance Enhancements. Computational Science CN Modeling and Simulation; Processing; Interactive Visualization; Data, Information and Knowledge; Numeric Analysis; Symbolic Computation; Mathematical Modeling; High- Performance Computing. Discrete Structures DS Logic; Sets, Relations, and Functions; Proof Techniques; Basics of Counting; Graphs and Trees; Discrete Probability. Graphics and Visualization GV Media Applications; Digitization; Color Models; Rendering; Modeling; Animation; Visualization. Human- Computer HCI Interactive Systems; Testing; Collaboration and Communication; Statistical Methods; Human
  • 3.
    Interaction Factors; Mixed,Augmented, and Virtual Reality. Information Assurance and Security IAS Defensive Programming; Threats and Attacks; Network Security; Cryptography;Web Security; Platform Security; Policy and Governance; Digital Forensics; Secure Software Engineering. Information Management IM Database Systems; Data Modeling; Indexing; Key-Value, Document, Relational, and Graph Databases; Query Languages; Transaction Processing; Distributed Databases; Physical Database Design; Data Mining; Information Storage and Retrieval; Multimedia Systems. Intelligent Systems IS Knowledge Representation and Reasoning; Search; Machine Learning; Reasoning Under Uncertainty; Agents; Natural Langauge Processing; Robotics; Speech Recognition and Synthesis; Perception and Computer Vision. Networking and Communication NC Networked Applications; Reliable Data Delivery; Routing and Forwarding; Local and Wide Area Networks; Reource Allocation; Mobility; Social Networking. Operating Systems OS Operating System Organization; Concurrency; Scheduling and Dispatch; Memory Management; Security and Protection; Virtual Machines; Device Management; File Systems;
  • 4.
    Realtime and EmbeddedSystems; Fault Tolerance; System Performance and Evaluation. Platform-Based Development PBD Web Platforms; Mobile Platforms; Industrial Platforms; game Platforms. Parallel and Distributed Computing PD Parallel Decomposition; Communication and Coordination; Parallel Algorithms, Analysis, and Programming; Parallel Architecture; Parallel Performance; Distributed Systems; Cloud Computing; Formal Models and Semantics. Programming Languages PL Object Oriented Programming; Functional Programming; Event-Driven and Reactive Programming; Type Systems; Program Representation; Language Translation and Execution; Syntax Analysis; Semantic Analysis; Code Generation; Runtime Systems; Static Analysis; Concurrency and Parallelism; Type Systems; Formal Semantics; Language Pragmatics; Logic Programming. Software Development Fundamentals SDF Algorithms and Design; Fundamental Programming Concepts; Fundamental Data Structures; Development Methods. Software Engineering SE Software Processes; Project Management; Tools and Environments; Requirements Engineering; Software Design; Software
  • 5.
    Construction; Software Verificationand Validation; Software Evolution; Software Reliability;Formal Methods. Systems Fundamentals SF Computational Paradigms; Cross-Layer Communications; State and State Machines; Parallelism; Evaluation; Resource Allocation and Scheduling; Proximity; Virtualization and Isolation; Reliability through Redundancy; Quantitative Evaluation. Social Issues and Professional Practice SP Social Context; Analytic Tools; Professional Ethics; Intellectual Property; Privacy and Civil Liberties; Professional Communication; Sustainability; History; Economics of Computing; Security Policies, Law, and Crime. We study data structures and algorithms because:  Computing systems are concerned with the storage and retrieval of information.  For systems to be economical the data must be organized (into data structures) in such a way as to support efficient manipulation (by algorithms).  Choosing the wrong algorithms and data structures makes a program slow at best and unmaintainable and insecure at worst. Topics
  • 6.
    While it's possibleto study algorithms and data structures exclusively at a theoretical level, we often study them together with introductory software engineering concepts; the major topics, grouping them roughly as follows:  Data: data types, abstract data types, data structures (arrays and links), classes, APIs, collections, OO.  Algorithms: What is an algorithm?, algorithmic patterns and paradigms (recursion, backtracking, search, etc.), complexity analysis.  Software development: abstraction, efficiency, robustness, classification, inheritance and composition, polymorphism, software life cycle, methodology, modeling languages, design patterns, and testing. Data Types During the design of a system, before the algorithms are written and before the data structures are sketched out, we must first identify the kinds of objects we need and how they should behave. A type is a collection of values, like [Math Processing Error] [Math Processing Error] [Math Processing Error] [Math Processing Error] [Math Processing Error] [Math Processing Error]
  • 7.
    [Math Processing Error] [MathProcessing Error] [Math Processing Error] [Math Processing Error] [Math Processing Error] [Math Processing Error] [Math Processing Error] [Math Processing Error] A data type is a type together with operations. An abstract data type (ADT) is a data type in which the internal structure and internal workings of the objects of the type are unknown to users of the type. Users can only see the effects of the operations. Types and operations can be given in pictures...
  • 8.
    ... or withsignatures CLUBS : Suit HEARTS : Suit DIAMONDS : Suit SPADES : Suit ACE : Rank TWO : Rank THREE : Rank FOUR : Rank FIVE : Rank SIX : Rank SEVEN : Rank EIGHT : Rank NINE : Rank TEN : Rank JACK : Rank QUEEN : Rank KING : Rank makeCard : Suit × Rank → Card getSuit : Card → Suit getRank : Card → Rank newDeck : Deck positionOf : Deck × Card → int cardAt : Deck × int → Card topCard : Deck → Card shuffle : Deck → Deck split : Deck → Deck
  • 9.
    Exercise: Draw apicture, and enumerate operations, for the type of integer lists. An integer list is a sequence of zero or more integers with operations such as obtaining the length, adding an item to a given position, removing the nth item, reversing, finding the position of a given item, etc. Collections A collection is a group of elements treated a single object. Most ADTs seem to be collection types. Collection types can be  Homogeneous or heterogeneous  Unordered or ordered  Linear, Hierarchical, Networked, or Non-Positional The latter classification looks like this: Classification Topology Examples SET (NON- POSITIONAL) Unrelated items, such as people in a class or a hand of cards. SEQUENCE (LINEAR) One-to-one relationship, as in a printer queue, ring LAN, or a line at the deli.
  • 10.
    TREE (HIERARCHICAL) One-to-many relationship, asin a family tree, organization chart, expression tree, or folders and files in a file system. GRAPH (NETWORK) Many-to-many relationship, as in a road map, schedule, belief network, or communication net. Operations common to most collections include  Creation and destruction  Adding, removing, replacing, and searching for elements  Getting the size, or figuring out if it is empty or full  Enumeration (traversal)  Equality testing  Cloning  Serializing Data Structures A data structure is an arrangement of data for the purpose of being able to store and retrieve information. Usually a data structure is some physical representation of an ADT.
  • 11.
    Example: A list(data type) can be represented by an array (data structure) or by attaching a link from each element to its successor (another kind of data structure). Building Blocks Data structures are built with  Records (a.k.a. structs, classes, hashes), which group entities into a single unit with named fields  Arrays, which store fixed size entities into indexed, contiguous slots in memory  Links (a.k.a. references or pointers) Choosing the Right Data Structure The data structure chosen for a data type must meet the resource constraints of the problem. Figure out the constraints before you start coding! Example: In an accounting system,  Creation and deletion of a new account can be slow  Lookup and update of the balance of a given account must be extremely fast. These are the extremely important questions you must ask to determine acceptable tradeoffs:
  • 12.
     Can thedata structure be completely filled at the beginning, or will there be insertions intermingled with deletions, lookups, updates, and other operations?  Can items be deleted from the structure?  Will the items always be processed in a well-defined order, or will random-access have to be supported? The Classics History is important. Over time, many data structures have become well-known (for variaous) reasons. Some of these are:  The Skip List  The Cartesian Tree  The Judy Array  ... Wait, I can't make up a great list, but Wikipedia can. Check out Wikpedia's List of Data Structures. Algorithms We design algorithms to carry out the desired behavior of a software system. Good programmers tend to use stepwise refinement when developing algorithms, and understand the tradeoffs between different algorithms for the same problem. Classifying Algorithms
  • 13.
    There are severalclassification schemes for algorithms:  By problem domain: numeric, text processing (matching, compression, sequencing), cryptology, fair division, sorting, searching, computational geometry, networks (routing, connectivity, flow, span), computer vision, machine learning.  By design strategy: divide and conquer, greedy, algebraic transformation, dynamic programming, linear programming, brute force (exhaustive search), backtracking,branch and bound, heuristic search, genetic algorithms, simulated annealing, particle swarm, Las Vegas, Monte Carlo, quantum algorithms.  By complexity: constant, logarithmic, linear, linearithmic, quadratic, cubic, polynomial, exponential.  Around the central data structures and their behavior: whether (1) organizational: set, linear, hierachical, or network; or (2) implementational: hashtables, skip lists, red-black trees, a-b trees, cartesian trees, neural networks, Bayesian networks, swarms.  By one of many implementation dimensions: sequential vs. parallel, recursive vs. iterative, deterministic vs. non-deterministic, exact vs. approximate, .... Kinds of Problems This classification has proved useful: Problem Type Description
  • 14.
    Decision Given [MathProcessing Error] and [Math Processing Error], is it true that [Math Processing Error]? Functional Given [Math Processing Error] and [Math Processing Error], find [Math Processing Error] Constraint Given [Math Processing Error] and [Math Processing Error], find an [Math Processing Error] such that [Math Processing Error] Optimization Given [Math Processing Error], find the [Math Processing Error] such that [Math Processing Error] is less than or equal to [Math Processing Error] for all [Math Processing Error]. Choosing the Right Algorithm Choosing the right algorithm is both  an art, because cleverness, insight, ingenuity, and dumb luck are often required to find efficient algorithms for certain problems  a science, because principles of algorithm analysis and widely applicable algorithmic patterns have been developed over time for you to use. The Classics History is important. Over time, many algorithms have become well- known (for variaous) reasons. Some of these are:
  • 15.
     Euclid's GCD The Fast Fourier Transform  Miller-Rabin Primality Test  ... Wait, I can't make up a great list, but Wikipedia can. Check out Wikpedia's List of Algorithms.