Basic terminologies & asymptotic notations

DataStructures
Basic Terminologies& Asymptotic
Notations
1

DataStructures
“Clever” waysto organizeinformation in order to enable
efficient computation
– What do wemean by clever?
– What do wemean by efficient?
2
Basic Terminologies & Asymptotic
Notations

Picking thebest
DataStructurefor thejob
• Thedatastructureyou pick needsto suppo rt the
operationsyou need
• Ideally it supportstheoperationsyou will usemost
often in an efficient manner
• Examplesof operations:
– A List with operationsinsert and delete
– A Stack with operationspushand pop
3
Notations

Terminology
• Abstract DataType(ADT)
– Mathematical description of an object with set of
operationson theobject. Useful building block.
• Algorithm
– A high level, languageindependent, description of
astep-by-step process
• Datastructure
– A specific family of algorithmsfor implementing
an abstract datatype.
• Implementation of datastructure
– A specific implementation in aspecific language
4
Notations

Terminology
• Data
Datarefersto valueor set of values.
e.g.Marksobtained by thestudents.
• Datatype
datatypeisaclassification identifying oneof varioustypes
of data, such as floating-point, integer, or Boolean, that
determinesthepossiblevaluesfor that type; theoperations
that can bedoneon valuesof that type; and theway values
of that typecan bestored
Data Structures - Introduction 5

Terminology
• Primitivedatatype:
Thesearebasic datatypesthat areprovided by the
programming languagewith built-in support. Thesedata
typesare nativeto thelanguage. Thisdatatypeis
supported by machinedirectly
• Variable
Variableisasymbolic namegiven to someknown or
unknown quantity or information, for thepurposeof
allowing thenameto beused independently of the
information it represents.

Terminology
• Record
Collection of related dataitemsisknown asrecord. The
elementsof recordsareusually Called fieldsor members.
Recordsare distinguished from arraysby thefact that
their number of fieldsistypically fixed, each field hasa
name, and that each field may haveadifferent type.
• Program
A sequenceof instructionsthat acomputer can
interpret and execute.

Terminology examples
• A stack isan abstract data type supporting push, pop and
isEmpty operations
• A stack data structure could usean array, alinked list, or
anything that can hold data
• Onestack implementatio n isjava.util.Stack; another is
java.util.LinkedList
8
Notations

Concepts vs. Mechanisms
• Abstract
• Pseudocode
• Algorithm
– A sequenceof high-level,
languageindependent
operations, which may act
upon an abstracted view
of data.
• Abstract DataType(ADT)
– A mathematical
description of an object
and theset of operations
on theobject.
• Concrete
• Specific programming language
• Program
– A sequenceof operationsin a
specific programming language,
which may act upon real datain
theform of numbers, images,
sound, etc.
• Datastructure
– A specific way in which a
program’sdataisrepresented,
which reflectsthe
programmer’sdesign
choices/goals.
9

Why So Many DataStructures?
Ideal datastructure:
“fast”, “elegant”, memory efficient
Generatestensions:
– timevs. space
– performancevs. elegance
– generality vs. simplicity
– oneoperation’sperformancevs. another’s
The study o f data structures is the study
o f tradeo ffs. That’ s why we have so
many o f them!
10Basic Terminologies & Asymptotic
Notations

DataStructures
Asymptotic Analysis
11
Notations

Algorithm Analysis: Why?
• Correctness:
– Doesthealgorithm do what isintended.
• Performance:
– What istherunning timeof thealgorithm.
– How much storagedoesit consume.
• Different algorithmsmay becorrect
– Which should I use?
12
Notations

Recursivealgorithm for sum
• Writearecursive function to find thesum of thefirst n
integersstored in array v.
13

Proof by Induction
• Basis Step: Thealgorithm iscorrect for abasecaseor
two by inspection.
• Inductive Hypothesis (n=k): Assumethat the
algorithm workscorrectly for thefirst k cases.
• Inductive Step (n=k+1): Given thehypothesisabove,
show that thek+1 casewill becalculated correctly.
14

Program Correctnessby Induction
• Basis Step:
sum(v,0) = 0. 
• Inductive Hypothesis (n=k):
Assumesum(v,k) correctly returnssum of first k
elementsof v, i.e. v[0]+v[1]+…+v[k-1]+v[k]
• Inductive Step (n=k+1):
sum(v,n) returns
v[k]+sum(v,k-1)= (by inductive hyp.)
v[k]+(v[0]+v[1]+…+v[k-1])=
v[0]+v[1]+…+v[k-1]+v[k]  15

AlgorithmsvsPrograms
• Proving correctnessof an algorithm isvery important
– awell designed algorithm isguaranteed to work
correctly and itsperformancecan beestimated
• Proving correctnessof aprogram (an implementation) is
fraught with weird bugs
– Abstract DataTypesareaway to bridgethegap
between mathematical algorithmsand programs
16

Comparing Two Algorithms
GOAL: Sort alist of names
“I’ll buy afaster CPU”
“I’ll useC++ instead of Java– wicked fast!”
“Ooh look, the–O4 flag!”
“Who careshow I do it, I’ll add morememory!”
“Can’t I just get thedatapre-sorted??”
17

Comparing Two Algorithms
• What wewant:
– Rough Estimate
– IgnoresDetails
• Really, independent of details
– Coding tricks, CPU speed, compiler optimizations, …
– Thesewould help any algorithmsequally
– Don’t just careabout running time– not agood
enough measure
18

Big-O Analysis
• Ignores“details”
• What details?
– CPU speed
– Programming languageused
– Amount of memory
– Compiler
– Order of input
– Sizeof input … sorta.
19

Analysisof Algorithms
• Efficiency measure
– how long theprogram runs timecomplexity
– how much memory it uses spacecomplexity
• Why analyzeat all?
– Decidewhat algorithm to implement beforeactually
doing it
– Given code, get asensefor wherebottlenecksmust be,
without actually measuring it
20

Asymptotic Analysis
• Complexity asafunction of input sizen
T(n) = 4n + 5
T(n) = 0.5 n log n - 2n + 7
T(n) = 2n
+ n3
+ 3n
• What happens as n gro ws?
21

Why Asymptotic Analysis?
• Most algorithmsarefast for small n
– Timedifferencetoo small to benoticeable
– External thingsdominate(OS, disk I/O, …)
• BUT n isoften largein practice
– Databases, internet, graphics, …
• Differencereally showsup asn grows!
22

Exercise- Searching
bool ArrayFind( int array[], int n, int key){
// Insert your algorithm here
2 3 5 16 37 50 73 75 126
What algo rithm wo uld yo u
cho o se to implement this
co de snippet?
23

Analyzing Code
Basic Java operations
Consecutive statements
Conditionals
Loops
Function calls
Recursive functions
Constant time
Sum of times
Larger branch plustest
Sum of iterations
Cost of function body
Solverecurrencerelation
24

Linear Search Analysis
bool LinearArrayFind(int array[],
int n,
int key ) {
for( int i = 0; i < n; i++ ) {
if( array[i] == key )
// Found it!
return true;
}
return false;
}
Best Case:
Worst Case:
25

Binary Search Analysis
bool BinArrayFind( int array[], int low,
int high, int key ) {
// Thesubarray isempty
if( low > high ) return false;
// Search thissubarray recursively
int mid = (high + low) / 2;
if( key == array[mid] ) {
return true;
} elseif( key < array[mid] ) {
return BinArrayFind( array, low,
mid-1, key );
} else{
return BinArrayFind( array, mid+1,
high, key );
Best case:
Worst case:
26

Solving RecurrenceRelations
1. Determinetherecurrencerelation. What is/arethebase
case(s)?
2. “Expand” theoriginal relation to find an equivalent general
expression in terms o f the number o f expansio ns.
3. Find aclosed-form expression by setting the number o f
expansio ns to avaluewhich reducestheproblem to abase
case
27

DataStructures
Asymptotic Analysis
28

Linear Search vsBinary Search
Linear Search Binary Search
Best Case 4 at [0] 4 at [middle]
Worst Case 3n+2 4 log n + 4
So … which algo rithm is better?
What tradeo ffs can yo u make?
29

Fast Computer vs. Slow Computer
30

Fast Computer vs. Smart Programmer (round
1)
31

Fast Computer vs. Smart Programmer
(round 2)
32

Asymptotic Analysis
• Asymptotic analysislooksat theo rder of therunning
timeof thealgorithm
– A valuabletool when theinput gets“large”
– Ignorestheeffects o f different machines or different
implementatio ns of an algorithm
• Intuitively, to find theasymptotic runtime, throw
away theconstantsand low-order terms
– Linear search isT(n) = 3n + 2 ∈ O(n)
– Binary search isT(n) = 4 log2n + 4 ∈ O(log n)
Remember: the fastest algo rithm has the
slo west gro wing functio n fo r its runtime
33
Notations

Asymptotic Analysis
• Eliminatelow order terms
– 4n + 5 ⇒
– 0.5 n log n + 2n + 7 ⇒
– n3
+ 2n
+ 3n ⇒
• Eliminatecoefficients
– 4n ⇒
– 0.5 n log n ⇒
– n log n2
=>
34
Notations

Propertiesof Logs
• log AB = log A + log B
• Proof:
• Similarly:
– log(A/B) = log A – log B
– log(AB
) = B log A
• Any log isequivalent to log-base-2
BAAB
AB
BA
BABA
BA
logloglog
222
2,2
)log(logloglog
loglog
2222
22
+=∴
=⋅=
==
+
35
Notations

Order Notation: Intuition
Although not yet apparent, asn gets“sufficiently large”,
f(n) will be“greater than or equal to” g(n)
f(n) = n3
+ 2n2
g(n) = 100n2
+ 1000
36
Notations

Definition of Order Notation
• Upper bound: T(n) = O(f(n)) Big-O
Exist positiveconstantsc and n’ such that
T(n) ≤ c f(n) for all n ≥ n’
• Lower bound: T(n) = Ω(g(n)) Omega
Exist positiveconstantsc and n’ such that
T(n) ≥ c g(n) for all n ≥ n’
• Tight bound: T(n) = θ(f(n)) Theta
When both hold:
T(n) = O(f(n))
T(n) = Ω(f(n)) 37
Notations

Definition of Order Notation
O( f(n) ) : aset or classof functions
g(n) ∈ O( f(n) ) iff thereexist positiveconstsc and n0 such
that:
g(n) ≤ c f(n) for all n ≥ n0
Example:
100n2
+ 1000 ≤ 5 (n3
+ 2n2
) for all n ≥ 19
So g(n) ∈ O( f(n) )
38
Notations

Order Notation: Example
100n2
+ 1000 ≤ 5 (n3
+ 2n2
) for all n ≥ 19
So f(n) ∈ O( g(n) )
39
Notations

SomeNoteson Notation
• Sometimesyou’ll see
g(n) = O( f(n) )
• Thisisequivalent to
g(n) ∈ O( f(n) )
• What about thereverse?
O( f(n) ) = g(n)
40
Notations

Big-O: Common Names
– constant: O(1)
– logarithmic: O(log n) (logkn, log n2
∈ O(log n))
– linear: O(n)
– log-linear: O(n log n)
– quadratic: O(n2
)
– cubic: O(n3
)
– polynomial: O(nk
) (k isaconstant)
– exponential: O(cn
) (c isaconstant > 1)
41
Notations

Meet theFamily
• O( f(n) ) istheset of all functionsasymptotically lessthan
or equal to f(n)
– o( f(n) ) istheset of all functionsasymptotically
strictly lessthan f(n)
• Ω( f(n) ) istheset of all functionsasymptotically greater
than or equal to f(n)
– ω( f(n) ) istheset of all functionsasymptotically
strictly greater than f(n)
• θ( f(n) ) istheset of all functionsasymptotically equal to
f(n)
42
Notations

Meet theFamily, Formally
• g(n) ∈ O( f(n) ) iff
Thereexist c and n0 such that g(n) ≤ c f(n) for all n ≥ n0
– g(n) ∈ o( f(n) ) iff
Thereexistsan0 such that g(n) < c f(n) for all c and n ≥ n0
• g(n) ∈ Ω( f(n) ) iff
Thereexist c and n0 such that g(n) ≥ c f(n) for all n ≥ n0
– g(n) ∈ ω( f(n) ) iff
Thereexistsan0 such that g(n) > c f(n) for all c and n ≥ n0
• g(n) ∈ θ( f(n) ) iff
g(n) ∈ O( f(n) ) and g(n) ∈ Ω( f(n) )
Equivalent to: limn→∞ g(n)/f(n) = 0
Equivalent to: limn→∞ g(n)/f(n) = ∞
43Data Structures - Introduction

Big-Omegaet al. Intuitively
Asymptotic Notation MathematicsRelation
O ≤
Ω ≥
θ =
o <
ω >
44Basic Terminologies & Asymptotic
Notations

Prosand Cons
of Asymptotic Analysis
45
Notations

Perspective: Kindsof Analysis
• Running timemay depend on actual datainput, not
just length of input
• Distinguish
– Worst Case
• Your worst enemy ischoosing input
– Best Case
– AverageCase
• Assumessomeprobabilistic distribution of
inputs
– Amortized
• Averagetimeover many operations
46
Notations

Typesof Analysis
Two orthogonal axes:
– Bound Flavor
• Upper bound (O, o)
• Lower bound (Ω, ω)
• Asymptotically tight (θ)
– AnalysisCase
• Worst Case(Adversary)
• AverageCase
• Best Case
• Amortized
47
Notations

16n3
log8(10n2
) + 100n2
= O(n3
log n)
• Eliminate
low-order
terms
• Eliminate
constant
coefficients
16n3
log8(10n2
) + 100n2
16n3
log8(10n2
)
n3
log8(10n2
)
n3
(log8(10) + log8(n2
))
n3
log8(10) + n3
log8(n2
)
n3
log8(n2
)
2n3
log8(n)
n3
log8(n)
n3
log8(2)log(n)
n3
log(n)/3
n3
log(n)
48Basic Terminologies&
Asymptotic Notations

Basic terminologies & asymptotic notations

More Related Content

What's hot

Viewers also liked

Similar to Basic terminologies & asymptotic notations

More from Rajendran

Recently uploaded

Basic terminologies & asymptotic notations

Editor's Notes