21. 空間複雜度 c c c c c O(1) 100K 100,000 10K 10,000 1K 1,000 100 100 10 10 O(n) 輸入量 n
22.
23.
24.
25.
26.
27.
28.
29.
30.
31. Multi-tape Turing Machines: Informal Description … We add a finite number of tapes Control … a 1 a 2 Tape 1 head 1 … a 1 a 2 Tape 2 head 2
32.
33.
34. a b a b Tape 1 a b a Tape 2 State in M2: s Solve by 2-tape Turing Machine M2 : a b a b Tape 1 a b a Tape 2 State in M2: s’
35. Using States to “Remember” Information Equivalent configuration in a Turing Machine M : a b a b a b # a b
36.
37.
38.
39. Definition: A Non-Deterministic TM is a 7-tuple T = (Q, Σ , Γ , , q 0 , q accept , q reject ), where: Q is a finite set of states Γ is the tape alphabet, where Γ and Σ Γ q 0 Q is the start state Σ is the input alphabet, where Σ : Q Γ -> Pow( Q Γ {L,R}) q accept Q is the accept state q reject Q is the reject state, and q reject q accept
48. A language is in NP if and only if there exist polynomial-length certificates for membership to the language. SAT is in NP because a satisfying assignment is a polynomial-length certificate that a formula is satisfiable.
51. The World by Karp P 2-SAT, Shortest-Path, Minimum-Cut, Arc-Cover ? NP-Hard NP-Complete SAT, Clique, Hamiltonian-Circuit, Chromatic Number . . . Equivalence of Regular Expression, Equivalence of ND Finite Automata, Context Sensitive Recognition Linear-Inequalities Graph-Isomorphism, Non-Primes NP ? ? In NPC In P
92. BLAST Step 1: Build the hash table for Sequence A. (3-tuple example) For DNA sequences: Seq. A = AGATCGAT 12345678 AAA AAC .. AGA 1 .. ATC 3 .. CGA 5 .. GAT 2 6 .. TCG 4 .. TTT For protein sequences: Seq. A = ELVIS Add xyz to the hash table if Score(xyz, ELV) ≧ T; Add xyz to the hash table if Score(xyz, LVI) ≧ T; Add xyz to the hash table if Score(xyz, VIS) ≧ T;
94. BLAST Step2: Scan sequence B for hits. Step 3: Extend hits. hit Terminate if the score of the sxtension fades away. (That is, when we reach a segment pair whose score falls a certain distance below the best score found for shorter extensions.) BLAST 2.0 saves the time spent in extension, and considers gapped alignments.
106. An Introduction to Graph Theory Definitions and Examples Undirected graph Directed graph isolated vertex adjacent loop multiple edges simple graph : an undirected graph without loop or multiple edges degree of a vertex: number of edges connected (indegree, outdegree) G =( V , E )
107. x y path : no vertex can be repeated a-b-c-d-e trail : no edge can be repeat a-b-c-d-e-b-d walk : no restriction a-b-d-a-b-c closed if x = y closed trail: circuit (a-b-c-d-b-e-d-a, one draw without lifting pen) closed path: cycle (a-b-c-d-a) a b c d e length : number of edges in this (path,trail,walk)
108. a x b remove any cycle on the repeated vertices Def 11.4 Let G =( V , E ) be an undirected graph. We call G connected if there is a path between any two distinct vertices of G . a b c d e a b c d e disconnected with two components
111. Subgraphs, Complements, and Graph Isomorphism a b c d e a b c d e b c d e a c d spanning subgraph V 1 = V induced subgraph include all edges of E in V 1
112. Subgraphs, Complements, and Graph Isomorphism Def. 11.11 complete graph: K n a b c d e K 5 Def. 11.12 complement of a graph G G a b c d e a b c d e
114. Subgraphs, Complements, and Graph Isomorphism Ex. 11.8 q r w z x y u t v a b c d e f g h i j a-q c-u e-r g-x i-z b-v d-y f-w h-t j-s, isomorphic Ex. 11.9 degree 2 vertices=2 degree 2 vertices=3 Can you think of an algorithm for testing isomorphism?
124. Weighted Bipartite Matching Given a weighted bipartite graph, find a matching with maximum total weight. Not necessarily a maximum size matching. A B
125.
126.
127.
128.
129.
130.
131.
132.
133.
134.
135.
136.
137.
138.
139.
140.
141.
142.
143.
144. Example: A C B D G E F Karger’s Min-Cut Algorithm
158. Example of a Decision Tree Refund MarSt TaxInc YES NO NO NO Yes No Married Single, Divorced < 80K > 80K Splitting Attributes Training Data Model: Decision Tree categorical categorical continuous class
159. Another Example of Decision Tree categorical categorical continuous class MarSt Refund TaxInc YES NO NO Yes No Married Single, Divorced < 80K > 80K There could be more than one tree that fits the same data! NO
184. S = M A L A Y A L A M $ 1 2 3 4 5 6 7 8 9 10 $ YALAM$ M $ ALAYALAM$ $M YALAM$ $M YALAM$ $M YALAM$ A AL LA 6 2 8 4 7 3 1 9 5 10 Suffix Trees Paths from root to leaves represent all suffixes of S
185. M A L A Y A L A M $ 1 2 3 4 5 6 7 8 9 10 $ YALAM$ M $ ALAYALAM$ $M YALAM$ $M YALAM$ $M YALAM$ A AL LA 6 2 8 4 7 3 1 9 5 10 Suffix Tree
186.
187.
188. Finding a Pattern in a String Find “ALA” $ YALAM$ M $ ALAYALAM$ M$ YALAM$ M$ YALAM$ M$ YALAM$ A AL LA 6 2 8 4 7 3 1 9 5 10 Two matches - at 6 and 2
189. (10, 10) (5, 10) (1, 1) (10, 10) (2, 10) (3, 4) (5, 10) (9, 10) (2, 2) (5, 10) (9, 10) (3, 4) (9, 10) (5, 10) 6 2 8 4 7 3 1 9 5 10 Edge Encoding S = M A L A Y A L A M $ 1 2 3 4 5 6 7 8 9 10
190. N äive Suffix Tree Construction Before starting: Why exactly do we need this $ , which is not part of the alphabet? $ 10 M$ 9 AM$ 8 LAM$ 7 ALAM$ 6 YALAM$ 5 AYALAM$ 4 LAYALAM$ 3 ALAYALAM$ 2 MALAYALAM$ 1
191. N äive Suffix Tree Construction $MALAYALAM LAYALAM$ 1 2 LAYALAM$ 3 A 2 3 4 4 YALAM$ etc. $ 10 M$ 9 AM$ 8 LAM$ 7 ALAM$ 6 YALAM$ 5 AYALAM$ 4 LAYALAM$ 3 ALAYALAM$ 2 MALAYALAM$ 1
192.
193.
194.
195. Suffix Array – Reducing Space M A L A Y A L A M $ 1 2 3 4 5 6 7 8 9 10 Suffix Array : Lexicographic ordering of suffixes Derive Longest Common Prefix array Suffix 6 and 2 share “ALA” Suffix 2,8 share just “A”. lcp achieved for successive pairs . $ 10 YALAM$ 5 M$ 9 MALAYALAM$ 1 LAYALAM$ 3 LAM$ 7 AYALAM$ 4 AM$ 8 ALAYALAM$ 2 ALAM$ 6 10 5 9 1 3 7 4 8 2 6 - 0 0 1 0 2 0 1 1 3
196. Example Text Position Suffix Array 3 1 1 0 2 0 1 0 0 lcp Array M M A L A Y A L A $ 1 2 3 4 5 6 7 8 9 10 3 7 4 10 5 8 9 1 2 6 $ 10 YALAM$ 5 M$ 9 MALAYALAM$ 1 LAYALAM$ 3 LAM$ 7 AYALAM$ 4 AM$ 8 ALAYALAM$ 2 ALAM$ 6
197.
198.
199.
200.
201.
202. 0 0 0 0 0 0 0 0 0 0 0 0 Initial with all 0 1 1 1 1 1 x 1 x 2 Each element of S is hashed k times Each hash location set to 1 1 1 1 1 1 y To check if y is in S, check the k hash location. If a 0 appears , y is not in S 1 1 1 1 1 y If only 1s appear, conclude that y is in S This may yield false positive
203.
204.
205.
206.
207.
208.
209.
210.
211.
212.
213.
214.
215.
216.
217.
218.
219.
220.
221.
222.
223.
224. Range trees The query time : Querying a 1D-tree requires O(log n+k) time. How many 1D trees (associated structures) do we need to query? At most 2 height of T = 2 log n Each 1D query requires O(log n+k’) time. Query time = O(log 2 n + k) Answer to query = Union of answers to subqueries: k = ∑k’ . Query: [x,x’] x x’
225.
226.
227.
228.
Editor's Notes
We usually use the matrix to hold the cost of the edges in the bipartite graph. First we need a matrix of the costs of the workers doing the jobs.
Let us now talk about a more sophisticated data structure : Range Trees. The 1-D case is straightforward.. Even a sorted list of the points would suffice. But a sorting wouldn’t generalize to higher dimensions, so we use binary trees instead. Build a perfectly balanced binary tree on the sorted list of points.. Input points r stored in leaves (all leaves are linked in a list), each internal node stores the highest value in it’s left subtree. Comparing the query boundary with this value will help us reach the first point falling in the query. Consider the following example … Query time is O(log n + k) for reporting case, and O(log n) for counting.