Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Semi fragile watermarking by Yash Diwakar 39 views
- [AI07] Revolutionizing Image Proces... by de:code 2017 214 views
- Optimization in deep learning by Jeremy Nixon 102 views
- портфоліо Бабич О.А. by Сергей Жулавник 168 views
- Computer vision, machine, and deep ... by Igi Ardiyanto 675 views
- Face recognition and deep learning ... by BAINIDA 1855 views

No Downloads

Total views

1,304

On SlideShare

0

From Embeds

0

Number of Embeds

3

Shares

0

Downloads

56

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Pattern Recognition and Machine Learning<br />Chapter 8: graphical models<br />
- 2. Bayesian Networks<br />Directed Acyclic Graph (DAG)<br />
- 3. Bayesian Networks<br />General Factorization<br />
- 4. Bayesian Curve Fitting (1) <br />Polynomial<br />
- 5. Bayesian Curve Fitting (2) <br />Plate<br />
- 6. Bayesian Curve Fitting (3) <br />Input variables and explicit hyperparameters<br />
- 7. Bayesian Curve Fitting —Learning<br />Condition on data<br />
- 8. Bayesian Curve Fitting —Prediction<br />Predictive distribution: <br />where<br />
- 9. Generative Models<br />Causal process for generating images<br />
- 10. Discrete Variables (1)<br />General joint distribution: K 2 { 1 parameters<br />Independent joint distribution: 2(K{ 1) parameters<br />
- 11. Discrete Variables (2)<br />General joint distribution over M variables: KM{ 1 parameters<br />M -node Markov chain: K{ 1 + (M{ 1) K(K{ 1) parameters<br />
- 12. Discrete Variables: Bayesian Parameters (1)<br />
- 13. Discrete Variables: Bayesian Parameters (2)<br />Shared prior<br />
- 14. Parameterized Conditional Distributions<br />If are discrete, <br />K-state variables, <br /> in general has O(K M) parameters.<br />The parameterized form<br />requires only M+ 1 parameters <br />
- 15. Linear-Gaussian Models<br />Directed Graph<br />Vector-valued Gaussian Nodes<br />Each node is Gaussian, the mean is a linear function of the parents.<br />
- 16. Conditional Independence<br />a is independent of b given c<br />Equivalently<br />Notation<br />
- 17. Conditional Independence: Example 1<br />
- 18. Conditional Independence: Example 1<br />
- 19. Conditional Independence: Example 2<br />
- 20. Conditional Independence: Example 2<br />
- 21. Conditional Independence: Example 3<br />Note: this is the opposite of Example 1, with c unobserved.<br />
- 22. Conditional Independence: Example 3<br />Note: this is the opposite of Example 1, with c observed.<br />
- 23. “Am I out of fuel?”<br />B = Battery (0=flat, 1=fully charged)<br />F = Fuel Tank (0=empty, 1=full)<br />G = Fuel Gauge Reading (0=empty, 1=full)<br />and hence<br />
- 24. “Am I out of fuel?”<br />Probability of an empty tank increased by observing G = 0. <br />
- 25. “Am I out of fuel?”<br />Probability of an empty tank reduced by observing B = 0. <br />This referred to as “explaining away”.<br />
- 26. D-separation<br /><ul><li>A, B, and C are non-intersecting subsets of nodes in a directed graph.
- 27. A path from A to B is blocked if it contains a node such that either</li></ul>the arrows on the path meet either head-to-tail or tail-to-tail at the node, and the node is in the set C, or<br />the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.<br /><ul><li>If all paths from A to B are blocked, A is said to be d-separated from B by C.
- 28. If A is d-separated from B by C, the joint distribution over all variables in the graph satisfies .</li></li></ul><li>D-separation: Example<br />
- 29. D-separation: I.I.D. Data<br />
- 30. Directed Graphs as Distribution Filters<br />
- 31. The Markov Blanket<br />Factors independent of xi cancel between numerator and denominator.<br />
- 32. Markov Random Fields<br />Markov Blanket<br />
- 33. Cliques and Maximal Cliques<br />Clique<br />Maximal Clique<br />
- 34. Joint Distribution<br />where is the potential over clique C and <br />is the normalization coefficient; note: MK-state variables KM terms in Z.<br />Energies and the Boltzmann distribution<br />
- 35. Illustration: Image De-Noising (1)<br />Original Image<br />Noisy Image<br />
- 36. Illustration: Image De-Noising (2)<br />
- 37. Illustration: Image De-Noising (3)<br />Noisy Image<br />Restored Image (ICM)<br />
- 38. Illustration: Image De-Noising (4)<br />Restored Image (Graph cuts)<br />Restored Image (ICM)<br />
- 39. Converting Directed to Undirected Graphs (1)<br />
- 40. Converting Directed to Undirected Graphs (2)<br />Additional links<br />
- 41. Directed vs. Undirected Graphs (1)<br />
- 42. Directed vs. Undirected Graphs (2)<br />
- 43. Inference in Graphical Models<br />
- 44. Inference on a Chain<br />
- 45. Inference on a Chain<br />
- 46. Inference on a Chain<br />
- 47. Inference on a Chain<br />
- 48. Inference on a Chain<br />To compute local marginals:<br /><ul><li>Compute and store all forward messages, .
- 49. Compute and store all backward messages, .
- 50. Compute Z at any node xm
- 51. Computefor all variables required.</li></li></ul><li>Trees<br />Undirected Tree<br />Directed Tree<br />Polytree<br />
- 52. Factor Graphs<br />
- 53. Factor Graphs from Directed Graphs<br />
- 54. Factor Graphs from Undirected Graphs<br />
- 55. The Sum-Product Algorithm (1)<br />Objective:<br />to obtain an efficient, exact inference algorithm for finding marginals;<br />in situations where several marginals are required, to allow computations to be shared efficiently.<br />Key idea: Distributive Law<br />
- 56. The Sum-Product Algorithm (2)<br />
- 57. The Sum-Product Algorithm (3)<br />
- 58. The Sum-Product Algorithm (4)<br />
- 59. The Sum-Product Algorithm (5)<br />
- 60. The Sum-Product Algorithm (6)<br />
- 61. The Sum-Product Algorithm (7)<br />Initialization<br />
- 62. The Sum-Product Algorithm (8)<br />To compute local marginals:<br /><ul><li>Pick an arbitrary node as root
- 63. Compute and propagate messages from the leaf nodes to the root, storing received messages at every node.
- 64. Compute and propagate messages from the root to the leaf nodes, storing received messages at every node.
- 65. Compute the product of received messages at each node for which the marginal is required, and normalize if necessary.</li></li></ul><li>Sum-Product: Example (1)<br />
- 66. Sum-Product: Example (2)<br />
- 67. Sum-Product: Example (3)<br />
- 68. Sum-Product: Example (4)<br />
- 69. The Max-Sum Algorithm (1)<br />Objective: an efficient algorithm for finding <br />the value xmax that maximises p(x);<br />the value of p(xmax).<br />In general, maximum marginals joint maximum.<br />
- 70. The Max-Sum Algorithm (2)<br />Maximizing over a chain (max-product)<br />
- 71. The Max-Sum Algorithm (3)<br />Generalizes to tree-structured factor graph<br />maximizing as close to the leaf nodes as possible<br />
- 72. The Max-Sum Algorithm (4)<br />Max-Product Max-Sum<br />For numerical reasons, use<br />Again, use distributive law <br />
- 73. The Max-Sum Algorithm (5)<br />Initialization (leaf nodes)<br />Recursion<br />
- 74. The Max-Sum Algorithm (6)<br />Termination (root node)<br />Back-track, for all nodes i with l factor nodes to the root (l=0) <br />
- 75. The Max-Sum Algorithm (7)<br />Example: Markov chain<br />
- 76. The Junction Tree Algorithm<br /><ul><li>Exact inference on general graphs.
- 77. Works by turning the initial graph into a junction tree and then running a sum-product-like algorithm.
- 78. Intractable on graphs with large cliques.</li></li></ul><li>Loopy Belief Propagation<br /><ul><li>Sum-Product on general graphs.
- 79. Initial unit messages passed across all links, after which messages are passed around until convergence (not guaranteed!).
- 80. Approximate but tractable for large graphs.
- 81. Sometime works well, sometimes not at all.</li>

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment