1 / 33

Bayesian Networks (Directed Acyclic Graphical Models)

Coin 1. Coin 2. Bell. Bayesian Networks (Directed Acyclic Graphical Models). The situation of a bell that rings whenever the outcome of two coins are equal can not be well represented by undirected graphical models.

Download Presentation

Bayesian Networks (Directed Acyclic Graphical Models)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Coin1 Coin2 Bell Bayesian Networks(Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well represented by undirected graphical models. A clique will be formed because of induced dependency of the two coins given the bell.

  2. Bayesian Networks (BNs)Examples of models for diseases & symptoms & risk factors One variable for all diseases (values are diseases) One variable per disease (values are True/False) Naïve Bayesian Networks versus Bipartite BNs

  3. Boundary Basis for Dependency Models Let M be a dependency model over U={X1,…,Xn}. Let d be an ordering of these elements. A boundary basis wrt d of M is a set of independence statements I(Xi, Bi, Ui-Bi) that hold in M where Ui={X1,X2,…,Xi-1}, i=1,..n. A boundary basis is minimal if every Bi is minimal. Example I: What is the boundary basis for P(X1,X2,X3,X4) = P(X1)P(X2|X1)P(X3|X2)P(X4|X3)?

  4. X1 X2 X3 X4 Example I A boundary basis and a boundary DAG for: P(X1,X2,X3,X4) = P(X1)P(X2|X1)P(X3|X2)P(X4|X3)? I( X3 ,X2 , X1) I( X4 , X3, {X1, X2}) The directed acyclic graph (DAG) created by assigning each vertex Xi the parents Bi is called the boundary DAG of M relative to order d.

  5. Coin1 Coin2 Bell Example II A boundary basis and a boundary DAG for: P(coin1,coin2,bell) =P(coin1)P(coin2)P(bell|coin1,coin2) I( coin1, { } ,coin2)

  6. S V L T B A X D Example III • In the order V,S,T,L,B,A,X,D, we have a boundary basis: • I( S, { }, V ) • I( T, V, S) • I( l, S, {T,V}) • … • I( X,A, {V,S,T,L,B,D}) Does I( {X,D} ,A,V) also hold in the dependency model P ?

  7. D is a minimal I-map of M if by removing any edge, D ceases to be an I-map. 3. D is a perfect map of M if ID(X,Z,Y)  IM(X,Z,Y) for all disjoint subsets X,Y, Z of U. Definitions Can we define “Independence” ID(X,Z,Y) graphically that answers these probabilistic independence questions ? 1. A Directed Acyclic Graph (DAG) D=(U,E) is an I-map of a dependency model M over U if ID(X,Z,Y)  IM(X,Z,Y) for all disjoint subsets X,Y, Z of U.

  8. From Separation in UGs To d-Separation in DAGs

  9. S V L T B A X D Paths • Intuition: dependency must “flow” along paths in the graph • A path is a sequence of neighboring variables Examples: • X  A  D  B • A L S  B

  10. Path blockage • Every path is classified given the evidence: • active -- creates a dependency between the end nodes • blocked – does not create a dependency between the end nodes Evidence means the assignment of a value to a subset of nodes.

  11. S S Blocked Blocked Active L L B B Path Blockage Three cases: • Common cause

  12. Blocked Blocked Active S S L L A A Path Blockage Three cases: • Common cause • Intermediate cause

  13. Blocked Blocked Active T T T L L L X X X A A A Path Blockage Three cases: • Common cause • Intermediate cause • Common Effect

  14. T L A Definition of Path Blockage Definition: A path is active, given evidence Z, if • Whenever we have the configurationthen either A or one of its descendents is in Z • No other nodes in the path are in Z. Definition: A path is blocked, given evidence Z, if it is not active. Definition: X is d-separated from Y, given Z, if all paths from a node in X and a node in Y are blocked, given Z.

  15. d-Separation

  16. ID(T,S|) = yes S V L T B A X D Example

  17. ID (T,S |) = yes ID(T,S|D) = no S V L T B A X D Example

  18. ID (T,S |) = yes ID(T,S|D) = no ID(T,S|{D,L,B}) = yes S V L T B A X D Example

  19. S V L T B A X D Example • In the order V,S,T,L,B,A,X,D, we get from the boundary basis: • ID( S, { }, V ) • ID( T, V, S) • ID( l, S, {T,V}) • … • ID( X,A, {V,S,T,L,B,D})

  20. Main Result - Soundness

  21. Bayesian Networks(Directed Acyclic Graphical Models) Definition: Given a probability distribution P on a set of variables U, a DAG D = (U,E) is called a Bayesian Network of P iff D is a minimal I-map of P.

  22. First claim holds because any probability distribution is a semi graphoid (Symmetry, Decomposition, Contraction, Weak union).

  23. Second claim of uniqueness of parents sets holds due to. • I(X,ZW1,YW2) and I(X,ZW2,YW1)  I(X,Z,YW1W2) • Proof: • (1) I(X, ZW1,YW2). Given. • I(X, ZW2,YW1). Given. • (3) I(X, ZW1W2,Y) by weak union from (1). • (4) I(X, ZYW1,W2) by weak union from (1). • (5) I(X, ZYW2,W1) by weak union from (2). • (6) I(X, ZY, W1W2) by intersection from (4) and (5). •  I(X, Z, YW1W2) by intersection from (3) and (6).

  24. d-separation The definition ofID(X, Z, Y) is such that: Soundness [Theorem 9]: ID(X, Z, Y)= yes implies IP(X, Z, Y) follows from the boundary Basis(D). Completeness [Theorem 10]: ID(X, Z, Y)= no implies IP(X, Z, Y) does not follow from the boundary Basis(D).

  25. S V L T B A X D Revisiting Example II So does IP( {X,D} ,A, V)hold ? Enough to check d-separation !

  26. S V L T B A X D Bayesian Networks with numbers p(s) p(v) p(t|v) p(l|s) p(b|s) p(a|t,l) p(d|a,b) p(x|a)

  27. S V L T B A X D p(s) p(v) p(t|v) p(l|s) p(b|s) p(a|t,l) p(d|a,b) p(x|a) Bayesian Network (cont.) Each Directed Acyclic Graph defines a factorization of the form:

  28. IP( Xi ; {X1,…,Xi-1}\Pai | Pai ) Independence in Bayesian networks This set of independence assertions is denoted Basis(G) . All other independence assertions that are entailed by (*) are derivable using the semi-graphoid axioms.

  29. Lung Cancer (Yes/No) Tuberculosis (Yes/No) p(A|T,L) Abnormality in Chest (Yes/no) Local distributions- Asymmetric independence Table: p(A=y|L=n, T=n) = 0.02 p(A=y|L=n, T=y) = 0.60 p(A=y|L=y, T=n) = 0.99 p(A=y|L=y, T=y) = 0.99

  30. COROLLARY 4: D is an I-map of P iff each variable X is conditionally independent in P of all its non-descendants, given its parents. Proof : X is d-separated of all its non-descendants, given its parents. Since D is an I-map, by the soundness theorem the claim holds. Proof : Each variable X is conditionally independent of all its non-descendants, given its parents implies using decomposition that it is also independent of its predecessors in a particular order d.

  31. COROLLARY 5: If D=(U,E) is a boundary DAG of P constructed in some order d, then any topological order d’ of U will yield the same boundary DAG of P. (Hence construction order can be forgotten). Proof : By Corollary 4, each variable X is d-separated of all its non-descendants, given its parents in the boundary DAG of P. In particular, due to decomposition, X is independent given its parents from all previous variables in any topological order d’.

  32. Extension of the Markov Chain Property I(Xk, Xk-1, X1 … Xk-2)  I(Xk, Xk-1 Xk+1, X1 … Xk-2 Xk+2… Xn ) Holds due to the soundness theorem. Converse holds when Intersection is assumed. Markov Blankets in DAGs

  33. Consequence: There is no improvement to d-separation and no statement escapes graphical representation. Reasoning: (1) If there were an independence statement  not shown by d-separation, then  must be true in all distributions that satisfy the basis. But Theorem 10 states that there exists a distribution that satisfies the basis and violates . (2) Same argument. [Note that (2) is a stronger claim.]

More Related