/Users/andrea/_magisterarbeit/korpus/clean/trainkorpus/1/file1.html NN ----------------------------------------- : A DT Brief NN Introduction NN to TO Graphical JJ Models NNS and CC Bayesian NP Networks NPS By IN Kevin NP Murphy NP , , 1998 CD . SENT Graphical JJ models NNS are VBP a DT marriage NN between IN probability NN theory NN and CC graph NN theory NN . SENT They PP provide VVP a DT natural JJ tool NN for IN dealing VVG with IN two CD problems NNS that WDT occur VVP throughout IN applied VVN mathematics NNS and CC engineering NN uncertainty NN and CC complexity NN and CC in IN particular JJ they PP are VBP playing VVG an DT increasingly RB important JJ role NN in IN the DT design NN and CC analysis NN of IN machine NN learning VVG algorithms NNS . SENT Fundamental JJ to TO the DT idea NN of IN a DT graphical JJ model NN is VBZ the DT notion NN of IN modularity NN a DT complex JJ system NN is VBZ built VVN by IN combining VVG simpler JJR parts NNS . SENT Probability NN theory NN provides VVZ the DT glue NN whereby WRB the DT parts NNS are VBP combined VVN , , ensuring VVG that IN the DT system NN as IN a DT whole NN is VBZ consistent JJ , , and CC providing VVG ways NNS to TO interface NN models NNS to TO data NNS . SENT The DT graph NN theoretic JJ side NN of IN graphical JJ models NNS provides VVZ both CC an DT intuitively RB appealing JJ interface NN by IN which WDT humans NNS can MD model VV highly RB interacting VVG sets NNS of IN variables NNS as RB well RB as IN a DT data NN structure NN that WDT lends VVZ itself PP naturally RB to TO the DT design NN of IN efficient JJ general JJ purpose NN algorithms NNS . SENT Many JJ of IN the DT classical JJ multivariate JJ probabalistic JJ systems NNS studied VVN in IN fields NNS such JJ as IN statistics NNS , , systems NNS engineering NN , , information NN theory NN , , pattern NN recognition NN and CC statistical JJ mechanics NNS are VBP special JJ cases NNS of IN the DT general JJ graphical JJ model NN formalism NN examples NNS include VVP mixture NN models NNS , , factor NN analysis NN , , hidden JJ Markov NP models NNS , , Kalman NP filters NNS and CC Ising NN models NNS . SENT The DT graphical JJ model NN framework NN provides VVZ a DT way NN to TO view VV all DT of IN these DT systems NNS as IN instances NNS of IN a DT common JJ underlying JJ formalism NN . SENT This DT view NN has VHZ many JJ advantages NNS in IN particular JJ , , specialized JJ techniques NNS that WDT have VHP been VBN developed VVN in IN one CD field NN can MD be VB transferred VVN between IN research NN communities NNS and CC exploited VVN more RBR widely RB . SENT Moreover RB , , the DT graphical JJ model NN formalism NN provides VVZ a DT natural JJ framework NN for IN the DT design NN of IN new JJ systems NNS . SENT Michael NP Jordan NP , , 1998 CD . SENT This DT tutorial JJ Note NN . SENT Dan NP Hammerstrom NP has VHZ made VVN a DT pdf NN version NN of IN this DT web NN page NN . SENT I PP also RB have VHP a DT closely RB related VVN tutorial NN in IN postscript NN or CC pdf NN format NN . SENT We PP will MD briefly RB discuss VV the DT following VVG topics NNS . SENT Representation NN , , or CC , , what WP exactly RB is VBZ a DT graphical JJ model NN . SENT Inference NN , , or CC , , how WRB can MD we PP use VV these DT models NNS to TO efficiently RB answer VV probabilistic JJ queries NNS . SENT Learning NP , , or CC , , what WP do VVP we PP do VVP if IN we PP don't VVD know VV what WP the DT model NN is VBZ . SENT Decision NN theory NN , , or CC , , what WP happens VVZ when WRB it PP is VBZ time NN to TO convert VV beliefs NNS into IN actions NNS . SENT Applications NNS , , or CC , , what's NNS this DT all RB good JJ for IN , , anyway RB . SENT Articles NNS in IN the DT popular JJ press NN The DT following VVG articles NNS provide VV less JJR technical JJ introductions NNS . SENT LA DT times NNS article NN 10 CD 28 CD 96 CD about IN Bayes NP nets NNS . SENT Economist NN article NN 3 CD 22 CD 01 CD about IN Microsoft's NP application NN of IN BNs NP . SENT Economist NN article NN 9 CD 30 CD 00 CD about IN Bayesian NP approaches NNS to TO clinical JJ trials NNS . SENT New NP York NP Times NP article NN 4 CD 28 CD 01 CD about IN Bayesian JJ statistics NNS . SENT Note NN . SENT Despite IN the DT name NN , , Bayesian NP networks NNS do VVP not RB necessarily RB imply VV a DT commitment NN to TO Bayesian NP statistics NNS . SENT Rather RB , , they PP are VBP so RB called VVN because IN they PP use VVP Bayes NP rule NN for IN probabilistic JJ inference NN , , as IN we PP explain VVP below RB . SENT Other JJ sources NNS of IN technical JJ information NN Nebula NN encyclopedia NN entry NN on IN Bayesian NP inference NN Zoubin NP Ghahramani's NP introduction NN to TO Bayesian NP machine NN learning NN . SENT Tom NP Minka's NP tutorial JJ notes NNS on IN Bayesian NP statistics NNS and CC other JJ topics NNS . SENT AUAI NP homepage NP Association NP for IN Uncertainty NN in IN Artificial JJ Intelligence NP The DT UAI NP mailing NN list NN UAI NP proceedings NNS . SENT My PP$ list NN of IN recommended JJ reading NN . SENT Bayes NP Net JJ software NN packages VVZ My PP$ Bayes NP Net NN Toolbox NP for IN Matlab NP Tutorial JJ slides NNS on IN graphical JJ models NNS and CC BNT NP , , presented VVD to TO the DT Mathworks NP , , May NP 2003 CD List NP of IN other JJ Bayes NP net JJ tutorials NNS Representation NN Probabilistic JJ graphical JJ models NNS are VBP graphs NNS in IN which WDT nodes NNS represent VVP random JJ variables NNS , , and CC the DT lack NN of IN arcs NNS represent VVP conditional JJ independence NN assumptions NNS . SENT Hence RB they PP provide VVP a DT compact JJ representation NN of IN joint JJ probability NN distributions NNS . SENT Undirected JJ graphical JJ models NNS , , also RB called VVD Markov NP Random NP Fields NPS MRFs NP or CC Markov NP networks NNS , , have VHP a DT simple JJ definition NN of IN independence NN . SENT two CD sets NNS of IN nodes NNS A NP and CC B NP are VBP conditionally RB independent JJ given VVN a DT third JJ set NN , , C NP , , if IN all DT paths NNS between IN the DT nodes NNS in IN A NP and CC B NP are VBP separated VVN by IN a DT node NN in IN C NP . SENT By IN contrast NN , , directed VVD graphical JJ models NNS also RB called VVD Bayesian NP Networks NPS or CC Belief NN Networks NPS BNs NP , , have VHP a DT more RBR complicated JJ notion NN of IN independence NN , , which WDT takes VVZ into IN account NN the DT directionality NN of IN the DT arcs NNS , , as IN we PP explain VVP below RB . SENT Note NN that IN , , despite IN the DT name NN , , Bayesian NP networks NNS do VVP not RB necessarily RB imply VV a DT commitment NN to TO Bayesian NP methods NNS . SENT rather RB , , they PP are VBP so RB called VVN because IN they PP use VVP Bayes NP rule NN for IN inference NN , , as IN we PP will MD see VV below RB . SENT Click NN here RB for IN a DT quick JJ review NN of IN Bayes NP rule NN . SENT Undirected JJ graphical JJ models NNS are VBP more RBR popular JJ with IN the DT physics NN and CC vision NN communities NNS , , and CC directed VVD models NNS are VBP more RBR popular JJ with IN the DT AI VVP and CC statistics NNS communities NNS . SENT It PP is VBZ possible JJ to TO have VH a DT model NN with IN both DT directed VVN and CC undirected JJ arcs NNS , , which WDT is VBZ called VVN a DT chain NN graph NN . SENT For IN a DT careful JJ study NN of IN the DT relationship NN between IN directed VVN and CC undirected JJ graphical JJ models NNS , , see VVP the DT books NNS by IN Pearl NP 88 CD , , Whittaker NP 90 CD , , and CC Lauritzen NP 96 CD . SENT Although IN directed VVN models NNS have VHP a DT more RBR complicated JJ notion NN of IN independence NN than IN undirected JJ models NNS , , they PP do VVP have VH several JJ advantages NNS . SENT The DT most RBS important JJ is VBZ that IN one PP can MD regard VV an DT arc NN from IN A NP to TO B NP as IN indicating VVG that IN A DT causes NNS B NN . SENT See VV the DT discussion NN on IN causality NN . SENT This DT can MD be VB used VVN as IN a DT guide NN to TO construct VV the DT graph NN structure NN . SENT In IN addition NN , , directed VVD models NNS can MD encode VV deterministic JJ relationships NNS , , and CC are VBP easier JJR to TO learn VV fit NN to TO data NNS . SENT In IN the DT rest NN of IN this DT tutorial NN , , we PP will MD only RB discuss VV directed VVN graphical JJ models NNS , , i NP . SENT e SYM . SENT , , Bayesian NP networks NNS . SENT In IN addition NN to TO the DT graph NN structure NN , , it PP is VBZ necessary JJ to TO specify VV the DT parameters NNS of IN the DT model NN . SENT For IN a DT directed VVN model NN , , we PP must MD specify VV the DT Conditional JJ Probability NN Distribution NN CPD NP at IN each DT node NN . SENT If IN the DT variables NNS are VBP discrete JJ , , this DT can MD be VB represented VVN as IN a DT table NN CPT NP , , which WDT lists VVZ the DT probability NN that IN the DT child NN node NN takes VVZ on IN each DT of IN its PP$ different JJ values NNS for IN each DT combination NN of IN values NNS of IN its PP$ parents NNS . SENT Consider VV the DT following VVG example NN , , in IN which WDT all DT nodes NNS are VBP binary JJ , , i NP . SENT e SYM . SENT , , have VHP two CD possible JJ values NNS , , which WDT we PP will MD denote VV by IN T NN true JJ and CC F NN false JJ . SENT We PP see VVP that IN the DT event NN grass NN is VBZ wet JJ W NP true JJ has VHZ two CD possible JJ causes NNS . SENT either CC the DT water NN sprinker NN is VBZ on IN S NP true JJ or CC it PP is VBZ raining VVG R NN true JJ . SENT The DT strength NN of IN this DT relationship NN is VBZ shown VVN in IN the DT table NN . SENT For IN example NN , , we PP see VVP that IN Pr NP W NP true JJ S NP true JJ , , R NN false JJ 0 CD . SENT 9 CD second JJ row NN , , and CC hence RB , , Pr NP W NP false JJ S NP true JJ , , R NN false JJ 1 CD 0 CD . SENT 9 CD 0 CD . SENT 1 CD , , since IN each DT row NN must MD sum VV to TO one CD . SENT Since IN the DT C NP node NN has VHZ no DT parents NNS , , its PP$ CPT NP specifies VVZ the DT prior JJ probability NN that IN it PP is VBZ cloudy JJ in IN this DT case NN , , 0 CD . SENT 5 CD . SENT The DT simplest JJS conditional JJ independence NN relationship NN encoded VVN in IN a DT Bayesian NP network NN can MD be VB stated VVN as RB follows VVZ . SENT a DT node NN is VBZ independent JJ of IN its PP$ ancestors NNS given VVN its PP$ parents NNS , , where WRB the DT ancestor NN parent NN relationship NN is VBZ with IN respect NN to TO some DT fixed VVN topological JJ ordering VVG of IN the DT nodes NNS . SENT By IN the DT chain NN rule NN of IN probability NN , , the DT joint JJ probability NN of IN all PDT the DT nodes NNS in IN the DT graph NN above RB is VBZ P NN C NP , , S NP , , R NP , , W NP P NN C NP P NN S NP C NP P NN R NP C NP , , S NP P NN W NP C NP , , S NP , , R NN By IN using VVG conditional JJ independence NN relationships NNS , , we PP can MD rewrite VV this DT as IN P NN C NP , , S NP , , R NP , , W NP P NN C NP P NN S NP C NP P NN R NP C NP P NN W NP S NP , , R NN where WRB we PP were VBD allowed VVN to TO simplify VV the DT third JJ term NN because IN R NP is VBZ independent JJ of IN S NP given JJ its PP$ parent NN C NP , , and CC the DT last JJ term NN because IN W NP is VBZ independent JJ of IN C NP given JJ its PP$ parents NNS S NP and CC R NP . SENT We PP can MD see VV that IN the DT conditional JJ independence NN relationships NNS allow VVP us PP to TO represent VV the DT joint NN more RBR compactly RB . SENT Here RB the DT savings NNS are VBP minimal JJ , , but CC in IN general JJ , , if IN we PP had VHD n NN binary NN nodes NNS , , the DT full JJ joint NN would MD require VV O NP 2 CD n NN space NN to TO represent VV , , but CC the DT factored VVN form NN would MD require VV O NN n NN 2 CD k NN space NN to TO represent VV , , where WRB k NN is VBZ the DT maximum JJ fan NN in IN of IN a DT node NN . SENT And CC fewer JJR parameters NNS makes VVZ learning VVG easier JJR . SENT Inference NN The DT most RBS common JJ task NN we PP wish VVP to TO solve VV using VVG Bayesian JJ networks NNS is VBZ probabilistic JJ inference NN . SENT For IN example NN , , consider VVP the DT water NN sprinkler NN network NN , , and CC suppose VVP we PP observe VVP the DT fact NN that IN the DT grass NN is VBZ wet JJ . SENT There EX are VBP two CD possible JJ causes NNS for IN this DT . SENT either CC it PP is VBZ raining VVG , , or CC the DT sprinkler NN is VBZ on IN . SENT Which WDT is VBZ more RBR likely JJ . SENT We PP can MD use VV Bayes NP rule VV to TO compute VV the DT posterior JJ probability NN of IN each DT explanation NN where WRB 0 CD false JJ and CC 1 CD true JJ . SENT where WRB is VBZ a DT normalizing VVG constant JJ , , equal JJ to TO the DT probability NN likelihood NN of IN the DT data NNS . SENT So RB we PP see VVP that IN it PP is VBZ more RBR likely JJ that IN the DT grass NN is VBZ wet JJ because IN it PP is VBZ raining VVG . SENT the DT likelihood NN ratio NN is VBZ 0 CD . SENT 7079 CD 0 CD . SENT 4298 CD 1 CD . SENT 647 CD . SENT Explaining VVG away RB In IN the DT above JJ example NN , , notice NN that IN the DT two CD causes NNS compete VVP to TO explain VV the DT observed JJ data NNS . SENT Hence RB S NP and CC R NP become VVP conditionally RB dependent JJ given VVN that IN their PP$ common JJ child NN , , W NP , , is VBZ observed VVN , , even RB though IN they PP are VBP marginally RB independent JJ . SENT For IN example NN , , suppose VVP the DT grass NN is VBZ wet JJ , , but CC that IN we PP also RB know VVP that IN it PP is VBZ raining VVG . SENT Then RB the DT posterior JJ probability NN that IN the DT sprinkler NN is VBZ on IN goes VVZ down RP . SENT Pr NP S NP 1 CD W NP 1 CD , , R NP 1 CD 0 CD . SENT 1945 CD This NP is VBZ called VVN explaining VVG away RB . SENT In IN statistics NNS , , this DT is VBZ known VVN as IN Berkson's NP paradox NN , , or CC selection NN bias NN . SENT For IN a DT dramatic JJ example NN of IN this DT effect NN , , consider VVP a DT college NN which WDT admits VVZ students NNS who WP are VBP either RB brainy JJ or CC sporty JJ or CC both DT . SENT . SENT Let VV C NP denote VV the DT event NN that IN someone NN is VBZ admitted VVN to TO college NN , , which WDT is VBZ made VVN true JJ if IN they PP are VBP either RB brainy JJ B NN or CC sporty JJ S NP . SENT Suppose VV in IN the DT general JJ population NN , , B NP and CC S NP are VBP independent JJ . SENT We PP can MD model VV our PP$ conditional JJ independence NN assumptions NNS using VVG a DT graph NN which WDT is VBZ a DT V NN structure NN , , with IN arrows NNS pointing VVG down RP . SENT B NP S NP v NN C NP Now RB look VVP at IN a DT population NN of IN college NN students NNS those DT for IN which WDT C NP is VBZ observed VVN to TO be VB true JJ . SENT It PP will MD be VB found VVN that IN being VBG brainy JJ makes VVZ you PP less RBR likely JJ to TO be VB sporty JJ and CC vice NN versa FW , , because IN either DT property NN alone RB is VBZ sufficient JJ to TO explain VV the DT evidence NN on IN C NP i NP . SENT e SYM . SENT , , P NN S NP 1 CD C NP 1 CD , , B NP 1 CD try VV this DT little JJ BNT NP demo NN . SENT Top NN down RB and CC bottom VV up RP reasoning NN In IN the DT water NN sprinkler NN example NN , , we PP had VHD evidence NN of IN an DT effect NN wet JJ grass NN , , and CC inferred VVN the DT most RBS likely JJ cause NN . SENT This DT is VBZ called VVN diagnostic JJ , , or CC bottom NN up RB , , reasoning NN , , since IN it PP goes VVZ from IN effects NNS to TO causes NNS . SENT it PP is VBZ a DT common JJ task NN in IN expert NN systems NNS . SENT Bayes NP nets NNS can MD also RB be VB used VVN for IN causal JJ , , or CC top JJ down RB , , reasoning NN . SENT For IN example NN , , we PP can MD compute VV the DT probability NN that IN the DT grass NN will MD be VB wet JJ given VVN that IN it PP is VBZ cloudy JJ . SENT Hence RB Bayes NP nets NNS are VBP often RB called VVN generative JJ models NNS , , because IN they PP specify VV how WRB causes NNS generate VVP effects NNS . SENT Causality NN One CD of IN the DT most RBS exciting JJ things NNS about IN Bayes NP nets NNS is VBZ that IN they PP can MD be VB used VVN to TO put VV discussions NNS about IN causality NN on IN a DT solid JJ mathematical JJ basis NN . SENT One CD very RB interesting JJ question NN is VBZ . SENT can MD we PP distinguish VV causation NN from IN mere JJ correlation NN . SENT The DT answer NN is VBZ sometimes RB , , but CC you PP need VVP to TO measure VV the DT relationships NNS between IN at IN least JJS three CD variables NNS . SENT the DT intution NN is VBZ that IN one CD of IN the DT variables NNS acts NNS as IN a DT virtual JJ control NN for IN the DT relationship NN between IN the DT other JJ two CD , , so RB we PP don't VVD always RB need VV to TO do VV experiments NNS to TO infer VV causality NN . SENT See VV the DT following VVG books NNS for IN details NNS . SENT Causality NN . SENT Models NNS , , Reasoning NN and CC Inference NN , , Judea NP Pearl NP , , 2000 CD , , Cambridge NP University NP Press NP . SENT Causation NN , , Prediction NN and CC Search NP , , Spirtes NP , , Glymour NP and CC Scheines NP , , 2001 CD 2 CD nd NN edition NN , , MIT NP Press NP . SENT Cause NN and CC Correlation NN in IN Biology NN , , Bill NP Shipley NP , , 2000 CD , , Cambridge NP University NP Press NP . SENT Computation NN , , Causation NN and CC Discovery NP , , Glymour NP and CC Cooper NP eds NPS , , 1999 CD , , MIT NP Press NP . SENT Conditional JJ independence NN in IN Bayes NP Nets NNS In IN general NN , , the DT conditional JJ independence NN relationships NNS encoded VVN by IN a DT Bayes NP Net NP are VBP best RBS be VB explained VVN by IN means NNS of IN the DT Bayes NP Ball NP algorithm NN due JJ to TO Ross NP Shachter NP , , which WDT is VBZ as RB follows VVZ . SENT Two CD sets NNS of IN nodes NNS A NP and CC B NP are VBP conditionally RB independent JJ d NN separated JJ given VVN a DT set VVN C NP if IN and CC only RB if IN there EX is VBZ no DT way NN for IN a DT ball NN to TO get VV from IN A NP to TO B NP in IN the DT graph NN , , where WRB the DT allowable JJ movements NNS of IN the DT ball NN are VBP shown VVN below RB . SENT Hidden JJ nodes NNS are VBP nodes NNS whose WP$ values NNS are VBP not RB known VVN , , and CC are VBP depicted VVN as IN unshaded JJ . SENT observed JJ nodes NNS the DT ones NNS we PP condition NN on IN are VBP shaded VVN . SENT The DT dotted JJ arcs NNS indicate VVP direction NN of IN flow NN of IN the DT ball NN . SENT The DT most RBS interesting JJ case NN is VBZ the DT first JJ column NN , , when WRB we PP have VHP two CD arrows NNS converging VVG on IN a DT node NN X NN so IN X NP is VBZ a DT leaf NN with IN two CD parents NNS . SENT If IN X NP is VBZ hidden VVN , , its PP$ parents NNS are VBP marginally RB independent JJ , , and CC hence RB the DT ball NN does VVZ not RB pass VV through IN the DT ball NN being VBG turned VVN around RB is VBZ indicated VVN by IN the DT curved VVN arrows NNS . SENT but CC if IN X NP is VBZ observed VVN , , the DT parents NNS become VVP dependent JJ , , and CC the DT ball NN does VVZ pass VV through IN , , because IN of IN the DT explaining VVG away RP phenomenon NN . SENT Notice NN that IN , , if IN this DT graph NN was VBD undirected JJ , , the DT child NN would MD always RB separate VV the DT parents NNS . SENT hence RB when WRB converting VVG a DT directed VVN graph NN to TO an DT undirected JJ graph NN , , we PP must MD add VV links NNS between IN unmarried JJ parents NNS who WP share VV a DT common JJ child NN i NP . SENT e SYM . SENT , , moralize VV the DT graph NN to TO prevent VV us PP reading VVG off RP incorrect JJ independence NN statements NNS . SENT Now RB consider VV the DT second JJ column NN in IN which WDT we PP have VH two CD diverging VVG arrows NNS from IN X NP so IN X NP is VBZ a DT root NN . SENT If IN X NP is VBZ hidden VVN , , the DT children NNS are VBP dependent JJ , , because IN they PP have VHP a DT hidden JJ common JJ cause NN , , so RB the DT ball NN passes VVZ through RP . SENT If IN X NP is VBZ observed VVN , , its PP$ children NNS are VBP rendered VVN conditionally RB independent JJ , , so RB the DT ball NN does VVZ not RB pass VV through RP . SENT Finally RB , , consider VVP the DT case NN in IN which WDT we PP have VHP one CD incoming JJ and CC outgoing JJ arrow NN to TO X NP . SENT It PP is VBZ intuitive JJ that IN the DT nodes NNS upstream JJ and CC downstream JJ of IN X NP are VBP dependent JJ iff NN X NN is VBZ hidden VVN , , because IN conditioning NN on IN a DT node NN breaks VVZ the DT graph NN at IN that DT point NN . SENT Bayes NP nets VVZ with IN discrete JJ and CC continuous JJ nodes NNS The DT introductory JJ example NN used VVD nodes NNS with IN categorical JJ values NNS and CC multinomial NN distributions NNS . SENT It PP is VBZ also RB possible JJ to TO create VV Bayesian JJ networks NNS with IN continuous JJ valued VVN nodes NNS . SENT The DT most RBS common JJ distribution NN for IN such JJ variables NNS is VBZ the DT Gaussian JJ . SENT For IN discrete JJ nodes NNS with IN continuous JJ parents NNS , , we PP can MD use VV the DT logistic JJ softmax NN distribution NN . SENT Using VVG multinomials NNS , , conditional JJ Gaussians NNS , , and CC the DT softmax NN distribution NN , , we PP can MD have VH a DT rich JJ toolbox NN for IN making VVG complex JJ models NNS . SENT Some DT examples NNS are VBP shown VVN below RB . SENT For IN details NNS , , click NN here RB . SENT Circles NNS denote VV continuous JJ valued VVN random JJ variables NNS , , squares NNS denote VV discrete JJ rv's NNS , , clear JJ means NNS hidden VVN , , and CC shaded VVD means NNS observed VVD . SENT For IN more JJR details NNS , , see VVP this DT excellent JJ paper NN . SENT A DT Unifying JJ Review NP of IN Linear NP Gaussian JJ Models NNS , , Sam NP Roweis NP Zoubin NP Ghahramani NP . SENT Neural JJ Computation NN 11 CD 2 CD 1999 CD pp NP . SENT 305 CD 345 CD Temporal JJ models NNS Dynamic JJ Bayesian NP Networks NPS DBNs NP are VBP directed VVN graphical JJ models NNS of IN stochastic JJ processes NNS . SENT They PP generalise VV hidden JJ Markov NP models NNS HMMs NP and CC linear JJ dynamical JJ systems NNS LDSs NP by IN representing VVG the DT hidden JJ and CC observed JJ state NN in IN terms NNS of IN state NN variables NNS , , which WDT can MD have VH complex JJ interdependencies NNS . SENT The DT graphical JJ structure NN provides VVZ an DT easy JJ way NN to TO specify VV these DT conditional JJ independencies NNS , , and CC hence RB to TO provide VV a DT compact JJ parameterization NN of IN the DT model NN . SENT Note NN that IN temporal JJ Bayesian NP network NN would MD be VB a DT better JJR name NN than IN dynamic JJ Bayesian NP network NN , , since IN it PP is VBZ assumed VVN that IN the DT model NN structure NN does VVZ not RB change VV , , but CC the DT term NN DBN NP has VHZ become VVN entrenched JJ . SENT We PP also RB normally RB assume VV that IN the DT parameters NNS do VVP not RB change VV , , i NP . SENT e SYM . SENT , , the DT model NN is VBZ time NN invariant NN . SENT However RB , , we PP can MD always RB add VV extra JJ hidden JJ nodes NNS to TO represent VV the DT current JJ regime NN , , thereby RB creating VVG mixtures NNS of IN models NNS to TO capture VV periodic JJ non JJ stationarities NNS . SENT There EX are VBP some DT cases NNS where WRB the DT size NN of IN the DT state NN space NN can MD change VV over IN time NN , , e NN . SENT g NN . SENT , , tracking VVG a DT variable NN , , but CC unknown JJ , , number NN of IN objects NNS . SENT In IN this DT case NN , , we PP need VVP to TO change VV the DT model NN structure NN over IN time NN . SENT Hidden JJ Markov NP Models NNS HMMs NP The DT simplest JJS kind NN of IN DBN NP is VBZ a DT Hidden NP Markov NP Model NP HMM NP , , which WDT has VHZ one CD discrete JJ hidden JJ node NN and CC one CD discrete JJ or CC continuous JJ observed JJ node NN per IN slice NN . SENT We PP illustrate VVP this DT below IN . SENT As IN before RB , , circles NNS denote VV continuous JJ nodes NNS , , squares NNS denote VV discrete JJ nodes NNS , , clear JJ means NNS hidden VVN , , shaded VVD means NNS observed VVD . SENT We PP have VHP unrolled VVN the DT model NN for IN 4 CD time NN slices VVZ the DT structure NN and CC parameters NNS are VBP assumed VVN to TO repeat VV as IN the DT model NN is VBZ unrolled VVN further RBR . SENT Hence RB to TO specify VV a DT DBN NP , , we PP need VVP to TO define VV the DT intra NN slice NN topology NN within IN a DT slice NN , , the DT inter VV slice NN topology NN between IN two CD slices NNS , , as RB well RB as IN the DT parameters NNS for IN the DT first JJ two CD slices NNS . SENT Such PDT a DT two CD slice NN temporal JJ Bayes NP net NN is VBZ often RB called VVN a DT 2 CD TBN NP . SENT Some DT common JJ variants NNS on IN HMMs NP are VBP shown VVN below RB . SENT Linear NP Dynamical JJ Systems NPS LDSs NP and CC Kalman NP filters VVZ A DT Linear NP Dynamical JJ System NP LDS NP has VHZ the DT same JJ topology NN as IN an DT HMM NP , , but CC all PDT the DT nodes NNS are VBP assumed VVN to TO have VH linear JJ Gaussian JJ distributions NNS , , i NP . SENT e SYM . SENT , , x NN t NN 1 CD A DT x NN t NN w NN t NN , , w NN N NP 0 CD , , Q NP , , x NN 0 CD N NP init NP x SYM , , init JJ V NN y NN t NN C NN x SYM t NN v NN t NN , , v NN N NP 0 CD , , R NP The NP Kalman NP filter NN is VBZ a DT way NN of IN doing VVG online JJ filtering VVG in IN this DT model NN . SENT Some DT simple JJ variants NNS of IN LDSs NP are VBP shown VVN below RB . SENT The DT Kalman NP filter NN has VHZ been VBN proposed VVN as IN a DT model NN for IN how WRB the DT brain NN integrates VVZ visual JJ cues NNS over IN time NN to TO infer VV the DT state NN of IN the DT world NN , , although IN the DT reality NN is VBZ obviously RB much RB more RBR complicated JJ . SENT The DT main JJ point NN is VBZ not RB that IN the DT Kalman NP filter NN is VBZ the DT right JJ model NN , , but CC that IN the DT brain NN is VBZ combining VVG bottom NN up RP and CC top VV down RP cues NNS . SENT The DT figure NN below RB is VBZ from IN a DT paper NN called VVD A DT Kalman NP Filter NP Model NP of IN the DT Visual NP Cortex NN , , by IN P NN . SENT Rao NP , , Neural JJ Computation NN 9 CD 4 CD . SENT 721 CD 763 CD , , 1997 CD . SENT More RBR complex JJ DBNs NP It PP is VBZ also RB possible JJ to TO create VV temporal JJ models NNS with IN much RB more JJR complicated JJ topologies NNS , , such JJ as IN the DT Bayesian NP Automated VVN Taxi NN BAT NN network NN shown VVN below IN . SENT For IN simplicity NN , , we PP only RB show VVP the DT observed JJ leaves NNS for IN slice NN 2 CD . SENT Thanks NNS to TO Daphne NP Koller NP for IN providing VVG this DT figure NN . SENT When WRB some DT of IN the DT observed JJ nodes NNS are VBP thought VVN of IN as IN inputs NNS actions NNS , , and CC some DT as IN outputs NNS percepts NNS , , the DT DBN NP becomes VVZ a DT POMDP NN . SENT See VV also RB the DT section NN on IN decision NN theory NN below RB . SENT A DT generative JJ model NN for IN generative JJ models NNS The DT figure NN below IN , , produced VVN by IN Zoubin NP Ghahramani NP and CC Sam NP Roweis NP , , is VBZ a DT good JJ summary NN of IN the DT relationships NNS between IN some DT popular JJ graphical JJ models NNS . SENT INFERENCE NN A DT graphical JJ model NN specifies VVZ a DT complete JJ joint JJ probability NN distribution NN JPD NP over IN all PDT the DT variables NNS . SENT Given VVN the DT JPD NP , , we PP can MD answer VV all DT possible JJ inference NN queries NNS by IN marginalization NN summing VVG out RP over IN irrelevant JJ variables NNS , , as RB illustrated VVN in IN the DT introduction NN . SENT However RB , , the DT JPD NP has VHZ size NN O NP 2 CD n NN , , where WRB n NN is VBZ the DT number NN of IN nodes NNS , , and CC we PP have VHP assumed VVN each DT node NN can MD have VH 2 CD states NNS . SENT Hence RB summing VVG over IN the DT JPD NP takes VVZ exponential JJ time NN . SENT We PP now RB discuss VVP more RBR efficient JJ methods NNS . SENT Variable JJ elimination NN For IN a DT directed VVN graphical JJ model NN Bayes NP net NN , , we PP can MD sometimes RB use VV the DT factored VVN representation NN of IN the DT JPD NP to TO do VV marginalisation NN efficiently RB . SENT The DT key JJ idea NN is VBZ to TO push VV sums NNS in IN as RB far RB as RB possible JJ when WRB summing VVG marginalizing VVG out RP irrelevant JJ terms NNS , , e NN . SENT g NN . SENT , , for IN the DT water NN sprinkler NN network NN Notice NN that WDT , , as IN we PP perform VVP the DT innermost JJ sums NNS , , we PP create VVP new JJ terms NNS , , which WDT need VVP to TO be VB summed VVD over RP in IN turn NN e NN . SENT g NN . SENT , , where WRB Continuing VVG in IN this DT way NN , , where WRB This DT algorithm NN is VBZ called VVN Variable JJ Elimination NN . SENT The DT principle NN of IN distributing VVG sums NNS over IN products NNS can MD be VB generalized VVN greatly RB to TO apply VV to TO any DT commutative JJ semiring NN . SENT This DT forms VVZ the DT basis NN of IN many JJ common JJ algorithms NNS , , such JJ as IN Viterbi NP decoding VVG and CC the DT Fast NP Fourier NP Transform VV . SENT For IN details NNS , , see VV R NN . SENT McEliece NP and CC S NP . SENT M NP . SENT Aji NP , , 2000 CD . SENT The DT Generalized VVN Distributive NN Law NN , , IEEE NP Trans NP . SENT Inform VV . SENT Theory NN , , vol NP . SENT 46 CD , , no RB . SENT 2 CD March NP 2000 CD , , pp NP . SENT 325 CD 343 CD . SENT F SYM . SENT R SYM . SENT Kschischang NP , , B NP . SENT J NP . SENT Frey NP and CC H NP . SENT A DT . SENT Loeliger NP , , 2001 CD . SENT Factor NN graphs NNS and CC the DT sum NN product NN algorithm NN IEEE NN Transactions NNS on IN Information NP Theory NN , , February NP , , 2001 CD . SENT The DT amount NN of IN work NN we PP perform VVP when WRB computing VVG a DT marginal JJ is VBZ bounded VVN by IN the DT size NN of IN the DT largest JJS term NN that IN we PP encounter VV . SENT Choosing VVG a DT summation NN elimination NN ordering VVG to TO minimize VV this DT is VBZ NP NP hard RB , , although IN greedy JJ algorithms NNS work VVP well RB in IN practice NN . SENT Dynamic JJ programming NN If IN we PP wish VVP to TO compute VV several JJ marginals NNS at IN the DT same JJ time NN , , we PP can MD use VV Dynamic JJ Programming NN DP NNS to TO avoid VV the DT redundant JJ computation NN that WDT would MD be VB involved VVN if IN we PP used VVD variable JJ elimination NN repeatedly RB . SENT If IN the DT underlying JJ undirected JJ graph NN of IN the DT BN NP is VBZ acyclic JJ i NP . SENT e SYM . SENT , , a DT tree NN , , we PP can MD use VV a DT local JJ message NN passing VVG algorithm NN due JJ to TO Pearl NP . SENT This DT is VBZ a DT generalization NN of IN the DT well RB known VVN forwards RB backwards JJ algorithm NN for IN HMMs NP chains NNS . SENT For IN details NNS , , see VVP Probabilistic JJ Reasoning NN in IN Intelligent JJ Systems NPS , , Judea NP Pearl NP , , 1988 CD , , 2 CD nd NN ed NP . SENT Fusion NN and CC propogation NN with IN multiple JJ observations NNS in IN belief NN networks NNS , , Peot NP and CC Shachter NP , , AI VVP 48 CD 1991 CD p NN . SENT 299 CD 318 CD . SENT If IN the DT BN NP has VHZ undirected JJ cycles NNS as IN in IN the DT water NN sprinkler NN example NN , , local JJ message NN passing VVG algorithms NNS run VVP the DT risk NN of IN double JJ counting NN . SENT e SYM . SENT g NN . SENT , , the DT information NN from IN S NP and CC R NP flowing VVG into IN W NP is VBZ not RB independent JJ , , because IN it PP came VVD from IN a DT common JJ cause NN , , C NP . SENT The DT most RBS common JJ approach NN is VBZ therefore RB to TO convert VV the DT BN NP into IN a DT tree NN , , by IN clustering VVG nodes NNS together RB , , to TO form VV what WP is VBZ called VVN a DT junction NN tree NN , , and CC then RB running VVG a DT local JJ message NN passing VVG algorithm NN on IN this DT tree NN . SENT The DT message NN passing VVG scheme NN could MD be VB Pearl's NP algorithm NN , , but CC it PP is VBZ more RBR common JJ to TO use VV a DT variant NN designed VVN for IN undirected JJ models NNS . SENT For IN more JJR details NNS , , click NN here RB The DT running VVG time NN of IN the DT DP JJ algorithms NNS is VBZ exponential JJ in IN the DT size NN of IN the DT largest JJS cluster NN these DT clusters NNS correspond VV to TO the DT intermediate JJ terms NNS created VVN by IN variable JJ elimination NN . SENT This DT size NN is VBZ called VVN the DT induced VVN width NN of IN the DT graph NN . SENT Minimizing VVG this DT is VBZ NP NN hard JJ . SENT Approximation NN algorithms NNS Many JJ models NNS of IN interest NN , , such JJ as IN those DT with IN repetitive JJ structure NN , , as RB in IN multivariate JJ time NN series NN or CC image NN analysis NN , , have VHP large RB induced VVN width NN , , which WDT makes VVZ exact JJ inference NN very RB slow JJ . SENT We PP must MD therefore RB resort VV to TO approximation NN techniques NNS . SENT Unfortunately RB , , approximate JJ inference NN is VBZ P NN hard JJ , , but CC we PP can MD nonetheless RB come VV up RP with IN approximations NNS which WDT often RB work VVP well RB in IN practice NN . SENT Below RB is VBZ a DT list NN of IN the DT major JJ techniques NNS . SENT Variational JJ methods NNS . SENT The DT simplest JJS example NN is VBZ the DT mean JJ field NN approximation NN , , which WDT exploits VVZ the DT law NN of IN large JJ numbers NNS to TO approximate JJ large JJ sums NNS of IN random JJ variables NNS by IN their PP$ means NNS . SENT In IN particular JJ , , we PP essentially RB decouple VV all PDT the DT nodes NNS , , and CC introduce VV a DT new JJ parameter NN , , called VVD a DT variational JJ parameter NN , , for IN each DT node NN , , and CC iteratively RB update VV these DT parameters NNS so RB as RB to TO minimize VV the DT cross NN entropy NN KL NP distance NN between IN the DT approximate JJ and CC true JJ probability NN distributions NNS . SENT Updating VVG the DT variational JJ parameters NNS becomes VVZ a DT proxy NN for IN inference NN . SENT The DT mean JJ field NN approximation NN produces VVZ a DT lower JJR bound VVN on IN the DT likelihood NN . SENT More RBR sophisticated JJ methods NNS are VBP possible JJ , , which WDT give VV tighter RBR lower JJR and CC upper JJ bounds NNS . SENT Sampling VVG Monte NP Carlo NP methods NNS . SENT The DT simplest JJS kind NN is VBZ importance NN sampling NN , , where WRB we PP draw VVP random JJ samples NNS x SYM from IN P NN X NN , , the DT unconditional JJ distribution NN on IN the DT hidden JJ variables NNS , , and CC then RB weight VV the DT samples NNS by IN their PP$ likelihood NN , , P NN y NN x SYM , , where WRB y NP is VBZ the DT evidence NN . SENT A DT more RBR efficient JJ approach NN in IN high JJ dimensions NNS is VBZ called VVN Monte NP Carlo NP Markov NP Chain NP MCMC NP , , and CC includes VVZ as IN special JJ cases NNS Gibbs NP sampling NN and CC the DT Metropolis NN Hasting NP algorithm NN . SENT Loopy JJ belief NN propogation NN . SENT This DT entails VVZ applying VVG Pearl's NP algorithm NN to TO the DT original JJ graph NN , , even RB if IN it PP has VHZ loops NNS undirected JJ cycles NNS . SENT In IN theory NN , , this DT runs VVZ the DT risk NN of IN double JJ counting NN , , but CC Yair NP Weiss NP and CC others NNS have VHP proved VVN that IN in IN certain JJ cases NNS e NN . SENT g NN . SENT , , a DT single JJ loop NN , , events NNS are VBP double RB counted VVN equally RB , , and CC hence RB cancel VV to TO give VV the DT right JJ answer NN . SENT Belief NN propagation NN is VBZ equivalent JJ to TO exact JJ inference NN on IN a DT modified JJ graph NN , , called VVD the DT universal JJ cover NN or CC unwrapped VVD computation NN tree NN , , which WDT has VHZ the DT same JJ local JJ topology NN as IN the DT original JJ graph NN . SENT This DT is VBZ the DT same JJ as IN the DT Bethe NP and CC cavity NN TAP VV approaches NNS in IN statistical JJ physics NNS . SENT Hence RB there EX is VBZ a DT deep JJ connection NN between IN belief NN propagation NN and CC variational JJ methods NNS that IN people NNS are VBP currently RB investigating VVG . SENT Bounded VVN cutset NN conditioning NN . SENT By IN instantiating VVG subsets NNS of IN the DT variables NNS , , we PP can MD break VV loops NNS in IN the DT graph NN . SENT Unfortunately RB , , when WRB the DT cutset NN is VBZ large JJ , , this DT is VBZ very RB slow JJ . SENT By IN instantiating VVG only RB a DT subset NN of IN values NNS of IN the DT cutset NN , , we PP can MD compute VV lower JJR bounds NNS on IN the DT probabilities NNS of IN interest NN . SENT Alternatively RB , , we PP can MD sample VV the DT cutsets NNS jointly RB , , a DT technique NN known VVN as IN block NN Gibbs NP sampling NN . SENT Parametric NP approximation NN methods NNS . SENT These DT express VVP the DT intermediate JJ summands NNS in IN a DT simpler JJR form NN , , e NN . SENT g NN . SENT , , by IN approximating VVG them PP as IN a DT product NN of IN smaller JJR factors NNS . SENT Minibuckets NNS and CC the DT Boyen NP Koller NP algorithm NN fall NN into IN this DT category NN . SENT Approximate JJ inference NN is VBZ a DT huge JJ topic NN . SENT see VV the DT references NNS for IN more JJR details NNS . SENT Inference NN in IN DBNs NP The DT general JJ inference NN problem NN for IN DBNs NP is VBZ to TO compute VV P NN X NP i NP , , t NN 0 CD y NN . SENT , , t NN 1 CD . SENT t NN 2 CD , , where WRB X NP i NP , , t NN represents VVZ the DT i'th JJ hidden JJ variable NN at IN time NN and CC t NN Y NN . SENT , , t NN 1 CD . SENT t NN 2 CD represents VVZ all PDT the DT evidence NN between IN times NNS t NN 1 CD and CC t NN 2 CD . SENT In IN fact NN , , we PP often RB also RB want VVP to TO compute VV joint JJ distributions NNS of IN variables NNS over IN one CD or CC more JJR time NN slcies NNS . SENT There EX are VBP several JJ special JJ cases NNS of IN interest NN , , illustrated VVN below IN . SENT The DT arrow NN indicates VVZ t NN 0 CD . SENT it PP is VBZ X NN t NN 0 CD that IN we PP are VBP trying VVG to TO estimate VV . SENT The DT shaded VVN region NN denotes VVZ t NN 1 CD . SENT t NN 2 CD , , the DT available JJ data NNS . SENT Here RB is VBZ a DT simple JJ example NN of IN inference NN in IN an DT LDS NP . SENT Consider VV a DT particle NN moving VVG in IN the DT plane NN at IN constant JJ velocity NN subject NN to TO random JJ perturbations NNS in IN its PP$ trajectory NN . SENT The DT new JJ position NN x SYM 1 CD , , x NN 2 CD is VBZ the DT old JJ position NN plus IN the DT velocity NN dx NP 1 CD , , dx NP 2 CD plus CC noise NN w NN . SENT x SYM 1 CD t NN 1 CD 0 CD 1 CD 0 CD x SYM 1 CD t NN 1 CD wx NN 1 CD x SYM 2 CD t NN 0 CD 1 CD 0 CD 1 CD x SYM 2 CD t NN 1 CD wx NN 2 CD dx NN 1 CD t NN 0 CD 0 CD 1 CD 0 CD dx NN 1 CD t NN 1 CD wdx NN 1 CD dx NN 2 CD t NN 0 CD 0 CD 0 CD 1 CD dx NN 2 CD t NN 1 CD wdx NN 2 CD We PP assume VVP we PP only RB observe VVP the DT position NN of IN the DT particle NN . SENT y NP 1 CD t NN 1 CD 0 CD 0 CD 0 CD x SYM 1 CD t NN vx NN 1 CD y NN 2 CD t NN 0 CD 1 CD 0 CD 0 CD x SYM 2 CD t NN vx NN 2 CD dx NN 1 CD t NN dx NP 2 CD t NN Suppose VVP we PP start VV out RP at IN position NN 10 CD , , 10 CD moving VVG to TO the DT right NN with IN velocity NN 1 CD , , 0 CD . SENT We PP sampled VVD a DT random JJ trajectory NN of IN length NN 15 CD . SENT Below IN we PP show VVP the DT filtered VVN and CC smoothed VVN trajectories NNS . SENT The DT mean NN squared VVD error NN of IN the DT filtered VVN estimate NN is VBZ 4 CD . SENT 9 CD . SENT for IN the DT smoothed VVN estimate NN it PP is VBZ 3 CD . SENT 2 LS . SENT Not RB only RB is VBZ the DT smoothed VVN estimate NN better RBR , , but CC we PP know VVP that IN it PP is VBZ better JJR , , as RB illustrated VVN by IN the DT smaller JJR uncertainty NN ellipses NNS . SENT this DT can MD help VV in IN e NN . SENT g NN . SENT , , data NNS association NN problems NNS . SENT Note VV how WRB the DT smoothed VVN ellipses NNS are VBP larger JJR at IN the DT ends NNS , , because IN these DT points NNS have VHP seen VVN less JJR data NNS . SENT Also RB , , note VVP how WRB rapidly RB the DT filtered VVN ellipses NNS reach VVP their PP$ steady JJ state NN Ricatti NP values VVZ . SENT See VV my PP$ Kalman NP filter NN toolbox NN for IN more JJR details NNS . SENT LEARNING VVG One CD needs NNS to TO specify VV two CD things NNS to TO describe VV a DT BN NP . SENT the DT graph NN topology NN structure NN and CC the DT parameters NNS of IN each DT CPD NP . SENT It PP is VBZ possible JJ to TO learn VV both DT of IN these DT from IN data NNS . SENT However RB , , learning VVG structure NN is VBZ much RB harder RBR than IN learning VVG parameters NNS . SENT Also RB , , learning VVG when WRB some DT of IN the DT nodes NNS are VBP hidden VVN , , or CC we PP have VHP missing JJ data NNS , , is VBZ much RB harder JJR than IN when WRB everything NN is VBZ observed VVN . SENT This DT gives VVZ rise NN to TO 4 CD cases NNS . SENT Structure NN Observability NN Method NN Known VVN Full JJ Maximum JJ Likelihood NN Estimation NN Known VVN Partial JJ EM NN or CC gradient JJ ascent NN Unknown JJ Full JJ Search NP through IN model NN space NN Unknown JJ Partial JJ EM NN search NN through IN model NN space NN Known VVN structure NN , , full JJ observability NN We PP assume VVP that IN the DT goal NN of IN learning VVG in IN this DT case NN is VBZ to TO find VV the DT values NNS of IN the DT parameters NNS of IN each DT CPD NP which WDT maximizes VVZ the DT likelihood NN of IN the DT training NN data NNS , , which WDT contains VVZ N NP cases NNS assumed VVD to TO be VB independent JJ . SENT The DT normalized VVN log NN likelihood NN of IN the DT training NN set VVD D NP is VBZ a DT sum NN of IN terms NNS , , one CD for IN each DT node NN . SENT We PP see VVP that IN the DT log NN likelihood NN scoring VVG function NN decomposes VVZ according VVG to TO the DT structure NN of IN the DT graph NN , , and CC hence RB we PP can MD maximize VV the DT contribution NN to TO the DT log NN likelihood NN of IN each DT node NN independently RB assuming VVG the DT parameters NNS in IN each DT node NN are VBP independent JJ of IN the DT other JJ nodes NNS . SENT Consider VV estimating VVG the DT Conditional JJ Probability NN Table NN for IN the DT W NP node NN . SENT If IN we PP have VHP a DT set NN of IN training NN data NNS , , we PP can MD just RB count VV the DT number NN of IN times NNS the DT grass NN is VBZ wet JJ when WRB it PP is VBZ raining VVG and CC the DT sprinler NN is VBZ on IN , , N NP W NP 1 CD , , S NP 1 CD , , R NP 1 CD , , the DT number NN of IN times NNS the DT grass NN is VBZ wet JJ when WRB it PP is VBZ raining VVG and CC the DT sprinkler NN is VBZ off RP , , N NP W NP 1 CD , , S NP 0 CD , , R NP 1 CD , , etc FW . SENT Given VVN these DT counts NNS which WDT are VBP the DT sufficient JJ statistics NNS , , we PP can MD find VV the DT Maximum JJ Likelihood NN Estimate NP of IN the DT CPT NP as RB follows VVZ . SENT where WRB the DT denominator NN is VBZ N NP S NP s PP , , R NN r NN N NP W NP 0 CD , , S NP s PP , , R NN r NN N NP W NP 1 CD , , S NP s PP , , R NN r NN . SENT Thus RB learning VVG just RB amounts VVZ to TO counting VVG in IN the DT case NN of IN multinomial NN distributions NNS . SENT For IN Gaussian JJ nodes NNS , , we PP can MD compute VV the DT sample NN mean NN and CC variance NN , , and CC use VV linear JJ regression NN to TO estimate VV the DT weight NN matrix NN . SENT For IN other JJ kinds NNS of IN distributions NNS , , more JJR complex JJ procedures NNS are VBP necessary JJ . SENT As RB is VBZ well RB known VVN from IN the DT HMM NP literature NN , , ML NP estimates VVZ of IN CPTs NNS are VBP prone JJ to TO sparse JJ data NNS problems NNS , , which WDT can MD be VB solved VVN by IN using VVG mixtures NNS of IN Dirichlet NP priors NNS pseudo JJ counts NNS . SENT This DT results NNS in IN a DT Maximum NN A DT Posteriori NP MAP NN estimate NN . SENT For IN Gaussians NP , , we PP can MD use VV a DT Wishart NP prior RB , , etc FW . SENT Known VVN structure NN , , partial JJ observability NN When WRB some DT of IN the DT nodes NNS are VBP hidden VVN , , we PP can MD use VV the DT EM NN Expectation NN Maximization NN algorithm NN to TO find VV a DT locally RB optimal JJ Maximum NN Likelihood NN Estimate NP of IN the DT parameters NNS . SENT The DT basic JJ idea NN behind IN EM NN is VBZ that IN , , if IN we PP knew VVD the DT values NNS of IN all PDT the DT nodes NNS , , learning VVG the DT M NP step NN would MD be VB easy JJ , , as IN we PP saw VVD above RB . SENT So RB in IN the DT E NN step NN , , we PP compute VV the DT expected VVN values NNS of IN all PDT the DT nodes NNS using VVG an DT inference NN algorithm NN , , and CC then RB treat VV these DT expected VVN values NNS as IN though IN they PP were VBD observed JJ distributions NNS . SENT For IN example NN , , in IN the DT case NN of IN the DT W NP node NN , , we PP replace VVP the DT observed JJ counts NNS of IN the DT events NNS with IN the DT number NN of IN times NNS we PP expect VVP to TO see VV each DT event NN . SENT P NN W NP w NN S NP s PP , , R NN r NN E NP N NP W NP w NN , , S NP s PP , , R NN r NN E NP N NP S NP s PP , , R NN r NN where WRB E NP N NP x NN is VBZ the DT expected VVN number NN of IN times NNS event NN x NN occurs VVZ in IN the DT whole JJ training NN set NN , , given VVN the DT current JJ guess NN of IN the DT parameters NNS . SENT These DT expected VVN counts NNS can MD be VB computed VVN as RB follows VVZ E NP N NP . SENT E SYM sum NN k NN I PP . SENT D SYM k NN sum NN k NN P NN . SENT D SYM k NN where WRB I NN x NN D NP k NN is VBZ an DT indicator NN function NN which WDT is VBZ 1 CD if IN event NN x NN occurs VVZ in IN training NN case NN k NN , , and CC 0 CD otherwise RB . SENT Given VVN the DT expected VVN counts NNS , , we PP maximize VV the DT parameters NNS , , and CC then RB recompute VV the DT expected VVN counts NNS , , etc FW . SENT This DT iterative JJ procedure NN is VBZ guaranteed VVN to TO converge VV to TO a DT local JJ maximum NN of IN the DT likelihood NN surface NN . SENT It PP is VBZ also RB possible JJ to TO do VV gradient JJ ascent NN on IN the DT likelihood NN surface VV the DT gradient JJ expression NN also RB involves VVZ the DT expected VVN counts NNS , , but CC EM NN is VBZ usually RB faster RBR since IN it PP uses VVZ the DT natural JJ gradient NN and CC simpler JJR since IN it PP has VHZ no DT step NN size NN parameter NN and CC takes VVZ care NN of IN parameter NN constraints NNS e NN . SENT g NN . SENT , , the DT rows NNS of IN the DT CPT NP having VHG to TO sum VV to TO one CD automatically RB . SENT In IN any DT case NN , , we PP see VVP than IN when WRB nodes NNS are VBP hidden VVN , , inference NN becomes VVZ a DT subroutine NN which WDT is VBZ called VVN by IN the DT learning VVG procedure NN . SENT hence RB fast JJ inference NN algorithms NNS are VBP crucial JJ . SENT Unknown JJ structure NN , , full JJ observability NN We PP start VVP by IN discussing VVG the DT scoring VVG function NN which WDT we PP use VVP to TO select VV models NNS . SENT we PP then RB discuss VV algorithms NNS which WDT attempt VVP to TO optimize VV this DT function NN over IN the DT space NN of IN models NNS , , and CC finally RB examine VV their PP$ computational JJ and CC sample NN complexity NN . SENT The DT objective JJ function NN used VVN for IN model NN selection NN The DT maximum JJ likelihood NN model NN will MD be VB a DT complete JJ graph NN , , since IN this DT has VHZ the DT largest JJS number NN of IN parameters NNS , , and CC hence RB can MD fit VV the DT data NN the DT best JJS . SENT A DT well RB principled JJ way NN to TO avoid VV this DT kind NN of IN over IN fitting NN is VBZ to TO put VV a DT prior JJ on IN models NNS , , specifying VVG that IN we PP prefer VVP sparse JJ models NNS . SENT Then RB , , by IN Bayes NP rule NN , , the DT MAP NN model NN is VBZ the DT one NN that WDT maximizes VVZ Taking VVG logs NNS , , we PP find VVP where WRB c NN log NN Pr NP D NP is VBZ a DT constant JJ independent NN of IN G NP . SENT The DT effect NN of IN the DT structure NN prior JJ P NN G NP is VBZ equivalent JJ to TO penalizing VVG overly RB complex JJ models NNS . SENT However RB , , this DT is VBZ not RB strictly RB necessary JJ , , since IN the DT marginal JJ likelihood NN term NN P NN D NP G NP int NP theta NN P NN D NP G NP , , theta NN has VHZ a DT similar JJ effect NN of IN penalizing VVG models NNS with IN too RB many JJ parameters NNS this DT is VBZ known VVN as IN Occam's JJ razor NN . SENT Search NN algorithms NNS for IN finding VVG the DT best JJS model NN The DT goal NN of IN structure NN learning NN is VBZ to TO learn VV a DT dag NN directed VVD acyclic JJ graph NN that IN best JJS explains VVZ the DT data NNS . SENT This DT is VBZ an DT NP NP hard JJ problem NN , , since IN the DT number NN of IN dag's NNS on IN N NP variables NNS is VBZ super JJ exponential NN in IN N NP . SENT There EX is VBZ no DT closed JJ form NN formula NN for IN this DT , , but CC to TO give VV you PP an DT idea NN , , there EX are VBP 543 CD dags NNS on IN 4 CD nodes NNS , , and CC O NP 10 CD 18 CD dags NNS on IN 10 CD nodes NNS . SENT If IN we PP know VVP the DT ordering VVG of IN the DT nodes NNS , , life NN becomes VVZ much RB simpler JJR , , since IN we PP can MD learn VV the DT parent NN set VVN for IN each DT node NN independently RB since IN the DT score NN is VBZ decomposable JJ , , and CC we PP don't VVD need VV to TO worry VV about IN acyclicity NN constraints NNS . SENT For IN each DT node NN , , there RB at IN most JJS sum NN k NN 0 CD n NN choice NN n NN k NN 2 CD n NN sets NNS of IN possible JJ parents NNS for IN each DT node NN , , which WDT can MD be VB arranged VVN in IN a DT lattice NN as IN shown VVN below IN for IN n NN 4 CD . SENT The DT problem NN is VBZ to TO find VV the DT highest JJS scoring VVG point NN in IN this DT lattice NN . SENT There EX are VBP three CD obvious JJ ways NNS to TO search VV this DT graph NN . SENT bottom NN up RB , , top JJ down RP , , or CC middle NN out RB . SENT In IN the DT bottom NN up IN approach NN , , we PP start VVP at IN the DT bottom NN of IN the DT lattice NN , , and CC evaluate VV the DT score NN at IN all DT points NNS in IN each DT successive JJ level NN . SENT We PP must MD decide VV whether IN the DT gains NNS in IN score NN produced VVN by IN a DT larger JJR parent NN set NN is VBZ worth JJ it PP . SENT The DT standard JJ approach NN in IN the DT reconstructibility NN analysis NN RA NP community NN uses VVZ the DT fact NN that IN chi NN 2 CD X NN , , Y NP approx NP I NP X NP , , Y NP N NP ln NP 4 CD , , where WRB N NP is VBZ the DT number NN of IN samples NNS and CC I NP X NP , , Y NP is VBZ the DT mutual JJ information NN MI NN between IN X NP and CC Y NP . SENT Hence RB we PP can MD use VV a DT chi NN 2 CD test NN to TO decide VV whether IN an DT increase NN in IN the DT MI NN score NN is VBZ statistically RB significant JJ . SENT This DT also RB gives VVZ us PP some DT kind NN of IN confidence NN measure NN on IN the DT connections NNS that IN we PP learn VVP . SENT Alternatively RB , , we PP can MD use VV a DT BIC NP score NN . SENT Of IN course NN , , if IN we PP do VVP not RB know VV if IN we PP have VHP achieved VVN the DT maximum JJ possible JJ score NN , , we PP do VVP not RB know VV when WRB to TO stop VV searching VVG , , and CC hence RB we PP must MD evaluate VV all DT points NNS in IN the DT lattice NN although IN we PP can MD obviously RB use VV branch NN and CC bound VVN . SENT For IN large JJ n NN , , this DT is VBZ computationally RB infeasible JJ , , so RB a DT common JJ approach NN is VBZ to TO only RB search VV up RP until IN level NN K NP i NP . SENT e SYM . SENT , , assume VV a DT bound VVN on IN the DT maximum JJ number NN of IN parents NNS of IN each DT node NN , , which WDT takes VVZ O NN n NN K NP time NN . SENT The DT obvious JJ way NN to TO avoid VV the DT exponential JJ cost NN and CC the DT need NN for IN a DT bound VVN , , K NP is VBZ to TO use VV heuristics NNS to TO avoid VV examining VVG all DT possible JJ subsets NNS . SENT In IN fact NN , , we PP must MD use VV heuristics NNS of IN some DT kind NN , , since IN the DT problem NN of IN learning VVG optimal JJ structure NN is VBZ NP NP hard RB cite VVP Chickering NP 95 CD . SENT One CD approach NN in IN the DT RA NP framework NN , , called VVD Extended JJ Dependency NP Analysis NP EDA NP cite VVP Conant NP 88 CD , , is VBZ as RB follows VVZ . SENT Start NN by IN evaluating VVG all DT subsets NNS of IN size NN up RB to TO two CD , , keep VV all PDT the DT ones NNS with IN significant JJ in IN the DT chi NN 2 CD sense NN MI NN with IN the DT target NN node NN , , and CC take VV the DT union NN of IN the DT resulting VVG set NN as IN the DT set NN of IN parents NNS . SENT The DT disadvantage NN of IN this DT greedy JJ technique NN is VBZ that IN it PP will MD fail VV to TO find VV a DT set NN of IN parents NNS unless IN some DT subset NN of IN size NN two CD has VHZ significant JJ MI NN with IN the DT target NN variable NN . SENT However RB , , a DT Monte NP Carlo NP simulation NN in IN cite VV Conant NP 88 CD shows VVZ that DT most RBS random JJ relations NNS have VHP this DT property NN . SENT In IN addition NN , , highly RB interdependent JJ sets NNS of IN parents NNS which WDT might MD fail VV the DT pairwise JJ MI NN test NN violate VVP the DT causal JJ independence NN assumption NN , , which WDT is VBZ necessary JJ to TO justify VV the DT use NN of IN noisy JJ OR CC and CC similar JJ CPDs NNS . SENT An DT alternative JJ technique NN , , popular JJ in IN the DT UAI NP community NN , , is VBZ to TO start VV with IN an DT initial JJ guess NN of IN the DT model NN structure NN i NP . SENT e SYM . SENT , , at IN a DT specific JJ point NN in IN the DT lattice NN , , and CC then RB perform VV local JJ search NN , , i NP . SENT e SYM . SENT , , evaluate VV the DT score NN of IN neighboring JJ points NNS in IN the DT lattice NN , , and CC move VV to TO the DT best JJS such JJ point NN , , until IN we PP reach VVP a DT local JJ optimum JJ . SENT We PP can MD use VV multiple JJ restarts NN to TO try VV to TO find VV the DT global JJ optimum JJ , , and CC to TO learn VV an DT ensemble NN of IN models NNS . SENT Note NN that IN , , in IN the DT partially RB observable JJ case NN , , we PP need VVP to TO have VH an DT initial JJ guess NN of IN the DT model NN structure NN in IN order NN to TO estimate VV the DT values NNS of IN the DT hidden JJ nodes NNS , , and CC hence RB the DT expected VVN score NN of IN each DT model NN . SENT starting VVG with IN the DT fully RB disconnected JJ model NN i NP . SENT e SYM . SENT , , at IN the DT bottom NN of IN the DT lattice NN would MD be VB a DT bad JJ idea NN , , since IN it PP would MD lead VV to TO a DT poor JJ estimate NN . SENT Unknown JJ structure NN , , partial JJ observability NN Finally RB , , we PP come VVP to TO the DT hardest JJS case NN of IN all DT , , where WRB the DT structure NN is VBZ unknown JJ and CC there EX are VBP hidden JJ variables NNS and CC or CC missing JJ data NNS . SENT In IN this DT case NN , , to TO compute VV the DT Bayesian NP score NN , , we PP must MD marginalize VV out RP the DT hidden JJ nodes NNS as RB well RB as IN the DT parameters NNS . SENT Since IN this DT is VBZ usually RB intractable JJ , , it PP is VBZ common JJ to TO usean JJ asymptotic JJ approximation NN to TO the DT posterior NN called VVD BIC NP Bayesian NP Information NP Criterion NN , , which WDT is VBZ defined VVN as RB follows VVZ . SENT log VV Pr NP D NP G NP approx NNS log VVP Pr NP D NP G NP , , hat NN Theta NN G NP frac NN log NN N NP 2 CD G NP where WRB N NP is VBZ the DT number NN of IN samples NNS , , hat NN Theta NN G NP is VBZ the DT ML NP estimate NN of IN the DT parameters NNS , , and CC G NP is VBZ the DT dimension NN of IN the DT model NN . SENT In IN the DT fully RB observable JJ case NN , , the DT dimension NN of IN a DT model NN is VBZ the DT number NN of IN free JJ parameters NNS . SENT In IN a DT model NN with IN hidden JJ variables NNS , , it PP might MD be VB less JJR than IN this DT . SENT The DT first JJ term NN is VBZ just RB the DT likelihood NN and CC the DT second JJ term NN is VBZ a DT penalty NN for IN model NN complexity NN . SENT The DT BIC NP score NN is VBZ identical JJ to TO the DT Minimum NP Description NN Length NN MDL NP score NN . SENT Although IN the DT BIC NP score NN decomposes VVZ into IN a DT sum NN of IN local JJ terms NNS , , one CD per IN node NN , , local JJ search NN is VBZ still RB expensive JJ , , because IN we PP need VVP to TO run VV EM NN at IN each DT step NN to TO compute VV hat NN Theta NN . SENT An DT alternative JJ approach NN is VBZ to TO do VV the DT local JJ search NN steps NNS inside RB of IN the DT M NP step NN of IN EM NN this DT is VBZ called VVN Structureal JJ EM NN , , and CC provably RB converges VVZ to TO a DT local JJ maximum NN of IN the DT BIC NP score NN Friedman NP , , 1997 CD . SENT Inventing VVG new JJ hidden JJ nodes NNS So RB far RB , , structure NN learning NN has VHZ meant VVN finding VVG the DT right JJ connectivity NN between IN pre NP existing VVG nodes NNS . SENT A DT more RBR interesting JJ problem NN is VBZ inventing VVG hidden JJ nodes NNS on IN demand NN . SENT Hidden JJ nodes NNS can MD make VV a DT model NN much RB more RBR compact JJ , , as IN we PP see VVP below RB . SENT a DT A NP BN NP with IN a DT hidden JJ variable JJ H NN . SENT b SYM The DT simplest JJS network NN that WDT can MD capture VV the DT same JJ distribution NN without IN using VVG a DT hidden JJ variable NN created VVD using VVG arc JJ reversal NN and CC node NN elimination NN . SENT If IN H NP is VBZ binary JJ and CC the DT other JJ nodes NNS are VBP trinary JJ , , and CC we PP assume VVP full JJ CPTs NNS , , the DT first JJ network NN has VHZ 45 CD independent JJ parameters NNS , , and CC the DT second NN has VHZ 708 CD . SENT The DT standard JJ approach NN is VBZ to TO keep VV adding VVG hidden JJ nodes NNS one NN at IN a DT time NN , , to TO some DT part NN of IN the DT network NN see VVP below RB , , performing VVG structure NN learning VVG at IN each DT step NN , , until IN the DT score NN drops VVZ . SENT One CD problem NN is VBZ choosing VVG the DT cardinality NN number NN of IN possible JJ values NNS for IN the DT hidden JJ node NN , , and CC its PP$ type NN of IN CPD NP . SENT Another DT problem NN is VBZ choosing VVG where WRB to TO add VV the DT new JJ hidden JJ node NN . SENT There EX is VBZ no DT point NN making VVG it PP a DT child NN , , since IN hidden JJ children NNS can MD always RB be VB marginalized VVN away RB , , so RB we PP need VVP to TO find VV an DT existing JJ node NN which WDT needs VVZ a DT new JJ parent NN , , when WRB the DT current JJ set NN of IN possible JJ parents NNS is VBZ not RB adequate JJ . SENT cite VV Ramachandran NP 98 CD use NN the DT following VVG heuristic JJ for IN finding VVG nodes NNS which WDT need VVP new JJ parents NNS . SENT they PP consider VVP a DT noisy JJ OR CC node NN which WDT is VBZ nearly RB always RB on IN , , even RB if IN its PP$ non JJ leak NN parents NNS are VBP off RP , , as IN an DT indicator NN that IN there EX is VBZ a DT missing JJ parent NN . SENT Generalizing VVG this DT technique NN beyond IN noisy JJ ORs NNS is VBZ an DT interesting JJ open JJ problem NN . SENT One CD approach NN might MD be VB to TO examine VV H NP X NP Pa NP X NP . SENT if IN this DT is VBZ very RB high JJ , , it PP means VVZ the DT current JJ set NN of IN parents NNS are VBP inadequate JJ to TO explain VV the DT residual JJ entropy NN . SENT if IN Pa NP X NP is VBZ the DT best JJS in IN the DT BIC NP or CC chi NN 2 CD sense NN set VVN of IN parents NNS we PP have VHP been VBN able JJ to TO find VV in IN the DT current JJ model NN , , it PP suggests VVZ we PP need VVP to TO create VV a DT new JJ node NN and CC add VV it PP to TO Pa NP X NP . SENT A DT simple JJ heuristic JJ for IN inventing VVG hidden JJ nodes NNS in IN the DT case NN of IN DBNs NP is VBZ to TO check VV if IN the DT Markov NP property NN is VBZ being VBG violated VVN for IN any DT particular JJ node NN . SENT If IN so RB , , it PP suggests VVZ that IN we PP need VVP connections NNS to TO slices NNS further RBR back RB in IN time NN . SENT Equivalently RB , , we PP can MD add VV new JJ lag NN variables NNS and CC connect VVP to TO them PP . SENT Of IN course NN , , interpreting VVG the DT meaning NN of IN hidden JJ nodes NNS is VBZ always RB tricky JJ , , especially RB since IN they PP are VBP often RB unidentifiable JJ , , e NN . SENT g NN . SENT , , we PP can MD often RB switch VV the DT interpretation NN of IN the DT true JJ and CC false JJ states NNS assuming VVG for IN simplicity NN that IN the DT hidden JJ node NN is VBZ binary NN provided VVD we PP also RB permute VV the DT parameters NNS appropriately RB . SENT Symmetries NNS such JJ as IN this DT are VBP one CD cause NN of IN the DT multiple JJ maxima NN in IN the DT likelihood NN surface NN . SENT Further JJR reading NN on IN learning VVG The DT following VVG are VBP good JJ tutorial JJ articles NNS . SENT W NP . SENT L NP . SENT Buntine NP , , 1994 CD . SENT Operations NNS for IN Learning NP with IN Graphical JJ Models NNS , , J NP . SENT AI VVZ Research NP , , 159 CD 225 CD . SENT D SYM . SENT Heckerman NP , , 1996 CD . SENT A DT tutorial NN on IN learning VVG with IN Bayesian NP networks NNS , , Microsoft NP Research NP tech NN . SENT report NN , , MSR NP TR NP 95 CD 06 CD . SENT Decision NN Theory NN It PP is VBZ often RB said VVD that DT Decision NN Theory NN Probability NN Theory NN Utility NN Theory NN . SENT We PP have VHP outlined VVN above IN how WRB we PP can MD model VV joint JJ probability NN distributions NNS in IN a DT compact JJ way NN by IN using VVG sparse JJ graphs NNS to TO reflect VV conditional JJ independence NN relationships NNS . SENT It PP is VBZ also RB possible JJ to TO decompose VV multi NNS attribute VVP utility NN functions NNS in IN a DT similar JJ way NN . SENT we PP create VVP a DT node NN for IN each DT term NN in IN the DT sum NN , , which WDT has VHZ as IN parents NNS all RB the DT attributes VVZ random JJ variables NNS on IN which WDT it PP depends VVZ . SENT typically RB , , the DT utility NN node NN s PP will MD have VH action NN node NN s PP as IN parents NNS , , since IN the DT utility NN depends VVZ both DT on IN the DT state NN of IN the DT world NN and CC the DT action NN we PP perform VVP . SENT The DT resulting VVG graph NN is VBZ called VVN an DT influence NN diagram NN . SENT In IN principle NN , , we PP can MD then RB use VV the DT influence NN diagram NN to TO compute VV the DT optimal JJ sequence NN of IN action NN s PP to TO perform VV so RB as RB to TO maximimize VV expected VVN utility NN , , although IN this DT is VBZ computationally RB intractible JJ for IN all DT but CC the DT smallest JJS problems NNS . SENT Classical JJ control NN theory NN is VBZ mostly RB concerned VVN with IN the DT special JJ case NN where WRB the DT graphical JJ model NN is VBZ a DT Linear NP Dynamical JJ System NP and CC the DT utility NN function NN is VBZ negative JJ quadratic JJ loss NN , , e NN . SENT g NN . SENT , , consider VV a DT missile NN tracking VVG an DT airplane NN . SENT its PP$ goal NN is VBZ to TO minimize VV the DT squared VVN distance NN between IN itself PP and CC the DT target NN . SENT When WRB the DT utility NN function NN and CC or CC the DT system NN model NN becomes VVZ more RBR complicated JJ , , traditional JJ methods NNS break VVP down RP , , and CC one PP has VHZ to TO use VV reinforcement NN learning VVG to TO find VV the DT optimal JJ policy NN mapping NN from IN states NNS to TO actions NNS . SENT Applications NNS The DT most RBS widely RB used VVN Bayes NP Nets NNS are VBP undoubtedly RB the DT ones NNS embedded VVN in IN Microsoft's NP products NNS , , including VVG the DT Answer NN Wizard NN of IN Office NP 95 CD , , the DT Office NP Assistant NP the DT bouncy JJ paperclip NN guy NN of IN Office NP 97 CD , , and CC over IN 30 CD Technical JJ Support NN Troubleshooters NNS . SENT BNs NP originally RB arose VVD out RP of IN an DT attempt NN to TO add VV probabilities NNS to TO expert NN systems NNS , , and CC this DT is VBZ still RB the DT most RBS common JJ use NN for IN BNs NP . SENT A DT famous JJ example NN is VBZ QMR NP DT NP , , a DT decision NN theoretic JJ reformulation NN of IN the DT Quick NP Medical NP Reference NN QMR NP model NN . SENT Here RB , , the DT top JJ layer NN represents VVZ hidden JJ disease NN nodes NNS , , and CC the DT bottom JJ layer NN represents VVZ observed JJ symptom NN nodes NNS . SENT The DT goal NN is VBZ to TO infer VV the DT posterior JJ probability NN of IN each DT disease NN given VVN all PDT the DT symptoms NNS which WDT can MD be VB present JJ , , absent JJ or CC unknown JJ . SENT QMR NP DT NP is VBZ so RB densely RB connected VVN that IN exact JJ inference NN is VBZ impossible JJ . SENT Various JJ approximation NN methods NNS have VHP been VBN used VVN , , including VVG sampling NN , , variational JJ and CC loopy JJ belief NN propagation NN . SENT Another DT interesting JJ fielded VVN application NN is VBZ the DT Vista NP system NN , , developed VVN by IN Eric NP Horvitz NP . SENT The DT Vista NP system NN is VBZ a DT decision NN theoretic JJ system NN that WDT has VHZ been VBN used VVN at IN NASA NP Mission NP Control NP Center NP in IN Houston NP for IN several JJ years NNS . SENT The DT system NN uses VVZ Bayesian JJ networks NNS to TO interpret VV live JJ telemetry NN and CC provides VVZ advice NN on IN the DT likelihood NN of IN alternative JJ failures NNS of IN the DT space NN shuttle's NN propulsion NN systems NNS . SENT It PP also RB considers VVZ time NN criticality NN and CC recommends VVZ actions NNS of IN the DT highest JJS expected VVN utility NN . SENT The DT Vista NP system NN also RB employs VVZ decision NN theoretic JJ methods NNS for IN controlling VVG the DT display NN of IN information NN to TO dynamically RB identify VV the DT most RBS important JJ information NN to TO highlight VV . SENT Horvitz NP has VHZ gone VVN on RP to TO attempt VV to TO apply VV similar JJ technology NN to TO Microsoft NP products NNS , , e NN . SENT g NN . SENT , , the DT Lumiere NP project NN . SENT Special JJ cases NNS of IN BNs NP were VBD independently RB invented VVN by IN many JJ different JJ communities NNS , , for IN use NN in IN e NN . SENT g NN . SENT , , genetics NN linkage NN analysis NN , , speech NN recognition NN HMMs NP , , tracking VVG Kalman NP fitering NN , , data NN compression NN density NN estimation NN and CC coding VVG turbocodes NNS , , etc FW . SENT For IN examples NNS of IN other JJ applications NNS , , see VVP the DT special JJ issue NN of IN Proc NP . SENT ACM NP 38 CD 3 CD , , 1995 CD , , and CC the DT Microsoft NP Decision NN Theory NN Group NP page NN . SENT Applications NNS to TO biology NN This DT is VBZ one CD of IN the DT hottest JJS areas NNS . SENT For IN a DT review NN , , see VV Inferring VVG cellular JJ networks NNS using VVG probabilistic JJ graphical JJ models NNS Science NN , , Nir NP Friedman NP , , v NN 303 CD p NN 799 CD , , 6 CD Feb NN 2004 CD . SENT Recommended JJ introductory JJ reading NN Books NNS In IN reverse JJ chronological JJ order NN bold JJ means NNS particularly RB recommended VVD F NN . SENT V NN . SENT Jensen NP . SENT Bayesian NP Networks NPS and CC Decision NN Graphs NNS . SENT Springer NP . SENT 2001 CD . SENT Probably RB the DT best RBS introductory JJ book NN available JJ . SENT D SYM . SENT Edwards NP . SENT Introduction NN to TO Graphical JJ Modelling VVG , , 2 CD nd NN ed NP . SENT Springer NP Verlag NP . SENT 2000 CD . SENT Good JJ treatment NN of IN undirected JJ graphical JJ models NNS from IN a DT statistical JJ perspective NN . SENT J NP . SENT Pearl NP . SENT Causality NN . SENT Cambridge NP . SENT 2000 CD . SENT The DT definitive JJ book NN on IN using VVG causal JJ DAG NN modeling NN . SENT R SYM . SENT G NP . SENT Cowell NP , , A NP . SENT P NN . SENT Dawid NP , , S NP . SENT L NP . SENT Lauritzen NP and CC D NP . SENT J NP . SENT Spiegelhalter NP . SENT Probabilistic JJ Networks NNS and CC Expert JJ Systems NPS . SENT Springer NP Verlag NP . SENT 1999 CD . SENT Probably RB the DT best JJS book NN available JJ , , although IN the DT treatment NN is VBZ restricted VVN to TO exact JJ inference NN . SENT M NP . SENT I PP . SENT Jordan NP ed NP . SENT Learning VVG in IN Graphical JJ Models NNS . SENT MIT NP Press NP . SENT 1998 CD . SENT Loose JJ collection NN of IN papers NNS on IN machine NN learning NN , , many RB related VVN to TO graphical JJ models NNS . SENT One CD of IN the DT few JJ books NNS to TO discuss VV approximate JJ inference NN . SENT B LS . SENT Frey NP . SENT Graphical JJ models NNS for IN machine NN learning NN and CC digital JJ communication NN , , MIT NP Press NP . SENT 1998 CD . SENT Discusses VVZ pattern NN recognition NN and CC turbocodes NNS using VVG directed VVN graphical JJ models NNS . SENT E SYM . SENT Castillo NP and CC J NP . SENT M NP . SENT Gutierrez NP and CC A NP . SENT S NP . SENT Hadi NP . SENT Expert NN systems NNS and CC probabilistic JJ network NN models NNS . SENT Springer NP Verlag NP , , 1997 CD . SENT A DT Spanish JJ version NN is VBZ available JJ online JJ for IN free JJ . SENT F SYM . SENT Jensen NP . SENT An DT introduction NN to TO Bayesian NP Networks NPS . SENT UCL NP Press NP . SENT 1996 CD . SENT Out RB of IN print NN . SENT Superceded VVD by IN his PP$ 2001 CD book NN . SENT S NP . SENT Lauritzen NP . SENT Graphical JJ Models NNS , , Oxford NP . SENT 1996 CD . SENT The DT definitive JJ mathematical JJ exposition NN of IN the DT theory NN of IN graphical JJ models NNS . SENT S NP . SENT Russell NP and CC P NN . SENT Norvig NP . SENT Artificial JJ Intelligence NP . SENT A DT Modern NP Approach NN . SENT Prentice NP Hall NP . SENT 1995 CD . SENT Popular JJ undergraduate JJ textbook NN that WDT includes VVZ a DT readable JJ chapter NN on IN directed VVN graphical JJ models NNS . SENT J NP . SENT Whittaker NP . SENT Graphical JJ Models NNS in IN Applied NP Multivariate NP Statistics NP , , Wiley NP . SENT 1990 CD . SENT This DT is VBZ the DT first JJ book NN published VVN on IN graphical JJ modelling VVG from IN a DT statistics NNS perspective NN . SENT R SYM . SENT Neapoliton NP . SENT Probabilistic JJ Reasoning NN in IN Expert JJ Systems NPS . SENT John NP Wiley NP Sons NPS . SENT 1990 CD . SENT J NP . SENT Pearl NP . SENT Probabilistic JJ Reasoning NN in IN Intelligent JJ Systems NPS . SENT Networks NNS of IN Plausible JJ Inference NN . SENT Morgan NP Kaufmann NP . SENT 1988 CD . SENT The DT book NN that WDT got VVD it PP all RB started VVD . SENT A DT very RB insightful JJ book NN , , still RB relevant JJ today NN . SENT Review NN articles NNS P NN . SENT Smyth NP , , 1998 CD . SENT Belief NN networks NNS , , hidden JJ Markov NP models NNS , , and CC Markov NP random JJ fields NNS . SENT a DT unifying JJ view NN , , Pattern NN Recognition NP Letters NP . SENT E SYM . SENT Charniak NP , , 1991 CD . SENT Bayesian NP Networks NPS without IN Tears NNS , , AI VVP magazine NN . SENT Sam NP Roweis NP Zoubin NP Ghahramani NP , , 1999 CD . SENT A DT Unifying JJ Review NP of IN Linear NP Gaussian JJ Models NNS , , Neural JJ Computation NN 11 CD 2 CD 1999 CD pp NP . SENT 305 CD 345 CD Exact JJ Inference NN C NP . SENT Huang NP and CC A NP . SENT Darwiche NP , , 1996 CD . SENT Inference NN in IN Belief NN Networks NNS . SENT A DT procedural JJ guide NN , , Intl NP . SENT J NP . SENT Approximate JJ Reasoning NN , , 15 CD 3 CD . SENT 225 CD 263 CD . SENT R SYM . SENT McEliece NP and CC S NP . SENT M NP . SENT Aji NP , , 2000 CD . SENT The DT Generalized VVN Distributive NN Law NN , , IEEE NP Trans NP . SENT Inform VV . SENT Theory NN , , vol NP . SENT 46 CD , , no RB . SENT 2 CD March NP 2000 CD , , pp NP . SENT 325 CD 343 CD . SENT F SYM . SENT Kschischang NP , , B NP . SENT Frey NP and CC H NP . SENT Loeliger NP , , 2001 CD . SENT Factor NN graphs NNS and CC the DT sum NN product NN algorithm NN , , IEEE NP Transactions NNS on IN Information NP Theory NN , , February NP , , 2001 CD . SENT M NP . SENT Peot NP and CC R NP . SENT Shachter NP , , 1991 CD . SENT Fusion NN and CC propogation NN with IN multiple JJ observations NNS in IN belief NN networks NNS , , Artificial JJ Intelligence NP , , 48 CD . SENT 299 CD 318 CD . SENT Approximate JJ Inference NN M NP . SENT I PP . SENT Jordan NP , , Z NP . SENT Ghahramani NP , , T NN . SENT S NP . SENT Jaakkola NP , , and CC L NP . SENT K NP . SENT Saul NP , , 1997 CD . SENT An DT introduction NN to TO variational JJ methods NNS for IN graphical JJ models NNS . SENT D SYM . SENT MacKay NP , , 1998 CD . SENT An DT introduction NN to TO Monte NP Carlo NP methods NNS . SENT T NN . SENT Jaakkola NP and CC M NP . SENT Jordan NP , , 1998 CD . SENT Variational JJ probabilistic JJ inference NN and CC the DT QMR NP DT NP database NN Learning NP W NP . SENT L NP . SENT Buntine NP , , 1994 CD . SENT Operations NNS for IN Learning NP with IN Graphical JJ Models NNS , , J NP . SENT AI VVZ Research NP , , 159 CD 225 CD . SENT D SYM . SENT Heckerman NP , , 1996 CD . SENT A DT tutorial NN on IN learning VVG with IN Bayesian NP networks NNS , , Microsoft NP Research NP tech NN . SENT report NN , , MSR NP TR NP 95 CD 06 CD . SENT DBNs NP L NP . SENT R SYM . SENT Rabiner NP , , 1989 CD . SENT A DT Tutorial NN in IN Hidden NP Markov NP Models NNS and CC Selected VVD Applications NP in IN Speech NN Recognition NP , , Proc NP . SENT of IN the DT IEEE NP , , 77 CD 2 CD . SENT 257 CD 286 CD . SENT Z SYM . SENT Ghahramani NP , , 1998 CD . SENT Learning VVG Dynamic JJ Bayesian NP Networks NPS In IN C NP . SENT L NP . SENT Giles NP and CC M NP . SENT Gori NP eds NPS . SENT , , Adaptive JJ Processing VVG of IN Sequences NNS and CC Data NP Structures NNS . SENT Lecture NN Notes NNS in IN Artificial JJ Intelligence NP , , 168 CD 197 CD . SENT Berlin NP . SENT Springer NP Verlag NP . SENT