Bayesian Phylogenetics

Podobne dokumenty
Machine Learning for Data Science (CS4786) Lecture 11. Spectral Embedding + Clustering

Hard-Margin Support Vector Machines

Zakopane, plan miasta: Skala ok. 1: = City map (Polish Edition)

Machine Learning for Data Science (CS4786) Lecture11. Random Projections & Canonical Correlation Analysis

Revenue Maximization. Sept. 25, 2018

Previously on CSCI 4622

Stargard Szczecinski i okolice (Polish Edition)

Katowice, plan miasta: Skala 1: = City map = Stadtplan (Polish Edition)

Weronika Mysliwiec, klasa 8W, rok szkolny 2018/2019

Linear Classification and Logistic Regression. Pascal Fua IC-CVLab

MaPlan Sp. z O.O. Click here if your download doesn"t start automatically

SSW1.1, HFW Fry #20, Zeno #25 Benchmark: Qtr.1. Fry #65, Zeno #67. like

Tychy, plan miasta: Skala 1: (Polish Edition)

Rozpoznawanie twarzy metodą PCA Michał Bereta 1. Testowanie statystycznej istotności różnic między jakością klasyfikatorów

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 9: Inference in Structured Prediction

deep learning for NLP (5 lectures)

Helena Boguta, klasa 8W, rok szkolny 2018/2019

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Wprowadzenie do programu RapidMiner, część 2 Michał Bereta 1. Wykorzystanie wykresu ROC do porównania modeli klasyfikatorów

Analysis of Movie Profitability STAT 469 IN CLASS ANALYSIS #2

Miedzy legenda a historia: Szlakiem piastowskim z Poznania do Gniezna (Biblioteka Kroniki Wielkopolski) (Polish Edition)

Machine Learning for Data Science (CS4786) Lecture 24. Differential Privacy and Re-useable Holdout

Maximum A Posteriori Chris Piech CS109, Stanford University

Probability definition

Dolny Slask 1: , mapa turystycznosamochodowa: Plan Wroclawia (Polish Edition)

Proposal of thesis topic for mgr in. (MSE) programme in Telecommunications and Computer Science in2357

Egzamin maturalny z języka angielskiego na poziomie dwujęzycznym Rozmowa wstępna (wyłącznie dla egzaminującego)

y = The Chain Rule Show all work. No calculator unless otherwise stated. If asked to Explain your answer, write in complete sentences.

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

TTIC 31210: Advanced Natural Language Processing. Kevin Gimpel Spring Lecture 8: Structured PredicCon 2

Miedzy legenda a historia: Szlakiem piastowskim z Poznania do Gniezna (Biblioteka Kroniki Wielkopolski) (Polish Edition)

Wybrzeze Baltyku, mapa turystyczna 1: (Polish Edition)


harmonic functions and the chromatic polynomial

Lecture 18 Review for Exam 1

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

Poland) Wydawnictwo "Gea" (Warsaw. Click here if your download doesn"t start automatically

Leba, Rowy, Ustka, Slowinski Park Narodowy, plany miast, mapa turystyczna =: Tourist map = Touristenkarte (Polish Edition)

Blow-Up: Photographs in the Time of Tumult; Black and White Photography Festival Zakopane Warszawa 2002 / Powiekszenie: Fotografie w czasach zgielku

18. Przydatne zwroty podczas egzaminu ustnego. 19. Mo liwe pytania egzaminatora i przyk³adowe odpowiedzi egzaminowanego

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)


Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

Jak zasada Pareto może pomóc Ci w nauce języków obcych?

miniature, low-voltage lighting system MIKRUS S

Ankiety Nowe funkcje! Pomoc Twoje konto Wyloguj. BIODIVERSITY OF RIVERS: Survey to students

Zestawienie czasów angielskich

Bardzo formalny, odbiorca posiada specjalny tytuł, który jest używany zamiast nazwiska

Pielgrzymka do Ojczyzny: Przemowienia i homilie Ojca Swietego Jana Pawla II (Jan Pawel II-- pierwszy Polak na Stolicy Piotrowej) (Polish Edition)



099 Łóżko półpiętrowe 2080x1010(1109)x Bunk bed 2080x1010(1109)x1600 W15 INSTRUKCJA MONTAŻU ASSEMBLY INSTRUCTION

Karpacz, plan miasta 1:10 000: Panorama Karkonoszy, mapa szlakow turystycznych (Polish Edition)

Instrukcja obsługi User s manual

ERASMUS + : Trail of extinct and active volcanoes, earthquakes through Europe. SURVEY TO STUDENTS.

Emilka szuka swojej gwiazdy / Emily Climbs (Emily, #2) API Documentation

Test sprawdzający znajomość języka angielskiego


Few-fermion thermometry

Zdecyduj: Czy to jest rzeczywiście prześladowanie? Czasem coś WYDAJE SIĘ złośliwe, ale wcale takie nie jest.

Compressing the information contained in the different indexes is crucial for performance when implementing an IR system

Nazwa projektu: Kreatywni i innowacyjni uczniowie konkurencyjni na rynku pracy


Wroclaw, plan nowy: Nowe ulice, 1:22500, sygnalizacja swietlna, wysokosc wiaduktow : Debica = City plan (Polish Edition)


Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)

098 Łóżko piętrowe 2080x1010(1109)x Double bunk bed 2080x1010(1109)x1600 W15 MONTAGEANLEITUNG ASSEMBLY INSTRUCTION

The Overview of Civilian Applications of Airborne SAR Systems

Domy inaczej pomyślane A different type of housing CEZARY SANKOWSKI

Realizacja systemów wbudowanych (embeded systems) w strukturach PSoC (Programmable System on Chip)

SubVersion. Piotr Mikulski. SubVersion. P. Mikulski. Co to jest subversion? Zalety SubVersion. Wady SubVersion. Inne różnice SubVersion i CVS


Instrukcja konfiguracji usługi Wirtualnej Sieci Prywatnej w systemie Mac OSX

Classic Clad / Thermo Clad / ThermoPlus Clad option selection for projects with Pine / Fir wood

Fig 5 Spectrograms of the original signal (top) extracted shaft-related GAD components (middle) and

Gradient Coding using the Stochastic Block Model

Wojewodztwo Koszalinskie: Obiekty i walory krajoznawcze (Inwentaryzacja krajoznawcza Polski) (Polish Edition)


Installation of EuroCert software for qualified electronic signature

Relaxation of the Cosmological Constant

Camspot 4.4 Camspot 4.5

Network Services for Spatial Data in European Geo-Portals and their Compliance with ISO and OGC Standards

PSB dla masazystow. Praca Zbiorowa. Click here if your download doesn"t start automatically

PLSH1 (JUN14PLSH101) General Certificate of Education Advanced Subsidiary Examination June Reading and Writing TOTAL

Unbiased Cosmological Parameter Estimation from Emission Line Surveys with Interlopers

Export Markets Enterprise Florida Inc.

Wroclaw, plan nowy: Nowe ulice, 1:22500, sygnalizacja swietlna, wysokosc wiaduktow : Debica = City plan (Polish Edition)

Surname. Other Names. For Examiner s Use Centre Number. Candidate Number. Candidate Signature

User s manual for icarwash


Sargent Opens Sonairte Farmers' Market

Strangeness in nuclei and neutron stars: many-body forces and the hyperon puzzle

Dolny Slask 1: , mapa turystycznosamochodowa: Plan Wroclawia (Polish Edition)

Stół Regolo. 100% Made in Italy. Może być używany w wersji zamkniętej lub otwartej na dowolnej wysokości It can be used open or closed, at any height

Agnostic Learning and VC dimension


See also 20 & 27 June 2018 at Bayesian Phylogenetics Workshop on Molecular Evolution Woods Hole, Massachusetts 4 Aug 2019 Paul O. Lewis Department of Ecology & Evolutionary Biology

Bayesian inference

Joint probabilities White,Solid White,Dotted 10 marbles in a bag Sampling with replacement Pr(B,S) = 0.4 Pr(W,S) = 0.1 Pr(B,D) = 0.2 Black,Dotted Black,Solid Pr(W,D) = 0.3

Conditional probabilities What's the probability that a marble is black given that it is dotted? 5 marbles satisfy the condition (D) Pr(B D) = 2 5 2 remaining marbles are black (B)

Marginal probabilities Marginalizing over color yields the total probability that a marble is dotted (D) W,D B,D Pr(D) = Pr(B,D) + Pr(W,D) = 0.2 + 0.3 = 0.5 Marginalization involves summing all joint probabilities containing D

Marginalization B W D Pr(D,B) Pr(D,W) S Pr(S,B) Pr(S,W)

Marginalizing over colors S D B W Marginal probability of being dotted is the sum of all joint probabilities involving dotted marbles Pr(S,B) Pr(D,B) Pr(D,W) Pr(S,W)

Joint probabilities B W D Pr(D,B) Pr(D,W) S Pr(S,B) Pr(S,W)

Marginalizing over "dottedness" B W D S Pr(D,B) Pr(S,B) Pr(D,W) Pr(S,W) Marginal probability of being a white marble

Bayes' rule The joint probability Pr(B,D) can be written as the product of a conditional probability and the probability of that condition Pr(B D) Pr(D) Pr(B,D) Either B or D can be the condition Pr(D B) Pr(B)

Bayes' rule Equate the two ways of writing Pr(B,D) Pr(B D) Pr(D) = Pr(D B) Pr(B) Divide both sides by Pr(D) Pr(B D) Pr(D) Pr(D B) Pr(B) = Pr(D) Pr(D) Bayes' rule Pr(D B) Pr(B) Pr(B D) = Pr(D)

Bayes' rule 1 2 3 = 5 1 2 3 5 2 2 = 5Bayes' 5rule Pr(D B) Pr(B) Pr(B D) = Pr(D)

<latexit sha1_base64="s7y8+ujo/amurwqp2sijla1uubo=">aaacm3icfvfnswmxee3x7/pv9shkscgwpeykobdbqgcrdxwsfbqlznnzdsbzjckkzd1f6c/wz3jvi9ltbaviqodlzzvmzesqc6an676wninjqemz2bny/mli0njlzfvwr4mi0kirj9rdqdrwjqflmofwfysgiudqdh7p8nz7czrmkbwxgxi6gtxlfjjkjkv6ffcbarfxff7doyfydxwhac6cpzdquejvsifry7dvl/fx/k9u7nsh9wqqbwhwq1tdulse/g28eaiiutr7k6vnvx/rria0lboto8ex6azeguy5zgu/0rat+kjuowohjaj0ny3sypc2zfo4jjq90ucc/v6reqh1qarwkyh50d9zoflxrpoy8libmhknbiqdngotjk2ec29xnymghg8sifqxoyumd8s6zowpjhxjb9mx0lfn0kdyuwyjcjo5ir1hdq9xdswsas8d9x769xvchtq9t+5dh1zpgynvz9e62kk7yenh6brdoczqiype0bt6rx/ohnpmxdpxq6ltgtwsobfwwp/7kmek</latexit> Bayes' rule (variations) Pr(D) is the marginal probability of being dotted To compute it, we marginalize over colors

<latexit sha1_base64="y059rfl5nfg3zpwjlzhpua3xhqs=">aaacbxicbvdlsgmxfe3hv323iitfgsvhucqmclorsnxhsok1qltkjr3thiazickizeyn+dvu9qp8cn/bzlqlq14injxzlvfe40ecaeo6nzlnbn5hcsm/vlk6tr6xwshupeowvhqanoshevkjbs4knawzhj4ibut4hjr+8cbvm8+gnavlgxlf0bgkl1nakdgw6hyu23v1unu5leoja9wofkfjyty+1mo4k8rjjkidwctprjuthhcljbfizox/am8ksmha9w4xt9/uhtqwia3lrovwvwq6cvgguq7jlxasisj0sprqslasabqtzaeo8aflejgilx3s4iz92zeqofvi+nypibno31pk/qe1yhncdrimo9iapjnbqcyxcxgafu4xbdtwkqwekmz3xxrabergzjozjv1mr0bnlkl8yf8ajcbmpo7kgdm7zj2usao8tdt7nd9f8hhe8dykd39rqtam2ebrljpaj8hdl6ik7ladnrbfr+gnvaop3jez4+w5+xork5v2bkozco6/aelqunu=</latexit> <latexit sha1_base64="kjjbwoqyakzjyzw73xqvmrgu2+i=">aaaccxicbvdlsgmxfm2m7/qquhjbqotgkzqzerreknwfywrwftpsmumdgkwyq5irytip8wvc6tlv8afmtf041qobk3pp5d57gpgzbtzvy3gxlldw19y3cptb2zu7xb39rx0likklrjxsnybo4excyzddormriclg0a6eb7j6+wwuzpf8momy+okmjaszjczkg+lvne6fitc011snt6+ncs5iozjzfhavz5t2tglxjoni2at5u+c/xj+tmpqjodhzjnvdicycpkgcan29je0/jcowymfs6cuaykkfyqi6lkoiqpft6zetfgkviq4jzz80ekr+7kij0hosausuxdzpxvom/lfrjia87kdmxokbswedworje+esmtxkcqjhy0sivczuiuktszkzm2tusrayjohmlkkdyf8ajcbmzo70gdm78h5kjawejeov5vexpj7vfk/m35+x6415tuvocjxqkflrbaqjo9relutrg3phh+jt+xypxeywzlbxmfccobzc6g/iilw9</latexit> <latexit sha1_base64="acyvtjyled8gklbedcecy4wozwm=">aaacgnicbvbnaxsxejw3azomh3hayycimejsitlta82hhed0kemodsrxwdjgk8/gipj2kwylrt0/1v/thttfuq13d3hsachte2+ymzcwsjqm49+d6mng02ebw8+3x7x89xqnu/vm2uwlftasucrttcodkglghbiv3bqwue4vjno7s1offwfrzg6ucfnavpnbizmpoazq1r34sllmufbsaa+//rgc0romjirpxklnnueckfmmdwv+8ghmqqpxnkjjb3e16/bifrwq+hgkleirtoaz3c4+m+ei1gbqko7c5ktaqecwpvbqbbpsqchfhb+fsycga3btv7q6ou8cm6dzbsmzsffs/q7ptxnlnqan5rhwd7wa/j82kte7mxppihlbigzqviqkoa0jphnpqababscflwfxkhy8high6lup9wkualf2iu91+dtazawphf5khrvwpyibaaponhmy32nw/bgfxp3k8rh3omiz3sj75iackor8jqfknazjiajyk/wif8jfacn6hyxrp8yaddqet2stoi//abzxxnm=</latexit> Bayes' rule (variations)

Bayes rule in statistics Likelihood of hypothesis θ Prior probability of hypothesis θ Pr( D) = Pr(D ) Pr( ) Pr(D ) Pr( ) Posterior probability of hypothesis θ Marginal probability of the data (marginalizing over hypotheses)

Pr(θ D) = Paternity example Pr(D θ) Pr(θ) θ Pr(D θ) Pr(θ) (father) A- aa (mother) Aa (child) θ 1 θ 2 Row sum Genotypes AA Aa --- Prior 1/2 1/2 1 Likelihood 1 1/2 --- Prior X Likelihood 1/2 1/4 3/4 Posterior 2/3 1/3 1

Bayes rule: continuous case Likelihood Prior probability density p(θ D) = p(d θ)p(θ) p(d θ)p(θ)dθ Posterior probability density Marginal probability of the data

If you had to guess... Photo by Tracy Heath Not knowing anything about my archery abilities, draw a curve representing your view of the chances of my arrow landing a distance d centimeters from the center of the target. 1 meter 0.0 d (centimeters from target center)

Case 1: assume I have talent An informative prior (low variance) that says most of my arrows will fall within 20 cm of the center (thanks for your confidence!) 1 meter 0.0 20.0 40.0 60.0 80.0

Case 2: assume I have a talent for missing the target! Also an informative prior, but one that says most of my arrows will fall within a narrow range just outside the entire target! 1 meter 0.0 20.0 40.0 60.0

Case 3: assume I have no talent This is a vague prior: its high variance reflects nearly total ignorance of my abilities, saying that my arrows could land nearly anywhere! 1 meter 0.0 20.0 40.0 60.0 80.0

A matter of scale Notice that I haven't provided a scale for the vertical axis. What exactly does the height of this curve mean? For example, does the height of the dotted line represent the probability that my arrow lands 60 cm from the center of the target? 0.0 20.0 40.0 60.0

Probabilities are associated with intervals Probabilities are attached to intervals (i.e. ranges of values), not individual values The probability of any given point (e.g. d = 60.0) is zero! However, we can ask about the probability that d falls in a particular interval e.g. 50.0 < d < 65.0 0.0 20.0 40.0 60.0

Probabilities vs. probability densities Probability density function Note: the height of this curve does not represent a probability (if it did, it would not exceed 1.0)

Densities of various substances Substance Density (g/cm 3 ) Cork 0.24 Aluminum 2.7 Gold 19.3 Density does not equal mass mass = density volume

A brick with varying density density Gold on left end Aluminum on right end distance from left end

Integrating a density yields a probability Area of rectangle = p(θ)dθ p(θ) 0.0 0.2 0.4 0.6 0.8 θ p(θ) dθ infinitesimal width 0 1 2 3 1.0 = p(θ)dθ The density curve is scaled so that the value of this integral (i.e. the total area) equals 1.0

Integrating a density yields a probability Area of rectangle = p(θ)dθ p(θ) 0.0 0.2 0.4 0.6 0.8 θ p(θ) dθ infinitesimal width 0 1 2 3 0.39109 = 2 1 p(θ)dθ The area under the density curve from 1 to 2 is the probability that θ is between 1 and 2

Archery priors revisited 0.00 0.05 0.10 0.15 0.20 0.25 0.30 mean=1.732 variance=3 mean=60 variance=3 mean=200 variance=40000 These density curves are all variations of a gamma probability distribution. We could have used a gamma distribution to specify each of the prior probability distributions for the archery example. Note that higher variance means less informative 0 10 20 30 40 50 60

Usually there are many parameters... A 2-parameter example Likelihood Prior density p(θ, ϕ D) = Posterior probability density p(d θ, ϕ) p(θ) p(ϕ) θ ϕ p(d θ, ϕ) p(θ) p(ϕ) dϕ dθ Marginal probability of data An analysis of 100 sequences under the simplest model (JC69) requires 197 branch length parameters. The denominator would require a 197-fold integral inside a sum over all possible tree topologies! It would thus be nice to avoid having to calculate the marginal probability of the data...

Markov chain Monte Carlo (MCMC)

Markov chain Monte Carlo (MCMC) posterior 0 1 2 3 prior 0.0 0.2 0.4 0.6 0.8 1.0 For more complex problems, we might settle for a good approximation to the posterior distribution

MCMC robot s rules Slightly downhill steps are usually accepted Drastic off the cliff downhill steps are almost never accepted With these rules, it is easy to see why the robot tends to stay near the tops of hills Uphill steps are always accepted

10 Actual rules (Metropolis algorithm) 8 6 Slightly downhill steps are usually accepted because R is near 1 Currently at 6.2 m Proposed at 5.7 m R = 5.7/6.2 =0.92 Drastic off the cliff downhill steps are almost never accepted because R is near 0 Currently at 6.2 m Proposed at 0.2 m R = 0.2/6.2 = 0.03 4 2 0 Uphill steps are always accepted because R > 1 Currently at 1.0 m Proposed at 2.3 m R = 2.3/1.0 = 2.3 The robot takes a step if a Uniform(0,1) random deviate R Metropolis et al. 1953. Equation of state calculations by fast computing machines. J. Chem. Physics 21(6):1087-1092.

Cancellation of marginal likelihood When calculating the ratio (R) of posterior densities, the marginal probability of the data cancels. p(d θ*) p(θ*) p(θ* D) p(θ D) = p(d) p(d θ) p(θ) = p(d θ*) p(θ*) p(d θ) p(θ) p(d) Posterior odds Apply Bayes' rule to both top and bottom Likelihood ratio Prior odds

Target vs. Proposal Distributions "good" proposal distribution The proposal distribution is used by the robot to choose the next spot to step, and is separate from the target distribution. target distribution log(posterior) 16 15 14 13 12 11 10 9 Tracer (app for generating trace plots from MCMC output): trace plot MCMC iteration 0 200 400 600 800 1000 The target is usually the posterior distribution White noise appearance is a sign of good mixing

Target vs. Proposal Distributions "baby steps" proposal distribution log(posterior) 16 15 14 13 12 11 10 9 MCMC iteration target distribution 0 200 400 600 800 1000 Big waves in trace plot indicate robot is crawling around

Target vs. Proposal Distributions "overly bold" proposal distribution log(posterior) 13 12 11 10 MCMC iteration target distribution 0 200 400 600 800 1000 Plateaus in trace plot indicate robot is often stuck in one place

MCRobot (or "MCMC Robot") Javascript version used today will run in most web browsers and is available here: Free app for Windows or iphone/ipad available from (also see John Huelsenbeck's imcmc app for MacOS:

Metropolis-coupled Markov chain Monte Carlo (MCMCMC) Sometimes the robot needs some help, MCMCMC introduces helpers in the form of "heated chain" robots that can act as scouts. Geyer, C. J. 1991. Markov chain Monte Carlo maximum likelihood for dependent data. Pages 156-163 in Computing Science and Statistics (E. Keramidas, ed.).

Heated chains act as scouts for the cold chain cold Cold chain robot can easily make this jump because it is uphill Hot chain robot can also make this jump with high probability because it is only slightly downhill heated

Heated chains act as scouts for the cold chain cold Swapping places means both robots can cross the valley, but this is more important for the cold chain because its valley is much deeper. heated

The Hastings ratio propose left 1 3 propose right 2 3 Suppose the probability of proposing a spot to the right is twice that of proposing a spot to the left. Hastings, W. K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97-109.

The Hastings ratio Example in which proposals were biased toward due east, but Hastings ratio was not used to modify acceptance probabilities

The Hastings ratio accept left 2 3 propose left 1 3 propose right 2 3 accept right 1 3 Accepting half as many right proposals as left proposals serves to balance the proposal probabilities Hastings, W. K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97-109.

Hastings Ratio R = [ p(d θ*) p(θ*) p(d θ) p(θ) ] [ q(θ θ*) q(θ* θ) ] posterior ratio Hastings ratio Note that the Hastings ratio is 1.0 if q(θ* θ) = q(θ θ*)