{"id":1476,"date":"2022-06-24T23:30:00","date_gmt":"2022-06-25T03:30:00","guid":{"rendered":"http:\/\/aristotle2digital.blogwyrm.com\/?p=1476"},"modified":"2022-06-20T16:33:51","modified_gmt":"2022-06-20T20:33:51","slug":"metropolis-hastings-ii","status":"publish","type":"post","link":"https:\/\/aristotle2digital.blogwyrm.com\/?p=1476","title":{"rendered":"Metropolis-Hastings II"},"content":{"rendered":"\n<p>Last month\u2019s posting introduced the Metropolis-Hastings (MH) method for generating a sample distribution from just about any functional form.&nbsp; This method can in handy particularly for methods that are analytically intractable so that the inverse-transform method would be useless and for which the accept-reject method would be too inefficient.&nbsp;<\/p>\n\n\n\n<p>The Metropolis-Hastings algorithm, which is particular instance of the broader Markov Chain Monte Carlo (MCMC) methods, brings us the desired flexibility at the expense of being computationally more involved and expensive.&nbsp; As a reminder, the algorithm works as follows:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Setup<ul><li>Select the distribution $f(x)$ from which a set random numbers are desired<\/li><\/ul><ul><li>Arbitrarily assign a value to a random realization $x_{start}$ of the desired distribution consistent with its range<\/li><\/ul><ul><li>Select a simple distribution $g(x|x_{cur})$, centered on the current value of the realization from which to draw candidate changes<\/li><\/ul><\/li><li>Execution<ul><li>Draw a change from $g(x|x_{cur})$<\/li><\/ul><ul><li>Create a candidate new realization $x_{cand}$<\/li><\/ul><ul><li>Form the ratio $R_f = f(x_{cand})\/f(x_{cur})$<\/li><\/ul><ul><li>Make a decision as the keeping or rejecting $x_{cand}$ as the new state to transition to according to the Metropolis-Hastings rule<ul><li>If $R_f \\geq 1$ then keep the change and let $x_{cur} = x_{cand}$<\/li><\/ul><ul><li>If $R_f &lt; 1$ then draw a random number $q$ from the unit uniform distribution and make the transition if $q &lt; R_f$<\/li><\/ul><\/li><\/ul><\/li><\/ul>\n\n\n\n<p>and the explicit promise of this month\u2019s installment was the explanation as to why this set of rules actually works.&nbsp;<\/p>\n\n\n\n<p>Technically speaking, the algorithm, as written above, is strictly the Metropolis-Hastings algorithm for the case where the simple distribution $g(x|x_{cur})$ is symmetric; the modification for the case of non-symmetric distributions will emerge naturally in the course of justification for the rules.<\/p>\n\n\n\n<p>The arguments presented here closely follow the excellent online presentation by ritvikmath:<\/p>\n\n\n\n<iframe loading=\"lazy\" width=\"560\" height=\"315\" src=\"https:\/\/www.youtube.com\/embed\/yCv2N7wGDCw\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen><\/iframe>\n\n\n\n<p>To start, we relate our distribution $f(x)$ to the corresponding probability by<\/p>\n\n\n\n<p>\\[ p(x) = \\frac{f(x)}{N_C} \\; , \\]<\/p>\n\n\n\n<p>where $N_C$ is the normalization constant that is usually impossible to calculate analytically either because the functional form of $f(x)$ possess no analytic anti-derivative or the distribution is defined in high dimensions or both.<\/p>\n\n\n\n<p>The MH algorithm pairs $p(x)$ with a known probability distribution $g(x)$ but in a way such that the algorithm \u2018learns\u2019 where to concentrate its effort by having $g(x)$ follow the current state around (in contrast to the accept-reject method where $g(x)$ is stationary).&nbsp; At each step, the transition from the current state $a$ to the proposed or candidate next state $b$ happens according to the transition probability<\/p>\n\n\n\n<p>\\[ T(a \\rightarrow b) = g(b|a) A(a \\rightarrow b) \\; , \\]<\/p>\n\n\n\n<p>where $g(b|a)$ is conditional draw from $g(x)$ centered on state $a$ (i.e., $g$ follows the current state around) and the transition probability $A(a \\rightarrow b)$ is the MH decision step.&nbsp; The right-hand-side of the transition probability directly maps $g(b|a)$ and $A(a \\rightarrow b)$ to the first two and the last two sub-bullets of the execution step above, respectively.&nbsp;<\/p>\n\n\n\n<p>To derive the MH decision step correctly describes $A(a \\rightarrow b)$, we impose detailed balance,<\/p>\n\n\n\n<p>\\[ p(a) T(a \\rightarrow b) = p(b) T(b \\rightarrow a) \\; \\]<\/p>\n\n\n\n<p>guaranteeing that the MCMC will eventually result in a stationary distribution.&nbsp; That detailed balance give stationary distributions of the Markov chain is most easily seen by summing both sides over $a$ to arrive at a matrix analog of the stationary requirement ${\\bar S}^* = M {\\bar S}^*$ used in the <a href=\"http:\/\/aristotle2digital.blogwyrm.com\/?p=1455\">Sunny Day Markov<\/a> chain post.<\/p>\n\n\n\n<p>The next step involve a straightforward plugging-in of earlier relations to yield<\/p>\n\n\n\n<p>\\[ \\frac{f(a)}{N_C} g(b|a) A(a \\rightarrow b) = \\frac{f(b)}{N_C} g(a|b) A(b \\rightarrow a) \\; . \\]<\/p>\n\n\n\n<p>&nbsp;Collecting all of the $A$\u2019s over on one side gives<\/p>\n\n\n\n<p>\\[ \\frac{A(a \\rightarrow b)}{A(b \\rightarrow a)} = \\frac{f(b)}{f(a)} \\frac{g(a|b)}{g(b|a)} \\; . \\]<\/p>\n\n\n\n<p>Next define, for convenience, the two ratios<\/p>\n\n\n\n<p>\\[ R_f = \\frac{f(b)}{f(a)} \\; \\]<\/p>\n\n\n\n<p>and<\/p>\n\n\n\n<p>\\[ R_g = \\frac{ g(a|b) }{ g(b|a) }&nbsp; \\; . \\]<\/p>\n\n\n\n<p>Note that $R_f$ is the ratio defined earlier and that $R_g$ is exactly 1 when the $g(x)$ is a symmetric distribution.<\/p>\n\n\n\n<p>Now, we analyze two separate cases to make sure that the transitions $A(a \\rightarrow b)$ and $A(b \\rightarrow a)$ can actually satisfy detailed-balance.&nbsp;<\/p>\n\n\n\n<p>First, if $R_f R_g &nbsp;&lt; 1$ then detailed balance will hold if we assign $A(a \\rightarrow b) = R_f R_g$ and $A(b \\rightarrow a) = 1$.&nbsp; These conditions mean that we can interpret $A(a \\rightarrow b)$ as a probability for making the transition and we can then draw a number $q$ from a uniform distribution to see if we beat the odds $q \\leq R_f R_g$, in which case we transition to the state $b$.<\/p>\n\n\n\n<p>Second, if $R_f R_g \\geq 1$ then detailed balance will hold is we assign $A(a \\rightarrow b) = 1$ and $A(b \\rightarrow a) = R_f R_g$.&nbsp; These conditions mean that we can interpret $A(a \\rightarrow b)$ is being a demand, with 100% certainty, that we should make the change.<\/p>\n\n\n\n<p>These two cases, which comprise the MH decision rule, are often summarized as<\/p>\n\n\n\n<p>\\[ A(a \\rightarrow b) = Min(1,R_f R_g) \\; \\]<\/p>\n\n\n\n<p>or, in the case when $g(x)$ is symmetric<\/p>\n\n\n\n<p>\\[ A(a \\rightarrow b) = Min\\left(1,\\frac{f(b)}{f(a)}\\right) \\; , \\]<\/p>\n\n\n\n<p>which is the form used in the algorithm listing above.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last month\u2019s posting introduced the Metropolis-Hastings (MH) method for generating a sample distribution from just about any functional form.&nbsp; This method can in handy particularly for methods that are analytically&#8230; <a class=\"read-more-button\" href=\"https:\/\/aristotle2digital.blogwyrm.com\/?p=1476\">Read more &gt;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1476","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/1476","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1476"}],"version-history":[{"count":0,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/1476\/revisions"}],"wp:attachment":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1476"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1476"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1476"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}