{"id":1394,"date":"2021-12-24T23:30:00","date_gmt":"2021-12-25T04:30:00","guid":{"rendered":"http:\/\/aristotle2digital.blogwyrm.com\/?p=1394"},"modified":"2022-01-25T12:46:52","modified_gmt":"2022-01-25T17:46:52","slug":"monte-carlo-integration-part-3","status":"publish","type":"post","link":"https:\/\/aristotle2digital.blogwyrm.com\/?p=1394","title":{"rendered":"Monte Carlo Integration &#8211; Part 3"},"content":{"rendered":"\n<p>The last post explored the error analysis for Monte Carlo integration in detail.\u00a0 Numerical experiments indicated that the error in the algorithm goes as $1\/\\sqrt{N}$, where $N$ are the number of Monte Carlo trials used to make the estimate.\u00a0 Further analysis of the sampling statistics led to a fundamental formula describing the estimated error of the estimated value $\\langle I \\rangle$ of the integral as<\/p>\n\n\n\n<p>\\[ \\sigma_{\\langle I \\rangle } = \\frac{\\sigma}{\\sqrt{N}} \\; , \\]<\/p>\n\n\n\n<p>where $\\sigma$ is the standard deviation of the random process used to evaluate the integral.&nbsp; In other words, we could report the estimated value of some integral as<\/p>\n\n\n\n<p>\\[ I = \\int_a^{b} \\, dx f(x) \\approx \\frac{1}{N} \\sum_{i} \\frac{f_i(x)}{p_i(x)} \\pm \\frac{\\sigma}{\\sqrt{N}} \u00a0\\; , \\]<\/p>\n\n\n\n<p>where $p(x)$ is some known probability distribution.\u00a0 We report this estimate with the understanding that the last term is an estimated $1\\sigma$ error of the sampling mean, which is expected to \u2018look\u2019 Gaussian as a consequence of the central limit theorem.<\/p>\n\n\n\n<p>While it is attractive to think that an appropriate strategy to improve the estimate of $I$ is to crank up the number of Monte Carlo trials so that the error is as small as desired doing so completely ignores the other \u2018knob\u2019 that can be turned in the error equation, namely the value of $\\sigma$.<\/p>\n\n\n\n<p>At first, it isn\u2019t obvious how to turn this knob and why we should even consider figuring out how.&nbsp; Motivation comes when we reflect on the fact that in the most important and interesting cases the integrand $f(x)$ may be very or extremely computationally expensive to evaluate.&nbsp; In those cases, we might look for a way to make every Monte Carlo trial count as maximally as it can.&nbsp; Doing so may cut down on run times to get a desired accuracy from days to.<\/p>\n\n\n\n<p>Given that motivation, the next question is how.&nbsp; An illustrative example helps.&nbsp; Consider the integral of a constant integrand, say $f(x)=2$ over the range $0$ to $1$ sampled with a uniform probability distribution.&nbsp; The estimate of the integral becomes<\/p>\n\n\n\n<p>\\[ I = \\int_0^{1} \\, dx 2 \\approx \\frac{1}{N} \\sum_{i} 2 = \\frac{2N}{N} = 2 \\; , \\]<\/p>\n\n\n\n<p>regardless of the number of trials.&nbsp; In addition since each trial returns the same value, the sample standard deviation is $\\sigma = 0$.&nbsp; The estimate is exact and independent of the number of trials.&nbsp;<\/p>\n\n\n\n<p>Our strategy for the general $f(x)$ is quite clear; we want to pick the probability measure such that the ratio $f(x)\/p(x) = \\textrm{constant}$.&nbsp; This strategy is termed importance sampling because the samples are preferentially placed in regions where $f(x)$ is bigger. This isn\u2019t always possible but let\u2019s examine another case where we can actually do this.&nbsp;<\/p>\n\n\n\n<p>Let\u2019s estimate the integral<\/p>\n\n\n\n<p>\\[ I = \\int_{0}^{1} \\, dx \\left(1 &#8211; \\frac{x}{2} \\right) &nbsp;= \\frac{3}{4} \\; .\\]<\/p>\n\n\n\n<p>The integrand is linear across the range taking on the values of $1$ when $x=0$ and $1\/2$ for $x=1$.&nbsp; We want our probability distribution also be linear over this range but to be normalized.&nbsp; A reasonable choice is to simply normalize the integrand directly, giving<\/p>\n\n\n\n<p>\\[ p(x) = \\frac{4}{3}\\left( 1 &#8211; \\frac{x}{2} \\right) \\; .\\]<\/p>\n\n\n\n<p>The ratio $f(x)\/p(x) = \\frac{3}{4}$ and our expectation is that the Monte Carlo integration should give the exact result.&nbsp;<\/p>\n\n\n\n<p>The only detail missing is how to produce a set of numbers taken from that probability distribution.&nbsp; To answer this question let\u2019s follow the discussion in Chapter 8 of Koonin\u2019s <em>Computational Physics<\/em> and rewrite the original integral in a suggestive way<\/p>\n\n\n\n<p>\\[ \\int_0^{1} \\, dx f(x) = \\int_0^{1} \\, dx \\, p(x) \\frac{f(x)}{p(x)} \\; . \\]<\/p>\n\n\n\n<p>Now, define the cumulative distribution as<\/p>\n\n\n\n<p>\\[ y(x) = \\int_0^x \\, dx\u2019 p(x\u2019) \\; \\]<\/p>\n\n\n\n<p>and make a change of variables from $x$ to $y$.&nbsp; Using Leibniz\u2019s rule,<\/p>\n\n\n\n<p>\\[ \\frac{dy}{dx} = p(x) \\; , \\]<\/p>\n\n\n\n<p>which gives $dx = dy\/p(x)$.&nbsp; Turning to the limits of the integral, we find $y(x=0) = 0$ and $y(x=1) = 1$ and the new form of the integral is<\/p>\n\n\n\n<p>\\[ \\int_0^{1} \\, dy \\frac{f(y)}{p(y)} \\; .\\]<\/p>\n\n\n\n<p>We can now tackle this integral by sampling $y$ uniformly over the interval $[0,1]$ as if the integral were originally written this way. &nbsp;&nbsp;<\/p>\n\n\n\n<p>Since $y$ is uniformly distributed, if an anti-derivative exist for $p(x)$ and we can invert the result to express $x=x(y)$ we will automatically get the desired distribution for random realizations of $p(x)$.&nbsp; For the case at hand, $p(x) = \\frac{4}{3}\\left(1 &#8211; \\frac{x}{2}\\right)$.&nbsp; The expression for $y$ is then<\/p>\n\n\n\n<p>\\[ y = \\frac{4}{3} \\left(x &#8211; \\frac{x^2}{4} \\right) \\; .\\]<\/p>\n\n\n\n<p>This quadratic equation can be solved for $x$ to give<\/p>\n\n\n\n<p>\\[ x = 2- (4- 3y)^{1\/2} \\; .\\]<\/p>\n\n\n\n<p>The following figure shows the distribution of $x$ as a random variable given a 100,000 uniformly distributed random numbers representing $y$.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"640\" height=\"480\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2021\/12\/Linear_distribution.png\" alt=\"\" class=\"wp-image-1393\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2021\/12\/Linear_distribution.png 640w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2021\/12\/Linear_distribution-300x225.png 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/figure>\n\n\n\n<p>All of this machinery is easily implemented in python to provide a side-by-side comparison between directly sampling the original integral using both a uniform distribution and the importance sampling using the linear distribution.<\/p>\n\n\n\n<div class=\"myQuoteDiv\">\n<pre class=\"wp-block-preformatted\">\nimport matplotlib as mpl\nimport numpy&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; as np\nimport pandas&nbsp;&nbsp;&nbsp;&nbsp; as pd\nimport scipy&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; as sp\nimport matplotlib.pyplot as plt\nNs&nbsp;&nbsp; = [10,20,50,100,200,500,1_000,2_000,5_000,10_000,100_000]\nxmin = 0\nxmax = 1\ndf&nbsp;&nbsp; = pd.DataFrame()\ndef f(x): return (1.0 - x\/2.0)\ndef w(x): return (4.0-2.0*x)\/3.0\ndef x(x): return 2.0 - np.sqrt(4.0 - 3.0*x)\nfor N in Ns:\n&nbsp;&nbsp;&nbsp; y&nbsp;&nbsp; = np.random.random((N,))\n&nbsp;&nbsp;&nbsp; Fu&nbsp; = f(y)\n&nbsp;&nbsp;&nbsp; X&nbsp;&nbsp; = x(y)\n&nbsp;&nbsp;&nbsp; Wx&nbsp; = w(X)\n&nbsp;&nbsp;&nbsp; Fx&nbsp; = f(X)\n&nbsp;&nbsp;&nbsp; row = {'N':N,'I_U':np.mean(Fu),'sig_I_U':np.std(Fu)\/np.sqrt(N),\\\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 'I_K':np.mean(Fx\/Wx),'sig_I_K':np.std(Fx\/Wx)\/np.sqrt(N)}\n&nbsp;&nbsp;&nbsp; df = df.append(row,ignore_index=True)\nprint(df)<\/pre>\n<\/div>\n\n\n\n<p>The following table shows the results from this experiment (i.e. the contents of the pandas dataframe).<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>$N$<\/td><td>$I_U$<\/td><td>$\\sigma_{I_U}$<\/td><td>$I_L$<\/td><td>$\\sigma_{I_L}$<\/td><\/tr><tr><td>10<\/td><td>0.811536<\/td><td>0.052912<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>20<\/td><td>0.787659<\/td><td>0.033866<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>50<\/td><td>0.768951<\/td><td>0.020441<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>100<\/td><td>0.755852<\/td><td>0.014472<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>200<\/td><td>0.752533<\/td><td>0.010497<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>500<\/td><td>0.749750<\/td><td>0.006580<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>1000<\/td><td>0.749777<\/td><td>0.004602<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>2000<\/td><td>0.745935<\/td><td>0.003184<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>5000<\/td><td>0.748036<\/td><td>0.002035<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>10000<\/td><td>0.751155<\/td><td>0.001438<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><tr><td>100000<\/td><td>0.749597<\/td><td>0.000457<\/td><td>0.75<\/td><td>0.0<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>It is clear that the estimate of the integral $I_U$ slowly converges to the exact value of $3\/4$ but that by matching the importance sampling weight $p(x)$ such that $f(x)\/p(x)$ is a constant, we get an exact value $I_L$ of the integral with no uncertainty.<\/p>\n\n\n\n<p>Of course, this illustrative example is not meant to be considered a triumph since we had to know the exact value integral to determine $p(x)$ in the first place.&nbsp; But it does point to applications where we either don\u2019t or can\u2019t evaluate the integral.&nbsp;<\/p>\n\n\n\n<p>In the next post, we\u2019ll explore some of these applications by looking at more complex integrals and discuss some of the specialized algorithms designed to exploit importance sampling.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The last post explored the error analysis for Monte Carlo integration in detail.\u00a0 Numerical experiments indicated that the error in the algorithm goes as $1\/\\sqrt{N}$, where $N$ are the number&#8230; <a class=\"read-more-button\" href=\"https:\/\/aristotle2digital.blogwyrm.com\/?p=1394\">Read more &gt;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1394","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/1394","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1394"}],"version-history":[{"count":0,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/1394\/revisions"}],"wp:attachment":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1394"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1394"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}