{"id":1482,"date":"2022-07-29T23:30:40","date_gmt":"2022-07-30T03:30:40","guid":{"rendered":"http:\/\/aristotle2digital.blogwyrm.com\/?p=1482"},"modified":"2022-07-04T06:31:56","modified_gmt":"2022-07-04T10:31:56","slug":"multivariate-sampling-and-covariance","status":"publish","type":"post","link":"https:\/\/aristotle2digital.blogwyrm.com\/?p=1482","title":{"rendered":"Multivariate Sampling and Covariance"},"content":{"rendered":"\n<p>Our journey through various Monte Carlo techniques has been rich and rewarding but has been confined almost exclusively to one-dimensional sampling.&nbsp; Even the case of the sunny\/rainy-day multi-state Markov Chain (<a href=\"http:\/\/aristotle2digital.blogwyrm.com\/?p=1431\">prelude<\/a> and <a href=\"http:\/\/aristotle2digital.blogwyrm.com\/?p=1455\">main act<\/a>) was merely a precursor to the setup and analysis of the Metropolis-Hastings algorithm that was then applied in one dimension.&nbsp; This month, we step off the real number line and explore the plane to understand a bit of how multivariate sampling works and the meaning of covariance.<\/p>\n\n\n\n<p>To start, let\u2019s imagine that we are measuring the height and weight of some mythical animal, say the xandu of C.S. Friedman\u2019s <em><a href=\"https:\/\/en.wikipedia.org\/wiki\/Coldfire_Trilogy\">Coldfire Trilogy<\/a><\/em>.&nbsp; Further suppose that we\u2019ve measured 5000 members of the population and recorded their height and weight.&nbsp; The following figure shows the distribution of these two attributes plus the quantitative statistics of the mean and standard deviation of each.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"857\" height=\"707\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_compare.png\" alt=\"\" class=\"wp-image-1484\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_compare.png 857w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_compare-300x247.png 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_compare-768x634.png 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_compare-810x668.png 810w\" sizes=\"auto, (max-width: 857px) 100vw, 857px\" \/><\/figure><\/div>\n\n\n\n<p>While these statistics give us a good description of the distribution of the weight or the height of the xandu population that don\u2019t say anything about the connection between these two attributes.&nbsp; Generally, we would expect that the larger an animal is the heavier that animal will be, all others things being equal.&nbsp; To better understand the relationship between these two population characteristics, we first make a scatter plot of their weight arbitrarily displayed on the x-axis ($X$ will hereafter be a stand in symbol for weight) and their height along the y-axis ($Y$ for height).<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"857\" height=\"769\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_scatter.png\" alt=\"\" class=\"wp-image-1489\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_scatter.png 857w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_scatter-300x269.png 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_scatter-768x689.png 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_Height_scatter-810x727.png 810w\" sizes=\"auto, (max-width: 857px) 100vw, 857px\" \/><\/figure><\/div>\n\n\n\n<p>It is obvious that the two attributes are strongly correlated.&nbsp; To make that observation quantitative we first define the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Covariance_matrix\">covariance matrix<\/a> between the two series.&nbsp; Let\u2019s remind ourselves that the variance of a set of $N$ values of the random variable $X$, denoted by ${\\mathbf X} = \\left\\{ x_1, x_2, \\ldots , x_N\\right\\}$ defined as<\/p>\n\n\n\n<p>\\[ {\\sigma_X}^2 = \\frac{1}{N} \\sum_{i=1}^N \\left( x_i &#8211; {\\bar x} \\right)^2 \\; , \\]<\/p>\n\n\n\n<p>where ${\\bar x}$ is the mean value given by<\/p>\n\n\n\n<p>\\[ {\\bar x} = \\frac{1}{N} \\sum_{i=1}^N x_i &nbsp;\\; . \\]<\/p>\n\n\n\n<p>We can express these relations more compactly by defining the common notation of an expectation value of $Q$ as<\/p>\n\n\n\n<p>\\[ E[Q] = \\frac{1}{N} \\sum_{i=1}^N q_i \\; , \\]<\/p>\n\n\n\n<p>where $Q$ is some arbitrary set of $N$ values denoted as $q_i$.&nbsp; With this definition, the variance becomes<\/p>\n\n\n\n<p>\\[ {\\sigma_X}^2 = E \\left[ (X &#8211; {\\bar x})^2 \\right] \\; . \\]<\/p>\n\n\n\n<p>The covariance is then the amount the set $X$ moves with (or covaries) with the set $Y$ defined by<\/p>\n\n\n\n<p>\\[ Cov(X,Y) = E \\left[ (X &#8211; {\\bar x}) (Y &#8211; {\\bar y} ) \\right] \\; .\\]<\/p>\n\n\n\n<p>Note that the usual variance is now expressed as<\/p>\n\n\n\n<p>\\[ {\\sigma_X}^2 = Cov(X,X) = E \\left[ (X &#8211; {\\bar x}) (X &#8211; {\\bar x}) \\right] = E \\left[ (X &#8211; {\\bar x})^2 \\right] \\; , \\]<\/p>\n\n\n\n<p>which is the same as before.<\/p>\n\n\n\n<p>There are four unique combinations: $Cov(X,X)$, $Cov(X,Y)$, $Cov(Y,X)$, and $Cov(Y,Y)$.&nbsp; Because the covariance in symmetric in its arguments $Cov(Y,X) = Cov(X,Y)$.&nbsp; For reasons that will become clearer below, it is convenient to display these four values in the covariance matrix<\/p>\n\n\n\n<p>\\[ {\\mathcal P}_{XY} = \\left[ \\begin{array}{cc} Cov(X,X) &amp; Cov(X,Y) \\\\ Cov(Y,X) &amp; Cov(Y,Y) \\end{array} \\right] \\; , \\]<\/p>\n\n\n\n<p>where it is understood that in real computations the off-diagonal terms are symmetric.<\/p>\n\n\n\n<p>The computation of the covariance matrix for the data above is<\/p>\n\n\n\n<p>\\[ {\\mathcal P}_{XY} = \\left[ \\begin{array}{cc} 19.22857412 &amp; 17.51559386 \\\\ 17.51559386 &amp; 39.54157185 \\end{array} \\right ] \\; . \\]<\/p>\n\n\n\n<p>The large off-diagonal components, relative to the size of the on-diagonal components, reflects the strong correlation between weight and height of the xandu.&nbsp; The actual correlation coefficient is available from the correlation ${\\bar {\\mathcal P}}_{XY}$ matrix defined as dividing the terms by the standard deviation of the arguments used<\/p>\n\n\n\n<p>\\[ {\\bar {\\mathcal P}}_{XY} = \\left[ \\begin{array}{cc} Cov(X,X)\/{\\sigma_X}^2 &amp; Cov(X,Y)\/\\sigma_X \\sigma_Y \\\\ Cov(Y,X)\/\\sigma_Y \\sigma_X &amp; Cov(Y,Y)\/{\\sigma_Y}^2 \\end{array} \\right] \\; .\\]<\/p>\n\n\n\n<p>Substituting the values in yields<\/p>\n\n\n\n<p>\\[ {\\bar {\\mathcal P}}_{XY} = \\left[ \\begin{array}{cc} 1.00000 &amp; 0.63577 \\\\ 0.63577 &amp; 1.0000 \\end{array} \\right ] \\; . \\]<\/p>\n\n\n\n<p>The value of 0.63577 shows that the weight and height are moderately correlated so that about 64% of the variation of the one explains the variation of the other.<\/p>\n\n\n\n<p>With these results in hand, let\u2019s return to why it is convenient to display the individual covariances in matrix form.&nbsp; Doing so allows us to use the power of linear algebra to find the independent variations, $A$ and $B$, of the combined data set.&nbsp; When the weight-height data are experimentally measured, the independent variations are found by diagonalizing the covariance matrix.&nbsp; The resulting eigenvectors indicate the directions of independent variation (i.e., $A$ and $B$ will be expressed as some linear combination of $X$ and $Y$) and the eigenvalues are the corresponding variances ${\\sigma_A}^2$ and ${\\sigma_B}^2$.&nbsp; The eigenvector\/eigenvalue decomposition rarely yields results that are generally understandable as the units of the two variables different but, nonetheless, one can think of the resulting directions as arising from a rotation in the weight-height space.<\/p>\n\n\n\n<p>Understanding is best achieved by using a contrived example where we start with known population parameters for $A$ and $B$ and \u2018work forward\u2019.&nbsp; For this example, $\\mu_A = 45$ and $\\mu_B = 130$ with the corresponding standard deviations of $\\sigma_A = 3$ and $\\sigma_B = 7$, respectively.&nbsp; These data are then constructed so that they differ from $X$ and $Y$ by a 30-degree rotation.&nbsp; In other words, the matrix equation<\/p>\n\n\n\n<p>\\[ \\left[ \\begin{array}{c} X \\\\ Y \\end{array} \\right] = \\left[ \\begin{array}{cc} \\cos \\theta &amp; \\sin \\theta \\\\ -\\sin \\theta &amp; \\cos \\theta \\end{array} \\right] \\left[ \\begin{array}{c} A \\\\ B \\end{array} \\right] \\; \\]<\/p>\n\n\n\n<p>relates the variations in $A$ and $B$ to the variation in $X$ and $Y$, where $\\theta = 30^{\\circ}$.&nbsp; It also relates the mean values by<\/p>\n\n\n\n<p>\\[ \\left[ \\begin{array}{c} \\mu_X \\\\ \\mu_Y \\end{array} \\right] = \\left[ \\begin{array}{cc} \\cos \\theta &amp; \\sin \\theta \\\\ -\\sin \\theta &amp; \\cos \\theta \\end{array} \\right] \\left[ \\begin{array}{c} \\mu_A \\\\ \\mu_B \\end{array} \\right] \\; , \\]<\/p>\n\n\n\n<p>yielding values of $\\mu_X = 103.97114317$ and $\\mu_Y = 90.08330249$.&nbsp; Note that these are exact values and that the means calculated earlier are estimates based on the sampling.&nbsp;<\/p>\n\n\n\n<p>The exact covariance can also be determined from<\/p>\n\n\n\n<p>\\[ {\\mathcal P}_{XY} = M {\\mathcal P}_{AB} M^{T} \\; , \\]<\/p>\n\n\n\n<p>where the diagonal matrix ${\\mathcal P}_{AB} = diag({\\sigma_A}^2,{\\sigma_B}^2)$ reflects the fact that $A$ and $B$ are independent random variables and where the \u2018$T$\u2019 superscript means the transpose.\u00a0 The exact value is<\/p>\n\n\n\n<p>\\[ {\\mathcal P}_{XY} = = \\left[ \\begin{array}{cc} 19.0 &amp; 17.32050808 \\\\ 17.32050808 &amp; 39.0 \\end{array} \\right] \\; . \\]<\/p>\n\n\n\n<p>Again, note both the excellent agreement with what has been estimated from the sampling.&nbsp;<\/p>\n\n\n\n<p>These methods generalize to arbitrary dimension straightforwardly.<\/p>\n\n\n\n<p>Finally, all of the sampling here was done using numpy, which conveniently has a multivariate sampling method.&nbsp; The only draw back is that it requires knowledge of the exact means and covariances so the above method of mapping $A$ and $B$ to $X$ and $Y$ is necessary.&nbsp; It is instructive to see how well the multivariate sampling matches the method used above in both scatter plot<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"857\" height=\"713\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Numpy_Multivariate_Weight_Height.png\" alt=\"\" class=\"wp-image-1487\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Numpy_Multivariate_Weight_Height.png 857w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Numpy_Multivariate_Weight_Height-300x250.png 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Numpy_Multivariate_Weight_Height-768x639.png 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Numpy_Multivariate_Weight_Height-810x674.png 810w\" sizes=\"auto, (max-width: 857px) 100vw, 857px\" \/><\/figure><\/div>\n\n\n\n<p>and in the individual distributions for weight<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"857\" height=\"622\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_compare.png\" alt=\"\" class=\"wp-image-1486\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_compare.png 857w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_compare-300x218.png 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_compare-768x557.png 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Weight_compare-810x588.png 810w\" sizes=\"auto, (max-width: 857px) 100vw, 857px\" \/><\/figure><\/div>\n\n\n\n<p>and height<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"857\" height=\"602\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Height_compare.png\" alt=\"\" class=\"wp-image-1485\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Height_compare.png 857w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Height_compare-300x211.png 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Height_compare-768x539.png 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2022\/07\/Height_compare-810x569.png 810w\" sizes=\"auto, (max-width: 857px) 100vw, 857px\" \/><\/figure><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Our journey through various Monte Carlo techniques has been rich and rewarding but has been confined almost exclusively to one-dimensional sampling.&nbsp; Even the case of the sunny\/rainy-day multi-state Markov Chain&#8230; <a class=\"read-more-button\" href=\"https:\/\/aristotle2digital.blogwyrm.com\/?p=1482\">Read more &gt;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1482","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/1482","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1482"}],"version-history":[{"count":5,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/1482\/revisions"}],"predecessor-version":[{"id":1493,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/1482\/revisions\/1493"}],"wp:attachment":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1482"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1482"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1482"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}