{"id":601,"date":"2017-02-24T23:30:58","date_gmt":"2017-02-25T04:30:58","guid":{"rendered":"http:\/\/aristotle2digital.blogwyrm.com\/?p=601"},"modified":"2018-05-22T11:32:25","modified_gmt":"2018-05-22T15:32:25","slug":"knowing-when-to-stop","status":"publish","type":"post","link":"https:\/\/aristotle2digital.blogwyrm.com\/?p=601","title":{"rendered":"Knowing When to Stop"},"content":{"rendered":"<p>Imagine that it&#8217;s Christmas Eve and, due to some poor planning on your part, you find yourself short of a few gifts \u2013 gifts for key people in your life.\u00a0 You reckon that you have no choice but to go out to the mall and fight all the other last minute shoppers to find those special trinkets that will bring smiles to all those faces you would rather not look at when they are frowning.\u00a0 You know parking will be a delicate issue with few choices available at any given time and, as you enter the lot, you happen to see a space about one foot shy of a football-field\u2019s distance to the mall entrance.\u00a0 Should you take it or is there a better one closer?<\/p>\n<p>If you take the space, you are in for a long walk to and fro as well as a waste of your time &#8211; and maybe, just maybe, the gifts will be gone by the time you get there.\u00a0 If you pass by the space you run the risk of not finding a closer space and, most likely, this space will not be there when you circle back.<\/p>\n<p>In a nutshell, this type of problem is best described under the heading \u2018knowing when it is time to settle\u2019.\u00a0 It has broad applications in wide ranging fields; any discipline where decision making is done within the context of uncertainty mixed with a now-or-never flavor falls under this heading.<\/p>\n<p>Within the computing and mathematical communities, this scenario is dubbed <a href=\"https:\/\/en.wikipedia.org\/wiki\/Secretary_problem\">The Secretary Problem<\/a> and has been widely studied.\u00a0 The article <em><a href=\"https:\/\/people.math.gatech.edu\/~hill\/publications\/PAPER%20PDFS\/AmSciKnowingWhenToStop_Online2009.pdf\">Knowing When to Stop<\/a><\/em> by Theodore Hill, published by The American Scientist, presents a nice introduction and discussion of the problem within many of the real world applications.\u00a0 The aim of this month\u2019s column is to look at some realizations of the problem within a computing context, and to look at some variations that lead to some interesting deviations from the common wisdom.\u00a0 The code and approach presented here are strongly influenced by the article <em><a href=\"https:\/\/msdn.microsoft.com\/en-us\/magazine\/mt763238.aspx\">The Secretary Problem<\/a><\/em> by James McCaffrey in the Test Run column of MSDN Magazine.\u00a0 All of the code presented and all of the results were produced in a Jupyter notebook using Python 2.7 and the standard suite of numpy and matplotlib.<\/p>\n<p>The basic notion of the Secretary Problem is that a company is hiring for the position of secretary and they have received a pool of applicants.\u00a0 Since it is expensive to interview and vet applicants and there is a lost opportunity cost for each day the position goes unfilled, the company would like to fill the position as soon as possible.\u00a0 On the other hand, the company doesn\u2019t want to settle for\u00a0a poor candidate if a more suitable one would be found with a bit more searching.\u00a0 And, overall, what expectations should the company have for the qualifications of the secretary; perhaps the market is bad all over.<\/p>\n<p>Within a fairly stringent set of assumptions, there is a way to maximize the probability of selecting the best choice by using the 1\/e stopping rule.\u00a0 To illustrate the method, imagine that 10 applicants seek the position.\u00a0 Divide the applicant pool up into a testing pool and a selection pool, where the size of the testing pool is determined (to within some rounding or truncation scheme) by dividing the total number of applicants by e, the\u00a0base of the natural logarithms. Using truncation, the testing pool has 3 members and the selection pool has 7.<\/p>\n<p><a href=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Problem_pool.jpg\" rel=\"attachment wp-att-608\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-608\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Problem_pool.jpg\" alt=\"Secretary Problem_pool\" width=\"857\" height=\"128\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Problem_pool.jpg 857w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Problem_pool-300x45.jpg 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Problem_pool-768x115.jpg 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Problem_pool-810x121.jpg 810w\" sizes=\"auto, (max-width: 857px) 100vw, 857px\" \/><\/a><\/p>\n<p>The testing pool is interviewed and the applicants assessed and scored.\u00a0 This sampling of the applicant pool serves to survey the entire pool.\u00a0 The highest score from the testing pool sets a threshold that must be met or exceeded (hopefully) by an applicant within the additional population found in the selection pool. \u00a0The first applicant from the selection pool to meet or exceed the threshold is selected; this may or may not be the best overall candidate. Following this approach, and using the additional assumption that each applicant is scored uniquely, the probability is\u00a036.8% chance of getting the best applicant (interestingly, this percentage is also 1\/e).<\/p>\n<p>This decision-making framework has three possible responses:\u00a0 it can find the best applicant, it can settle on a sub-optimal applicant, or it can fail to find any applicant that fits the bill.\u00a0 This later case occurs when all the best applicants are in the Testing Pool and no applicants in the Selection Pool can match or exceed the threshold.<\/p>\n<p>To test the 1\/e rule, I developed code in Python within the Jupyter notebook framework. \u00a0The key function is the one that sets up the initial applicant pool.\u00a0 This function<\/p>\n<div class=\"myQuoteDiv\" style=\"width: 600px;\">\n<pre>def generate_applicants(N,flag='uniform'):\r\n    if flag == 'integer':\r\n        pool = []\r\n        for i in range(0,N):\r\n            pool.append(np.random.randint(10*N))\r\n        return np.array(pool)\r\n    if flag == 'normal':\r\n        temp          = np.abs(np.random.randn(N))\r\n        return np.floor(temp\/np.max(temp)*100.0)\/10.0\r\n    if flag == 'uniform':\r\n        return np.floor(np.random.rand(N)*100.0)\/10.0\r\n    else:\r\n        print \"Didn't understand your specification - using uniform distribution\"\r\n        return np.floor(np.random.rand(N)*100.0)\/10.0\r\n<\/pre>\n<\/div>\n<p>sets the scores of the applicants in one of three ways.\u00a0 The first method, called \u2018integer\u2019, assigns an integer to each applicant based on a uniform probability distribution.\u00a0 The selected range is chosen to be 10 times larger than the number of applicants, effectively guaranteeing that no two applicants have the same score.\u00a0 The second, called \u2018normal\u2019, assigns a score from the normal distribution.\u00a0 This approach also effectively guarantees that no two applicants have the same score.\u00a0 The occasions where both methods violate the assumption of uniqueness form a very small subset of the whole.\u00a0 The third method, called \u2018uniform\u2019, distributes scores uniformly but \u2018quantizes\u2019 the score to a discrete set.\u00a0 This last method is used to test the importance of the assumption of a unique score for each applicant.<\/p>\n<p>A specific applicant pool and the application of the 1\/e rule can be regarded as an individual Monte Carlo trial.\u00a0 Each trial is repeated a large number of times to assemble the statistics for analysis.\u00a0 The statistics comprise the number of times the best applicant is found, the number of times no suitable applicant is found, and the number of times a sub-optimal applicant is found and how far from the optimum said applicant is.\u00a0 This last statistic is called the settle value, since this is what the company has had to settle for.<\/p>\n<p>The following figure shows the percentage of times that each method finds an optimal candidate from the selection pool by using the 1\/e stopping rule.<\/p>\n<p><a href=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Total-Success.png\" rel=\"attachment wp-att-607\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-607\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Total-Success.png\" alt=\"Secretary - Total Success\" width=\"546\" height=\"383\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Total-Success.png 546w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Total-Success-300x210.png 300w\" sizes=\"auto, (max-width: 546px) 100vw, 546px\" \/><\/a><\/p>\n<p>Note that for the two methods where duplication is nearly impossible (integer and normal), the percent of total success remains, to within Monte Carlo error, at the theoretically derived value of about 36.8 %.\u00a0 In contrast, the uniform method, which enjoys a quantized scoring system, shoots upwards to a total success rate of 100%.\u00a0 The reason that explains this behavior is that with a quantized scoring system there is only a discrete set of values any applicant can achieve.\u00a0 Once the number of applicants gets great enough, the testing pool perfectly characterizes the whole.\u00a0\u00a0 And while the number of applicants needed to achieve this higher percentage is impractical for finding a secretary (who really wants 640 applicants interviewing for the position) the application to other problems is obvious.\u00a0 There is really no reason that a decision process should always hinge on the difference between two choices of less than a fraction of the overall score.\u00a0 This fact also explains why businesses typically \u2018look\u2019 at the market and pay careful attention to who is hiring whom.<\/p>\n<p>For completeness, the following figures show the analogous behavior for the partial success percentage<\/p>\n<p><a href=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Partial-Success.png\" rel=\"attachment wp-att-606\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-606\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Partial-Success.png\" alt=\"Secretary - Partial Success\" width=\"555\" height=\"383\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Partial-Success.png 555w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Partial-Success-300x207.png 300w\" sizes=\"auto, (max-width: 555px) 100vw, 555px\" \/><\/a><\/p>\n<p>and the total failure scenarios<\/p>\n<p><a href=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Failure.png\" rel=\"attachment wp-att-605\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-605\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Failure.png\" alt=\"Secretary - Failure\" width=\"555\" height=\"383\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Failure.png 555w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Failure-300x207.png 300w\" sizes=\"auto, (max-width: 555px) 100vw, 555px\" \/><\/a><\/p>\n<p>An interesting corollary is to ask, in the case of partial success, how much short of optimal did the decision process fall in the process of settling on a sub-optimal selection.\u00a0 The following figures shows histograms for 10, 80, and 640 applicants in the applicant pool for those cases where the decision process had to settle for a sub-optimal choice, for the normal and uniform cases, respectively.\u00a0 As expected, there is an improvement in how far from the maximum the decision falls as the testing pool size increases but, even with 640 applicants, the normal process has a significant probability of falling short by 20% or more.<\/p>\n<p><a href=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Normal.png\" rel=\"attachment wp-att-603\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-603\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Normal.png\" alt=\"Secretary - Settle Normal\" width=\"1262\" height=\"455\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Normal.png 1262w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Normal-300x108.png 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Normal-768x277.png 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Normal-1024x369.png 1024w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Normal-810x292.png 810w\" sizes=\"auto, (max-width: 1262px) 100vw, 1262px\" \/><\/a><\/p>\n<p>In contrast, the distribution for the uniform scoring quickly collapses, so that\u00a0the amount that the settled-upon candidate falls from the optimum is essentially within 5% even with a moderately sized applicant pool.\u00a0 Again, this behavior is due to the quantized scoring, which more accurately reflects real world scenarios.<\/p>\n<p><a href=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Quantized.png\" rel=\"attachment wp-att-604\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-604\" src=\"http:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Quantized.png\" alt=\"Secretary - Settle Quantized\" width=\"1262\" height=\"455\" srcset=\"https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Quantized.png 1262w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Quantized-300x108.png 300w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Quantized-768x277.png 768w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Quantized-1024x369.png 1024w, https:\/\/aristotle2digital.blogwyrm.com\/wp-content\/uploads\/2017\/02\/Secretary-Settle-Quantized-810x292.png 810w\" sizes=\"auto, (max-width: 1262px) 100vw, 1262px\" \/><\/a><\/p>\n<p>At this point, there are two observations\u00a0worth making in brief.\u00a0 First, the core assumption of the original problem, that all applicants can be assigned a unique score, is worth throwing away.\u00a0 Even if its adoption was crucial in deriving the 1\/e stopping rule, real world applications simply do not admit a clear, unambiguous way to assign unique scores.\u00a0 Second, it is, perhaps, astonishing how much richness is hidden in something so mundane as hiring a qualified candidate. Of course, this is to be expected, since good help is hard to find.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Imagine that it&#8217;s Christmas Eve and, due to some poor planning on your part, you find yourself short of a few gifts \u2013 gifts for key people in your life.\u00a0&#8230; <a class=\"read-more-button\" href=\"https:\/\/aristotle2digital.blogwyrm.com\/?p=601\">Read more &gt;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-601","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/601","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=601"}],"version-history":[{"count":0,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=\/wp\/v2\/posts\/601\/revisions"}],"wp:attachment":[{"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=601"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=601"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aristotle2digital.blogwyrm.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=601"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}