Uncategorized

Making Rational Decisions

I recently came across an interesting method for combining qualitative and quantitative data on a common footing to allow for a mathematically supported framework for make complicated decisions where many criteria are involved.  The method is called the Analytic Hierarchy Process.

The Analytic Hierarchy Process (AHP), which was invented Thomas L. Saaty in the 1970s, uses a technique based on matrices and eigenvectors to structure complex decision making when large sets of alternatives and criteria are involved and/or when some of the criteria are described by attributes that cannot be assigned objective rankings are in play.  It is especially useful in group-based decision making since it allows the judgements of disparate stake-holders, often with quite different points-of-view, to be considered in a dispassionate way.

In a nut-shell, the AHP consists of three parts: the objective, the criteria, and the alternatives.  Criteria can be sub-divided as finely as desired, with the obvious, concomitant cost of more complexity in the decision making process.  Each alternative is then assigned a value in each criterion and each criteria is given a weighting.  The assessments are normalized and matrix methods are used to link the relative values and weightings to give a ranking.  Graphically, these parts are usually presented in hierarchical chart that looks something like:

AHP_structure

A nice tutorial exists by Haas and Meixner entitled An Illustrated Guide to the Analytic Hierarchy Process and this posting is patterned closely after their slides.  The decision-making process that they address is buying a car.  This is the objective (‘the what’) that we seek to accomplish.  We will use three criteria when selecting the car to buy:  Style, Reliability, and Fuel Economy.

Two of these criteria, Style and Reliability, are qualitative or, at least, semi-qualitative, whereas the Fuel Economy is quantitative.  Our alternatives/selections for the cars will be AutoFine, BigMotors, CoolCar, and Dynamix.

The first step is to make assign numerical labels to the qualitative criteria.  We will use a 1-10 scale for Style and Reliability.  Since we are weighing judgements, the absolute values of these scores are meaningless.  Instead the labels indicate the relative ranking.  For example, we can assume that the 1-10 scale can be interpreted as:

  • 1 – perfectly equal
  • 3 – moderately more important/moderately better
  • 5 – strongly more important/strongly better
  • 7 – very strongly more important/very strongly better
  • 9 – extremely more important/extremely better

with the even-labeled values slightly greater in shading than the odd labels that precede them.  This ranking scheme can be used to assign weightings to the criteria relative to each other (for example style is almost strongly more important than reliability – 4/1) and to weigh the alternatives against each other in a particular criteria (for example AutoFine is moderately better than CoolCar in reliability).

To be concrete, let’s suppose our friend Michael is looking to buy a car.  We interview Michael and find that he feels that:

  • Style is half as important as Reliability
  • Style is 3 times more important as Fuel Economy
  • Reliability is 4 times more important as Fuel Economy

Based on these responses, we construct a weighting table

  Style Reliability Fuel Economy
Style 1/1 1/2 3/1
Reliability   1/1 4/1
Fuel Economy     1/1

where the first number in the entry corresponds to the row and the second to the column.  So the ‘4/1’ entry encodes the statement that Reliability is 4 times more important as Fuel Economy. The omitted entries below the diagonal as simply the reverses of the one above (e.g. 1/2 goes to 2/1).

This table is converted to a weighting matrix $\mathbf{W}$ which numerically looks

\[ {\mathbf W} = \left[ \begin{array}{ccc} 1.000 & 0.5000 & 3.000 \\ 2.000 & 1.000 & 4.000 \\ 0.3333 & 0.2500 & 1.000 \end{array} \right] \; . \]

In a similar fashion, we interview Michael for his judgments of each automobile model with each criteria and find corresponding weighting matrices for Style and Reliability:

\[ {\mathbf S} = \left[ \begin{array}{ccc} 1.000 & 0.2500 & 4.000 & 0.1667 \\ 4.000 & 1.000 & 4.000 & 0.2500 \\ 0.2500 & 0.2500 & 1.000 & 0.2000 \\ 6.000 & 4.000 & 5.000 & 1.000 \end{array} \right] \; \]

and

\[ {\mathbf R} = \left[ \begin{array}{ccc} 1.000 & 2.000 & 5.000 & 1.000 \\ 0.5000 & 1.000 & 3.000 & 2.000 \\ 0.2000 & 0.3333 & 1.000 & 0.2500 \\ 1.000 & 0.5000 & 4.000 & 1.000 \end{array} \right] \; . \]

Finally, we rank the fuel economy for each alternative.  Here we don’t need to depend on Michael’s judgment and can simply look up the CAFE standards to find

\[ {\mathbf F} = \left[ \begin{array}{c}34\\27\\24\\28 \end{array} \right] mpg \; . \]

Saaty’s method directs us to first find the eigenvectors of each of the $4 \times 4$ criteria matrices and of the $3 \times 3$ weighting matrix that correspond to largest eigenvalues for each.  Note that the Fuel Economy is already in vector form.  The L1 norm is used so that each vector is normalized by the sum of it elements.  The resulting vectors are:

\[ {\mathbf vW} = \left[ \begin{array}{c}0.3196\\0.5584\\0.1220 \end{array} \right] \; , \]

\[ {\mathbf vS} = \left[ \begin{array}{c}0.1163\\0.2473\\0.0599\\0.5764 \end{array} \right] \; , \]

\[ {\mathbf vR} = \left[ \begin{array}{c}0.3786\\0.2901\\0.0742\\0.2571 \end{array} \right] \; , \]

and

\[ {\mathbf vF} = \left[ \begin{array}{c}0.3009\\0.2389\\0.2124\\0.2479 \end{array} \right] \; . \]

A $4 \times 3$ matrix is formed whose columns are ${\mathbf vS}$, ${\mathbf vR}$, ${\mathbf vF}$ which is then left multiplied into ${\mathbf vW}$ to give a final ranking.  Doing this gives:

AutoFine 0.2853
BigMotors 0.2702
CoolCar 0.0865
Dynamix 0.3580

 

So from the point-of-view of our criteria, Dynamix is the way to go.  Of course, we haven’t figured in cost.  To this end, Haas and Meixner recommend scaling these results by the cost to get a value.  This is done straightforwardly as shown in the following table.

  Ranking Cost Normalized Cost Value = Ranking/Normalized Cost
AutoFine 0.2853 $20,000 0.2899 1.016
BigMotors 0.2702 $15,000 0.2174 1.243
CoolCar 0.0865 $12,000 0.1739 0.497
Dynamix 0.3580 $22,000 0.3188 1.122

 

With this new data incorporated, we now decided that BigMotors gives us the best value for the money.  Whether our friend Michael will follow either of these two recommendations, is, of course, only answerable by him but at least the AHP gives some rational way of weighing the facts.  I suspect that Aristotle would have been pleased.

Technique and Judgment – Distinguishing ‘How’ and ‘What’

I suppose this column grew out a confluence of things that made me realize yet another limitation of the computers in their gallant attempt to unseat the human race in its mastery of the planet.  This limitation, unfortunately, also impacts the human being first learning a new skill.  What is this limitation you may ask – it’s the inability to distinguish technique from judgment.  Fortunately humans can grow out of it, computers not so much (that is to say not at all).

On the face of it, this limitation seems to be a mismatching of concepts bordering on a non sequitur.  After all technique is how one does whereas judgment centers on the ability to decide or conclude.  What do the two have to do with each other?

To illustrate, let’s consider the average student in one of the STEM programs at a university.  The student spends large amounts of time in mathematical courses learning the techniques associated with Calculus, Linear Algebra, Differential Equations, Vector Analysis and the like.  A good student earning good grades succeeds at tests with questions of the sort:

Given a vector field $\vec f(x,y,z) = x^2 \hat \imath + \sin(y) \hat \jmath + \frac{1}{3 z^3} \hat k$ compute the divergence $\nabla \cdot \vec f(x,y,z)$

A successful completion of this problem leads to the answer:

$\nabla \cdot \vec f(x,y,z) = 2x +\cos(y) – \frac{1}{z^4}$

demonstrating that the student knows how to compute a divergence.  To be sure, this skill and the others listed above, are important skills to have and are nothing to sneeze at, but they don’t take one far enough.  Without the judgment of knowing what to do, the technique of how to do it becomes nearly useless.

To illustrate this, our student, having gotten straight A’s in all her subjects now moves onto an engineering class where she is asked to solve a problem in electricity and magnetism that says something like

Given the following distribution of charges and arrangements of conducting surfaces, compute the electric field.

Suddenly there is no specification on what technique to use, no indication how to solve the problem. All that is being asked is a ‘what’ – what is the electric field.  Prudent judgment is needed to figure out how to solve the problem.  And here we find the biggest stumbling block for the human (and a nearly insurmountable obstacle for current computing).

Lawvere and Schanuel, in their book Conceptual Mathematics: a first introduction to categories, summarize this distinction when they note

There will be little discussion about how to do specialized calculations, but much about the analysis that goes into deciding what calculations need to be done, and in what order.  Anyone who has struggled with a genuine problem without having been taught an explicit method knows that this is the hardest part.

– Lawvere and Schanuel

Of course, any technique sufficiently refined, can be ‘taught’ to a computer. In the example from Vector Calculus discussed above, any number of computer algebra programs can be used to compute the divergence of the vector field $\vec f(x,y,z)$. Almost nothing exists in the way of computing judgment to determine the best strategy to solve a ‘what’ type problem. Even the most sophisticated expert systems fail to compete with competent human judgment unless the application is heavily structured.

The distinction between the ‘what’ and the ‘how’, between the judgment needed to determine what and when to do a thing and the technique needed to perform the thing is often complicated and subtle – even for the human.  Much like the intricate interplay between language and thought the interaction between judgment and technique has no clean lines.  In fact, viewed from a certain perspective, a technique can be thought of as the language of doing and judgment as the thought associated with what should be done.  How we do a thing often affects what we think can be done. That’s what makes it fun being alive!

Thinking and Language Ex Machina

I had an occasion recently to watch the new science fiction movie called Ex Machina.  While the movie was okay there were some interesting points raised that are worth at least a brief commentary.

The story centers around a genius computer scientist/programmer/inventor (enough… you get the picture) by the name of Nathan.  He’s some sort of strange mix between Elon Musk, Tony Stark, and Sergey Brin, who he most physically resembles and is clearly patterned after in terms of backstory.  It seems that Nathan’s invented an artificially intelligent ‘woman’ by the name of Ava who he wants to subject to a Turing Test.  However, no simply dialog version of the Turing Test is good enough for Nathan and he recruits a bright employee, by name Caleb (no one has a last name in this movie), from his multi-national company to administer a super version of the test.  The problem is that Caleb can’t know the real nature of the test since Nathan wants to see if Ava can elicit a real emotional connection with Caleb. And thus begins a psychological thriller that starts with Nathan manipulating the interactions between Ava and Caleb and ends with the question of just who is manipulating whom.  Overall the movie is mostly telegraphed and predictable, ending with the usual but unspoken admonition that there are certain things man is not meant to mess with – certain genies that once let out of the bottle are impossible to recapture.  Nonetheless, a few ideas stood out as intriguing.

About a third of the way into the movie, Nathan takes to explaining some of the technical details to an eager Caleb – details on just how Nathan was able to model such a human looking Ava.  The two major technical challenges discussed where making her speech natural and making her thinking adaptable.  Nathan says he solved the natural language problem by eavesdropping on every cell-phone conversation and using all those data as a reference (perhaps by training a neural net – the fine points weren’t discussed).  One the thinking front, he claims to have watched how web surfers travel through the World Wide Web.

Most of us expect that our journeys in cyberspace are quietly monitored by big-data marketers, salivating to see what we are interested in and thereby serve up those ‘customers who purchased x also purchased y’ prompts.  What is tantalizing how the discussion in Ex Machina is the idea that Nathan wasn’t interesting in what they users were viewing on the web but how they were viewing the web.  In other words, the patterns of clicks and views reveal more about how we think than what we think.

The discussion reminded me of a game I am fond of playing.  There is no winning or losing.  All that is required is for each participant to say a single word that comes into their min based on the word previously uttered.  Obviously, someone starts with an arbitrary word but from then on it is link by link by link.  Any player is allowed to challenge a word uttered by another player for an explanation on how it links with the previous word.  Most players start reluctantly and soon join in with enthusiasm and I have found it to be a great way to gain insight on how friends and family think.

The technique that Nathan is discussing is like the game described above (it has no formal name) writ large.  Analysis of how people link pages and images together is yet another facet of that ongoing debate over how much does language affect thinking versus how thinking affects language.  Except in this context, the answers are no longing strictly couched in terms of philosophical arguments and mathematical hypotheses.  Suddenly, with this subtle but significant shift in thought, new vistas open where linguistics can truly be tackled in an experimental framework.  The possibilities are staggering.

The Red Box of Ethical Conundrums

A couple of weeks ago I wrote about the detective story as a playground for exploring the philosophical question of double effect.  This week I thought I would discuss an excellent example in this category – The Red Box by Rex Stout.

The Red Box

Unfortunately there is no easy way to discuss the ethical dilemma without spoiling the story so beware.

At the center of the story is the Frost family and the vast Frost fortune.  The Frost family consists of two major branches.  On one side is Llewellyn Frost, a shallow young man who produces flop plays on Broadway and his father Dudley, who is a babbling, blowhard.  Neither has much more than their bluster to their name.  On the other side is Helen Frost, Dudley’s niece and the heiress to Dudley’s brother’s immense fortune, and her mother, Calida Frost, Dudley’s sister-in-law.

Caught in the orbit of the Frosts are a wide variety of people, the two keys being Boyden McNair and Perren Gebert.  McNair is a successful designer of women’s clothing and he own a fashionable boutique in lower Manhattan where Helen works as a fashion model.  Perren Gebert is a useless dandy who is an old friend of Calida Frost and a would-be suitor for Helen.  Unlike McNair, Gebert has no visible means of support.

The story starts with Llewellyn Frost bullying the famous literary detective Nero Wolfe into investigating the death of Molly Lauck, a fashion model in McNair’s boutique.  About a week earlier, between runway calls, Molly, Helen, and a third girl by the name of Thelma Mitchell, snuck away to a deserted spot to rest.  Molly produces a box of candy that she ‘swiped’ for them to have a snack.  One poisoned Jordan almond later and Molly is no more.  Llewellyn’s aim is for Wolfe to find Molly’s killer as a way of pulling Helen away from her job at Boden McNair’s shop thus opening a door for Llewellyn to rid himself of a McNair as a perceived rival for Helen’s affection.  It seems that Llewellyn is so smitten with his cousin that he has researched whether it would be licit for them to marry.

As Wolfe investigates he finds that the box of poisoned candy was intended for Boyden McNair.  McNair, recognizing his peril begins to put his affairs in order and even goes so far as to name Wolfe in his will as the executor of his estate.  Shortly afterwards, Boyden McNair visits Wolfe’s office to explain the situation and to ask Wolfe to consent to the arrangement.   Boyden hints that the danger to him is rooted long ago in the past around the time of the birth of his daughter Glenna, who was born a month before Helen Frost.  McNair explains how his wife died during childbirth and how Glenna was lost to him a few months later.  He goes on to tell Wolfe that should he die all the proof Wolfe needs can be found in his red box.  Unfortunately, just before McNair reveals where the box is hidden he also dies from poison, this time expertly put into a bottle of aspirin that he carried around.

With no other obvious recourse, Wolfe sets his agents to the task of finding McNair’s red box.  The New York police are also eager to find the box as are the Frosts.  Indeed, everyone is so anxious to see what truth lies in the red box, that the box itself begins to take on a life of its own.

Meanwhile, Perren Gebbert begins to push hard for the hand of Helen in marriage.  He even goes so far as to try to find the red box for himself, which is quite remarkable considering his layabout status.  He is rewarded for his effort by becoming the third victim of the poisoner.

With three deaths on hand and no red box, Wolfe resorts to a clever plan.  Realizing that the red box has now become an incredibly powerful psychological symbol, he pays an artisan to manufacture a faux red box.  He then gathers the suspects and begins to reveal the truth all the while with the counterfeit red box in plain view.

The truth he reveals is this.  The real Helen frost died those many years ago.  Being penniless and a recent widow, Boyden McNair agrees to allow Calida Frost to raise Glenna as Helen.  Calida is the one who suggests this arrangement because secretly she sees it as a way of keeping control of the vast Frost fortune – a fortune that she was disinherited from due to her marital infidelity years earlier.  Gebbert, who was witness to the switch, offered his silence in return for a monthly stipend from Calida.  Having identified the motive for the murders, Wolfe has revealed Calida as the culprit and convincingly hoodwinked the gathering into believing that all of this is supported by the evidence he found in the red box.

However, Wolfe really has no tangible evidence as he has never been able to locate the red box.  To keep hi gambit alive, he hands the red box to Calida.  Upon opening it, she finds a bottle of oil of the bitter almond, a potent poison.  Believing herself to be trapped, she quickly swallows the poison and kills herself.

And here we have the ethical conundrum of double effect:  was Wolfe justified into cornering Calida Frost into committing suicide?

While there is no easy answer to this question, I would argue that he was.  To support this, let’s look that the four conditions that must be met for the application of double effect, starting with the easiest and progressing to the hardest.

The agent must intend good.  This point is the easiest and most supported of the four.  Wolfe gained in no way, neither financially, personally, nor professionally by the death of Calida Frost. She posed no threat to him physically.  Using the fake red box as a prop actually resulted in less growth in his professional reputation than would have occurred if all concerned knew that he had deduced the story merely by the scattered bits of clues that had emerged since Molly Lauck’s death.

The end must be an immediate consequence and not a means to some other end.  This point is also fairly easy to support.  The end was putting an end to the reign of terror brought on by a murdered.  That end was achieved by the death of the murderer as an immediate consequence of presenting the poison to Calida Frost.

The good must outweigh the bad.  This point is a bit trickier to affirm in Wolfe’s favor.  Clearly taking a murderer off the street, in particular, one so cunning and sneaky as to have killed three people by poison, is a good thing.  And considering that the period in which this story was set also had mandatory capital punishment, it seems that this should be straightforward.  But the people of the city of New York was the only proper authority to administering this type of justice.  By circumventing their proper role and becoming, in essence, a vigilante, Wolfe has visited some bad on the justice as a whole.  Nonetheless, I believe that the good outweighs the bad in this case.

The action must be good or at least morally neutral.  Here is the most difficult point to justify. Is offering poison to anyone with the intention of them drinking and thereby dying ever a good thing.  This point clearly dovetails with the one listed just above but it differs in a subtle way.  In the question about the good outweighing the bad, the central notion was the good and bad done to society.  Here the central question seems to be the intrinsic question about the violence Wolfe may have done to his own soul.  It was a hard choice and while I understand why Wolfe made the choice he did, I still have a hard time becoming comfortable with this way of stopping a killer.  Nonetheless, I think when all things are considered, Wolfe’s actions were supportable by the doctrine of double effect – I just don’t think I could have done it were I in his shoes.

Double Effects and Detectives

I’ve used the detective story as a model for talking about and modeling epidemiological questions in earlier posts but I was recently inspired to explore a different kind of philosophical exploration using the murder mystery – the question of double effect.

The principle of double effect, introduced by Thomas Aquinas, defines under what conditions it is permissible to perform an action that does good for some but which results in harmful side effects for others.  Hence the term ‘double’ in the name.

Philosophy surrounding double effect is very much an Aristotelian concept in that there is a kind of virtue to this principle.  Aristotle’s point-of-view is that a virtue is achieved when a being performs just the right amount of activity characteristic to that being’s existence.  A soldier has the virtue of soldiering when he is neither too timid nor too foolhardy.  Justice then flows from virtue in that all pieces in the system are working harmoniously by being just in right place that they need to be and by just performing exactly the way they should perform.

Double effect dove tails with the notion of virtue since it seeks to balance the good an action may perform with the bad that may also result.  A popular example of the double effect framework is the classic ethical conundrum about the passengers on a railcar.  While there are many variations that differ in minor details, they all agree on the central notions.  A runaway railcar is heading to certain doom spelling an inevitable death to the five passengers who are sadly aboard.  An innocent bystander finds himself in the position to save the unhappy 5 by switching the train to a safer track but doing so will result in the death of a single passenger who is stuck on the other track.   What does our bystander do?

According to the principle of double effect, our bystander can legitimately pull the switch and kill the single guy to save the 5 if the following conditions are met:

  • The action (pulling the switch) has to be good or at least morally neutral
  • The agent (the bystander) must intend to do good (i.e. not just taking advantage of the situation)
  • The end (saving the 5) must be an immediate consequence and not result from the means (killing of the 1)
  • The good (saving the 5) must outweigh the bad (killing the 1)

Of course, the application of these rules to various situations can be quite tricky and their application is hotly debated by philosophers, usually by the construction of hypothetical situations where the agent is placed in different and complicated situations.

And here we come to one of the many uses of the detective story – the construction of realistic and compelling narratives that allow us to explore the possibilities in a way that mere academic constructions lack.  Some of the most interesting questions about justice and double effect come to light in these ‘enjoyable hypotheticals’. Should the detective bring the criminal to justice when the crime has a moral underpinning (e.g. Jean Valjean in Victor Hugo’s Les Misérables).  Or perhaps the moral quandary is whether the detective should shoot a criminal he knows to be guilty of a heinous crime to prevent ‘some lawyer from getting him off on a technicality (Captain Dudley Smith to Officer Edmund Exley in James Ellroy’s LA Confidential).

Well there is plenty of material in the genre to play with, and from time-to-time, this column will explore some of the philosophical questions raised.

Representing Time

Time is a curious thing.  John Wheeler is credited with saying that

Time is defined so that motion looks simple

Certainly this is a common, if unacknowledged, sentiment that pervades all of the physical sciences.  The very notion of an equation of motion for a dynamical system rests upon the representation of time as a continuous, infinitely divisible parameter.  Time is considered as either the sole independent variable for ordinary differential equations or as the only ‘time-like’ parameter for partial differential equations.  Regardless, the basic equation of the motion for most physical processes takes the form of

\[ \frac{d}{dt} \bar S = \bar f(\bar S; t) \]

where the state $\bar S$ can be just about anything imagined and the generalized force $\bar f$ depends on the state and, possibly, the time.  By its very structure, these generic form implies that we think of time as something that can be as finely divided as we wish – how else can there be any sense made of the $\frac{d}{dt}$ operator.

Even in the more modern implementations of cellular automata, where the time updates occur at discrete instants, we still think of the computational system as representing a continuous process sampled at evenly spaced times.

The very notion of continuous time is inherited from the ideas of motion and here I believe that Wheeler’s aphorism is on target.  The original definition of time is based on the motion of the Earth about its axis with the location of the Sun in the sky moving continuously as the day winds forward. As the invention of timekeeping evolved, items, like the sundial, either abstracted the sun’s apparent motion to something more easily measured, or replaced that motion with something more easily controlled like a clock.  Thus time for most of us takes on the form of the moving hands of the analog clock.

analog_clock

The location of the hands is a continuous function of time, with the angle that the hour and minute (and perhaps second) hands make with respect to high noon going something like $\sin(\omega t)$ where the angular frequency $\omega$ is taken to be negative to get the handedness correct.

But as timekeeping has evolved does this notion continue to be physical?  Specifically, how should we think about the pervasive digital clock

digital_clock

and the underlying concepts of digital timekeeping on a computer.

Originally, many computer systems were designed to inherit this human notion of ‘time as motion’ and time is internally represented in many machines as a double precision floating point number.  But does this make sense – either from the philosophical view or the computing view?

Let’s consider the last point first.  Certainly, the force models $\bar f(\bar S;t)$ used in dynamical systems require a continuous time in the calculus but they clearly cannot get such a time in the finite precision of any computing machine.  At some level, all such models have to settle for a time just above a certain threshold that is tailored for the specific application. So the implementation of a continuous time expressed in terms of a floating point variable should be replaced with one or more integers that count the multiples of this threshold time in a discrete way.

What is meant by one or more integers is best understood in terms of an example.  Astronomical models of the motion of celestial objects are usually expressed in terms of Julian date and fractions therein.  Traditional computing approaches would dictate that the time, call it $JD$ would be given by a floating point number where the integer part is the number of whole days and the fractional parts the numbers of hours, minutes, seconds, milliseconds, and so on, added together appropriately and then divided by 86400 to get the corresponding fraction.  Conceptually, this means that we take a set of integers and then contort them into a single floating point number.  But this approach not only involves a set of unnecessary mental gymnastics but is actually subject to error in the numerical sense.

Consider the following two modified Julian dates, represented by their integer values for days, hours, minutes, seconds, and milliseconds and by their corresponding floating point representations

JD

In an arbitrary-precision computation, the sum of $JD1 = JD2 + deltaT$ would be exact but a quick scan over the last two digits of the three numbers involved shows that the floating point representation doesn’t capture the correct representation exactly.

Of course this should come as no surprise since this is an expected limitation of floating point arithmetic.  The only way to determine if two times are equal using the floating point method is to difference the two times in question, take the absolute value of the result and to declare sameness if the value is less than some tolerance.  Critics will be quick to point out that this fuzziness is the cost of fast performance and that this consideration outweighs exactness, but this is really just a tacit admission of the existence of a threshold time below which one does not need to probe.

Arbitrary precision, in the form of a sufficient set of integers (as used above), circumvents this problem but only to a point.  One cannot have an infinite number of integers to capture the smallest conceivable sliver of time. Practically, both memory and performance considerations limit the list of integers in the set to be relatively small.  And so we again have a threshold time below which we cannot represent a change.

And so we arrive at the contemplation of the first problem.  Is there really any philosophical ground on which we can stand that says that a continuous time is required.  Certainly the calculus requires continuity at the smallest of scales but is the calculus truth or a tool?  Newton’s laws can only be explored to a fairly limited level before the laws of quantum mechanics becomes important.  But are the laws of quantum mechanics really laws in continuous time?  Or is Schrodinger’s equation an approximation to the underlying truth?  The answer to these questions, I suppose, is a matter of time.

A Heap of Equivocation

As I write this week’s entry to Aristotle to Digital, I am reflecting on the life and times of Yogi Berra, who just died at the ripe old age of 90.  I fervently hope that he is resting in peace.  In my opinion, he earned it.

In an earlier column, published about a year ago, I wrote about Yogi Berra Logic as I termed the legendary witticisms of one of the greatest catchers to have ever played the game of baseball.  In tribute to his life and passing, I thought I would revisit that whimsical posting with some more thoughts on what made Yogisms have such timeless attraction and talk a little about some other playful uses of natural language.

Before I go deeply into these points, I would like to correct the record about Yogisms.  Several people have used the word malapropisms to describe the various nuggets of thought that he would utter.  This is an incorrect application of malapropism, which is defined as:

malapropism – the mistaken use of a word in place of a similar-sounding one, often with unintentionally amusing effect, as in, for example, “dance a flamingo” (instead of flamenco).

I’m not saying that Yogi never used a malapropism in his life.  I am saying that most, if not all, of his Yogisms don’t fall into this category.  Rather they fall into the category of equivocal speech.  The meaning of words changes, often extremely quickly, from one part of the Yogism to another and one has to read them with the various contexts that they span in mind.

A host of Yogisms can be found at the Yogi Berra Musuem and Learning Center’s list of Yogisms.  To illustrate the point of equivocation in some detail take the Yogism

The future ain’t what it used to be

– Yogi Berra

On the surface, this expression doesn’t seem to have any meaning and would surely throw any natural language analysis software for a loop in trying to assign one.  And yet, there actually is at least a little meaning as evidenced by the smile, chuckle, chortle, snicker, or belly-laugh that each of us has as a reaction upon reading it.

But surface impressions are rarely more than skin deep (a Yogism of my own perhaps?) and with a bit of imagination we can easily parse out some meaning and, perhaps, even profound meaning.  I base this expectation on the fact that Yogi Berra was not a stupid man by any measure – his accomplishments alone should testify to that – and that his Yogisms strike a chord in so many peoples mind.

There are, at least, two fairly poignant meanings that can be mined with a fair amount of confidence from the Yogism above.  The first is that the hope and aspirations for the future that filled his head at a younger age are now replaced with far less hopeful ideas for what the future holds now that he has grown older.  On other words, the next 20 years looked brighter to him when he was younger compared to how he perceives the same 20 year span into the future now that he is an older man.  The second is that when he was younger, say 25 years old, and looking forward to what the world would offer when he was 40, he had huge dreams of what might come true.  Now that he has turned 40, he’s found that ‘the future’ wasn’t as wonderful as he imagined it might be.

Notice the structure of this particular Yogism. It invokes these two ideas compactly and with humor in a way that a plainer and more logical composition that avoided equivocation cannot do.  It’s a masterpiece of natural language if not of pure predicate logic and I think we should be thankful for that.

I don’t know with certainty but I suspect that the next example of natural language gymnastics would have likely captured Yogi’s fancy as well.  It is known as the continuum fallacy.

In the continuum fallacy, natural language is used to allow one to cross a fuzzy line without even knowing one is doing it.  One form of the continuum fallacy (really the sorites paradox, but they are essentially the same thing) reads something like this.

  • We can all agree that 1,000,000 grains of sand can be called a heap
  • We can also agree that if we take 1 grain away from this heap, it’s still a heap
  • Then we can also agree that 999,999 grains of sand can also be called a heap
  • And in continuing in this fashion we can soon arrive at the idea that a heap of sand need not have any sand in it at all.

In the general explanations of why this line of argumentation is a fallacy, analysts will cite that reason being the vague nature of the definition of heap (vagueness of predicates).  Certainly it is true that at some poorly undefined (or undefinable?) line exists between where the heap turns into a non-heap.

This ‘paradox’ is not confined to linguistics.  Take the image below.

continuum_paradox

The color gradient from red to yellow is so gradual that it is hard to say that any single color is really different from its neighbors and yet red is not yellow and yellow is not red.  And where does the orange begin and end?

This vagueness seems to be a universal feature that is built into most everything.  And while it may throw linguists, logicians, perceptual psychologists, and computers into a tizzy, I suspect that Berra, the playful king of vagueness, would have had as much fun with this as with uttering Yogisms.

Fallacies, Authority, and Common Sense

Logical fallacies are everywhere.  Just make a search online using the string ‘logical fallacies list’ (e.g., here, here, and here), and you’ll come across many lists citing many more fallacies that an arguer can employ, and why they are wrong, bad, or otherwise socially unacceptable.  The authors of such lists argue that it is desirable, when crafting a valid argument, to avoid as many of these fallacies as possible and, when consuming an argument, to be sensitive to their presence.

And yet the number of fallacies in day-to-day discourse never seems to diminish.  So, clearly, people aren’t getting the message.

Of course, not everyone making an argument is really interested in making their argument valid.  Certainly, politicians are interested more in getting votes or passing their particular bills into law than they are ever interested in logic and logical fallacies.  Advertisers also bend the rules of good logic to make their product stand out so that potential customers will select their product over a competitor’s.  So people who fall into these classes reject the message because embracing it would compromise their goals.

But there is another facet worth considering as well.  There is a possibility that people do get the message and simply reject it since they judge that the message itself is flawed.  Is it possible that some people’s common sense allows them, perhaps unconsciously, to see that some arguments about fallacies are themselves fallacious?  Is it possible that some people who argue about avoiding fallacies are engaging in fallacies about fallacies?

Now, before I explain how some arguments about fallacies can be fallacious, I would like to clarify a couple of points.  First, I think the best definition of a fallacy is provided by the Stanford Encyclopedia of Philosophy, which says that a fallacy is a deceptively bad argument; an argument where the conclusion that does not follow from the premises being offered and that it is not manifestly obvious why. Second, that that definition, while being the best out there, is still fairly inadequate.  The reason is that, if one can detect the fallacy, how deceptive is it actually?  The point here is that the very concept of a fallacy is a slippery one and, in fact, there is substantial controversy about the nature of fallacies as can be seen from the long discussion here.

So, for the sake of this post, I am going to argue that a fallacy is a bad argument that is deceptive for people who are not trained in detecting and correcting it.

Some fallacies are relatively easy to detect and fix.   The simplest ones seem to originate in deductive reasoning.  The following example of the fallacy of the undistributed middle comes from syllogistic logic:

All dogs have fur
My cat has fur
Therefore my cat is a dog

These types of errors are easy to see even if they are not easy to explain.  These types of fallacies are benign because they aren’t very deceptive.

A much more common and truly deceptive fallacy comes in the form of equivocation, where the meaning of a term changes mid-argument and, if one isn’t careful, one misses it and becomes either confused or, worse, convinced of an invalid conclusion.

When the argument is simple, equivocation is fairly easy to find, as in this example:

The end of life is death.
Happiness is the end of life.
So, death is happiness.

Clearly the word ‘end’ in the first line means the termination or cessation whereas the word ‘end’ in the second means goal or purpose.   When the argument is much larger in length or involves an emotional subject it is much harder to detect equivocation.  As an example on that front, I once read a blog post (unfortunately I can’t source it anymore) where the author was celebrating a story in which an Amish man boarded a bus and challenged the people onboard about television.  As the story goes, the Amish man asked how many of the passengers had a TV and every hand went up.  He then asked them how many of them thought TV was bad and almost every hand went up as well.  He then asked why, if they thought it was bad, did they tolerate a TV in their homes.  The blogger obviously didn’t notice or care that the definition of TV had changed from the first question, where it meant the device, to the second question, where it meant the programming.  All that mattered was the emotional delivery.

Perhaps the trickiest kind of fallacy concerns appeals to authority.  And it is in this case where we find fertile ground where grow the fallacy of fallacies.

An appeal to authority can actually be a reasonable thing to do when dealing with custom, or policy, or doctrine.  As long as the authority is proper, the appeal can be a solid piece in an argument.  When the appeal is to the authority of the public or to someone whose motives are questionable, then the appeal to authority becomes a fallacy.

That said, an appeal to authority is never valid when it comes to science.  Nonetheless, it is a common place appeal offered by those who talk about ‘settled science’.  They tell us that a scientific conclusion is valid based solely on the idea that ‘X percent of the scientists in the world agree on proposition Y’.  They also tell us that anyone who objects is necessarily engaged in a logical fallacy by either ignoring a proper appeal to authority (the X percent of scientists who believe proposition Y) or by making an incorrect appeal to authority (the 100 – X percent of scientists who reject proposition Y).

To my way of thinking, as a physicist, this type of argument goes against common sense and is just wrong.  Consider the case in physics at the turn on the 20th century.  A majority of scientists felt that mankind had basically all the rules in place.  The science of mechanics was well understood in terms of Newton and his 3 laws and the science of electricity, magnetism, and optics had just been united by Maxwell.  Sure there was this pesky little problem with the ultra-violet catastrophe, but the majority of scientists were willing to ignore this or believe that a small tweak was all that was needed to fix things.   Of course, that ‘small tweak’ ushered in the science of quantum mechanics that forever changed the way we think about science and philosophy.

Now a careful reader may argue that I indulged in a logical fallacy of my own about the majority of scientists when I pronounced that they were willing to ignore or believe all that was needed was a small tweak.  After all, was I there to interview each and every one of them?  But that assessment is backed up by an overwhelming amount of evidence that shows that the advancement of Planck was surprise to the physics community.

So what to make of those ‘settled science’ folk? Well they seem to want to ignore the logic underpinning the scientific method by appealing to authority as if scientific conclusions are immutable as long as they are based on a kind of popularity.  They also use the form and structure developed to explain logical fallacies as an additional appeal to authority (in this case to the community of logicians rather than scientists) to dismiss anyone who believes the contrary to their doctrine as being illogical.   And here they commit a two-fold error.  By failing to recognize that there is no certainty encompassing either science or logic in their entirety, these individuals use the machinery of avoiding fallacies as a logical fallacy itself. They look on those who support the doctrine as pure in motive and look upon those who reject it as either corrupt or unqualified and stupid.  They heap on layer after layer of emotionalism while telling their critics that they are mired in emotional thinking.

Fortunately, it seems, the human mind has a built-in safety valve in the form of common sense that allows us to reject these fallacies of fallacies even if we don’t know why we do it.  I suppose, intrinsic to the human condition, is a natural skepticism for just how far logic can takes us.  After all, it is a tool not a god and we should treat it as such.

The Power of Imagination

A couple of weeks ago, I wrote about the subtle difficulties surrounding the mathematics and programming of vectors.  The representation of a generic vector by a column array made the situation particularly confusing, as one type of vector was being used to represent another type.  The central idea in that post was that the representation of an object can be very seductive; it can cloud how you think about the object, or use it, or program it.

Well, this idea about representations has, itself, proven to be seductive, and has lead me to think about the human capacity that allows imagination to imbue representations of things with a life of their own.

To set the stage for this exploration, consider the well-known painting entitled The Treachery of Images by the French painter Magritte.

Magritte_pipe

The translation of the text in French at the bottom of the painting reads “This is not a pipe.”  Magritte’s point is that the image of the pipe is a representation of the idea of a pipe but is not a pipe itself; hence his choice of the word ‘treachery’ in the title of his painting.

Of course, this is exactly the point I was making in my earlier post, but a complication in my thinking arose that sheds a great deal of light on the human condition and has implications for true machine sentience.

I was reading Scott McCloud’s book Understanding Comics, when he presented a section on what makes sequential art so compelling.  In that section, McCloud talks about the inherent mystery that allows a human, virtually any human old enough to read, to imagine many things while reading a comic.  Some of the things that the reader imagines include:

  • Action takes place in the gutters between the panels
  • Written dialog is actually being spoken
  • That strokes of pencil and pen and color are actually things.

You, dear reader, are also engaging in this kind of imagining.  The words you are reading – words that I once typed – are not even pen and pencil strokes on a page.  The whole concept of page and stroke is, of course, virtual: tracings of different flows of current and voltage humming through micro-circuitry in your computer.

Not only is that painting of Magritte’s shown above not a pipe, it’s not a painting.  It is simply a play of electronic signals on a computer monitor and a physiological response in the eye.  And yet, how is it that it is so compelling?

What is the innate capacity of the human mind and the human soul to be moved by arrangements of ink on a page, by the juxtaposition of glyphs next to each other, by movement of light and color on a movie screen, by the modulated frequencies that come out of speakers and headphones?  In other words, what is the human capacity that breathes life into the signs and signals that surrounds us?

Surely someone will rejoin “it’s a by-product of evolution” or “it’s just the way we are made”.  But these types of responses, as reasonable as they may be, do nothing to address the root faculty of imagination.  They do nothing to address the creativity and the connectivity of the human mind.

As a whimsical example, consider this take on Magritte’s famous painting, inspired by the world of videogames.

Mario_pipe

Humans have that amazing ability to connect to different ideas by some tenuous association to find a marvelous (or at least funny) new thing.  The connections that lead from the ‘pipe’ you smoke to the virtual ‘pipe’ in Mario Brothers are obvious to anyone who has been exposed to both of them in context.  And yet, how do you explain them to someone who hasn’t?  Even more interesting:  how do you enable a machine to make the same connection, to find the imagery funny?  In short, how can we come to understand imagination and, perhaps, imitate it?

Maybe we really don’t want machines that actually emulate human creativity, but we won’t know or understand the limitations of machine intelligence without more fully exploring our own.  And surely one vital component of human intelligence is the ability to flow through the treachery of images into the power of imagination.

Balance and Duality

There is a commonly used device in literature that big, important events start small.  I don’t know if that’s true.  I don’t know if small things are heralds of momentous things but I do know that I received a fairly big shock from a small, almost ignorable footnote in a book.

I was reading through Theory and Problems in Logic, by John Nolt and Dennis Rohatyn, when I discovered the deadly aside.  But before I explain what surprised me so, let me say a few words about the work itself.  This book, for those who don’t know, is a Schaum’s Outline.  Despite that, it is actually a well-constructed outline on Logic.  The explanations and examples are quite useful and the material is quite comprehensive.  I think that the study of logic lends itself quite nicely to the whole approach of Schaum’s since examples seem to be heart of learning logic and the central place where logicians tangle is over some controversial argument or curious sentence like ‘this sentence is false’.

As I was skimming Nolt and Rohatyn’s discussion about how to evaluate arguments I came across this simple exercise

Is the argument below deductive?

Tommy T. reads The Wall Street Journal
$\therefore$ Tommy T. is over 3 months old.

– Nolt and Rohatyn, Theory and Problems in Logic

Their answer (which is the correct one) is that the argument above is not deductive.  At the heart of their explanation for why it isn’t deductive is the fact that while it is highly unlikely that anyone 3 months old or younger could read The Wall Street Journal, nothing completely rules it out.  Since the concept of probability enters into the argument, it cannot be deductive.

So far so good.  Of course, this is an elementary argument so I didn’t expect any surprises.

Nolt and Rohaytn go on to say that this example can be made to be deductive by the inclusion of an additional premise.  This is the standard fig-leaf of logicians, mathematicians, and, to a lesser extent, scientists the world over.  If at first your argument doesn’t succeed, redefine success by axiomatically ruling out all the stuff you don’t like.  Not that that approach is necessarily bad; it is a standard way of making problems more manageable but usually causes confusion in those not schooled in the art.

For their particular logical legerdemain, they amend the argument to read

All readers of The Wall Street Journal are over 3 months old.
Tommy T. reads The Wall Street Journal
$\therefore$ Tommy T. is over 3 months old.

– Nolt and Rohatyn, Theory and Problems in Logic

This argument is now deductive because they refuse to allow the possibility (no matter how low in probability) that those amongst us who are 3 months old are younger cannot read The Wall Street Journal. They elevate to metaphysical certitude the idea that youngsters such as they can’t by simple pronouncement.

Again there are really no surprises here and this technique is a time honored one.  It works pretty well when groping one’s way through a physical theory where one may make a pronouncement that nature forbids or allows such and such, and then one looks for the logical consequences of such a pronouncement.  But a caveat is in order.  This approach is most applicable when a few variables have been identified and/or isolated as being the major cause of the phenomenon that is being studied.  Thus it works better the simpler the system under examination is.  It is more applicable to the study of the electron than it is to the study of a molecule.  It is more applicable to the study of the molecule than to an ensemble of molecules and so on.  By the time we are attempting to apply it to really complex systems (like a 3-month old) its applicability is seriously in doubt.

Imagine then, my surprise by the innocent, little footnote associated with this exercise that reads

There is, in fact, a school of thought known as deductivism which holds that all of what we are here calling “inductive arguments” are mere fragments which must be “completed” in this way before analysis, so there are no genuine inductive arguments

– Nolt and Rohatyn, Theory and Problems in Logic

Note the language used by the pair of logicians.  Not that the deductivism school of thought wants to minimize the use of inductive arguments or maximize the use of deductive ones.  Not that its adherents want to limit the abuses that occur in inductive arguments.  Nothing so cautious as that.  Rather the blanket statement that “there are no genuine inductive arguments.”

A few minutes of exploring on the internet led me to slightly deeper understanding of the school of deductivism but only marginally so.  What could be meant by no genuine arguments?  A bit more searching led me to some arguments due to Karl Popper (see the earlier column on Black Swan Science).

These arguments, as excerpted from Popper’s The Logic of Scientific Discovery, roughly summarized, center on his uneasiness with inductive methods as applied to the empirical sciences.  In his view, an inference is called inductive if it proceeds from singular statements to universal statements.  As his example, we again see the black-swan/white-swan discussion gliding to the front.  His concern is for the ‘problem of induction’ defined as

[t]he question whether inductive inferences are justified, or under what conditions…

-Karl Popper, The Logic of Scientific Discovery

Under his analysis, Popper finds that any ‘principle of induction’ that would solve the problem of induction is doomed to failure since it would necessarily be a synthetic statement, not an analytic one.  From this observation, one would then need a ‘meta principle of induction’ to justify the principle of induction and a ‘meta-meta principle of induction’ to justify that one and so on, to an infinite regress.

Having established this initial work, Popper jumps into his argument for deductivism with the very definite statement

My own view is that the various difficulties of inductive logic here sketched are insurmountable.

-Karl Popper, The Logic of Scientific Discovery

And off he goes. By the end, he has constructed an argument that banishes inductive logic from the scientific landscape, using what, in my opinion, amounts to a massive redefinition of terms.

I’ll not try to present anymore of his argument.  The interested reader can follow the link above and read the excerpt in its entirety.  I would like to try to ask a related but, in my view, more human question.  To what end is all this work leading?  I recognize that it is important to understand how well a scientific theory is supported.  It is also important to understand the limits of knowledge and logic.  But surely, human understanding and knowledge are not limited by our scientific theories nor are they adequate described by formal logic.  Somehow, human understanding is a balance between intuition and logic, between deduction and induction.

Popper’s critiques sound too much like the sounds of someone obsessing over getting the thinking just so without stopping to ask if such a task is worth it.  Scientific discovery happens without the practitioners knowing exactly how it happens and what to call each step.  Should that be enough?

Of course, objectors to my point-of-view will be quick to point out all the missteps that logicians can see in the workings of science – all the black swans that fly in the face of a white-swan belief.  My retort is simply “so what?”

Human existence is not governed solely by logic nor should it be.  If it were, a part of the population would be frozen in indecision because terms were not defined properly, another part would be stuck in an infinite loop, and the last part would be angrily arguing with itself over the proper structure.  There is a duality between induction and deduction that works for the human race – a time to generalize from the specific to the universal and a time to deduce from the universal to the specific.

Perhaps someday, someone will perfect deductivism in such a way so that scientific discovery can happen efficiently without all the drama and controversy and uncertainty.  Maybe… but I doubt it.  After all, we know that we humans aren’t perfect – why should we expect one of our enterprises to be perfectible?