Latest Posts

Finding the Socratic Method

There is a standard debate in mathematics about the application of the terms ‘invention’ versus ‘discovery’.  It resurfaced the other day when a colleague and I were talking about some mathematical graffiti that adorned a door jamb in the conference room in which we were meeting.  This graffiti took the form of some mathematical symbols printed on a magnet held in place at the top part of the door.  None of us in the room were able to determine what, if any, message was being sent but, in the process of discussing the possible meaning, my colleague said, in passing, that Pythagoras had invented the theorem that bears his name.  I questioned whether the verb should have been ‘discovered’ rather than ‘invented’.  We spent a few minutes discussing that point, then we gave up altogether and went our separate ways. On the drive home that evening I began to think about the proper use of those two words and I finished up wondering if Socrates invented or discovered his famous method.

To understand the distinction between ‘invent’ and ‘discover’, let’s return to the Pythagorean Theorem for a moment.  Most everyone knows the theorem, namely that the square of the length of the hypotenuse of a right triangle is equal to the sum of the squares of the lengths of the other 2 sides.  The most common proof and, in my opinion, the most elegant, draws a right triangle with squares of the appropriate areas on each side.  I’ve provided such a diagram for a 3-4-5 Pythagorean triangle below.

PT_triangles

The proof proceeds by laying the bigger of the two 90-degree side squares onto the hypotenuse square

PT_overlay

and then adding up the remaining area and showing that it is equal to the area of the smaller square

PT_small_area

Of course, the steps shown above were quickly done in PowerPoint graphics and there is no reason for a skeptic to actually accept them as proof.  But the doubters can go at this ‘proof’ with whatever vigor they desire.  The answer will always be the same:  $a^2 + b^2 = c^2$.

And that brings us to the point about invention versus discovery.  I would argue that the Pythagorean Theorem is an exercise in discovery.  That finding that all right triangles satisfy it is exactly the point – we find or discover that all right triangles obey this relationship.

Contrast this with Edison’s invention of the light bulb.  I say invention because of a number of factors.  First, there is the particular form of the object in question.  The base, with its two contacts, one at the bottom center and one on the periphery, is choice of form that could have been done many other ways.  The shape of the bulb itself is only a suggestion of what could be done given the state of art of glass blowing at the time of its introduction to society.  Second, there is the particular design and implementation.  The placement, current and voltage running through the filament were all carefully chosen to meet specific goals or requirements.  The materials that comprise all the parts were chosen to provide the maximum economy based on availability and convenience in pre-existing manufacturing processes. So I would argue that, while it is proper to say that laws and properties of electricity were discovered, the light bulb is a true invention.

So far so good, but what about Socrates?  Did he discover or invent the Socratic Method?  As a quick review, a few brief words are in order about what I mean when I say the Socratic Method (ironically, if you follow the link to the Wikipedia article, you’ll find that both ‘invention’ and ‘discovery’ are used in describing Socrates’s contribution).

The Socratic Method is best explained by the Platonic dialog called Euthyphro.  In this dialog, we find Socrates and Euthyphro both showing up at the Athenian court but for very different reasons.  Socrates is answering a call by the court to make account of his ‘criminal ways’ whereas Euthyphro intends on petitioning the court to bring a charge of murder against his father for the death of a slave in his possession.  The two meet on the steps leading inside and exchange with each other their reasons for being there.  Socrates expresses surprise that Euthyphro is accusing his father of murder since the slave in question died from being imprisoned for murdering another slave.  Euthyphro says that he is compelled to this course of action due to his piety.  That’s all the prompting needed by Socrates and soon the two are engaged in a discussion where Socrates asks a question like ‘what is piety’ and Euthyphro attempts to answer with a response like ‘what is pleasing to the gods’.  After each answer, Socrates questions some new part of the response as a way of sharpening the reasoning behind the response.

The Socratic Method is a way of examining the logical content of a statement by carefully examining the basic notions that make up that statement.  So, asking what do ‘piety’, ‘pleasing’ and ‘gods’ mean is a way of finding the truth. Generally, when the method is applied, we are more apt to find out what a particular concept, like ‘piety’, isn’t, rather than finding out what it is.  Most of the dialogs (and, for that matter, modern applications) end with both parties departing before the full meaning has been established but at least with a clearer picture of what is not meant.

So, with all the preliminaries out of the way, the key question to grapple with is whether Socrates invented this method or discovered it.  My vote is for discovery.  I say this mostly because of the universal nature of this mode of inquiry, but partly because Socrates believed in Truth in the most absolute sense.  If he invented this type of intellectual exploration, then the application of it would be necessarily limited to those contexts where its design matched a particular cast of mind or cultural milieu. The fact that it is a successful philosophical pursuit the world over is testament to its ability to transcend the accidentals of human culture.  The fact that it was fashioned with the goal of discovering Truth through logic and reason and that Socrates believed in Truth leads me to believe that he would agree that he discovered his method.

I am willing to say that Plato invented the particular encounters presented in the dialogs and that Socrates invented the accidental trappings whereby he applied his method to the Athenian society.  Recognizing these inventions is equivalent to recognizing the invention of the particular symbols for denoting an algebraic quantity as ‘$a$’ or multiplication by ‘$\times$’ or equality as ‘$=$’.  Writing $c \times c = a \times a + b \times b$ versus saying ‘the square of the hypotenuse is equal to the sum of the squares of the other two sides’ are two different, invented ways for expressing the same truth.

Language and Metaphor

Why do we quote movies?  I often think about this fascination we have as a culture to repeat and relive moments from our favorite films.  Large number of clips exist on YouTube devoted to capturing that magical moment when a character utters an unforgettable line.  But why? What is it that drives us to remember a great line or identify with the person who uttered a quote?

This question came up over the dinner table one night and, as I reflected on this question, my thoughts were drawn to an explanation about speed reading that I had once come across.  The author went to great trouble to make the point that that the trick to speed reading was to see groups of words in a sentence as a single chunk of information.

To understand that point, consider the words you read in a sentence. To make it concrete, let’s take the word ‘understand’.  When you read the word ‘understand’, you are seeing a group of 11 individual letters, but are you really conscious of each letter at a time?  Do you really see a ‘u’ followed by an ‘n’ followed by a ‘d’ and so on?  No.  What each of us with any sophistication in reading accomplishes is to perceive these 11 letters as a unit that conveys the word ‘understand’.  Of course, this is why we can read a sentence with a misspelling quite comfortably and often we may not even notice.

This idea of chunking and clumping comfortably scales upward to where common phrases can be consumed with a glance without a detailed examination of the individual words.  Two common approaches are to either abbreviate the phrase into some short form.  Expressions like ‘LOL’ or ‘BTW’ or any of the other text and chat speak concepts are excellent examples.  The other approach is to pick lyrical or repetitive expressions to encapsulate a phrase.  The famous ‘rosy-fingered dawn’ from the Homeric epics or ‘ready, set, go!’ are examples of these kinds, respectively.

But there is an even more compelling approach – the concept of the metaphor.  The idea here being that we liken an object to another object, one with well-known properties.  The properties of the second object are then inherited by the first simply by the equating of the name.  Some examples of this include the following sentences.

  • ‘That guy’s a Benedict Arnold for leaving our team for that other one.’
  • ‘How was the date last night, Romeo?’
  • ‘Stay away from her, she’s Mata Hari through and through!’
  • ‘That man is death itself’.

I think that our desire to quote movies is indicative of this.  That by repeating the dialog of the movie, the quote itself becomes a cultural metaphor for the feelings and experiences expressed in the movie.  This idea was brilliantly explored in the Star Trek: The Next Generation episode Darmok.

In this episode, the crew of the Enterprise are coaxed into a meeting with a Tamarian ship in orbit around a mostly unexplored planet.  Despite their universal translator, no ship of the Federation had ever been able to crack the Tamarian language.  The words themselves could be translated, but the syntax and context seemed to be utterly devoid of meaning.  The Tamarian captain, Dathon, seeing that this meeting was no different from previous encounters and that the ordinary avenues for communication were not working, arranged for himself and Captain Picard to be trapped on a planet.

On that planet, both captains were confronted by a dangerous creature.  This shared danger spurred a meeting of the minds and eventual understanding dawned on Picard.  The Tamarian race thought and communicated in metaphors.  They would say statements like ‘Tember, his arms wide’ to mean the concept of giving or generosity.  Back on the Enterprise, the crew had also come to a similar epiphany.  By analogy, they constructed a Tamarian-like way of expressing romance by saying ‘Juliet on her balcony’, but they lamented that without the proper context, in this case Shakespeare’s tragic play about Romeo and Juliet, one didn’t know who Juliet was and why she was on her balcony.

The episode closes with Dathon dying from the wounds inflicted by the creature, and with Picard arriving back aboard the Enterprise just in time to make peace between the Tamarians and the Federation by speaking their language.

The episode left some dangling ideas.  How do the Tamarians specify an offer that involves a choice between many things, or how an abstract idea, like giving someone his freedom, would be expressed.  Nonetheless, it was a creative and insightful way of exploring how powerful metaphor can be and how abstracted can be the thought that lies behind it.

So, the next time you quote a movie, give a thought to the metaphor that you are tapping into and take a moment to marvel at the miracle of speech and thought.

Aces High

This week’s column has a three-fold inspiration.  First off, most of the most exciting and controversial philosophical endeavors always involve arguing from incomplete or imprecise information to a general conclusion.  These inferences fall under the heading of either inductive or abductive reasoning, and most of the real sticking points in modern society (or any society for that matter) revolve around how well a conclusion is supported by fact and argument.  The second source comes from my recent dabbling in artificial intelligence.  I am currently taking the edx course CS188.1x, and the basic message is that the real steps forward that have been taking shape in the AI landscape came after the realization was made that computers must be equipped to handle incomplete, noisy, and inconsistent data.  Statistical analysis and inference deals, by design, with such data, and its use allows for an algorithm to make a rational decision in such circumstances.  The final inspiration came from a fall meeting of my local AAPT chapter in which Carl Mungan of the United States Naval Academy gave a talk.  His discussion of the two aces problem was nicely done, and I decided to see how to produce code in Python that would ‘experimentally’ verify the results.

Before presenting the various pieces of code, let’s talk a bit about the problem.  This particular problem is due to Boas in her book Mathematical Methods for the Physical Sciences, and goes something like this:

What is the probability that, when being dealt two random cards from a standard deck, the two cards will be two aces?  Suppose that one of the cards is known to be an ace, what is the probability?  Suppose that one of the cards is known to be the ace of spades, what is the probability?

 

The first part of this three-fold problem is very well defined and relatively simple to calculate.  The second two parts require some clarification of the phrase ‘is known to be’.  The first possible interpretation is that the player separates out the aces from the deck and either chooses one of them at random (part 2) or he chooses the ace of spades (part 3).  He returns the remaining 3 aces to the deck, which he subsequently shuffles prior to drawing the second card.  I will refer to this method as the solitaire option.  The second possible interpretation (due to Mungan) is that a dealer draws two cards at random and examines them both while keeping their identity hidden from the player.  If one of the cards is an ace (part 2) or is the ace of spades (part 3), the dealer then gives the cards to the player.  Parts 2 and 3 of the question then ask for the probability that the hands that pass this inspection step actually have two aces.  Since this last process involves assistance from the dealer, I will refer to it as the assisted option.  Finally, all probabilities will be understood from the frequentist point of view.  That is to say that each probability will be computed as a ratio of desired outcomes to all possible outcomes.

With those preliminary definitions out of the way, let’s compute some probabilities by determining various two-card hand outcomes.

First let’s calculate the number of possible two-card hands.  Since there are 52 choices for the first card and 51 for the second, there are 52×51/2 = 1326 possible two-card hands.  Of these, there are 48×47/2 = 1128 possible two-card hands with no aces, since the set of cards without the aces is comprised of 48 cards.  Notice that, in both of these computations, we divide by 2 to account for the fact that order doesn’t matter.  For example, a 2 of clubs as the first card and a 3 of diamonds as the second is the same hand as a 3 of diamonds as the first card and a 2 of clubs as the second.

no_aces

Likewise, there are 48×4 = 192 ways of getting a hand with only 1 ace.  Note that there is no need to divide by 2, since the two cards are drawn from two different sets.

one_ace

Finally, there are only 6 ways to get a two-ace hand.  These are the 6 unique pairs that can be constructed from the green set shown in the above figure.

As a check, we should sum the size of the individual set and confirm that it equals the size of the total number of two-card hands. This sum is 1128 + 192 + 6 for no-ace, one-ace, and two-ace hands, and it totals 1326, which is equal to the size of the two-card hands.  So, the division into subsets is correct.

With the size of these sets well understood, it is reasonably easy to calculate the probabilities asked for in the Boas problem.  In addition, we’ll be in a position to determine something about the behavior of the algorithms developed to model the assisted option.

For part 1, the probability is easy to find as the ratio of all possible two-ace hands (6 of these) to the number of all possible two-card hands (1326 of these).  Calculating this ratio gives 6/1326 = 0.004525 as the probability of pulling a two-ace hand from a random draw from a well-shuffled standard deck.

For parts 2 and 3 in the solitaire option, the first card is either given to be an ace or the ace of spades.  In both cases, the probability of getting another ace is the fraction of times that one of the three remaining aces is pulled from the deck that now holds 51 cards.  The probability is then 3/15 or 0.05882.

The answers for the assisted option for parts 2 and 3 are a bit more subtle.  For part 2, the dealer assists by winnowing the possible set down from 1326 possibilities associated with all two-card hands, to only 192 possibilities associated with all one-ace hands, plus 6 possibilities associated with all two-ace hands.  The correct ratio is 6/198 = 0.03030 as the probability for getting a single hand with two aces when it is known that one is an ace.  For part 3, the dealer is even more zealous in his diminishment of the set of possible hands with one card, the ace of spades.  After he is done, there are only 51 possibilities, of which 3 are winners, and so the correct ratio is 3/51 = 0.05882 as the probability of getting a single hand with two aces when it is known that one is the ace of spades.

All of these probabilities are easily checked by writing computer code.  I’ve chosen python because it is very easy to perform string concatenation and to append and remove items from lists.  The basic data structure is the deck, which is a list of 52 cards constructed from 4 suits (‘S’,’H’,’D’,’C’), ranked in traditional bridge order, and 13 cards (‘A’,’2’,’3’,’4’,’5’,’6’,’7’,’8’,’9’,’10’,’J’,’Q’,’K’).  Picking a card at random is done importing the random package and using random.choice, which returns a random element of a list passed into it as an argument.  Using the random package in the code turns the computer modeling into a Monte Carlo simulation.  For all cases modeled, I input the number of Monte Carlo trials to be N = 1,000,000.

Code to model part 1 and parts 2 and 3 for the solitaire option is easy to implement and understand, so I don’t include it here.  The Monte Carlo results (10 cases each with N trials) certainly support the analysis done above, but, if one didn’t know how to do the combinatorics, one would only be able to conclude that the results are approximately 0.0045204 +/- 0.00017.

The code to model parts 2 and 3 for the assisted option is a bit more involved because the dealer (played by the computer) has to draw a hand then either accept or reject it.  Of the N Monte Carlo trials drawn, what percentage of them will be rejected?  For part 2, this amounts to determining what the ratio of two-card hands that have at least one ace relative to all the possible two-card hands.  This ratio is 192/1326 = 0.1448.  So, roughly 85.5 % of the time the dealer is wasting my time and his.  This lack of economy becomes more pronounced when the dealer rejects anything without an ace-of-spades.  In this case, the ratio is 51/1326 = 1/52 = 0.01923, and approximately 98% of the time the dealer is throwing away the dealt cards because they don’t meet the standard.  In both cases, the Monte Carlo results support the combinatoric analysis with 0.03031 +/- 0.0013 and 0.05948 +/- 0.0044.

import random                

suits   = ['S','H','C','D']
cards   = ['A','2','3','4','5','6','7','8','9','10','J','Q','K']

def make_deck():
    deck = []
    for s in suits:
        for c in cards:
            deck.append(c+s)
    return deck
  
def one_card_ace(num_mc):
    num_times = 0
    num_good  = 0
    deck     = make_deck()
    for i in range(0,num_mc,1):
        card1 = random.choice(deck)
        deck.remove(card1)
        card2 = random.choice(deck)
        deck.append(card1)
        if card1.find('A') == 0 or card2.find('A') == 0:
            num_times = num_times + 1
            if card1.find('A') == 0 and card2.find('A') == 0:
                num_good = num_good + 1
    print deck
    return [num_times, num_good]

def ace_of_spades(num_mc):
    num_times = 0
    num_good  = 0
    deck     = make_deck()
    for i in range(0,num_mc,1):
        card1 = random.choice(deck)
        deck.remove(card1)
        card2 = random.choice(deck)
        deck.append(card1)
        if card1.find('AS') == 0:
            num_times = num_times + 1
            if card2.find('A') == 0:
                num_good = num_good + 1
    print deck
    return [num_times,num_good]

Notice that the uncertainty in the Monte Carlo results grows larger in part 2 and even larger in part 3.  This reflects the fact that the dealer only really affords us about 150,000 and 19,000 trials of useful information due to the rejection process.

Finally, there are a couple of philosophical points to touch on briefly.  First, the Monte Carlo results certainly support the frequentist point of view, but they are not actual proofs of the results.  Even more troubling is that a set enumeration, such as given above, is not a proof of the probabilities, either.  It is a compelling argument and an excellent model, but it presupposes that the probability should be calculated by the ratios as above.  Fundamentally, there is no way to actually prove the assumption that set ratios give us the correct probabilities.  This assumption rests on the belief that all possible two-card hands are equally likely.  This is a very reasonable assumption, but it is an assumption nonetheless. Second, there is often an accompanying statement along the lines that, the more that is known, the higher the likelihood of the result.  For example, knowing that one of the cards was an ace increased the likelihood that both were an ace by a factor of 6.7.  While true, this statement is a bit misleading, since, in order to know, the dealer had to face the more realistic odds that 82 percent of the time he would be rejecting the hand.  So, as the player, our uncertainty was reduced only at the expense of a great deal of effort done on our behalf by another party.  This observation has implications for statistical inference that I will explore in a future column.

Random Thoughts on Randomness

One of the questions I like to indulge, from time to time, is “what would Aristotle’s reaction be to some fixture of modern life?” This week I decided to ponder what the famous Macedonian would say about the use of randomness, statistics, and probabilistic reasoning that pervades our daily existence.

On one hand, I think that he would be intrigued by the very way that randomness is built into nature and our interactions with her. Quantum mechanics has randomness at its very core. This type of randomness, sometimes referred to as aleatory (meaning depending on the dice) uncertainty, is (or at least seems to be) an inescapable facet of how matter works at its most basic level. The outcome of any experiment is probabilistic no matter how precisely controlled was the setup. I imagine the fascination and, perhaps, the disbelief that he would have displayed when first encountering the Heisenberg uncertainty principle. In fact, since I am fantasizing here, it would be really interesting to see these two great men sit down for lunch and a long philosophical discussion about the meaning of uncertainty and causality.

I also think that Aristotle would have quickly grasped how we apply uncertainty in our knowledge, sometimes referred to as epistemological uncertainty, to a host of important everyday activities.

I envision him nodding his head in approval to the sciences of chemistry, thermodynamics, and statistical mechanics. In each of these disciplines, the sheer amount of data needed to describe the motion of the individual atoms, molecules, or components is overwhelming, and one must sacrifice complete knowledge of the small in order to get good knowledge of the all. No doubt, the connection between information and entropy would consume much of his attention as he explored the combinatoric possibilities of very large numbers of objects.

Aristotle would also be at home in the how discipline of statistics and estimation theory. As a skeptic of the universal validity of the Platonic forms, he well knew the individual differences in development and form of everything on the Earth. I picture him exclaiming, with satisfaction, “Yes! That’s how it should be done,” when he saw how statistical estimation theory is applied to turning noisy measurements into adequate conclusions. All the modern applications, from quality control to data fitting, would have caught his fancy.

Computing application of randomness and uncertainty would also meet with his approval. Numerical techniques, such as the Monte Carlo method, that use random samples as a way of ferreting out details of a process would hold a particular fascination for him and I imagine him spending hours playing with cellular automata simulations, such as Conway’s game of life. Most of all, the growing consensus in the artificial intelligence community that uncertainty is the vital component to producing machines that make rational decisions would have been pure delight for the Philosopher.

All that said, I think Aristotle would recoil in horror with that one, very large, component of randomness in modern life – the public’s approach to the scientific study.

Everywhere we go we are confronted by a headline, radio broadcast, news report, or tweet that says some study performed somewhere has determined something. Often times the public, even that component that is highly educated, consumes these studies with little or no heed. Most are unaware of how these studies are performed and how weak the inferences (i.e., conclusions) actually are.

If Aristotle were sitting amongst us when the news of the latest study were to break, he would be the first to caution us to be clear in our thinking and skeptical of the results. As a logician, he would know the difficulty in inferring a cause from the observed effect. And as a member of the Athenian polis, he would know better than we the various conflicting political agendas that drive people to various conclusions.

But the most grievous complaint that Aristotle would level against modern scientific studies would come only after he returned to the academy to see just how ‘the sausage is made’. He would be shocked and repulsed by the bastardization of the academic process. Not because of the political pressures and the conflicts of interest due to funding or the lack of it. No, he would be repulsed with how little regard our institutions of higher learning have for the concept of critical thinking.

To be a skeptic, to doubt a conclusion, to question and probe are no longer virtues (in the strictest Aristotelian sense). These traits brand their bearer not as a philosopher but as dullard or, worse, a denier. Graduate students, post docs, and professors worldwide are increasingly drifting further away from intellectual integrity. It is enough, they say, to find a positive or negative correlation in some data, perform a hypothesis test, and publish a result (see, e.g., the critiques leveled by Michael Starbird in Meaning from Data: Statistics Made Clear). Damn critical thinking and the discomfort that it brings they say. Aristotle, or course, would say different.

The Dangers of Being Equal

One of the favorite themes of this blog is how language and reasoning affect each other, sometimes to the detriment of both.  The overlap between logical reasoning and mathematical language is particularly ripe with possibilities of confusion because of the way certain concepts are used contextually.  In an earlier post I discussed the seductive properties of the humble symbol $\infty$.  A far more deadly symbol is the highly overloaded glyph described by two parallel horizontal lines – the equal sign ‘$=$’.

There are so many contextual uses of the equal sign that it is hard to know where to start.  And each and every one of them is sinister to the untrained.  Like some kind of bizarre initiation ritual, we subject students of all stripes to this ambiguous notation, and then we get frustrated when they don’t grasp the subtle distinctions and shaded nuances of meaning that we take for granted.  This situation closely parallels the experiences many of us have had learning how to swim, or ride a bike, or ice-skate, or drive.  Those of us who know how to do something, often can’t remember how hard it is to learn when you don’t know.

Of course, this situation is not unprecedented in language.  A simple internet search using the string ‘word with the most definitions’ returns the following statement from Dictionary.com

“Set” has 464 definitions in the Oxford English Dictionary. “Run” runs a distant second, with 396. Rounding out the top ten are “go” with 368, “take” with 343, “stand” with 334, “get” with 289, “turn” with 288, “put” with 268, “fall” with 264, and “strike” with 250.

So, functionally overloading a word with numerous meanings, some of them very closely related and some of them quite distinct, is commonplace.

What makes the equal sign so frustrating is that it is mostly applied in highly technical fields where shades of meaning in thought can have large implication in outcomes.  Consider the differences in meaning in the following equations:
\[ \pi = \frac{C}{D} = \frac{C}{2 r} \]
and
\[ \pi = \ln\left( i^{-2i} \right) \]
and
\[ \pi = \sqrt{6 \sum_{n=1}^{\infty} \frac{1}{n^2}} \; . \]

Each of them tells us something about the irrational number $\pi$, but in very different ways.  In the first equation, we think of $\pi$ as the assigned value for the correlation between the diameter of a circle, $D$, and its circumference, $C$.  This concept is purely geometric, and can be explored with rulers and compasses and pieces of paper. In some sense, it can even be regarded as a causative relation, telling us that, if we make a circle of radius $r$, then we are making an object whose perimeter is a distance $C$.  The second equation is an identity in the purest sense of that term.  It boldly states that one of the many disguises of $\pi$ is an algebraic expression involving the natural logarithm and the imaginary number .  The final equation is neither an assignment nor an identity, but a set of instructions saying ‘if you want to know how to compute $\pi$ to some accuracy, then set up a computing process that takes the first  integers and combines them in this funny way.’

The science of computing has long recognized that the usual ambiguity of human language would be inadequate for machine instructions.  All programming languages to which I’ve been exposed clearly distinguish between the concepts of assignment, equivalence, and function definition.  Using the pi equations above, one might express them in the programming languages Python and Maxima as

Pi equation Python Maxima
\[ \small  \pi = \frac{C}{2r} \]
pi = C/(2*r)
pi : C/(2*r)
\[ \small \pi = \ln\left(i^{-2i}\right) \]
pi == ln(i**(-2*i))
pi = ln(i**(-2*i))
\[ \small \pi = \sqrt{6 \sum_{n=1}^{\infty} \frac{1}{n^2}} \]
def sum_sqr(n):
    sum = 0
    for i in range(1,n+1):
        sum = sum + 1.0/(i*i)
    return temp

def approx_pi(n):
    sum = sum_sqr(n)
    return (6*sum)**(0.5)
calc_pi(n) := 
block([sum],
 sum : 0,
 for i: 1 thru n do
 sum : sum + 1/(i*i),
 ans : sqrt(6*sum));

Note that, in each case, there is a clear syntactical difference between assignment (‘$=$’ or ‘$:$’), the conditional test for identity (‘$==$’ or ‘$=$’), and functional definition (‘def’ or ‘$:=$’).  For anyone who’s been programming for some time, switching back and forth between these ideas of assignment, equivalence, and definition is relatively effortless, but for the beginner it is one of the hardest concepts he will have to learn.

The situation is even more complex in the physical sciences, for two primary reasons.  First, and foremost, because man has been investigating the physical world longer than he has been writing computer programs.  As a result, there has been more time for man to layer different meanings and subtle distinctions.  Second, computers are intrinsically stupid and require a high degree of precision and clarity to function.  A nice discussion of this last point can be found in the prologue of the book Functional Differential Geometry by Sussman and Wisdom.

As an example, let’s look at perhaps the most famous physical statement – Newton’s second law.  Many people, even those lacking formal training in science, know that the expression of the law is ‘force equals mass times acceleration’ or, in mathematical terms,

\[ \vec F = m \vec a \; . \]

But what does the equal sign here mean?  The concept of a force tells us that it is a vector quantity that transforms like a position vector.  That means that a force relationship is the same in all frames.  For example, the balancing of the pulls from three ropes tied to an object such that the object doesn’t move is an equilibrium condition that is independent of the frame in which it is expressed.  An accelerating observer will make the same conclusion as an inertial observer. So the force on the left-hand side of $f=ma$ is geometric in its meaning.

On the other hand, we understand that the acceleration appearing on the right-hand side is kinematic.  It describes an object’s motion and it’s the kind of thing measured with rulers and clocks.  It is fundamentally frame dependent when described by an accelerating observer.  Just imagine the visual perception of someone on a merry-go-round.  The mass, which measures the object’s unwillingness to move under influence of a force, simply scales the acceleration and can be regarded as constant.

So how do we reconcile what the equal sign is meaning here?   On one side is a geometric quantity as immutable and placid as a mountain.  The other side is as ephemeral as rising mist or running water, flowing to and fro.  How can they actually be equal?

Well, the answer is that the equal sign should be regarded as relating cause and effect.  If we regard the force as known (e.g., Newton’s universal law of gravity), then the equal sign allows us to deduce the resulting motion once the force is applied.  If we regard the acceleration as known (e.g., we film the motion and do a frame analysis), we can infer (via abductive reasoning) the force that caused it.

Clearly, the innocent-looking ‘$=$’ packs a lot more meaning than at first it appears. It is interesting to ponder why it is that the shortest of strings, such as ‘$\infty$’, or ‘set’, or ‘$=$’, have the longest and deepest of meanings. Maybe it reflects on the subtly of the human mind.

It Isn’t Elementary

I suppose that this post grew out of a recent opportunity I had to re-watch the M*A*S*H television series. One particular episode, entitled The Light That Failed, finds the army surgeons and nurses of the 4077th unit suffering a brutal Korean winter and desperately low on supplies.  The supply truck soon arrives bearing a cargo more suited for a military unit in the Guam or the Philippines and not mainland Asia.  As the troops are left wondering what they can do with mosquito netting, ice cream churns, and swim fins when the temperature is hovering around freezing, someone notices that one of the doctors, B J Hunnicutt, has received a very rare object – a book.

And this is not just any book, but a murder mystery by the famed writer Abigail Potterfield called the Rooster Crowed at Midnight.  Either out of the goodness of his heart or out of a desire to end the constant nagging (probably both), B J decides to tear portions of the book out so that it can circulate throughout the camp.  As the old saying goes, no good deed goes unpunished, and soon he discovers that the last page of the book, in which all is revealed, is missing.  And thus begins the great debate as to who committed the murders, how they did it, and why.

The team comes up with many answers, all of which are first widely embraced as the solution and then scuttled when someone gives a counterexample that pokes a hole in the theory.  They eventually place a long distance phone call to the author herself, now residing in Australia, to get the answer.  But even this authoritative voice doesn’t quell the skepticism.  Shortly after they ring off, Col. Potter, the commanding officer, points out that Abigail Porterfield’s own solution can’t be true.  The episode closes with the delivery of the much-needed supplies, and some comic hijinks, but with no satisfactory explanation as to who the culprit was.

I was in middle school when I first saw that episode and it left a lasting impression on me.  For many years I carried misconceptions about mystery stories and I wondered why anyone would ever read them.  In particular, I held a very skewed idea about deductive reasoning and what can and cannot be determined from the evidence.  With the perspective of years (really decades) I am both happy and disappointed to say that I was not alone in my poor understanding of what logic and reason are capable of achieving.

Let’s talk about deductive logic first.  The basic idea behind deductive logic is that the conclusion is infallible if the premises are true.  It is a strong approach to logic, since it argues from first principles that apply to a broad class or set of objects and, from those, narrows down a conclusion about a specific object.  In a pictorial sense, deductive logic can be thought of in terms of Venn diagrams.  If we want to conclude something about an object we simply need to know into what categories or classes this object falls and we will be able to exactly conclude something about it by noting where all of the various categories to which it belongs intersect.

Deductive reasoning is, unfortunately, also limited by the fact that we are not born with nor does anyone have the universe’s user manual that spells out in detail what attributes each object has and into what categories they may be grouped.  So, the standard objections that are raised in deductive logic fall squarely on disagreements about the truth of one or more premises.

For example, the syllogism

  • All men are mammals
  • George is a man
  • Therefore George is a mammal

is a logically correct deduction, since the conclusion follows from the premises and it is true since the premises are true (or at least we regard them to be true). The syllogism

  • All men are mammals
  • George is a mammal
  • Therefore George is a man

is invalid, even though its premises are true, since it argues from the specific to the general.  In contrast, the syllogism

  • All white birds are man-eaters
  • All swans are white birds
  • Therefore all swans are man-eaters

is perfectly valid, since the conclusion follows from the premises, but is not true since neither premise is true (or so I hope!).

All of this should be familiar.  But what to make of this ‘deduction’ (B J’s syllogism) made by B J Hunnicutt in the episode:

  • Lord Cheevers was murdered in the locked library in Huntley Manor
  • Randolf had motive for murdering Lord Cheevers
  • Randolf played in Huntley Manor as a child
  • Randolf would have known if there were secret passages in Huntley Manor
  • Therefore Randolf was the murderer.

Is this really a deduction as he claimed?  According to the novel, the first three premises are true.  The fourth premise is certainly plausible but is not necessarily true.  How then should we feel about the conclusion?  What kind of logic is this if it is not deductive?  Suppose we knew that Randolf was the murderer (e.g., we caught him in the act). What can we infer about the fourth premise?

Before answering these questions, consider what would happen if we were to modify the argument a bit to simplify the various possibilities.  The syllogism (B J’s syllogism revised) now reads

  • Lord Cheevers was murdered in the locked library in Huntley Manor
  • Randolf had motive for murdering Lord Cheevers
  • Randolf played in Huntley Manor as a child
  • Randolf knew there was a secret passage from the study to the library
  • Randolf was seen entering the empty study just before the murder
  • Therefore Randolf was the murderer.

This argument is certainly a stronger one than the first one proffered, but it isn’t really conclusive.  But, again, how should we feel about the conclusion?  What kind of logic is this?

In both cases, we know that the conclusion is not iron-clad; that it doesn’t necessarily follow from the premises.  But just like those fictional characters in M*A*S*H, we are often faced with the need to draw a conclusion from a set of premises that do not completely ‘nail down’ an unequivocal conclusion.

The type of logic that deals with uncertainty falls under the broad descriptions of inductive and abductive reasoning.  Inductive reasoning allows us to draw a plausible conclusion ‘B’ from a set of premises ‘A’ without ‘B’ necessarily following from ‘A’.  Abductive reasoning allows us to infer the premise ‘A’ based on our knowledge that outcome ‘B’ has occurred.

In the M*A*S*H examples given above, B J’s revised syllogism is an example of inductive reasoning.  All the necessary ingredients are there for Randolf to have committed the crime but there is not enough evidence to inescapably conclude that he did. We can infer that Randolf is the killer but we can’t conclude that with certainty.

B J’s original syllogism is a lot more complicated.  It involved elements of both inductive and abductive reasoning.  If we believe Randolf is guilty, we might then try to establish that there were secret passages in Huntley Manor that connected the locked library to some other room in the mansion.  We would then have to also establish, maybe through eyewitness testimony, that Randolf knew of the passages (e.g., an old servant recalls showing it to a young Randolf).  Even still, all we would be doing is establishing the premises with more certainty.  The conclusion of his guilt would still not necessarily follow.  If, on the other hand, we knew that he was guilty, perhaps he was seen by someone looking into the library from outside, we might abductively infer that there was a secret passage and that Randolf knew of its existence.

So here it is, a great irony of life.  It’s decades after I first watched an episode of M*A*S*H that turned me off of mystery stories for a long time, and I find myself using that episode as a model for discussing logic and reason.  That, I can’t figure out.

The Language of Infinity

“How language shapes thought and thought shapes language” is an age old question in linguistics and philosophy.  I’m not in any position to give a definitive answer, nor, I suspect, is anyone else.  Having taught math and physics at the university level, I am willing to offer some thoughts about how the language of mathematics and the symbols and glyphs used to turn mathematical concepts into written words shape how people think and solve problems.

In this blog I will be focusing in the concept of infinity and the philosophic implications that come from using it.  But before I get to infinity directly, I would like to discuss, by way of a warm-up exercise, how the use of the symbol ‘x’ throws off a lot of beginning students.

When describing a function or mapping between two sets of real numbers, without a doubt, the most common notation that teachers use is to allow the symbol ‘x’, called the independent variable, to be any member of the initial set, and the symbols ‘y’ and ‘f(x)’ to be the corresponding member of the target set and the function that generates it.  The symbolic equation ‘y = f(x)’ becomes so rigidly fixed in some students minds, that the idea that the symbols ‘x’ or ‘y’ could be replaced with any other symbol, say ‘y’ and ‘z’, never occurs to them.  I myself have experiences of students coming and asking if their book has a typo when it asks them to solve ‘x = f(y)’ or integrate ‘f(y) dy’ or the like (once this happened while I was out to dinner at Olive Garden with my family – but that is a story for another day).

There is no easy way to fix this problem as there is a kind of catch 22 in the teaching of mathematics.  One on hand, the mapping between sets exists as a pictorial relation between ‘clouds’ and ‘the points within them’

pictorial_mapping

without the need for written glyphs.  One the other hand, an initial set of well-defined symbols keeps initial confusion to a minimum and allows the student to focus on the concepts without all the possible freedom of choice in notation getting in the way.  (Note:  a reader comfortable with classic philosophy may point out that a mapping between sets can be abstracted even further, perhaps to the notion of a Platonic form, but this is a side issue.)

Okay, with the appetizer firmly digesting in our minds, let’s turn to perhaps the most confusing symbol in all of mathematics, the symbol for infinity, ‘$\infty$’. This symbol, which looks like the number ‘8’ passed out after a night of heavy drinking, seduces students and instructors alike into all sorts of bad thoughts.

How does it have this power, you may ask? Well, its very shape, small and compact and slightly numberish, encourages our minds to treat it like all other numbers. There are literally countless examples of infinity masquerading as a plain number, much like a wolf in sheep’s clothing. One of the most egregious examples is the innocent-looking expression

\[ \int_0^\infty e^{-x^2} dx = \frac{\sqrt{\pi}}{2} \]

where ‘∞’ is compared side-by-side with the number ‘0’. There is perhaps a more palatable way of writing the integral as

\[ \lim_{a \rightarrow \infty} \int_0^a e^{-x^2} dx = \frac{\sqrt{\pi}}{2} \]

but it still looks like the number ‘a’ can be thought of as becoming or approaching ‘∞’. A seasoned practitioner actually knows that both expressions are really shorthand for something much more involved. I will summarize what this more involved thing is in one short and sweet sentence. Infinity is a process that you have the freedom to perform as many times as you like. Or even shorter: Infinity is an inexhaustible process.

Take a moment to think that through. Are you back with me? If you don’t see the wisdom in that maxim consider either form of the integral expression listed above. In both cases, what is really being said is the following. Pick an upper bound on the integral (call it ‘a’ to be consistent with the second form). Evaluate the integral for that value of ‘a’. Record the result. Now increase ‘a’ a bit more, maybe by doubling it or multiplying it by 10 or however you like, as long as it is bigger. Now evaluate the integral again and record the result. Keep at it until one of several things has happened: 1) the difference in the recorded values has gotten smaller than some threshold, 2) you run out of time, or 3) you run out of patience and decide to go do something else. The term infinity is simply meant to say that you have the freedom to decide when you stop and you also have the freedom to resume whenever you like and continue onwards.

If you are new to calculus, you will no doubt find this short sentence definition somewhat at odds with what your instructors have told you. Where are all the formal limits and precise nomenclature? Where is all the fancy machinery? If you are an old hand at calculus you may even be offended by the use of the words ‘process’, ‘freedom’, or ‘inexhaustible’. But this sentiment is exactly at the heart of the Cauchy delta-epsilon formalism, and the casual nomenclature has the advantage of ruthlessly demolishing the ‘high-brow’ language of mathematics to bring what is really a simple idea back to its rightful place as an everyday tool in the thinking person’s toolbox.

On the other hand you may be thinking that everyone knows this and that I am making a mountain out of a mole hill. If you fall into that camp, consider this video about the properties of zero by the Numberphiles.

I must admit I like many of the Numberphile’s videos, but this one made me shake my head. They allowed language to affect their thinking, and they were seduced by the evil camouflage powers of infinity. They go to great trouble to explain why you can’t divide by zero, and they note that people say “isn’t dividing by zero just infinity?” and they point out it isn’t that simple.

The problem is, it is that simple! Dividing by zero is infinity as understood by the maxim above. The Numberphiles prove this fact themselves. At about a minute into the video, one of their lot begins to explain how multiplication is ‘glorified addition’ and division is ‘glorified subtraction’. The argument for ‘glorified subtraction’ goes something like this.

If one wishes to divide 20 by 4, then one keeps subtracting 4 until one is lefts with a number smaller than 4 (in this case zero). The number of times one engages in this subtraction process is the answer, with whatever piece left over being the remainder. So dividing the number 17 by 5 is a shorthand for subtracting 5 from 17 three times and finding that one has 2 leftover. So one then says 17/5 = 3 with a remainder of 2.

The same bloke (I use bloke because of their cool English or Australian or whatever accents) then says that 20 divided by 0 goes on forever because each time you subtract 0 you are left with 20. Here then is the inexhaustible process that lives at the very heart of infinity. Sadly, while he looks like he is about to hit the bull’s-eye (at 2:20 he even says infinity isn’t a number), his aim goes horribly awry at the last moment when he objects to saying that the expression ‘1/0 = ∞’ can’t be true because one could then go on to say ‘1/0 = ∞ = 2/0’ from which one can say ‘1=2’.

This is, of course, a nonsensical objection since the expression ‘1/0 = ∞’ is a shorthand for saying ‘the glorified subtraction of 0 from 1 (in the sense used above) is an inexhaustible process.’ It is no more meaningful to say that this process is the same as the ‘glorified subtraction of 0 from 2’ as it is to say that ‘1/0’ is the same as any other inexhaustible process, like halving a non-zero number until you reach zero.

The fact that the words ‘0’, ‘1’, and ‘∞’ and the sentence ‘1/0 = ∞’ result in an illogical conclusion is an important warning about the power of language to shape thought. The Numberphile guys had all the right ideas but they came up with a wrong result.

Philosophy, Immanuel Kant, and Murder Mysteries – Part 2

In the last post we discussed the epistimological divisions in philosophy between a priori and a posteriori knowledge and the divisions due to Kant between the notions of analytic and synthetic statements.  As a brief reminder, a priori knowledge stems from first principles and can be understood using the human capacity to grasp the essential nature of things.  A posteriori knowledge is obtained only after examining a thing and coming to a conclusion about its nature – a conclusion that cannot be grasped by reason alone.  An analytic statement is one which is true and in which the subject contains the predicate (that is to loosely say that one defines the other) while a synthetic statement is one that is neither false nor is analytic.

On the surface there seems to be such a strong tie between a priori knowledge and analytic statements, on one hand, and between a posteriori knowledge and synthetic statements, on the other, that there is a temptation to equate the two concepts in each case.  Thus one might want to say that all statements of a priori knowledge are analytic and all statements of a posteriori knowledge are synthetic.

But as is usually the case with logic when examined very carefully, ideas that seem rock-solid based on a casual examination become a lot more uncertain when looked at more thoroughly.  However, these kinds of abstract examinations are often dry.  So for this post we’ll try to apply these ideas to the popular medium of the murder mystery.

What should be said about the murder mystery?  I think that if Aristotle were alive today one of his favorite past times would be reading and/or writing murder mysteries.  This should come as no surprise since Aristotle is credited with formalizing logic and logic and solving mysteries go hand-in-hand.  The murder mystery, or detective story as it also called (not all the crimes are murders – only the most enjoyable ones), are individual studies in epistemology.  At its heart is the idea of pronouncing a statement of truth; of disclosing ‘whodunnit’.

Consider the analysis of G. K. Chesterton, one of the twentieth century’s most profound thinkers and prolific authors, who penned dozens of works on analysis, philosophy, and social criticism.   Chesterton, who was home with logic and critical thinking in its many forms, was particularly fond of the detective story and wrote often about it.  One of his notable observations was:

The essence of a mystery tale is that we are suddenly confronted with a truth which we have never suspected and yet can see to be true.

 

Rex Stout, the author of over 70 detective stories, had the following very nice description of the detecting process.  Speaking about his gourmand and rotund detective Nero Wolfe, Archie Goodwin, Wolfe’s assistant, has this to say about his boss’s moments of genius:

I knew what was going on, something was happening so fast inside of him and so much ground was being covered, the whole world in a flash, that no one else could ever really understand it even if he had tried his best to explain, which he never did. Sometimes, when he felt patient, he explained to me and it seemed to make sense, but I realized afterward that that was only because the proof had come and so I could accept it.  I said to Saul Panzer once that it was like being with him in a dark room which neither of you has ever seen before, and he describes all of its contents to you, and then when the light is turned on his explanation of how he did it seems sensible because you see everything there before you just as he described.

 

If a detective story is an individual study in epistemology then it should be possible to examine each detective in terms of their where they fall in the division between a priori and a posteriori knowledge and between analytic and synthetic statements of truth.  In this way, maybe we can shed some light on the thornier sides of this debate and also have some fun doing it.

Before examining some of the great literary detectives, let me state that none of them are purely one way or another.  There is no author of detective fiction (at least not one I would want to read) who would believe that crime can be solved purely by thinking about the world from first principles nor who would believe that crime can be solved solely by the dry gathering of facts.  It is the interplay between the two extremes that is the engine of discovery and truth detection.  Nonetheless, each these detectives leans, as does the author who sits behind their adventures, more towards one extreme or another.

We can envision a categorization scheme for detectives where each is placed on a two-dimensional grid.  To the left is the extreme of the synthetic and to the right the extreme of the analytic.  At the bottom is a posteriori knowledge whereas at the top is a priori knowledge.  An empty grid looks like

Blank_Detective_Grid

and placing a detective in the top right means that he depends more heavily on analytic a priori methods to solve crime than by other means.

Our task is then to debate, and argue, and wrestle with where to place each.  I won’t pretend to have a well-conceived and impregnable argument for what I present below.  Rather I offer it as food for thought and, perhaps, the basis of some really enjoyable discussions with family and friends.

The easiest place of start is with Sherlock Holmes.  For this discussion, I will be dealing only with Holmes in his original incarnation as conceived of by Since Sir Arthur Conan-Doyle and not some of the more modern adaptations.  The sleuth of 221B Baker Street often solved the mysteries confronting him through observations correlated with dry or obscure facts.  Red clay from a particular quarry in northern England combined with an encyclopedic knowledge of the British Rail time tables were a more common route to the solution than ponderings about human nature.  Thus we can classify him as predominantly as synthetic and a posteriori.

The two famous creations of Agatha Christie, Hercule Poirot and Miss Marple, are cut from a decidedly different cloth.  Both of these sleuths depended heavily on their knowledge of human nature and often worked from motive to solution.  Clearly they are both analytic, but it seems to me that Poirot starts more from well-articulated first principles and methodical deduction than his female counterpart.  Poirot can explain exactly how he arrived at his conclusions (consider his ‘mentoring’ of Doctor Sheppard in the Murder of Roger Ackroyd) even if he often won’t and he needs only the bare facts to proceed (The Disappearance of Mr. Davenheim).  In contrast, Miss Marple relies on a lifetime spent examining human nature ‘under a microscope’ in her village of St. Mary Mead.  As she explains in Sir Henry Clithering, her knowledge is akin to an Egyptologist who, due to a lifetime handling Egyptian scarabs, can tell when one is genuine while another is a cheap knockoff even if he can’t explain how.  She often jumps to the solution and then gathers or reconciles facts only latter (Death by Drowning).  Thus I would be inclined to place Poirot in the analytic and a priori sector and Miss Marple in just below him somewhat in the a posteriori square.

Nero Wolfe, already mentioned above, is more difficult to place.  He seems to slide back and forth between the extremes, having the greater fluidity early on in Stout’s writing.  In some cases, he is clearly synthetic in his approach.  Consider Fer De Lance, where he asks a golf club salesman to demo how to swing a club to confirm his suspicions about the delivery method of a poison dart or The Rubber Band, where he realizes a connection between two usages of the word ‘rubber’ to impeach the murderer’s alibi.  In other cases, including The Christmas Party and Death of a Doxy, he relies solely on his understanding of human nature and his ability to play upon a murderer’s irresistible compulsion to force a conviction.  I place him nearly equally balanced between analytic and synthetic and tipping more towards a posteriori than a priori.

The final two detectives I’ll discuss both happen to be Roman Catholic priests: Father Brown the creation of G. K. Chesteron and Brother William of Baskerville from Umberto Eco’s brilliant novel The Name of the Rose.  There is some irony here in that Chesterton was a devout catholic and Eco is a self-declared atheist.  Nonetheless, both detectives depend on their training in philosophy (with particular emphasis on Thomas Aquinas) and the intellectual and theological traditions of the Catholic Church to find solutions to their mysteries.  Father Brown is deeply logical and staunch defender of reason (The Blue  Cross) but is prone to inspired deductions where, as Chesteron puts it (The Queer Feet):

…in that instant he had lost his head. His head was always most valuable when he had lost it. In such moments he put two and two together and made four million. Often the Catholic Church (which is wedded to common sense) did not approve of it. Often he did not approve of it himself. But it was real inspiration — important at rare crises — when whosoever shall lose his head the same shall save it.

 

In contrast, Brother William seems to take a more measured approach.  On one hand he is quite proud and comfortable in his use of logic as in the affair of Brunellus the horse as he and Adso, his novice, approached the unnamed abbey where the bulk of the book is set.  At other times, he seems to despair of ever knowing anything or, at least, anything with certainty as in his explanation to Adso of how he got the right answer using from the wrong approach.  (An aside: the whole discussion associated with penetrating and navigating the labyrinth is delightful reading and worth studying).

All things considered, I tend to plop Father Brown down into that controversial region where synthetic a priori knowledge sits and I place Brother William firmly in the center.

My final diagram looks like:

Filled_Detective_Grid

Obviously, I’ve ignored a host of beloved literary detectives, including C. Auguste Dupin, Perry Mason, Ellery Queen, Lord Peter Wimesy, and Sam Spade.  Leave a comment telling where on the diagram you placed your favorites and why.

Philosophy, Immanuel Kant, and Murder Mysteries – Part 1

I suppose that the genesis of this post comes from one of my current study projects.  Over the past several months, I’ve been slowly working my way through Harry Gensler’s really fine book ‘An Introduction to Logic’, 2nd edition.  As is the case when I learn anything, I find that my mind automatically associates many things with many things.  It seems to me a good strategy, because I remember the information much better and can apply it with greater ease.  (This should be contrasted with the way I was taught or learned history – I still don’t know what the Battle of Hastings was, why I should care, and how it affects my life.)

Anyway, Chapter 3 of Gensler’s book deals with definitions and what is essentially epistemology, although I don’t believe that Gensler ever mentions that term explicitly. The most interesting part of that discussion is the presentation of the categories of definition attributed to Immanuel Kant and how they mesh with the two philosophical divisions of knowledge that are traditionally recognized.

Kant divides definitions into two categories:

Analytic statements:   Statements whose subject contains its predicate or are self-contradictory to deny.
Synthetic statements: Statements that are neither analytic nor are self-contradictory.

Traditionally, philosophers recognize two kinds of knowledge, which are defined as:

A posteriori knowledge: Empirical knowledge based on sense experience.
A priori knowledge:  Rational knowledge based solely on intellect.

No doubt a few examples are in order to make these concepts clearer.  The examples that Gensler provides (and which I believe an anonymous Wikipedia contributor lifted without attribution) tend to feature the noun ‘bachelor’.

Examples of analytic and synthetic statements are:

All bachelors are unmarried. (analytic)
Daniel is a bachelor. (synthetic)

The first statement is analytic, since its subject ‘bachelors’ is synonymous with ‘unmarried’ (that is to say that its subject contains its predicate as an attribute), while the second statement is clearly synthetic, since the word ‘Daniel’ is not synonymous with ‘bachelor’, nor is it self-contradictory, as it would be if ‘Daniel’ were replaced by ‘Stacey’ (assuming the usual gender denotations of names).

The following statements are examples of a posteriori and a priori knowledge:

Some bachelors are happy. (a posteriori)
All bachelors are unmarried. (a priori)

 

The first piece of knowledge that ‘some bachelors are happy’ can only be obtained by us going out, meeting bachelors and determining (through whatever mechanism we like) that they are happy.  The second bit of knowledge is based on our ability to see the essential definition of the word bachelor.

Obviously, there is an extremely close tie between a statement being analytic and a piece of knowledge being a priori.  There is also a very close tie between a synthetic statement and a piece of a posteriori knowledge (but, I would argue, not as close as the association between analytic and a priori).  Thus, there is a tendency in philosophy to equate the two terms in each case, and to say that all statements of a priori knowledge are analytic, and that all statements of a posteriori knowledge are synthetic.

This seems to be a natural conclusion, and one may dismiss the idea that some statements of a priori knowledge can be synthetic, or that some statements of a posteriori knowledge can be analytic. This dismissal is also supported, at least superficially, by the common notion that all of our mathematics is a priori knowledge and all of our science is based on a posteriori knowledge.

The problem arises when one starts to examine certain statements that, while not quite self-referential, fall into a category where they at least talk about each other, or, more precisely, they are statements that explicitly talk about the nature of knowledge.

As a possible example of an analytic statement of a posteriori knowledge, consider the sentence ‘the value of pi is about 3% larger than 3’.  That there is a constant of proportionality between the diameter and the circumference of a circle is certainly an analytic statement of a priori knowledge, but the determination of the actual value (or some decimal approximation to it) is not.  Okay, so maybe there is such a thing as an analytic statement of a posteriori knowledge, although Gensler leaves the door open for doubt when he says

“But perhaps any analytic statement that is known a posteriori also could be known a priori

 

But, apparently, the real drama in the philosophical world (I must admit I have fanciful images of Plato and Aristotle, dressed in wrestling tights, as squaring off in a steel-cage match) is over whether there is credible evidence to support the claim of a synthetic statement of a priori knowledge.  Such a statement Q would be one such that Q is neither self-contradictory to affirm nor to deny, Q is true, and we know Q to be true only using our reason.

Trying to further explain where such a brain-twisting idea can arise, Gensler asks us to consider two types of philosophers: empiricists and rationalists.  According to his discussion, the empiricist denies the possibility of synthetic a priori knowledge, while the rationalist admits such a possibility.  The crux seems to come in the examination of the empiricist’s point of view.  The first observation is that an empirical point of view seems to equate the experiences of the senses with the actualities of the world.  An empiricist is inclined to say something like

“I perceive an object to be red, therefore it is a red object.”

 

Of course the empiricist seems to have no mechanism for embracing the idea that an object is actually red when it is perceived as red, except to resort to what seems to be synthetic a priori knowledge.  It is synthetic because nothing in how the terms are defined requires an object that is perceived as red to actually be red.  It is a priori because we use our reason to conclude that it is a tenable assumption that all objects perceived as red are, indeed, red.

Perhaps even more interesting is the position the empiricist takes on synthetic a priori knowledge in the first place.  To say

“There is no such thing as synthetic a priori knowledge”

 

seems to be an example of synthetic a priori knowledge, at least in-so-far as one is willing to agree that the statement, if true, is not true by virtue of the definition of the terms ‘synthetic’ and ‘a priori’, and is therefore synthetic, and that the statement, if true, cannot be determined to be so by our sense experiences, and so it must be a priori.

Okay, so what does any of this have to do with murder mysteries?  Well, as I mentioned above, whenever I am learning something, I employ a personal strategy of associating things I understand with things I am trying to grasp.  As I was reading Genler’s presentation, I couldn’t help but wonder how mystery writers employ these points to amuse, entertain, and sometimes baffle us.

So, next time, I will apply some of these concepts to some of the world’s most famous fictional detectives.  We’ll have a chance to see if Sherlock Holmes is synthetic or analytic.  We’ll ask how many of Hercule Poirot’s little gray cell depend on a priori versus a posteriori knowledge.  We’ll examine whether Miss Marple’s understanding of human nature springs from analytic a posteriori knowledge.  And we’ll explore how logic, reason, and epistemology figure into two of the twentieth century’s most philosophical writers, G.K. Chesteron and Umberto Eco, through their excellent characters of Father Brown and Brother William of Baskerville.

A New Turing Test

A common theme that’s been explored in this column for some time is the idea that, at the current state of the art, machine intelligence is clearly inferior to human intelligence, to the point that the term machine intelligence should perhaps be regarded as an oxymoron.  That isn’t to say that an actual thinking or sentient machine wouldn’t be welcome, or that it should be greeted with fear and shunned as an abomination.  Rather it is based on a cold-eye, unemotional assessment of where we stand and of the huge gap in time and technology that separates us from the apocalyptic stories as featured in the Terminator or in Demon Seed.

As a case in point that illustrates this gap, consider the evolution of machine automation.

The concept of machine automation is nothing new and it dates as far back as man has made machines.  But the perceived threat of machine automation as being harmful to mankind seems to have its genesis shortly after the beginning of the Industrial Revolution.

While historians don’t agree on exactly when and how the Industrial Revolution began, it is clear that it got its start in Great Britain sometime around 1780, and eventually completed its spread to most of the Western World by 1840.  During that time, machines of all varieties were invented to perform activities that were originally performed by hand.  Of course there was a backlash by certain segments of society, and perhaps the most notable was by the Luddites in England.

The Luddites were a group of textile artisans who felt threatened by the invention of knitting, spinning, and weaving machines that made textile manufacturing accessible by lower-skill workers. During the period from about 1811 to 1817, they were known for destroying factory machinery and protesting the encroachment of machines into their economic sphere.  And while it is true that these machines probably weakened textile artisan position in society, nowhere do we hear a claim that the machine dislodged the clothing designer.  Nor do we hear that the designers of the machine didn’t know how to weave, spin or knit.  What we hear is simply that the machines did the same job as the human, faster and with fewer errors (and of course, ultimately cheaper) using the techniques developed by human beings.

The idea of machine automation as a threat has waxed and waned over the years.  Another example can be found starting in the late 1950s, where workers in office settings felt threatened by the rise of the computer as a business machine.  The delightful movie Desk Set, starring Spencer Tracey and Katherine Hepburn, is a comical romp through the fears and realities of machines displacing human beings through automation.  Tracey plays the role of Richard Sumner, who is a ‘Methods Engineer’ (computer scientist) that has been hired to provide a computer for the research department at a major television network in New York in advance of a corporate merger.  Katherine Hepburn plays the role of Bunny Watson (I’m not making that name up), who is the head of the research department, and whose entire contingent is filled with smart, witty, and attractive women.  Comic hijinks ensue and Sumner and Watson end up falling in love, but there is a very level-headed message that is given in the movie towards the end, when Sumner explains that the purpose of his computer is to store and retrieve data so as to free the women for tasks best suited for a human being since, as he likes to say, ‘no machine can evaluate.’

Fast forward to today.  We don’t just have machines that weave fabric or collate research data.  We have machines that act as AI players in video games; that automate complex robotic manufacturing; that print 3-D objects; that control computer updates and traffic lights and billing notices and hundreds of other things.  But they only do what we ourselves have taught them to do.

Nowhere is there a record of a single machine inventing a new process, designing a new object, or developing a new idea.  True, they assist us in all of these tasks, but they do so using the well-defined methodologies that we taught them.  True that they allow us to comb through vast amounts of data and gain deeper insight than we could have gotten by going through the data by hand.  But the patterns they search for and the insights they help bring to light are fundamentally what we put in.

In other words, they do what we tell them to do precisely how we told them to do it.  That we are sometimes surprised by their results shouldn’t come as a shock.  It is well known that any set of rules of reasonable complexity, which are seemingly understandable and sensible on their own, can sometimes produce unforeseen results when they interact.  Ask any person harmed by the unintended consequence of a law written by people, administered by people, and acting on people.  This doesn’t mean that the law itself somehow became intelligent or achieved sentience, but rather we were too busy or too rushed or simply too stupid to think it all the way through.

So I would propose a new type of Turing test to mark the beginning of the era when machines can think on their own.  For those who don’t know, the Turing test consists of a remote dialog between a human and a second party that the human knows is maybe another human or maybe is a machine.  If after some period of time, the human is unable to distinguish that the second party is a machine, then the machine has passed the test, successfully mimicking human responses in the dialog.

In my test, I would say that if the human were able to go to the second party with a vague set of requirements for a new thing (a process, a widget, a tool, whatever) and the second party can come back with a design that meets these requirements or explains why they can’t be met, then that second party is intelligent, whether or not it is human.  When that happens, let me know… I would like to hire it.