Latest Posts

The Axiom of Choice

I was a graduate student when I first heard of the Axiom of Choice.  The mention, by one of the faculty, was in passing and, since it didn’t seem to impact my particular field and my desire to  graduate was high, I didn’t pursue exactly what is was. All that I was left with was a lasting impression of the awe (or was it frustration or, perhaps, disgust), for lack of a better term, that I heard in his voice.  That impression has stayed with me for years and I thought that this month’s column should be devoted to an introduction to the Axiom of Choice.

Years later, I am in a better position to understand the mix of emotions that came from that faculty member when it came to the Axiom of Choice.  Said simply, the Axiom of Choice may very well be the most controversial of topics in modern mathematics.

The Axiom of Choice is a part of axiomatic set theory, the branch of mathematics that studies sets and attempts to distill their essential properties, from the most mundane finite sets, to the more difficult but commonly used infinite sets like the number line, and beyond (transfinite sets).

Axiomatic set theory grew out of the desire to eliminate the paradoxes of naive set theory, the most famous example being Russell’s paradox.  Towards that goal, several axiomatic systems (or models) cropped up,  The most commonly used system is the Zermelo-Fraenkel model with the Axiom of Choice thrown in (denoted by ZFC). 

The importance of the Axiom of Choice is immediately apparent in that it is the only axiom that has the pride of place (or shame) next to the names of the men who founded this system.  The following table, a synthesis of the articles on the ZFC system from the Encyclopedic Dictionary of Mathematics (EDM-2) and the Wikipedia article on axiomatic set theory. The first number is taken from the EDM-2 ordering, the second is from WIkipedia; likewise for the name with the EDM-2’s nomenclature coming first and the Wikipedia name (if different) coming after.

Axiom NameContent
1. (1) Axiom of ExtensionalitySets formed by the same elements are equal
2. (4) Axiom of the Unordered Pair (Axiom of Pairing)Set X exists with only elements that are made of the elements of sets A and B
3. (5) Axiom of the Sum Set (Axiom of Union)Set X exists whose elements are all the possible elements of a set A.  For example the union over {{1,2},{2,3}} is {1,2,3}
4. (8) Axiom of the Power SetSet X exists whose elements are all the possible subsets of a set A (i.e. the power set)
5. (8) Axiom of the Empty SetThere exists an empty set {}, 
(Note that this axiom is not included as separate axiom in the Wikipedia article nor in the PBS Infinite Series video but mentioned in passing under the Axiom 8 in the former)
6. (7) Axiom of InfinityThere exists a set having infinitely many members (e.g. containing all the natural numbers; 0 = {}, 1 = {0} = {{}}, 2 = {0,1} = {{},{{}}}, and so on)
7. (3) Axiom of Separation (Axiom of Specification)A set X can be separated into two subsets, the first obeying some condition and the second not
8. (6) Axiom of ReplacementThe image of a set under any definable function will also be a set
9. (2) Axiom of Regularity (Axiom of Foundation)Every non-empty set X contains a member Y such that X and Y are disjoint sets
10. (9) Axiom of Choice (Well-ordering Theorem)Let X be a set whose members are all non-empty.  Then there exists a choice function f from X to the union of the members of X such that for all Y in X, f(Y) is in Y.

On the face of it, the Axiom of Choice may seem to be as technically innocuous as its brother axioms and, indeed, it can be deduced from the other axioms when applied to finite sets.  The Axiom of Choice is distinct when dealing with infinite sets. And this is where the controversy arises.

Before discussing that point, it is worthwhile recalling that infinite sets are not strictly the parlance of the erudite mathematician.  Much of physics is built on the idea of an infinite set representing a physical process (e.g. Fourier Series). And infinite sets or, more properly said, their avoidance was used in classical philosophy for all sorts of arguments (e.g. Aquinus’s arguments for the existence of God).  So, the implications of the ZFC system may range further than simply a technical point on the correctness of infinite sets.

That said, the controversy surrounding the Axiom of Choice is both serious and interesting. The first controversy is that the Axiom of Choice presents is the fact that while it tells us that there is a way of selecting a member from a set it doesn’t tell us how.  In other terms, the Axiom of Choice is not constructive and, so, we are left with the problem of saying that we can choose but we don’t know how the choice is made or even, sometimes, what is chosen. It is as if a miracle happened.

A direct consequence of this miracle, as discussed the PBS Infinite Series, is that Axiom of Choice can produce sets that are measureless from perfectly measurable sets.  As shown in the video, at set S has, by one argument, a measure between 1 and 3, but, by another argument, its measure must be zero. So, S is a non-measurable set.

Even more disturbing is Banach-Tarski Paradox.  The essence of this paradox is that a 3-dimensional ball can be taken apart into a finite number of disjoint pieces.  These pieces can then be reassembled to make two balls of the same size. We are supposed to be comforted by the fact that in this process, the disjoint pieces are non-measurable sets.  Details explaining the process are found in the Vsauce video linked below.

I’ll close by saying that after really thinking about the Axiom of Choice and all of the controversy that comes in its wake, I now have a better idea about the mix of feelings of that academic mentioned at the beginning.  I also must confess a deep concern with mathematical logic applied to infinite sets. Unfortunately, that concern seems to have no immediate mitigation, which means researchers and, for that matter, the whole human race still has a long way to go in understanding infinity.

The Three Acts of the Mind

The last post presented the concept of a hybrid AI or, perhaps more correctly said, a hybrid intelligent system which mixed and matched various tools, developed in computer science, to mimic the Three Acts of the Mind.  The idea is that to best mimic the intelligent behavior of humans one must understand what is being mimicked (the human mind) so that one may most closely align one’s mimicry (machine behavior) to the object being mimicked.  Of course, the last sentence is a bit fanciful in its language but its message is quite serious – to best duplicate the behavior of the human mind first figure out what is going on within the human mind.  And that is the point of this post.

To be clearer, neither the dark corners of the human mind, explored by Freud, nor the nearly inexplicable connections between minds and society at large, posited by Jung, nor the red hot passions presented in oh so many of the Greek tragedies are being pursued here.  Only that smallest sliver of human thought, the rational and ordered part, is being discussed because this is the only part that maybe can be mimicked by a machine.  And the description of the rational mind that will be employed is Three Acts of the Mind.

The go-to reference on this is Peter Kreeft’s Socratic Logic.  The entire text book is built around the Three Acts of the Mind and concentrates on classical, Aristotelian logic rather than the usual symbolic propositional logic (more on this in a future post).  Despite the fact that the book boasts nearly 400 pages of content, the basic principles as summarized in a few pages in Section 5.  A condense form of that summary will be presented below.

The situation is bleaker when considering online references.  There isn’t a large amount of material to be found on the web concerning the Three Acts of the Mind.  On possible explanation is that the Three Acts are so obvious as first principles that there really isn’t anything to say on them.  But careful reflection should lead one to realize that this argument is vacuous since it is the first principles of any philosophy that always receive the most scrutiny because they can’t be proved; they must either be accepted or rejected at face value.  Anyway, for those who can’t obtain Kreeft’s book, Dr. Christopher Anadale from Mount Saint Mary’s provides a nice succinct summary.

Both Kreeft and Anadale agree on the basic observations of human thought that underpin the Three Acts, namely that:

  1. Human beings think
  2. Human thought has structure
  3. That structure is objective and knowable

These three assumptions are the hunting license that allows us to go off and categorize the acts of the human mind.  Briefly stated the Three Acts are (in order from least to most complex):

Act of Understanding – grasp one object of thought,  

Act of Judgement – combine to objects of thought into a proposition with a subject and predicate,

Act of Reasoning – combine two or more propositions into a reasoned argument producing a conclusion.

Several things should be noted for each act.  For the First Act of Understanding, it is important that the term or concept be clearly defined.  Nothing obscures human communication so much as when two individuals are using the same word to mean different things and nothing is so tricky as a logical argument where the meaning of a term changes somewhere between the beginning and end.  For the Second Act of Judgement there is little to be said in general.  The key point here is being able to see whether a proposition is true or false and this is a difficult task with no fixed rules for figuring out the truth value of a claim.  For the Third Act of Reasoning, deductive arguments can be algorithmically determined to be valid or invalid.  Inductive arguments can also be analyzed but only to the probability of certainty since an inductive argument can only rendered conclusions that are true within a confidence interval.  It should also be noted that the Act of Reasoning, with its arguments and conclusions, is the activity that actually produces new knowledge, conjectures, hypotheses, and actions.

Now that Three Acts can be a bit obscure when stated in an abstract way, so the following table, adapted from the table on page of Kreeft’s book, should help.  Note that some rows have been omitted, simply for brevity, and that a new row, proposing a possible machine equivalent has been added.

  1st Act – Understanding 2nd Act – Judgement 3rd Act – Reasoning
Logical Expression Term Proposition Argument
Linguistic Expression Word or Phrase Declarative sentence Paragraph
Example book or football game All books have pages with words.   A football game is a sport All books have pages with words. Too Many Cooks has pages with words. Therefore, Too Many Cooks is a book
Structural Parts None Subject term and a predicate term Premises (propositions) and conclusion (also a proposition)
Question Answered What it is Whether it is Why it is
Aspect of Reality Essence Existence Cause
Good When Clear (unambiguous) True Valid
Bad When Unclear (ambiguous) False Invalid
Possible Machine Equivalent Recognized percept Classification Syllogism and action

Of course, there is a lot more to be said about the Three Acts but the above sets the generally agreed upon framework from which other analyses spring.

Hybrid AI

There is certainly a lot of excitement throughout the tech community about the promise of artificial intelligence or AI, as it is more commonly known.  And while many of the advances are impressive compared to where computer science was only a decade ago there is a lot more hype than fact in many of the more outlandish claims being made.  Skynet from the Terminator Movies or the Machines from the Matrix are not soon to take over nor are they likely to do so for many generations.  However, there is a distinct possibility that AI may actually be able to make competent decisions in the near future but only if the community takes a broader focus than it is apparently taking right now.

Now some may object that AI is enabling all sorts of important activities that would not otherwise work.  Reports of scientific discoveries, business approaches, and computational improvements abound so why can’t one conclude that AI is making competent decisions now?  The crux of the matter is the definition of what AI is/does and what competent means.  To flesh out this argument, let’s take a small regression to discuss what is commonly meant by AI and what is capable of doing now.

Many modern books on AI and a spate of YouTube videos extol the recent advances in AI with particular attention on algorithms like convolutional neural nets and the vast improvements they offer in image classification or on support vector machines or K++ Means algorithms and the power they offer in clustering data.  For example, one of my favorite videos on the subject is But what is a Neural Network? By 3Blue1Brown

Grant Sanderson (the voice and vision behind 3Blue1Brown) starts off his video by discussing the remarkable functioning of the human visual cortex that can recognize all sorts of different renderings of the number 3.  For example, each of the following glyphs show the string “3” printed in a different font. 

Most people (and their remarkable visual system) can tell that each character is a different rendering of the same number.  And, as Grant details in his video, convolutional neural networks seem to be able to perform the same recognition. 

He moves onto discussing how a neural net can be used to encode similar kinds of pattern recognition for a machine allowing it to recognize edges, or loops, and so and that within its multiple layers can be found the ability to report, with very highly levels of certainty that each of render corresponds to the same underlying digit. 

This is quite a step forward in machine vision but does it really constitute artificial intelligence or decision making?  Sure, the algorithms can comb through vast amounts of data looking for a prescribed pattern and can ‘decide’ when they have found a candidate.  And these results should not be surprising because, after all, neural nets were designed to mimic the human cortex and who is to say that the training the net receives and the way it decomposes images into parts doesn’t mimic what is done in the brain.

Despite those arguments, it is philosophically hard to say that the AI can even come close to making competent decisions. 

There are two reason for this assertion, one technical and one philosophical.  On the technical front the  best operative definition of artificial intelligence is that of the rational agent taken from Artificial Intelligence: A Modern Approach by Russell and Norvig.  Under this definition, the machine must not only recognize the desired pattern (e.g. deciding that it sees a “3”) but it also has to perform an appropriate action based on that recognition. 

To understand how a rational agent takes the appropriate action, Russell and Norvig assert that a rational agent received stimulus, called a percept, (in the case above the pixel values of an image of the rendering of “3”) and then acts so as to maximize the value of some performance metric based on the percept, the sequence of percepts up to that point (i.e. some notion of memory), it’s knowledge about its environment, and the rules that it has for actuating whatever actions it can take.

Simply pattern matching and ‘deciding’ that the pattern is either seen or not doesn’t really meet the definition of a rational agent.  For example, no sequence of previous percepts (excepting the initial training) is used by the machine learning techniques currently being enthusiastically pursued.  The current systems aren’t capable of continuous adjustment as situations change.  To pattern match a “3” the system needs to find two half-loops stack one upon the other.  If gradually, the adoption of other styles, say the Roman numeral III, became popular in the percept stream, the net would be stymied.

In addition, it is difficult to call a binary sorting a ‘real decision’.  Certainly, it is useful to have such a sorting algorithm that can look at vast amounts of ‘noisy’ data and point out the parts that are of most interest but the judgement of what is of interest still sits with the human element.   And, to be sure, this is an important step forward, but it doesn’t really constitute ‘learning’ or ‘rational’ in the human sense.

And this brings up the second point.  Traditional philosophy recognizes Three Acts of the Mind.  Roughly the three acts break down as follows (using our “3” once again).  First Act:  the system recognizes the “3” in a sentence.  Second Act:  the system recognizes the meaning of the sentence “Start the process at 3.” Third Act:  the system can reason through chaining of statements to answer “Why didn’t the process start at 3?”.   At best, what has been accomplished corresponds to the initial, baby steps into the First Act.

For a system to really exhibit some semblance of rationality it must mimic the three acts, there need to be a hierarchy of different types of agents all working together.  A possible example of this would be a system as pictured below:

At the bottom would be some set of agents to perform the First Act, perhaps two differently trained convolutional neural nets or a convolutional neural net and a support vector machine or whatever other combination one could imagine.  This layer, like the others, should be tool agnostic, focusing on what should be done not how.  The second layer might have an expert system to interpret the percepts from the lower layer within the context that the system finds itself (effectively answering Russell’s and Norvig’s requirement that the rational agent know its percept history and its environment).  This layer should also have some way of swapping the tools in the lowest layer or adjusting their operating parameters, allowing it to change how it looks at a problem.  Perhaps an Analytical Hierarchy Process or a Genetic Algorithm can be employed to weight the performance of the tools.  Finally, in the third layer should be some set of tools for chaining the results from the second layer so that rational decisions could be made.  Here the tool set is far more speculative.  Perhaps a different kind of expert system or AHP or, perhaps, an A* algorithm could be used.  It really doesn’t make what the tools are but rather what they do.  This seems to be the only blueprint for achieving a real AI. 

 

Pseudoscience and Science?

 

The roots of this post date back almost over three decades.  Shortly after I started full time employment and had some money to spend I decided to get a subscription to Scientific American.  At the time, I thought that it was a excellent publication devoted to pure science.  With time I began to see the business side of it, the need to generate ‘science excitement’ even when there was perhaps none.  Over the years, I became much more savvy in seeing the ebb and flow, the highs and the lows of their editorial policy.  And I learned as much, or even more, by the bad treatments as I did from the good ones, since the former challenge one’s ability to identify, isolate, and explain what went wrong.

One particular low, which caused me to let my subscription lapse, was a provocative piece on pseudoscience.  In principle, I have no beef with anyone taking sloppy thinking and evidence-poor arguments to task.  I have had, on many occasions, the need to roll up my intellectual sleeves and rip apart poor scientific logic, whether it was someone else’s (I’ll discuss a particularly interesting example later on) or my own.  But that’s not what the author did.  Rather than focus on the ‘truth content’ of the claims, he focused on the ‘stupidity’ of the ideas and made light of those who held them and I have a big problem with that.

As a typical example, the author mocked the concept of alien abduction.  You may be remarking to yourself that alien abduction should be mocked; just look at the people who assert that they are victims.  But we shouldn’t be mocking people in general nor should we mock an idea, no matter how far-fetched.  The assertion that people are taken off the earth and placed on alien spacecraft and are subjected to who knows what else is neither a scientific statement nor unscientific one; it is ascientific.

Admittedly, the word ‘ascientific’ is one that I’ve coined for this discussion but I think it serves the need well.  No simple statement that can be uttered, such as ‘the sky is blue’ or ‘the sky is red’, can have any scientific content, merely truth content.  We add the ‘science’ to these assertions by the way we investigate the claim made in the statement.  In other words, the word science should mean the process by which we add a truth content to an assertion not the truth of the assertion itself.

The ‘science’ begins by the very obvious but often forgotten steps of formulating a question and then gathering what evidence exists.  (The wikipedia article on the scientific method is amazingly detailed and explanatory on these points as well as the others that span the general principles of the method).  In the case of alien abductions, we start with the claim at face value and look at what evidence exists.  Admittedly, the evidence that its proponents muster is the worst kind of evidence, riddled with eyewitness testimony and scant bits of circumstantial evidence and no reasonable analysis would it.

So, based on the evidence, we can conclude that there is no support for the statement ‘alien abductions take place’ and the truth content of any claiming victims is low.  But it doesn’t mean we disproved the statement as a whole, anymore than Europeans of nearly 350 years ago could have proved that the statement ‘some swans are white’ as false.  The tests we set a limit on the likelihood of the particular assertions that have been made.

To be more concrete, I’ll draw from personal experience and talk about two of my encounters with people who believed in the explanatory and predictive power of astrology.  Both of these encounters occurred when I work for a small technology firm and the people involved were highly trained STEM professionals (oddly enough they didn’t know each other).  Once the individual in question brought up their belief in astrology I thought I would put the power of the system to a test.  Rather than reveal my ‘star sign’ and get a stream of vague statements that could be interpreted in just about anyway, I asked for the person to predict my astrological sign given what they knew about my behavior, attitudes, and habits.  In both cases, the adherent started by suggesting what I might be and eventually ruling out the actual sign under which I was born.  They might have had better odds simply by spinning a zodiac spinner.

In both of these areas (alien abduction and astrological predictions), we are on firm ground saying that since we’ve examined a large number of such claims and found little to no evidence supporting them that it is unlikely we will ever find anyone with a credible set of facts that support these claims.  But we need to recognize that: 1) are conclusions are not deductive proof but merely the product of statistical inference, and 2) the individuals making these claims are real people and we must treat them with rudimentary respect if we ever hope to persuade them that these positions are very likely wrong..

Now some might suggest that I’m splitting hairs with this fine distinction between the claim and the truth of the claim?  What practical implications could warrant such a careful treatment?  One prominent example springs to mind – the debate over whether to vaccinate.

Before digging in, let me state categorically that in my opinion the benefits of vaccines outweighs their risk and I endorse getting children vaccinated.  But I am not ignorant of the fact that in the current debate there are actually two pseudoscience positions.  The first is the obvious position taken by in this debate by the anti-vaccine crowd, best attested to by the recent and ongoing outbreak of measles in the US.  The other pseudoscience position is by the medical professionals who assert that vaccines are perfectly safe by citing ‘medical studies’.

Much of what passes for medical studies, or statistical studies in most disciplines for that matter, themselves border on pseudoscience.  As William Wilson so nicely puts it in his opening paragraph in the May 2016 article of First Things: “The problem with science is that so much of it simply isn’t.”  In a nutshell, his points are that most studies are unreproducible (and therefore unfalsifiable) and are built on the flimsiest of statistical interpretation.

The public is aware of these problems even if they are unable to articulate their concerns precisely in a way that would make a philosopher happy.  Therefore, it is hardly a surprise when they hold pseudoscientific positions that cause unneeded misery and death.  The science community needs to put its own house in order by raising standards on publication before turning its critical eye somewhere else.

Teaching a Machine to Ghoti

English is a notoriously difficult language to spell and pronounce.  If it challenges the organic learners ability to grasp and communicate it more than doubly so is a source of frustration for the programmer trying to teach bits of silicon, metal, and plastic to appear that they speak naturally.

A famous demonstration of the problems readily encountered in today’s lingua franca, attributed to William Ollier Jr., is the spelling of fish as “ghoti”.  The proper pronunciation of this awkward looking sequence of letters strikes many people as something like “goatey”, an adjective that could be used to describe a thing with the essence or behavior of a the goat.  But Ollier argues for the fish phonetic as follows:

  • the letters ‘gh’ are to be pronounced as they are in the word enough
  • the letter ‘o’ is to be pronounced as it is in the word women
  • the letters ‘ti’ is to be pronounced as it is in the word nation.

Short, simple, and eminently confusing!

Ollier’s first donor word comes from a delightfully ridiculous family of words all sporting the ‘ough’ combination of letters and all having subtly or radically different pronunciations.  Consider the following list:

  • bough – a part of a tree; pronounced as [bou]
  • bought – the past tense of to buy; pronounce as [bôt]
  • cough – an action of the mouth and lungs; pronounced as [käf]
  • dough – a mixture of water and flour used in baking; pronounced as [dō]
  • enough – just the right amount of a required thing; pronounced [iˈnəf]
  • through – to past into and beyond; pronounced [THro͞o]

This list contains six distinct pronunciations for the same four letter combination, representing an astonishing gain of nearly two distinct pronunciations per letter, most likely the greatest variability in the English language, possibly the world.  And it is by no means exhaustive (even if putting it together exhausted the writer). And there are a host of other ‘ough’ words that didn’t make onto the list simply because they offered nothing new.

For example, consider that fought, ought, sought, thought all rhyme with bought and bring nothing new to the list even if they bring headaches to people learning the English tongue.  Similarly, ‘ough’ in drought sounds the same as it does in bough despite the fact that trees need water (or perhaps because of it).  Likewise, though’s similarity to dough consigns it to a mere mention in this paragraph rather than a place of honor (or is it shame) within the bullets above.  And it is tough that tough and rough weren’t unique enough to make the cut.

But perhaps the most interesting no-show on the list is slough.  This word is two-faced having two quite different sounds depending on whether it is a noun or a verb:

  • slough – a swamp or mire; pronounced [slou, slo͞o]
  • slough – to shed or cast off; pronounced [sləf]

This example is, by no means unique.  English is just brimming with curious words that lie in wait trying to trick the speaker.  There are the heterographs that have the same pronunciation but have different spellings and meanings.  A particularly sinister example is the set of to, too, and two.  There are the homonyms that share both the same pronunciation and spelling but also mean different things.  The clause, the rose rose to glory in the garbage dump, is a fine example.  Together the heterographs and the homonyms form the set of homophones; words that are pronounced the same but which mean different things (regardless of spelling).  Homonyms are also often called homographs, thus serving to bring us to the next category, the synonyms; words with different spellings and pronunciations but which have the same meaning.  Two closely related categories that seem to have no official designation are the spelling variants and the speaking variants. Into the latter go all those weird words like color and colour, saber and sabre, and normalize and normalise.  In the latter category, one finds words like often, in which the speaker may include or omit the ‘t’ sound.

All of these categories can cause problems to speakers, natural and artificial though they may be.  But no category seems to engender as much confusion and consternation as the heteronyms.  Some of the classic examples that fall into this category are:

Together the heteronyms and homonyms make up the homographs; words with different meanings but the same spelling (regardless of pronunciation).  The following Venn diagram (based on the one in the Wikipedia references) helps one keep score.

Heteronyms are particularly problematic for computer-generated readings of the written word because context changes the pronunciation and meaning in a way that seems hard to find solid rules that work in all instances.

Consider the ridiculous sentence:

The bass played the bass to the applause of the crowd.

Both instances of bass are nouns but it seems clear that the musical instrument can’t be the subject of any sentence, so maybe a programmer can write a rule to accounted for this case or, since it is unlikely that the previous sentence will show up in anything worth writing specialized code to handle, maybe it is ignored altogether.

The sentence:

To a man with too many things to do a minute is a minute fraction of time.

may be an entirely different story.  There is a reasonable chance that a sentence containing both the noun and adjective form of minute will be written and the word order in the sentence is unlikely to indicate which is which in nearly all cases.  Still the local association of the article a just before the noun form and the occurrence of the noun fraction just after the adjective form may be enough of a pattern to write a rule.

And then there are the voluminous list of noun-verb heteronyms, a sample of which are listed here (for a full list of two-syllable examples):

  • Address and address
  • Bow and bow
  • Buffet and buffet
  • Desert and desert
  • Dove and dove
  • Lead and lead
  • Present and present
  • Project and project
  • Row and row
  • Slough and slough
  • Tear and tear
  • Wind and wind

Writing rules for a machine to naturally speak any sentence with any of these is difficult.  In some cases the word order makes it easier. For example:

He will bow when he receives the bow.

guarantees that the subject (he) will be followed by the verb form of bow rather than the noun form.

But other sentences are not so obvious.  Consider these sentences involving the very treacherous word tear, which is a sort of palindrome of noun-verb heteronyms since the [ter] and the [tir] form can be both a noun and a verb.

He will tear open the screen to let in the breeze that will cause him to have a tear in his eye.

and

His will open a tear in the screen to let in the breeze that will cause his eye to tear.

Definitely more difficult, but perhaps doable by scanning the sentence for the helper verbs like will and to.

But the fun doesn’t stop there.  Consider the following command.

Give the address.

Is this a command telling an unidentified person to hand over where he lives or to give a speech to an audience.  There is simply no way of knowing without comprehending the rest of sentences around it.

And there you have, a brief but bewildering dip into what makes learning English a tricky  enterprise. A relatively small set of rules may cover a majority of sentences encountered but to get to fluency an enormous number of special cases and exceptions must be mastered, some of which required a non-local analysis of the text to ensure correctness.  And while teaching a machine to read and speak English aloud is definitely a noble goal for the computer scientist (and especially beneficial for the seeing-impaired) it is one that is likely to come with a whole host of frustrations for year to come. It amazing that any of us can communicate with each other at all.

Real Reality

The year was 1995.  A short-lived and largely unnoticed television show premiered on Fox from March to May of that year.  The show centered around a young woman named Sydney Bloom (played by Lori Singer, who played the role of Cassandra in Warlock), whose father bequeathed her a strange ability.  She was able to enter VR.5, the fifth level of virtual reality, which consisted of her entering with total perception into a cyberspace world that sported realism that matched our own day-to-day reality.

The premise and promise of the show was that a fully immersive digital world was just around the corner.  All that was needed was the familiar trappings of eye-gear, power-glove, computer, and cheesy special effects.  Those modest ingredients allowed her to totally connect with a different reality or, at least, another facet of our reality.

Just a scant three years earlier, the entertainment industry subjected the film-going public to the cringe-worthy spectacle that was the horror movie The Lawnmower Man.  Loosely based on a Stephen King movie, the story follows the tragedy that ensues when a scientist (played by Pierce Brosnan) attempts to boost the brain power of a developmentally challenged man (played by Jeff Fahey) through a combined use of drugs and virtual reality.

Here we see a fairly early representation of tactile suits that are designed to simulate the sense of touch, as well as the concepts of an avatar to represent one’s persona in the virtual world and the reoccurring theme of megalomaniacal behavior brought on by unfettered cyber-license.

Without a doubt, the most esteemed member of all virtual reality stories must be The Matrix.

Premiering in 1999 under the shadow of The Phantom Menace, the first Star Wars movie in over 15 years, The Matrix was largely overlooked in the months leading up to its release, but the public reception of its storyline and the philosophical questions that it raised made it a classic of American film.  It set the bar for what virtual reality would mean in the science fiction genre, and the tropes, styles, and themes that the entire trilogy used continue to shape VR storytelling to this day.

During the 1990s, it seemed that everywhere you looked the concept of a totally encompassing virtual reality was a narrative staple in almost every type of tale.  Even an entire episode of Murder She Wrote, was devoted to it as a vital clue to solving a crime.  And it isn’t hard to see why VR as a story element showed up in so many places; the concept was flashy and popular and the alternate reality it represents is, at its heart, exactly what storytelling is about – exploring the ‘what ifs’ and ‘what may have beens’ of life.

But the actualization of VR in real life has fallen far short from the promise of these tales.  Like stories involving space travel and artificial intelligence before it, it is far easier to imagine a far-off future of our hopes than it is to realize the more modest goals that we can achieve.  All of these are instances of intoxication fantasies where the mind constructs an idealistic and totally unachievable future by completely disregarding cold, hard facts.

Fortunately, there are realists who are quite willing to redirect ideas and effort to a future that is not only achievable but actually useful.  Space travel doesn’t take us boldly where no man has gone before but it does let us examine land usage, water pollution, and habitat change from a global vantage point.  It connects distant points on the globe, helps us to remain found (rather than being lost), and make instantaneous communications a reality.  AI no longer means machine sentience with all the dreams of utopia and all sinister overtones of dystopia but rather smart agents that act in specific circumstances and within a well-defined realm of applicability faster and more reliably than can a human.  Modern AI techniques help us to make smarter use of limited resources, connect disparate things together to make informed decisions, and help us manage the ever-increasing scale of modern technological life.

Likewise, it seems that VR is moving away from the pipe dream of total immersion and towards a future where the digital supplements the actual.  A panel I attended at Dragon Con a few years back presented an excellent application of augmented reality.  Imagine you are a network engineer and there is a problem with an onsite telecommunications system of one of your customers.   Despite your expertise, qualifications, and certifications, if you’ve never seen the client’s telecom closet, which may look a lot like this,

it will take you some time just to grasp where each blue cable is going to or coming from before you can even begin to troubleshoot which ones are at the root of the problem.  The network topology may be correctly documented on the appropriate diagrams but wading through all the details, trying to match up some lines on a piece of paper to a real world length of cable in order to find the one circuit that you believe is the culprit is a long, tedious process.  Now suppose that instead of a static network diagram, you are able to done VR goggles that overlay portions of the network diagram over top of the image of the real thing that is in front of your eyes.  Suddenly you have a live, heads-up display to assist your navigation of the network design that rivals the best tutorials that videogames have to offer.  This use of VR is so promising that Koch Industries is touting it in their newest press release.

There are other equally humdrum but interesting applications.  One that is particularly near and dear to my heart is the use of VR to explore complicated 3D data.  I had a recent occasion to look at the tangled magnetic field patterns that arise in certain geophysical situations, and a few minutes in VR space did a lot more for my understanding than months of looking at 2D slices and projections.

So, maybe there is a future after all for VR, just not the one that futurists and science fiction writers envisioned.  Never would have seen that coming.

Chaos Game Part 4 – The Game as Art

This month’s installment will be the final column of this series on the chaos game.  As befitting any swan song, this ending should be artistic and dramatic and awe-inspiring.  Unfortunately, the chaos game isn’t really dramatic or awe-inspiring – at least for most people – but it can be quite artistic.  The patterns that can be produced can be quite beautiful and pleasing to the eye.  So, this article will be mostly a tour of our local digital museum displaying the various works of art that one can produce using either the code heretofore presented or with some minor modifications.

The first entry in our tour is the Barnsley fern, which is a self-similar structure invented by Michael Barnsley to resemble the black spleenwort fern.  There are four entries into the set affine mappings that comprise the game:

barnsley_fern = np.array([[0.01,   0,      0,    0,0.16,0,   0],
                          [0.86, 0.85,  0.04,-0.04,0.85,0,1.60],
                          [0.93, 0.20, -0.26, 0.23,0.22,0,1.60],
                          [1.00,-0.15,  0.28, 0.26,0.24,0,0.44]])

When run through the chaos game, a remarkably pleasing fern frond is found (alliteration, a sure sign of class).

Surprisingly, one of the affine mappings in the table has only a 1 percent chance of being selected. Modification of the first entry in the table of affine mappings from 0.01 to 0.2,

barnsley_fern_2 = np.array([[0.20,   0,      0,    0,0.16,0,   0],
                            [0.86, 0.85,  0.04,-0.04,0.85,0,1.60],
                            [0.93, 0.20, -0.26, 0.23,0.22,0,1.60],
                            [1.00,-0.15,  0.28, 0.26,0.24,0,0.44]])

thus increasing the probability of using the first mapping at the expense of the second, results in a slightly less luxurious, or mangier but, nonetheless, quite similar frond.

This result strongly supports the conclusion from the last post that the chaos game operations on this set of affine maps seems to be quite stable against changes.  Play of this kind also opens the door to the idea of a meta chaos game where the exact set of affine maps used at any iteration is either subjected to small random variations or is picked from a larger set.  In the latter case, one can imagine a meta algorithm selecting between the barnsley_fern and barnsley_fern_2 tables for each frond as part of a larger plant or switching between the two at random.  Perhaps this type of numerical experiment will form the subject of a future column.

But now back to the tour.

Other mathematicians, professional and amateur alike, have played with the set of affine mappings to create similar plant-like results.  An interesting example, called here the Flatter Fern, has as its set of affine mappings

flatter_fern = np.array([[0.02,0,0,0,0.25,0,-0.4],
                         [0.86,0.95,0.005,-0.005,0.93,-0.002,0.5],
                         [0.93,0.035,-0.2,0.16,0.04,-0.09,0.02],
                         [1.00,-0.04,0.2,0.16,0.04,0.083,0.12]])

Using this table in the chaos game gives

As in the previous case, the structure is self-similar but each of the leaf groups is thinner and straighter than in the Barnsley fern.

Keeping with the botanical theme, another fan favorite is the Fractal Tree.  Its set of affine maps is

fractal_tree  = np.array([[0.05,   0,    0,    0, 0.5, 0,  0],
                          [0.45,0.42,-0.42, 0.42,0.42, 0,0.2],
                          [0.85,0.42, 0.42,-0.42,0.42, 0,0.2],
                          [1.00, 0.1,    0,    0, 0.1, 0,0.2]])

Running the chaos game with this table gives something that looks more like broccoli than it does a tree.

Once again, one can play with the stability of the game by adjusting the relative probabilities.  By upping the probability of the first map of the set from 0.05 to 0.25, giving the following table

fractal_puff  = np.array([[0.25,   0,    0,    0, 0.5, 0,  0],
 [0.45,0.42,-0.42, 0.42,0.42, 0,0.2],

[0.85,0.42, 0.42,-0.42,0.42, 0,0.2],

[1.00, 0.1,    0,    0, 0.1, 0,0.2]])

the chaos game produces a Fractal Puff

which suggests that the outer edge is soft, fuzzy, and worn away.

In the next wing of the museum awaits a curious set of geometric shapes.  The one we will examine in detail is the Square, whose set of affine maps is displayed in the following table.

square  = np.array([[0.25,0.5,0,0,0.5, 1, 1],
                    [0.50,0.5,0,0,0.5,50, 1],
                    [0.75,0.5,0,0,0.5, 1,50],
                    [1.00,0.5,0,0,0.5,50,50]])

In this case, running the chaos game with this table results in a uniformly filled region.

The trick to making a fractal appear is to restrict the selection of the vertex so that the same vertex cannot be picked in a row.  The easiest modification to the code used in these explorations is to create a new transformation function

def Transform2(point,table,r,prev_N):
    x0 = point[0]
    y0 = point[1]
    for i in range(len(table)):
        if r <= table[i,0]:
            N = i
            break
    if N != prev_N:
        x = table[N,1]*x0 + table[N,2]*y0 + table[N,5]
        y = table[N,3]*x0 + table[N,4]*y0 + table[N,6]
    else:
        x = np.NaN
        y = np.NaN
    
    return np.array([x,y]), N

that is aware of the previous vertex (the variable prev_N) and returns a null result for the new point.   Testing the return value keeps the ‘bad’ point out of the results giving the following pattern

The Wikipedia article on the chaos game, has a stunning gallery of geometric shapes that result from by similar types of rules restricting vertices.  Particularly interesting are the pentagon examples by Edward Haas.

The first one, shows the resulting pattern by using an analogous set of maps to the square and restricting the vertex to be strictly different from the one before.

In Haas’s second case, the pattern results from using the same table as his first case, but with the restriction that the new vertex cannot be 1 or 4 places away from the two previously chosen vertices.

While both cases exhibit five-fold symmetry the differences that arise solely due to the restriction on the allowed vertices is startling.

The final exhibit is a mix of the botanical and the geometric.  The Fractal Tree table was used with the vertex restriction rule used with the square.  The resulting Fractal Twig is beautiful in its brutal desolation (always wanted to talk like a snooty, pretentious modern art dealer)

So, it seems as if the sky is the limit in creating digital art using the chaos game.  I suspect we’ve only scratched the surface.

Chaos Game Part 3 – Stability of the Game

Now having introduced the chaos game and analyzed how a single iteration works, it is worth taking stock of what we actually know.  Clearly the repeated application of the affine map causes points to collect only in certain places in the plane creating the self-similar fractals that are often called the attractors of the map or strange attractors.  We’ve also established what the parameters of the affine map do in the particular case of the Sierpinski triangle, getting a sense of how those parameters map to the rules discussed in the Numberphile video.  But the exact mechanism for why only certain points are hit is based on some specialized mathematics and little intuition is obtained from pursuing the study without a lot of initial effort being devoted to understanding the ‘nuts and bolts’.

So, this post is devoted to an experimental approach to the chaos game that produces the Sierpinski triangle.  Each experiment performed involves changing one or more of the parameters that make up the affine maps used in the game and then observing what happens to the pattern produced.  Doing so will help us understand the stability of the map.

Stability can be defined in numerous ways.  There are definite schools of thought with different interpretations.  For this exploration of the chaos game, I will define two types of stability:  deterministic and stochastic.

Deterministic stability looks to see in what range of parameters does the chaos game yield fractal-like results.  The parameters themselves are chosen at the beginning of the game and remain fixed during its progression.   The idea is to answer such questions as:  How accurate do the parameters need to be?  How many different shapes or patterns result?  How many different types of games are there?

To get a sense of what answers may result and to clarify what is meant by ‘types of games’, consider the investigation detailed in the last post.  As a direct byproduct of understanding a single iteration, the significance of the constant terms $f^{(p)}$ and $g^{(p)}$ used in three Sierpinski affine maps became clear; multiplying these parameters by two gave the coordinates of vertices of the triangle.  The same Sierpinski triangle resulted even if these values were changed.  Thus, the family of all possible values for $f^{(p)}$ and $g^{(p)}$ form an equivalence class where the scale and overall geometry may change but the shape is undeniably a Sierpinski triangle.  I term this equivalence class as a single game.

In contrast, stochastic stability involves regarding the parameters of the affine maps as random variables, which vary at each iteration.  The time history of each random variable forms a unique realization that is different each time the game is played.  This aspect can be quite important because noise can creep into the game in a variety of scenarios.  Computationally, the finite precision floating point representation of the rational numbers in the game are not exact and small errors arise as the game progresses through its iterations.  Physically, an analog of the game in the material world is a messy affair with the parameters of the maps only inexactly realized due to all the small but unmodelled effects.

One possible way of implementing a stochastic game would be to generalize the deterministic expression for a new point $(x,y)$ in the Sierpinski triangle game,

\[ \left[ \begin{array}{c} x \\ y \end{array} \right] = \left[ \begin{array}{cc} 1/2 & 0 \\ 0 & 1/2 \end{array} \right] \left[ \begin{array}{c} x_{current} \\ y_{current} \end{array} \right] + \frac{1}{2} \left[ \begin{array}{c} x_{vertex} \\ y_{vertex} \end{array} \right] \; , \]

(where $(x_{vertex},y_{vertex})$ are the coordinates of one of the three vertices of the triangle, chosen randomly, and $(x_{current},y_{current})$ is the current points coordinates) to a physical system by making the transformation matrix look like

\[ \left[ \begin{array}{cc} 1/2 & 0 \\ 0 & 1/2 \end{array} \right] \rightarrow \left[ \begin{array}{cc} 1/2 + \eta_{11} & \eta_{12} \\ \eta_{21} & 1/2 + \eta_22 \end{array} \right] \; , \]

where the $\eta_{ij}$ are random variables, with some specified moments, reflecting noise that is ubiquitous in the world.

For this post, we’ll content ourselves with exploring deterministic stability for the Sierpinski triangle, looking at what results from changing the values in the transformation matrix and the relative probabilities that each vertex is selected.  In order to facilitate the experiments, each game’s results will be over-plotted (in black) the results from the standard Sierpinski triangle (red).

The first experiment, we’ll change the transformation matrix so that the new point is only moved one third along the line from the current point to the vertex (this was actually suggested in the Numberphile video).  This modification amounts to the following change for the transformation matrix

\[ \left[ \begin{array}{cc} 1/2 & 0 \\ 0 & 1/2 \end{array} \right] \rightarrow \left[ \begin{array}{cc} 1/3 & 0 \\ 0 & 1/3 \end{array} \right] \; . \]

Running the game gives

At first glance, lowering the values in the transformation matrix seems to have simply shrank the scale of the triangle but closer inspection shows that there is a fundamental change in the geometry.  The missing ‘center piece’ is now a squat hexagon instead of a triangle as is more easily seen with the following annotated figure.

This small change has resulted in a new, but closely related game.

Scaling up the values in the transformation matrix from $1/2$ to $3/4$ leads to no discernible structure (with the original triangle now peaking thru in the lower left corner).

This result looks as if scaling the values in the transformation matrix above $1/2$ causes the points to overlap, suggesting that the value of $1/2$ is some sort of divider or separatrix between non-overlapping and overlapping points.  We might expect that a small value above one half, say $0.55$ would begin to blur the Sierpinski triangle by creating a small amount of overlap.  Running the game with this value yields

Some of the overlap regions are highlighted within red circles and the remaining ones can deduced by the symmetry exhibited by this self-similar fractal.

Likewise, a value of $0.45$ produces a triangle where the sub-triangles no longer smoothly join together but where there are gaps.

If this process were to occur in a physical system (and there are certain researchers constructing such systems – see Physicists wrangled electrons into a quantum fractal) then the scales selected in the transformation matrix must stay confined to a relatively narrow range of parameter space ($0.45-0.55$) before the results become so different as to suggest that the system is fundamentally different.  That said, the geometry produced by the game, regarded strictly in a visual sense, is remarkably resilient.  Triangles within triangles result over a fairly wide range of values.

The results of the final experiment show that the game is even much more tolerant to changes in the probabilities that determine how often each vertex is picked.  In the original implementation of the game, each vertex had a $1/3$ chance of being the one selected.  To see how sensitive the game is to changes in this respect, the relative probabilities were adjusted so that one vertex had an 80% chance of being selected while there was only a 10% chance of selecting from the remaining two.  As the figure below shows, the result of this change are to simply ‘ghost out’ the vertices that are less likely.

The conclusion here is straightforward.  The chaos game method of producing the Sierpinski triangle is remarkably stable to a range of deterministic changes in the parameter set.  This robustness makes the chaos game a convenient numerical laboratory for exploring other emergent fractals, some of which we will see in the next post.

Chaos Game Part 2 – How It Works

Last month’s blog introduced the idea of the chaos game, which is a simple and compact computing method for generating deterministic fractals using a Monte Carlo-like random method.  The central ingredients of the chaos game are an affine transformation of the plane

\[ \left[ \begin{array}{c} x_i \\ y_i \end{array} \right] = \left[ \begin{array}{cc} a^{(p)} & b^{(p)} \\ c^{(p)} & d^{(p)} \end{array} \right] \left[ \begin{array}{c} x_{i-1} \\ y_{i-1} \end{array} \right] + \left[ \begin{array}{c} f^{(p)} \\ g^{(p)} \end{array} \right] \; , \]

which expresses the current point (labeled by the index i) in terms of the previous point (labeled by the index i-1) in terms of known, fixed parameters $\{a^{(p)}, b^{(p)}, c^{(p)}, d^{(p)}, f^{(p)}, g^{(p)}\}$.  The index $p$ is an integer label for the various affine transformations available and is chosen at random at each step, according to some pre-defined probability distribution.  For the Sierpinski triangle discussed in the previous post, $p$ can take on the values (1,2,3) each with probability 1/3.

It is convenient to package all the information into a table with each row corresponding to a one of the affine transformations.  For the Sierpinski triangle, the table is given by

Cumulative Probabilityabcdfg
0.330.5000.511
0.660.5000.5150
1.000.5000.55050

, which is directly translated into the numpy array

sierpinski_triangle = np.array([[0.33,0.5,0,0,0.5, 1, 1],
                                [0.66,0.5,0,0,0.5, 1,50],
                                [1.00,0.5,0,0,0.5,50,50]])

used for simulation (details in the previous post).

The above material, which is a quick recap of last month’s entry, sets the stage for this post, which focuses on understanding how the structures emerge from such a simple set of rules.

A good starting point for that understanding is the Numberphile video on the chaos game

The presenter, a mathematician by the name of Ben Sparks, starts by showing how the chaos game for the Sierpinski triangle is played by examining single iterations performed by hand.   He performs a single iteration by taking the initial point, rolling a six-sided die (yes die is the correct word for the singular – like Sparks that grammatical rule confuses me as well) to randomly pick one of the three vertices, connecting the point to the chosen vertex with a straight line, and then establishing the new point at a distance halfway along that line.

To map this procedure to a mathematical algorithm, start with the situation pictured below.

Assuming that vertex C is chosen by random, the line joining the current point to the vertex is given by

\[ \left[\begin{array}{c} x \\ y\end{array}\right] = \left[ \begin{array}{c} x_p \\ y_p \end{array} \right] + \lambda \left[\begin{array}{c} x_C – x_p \\ y_C – y_p\end{array}\right] \; , \]

where the parameter $$\lambda$$ ranges from 0 to 1, with 0 corresponding to the initial point and 1 to vertex C.  The point halfway from the initial point to the vertex C has coordinates given by substituting $\lambda = 1/2$ to yield

\[ \left[\begin{array}{c} x_{1/2} \\ y_{1/2}\end{array}\right] = \frac{1}{2}\left[ \begin{array}{c} x_p \\ y_p\end{array}\right] + \frac{1}{2} \left[\begin{array}{c} x_C \\ y_C\end{array}\right] \; . \]

By comparing this formal transformation with the rows of the table, it is obvious that the vertices of the triangle used in the chaos game are:  A – (2,2), B – (2,100), C – (100,100).  This is nicely confirmed by putting the axes back on the figures and foregoing the aesthetic transformation used last month.

The limits of the figure are [2,100] for the x-axis and y-axis and the triangle fits perfectly into the upper left half.

Further confirmation is obtained by changing the values of $f$ and $g$ for all three affine transformations so that the vertices are now located at: A – (0,0), B – (0,20), C – (20,20).  Rerunning the chaos game yields the following fractal

Here the limits on the plot are [0,20] for both the x- and y-axes.

With a firm understanding of what the parameters $f$ and $g$ do, next turn attention to the matrix-portion of the transformation.  Referring to the half-way point equation above, the first term on the right-hand side can be recast as

\[ \frac{1}{2} \left[ \begin{array}{c} x_p \\ y_p \end{array} \right] = \left[ \begin{array}{cc} 1/2 & 0 \\ 0 & 1/2 \end{array}\right] \left[\begin{array}{c} x_p \\ y_p\end{array} \right] \; ,\]

thus explaining what the parameters $a$, $b$, $c$, and $d$ describe.  For example, to get to point a third of the way along the line connecting the current point to the particular vertex, the table should be changed so that $a = 1/3 = d$ while keeping $b=0=c$.

Finally, the cumulative probabilities can be modified as well to create additional realizations of the game.

Next post will explore some of the various new fractals that can be obtained by fiddling with the matrix component of the affine transformation and the cumulative probabilities as a way of investigating the stability of the game to variations.

Chaos Game Part 1 – The Basics

As is typical of many of the best things in life, discovery of this month’s topic involved quite a bit of serendipity.  It all started with a search for a quick reminder on how to solve a numerical set of coupled ODEs using scipy/Python and it ended with the results seen here on the chaos game.

Since it is possible that the reader is unfamiliar with this term (as was the author), a quick introduction is warranted.  In short, the chaos game is a random method for producing a deterministic fractal shape.  Its operation is in sharp contrast to the more traditional way of drawing a fractal deterministically by hand in that the chaos game teases out the fractal’s attractor by iteratively applying a set of affine transformations, each chosen randomly, to a series of points starting with a randomly chosen initial point.

Mathematically, the chaos game falls under the broader heading of an Iterated Function System (IFS) where the ith point in the plane $q_i$ is given by

\[ q_i = M^{(p)} q_{i-1} + f^{(p)} \; \]

or, in terms of arrays

\[ \left[ \begin{array}{c} x_i \\ y_i \end{array} \right] = \left[ \begin{array}{cc} a^{(p)} & b^{(p)} \\ c^{(p)} & d^{(p)} \end{array} \right] \left[ \begin{array}{c} x_{i-1} \\ y_{i-1} \end{array} \right] + \left[ \begin{array}{c} f^{(p)} \\ g^{(p)} \end{array} \right] \; .\]

The index $p$ is an integer to label the various affine transformations available, and is chosen at random at each step according to some pre-defined probability distribution.

Before getting to the Python code, which is easy to implement, it is instructive to compare and contrast the traditional method and the chaos game in generating the well-known Sierpinski Triangle fractal.

The traditional way to draw the Sierpinski Triangle starts with an equilateral triangle.

A similar triangle half in size (or a quarter in area) is removed from the middle of the larger triangle leaving an arrangement of three smaller triangles.

The process can then be applied to these smaller triangles yielding another stage of refinement.  This in turn results in a larger set of smaller triangles to which the process can be applied yet again.

Continuing on indefinitely gives the Sierpinski Triangle in all its glory.

However, there is a fundamental limit on how much can be drawn by hand or by computer, and the amount of work to obtain a successively more exact approximation in level of detail becomes increasingly more immense.

In the chaos game, the programing is much simpler and, due to the random nature of the system, all levels of detail are present simultaneously.  Central to the chaos game approach are the three affine transformations:

\[ T1: \left[ \begin{array}{c} x_i \\ y_i \end{array} \right] = \left[ \begin{array}{cc} 0.5 & 0 \\ 0 & 0.5 \end{array} \right] \left[ \begin{array}{c} x_{i-1} \\ y_{i-1} \end{array} \right] + \left[ \begin{array}{c} 1 \\ 1 \end{array} \right] \; ,\]

\[ T2: \left[ \begin{array}{c} x_i \\ y_i \end{array} \right] = \left[ \begin{array}{cc} 0.5 & 0 \\ 0 & 0.5 \end{array} \right] \left[ \begin{array}{c} x_{i-1} \\ y_{i-1} \end{array} \right] + \left[ \begin{array}{c} 1 \\ 50 \end{array} \right] \; , \]

and

\[ T3: \left[ \begin{array}{c} x_i \\ y_i \end{array} \right] = \left[ \begin{array}{cc} 0.5 & 0 \\ 0 & 0.5 \end{array} \right] \left[ \begin{array}{c} x_{i-1} \\ y_{i-1} \end{array} \right] + \left[ \begin{array}{c} 50 \\ 50 \end{array} \right] \; .\]

At each stage of the iteration, one of these transformations is selected with probability 1/3; that is to say that each is equally likely to be selected.

Four basic pieces of code are used in the implementation.

The first is a table that encodes the selection probability and the parameters for each affine transformation used in the game (i.e., each row of the table encodes the likelihood and the numbers associated with $T1$, $T2$, and so on).  The table encoding for the Sierpinski Triangle is:

sierpinski_triangle = np.array([[0.33,0.5,0,0,0.5, 1, 1],
                                [0.66,0.5,0,0,0.5, 1,50],
                                [1.00,0.5,0,0,0.5,50,50]])

Second is the Transform iteration function that takes as input the current point, the table describing the game, and a random number simulating the probability at each step.

def Transform(point,table,r):
    x0 = point[0]
    y0 = point[1]

    for i in range(len(table)):
        if r <= table[i,0]:
            N = i
            break

    x = table[N,1]*x0 + table[N,2]*y0 + table[N,5]
    y = table[N,3]*x0 + table[N,4]*y0 + table[N,6]

    return np.array([x,y])

Third is looping code that carries out the chaos game through successive applications.  The code, which, for convenience, wasn’t encapsulated in a function, looks like

N     = 100000
shape = np.zeros((N,2))
probs = np.random.random(N)
game  = sierpinski_triangle

for i in range(1,N):
    shape[i,:] = Transform(shape[i-1,:],game,probs[i])

The parameter N sets the number of desired iterations; the numpy array shape pre-allocates the space for the resulting points from the iterations and initializes the starting point to the origin.  The probabilities are also precomputed since a call to the numpy (np) random is more quickly performed when in bulk than at each step.  The game object points at the particular table encoding the transformations, here the sierpinski_triangle table defined above.  The for loop could be done faster using a list comprehension but at the loss of clarity.  Since N = 100000 takes less than a second of computation there wasn’t a compelling reason to lose the clarity.

The final step is a plot of the results.  The code that performs this is given by

start = 0

fig_shape = plt.figure(figsize=(6,6))
ax_shape  = fig_shape.add_subplot(1,1,1)
ax_shape.scatter(shape[start:,0],shape[start:,1],marker='.',color='k',s=0.01)
ax_shape.axes.set_frame_on(False)
ax_shape.xaxis.set_visible(False)
ax_shape.yaxis.set_visible(False)

The parameter start sets the number of initial points to ignore.  This is sometimes useful to suppress the initial stray n points until the game converges to the fractal attractor.

All of these pieces were put together in a Jupyter notebook.  The resulting image is

Note that the Sierpinski Triangle is rotated 45 degrees from the more usual ‘artistic’ expression.  That minor issue is easily rectified using a simple rotation matrix

c45 = 1.0/np.sqrt(2.0)
s45 = 1.0/np.sqrt(2.0)
A   = np.array([[c45,s45],[-s45,c45]])

new_shape = np.array([A.dot(shape[i,:]) for i in range(len(shape))])

where the parameters c45 and s45 are shorthand for $\cos(45 {}^{\circ}) = 1/\sqrt{2}$ and $\sin(45 {}^{\circ}) = 1/\sqrt{2}$, respectively.

The resulting image is now more aesthetically pleasing.

Making it an equilateral triangle is left as an exercise for the reader.

Note that I’ve played with the color scheme freely and inconsistently from image to image.  Picking and playing with colors is one of the most fun things about exploring the chaos game.

Future columns will look more closely at the chaos game, why it works the way it does and some of the beautiful images that can be produced.