Optimization with J

Richard L.W. Brown
York University


1  Introduction to J
    1.1  J as a Calculator
        1.1.1  Nouns and Verbs
        1.1.2  Order of Execution
        1.1.3  Infinity, Complex Numbers, Rationals, Extended Integers
        1.1.4  Pronouns, Proverbs, Adverbs
    1.2  Arrays
        1.2.1  Ranks and Items
        1.2.2  Creating Arrays
        1.2.3  Verbs and the Rank Conjunction
    1.3  Programs
        1.3.1  Tacit and Explicit Programs
        1.3.2  Binding and Composition
        1.3.3  Forks and Hooks
        1.3.4  Constant Functions, Inverse Functions, Identity Functions
2  Game Theory
    2.1  Two-Person Zero-Sum Games
        2.1.1  Matrices
        2.1.2  Entering the Game Matrix into J
        2.1.3  Solving Small Games
        2.1.4  The Tennis Serve Game
        2.1.5  Larger Game Matrices


These are notes prepared for two concurrent courses: Math 5500 Topics in Mathematics for Teachers (Optimization) and Math 4570 Applied Optimization.

The purpose of the notes is to provide a tutorial in the use of the `J calculator' as a tool for computing solutions to a variety of optimization problems.

J version 4.05 for Windows 95/98/NT is available at no cost at:
You download a single file of size about 2.7 megs and then double-click this file to install J on your computer. There are also versions of J for the Mac, for Linux, and for Windows CE (now called Pocket PC). The on-line help menu is rather complete and includes a manual for J as well as demos and tutorials (called labs).

J is a programming language developed from A Programming Language (APL) by Ken Iverson (father of APL) and Roger Hui. Like APL, J has primitive functions for operating on arrays without using indexes or looping. For example, if A and B are two matrices of the same shape then A+B computes the elementwise sum of A and B, and <./>./A computes the minimum of the column maximums of A. These powerful features of the J language allow for concise expression of many optimization algorithms.

We have selected J because computing with J is more in the spirit of using a calculator to evaluate formulas than of writing a computer program. Computing with J may be compared to using a spreadsheet. In a spreadsheet, the user types numbers and formulas into cells arranged in a rectangular grid. Together, these numbers and formulas represent a computer program for some calculation. In J, numbers are entered into arrays and formulas are defined to operate on these arrays. For example, to sum a list of numbers 20.21, 28.01, 173.07, 65.16 in a spreadsheet, one would type the numbers into a row (or column) of cells. Then one would give a name to this range of cells, say payments. And finally, one would select a new cell and type into it a formula of the form =SUM(payments). In that cell would appear the sum of the payments, 286.45. In J, one would enter the numbers into a vector and then apply a summation formula.

   payments=: 20.21  28.01  173.07  65.16
   sum =: +/
   sum payments

It has been widely noted that many people who are comfortable writing formulas into spreadsheets (or using the functions on a calculator to evaluate formulas) are not so comfortable at writing a programs for a computer. There is a good reason for this. The original problem that gave rise to the calculation was either itself a mathematical problem (add these four numbers) or a practical problem (compute the bill when 4 items are purchased) that is formulated into a mathematical problem (add the four prices). Therefore, we can readily accept that a mathematical formula is needed. But to write a computer program, a new level of formulation must be added. We must store the numbers in a sequence of computer memory cells v[i] using an index i. We must name a computer memory cell S to hold the partial sums and initialize it to zero and we must set an index i equal to zero. Then we enter a loop. Each time we enter the loop, we increment the index, fetch a number using the index, and add it to the contents of the memory cell holding the partial sum. These steps take us quite far from the original problem. And there are many ways of making errors, even in such a simple program!

In writing programs, we will be emphasizing a style called `functional programming' in which functions are defined and then combined to give more complicated functions until the final program is achieved. For example, we know that the arithmetic mean of a list of numbers is the sum divided by the number of items. Here is how that is programmed in J.

   sum =: +/
   am =: sum % #
   am 7 9 2 1 4
   am 3r5 2r3 17r7

(The second example above shows the arithmetic mean of three rational numbers. It says that the arithmetic mean of 3/5, 2/3, 17/7 is 388/315.)

We will also occasionally use the more traditional programming style of making a sequence of commands that operate on data and assign the resulting values to variables. The sequence of commands will usually have some conditional branches or loops. This command-style programming is often called `imperative programming'. But in the latter case, we will streamline the imperative program by using many functional parts so that the ideal of using just functions (or just `formulas') is not far away.


Toronto, Ontario, September, 2000

Chapter 1
Introduction to J

1.1  J as a Calculator

J operates in calculator mode . That is, the user is presented with an execution window (entitled 1.ijx or 2.ijx, etc.) in which typed statements are executed after the user types <return>. The execution window has a number of helpful features. The user can move the cursor up the screen to a line that was entered previously and press <return>. This will copy the line down to the current line. There it may be edited and executed. Similarly, the usual `cut and paste' methods may be used to retrieve parts of the session and paste them into the current line. Or a sequence of previous lines can be edited, then selected with the mouse, and then run using the Run Selection item from the Run menu. In effect, it is like a calculator that allows the used to correct mistakes and retry edited earlier work.

1.1.1  Nouns and Verbs

The simplest calculations are done by entering data (nouns) and operating on the data with operations or functions (verbs). The examples below are formatted just as they would appear in a real J session.

It will be of great help to sit at a computer with a J session running and do these and similar calculations as you read this section.


Note that in J, a negative number is denoted by an underbar rather than a minus sign. The underbar is part of the notation for a negative number. The minus sign, by contrast, is an `action word' (verb) that either means subtraction or negation as shown above. Note that a verb in J can have both a dyadic meaning (like subtract) and a monadic meaning (like negate). Another example is the verb % whose dyadic meaning is division and whose monadic meaning is reciprocal. The dyadic meaning of ^ is exponent whereas the monadic meaning is exponential: ^x means ex where e, `Euler's number', is the base of natural logarithms, e = 2.71828.

1.1.2  Order of Execution

J has many more primitive (predefined) verbs than those shown above. Therefore, there is no heirarchy of execution. (In other words, for example, multiplication is not done before addition!) The order of execution is the same as in composition of functions in mathematics: f g h 5 means f(g(h(5))). You can alter the order of execution with parentheses.

So remember: expressions are evaluated from right to left unless parentheses are used. This is a simple rule and easy to apply - but it does lead to many mistakes by beginners. Consider these examples:


1.1.3  Infinity, Complex Numbers, Rationals, Extended Integers

The following examples show some calculations involving infinite, negative infinity, complex numbers, rationals, and extended integers (larger that the default integer limit of 231).

   _ + 2        NB. infinity plus 2 is infinity
   _1%0         NB. _1 over 0 is negative infinity
   _2^0.5       NB. _2 to power 0.5 is 0+i(sqrt(2))
   2r3 + 4r7    NB. 2/3 + 4/7 is 26/21
   2^50         NB. 2^50 overflows to floating point
   2^ x: 50     NB. x: 50 makes exponent, hence result, extended

1.1.4  Pronouns, Proverbs, Adverbs

Continuing the sample session, we show how to assign names to data (nouns) and names to verbs. (The names for nouns are sometimes called pronouns and the names for verbs are sometimes called proverbs. But, by abuse of language, we still refer to the names as nouns and verbs.) Thus, in the session that follows, we should call 1.2 4 5 a noun, v a [pro]noun, +/ a verb, and sum a [pro]verb.

An adverb modifies the meaning of a verb. The most commonly used adverb in J is /. When it is used monadically, it inserts the verb (to its left) between the items of its right argument. Thus, whereas 2+3 means `2 plus 3', +/ 2 3 4 means `2 plus 3 plus 4' or `sum of 2, 3, 4'. (You can read +/ 2 3 4 as `plus over 2 3 4'.) Similarly, */2 3 4 is `2 times 3 times 4'. When used dyadically, it generates a table such as a plus-table or times-table.

   v =: 1.2 4 5
2.2 5 6
   v+10 11 12
11.2 14 17
   sum =: +/
   sum v
   1 2 3 4 +/ v
3.2  6  7
4.2  7  8
5.2  8  9
6.2  9 10
   i. 6
0 1 2 3 4 5
   (i. 6) */ (i. 6)
0 0  0  0  0  0
0 1  2  3  4  5
0 2  4  6  8 10
0 3  6  9 12 15
0 4  8 12 16 20
0 5 10 15 20 25
   2 <. 3        NB. lesser of
   <./v          NB. lesser over is min
   min =: <./
   min v
   max =: >./
   max v
   n =: i.15
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
   +/%!n         NB. sum of reciprocals of factorials

In J, a primitive operation is denoted by a single symbol (like =) or by following such a symbol by a period (like =.) or a colon (like =:). Thus the expression <. is to be taken as one operation. As a dyadic verb, it means `lesser of'. It can be used to define a monadic minimum verb as shown above. And its dual, >. (`greater of') can be used to define a maximum verb. Note how the monadic verb i. generates an index set starting at zero. The line +/%!n gives the sum of the reciprocals of the factorials of this index set; in mathematical notation, that is


n = 0 
and this sum is approximately Euler's number e = 2.71828.

1.2  Arrays

J is especially strong in its ability to handle arrays such as vectors, matrices, and arrays of higher dimension. Typically one does arithmetic on arrays by applying verbs to entire arrays rather than using for-loops with indicies on the individual elements of the arrays.

1.2.1  Ranks and Items

In J the dimension of an array is called its rank. (This is a different meaning of `rank' from the `rank of a matrix' in mathematics.) In J, arrays of any finite rank are allowed. For example, vectors are arrays of rank one, matrices are arrays of rank two. An array of rank three, say an array of shape 4 by 5 by 6, can be thought of as 4 matrices each of size 5 by 6. Corresponding to each value of the leading index is an item of the array. Thus the items of a vector are its elements. The items of a matrix are its rows. The items of an array of rank three of shape 4 by 5 by 6 are 4 matrices each of shape 5 by 6.

In mathematics, a vector is a special case of a matrix that has just one row. That is not true in J. In J there is a difference between an n-vector (which has rank 1) and a matrix of shape 1 by n (which therefore has rank 2).

1.2.2  Creating Arrays

We have seen above that vectors can be created by listing their elements. We also have seen that the verb i. can be used to create a vector and that matrices can be created by making tables such as addition tables or multiplication tables. Now we look at some other ways of creating matrices.

In the following sample session, a 4 by 5 matrix A is defined by using the shape verb $ to reshape a 20-element vector. (The extra spacing in the vector is unnecessary but helps the reader to see how the vector data is broken up into the rows of the matrix.) Then a 5 by 3 matrix B is defined by combining five 3-vectors. (Note the use of ,: to create a rank-2 array from the two rightmost vectors followed by the use of , to add the remaining rows on top. If you use all commas then you just get a vector with 15 elements.) Next we create a 2 by 4 matrix C by using ,. to combine the columns.

Various operations on the arrays A B C create other arrays. A few samples, with comments, are shown.

   A =: 4 5$2 1 7 3 9  6 1 9 9 2  2 7 1 3 8  8 3 2 5 1
2 1 7 3 9
6 1 9 9 2
2 7 1 3 8
8 3 2 5 1
   B =: 2 0 1, 8 _1 2, 3 1 4, 5 2 0,: 1 2 3
2  0 1
8 _1 2
3  1 4
5  2 0
1  2 3
   C =: 8 1,. 6 2,. 4 5,. 3 4
8 6 4 3
1 2 5 4
   0 3 { A           NB. 0-row, 3-row from A
2 1 7 3 9
8 3 2 5 1
   (0 1;1 2){A       NB. (0 ,1)- and (1,2)-elements from A
1 9
   2 3 {. A          NB. 2 rows, 3 columns take from A
2 1 7
6 1 9
   2 _3 {. A         NB. 2 rows, last 3 columns take from A
7 3 9
9 9 2
   +/B               NB. plus between rows of B (col sums)
19 4 10
   >./C              NB. greater between rows of C (col max's)
8 6 5 4
   <./>./A           NB. min of col max's of A (minimax as in game theory)
   C +/ . * A        NB. C usual-matrix-product A
84 51 120 105 119
56 50  38  56  57
   C <./ . + A       NB. C min-plus-matrix-product  A
6 6 5 7 4
3 2 6 4 4

Notice that the matrix product is denoted by +/ . * and hence there is explicit reference to the two operations of matrix product: each term in the answer matrix is a sum of products of terms from the factors. Because the operations of plus and times are explicit in the notation, they can be changed to other operations if we wish.

The last example above shows a matrix product using a minimum of sums instead of a sum of products. Is this type of matrix product useful? Imagine a diagram of points and lines. There are 11 points arranged as a group of 2 points, a group of 4 points and a group of 5 points. The matrix C gives the lengths of the 8 lines from the first two points to the next four points. The matrix A gives the lengths of the 20 lines from the group of 4 points to the group of 5 points. Then the min-plus matrix product computed above finds the lengths of the shortest paths from each of the first two points to each of the last 5 points. (Each path passes through the middle 4 points.) This is an important type of optimization problem: finding shortest paths through a network.

1.2.3  Verbs and the Rank Conjunction

How do verbs operate on arrays? Arithmetic verbs like plus and times operate on the elements of arrays. For example the expression 3 2 7 + 4 9 1 yields the answer 7 11 8 which is the elementwise sum of the two arrays. We say that arithmetic verbs have rank zero and they act on 0-cells of arrays.

Can you add (multiply etc.) two arrays of different shapes? Yes, but only under these conditions: 1) one of the arrays is a scalar (single number) or 2) if the shape of one array is a prefix of the shape of the other array. For example an array of shape 2 by 3 could be added to an array of shape 2 by 3 by 4 because the shape of the first array is a prefix of the shape of the second array.

   1 2 3 + 4 5           NB. different shapes
| length error
|   1 2 3    +4 5

   i. 2 3
0 1 2
3 4 5
   100* i. 2 3           NB. 1) scalar
  0  100  200
300  400  500
   i. 2 3 4
 0  1  2  3
 4  5  6  7
 8  9 10 11

12 13 14 15
16 17 18 19
20 21 22 23
   (100*i.2 3)+i.2 3 4   NB. 2) prefix
  0   1   2   3
104 105 106 107
208 209 210 211

312 313 314 315
416 417 418 419
520 521 522 523

Not all verbs have rank 0. Consider the expression 3 2 7 , 4 9 1 which evaluates to 3 2 7 4 9 1. This shows that the ,-verb joins two whole arrays and does not work elementwise. This verb has infinite rank. Every verb in J has a default rank and applies to cells of that rank. To explain this further we need to know more about cells.

What are the cells of an array? We have already called the elements of the array its 0-cells. Consider an array of rank 3 and shape 2 by 3 by 4 such as in the example directly above. It has (2)(3)(4) = 24 elements = 0-cells. It has (2)(3) = 6 1-cells each of length 4. It has 2 2-cells each of shape 3 by 4. It has one 3-cell of shape 2 by 3 by 4. For a general array, the shape of each k-cell is the last k elements of the shape of the array.

Note that the items of an array of shape 2 by 3 are its two 1-cells (rows) each of size 3 but the items of an array of shape 2 by 3 by 4 are its two 2-cells each of shape 3 by 4. In J you may also talk about (-1)-cells, (-2)-cells, and so on. A (-k)-cell has the shape you get by dropping the first k elements of the shape of the array. Thus for any array the items are always (-1)-cells.

The rank of a function can be modified with the rank conjunction ". Consider the examples below:

   3 2 7 , 4 9 1         NB. append arrays
3 2 7 4 9 1
   3 2 7 (,"0) 4 9 1     NB. append 0-cells
3 4
2 9
7 1
   3 2 7 (,"0 1) 4 9 1   NB. append 0-cells to 1-cells
3 4 9 1
2 4 9 1
7 4 9 1
   3 2 7 (,"1 0) 4 9 1   NB. append 1-cells to 0-cells
3 2 7 4
3 2 7 9
3 2 7 1
   7 2 9 (+"0 1) 4 5     NB. add 0-cells to 1-cells
11 12
 6  7
13 14

Thus the rank conjunction add great flexibility to the way that verbs can act on arrays.

1.3  Programs

A computer program is a sequence of instructions that takes an input and creates an output. Thus a computer program is similar to a function in mathematics. In J, functions are action-words called verbs (or adverbs or conjunctions). Therefore, from the point of view of J, a program is a user-defined verb (or adverb or conjunction).

1.3.1  Tacit and Explicit Programs

Suppose we are given a list of numbers (3, 5, 2, 9, 11) and we wish to write a program that will take this list as input and will output the product of the numbers. The simplest way to do this in J is shown below.

   prod =: */         NB. This is the program.
   prod 3 5 2 9 11    NB. Here it is in action.

This form of programming is called tacit programming because the program definition makes no explicit reference to the input.

Next we give an explicit version of the program. Moreover, we will write this program in a more traditional style that you might find in BASIC or C programming.

   prod =: monad define
n =. # y.
p =. 1
i =. 0
while. i < n do.
 p =. p*(i{y.)
 i =. i+1
   prod 3 5 2 9 11

There are several things to observe. Our program is a monadic verb that takes one argument (the list of numbers) so we define a monad. We must explicitly refer to the input using the special variable name (y.) that denotes a right hand argument. (A dyadic verb has both a left and a right argument denoted respectively by (x.) and (y.).) In the various lines of the program we used local assignment (=.) instead of global assignment (=:) so that the variable names n p i would exist only inside the function and would not conflict with any global definitions we might have. (e.g. if we had previously defined p =: 0.2 0.9 then defining and running the above function would not conflict with or in any way affect this definition.)

J has many features that support tacit programming and we will emphasize this style of programming in these notes. For a person trained in other programming languages, the explicit form may be more familiar; but that is no reason not to learn `the J-way'!

1.3.2  Binding and Composition

One simple way to create a monadic verb is to bind an argument to a dyadic verb. Here are some examples.

   cube =: ^&3
   cube 5
   half =: %&2
   half 2 3 4
1  1.5  2

Another way to form a verb is to use composition. J provides (@) and (@:) for composition. The difference is that the latter gives a verb with default rank infinity whereas the former inherits its rank from the first verb in the composition. It turns out that we usually want the latter.

   of =: @:
   f =: half of cube
   g =: cube of half
   f 10
   g 10
   h =: cube of +
   2 h 3        NB. (2+3)^3

The last example above shows how the (@:)-conjunction can be used with a dyadic verb. There is another composition that is useful with a dyadic verb and this is shown below.

   k =: +&cube
   2 k 3         NB. (2^3) + (3^3)

1.3.3  Forks and Hooks

Consider the following J code:

   f =: half + cube
   f 4               NB. (4%2) + (4^3)
   sum =: +/
   am =: sum % #     NB. arithmetic mean
   am 2 3 4 5
   dev =: - am       NB. subtract arithmetic mean
   dev 2 3 4 5       NB. deviations from the mean
_1.5 _0.5 0.5 1.5

The first construction above is called a fork in J. But it corresponds to a common mathematical notation: given two functions, you can make a new function from them by adding them or multiplying them or dividing them etc. The general case is that a train of three verbs of the form (f g h) is a new verb defined, in the monadic case, by ((f g h) y) is (f(y) g h(y)).

The last construction has a train of two verbs (- and am) and this is called a hook. Its meaning is, perhaps, a little less obvious that the fork. In the example, it means `subtract arithmetic mean'. This is the operation you want when you want the list of deviations from the mean. There is a temptation to misread the hook as `negative arithmetic mean'. But that would be written as (- of am) or just as (-@:am).

Note that forks and hooks have rank infinity by default. So if you need some other rank, you must specify it explicitly.

1.3.4  Constant Functions, Inverse Functions, Identity Functions

The art of tacit programming is to combine simple functions together to construct the function you want. Among the simple functions we can use as building blocks are constant functions and identity functions.

A constant function is a function which always gives the same constant value no matter what the argument. You can make a constant function f with value 10 by writing (f =: 10"_). This defines a constant function of infinite rank. So if you evaluate it on an array of any rank, the result is just the one number 10. At the other extreme, you could define (f =: 10"0) to make a constant function with value 10 and rank 0. If you apply this to an array, it converts each 0-cell to 10.

For the values _9 ... _1 0 1 ... 9 there are special alternative names for the infinite rank constant functions, namely, _9: ... _1: 0: 1: ... 9: and that is convenient because these small constants are used so often.

The identity functions are right and left square bracket. See the examples below.

   c32 =: 32"_           NB. constant 32
   C2F =: 1.8&* + c32    NB. Celsius to Fahrenheit function
   C2F _40 _30 _10 0 10 20 30 40
_40 _22 14 32 50 68 86 104
   C2F^:_1 (_40 _22 14 32 50 68 86 104)  NB. inverse is F2C
_40 _30 _10 0 10 20 30 40
   ] 10                  NB. identity function
   (1: + ] + ]^2:) 10    NB. polynomial
   2 ] 3                 NB. dyadic ] is right
   2 [ 3                 NB. dyadic [ is left

In the J code above, note that J is able to compute the inverse of a user-defined function provided that an inverse exists and that the function definition is simple enough to fall within J's list of `obviously invertible' function combinations. The conjunction (^:) is used to give powers of functions, including negative powers, with the inverse as a special case. In the example above, parentheses are needed to separate the exponent _1 from the input vector of Fahrenheit temperatures.

Chapter 2
Game Theory

Game theory is the theory of competition. The competing `players' may be people playing parlor games or corporations competing in a market or animals or plants competing for survival in nature. There is a game theory model suitable for each of these situations.

Games can be classified in several ways. One way is by the number of players (or competing interests). We exclude situations with only one player. Therefore, the smallest number of players is two. Two-person games differ in an important way from N-person games with N > 2: in the latter case, various groups of players may form coalitions - that is, some players may gang up on others. Hence the two-person game is essentially simpler.

Next, games may be purely competitive or partly competitive and partly cooperative. In purely competitive games, the total payoffs to all players is zero so that a player's winnings come at the expense of the other player's. These are called zero-sum games. In non-zero-sum games, it may be possible for all players to win or for all to lose.

The simplest type of game is the two-person zero-sum game. Moreover, this is the foundation of the theory of more complex types of games.

2.1  Two-Person Zero-Sum Games

A two-person zero-sum game is usually described by a set of rules of play. These rules may involve jargon words (bishops, rooks, or aces, deuces, or dice, or game boards etc.) that are specific to that game. It is important to have a single uniform notation that can be used to study all games. That notation is the matrix.

2.1.1  Matrices

In a two-person zero-sum game, it is sufficient to show the outcomes for one player, say Player A. For the outcomes for the other Player B will be just the negatives of the outcomes for Player A. We create a matrix with one row for each strategy of Player A and one column for each strategy of Player B and we fill the matrix with the outcomes to Player A.

`Any matrix is a game.'

Any matrix can be interpreted as a finite two-person zero-sum game. Player A selects a row, Player B selects a column and then the payoff (paid to Player A by Player B) is the matrix entry in that row and column. Note that both players must make their choices without knowledge of the other player's choice. Otherwise the optimal strategy is obvious. Consider for example the game `Even-Odd'. The matrix of Even-Odd comes from the following set of rules. There are two players named Even and Odd. Each player shows either one or four fingers. (They do this at the same time!) If the total number of fingers showing is even, Even wins. Otherwise Odd wins. The amount a player wins is the total number of fingers showing. Letting Player A (the row chooser) be Even, the matrix of the game is


The first strategy of Player A will either win 2 or lose 5 depending on what column is chosen by Player B. The second strategy of Player A will either lose 5 or win 8. Which is the best row for Player A to choose? Some will say the second row is better because the possible loss is 5 in each row but the second row has a possible gain of 8. But this is fallacious reasoning because it ignores what Player B is likely to do. Player B would prefer the first column because it avoids the disastrous loss of 8 to Player A. given that Player B is likely to prefer the first column, then Player A should prefer the first row. But if Player B anticipate this reasoning, then Player B may in fact play the second column. And if Player A has anticipated all of the above reasoning, then Player A may indeed play the second row after all! And so the reasoning continues, going in circles without a conclusion!

Later, we will discover that this game has a mixed strategy optimal solution in which Player A should select the rows at random using probabilities of (0.65, 0.35) respectively and Player B should use the same probability distribution for the columns. On the average, Player A loses $0.45 per play to Player B.

Instead of the matrix above, we could use any matrix to define the game. One player chooses a row, the other simultaneously chooses a column and then the resulting matrix entry is paid to the row chooser by the column chooser. And if this entry is negative then the `payment' means the row chooser loses and the column chooser wins.

Such a game is called a finite two-person zero-sum game because each of the players has a finite number of choices.

`Any game is a matrix.'

Of course by `any game' we mean `any finite two-person zero-sum game'. By `finite' we mean that each player must have a finite number of strategies . If you think about some common parlor games like Bridge (played by four people who are really two teams playing as the two `players') or Chess or Go or Poker then it is not obvious how these games are matrices! The key idea is the concept of a strategy. In common language, a strategy is a plan of action that helps a player choose moves. In game theory language, a strategy is a precise rule that a player can use to select a move under any situation that may arise during the game.

Obviously, a strategy may be complicated in a game like Chess. On the other hand, some Chess strategies are simple to describe. Here is a simple (but useless!) strategy in Chess. Number the squares from 1 to 64. Number the Chess pieces from 1 to 20 for each player. To make a move, select the piece with the smallest number that has a legal move and then make the legal move that lands on the square with smallest number. In fact, nowadays, strategies for Chess and other parlor games are sold in glossy shrink-wrapped packages - available at your nearest software retailer. Because a computer program that plays Chess is nothing more than one Chess strategy in the game theory sense of the word `strategy'. (This assumes the computer program will always make the same move from a given position. If the program makes some random decisions then the program represents a mixed strategy. If the program learns as it plays and changes its strategy then the program is an algorithm for selecting strategies based on experience.)

If a game has rules that limit play to a finite number of moves then there will be a finite number of strategies possible. Make a list of all strategies for each player. Use these two lists to label the rows and columns of a matrix. Fill in the entries of the matrix by playing each row against each column and recording the outcome. (In case the game has random moves, it is necessary to play the same strategies against each other many times and then average the results.) The result is the matrix form of the game, called the normal form in game theory. The normal form is `normalized' in the sense that all games are brought to the same form, namely a matrix, and are stripped of their idiosyncratic rules about 'boards' or 'pieces' or 'cards' and the like.

Exercise: Consider the game of Chess limited to just one move by White followed by just one move by Black. Clearly White has 20 possible opening moves and therefore 20 possible strategies. Black also has 20 possible opening moves. How many strategies does Black have? If Black's strategies were all written down in one or more books (produced in the style of telephone directories) how much (approximately) would these books weigh?

2.1.2  Entering the Game Matrix into J

For the game Even-Odd, the game matrix may be entered in several ways: by rows, by columns, or by reshaping a vector.

   eo =: 2 _5 ,: _5 8    NB. first row above second row
   eo =: 2 _5 ,. _5 8    NB. first col beside second col
   eo =: 2 2$2 _5 _5 8   NB. or 2 by 2 reshape of vector

Consider the following game `Think Small': Each player picks a number from 1 to 5. If the numbers are the same the payoff is zero. If the numbers are different then the player with the smaller number gets a payoff of 1 unless the smaller number is exactly one less than the larger number. In the latter case, the player with the larger number gets a payoff of 2. The game matrix is


One way to enter the game matrix into J is to build up a long vector of rows in convenient steps and then reshape at the end as shown below.

   M =: 0 _2 1 1 1   2 0 _2 1 1  _1 2 0 _2 1
   M =: M, _1 _1 2 0 1  _1 _1 _1 2 0
   M =: 5 5$M
 0 _2  1  1  1
 2  0 _2  1  1
_1  2  0 _2  1
_1 _1  2  0 _2
_1 _1 _1  2  0

Once the matrix is entered into the computer, calculations may be done on it. For example, <./>./M will give the minimum of the column maximums of M.

Calculating the Game Matrix

If the game matrix is large, we probably want to compute it rather than typing it in by hand. Let us start with a trivial example. Suppose Player A gets to choose a number from1 to 5 and Player B gets to choose a number from 1 to 7. The outcome to Player A is the sum of these two chosen numbers. (This is a silly game but it illustrates the method of computing the game matrix.) Then the game matrix is computed as an addition table.

   SA =: 1 2 3 4 5
   SB =: 1 2 3 4 5 6 7
   SA +/ SB
2 3 4 5  6  7  8
3 4 5 6  7  8  9
4 5 6 7  8  9 10
5 6 7 8  9 10 11
6 7 8 9 10 11 12

Now we can use this same type of calculation for any game matrix. Replace SA and SB with lists of strategies for the Players A and B respectively. Let g be a dyadic function that takes a strategy of Player A as a left argument and a strategy of Player B as a right argument and computes the outcome to Player A. Then SA g/ SB computes the game matrix.

Let us see how this works in the Think-Small game.

   SA =: 1 2 3 4 5
   SB =: 1 2 3 4 5
   g =: <                NB. first try
   SA g/ SB
0 1 1 1 1
0 0 1 1 1
0 0 0 1 1
0 0 0 0 1
0 0 0 0 0
   g =: >                NB. second try
   SA g/ SB
0 0 0 0 0
1 0 0 0 0
1 1 0 0 0
1 1 1 0 0
1 1 1 1 0
   g =: (<->)"0          NB. need to specify rank 0
   SA g/ SB
 0  1  1  1  1
_1  0  1  1  1
_1 _1  0  1  1
_1 _1 _1  0  1
_1 _1 _1 _1  0
   g =: ([=1:+])"0       NB. left=1+right
   SA g/ SB
0 0 0 0 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
   g =: ((<->)+3:*([=1:+])-(]=1:+[))"0 
   SA g/ SB
 0 _2  1  1  1
 2  0 _2  1  1
_1  2  0 _2  1
_1 _1  2  0 _2
_1 _1 _1  2  0

Note: We have shown a general approach to calculating a game matrix. But in any particular example, there may be some easier way of doing the calculation! You are not obliged to follow the method shown above.

2.1.3  Saddlepoints

For any game, the safest row to choose is the row whose minimum entry is maximal. The maximum of the row minima is called the maximin. It represents the security level of Player A. By selecting a row which has the maximum row minimum, Player A is sure of getting at least the maximin value no matter what Player B does.

Dually, Player B's safest column is the one for which the column maximum is minimized. This value is called the minimax (minimum of column maxima). Player B can hold Player A down to at most this value. Thus the maximin is a lower bound and the minimax is an upper bound on how much Player A can win. Thus we always have maximin minimax.

A game has a saddlepoint (and hence optimal pure strategy solutions for both players) if and only if the maximin = the minimax. Below are J functions for useful in this context.

   of =: @:
   max =: >./
   min =: <./
   rowmins =: min"1
   colmaxs =: max
   minimax =: min of colmaxs
   maximin =: max of rowmins
   sp =: maximin = minimax
   game1 =: 2 3$ 3 _2 0 _4 1 1
 3 _2  0
_4  1  1 
   maximin game1
   minimax game1
   sp game1        NB. result 1=true, 0=false
   game2 =: 3 4$11 _8 0 _11 5 3 1 9 2 7 _2 6
11 _8  0 _11
 5  3  1   9
 2  7 _2   6
   maximin game2
   minimax game2
   sp game2


2.1.5  The Tennis Serve Game

Suppose a tennis player has a big serve and a small serve. The big serve is not very reliable: it is good only 20% of the time. But when it is good, it is very effective. It wins a point 80% of the time if the receiver was standing back and 96% of the time if the receiver has moved up closer to the net. The server also has a reliable but weak serve. This serve is good 90% of the time but wins a point only 50% of the time when the receiver is standing back and only 30% of the time when the receiver has moved up. According to tennis rules, if the first serve is bad, the server may take a second serve. The problem is to find the best strategy for the server and also find the best strategy for the receiver. We also want to find the probability that the server will win a point.

The Game Matrix

The first step is to calculate the game matrix. The server has four strategies because on the first serve, either the big or the small serve can be used and then on the second serve (if necessary) either the big or the small serve can be used. Similarly, the receiver has four strategies because the receiver can take a back or up position for the first serve and has the same two choices for the second serve.

Exercise: Assume that the server's strategy choices are BB, BS, SB, SS. (For example the third strategy is to start with the small serve S and if it is a fault, use the big serve B as the second serve.) Assume the receiver's strategies are bb, bu, ub, uu. (For example, the second strategy is to play back for the first serve and, if there is a second serve, play up for that.) Compute the 16 entries in the 4 by 4 game matrix.

Solution: Let pi be the probability that the serve of type-i is good. (Thus pB = 0.2 and pS = 0.9.) Let qij be the probability that if the serve of type-i is good it will win a point for the server against a receiver in position-j. (Thus, for example, qBu = 0.96 and qSb = 0.5.) Then the formula for the probability of the server winning a point with circumstances of (i,i) against (j,j) is

The first term comes from the first serve. Then if the first serve is bad, with probability (1-pi), the second serve occurs.

Using this formula, you can compute the 16 entries of the game matrix. For example, if (i,i) = (B,S) and if (j,j) = (b,u) then the probability the server wins is

(0.2)(0.8)+(1-0.2)(0.9)(0.3) = 0.376

We can use J to calculate all the probability-server-wins entries at once, as is shown below. (Warning! This is not beginner's J so don't be alarmed if you don't understand some aspects of the result. For clarity, we show the calculation in pieces and display some partial results.)

   p =: 0.2 0.9
   q =: 0.8 0.96 ,: 0.5 0.3
0.8  0.96
0.5  0.3
   pq =: p*q
0.16 0.192
0.45  0.27
   psw =: pq + (1 - p,.p)*/pq
0.288 0.3136
 0.52  0.376

 0.32 0.3456
0.552  0.408

0.466 0.4692
0.495  0.477

0.286 0.2892
0.315  0.297
2 2 2 2
0.288 0.3136
 0.52  0.376
0.466 0.4692
0.495  0.477

 0.32 0.3456
0.552  0.408
0.286 0.2892
0.315  0.297
0.288 0.3136  0.32 0.3456
 0.52  0.376 0.552  0.408
0.466 0.4692 0.286 0.2892
0.495  0.477 0.315  0.297

In the above, psw (probability server wins) is an array of shape 2 by 2 by 2 by 2 that has all the correct probabilities. We then use some additional steps to rearrange this data into a 4 by 4 matrix.

The Solution of the Tennis Serve Game

2.1.6  Larger Game Matrices

For small matrices, entering the matrix as shown above is probably the best method. However, it is certainly possible to ask the computer to help create the game matrix. This can be useful for larger games.

If the game matrix is already known, but typing it into the computer is a major chore, then it may be possible to use J to provide some shortcuts. To illustrate this idea, we show how we could use some J constructions to enter the 3 by 3 version of the matrix above. We begin by defining n =: 3 and then use the variable n in the formulas. Thus, our method below could be used to enter the matrix for any version of this game where the players choose numbers from 1 to n for any value of n. The basic idea of the construction is to reshape a vector into a matrix. For example, 3 3$1 0 0 is a 3 by 3 matrix each row of which is (1,0,0). But 3 3$1 0 0 0 is a 3 by 3 matrix with first row (1,0,0), second row (0,1,0), and third row (0,0,1) - that is, a 3 by 3 identity matrix. Now we use a more complicated version of this trick. We start with a vector with n entries of -1 followed by (2,0,-2) followed by (n-1) entries of 1. We reshape this into an n by (1+2n) matrix. Finally, we take all n rows and the last n columns to get our desired game matrix.

   n =: 3
   [M =: (n,1+2*n)$(n$_1)(2 0 _2),(n-1)$1
_1 _1 _1  2  0 _2  1
 1 _1 _1 _1  2  0 _2
 1  1 _1 _1 _1  2  0
   [M =: (n,-n){.M
 0 _2  1
 2  0 _2
_1  2  0

In the above, we are using the verb [ at the left of each line to display M as we compute it; if we omit [ then M is defined properly but not shown on the display.

File translated from TEX by TTH, version 2.78.
On 2 Oct 2000, 15:26.