## Repair of (8.11)

The phenomenon of ``capture of free variables'' makes it necessary to change our definition of ``(con)textual substitution'' in predicate logic. The authors make a horrible mess of this with (8.11). We will fix 8.11 here.

### Wffs and sentences

We do not need this for the repair of 8.11, but it must be mentioned somewhere, so here goes:

Here is a definition we could have mentioned earlier:

A boolean expression is also called a formula, or a well-formed formula, or just a wff.

A wff that has no free occurrences of object variables (i.e. nonboolean variables) in it is called a sentence. (It is also sometimes called a "closed formula".)

This fits with our use of "sentence" in propositional logic -- there, we had only boolean variables, so there were no occurrences of object variables at all, free or bound, back then.

We have seen that one only wants, ever, to substitute for free occurrences of object variables in expressions (the bound occurrences in some sense are not even "really there").

One wants (9.13), Instantiation,

```For any boolean expression P,
and any expression e and variable x
of the same nonboolean syntactic type,

(\forall x  P)  ==>  P[x:=e]
```
to be a theorem scheme, and, of course, therefore, all such expressions to be ``valid formulas'' (i.e., T in all interpretations). (See below for a discussion of 9.13.) But if we blindly take ``substitute'' to mean ``mechanically replace all occurrences of the variable, whether free or not'', then we very quickly end up in absurdity.

Example 1:

In (9.13), take P to be the wff (the range is "true", so we have left it out and also left out the |: ).

```              (\exists x   x = 3) .
```
Then the antecedent in (9.13) is just this already "closed" formula quantified again universally over x . (Semantically, this second quantification does nothing.) But if we take the expression e in the consequent of (9.13) to be 5, and if we do the substitution in a bonehead, unthinking way, replacing all occurrences of x by 5, whether free or not, in P, then we get that the following is to be a theorem!:
```      (\forall x|: (\exists x  x=3))   ==>   (\exists x  5=3)
```
Now in the usual interpretation in the semantic type INTEGER', the antecedent here just says there is an integer that is 3, which most of us would agree with. The consequent we would not agree with....

More subtle is the phenomenon of ``capture'' of free variables upon unthinking substitution. In many traditional logics, our Theorem (9.13), Instantiation, is taken as an axiom:
```    (\forall x |: P)  ==>  P[x:=e]
```
Semantically this just says:
```"If something holds for ALL x, then it
holds for any particular ``value'' of x."

But if the substitution indicated here is done naively, then we get
nonsense. Here is an example (this can be found in Enderton's
A Mathematical Introduction to Logic):

Example 2:

Let  P  be   ¬(\forall y   x=y) .

Then the
antecedent in our (9.13) becomes a wff that we would interpret
roughly thus:

For all elements x of the universe of discourse (the UD, the set
we are interested in and are interpreting our wff's in), it is NOT
the case that for all elements y of the UD, x is y.

Now, this is clearly a true statement if the UD has at least two
elements in it (an UD always has at least one element in it, by
definition -- though GS do not talk about UD's...), i.e., almost
always this antecedent interprets as ``T''.

But the consequent, if we do the substitution carelessly, with e being
the perfectly good expression y, says
¬(\forall y    y=y) ,

and with the usual interpretation of =
as equality of elements in an UD, this statement is NEVER ``T''.
So (9.13) ends up being almost always F....

What has happened here is that the free occurrence of x in P sits
in the ``scope'' of the quantifier \forall y , so that when we
replace that x by the expression e, the free (i.e., real!
as opposed to a bound occurrence, which is not "really" there)
occurrence of y in the expression e (i.e., in the expression y)
ends up becoming bound by the quantifier in P after the substitution.
``Capture'' has taken place. Semantic damage has been done.

Enderton does not spend any time/space explaining exactly why ``capture''
is bad IN GENERAL, and I won't pretend that I understand that. In the
above example, we should be able to use any e and substitute
carelessly,
provided that the
e does not contain free
occurrences of y. (Somehow the ``y'' in the universal quantification
over y isn't really y; it is just a placeholder. Allowing the capture
as above clearly does semantic damage; I cannot articulately explain
exactly WHY this should be. I think it has to do with allowing
a y that is not really a y to come in contact, in P[x:=e], with a
y that really is a y, but ceases to be once it lands in the scope of
dummy y.)

Anyway, here is the repair of (8.11).

We define ``contextual substitution'' (i.e., ``safe'' substitution,
i.e., ``sensitive'' substitution, i.e., substitution done with a view
to the  context in which it is done, not just blindly)
recursively
(or is this inductively?...) as follows:

For any variable z (I should say bold z but will not) and expression E
of the same syntactic type as z,
c[z:=E] is c (for any constant symbol c);

y[z:=E] is    y,  if z is not y,  and is

E,   if z is y   (y a variable);

f(P_1, ... , P_n)[z:=E]    is   f(P_1[z:=E], ... , P_n[z:=E]),

where   f   is  an n-place function symbol.

Note that the f here could be one of our five boolean connectives,
and could be written as prefix or infix connective,
for instance, if f is of type BOOL.

The only hard part of the definition of contextual substitution is
explaining how it works with respect to quantification:
Let y be an object variable , R, P, E be expressions, R of syntactic
type BOOL, x a variable, E an expression of the same syntactic type
as x (I do not use F because of F's being a truth value).

Let * be a quantifier of the right syntactic type (* is of course
either the universal or the existential quantifier, if P is of type
BOOL).

Then the contextual substitution  (*y | R: P)[x:=E]

is defined as follows:

(Denote the expression  (*y | R: P)   by   Q.)

(i) It is just Q, if x does not occur free (d.n.o.f.) in Q.

In particular, if x is y, then the above substn. by definition
does NOTHING. But quite generally, it is not a question of whether
x is y or not -- the first question to be answered is:

Does x occur freely in Q? If the answer is ``No'', then the substn.
above does NOTHING to Q. (This makes perfect semantic sense; see the
discussion a few dozen lines above.)

Notice that this is already different from the
horrible (8.11) and the examples following it in the book. We
recommend essentially ignoring the book and those examples.
There are further mistakes in the examples right after 8.11.
(Missing parentheses.)

(ii) If x DOES occur freely in Q,
but y does not occur freely in E,

then   Q[x:=E]   is recursively defined to be

(*y|  R[x:=E]:  P[x:=E]) .

Here, what we are saying is that at least as far as the object
variable y is concerned, no capture will take place if we push the
substitution past the quantifier *y into the range and body of the
quantification. Of course, in executing THOSE substitutions, one
again must worry about capture with respect to OTHER possible
quantifications.

(iii) If x occurs free in Q and y occurs free in E

(so that capture WOULD
take place upon careless ``substitution''

Q[x:=E] is by definition

(*w| R[y:=w][x:=E] :  P[y:=w][x:=E]) ,

where w is an object variable of the same type as y, THAT DOES NOT
APPEAR AT ALL (even just as a dummy in a quantifier) in
either Q or E. The book calls the w a ``fresh variable'' -- but
I am being even fresher here than the book -- I want to avoid all
appearances in my expressions, not just all occurrences.

It is worrisome that the particular w to choose above here is
left completely open -- we no longer have a well-defined
operation of contextual substn. There are solutions to this,
to be found in certain logic texts, but they all seem too
technical to bother our students with. One should just remark
that different  choices of w lead to different expressions,
but that all such are syntactically equivalent (i.e., if L and M
are two expressions arising from Q above by different choices of
fresh w, then   L = M   is a theorem  (here = is the equality symbol
for the syntactic type of Q, which of course is   == ,
if Q is of type  BOOL).

The idea here is that we allow the substitution to be pushed past the
quantifier *y, provided we first change the dummy y, that is causing
capture, into a dummy w that cannot conceivably cause any
capture.

Note that the [y:=w] above in two places will, because of the
choice of w from ``out of left field'', never involve further
influxes of fresh variables; it can be done mechanically acc. to
the recursive definition of contextual substn.

With the book's horrible, vague, incomplete 8.11, one does not
even have that for a variable x and an expression P, the
expression P[x:=x] is just P. And yet the authors clearly
tacitly assume this in their "proof" of 9.16, at the
bottom of page 162.
This situation is intolerable.

With OUR version of 8.11, however, we do have
that   P[x:=x]   is just  P , for any expression P.

One merely
notes that if "y"  in the above discussion is x, then the
substn. does nothing (case (i)), and if it is not x, then we
are in case (ii), since y d.n.o.f. in x in this case, so we
push the substitution past the quantifier and we are done by
induction.

Examples:

(i)  (\exists z  ( C(z,w) /\ (\forall x  B(x)) ) )  [x:= 43*u + 19]

is just

(\exists z  ( C(z,w) /\ (\forall x  B(x)) ) ) ,

because x dnof in the above expression, so we are in case (i)
of the definition.

(ii) (\forall j|  i < j : D(i)) [i:= k+2]   is just

(\forall j|  k+2 < j : D(k+2)) ,

because i does occur free(ly) in the quantification (twice),
but j dnof in  k+2, so no capture occurs if we slip the
substitution past the quantifier into the range and body
separately and substitute there; this we have done above.
We are in case (ii) of the definition.

Here, D is a one-place predicate letter, i.e., syntactically
takes (say) integer input and produces boolean output. If you
don't feel comfortable with D(i), replace it with something like
" i > 19 "  or  " i = m+3 ".

(iii)  (\forall j|  i < j :   D(i))  [i:= j+2]   is

(\forall n|  (i < n)[i:= j+2] :   D(i)[j:=n][i:= j+2]),

which is

(\forall n|  j+2 < n : D(j+2))  --

Now we are in case (iii) of the definition. (Verify this.)
So we need a fresh variable (I picked n).
Here, replacing j by n in both range and body above was easy
because they did not themselves involve further quantifications.
The n here appeared nowhere in the quantification
into which I was substituting, and did not appear
at all in the expression being substituted.

Here is the way I currently think of substitution in a quantification.
The quantified expression (*y| R: P)
is a little world unto itself. There are a
particular quantifier * and
a particular dummy y
at the "door".
Suddenly
a visitor [x:=E] knocks at the door. First question the doorman asks
is, "What is your variable"? The visitor says "x". The doorman turns
to the people in the little room and says, "Any free occurrences of x
here?"

If the answer comes back "No", then the doorman eats the
visitor and nothing changes inside the little room. (Or he sends the
visitor away and the room remains unchanged, if you object to
cannibalism.)
If
the visitor what his E is. If the E has no free occ's of y in it, the
doorman smiles and beckons the visitor in and tells him to go to the
doors of both rooms R and P (where he should be ready to answer the
same questions before being allowed to pass through THEIR doors, if
they have doors (i.e., if R or P
ITSELF involves
quantified expressions).
(This is case (ii) above.)
If instead the E has free occ's of y in it, the doorman again
smiles but says, "Pardon me, but we have servants
(dummy variables, with duties to perform)
inside who look
just like some of the people in your group E,
and we would not want you to get emotionally involved
with any of them; that can get messy. Please wait while we fire our
present staff and hire new ones (fresh variable w
for the dummy y, and for
all (free)
occurrences of that dummy in R and P) to avoid this
problem." Then he beckons the visitor
in and directs him to the rooms with the freshly hired staff in them,
where he must be prepared for the same questioning if further
quantifications occur in R or P.

```