Repair of (8.11)

The phenomenon of ``capture of free variables'' makes it necessary to change our definition of ``(con)textual substitution'' in predicate logic. The authors make a horrible mess of this with (8.11). We will fix 8.11 here.

Wffs and sentences

We do not need this for the repair of 8.11, but it must be mentioned somewhere, so here goes:

Here is a definition we could have mentioned earlier:

A boolean expression is also called a formula, or a well-formed formula, or just a wff.

A wff that has no free occurrences of object variables (i.e. nonboolean variables) in it is called a sentence. (It is also sometimes called a "closed formula".)

This fits with our use of "sentence" in propositional logic -- there, we had only boolean variables, so there were no occurrences of object variables at all, free or bound, back then.


We have seen that one only wants, ever, to substitute for free occurrences of object variables in expressions (the bound occurrences in some sense are not even "really there").

One wants (9.13), Instantiation,

For any boolean expression P, 
and any expression e and variable x 
of the same nonboolean syntactic type,

              (\forall x  P)  ==>  P[x:=e]       
to be a theorem scheme, and, of course, therefore, all such expressions to be ``valid formulas'' (i.e., T in all interpretations). (See below for a discussion of 9.13.) But if we blindly take ``substitute'' to mean ``mechanically replace all occurrences of the variable, whether free or not'', then we very quickly end up in absurdity.

Example 1:

In (9.13), take P to be the wff (the range is "true", so we have left it out and also left out the |: ).

              (\exists x   x = 3) . 
Then the antecedent in (9.13) is just this already "closed" formula quantified again universally over x . (Semantically, this second quantification does nothing.) But if we take the expression e in the consequent of (9.13) to be 5, and if we do the substitution in a bonehead, unthinking way, replacing all occurrences of x by 5, whether free or not, in P, then we get that the following is to be a theorem!:
      (\forall x|: (\exists x  x=3))   ==>   (\exists x  5=3) 
Now in the usual interpretation in the semantic type INTEGER', the antecedent here just says there is an integer that is 3, which most of us would agree with. The consequent we would not agree with....


More subtle is the phenomenon of ``capture'' of free variables upon unthinking substitution. In many traditional logics, our Theorem (9.13), Instantiation, is taken as an axiom:
    (\forall x |: P)  ==>  P[x:=e]
Semantically this just says:
"If something holds for ALL x, then it
holds for any particular ``value'' of x." 

But if the substitution indicated here is done naively, then we get nonsense. Here is an example (this can be found in Enderton's A Mathematical Introduction to Logic):

Example 2:

Let  P  be   ¬(\forall y   x=y) .
Then the antecedent in our (9.13) becomes a wff that we would interpret roughly thus:

For all elements x of the universe of discourse (the UD, the set we are interested in and are interpreting our wff's in), it is NOT the case that for all elements y of the UD, x is y.

Now, this is clearly a true statement if the UD has at least two elements in it (an UD always has at least one element in it, by definition -- though GS do not talk about UD's...), i.e., almost always this antecedent interprets as ``T''.

But the consequent, if we do the substitution carelessly, with e being the perfectly good expression y, says

   ¬(\forall y    y=y) , 
and with the usual interpretation of = as equality of elements in an UD, this statement is NEVER ``T''. So (9.13) ends up being almost always F....


What has happened here is that the free occurrence of x in P sits in the ``scope'' of the quantifier \forall y , so that when we replace that x by the expression e, the free (i.e., real! as opposed to a bound occurrence, which is not "really" there) occurrence of y in the expression e (i.e., in the expression y) ends up becoming bound by the quantifier in P after the substitution. ``Capture'' has taken place. Semantic damage has been done.


Enderton does not spend any time/space explaining exactly why ``capture'' is bad IN GENERAL, and I won't pretend that I understand that. In the above example, we should be able to use any e and substitute carelessly, provided that the e does not contain free occurrences of y. (Somehow the ``y'' in the universal quantification over y isn't really y; it is just a placeholder. Allowing the capture as above clearly does semantic damage; I cannot articulately explain exactly WHY this should be. I think it has to do with allowing a y that is not really a y to come in contact, in P[x:=e], with a y that really is a y, but ceases to be once it lands in the scope of dummy y.)


Anyway, here is the repair of (8.11).

We define ``contextual substitution'' (i.e., ``safe'' substitution, i.e., ``sensitive'' substitution, i.e., substitution done with a view to the context in which it is done, not just blindly) recursively (or is this inductively?...) as follows:

For any variable z (I should say bold z but will not) and expression E of the same syntactic type as z,

c[z:=E] is c (for any constant symbol c);

y[z:=E] is    y,  if z is not y,  and is  
            
              E,   if z is y   (y a variable);

f(P_1, ... , P_n)[z:=E]    is   f(P_1[z:=E], ... , P_n[z:=E]),

where   f   is  an n-place function symbol.

Note that the f here could be one of our five boolean connectives, and could be written as prefix or infix connective, for instance, if f is of type BOOL.

The only hard part of the definition of contextual substitution is explaining how it works with respect to quantification:

Let y be an object variable , R, P, E be expressions, R of syntactic
type BOOL, x a variable, E an expression of the same syntactic type 
as x (I do not use F because of F's being a truth value).

Let * be a quantifier of the right syntactic type (* is of course 
either the universal or the existential quantifier, if P is of type
BOOL). 

Then the contextual substitution  (*y | R: P)[x:=E]

is defined as follows:

(Denote the expression  (*y | R: P)   by   Q.)



(i) It is just Q, if x does not occur free (d.n.o.f.) in Q.


In particular, if x is y, then the above substn. by definition
does NOTHING. But quite generally, it is not a question of whether
x is y or not -- the first question to be answered is:

Does x occur freely in Q? If the answer is ``No'', then the substn.
above does NOTHING to Q. (This makes perfect semantic sense; see the
discussion a few dozen lines above.)

Notice that this is already different from the
horrible (8.11) and the examples following it in the book. We
recommend essentially ignoring the book and those examples.
There are further mistakes in the examples right after 8.11.
(Missing parentheses.)


(ii) If x DOES occur freely in Q, 
        but y does not occur freely in E,

        then   Q[x:=E]   is recursively defined to be 
     
        (*y|  R[x:=E]:  P[x:=E]) .


Here, what we are saying is that at least as far as the object
variable y is concerned, no capture will take place if we push the
substitution past the quantifier *y into the range and body of the
quantification. Of course, in executing THOSE substitutions, one 
again must worry about capture with respect to OTHER possible
quantifications.


(iii) If x occurs free in Q and y occurs free in E

      (so that capture WOULD
       take place upon careless ``substitution'' 
       of E for free occurrences of x),           then

               Q[x:=E] is by definition 

           (*w| R[y:=w][x:=E] :  P[y:=w][x:=E]) ,


where w is an object variable of the same type as y, THAT DOES NOT
APPEAR AT ALL (even just as a dummy in a quantifier) in 
either Q or E. The book calls the w a ``fresh variable'' -- but
I am being even fresher here than the book -- I want to avoid all
appearances in my expressions, not just all occurrences.

It is worrisome that the particular w to choose above here is
left completely open -- we no longer have a well-defined
operation of contextual substn. There are solutions to this,
to be found in certain logic texts, but they all seem too
technical to bother our students with. One should just remark
that different  choices of w lead to different expressions,
but that all such are syntactically equivalent (i.e., if L and M
are two expressions arising from Q above by different choices of
fresh w, then   L = M   is a theorem  (here = is the equality symbol
for the syntactic type of Q, which of course is   == ,  
if Q is of type  BOOL).

The idea here is that we allow the substitution to be pushed past the
quantifier *y, provided we first change the dummy y, that is causing
capture, into a dummy w that cannot conceivably cause any
capture.

Note that the [y:=w] above in two places will, because of the 
choice of w from ``out of left field'', never involve further 
influxes of fresh variables; it can be done mechanically acc. to
the recursive definition of contextual substn. 

With the book's horrible, vague, incomplete 8.11, one does not
even have that for a variable x and an expression P, the 
expression P[x:=x] is just P. And yet the authors clearly 
tacitly assume this in their "proof" of 9.16, at the 
bottom of page 162.
This situation is intolerable. 

With OUR version of 8.11, however, we do have 
that   P[x:=x]   is just  P , for any expression P.

One merely 
notes that if "y"  in the above discussion is x, then the 
substn. does nothing (case (i)), and if it is not x, then we
are in case (ii), since y d.n.o.f. in x in this case, so we 
push the substitution past the quantifier and we are done by 
induction.

Examples:

(i)  (\exists z  ( C(z,w) /\ (\forall x  B(x)) ) )  [x:= 43*u + 19]    

                    is just

     (\exists z  ( C(z,w) /\ (\forall x  B(x)) ) ) ,

because x dnof in the above expression, so we are in case (i) 
of the definition.
     

(ii) (\forall j|  i < j : D(i)) [i:= k+2]   is just

     (\forall j|  k+2 < j : D(k+2)) ,

because i does occur free(ly) in the quantification (twice),
but j dnof in  k+2, so no capture occurs if we slip the
substitution past the quantifier into the range and body 
separately and substitute there; this we have done above.
We are in case (ii) of the definition.

Here, D is a one-place predicate letter, i.e., syntactically
takes (say) integer input and produces boolean output. If you
don't feel comfortable with D(i), replace it with something like
" i > 19 "  or  " i = m+3 ".


(iii)  (\forall j|  i < j :   D(i))  [i:= j+2]   is

       (\forall n|  (i < n)[i:= j+2] :   D(i)[j:=n][i:= j+2]),

       which is

       (\forall n|  j+2 < n : D(j+2))  --

Now we are in case (iii) of the definition. (Verify this.)
So we need a fresh variable (I picked n).
Here, replacing j by n in both range and body above was easy
because they did not themselves involve further quantifications.
The n here appeared nowhere in the quantification 
into which I was substituting, and did not appear
at all in the expression being substituted.




Here is the way I currently think of substitution in a quantification. The quantified expression (*y| R: P) is a little world unto itself. There are a particular quantifier * and a particular dummy y at the "door".

Suddenly a visitor [x:=E] knocks at the door. First question the doorman asks is, "What is your variable"? The visitor says "x". The doorman turns to the people in the little room and says, "Any free occurrences of x here?"

If the answer comes back "No", then the doorman eats the visitor and nothing changes inside the little room. (Or he sends the visitor away and the room remains unchanged, if you object to cannibalism.)

If the answer comes back "Yes," then the doorman asks the visitor what his E is. If the E has no free occ's of y in it, the doorman smiles and beckons the visitor in and tells him to go to the doors of both rooms R and P (where he should be ready to answer the same questions before being allowed to pass through THEIR doors, if they have doors (i.e., if R or P ITSELF involves quantified expressions). (This is case (ii) above.)

If instead the E has free occ's of y in it, the doorman again smiles but says, "Pardon me, but we have servants (dummy variables, with duties to perform) inside who look just like some of the people in your group E, and we would not want you to get emotionally involved with any of them; that can get messy. Please wait while we fire our present staff and hire new ones (fresh variable w for the dummy y, and for all (free) occurrences of that dummy in R and P) to avoid this problem." Then he beckons the visitor in and directs him to the rooms with the freshly hired staff in them, where he must be prepared for the same questioning if further quantifications occur in R or P.