Subject: Re: EPR Approach to Intro Stat:  
             Relationships Between Variables

   Date: July 30, 1996 11:30 EDT

   From: "Donald Macnaughton" <donmac@matstat.com>
                     (formerly donmac@hookup.net)

     To: edstat-l@jse.stat.ncsu.edu

     Cc: "George Zeliger" <ZELIGER.G.197935@asqcnet.org>, 
         "Milo Schield" <schield@augsburg.edu>

On July 23, 1996 George Zeliger wrote

> "Donald Macnaughton" <donmac@hookup.net> wrote:
>
>> I agree that explanation (sometimes referred to as "under-
>> standing" or sometimes linked to the concept of "under-
>> standing") is an important goal for many *researchers*.
>  (snip)
>> However, I don't think that we should concentrate on the goals
>> of *researchers*.  Instead, I think we should concentrate on
>> the goals of empirical research *for society*.
>
> Aren't researches [sic] a part of society?  And in many cases,
> the best educated and most qualified (for political reasons, I
> don't say "most intelligent") part?

Empirical researchers are certainly an important part of society.  
However, if we wish to maximize the social benefits of empirical 
research (as opposed to maximizing the benefits of empirical re-
searchers), we should concentrate on the goals of empirical re-
search for society.  Of course, researchers (in their role of 
members of society) have a special duty (in view of their extra 
knowledge) to help society determine its goals.

>> We should concentrate on the goals for society because if re-
>> searchers are working to satisfy societal goals (which here
>> include commercial goals), the researchers are more likely to
>> get funded, and their work is more likely to make a useful
>> social contribution.
>
> That society has limited resources and, therefore, the funds
> will always be limited, is well known.  However, it seems that
> a dangerous idea is hidden in the above statement.
> Who determines what are societal goals?  Who evaluates whether
> the contribution is useful, and how much useful?  What is the
> relationship between moral principles and what appears to be
> "societal goals" -- that, let's admit it openly, usually are
> formulated by politicians in power?

Although the question of who determines *specific* societal goals 
is very important, I suggest that this question is not relevant 
for the present newsgroup/mail-list thread.  Here we are solely 
concerned with the relative importance of the following two *gen-
eral* goals of empirical research:  
- prediction (including control) and 
- explanation (including understanding). 

My purpose in discussing the relative importance of the two goals 
is to answer a July 16 post from Bryan Griffin, who criticized 
the EPR approach for ignoring the concept of explanation in favor 
of the concepts of prediction and control.


RESTATEMENT OF A THOUGHT EXPERIMENT
In my response to Griffin (Macnaughton 1996) I give an argument 
in terms of a thought experiment.  The goal of the argument is to 
demonstrate that prediction (control) is a more important socie-
tal goal than explanation (understanding).  George's post criti-
cizes this argument.  To help readers consider the criticism and 
evaluate my response, I now restate the thought experiment from 
the earlier post:

Which of the following two goals of empirical research is more 
important *for society*:  
- prediction (including control) or 
- explanation (including understanding)?

Consider a thought experiment:

a. Suppose you are offered a technique that yields substantially 
   more accurate predictions (or controls) in some area of life 
   than are now available.  But suppose the technique provides 
   *absolutely* no explanation or understanding of how the pre-
   dictions work or how the phenomena under study work.  If you 
   can use the predictions to your benefit, you will probably be 
   very interested in using them even though you don't understand 
   how they work.  (Of course, most people would prefer that the 
   prediction ability be *accompanied* by explanations and under-
   standing because the explanations and understanding would make 
   us more comfortable that the predictions are reliable.  But if 
   we can't have the explanations and understanding, so be it--we 
   will still be pleased to take advantage of the predictions.)  
   Thus accurate predictions (or controls) have high societal 
   value, regardless of whether they are accompanied by explana-
   tions or understanding.

b. Suppose we somehow know that certain explanations (or under-
   standing), although correct about the past, contain *abso-
   lutely* no information relevant to the prediction or control 
   of variables in the future.  (Thus the explanations would ap-
   pear to be of no practical use.)  Most practical people (and 
   most funding agencies) will not be interested in these expla-
   nations and will be unwilling to support efforts to obtain
   further such explanations.  That is, explanations (or under-
   standing) with no hope of future prediction or control have
   low societal value.  

Since accurate predictions have high societal value *regardless* 
of whether they are accompanied by explanations, but correct ex-
planations have low societal value *unless* they are accompanied 
by accurate predictions (or hope of predictions), therefore pre-
diction (control) is more important to society than explanation 
(understanding).

Therefore, it is reasonable to emphasize prediction and control 
of the values of variables in the introductory statistics course.  


GEORGE'S CRITICISM OF THE THOUGHT EXPERIMENT
George begins his criticism of the thought experiment as follows:

>>     ( snip )
>> a. Suppose you are offered a technique that yields
>>    substantially more accurate predictions (or controls) in
>>    some area of life than are now available.
>
> How do you know that the technique _does_ yield more accurate
> predictions, if you don't understand how it works?  From
> experiments?  But how do you know that the next experiment
> won't completely destroy your previous opinion about the
> technique?

In the thought experiment knowing *how* you know that the tech-
nique yields more accurate predictions is unnecessary.  That is, 
the point of paragraph a. is to demonstrate the truth of the con-
ditional proposition
                         If P then Q
where
P = "a new technique is available that yields substantially more
    accurate predictions with no hope of explanation"
and
Q = "most people will be interested in using the new technique".

The point is *not* to discuss how to determine the truth of prop-
osition P.  The question of how to determine the truth of propo-
sition P is irrelevant for determining the truth of the condi-
tional statement "If P then Q".  

(I expect readers to determine the "truth" of the conditional 
statement "If P then Q" by simply using their intuition.  Alter-
natively, we can formalize the determination of the truth of the 
statement by doing an appropriate survey of people.)

>      ( snip )
>> b. Suppose you somehow know that certain explanations (or
>>    understanding), although correct about the past, contain
>>    *absolutely* no information relevant to the prediction or
>>    control of variables in the future.
>
> Could you please give at least one real life example?  

It is difficult to give a completely realistic example because it 
is impossible to *know* that either
- a certain explanation is correct about the past or
- a certain explanation contains absolutely no information rele-
  vant to prediction or control of a variable in the future.

Nevertheless, consider an example from medicine where it might be 
somehow known that a certain virus caused a disease in the past 
but it is somehow also known that the virus was eradicated, and 
nothing could be learned about the virus or its eradication that 
would be useful in the future.  

Note that the impossibility of this example does not invalidate 
my argument.  That is, paragraph b. asks the reader (in a way 
that is parallel to paragraph a.) to consider the truth of the 
conditional statement 
                          If R then S
where
R = "it is known that an explanation, although correct about the 
    past, contains absolutely no help in predicting in the
    future"
and
S = "most people will not be interested in the explanation, and
    will not be interested in supporting efforts to obtain fur-
    ther such explanations".

As I discussed above for paragraph a., when determining the truth 
of the conditional statement it is not necessary to know how to 
determine the truth of the antecedent proposition (proposition R) 
in the conditional statement.


ARE EXPLANATIONS AND UNDERSTANDING THE SAME?
> Explanations and understanding are not the same; 

Whether explanations and understanding are the same depends on 
how one chooses to define the two concepts.  However, I don't 
feel that it's necessary to discuss the definitions of the two 
concepts because I believe my argument works with most common 
definitions of the two concepts, regardless of whether we define 
them as being the same as or different from each other.


"REAL" UNDERSTANDING LEADS TO PREDICTION
> real understanding _always_ provide some knowledge concerning
> prediction -- sometimes, that our understanding is not suffi-
> cient for reliable prediction, but in such cases we always get
> an idea as to in what direction to continue our research [sic].

I shall assume that George means the following:

Real understanding _always_ provides some knowledge that can lead 
to predictions, although the knowledge may sometimes be insuffi-
cient for highly accurate predictions.  If understanding provides 
inaccurate predictions, it will always suggest a direction for 
further research that will allow us to increase the accuracy of 
the predictions.

What is "real" understanding?  How should we distinguish "real" 
understanding from other understanding?  

In characterizing real understanding we can't appeal to some 
"true" reality that is "out there" and say that real understand-
ing is understanding that correctly reflects that reality.  We 
can't take this approach because there is no known way of deter-
mining what the true reality is, so we can never tell whether we 
have real understanding, so the concept becomes strictly theo-
retical, never empirically verifiable, and therefore of little 
use except in philosophical discussions.

The only way I know of determining the "realness" of understand-
ing is through the ability of the understanding to make accurate 
predictions.  Thus the essence of real understanding would seem 
to be the ability to make accurate predictions.  Given this, it 
is only a short step to conclude that the *goal* of all under-
standing is to make accurate predictions (and exercise accurate 
control).  

If the goal of understanding is to make accurate predictions, 
then this implies that understanding is subordinate to predic-
tion.

If the goal of understanding is *not* to make accurate predic-
tions (and exercise accurate control), then what is the goal of 
understanding?  

(It seems unlikely that the goal of understanding is an end in 
itself because natural selection, as it applies to shaping human 
priorities, would not define a human goal that is an end in it-
self.  Instead, all important human goals can be expected to have 
survival value--either for individuals or for society.  Making 
accurate predictions [and exercising accurate control] has obvi-
ous survival value, both for individuals and for society.)  


RETURN TO THE THOUGHT EXPERIMENT
George next quotes the conclusion of the thought experiment and 
cites an interesting apparent counterexample:

>> Since accurate predictions have high societal value *regard-
>> less* of whether they are accompanied by explanations, but
>> correct explanations have low societal value *unless* they
>> are accompanied by accurate predictions (or hope of predic-
>> tions), therefore prediction (control) is more important to
>> society than explanation (understanding).
>
> As is well known, [the] Ptolemaic system was maybe difficult
> for understanding and explanation, but quite good for predic-
> tion.  So, from Donald's point of view, Copernicus was waisting
> [sic] time (fortunately for us, the society, he had his own
> funds),

Copernicus' work is an interesting special case, because (as 
George suggests) the Copernican system and the Ptolemaic system 
give equally accurate predictions.  The only advantage of the 
Copernican system over the Ptolemaic system is that the 
Copernican system is simpler:  it reduces the number of circles 
required to describe the apparent movement of the heavens from 
around eighty to forty-eight (Mason 1962, 129-131).

Since the Copernican system does not give more accurate predic-
tions than the Ptolemaic system, society does not value 
Copernicus' work for improved prediction ability.  Instead, we 
value his work because it gives us a simpler model.  The prefer-
ence for the simpler model of two competing models is a reflec-
tion of the principle of parsimony, which states that if two com-
peting models make equally accurate predictions, then (for econ-
omy of mental labor) we should choose the simpler of the two mod-
els.

Copernicus' brilliant stroke was to transform the origin of the 
coordinate system from the earth to the sun, thereby deriving a 
substantially simpler system.  Such transformations occur rela-
tively infrequently in science, but are certainly important.  

On the other hand, a much more frequent occurrence is the discov-
ery of a previously unknown relationship between variables, or 
the discovery of a new aspect of a known relationship, both of 
which increase our prediction ability.  (Of course, both the 
Ptolemaic and Copernican systems are systems of relationships be-
tween variables, where the variables represent the positions of 
the sun, the stars, the planets, and time.)

Following on in the same sentence as his reference to Copernicus 
wasting time, George writes:

> Galileo struggled and suffered in vain, Einstein was nothing
> but a funny guy.  What to say about Sir Isaac?  When he managed
> the Royal Mint, he was OK, but before that he stupidly scrib-
> bled paper.  
> In fact, all those giants' scientific deeds began with attempts
> to _understand_ -- which eventually lead to brilliant predic-
> tions and unprecedented opportunities to control.

As I say at the beginning of my July 18 post, I agree that under-
standing is an important goal for many researchers.  However, I 
don't believe that society reveres great scientists for the *un-
derstanding* that they provide.  Instead, we revere them because 
they have given us the ability to make brilliant predictions, and 
they have given us unprecedented opportunities for control.

For example, Galileo taught us how to correctly predict the speed 
of a falling body as a function of time (v = gt), Einstein taught 
us how to correctly predict the amount of nuclear energy stored 
in a body as a function of its mass (E = m c-squared), and Newton 
taught us how to correctly predict the acceleration of a body as 
a function of applied force (f = ma).  Similarly, we value other 
important contributions of these scientists because they give us 
the ability to predict or control.

The prediction and control is prediction and control of the val-
ues of variables (values of properties of entities).  The predic-
tion and control takes place through knowledge of relationships 
between variables.  

(Consider all the physics equations that Galileo, Einstein or 
Newton ever wrote.  Are these not all prediction equations in the 
form of statements of relationships between variables?)  

If the work of Galileo, Einstein, and Newton had not led to the 
ability to predict and control, then (regardless of the amount of 
explanation and understanding they provided) I believe that soci-
ety would not be much interested in their work.  

That is, the bottom line of empirical research for society is 
*not* explanation or understanding, but is prediction and con-
trol.

>> Finally, it's important to note that explanations and under-
>> standing are sometimes very helpful as middle steps in discov-
>> ering how to predict or control the values of variables.  So
>> I'm not saying that explanation and understanding are unimpor-
>> tant.
>
> Yes, you are saying that -- implicitely, [sic] but clearly (I
> really like your "sometimes"), and your next statement confirms
> it:

In retrospect, I should have used the word "often" instead of the 
word "sometimes".  I reiterate that I am not saying that explana-
tion and understanding are unimportant--I am only saying that 
they are subordinate goals to the goal of prediction and control.

>> However, in view of the arguments above, I maintain that, from
>> the point of view of society, the goals of explanation and
>> understanding are subordinate to the goals of prediction and
>> control.
>
> Why do you think that what you are saying really represents the
> point of view of society?  

I base my conclusions on the argument in the thought experiment 
above.


IMPLICATIONS
The field of statistics holds the keys to a powerful set of gen-
eral tools for predicting and controlling.  Thus if I'm right 
that prediction and control are society's bottom line for empiri-
cal research, then this observation has great import for the 
field.  If we properly emphasize the concepts of predicting and 
controlling in the introductory statistics course, we can dra-
matically change students' perception of the field of statistics.  
The change is from viewing the field as "the worst course taken 
in college" to viewing it as a fascinating cornerstone of all em-
pirical research, including all scientific research.


LINK
The above points are part of a broader discussion of an approach 
to the introductory statistics course available at

                   http://www.matstat.com/teach/

-----------------------------------------------------------
Donald B. Macnaughton      MatStat Research Consulting Inc.
donmac@matstat.com         Toronto, Canada
-----------------------------------------------------------


REFERENCES
Macnaughton, D. B. (1996) "Response to comments by Brian
  Griffin."  Posted to the sci.stat.edu UseNet newsgroup (= e-
  mail list EdStat-L) under the title of "Re: EPR Approach to
  Intro Stat: Relationships Between Variables" on July 18, 1996.
  Available at the web site above.
Mason, S. F. (1962) _A History of the Sciences._ New York:
  Collier Books.


ACKNOWLEDGMENT
I thank Milo Schield for probing questions that stimulated some 
of the ideas in this post.