[Inquiry] Re: What Is Information That A Sign May Bear It? --
Discussion
Jon Awbrey
jawbrey at att.net
Sun Feb 1 23:40:12 CST 2004
o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o
WIS. Discussion Note 16
o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o
HT = Hugh Trenchard
JA = Jon Awbrey
Hugh,
A bit too late in the day for a detailed response, so I will just
make a few general remarks and look at things again in the morning.
Let me remind myself what I wrote in that first note:
Cf: WIS 2. http://suo.ieee.org/ontology/msg05296.html
In part:
JA: Speaking of recurring visitations, here is one that various among you
will have seen somewhere varying between one and three or seven times.
It is the simplest way I know of explaining information theory as one
finds it during the middle ages of its buzzy boom, circa 1920-1970 or
thereabouts. This picture of information is not quite general enough
to fully cover what Peirce had in mind at its conception, but it does
the job under many commonly occurring circumstances, and so it repays
careful consideration even in the more general context of logic taken
as a form of normative semiotics.
So my sketches were an attempt to illustrate a couple of issues
from the standard sort of Hartley-Shannon information theory as
it developed in the 20th Century. I was trying to lay all this
out in such a way that I'g eventually be able to relate it to
Peirce's ideas about information, inquiry, logic, and signs.
But that, as they say, is a work in progress.
Consequently, I can't see any reason to set-up false oppositions
between Bayes and Peirce here, since it'd be like trying to have
a SuperBowl game between the RedWings and the Tigers. It's just
the wrong ballpark for that.
HT: Ok. Just a last note, and I won't push the issue further.
HT: In looking back at your note regarding channel capacities and
the reduction of uncertainties by choosing a particular path
at successive steps etc. (WIS Note 4, I believe), you begin
by setting out a Scene of Uncertainty. Then you indicate
a set of options "fan[ning] out before me." You assign 5
options, for the sake of example.
That was here:
JA: Against this backdrop one finds oneself cast as
a protagonist on a "scene of uncertainty" (SOU).
I picture this as a juncture where I have a set
of n options that fan out before me. It may be
a question of "What is true?", or "What to do?",
or "What to hope?", where the last is a codebit
for "What regulative principle has any chance?",
but the main uncertainty is that I am called on
to make a choice and often do not have any clue
what is fit to pick. (By the way, this picture
of the human practical fix is credited to Kant.)
JA: Just to make up a discrete example let us suppose
that the cardinality of this choice is a finite n,
and just to make it fully concrete let us say n=5.
Here is the picture that I would have in mind for
such a situation:
o-------------------------------------------------o
| |
| ? ? ? ? ? |
| o o o o o |
| |
| o o o o o |
| |
| o o o o o |
| |
| o o o o o |
| |
| o o o o o |
| |
| ooooo |
| |
| @ n = 5 |
| |
o-------------------------------------------------o
Figure 1. Juncture of Degree 5
JA: This pictures a juncture, represented by "@",
where there are n options for the outcome of
a conduct, and I do not have a clue which it
must be. In a sense the degree of this node,
in this case n = 5, measures the uncertainty
that I have at this point.
JA: As best I can figure, this is the minimal sort of
setting in which a sign can make any sense at all.
A sign has significance for an agent, interpreter,
or observer because its actualization, its being
given or its being present, serves to reduce the
uncertainty of a decision that the agent has to
make, whether it concerns the actions that the
agent ought to take in order to achieve some
objective of interest, or whether it concerns
the predicates that the agent ought to treat
as being true of some object in the world.
JA: The way that signs come into this setting,
to make the scene, as one used to say, is
something that I could picture as follows:
o-------------------------------------------------o
| |
| k_1 = 3 k_2 = 2 |
| o-----o-----o o-----o |
| "A" "B" |
| o----o----o o----o |
| |
| o---o---o o---o |
| |
| o--o--o o--o |
| |
| o-o-o o-o |
| |
| ooooo |
| |
| @ n = 5 |
| |
o-------------------------------------------------o
Figure 2. Partition of Degrees 3 and 2
JA: This illustrates a situation of uncertainty
that has been augmented by a classification.
HT: You go on to "contemplate making another decision after the
present issue has been decided", and arrive at a compound
uncertainty represented by the 5 options at the first step,
reduced to 2 at the next step.
The next phrase that you quote was made in the context of discussing
a different example, which was meant to illustrate a different issue:
JA: As a matter of fact, at least in this discrete type of case, it would
be possible to use the degree of the node as a measure of uncertainty,
but it would operate as a multiplicative measure rather than the sort
of additive measure that we would normally prefer. To illustrate how
this would work out, let us consider an easier example, one where the
degree of the choice point is 4.
o-------------------------------------------------o
| |
| ? ? ? ? |
| o o o o |
| |
| o o o o |
| |
| o o o o |
| |
| o o o o |
| |
| o o o o |
| |
| oo oo |
| |
| @ n = 4 |
| |
o-------------------------------------------------o
Figure 3. Juncture of Degree 4
JA: Suppose that we contemplate making another decision after
the present issue has been decided, one that has a degree
of 2 in every case. The compound situation looks like so:
o-------------------------------------------------o
| |
| o o o o o o o o |
| \ / \ / \ / \ / |
| o o o o n_2 = 2 |
| |
| o o o o |
| |
| o o o o |
| |
| o o o o |
| |
| o o o o |
| |
| oo oo |
| |
| @ n_1 = 4 |
| |
o-------------------------------------------------o
Figure 4. Compound Junctures of Degrees 4 and 2
JA: This depicts the fact that the compound uncertainty, 8,
is the product of the two component uncertainties, 4 x 2.
To convert this to an additive measure, we simply take the
logarithms to a convenient base, say 2, and thus we arrive
at the not too astounding fact that the uncertainty of the
first choice is 2 bits, the uncertainty of the next choice
is 1 bit, and the compound uncertainty is 3 = 2 + 1 bits.
Now, there appears to be some sort of misunderstanding at this point.
I don't see the application of Bayes' Rule here, as my pictures were
meant to illustrate a more primitive situation than the sort of case
where Bayesian diagnosis comes up.
HT: It at this point that I see a similarity to the Bayesian Rule, to defer to
your description. The initial 5 options you present can be, if I comprehend
properly a possible application of Bayes Rule (which I very well may not),
can be represented by probabilities and is the "prior probability". Here
I understand we are, under Bayes theorem, obliged to compute or estimate
the prior probability of one of the five options actually occurring (1/5
I suppose).
The set-up of the problem stipulated equally likely outcomes
for the prior state of information. Here one is thinking of
a uniform prior distribution, that is, with maximum entropy.
Probability
... ^
0.6 |
0.5 |
0.4 o
0.3 |
0.2 o-o-o-o-o-o
0.1 | |
0.0 o-o-o-o-o-o-> Outcome Number x
1 2 3 4 5
HT: But by the second step, presumably one has obtained information which
crystallizes the prior probability. It seems to me the Bayes theorem
allows you to factor the new information received at the point of the
second step, so you will have a more accurate quantification of the
"uncertainty" given the combination of the prior probability and
the new information received at the second step.
There is no need to apply any fancy theory here, as I simply
gave you the data that says what the possible outcomes are
after receiving sign A and what the possible outcomes are
after receiving sign B. This data gives you the two
posterior distributions that are shown below:
Prob(x|A)
... ^
0.6 |
0.5 |
0.4 |
1/3 o-o-o-o
0.2 | |
0.1 | |
0.0 o-o-o-o-o-o-> Outcome x
1 2 3 4 5
Prob(x|B)
... ^
0.6 |
0.5 | o-o-o
0.4 | | |
0.3 | | |
0.2 | | |
0.1 | | |
0.0 o-o-o-o-o-o-> Outcome x
1 2 3 4 5
In each case, by any suitable measure of uncertainty (or entropy)
that one picks, one can say the reception of a sign is associated
with a reduction in the uncertainty measure of the distribution.
HT: You note that any application of the Bayesian theorem must start by
"staking out your universe of discourse [and] ... must already know the
conditional probabilities". But in your scenario involving 5 options,
isn't the conditional probability arrived at simply by looking at the
number of choices (i.e. 1/5) followed by the number of choices at the
second step (1/2)? In terms of the "information" that allows us to
get to step two, and the two choices which follow, is there something
implicit in the fact that we've reduced the choices to two at the
second step; i.e., is that "information" which can plugged into
Bayes theorem to arrive at a measure of the "uncertainty" of the
situation, in a somewhat parallel fashion to what Peirce has done?
In setting up these examples, I simply stipulated all of the probabilities
up front, either explicitly or by telling you to assume equal likelihoods.
All of that was just the data of the example, and needs no application of
Bayes' rule or any other theory. I can state the posterior probabilities
in conditional form if I choose, Prob(x|A) and Prob(x|B), but unless you
ask something like Prob(A|x) or Prob(B|x), Bayes' doesn't come up. Then
again, you can answer those questions simply by looking at the pictures.
HT: You note that Bayes theorem is akin to a "two-step fox-trot",
while Peirce is a "three-step Waltz", but all I am wondering
is if there is some scope for combining the two approaches for
perhaps a yet more accurate -- or perhaps synthesized -- measure
of the information quantity in certain situations.
Okay, that was too cryptic. I was not trying to create any opposition
between Bayes and Peirce. Properly understood there's no real problem
about the statistics. Here I was talking about different models of the
inquiry process, which is a bigger issue as to how one uses statistics:
The 2-step model thinks in terms of induction and deduction, while the
3-step model adds a step up front, "abductive" hypothesis formation.
See:
| Awbrey & Awbrey, "Interpretation as Action: The Risk of Inquiry"
| http://www.chss.montclair.edu/inquiry/fall95/awbrey.html
Have to stop here ...
Jon Awbrey
HT: It seems to me you are saying the two approaches are rather incompatible.
But isn't it one of the aims of mathematicians to find grand unifying
theories, to bridge disconnected formalistic islands, ultimately so
we may see more clearly the grand platonic mathemosphere? If this
is not an aim of mathematicians, then surely it is a necessity in
order for certain things to be proven -- Andrew Wiles in proving
Fermat's theorem comes to mind (obviously among myriad other
examples.) Incidentally, it is from this broad philosophical
perspective that one such as myself can love and appreciate
mathematics without necessarily toiling through its rigours.
HT: Maybe a Peirce-Bayes synthesis is not possible -- again, I am in no
position to question your wisdom on the question, and I do not wish
to bog you down with questions that ultimately do not advance your
cause -- in which case I won't pursue the question further.
o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o
http://www.cs.bsu.edu/homepages/mighty/history.html
o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o
More information about the Inquiry
mailing list