[Inquiry] Re: Theme One Program -- Exposition

Jon Awbrey jawbrey at att.net
Thu Feb 10 08:36:11 CST 2005


o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o

TOP.  Expository Note 11

o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o

3.2.  Literal Cacti

| There are no strings,
| There are no strings,
| There are no strings on me!

Let us return to our littlest example
of a two-level formal language, and
pick up on the way that Theme One
stores information about the
literal level L_2.

Example 2.  Two-Level Formal Language:  am

Recall the two-level formal language L = (L_1, L_2)
whose only word, or member of L_1, is the sequence
of characters <"a", "m">, and whose only phrase, or
member of L_2, is the singleton word sequence <"am">.

Formally speaking, we have the following data:

   L_1 = {<"a", "m">} c !A!*.

   L_2 = {<"am">} = {<<"a", "m">>} c !A!**.

Example 2.  Literal Level:  am

Theme One stores its data about the literal level of a 2-level
formal language in the form of a "literal cactus".  As always,
one may choose to view this information as an abstract graph,
as a concrete pointer graph, or as a text file consisting of
the characters that encode a traversal string for the graph.

Things begin to get a little bit complicated at this point.
Just as the lexical cactus that stores the data for L_1 is
painted with single characters from the alphabet !A!, the
literal cactus that stores the data for L_2 needs to be
painted with words from L_1.  This is achieved, not by
recording the words themselves as character strings in
the literal cactus, but by storing just their "ideas",
that is, pointers to forms in the lexical cactus that
serve as a type of "hash codes" for the words in L_1.
These indirect identifier references are recorded in
the "alias" or the 'as' fields of the paint-bearing
forms in the literal cactus.

In order to see how these "hash nodes" work, we shall need
to drive a stratum deeper into the concrete data structure
that supports both the lexical and the literal data bases.

The display below shows a memory dump of the index structure
that is formed in the relevant piece of computer memory when
the Indexer, starting from scratch, has taken up the initial
segment of a data stream that begins "a", "m", " ", " ".

   ( dump index (
      1003407    (         0   1003510   1003407       0
      1003201    (         0   1005513   1004006       1
      1005513    a         0         0   1005700       1    a
      1005700    (         0   1005803   1006112       1
      1005803    m         0         0   1005906       1    m
      1005906    ,         0         0   1006215       1
      1006215    (         0   1006402   1006009       0
      1006402    )         0   1006215   1006402       0
      1006009    )         0   1005700   1005803       0
      1006112    ,         0         0   1002911       0
      1002911    (         0   1003014   1003304       0
      1003014    )         0   1002911   1003014       0
      1003304    )         0   1003201   1005513       0
      1004006    (         0   1007001   1003510       1
      1007001        1005906         0   1007104       1    am
      1007104    ,         0         0   1003800       1
      1003800    (         0   1003903   1004109       0
      1003903    )         0   1003800   1003903       0
      1004109    )         0   1004006   1007001       0
      1003510    )         0   1003407   1003201       0
   ))

The first column of the index dump shows the address of the form.
The second column shows the character in the form's 'sign' field.
The third, fourth, and fifth columns list the addresses in the
form's 'as' field, 'on' field, and 'by' field, respectively.
The sixth column shows the number in the form's 'code' field.
The last column highlights the identifier, if any, that is
associated with a paint-bearing lexical or literal form.

Figure 10 plots the data of the index dump as a graph,
showing the shape and some of the concrete details of
the graph-theoretical data structure that is built up
in memory when the Indexer has taken up the incept of
a data stream that begins "a", "m", " ", " ".

                                                  o-----o
                                          o-------|--o  |
                                          |  o----o  |  |
                                          o->| )0 |--o  |
                                             |6402|     |
                                             o----o     |
                                             ^   o------o
                                             |  /    o-----o          o-----o
                     o-----------------------|-/-----|--o  |  o-------|--o  |
                     |  o----o  o----o  o----o< o----o  |  |  |  o----o  |  |
                     o->| m1 |->| ,1 |->| (0 |->| )0 |--o  |  o->| )0 |--o  |
                        |5803|  |5906|  |6215|  |6009|     |     |3014|     |
                        o----o  o----o  o----o  o----o     |     o----o     |
                        ^       ^                          |     ^   o------o
                        |  o----|--------------------------o     |  /    o-----o
        o---------------|-/-----|--------------------------------|-/-----|--o  |
        |  o----o  o----o<      |       o----o              o----o< o----o  |  |
        o->| a1 |->| (1 |-------|------>| ,0 |------------->| (0 |->| )0 |--o  |
           |5513|  |5700|       |       |6112|              |2911|  |3304|     |
           o----o  o----o       o       o----o              o----o  o----o     |
           ^                   o-\---------------------------------------------o
           |                  /   \
           |                 /     \                                  o-----o
           |                /       \                         o-------|--o  |
           |               /         \                        |  o----o  |  |
           |              /           \                       o->| )0 |--o  |
           |             /             \                         |3903|     |
           |            /               \                        o----o     |
           |           /                 \                       ^   o------o
           |          /                   \                      |  /    o-----o
           |         /                   o-\---------------------|-/-----|--o  |
           |        /                    |  o----o  o----o  o----o< o----o  |  |
           |       /                     o->|  1 |->| ,1 |->| (0 |->| )0 |--o  |
           |      /                         |7001|  |7104|  |3800|  |4109|     |
           |     /                          o----o  o----o  o----o  o----o     |
           |    /                           ^    o-----------------------------o
           |   /                            |   /
           |  /                             |  /                      o-----o
   o-------|-/------------------------------|-/-----------------------|--o  |
   |  o----o<                          o----o<                   o----o  |  |
   o->| (1 |-------------------------->| (1 |------------------->| )0 |--o  |
      |3201|                           |4006|                    |3510|     |
      o----o                           o----o                    o----o     |
      ^                                ^                         ^   o------o
      |                                |                         |  /
      @                                @                o--------|-/-o
     lex                              lit               |   o----o<  |
                                                        o-->| (0 |---o
                                                            |3407|
                                                            o----o
                                                            ^
                                                            |
                                                            @

   Figure 10.  Index Graph:  am

In Figure 10, the pointer marked "lex" points to the root form of
the lexical cactus for L_1, while the pointer marked "lit" points
to the root form of the literal cactus for L_2.  The lex and lit
root forms are lodged in a piece of utility data structure that
serves as a convenient dopstick or palette for keeping them
together during processing.

The thing to notice in Figure 10 is the "alias" (or is it "alibi"?)
pointer that extends from the literal cactus to the lexical cactus.
In this particular case, the literal form # 7001 has an 'as' field
that points to the lexical form # 5906.  The latter form serves in
effect as the environmentally unique "hash code" for the word "am".

Figure 11 demonstrates, in the case of our present example, a quick way to
sketch the abstract versions of the lexical and literal cacti, and it also
gives the text file representations of the corresponding traversal strings.

      o
      |
   mo-o o                  o
    |/ /              am   |
   ao-o                 `o-o
    |/                   |/
    @                    @
   lex                  lit

   am.lex  =  (1 a1 (1 m1 ,1 (0 )0 )0 ,0 (0 )0 )0
   am.lit  =  (1 am 1 ,1 (0 )0 )0

   Figure 11.  Lexical Cactus + Literal Cactus:  am

Jon Awbrey

o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o
inquiry e-lab: http://stderr.org/pipermail/inquiry/
o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o~~~~~~~~~o



More information about the Inquiry mailing list