Glasgow 1993: The imagery debate revisited: A computational perspective

[ CogSci Summaries home | UP | email ]
http://www.jimdavies.org/summaries/

Glasgow, J. I. (1993). The imagery debate revisited: A computational perspective. Computational Intelligence 9:4, 309--333.

@Article{,
  author = 	 {Glasgow, Janice I.},
  title = 	 {The imagery debate revisited: A computational perspective},
  journal = 	 {Computational Intelligence},
  year = 	 {1993},
  OPTkey = 	 {},
  OPTvolume = 	 {9},
  OPTnumber = 	 {4},
  OPTpages = 	 {309--333},
  OPTmonth = 	 {},
  OPTnote = 	 {},
  OPTannote = 	 {}
}

Author of the summary: Jim Davies, 2004, jim@jimdavies.org

Cite this paper for:

Spatial similarity is measured by the number of image transformations needed to bring them to equivalence.[234]

"Symbolic arrays provide a compact and coherent representation that can be directly inspected and transformed to retrieve spatial properties that are only implicit in the descriptive representation." [329]

Kosslyn: depictive (surface representation, qasi-pictorial that occurs in the visual buffer) and descriptive (storing image information as propositions in LTM) [310].

descriptionalist: One who does not believe in internal human depictive representations (e.g. Pylyshyn). Hinton believes that there is a viewer-centered description attached to object centered structural descriptions of images.

A Visual image "can be thought of as modality specific and provides a literal encoding that preserves properties such as shape and relative size." Spatial representations "specify the layout of objects within a scene and preserve spatial and topological properties." [311] Kosslyn and others refer to this as a what/where distinction.

Tye: mental images as interpreted, symbol-filled arrays. [312]

Johnson-Laird 1983: 1) descriptive (propositional) 2) mental modal (structural and spatial) 3) visual image (perceptual, viewer-centered representation). He argues we would have these not for greater expressiveness, but for the same reason we have high-level programming languages.

Glasgow likes the Johnson-Laird view.

Marr 1982: Primal sketch, 2.5d sketch, 3d sketch [313]

"In particular, computational imagery is concerned wih the reconstruction of image representations to facilitate the retrieval of visual and spatial information that was not explicitly encoded in long-term memory." (reperception?)

visual thinking: what the image looks like
spatial thinking: where an object is located relative to other objects in a scene (complex image).

DESCRIPTIVE REPRESENTATION: The LTM storage. It's hierarchically organized. Information can be accessed from it using standard retrieval, procedural attachment and inheritance. (summary author question: but what is the content?) A hierarchical network model for semantic memory.[314] Uses AKO (a kind of) and part-of links to make two hierarchies. (summary author's note: AKO is clearly not visual or spatial. Part of kind of is.) Represented with Frames. Parts of an image are stored with indexes to their places in the spatial array (see below). Image concepts contain slots for specifying location and orientation of the instance within its context, the values of which are used to generate the spatial rep. So the sentences taken as input for a person, e.g.

"The spoon is to the left of the knife.
The plate is to the right of the knife.
The fork is in front of the spoon.
The cup is in front of the knife."

Would be generate the following spatial representation:

-------------------------
| spoon | knife | plate |
-------------------------
| fork  | cup   |       |
-------------------------

and this spatial image is stored in LTM as the following frame:

Framename: place-setting
AKO: image Parts/Location: spoon(1,1) knife(1,2) plate(1,3) fork(2,1) cup(2,2)

(summary author's question: If the above is the descriptive representation, it appears that the spatial isn't generated by the descriptive, but the descriptive has the spatial representation directly in memory. It's a matter of what information is made explicit-- if spoon is at 1, 1, it's informationally equivalent to the diagram with the spoon in the upper left corner, isn't it? What is it doing in the LTM section if it's spatial? )

SPATIAL REPRESENTATION: The image components in a multidimensional, symbolic array that preserves spatial and topological properties. They can be embedded for structural hierarchy. Functions transform and inspect arrays. Relative locations (left-of) and topological (inside-of, adjcent-to).

(summary author's question: If it's an array, how does it explicitly represent left-of? Don't you just have coordinates and "left-of" needs to be inferred based on those coordinates?)

"As each symbol in the array cooresponds to a frame in long-term memory, we can focus attention on a particular subimage and have it replaced by its symbolic array of parts."

(summary author's question: I think this means that LTM has both descriptive and spatial representations in it, although in the initial desciption of descriptive and spatial, descriptive seemed to be distinguished by the fact that it's in LTM. Also, in the descriptive description, it says the spatial is generated from it, implying that it's a temporary working memory representation.) VISUAL REPRESENTATION: the space occupied by an image as an occupancy array (in contrast with a symbolic array, I guess). Functions get volume, shape, and relative distance. [314]

The spatial representation is depictive because it satisfies Glasgow's criteria for depictive representations: [317] "If D is a depictive representation of an image I then (1) there must be a mapping of the parts in D (e.g., symbols or pixels) to parts in I, and (2) it must be possible to define a correspondence between the properties or relations of parts iof D and properties or relations of parts of I (e.g., shape, relative location, volume)."

(Summary author's note: I don't understand this well enough to know how this definition disqualifies descriptive representations too. Most consider above(a, b) to be descriptive, right?)

VAgueness can be represented spatially:

+--------------+
| Mary  | John |
+--------------+

if the set of relations R = {beside}, then the above symbolic array commits to the beside relation leaving the left-of relation ambiguous.

(Summary author's note: The symbolic array has what I will call "base" relations, which are the fundamental array relationships. The set or relations R is a mapping of those array relationships to things like "beside," "left-of," etc. Thus you could have an array that is just a list of colors, in no order at all-- if there are no relations defined, then the base relations (that blue is next to green in the array, for example) means absolutely nothing.)

But you can't represent beside(mary, john) and beside(mary, tom) without also committing to either beside(john,tom) or -beside(john,tom).

(Summary author's note: But you could represent this with two arrays.)

The symbolic array is for reasoning about spatial domains.

SYSTEM: Nial (high level programming language for computational imagery.
(Glasgow 1990) [318]

Ullman 1984: spatial parallelism is being able to apply the same operations to different parts of the image. Functional parallelism is when different operations are applied to a single image area.

Array theory supports parallelism in several ways. The EACH function applies the same function to a list of areas, and the ATLAS function applies a list of functions to an area.

Glasgow shows how a (spatial) symbolic array can be more efficient than propositions. If we have the sentences:
John is older than Mary
Robb is younger than Mary
Robb is older than Jane
Mark is younger than Jane

it could be represented propositionally with older(john,mary) etc. To reason out whether mary or mark is older, you would go through, presumably, a logical inference procedure. In general, the number of required inferences is proportional to the distance between terms in the linear ordering. But if you represent it spatially, e.g.

array(john mary robb jane mark)

Then knowing who is older, mark or mary, is simply inferring if one is to the left of the other.

(summary author's note: This is assuming, I assume, that rightness is mapped to olderness. How can you tell which is rightmost? In an array, you know a number that is associated with each cell. In this example mary is in cell 2, and mark is in cell 5. So it's simple to say that 5 is more than 2, thus mark is older than mary. I'm thinking that it's the numerical code that maps to some meaning (usually visio-spatial, but in this case age) that gives arrays their power. That is, the quantification is a useful intermediate representation. A question I have is this: how do we know that knowing that 2 is less than 10, for example, is computationally cheap?)

Now, you could instead do at-location(john,1), at-location(mary,2) etc. and reason at-location(x,y) AND at-location(i,j) AND less-than(y,j) THEN older(i,x). This works fine for the question at hand, but what about "which people are older than robb?" In an array, you just take the sublist to the right. Since the propositions are unordered, it requires a search of the entire list to get the answer.[322]

Spatial similarity is measured by the number of image transformations needed to bring them to equivalence.[234]

Symbolic arrays also help with re-orientation (the frame problem). If you have a description of what's to the right of you and what's in front of you in a room, determining the changes necessary when you turn is easier with an array than with just the descriptive representation. It also allows easy identification of the absence of objects in a certain area.[326]

Operations on spatial images:
Superimpose: Place an array with symbol x on an other array with symbol x, extending the array. [328]

A thesis of this paper: "Symbolic arrays provide a compact and coherent representation that can be directly inspected and transformed to retrieve spatial properties that are only implicit in the descriptive representation." [329]

It's better stated on 330: "the symbolic array representation is not computationally equivalent to a descriptive representation. Indeed, for many imagery-related tasks the structure imposed by an array permits problem-solving strategies that are computationally preferable, both in ease of programming and efficiency, to those that coul dbe designed for descriptive representations."

Summary author's notes:

On pg 314 she lists propositions and then shows the spatial image generated from it. But above on the page she says that propositions like "x is left of y" are gleaned from the spatial image. Where in memory would "x is to the left of y" go? In the descriptive or in the spatial? Seems neither, since the descriptive is merely parts and ako hierarchies with indexes to spatial arrays.

Back to the Cognitive Science Summaries homepage
Cognitive Science Summaries Webmaster:

JimDavies (jim@jimdavies.org)

Last modified: Tue May 13 10:28:57 EDT 2003