[ CogSci Summaries home | UP | email ]

D. Lenat and R. Guha (1990) Building Large Knowledge Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley Publishing.

Author of the summary: Jim Davies, 1999, jim@jimdavies.org

Cite this paper for:

  author =       "Douglas B. Lenat and R. V. Guha",
  year =         "1990",
  title =        "Building Large Knowledge-Based Systems: Representation
                 and Inference in the {CYC} Project",
  publisher =    "Addison-Wesley",
  address =      "Reading, Massachusetts",
  note =         "\iindex{Lenat, D. B.}\iindex{Guha, R.}",

Programs appear more intelligent than they are and don't know when they are out of their competence. You need knowledge to learn and not be brittle. How do you get this? By encoding the millions of bits of knowledge that an average person knows. Brittleness can be overcome by drawing on specialized knowledge, by falling back on increasingly general knowledge, or by analogizing to specific but superficially disparate knowledge.

Analogy is a vague word that means many different kinds of inferences on different occasions-- these inferences are very important and a large knowledge base is necessary for doing them right.

Having complex properties can get you into trouble. For example, lays-eggs-in-water should be broken down into more primitive elements. Else you need a zillion rules concerning them. For this reason many small scale knowledge bases won't easily scale up to larger ones. (p16).

Expert systems understand their domain at one level of abstraction and they cannot go to a deeper level if they need to. Thus they only understand very little about the domain. The input and output are meaningful largely only to the users. (p19)

This background knowledge will require millions of frame like entities.

The Cyc project is an attempt to make such a system. The idea is to do the top layers of the global ontology correctly-- this will provide a good framework for everything else. The first part is about 10 million entries, which is Minsky's estimate of how many things go into long term memory between the ages of 0 and 8. Ten to fifty million entries will probably do for the second part. The biggest problems have been misuse of concepts so that it is inconsistent, and re-entering of concepts already there. A way to solve this is by having Cyc make analogies of its own. This results either in 1. good analogies, 2. a sign that we need to differentiate concepts better, or 3. that we have entered the same thing twice. (p21)

Choosing the primitives of the ontology is called ontological engineering (p23).

NLU (natural language understanding) would be great, but it requires much of consensus reality to do it. Machine pattern recognition will not work either, because you need to know a lot to learn a lot.


Cyc has three pieces: 1. the knowledge base, 2. the browsing/entering environment, and 3. the representation language. (p28)

CycL, the language, is basically frame based.

 capital: (Austin)
 residents:  (Doug, Guha, Mary)
 stateOf: (UnitedStatesOfAmerica)

But it also has the CycL constraint Language (a predicate calculus), which sits on top and allows it to express things like "Siblings almost never have the same first names." and "Bill is either a terrific fisherman or a terrific liar." Though predicate calc is more powerful, the win with the frames is the speed. (p36)

There are units, slotunits (color), and seeunits.

 instanceOf: (SeeUnit)
 modifiesUnit: (Texas)
 modifiesSlot: (residents)
 rateOfChange: ( )
 cardinality: (10000000)

 instanceOf: (SeeUnit)
 modifiesUnit: (SeeUnitFor-residents.Texas)
 modifiesSlot: (rateOfChange)
 qualitativeValue: (Low)

There are also SlotEntry-Details, which comment on a particular relationship, like that Guha lives in Austin. You can say that it becameTrueIn: (1987), etc.

Each Unit also has (p39)

  1. truth value
  2. justification for TV
  3. dependencies, what depends on its being true
  4. properties that each entry v1 on the value inherits just by virtue of being there.
  5. Attitudes of agents toward this proposition

Summary author's notes:

Back to the Cognitive Science Summaries homepage
Cognitive Science Summaries Webmaster:
JimDavies ( jim@jimdavies.org )
Last modified: Fri Aug 27 17:04:33 EDT 1999