The term "artificial intelligence" was coined by John McCarthy (who later developed the LISP programming language, widely used in AI research) as the name of a summer workshop held at Dartmouth in the summer of 1956.
While there does not appear to be any consensus about the definition of AI, we can perhaps categorize the various definitions under three headings:
Elaine Rich's definition of AI is an example of a Weak AI definition:
Artificial Intelligence (A.I.) is the study of how to make computers do things at which, at the moment, people are better. [1]
Another way of categorizing the differing views about AI was proposed by W. Ross Ashby in his 1948 (!) paper "Design for a Brain" (published as a book in 1952):
Early work on learning followed the brain paradigm, seeking to model human brain functionality at the level of the neuron. Those adopting the performance paradigm worked primarily on problem solving.
Research on the brain paradigm waned in the 1970's due to both practical and theoretical limitations:
Most of the successes of AI in the 1970's and 1980's were due to research in the performance paradigm, based on Allen Newell's and Herbert Simon's Physical Symbol System Hypothesis:
A physical symbol system has the necessary and sufficient means for general intelligent action. [3]
Newell and Simon viewed intelligence as symbol manipulation, and hypothesized that it didn't matter what physical medium -- brain, paper, or computer -- was used to do the symbol manipulation. Computation is done on symbol structures which can be interpreted as representing situations in the "real world" (see Knowledge Representation below) provided that the system obeys the law of representation:
decode[encode(T)(encode(S1))] =
T(S1)
where S1 is some initial state in the real world which is transformed by some action T into a new state S2. [4]
The law of representation states that our physical symbol system models the real world in a valid way if:
Image based on [5]
A serious problem with physical symbol systems was pointed out by McCarthy and Hayes [6], which was termed the frame problem: How does one keep track of the frame of reference of an operation (transformation)? In particular, what changes and what stays the same when an operator is applied to the representation of a state?
A simple version of the problem is illustrated in the following figure: When we move block B, does block C move too?

What about this version of the problem?

It appears that our physical symbol system broke the law of representation in this situation.
Daniel Dennett provided a classic example of the frame problem:
Once upon a time there was a robot, named R1 by its creators. Its only task was to fend for itself. One day its designers arranged for it to learn that its spare battery, its precious energy supply, was locked in a room with a time bomb set to go off soon. R1 located the room, and the key to the door, and formulated a plan to rescue its battery. There was a wagon in the room, and the battery was on the wagon, and R1 hypothesized that a certain action which it called
PULLOUT(WAGON, ROOM)would result in the battery being removed from the room. Straightaway it acted, and did succeed in getting the battery out of the room before the bomb went off. Unfortunately, however, the bomb was also on the wagon. R1 knew that the bomb was on the wagon in the room, but didn't realize that pulling the wagon would bring the bomb out along with the battery. Poor R1 had missed that obvious implication of its planned act.Back to the drawing board. "The solution is obvious," said the designers. "Our next robot must be made to recognize not just the intended implications of its acts, but also the implications about their side-effects, by deducing these implications from the descriptions it uses in formulating its plans." They called their next model, the robot-deducer, R1D1. They placed R1D1 in much the same predicament that R1 had succumbed to, and as it too hit upon the idea of
PULLOUT(WAGON, ROOM)it began, as designed, to consider the implications of such a course of action. It had just finished deducing that pulling the wagon out of the room would not change the colour of the room's walls, and was embarking on a proof of the further implication that pulling the wagon out would cause its wheels to turn more revolutions than there were wheels on the wagon . . . when the bomb exploded.Back to the drawing board. "We must teach it the difference between relevant implications and irrelevant implications," said the designers, "and teach it to ignore the irrelevant ones." So they developed a method of tagging implications as either relevant or irrelevant to the project at hand, and installed the method in their next model, the robot-relevant-deducer, R2D1 for short. When they subjected R2D1to the test that had so unequivocally selected its ancestors for extinction, they were surprised to see it sitting, Hamlet-like, outside the room containing the ticking bomb, the native hue of its resolution sicklied o'er with the pale cast of thought, as Shakespeare (and more recently Fodor) has aptly put it. "Do something!" they yelled at it. "I am," it retorted. "I'm busily ignoring some thousands of implications I have determined to be irrelevant. Just as soon as I find an irrelevant implication, I put it on the list of those I must ignore, and . . ." the bomb went off. [7]
Interest in the brain paradigm revived in the 1980's after a new, more powerful model of neuron functioning -- the neural network -- was discovered and connectionist architectures (such as Danny Hillis's Connection Machine) were being designed and built. A neural network is a computation model which uses many relatively simple interconnected units (nodes) working in parallel. Each node is activated when the sum of its inputs reaches a certain threshold level. The connectivity of the network -- how the nodes are connected to each other, and how many layers of nodes with similar characteristics are used -- is specified as part of the design of the neural network. Each connection between one node and another is assigned an initial weight (typically chosen at random), and these weights are modified by a learning rule as the neural network is trained to perform some task. The activation of a node is thus determined using a weighted sum of its inputs, which are connections to other nodes.
Several properties of neural networks have led to their widespread use in cognitive science and a variety of fields of application such as signal processing, pattern recognition, and optimization:
Learning -- Neural networks learn to improve their performance of a task during a training period. Learning is an intrinsic property of the model.
Robustness -- Neural networks tend to tolerate noisy input and behave appropriately in new situations.
Emergence -- Neural networks often exhibit behaviors which were not programmed into the system. For example, the NETTalk system, a neural network that learns to pronounce English text, starts to behave as if it had learned the rules of English pronunciation -- it can correctly pronounce words it has not seen before -- although there are no rules built in to the network. (Internally, certain nodes start distinguishing between vowels and consonants, although the vowel-consonant distinction was not preprogrammed and was not explicitly taught.)
The problems associated with classical AI approaches, including the frame problem, led some researchers to study intelligence from a completely different vantage point. They employed a synthetic methodology -- the creation artificial systems that modelled certain aspects of natural systems -- to create autonomous agents, mobile robots that behave in the real world without the intervention of a human. The robots have sensors to perceive the environment, and they perform actions that change the environment.
Some key concepts of this approach, which some have dubbed embodied cognitive science, are:
Embodiment -- Autonomous agents are real physical agents which must interact with their environments. While this complicates some design issues (since the agent must be able to react to unexpected events), other issues are simplified. For example, the frame problem is effectively eliminated -- the agent can use its sensors to find out the state of the world after it performs some action. Some researchers have asserted that intelligence can only emerge from embodied agents. [8]
Situatedness -- "An agent is 'situated' if it can acquire information about the current situation through its sensors in interaction with the environment. . . . [Situated agents] are much better at performing in real time because they exploit the system-environment interaction and therefore minimize the amount of world modeling required." [9]
Emergence -- The goal of embodied cognitive science is to design agents that display emergent behaviors, especially the sort due to agent-environment interaction. This is achieved by designing at a low level rather than a high one. For example, rather than programming a robot to be attracted to light, the robot's light sensors could be wired to its drive motors so that it turns toward a light source.
Declarative representations are based on the assumption that complex entities can be described as collections of attributes and their associated values. They typically use a slot-and-filler notation to represent the values of various attributes (characteristics, features) of objects and concepts being represented. For example, in slot-assertion notation (or predicate notation), we might represent the sentence "Mary is the mother of Bobby " as:
(mother-of Mary Bobby)
In Object-Attribute-Value (OAV) notation, we could represent information about a block in the "blocks-world" microworld as
(B1 shape cube)
(B1 color green)
(B1 left-of B2)
(B1 supports B3)
or in a property list, which associates with each object a list of attribute-value pairs:
(B1
(shape cube)
(color green)
(left-of B2)
(supports B3)
)
Each of the four following methods of representing knowledge use a notation like one of those above along with a set of inference rules to extract knowledge from the knowledge base to solve problems.
An alternative method of representing knowledge uses only a small amount of information stored in declarative notation (as facts), representing most of its knowledge in a set of procedures which use the set of facts to derive additional facts, to verify assertions, or to accomplish tasks. For example, we would not store the fact that 2 + 3 = 5, but represent the concept of addition in a procedure which, given two numbers, would compute their sum.
Winograd's SHRDLU uses procedural representation:
The definition of every word is a program which is called at an appropriate point in the analysis, and which can do arbitrary computations involving the sentence and the present physical situation. [14]
No knowledge representation is completely declarative or completely procedural --- these categories represent the endpoints of a continuum of possible representations.
1. Elaine Rich, Artificial Intelligence, McGraw-Hill, 1983, p. 1.
2. Minsky and Papert, Perceptrons, MIT Press, 1969.
3. Newell and Simon, "Computer Science as Empirical Inquiry: Symbols and Search", Communications of the ACM, vol 19, no. 3, Mar. 1976, p. 116.
4. A. Newell, Unified theories of cognition, Harvard University Press, 1990, cited in Pfeifer and Scheier, Understanding Intelligence, MIT Press, 1999.
5. Pfeifer and Scheier, ibid., Figure 2.5, page 45.
6. McCarthy and Hayes, "Some philosophical problems from the standpoint of artificial intelligence," Machine Intelligence, 4 (1969), pp. 463-502 [cited in Pfeifer and Scheier].
7. Daniel Dennett, "Cognitive wheels: The frame problem of AI," in C. Hookway (Ed.), Minds, machines and evolution, Baen Books, 1987, pp. 41-42.
8. Rodney Brooks, "Intelligence without representation," Artificial Intelligence, 47 (1991), pp. 139-160.
9. Pfeifer and Scheier, ibid., page 72.
10. Quillian, 1968.
11. Schank, 1973.
12. Minsky, 1975.
13. Schank and Abelson, 1977.
14. Winograd, "A Procedural Model of Language Understanding", in Computer Models of Thought and Language, p. 170, cited in Hofstadter, Goedel, Escher, Bach
Copyright © 2000 Jonathan Mohr