Chapter 2, Paradigms of AI Programming: Case Studies in Common Lisp, Peter Norvig.
Consider a portion of rules:
This grammar description is called a "context-free phrase-structure grammar". The idea is that anywhere we want a sentence, we can generate a noun phrase followed by a verb phrase. Anywhere a noun phrase has been specified, we generate instead an article followed by a noun. Anywhere an article has been specified, we generate either "the", "a", or some other article.
The formalism is "context-free" because the rules apply anywhere regardless of the surrounding words, and the approach is "generative" because the rules as a whole define the complete set of sentences in a language (and by contrast the set of nonsentences as well).
For example, the derivation of a single sentence using these rules can be written as follows:
We now want to develop a short Lisp program that generates random sentences from a phrase-structure grammar. The most straightforward approach is to represent each grammar rule by a seperate Lisp function.
(defun sentence ()
(append (noun-phrase) (verb-phrase)))
(defun noun-phrase ()
(append (article) (noun)))
(defun verb-phrase ()
(append (verb) (noun-phrase)))
(defun article ()
(one-of '(the a)))
(defun noun ()
(one-of '(man ball woman table)))
(defun verb ()
(one-of '(hit look saw liked)))
Each of these functions has an empty parameter list, (). That means the functions take no arguments. This is unusual because, strictly speaking, a function with no arguments would always return the same thing, so we would use a constant instead.
However, these functions make use of the "random" function (as we will see shortly), and thus can return different results even with no arguments. Thus, they are not functions in the mathematical sense, but they are still call functions in Lisp, since they return a value.
All that remains now is to define the function "one-of". It takes a list of possible choices as an argument, chooses one of these at random, and return a one-element list of the element chosen. The last part is so that all functions in the grammar will return a list of words. By that way, we can freely apply "append" to any category.
(defun one-of (set)
"Pick one element of set, and make a list of it."
(list (random-elt set)))
(defun random-elt (choice)
"Choose an element for a list at random."
(elt choice (random (length choice))))
There are two new functions here, "elt" and "random". "elt" picks an element out of a list, when the first argument is the list, and the second is the position of the element we want to pick out of that list. Note that the position for "elt" function begins at 0, so (elt choices 0) would return the first element of the list, and (elt choices 1) would return the second.
The expression (random n) returns an integer from 0 to n-1, so that (random 4) would return either 0, 1, 2 or 3.
Now we can test the porgram by generating a few random sentences, along with a noun phrase and a verb phrase:

If we trace our program, we will see the following:

Suppose that we now want to allow noun phrases to be modified by an indefinite number of adjectives and an indefinite number of prepositional phrases. In generative context-free grammar notation, we might have the following rules:
In the above notation, 0 indicates a choice of nothing at all, a comma indicates a choice of several alternatives, and the asterisk is nothing special--as in Lisp, it's just part of the name of a symboil. However, the convention used here is that name ending in an asterisk denotes zero or more repetitions of the underlying name. That is, PP* denotes zero or more repetitions of PP. This is known as "Kleene star" notation, named after the mathematician "Stephen Cole Kleene".
The problem is that the rules for Adj* and PP* contain choices that we would have to represent as some kind of conditional in Lisp. For example:
(defun adj* ()
(if (= (random 2) 0) nil (append (adj) (adj*))))
(defun pp* ()
(if (random-elt '(t nil)) (append (pp) (pp*)) nil))
(defun noun-phrase ()
(append (article) (adj*) (noun) (pp*)))
(defun pp ()
(append (prep) (noun-phrase)))
(defun adj ()
(one-of '(big little blue green long)))
(defun prep ()
(one-of '(to in by with on)))
Note that either implementation of Adj* and PP* would work. Now let's look at the results of our complex grammar:

An Alternative implementation of the above generative context-free grammar would concentrate on making it easy to write grammar rules and would worry later about how they will be processed. Let's look again at he original grammar rules:
Each rule consists of an arrow with a symbol on the left-hand side and something on the right-hand side. The complication is that there can be two kinds of right-hand sides: a concatenated list of symbols, as in "Noun-Phrase => Article + Noun", or a list of alternative words, as in "Noun => man, ball, ..." We can account for these possibilities by deciding that every rule will have a list of possibilities on the right-hand side, and that a concatenated list, for example "Article + Noun", will be represented as a Lisp list, for example "(Article Noun)". The list of rules can then be now represented as follows:
(defparameter *simple-grammar*
"A grammar for a trivial subset of English."
'((sentence -> (noun-phrase verb-phrase))
(noun-phrase -> (Article Noun))
(verb-phrase -> (Verb noun-phrase))
(Article -> the a)
(Noun -> man ball woman table)
(Verb -> hit took saw liked)))
(defvar *grammar* *simple-grammar
"The grammar used by generate. Initially, this is *simple-grammar*,
but we can switch to other grammars.")
Note that the Lisp version of this rules closely mimics the original version. In particular, the symbol "->" included, even though it serves no real purpose; it is purely decorative.
The special forms defvar and defparameter both introduce special variables and assign a value to them. The difference is that a variable, like *grammar*, is routinely changed during the course of running the program. A parameter, like *simple-grammar*, on the other hand, will normally stay constant. A change to a parameter is considered a change to the program, not a change by the program.
Once the list of rules has been defined, it can be used to find the possible rewrites of a given category symbol. The function assoc is designed for just this sort of task. It takes two arguments, a "key" and a list of lists, and returns the first element of the list of lists that starts with the key. If there is none, it return nil.
> (assoc 'noun *grammar*)
(NOUN -> MAN BALL WOMAN TABLE)
Although rules are quite simply implemented as lists, it is a good idea to impose a layer of abstraction by defining functions to operate on the rules. We will now develop three functions: one to get the right-hand side of the rule, one for the left-hand side, and one to look up all the possible rewrites (right-hand sides) for a category.
(defun rule-lhs (rule)
"The left-hand side of a rule."
(first rule))
(defun rule-rhs (rule)
"The right-hand side of a rule."
(rest (rest rule)))
(defun rewrites (category)
"Return a list of the possible rewrites for this category."
(rule-rhs (assoc category *grammar*)))
We are now ready to address the main problem: defining a function that will generate sentences (or noun-phrases, or other category). We will call this function generate. It will have to contend with three cases:
(defun generate (phrase)
"Generate a random sentence or phrase."
(cond ((listp phrase) (mappend #'generate phrase))
((rewrites phrase) (generate (random-elt (rewrites phrase))))
(t (list phrase))))
From the above generate function, the first clause handles the third case, while the second handles the first, and the third handles the second. Note that we use the mappend function that apply fn to each element of list and append the results.
(defun mappend (fn the-list)
(apply #'append (mapcar fn the-list)))
We can see that the generate function and its companies are very short, but dense with information: the craft of programming includes knowing what not to write, as well as what to write.
This style of programming is called data-driven programming, because the data (the list of rewrites associated with a category) drives what the program does next. It is a natural and easy-touse style in Lisp, leading ro concise and extensible programs, because it is always possible to add a new piece of data with a new association without having to modify the original program.
Let's look at some examples when generate is in used.

There are many possible ways to write generate. The following version uses if instead of cond:
(defun generate (phrase)
"Generating a random sentences or phrase."
(if (listp phrase)
(mappend #'generate phrase)
(let ((choices (rewrites phrase)))
(if (null choices) (list phrase)
(generate (random-elt choices))))))
This version uses the special form let, which introduces a new variable (in this case, choices) and also binds the variable to a value. In this case, introducing the variable saves us from calling the function rewrites twice, as was done in cond version of generate.
Write a version of generate that uses cond but avoids calling rewrites twice.
We now define a new grammar in rule-based approach that includes adjectives, prepositional phrases, proper names, and pronouns. We can then apply the generate function without modification to this new grammar.
(defparameter *bigger-grammar*
'((sentence -> (noun-phrase verb-phrase))
(noun-phrase -> (Article Adj* Noun PP*) (Name) (Pronoun))
(verb-phrase -> (Verb noun-phrase PP*))
(PP* -> () (PP PP*))
(Adj* -> () (Adj Adj*))
(PP -> (Prep noun-phrase))
(Prep -> to in by with on)
(Adj -> big little blue green long)
(Article -> the a)
(Name -> Pat Kim Lee Terry Robin)
(Noun -> man ball woman table)
(Verb -> hit took saw liked)
(Pronoun -> he she it these those that)))
(setf *grammar* *bigger-grammar*)
And then, let's see the results:

From the output, it is clear that the program does not distinguish sensible from silly output.
One advantage of representing information in a declarative manner--as rules or facts rather than as Lisp functions--is that it can be easier to use the information for multiple purposes. Suppose we wanted a function that would generate not just the list of words in a sentence, but a representation of the complete syntax of a sentence. For example, instead of the list (a woman took a ball), we want to get the nested list:
(sentence (noun-phrase (article a) (noun woman))
(verb-phrase (verb took)
(noun-phrase (article a) (noun ball))))
This corresponds to the tree that linguists draw as the following:

Using a "straightforward solution" approach, we would be stuck; we would have to rewrite every function to generate the additional structure. With the "rule-based solution" version we could keep the grammar as it is and just write one new function: a version of generate that produces nested list. The two changes are to cons the category onto the front of each rewrite, and then not to append together the results but rather just list them with mapcar:
Last modified: Mon Jan 18 10:57:49 ICT 1999