Coy - Like Carp, Only Prettier

Damian Conway

School of Computer Science and Software Engineering
Monash University
Clayton 3168, Australia

damian@csse.monash.edu.au
http://www.csse.monash.edu.au/~damian

 

Abstract

Before use Coy: run
code...read rebuke. After use
Coy: run code...haiku!

Introduction

Error messages
strewn across my terminal.
A vein starts to throb.

Their reproof adds the
injury of insult to
the shame of failure.

When a program dies
what you need is a moment
of serenity.

The Coy.pm
module brings tranquillity
to your debugging.

The module alters
the behaviour of die and
warn (and croak and carp).

It also provides
transcend and enlighten, two
Zen alternatives.

Like Carp.pm,
Coy reports errors from the
caller's point-of-view.

But it prefaces
the bad news of failure with
a soothing poem.

Haiku as error messages

The use of haiku
to couch an error message
is by no means new.

The easiest way
to ornament errors is
with a "canned" haiku.

Salon magazine[1]
suggested this approach in
1998.

They asked readers to
submit error messages
written as haiku.

The winning entries
are now widely known. The best
of them is perhaps:

Three things are certain:
Death, taxes, and lost data.
Guess which has occurred.
But just as canned fish
soon grow less appetizing,
so too canned poems.

Inevitably,
constant repetition robs
them of their piquance.

Besides, there are too
many error messages
that need a haiku.

Perl's diagnostics
alone would require just
under 500.

And, of course, there's an
endless supply of user-
defined messages.

Synthetic haiku

Coy's haiku are not
"canned". They are generated
freshly every time.

But it's not the first
system designed to create
synthetic haiku.

The Internet is
awash with generators
of Japanese verse;

A simple search[2] finds
100,000 links for:
"generate haiku".

Silicon Graphics
rigged a lava-lamp[3] to build
random-word verses:

i think i'm wasted
i'll wax the cats. cool clear earth
pigs are smarter. crash
That is one of its
clearer efforts. Mostly it
just spouts gibberish.

In contrast, Garret
Kaminaga[4] created
a "haiku grammar".

Its simple rules (see
Figure 1) expand to give
correct syllables.
 

haiku: 
five_line seven_line five_line
five_line: 
one four | one three one | one one three | one two two | 
one two one one | one one two one | four one | five 
seven_line: 
one one five_line | two five_line | five_line one one | five_line two 
one: 
red | white | black | sky | dawns | breaks | falls | cranes | 
rain | pool | my | your | sun | clouds | tree | Zen
two: 
drifting | purple | mountains | faces | empty | temple | 
ocean | thinking | zooming | rushing | over | ricefields
three: 
peasant farms | computer | sashimi | fishing boats | ethernet
four: 
CD Player | aluminum | yakitori | chrysanthemums
five: 
resolutional | rolling foothills rise
Figure 1: Haiku
generating grammar (by
G. Kaminaga).
But they don't encode
any English grammar, so
the results are poor: empty computer
yakitori to empty
your chrysanthemums
Lisa Schmeiser and
Cliff Maier string together
random text fragments[5].

These flow better, but
still betray a tell-tale "stream-
of-consciousness" feel:

bagels in the morn
my dreams leaving me empty
blame the mosquitos
Richard Decker and
Stuart Hirshfield[6] also use
grammars for haiku.

But theirs are based on
real English sentence structures
(as in Figure 2).

As a result, they
generate quite plausible
(and lovely) haiku:

A liquid summer
wind. Under the gusty sky
a storm whispers. [sic]
 
haiku: 
form1 | form2 | form3
form1: 
article adjective noun 
article noun verb preposition article noun 
adjective adjective noun
form2: 
noun preposition article noun 
article adjective noun preposition article noun 
adjective noun
form3: 
article adjective adjective noun 
preposition article adjective noun 
article noun verb
noun: 
waterfall | river | breeze | moon | rain | wind | sea | sky | storm
verb: 
shakes | drifts | has stopped | struggles | whispers | grows | flys
adjective: 
liquid | gusty | flowing | autumn | hidden | bitter | misty | summer
Figure 2: Haiku
generating grammar (by
Decker and Hirshfield).
 
Unfortunately,
their English grammar doesn't
encode syllables.

Consequently, most
of the haiku they produce
don't scan correctly.

As these samples show,
a haiku generator
must balance two things:

It must use correct
English syntax and it has
to track syllables.

Structure of the Coy.pm Generator

Coy's data-driven
poem generator has
five main components: The following five
subsections describe each of
these tasks in detail.

Encoding vocabulary

Ultimately, a
haiku is just sequence
of well-chosen words.

Coy's words are stored in
a hierarchical, cross-linked
vocabulary.

Figure 3 shows an
abbreviated sample
of the database.
 

$database = {
    duck => {
        category => [ "bird" ],
        sound    => [ "quacks", ],
        act      => { swims => { location     => "suraquatic",
                                 direction    => "horizontal",
                                 synonyms     => [ "paddles" ],
                                 associations => "sink wet", }, },
    },
    fox => {
        category => [ "animal", "hunter" ],
        sound    => [ "barks" ],
        act      => { trots => { location     => "terrestrial",
                                 associations => "smart problem", }, },
    },
    lover => {
        category => [ "human" ],
        sound    => [ "sighs", "laughs" ],
        minimum  => 2,
        maximum  => 2,
        act      => {  kisses   => { location     => "terrestrial",
                                     associations => "connection", },
                       quarrels => { location     => "terrestrial", 
                                     associations => "argument", }, },
    },
};
Figure 3: Structure 
of the Coy hierarchical 
vocabulary.
 
The top level of
word categorization
is by subject nouns.

For each such noun, a
set of categories and
sounds is then given.

The categories
relate the noun to standard
actions (see below).

The sounds are used as
verbs to generate clauses
describing noises.

In addition, a
list of more general verb forms
("act") is specified.

Each such verb, listed
in third person singular,
may take attributes.

These attributes list
constraints on the verb's usage
(such as location).

The entry for "duck" =>
"swims", for instance, locates it
as "suraquatic".

Other attributes
limit the subject count for
particular verbs.

"lover" => "kisses", for
example, is limited
to exactly 2.

(What can I say? It's
a very traditional
style of poetry.)

Nouns and verbs may be
given synonym lists to
cut repetition.

Verbs also list their
associations (see the
following section).

Many common verbs
can be applied across a
general class of nouns.

For example, all
nouns representing fish can
take the verb to swim.

Relationships of
this type can be stored in Coy's
vocabulary.

A noun's entry can
specify categories
to which it belongs.

Such categories
are listed separately
in the database.

Each is formatted
like a noun: with verbs, sounds, and
associations.

When the database
is loaded, categories
are "distributed".

Coy identifies
noun entries specified with
a category.

It then adds to such
entries the category's
list of attributes

Finding associations

Coy's next component
is an association
selection system

This system ensures
that the haiku relates to
the error message.

The message is first
scanned to find significant
words (principally nouns).

These words are found by
deleting "stop words" from the
original text.

The remaining words
become a "filter" for the
vocabulary.

Coy then expands this
filter by augmenting words
with their synonyms.

Each word selected
for the haiku is compared
against the filter.

If the selected
word's associations don't
match, it's rejected.

This leads to problems
though, if the filter words are
too unusual.

In extreme cases,
they may filter out the whole
vocabulary.

To prevent this, Coy
can turn the word filter off
temporarily.

It does so if the
selection success rate falls
below 5%.

That allows words to
be chosen, so that haiku
creation proceeds.

When the selection
rate rises again, Coy turns
the filter back on.

This balances the
desire for relevance with
the need to progress

Encoding grammatical structures

The third component
generates the haiku, by
filling in templates.

Those templates encode
various grammatical
structures for haiku.

The generator
selects one and fills it in
with relevant words.

Figure 4 shows a
few of the grammatical
templates Coy uses.
 

haiku_fragment:
sentence | description | exclamation 
sentence: 
noun verb | 
noun verb direction | 
noun verb location
description: 
noun location | 
pres_participle noun
exclamation: 
noun
verb: 
simple_present | 
pres_participle
Figure 4: Sample 
English grammar templates used 
by Coy.pm.
 

Note that the grammar
has no terminals. They're drawn
from the database.

Templates are chosen
at random, as often as
needed (see below).

The chosen template
is then filled in with "filtered"
semi-random words.

The noun to be used
is randomly selected,
and constrains the verb.

The verb is chosen
from those specified for that
particular noun.

Any other parts
of the grammar are likewise
constrained by the verb.

These are typically
adverbial phrases of
place or direction.

For instance, suppose
the filtered noun chosen is
the word hummingbird.

Immediately
this constrains the verb to words
like flies, darts, or nests.

If flies were chosen,
that would then constrain the place
to be aerial.

Whereas, if nests were
chosen, the place would have to
be arboreal.

Note that Coy needs no
A.I. techniques to enforce
these sequenced constraints.

The hierarchical
vocabulary structure
itself ensures them

Word inflections

But selecting the
right parts of speech--and words to
match--is not enough.

The module must then
adjust the selected words'
grammatical form.

Specifically, the
words used must be inflected
for number and tense.

Lingua::EN::Inflect
is used to supply correct
noun/verb agreement

(specifically, the
exportable PL_N
and PL_V subs).

Currently, tense is
restricted to the present
or continuous.

That's not a problem
though--most haiku are written
in those two tenses.

Verbs are stored in the
vocabulary in the
present tense only.

Lingua::EN::Inflect
can now inflect present tense
to continuous.

This transformation
is provided via the
new PART subroutine.

(Inflecting present
participles is harder
than it might first seem.

Consider the verbs:
bat, combat, eat, bite, fulfil(l),
lie, sortie, and ski.)

Enforcing metrical constraints

The four components
above ensure the haiku
parses and makes sense.

However, there's no
guarantee that the result
scans 5-7-5.

To ensure perfect
metre, each selected word's
syllables are checked.

This occurs whilst the
grammar templates are filled in
(as words are filtered).

The selector tracks
the progressive syllable
count of the words used.

If the count exceeds
17, the selector
can reject a word.

The selection can
also backtrack further, if
that's necessary.

In some cases this
might cause the template itself
to be rejected

The template-filling
process then repeats until
the full haiku scans.

The Coy.pm interface

Coy.pm works
by assigning a handler
to $SIG{__DIE__}.

That handler passes
the error it receives through
the generator.

It then re-calls die
with the resulting haiku
as its argument.

The same approach is
applied to $SIG{__WARN__},
to catch warnings too.

As a result, all
exceptions thrown by die, warn,
croak, or carp are caught.

The Coy.pm
module also exports two
extra subroutines.

These are transcend and
enlighten, which lend a Zen
overtone to code.

Internally though,
these two subs are just wrappers
around croak and carp.

Sample results

This section gives some
examples of the haiku
that Coy produces.

Given the error
message: die "Bad argument",
Coy generates this:

A pair of lovers
quarrel beside a stream. Four
thrushes fly away.
Note the allusion
to the bad argument in
the error message.

Haiku are never
repeated. A second die
"Bad argument" gives:

Two old men fighting
under a sycamore tree.
Homer Simpson sighs.
In contrast, for a
croak "Missing file", Coy reflects
the sense of loss with:
Bankei weeping by
a lake. Ryonen dying.
Seven howling bears.
Coy cannot always
achieve this high level of
(oblique) relevance.

For example, it
also produced this response
to croak "Missing file":

A swallow nesting
in the branches of an elm
tree. A waiting fox.
Sometimes Coy's output
suggests a macabre sense
of humour, as in:
A wolf leaps under
a willow. Two old men sit
under the willow.
In other cases,
its inscrutability
is most authentic:
Two young women near
Bill Clinton's office. A cat
waiting by a pond.
 

Current limitations

In its current form,
the module has four problems
and limitations. This limits the range
of topics that the haiku
produced can cover.

That in turn leads to
tell-tale repetition (which
fails the Turing test).

Extending the range
of words Coy.pm can
use is no problem

(though finding the time
and the creativity
required may be).

Hence it's often not
able to find relevant
words for a message.

This leads to haiku
utterly unrelated
to the error text.

Again, there is no
technical difficulty
in adding more links:

Defining enough
associations isn't
hard, just tedious.

This leads to haiku
that are (structurally, at
least) monotonous.

Yet again, this needs
no technical solution,
just time and effort.

Of course, such enhanced
templates might require richer
vocabulary.

For example, verb
predicates would need extra
database structure:

Each verb entry would
have to be extended with
links to object nouns.

The algorithmic
syllable counter is still
being developed.

It is currently
around 92%
accurate (per word).

This means that correct
syllable counts for haiku
can't be guaranteed.

Syllable counts for
single words are correct to
±1.

In a multi-word
haiku these errors cancel
out in most cases.

Thus, the haiku tend
to be correct within one
or two syllables.

As the syllable
counter slowly improves, this
problem will abate.

Future work

The Coy.pm
module still has ample scope
for development.

Increasing Coy's range
of vocabulary is
clearly essential.

Both its content and
its cross-linked structure need to
be greatly enhanced.

This, in turn, would make
it possible to extend
the grammar templates.

Some new syntactic
formats would also provide
more variety.

A side-effect might
be haiku that are smoother
and less fragmented.

The problem, of course,
is the effort required
to add this data.

Automation of
data acquisition might
be one solution.

Coy could use data
from some general semantic
lexicon system.

For instance, it might
be possible to adapt
data from WordNet[7].

Conclusion

The Coy.pm
module frames errors within
synthetic haiku.

This reduces the
stress induced in the user
when a program fails.

The haiku are fresh
each time, and related to
the error message.

They conform to the
rules of English grammar and
Japanese metre.

As usual, the
module is available
via the CPAN.

References

  1. http://www.salonmagazine.com/21st/chal/1998/02/10chal2.html
  2. http://www.altavista.com/cgi-bin/query?q=generate+haiku
  3. http://lavarand.sgi.com/cgi-bin/haiku.cgi
  4. http://www.cs.stanford.edu/~zelenski/rsg/grammars/Haiku.g
  5. http://inp.cie.rpi.edu/cgi-bin/haiku
  6. http://odyssey.thomson.com/brookscole/compsci/aeonline/course/9/2/index.html
  7. http://www.cogsci.princeton.edu/~wn/