I have a reasonable amount of experience with GA’s (PhD + some
industrial experience). This sounds interesting, but unfortunately I
don’t have time to get involved at this point. However, I am more than
willing to give you a few pointers…
In this sort of case you might wish to consider a system known as a
genetic algorithm classifier. These are detailed quite well in David
Goldberg’s book (he was the originator of the technique, or at least
the first implementor if IIRC).
There are 4 components to the system. The first is a set of rules,
a set of trinary strings or arrays (true, false, widcard). Each
element refers to one particular condition (and these conditions are
hard-coded). Each rule is of the form condition -response, i.e. if the
condition holds , then perform response.
The second component is a blackboard, from which messages can be read
and responses posted. Usually messages are eith inputs from the
outside world (e.g. what cards have been played in this round), or
responses from rules. Generally, rules check the conditions on the
blackboard, and then post a response, although you may wish to have
rules which read the blackboard and cause actions (e.g. play K
The third component is a credit assignment algorithm (usually known as
the bucket brigade algorithm). Usually this is the hardest bit to get
to work. The idea here is to try and rank the ruleset in order of
their utility. This is usually done as follows:
Each rule has a strength (initially assigned). Each cycle (after any
card is played?) each rule is checked against the blackboard. Any rule
whose conditions are met is offered a chance to operate. Each of these
rules bids a portion of its utility for the right to operate. One or
more rules are then chosen to operate using a roulette wheel algorithm
based on their bid utility (so the rule “Elvis is alive and playing in
the Ring’O Bells tonight” is occasionally taken out for a test drive!
- and we end up in the pub!). These rules then pay their bid to
operate (their responses are activated). Note that generally, we apply
the tax in a progressive manner, so we tax rules based on their
specificity - the more specific they are (fewer wildcards), the harder
they are taxed, so we both prefer specific rules to general ones, and
we make the cost of them getting things wrong higher.
What do we do with the tax. Well, it gets split between the rules that
were applied in the previous cycle (possibly minus a decay constant).
But we also introduce extra credit into the system every time a
positive result happens (e.g. we win a trick). This means that the
rules that lead to good results tend to increase in utility over time,
and the the bad ones decrease in utility; with the result that a
natural hierarchy or rules develops from any given initial ruleset
(e.g. randomly chosen). Over time, this algorithm will reward the
rules which are intermediate steps to good outcomes, because the
rewards get passed back up the “bucket brigade” by the taxation
system. However, this algorithm can have problems in that it can be
difficult to reward the earlier events inlong chains of events.
The fourth and final component is the genetic algorithm. Every once in
a while (at a slower rate than the bucket brigade algorithm runs at),
we use a GA to update the ruleset, replacing a couple of the weakest
rules using a GA (maybe 5% of the ruleset each GA epoch).
Hope this helps (I have to go and catch a train home now!)
OK, I have more useful things to be doing.
But sometimes I like to code just for fun.
I can admit that here, right?
What I had in mind for the Spades project
was originally a genetic algorithm type
That is, you define a genome based on some
options for bidding and playing, and you
play the members of the population against
each other as they evolve.
I think it’s interesting from a GA standpoint
(disclaimer: I’ve never written a GA program).
For one thing, it’s not clear that a simple
tournament would suffice to find optimal
strategies. You might just find a “local
maximum” where the surviving members can beat
the previous generation, but couldn’t
necessarily have beaten some earlier
If someone wants to work on it, let me know (the
GA or the cardbots). Otherwise, I’ll probably just
implement 40% of each one and quit.