The whole 'Spades' thing: GA and cardbots

OK, I have more useful things to be doing.

But sometimes I like to code just for fun.
I can admit that here, right? :slight_smile:

What I had in mind for the Spades project
was originally a genetic algorithm type
of thing.

That is, you define a genome based on some
options for bidding and playing, and you
play the members of the population against
each other as they evolve.

I think it’s interesting from a GA standpoint
(disclaimer: I’ve never written a GA program).
For one thing, it’s not clear that a simple
tournament would suffice to find optimal
strategies. You might just find a “local
maximum” where the surviving members can beat
the previous generation, but couldn’t
necessarily have beaten some earlier
generation.

For another thing, the game is typically
played with partners. So we want to mix things
up so that partners have the chance to be
complementary or to shoot each other in the
foot. It’s conceivable that an algorithm A
might fare very well when partnered with its
own kind, but an algorithm B might do better
with an arbitrary partner and thus win
overall.

Too much for my poor brain.

Another interesting idea:

Implement a ‘server’ (say, via DRb) that does
nothing but referee games – it shuffles and
deals, tracks bids, keeps score, kills players
who break the rules, etc.

Clients connect and act as players. This is
like the old C-bots or whatever (there must be
a million variations). You’re pitting your code
against someone else’s code.

A player class need only have a fairly simple
interface. It would need to respond to messages
like ‘bid’ and ‘play’ and so on. Perhaps we
could also have a method like ‘broken_rule’ so
that the server can tell the client it’s done
something illegal.

If it’s all done via DRb, there’s no chance (I
think?) for a client to cheat, say, by looking
into the object space and finding his opponent’s
cards.

I’m really just dreaming, but I do think it would
make a fun project.

If someone wants to work on it, let me know (the
GA or the cardbots). Otherwise, I’ll probably just
implement 40% of each one and quit. :slight_smile:

Cheers,
Hal

Is there a reason to use GA instead of genetic programming (GP)?
This sounds like the sort of problem GP is specifically good for.

James

···

-----Original Message-----
From: Hal E. Fulton [mailto:hal9000@hypermetrics.com]
Sent: Thursday, August 08, 2002 12:52 AM
To: ruby-talk ML
Subject: The whole ‘Spades’ thing: GA and cardbots

OK, I have more useful things to be doing.

But sometimes I like to code just for fun.
I can admit that here, right? :slight_smile:

What I had in mind for the Spades project
was originally a genetic algorithm type
of thing.

That is, you define a genome based on some
options for bidding and playing, and you
play the members of the population against
each other as they evolve.

I have a reasonable amount of experience with GA’s (PhD + some
industrial experience). This sounds interesting, but unfortunately I
don’t have time to get involved at this point. However, I am more than
willing to give you a few pointers…

In this sort of case you might wish to consider a system known as a
genetic algorithm classifier. These are detailed quite well in David
Goldberg’s book (he was the originator of the technique, or at least
the first implementor if IIRC).

There are 4 components to the system. The first is a set of rules,
expressed as
a set of trinary strings or arrays (true, false, widcard). Each
element refers to one particular condition (and these conditions are
hard-coded). Each rule is of the form condition -response, i.e. if the
condition holds , then perform response.

The second component is a blackboard, from which messages can be read
and responses posted. Usually messages are eith inputs from the
outside world (e.g. what cards have been played in this round), or
responses from rules. Generally, rules check the conditions on the
blackboard, and then post a response, although you may wish to have
rules which read the blackboard and cause actions (e.g. play K
spades).

The third component is a credit assignment algorithm (usually known as
the bucket brigade algorithm). Usually this is the hardest bit to get
to work. The idea here is to try and rank the ruleset in order of
their utility. This is usually done as follows:

Each rule has a strength (initially assigned). Each cycle (after any
card is played?) each rule is checked against the blackboard. Any rule
whose conditions are met is offered a chance to operate. Each of these
rules bids a portion of its utility for the right to operate. One or
more rules are then chosen to operate using a roulette wheel algorithm
based on their bid utility (so the rule “Elvis is alive and playing in
the Ring’O Bells tonight” is occasionally taken out for a test drive!

  • and we end up in the pub!). These rules then pay their bid to
    operate (their responses are activated). Note that generally, we apply
    the tax in a progressive manner, so we tax rules based on their
    specificity - the more specific they are (fewer wildcards), the harder
    they are taxed, so we both prefer specific rules to general ones, and
    we make the cost of them getting things wrong higher.

What do we do with the tax. Well, it gets split between the rules that
were applied in the previous cycle (possibly minus a decay constant).
But we also introduce extra credit into the system every time a
positive result happens (e.g. we win a trick). This means that the
rules that lead to good results tend to increase in utility over time,
and the the bad ones decrease in utility; with the result that a
natural hierarchy or rules develops from any given initial ruleset
(e.g. randomly chosen). Over time, this algorithm will reward the
rules which are intermediate steps to good outcomes, because the
rewards get passed back up the “bucket brigade” by the taxation
system. However, this algorithm can have problems in that it can be
difficult to reward the earlier events inlong chains of events.

The fourth and final component is the genetic algorithm. Every once in
a while (at a slower rate than the bucket brigade algorithm runs at),
we use a GA to update the ruleset, replacing a couple of the weakest
rules using a GA (maybe 5% of the ruleset each GA epoch).

Hope this helps (I have to go and catch a train home now!)

Steve

OK, I have more useful things to be doing.

But sometimes I like to code just for fun.
I can admit that here, right? :slight_smile:

What I had in mind for the Spades project
was originally a genetic algorithm type
of thing.

That is, you define a genome based on some
options for bidding and playing, and you
play the members of the population against
each other as they evolve.

I think it’s interesting from a GA standpoint
(disclaimer: I’ve never written a GA program).
For one thing, it’s not clear that a simple
tournament would suffice to find optimal
strategies. You might just find a “local
maximum” where the surviving members can beat
the previous generation, but couldn’t
necessarily have beaten some earlier
generation.

···

If someone wants to work on it, let me know (the
GA or the cardbots). Otherwise, I’ll probably just
implement 40% of each one and quit. :slight_smile:

Cheers,
Hal

Sorry, I’ve always been unclear on the difference.
As I said, I’ve done some reading but not much else.

I thought GA was data-centric and GP was code-centric?
Of course, in Ruby, either might be as easy.

Hal

···

----- Original Message -----
From: " JamesBritt" james@jamesbritt.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Thursday, August 08, 2002 10:17 AM
Subject: RE: The whole ‘Spades’ thing: GA and cardbots

-----Original Message-----
From: Hal E. Fulton [mailto:hal9000@hypermetrics.com]
Sent: Thursday, August 08, 2002 12:52 AM
To: ruby-talk ML
Subject: The whole ‘Spades’ thing: GA and cardbots

OK, I have more useful things to be doing.

But sometimes I like to code just for fun.
I can admit that here, right? :slight_smile:

What I had in mind for the Spades project
was originally a genetic algorithm type
of thing.

That is, you define a genome based on some
options for bidding and playing, and you
play the members of the population against
each other as they evolve.

Is there a reason to use GA instead of genetic programming (GP)?
This sounds like the sort of problem GP is specifically good for.

Is there a reason to use GA instead of genetic programming (GP)?
This sounds like the sort of problem GP is specifically good for.

Sorry, I’ve always been unclear on the difference.
As I said, I’ve done some reading but not much else.

I thought GA was data-centric and GP was code-centric?
Of course, in Ruby, either might be as easy.

GP operates at the function/terminal level of a functional programming
language, evolving code through breeding and mutation of actual programs.
The particular set of functions and terminals are used as raw materials for
generating programs to solve a given problem.

My understanding of GA is that it operates at a lower level, defining
genotypes that are passed to some other program to be expressed.

There’s a Ruby GP implementation at
http://akimichi.homeunix.net/~emile/aki/program/gp/

though the site appears to be down right now. I did play with it some
months ago, and it’s quite nice.

James

···

Hal