[SUMMARY] Lost Cities (#51)

James_Edward_Gray_II · 20 October 2005 13:01

There's nothing too tough in this quiz, but it turned out to be pretty time
consuming for me. Not because it required a ton of code for a solution, but
because I kept playing against my solution and tweaking its behavior. Of
course, as Daniel Sheppard pointed out, I should have tried him against
DumbPlayer a little more:

  $ ruby lost_cities.rb localhost 61676 risk_player.rb
  Final Score: -21 (You) vs. -60 (Opponent).
  Congratulations, you win.
  $ ruby lost_cities.rb localhost 61676 risk_player.rb
  Final Score: -32 (You) vs. -1 (Opponent).
  I'm sorry, you lose.
  $ ruby lost_cities.rb localhost 61676 risk_player.rb
  Final Score: -43 (You) vs. -43 (Opponent).
  The game is a draw.
  $ ruby lost_cities.rb localhost 61676 risk_player.rb
  Final Score: -51 (You) vs. 5 (Opponent).
  I'm sorry, you lose.

"You" above would be RiskPlayer and "Opponent" is DumbPlayer. I guess
DumbPlayer isn't so dumb and RiskPlayer doesn't take enough risks.

There were also some issues with the server and DumbPlayer that may have made it
harder for people to mess around with this problem. I apologize for that.

Daniel's own solution is very interesting, but quite a bit of code to show here.
Let me see if I can hit a highlight or two. First, I'll let Daniel explain how
the code was built:

  Designing an AI for a game which you've never played is a pretty
  daunting task... So I let the computer do the work with a little bit
  of genetics.

  The rule_player.rb file contains the basic framework for parsing the
  input and storing the knowledge and also the rules, as well as an
  extra player named "MultiplierPlayer" which allows the rules to be
  given different weightings

  The breeder.rb file contains the breeding system. It creates a bunch
  of MultiplierPlayers with different weightings and plays them against
  each other round-robin style. The 2 players with the most wins under
  their belts get bred to form extra players and the worst players get
  dropped. The players are then saved off in a yaml file. Being yaml,
  it's easy to edit, so if you think you know better than the breeder,
  it's easy to throw your figures into the race.

  ...

That YAML file is really neat. Here's a peek at a small slice of it:

···

---
  -
    -
      - 1
      - 1
      - 0
      - 0
      - 0
      - 0
      - 0
  # ...
  -
    - Rules::PlayLowestFirst
    - Rules::MaximumScoreEndGame
    - Rules::IgnoreUnusable
    - Rules::DiscardUnusable
    - Rules::DepriveOpponent
    - Rules::AvoidLateInvestment
    - Rules::ExpectedInvestmentValue

The bottom section has the rules that the genetics system is considering. The
top Array lists the weights the winner settled on, favoring the first two rules
clearly.

The player that puts the data to use is trivial:

  require 'rule_player'
  require 'yaml'

  class BredPlayer < MultiplierPlayer
      def initialize
          raise "No Data File" unless File.exist?('data2.yaml')
          multipliers, rule_names = File.open('data2.yaml','r') do |f|
              YAML::load(f)
          end
          super(rule_names, multipliers[0])
      end
  end

This is all designed to work with a testing framework Daniel provided, of
course. The really interesting part of Daniel's code, to me, was breeder.rb.
If you're at all interested in genetic programming, do look around in there. It
even has comments to drive you through the evolution process. Neat stuff.

Though my solution turns out to be a pretty bad player, it does have the
elements any smarter solution would need. Its flaw is in the logic, which is
just me being a bad teacher for computer strategy. Let's look past that and
examine the code anyway:

  #!/usr/local/bin/ruby -w

  class RiskPlayer < Player
    def self.card_from_string( card )
      value, land = card[0..-2], card[-1, 1].downcase
      Game::Card.new( value[0] == ?I ? value : value.to_i,
                      Game::LANDS.find { |l| l[0, 1] == land } )
    end

    def initialize
      @piles = Hash.new do |piles, player|
        piles[player] = Hash.new { |pile, land| pile[land] = Array.new }
      end

      @deck_size = 60
      @hand = nil

      @last_dicard = nil

      @action = nil

      @done = false
    end

    # ...

There's nothing tricky in there. The class method is a helper for converting a
String like "InvV" into and actual card object.

Then initialize() just prepares the instance data this class needs to track.
The @piles variable looks a little ugly because I wanted it to invent the data
structure as needed. It's just a Hash containing three keys: :me, :them, and
:discards. Each of those is a Hash that contains an Array pile for each land
type.

Now the quiz requires a parser for the protocol. Here's the easiest approach I
could come up with:

    def show( game_data )
      if @done
        puts game_data
      else
        if game_data =~ /^(Your?)(?: opponent)? (play|discard)s? the (\w+)/
          card = self.class.card_from_string($3)
          if $2 == "play"
            if $1 == "You"
              @piles[:me][card.land] << card
            else
              @piles[:them][card.land] << card
            end
          else
            @piles[:discards][card.land] << card
          end

          @last_discard = nil if $1 == "Your"
        end
        if game_data =~ /^You(?:r opponent)? picks? up the (\w+)/
          @piles[:discards][self.class.card_from_string($1).land].pop
        end

        if game_data =~ /^\s*Deck:\s+#+\s+$(\d+)$/
          @deck_size = $1.to_i
        end
        if game_data =~ /^\s*Hand:((?:\s+\w+)+)/
          @hand = $1.strip.split.map { |c| self.class.card_from_string(c) }
        end

        if game_data.include?("Your play?")
          @action = :play_card
        elsif game_data.include?("Draw from?")
          @action = :draw_card
        end

        @done = true if game_data.include?("Game over.")
      end
    end

The server notifies you when everything happens and I think it's easier to just
track those notifications than it is to track your own moves and/or parse the
game board to figure out where everything is. The messages are easy to break
down with light Regexp usage.

The else branch of that top level if statement handles the parsing. First it
breaks down play/discard messages and places the card on the indicated pile. It
pops cards off the discard piles too, as needed. The next bit reads the deck
size and your hand from the game board. The third section tracks whether you
were asked to play or draw and the final line watches for a "Game over." which
causes the code to print the final score (if branch at the top of the method).

With that in place, we can move our focus to responding to play requests:

    # ...

    def move
      send(@action)
    end

    private

    def play_card
      plays, discards = @hand.partition { |card| playable? card }

      if plays.empty?
        discard_card(discards)
      else
        risks = analyze_risks(plays)
        risk = risks.max { |a, b| a.last <=> b.last }

        return discard_card(@hand) if risk.last < 0

        land = risks.max { |a, b| a.last <=> b.last }.first.land
        play = plays.select { |card| card.land == land }.
                  sort_by { |c| c.value.is_a?(String) ? 0 : c.value }.first
        "#{play.value}#{play.land[0, 1]}".sub("nv", "")
      end
    end

    def discard_card( choices )
      discard = choices.sort_by do |card|
        [ playable?(card) ? 1 : 0, playable?(card, :them) ? 1 : 0,
          card.value.is_a?(String) ? 0 : card.value ]
      end.first

      @last_discard = discard
      "d#{discard.value}#{discard.land[0, 1]}".sub("nv", "")
    end

    def draw_card
      want = @piles[:discards].find do |land, cards|
        not @piles[:me][land].empty? and
        cards.last != @last_discard and cards.any? { |card| playable?(card) }
      end
      if want
        want.first[0, 1]
      else
        "n"
      end
    end

    def playable?( card, who = :me )
      @piles[who][card.land].empty? or
      @piles[who][card.land].last.value.is_a?(String) or
      ( not card.value.is_a?(String) and
        @piles[who][card.land].last.value < card.value )
    end

    # ...

First, move() calls the correct action method based on what the server last
requested (:play_card or :draw_card).

The main action method, play_card(), splits hands into playable cards and
discards. It analyzes the risks of each play and tries to find something fairly
safe. If it has no plays or doesn't like its choices, a handoff is made to
discard_card().

The discard method just sorts available discards by some general criteria: Can
I play this? Can my opponent? How much is it worth? It tries to find
something it can't play, the opponent can't play, and that is low in value. It
tosses that.

The other action method, draw_card(), just scans the discard piles for cards it
wants. If it sees a goody, it will pull from that pile. Otherwise it defaults
to a draw from the deck.

Finally, playable?() is just a tool that will tell you if a card can be played
currently, by the indicated player.

The missing piece is the risk analysis method:

    # ...

    def analyze_risks( plays )
      plays.inject(Hash.new) do |risks, card|
        risks[card] = 0

        me_total = ( @piles[:me][card.land] +
                     plays.select { |c| c.land == card.land }
                     ).inject(0) do |total, c|
          if c.value.is_a? String
            total
          else
            total + c.value
          end
        end
        risks[card] += 20 - me_total

        them_total = @piles[:them][card.land].inject(0) do |total, c|
          if c.value.is_a? String
            total
          else
            total + c.value
          end
        end
        high = card.value.is_a?(String) ? 2 : card.value
        risks[card] += ( (high..10).inject { |sum, n| sum + n }
                         - (me_total + them_total) ) / 2

        if @piles[:me][card.land].empty?
          lands_played = @piles[:me].inject(0) do |count, (land, cards)|
            if cards.empty?
              count
            else
              count + 1
            end
          end

          risks[card] -= (lands_played + 1) * 5
        end

        risks
      end
    end
  end

WARNING: The above code is what needs tweaking to make the AI player smarter.

This method could do whatever thinking is required. It's just expected to
return a Hash of playable cards (keys) and their ratings (values). The
play_card() method will play the highest rated card, as long as it isn't
negative.

I tried to distill a little of how I play down into computer terms here. The
method takes into account how many points it has in a given pile and how many
more it could play from its hand. It also adds up the total of the cards still
at large (above the highest play it can make), and assumes it could luck into
half of those. Finally, it adds a penalty to the rating for each new pile
started. It's usually a mistake to play too many piles in Lost Cities, because
you don't have time to finish them all off. Again, this logic needs further
refinement.

For yet another spin on rules, Adam Shelly sent in a solution yesterday that has
a whole bunch of criteria it bases decisions on. Check out this list:

  # ...

  #rules affecting play/hold decision :
  #positive values mean play, negative mean hold
  @prules = {:rule_inSequence=>[0.6,0.8],
             :rule_lowCard=>[0.1,0.0],
             :rule_lowCards=>[0.2,0.0],
             :rule_highCard=>[-0.3,0.1],
             :rule_highCards=>[-0.2,0.2],
             :rule_investments=>[0.1,-0.2],
             :rule_onInvestments=>[0.5,0.7],
             :rule_holdingInvestments=>[-0.2,0.0],
             :rule_investmentWithHope=>[0.5,0.3],
             :rule_investmentWithoutHope=>[-0.6,-1.0],
             :rule_group10=>[0.5,-0.4],
             :rule_group15=>[0.6,-0.3],
             :rule_group20=>[0.7,-0.2],
             :rule_group25=>[0.9,-0.1],
             :rule_total20 =>[0.35,1.0],
             :rule_total25 =>[0.6,1.0],
             :rule_suitStarted=>[0.7,0.9],
             :rule_closeToPrevious=>[0.4,0.5],
             :rule_multiplier2=>[0.4,0.8],
             :rule_multiplier3=>[0.5,0.9],
             :rule_onUnplayed=>[-0.5,-1.0],
             :rule_heHasPlayed=>[-0.1,0.0],
             :rule_heHasPlayed10=>[-0.2,0.0],
             :rule_heHasPlayed20=>[-0.3,0.0],
             :rule_handNegative=>[0.5,0.9],
             :rule_mustPlays=>[-0.3,1.0],
             :rule_lowerInHand=>[-0.5,-0.4],
             :rule_highestInHand=>[-0.1,-0.01],
             :rule_2followsInvest=>[0.3,0.5],
             :rule_finishGame=>[0.0,2.0],
             :rule_possibleBelow=>[-0.2,-0.05],
             :rule_possibleManyBelow=>[-0.4,-0.1]}
  #rules affecting keep/discard decision :
  #positive values mean keep, negative mean discard
  @drules = {:rule_useless2me=>[-0.5, 0.1],
             :rule_useless2him=>[-0.2,0.1],
             :rule_useful2him=>[0.4,0.5],
             :rule_useful2me=>[0.3,0.3],
             :rule_heHasPlayed=>[0.1,0.3],
             :rule_singleton=>[-0.2,-0.1],
             :rule_noPartners=>[-0.3,-0.3],
             :rule_wantFromDiscard=>[0.3,0.5],
             :rule_belowLowestPlayable=>[-0.2,0.0],
             :rule_dontDiscardForever=>[0.5,1]}

  # ...

Those numbers are weights, one set for early in the game and another for late.
Cards are ranked by these criteria which allows plays/discards to be found.
These weights are hand tuned, from Adam's experience.

Many thanks to all who fiddled with the game and especially to those who even
tried to build a player.

Tomorrow we have the very core of what makes programmers into programmers...
Talking barnyard animals, of course.

Topic		Replies	Views
[QUIZ] Lost Cities (#51) ruby-talk	5	94	17 October 2005
[QUIZ] Lost Cities (#51) ruby-talk	0	64	17 October 2005
[SOLUTION] Lost Cities (#51) ruby-talk	1	64	19 October 2005
[QUIZ] Lost Cities (#51) ruby-talk	0	73	17 October 2005
[SOLUTION] Lost Cities (#51) ruby-talk	3	84	19 October 2005

[SUMMARY] Lost Cities (#51)

Related topics