Chain, Chain, Chain (was Re: zed shaw zed shaw zed shaw)

Okay, I've been playing with doing this programatically. I'm not yet
convinced that it's bug free, but I've found some rather long movie
title chains using the non-imdb list posted earlier:

10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF
THE<DEAD><BANG> BANG YOURE<DEAD><END> OF<DAYS> OF<HEAVEN> CAN<WAIT>
UNTIL<DARK><BLUE><CAR> 54 WHERE ARE<YOU> CAN COUNT ON<ME> MYSELF<I> AM
TRYING TO BREAK YOUR<HEART><CONDITION><RED><DAWN> OF THE<DEAD><HEAT>
AND<DUST> TO<GLORY><ROAD><HOUSE> OF<DRACULA> DEAD AND LOVING<IT> CAME
FROM BENEATH THE<SEA> OF<LOVE> AND<DEATH> BECOMES<HER> MAJESTY
MRS<BROWN><SUGAR> AND<SPICE><WORLD>
TRADE<CENTER><STAGE><FRIGHT><NIGHT> AND THE<CITY> OF<ANGELS> WITH
DIRTY<FACES> OF<DEATH><SHIP> OF<FOOLS> RUSH<IN>
COLD<BLOOD><BEACH><PARTY><GIRL> IN THE<CADILLAC><MAN> OF THE<HOUSE>
OF<FRANKENSTEIN> AND THE MONSTER FROM<HELL><NIGHT> FALLS ON<MANHATTAN>
MURDER<MYSTERY><ALASKA> SPIRIT OF THE<WILD><BILL> AND TEDS
BOGUS<JOURNEY> TO THE CENTER OF THE<EARTH> GIRLS ARE<EASY> COME
EASY<GO><NOW> YOU SEE HIM NOW YOU<DONT> BOTHER TO<KNOCK><OFF>
THE<BLACK> AND<WHITE> WATER<SUMMER><LOVERS> AND OTHER<STRANGERS> WHEN
WE<MEET> JOE<BLACK> HAWK<DOWN> TO<YOU> CANT TAKE IT WITH<YOU> LIGHT UP
MY<LIFE> AS A<HOUSE><PARTY><MONSTER><HOUSE> PARTY<3> NINJAS KICK<BACK>
TO<SCHOOL> OF<ROCK><STAR> TREK IV THE VOYAGE<HOME><ALONE> IN
THE<DARK><CITY> OF<JOY><RIDE> THE HIGH<COUNTRY><LIFE>
IS<BEAUTIFUL><GIRLS> GIRLS<GIRLS> WILL BE<GIRLS> JUST WANT TO
HAVE<FUN> AND FANCY<FREE> WILLY 2 THE ADVENTURE<HOME> ALONE<3> NINJAS
KNUCKLE<UP> CLOSE AND<PERSONAL><BEST> OF THE<BEST><MEN> CRY<BULLETS>
OVER<BROADWAY> DANNY<ROSE><RED><EYE> FOR AN<EYE> OF<GOD> TOLD ME<TO>
DIE<FOR> YOUR EYES<ONLY> THE STRONG SURVIVE A CELEBRATION
OF<SOUL><FOOD> OF<LOVE> WALKED<IN> AND<OUT><COLD><FEVER><PITCH><BLACK>
LIKE<ME> WITHOUT<YOU> ONLY LIVE<ONCE> IN THE<LIFE> OR SOMETHING
LIKE<IT> HAPPENED AT THE WORLDS<FAIR><GAME> OF<DEATH> WISH V THE FACE
OF<DEATH><WISH> UPON A<STAR> TREK THE MOTION<PICTURE><BRIDE>
OF<FRANKENSTEIN> MEETS THE WOLF<MAN> ON<FIRE> IN THE<SKY><HIGH>
SCHOOL<HIGH><SPIRITS> OF THE<DEAD> OF<NIGHT><MOTHER><NIGHT> OF THE
LIVING<DEAD> MAN<WALKING> AND<TALKING> ABOUT<SEX> AND THE
OTHER<MAN><TROUBLE> EVERY<DAY> OF THE<WOMAN> ON<TOP><GUN><CRAZY>
AS<HELL> UP IN<HARLEM> RIVER<DRIVE> ME<CRAZY><PEOPLE>
WILL<TALK><RADIO><DAYS> OF THUNDER

That's 175 titles strung together. The <> bracketed words are the
ones which are the last of one title and the first of another. Lot's
of two-word titles show up in this chain.

By the way, a word about IMDB and data scraping. Quite a few years
ago, as an exercise in learning Java, I decided to write a program
which would look for "six degrees of Kevin Bacon" links between actors
in the IMDB. This was a spare moment project at work. After a few
days, I got an e-mail from IMDB saying that they had detected my
'robot' and had disabled access to IMDB from my ip address, which to
them was the proxy server for the company (Object Technology
International). So OTI didn't have access to IMDB for a few days
while they considered my contrite reply and promise to cease and
desist.

···

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Saw that one in the theaters. Save your money and wait for the torrent.

TwP

···

On Jan 5, 2008 2:33 PM, Rick DeNatale <rick.denatale@gmail.com> wrote:

10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF
THE<DEAD><BANG> BANG YOURE<DEAD><END> OF<DAYS> OF<HEAVEN> CAN<WAIT>
UNTIL<DARK><BLUE><CAR> 54 WHERE ARE<YOU> CAN COUNT ON<ME> MYSELF<I> AM
TRYING TO BREAK YOUR<HEART><CONDITION><RED><DAWN> OF THE<DEAD><HEAT>
AND<DUST> TO<GLORY><ROAD><HOUSE> OF<DRACULA> DEAD AND LOVING<IT> CAME
FROM BENEATH THE<SEA> OF<LOVE> AND<DEATH> BECOMES<HER> MAJESTY
MRS<BROWN><SUGAR> AND<SPICE><WORLD>
TRADE<CENTER><STAGE><FRIGHT><NIGHT> AND THE<CITY> OF<ANGELS> WITH
DIRTY<FACES> OF<DEATH><SHIP> OF<FOOLS> RUSH<IN>
COLD<BLOOD><BEACH><PARTY><GIRL> IN THE<CADILLAC><MAN> OF THE<HOUSE>
OF<FRANKENSTEIN> AND THE MONSTER FROM<HELL><NIGHT> FALLS ON<MANHATTAN>
MURDER<MYSTERY><ALASKA> SPIRIT OF THE<WILD><BILL> AND TEDS
BOGUS<JOURNEY> TO THE CENTER OF THE<EARTH> GIRLS ARE<EASY> COME
EASY<GO><NOW> YOU SEE HIM NOW YOU<DONT> BOTHER TO<KNOCK><OFF>
THE<BLACK> AND<WHITE> WATER<SUMMER><LOVERS> AND OTHER<STRANGERS> WHEN
WE<MEET> JOE<BLACK> HAWK<DOWN> TO<YOU> CANT TAKE IT WITH<YOU> LIGHT UP
MY<LIFE> AS A<HOUSE><PARTY><MONSTER><HOUSE> PARTY<3> NINJAS KICK<BACK>
TO<SCHOOL> OF<ROCK><STAR> TREK IV THE VOYAGE<HOME><ALONE> IN
THE<DARK><CITY> OF<JOY><RIDE> THE HIGH<COUNTRY><LIFE>
IS<BEAUTIFUL><GIRLS> GIRLS<GIRLS> WILL BE<GIRLS> JUST WANT TO
HAVE<FUN> AND FANCY<FREE> WILLY 2 THE ADVENTURE<HOME> ALONE<3> NINJAS
KNUCKLE<UP> CLOSE AND<PERSONAL><BEST> OF THE<BEST><MEN> CRY<BULLETS>
OVER<BROADWAY> DANNY<ROSE><RED><EYE> FOR AN<EYE> OF<GOD> TOLD ME<TO>
DIE<FOR> YOUR EYES<ONLY> THE STRONG SURVIVE A CELEBRATION
OF<SOUL><FOOD> OF<LOVE> WALKED<IN> AND<OUT><COLD><FEVER><PITCH><BLACK>
LIKE<ME> WITHOUT<YOU> ONLY LIVE<ONCE> IN THE<LIFE> OR SOMETHING
LIKE<IT> HAPPENED AT THE WORLDS<FAIR><GAME> OF<DEATH> WISH V THE FACE
OF<DEATH><WISH> UPON A<STAR> TREK THE MOTION<PICTURE><BRIDE>
OF<FRANKENSTEIN> MEETS THE WOLF<MAN> ON<FIRE> IN THE<SKY><HIGH>
SCHOOL<HIGH><SPIRITS> OF THE<DEAD> OF<NIGHT><MOTHER><NIGHT> OF THE
LIVING<DEAD> MAN<WALKING> AND<TALKING> ABOUT<SEX> AND THE
OTHER<MAN><TROUBLE> EVERY<DAY> OF THE<WOMAN> ON<TOP><GUN><CRAZY>
AS<HELL> UP IN<HARLEM> RIVER<DRIVE> ME<CRAZY><PEOPLE>
WILL<TALK><RADIO><DAYS> OF THUNDER

Rick DeNatale wrote:

Okay, I've been playing with doing this programatically. I'm not yet
convinced that it's bug free, but I've found some rather long movie
title chains using the non-imdb list posted earlier:

10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF

<snip>

WILL<TALK><RADIO><DAYS> OF THUNDER

Why didn't it find Thunder Road[1]? (And then maybe Road Trip to Bountiful...)

[1] http://us.imdb.com/title/tt0052293/

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

> Okay, I've been playing with doing this programatically. I'm not yet
> convinced that it's bug free, but I've found some rather long movie
> title chains using the non-imdb list posted earlier:

code!

Why didn't it find Thunder Road[1]? (And then maybe Road Trip to
Bountiful...)

[1] http://us.imdb.com/title/tt0052293/

This is actually a very interesting question. I think the answer is
probably "Travelling Salesman Problem."

···

--
Giles Bowkett

Podcast: http://hollywoodgrit.blogspot.com
Blog: http://gilesbowkett.blogspot.com
Portfolio: http://www.gilesgoatboy.org
Tumblelog: http://giles.tumblr.com

Rick DeNatale wrote:
> Okay, I've been playing with doing this programatically. I'm not yet
> convinced that it's bug free, but I've found some rather long movie
> title chains using the non-imdb list posted earlier:
>
> 10 ITEMS OR<LESS> THAN<ZERO><DAY> FOR<NIGHT> AND<DAY> OF
<snip>
> WILL<TALK><RADIO><DAYS> OF THUNDER

Why didn't it find Thunder Road[1]? (And then maybe Road Trip to
Bountiful...)

Well first of all, because thunder road isn't on that list.

Second, because it's the longest title the program found in the time I
gave to run it. As I said the program isn't of the quality that I'd
prefer to share, but what the heck. It's a pretty dumb search, and I
have it print a chain only the first time it finds one longer (in
terms of the number of movies) than any it had found before. It's
full of "the simplest thing that could possibly work" decisions with
the idea of making it work before trying to make it fast.

require 'net/http'
class Title

  @first_words = Hash.new {|h,k| h[k] = []}

  def self.register(title)
    @first_words[title.first_word] << title if title.first_word
  end

  def self.process_lines(string)
    string.each_line do |line|
      self.new(line)
    end
  end

  def self.generate_titles
    result = []
    @first_words.keys.sort.each do |fw|
      all_starting_with(fw).each do |title|
        title.chains do | chain |
          result << chain
        end
      end
    end
  end

  def self.all_starting_with(word)
    @first_words[word]
  end

  def print_chain(chain)
    puts "#{chain.length}: #{chain.inject(chain.first.first_word) {

chained_title, title| title.chain_to(chained_title)}}"

  end

  def chain_to(string)
    "#{string.sub(/\s(\S+)$/,'<\1>')} #{@words[1...@words.length].join(' ')}"
   end

  def self.max_chain?(chain)
    @max ||= 0
    if chain.length > @max
      @max = chain.length
      true
    else
      false
    end
  end

  def chains(chain = [])
    root_chain = chain << self
    raise "Duplicate title #{print_chain(root_chain)}" unless
root_chain.length == root_chain.uniq.length
    print_chain(root_chain) if self.class.max_chain?(root_chain)
    result = []
    self.class.all_starting_with(self.last_word).each do |title|
      unless root_chain.include?(title)
        title.chains(root_chain.dup).each do |chain|
          result << chain
        end
      end
    end
    result
  end

  def initialize(title_string)
    @words = title_string.strip.split
    self.class.register(self)
  end

  def first_word
    @words.first
  end

  def last_word
    @words.last
  end

  def to_s
    @words.join(' ')
  end

  def inspect
    to_s
  end
end
_,x = Net::HTTP.new("itafullsite.dev.neptuneweb.com").get("/careers/puzzles/MOVIES.LST")
Title.process_lines(x)
p Title.generate_titles

···

On Jan 5, 2008 5:05 PM, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

Giles Bowkett wrote:

Why didn't it find Thunder Road[1]? (And then maybe Road Trip to
Bountiful...)

[1] http://us.imdb.com/title/tt0052293/

This is actually a very interesting question. I think the answer is
probably "Travelling Salesman Problem."

This is not an optimization problem (not as stated so far, anyway).

There must be a bug somewhere--in IMDB, or in the script.

···

--
        vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

[some sadly uncited person wrote]

> Okay, I've been playing with doing this programatically. I'm not
> yet convinced that it's bug free, but I've found some rather long
> movie title chains using the non-imdb list posted earlier:

I'm surprised no-one's found a loop yet. Is there one? What's the
shortest?

Regards,

Jeremy Henty

>> Why didn't it find Thunder Road[1]? (And then maybe Road Trip to
>> Bountiful...)
>>
>> [1] http://us.imdb.com/title/tt0052293/
>
> This is actually a very interesting question. I think the answer is
> probably "Travelling Salesman Problem."

This is not an optimization problem (not as stated so far, anyway).

There must be a bug somewhere--in IMDB, or in the script.

OK. But why it didn't find X combination when it did find Y
combination, surely that's a complex question to answer.

···

--
Giles Bowkett

Podcast: http://hollywoodgrit.blogspot.com
Blog: http://gilesbowkett.blogspot.com
Portfolio: http://www.gilesgoatboy.org
Tumblelog: http://giles.tumblr.com

The shortest is easy:

File.readlines('MOVIES.LST').find_all {|line| line.split.first ==
line.split.last}
["\n", "AREMEMBER\n", "AUTHOR AUTHOR\n", "BEST OF THE BEST\n",
"BREAKER BREAKER\n", "BUDDY BUDDY\n", "CHUD II BUD THE CHUD\n",
"CORRINA CORRINA\n", "DEATH WISH V THE FACE OF DEATH\n", "DIE MOMMIE
DIE\n", "DIE MONSTER DIE\n", "DREAM A LITTLE DREAM\n", "EAST IS
EAST\n", "EYE FOR AN EYE\n", "GIRLS GIRLS GIRLS\n", "GIRLS WILL BE
GIRLS\n", "HIGH SCHOOL HIGH\n", "JULIA AND JULIA\n", "JUNGLE 2
JUNGLE\n", "KRAMER VS KRAMER\n", "LADYBIRD LADYBIRD\n", "LIAR LIAR\n",
"MELINDA AND MELINDA\n", "MOMENT BY MOMENT\n", "MURDER AND MURDER\n",
"NIAGARA NIAGARA\n", "NIGHTBREED \n", "SCREAM BLACULA SCREAM\n",
"SISTER MY SISTER\n", "SUNDAY BLOODY SUNDAY\n", "THEGUEST\n", "THEY
SHOOT HORSES DONT THEY\n", "TIME AFTER TIME\n", "TORA TORA TORA\n",
"WRONGSTAND\n", "YI YI\n", "YOU CANT TAKE IT WITH YOU\n"]

···

On Jan 5, 2008 6:14 PM, Jeremy Henty <onepoint@starurchin.org> wrote:

[some sadly uncited person wrote]

>> > Okay, I've been playing with doing this programatically. I'm not
>> > yet convinced that it's bug free, but I've found some rather long
>> > movie title chains using the non-imdb list posted earlier:

I'm surprised no-one's found a loop yet. Is there one? What's the
shortest?

Giles Bowkett wrote:

Why didn't it find Thunder Road[1]? (And then maybe Road Trip to
Bountiful...)

[1] http://us.imdb.com/title/tt0052293/

This is actually a very interesting question. I think the answer is
probably "Travelling Salesman Problem."

This is not an optimization problem (not as stated so far, anyway).

There must be a bug somewhere--in IMDB, or in the script.

OK. But why it didn't find X combination when it did find Y
combination, surely that's a complex question to answer.

Actually, what I overlooked is that Rick's original post said he used a "non-imdb list", and I found "Thunder Road" on imdb. Probably his list just doesn't include this one.

···

--
        vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

D'oh! Neat! Any twofers? (Nitpick: the code should eliminate single
word titles.)

Jeremy Henty

···

On 2008-01-06, Chris Shea <chris@ruby.tie-rack.org> wrote:

On Jan 5, 2008 6:14 PM, Jeremy Henty <onepoint@starurchin.org> wrote:

I'm surprised no-one's found a loop yet. Is there one? What's the
shortest?

The shortest is easy:

File.readlines('MOVIES.LST').find_all {|line| line.split.first ==
line.split.last}
["\n", "AREMEMBER\n", "AUTHOR AUTHOR\n", "BEST OF THE BEST\n",
"BREAKER BREAKER\n", "BUDDY BUDDY\n", "CHUD II BUD THE CHUD\n",
"CORRINA CORRINA\n", "DEATH WISH V THE FACE OF DEATH\n", "DIE MOMMIE
DIE\n", "DIE MONSTER DIE\n", "DREAM A LITTLE DREAM\n", "EAST IS
EAST\n", "EYE FOR AN EYE\n", "GIRLS GIRLS GIRLS\n", "GIRLS WILL BE
GIRLS\n", "HIGH SCHOOL HIGH\n", "JULIA AND JULIA\n", "JUNGLE 2
JUNGLE\n", "KRAMER VS KRAMER\n", "LADYBIRD LADYBIRD\n", "LIAR LIAR\n",
"MELINDA AND MELINDA\n", "MOMENT BY MOMENT\n", "MURDER AND MURDER\n",
"NIAGARA NIAGARA\n", "NIGHTBREED \n", "SCREAM BLACULA SCREAM\n",
"SISTER MY SISTER\n", "SUNDAY BLOODY SUNDAY\n", "THEGUEST\n", "THEY
SHOOT HORSES DONT THEY\n", "TIME AFTER TIME\n", "TORA TORA TORA\n",
"WRONGSTAND\n", "YI YI\n", "YOU CANT TAKE IT WITH YOU\n"]

>> There must be a bug somewhere--in IMDB, or in the script.
>
> OK. But why it didn't find X combination when it did find Y
> combination, surely that's a complex question to answer.

Actually, what I overlooked is that Rick's original post said he used a
"non-imdb list", and I found "Thunder Road" on imdb. Probably his list
just doesn't include this one.

Ah, OK. I was wondering. Yeah, Rick was using (I think) the official
list from ITA. I posted it the other day.

···

--
Giles Bowkett

Podcast: http://hollywoodgrit.blogspot.com
Blog: http://gilesbowkett.blogspot.com
Portfolio: http://www.gilesgoatboy.org
Tumblelog: http://giles.tumblr.com