[QUIZ] Longest Repeated Substring (#153)

> [Note: parts of this message were removed to make it a legal post.]
>
> Hi Alex,
>
> Have the same problem with you :slight_smile:
>
> I think we should understand the 'adjacent' as 'not overlapped adjacent'
> here. Each time we take two adjacent suffix, we check whether they
> overlapped first - if so, make s2 point to next suffix, until we found a
not
> overlapped one.

Hm... but every pair of suffixes _are_ overlapped just 'cause all of
them last to the end of the text, so we cannot 'check first'.
Anyway...

Yes u'r right ... I should have described more clear: take two adjacent,
check whether the "head" of them overlapped:

overlap?(s1[0, current_longest_substr.length], s2[0,
current_longest_substr.length])

Thanks
Jan

···

On Jan 23, 2008 7:39 PM, Alex Shulgin <alex.shulgin@gmail.com> wrote:

On Jan 22, 10:51 am, Jan <jan.h....@gmail.com> wrote:

> For aaaa: on the step of comparing aa and aaa, we found they overlapped,
so
> we keep s1 = aa, but make s2 = aaaa, the one next to aaa.

This might be an appropriate hack--I'll try it when I have a
minute. :slight_smile:

--
Cheers,
Alex

--
Fly on wheels.

http://www.fordham.edu/halsall/ancient/homer-illiad.txt

a @ http://codeforpeople.com/

···

On Jan 18, 2008, at 4:06 PM, James Gray wrote:

On Jan 18, 2008, at 3:42 PM, Radosław Bułat wrote:

My question is (I'm not familiar with RubyQuiz too much :)): episode
focus on algorithm (speed) or source code (readable)?

Hopefully both. :slight_smile:

How long input string I could expect?

Anybody found a good long text to work with yet? The text of the U.S. Constitution perhaps? Tristram Shandy?

I would say that you should handle the biggest text you possibly can. :slight_smile:

James Edward Gray II

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

There's also War and Peace: http://www.gutenberg.org/dirs/etext01/wrnpc12.txt

$ wc wrnpc12.txt
  67403 564514 3285165 wrnpc12.txt

···

On Sat, 2008-01-19 at 11:12 +0900, ara.t.howard wrote:

On Jan 18, 2008, at 4:06 PM, James Gray wrote:

> On Jan 18, 2008, at 3:42 PM, Radosław Bułat wrote:
>
>>>> My question is (I'm not familiar with RubyQuiz too much :)):
>>>> episode
>>>> focus on algorithm (speed) or source code (readable)?
>>>
>>> Hopefully both. :slight_smile:
>>
>> How long input string I could expect?
>
> Anybody found a good long text to work with yet? The text of the
> U.S. Constitution perhaps? Tristram Shandy?
>
> I would say that you should handle the biggest text you possibly
> can. :slight_smile:
>
> James Edward Gray II

http://www.fordham.edu/halsall/ancient/homer-illiad.txt

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

For The Illiad, I got:
' who are
perishing and coming to a bad end. We will, however, since you so
bid us, refrain from actual fighting, but we will make serviceable
suggestions to the Argives' (168 characters)

···

On Jan 18, 8:12 pm, "ara.t.howard" <ara.t.how...@gmail.com> wrote:

On Jan 18, 2008, at 4:06 PM, James Gray wrote:

> On Jan 18, 2008, at 3:42 PM, Radosław Bułat wrote:

>>>> My question is (I'm not familiar with RubyQuiz too much :)):
>>>> episode
>>>> focus on algorithm (speed) or source code (readable)?

>>> Hopefully both. :slight_smile:

>> How long input string I could expect?

> Anybody found a good long text to work with yet? The text of the
> U.S. Constitution perhaps? Tristram Shandy?

> I would say that you should handle the biggest text you possibly
> can. :slight_smile:

> James Edward Gray II

http://www.fordham.edu/halsall/ancient/homer-illiad.txt

a @http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama

For War & Peace:
'

The Project Gutenberg Literary Archive Foundation has been ' (63
characters)

That one took awhile - 5m29s real time vs 1m2s real time for the
Illiad - though the increase is close to linear given the number of
characters in each text (3285165 vs. 806921).

···

On Jan 18, 8:27 pm, fw <fwmailingli...@gmail.com> wrote:

On Sat, 2008-01-19 at 11:12 +0900, ara.t.howard wrote:
> On Jan 18, 2008, at 4:06 PM, James Gray wrote:

> > On Jan 18, 2008, at 3:42 PM, Radosław Bułat wrote:

> >>>> My question is (I'm not familiar with RubyQuiz too much :)):
> >>>> episode
> >>>> focus on algorithm (speed) or source code (readable)?

> >>> Hopefully both. :slight_smile:

> >> How long input string I could expect?

> > Anybody found a good long text to work with yet? The text of the
> > U.S. Constitution perhaps? Tristram Shandy?

> > I would say that you should handle the biggest text you possibly
> > can. :slight_smile:

> > James Edward Gray II

>http://www.fordham.edu/halsall/ancient/homer-illiad.txt

> a @http://codeforpeople.com/
> --
> we can deny everything, except that we have the possibility of being
> better. simply reflect on that.
> h.h. the 14th dalai lama

There's also War and Peace:http://www.gutenberg.org/dirs/etext01/wrnpc12.txt

$ wc wrnpc12.txt
  67403 564514 3285165 wrnpc12.txt

Using Eric's code, for "The Iliad" downloaded from the Gutenberg project...

"On this the rest of the Acheans with one voice were for respecting
the priest and taking the ransom that he offered; but not so
Agamemnon, who spoke fiercely to him and sent him roughly away. " (193
characters)

Todd

···

2008/1/18 yermej <yermej@gmail.com>:

For The Illiad, I got:
' who are
perishing and coming to a bad end. We will, however, since you so
bid us, refrain from actual fighting, but we will make serviceable
suggestions to the Argives' (168 characters)

>
>
> > >>>> My question is (I'm not familiar with RubyQuiz too much :)):
> > >>>> episode
> > >>>> focus on algorithm (speed) or source code (readable)?
>
> > >>> Hopefully both. :slight_smile:
>
> > >> How long input string I could expect?
>
> > > Anybody found a good long text to work with yet? The text of the
> > > U.S. Constitution perhaps? Tristram Shandy?
>
> > > I would say that you should handle the biggest text you possibly
> > > can. :slight_smile:
>
> > > James Edward Gray II
>
> >http://www.fordham.edu/halsall/ancient/homer-illiad.txt
>
> > a @http://codeforpeople.com/
> > --
> > we can deny everything, except that we have the possibility of being
> > better. simply reflect on that.
> > h.h. the 14th dalai lama
>
> There's also War and Peace:http://www.gutenberg.org/dirs/etext01/wrnpc12.txt
>
> $ wc wrnpc12.txt
> 67403 564514 3285165 wrnpc12.txt

For War & Peace:
'

The Project Gutenberg Literary Archive Foundation has been ' (63
characters)

That one took awhile - 5m29s real time vs 1m2s real time for the
Illiad

So we need to take care of unicode too?

···

2008/1/19 yermej <yermej@gmail.com>:

On Jan 18, 8:27 pm, fw <fwmailingli...@gmail.com> wrote:
> On Sat, 2008-01-19 at 11:12 +0900, ara.t.howard wrote:
> > On Jan 18, 2008, at 4:06 PM, James Gray wrote:
> > > On Jan 18, 2008, at 3:42 PM, Radosław Bułat wrote:
- though the increase is close to linear given the number of
characters in each text (3285165 vs. 806921).

--
http://ruby-smalltalk.blogspot.com/

---
Whereof one cannot speak, thereof one must be silent.
Ludwig Wittgenstein

I've got the same output for both examples. Current times are 2.767s and 12.894s elapsed on a 3GHz processor.

Ruby's clever copy-on-write string handling really shines here.

Dave

···

On Jan 18, 2008, at 9:59 PM, yermej wrote:

That one took awhile - 5m29s real time vs 1m2s real time for the
Illiad - though the increase is close to linear given the number of
characters in each text (3285165 vs. 806921).

In my copy of the Illiad (there's a phrase I never thought I'd use!), those two sections differ in whitespace / linebreaks:

   On this the rest of the Achaeans with one voice were for
respecting the priest and taking the ransom that he offered; but not
so Agamemnon, who spoke fiercely to him and sent him roughly away.
"Old man," said he, "let me not find you tarrying about our ships, nor
yet coming hereafter.

and

   "On this the rest of the Achaeans with one voice were for respecting
the priest and taking the ransom that he offered; but not so
Agamemnon, who spoke fiercely to him and sent him roughly away. So
he went back in anger, and Apollo, who loved him dearly, heard his
prayer.

Did you get yours from http://www.fordham.edu/halsall/ancient/homer-illiad.txt ?

/dh

···

On 21 Jan 2008, at 15:12, Todd Benson wrote:

2008/1/18 yermej <yermej@gmail.com>:

For The Illiad, I got:
' who are
perishing and coming to a bad end. We will, however, since you so
bid us, refrain from actual fighting, but we will make serviceable
suggestions to the Argives' (168 characters)

Using Eric's code, for "The Iliad" downloaded from the Gutenberg project...

"On this the rest of the Acheans with one voice were for respecting
the priest and taking the ransom that he offered; but not so
Agamemnon, who spoke fiercely to him and sent him roughly away. " (193
characters)

Todd

As always, it's your choice. That's a nice bonus though.

James Edward Gray II

···

On Jan 19, 2008, at 5:29 AM, Robert Dober wrote:

So we need to take care of unicode too?

That one took awhile - 5m29s real time vs 1m2s real time for the
Illiad - though the increase is close to linear given the number of
characters in each text (3285165 vs. 806921).

I've got the same output for both examples. Current times are 2.767s and 12.894s elapsed on a 3GHz processor.

Ruby's clever copy-on-write string handling really shines here.

Hmm, I was happy with my solution (similiar times to yermej on 1.8GHz) until I saw Dave's times. I'm really looking forward to seeing the code.

/dh

···

On 19 Jan 2008, at 16:21, Dave Thomas wrote:

On Jan 18, 2008, at 9:59 PM, yermej wrote:

>>
>> For The Illiad, I got:
>> ' who are
>> perishing and coming to a bad end. We will, however, since you so
>> bid us, refrain from actual fighting, but we will make serviceable
>> suggestions to the Argives' (168 characters)
>>
>>
>
> Using Eric's code, for "The Iliad" downloaded from the Gutenberg
> project...
>
> "On this the rest of the Acheans with one voice were for respecting
> the priest and taking the ransom that he offered; but not so
> Agamemnon, who spoke fiercely to him and sent him roughly away. " (193
> characters)
>
> Todd
>
In my copy of the Illiad (there's a phrase I never thought I'd use!),
those two sections differ in whitespace / linebreaks:

   On this the rest of the Achaeans with one voice were for
respecting the priest and taking the ransom that he offered; but not
so Agamemnon, who spoke fiercely to him and sent him roughly away.
"Old man," said he, "let me not find you tarrying about our ships, nor
yet coming hereafter.

and

   "On this the rest of the Achaeans with one voice were for respecting
the priest and taking the ransom that he offered; but not so
Agamemnon, who spoke fiercely to him and sent him roughly away. So
he went back in anger, and Apollo, who loved him dearly, heard his
prayer.

Did you get yours from http://www.fordham.edu/halsall/ancient/homer-illiad.txt

http://www.gutenberg.org/dirs/etext00/iliad10.txt

It goes to show that any text on the "web" is spurious :slight_smile:

Side note and totally off topic... why do people like to use two el's
in "The Iliad"?

  ?

/dh

Todd

···

On Jan 21, 2008 9:49 AM, Denis Hennessy <denis@hennessynet.com> wrote:

On 21 Jan 2008, at 15:12, Todd Benson wrote:
> 2008/1/18 yermej <yermej@gmail.com>:

Long version: because the capital-I looks identical to the lower case L so people think they've seen two 'L's. Of course they also heard an 'I' so they insert that first.

Short version: Iggnorance.

/dh

···

On 22 Jan 2008, at 00:55, Todd Benson wrote:

Side note and totally off topic... why do people like to use two el's
in "The Iliad"?

Okay, this is an ignorant reply. I'm really sorry to the people that
actually want to program. Just delete this :slight_smile:

Your previous link has two "L's". And many of the other posts do as
well. And, I am pretty sure it has nothing to do with fonts.

Todd

···

On Jan 22, 2008 7:06 AM, Denis Hennessy <denis@hennessynet.com> wrote:

On 22 Jan 2008, at 00:55, Todd Benson wrote:
> Side note and totally off topic... why do people like to use two el's
> in "The Iliad"?

Long version: because the capital-I looks identical to the lower case
L so people think they've seen two 'L's. Of course they also heard an
'I' so they insert that first.

Short version: Iggnorance.

I meant "this will be an ignorant reply". It is hard to throw candor
correctly over the internet.

Has anyone figured out why my text result differs from others?
Spacing/newlines, obviously, is a factor, but I get the same result on
every machine. But then, if you think about it, then to use something
like this, you _really_ have to trust the data.

Todd

···

On Jan 22, 2008 3:25 PM, Todd Benson <caduceass@gmail.com> wrote:

On Jan 22, 2008 7:06 AM, Denis Hennessy <denis@hennessynet.com> wrote:
> On 22 Jan 2008, at 00:55, Todd Benson wrote:
> > Side note and totally off topic... why do people like to use two el's
> > in "The Iliad"?
>
> Long version: because the capital-I looks identical to the lower case
> L so people think they've seen two 'L's. Of course they also heard an
> 'I' so they insert that first.
>
> Short version: Iggnorance.

Okay, this is an ignorant reply. I'm really sorry to the people that
actually want to program. Just delete this :slight_smile:

Your previous link has two "L's". And many of the other posts do as
well. And, I am pretty sure it has nothing to do with fonts.

I think, it's "Ilias" in ancient Greek anyway. It's quite a common
mistake/transliteration.