Split on '' (and another for split -1)

Here's a generic routine I'm working on:

  class String
    def last=(str, separator=$/)
      separator = '' unless separator
      raise "separator must be a String" unless String === separator
      s = self.split(separator, -1)
      s[-1] = str
      self.replace(s.join(separator))
    end
  end

  $/ = "\n"
  s = "ab\nc"
  s.last = "123"
  p s
  # => "ab\n123"

Now try:

  $/ = ""
  s = "abc"
  s.last = "123"
  p s

Unfortunately this does not give a congruent result.

As an aside, this relates to the split -1 since any generic routine will not
know what the split separator is and therefore will require the -1. Though my
analysis is not complete, I continue to find that in most cases the -1
parameter is either required, or at worst, inconsequential.

T.

["trans. (T. Onoma)" <transami@runbox.com>, 2004-12-27 20.31 CET]

Here's a generic routine I'm working on:

  class String
    def last=(str, separator=$/)
      separator = '' unless separator
      raise "separator must be a String" unless String === separator
      s = self.split(separator, -1)
      s[-1] = str
      self.replace(s.join(separator))
    end
  end

  $/ = "\n"
  s = "ab\nc"
  s.last = "123"
  p s
  # => "ab\n123"

Now try:

  $/ = ""
  s = "abc"
  s.last = "123"
  p s

Unfortunately this does not give a congruent result.

$/="" means paragraph mode, and it is acknowledged(?) by IO#gets,
#readlines, String#each, #to_a, etc. Maybe you should do the same?

And that solves your problems with split('') ;)))).

trans. (T. Onoma) wrote:

Here's a generic routine I'm working on:

  class String
    def last=(str, separator=$/)
      [...]
    end
  end

AFAIK there is no way of supplying the separator in that case except using .send(). I have done an RCR for this, but it was not commonly wanted at that time.

Boy, that's a real side-splitter! Watch me slap my knee! ;)))).

But seriously, you can call it anything you wish. It does not change the
behavior.

"Moreover you have a peculiar definition of paragraph. ".split('')
=> ["M", "o", "r", "e", "o", "v", "e", "r", " ", "y", "o", "u", " ", "h", "a",
"v", "e", " ", "a", " ", "p", "e", "c", "u", "l", "i", "a", "r", " ", "d",
"e", "f", "i", "n", "i", "t", "i", "o", "n", " ", "o", "f", " ", "p", "a",
"r", "a", "g", "r", "a", "p", "h", ".", " "]

A more informative explanation of this "paragraph mode" might actually be
helpful.

T.

···

On Monday 27 December 2004 03:33 pm, Carlos wrote:

["trans. (T. Onoma)" <transami@runbox.com>, 2004-12-27 20.31 CET]

> Here's a generic routine I'm working on:
>
> class String
> def last=(str, separator=$/)
> separator = '' unless separator
> raise "separator must be a String" unless String === separator
> s = self.split(separator, -1)
> s[-1] = str
> self.replace(s.join(separator))
> end
> end
>
> $/ = "\n"
> s = "ab\nc"
> s.last = "123"
> p s
> # => "ab\n123"
>
> Now try:
>
> $/ = ""
> s = "abc"
> s.last = "123"
> p s
>
> Unfortunately this does not give a congruent result.

$/="" means paragraph mode, and it is acknowledged(?) by IO#gets,
#readlines, String#each, #to_a, etc. Maybe you should do the same?

And that solves your problems with split('') ;)))).

At least one can set the $/ before hand, I guess.

T.

···

On Monday 27 December 2004 05:21 pm, Florian Gross wrote:

trans. (T. Onoma) wrote:
> Here's a generic routine I'm working on:
>
> class String
> def last=(str, separator=$/)
> [...]
> end
> end

AFAIK there is no way of supplying the separator in that case except
using .send(). I have done an RCR for this, but it was not commonly
wanted at that time.

I didn't see it off hand. Which RCR # is it?

In doing so would there be a conflict with parallel assignment?

T.

···

On Monday 27 December 2004 05:21 pm, Florian Gross wrote:

trans. (T. Onoma) wrote:
> Here's a generic routine I'm working on:
>
> class String
> def last=(str, separator=$/)
> [...]
> end
> end

AFAIK there is no way of supplying the separator in that case except
using .send(). I have done an RCR for this, but it was not commonly
wanted at that time.

["trans. (T. Onoma)" <transami@runbox.com>, 2004-12-27 22.25 CET]

A more informative explanation of this "paragraph mode" might actually be
helpful.

require 'pp'

$/=""
pp <<'EOT'.to_a

A line feed separates lines. For example this one ->
<- divides these two lines. Two or more line feeds
separate paragraphs (that is, the regular expression
/\n\n+/). Here ends the first paragraph:

And here begins the second.

Third.

Fourth.
EOT

=>
["\nA line feed separates lines. For example this one ->\n<- divides these
two lines. Two or more line feeds\nseparate paragraphs (that is, the regular
expression\n/\\n\\n+/). Here ends the first paragraph:\n\n",
"And here begins the second.\n\n",
"Third.\n\n\n",
"Fourth.\n"]

Good luck.

Okay, I looked up what your were trying to explain to me. Sigh, it makes it
even more complex.

Firstly, the issue with -1 is still the actual problem I was pointing out:
i.e. it appends an "" to the end of the array when split on "".

And now thanks to "paragraph mode" I have another problem. While I was trying
to make #first= and #last= work congruently with #each, which uses $/, it
seems that paragraph mode actually subverts the potential for splitting on ''
as a character mode. To do so one must use // instead, but...

  $/ = //
  TypeError: value of $/ must be String

Why not just have #each_paragraph for a "paragraph mode"?

T.

···

On Monday 27 December 2004 04:25 pm, trans. (T. Onoma) wrote:

A more informative explanation of this "paragraph mode" might actually be
helpful.

trans. (T. Onoma) wrote:

> AFAIK there is no way of supplying the separator in that case except
> using .send(). I have done an RCR for this, but it was not commonly
> wanted at that time.

I didn't see it off hand. Which RCR # is it?

http://www.rcrchive.net/rcr/show/157

At the time I submitted it there was not much request for it. If that changes I could resubmit it under the new format.

Thanks, Carlos.

T.

···

On Monday 27 December 2004 04:48 pm, Carlos wrote:

["trans. (T. Onoma)" <transami@runbox.com>, 2004-12-27 22.25 CET]

> A more informative explanation of this "paragraph mode" might actually be
> helpful.

require 'pp'

$/=""
pp <<'EOT'.to_a

A line feed separates lines. For example this one ->
<- divides these two lines. Two or more line feeds
separate paragraphs (that is, the regular expression
/\n\n+/). Here ends the first paragraph:

And here begins the second.

Third.

Fourth.
EOT

=>
["\nA line feed separates lines. For example this one ->\n<- divides these
two lines. Two or more line feeds\nseparate paragraphs (that is, the
regular expression\n/\\n\\n+/). Here ends the first paragraph:\n\n",
"And here begins the second.\n\n",
"Third.\n\n\n",
"Fourth.\n"]

Good luck.

Another thought:

  $/ = :paragraph

T.

···

On Monday 27 December 2004 04:56 pm, trans. (T. Onoma) wrote:

Why not just have #each_paragraph for a "paragraph mode"?