Extracting vowels and consonants using regular expression

Thanks Robert, Corey, Philip, Stephano, et. al, for all of the great
suggestions. However, they all seem to ignore the conditional nature of 'y'
as a vowel. I would like the regex to treat 'y' as a vowel when there is no
other vowel either before or after it. The string I used initially, was
drawn from the following perl code that accomplishes this (found at
http://www.perlmonks.org/?node_id=592867\):

my @vowels = ( /[aeiou]|(?<![aeiou])y(?![aeiou])/gi );

The "(?<!...)" is a "zero-width negative-look-behind assertion".

The "(?!...)" is a "zero-width negative-look-ahead assertion".

Together, they match the condition of treating "y as a vowel only if there
is no other vowel before or after it."

This was my attempt at converting the Perl fragment to Ruby syntax:

scan(/[aeiou]|(?![aeiou])y(?![aeiou])/i)

I have since discovered that Ruby 1.8 lacks regex look-behind assertion --
which explains why the code was failing. As a result, I have fallen back to
the following which currently ignores the 'y':

class String
  def vowels
    scan(/[aeiou]/i)
  end
  def consonants
    scan(/[^aeiou]/i)
  end
end

Any ideas how to modify this to include the conditional treatment of "y as a
vowel only if there is no other vowel before or after it?"

(i.e., is there a way to simulate the perl "zero-width negative-look-behind"
and "zero-width negative-look-ahead" assertions for 'y' in Ruby 1.8?)

···

On 2/2/08 4:17 PM, in article 335e48a90802021417n4ba96a11i746bd5118f18464e@mail.gmail.com, "Robert Dober" <robert.dober@gmail.com> wrote:

Oh sorry using 1.9 excluseviley now...
Robert

Interesting. I guess you could post-process a bit on the two sets that you
get back. A regular expression that can handle the y would be good, I guess.

···

On Feb 2, 2008 5:40 PM, Donovan Dillon <donovan.dillon@verizon.net> wrote:

Thanks Robert, Corey, Philip, Stephano, et. al, for all of the great
suggestions. However, they all seem to ignore the conditional nature of
'y'
as a vowel. I would like the regex to treat 'y' as a vowel when there is
no
other vowel either before or after it. The string I used initially, was
drawn from the following perl code that accomplishes this (found at
http://www.perlmonks.org/?node_id=592867\):

my @vowels = ( /[aeiou]|(?<![aeiou])y(?![aeiou])/gi );

The "(?<!...)" is a "zero-width negative-look-behind assertion".

The "(?!...)" is a "zero-width negative-look-ahead assertion".

Together, they match the condition of treating "y as a vowel only if there
is no other vowel before or after it."

This was my attempt at converting the Perl fragment to Ruby syntax:

scan(/[aeiou]|(?![aeiou])y(?![aeiou])/i)

I have since discovered that Ruby 1.8 lacks regex look-behind assertion --
which explains why the code was failing. As a result, I have fallen back
to
the following which currently ignores the 'y':

class String
def vowels
   scan(/[aeiou]/i)
end
def consonants
   scan(/[^aeiou]/i)
end
end

Any ideas how to modify this to include the conditional treatment of "y as
a
vowel only if there is no other vowel before or after it?"

(i.e., is there a way to simulate the perl "zero-width
negative-look-behind"
and "zero-width negative-look-ahead" assertions for 'y' in Ruby 1.8?)

On 2/2/08 4:17 PM, in article > 335e48a90802021417n4ba96a11i746bd5118f18464e@mail.gmail.com, "Robert > Dober" > <robert.dober@gmail.com> wrote:

> Oh sorry using 1.9 excluseviley now...
> Robert
>

--

The Internet's Premiere source of information about Corey Haines

Donovan Dillon wrote:

they all seem to ignore the conditional nature of 'y'
as a vowel.

What about diphthongs? Technically a diphthong is *one* vowel
made up of two letters. The rules vary by language; Spanish
even has triphthongs, e.g. in Raoul. A diphthong/triphthong
occurs wherever a succession of vowel symbols doesn't contain
a syllable break.

Perhaps this would be a good Ruby Quiz?

Clifford Heath.

Origumura is the default regex library for Ruby 1.9. It includes look-
behind assertions (wohoo!) ... and ... It turns out that there is a
gem available so you don't have to upgrade to Ruby 1.9 or monkey
around with creating a custom Ruby 1.8.x build.

The gem relies upon a c library that can be downloaded from here:
http://www.geocities.jp/kosako3/oniguruma/

After installing the library, origuruma installation is a breeze
using:

sudo gem install -r origuruma

However, my progress has come to a screeching halt as I am now
receiving the following error:

** Starting Mongrel listening at 0.0.0.0:3000
** Starting Rails with development environment...
Exiting
/usr/local/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
`gem_original_require': ./lib/string_extensions.rb:4: undefined
(?...)
sequence: /[aeiou]|(?<![aeiou])y(?![aeiou])/ (SyntaxError)
./lib/string_extensions.rb:8: undefined (?...) sequence: /![aeiou]|(?
<=[aeiou])y(?=[aeiou])/ from /usr/local/lib/ruby/site_ruby/1.8/
rubygems/custom_require.rb:27:in `require'

It seems to be complaining about the look-behind and look-ahead
assertions in the following code fragment (which origuruma is supposed
to support):

class String
  def vowels
    scan(/[aeiou]|(?<![aeiou])y(?![aeiou])/i)
  end
  def consonants
    scan(/![aeiou]|(?<=[aeiou])y(?=[aeiou])/i)
  end
end

According to this reference (サービス終了のお知らせ
oniguruma/
doc/RE.txt), the look behind and look ahead syntax that I am using
appears to be correct
(ref section 7. Extended groups).

This suggests either:

A. Ruby may be using the default regexp library instead of the
oniguruma regexp library,

B. The oniguruma regexp library is not accessible via the 'scan'
method, or

C. Something else entirely

... hmmm ... <scratches head/>

Any suggestions?

···

On Feb 2, 4:36 pm, Donovan Dillon <donovan.dil...@verizon.net> wrote:

Thanks Robert, Corey, Philip, Stephano, et. al, for all of the great
suggestions. However, they all seem to ignore the conditional nature of 'y'
as a vowel. I would like the regex to treat 'y' as a vowel when there is no
other vowel either before or after it. The string I used initially, was
drawn from the following perl code that accomplishes this (found athttp://www.perlmonks.org/?node_id=592867):

my @vowels = ( /[aeiou]|(?<![aeiou])y(?![aeiou])/gi );

The "(?<!...)" is a "zero-width negative-look-behind assertion".

The "(?!...)" is a "zero-width negative-look-ahead assertion".

Together, they match the condition of treating "y as a vowel only if there
is no other vowel before or after it."

This was my attempt at converting the Perl fragment to Ruby syntax:

scan(/[aeiou]|(?![aeiou])y(?![aeiou])/i)

I have since discovered that Ruby 1.8 lacks regex look-behind assertion --
which explains why the code was failing. As a result, I have fallen back to
the following which currently ignores the 'y':

class String
def vowels
scan(/[aeiou]/i)
end
def consonants
scan(/[^aeiou]/i)
end
end

Any ideas how to modify this to include the conditional treatment of "y as a
vowel only if there is no other vowel before or after it?"

(i.e., is there a way to simulate the perl "zero-width negative-look-behind"
and "zero-width negative-look-ahead" assertions for 'y' in Ruby 1.8?)

On 2/2/08 4:17 PM, in article > 335e48a90802021417n4ba96a11i746bd5118f184...@mail.gmail.com, "Robert Dober" > > <robert.do...@gmail.com> wrote:
> Oh sorry using 1.9 excluseviley now...
> Robert

Good idea --this is made for Ruby Quiz. How does one submit a problem?

···

On Feb 2, 5:41 pm, Clifford Heath <n...@spam.please.net> wrote:

Donovan Dillon wrote:
> they all seem to ignore the conditional nature of 'y'
> as a vowel.

What about diphthongs? Technically a diphthong is *one* vowel
made up of two letters. The rules vary by language; Spanish
even has triphthongs, e.g. in Raoul. A diphthong/triphthong
occurs wherever a succession of vowel symbols doesn't contain
a syllable break.

Perhaps this would be a good Ruby Quiz?

Clifford Heath.

Thanks for all the help everyone. The problem was solved with the help
from pullmonkey on Rails Forum! Here is the solution:

Objective:

1. Extract vowels and consonants from a string
2. Handle the conditional treatment of 'y' as a vowel under the
following circumstances:
     - y is a vowel if it is surrounded by consonants
     - y is a consonant if it is adjacent to a vowel

Here is the code that works:

  def vowels(name_str)
    reg = Oniguruma::ORegexp.new('[aeiou]|(?<![aeiou])y(?![aeiou])')
    reg.match_all(name_str).to_s.scan(/./)
  end

  def consonants(name_str)
    reg = Oniguruma::ORegexp.new('[bcdfghjklmnpqrstvwx]|(?<=[aeiou])y|
y(?=[aeiou])')
    reg.match_all(name_str).to_s.scan(/./)
  end

(Note, the .scan(/./) can be eliminated to return an array)

The major problem was getting the code to accurately treat "y" as a
consonant. The key to solving this problem was to:

1. define unconditional consonants explicitly (i.e.,
[bcdfghjklmnpqrstvwx]) -- not as [^aeiou] which automatically includes
"y" thus OVER-RIDING any conditional reatment of "y" that follows

2. define conditional "y" regexp assertions independently, i.e., "| (?
<=[aeiou]) y | y (?=[aeiou])" -- not "|(?<=[aeiou]) y (?=[aeiou])"
which only matches "y" preceded AND followed by a vowel, not preceded
OR followed by a vowel

HTH.

···

On Feb 3, 11:42 pm, Dondi <Donovan.Dil...@gmail.com> wrote:

On Feb 2, 4:36 pm, Donovan Dillon <donovan.dil...@verizon.net> wrote:

> Thanks Robert, Corey, Philip, Stephano, et. al, for all of the great
> suggestions. However, they all seem to ignore the conditional nature of 'y'
> as a vowel. I would like the regex to treat 'y' as a vowel when there is no
> other vowel either before or after it. The string I used initially, was
> drawn from the following perl code that accomplishes this (found athttp://www.perlmonks.org/?node_id=592867):

> my @vowels = ( /[aeiou]|(?<![aeiou])y(?![aeiou])/gi );

> The "(?<!...)" is a "zero-width negative-look-behind assertion".

> The "(?!...)" is a "zero-width negative-look-ahead assertion".

> Together, they match the condition of treating "y as a vowel only if there
> is no other vowel before or after it."

> This was my attempt at converting the Perl fragment to Ruby syntax:

> scan(/[aeiou]|(?![aeiou])y(?![aeiou])/i)

> I have since discovered that Ruby 1.8 lacks regex look-behind assertion --
> which explains why the code was failing. As a result, I have fallen back to
> the following which currently ignores the 'y':

> class String
> def vowels
> scan(/[aeiou]/i)
> end
> def consonants
> scan(/[^aeiou]/i)
> end
> end

> Any ideas how to modify this to include the conditional treatment of "y as a
> vowel only if there is no other vowel before or after it?"

> (i.e., is there a way to simulate the perl "zero-width negative-look-behind"
> and "zero-width negative-look-ahead" assertions for 'y' in Ruby 1.8?)

> On 2/2/08 4:17 PM, in article > > 335e48a90802021417n4ba96a11i746bd5118f184...@mail.gmail.com, "Robert Dober" > > > <robert.do...@gmail.com> wrote:
> > Oh sorry using 1.9 excluseviley now...
> > Robert

Origumura is the default regex library for Ruby 1.9. It includes look-
behind assertions (wohoo!) ... and ... It turns out that there is a
gem available so you don't have to upgrade to Ruby 1.9 or monkey
around with creating a custom Ruby 1.8.x build.

The gem relies upon a c library that can be downloaded from here:サービス終了のお知らせ

After installing the library, origuruma installation is a breeze
using:

sudo gem install -r origuruma

However, my progress has come to a screeching halt as I am now
receiving the following error:

** Starting Mongrel listening at 0.0.0.0:3000
** Starting Rails with development environment...
Exiting
/usr/local/lib/ruby/site_ruby/1.8/rubygems/custom_require.rb:27:in
`gem_original_require': ./lib/string_extensions.rb:4: undefined
(?...)
sequence: /[aeiou]|(?<![aeiou])y(?![aeiou])/ (SyntaxError)
./lib/string_extensions.rb:8: undefined (?...) sequence: /![aeiou]|(?
<=[aeiou])y(?=[aeiou])/ from /usr/local/lib/ruby/site_ruby/1.8/
rubygems/custom_require.rb:27:in `require'

It seems to be complaining about the look-behind and look-ahead
assertions in the following code fragment (which origuruma is supposed
to support):

class String
def vowels
scan(/[aeiou]|(?<![aeiou])y(?![aeiou])/i)
end
def consonants
scan(/![aeiou]|(?<=[aeiou])y(?=[aeiou])/i)
end
end

According to this reference (サービス終了のお知らせ
oniguruma/
doc/RE.txt), the look behind and look ahead syntax that I am using
appears to be correct
(ref section 7. Extended groups).

This suggests either:

A. Ruby may be using the default regexp library instead of the
oniguruma regexp library,

B. The oniguruma regexp library is not accessible via the 'scan'
method, or

C. Something else entirely

... hmmm ... <scratches head/>

Any suggestions?