Perl multiple match RE in Ruby?

Have this in Perl, would prefer it in Ruby.

my @A = ($_[0] =~ /([-.]|\d+|[^-.\d]+)/g);

Need to take String and get Array of all successive matches against string.

Regex itself is fine, but hope there’s an idiomatic way to run this without
building a loop using MatchData#[1] and MatchData#post_match. Which I can
do, but it seems clumsy.

Thanks…
-michael

Michael C. Libby x@ichimunki.com http://www.ichimunki.com/ http://www.ichimunki.com/public_key.txt

Hi,

michael libby x@ichimunki.com writes:

Have this in Perl, would prefer it in Ruby.

my @A = ($_[0] =~ /([-.]|\d+|[^-.\d]+)/g);

Need to take String and get Array of all successive matches against string.

How about String#scan ?

% ruby -e ‘p “a b c”.scan(/\w/)’
[“a”, “b”, “c”]

···


eban

michael libby x@ichimunki.com writes:

Have this in Perl, would prefer it in Ruby.

my @A = ($_[0] =~ /([-.]|\d+|[^-.\d]+)/g);

Need to take String and get Array of all successive matches against string.

You’re in luck:

$_[0].scan( /([-.]|\d+|[^-.\d]+)/ )

You can also pass String#scan a block to execute on each match. See
http://www.rubycentral.com/book/ref_c_string.html#String.scan for
details.

Dan

···


http://www.dfan.org

That did the trick! This is why I love Ruby, so easy to get stuff done. The
Ruby version is SO much more readable than the Perl too.

So here’s my adaptation of Perl’s Sort::Versions. Anything I’m missing?
Other than the documentation, that is.

class Sort
class Versions
def Versions.versioncmp(version_a, version_b)
vre = /[-.]|\d+|[^-.\d]+/
ax = version_a.scan(vre)
bx = version_b.scan(vre)

  while (ax.length>0 && bx.length>0) do
a = ax.shift
b = bx.shift

if( a == b )                 then next
elsif (a == '-' && b == '-') then next 
elsif (a == '-')             then return -1
elsif (b == '-')             then return 1
elsif (a == '.' && b == '.') then next 
elsif (a == '.' )            then return -1
elsif (b == '.' )            then return 1
elsif (a =~ /^\d+$/ && b =~ /^\d+$/) then
  if( a =~ /^0/ or b =~ /^0/ ) then
    return a.to_s.upcase <=> b.to_s.upcase
  end
  return a.to_i <=> b.to_i
else 
  return a.upcase <=> b.upcase
end
  end
  return version_a <=> version_b;
end

def Versions.sort_versions(list)
  return list.sort{|a,b| Sort::Versions.versioncmp(a,b)}
end

end
end

puts Sort::Versions::sort_versions( %w{ 1.1.6 2.3 1.1a 3.0 1.5 1 2.4 1.1-4
2.3.1 1.2 2.3.0 1.1-3 2.4b 2.4 2.40.2 2.3a.1 3.1 0002 1.1-5 1.1.a 1.06} )


Michael C. Libby x@ichimunki.com http://www.ichimunki.com/ http://www.ichimunki.com/public_key.txt
···

On Monday 21 October 2002 22:52, WATANABE Hirofumi wrote:

How about String#scan ?

michael libby x@ichimunki.com writes:

···

On Monday 21 October 2002 22:52, WATANABE Hirofumi wrote:

How about String#scan ?

That did the trick! This is why I love Ruby, so easy to get stuff done. The
Ruby version is SO much more readable than the Perl too.

While I agree, to be fair to Perl, String#scan in this instance is
equivalent to String#split, which is Perl’s split.

Of course, String#scan does do things that Perl’s split can’t do, but
what you’re doing isn’t one of them. :slight_smile:


The warly race may riches chase,
An’ riches still may fly them, O;
An’ tho’ at last they catch them fast,
Their hearts can ne’er enjoy them, O.

So here’s my adaptation of Perl’s Sort::Versions. Anything I’m missing?
Other than the documentation, that is.

Kewl! Three things:

Why define your own sort, when you can just pass the comparison method to
normal sort-method? (I may be missing something here, though.)

Second, modules might fit the bill better than classes.

Third, I’d like to suggest some API changes - to make it less perl, and
more ruby-like:

class Sort
class Versions
def Versions.versioncmp(version_a, version_b)
end
def Versions.sort_versions(list)
end
end
end

I’d make it:

module Version

def self.cmp( a, b )
Sort::Versions::versioncmp( a, b )
end

def self.sort( list )
list.sort { |a,b| Version.cmp(a,b) }
end

def self.sort!( list )
list.sort! { |a,b| Version.cmp(a,b) }
end

module Cmp
def version_cmp( b )
Version.cmp( self, b )
end
end

module Sort
def version_sort
Version.sort( self )
end
def version_sort!
Version.sort!( self )
end
end

end

Version.cmp( “0.77”, “1.3.5” )

String.extend(Version::Cmp)
“0.1.3”.version_cmp( “0.2.32” )

Version.sort([“0.22”, “0.1”, “1.4”])
Version.sort!([“0.22”, “0.1”, “1.4”])

Array.extend(Version::Sort)
[“0.22”, “0.1”, “1.4”].version_sort
[“0.22”, “0.1”, “1.4”].version_sort!

This naturally does break the CPAN-equivalence, which may be a nice thing
to have, so maybe make a separate hierarchy: CPAN::Sort::Versions that
refers to the same implementation.

– Nikodemus

···

On Tue, 22 Oct 2002, michael libby wrote:

I refuse to have a battle of wits with an unarmed person.

I was all set to refute this. Then I read ‘perldoc -f split’ and discovered
that in Perl’s split() if /EXPR/ contains capturing parentheses it will
return the match as well as splitting on it. It works this way in Ruby,
too. Thanks for pointing this out.

-michael

Michael C. Libby x@ichimunki.com http://www.ichimunki.com/ http://www.ichimunki.com/public_key.txt
···

On Tuesday 22 October 2002 04:36, Simon Cozens wrote:

michael libby x@ichimunki.com writes:

On Monday 21 October 2002 22:52, WATANABE Hirofumi wrote:

How about String#scan ?

That did the trick! This is why I love Ruby, so easy to get stuff
done. The Ruby version is SO much more readable than the Perl too.

While I agree, to be fair to Perl, String#scan in this instance is
equivalent to String#split, which is Perl’s split.

Of course, String#scan does do things that Perl’s split can’t do, but
what you’re doing isn’t one of them. :slight_smile:

Why define your own sort, when you can just pass the comparison method
to normal sort-method? (I may be missing something here, though.)

I guess because it makes a convenient wrapper?

Second, modules might fit the bill better than classes.
[snip]

They certainly would the way I had it. Your suggestions give a lot of food
for thought.

My natural tendency is to want the version_cmp method in String and
version_sort in Array… would it make more sense to simply do:

class String
def version_cmp(b)
#compare(self,b)
end
end

class Array
def version_sort
self.sort{|a,b| a.to_s.version_cmp(b.to_s)}
end
end

That would save the step of extending those classes in code.

-michael

Michael C. Libby x@ichimunki.com http://www.ichimunki.com/ http://www.ichimunki.com/public_key.txt
···

On Tuesday 22 October 2002 05:02, Nikodemus Siivola wrote:

This is something I haven’t been overly concerned with in my
reimplementations of Text::Format and MIME::Types. Ultimately, I
think that the Ruby implementation should have a different API
because Ruby allows for some things that Perl doesn’t,
expression-wise.

-austin
– Austin Ziegler, austin@halostatue.ca on 2002.10.22 at 11.33.44

···

On Tue, 22 Oct 2002 19:02:13 +0900, Nikodemus Siivola wrote:

This naturally does break the CPAN-equivalence, which may be a
nice thing to have, so maybe make a separate hierarchy:
CPAN::Sort::Versions that refers to the same implementation.

My natural tendency is to want the version_cmp method in String and
version_sort in Array… would it make more sense to simply do:

That would save the step of extending those classes in code.

How about adding that to the Version module inside a BuiltinExt module, so
that to extend the classes you would just:

require ‘version’
include Version::BuiltinExt

But those that do not like to see String and array extended, or would like
to extend some other String or Array like class could still use:

Version.cmp(a,b)

or

MyClass.extend Version::Cmp

This (extension api design) is one area that would really deserve a “Best
Practices” doc.

– Nikodemus

···

On Tue, 22 Oct 2002, michael libby wrote:

I agree. Ruby API should be idiomatic Ruby, and oftentimes this means that
the entire name / class / module hierarchy should be different.

– Nikodemus

···

On Wed, 23 Oct 2002, Austin Ziegler wrote:

reimplementations of Text::Format and MIME::Types. Ultimately, I
think that the Ruby implementation should have a different API

I’ll throw mine in too; I wrote it for rpkg. This makes Version a
Comparable object. (Have a look at the tests for usage examples.)

class Version
include Comparable

def Version.
new(*args)
end

def initialize(s, separators = ‘.-’)
separators = separators.split(‘’)

items_regex_src = separators.collect {|sep| Regexp.escape(sep)}.join("|")
seps_regex_src  = separators.collect {|sep| Regexp.escape(sep)}.join

items_regex = Regexp.compile(items_regex_src)
seps_regex = /[^#{seps_regex_src}]/

@v = s.split(items_regex).collect {|n| Integer(n) rescue n}
@sep = s.gsub(seps_regex, '').split('')

end

def to_a
@v
end

def to_s
s = ‘’
@sep.each_with_index do |sep, i|
s << “#{@v[i]}#{sep}”
end
s << “#{@v.last}”
return s
end

def inspect
@v.inspect
end

def
@v[n]
end

def <=>(other)
raise unless other.is_a? Version

comp = 0
@v.each_with_index do |n, i|
  if n and other[i] 
    comp = n <=> other[i]
  elsif n.nil?
    comp = -1
  elsif other[i].nil?
    comp = +1
  else
    raise "This should never happen!"
  end
    
  if comp == 0
    next
  else
    break
  end
end

return comp     

end
end

if $0 == FILE
require ‘test/unit’

class Version
def separators
@sep
end
end

class TestVersion < Test::Unit::TestCase
def test_version_to_array
v = Version[‘0.1.0’]
assert [0, 1, 0], v.to_a
end

def test_can_access_revision_number 
  v = Version['0.1.0']
  assert_equal 0, v[0]
  assert_equal 1, v[1]
  assert_equal 0, v[2]
end

def test_non_dot_separators 
  v = Version['0.1.0-20021099']
  
  assert_equal 0, v[2]
  assert_equal 20021099, v[3]
end

def test_can_mix_numbers_and_strings
  v = Version['0.1.4-unstable']

  assert_equal 0, v[0]
  assert_equal 1, v[1]
  assert_equal 4, v[2]
  assert_equal 'unstable', v[3]
end

def test_can_compare_equal_versions
  v1 = Version['0.1.0']
  v2 = Version['0.1.0']

  assert_equal v1 <=> v2, 0
  assert v1 == v2
  assert_equal v1, v2
end

def test_can_compare_different_versions
  v1 = Version['0.1.0']
  v2 = Version['0.1.2']

  assert_equal v1 <=> v2, -1
  assert_equal v2 <=> v1, 1
  assert v1 < v2
  assert v2 > v1
end

def test_can_compare_different_versions_with_different_number_of_items 
  v1 = Version['0.1.0']
  v2 = Version['0.1.0.2']

  assert v2 > v1

  v1 = Version['0.1.0']
  v2 = Version['0.1.0-20021010']

  assert v2 > v1
  
  v1 = Version['0.1.0']
  v2 = Version['0.1.0-unstable']

  assert v2 > v1
end

def test_can_reconstruct_version_string 
  s = '0.1.0-unstable-20021010'
  v = Version[s]

  assert_equal s, v.to_s
end

def test_finds_separators 
  s = '0.1.0-unstable-20021010'
  v = Version[s]

  assert_equal ['.', '.', '-', '-'], v.separators
end

def test_unique_version_item_accepted 
  v = Version['0123_done']

  assert_equal ["0123_done"], v.to_a      
end

def test_allow_alphabetic_only_versions 
  v = Version['cvs']

  assert_equal ["cvs"], v.to_a
end

end
end

···

On Tue, Oct 22, 2002 at 09:03:14PM +0900, Nikodemus Siivola wrote:

That would save the step of extending those classes in code.

How about adding that to the Version module inside a BuiltinExt module, so
that to extend the classes you would just:

[and included a great script for a Version class, none of which is
reproduced here]

This is great! Now we can add things like Version#succ, #is_beta?, or
whatever.

I did find some flaws, so I added test cases from the CPAN module docs and
fixed the failed tests. Patch enclosed.

I’m guessing since you wrote most of it, you’d to maintain it (as a
separate package from rpkg)? If not, I’ll volunteer to take it on.

And what’s up with the English version of RAA? Can’t get to it at all! (not
a nag, just an alert ).

-michael

ichimunki@greyhound:~/ruby-misc/raa-stuff/sort_versions$ diff -u version_mirra.rb version_mirra_libby.rb - --- version_mirra.rb 2002-10-23 20:16:05.000000000 -0500 +++ version_mirra_libby.rb 2002-10-23 20:15:11.000000000 -0500 @@ -22,6 +22,10 @@ @v end
  • def length
  • @v.length
  • end
···

On Wednesday 23 October 2002 04:13, Massimiliano Mirra wrote:

  • def to_s
    s = ‘’
    @sep.each_with_index do |sep, i|
    @@ -45,13 +49,13 @@
    comp = 0
    @v.each_with_index do |n, i|
    if n and other[i]
    •    comp = n <=> other[i]
      
  • if n.type == other[i].type
  • comp = n <=> other[i]
    
  • else
  • comp = n.to_s <=> other[i].to_s
    
  • end
    elsif n.nil?
    •    comp = -1
      
    •  elsif other[i].nil?
      
    •    comp = +1
      
    •  else
      
    •    raise "This should never happen!"
      
  • comp = +1
     end
       
     if comp == 0
    

@@ -61,6 +65,10 @@
end
end

  • if comp == 0

  •  comp = @v.length <=> other.length
    
  • end

  • return comp
    end
    end
    @@ -165,5 +173,57 @@

     assert_equal ["cvs"], v.to_a
    

    end

  • #test from CPAN version module

  • def test_CPAN_both_numeric

  •  v1 = Version['1.1']
    
  •  v2 = Version['1.2']
    
  •  assert v1 < v2
    
  •  v1 = Version['1.1']
    
  •  v2 = Version['1.1.1']
    
  •  assert v1 < v2
    
  •  v1 = Version['1']
    
  •  v2 = Version['2']
    
  •  assert v1 < v2
    
  •  v1 = Version['1']
    
  •  v2 = Version['0002']
    
  •  assert v1 < v2
    
  •  v1 = Version['1.5']
    
  •  v2 = Version['1.06']
    
  •  assert v1 < v2
    
  • end

  • def test_CPAN_string_and_numeric

  •  v1 = Version['1.1a']
    
  •  v2 = Version['1.2']
    
  •  assert v1 < v2
    
  •  v1 = Version['1.1']
    
  •  v2 = Version['1.1a']
    
  •  assert v1 < v2
    
  •  v1 = Version['1.1.a']
    
  •  v2 = Version['1.1a']
    
  •  assert v1 < v2
    
  •  v1 = Version['1']
    
  •  v2 = Version['a']
    
  •  assert v1 < v2
    
  • end

  • def test_CPAN_both_strings

  •  v1 = Version['a']
    
  •  v2 = Version['b']
    
  •  assert v1 < v2
    
  • end

  • def test_can_count_parts_of_version

  •  v = Version['1.2-3.4']
    
  •  assert v.length == 4
    
  • end
    end
    end

Michael C. Libby x@ichimunki.com http://www.ichimunki.com/ http://www.ichimunki.com/public_key.txt

This is great! Now we can add things like Version#succ, #is_beta?, or
whatever.

I did find some flaws, so I added test cases from the CPAN module docs and
fixed the failed tests. Patch enclosed.

Man, I love open source. :slight_smile: Thanks.

I’m guessing since you wrote most of it, you’d to maintain it (as a
separate package from rpkg)? If not, I’ll volunteer to take it on.

I prefer to keep a version.rb in the rpkg distribution, as I want rpkg
to be as self contained as possible. Would you be interested in
maintaining a mixin to Version that provides extra functionality such
as the #is_beta? and #succ that you propose?

That way, rpkg would keep requiring ‘rpkg/version’, while users of the
version.rb `on steroids’, or those that don’t want to install rpkg,
would just require ‘version’, which includes both the base Version
class (that we’d have to mirror) and the extensions. With such setup,
if ever a Version comes into the standard distribution, no user code
would have to be modified. Mmmh, I know that sounds complicated, but
it’s easier done than said. :slight_smile:

Massimiliano

···

On Thu, Oct 24, 2002 at 10:31:13AM +0900, michael libby wrote: