Strange unicode regex behavior with Ruby 2.0

Hi,

the following code behaves strangely in ruby 2.0, and different from
1.9:

puts "ä".match(/[\p{Word}]/).inspect
puts "ä".match(/[\p{Word}\s]/).inspect

Result on 1.9.3-p194:
#<MatchData "ä">
#<MatchData "ä">

Result on 2.0.0-p0:
#<MatchData "ä">
nil

Any ideas what's going on there?

I have attached the ruby code as a file, in case there are any problems
with email charset conversion.

Thanks!
Andreas

Attachments:
http://www.ruby-forum.com/attachment/8267/test_regex.rb.txt

···

--
Posted via http://www.ruby-forum.com/.

Hey Andreas,

Lately, there have been some discussions on ruby-core (the mailing list
dedicated to the core implementers of MRI). It's possible that this bug is
being adressed at the moment. These are the most recent messages there:

http://blade.nagaokaut.ac.jp/ruby/ruby-core/53601-53800.shtml#latest

I don't really know what could be causing this difference. :frowning:

···

-----
Carlos Agarie
Skype: carlos.agarie

Control engineering
Polytechnic School, University of São Paulo, Brazil
Computer engineering
Embry-Riddle Aeronautical University, USA

2013/3/26 Andreas S. <lists@ruby-forum.com>

Hi,

the following code behaves strangely in ruby 2.0, and different from
1.9:

puts "ä".match(/[\p{Word}]/).inspect
puts "ä".match(/[\p{Word}\s]/).inspect

Result on 1.9.3-p194:
#<MatchData "ä">
#<MatchData "ä">

Result on 2.0.0-p0:
#<MatchData "ä">
nil

Any ideas what's going on there?

I have attached the ruby code as a file, in case there are any problems
with email charset conversion.

Thanks!
Andreas

Attachments:
http://www.ruby-forum.com/attachment/8267/test_regex.rb.txt

--
Posted via http://www.ruby-forum.com/\.

Quoting Andreas S. (lists@ruby-forum.com):

Result on 1.9.3-p194:
#<MatchData "ä">
#<MatchData "ä">

Result on 2.0.0-p0:
#<MatchData "ä">
nil

Any ideas what's going on there?

If a bug had been inserted, it appears it has been removed already:

$ ruby -v
ruby 2.1.0dev (2013-03-25 trunk 39928) [x86_64-linux]
$ ruby test_regex.rb
#<MatchData "ä">
#<MatchData "ä">

Carlo

···

Subject: Strange unicode regex behavior with Ruby 2.0
  Date: mer 27 mar 13 04:32:14 +0900

--
  * Se la Strada e la sua Virtu' non fossero state messe da parte,
* K * Carlo E. Prelz - fluido@fluido.as che bisogno ci sarebbe
  * di parlare tanto di amore e di rettitudine? (Chuang-Tzu)

Thanks! Guess I will use a workaround for now and wait for 2.1.

···

--
Posted via http://www.ruby-forum.com/.