Well I did replace the regex engine with Oniguruma, and found 2 bugs:
a) RegEx strings > 16 KB result in an "undefined bytecode (bug)" error. This
happen when I create a regex with
NAMES = <<TEXT
... 3000 names ...
TEXT
s1 = (NAMES.to_a.map { |sWord| sWord.strip }).join('|')
@reTitlePatterns = /(?i)\b(#{s1})\b/
The standard RegEx engine has no problems.
b) the following code does not work as I expect it to work.
VALIDNAMECHAR = '[A-Za-z\xa2\xc0-\xff]'
puts /^(\w{2,}).* (\w{2,}?)$/.match("Hallo Welt") --> "Hallo"
puts /^(\w{2,}).* (#{VALIDNAMECHAR}{2,}?)$/.match("Hallo Welt") --> nil
Again, the standard RegEx engine has no problems. Which one is wrong?
Just replacing "\w" with [A-Za-z\xa2\xc0-\xff] should not cause any
problems...
Ruby 1.8.1 with regex engine replaced by Oniguruma. Operating system is
Win32.
Christian