Sorry, I realise I should have probably given some code to exemplify
what's in my head:
str1 = "ABC12A2012"
if str1 =~ /^([a-zA-Z]+)(.*?)(?:KEYWORD)?$/
p $1, $2 # => "ABC", "12A2012"
end
str2 = "ABC13B2012KEYWORD"
if str2 =~ /^([a-zA-Z]+)(.*?)(?:KEYWORD)?$/
p $1, $2 # => "ABC", "13B2013"
end
To get it to work I:
* made a non-capturing group, using the (?: ... ) syntax, so the
entire string 'KEYWORD' can be marked as optional.
* made the middle _anything_ matcher (.*) non-greedy (so it doesn't
automatically also capture a trailing KEYWORD) by appending a question
mark
* tacked a $ end-of-string marker after the optional KEYWORD thing, to
ensure that the non-greedy _anything_ matcher actually matches;
otherwise it seems quite happy to match ""
* also put a ^ start-of-string marker, since my if-statement is
simultaneously validating the string as well as extracting the match.
It doesn't really do anything, you can leave it out.
You could also wrap the non-capturing keyword group in capturing
parens if you want to extract the keyword, thus:
str1 = "ABC12A2012"
if str1 =~ /^([a-zA-Z]+)(.*?)((?:KEYWORD)?)$/
p $1, $2, $3 # => "ABC", "12A2012", ""
end
str2 = "ABC13B2012KEYWORD"
if str2 =~ /^([a-zA-Z]+)(.*?)((?:KEYWORD)?)$/
p $1, $2, $3 # => "ABC", "13B2013", "KEYWORD"
end
If you don't like using the if-statement structure, you can also use
String#match , which returns a MatchData object. You can access the
groups using array syntax:
str1 = "ABC12A2012"
md = str1.match /^([a-zA-Z]+)(.*?)((?:KEYWORD)?)$/
p md[1] => "ABC"
p md[2] => "12A2012"
etc.
String#split , on the other hand, breaks the string up by chopping out
any parts of the string that match the regexp, and returning the
remaining chunks as an array. For example:
str2.split /[0-9]+/ # => ["ABC", "B", "KEYWORD"]
That is, it chops out the numbers, and returns the bits in between.
I'd have to think a lot harder about how you're getting what you're
getting with str2, but I'm already pretty sure the actual question can
be resolved using =~ or String#match
···
On 13 June 2012 11:30, cyber c. <lists@ruby-forum.com> wrote:
Hi,
I need to match
str1 = "ABC12A2012"
str2 = "ABC13B2013KEYWORD"
i have to extract,
word1 = ABC, word2 = 12A2012 for str1
word1 = ABC, word2 = 13B2013 for str2
i have used
str.split(/([a-zA-Z]+)(.*)(KEYWORD?)/)
for str2 i get substrings as
"","ABC","12A2012","KEYWORD" -- Why did the nil sub string popped up?
for str1 i get only 1 substring which is the entire string
Any one can spot the problem , please let me know.
--
Posted via http://www.ruby-forum.com/\.
--
Matthew Kerwin, B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651
"You'll never find a programming language that frees
you from the burden of clarifying your ideas." - xkcd