Joe Peck wrote:
> Regular expressions can be used to test whether a string belongs to a
> certain regular language, which is a subset of all possible languages
> (where a language is a set of strings). Regular expressions are
> equivalent to finite state automata in this respect. Since a finite
> state automata can only be in a finite number of states. You'd like to
> match a possibly infinitely large number of [joe][/joe] pairs. The FSA
> would need a new state for every extra [joe] it reads to remember it
> still needs to consume a matching [/joe] for it.
>
> If this sounds like Chinese, just remember regexpes aren't keen on
> matching this sort of stuff. Stacks on the other hand seem to be custom
> designed for these purposes.
>
> A.
It doesn't sound like Chinese 
If wouldn't have to be an infinite amount of states. Let's say these
are the states:
State 1 - no [joe] yet. If finds [joe], goes to state 2. If finds
[/joe], fails.
State 2 - [joe] found but not matching [/joe]. If it finds [joe] again
in this state, then fails. If it finds [/joe], increments count by 1
and moves to state 1.
If count goes above 3, fails.
But maybe I'll use something besides a regexp, although I thought there
would be a pretty easy way to do it.
--
Posted via http://www.ruby-forum.com/\.
If a regular expression can't do it, does that mean we can't use
a regular expression?
No. We'll still use a regexp and add some code to help it.
If all the pairs are matched, then after partitioning and zipping
we wind up with the original pairs.
[
"ok [joe] ok [/joe] right",
"ok [joe] [/joe] [joe] foo [/joe]",
"bad [joe] [/joe] foo [joe]",
"bad [joe] [/joe] foo [/joe]",
"bad [joe] [joe] foo [/joe]",
"bad [joe] [joe] ",
"bad [/joe] [joe] ",
"bad [/joe] [/joe] "
].each { |s|
ary = s.scan( %r{\[/?joe\]} )
p ary
if ary == ary.partition{|t| "[joe]"==t}.inject{|a,b| a.zip(b)
}.flatten
puts "good\n"
else
puts "bad\n"
end
}
--- output -----
["[joe]", "[/joe]"]
good
["[joe]", "[/joe]", "[joe]", "[/joe]"]
good
["[joe]", "[/joe]", "[joe]"]
bad
["[joe]", "[/joe]", "[/joe]"]
bad
["[joe]", "[joe]", "[/joe]"]
bad
["[joe]", "[joe]"]
bad
["[/joe]", "[joe]"]
bad
["[/joe]", "[/joe]"]
bad