this is not exactly a ruby problem, i hope people will not consider
this spamming.
does anyone know of any reference or an algorithm to build a regular
expression (a finite automaton, in general) that would match a given
set of strings? i of course mean a regexp that would match just the
given strings, and not any other. so solutions like /^.*$/ are not
acceptable
this is not exactly a ruby problem, i hope people will not consider
this spamming.
does anyone know of any reference or an algorithm to build a regular
expression (a finite automaton, in general) that would match a given
set of strings? i of course mean a regexp that would match just the
given strings, and not any other. so solutions like /^.*$/ are not
acceptable
thanks
konstantin
nice challenge, reminds me a bit of the 4th rubyquiz :>
ako... schrieb:
>does anyone know of any reference or an algorithm to build a regular
>expression (a finite automaton, in general) that would match a given
>set of strings? i of course mean a regexp that would match just the
>given strings, and not any other. so solutions like /^.*$/ are not
>acceptable
This is essentially an information compression problem. There's
always a trivial solution:
s = %w{find only these strings}
r = Regexp.new(s.map {|w| '\A' + w + '\Z'}.join('|'))
The _real_ problem is finding the shortest possible regexp.
i suspect that from your trivial solution we can arrive at the final
solution by just performing a DFA minimization. the same thing that lex
and yacc do when they generate code. so i guess i have answered my own
question. thanks for helping me. your trivial solution just made my
brain work.
i suspect that from your trivial solution we can arrive at the final
solution by just performing a DFA minimization. the same thing that
lex and yacc do when they generate code. so i guess i have answered
my own question. thanks for helping me. your trivial solution just
made my brain work.
Another option is to build up a tree and create the RX from that. I once
wrote that: