In search of a kind of converse to sub/gsub

Dear all,

while reading the thread about ggsub, I recalled a problem I have tried to
code but stopped for lack of time.
Now, if you can't do it yourself, why not let others work for you :wink: ?

When correcting a text for typos, one can minimize the Levenstein distance
(number of character changes from one string to another) of the mis-spelled
word
to a list of correctly-spelled words.
However, many words are short, and there are a lot of "correct", English
words you can produce by, say, changing three letters in a three-letter word.
Yet typos are much more specific - there are essentially inversions 'ab'
->'ba'
or misspellings coming from hitting nearby keys on the keyboard.

Now I am too lazy to identify all these possibilities and I am searching for
some method of the form

class String
    def find_what_to_replace(other)
            ...
            return replace_what_substring,replace_by
    end
end

such that for

replace_what_substring,replace_it_by =string1.find_what_to_replace(string2)

the statement

string2 == string1.gsub(replace_what_substring,replace_it_by)

holds true .

Has anybody done that already?

Best regards,

Axel

Axel asked for a method find_what_to_replace

... such that for

replace_what_substring,replace_it_by
=string1.find_what_to_replace(string2)

the statement

string2 == string1.gsub(replace_what_substring,replace_it_by)

holds true .

Has anybody done that already?

Does Diff::LCS do what you want? It returns the differences between strings.
(You can get it from http://rubyforge.org/frs/?group_id=84 or via rubygems)

Cheers,
Dave