Hi all, I'm Ikoma and it's my first post in this ML.
I'm very pleased to announce the first public release of regextest 0.1.5 on
Rubygems / BitBucket.
It generates sample string corresponding to specified regular expression.
require "regextest"
/\d{5}/.sample #=> "62853"
5.times.map{/\w{5}/.sample} #=> ["mCcA5", "1s3Ae", "9HYbe",
"x3T0A", "TJHlQ"]
Unlike any other similar tools/libraries(*1), it recognizes anchors (\b,
\A, \z, etc.),
any unicode character classes (Hiragana, Han, Hangul, Tamil, Kannada,
etc.), and
extended groups.
/.\b.\b.\b.\b\w/.sample #=> "W!m;4"
/[\p{greek}&&\p{upper}]+/.sample #=> "ΥΣΈΥΪΕΓΧΖ"
/(?x) G (o O(?-x)oO) g L/.sample #=> "GoOoOgL"
In addition, it can generate strings from Ruby's "not-so-regular"
expression such as
look-ahead/behind, condition, reluctant repeat or so-called "Tanaka Akira
special"(*2),
etc.. That is, using this library, you can generate sample strings of any
languages
(XML etc.) if you can describe the syntax of the language using regex :-).
/(?=[a-z])\w{5}(?<=_\d)/.sample #=> "nCc_0"
palindrome = /\A(?<a>|.|(?:(?<b>.)\g<a>\k<b+0>))\z/
palindrome.sample #=> "a]r\\CC\\r]a"
xml = Regexp.compile(<<'__REGEXP__'.strip, Regexp::EXTENDED)
(?<element> \g<stag> \g<content>* \g<etag> ){0}
(?<stag> < \g<name> \s* > ){0}
(?<name> [a-zA-Z_:]+ ){0}
(?<content> [^<&]+ (\g<element> | [^<&]+)* ){0}
(?<etag> </ \k<name+1> >){0}
\g<element>
__REGEXP__
xml.sample #=> "<NB\v>b/e}:<un\r>\"A<a\r>[<IL
\r\n></IL>o(</a>q</un></NB>"
You can use sample application ( http://goo.gl/5miiF4 ) without
installation /
scripting. It provides Rubular(*3)-like regex testing service. However, you
will be
aware that your regexes can be checked without entering test strings.
The objective of the tool is to know what kind of strings can be matched
with
specified regex. Therefore, main use case of the tool may be testing.
AFAIC, very
surprised some regexes of mine could match unexpected strings and sometimes
they
were harmful for the target program.
*NOTE*, it is impossible to generate string from any regex within a given
time
period (just same as Regexp class to analyze). The tool returns error if it
fails to
generate. As of now, there are some major (and many minor) restrictions /
bugs.
See issues tracker
( https://bitbucket.org/ikomamik/regextest/issues?status=new&status=open )
for more details. I would like to improve functionality / reliability on
demand.
Homepage: https://bitbucket.org/ikomamik/regextest
License: 2-clause BSD license
View on registry: https://rubygems.org/gems/regextest
Documentation: http://www.rubydoc.info/gems/regextest/0.1.5
Any comments, reports, or PRs are very welcomed.
It's my great pleasure if you would improve productivity using this tool.
Enjoy!
Mikio Ikoma
*1 Similar tools/libraries
- String-Random at CPAN
http://search.cpan.org/~shlomif/String-Random-0.29/lib/String/Random.pm
- SDL Regex Fuzzer of Microsoft
https://www.microsoft.com/en-us/download/details.aspx?id=20095
and many others
*2 "Not-so-regular" expression:
https://github.com/k-takata/Onigmo/blob/master/doc/RE
*3 Rubular:
http://rubular.com/