[ANN] regextest 0.1.5 Released!

Mikio_Ikoma · 3 September 2016 22:57

# Very sorry for the double posting, if you've received this.

Hi all, I'm Ikoma and it's my first post in this ML.

I'm very pleased to announce the first public release of
regextest 0.1.5 on Rubygems / BitBucket.

It generates sample string corresponding to specified regular
expression.

require "regextest"

/\d{5}/.sample #=> "62853"
5.times.map{/\w{5}/.sample} #=> ["mCcA5", "1s3Ae", "9HYbe",
"x3T0A", "TJHlQ"]

Unlike any other similar tools/libraries(*1), it recognizes anchors (\b,
\A, \z, etc.),
any unicode character classes (Hiragana, Han, Hangul, Tamil, Kannada,
etc.), and
extended groups.

  /.\b.\b.\b.\b\w/.sample #=> "W!m;4"
  /[\p{greek}&&\p{upper}]+/.sample #=> "ΥΣΈΥΪΕΓΧΖ"
  /(?x) G (o O(?-x)oO) g L/.sample #=> "GoOoOgL"

In addition, it can generate strings from Ruby's "not-so-regular"
expression such as look-ahead/behind, condition, reluctant repeat
or so-called "Tanaka Akira special"(*2), etc.. That is, using
this library, you can generate sample strings of any languages
(XML etc.) if you can describe the syntax of the language using
regex :-).

/(?=[a-z])\w{5}(?<=_\d)/.sample #=> "nCc_0"

palindrome = /\A(?<a>|.|(?:(?<b>.)\g<a>\k<b+0>))\z/
palindrome.sample #=> "a]r\\CC\\r]a"

  xml = Regexp.compile(<<'__REGEXP__'.strip, Regexp::EXTENDED)
    (?<element> \g<stag> \g<content>* \g<etag> ){0}
    (?<stag> < \g<name> \s* > ){0}
    (?<name> [a-zA-Z_:]+ ){0}
     (?<content> [^<&]+ (\g<element> | [^<&]+)* ){0}
    (?<etag> </ \k<name+1> >){0}

    \g<element>
  __REGEXP__
  xml.sample #=>
"<NB\v>b/e}:<un\r>\"A<a\r>[<IL\r\n></IL>o(</a>q</un></NB>"

You can use sample application ( http://goo.gl/5miiF4 ) without
installation / scripting. It provides Rubular(*3)-like regex
testing service. However, you will be aware that your regexes
can be checked without entering test strings.

The objective of the tool is to know what kind of strings can
be matched with specified regex. Therefore, main use case of the
tool may be testing. AFAIC, very surprised some regexes of mine
could match unexpected strings and sometimes they were harmful
for the target program.

*NOTE*, it is impossible to generate string from any regex
within a given time period (it is just same as Regexp class to
analyze). The tool returns error if it fails to generate. As
of now, there are some major (and many minor) restrictions /
bugs. See issues tracker
( https://bitbucket.org/ikomamik/regextest/issues?status=new&status=open )
for more details. I would like to improve functionality /
reliability on demand.

Homepage: https://bitbucket.org/ikomamik/regextest
License: 2-clause BSD license
View on registry: https://rubygems.org/gems/regextest
Documentation: http://www.rubydoc.info/gems/regextest/0.1.5

Any comments, reports, or PRs are very welcomed.

It's my great pleasure if you would improve productivity using
this tool.

Enjoy!

Mikio Ikoma

*1 Similar tools/libraries
- String-Random at CPAN
http://search.cpan.org/~shlomif/String-Random-0.29/lib/String/Random.pm
- SDL Regex Fuzzer of Microsoft
https://www.microsoft.com/en-us/download/details.aspx?id=20095
and many others

*2 "Not-so-regular" expression:
https://github.com/k-takata/Onigmo/blob/master/doc/RE

*3 Rubular:
http://rubular.com/

Victor_Shepelev · 4 September 2016 07:25

Hey, really cool!

Maybe you will be interested -- i've pushed a link to Reddit, and there is
some interesting comment there:
https://www.reddit.com/r/ruby/comments/50zv03/regextest_generates_sample_string_that_matches/

···

2016-09-04 1:57 GMT+03:00 Mikio Ikoma <mikio.ikoma@gmail.com>:

# Very sorry for the double posting, if you've received this.

Hi all, I'm Ikoma and it's my first post in this ML.

I'm very pleased to announce the first public release of
regextest 0.1.5 on Rubygems / BitBucket.

It generates sample string corresponding to specified regular
expression.

  require "regextest"

  /\d{5}/.sample #=> "62853"
  5.times.map{/\w{5}/.sample} #=> ["mCcA5", "1s3Ae", "9HYbe",
"x3T0A", "TJHlQ"]

Unlike any other similar tools/libraries(*1), it recognizes anchors (\b,
\A, \z, etc.),
any unicode character classes (Hiragana, Han, Hangul, Tamil, Kannada,
etc.), and
extended groups.

  /.\b.\b.\b.\b\w/.sample #=> "W!m;4"
  /[\p{greek}&&\p{upper}]+/.sample #=> "ΥΣΈΥΪΕΓΧΖ"
  /(?x) G (o O(?-x)oO) g L/.sample #=> "GoOoOgL"

In addition, it can generate strings from Ruby's "not-so-regular"
expression such as look-ahead/behind, condition, reluctant repeat
or so-called "Tanaka Akira special"(*2), etc.. That is, using
this library, you can generate sample strings of any languages
(XML etc.) if you can describe the syntax of the language using
regex :-).

  /(?=[a-z])\w{5}(?<=_\d)/.sample #=> "nCc_0"

  palindrome = /\A(?<a>|.|(?:(?<b>.)\g<a>\k<b+0>))\z/
  palindrome.sample #=> "a]r\\CC\\r]a"

  xml = Regexp.compile(<<'__REGEXP__'.strip, Regexp::EXTENDED)
    (?<element> \g<stag> \g<content>* \g<etag> ){0}
    (?<stag> < \g<name> \s* > ){0}
    (?<name> [a-zA-Z_:]+ ){0}
     (?<content> [^<&]+ (\g<element> | [^<&]+)* ){0}
    (?<etag> </ \k<name+1> >){0}

    \g<element>
  __REGEXP__
  xml.sample #=> "<NB\v>b/e}:<un\r>\"A<a\r>[<
IL\r\n></IL>o(</a>q</un></NB>"

You can use sample application ( http://goo.gl/5miiF4 ) without
installation / scripting. It provides Rubular(*3)-like regex
testing service. However, you will be aware that your regexes
can be checked without entering test strings.

The objective of the tool is to know what kind of strings can
be matched with specified regex. Therefore, main use case of the
tool may be testing. AFAIC, very surprised some regexes of mine
could match unexpected strings and sometimes they were harmful
for the target program.

*NOTE*, it is impossible to generate string from any regex
within a given time period (it is just same as Regexp class to
analyze). The tool returns error if it fails to generate. As
of now, there are some major (and many minor) restrictions /
bugs. See issues tracker
( Log in with Atlassian account
)
for more details. I would like to improve functionality /
reliability on demand.

Homepage: Bitbucket
License: 2-clause BSD license
View on registry: regextest | RubyGems.org | your community gem host
Documentation: File: README — Documentation for regextest (0.1.5)

Any comments, reports, or PRs are very welcomed.

It's my great pleasure if you would improve productivity using
this tool.

Enjoy!

Mikio Ikoma

*1 Similar tools/libraries
- String-Random at CPAN
   String::Random - Perl module to generate random strings based on a pattern - metacpan.org
- SDL Regex Fuzzer of Microsoft
   https://www.microsoft.com/en-us/download/details.aspx?id=20095
and many others

*2 "Not-so-regular" expression:
   Onigmo/doc/RE at master · k-takata/Onigmo · GitHub

*3 Rubular:
   http://rubular.com/

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Robert_K1 · 5 September 2016 06:12

Thank you for sharing! I struggle to understand the utility of this.
Can you explain a bit more how you imagine this to be used during
testing - especially with the limitation that sample generation will
not always be successful? Thank you!

Kind regards

robert

···

On Sun, Sep 4, 2016 at 12:57 AM, Mikio Ikoma <mikio.ikoma@gmail.com> wrote:

I'm very pleased to announce the first public release of
regextest 0.1.5 on Rubygems / BitBucket.

It generates sample string corresponding to specified regular
expression.

The objective of the tool is to know what kind of strings can
be matched with specified regex. Therefore, main use case of the
tool may be testing. AFAIC, very surprised some regexes of mine
could match unexpected strings and sometimes they were harmful
for the target program.

*NOTE*, it is impossible to generate string from any regex
within a given time period (it is just same as Regexp class to
analyze). The tool returns error if it fails to generate. As
of now, there are some major (and many minor) restrictions /
bugs. See issues tracker
( Log in with Atlassian account )
for more details. I would like to improve functionality /
reliability on demand.

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Mikio_Ikoma · 4 September 2016 08:40

Hi Victor!

Very thanks to post to Reddit. Yes, Reddit is the right place to discuss. I
forgot Reddit since it is not so popular here (Japan). I will response the
comments there too. (In fact, I was disappointed I failed to post the
release announcement on last night.)

Again, thank you very much for being interested in my tool and letting the
tool go west!

Cheer!

Ikoma

···

2016-09-04 16:25 GMT+09:00 Victor Shepelev <zverok.offline@gmail.com>:

Hey, really cool!

Maybe you will be interested -- i've pushed a link to Reddit, and there is
some interesting comment there: https://www.reddit.com/
r/ruby/comments/50zv03/regextest_generates_sample_string_that_matches/

2016-09-04 1:57 GMT+03:00 Mikio Ikoma <mikio.ikoma@gmail.com>:

# Very sorry for the double posting, if you've received this.

Hi all, I'm Ikoma and it's my first post in this ML.

I'm very pleased to announce the first public release of
regextest 0.1.5 on Rubygems / BitBucket.

It generates sample string corresponding to specified regular
expression.

  require "regextest"

  /\d{5}/.sample #=> "62853"
  5.times.map{/\w{5}/.sample} #=> ["mCcA5", "1s3Ae", "9HYbe",
"x3T0A", "TJHlQ"]

Unlike any other similar tools/libraries(*1), it recognizes anchors (\b,
\A, \z, etc.),
any unicode character classes (Hiragana, Han, Hangul, Tamil, Kannada,
etc.), and
extended groups.

  /.\b.\b.\b.\b\w/.sample #=> "W!m;4"
  /[\p{greek}&&\p{upper}]+/.sample #=> "ΥΣΈΥΪΕΓΧΖ"
  /(?x) G (o O(?-x)oO) g L/.sample #=> "GoOoOgL"

In addition, it can generate strings from Ruby's "not-so-regular"
expression such as look-ahead/behind, condition, reluctant repeat
or so-called "Tanaka Akira special"(*2), etc.. That is, using
this library, you can generate sample strings of any languages
(XML etc.) if you can describe the syntax of the language using
regex :-).

  /(?=[a-z])\w{5}(?<=_\d)/.sample #=> "nCc_0"

  palindrome = /\A(?<a>|.|(?:(?<b>.)\g<a>\k<b+0>))\z/
  palindrome.sample #=> "a]r\\CC\\r]a"

  xml = Regexp.compile(<<'__REGEXP__'.strip, Regexp::EXTENDED)
    (?<element> \g<stag> \g<content>* \g<etag> ){0}
    (?<stag> < \g<name> \s* > ){0}
    (?<name> [a-zA-Z_:]+ ){0}
     (?<content> [^<&]+ (\g<element> | [^<&]+)* ){0}
    (?<etag> </ \k<name+1> >){0}

    \g<element>
  __REGEXP__
  xml.sample #=> "<NB\v>b/e}:<un\r>\"A<a\r>[<IL
\r\n></IL>o(</a>q</un></NB>"

You can use sample application ( http://goo.gl/5miiF4 ) without
installation / scripting. It provides Rubular(*3)-like regex
testing service. However, you will be aware that your regexes
can be checked without entering test strings.

The objective of the tool is to know what kind of strings can
be matched with specified regex. Therefore, main use case of the
tool may be testing. AFAIC, very surprised some regexes of mine
could match unexpected strings and sometimes they were harmful
for the target program.

*NOTE*, it is impossible to generate string from any regex
within a given time period (it is just same as Regexp class to
analyze). The tool returns error if it fails to generate. As
of now, there are some major (and many minor) restrictions /
bugs. See issues tracker
( Log in with Atlassian account
)
for more details. I would like to improve functionality /
reliability on demand.

Homepage: Bitbucket
License: 2-clause BSD license
View on registry: regextest | RubyGems.org | your community gem host
Documentation: File: README — Documentation for regextest (0.1.5)

Any comments, reports, or PRs are very welcomed.

It's my great pleasure if you would improve productivity using
this tool.

Enjoy!

Mikio Ikoma

*1 Similar tools/libraries
- String-Random at CPAN
   Browse SHLOMIF/String-Random-0.29/lib - metacpan.org
String/Random.pm
- SDL Regex Fuzzer of Microsoft
   https://www.microsoft.com/en-us/download/details.aspx?id=20095
and many others

*2 "Not-so-regular" expression:
   Onigmo/doc/RE at master · k-takata/Onigmo · GitHub

*3 Rubular:
   http://rubular.com/

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Mikio_Ikoma · 5 September 2016 15:06

Thank you for sparing your valuable time to make up use cases of my tool.
Yes, I would like to discuss how to use the tool here.

Let's start with the limitation that the tool sometimes fails to generate.
Yes, it is true. However, failure rate is not so high. As of now, failure
cases are 5.5% of Onigmo's test suite (of many strange regexes unlike we
have ever seen before) and the rate is lower and lower day by day.

When I started to develop the tool, I wrote a manifesto (in Japanese).
First 2 items are

- The tool is for common programmers / testers, rather than regex
implementors or researchers.
- The tool shall returns proper string(s) matched with specified regex. If
fails, return proper message to users.

By the end of this year, I expect many programmers / testers can use the
tool without any troubles for learning, programming and testing.

Ah, sorry, I forgot your question, how to use it in testing. I think there
are two or more cases.

1. Unit testing
Many regexes in code are tested by intended strings by in-house testing or
by using Rubular or similar regex tester. However, in the field, many
unexpected strings are inputting from real world. Regextest can be used in
your test program or used as an engine of existing regex tester. Please
check my sample implementation of regex tester ( http://goo.gl/5miiF4 ).
Furthermore, as Microsoft's SDL Regex Fuzzer is, regextest can be use to
detect fuzzing codes such as /(\d+)+\1$/. Regextest return timeout
exception for such fuzzing code.

2. Functional testing
As far as I'm concerned, many functional documents define the spec using
regex notation (even if the implementation don't use regex). In this case,
the described regexes are usually simple and easy to use Regextest for
functional testing such as

/[A-Z]\w{0,31}/.enumerate.each do | var_name |
do_test(var_name)
end
# NOTE: enumerate method is not implemented yet!

If you want, it is comparatively easy to implement all-pairs (pairwise)
testing as below.

reg = /(?<os> Linux | Windows | MacOS ){0}
(?<lang> C | ja | zh ){0}
(?<brwz> Chrome| IE | Saffari){0}

        test \g<os>, \g<lang>, \g<brwz>
       /x
reg.enumerate.pairs(2).each do | a_test |
   do_test(a_test)
end
# NOTE: enumerate/pairs methods are not implemented yet!

Although the latter case may be impractical for testers, I hope you may
understand you can use it in many cases even in functional testing phase.

I'm very glad if this helps.

Cheers!

···

2016-09-05 15:12 GMT+09:00 Robert Klemme <shortcutter@googlemail.com>:

On Sun, Sep 4, 2016 at 12:57 AM, Mikio Ikoma <mikio.ikoma@gmail.com> > wrote:

> I'm very pleased to announce the first public release of
> regextest 0.1.5 on Rubygems / BitBucket.
>
> It generates sample string corresponding to specified regular
> expression.

> The objective of the tool is to know what kind of strings can
> be matched with specified regex. Therefore, main use case of the
> tool may be testing. AFAIC, very surprised some regexes of mine
> could match unexpected strings and sometimes they were harmful
> for the target program.
>
> *NOTE*, it is impossible to generate string from any regex
> within a given time period (it is just same as Regexp class to
> analyze). The tool returns error if it fails to generate. As
> of now, there are some major (and many minor) restrictions /
> bugs. See issues tracker
> ( Log in with Atlassian account
)
> for more details. I would like to improve functionality /
> reliability on demand.

Thank you for sharing! I struggle to understand the utility of this.
Can you explain a bit more how you imagine this to be used during
testing - especially with the limitation that sample generation will
not always be successful? Thank you!

Kind regards

robert

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Doug_ClymerOlson · 5 September 2016 20:26

> By the end of this year, I expect many programmers / testers can use the
> tool without any troubles for learning, programming and testing.

We'll see.

> Ah, sorry, I forgot your question, how to use it in testing. I think
there
> are two or more cases.
>
> 1. Unit testing
> Many regexes in code are tested by intended strings by in-house testing
or
> by using Rubular or similar regex tester. However, in the field, many
> unexpected strings are inputting from real world. Regextest can be used
in
> your test program or used as an engine of existing regex tester. Please
> check my sample implementation of regex tester ( http://goo.gl/5miiF4 ).
> Furthermore, as Microsoft's SDL Regex Fuzzer is, regextest can be use to
> detect fuzzing codes such as /(\d+)+\1$/. Regextest return timeout
exception
> for such fuzzing code.
>
> 2. Functional testing
> As far as I'm concerned, many functional documents define the spec using
> regex notation (even if the implementation don't use regex). In this
case,
> the described regexes are usually simple and easy to use Regextest for
> functional testing such as
>
> /[A-Z]\w{0,31}/.enumerate.each do | var_name |
> do_test(var_name)
> end
> # NOTE: enumerate method is not implemented yet!

Makes sense.

> If you want, it is comparatively easy to implement all-pairs (pairwise)
> testing as below.
>
> reg = /(?<os> Linux | Windows | MacOS ){0}
> (?<lang> C | ja | zh ){0}
> (?<brwz> Chrome| IE | Saffari){0}
>
> test \g<os>, \g<lang>, \g<brwz>
> /x
> reg.enumerate.pairs(2).each do | a_test |
> do_test(a_test)
> end
> # NOTE: enumerate/pairs methods are not implemented yet!

I am not sure I grok the intention of the code above.

> Although the latter case may be impractical for testers, I hope you may
> understand you can use it in many cases even in functional testing phase.
>
> I'm very glad if this helps.

Thanks for taking the time to write this! I can see some value in
generating a set of (potentially long) strings that are supposed to
match and throwing it at the regex to see how it behaves (i.e.
properly matches in reasonable time).

To me it seems the main functional testing for regular expressions
usually consists of providing a set of strings that the expression is
supposed to match and another set which it is not supposed to match to
make sure the regexp does what it is supposed to do. Usually you want
to carefully chose strings to match and not match based on knowledge
about edge cases and the use case's requirements for matching.

I can see the value of using this as a method to generate randomized/fuzz
input for a function.

For (a contrived) example, the following regex:

^(GET|POST) [\w\/\-]+ (HTTP\/([1-3](\.\d)?))$

Could be used to generate a list of strings:

POST -jvKm6NULzjuV-XD HTTP/3.7
GET jf-qbgzYuslC HTTP/2
POST wRhIXV/FuLNeQZMH HTTP/3.3
POST d4avtPkvMuSNoAwc/8bZDy_hR HTTP/2.0
...

Which could be used to test my contrived function:

def decode_http_request(request)

request =~ /^(\w+) (\S+) ([A-Z]+)(\/([\d\.]+))/
{ method: *$1*.downcase.to_sym, path: *$2*, protocol: *$3*, version: *$5* }

end

Given a more complex scenario, it would be helpful to have input for tests
generated by a pattern rather than relying on my ability to think of all
the possible combinations (though I think this is still important to
attempt). It would be good to use a much larger set of testa than what I
showed above.

Doug

robert

···

On Mon, Sep 5, 2016 at 9:29 AM, Robert Klemme <shortcutter@googlemail.com> wrote:

On Mon, Sep 5, 2016 at 5:06 PM, Mikio Ikoma <mikio.ikoma@gmail.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Robert_K1 · 5 September 2016 15:29

By the end of this year, I expect many programmers / testers can use the
tool without any troubles for learning, programming and testing.

We'll see.

Ah, sorry, I forgot your question, how to use it in testing. I think there
are two or more cases.

1. Unit testing
Many regexes in code are tested by intended strings by in-house testing or
by using Rubular or similar regex tester. However, in the field, many
unexpected strings are inputting from real world. Regextest can be used in
your test program or used as an engine of existing regex tester. Please
check my sample implementation of regex tester ( http://goo.gl/5miiF4 ).
Furthermore, as Microsoft's SDL Regex Fuzzer is, regextest can be use to
detect fuzzing codes such as /(\d+)+\1$/. Regextest return timeout exception
for such fuzzing code.

2. Functional testing
As far as I'm concerned, many functional documents define the spec using
regex notation (even if the implementation don't use regex). In this case,
the described regexes are usually simple and easy to use Regextest for
functional testing such as

/[A-Z]\w{0,31}/.enumerate.each do | var_name |
do_test(var_name)
end
# NOTE: enumerate method is not implemented yet!

Makes sense.

If you want, it is comparatively easy to implement all-pairs (pairwise)
testing as below.

reg = /(?<os> Linux | Windows | MacOS ){0}
        (?<lang> C | ja | zh ){0}
        (?<brwz> Chrome| IE | Saffari){0}

        test \g<os>, \g<lang>, \g<brwz>
       /x
reg.enumerate.pairs(2).each do | a_test |
   do_test(a_test)
end
# NOTE: enumerate/pairs methods are not implemented yet!

I am not sure I grok the intention of the code above.

Although the latter case may be impractical for testers, I hope you may
understand you can use it in many cases even in functional testing phase.

I'm very glad if this helps.

Thanks for taking the time to write this! I can see some value in
generating a set of (potentially long) strings that are supposed to
match and throwing it at the regex to see how it behaves (i.e.
properly matches in reasonable time).

To me it seems the main functional testing for regular expressions
usually consists of providing a set of strings that the expression is
supposed to match and another set which it is not supposed to match to
make sure the regexp does what it is supposed to do. Usually you want
to carefully chose strings to match and not match based on knowledge
about edge cases and the use case's requirements for matching.

Kind regards

robert

···

On Mon, Sep 5, 2016 at 5:06 PM, Mikio Ikoma <mikio.ikoma@gmail.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Robert_K1 · 5 September 2016 21:14

I can see the value of using this as a method to generate randomized/fuzz
input for a function.

For (a contrived) example, the following regex:

^(GET|POST) [\w\/\-]+ (HTTP\/([1-3](\.\d)?))$

Could be used to generate a list of strings:

POST -jvKm6NULzjuV-XD HTTP/3.7
GET jf-qbgzYuslC HTTP/2
POST wRhIXV/FuLNeQZMH HTTP/3.3
POST d4avtPkvMuSNoAwc/8bZDy_hR HTTP/2.0
...

Which could be used to test my contrived function:

def decode_http_request(request)

request =~ /^(\w+) (\S+) ([A-Z]+)(\/([\d\.]+))/
{ method: $1.downcase.to_sym, path: $2, protocol: $3, version: $5 }

end

Given a more complex scenario, it would be helpful to have input for tests
generated by a pattern rather than relying on my ability to think of all the
possible combinations (though I think this is still important to attempt).

Exactly: a human thinks of legal and illegal sequences and uses these
to test against.

It would be good to use a much larger set of testa than what I showed above.

But why would you use a different regex than the one in the method to
generate strings? If one does not reuse the regex in the method then
one could use any kind of generator to create all these strings. Or is
the point that a regex makes for a very compact generator?

I should probably get some sleep. Maybe tomorrow I'll understand.

Kind regards

robert

···

On Mon, Sep 5, 2016 at 10:26 PM, Doug ClymerOlson <doug@infoplane.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Mikio_Ikoma · 5 September 2016 22:25

But why would you use a different regex than the one in the method to

generate strings?

The example Doug showed is one of functional testing. The test is too
simple but may examine many implementation of HTTP Header analysis.

By the way, Klemme-san, how about

a_regex.enumerate.bounds.each {| edge_value | ....}

a_regex.enumerate.out_bounds.each {| invalid_value | ....}

it is possible in near future.

Regarding equivalent partitioning, the tool can consider character range
[A-Z] or length {2,10}. However, there is no (simple) means to express
0..255. Regex had better to support syntax such as (?{0..255}) (will
generate 0/1/2/.../255).

Cheers!

···

2016-09-06 6:14 GMT+09:00 Robert Klemme <shortcutter@googlemail.com>:

On Mon, Sep 5, 2016 at 10:26 PM, Doug ClymerOlson <doug@infoplane.com> > wrote:

> I can see the value of using this as a method to generate randomized/fuzz
> input for a function.
>
> For (a contrived) example, the following regex:
>
> ^(GET|POST) [\w\/\-]+ (HTTP\/([1-3](\.\d)?))$
>
>
> Could be used to generate a list of strings:
>
> POST -jvKm6NULzjuV-XD HTTP/3.7
> GET jf-qbgzYuslC HTTP/2
> POST wRhIXV/FuLNeQZMH HTTP/3.3
> POST d4avtPkvMuSNoAwc/8bZDy_hR HTTP/2.0
> ...
>
> Which could be used to test my contrived function:
>
> def decode_http_request(request)
>
> request =~ /^(\w+) (\S+) ([A-Z]+)(\/([\d\.]+))/
> { method: $1.downcase.to_sym, path: $2, protocol: $3, version: $5 }
>
> end
>
>
> Given a more complex scenario, it would be helpful to have input for
tests
> generated by a pattern rather than relying on my ability to think of all
the
> possible combinations (though I think this is still important to
attempt).

Exactly: a human thinks of legal and illegal sequences and uses these
to test against.

> It would be good to use a much larger set of testa than what I showed
above.

But why would you use a different regex than the one in the method to
generate strings? If one does not reuse the regex in the method then
one could use any kind of generator to create all these strings. Or is
the point that a regex makes for a very compact generator?

I should probably get some sleep. Maybe tomorrow I'll understand.

Kind regards

robert

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Topic		Replies	Views
[ANN] regextest 0.1.5 Released ruby-talk	0	264	3 September 2016
[Ann] Reg 0.4.5 ruby-talk	1	126	17 May 2005
MetaRegexp: experimental extensions to Regexp (requesting feedback) ruby-talk	3	132	3 October 2010
[ANN] RegexpBench 0.5.1 Released ruby-talk	0	101	11 December 2007
[ANN] RegexpBench 0.5.1 Released ruby-talk	0	103	11 December 2007

[ANN] regextest 0.1.5 Released!

Related topics