Hi

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?

Opti

Hi

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?

Opti

In regexps, "and" is implicit. For example:

/foo/

means find an "f" somewhere, and followed by an "o", and in turn followed

by an "o".

Hi!

All continuous tokens in a regex are joined with your 'and' symbol implicitly.

/ab/ matches /ab/

/ac/ matches /ac/

/a[bc]/ matches 'ab' and 'ac'

Yunzhe

-----Original Messages-----

From: "Die Optimisten" <inform@die-optimisten.net>

Sent Time: 2022-08-09 17:08:33 (Tuesday)

To: Ruby-Talk <ruby-talk@ruby-lang.org>

Cc:

Subject: regexHi

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?Opti

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>

<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

There's no "and" operator in regexp. But you can use lookahead, which kind of simulates that, but is certainly less performant.

[4] pry(main)> /(?=hello)(hi|howdy|hello)/.match? "hi"

=> false

[5] pry(main)> /(?=hello)(hi|howdy|hello)/.match? "hello"

=> true

[6] pry(main)> /(?=hello)(hi|howdy|hello)/.match? "howdy"

=> false

[7] pry(main)>

Kind of, because length matters, note I used + in the lookahead.

[14] pry(main)> /(?=[A-Z]+)((?i)hello|world)/.match? "hello"

=> false

[15] pry(main)> /(?=[A-Z]+)((?i)hello|world)/.match? "HELLO"

=> true

[16] pry(main)> /(?=[A-Z]+)((?i)hello|world)/.match? "Hello"

=> true

[17] pry(main)>

To simulate "and not" you can use negative lookahead:

[17] pry(main)> /(?![A-Z]+)((?i)hello|world)/.match? "Hello"

=> false

[18] pry(main)> /(?![A-Z]+)((?i)hello|world)/.match? "HELLO"

=> false

[19] pry(main)> /(?![A-Z]+)((?i)hello|world)/.match? "hello"

=> true

[20] pry(main)>

On 8/9/22 11:08, Die Optimisten wrote:

Hi

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?Opti

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>

<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Hello!

So for (a and (b or c) we have to search for (a before (b|c)) AND ((b

c) before a)

can this be somehow reduced? (there is no 'before-or-after' operator...

Is the performance + ram_needed better when using distinct commands (

/a/.match?... and /b|c/.match?... ) ?

Thank you

Opti

You can test each option in one regex:

"a something b".match(/a.*[bc]|[bc].*a/)

"c something a".match(/a.*[bc]|[bc].*a/)

Hi,

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?

Such an operator does not exist in Regular Expressions (as defined by

Kleene) and also does not exist in any Regexp engine I have ever heard

of. The reason is quite simple: it does not make sense because it can

never match.

A symbol can never be two symbols at the same time. Your proposed

Regex is looking for a symbol that is at the same time an `a` and also

either a `b` or a `c`. But that is not possible: if the symbol is an

`a`, then it cannot possibly be a `b` or a `c`, and if the symbol is a

`b` or a `c`, then it cannot possibly be an `a`.

To formalize it a little:

Let A be an arbitrary regular expression.

Let B ≠ A be an arbitrary regular expression.

∀A, B: There can never be a string that is recognized by the regular

expression A∧B.

The proof for that is a little too long to fit into an email (and also

a little bit over my head, so I will not even attempt it), but

intuitively, it should be possible to at least get an inkling why this

statement might be true.

OTOH, it is easy to see that for A=B, i.e. for the regular expression

A∧A, there *are* strings that are recognized by it, namely exactly the

set of strings that are recognized by A.

Cheers

Die Optimisten <inform@die-optimisten.net> wrote:

Hello,

thanks for your answer;

I should have written that a,b,c are placeholders for strings;

but ... also if they're one-char strings: why 'can't never match?'

# Also a could be the same as b or c.....

But (what I meant): -> / (a.*(b|c)) | ((b|c).*a) / # also with or

without '.*' ...

* Can this be simplified (without having to write a,b,c twice) ?

Opti

PS: What does OTOH mean?

Am 09.08.22 um 21:14 schrieb Jörg W Mittag:

Hi,

Die Optimisten <inform@die-optimisten.net> wrote:

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?Such an operator does not exist in Regular Expressions (as defined by

Kleene) and also does not exist in any Regexp engine I have ever heard

of. The reason is quite simple: it does not make sense because it can

never match.A symbol can never be two symbols at the same time. Your proposed

Regex is looking for a symbol that is at the same time an `a` and also

either a `b` or a `c`. But that is not possible: if the symbol is an

`a`, then it cannot possibly be a `b` or a `c`, and if the symbol is a

`b` or a `c`, then it cannot possibly be an `a`.To formalize it a little:

Let A be an arbitrary regular expression.

Let B ≠ A be an arbitrary regular expression.∀A, B: There can never be a string that is recognized by the regular

expression A∧B.The proof for that is a little too long to fit into an email (and also

a little bit over my head, so I will not even attempt it), but

intuitively, it should be possible to at least get an inkling why this

statement might be true.OTOH, it is easy to see that for A=B, i.e. for the regular expression

A∧A, there *are* strings that are recognized by it, namely exactly the

set of strings that are recognized by A.Cheers

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>

<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

Hello,

Not sure if I understood you correctly nor whether this is appropiate solution, but

you could do something like ((a).*(b|c)) | ((?3).*(?2))

In ruby it would work look like

/((a).*(b|c))|((\g<3>).*(\g<2>))/

Here (a) would be group capture 2 and (b|c) would be group capture 3, that's why we later call them as (\g<3>) and (\g<2>) to reverse their order.

I'm in no ways exper (just learning tbf), so others should say if this is good idea.

Cheers

On 09.08.2022 21:14, Jörg W Mittag wrote:

Hi,

Die Optimisten<inform@die-optimisten.net> wrote:

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?Such an operator does not exist in Regular Expressions (as defined by

Kleene) and also does not exist in any Regexp engine I have ever heard

of. The reason is quite simple: it does not make sense because it can

never match.A symbol can never be two symbols at the same time. Your proposed

Regex is looking for a symbol that is at the same time an `a` and also

either a `b` or a `c`. But that is not possible: if the symbol is an

`a`, then it cannot possibly be a `b` or a `c`, and if the symbol is a

`b` or a `c`, then it cannot possibly be an `a`.To formalize it a little:

Let A be an arbitrary regular expression.

Let B ≠ A be an arbitrary regular expression.∀A, B: There can never be a string that is recognized by the regular

expression A∧B.The proof for that is a little too long to fit into an email (and also

a little bit over my head, so I will not even attempt it), but

intuitively, it should be possible to at least get an inkling why this

statement might be true.OTOH, it is easy to see that for A=B, i.e. for the regular expression

A∧A, there *are* strings that are recognized by it, namely exactly the

set of strings that are recognized by A.Cheers

Unsubscribe:<mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>

<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>

I think the answer is much simpler than many have put it.

First, assuming no “to-be-ignored” strings between a, b, and c: /a(b|c)/ or /(a)b|c/

Second, if you want to allow extra characters between the two strings: //a.*(b|c)/ or /(a.*)b|c/

The parentheses are required so that the “a” (or “a.*”) are not grouped with the “b” in the alternation. However, if a, b, or c are complex Regexps themselves, you may need to add parentheses around them to make them atomic relative to the “|”.

On Aug 9, 2022, at 4:11 PM, Gludek <gludekpl@gmail.com> wrote:

Hello,

Not sure if I understood you correctly nor whether this is appropiate solution, but

you could do something like ((a).*(b|c)) | ((?3).*(?2))

In ruby it would work look like

/((a).*(b|c))|((\g<3>).*(\g<2>))/

Here (a) would be group capture 2 and (b|c) would be group capture 3, that's why we later call them as (\g<3>) and (\g<2>) to reverse their order.

I'm in no ways exper (just learning tbf), so others should say if this is good idea.

Cheers

On 09.08.2022 21:14, Jörg W Mittag wrote:

Hi,

Die Optimisten <inform@die-optimisten.net> <mailto:inform@die-optimisten.net> wrote:

How can I build the Regex

/(a and (b or c)/ # (in one term) ? => is there an

'and'-symbol, like | ?

Kleene) and also does not exist in any Regexp engine I have ever heard

of. The reason is quite simple: it does not make sense because it can

never match.

Regex is looking for a symbol that is at the same time an `a` and also

either a `b` or a `c`. But that is not possible: if the symbol is an

`a`, then it cannot possibly be a `b` or a `c`, and if the symbol is a

`b` or a `c`, then it cannot possibly be an `a`.To formalize it a little:

Let A be an arbitrary regular expression.

Let B ≠ A be an arbitrary regular expression.∀A, B: There can never be a string that is recognized by the regular

expression A∧B.

a little bit over my head, so I will not even attempt it), but

intuitively, it should be possible to at least get an inkling why this

statement might be true.

A∧A, there *are* strings that are recognized by it, namely exactly the

set of strings that are recognized by A.Cheers

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe> <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>

<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk> <http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk><OpenPGP_0xE6B3E570C267469B.asc>

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>

<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>