Regular expressions

Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

let me know the reg exp to do this

···

--
Posted via http://www.ruby-forum.com/.

J. mp wrote:

Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

let me know the reg exp to do this

I think more details are necessary. What characters are allowed in "username"? Just alphabetic? Alphabetic+numbers? Anything else? Is there a minimum number of characters? A maximum? Just Latin characters? Similarly for "user.name" Is it the same as "username" except with a period? Does there have to be exactly four characters before the period and four after it? Any other constraints?

To use regular expressions you must be able to precisely state what a "match" means.

J. mp wrote:

Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

  Just to get you started:

  /[a-z]+(\.[a-z]+)?/

  Vince

···

--
Vincent Fourmond, PhD student (not for long anymore)
http://vincent.fourmond.neuf.fr/

I'm trying to install ruby-1.8.5-p12 on my CentOS4 system.

  % uname -rsvp
  % Linux 2.6.9-022stab078.23-enterprise #1 SMP Thu Oct 19 14:54:39 MSD 2006 i686

I do the following steps with no problem:

  ./configure --prefix=/usr --enable-shared
  make

But then, when I do "make check", the test run hangs forever after a few
hundred characters (mostly dots) print out.

When I finally do a Control-C, I get the following stack trace:

  /readline/test_readline.rb:20:in `readline': Interrupt
  from ./readline/test_readline.rb:20:in `test_readline'
  from ./readline/test_readline.rb:72:in `replace_stdio'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open_uri_original_open'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
  from ./readline/test_readline.rb:66:in `replace_stdio'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open_uri_original_open'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
  from ./readline/test_readline.rb:65:in `replace_stdio'
   ... 15 levels...
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/ui/testrunnerutilities.rb:29:in `run'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:200:in `run'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:13:in `run'
  from runner.rb:7
  make: *** [test-all] Error 1

If I'm reading this correctly, it looks like open_uri_original_open is
somehow being called recursively and repeatedly failing.

Is this something I need to be concerned about?

Thanks.

···

--
Lloyd Zusman
ljz@asfast.com
God bless you.

Vincent Fourmond wrote:

J. mp wrote:

Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid

  Just to get you started:

  /[a-z]+(\.[a-z]+)?/

  Vince

First of all, thanks for the attention.
More details:

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

any alphabetic char, english chars only
case insensitive
no numbers

eg
.username ->invalid
user-name -> valid
_username -> invalid
user.name -> valid
user_name ->valid

basically I want allow the same pattern allowed for emails but before
the @ char :slight_smile:

···

--
Posted via http://www.ruby-forum.com/\.

Lloyd Zusman wrote:

I'm trying to install ruby-1.8.5-p12 on my CentOS4 system.

  % uname -rsvp
  % Linux 2.6.9-022stab078.23-enterprise #1 SMP Thu Oct 19 14:54:39 MSD 2006 i686

I do the following steps with no problem:

  ./configure --prefix=/usr --enable-shared
  make

But then, when I do "make check", the test run hangs forever after a few
hundred characters (mostly dots) print out.

When I finally do a Control-C, I get the following stack trace:

  /readline/test_readline.rb:20:in `readline': Interrupt
  from ./readline/test_readline.rb:20:in `test_readline'
  from ./readline/test_readline.rb:72:in `replace_stdio'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open_uri_original_open'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
  from ./readline/test_readline.rb:66:in `replace_stdio'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open_uri_original_open'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/open-uri.rb:32:in `open'
  from ./readline/test_readline.rb:65:in `replace_stdio'
   ... 15 levels...
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/ui/testrunnerutilities.rb:29:in `run'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:200:in `run'
  from /usr/local/src/ruby/ruby-1.8.5-p12/lib/test/unit/autorunner.rb:13:in `run'
  from runner.rb:7
  make: *** [test-all] Error 1

If I'm reading this correctly, it looks like open_uri_original_open is
somehow being called recursively and repeatedly failing.

Is this something I need to be concerned about?

Thanks.

You may have to do "make install" before "make check". Did you do it that way?

···

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

J. mp wrote:

Vincent Fourmond wrote:
  

J. mp wrote:
    

Hi folks,
I'm burning my head because i don't understand how regular expressions
works

I just want to validade a username wher
username ->valid
user.name ->valid

everything else is invalid
      

  Just to get you started:

  /[a-z]+(\.[a-z]+)?/

  Vince
    
First of all, thanks for the attention.
More details:

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

any alphabetic char, english chars only
case insensitive
no numbers

eg
.username ->invalid
user-name -> valid
_username -> invalid
user.name -> valid
user_name ->valid

/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/

basically I want allow the same pattern allowed for emails but before the @ char :slight_smile:

The above regexp does not do this. Certainly you can have numbers in your email address, for example. Basically anything is allowed before the @. Google "regular expression email address" for extensive discussions about this.

/\A[a-z][a-z.-]{3,28}[a-z]\Z/i

Translated, that says:
* start at the beginning
* find any letter
* followed by 3-28 characters that are letters, periods, or hyphens
* followed by a letter
* follwed by the
* oh, and be case insensitive, please

Note that, per your exact instructions, this allows:
  u_s_e_r_n_a_m_e
  u____________________________e
  z._-_.z

···

On Feb 11, 4:21 pm, "J. mp" <joaomiguel.pere...@gmail.com> wrote:

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

"M. Edward (Ed) Borasky" <znmeb@cesmail.net> writes:

Lloyd Zusman wrote:

[ ... ]

But then, when I do "make check", the test run hangs forever after a few
hundred characters (mostly dots) print out.

[ ... ]

You may have to do "make install" before "make check". Did you do it
that way?

No, I didn't. I have always thought that the autoconf convention is to
perform "make check" first, and to use its result to decide whether to
then do the "make install". It seems incorrect to install the software
and then check it. If "make check" fails miserably, it will be a big
headache to then try to uninstall everything.

But I took a chance, and I did do the "make install" after all. Then, I
did a "make check", and it hung in exactly the same manner as before.
Luckily, ruby seems to work for all of my usual scripts, but I don't
know whether there might be something fundamentally wrong which will
bite me later.

···

--
Lloyd Zusman
ljz@asfast.com
God bless you.

Timothy Hunter wrote:

J. mp wrote:

user.name ->valid

/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/

basically I want allow the same pattern allowed for emails but before
the @ char :slight_smile:

The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.

It works well.
Thanks a lot

···

--
Posted via http://www.ruby-forum.com/\.

And if you are being pedantic, RFC2822 doesn't allow E-mail addresses to
contain two dots next to each other, unless the local-part is quoted.

···

On Mon, Feb 12, 2007 at 08:56:12AM +0900, Timothy Hunter wrote:

/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/

>basically I want allow the same pattern allowed for emails but before
>the @ char :slight_smile:
>
>
The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.

Gavin Kistner wrote:

···

On Feb 11, 4:21 pm, "J. mp" <joaomiguel.pere...@gmail.com> wrote:

max size allowed is 30
min size allowed is 5

the follwoing chars are allowed :
- _ . (Slash, undescore, perdiod)

these chars are not allowed as start neither as ending char

/\A[a-z][a-z.-]{3,28}[a-z]\Z/i

Translated, that says:
* start at the beginning
* find any letter
* followed by 3-28 characters that are letters, periods, or hyphens
* followed by a letter
* follwed by the
* oh, and be case insensitive, please

Note that, per your exact instructions, this allows:
  u_s_e_r_n_a_m_e
  u____________________________e
  z._-_.z

Oh damm!! the first should be allowed but second and the third should
not be allowed
thnaks

--
Posted via http://www.ruby-forum.com/\.

Lloyd Zusman wrote:

"M. Edward (Ed) Borasky" <znmeb@cesmail.net> writes:

Lloyd Zusman wrote:
    

[ ... ]

But then, when I do "make check", the test run hangs forever after a few
hundred characters (mostly dots) print out.

[ ... ]

You may have to do "make install" before "make check". Did you do it
that way?
    
No, I didn't. I have always thought that the autoconf convention is to
perform "make check" first, and to use its result to decide whether to
then do the "make install". It seems incorrect to install the software
and then check it. If "make check" fails miserably, it will be a big
headache to then try to uninstall everything.
  

Yeah ... that's the way it's supposed to work -- check first, then install. But I usually make a new home in /opt for testing stuff anyhow, rather than letting it default into /usr/local. And I have had instances where things broke in "make check" that didn't break after "make install" because of some path issues. I'll take this as encouragement to hunt them down and document them. :slight_smile:

···

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Brian Candler wrote:

···

On Mon, Feb 12, 2007 at 08:56:12AM +0900, Timothy Hunter wrote:

/\A[[:alpha:]][-_.[:alpha:]]{3,28}[[:alpha:]]\z/

>basically I want allow the same pattern allowed for emails but before
>the @ char :slight_smile:
>
>
The above regexp does not do this. Certainly you can have numbers in
your email address, for example. Basically anything is allowed before
the @. Google "regular expression email address" for extensive
discussions about this.

And if you are being pedantic, RFC2822 doesn't allow E-mail addresses to
contain two dots next to each other, unless the local-part is quoted.

Ok, thanks all, I need a reg expr do what I described before without the
dots, slashes and underscores one after another, and not in the start
nor in the end
Thnaks

--
Posted via http://www.ruby-forum.com/\.

>
> Note that, per your exact instructions, this allows:
> u_s_e_r_n_a_m_e
> u____________________________e
> z._-_.z

Oh damm!! the first should be allowed but second and the third should
not be allowed

The regexp could be extended to allow this, but it gets ever more
convoluted and unreadable - you'd be better off doing a separate check
for a !~ /[^A-Za-z]{2,}/ (that is, "a does not match two
non-alphanumeric chars in a row"

tests = %w( u_s_e_r_n_a_m_e

u____________________________e
z._-_.z
)
=> ["u_s_e_r_n_a_m_e", "u____________________________e", "z._-_.z"]

tests.each {|a| p [a, a !~ /[^A-Za-z]{2,}/]}

["u_s_e_r_n_a_m_e", true]
["u____________________________e", false]
["z._-_.z", false]

Also, play around with The Regex Coach - interactive regular expressions

martin

···

On 2/12/07, J. mp <joaomiguel.pereira@gmail.com> wrote:

Gavin Kistner wrote:
> /\A[a-z][a-z.-]{3,28}[a-z]\Z/i

[snip]

> Note that, per your exact instructions, this allows:
> u_s_e_r_n_a_m_e
> u____________________________e
> z._-_.z

Oh damm!! the first should be allowed but second and the third should
not be allowed

OK, but *why* aren't they allowed. You haven't described exactly what
your requirements are. Is it because you can't have to non-letters in
a row? Is it because the string must contain at least three letters?

BTW, where are these requirements coming from? Are these business
requirements that must be enforced? Are you just making up what you
think people should probably have to use as a name? Or are you just
trying to learn regexp?

···

On Feb 12, 3:49 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:

"M. Edward (Ed) Borasky" <znmeb@cesmail.net> writes:

Lloyd Zusman wrote:

"M. Edward (Ed) Borasky" <znmeb@cesmail.net> writes:

[ ... ]

You may have to do "make install" before "make check". Did you do it
that way?

No, I didn't. I have always thought that the autoconf convention is to
perform "make check" first, and to use its result to decide whether to
then do the "make install". [ ... ]

Yeah ... that's the way it's supposed to work -- check first, then
install. But I usually make a new home in /opt for testing stuff anyhow,
rather than letting it default into /usr/local. And I have had instances
where things broke in "make check" that didn't break after "make
install" because of some path issues. I'll take this as encouragement to
hunt them down and document them. :slight_smile:

Yes, I see how that approach can be helpful. I just usually do the
following: if "make check" fails, don't even try the install. I've
never seen a case where the check succeeded and the software blew up
after installation ... although I know that this certainly could happen,
and your procedure would catch this problem.

In any case, this time I got the same error during the "make check" both
before and after the installation.

···

--
Lloyd Zusman
ljz@asfast.com
God bless you.

Gavin Kistner wrote:

Gavin Kistner wrote:
> /\A[a-z][a-z.-]{3,28}[a-z]\Z/i

[snip]

> Note that, per your exact instructions, this allows:
> u_s_e_r_n_a_m_e
> u____________________________e
> z._-_.z

Oh damm!! the first should be allowed but second and the third should
not be allowed

OK, but *why* aren't they allowed. You haven't described exactly what
your requirements are. Is it because you can't have to non-letters in
a row? Is it because the string must contain at least three letters?

BTW, where are these requirements coming from? Are these business
requirements that must be enforced? Are you just making up what you
think people should probably have to use as a name? Or are you just
trying to learn regexp?

It's a business requirement. The user name will be used before the
domain, for example:
I have the domain http://somedomain.com and for each user a unique url
will exists like http://user.name.somedomain.com
http://david_coperfield.somedomain.com
http://andreas-blast.somedomain.com

This is my business requirement, so I can only allow user names that can
be used in a URI.

Thnaks all again,

···

On Feb 12, 3:49 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:

--
Posted via http://www.ruby-forum.com/\.

Gavin Kistner wrote:
> OK, but *why* aren't they allowed. You haven't described exactly what
> your requirements are. Is it because you can't have to non-letters in
> a row? Is it because the string must contain at least three letters?

You didn't answer these questions.

> BTW, where are these requirements coming from? Are these business
> requirements that must be enforced? Are you just making up what you
> think people should probably have to use as a name? Or are you just
> trying to learn regexp?

It's a business requirement. The user name will be used before the
domain, for example:
I have the domain http://somedomain.com and for each user a unique url
will exists like http://user.name.somedomain.com
http://david_coperfield.somedomain.com
http://andreas-blast.somedomain.com

This is my business requirement, so I can only allow user names that can
be used in a URI.

So the question is, what is legal in that part of a URI? The best
resource I can find is RFC2396 [1], and it says:
"The most common name registry mechanism is the Domain Name System
(DNS). A registered name intended for lookup in the DNS uses the
syntax defined in Section 3.5 of [RFC1034] and Section 2.1 of
[RFC1123]."

Section 2.1 of RFC 1123 [2] says:
"The syntax of a legal Internet host name was specified in RFC-952
[DNS:4]. One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a letter
or a digit. Host software MUST support this more liberal syntax.

Host software MUST handle host names of up to 63 characters and SHOULD
handle host names of up to 255 characters."

RFC 952 [3] says:
"<domainname> ::= <hname>
<hname> ::= <name>*["."<name>]
<name> ::= <let>[*[<let-or-digit-or-hyphen>]<let-or-digit>]"

So, my reading of that (and I'm not an expert) is that a machine name
MAY have digits in it (including at the start or end), may NOT have
underscores, and may be pretty darn long. (Though it makes sense to
put some sort of bound on it - if you think 30 chars is OK, so be it.)

A regexp for this, allowing multiple dotted names joined together:

# Regexp for a single name
/[a-z\d](?:[a-z\d-]*[a-z\d])?/i

# Regexp for 1 or more of those joined by periods
/(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i

[1] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
[2] Rafa Fernandez, Cerrajero – Calidad de servicio y profesionalismo
[3] http://rfc.net/rfc952.html#sA\.

···

On Feb 12, 8:25 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:

Gavin Kistner wrote:

···

On Feb 12, 8:25 am, "J. mp" <joaomiguel.pere...@gmail.com> wrote:

Gavin Kistner wrote:
> OK, but *why* aren't they allowed. You haven't described exactly what

So, my reading of that (and I'm not an expert) is that a machine name
MAY have digits in it (including at the start or end), may NOT have
underscores, and may be pretty darn long. (Though it makes sense to
put some sort of bound on it - if you think 30 chars is OK, so be it.)

A regexp for this, allowing multiple dotted names joined together:

# Regexp for a single name
/[a-z\d](?:[a-z\d-]*[a-z\d])?/i

# Regexp for 1 or more of those joined by periods
/(?:[a-z\d](?:[a-z\d-]*[a-z\d])?)(?:\.[a-z\d](?:[a-z\d-]*[a-z\d])?)*/i

[1] http://www.gbiv.com/protocols/uri/rfc/rfc3986.html
[2] http://rfc-ref.org/RFC-TEXTS/1123/chapter2.html#sub1
[3] http://rfc.net/rfc952.html#sA\.

So, Gavin your last regex allows only valid host names on an URI? I'm
sorry for not reading the RFC before. My requirement is what I said, the
user name will act as part of an URI, so I should allow any combination
of chars that are valid for the first part of an URI

--
Posted via http://www.ruby-forum.com/\.