New Ruby questions

Jeff_Massung · 13 April 2004 22:14

I’ve just started Ruby a couple days ago (man this is cool). Coming from
the embedded world of Forth and C, being able to do some string parsing
easily is what I’m looking forward to. So, onto a couple questions that I
can’t seem to find the answers to:

Quickly, the “teach myself” project is a simple ARM assembler. I’ve done
it numerous times, and it is something I’m familiar and comfortable with
(let alone something I need ATM).

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1? I get the same problem with opcode mnemonics (like B
instead of BX or BL).

I don’t seem to understand the @ convensions for variable names. Is
this just a naming convension that people stick to (like name system
names in Lisp) but aren’t required?

I noticed that if I have a global variable:

x = 10

In some function:

def show_x
print x
end

This fails (unknown local x). But if I use @x in both cases it works just
fine. Can someone explain this to me? Also, what about @@ variables?
Likewise, is there @@@ or @@@@ ?

Thanks in advance.

···

–
Best regards,
Jeff jma[at]mfire.com
http://www.jm-basic.com

John_Platte · 13 April 2004 22:28

The @ prefix marks an instance variable, which is global to the
instance of the class you’re in.

Ruby’s a very object-oriented language. If you don’t think you’re in an
class, then you’re scoped in the default top-level class. So your @x is
an instance variable for this top-level class.

@@ is a class variable, available to all instances of that class. It
will seem to work like @ in the top-level scope.

Check out http://www.rubycentral.com/book/tut_classes.html for a much
better rundown of this stuff.

···

On 2004 Apr 13, at 17:14, Jeff Massung wrote:

I noticed that if I have a global variable:

x = 10

In some function:

def show_x
print x
end

This fails (unknown local x). But if I use @x in both cases it works
just
fine. Can someone explain this to me? Also, what about @@ variables?
Likewise, is there @@@ or @@@@ ?

–
Ryan “John” Platte
Custom services, NIKA Consulting
http://nikaconsulting.com/

Joel_VanderWerf1 · 13 April 2004 22:35

Jeff Massung wrote:

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1? I get the same problem with opcode mnemonics (like B
instead of BX or BL).

Is this what you want to do?

irb(main):001:0> registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’
=> “(r0|r1|r2|r3…|r10|r11|r12…)”
irb(main):002:0>
irb(main):003:0* registers.scan(/r\d+/)
=> [“r0”, “r1”, “r2”, “r3”, “r10”, “r11”, “r12”]

Armin_Roehrl · 13 April 2004 22:43

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1?

register =~ /(r\d{2,})/
p $1 #-> r10

\d is for a digit character
{m,n} at least m and at most n repetitions of the preceding

more info:
http://dev.faeriemud.org/ruby-uguide/uguide05.html
http://www.rubycentral.com/ref/ref_c_regexp.html

Simon_Strandgaard1 · 13 April 2004 22:49

[snip]

Quickly, the “teach myself” project is a simple ARM assembler. I’ve done
it numerous times, and it is something I’m familiar and comfortable with
(let alone something I need ATM).

sounds interesting… BTW: are you to follow test-first paradigm?

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1? I get the same problem with opcode mnemonics (like B
instead of BX or BL).

There is many ways to do this…

I can recommend you to read up on ‘lexing’… a phase where you transform
the sourcecode into tokens, where all whitespace is stripped off.
Then you have your registernames as you want them.

However you asked for regexp… then you can place word-boundary anchors
around the registers. Also place the longest names first in the
alternation. But I rather recommend the lexing path.
registers = / \b (?: r10 | r11 | r12 | … r0 | r1 | r2 ) \b /x

BTW: welcome to Ruby

···

On Tue, 13 Apr 2004 23:10:11 +0000, Jeff Massung wrote:

–
Simon Strandgaard

Ara.T.Howard · 13 April 2004 22:54

I’ve just started Ruby a couple days ago (man this is cool). Coming from the
embedded world of Forth and C, being able to do some string parsing easily
is what I’m looking forward to. So, onto a couple questions that I can’t
seem to find the answers to:

welcome aboard.

Quickly, the “teach myself” project is a simple ARM assembler. I’ve done it
numerous times, and it is something I’m familiar and comfortable with (let
alone something I need ATM).

How can a regexp get the longer of two possibilities that are ambiguous?
For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1? I get the same problem with opcode mnemonics (like B instead
of BX or BL).

you need to make a pattern that says “a register string is an ‘r’ followed by
one, or more, digits”. pattern matching is greedy by default, so that alone
will do it

irb(main):001:0> register_pat = /r\d+/o
=> /r\d+/

irb(main):002:0> “r1”[register_pat]
=> “r1”

irb(main):003:0> “r11”[register_pat]
=> “r11”

irb(main):004:0> “r111”[register_pat]
=> “r111”

however, pattern matching is trickier than it looks:

irb(main):005:0> “car_number1”[register_pat]
=> “r1”

you probably wan’t to say something like: “a register is the string ‘r’
followed by one, or more, digits, but when the whole thing is bounded by
non-word chars”

irb(main):007:0> “r1”[pat]
=> “r1”

irb(main):008:0> “carr1”[pat]
=> nil

in general, it’s best to always anchor patterns somehow.

I don’t seem to understand the @ convensions for variable names. Is this
just a naming convension that people stick to (like name system names in
Lisp) but aren’t required?

I noticed that if I have a global variable:

x = 10

In some function:

def show_x print x end

This fails (unknown local x). But if I use @x in both cases it works just
fine. Can someone explain this to me? Also, what about @@ variables?
Likewise, is there @@@ or @@@@ ?

‘@’ means instance var - it’s OO speak for my var. eg:

~ > cat foo.rb
class Car
attr :model
def initialize model
@model = model
end
end

audi = Car.new ‘audi’
p audi.model

jag = Car.new ‘jaguar’
p jag.model

~ > ruby foo.rb
“audi”
“jaguar”

‘@@’ means class var. they could be thought of as ‘static storage on a class
hierachy basis’. eg:

~ > cat foo.rb
class Widget
@@total_widgets = 0

def Widget.n_widgets
  @@total_widgets
end


attr :name
def initialize name 
  @name = name
  @@total_widgets += 1
end

end

widgets =
42.times{|i| widgets << Widget.new(i) }

p widgets.size
p Widget.n_widgets

~ > ruby foo.rb
42
42

there are also, ‘class instance vars’, but that’s a longer story…

there is no @@@+

Thanks in advance.

no problem. you’ll find this (the pickaxe) most useful:

http://www.pragmaticprogrammer.com/ruby/downloads/book.html

i use it, literally, every day. i always use the electronic copy, but bought
three and gave them away to support the authors - good to support them too, i
hear they are working on a new release…

cheers.

-a

···

On Tue, 13 Apr 2004, Jeff Massung wrote:

===============================================================================

EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
ADDRESS :: E/GC2 325 Broadway, Boulder, CO 80305-3328
URL :: Solar-Terrestrial Physics Data | NCEI
TRY :: for l in ruby perl;do $l -e “print "\x3a\x2d\x29\x0a"”;done
===============================================================================

Bill_Kelly · 13 April 2004 22:56

I’ve just started Ruby a couple days ago (man this is cool). Coming from
the embedded world of Forth and C, being able to do some string parsing
easily is what I’m looking forward to. So, onto a couple questions that I
can’t seem to find the answers to:

26952 DUP 255 AND EMIT 256 / EMIT (

Quickly, the “teach myself” project is a simple ARM assembler. I’ve done
it numerous times, and it is something I’m familiar and comfortable with
(let alone something I need ATM).

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1? I get the same problem with opcode mnemonics (like B
instead of BX or BL).

Try putting the longer strings first in the alternation
sequence…

irb --simple-prompt

puts $& if “r11” =~ /r1|r2|r11/ # shorter strings first
r1

puts $& if “r11” =~ /r11|r2|r1/ # longer strings first
r11

Alternately, you could use lookahead assertions to
specify the character following the token is some
kind of token separating character, like…

This “\b” insists a “word boundary” occurs at that point
in the pattern:

puts $& if “r11” =~ /(r1|r2|r11)\b/
r11

I think the “\b” is essentially equivalent to:

puts $& if “r11” =~ /(r1|r2|r11)(?![A-Za-z0-9_])/
r11

Regards,

Bill

···

From: “Jeff Massung” jma@NOSPAM.mfire.com

Mark_Hubbart · 13 April 2004 22:59

I’ve just started Ruby a couple days ago

welcome!

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead
of
stopping at r1? I get the same problem with opcode mnemonics (like B
instead of BX or BL).

Are you wanting a regex that matches register names? here’s a couple
ways:

A bit lazy, but less complicated: matches 1 or 2 digit numbers

the \d means match any single digit; following the second \d with

a ? means that it should match it if it’s there, but it is optional.

register_regex = /r\d\d?/

A bit less more complex, but more precise; matches up to “r42”

rough translation: "match an ‘r’, followed by either the a digit

followed by a non-digit; or a number 1-3 followed by one digit

and a non-digit; or the number 4 followed by a 0, 1, or 2.

register_regex = /r(?:\d\D|[1-3]\d\D|4[012]\D)/

If you are going to be doing lots of regex stuff, I highly reccoment
going out and getting a small pocket manual regex reference. When I
started working with them, I got a little perl-oriented one, but it
works just as well for Ruby, PHP, or most other implementations. I just
marked it up to show what was available in each language.

cheers,
–Mark

···

On Apr 13, 2004, at 3:14 PM, Jeff Massung wrote:

Tim_Hunter · 13 April 2004 23:14

Try this regexp instead: /r\d+/

For example,

irb(main):007:0> m = /r\d+/.match(‘r13’)
#MatchData:0x401ef134
irb(main):008:0> p m[0]
“r13”
nil
irb(main):009:0>

This regular expression matches ‘r’ followed by one or more
digits. Notice that the regexp matched the whole string, not just ‘r1’. By
default, regexps are “greedy”, that is, they want to match as many
characters as they can. Of course that’s the behavior you’re looking for.

Your original expression “(r0|r1|r2…” uses the alternation (“|”)
operator to match one of a set of regular expressions. The regexp “a|b”
tries “a” first and if it doesn’t match then it tries “b”.

···

On Tue, 13 Apr 2004 22:10:11 +0000, Jeff Massung wrote:

I’ve just started Ruby a couple days ago (man this is cool). Coming from
the embedded world of Forth and C, being able to do some string parsing
easily is what I’m looking forward to. So, onto a couple questions that I
can’t seem to find the answers to:

Quickly, the “teach myself” project is a simple ARM assembler. I’ve done
it numerous times, and it is something I’m familiar and comfortable with
(let alone something I need ATM).

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1? I get the same problem with opcode mnemonics (like B
instead of BX or BL).

Ptkwt · 13 April 2004 23:24

In article 107opa3ca462a41@corp.supernews.com,

I’ve just started Ruby a couple days ago (man this is cool). Coming from
the embedded world of Forth and C, being able to do some string parsing
easily is what I’m looking forward to.

Welcome. So how did you find out about Ruby?

So, onto a couple questions that I
can’t seem to find the answers to:

Quickly, the “teach myself” project is a simple ARM assembler.

Sounds like a good project.

I’ve done
it numerous times, and it is something I’m familiar and comfortable with
(let alone something I need ATM).

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead of
stopping at r1? I get the same problem with opcode mnemonics (like B
instead of BX or BL).

I’ll let someone who is quicker with regexen answer that one.

I don’t seem to understand the @ convensions for variable names. Is
this just a naming convension that people stick to (like name system
names in Lisp) but aren’t required?

Well, no, it’s not just convention. the ‘@’ implies instance scope.

class SomeClass
def initialize
@foo = “some value”
end

def to_s
@foo.to_s
end
end

The @foo is accessible to the to_s method. foo is an instance variable.

I noticed that if I have a global variable:

x = 10

No, you’ve got a regular variable, if you want a global variable, put a
‘$’ in front, like so:

$x = 10

In some function:

def show_x
print x

x inside of the method ‘show_x’ is different from the ‘x’ at the scope
above.

end

This fails (unknown local x). But if I use @x in both cases it works just
fine. Can someone explain this to me?

if you used @x in both cases it would work because @x would be accessable
from the current instance scope which happens to be an instance of
‘Object’ at this point.

Also, what about @@ variables?

‘@@’ variables have class scope. So in this case:

class Foo
@@var = “Foo!”
def var
@@var
end

def var=(val)
@@var=val
end
end

a = Foo.new
b = Foo.new
puts a.var #=> Foo!
puts b.var #=> Foo!
#they’re the same
a.var=“Bar!”
puts b.var #=> Bar!
#still the same

a and b are instances of class Foo and @@var is the same in both a and b
because @@var has class scope (all instances of Foo share @@var).

Likewise, is there @@@ or @@@@ ?

Luckily, we don’t have these

Phil

···

Jeff Massung jma@NOSPAM.mfire.com wrote:

Jeff_Massung · 13 April 2004 22:39

John Platte john.platte@nikaconsulting.com wrote in news:E3F08E93-8D99-
11D8-B4A8-000A95EB0812@nikaconsulting.com:

Check out http://www.rubycentral.com/book/tut_classes.html for a much
better rundown of this stuff.

Thanks! Feel better already

···

–
Best regards,
Jeff jma[at]mfire.com
http://www.jm-basic.com

Robert · 14 April 2004 08:04

“Armin Roehrl” armin@xss.de schrieb im Newsbeitrag
news:407C6D08.2080509@xss.de…

How can a regexp get the longer of two possibilities that are
ambiguous? For example, I need to be able to strip out register names:

registers = ‘(r0|r1|r2|r3…|r10|r11|r12…)’

In the regular expression, how can I get it to find r10 or r11 instead
of
stopping at r1?

register =~ /(r\d{2,})/

I guess he rather wants {1,2}

some_string.scan(/r\d{1,2}/) do |reg|
puts “Found register: #{reg}”
end

You can even convert that to a number easily:

some_string.scan(/r(\d{1,2})/) do |match|
no = match[0].to_i
puts “Found register: #{no}”
end

Regards

robert

···

p $1 #-> r10

\d is for a digit character
{m,n} at least m and at most n repetitions of the preceding

more info:
http://dev.faeriemud.org/ruby-uguide/uguide05.html
http://www.rubycentral.com/ref/ref_c_regexp.html

Jeff_Massung · 14 April 2004 17:14

Simon Strandgaard neoneye@adslhome.dk wrote in
news:pan.2004.04.13.22.41.09.300793@adslhome.dk:

[snip]

Quickly, the “teach myself” project is a simple ARM assembler. I’ve
done it numerous times, and it is something I’m familiar and
comfortable with (let alone something I need ATM).

sounds interesting… BTW: are you to follow test-first paradigm?

Hehe, of course.

In the regular expression, how can I get it to find r10 or r11
instead of stopping at r1? I get the same problem with opcode
mnemonics (like B instead of BX or BL).

There is many ways to do this…

I can recommend you to read up on ‘lexing’… a phase where you
transform the sourcecode into tokens, where all whitespace is stripped
off. Then you have your registernames as you want them.

That was my approach in C. Works fine, but using the regexp (so far),
I’ve been able to slowly tackle away at a source line easily. Assembly is
simple, but ARM assembly can be a little headache-ish to parse.

For example, I’ve been able to:

@source = “test: add r0,r1,r2”

if (@source =~ /(\w+):\s+(.*)/) == 0
@label = $1
@source = $2
end

And then continue on, with using regexp on the rest of @source (stripping
the opcode from the operands), etc. It actually seems to be rather
efficient (programmatically). Whether it is effecient processor-wise, at
this point, I don’t really care. I’m more learning that optimizing.

However you asked for regexp… then you can place word-boundary
anchors around the registers. Also place the longest names first in
the alternation. But I rather recommend the lexing path.
registers = / \b (?: r10 | r11 | r12 | … r0 | r1 | r2 ) \b /x

I’ll give that a try.

BTW: welcome to Ruby

Thanks!

And a hearty thanks to everyone else who replied. I was rather surprised
at the time taken by everyone to explain the OOP concepts behind @ and @@
(I do know C++ very well, but the generosity of the community here was
rather – delightfully overwhelming)

It appears all I really was “missing” was the concept that all code is
always in a class, whether some overall Ruby class or one that I make
myself. And that @@ is static.

Lastly, though. @ is instance, @@ is static, and $ is global. What is
nothing?

x = 10

Where does x reside in namespace, etc? How do I access it? Is it
considered just local to the current definition?

Really, thanks all!

···

On Tue, 13 Apr 2004 23:10:11 +0000, Jeff Massung wrote:

–
Best regards,
Jeff jma[at]mfire.com
http://www.jm-basic.com

Mark_Hubbart · 14 April 2004 18:21

x is just a plain local variable; it is inaccessible outside the
current scope. Once it goes out of scope (ie., you close the class,
module, or function definition) it’s current value is gone forever.

(unless you use continuations or fiddle with bindings, but that’s
another thing altogether)

cheers,
Mark

···

On Apr 14, 2004, at 10:14 AM, Jeff Massung wrote:

It appears all I really was “missing” was the concept that all code is
always in a class, whether some overall Ruby class or one that I make
myself. And that @@ is static.

Lastly, though. @ is instance, @@ is static, and $ is global. What is
nothing?

x = 10

Where does x reside in namespace, etc? How do I access it? Is it
considered just local to the current definition?

Topic		Replies	Views
Understanding Variable Naming in Ruby ruby-talk	2	113	14 March 2006
Nubish questions about syntax and gems ruby-talk	12	129	27 November 2005
[ruby2] will '@@' disapear in ruby2? ruby-talk	12	103	17 March 2005
Concept of instance variable in ruby ruby-talk	5	129	12 December 2011
Var scope question ruby-talk	3	73	19 March 2006

New Ruby questions

On Tue, 13 Apr 2004, Jeff Massung wrote:

A bit lazy, but less complicated: matches 1 or 2 digit numbers

the \d means match any single digit; following the second \d with

a ? means that it should match it if it’s there, but it is optional.

A bit less more complex, but more precise; matches up to “r42”

rough translation: "match an ‘r’, followed by either the a digit

followed by a non-digit; or a number 1-3 followed by one digit

and a non-digit; or the number 4 followed by a 0, 1, or 2.

Related topics