Ruby Code Parsing

Jonathan_Bale · 18 August 2010 20:31

I have a Perl friend asking me questions about how ruby parses its code.
Things like, how does it know where a statement ends and where can you
and can't you put new lines -- basically trying to understand how Ruby
differs from Perl and from Python in regard to parsing Ruby code.

I have a good /practical/ understanding of how it works, but have
difficulty delineating in one e-mail all the rules of how it works. Is
there a particular document (please give specific section) on the
Internet that I could link him to that explains all the rules?

···

--
Posted via http://www.ruby-forum.com/.

Ryan_Davis1 · 18 August 2010 22:26

Well... You can look at ruby_parser.y in the ruby_parser gem. It is "only" ~1800 lines long. I wouldn't recommend it tho, except as evidence that it is statically parseable.

The long and the short of it is that it is somewhere between perl (which basically can't be statically parsed) and python (which I can't mentally parse (because of things like list comprehensions are anything but comprehensible to me), but which is highly regular).

···

On Aug 18, 2010, at 13:31 , Jonathan Bale wrote:

I have a Perl friend asking me questions about how ruby parses its code.
Things like, how does it know where a statement ends and where can you
and can't you put new lines -- basically trying to understand how Ruby
differs from Perl and from Python in regard to parsing Ruby code.

I have a good /practical/ understanding of how it works, but have
difficulty delineating in one e-mail all the rules of how it works. Is
there a particular document (please give specific section) on the
Internet that I could link him to that explains all the rules?

Caleb_Clausen1 · 22 August 2010 23:38

I have a Perl friend asking me questions about how ruby parses its code.
Things like, how does it know where a statement ends and where can you
and can't you put new lines -- basically trying to understand how Ruby
differs from Perl and from Python in regard to parsing Ruby code.

Statements generally end with a semicolon or newline... but not all
newlines are statement terminators. Newlines are considered whitespace
if preceded by an operator or some other token (eg: '(', '[', '{' )
which requires an expression following it. There are some other
special cases.

I have a good /practical/ understanding of how it works, but have
difficulty delineating in one e-mail all the rules of how it works. Is
there a particular document (please give specific section) on the
Internet that I could link him to that explains all the rules?

Ruby is designed to be easy for humans to read, which means it is not
easy for computers to parse. As opposed to say, perl, which is hard
for both humans and computers to parse. If you were to write down in
one email a complete set of rules, it would be a pretty long email.

You might take a look at the ruby draft standard, which is fairly
complete. For a lot of stuff, it just gives bnf, tho, with no verbal
description.

···

On 8/18/10, Jonathan Bale <webmaster@indicium.us> wrote:

Suraj_Kurapati2 · 24 August 2010 04:16

Jonathan Bale wrote:

I have a Perl friend asking me questions about how ruby parses its code.

Is there a particular document (please give specific section) on the
Internet that I could link him to that explains all the rules?

How about the "To Ruby From Perl" mini-guide on the Ruby website?

http://www.ruby-lang.org/en/documentation/ruby-from-other-languages/to-ruby-from-perl/

It doesn't explain *all* the rules, but it's a worthy starting point.

Cheers.

···

--
Posted via http://www.ruby-forum.com/\.

Chad_Perrin · 23 August 2010 03:12

I find Perl pretty easy to parse. Maybe yours is a personal problem.

···

On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:

Ruby is designed to be easy for humans to read, which means it is not
easy for computers to parse. As opposed to say, perl, which is hard
for both humans and computers to parse. If you were to write down in
one email a complete set of rules, it would be a pretty long email.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Ryan_Davis1 · 23 August 2010 07:20

No, you do not find perl pretty easy to parse. Last time I checked, you can NOT statically parse perl. You have to evaluate it in order to get a proper parse. So, you find it pretty easy to evaluate. No easy task for me, I'll admit, but there is no reason for you to be an ass about it.

···

On Aug 22, 2010, at 20:12 , Chad Perrin wrote:

On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:

Ruby is designed to be easy for humans to read, which means it is not
easy for computers to parse. As opposed to say, perl, which is hard
for both humans and computers to parse. If you were to write down in
one email a complete set of rules, it would be a pretty long email.

I find Perl pretty easy to parse. Maybe yours is a personal problem.

Mark_Thomas · 23 August 2010 13:50

There's a saying in the Perl community... "Only perl can parse
Perl." (capitalization intentional--that is, "perl" is the executable
and "Perl" is the language).

···

On Aug 23, 3:20 am, Ryan Davis <ryand-r...@zenspider.com> wrote:

No, you do not find perl pretty easy to parse. Last time I checked, you can NOT statically parse perl. You have to evaluate it in order to get a proper parse.

Chad_Perrin · 23 August 2010 15:10

Regardless of nitpicky phrasing, I found your comment about Perl kind of
ass-ish. It seems kind of ironic to be called an ass for pointing out
that your problems with Perl may not be others' problems, when you make a
categorical deprecating statement about Perl.

···

On Mon, Aug 23, 2010 at 04:20:54PM +0900, Ryan Davis wrote:

On Aug 22, 2010, at 20:12 , Chad Perrin wrote:
>
> On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:
>>
>> Ruby is designed to be easy for humans to read, which means it is not
>> easy for computers to parse. As opposed to say, perl, which is hard
>> for both humans and computers to parse. If you were to write down in
>> one email a complete set of rules, it would be a pretty long email.
>
> I find Perl pretty easy to parse. Maybe yours is a personal problem.

No, you do not find perl pretty easy to parse. Last time I checked, you
can NOT statically parse perl. You have to evaluate it in order to get
a proper parse. So, you find it pretty easy to evaluate. No easy task
for me, I'll admit, but there is no reason for you to be an ass about
it.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Tony_Arcieri4 · 23 August 2010 15:16

I think Ryan was pointing out Perl can't be parsed:

http://www.perlmonks.org/?node_id=663393

···

On Mon, Aug 23, 2010 at 9:10 AM, Chad Perrin <code@apotheon.net> wrote:

Regardless of nitpicky phrasing, I found your comment about Perl kind of
ass-ish. It seems kind of ironic to be called an ass for pointing out
that your problems with Perl may not be others' problems, when you make a
categorical deprecating statement about Perl.

--
Tony Arcieri
Medioh! A Kudelski Brand

Caleb_Clausen1 · 23 August 2010 16:35

First of all, do not confuse me with Ryan.

My statement about perl was perfectly accurate, in addition to being
deprecatory. Although perl has been an interesting experiment in
language design, it can at best be seen as a rough draft for ruby. No
one, not even its mother, has ever claimed perl is beautiful. Why do
you think the symbol of perl is a camel?

What I mostly find unreadable about perl code are the many uses of
sigils, particularly dollar signs which interrupt the flow of the
narrative in perl programs. The special funnily-named global variables
($_, $\, etc) are pretty ugly too. Of course, ruby has these as well,
but they just don't seem to be needed as much. The fact that some
variables sometimes start with a $ and other times with a @ is also
disturbing. None of these are the reasons that make perl difficult for
programs to parse; those are entirely different problems.

···

On 8/23/10, Chad Perrin <code@apotheon.net> wrote:

On Mon, Aug 23, 2010 at 04:20:54PM +0900, Ryan Davis wrote:

On Aug 22, 2010, at 20:12 , Chad Perrin wrote:
>
> On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:
>>
>> Ruby is designed to be easy for humans to read, which means it is not
>> easy for computers to parse. As opposed to say, perl, which is hard
>> for both humans and computers to parse. If you were to write down in
>> one email a complete set of rules, it would be a pretty long email.
>
> I find Perl pretty easy to parse. Maybe yours is a personal problem.

No, you do not find perl pretty easy to parse. Last time I checked, you
can NOT statically parse perl. You have to evaluate it in order to get
a proper parse. So, you find it pretty easy to evaluate. No easy task
for me, I'll admit, but there is no reason for you to be an ass about
it.

Regardless of nitpicky phrasing, I found your comment about Perl kind of
ass-ish. It seems kind of ironic to be called an ass for pointing out
that your problems with Perl may not be others' problems, when you make a
categorical deprecating statement about Perl.

Ryan_Davis1 · 24 August 2010 06:38

There is nothing nitpicky about it. YOU _can't_ parse perl, yet you claim you can and that it is pretty easy.

If you think it is so easy, here is a challenge: write a parser for perl in ruby. It can't be that hard, right? I have one in ruby, and it "only" took 145 commits (so far) to do. I don't care what type of parser style you choose (LALR, RDP, PEG, etc), and I don't care if you choose to port someone else's parser to ruby... but you can NOT evaluate/interpret the code in order to determine your parse trees. It has to be a 100% static parse.

P.S. Even tho it was Caleb who made the "deprecating statement" about perl (nice double-entendre considering we're all still waiting for perl 6... 10 years later), I totally agree with him on this one.

···

On Aug 23, 2010, at 08:10 , Chad Perrin wrote:

On Mon, Aug 23, 2010 at 04:20:54PM +0900, Ryan Davis wrote:

On Aug 22, 2010, at 20:12 , Chad Perrin wrote:

On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:

Ruby is designed to be easy for humans to read, which means it is not
easy for computers to parse. As opposed to say, perl, which is hard
for both humans and computers to parse. If you were to write down in
one email a complete set of rules, it would be a pretty long email.

I find Perl pretty easy to parse. Maybe yours is a personal problem.

No, you do not find perl pretty easy to parse. Last time I checked, you
can NOT statically parse perl. You have to evaluate it in order to get
a proper parse. So, you find it pretty easy to evaluate. No easy task
for me, I'll admit, but there is no reason for you to be an ass about
it.

Regardless of nitpicky phrasing, I found your comment about Perl kind of
ass-ish. It seems kind of ironic to be called an ass for pointing out
that your problems with Perl may not be others' problems, when you make a
categorical deprecating statement about Perl.

Chad_Perrin · 23 August 2010 20:47

>> >
>> >>
>> >> Ruby is designed to be easy for humans to read, which means it is not
>> >> easy for computers to parse. As opposed to say, perl, which is hard
>> >> for both humans and computers to parse. If you were to write down in
>> >> one email a complete set of rules, it would be a pretty long email.
>> >
>> > I find Perl pretty easy to parse. Maybe yours is a personal problem.
>>
>> No, you do not find perl pretty easy to parse. Last time I checked, you
>> can NOT statically parse perl. You have to evaluate it in order to get
>> a proper parse. So, you find it pretty easy to evaluate. No easy task
>> for me, I'll admit, but there is no reason for you to be an ass about
>> it.
>
> Regardless of nitpicky phrasing, I found your comment about Perl kind of
> ass-ish. It seems kind of ironic to be called an ass for pointing out
> that your problems with Perl may not be others' problems, when you make a
> categorical deprecating statement about Perl.

First of all, do not confuse me with Ryan.

You're right -- I apologize. That was sloppy of me.

My statement about perl was perfectly accurate, in addition to being
deprecatory. Although perl has been an interesting experiment in
language design, it can at best be seen as a rough draft for ruby. No
one, not even its mother, has ever claimed perl is beautiful. Why do
you think the symbol of perl is a camel?

"it can at best be seen as a rough draft for ruby"

. . . except for the things it does better, I suppose. I find both Ruby
and Perl quite useful, each more so in some contexts than others, so that
neither of them feels like it should be tossed out of my toolbox, and
neither is always a better choice than the other.

What I mostly find unreadable about perl code are the many uses of
sigils, particularly dollar signs which interrupt the flow of the
narrative in perl programs. The special funnily-named global variables
($_, $\, etc) are pretty ugly too. Of course, ruby has these as well,
but they just don't seem to be needed as much. The fact that some
variables sometimes start with a $ and other times with a @ is also
disturbing. None of these are the reasons that make perl difficult for
programs to parse; those are entirely different problems.

Sigils serve as a common kicking-dog for people looking for excuses to
dislike Perl, I've noticed. What you find disturbing, however, others
sometimes find useful within the context of Perl's syntactic model. As
for those "funnily-named global variables", they are often better used
implicitly than explicitly so that they are rarely needed (to be seen) in
Perl, too.

Of course, if you never grasped that, it might help explain your aversion
to a language that has them, I suppose.

···

On Tue, Aug 24, 2010 at 01:35:15AM +0900, Caleb Clausen wrote:

On 8/23/10, Chad Perrin <code@apotheon.net> wrote:
> On Mon, Aug 23, 2010 at 04:20:54PM +0900, Ryan Davis wrote:
>> On Aug 22, 2010, at 20:12 , Chad Perrin wrote:
>> > On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Chad_Perrin · 24 August 2010 16:43

You apparently think I'm saying my brain has an implementation of the
perl runtime installed on it. By that standard, none of us can parse
Perl, or Ruby, or Python, or C, or Scheme, or any other nontrivial
computer programming language.

I'm just saying that I can look at some Perl code and figure out what
it's doing -- a necessary task if you're going to try to maintain some
Perl code.

···

On Tue, Aug 24, 2010 at 03:38:42PM +0900, Ryan Davis wrote:

On Aug 23, 2010, at 08:10 , Chad Perrin wrote:

> On Mon, Aug 23, 2010 at 04:20:54PM +0900, Ryan Davis wrote:
>> On Aug 22, 2010, at 20:12 , Chad Perrin wrote:
>>>
>>> On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:
>>>>
>>>> Ruby is designed to be easy for humans to read, which means it is not
>>>> easy for computers to parse. As opposed to say, perl, which is hard
>>>> for both humans and computers to parse. If you were to write down in
>>>> one email a complete set of rules, it would be a pretty long email.
>>>
>>> I find Perl pretty easy to parse. Maybe yours is a personal problem.
>>
>> No, you do not find perl pretty easy to parse. Last time I checked, you
>> can NOT statically parse perl. You have to evaluate it in order to get
>> a proper parse. So, you find it pretty easy to evaluate. No easy task
>> for me, I'll admit, but there is no reason for you to be an ass about
>> it.
>
> Regardless of nitpicky phrasing, I found your comment about Perl kind of
> ass-ish. It seems kind of ironic to be called an ass for pointing out
> that your problems with Perl may not be others' problems, when you make a
> categorical deprecating statement about Perl.

There is nothing nitpicky about it. YOU _can't_ parse perl, yet you
claim you can and that it is pretty easy.

If you think it is so easy, here is a challenge: write a parser for
perl in ruby. It can't be that hard, right? I have one in ruby, and it
"only" took 145 commits (so far) to do. I don't care what type of
parser style you choose (LALR, RDP, PEG, etc), and I don't care if you
choose to port someone else's parser to ruby... but you can NOT
evaluate/interpret the code in order to determine your parse trees. It
has to be a 100% static parse.

P.S. Even tho it was Caleb who made the "deprecating statement" about
perl (nice double-entendre considering we're all still waiting for perl
6... 10 years later), I totally agree with him on this one.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Ryan_Davis1 · 24 August 2010 21:46

And I'm "just saying" that you were being an ass to Caleb when you suggested his comment about perl to be a personal problem. His comment is a fact. Perl is difficult for both man and machine to parse. I'm also "just saying" that you're wrong when you said that you found perl pretty easy to parse. You don't. You can't. That's been pointed out several times and you dance around it. You look at perl and "figure out what it's doing", meaning: you EVALUATE it in your head in order to determine what the code is doing. If you don't, then you're not as good as you think you are and perl is a lot more difficult than you realize. (or as a possible alternative, you're only dealing with kindergarten-level perl--which is a good thing really.)

By the standard you think I'm setting, MANY of us can mentally statically parse ruby, python, most C, and especially scheme, just like MANY of us learned how to diagram natural languages like English in grade school.

In increasing order of difficulty to parse: scheme, python, C, ruby.

Notice that perl isn't on that list.

···

On Aug 24, 2010, at 09:43 , Chad Perrin wrote:

On Tue, Aug 24, 2010 at 03:38:42PM +0900, Ryan Davis wrote:

On Aug 23, 2010, at 08:10 , Chad Perrin wrote:

On Mon, Aug 23, 2010 at 04:20:54PM +0900, Ryan Davis wrote:

On Aug 22, 2010, at 20:12 , Chad Perrin wrote:

On Mon, Aug 23, 2010 at 08:38:40AM +0900, Caleb Clausen wrote:

Ruby is designed to be easy for humans to read, which means it is not
easy for computers to parse. As opposed to say, perl, which is hard
for both humans and computers to parse. If you were to write down in
one email a complete set of rules, it would be a pretty long email.

I find Perl pretty easy to parse. Maybe yours is a personal problem.

No, you do not find perl pretty easy to parse. [...]

Regardless of nitpicky phrasing, I found your comment about Perl kind of
ass-ish. It seems kind of ironic to be called an ass for pointing out
that your problems with Perl may not be others' problems, when you make a
categorical deprecating statement about Perl.

There is nothing nitpicky about it. YOU _can't_ parse perl, yet you
claim you can and that it is pretty easy. [...]

You apparently think I'm saying my brain has an implementation of the
perl runtime installed on it. By that standard, none of us can parse
Perl, or Ruby, or Python, or C, or Scheme, or any other nontrivial
computer programming language.

I'm just saying that I can look at some Perl code and figure out what
it's doing -- a necessary task if you're going to try to maintain some
Perl code.

Chad_Perrin · 25 August 2010 06:54

Yes, I saw all this.

No, I don't think it's productive to have this discussion on the Ruby
list.

Anything else I might have to say . . . won't be said.

···

On Wed, Aug 25, 2010 at 06:46:36AM +0900, Ryan Davis wrote:

And I'm "just saying" that you were being an ass to Caleb when you
suggested his comment about perl to be a personal problem. His comment
is a fact. Perl is difficult for both man and machine to parse. I'm
also "just saying" that you're wrong when you said that you found perl
pretty easy to parse. You don't. You can't. That's been pointed out
several times and you dance around it. You look at perl and "figure out
what it's doing", meaning: you EVALUATE it in your head in order to
determine what the code is doing. If you don't, then you're not as good
as you think you are and perl is a lot more difficult than you realize.
(or as a possible alternative, you're only dealing with
kindergarten-level perl--which is a good thing really.)

By the standard you think I'm setting, MANY of us can mentally
statically parse ruby, python, most C, and especially scheme, just like
MANY of us learned how to diagram natural languages like English in
grade school.

In increasing order of difficulty to parse: scheme, python, C, ruby.

Notice that perl isn't on that list.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Topic		Replies	Views
String#split(' ') and whitespace (perl user's surprise) ruby-talk	14	92	27 June 2003
Parsing a ruby code ruby-talk	0	65	8 June 2008
Why won't ruby chomp for me? ruby-talk	9	97	30 January 2004
[QUIZ] Code Heuristics (#172) ruby-talk	5	77	8 August 2008
Ruby-dev summary 19198-19345 ruby-talk	0	105	15 January 2003

Ruby Code Parsing

Related Topics