What's the matter with my programm:Web Analysis

Pen_Ttt · 23 March 2010 04:18

HTMLRegexp =/(<!--.*?--\s*>)|
(<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
([^<]*)/xm

data =DATA.read

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
if comment
   p ["Comment",comment]
elseif tag
  p ["Tag",tag]
elseif tdata
  tdata.gsub!(/\s+/,"")
  tdata.sub!(/ $/,"")
  p [ "TextData",tdata] unless tdata.empty?

end
}
_END_
<!DOCTYPE HTML>
<HTML>
<BODY>
    < A name="FOO" href="foo" attr >foo</A>
    < A name="BAR" href="bar" attr >bar</A>
    < A name=BAZ href=baz attr >baz</A>
    
  <BODY>
</HTML>

i run it ,the output is:
syntax error, unexpected '<', expecting $end
<!DOCTYPE HTML>
^
what's the problem?how can i solve it?

···

--
Posted via http://www.ruby-forum.com/.

Thomas_P · 23 March 2010 06:48

it's __END__ not _END_

-Thomas

···

On 2010-03-23, Pen Ttt <myocean135@yahoo.cn> wrote:

HTMLRegexp =/(
  <BODY>
</HTML>

i run it ,the output is:
syntax error, unexpected '<', expecting $end
<!DOCTYPE HTML>
^
what's the problem?how can i solve it?
--
Posted via http://www.ruby-forum.com/\.

--
Thomas Preymesser
thopre@gmail.com

Glenn_Jackman · 23 March 2010 17:20

HTMLRegexp =/(<!--.*?--\s*>)|
(<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
([^<]*)/xm

Please consider using a real HTML parser, such as Nokogiri

  if comment
    p ["Comment",comment]
  elseif tag

"elsif" not "elseif"

···

At 2010-03-23 12:18AM, "Pen Ttt" wrote:

--
Glenn Jackman
Write a wise saying and your name will live forever. -- Anonymous

Pen_Ttt · 23 March 2010 10:26

Thomas Preymesser wrote:

it's __END__ not _END_

-Thomas

i change _END_ into __END__ ,it's no use.
please run it on your computer to see what happen,think you

···

--
Posted via http://www.ruby-forum.com/\.

Pen_Ttt · 24 March 2010 01:48

there two bugs in my first program :
1、it's __END__ not _END_
2、it's "elsif" not "elseif"
i change my program into
#the filename is: /home/pt/htmlscan_test.rb
HTMLRegexp =/(<!--.*?--\s*>)|
(<(?:[^"'>]*|"[^"]*"|'[^']*')+>)|
([^<]*)/xm

data =DATA.read

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
if comment
   p ["Comment",comment]
elsif tag
  p ["Tag",tag]
elsif tdata
  tdata.gsub!(/\s+/,"")
  tdata.sub!(/ $/,"")
  p [ "TextData",tdata] unless tdata.empty?
  end
}
__END__
<!DOCTYPE HTML>
<HTML>
<BODY>
    < A name="FOO" href="foo" attr >foo</A>
    < A name="BAR" href="bar" attr >bar</A>
    < A name=BAZ href=baz attr >baz</A>
    
  <BODY>
</HTML>

there is another problem too:
it can be run on netbeans ide6.8,got correct answer
["Tag", "<!DOCTYPE HTML>"]
["Tag", "<HTML>"]
["Tag", "<BODY>"]
["Tag", "< A name=\"FOO\" href=\"foo\" attr >"]
["TextData", "foo"]
["Tag", "</A>"]
["Tag", "< A name=\"BAR\" href=\"bar\" attr >"]
["TextData", "bar"]
["Tag", "</A>"]
["Tag", "< A name=BAZ href=baz attr >"]
["TextData", "baz"]
["Tag", "</A>"]
["Comment", ""]
["Tag", "<BODY>"]
["Tag", "</HTML>"]

but when i run it on terminal
pt@pt-laptop:~$ ruby /home/pt/htmlscan_test.rb
/home/pt/htmlscan_test.rb:20: syntax error, unexpected '<', expecting
$end
<!DOCTYPE HTML>
^

what's the matter?
can you try it on your computer?
please help me.

···

--
Posted via http://www.ruby-forum.com/.

Thomas_P · 23 March 2010 11:00

Thomas Preymesser wrote:
> it's __END__ not _END_
>
> -Thomas

i change _END_ into __END__ ,it's no use.

please run it on your computer to see what happen,think you

http://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Control_Structures#if

···

On 23 March 2010 11:26, Pen Ttt <myocean135@yahoo.cn> wrote:

--
Posted via http://www.ruby-forum.com/\.

--
Thomas Preymesser
thopre@gmail.com

Brian_Candler · 23 March 2010 14:06

Pen Ttt wrote:

please run it on your computer to see what happen,think you

z.rb:11: undefined method `elseif' for main:Object (NoMethodError)

That's trying to tell you something.

···

--
Posted via http://www.ruby-forum.com/\.

Josh_Cheek · 24 March 2010 02:37

It works for me on 1.8.6, 1.8.7, and 1.9.1
http://img13.imageshack.us/img13/122/picture1svu.png

···

On Tue, Mar 23, 2010 at 7:48 PM, Pen Ttt <myocean135@yahoo.cn> wrote:

there two bugs in my first program :
1、it's __END__ not _END_
2、it's "elsif" not "elseif"
i change my program into
#the filename is: /home/pt/htmlscan_test.rb
HTMLRegexp =/(
<BODY>
</HTML>

there is another problem too:
it can be run on netbeans ide6.8,got correct answer
["Tag", "<!DOCTYPE HTML>"]
["Tag", "<HTML>"]
["Tag", "<BODY>"]
["Tag", "< A name=\"FOO\" href=\"foo\" attr >"]
["TextData", "foo"]
["Tag", "</A>"]
["Tag", "< A name=\"BAR\" href=\"bar\" attr >"]
["TextData", "bar"]
["Tag", "</A>"]
["Tag", "< A name=BAZ href=baz attr >"]
["TextData", "baz"]
["Tag", "</A>"]
["Comment", ""]
["Tag", "<BODY>"]
["Tag", "</HTML>"]

but when i run it on terminal
pt@pt-laptop:~$ ruby /home/pt/htmlscan_test.rb
/home/pt/htmlscan_test.rb:20: syntax error, unexpected '<', expecting
$end
<!DOCTYPE HTML>
^

what's the matter?
can you try it on your computer?
please help me.
--
Posted via http://www.ruby-forum.com/\.

Pen_Ttt · 24 March 2010 02:57

i reopen my computer ,run the script,get the correct answer
i still have something want to know

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]

what is the meaning of
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
i want to see material about string#scan method
it's difficult for me to understand :
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
in the script.
can you explain it for me?

···

--
Posted via http://www.ruby-forum.com/.

Josh_Cheek · 24 March 2010 03:01

Scan docs are here (this is 1.8.6)
http://ruby-doc.org/core/classes/String.html#M000812

···

On Tue, Mar 23, 2010 at 8:57 PM, Pen Ttt <myocean135@yahoo.cn> wrote:

i reopen my computer ,run the script,get the correct answer
i still have something want to know

data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]

what is the meaning of
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
i want to see material about string#scan method
it's difficult for me to understand :
data.scan(HTMLRegexp){|match|
comment,tag,tdata=match[0..2]
in the script.
can you explain it for me?

--
Posted via http://www.ruby-forum.com/\.

Topic		Replies	Views
Whats wrong with this script? ruby-talk	2	115	26 January 2009
Trying to use regex ruby-talk	3	98	20 June 2007
Regex problem ruby-talk	4	87	2 December 2007
Weird error using String#[] ruby-talk	9	95	18 January 2011
Some unrecognized syntax error ruby-talk	8	114	3 March 2011

What's the matter with my programm:Web Analysis

Related topics