Reading website

Hey!
I have been going through some basic IO stuff and want to get into
something more "advanced".
I would like to make a script who asks for a website and takes the
information (the text) on the website and put it into a textfile..

So now im wondering.. how do I tell the script to download/read
text/content from a website.. How do I do it? ( of course i do not want
anyone to actually write this script for me, i want to write it myself
but im a newbie and don't know which libs and etc to read in)

Thanks!

You are looking for the open-uri library. See if these docs are enough to get you going:

   http://www.ruby-doc.org/stdlib/libdoc/open-uri/rdoc/index.html

James Edward Gray II

···

On Aug 17, 2006, at 2:40 PM, fabsy wrote:

Hey!
I have been going through some basic IO stuff and want to get into
something more "advanced".
I would like to make a script who asks for a website and takes the
information (the text) on the website and put it into a textfile..

So now im wondering.. how do I tell the script to download/read
text/content from a website.. How do I do it? ( of course i do not want
anyone to actually write this script for me, i want to write it myself
but im a newbie and don't know which libs and etc to read in)

Just

require "open-uri"

open(url) do |io|
  # the ensues
end

···

On 8/17/06, fabsy <fabbyfabs@gmail.com> wrote:

Hey!
I have been going through some basic IO stuff and want to get into
something more "advanced".
I would like to make a script who asks for a website and takes the
information (the text) on the website and put it into a textfile..

So now im wondering.. how do I tell the script to download/read
text/content from a website.. How do I do it? ( of course i do not want
anyone to actually write this script for me, i want to write it myself
but im a newbie and don't know which libs and etc to read in)

Thanks!

fabsy wrote:

Hey!
I have been going through some basic IO stuff and want to get into
something more "advanced".
I would like to make a script who asks for a website and takes the
information (the text) on the website and put it into a textfile..

Have a look at Programming Ruby: The Pragmatic Programmer's Guide, the section
titled 'Network and Web', for the low level details.

Open-uri is a nice wrapper to make web IO simple as File IO

Look into Mechanize/Hpricot/etc if you want to get fancy 9^)

Cheers

I'm a little late to this party, and it might be a little advanced for you yet, but if you're dealing with scripting against websites you can do some pretty sick stuff using 'mechanize' (http://mechanize.rubyforge.org/\) and 'hpricot' (http://code.whytheluckystiff.net/hpricot/\).

Mechanize can click links and fill out forms. Hpricot can look for text/tags in the resulting HTML. And Mechanize supports pluggable parsers which lets you drop Hpricot right into Mechanize. It's so buttery... :slight_smile:
-Mat

···

On Aug 17, 2006, at 3:40 PM, fabsy wrote:

Hey!
I have been going through some basic IO stuff and want to get into
something more "advanced".
I would like to make a script who asks for a website and takes the
information (the text) on the website and put it into a textfile..

So now im wondering.. how do I tell the script to download/read
text/content from a website.. How do I do it? ( of course i do not want
anyone to actually write this script for me, i want to write it myself
but im a newbie and don't know which libs and etc to read in)

Thanks!

Thanks!
I've got it working..

···

---
require "open-uri"

print 'Skriv in adress: '
addr = gets.chomp

  open(addr) do |addr|
    fil = File.open('file', 'a')
  fil << addr.read
  fil.close
end
---

Is that a good way to write it?
I also saw that the first "word" in the file 'file' was
#<StringIO:0x25350>.. What does that mean?

Thanks!
I've got it working..Is this a good way to write it?

···

---
require "open-uri"

print 'Skriv in adress: '
addr = gets.chomp

        open(addr) do |addr|
        fil = File.open('file', 'a')
        fil << addr.read
        fil.close
end
---

It means you wrote an input stream object into the file. Which seems Weird. Wasn't there anything in 'file' before this script started? You set it to append, so that might be a leftover from some other script you wrote that used that file. Delete it, or open the file with the 'w' flag instead of 'a', and run it again?

<rant>
Also, you might want to avoid open(addr) {|addr| ... }, reusing the same variable name. It's probably not the issue here, but shadowing variables is inherently confusing. YMMV.
</rant>

David Vallner

···

On Thu, 17 Aug 2006 22:15:10 +0200, fabsy <fabbyfabs@gmail.com> wrote:

Thanks!
I've got it working..

---
require "open-uri"

print 'Skriv in adress: '
addr = gets.chomp

  open(addr) do |addr|
    fil = File.open('file', 'a')
  fil << addr.read
  fil.close
end
---

Is that a good way to write it?
I also saw that the first "word" in the file 'file' was
#<StringIO:0x25350>.. What does that mean?

Aha! Yes it was something old.
It opened the file with append..

And, sorry for all the questions.. (im really trying to learn) :slight_smile:
But is there any thing like this in ruby?

--VB code--

If InStr(1, Text1.Text, "String to find", 1) > 0 Then Msgbox "I found
the string!"

···

--------

( If InStr(Start,String1,String2,CompareMode)>0 Then Msgbox "I found
the string!")

Aha! Yes it was something old.
It opened the file with append..

And, sorry for all the questions.. (im really trying to learn) :slight_smile:
But is there any thing like this in ruby?

--VB code--

If InStr(1, Text1.Text, "String to find", 1) > 0 Then Msgbox "I found
the string!"

--------

( If InStr(Start,String1,String2,CompareMode)>0 Then Msgbox "I found
the string!")

Does VB index strings starting with 1?

For very simple seaching, use String#include?

  >> "foobar".include? "ooba"
  => true

If you want to check if there's a substring to be found -after- (and not including) the second character, for example, then I'd use a substring:

  >> "foobar"[2..-1].include? "ooba"
  => false

  >> "foobar"[2..-1].include? "obar"
  => true

What VB's CompareMode is, I have no idea. Case sensitivity? If so, I'd personally manually use String#downcase in the conditional.

And, of course, there's always regular expressions, the swiss army knife of string searching, but those are for a longer discussion and you can probably find material for those out there that explains them rather nicely.

David Vallner

···

On Thu, 17 Aug 2006 23:55:05 +0200, fabsy <fabbyfabs@gmail.com> wrote:

David Vallner skrev:

> Aha! Yes it was something old.
> It opened the file with append..
>
> And, sorry for all the questions.. (im really trying to learn) :slight_smile:
> But is there any thing like this in ruby?
>
> --VB code--
>
> If InStr(1, Text1.Text, "String to find", 1) > 0 Then Msgbox "I found
> the string!"
>
> --------
>
> ( If InStr(Start,String1,String2,CompareMode)>0 Then Msgbox "I found
> the string!")
>
>

Does VB index strings starting with 1?

For very simple seaching, use String#include?

  >> "foobar".include? "ooba"
  => true

If you want to check if there's a substring to be found -after- (and not
including) the second character, for example, then I'd use a substring:

  >> "foobar"[2..-1].include? "ooba"
  => false

  >> "foobar"[2..-1].include? "obar"
  => true

What VB's CompareMode is, I have no idea. Case sensitivity? If so, I'd
personally manually use String#downcase in the conditional.

And, of course, there's always regular expressions, the swiss army knife
of string searching, but those are for a longer discussion and you can
probably find material for those out there that explains them rather
nicely.

David Vallner

Ok!
But that doesn't tell me the position of the string..
Say that you have a textfile with the sentence "Could you pass me the
milk please?" and I make a search for the string "milk". Then I want to
know the position of where the search word starts so I can extract the
word into a variable.... How do I do that?

And yes VB index strings start with 1..

···

On Thu, 17 Aug 2006 23:55:05 +0200, fabsy <fabbyfabs@gmail.com> wrote:

Ok!
But that doesn't tell me the position of the string..
Say that you have a textfile with the sentence "Could you pass me the
milk please?" and I make a search for the string "milk". Then I want to
know the position of where the search word starts so I can extract the
word into a variable.... How do I do that?

Use String#index, or String#=~. The former works with strings and regular expressions, the latter requires regular expressions, but I think it also saves the match data into the special global variable I don't use or remember the name of... Anyhoo:

  >> "foobar".index "ooba"
  => 1

  >> "foobar".index "bar"
  => 3

  >> "foobar".index /bar/
  => 3

  >> "foobar".index /ooba/
  => 1

For word extraction, I'm usually very, very bad doing those with substring indexes and abuse regexps wildly each and every time I need to get some piece of text out of another one.

David Vallner

···

On Fri, 18 Aug 2006 00:40:11 +0200, fabsy <fabbyfabs@gmail.com> wrote: