Reading String Data as a File

I use Net::HTTP to collect some data as a string. I now need to pass
that string data to a Ruby method that is expecting to receive the data
from a file (i.e., the method expects the data to be stored in a file
and to have a path to the file passed to it as a parameter). Is there
anyway to resolve this dilemma short of writing the string data to a
file and then reading it in from the file?

Thanks for any input.

      ... doug

···

--
Posted via http://www.ruby-forum.com/.

ri StringIO

···

On Jun 28, 2010, at 17:43 , Doug Jolley wrote:

I use Net::HTTP to collect some data as a string. I now need to pass
that string data to a Ruby method that is expecting to receive the data
from a file (i.e., the method expects the data to be stored in a file
and to have a path to the file passed to it as a parameter). Is there
anyway to resolve this dilemma short of writing the string data to a
file and then reading it in from the file?

Ryan Davis wrote:

ri StringIO

That will work if the code in question will accept an open File/IO
object as an argument.

If it takes only a pathname argument, then you're stuck with writing the
data to a file (ri Tempfile may help).

If you have control of the target code, then refactor it. e.g.

class Foo
  # original entry point
  def read_file(pathname)
    File.open(pathname,"rb") { |f| read_io(f) }
  end

  # entry point for already-open object, e.g. STDIN, a StringIO etc.
  def read_io(io)
    io.each_line { ... }
  end
end

···

--
Posted via http://www.ruby-forum.com/\.

If it takes only a pathname argument, then you're
stuck with writing the data to a file

Unfortunately that is precisely my case and that is precisely what I was
trying to avoid. (And, unfortunately, I don't have any control over the
target code.)

Interestingly, a post that I found seemed to say that I could use the
StringIO approach in the case where a pathname argument was required.
The post said:

Any easy way to work with a string in a method that is expecting
a file is to create a new StringIO object and pass the result to
the method requiring a file type. For example:

some_method(StringIO.new("Your string here"))

He did say, "file". It's just that usually methods that follow that
form are expecting a path. Anyway, as one might expect, it didn't work
for me. I get the following error:

./test1:5:in `read': can't convert StringIO into String (TypeError)

As Ryan says, I guess that I'm stuck to write this out to a temp file.

Thanks to all who responded to my inquiry.

           ... doug

···

--
Posted via http://www.ruby-forum.com/\.

It's Ruby. You can always patch or alias_method_chain the target code if
you're willing to bear some slight brittleness.

···

On Tue, Jun 29, 2010 at 11:50 AM, Doug Jolley <ddjolley@gmail.com> wrote:

Unfortunately that is precisely my case and that is precisely what I was
trying to avoid. (And, unfortunately, I don't have any control over the
target code.)

--
Tony Arcieri
Medioh! A Kudelski Brand

It's Ruby. You can always patch or alias_method_chain the target code
if you're willing to bear some slight brittleness.

Good point. I've been considering whether I should re-think my position
that the underlying code is inaccessible. The truth is, the block of
data that I have in memory is actually a Rails layout. I was reluctant
to mention the Rails aspects in this forum. So, I don't know if I could
ever figure out what would need to be done; but, your idea is definitely
a good one. Thanks for the input.

      ... doug

···

--
Posted via http://www.ruby-forum.com/\.

That is EXACTLY what I was coming back to say... Tony beat me to it.

···

On Jun 29, 2010, at 14:20 , Tony Arcieri wrote:

On Tue, Jun 29, 2010 at 11:50 AM, Doug Jolley <ddjolley@gmail.com> wrote:

Unfortunately that is precisely my case and that is precisely what I was
trying to avoid. (And, unfortunately, I don't have any control over the
target code.)

It's Ruby. You can always patch or alias_method_chain the target code if
you're willing to bear some slight brittleness.

Is this always possible? Wouldn't you need some knowledge of the
inner workings of the target code? In this case for example, does it
open the file with File.open or maybe with File.foreach?

This is an interesting point of interface design: usually it is more
convenient to just pass a file name somewhere and that method opens
the file (or URL) and reads the data. But from a modularity point of
view it is generally better to pass an open IO like instance.

You can nicely layer this e.g.

class X
  # convenience method that will open the file for you
  def read_file(path)
    File.open path |io|
      read io
    end
  end

  # yet another convenience method
  def read_url(url)
    ...
  end

  # read the data
  def read(io)
     io.each_line do |line|
       # whatever
     end
  end
end

The only drawback here is the additional method needed but convenience
comes at a price. :slight_smile:

Kind regards

robert

···

2010/6/29 Tony Arcieri <tony.arcieri@medioh.com>:

On Tue, Jun 29, 2010 at 11:50 AM, Doug Jolley <ddjolley@gmail.com> wrote:

Unfortunately that is precisely my case and that is precisely what I was
trying to avoid. (And, unfortunately, I don't have any control over the
target code.)

It's Ruby. You can always patch or alias_method_chain the target code if
you're willing to bear some slight brittleness.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote:

It's Ruby. �You can always patch or alias_method_chain the target code if
you're willing to bear some slight brittleness.

Is this always possible? Wouldn't you need some knowledge of the
inner workings of the target code? In this case for example, does it
open the file with File.open or maybe with File.foreach?

You simply find that part of the code, and replace the offending
method(s) with something else. In the limit, you replace everything with
your own code :slight_smile:

It would be convenient to be able to mock out File and Dir with a
virtual, in-RAM filesystem. I'm not aware of a library which does that,
but in principle I think it could be done.

This is an interesting point of interface design: usually it is more
convenient to just pass a file name somewhere and that method opens
the file (or URL) and reads the data. But from a modularity point of
view it is generally better to pass an open IO like instance.

Definitely. The original csv.rb in ruby 1.8 got this very badly wrong.

The new (faster_csv) interface is capable of this, but it suffers from
missing documentation. IIRR you have to do something like

FasterCSV.new($stdin).each do |row|
  p row
end

Since the documented "primary" interface is
FasterCSV.foreach("path/to/file.csv"), you have to dig through the code
to work out how to handle an open stream.

···

--
Posted via http://www.ruby-forum.com/\.

Robert Klemme wrote:

It's Ruby. �You can always patch or alias_method_chain the target code if
you're willing to bear some slight brittleness.

Is this always possible? Wouldn't you need some knowledge of the
inner workings of the target code? In this case for example, does it
open the file with File.open or maybe with File.foreach?

You simply find that part of the code, and replace the offending
method(s) with something else. In the limit, you replace everything with
your own code :slight_smile:

That's what I always wanted to do - seems I have to resurrect my
WorldDomination gem. :slight_smile:

It would be convenient to be able to mock out File and Dir with a
virtual, in-RAM filesystem. I'm not aware of a library which does that,
but in principle I think it could be done.

Well, /tmp is in memory on many systems and writing a small file is
also a mostly in memory operation. Of course, this is not as cheap as
doing it completely in userland but probably sufficient for many
applications (although it's not really nice). At least one can use
Tempfile for this, e.g.

Tempfile "prefix", "/tmp" do |io|
  io.write everything

  io.seek 0
  whatever_load_routine io
end

This is an interesting point of interface design: usually it is more
convenient to just pass a file name somewhere and that method opens
the file (or URL) and reads the data. But from a modularity point of
view it is generally better to pass an open IO like instance.

Definitely. The original csv.rb in ruby 1.8 got this very badly wrong.

The new (faster_csv) interface is capable of this, but it suffers from
missing documentation. IIRR you have to do something like

FasterCSV.new($stdin).each do |row|
p row
end

Since the documented "primary" interface is
FasterCSV.foreach("path/to/file.csv"), you have to dig through the code
to work out how to handle an open stream.

Or have the idea to look at "ri CSV.new"...

Thanks for the hint. This is good to know.

Cheers

robert

···

2010/6/30 Brian Candler <b.candler@pobox.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote:

This is an interesting point of interface design: usually it is more
convenient to just pass a file name somewhere and that method opens
the file (or URL) and reads the data. But from a modularity point of
view it is generally better to pass an open IO like instance.

Definitely. The original csv.rb in ruby 1.8 got this very badly wrong.

The new (faster_csv) interface is capable of this, but it suffers from
missing documentation.

I agree that FasterCSV's documentation isn't perfect. I'm pretty sure all of its functions are documented, but you would need to read the API like a novel to find them. I've been trying more tutorial style documentation lately, but there again it's hard to reference what you specifically want to know.

I'm open to suggestions and I do take patches.

IIRR you have to do something like

FasterCSV.new($stdin).each do |row|
p row
end

That works, yes.

Since the documented "primary" interface is
FasterCSV.foreach("path/to/file.csv"), you have to dig through the code
to work out how to handle an open stream.

That's mostly due to a pet peeve of mine. I often see code that slurps when foreach() would have worked fine. That's why I try to push that as a first choice.

Do you think it would help if I added Wrapping an IO under the Shortcut Interface on this page?

http://fastercsv.rubyforge.org/classes/FasterCSV.html

James Edward Gray II

···

On Jun 30, 2010, at 7:53 AM, Brian Candler wrote:

Robert Klemme wrote:

At least one can use Tempfile for this, e.g.

Tempfile "prefix", "/tmp" do |io|
  io.write everything

  io.seek 0
  whatever_load_routine io
end

or rather:

Tempfile.open "prefix", "/tmp" do |io|
  io.write everything
  io.flush
  whatever_load_routine io.path
end

···

--
Posted via http://www.ruby-forum.com/\.

James Edward Gray II wrote:

I'm open to suggestions and I do take patches.

Specifically, I'd like to see how to parse CSV from stdin. You provide
an example in the opposite direction:

# FCSV($stderr) { |csv_err| csv_err << %w{my data here} } # to
$stderr

A bit more experimentation suggests that

    FCSV($stdin).each { |a,b,c| p a,b,c }

works, so if that's a reasonable way to drive the library, I'd like to
see that mentioned under shortcuts. (I thought I'd tried that before and
it failed, but I must have done something different)

···

--
Posted via http://www.ruby-forum.com/\.

That's mostly due to a pet peeve of mine. I often see code that
slurps when foreach() would have worked fine. That's why I try to
push that as a first choice.

I wholeheartedly agree.

Do you think it would help if I added Wrapping an IO under the
Shortcut Interface on this page?

http://fastercsv.rubyforge.org/classes/FasterCSV.html

+1

  robert

···

On 30.06.2010 17:05, James Edward Gray II wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote:

At least one can use Tempfile for this, e.g.

Tempfile "prefix", "/tmp" do |io|
io.write everything

io.seek 0
whatever_load_routine io
end

or rather:

Tempfile.open "prefix", "/tmp" do |io|
io.write everything
io.flush

I'd rather io.close instead of io.flush to release resources as soon
as possible.

whatever_load_routine io.path
end

Ooops! Yes, of course. I copied the wrong example. Sorry for my confusion.

Cheers

robert

···

2010/6/30 Brian Candler <b.candler@pobox.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Better?

http://fastercsv.rubyforge.org/classes/FasterCSV.html

James Edward Gray II

···

On Jun 30, 2010, at 10:41 AM, Brian Candler wrote:

James Edward Gray II wrote:

I'm open to suggestions and I do take patches.

Specifically, I'd like to see how to parse CSV from stdin. You provide
an example in the opposite direction:

# FCSV($stderr) { |csv_err| csv_err << %w{my data here} } # to
$stderr

On Jun 30, 2010, at 11:35 AM, Robert Klemme wrote:

On 30.06.2010 17:05, James Edward Gray II wrote:

Do you think it would help if I added Wrapping an IO under the
Shortcut Interface on this page?

http://fastercsv.rubyforge.org/classes/FasterCSV.html

+1

Robert Klemme wrote:

or rather:

Tempfile.open "prefix", "/tmp" do |io|
�io.write everything
�io.flush

I'd rather io.close instead of io.flush to release resources as soon
as possible.

But tempfile will want to close itself using the block form anyway.

In most versions of ruby, Tempfile with a block returns nil. A change
was committed so that it returns the (closed) object, but that hasn't
made it into either of the versions I have lying around here.

tf = Tempfile.open("aaa","/tmp") { puts "hello"; 123 }

hello
=> nil

···

2010/6/30 Brian Candler <b.candler@pobox.com>:

--
Posted via http://www.ruby-forum.com/\.

Perfect! Do you think it is a good idea to also allow an IO as
argument to foreach so we can save a block?

FCSV($stdin) { |csv_in| csv_in.each { |row| p row } } # from $stdin

would become

FCSV.foreach($stdin) { |row| p row } # from $stdin

Kind regards

robert

···

2010/7/1 James Edward Gray II <james@graysoftinc.com>:

On Jun 30, 2010, at 10:41 AM, Brian Candler wrote:

James Edward Gray II wrote:

I'm open to suggestions and I do take patches.

Specifically, I'd like to see how to parse CSV from stdin. You provide
an example in the opposite direction:

# FCSV($stderr) { |csv_err| csv_err << %w{my data here} } # to
$stderr

On Jun 30, 2010, at 11:35 AM, Robert Klemme wrote:

On 30.06.2010 17:05, James Edward Gray II wrote:

Do you think it would help if I added Wrapping an IO under the
Shortcut Interface on this page?

http://fastercsv.rubyforge.org/classes/FasterCSV.html

+1

Better?

http://fastercsv.rubyforge.org/classes/FasterCSV.html

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

James Edward Gray II wrote:

Better?

http://fastercsv.rubyforge.org/classes/FasterCSV.html

Yes, that's just the reminder I need :slight_smile: Thanks.

···

--
Posted via http://www.ruby-forum.com/\.

Robert Klemme wrote:

or rather:

Tempfile.open "prefix", "/tmp" do |io|
�io.write everything
�io.flush

I'd rather io.close instead of io.flush to release resources as soon
as possible.

But tempfile will want to close itself using the block form anyway.

Yes, but later. This can make a difference if you are low on file descriptors. And you do not risk weird effects by the same process opening the file twice.

In most versions of ruby, Tempfile with a block returns nil. A change
was committed so that it returns the (closed) object, but that hasn't
made it into either of the versions I have lying around here.

tf = Tempfile.open("aaa","/tmp") { puts "hello"; 123 }

hello
=> nil

The non block form obviously returns the Tempfile instance and if you want it to be returned from the block what stops you from explicitly returning it?

IMHO the method with block should return whatever the implementor of the block chooses. That is far more reusable than always returning the Tempfile. Most of the time the Tempfile instance is of no use anyway since it is closed then.

Kind regards

  robert

···

On 30.06.2010 17:31, Brian Candler wrote:

2010/6/30 Brian Candler<b.candler@pobox.com>:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/