Join all text files in a folder, with a single line of Ruby code

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

Thanks everyone!

www.twitter.com/luisbebop

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

system("cat *.txt >bigfile")

:-?

···

--
Posted via http://www.ruby-forum.com/\.

Hi --

···

On Sat, 25 Oct 2008, luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

You can use read rather than readlines, and save a loop:

   Dir["*.txt"].each {|f| merged_file.print(File.read(f)) }

or similar.

David

--
Rails training from David A. Black and Ruby Power and Light:
   Intro to Ruby on Rails January 12-15 Fort Lauderdale, FL
   Advancing with Rails January 19-22 Fort Lauderdale, FL *
   * Co-taught with Patrick Ewing!
See http://www.rubypal.com for details and updates!

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

William James wrote:

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

Note that 'puts' will add a newline to the end of each file which
doesn't already have one. If you don't want this, use 'print' or 'write'
instead.

···

--
Posted via http://www.ruby-forum.com/\.

Why not directly invoke "cat" from the shell prompt? :slight_smile:

Kind regards

  robert

···

On 25.10.2008 13:56, Brian Candler wrote:

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

system("cat *.txt >bigfile")

Without loops, it's very nice!

···

On Oct 25, 11:50 am, "William James" <w_a_x_...@yahoo.com> wrote:

luisbebop wrote:
> I did a single line of code in Ruby, which joins all text files in a
> folder to a bigfile. I got some tests, and it's works!
> Does anyone knows a better way, or other 'Ruby Way' to do that ?

> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
> file> File.readlines(file).each { |line| mergedFile << line } } }

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

There's also

ruby -e '$defout.write(ARGF.read)' *.txt >merged
ruby -e 'File.open("out","w") {|io| io.write(ARGF.read)}' *.txt

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

That's vastly inefficient since it reads all the files into memory before writing a single byte. This is not necessary. You can at least improve to

File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

But a proper solution (i.e. one that deals with arbitrary large files) would use a fixed buffer size - but that looks ugly on a single line...

Kind regards

  robert

···

On 25.10.2008 15:47, William James wrote:

William James wrote:

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir['*']; print ARGF.read

Not recommended, though, since it reads all the data into memory and steps on ARGV.

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Brian Candler wrote:

luisbebop wrote:
  

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }
    
system("cat *.txt >bigfile")
  
and on Windows
system("copy *.txt > bigfile")
(make sure that the bigfile name doesn't match the pattern, so use bigfile rather than bigfile.txt)

Cheers,
Mohit.
10/26/2008 | 5:49 PM.

'Case I'm learning Ruby , and I wanna see some snippets to make some
tasks in a single line of Ruby code.
Directly from prompt is not funny!

···

On Oct 25, 12:40 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

On 25.10.2008 13:56, Brian Candler wrote:

> luisbebop wrote:
>> I did a single line of code in Ruby, which joins all text files in a
>> folder to a bigfile. I got some tests, and it's works!
>> Does anyone knows a better way, or other 'Ruby Way' to do that ?

>> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
>> file> File.readlines(file).each { |line| mergedFile << line } } }

> system("cat *.txt >bigfile")

Why not directly invoke "cat" from the shell prompt? :slight_smile:

Kind regards

    robert

I got your point. We need an one loop, to be more efficient.
Really, I don't need deal with arbitrary large files.
Like as said, the main goals here are: use ruby (without prompt
commands), and one line of code.
Thanks :slight_smile:

···

On Oct 25, 1:24 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

On 25.10.2008 15:47, William James wrote:

> luisbebop wrote:

>> I did a single line of code in Ruby, which joins all text files in a
>> folder to a bigfile. I got some tests, and it's works!
>> Does anyone knows a better way, or other 'Ruby Way' to do that ?

>> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
>> file> File.readlines(file).each { |line| mergedFile << line } } }

> "We don't need no stinkin' loops!"

> ruby -e"puts ARGF.to_a" *.txt >merged

There's also

ruby -e '$defout.write(ARGF.read)' *.txt >merged
ruby -e 'File.open("out","w") {|io| io.write(ARGF.read)}' *.txt

> "We still don't need no stinkin' loops!"

> File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

That's vastly inefficient since it reads all the files into memory
before writing a single byte. This is not necessary. You can at least
improve to

File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

But a proper solution (i.e. one that deals with arbitrary large files)
would use a fixed buffer size - but that looks ugly on a single line...

Kind regards

    robert

Hi,

At Sun, 26 Oct 2008 04:08:48 +0900,
Joel VanderWerf wrote in [ruby-talk:318574]:

> ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir['*']; print ARGF.read

Not recommended, though, since it reads all the data into memory and
steps on ARGV.

ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'

···

--
Nobu Nakada

If we're into fast and ugly...

We need a Ruby interface to Linux "splice"...

   splice() moves data between two file descriptors without copying
        between kernel address space and user address space. It transfers up
        to len bytes of data from the file descriptor fd_in to the file
        descriptor fd_out, where one of the descriptors must refer to a pipe.

See "man splice" for more.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

···

On Sun, 26 Oct 2008, Robert Klemme wrote:

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

That's vastly inefficient since it reads all the files into memory before writing a single byte. This is not necessary. You can at least improve to

File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

Nobuyoshi Nakada wrote:

Hi,

At Sun, 26 Oct 2008 04:08:48 +0900,
Joel VanderWerf wrote in [ruby-talk:318574]:

ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir['*']; print ARGF.read

Not recommended, though, since it reads all the data into memory and steps on ARGV.

ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'

Very nice! But if you are going that far, why not go all the way:

ruby -pe'1' *

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

ruby -pe'1' *

Can you explain ? Sorry, but I didn't understand.

Thanks :slight_smile:

···

On Oct 26, 4:48 pm, Joel VanderWerf <vj...@path.berkeley.edu> wrote:

Nobuyoshi Nakada wrote:
> Hi,

> At Sun, 26 Oct 2008 04:08:48 +0900,
> Joel VanderWerf wrote in [ruby-talk:318574]:
>>> ruby -e"puts ARGF.to_a" *.txt >merged
>> Cheating a bit:

>> ARGV.replace Dir['*']; print ARGF.read

>> Not recommended, though, since it reads all the data into memory and
>> steps on ARGV.

> ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'

Very nice! But if you are going that far, why not go all the way:

ruby -pe'1' *

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

luisbebop wrote:

ruby -pe'1' *

Can you explain ? Sorry, but I didn't understand.

If you run this in a shell, the * expands to all files. The -p switch means "for each line in the files on the command line, store the line into $_, and print $_. Usually, you want to use -e'some code' to operate on $_. In this case, the '1' is a no-op, so it just prints the line without changing it. HTH.

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

# luisbebop wrote:
# >> ruby -pe'1' *
# >
# > Can you explain ? Sorry, but I didn't understand.
# If you run this in a shell, the * expands to all files.
# The -p switch means "for each line in the files on the
# command line, store the line into $_, and print $_.
# Usually, you want to use -e'some code' to operate
# on $_. In this case, the '1' is a no-op, so it just
# prints the line without changing it. HTH.

wc also means,

  ruby -pe '' *

···

From: Joel VanderWerf [mailto:vjoel@path.berkeley.edu]

Peña wrote:

From: Joel VanderWerf [mailto:vjoel@path.berkeley.edu] # luisbebop wrote:
# >> ruby -pe'1' *
# > # > Can you explain ? Sorry, but I didn't understand.
# If you run this in a shell, the * expands to all files. # The -p switch means "for each line in the files on the
# command line, store the line into $_, and print $_. # Usually, you want to use -e'some code' to operate # on $_. In this case, the '1' is a no-op, so it just # prints the line without changing it. HTH.

wc also means,

  ruby -pe '' *

Ah, you're right. I tried

ruby -pe'' *

but that failed. With the extra space it works.

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Hi,

At Mon, 27 Oct 2008 14:43:58 +0900,
Joel VanderWerf wrote in [ruby-talk:318648]:

Ah, you're right. I tried

ruby -pe'' *

but that failed. With the extra space it works.

I often use -ep to get rid of quotes and "unused literal"
warning.

···

--
Nobu Nakada