Join all text files in a folder, with a single line of Ruby code

luisbebop · 25 October 2008 11:44

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

Thanks everyone!

www.twitter.com/luisbebop

Brian_Candler · 25 October 2008 11:56

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

system("cat *.txt >bigfile")

:-?

···

--
Posted via http://www.ruby-forum.com/\.

David_A_Black1 · 25 October 2008 12:55

Hi --

···

On Sat, 25 Oct 2008, luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

You can use read rather than readlines, and save a loop:

Dir["*.txt"].each {|f| merged_file.print(File.read(f)) }

or similar.

David

--
Rails training from David A. Black and Ruby Power and Light:
   Intro to Ruby on Rails January 12-15 Fort Lauderdale, FL
   Advancing with Rails January 19-22 Fort Lauderdale, FL *
   * Co-taught with Patrick Ewing!
See http://www.rubypal.com for details and updates!

W_James · 25 October 2008 13:50

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

Brian_Candler · 25 October 2008 14:03

William James wrote:

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

Note that 'puts' will add a newline to the end of each file which
doesn't already have one. If you don't want this, use 'print' or 'write'
instead.

···

--
Posted via http://www.ruby-forum.com/\.

Robert_K1 · 25 October 2008 14:40

Why not directly invoke "cat" from the shell prompt?

Kind regards

robert

···

On 25.10.2008 13:56, Brian Candler wrote:

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

system("cat *.txt >bigfile")

luisbebop · 25 October 2008 15:10

Without loops, it's very nice!

···

On Oct 25, 11:50 am, "William James" <w_a_x_...@yahoo.com> wrote:

luisbebop wrote:
> I did a single line of code in Ruby, which joins all text files in a
> folder to a bigfile. I got some tests, and it's works!
> Does anyone knows a better way, or other 'Ruby Way' to do that ?

> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
> file> File.readlines(file).each { |line| mergedFile << line } } }

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

Robert_K1 · 25 October 2008 15:24

luisbebop wrote:

I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

There's also

ruby -e '$defout.write(ARGF.read)' *.txt >merged
ruby -e 'File.open("out","w") {|io| io.write(ARGF.read)}' *.txt

"We still don't need no stinkin' loops!"

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

That's vastly inefficient since it reads all the files into memory before writing a single byte. This is not necessary. You can at least improve to

File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

But a proper solution (i.e. one that deals with arbitrary large files) would use a fixed buffer size - but that looks ugly on a single line...

Kind regards

robert

···

On 25.10.2008 15:47, William James wrote:

Joel_VanderWerf1 · 25 October 2008 19:08

William James wrote:

"We don't need no stinkin' loops!"

ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir['*']; print ARGF.read

Not recommended, though, since it reads all the data into memory and steps on ARGV.

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Mohit_Sindhwani1 · 26 October 2008 09:49

Brian Candler wrote:

luisbebop wrote:


I did a single line of code in Ruby, which joins all text files in a
folder to a bigfile. I got some tests, and it's works!
Does anyone knows a better way, or other 'Ruby Way' to do that ?

File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
file> File.readlines(file).each { |line| mergedFile << line } } }

system("cat *.txt >bigfile")

and on Windows
system("copy *.txt > bigfile")
(make sure that the bigfile name doesn't match the pattern, so use bigfile rather than bigfile.txt)

Cheers,
Mohit.
10/26/2008 | 5:49 PM.

luisbebop · 25 October 2008 15:08

'Case I'm learning Ruby , and I wanna see some snippets to make some
tasks in a single line of Ruby code.
Directly from prompt is not funny!

···

On Oct 25, 12:40 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

On 25.10.2008 13:56, Brian Candler wrote:

> luisbebop wrote:
>> I did a single line of code in Ruby, which joins all text files in a
>> folder to a bigfile. I got some tests, and it's works!
>> Does anyone knows a better way, or other 'Ruby Way' to do that ?

>> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
>> file> File.readlines(file).each { |line| mergedFile << line } } }

> system("cat *.txt >bigfile")

Why not directly invoke "cat" from the shell prompt?

Kind regards
    robert

luisbebop · 25 October 2008 16:19

I got your point. We need an one loop, to be more efficient.
Really, I don't need deal with arbitrary large files.
Like as said, the main goals here are: use ruby (without prompt
commands), and one line of code.
Thanks

···

On Oct 25, 1:24 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

On 25.10.2008 15:47, William James wrote:

> luisbebop wrote:

>> I did a single line of code in Ruby, which joins all text files in a
>> folder to a bigfile. I got some tests, and it's works!
>> Does anyone knows a better way, or other 'Ruby Way' to do that ?

>> File.open('bigfile','w') { |mergedFile| Dir.glob("*.txt").each { |
>> file> File.readlines(file).each { |line| mergedFile << line } } }

> "We don't need no stinkin' loops!"

> ruby -e"puts ARGF.to_a" *.txt >merged

There's also

ruby -e '$defout.write(ARGF.read)' *.txt >merged
ruby -e 'File.open("out","w") {|io| io.write(ARGF.read)}' *.txt

> "We still don't need no stinkin' loops!"

> File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

That's vastly inefficient since it reads all the files into memory
before writing a single byte. This is not necessary. You can at least
improve to

File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

But a proper solution (i.e. one that deals with arbitrary large files)
would use a fixed buffer size - but that looks ugly on a single line...

Kind regards
    robert

Nobuyoshi_Nakada1 · 26 October 2008 14:06

Hi,

At Sun, 26 Oct 2008 04:08:48 +0900,
Joel VanderWerf wrote in [ruby-talk:318574]:

> ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir['*']; print ARGF.read

Not recommended, though, since it reads all the data into memory and
steps on ARGV.

ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'

···

--
Nobu Nakada

John_Carter · 28 October 2008 00:18

If we're into fast and ugly...

We need a Ruby interface to Linux "splice"...

   splice() moves data between two file descriptors without copying
        between kernel address space and user address space. It transfers up
        to len bytes of data from the file descriptor fd_in to the file
        descriptor fd_out, where one of the descriptors must refer to a pipe.

See "man splice" for more.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

···

On Sun, 26 Oct 2008, Robert Klemme wrote:

File.open("mrg","w"){|f|f.puts Dir['*.txt'].map{|nm|IO.read nm}}

That's vastly inefficient since it reads all the files into memory before writing a single byte. This is not necessary. You can at least improve to

File.open("mrg","w"){|f|Dir['*.txt'].each{|nm|f.write(File.read(nm))}}

Joel_VanderWerf1 · 26 October 2008 18:48

Nobuyoshi Nakada wrote:

Hi,

At Sun, 26 Oct 2008 04:08:48 +0900,
Joel VanderWerf wrote in [ruby-talk:318574]:

ruby -e"puts ARGF.to_a" *.txt >merged

Cheating a bit:

ARGV.replace Dir['*']; print ARGF.read

Not recommended, though, since it reads all the data into memory and steps on ARGV.

ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'

Very nice! But if you are going that far, why not go all the way:

ruby -pe'1' *

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

luisbebop · 27 October 2008 01:55

ruby -pe'1' *

Can you explain ? Sorry, but I didn't understand.

Thanks

···

On Oct 26, 4:48 pm, Joel VanderWerf <vj...@path.berkeley.edu> wrote:

Nobuyoshi Nakada wrote:
> Hi,

> At Sun, 26 Oct 2008 04:08:48 +0900,
> Joel VanderWerf wrote in [ruby-talk:318574]:
>>> ruby -e"puts ARGF.to_a" *.txt >merged
>> Cheating a bit:

>> ARGV.replace Dir['*']; print ARGF.read

>> Not recommended, though, since it reads all the data into memory and
>> steps on ARGV.

> ruby -pe 'BEGIN{ARGV.replace Dir["*"]}'

Very nice! But if you are going that far, why not go all the way:

ruby -pe'1' *

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Joel_VanderWerf1 · 27 October 2008 02:43

luisbebop wrote:

ruby -pe'1' *

Can you explain ? Sorry, but I didn't understand.

If you run this in a shell, the * expands to all files. The -p switch means "for each line in the files on the command line, store the line into $_, and print $_. Usually, you want to use -e'some code' to operate on $_. In this case, the '1' is a no-op, so it just prints the line without changing it. HTH.

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

_Pena_Botp1 · 27 October 2008 05:13

# luisbebop wrote:
# >> ruby -pe'1' *
# >
# > Can you explain ? Sorry, but I didn't understand.
# If you run this in a shell, the * expands to all files.
# The -p switch means "for each line in the files on the
# command line, store the line into $_, and print $_.
# Usually, you want to use -e'some code' to operate
# on $_. In this case, the '1' is a no-op, so it just
# prints the line without changing it. HTH.

wc also means,

ruby -pe '' *

···

From: Joel VanderWerf [mailto:vjoel@path.berkeley.edu]

Joel_VanderWerf1 · 27 October 2008 05:43

Peña wrote:

From: Joel VanderWerf [mailto:vjoel@path.berkeley.edu] # luisbebop wrote:
# >> ruby -pe'1' *
# > # > Can you explain ? Sorry, but I didn't understand.
# If you run this in a shell, the * expands to all files. # The -p switch means "for each line in the files on the
# command line, store the line into $_, and print $_. # Usually, you want to use -e'some code' to operate # on $_. In this case, the '1' is a no-op, so it just # prints the line without changing it. HTH.

wc also means,

ruby -pe '' *

Ah, you're right. I tried

ruby -pe'' *

but that failed. With the extra space it works.

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Nobuyoshi_Nakada1 · 27 October 2008 06:39

Hi,

At Mon, 27 Oct 2008 14:43:58 +0900,
Joel VanderWerf wrote in [ruby-talk:318648]:

Ah, you're right. I tried

ruby -pe'' *

but that failed. With the extra space it works.

I often use -ep to get rid of quotes and "unused literal"
warning.

···

--
Nobu Nakada

Topic		Replies	Views
Appending the contents of multiple text files into 1 file ruby-talk	1	108	14 June 2007
How to concatenate all the lines into one string ruby-talk	3	109	8 October 2006
Concatenate a set of files ruby-talk	10	121	26 September 2007
Open files in a dir then gsub all of them? ruby-talk	1	120	29 March 2005
File.open very basic ruby-talk	6	105	11 April 2011

Join all text files in a folder, with a single line of Ruby code

Related topics