One-liner removing duplicate lines

Damien_Wyart · 5 October 2005 20:11

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

Many thanks in advance,

···

--
Damien Wyart

Ryan_Leavengood · 5 October 2005 20:25

I tried creating a version that mimics the Perl one (because Ruby also
has the -n option), but in the end this seemed easier (and much more
readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Ryan

···

On 10/5/05, Damien Wyart <damien.wyart@free.fr> wrote:

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

Eric_Mahurin1 · 5 October 2005 20:31

Here is a pretty close translation that does what you want:

ruby -ne 's||={};s[$_]||print;s[$_]=true'

···

--- Damien Wyart <damien.wyart@free.fr> wrote:

Hello,

Converting from Perl to Ruby, I am trying to find an
equivalent to this
Perl one-liner removing duplicate lines in a file (without
sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

Many thanks in advance,

--
Damien Wyart

__________________________________
Yahoo! Mail - PC Magazine Editors' Choice 2005

Simon_Kroger · 5 October 2005 20:37

Damien Wyart wrote:

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

true,

open(outfile, 'w'){|out| out << IO.readlines(infile).uniq.join}

cheers

Simon

Christian_Neukirche1 · 5 October 2005 20:39

Damien Wyart <damien.wyart@free.fr> writes:

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

ruby -ne 'BEGIN{$s={}};$s[$_] ||= !puts($_)'

Not exactly nice, though.

···

Many thanks in advance,

Damien Wyart

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

Vincent_Foley · 5 October 2005 20:41

How about the uniq(1) program? uniq infile > outfile

Gyoung-Yoon_Noh1 · 5 October 2005 21:24

ruby -ne 'BEGIN{$s={}};$s[$_]=nil;END{puts$s}' infile > outfile

···

On 10/6/05, Damien Wyart <damien.wyart@free.fr> wrote:

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

Many thanks in advance,

--
Damien Wyart

--
http://nohmad.sub-port.net

W_James · 6 October 2005 02:11

Damien Wyart wrote:

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

awk '!a[$0]++' infile >outfile

Damien_Wyart · 6 October 2005 04:31

Many thanks to everyone who responded, the answers are very interesting
and enlightening !

···

--
Damien Wyart

ToRA · 11 October 2005 15:56

ruby -e 'require "set" ; s = Set.new ; ARGF.each_line {|z| s.add?(z)
and puts(z) }' input > output

or even more verbose

ruby -e 'require "set" ; s = Set.new ; ARGF.each_line {|z| if s.add?(z)
then puts(z) end }' input > output

Tris

Stefan_Lang · 5 October 2005 20:31

or:
ruby -e 'puts ARGF.readlines.uniq' infile > outfile

···

On Wednesday 05 October 2005 22:25, Ryan Leavengood wrote:

On 10/5/05, Damien Wyart <damien.wyart@free.fr> wrote:
> Hello,
>
> Converting from Perl to Ruby, I am trying to find an equivalent
> to this Perl one-liner removing duplicate lines in a file
> (without sorting it at first) :
>
> perl -ne'$s{$_}++||print' infile >outfile
>
> I guess uniq method could be used, but I can't find how.

I tried creating a version that mimics the Perl one (because Ruby
also has the -n option), but in the end this seemed easier (and
much more readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

--
Stefan

Ryan_Leavengood · 5 October 2005 20:34

Just for sake of comparison, here is the more "Perl-like" version:

ruby -ne "s||={};s[$_]||print;s[$_]=1" infile > outfile

Maybe some Ruby golfers can shorten it some more, but since Ruby lacks
some of the more terse (and obfuscating) features of Perl, it may not
be possible.

Ryan

···

On 10/5/05, Ryan Leavengood <leavengood@gmail.com> wrote:

I tried creating a version that mimics the Perl one (because Ruby also
has the -n option), but in the end this seemed easier (and much more
readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

James_Edward_Gray_II · 5 October 2005 20:34

That slurps the file though, of course, so mind your memory requirements.

Here's a more direct translation (untested):

ruby -ne 'BEGIN { $lines = Hash.new(0) }; print if ($lines[$_] += 1) == 1' infile > outfile

James Edward Gray II

···

On Oct 5, 2005, at 3:25 PM, Ryan Leavengood wrote:

On 10/5/05, Damien Wyart <damien.wyart@free.fr> wrote:

Hello,

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

I guess uniq method could be used, but I can't find how.

I tried creating a version that mimics the Perl one (because Ruby also
has the -n option), but in the end this seemed easier (and much more
readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Lou_Scoras · 5 October 2005 21:16

How about the uniq(1) program? uniq infile > outfile

Converting from Perl to Ruby, I am trying to find an equivalent to this

Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

He doesn't want sort the file first =)

···

On 10/5/05, Vincent Foley <vfoley@gmail.com> wrote:

Jeremy_Kemper1 · 6 October 2005 02:18

My head a splode. Old school.

Regards,
jeremy

···

On Oct 5, 2005, at 7:11 PM, William James wrote:

Damien Wyart wrote:

Converting from Perl to Ruby, I am trying to find an equivalent to this
Perl one-liner removing duplicate lines in a file (without sorting it at
first) :

perl -ne'$s{$_}++||print' infile >outfile

awk '!a[$0]++' infile >outfile

Damien_Wyart · 6 October 2005 04:31

* "William James" <w_a_x_man@yahoo.com> in comp.lang.ruby:

awk '!a[$0]++' infile >outfile

This one is very nice, thanks ! I had an Awk version which was slightly
longer.

···

--
Damien Wyart

Damien_Wyart · 6 October 2005 04:32

* "Vincent Foley" <vfoley@gmail.com> in comp.lang.ruby:

How about the uniq(1) program? uniq infile > outfile

Using uniq is not stable, ie you have to use sort(1) before, and the
initial order of lines is not kept.

···

--
Damien Wyart

Robert · 6 October 2005 09:11

Damien Wyart wrote:

Many thanks to everyone who responded, the answers are very
interesting and enlightening !

As far as I can see noone used the Hash with block feature. So here it
is:

ruby -e 'h=Hash.new(){|ha,l| puts l;ha[l.freeze]=1};ARGF.each {|l| h[l]}'

(David, I'm sorry again this is no #inject solution.)

Kind regards

robert

Mark_Hubbart1 · 8 October 2005 16:37

Here's a derived version (is this really Ruby?):

ruby -e'$><<[*$<].uniq' infile > outfile

cheers,
Mark

···

On 10/5/05, Ryan Leavengood <leavengood@gmail.com> wrote:

On 10/5/05, Damien Wyart <damien.wyart@free.fr> wrote:
> Hello,
>
> Converting from Perl to Ruby, I am trying to find an equivalent to this
> Perl one-liner removing duplicate lines in a file (without sorting it at
> first) :
>
> perl -ne'$s{$_}++||print' infile >outfile
>
> I guess uniq method could be used, but I can't find how.

I tried creating a version that mimics the Perl one (because Ruby also
has the -n option), but in the end this seemed easier (and much more
readable):

ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile

So you are right about using uniq.

Stefan_Lang · 5 October 2005 20:39

ruby -ne 'a||={};a[$_]||=(print;1)' infile > outfile

···

On Wednesday 05 October 2005 22:34, Ryan Leavengood wrote:

On 10/5/05, Ryan Leavengood <leavengood@gmail.com> wrote:
> I tried creating a version that mimics the Perl one (because Ruby
> also has the -n option), but in the end this seemed easier (and
> much more readable):
>
> ruby -e "puts IO.readlines(ARGV[0]).uniq" infile > outfile
>
> So you are right about using uniq.

Just for sake of comparison, here is the more "Perl-like" version:

ruby -ne "s||={};s[$_]||print;s[$_]=1" infile > outfile

Maybe some Ruby golfers can shorten it some more, but since Ruby
lacks some of the more terse (and obfuscating) features of Perl, it
may not be possible.

--
Stefan

Topic		Replies	Views
New to Ruby - pls help in translating this ruby-talk	44	305	28 December 2005
One-liner removing duplicate lines ruby-talk	0	98	10 October 2005
Speed Golf - Remove Early Dups ruby-talk	11	134	6 December 2005
How to put unique lines from regexped file ruby-talk	5	146	20 December 2009
Using reg expr with array.index ruby-talk	11	155	2 January 2008

One-liner removing duplicate lines

Related topics