Hi Jordan,
I didn't know about each_with_index until after I posted my last
message and read more on Ruby .. clearly I have to do more reading,
but I have found one of the best ways to learn is to do 
> There's no built-in way that I'm aware of. You have to iterate over
> the array yourself. If you want all the indices you could something
> like...
> indices =
> ['aaaa', '>bbbb', '>cccc'].each_with_index { | e, i |
> indices << i if e =~ /^>/
> }
> p indices # => [1, 2]
> But given the description of what you're trying to do in the other
> thread, you probably just want to use Array#reject...
> a = ['aaaa', '>bbbb', 'cccc'].reject { | e | e =~ /^>/ }
> p a # => ["aaaa", "cccc"]
This would delete only the one element, but I am trying to delete a range
of data (a record). I may have duplicate records, so I am trying to get
rid of them. They have different identifiers, each starting with a '>'.
Here's a test file that mimics this:
>88888/Bla08/the/rest8
888888888888888
888888888888888
888888888888888
888888888888888
888888888888888
88888 -- last line --
>77777/Bla07/the/rest7
777777777777777
777777777777777
777777777777777
777777777777777
777777777777777
77777 -- last line --
>66666/Bla06/the/rest6
666666666666666
666666666666666
666666666666666
666666666666666
666666666666666
66666 -- last line --
>77777/Bla07/the/rest7
777777777777777
777777777777777
777777777777777
777777777777777
777777777777777
77777 -- last line --
>
(I add the last > and later remove it)
So, this is what I came up with (with suggestions from you):
######################################
# delete duplicate records
######################################
def deleteDuplicates(data, dups)
dups.each do |name|
puts "\n****deleting duplicate \"#{name}\"...\n"
s = data.index(name)
e = 0
data[s+1..-1].each_with_index{ |v, i|
if v =~ /^>/
e = i
break
end
}
puts "deleting ... ", data[s..s+e], "..done"
data.slice!(s..s+e)
end
data
end
######################################
What do you think? It seems to work, but I'm always interested in
learning to do things better.
Thanks again!
Esmail
Hi Esmail,
A couple points:
- It's not very efficient to do all that iteration and slicing.
- The regexp won't work since #each and #each_with_index iterate over
lines and not characters (so v == " >...", so /^ >/ would be needed).
- #index returns nil if there is no matching index (error when you get
to s+1 in that case).
How about using Array#uniq, as in:
def no_dups(path)
IO.read(path).split(" >").uniq.join(" >")
end
fixed = no_dups("testfile")
puts fixed
# =>
>88888/Bla08/the/rest8
888888888888888
888888888888888
888888888888888
888888888888888
888888888888888
88888 -- last line --
>77777/Bla07/the/rest7
777777777777777
777777777777777
777777777777777
777777777777777
777777777777777
77777 -- last line --
>66666/Bla06/the/rest6
666666666666666
666666666666666
666666666666666
666666666666666
666666666666666
66666 -- last line --
>
Regards,
Jordan
···
On Dec 27, 7:17 am, Esmail <ebonak_de...@hotmail.com> wrote: