Retrieving and copying element from array

If I have an array like this:

["category: cat1",
"item1, item2, item3",
"category: cat2",
"item1",
"category: cat3",
"item1, item2, item3, item4",]

How can I have a new array like this:

[["cat1", "item1", "item2", item3"],
["cat2", "item1"],
["cat3", "item1", "item2", "item3", "item4"]
]

Thanks for the help.

···

--
Posted via http://www.ruby-forum.com/.

["category: cat1",
"item1, item2, item3",
"category: cat2",
"item1",
"category: cat3",
"item1, item2, item3, item4",]

How can I have a new array like this:

[["cat1", "item1", "item2", item3"],
["cat2", "item1"],
["cat3", "item1", "item2", "item3", "item4"]
]

If your initial array is called 'list' :

result =
list.each_slice(2) {|i, j| result.push(i.sub(/category: /, '')); b.push(*j.split(', '))}

Iterate over your list in pairs (each_slice), and remove 'category: '
from the first element, while split the second element over ', ' and
append the result to an array.

···

--
Anurag Priyam
http://about.me/yeban/

arr.grep(/category:.*/).map{|a| [arr.at(arr.index(a) +1)]}

will work as long as there is only 1 element after the 'category'

···

--
Posted via http://www.ruby-forum.com/.

Thanks for reply, Anurag. Can't get this to work though:

irb(main):001:0> lines = File.readlines('test.txt')
=> ["category: cat1\n", " item1\n", " item2\n", " item3\n", "category:
cat2\n", " item1\n", "category: cat3\n", " item1\n", " item2\n", "
item3\n", " item4\n", "\n"]
irb(main):002:0> puts lines
category: cat1
item1
item2
item3
category: cat2
item1
category: cat3
item1
item2
item3
item4

=> nil
irb(main):003:0> result = []
=> []
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); b.push(*j.split(', '))}
NameError: undefined local variable or method `b' for main:Object
        from (irb):4:in `block in irb_binding'
        from (irb):4:in `each'
        from (irb):4:in `each_slice'
        from (irb):4
        from /usr/local/bin/irb:12:in `<main>'
irb(main):005:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); j.push(*j.split(', '))}
NoMethodError: undefined method `push' for " item1\n":String
        from (irb):5:in `block in irb_binding'
        from (irb):5:in `each'
        from (irb):5:in `each_slice'
        from (irb):5
        from /usr/local/bin/irb:12:in `<main>'

···

--
Posted via http://www.ruby-forum.com/.

Josh: Thanks for the reply and link to your github example. The thing
is, this data is coming from a text file. An export from an MS Access
database. I wouldn't choose to save in that format.

···

--
Posted via http://www.ruby-forum.com/.

That works great, thanks Josh. A couple of questions if you don't mind.

1. What is the purpose of the [] in line below? Does it mean collect
whatever matches into an array?

categories << [ line.sub(/^category: /,'').chomp ]

2. I've tried to achieve the same result using an existing array, rather
than reading from the file and I'm stuck. I'm using JRuby 1.6RC1 and
getting this error about NilClass. Any ideas?

irb(main):038:0> arr2 = []
irb(main):039:0> arr
=> [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]

irb(main):040:0> arr.map do |item|
irb(main):041:1* if item =~ /^cat/
irb(main):042:2> arr2 << [ item ]
irb(main):043:2> else
irb(main):044:2* arr2.last << item
irb(main):045:2> end
irb(main):046:1> end

NoMethodError: undefined method `<<' for nil:NilClass
        from (irb):44:in `evaluate'
        from org/jruby/RubyArray.java:2460:in `collect'
        from (irb):40:in `evaluate'
        from org/jruby/RubyKernel.java:1091:in `eval'
        from /opt/jruby/lib/ruby/1.8/irb.rb:158:in `eval_input'
        from /opt/jruby/lib/ruby/1.8/irb.rb:271:in `signal_status'
        from /opt/jruby/lib/ruby/1.8/irb.rb:270:in `signal_status'
        from /opt/jruby/lib/ruby/1.8/irb.rb:155:in `eval_input'
        from org/jruby/RubyKernel.java:1421:in `loop'
        from org/jruby/RubyKernel.java:1194:in `rbCatch'
        from /opt/jruby/lib/ruby/1.8/irb.rb:154:in `eval_input'
        from /opt/jruby/lib/ruby/1.8/irb.rb:71:in `start'
        from org/jruby/RubyKernel.java:1194:in `rbCatch'
        from /opt/jruby/lib/ruby/1.8/irb.rb:70:in `start'

irb(main):047:0> arr
=> [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]
irb(main):048:0> arr2
=> []
irb(main):049:0> arr2.empty?
=> true

irb(main):052:0> arr.each do |item|
irb(main):053:1* if item =~ /^cat/
irb(main):054:2> arr2 << item
irb(main):055:2> else
irb(main):056:2* arr2.last << item
irb(main):057:2> end
irb(main):058:1> end

NoMethodError: undefined method `<<' for nil:NilClass
        from (irb):56:in `evaluate'
        from org/jruby/RubyArray.java:1671:in `each'
        from (irb):52:in `evaluate'
        from org/jruby/RubyKernel.java:1091:in `eval'
        from /opt/jruby/lib/ruby/1.8/irb.rb:158:in `eval_input'
        from /opt/jruby/lib/ruby/1.8/irb.rb:271:in `signal_status'
        from /opt/jruby/lib/ruby/1.8/irb.rb:270:in `signal_status'
        from /opt/jruby/lib/ruby/1.8/irb.rb:155:in `eval_input'
        from org/jruby/RubyKernel.java:1421:in `loop'
        from org/jruby/RubyKernel.java:1194:in `rbCatch'
        from /opt/jruby/lib/ruby/1.8/irb.rb:154:in `eval_input'
        from /opt/jruby/lib/ruby/1.8/irb.rb:71:in `start'
        from org/jruby/RubyKernel.java:1194:in `rbCatch'
        from /opt/jruby/lib/ruby/1.8/irb.rb:70:in `start'
irb(main):059:0>

···

--
Posted via http://www.ruby-forum.com/.

You've been very helpful. Thanks again Josh.

···

--
Posted via http://www.ruby-forum.com/.

Thanks for reply, Anurag. Can't get this to work though:

irb(main):001:0> lines = File.readlines('test.txt')
=> ["category: cat1\n", " item1\n", " item2\n", " item3\n", "category:
cat2\n", " item1\n", "category: cat3\n", " item1\n", " item2\n", "
item3\n", " item4\n", "\n"]
irb(main):002:0> puts lines
category: cat1
item1
item2
item3
category: cat2
item1
category: cat3
item1
item2
item3
item4

=> nil
irb(main):003:0> result =
=>
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); b.push(*j.split(', '))}
NameError: undefined local variable or method `b' for main:Object

My bad; typed in wrong. The 'b' should be 'result' - we want to
collect the processed element in the same array.

···

result =
lines.each_slice(2) {|i, j| result.push(i.sub(/category: /, '')); result.push(*j.split(', '))}

--
Anurag Priyam
http://about.me/yeban/

I recommend you don't store your data like this, it is fragile and error
prone. You can see your data already does not look like you have said in
your first post, each item in cat1 is its own line (ie index 1 in your first
post is "item1, item2, item3" but in your actual data, it is "item1\n", so
even after you fix the part where he said b.push instead of result.push, it
will still be wrong.

I recommend using a real data format such as yaml, xml, or json. It's
actually much easier to get started with this than you think, you can just
build the data in memory how you want it to look, then tell YAML to convert
it, and store it in a file. Ta-da, a valid YAML representation of your data.
Here is an example with this data Showing how I would handle some data by using yaml instead of an ad hoc format. · GitHub

It is slightly different in that I read them into hashes, because I dislike
storing category and items in the same array -- if it were me, I might even
go a step further and store them in a struct instead of a hash.

···

On Thu, Jan 13, 2011 at 2:19 PM, Simon Harrison <simon@simonharrison.net>wrote:

Thanks for reply, Anurag. Can't get this to work though:

irb(main):001:0> lines = File.readlines('test.txt')
=> ["category: cat1\n", " item1\n", " item2\n", " item3\n", "category:
cat2\n", " item1\n", "category: cat3\n", " item1\n", " item2\n", "
item3\n", " item4\n", "\n"]
irb(main):002:0> puts lines
category: cat1
item1
item2
item3
category: cat2
item1
category: cat3
item1
item2
item3
item4

=> nil
irb(main):003:0> result =
=>
irb(main):004:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); b.push(*j.split(', '))}
NameError: undefined local variable or method `b' for main:Object
       from (irb):4:in `block in irb_binding'
       from (irb):4:in `each'
       from (irb):4:in `each_slice'
       from (irb):4
       from /usr/local/bin/irb:12:in `<main>'
irb(main):005:0> lines.each_slice(2) { |i, j|
result.push(i.sub(/category: /, '')); j.push(*j.split(', '))}
NoMethodError: undefined method `push' for " item1\n":String
       from (irb):5:in `block in irb_binding'
       from (irb):5:in `each'
       from (irb):5:in `each_slice'
       from (irb):5
       from /usr/local/bin/irb:12:in `<main>'

--
Posted via http://www.ruby-forum.com/\.

Hi, Simon. Okay, well, if we assume that all data will be nested below a
category, and a category is denoted by "category: ", and there isn't leading
or trailing whitespace, and all data under the category is given on one
line, then this should work with your data format.

# goal format for the data, as given in the original post
goal = [
  ["cat1", "item1", "item2", "item3"],
  ["cat2", "item1"],
  ["cat3", "item1", "item2", "item3", "item4"]
]

categories = Array.new
File.foreach "test.txt" do |line|
  if line =~ /^category:/
    categories << [ line.sub(/^category: /,'').chomp ]
  else
    categories.last << line.strip
  end
end

goal == categories # => true

puts File.read('test.txt')
# >> category: cat1
# >> item1
# >> item2
# >> item3
# >> category: cat2
# >> item1
# >> category: cat3
# >> item1
# >> item2
# >> item3
# >> item4

···

On Fri, Jan 14, 2011 at 3:03 PM, Simon Harrison <simon@simonharrison.net>wrote:

Josh: Thanks for the reply and link to your github example. The thing
is, this data is coming from a text file. An export from an MS Access
database. I wouldn't choose to save in that format.

--
Posted via http://www.ruby-forum.com/\.

That works great, thanks Josh. A couple of questions if you don't mind.

1. What is the purpose of the in line below? Does it mean collect
whatever matches into an array?

categories << [ line.sub(/^category: /,'').chomp ]

Yes, but not whatever matches. The call to #sub, with the second arg being
an empty string, says to remove "category: " from the string, if it is at
the beginning. And the chomp removes the newline. So if line is "category:
cat1\n", then line.sub(/^category: /,'').chomp will return "cat1". Then we
stick that in the Array

2. I've tried to achieve the same result using an existing array, rather
than reading from the file and I'm stuck. I'm using JRuby 1.6RC1 and
getting this error about NilClass. Any ideas?

irb(main):038:0> arr2 =
irb(main):039:0> arr
=> [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]

irb(main):040:0> arr.map do |item|
irb(main):041:1* if item =~ /^cat/
irb(main):042:2> arr2 << [ item ]
irb(main):043:2> else
irb(main):044:2* arr2.last << item
irb(main):045:2> end
irb(main):046:1> end

You are right on, here, just getting confused about your data format, again.
Your code will work correctly if arr is an array of the lines of your file,
such as you would get with File.readlines.

In other words, in your irb example,
arr is [["cat1", "1", "2", "3"], ["cat2", "1", "2"], ["cat3", "1", "2"]]

but in mine, it was read in straight from the file, so it would be
["cat1", "1", "2", "3", "cat2", "1", "2", "cat3", "1", "2"]

If you fix that, it will work correctly.

As a side note, you are doing arr.map (
module Enumerable - RDoc Documentation), but what you
really mean is arr.each (class Array - RDoc Documentation).
It isn't harming anything, but it is misleading, because map implies you are
trying to create a new array by collecting the results of the blocks for
each element, but really you are just trying to iterate.

···

On Sat, Jan 15, 2011 at 8:10 AM, Simon Harrison <simon@simonharrison.net>wrote: