newbieQ: new array from old array w regex

Hi,

As an exercise to learn to use Ruby, I am trying to search large xml files
to extract just the zip codes from the author address nodes in the file:

require "rexml/document"
include REXML
doc = Document.new File.new("JPS.xml")
mylist2=mylist=[]
doc.elements.each("RECORDS/RECORD/AUTHOR_ADDRESS") {|element| mylist.push
element.text}
puts mylist.each {|m| /b\d{5}-\d{4}\b|\b\d{5}\b/.match(m)}

This only gives me the entire contents of the address node - how do I
extract only the zip codes (preferably putting them in an array). My next
step is to end up with a hash table of unique zipcodes and their frequency
of occurrence...

Thanks

CLS

Hi,

At Fri, 10 Jun 2005 13:15:28 +0900,
Charles L. Snyder wrote in [ruby-talk:145032]:

This only gives me the entire contents of the address node - how do I
extract only the zip codes (preferably putting them in an array). My next
step is to end up with a hash table of unique zipcodes and their frequency
of occurrence...

Enumerable#each just returns the receiver itself.

   puts mylist.grep(/b\d{5}-\d{4}\b|\b\d{5}\b/)

ยทยทยท

--
Nobu Nakada

nobuyoshi nakada wrote:

Hi,

At Fri, 10 Jun 2005 13:15:28 +0900,
Charles L. Snyder wrote in [ruby-talk:145032]:

This only gives me the entire contents of the address node - how do I
extract only the zip codes (preferably putting them in an array). My
next step is to end up with a hash table of unique zipcodes and
their frequency of occurrence...

Enumerable#each just returns the receiver itself.

   puts mylist.grep(/b\d{5}-\d{4}\b|\b\d{5}\b/)

You still get the complete string. This one might work better

puts mylist.inject(){|res,e| res << $& if /\b\d{5}(?:-\d{4})?\b/ =~ e;
res}

Also, could it be that the leading backslash for the word boundary was
missing in the original rx?

Kind regards

    robert