[ANN] csvhuman v0.1 - read tabular data in the CSV Humanitarian eXchange Language (HXL) format


(Gerald Bauer) #1

Hello,

  I've put together a first version of the csvhuman library / gem [1] that
  adds support for the Humanitarian eXchange Language (HXL) to ruby
  and lets you read tabular data in the
  comma-separated values (CSV) with Humanitarian eXchange Language
(HXL) [2] hashtags format.

   Questions and comments welcome.

[1] https://github.com/csvreader/csvhuman
[2] https://github.com/csvspecs/csv-hxl

  Cheers. Prost.

PS:

Usage

Pass in an array of arrays (or a stream responding to `#each` with an
array of strings).
Example:

pp CsvHuman.parse( [["Organisation", "Cluster", "Province" ], ## or
use HXL.parse
                    [ "#org", "#sector", "#adm1" ],
                    [ "Org A", "WASH", "Coastal Province" ],
                    [ "Org B", "Health", "Mountain Province" ],
                    [ "Org C", "Education", "Coastal Province" ],
                    [ "Org A", "WASH", "Plains Province" ]]

resulting in:

[{"org" => "Org A", "sector" => "WASH",      "adm1" => "Coastal Province"},
 {"org" => "Org B", "sector" => "Health",    "adm1" => "Mountain Province"},
 {"org" => "Org C", "sector" => "Education", "adm1" => "Coastal Province"},
 {"org" => "Org A", "sector" => "WASH",      "adm1" => "Plains Province"}]

Or pass in the text. Example:

pp CsvHuman.parse( <<TXT )      ## or use HXL.parse
  What,,,Who,Where,For whom,
  Record,Sector/Cluster,Subsector,Organisation,Country,Males,Females,Subregion
  ,#sector+en,#subsector,#org,#country,#sex+#targeted,#sex+#targeted,#adm1
  001,WASH,Subsector 1,Org 1,Country 1,100,100,Region 1
  002,Health,Subsector 2,Org 2,Country 2,,,Region 2
  003,Education,Subsector 3,Org 3,Country 2,250,300,Region 3
  004,WASH,Subsector 4,Org 1,Country 3,80,95,Region 4
TXT

resulting in:

[{"sector+en"    => "WASH",
  "subsector"    => "Subsector 1",
  "org"          => "Org 1",
  "country"      => "Country 1",
  "sex+targeted" => ["100", "100"],
  "adm1"         => "Region 1"},
 {"sector+en"    => "Health",
  "subsector"    => "Subsector 2",
  "org"          => "Org 2",
  "country"      => "Country 2",
  "sex+targeted" => ["", ""],
  "adm1"         => "Region 2"},
 {"sector+en"    => "Education",
  "subsector"    => "Subsector 3",
  "org"          => "Org 3",
  "country"      => "Country 2",
  "sex+targeted" => ["250", "300"],
  "adm1"         => "Region 3"},
 {"sector+en"    => "WASH",
  "subsector"    => "Subsector 4",
  "org"          => "Org 1",
  "country"      => "Country 3",
  "sex+targeted" => ["80", "95"],
  "adm1"         => "Region 4"}]

More ways to use the reader:

csv = CsvHuman.new( recs )
csv.each do |rec|
  pp rec
end

pp csv.read

CsvHuman.parse( recs ).each do |rec|
  pp rec
end

pp CsvHuman.read( "./test.csv" )

CsvHuman.foreach( "./test.csv" ) do |rec|
  pp rec
end

#...

or use the `HXL` alias:

hxl = HXL.new( recs )
hxl.each do |rec|
  pp rec
end

pp hxl.read

HXL.parse( recs ).each do |rec|
  pp rec
end

pp HXL.read( "./test.csv" )

HXL.foreach( "./test.csv" ) do |rec|
  pp rec
end

#...

Note: More aliases for `CsvHuman`, `HXL`? Yes, you can use
`CsvHum`, `CSV_HXL`, `CSVHXL` too.