csvrecord - read in comma-separated values (csv) records with typed structs / schemas


(Gerald Bauer) #1

Hello,

  I've put together a new library / gem called csvrecord [1] that
let's you read in comma-separated values (csv) records
with typed structs / schemas. Example.

beer.csv:

Brewery,City,Name,Abv
Andechser Klosterbrauerei,Andechs,Doppelbock Dunkel,7%
Augustiner Bräu München,München,Edelstoff,5.6%
Bayerische Staatsbrauerei Weihenstephan,Freising,Hefe Weissbier,5.4%
Brauerei Spezial,Bamberg,Rauchbier Märzen,5.1%
Hacker-Pschorr Bräu,München,Münchner Dunkel,5.0%
Staatliches Hofbräuhaus München,München,Hofbräu Oktoberfestbier,6.3%

Step 1: Define a (typed) struct for the comma-separated values (csv)
records. Example:

    require 'csvrecord'

   Beer = CsvRecord.define do
     field :brewery ## note: default type is :string
     field :city
     field :name
     field :abv, Float ## allows type specified as class (or use :float)
   end

   # or in "classic" style:

   class Beer < CsvRecord::Base
     field :brewery
     field :city
     field :name
     field :abv, Float
   end

Step 2: Read in the comma-separated values (csv) datafile. Example:

   beers = Beer.read( 'beer.csv' ).to_a

   puts "#{beers.size} beers:"
   pp beers

pretty prints (pp):

6 beers:
[#<Beer:0x302c760
    @abv = 7.0,
    @brewery = "Andechser Klosterbrauerei",
    @city = "Andechs",
    @name = "Doppelbock Dunkel">,
#<Beer:0x3026fe8
    @abv = 5.6,
    @brewery = "Augustiner Br\u00E4u M\u00FCnchen",
    @city = "M\u00FCnchen",
    @name = "Edelstoff">,
...
]

Or loop over the records. Example:

  Beer.read( data ).each do |rec|
    puts "#{rec.name} (#{rec.abv}%) by #{rec.brewery}, #{rec.city}"
  end

printing:

Doppelbock Dunkel (7.0%) by Andechser Klosterbrauerei, Andechs
Edelstoff (5.6%) by Augustiner Bräu München, München
Hefe Weissbier (5.4%) by Bayerische Staatsbrauerei Weihenstephan, Freising
Rauchbier Märzen (5.1%) by Brauerei Spezial, Bamberg
Münchner Dunkel (5.0%) by Hacker-Pschorr Bräu, München
Hofbräu Oktoberfestbier (6.3%) by Staatliches Hofbräuhaus München, München

Or create new records from scratch. Example:

beer = Beer.new( brewery: 'Andechser Klosterbrauerei',
                 city: 'Andechs',
                 name: 'Doppelbock Dunkel' )

And so on and so forth.
Happy hacking and data wrangling with ruby. Cheers. Prost.

[1] https://github.com/csv11/csvrecord


(botp) #2

cool. how does this differ with ruby csv's converters?
would probly be great if CsvRecord::Base may inherit from
ActiveRecord::Base, so one could treat csv like AR models too, no?

kind regards
--botp

···

On Mon, Aug 13, 2018 at 2:07 AM, Gerald Bauer <gerald.bauer@gmail.com> wrote:

   class Beer < CsvRecord::Base
     field :brewery
     field :city
     field :name
     field :abv, Float
   end


(Gerald Bauer) #3

Hello,

how does this [CsvRecord class] differ with ruby csv's converters?

   CsvRecord.read differs from CSV.read that it always returns typed
structs and you always need a schema with data types.
CSV.read "just" returns an array (CSV::Table) of array (CSV::Row) with
values, thus:

    rows = CSV.read( "beer.csv" )
    pp rows[0]['Abv']
    # => '6.5%' # is a string

  # vs

beers = Beer.read( "beer.csv )
  pp beers[0].abv
  # => 6.5 # is a float as specified in the schema (field/column definition)

   Note: CsvRecord uses CSV.read "under the hood" so its a kind of
"typed structs" wrapper with a lot of convenience methods (values,
parse, to_h, to_csv, field_names, field_types, etc.)

may inherit from ActiveRecord::Base, so one could treat csv like AR models too, no?

   Good point. Again they are different. ActiveRecord has its own
database schema / attributes. Using csvpack [1] (the tabular data
package) you can, however, for your convenience auto-generate
ActiveRecord classes and migrations (tables) from the tabular
datapackage schema (JSON Schema). That was kind of the start of the
exercise :slight_smile: for CsvRecord.

   To sum up - use CsvRecord for comma-separated values (csv) data
imports or data "wrangling"
   and use ActiveRecord for SQL queries / analysis and more. In the
good old unix tradition - the work together but have its own (limited
/ focused) purpose.

    Cheers. Prost.

[1] https://github.com/csv11/csvpack

···

2018-08-13 5:35 GMT+02:00 botp <botpena@gmail.com>:

On Mon, Aug 13, 2018 at 2:07 AM, Gerald Bauer <gerald.bauer@gmail.com> > wrote:

   class Beer < CsvRecord::Base
     field :brewery
     field :city
     field :name
     field :abv, Float
   end

cool. how does this differ with ruby csv's converters?
would probly be great if CsvRecord::Base may inherit from
ActiveRecord::Base, so one could treat csv like AR models too, no?

kind regards
--botp

Unsubscribe: <mailto:ruby-talk-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-talk>